Re: RE: Add Columnsize Filter for Scan Operation

John Fri, 25 Oct 2013 04:47:02 -0700

I try to build a MR-Job, but in my case that doesn't work. Because if I set
for example the batch to 1000 and there are 5000 columns in row. Now i
found to generate something for rows where are the column size is bigger
than 2500. BUT since the map function is executed for every batch-row i
can't say if the row has a size bigger than 2500.


any ideas?


2013/10/25 lars hofhansl <[email protected]>

> We need to finish up HBASE-8369
>
>
>
> ________________________________
>  From: Dhaval Shah <[email protected]>
> To: "[email protected]" <[email protected]>
> Sent: Thursday, October 24, 2013 4:38 PM
> Subject: Re: RE: Add Columnsize Filter for Scan Operation
>
>
> Well that depends on your use case ;)
>
> There are many nuances/code complexities to keep in mind:
> - merging results of various HFiles (each region can have.more than one)
> - merging results of WAL
> - applying delete markers
> - how about data which is only in memory of region servers and no where
> else
> - applying bloom filters for efficiency
> - what about hbase filters?
>
> At some point you would basically start rewriting an hbase region server
> on you map reduce job which is not ideal for maintainability.
>
> Do we ever read MySQL data files directly or issue a SQL query? Kind of
> goes back to the same argument ;)
>
> Sent from Yahoo Mail on Android
>

Re: RE: Add Columnsize Filter for Scan Operation

Reply via email to