Hi Lars, In my case, I just want to use ColumnPaginationFilter() rather than implementing my own logic for filter. Is there an easy way to apply this filter on top of an existing scanner ? Do I do something like
RegionScannerImpl scanner = new RegionScannerImpl(scan_with_my_filter, original_compaction_scanner) Thanks Varun On Mon, Dec 10, 2012 at 9:09 PM, lars hofhansl <[email protected]> wrote: > In your case you probably just want to filter on top of the provided > scanner with preCompact (rather than actually replacing the scanner, which > preCompactScannerOpen does). > > (And sorry I only saw this reply after I sent my own reply to your initial > question.) > > > > ________________________________ > From: Varun Sharma <[email protected]> > To: [email protected] > Sent: Monday, December 10, 2012 7:29 AM > Subject: Re: Filtering/Collection columns during Major Compaction > > Okay - I looked more thoroughly again - I should be able to extract these > from the region observer. > > Thanks ! > > On Mon, Dec 10, 2012 at 6:59 AM, Varun Sharma <[email protected]> wrote: > > > Thanks ! This is exactly what I need. I am looking at the code in > > compactStore() under Store.java but I am trying to understand why, for > the > > real compaction - smallestReadPoint needs to be passed - I thought the > read > > point was a memstore only thing. Also the preCompactScannerOpen does not > > have a way of passing this value. > > > > Varun > > > > > > On Mon, Dec 10, 2012 at 6:08 AM, ramkrishna vasudevan < > > [email protected]> wrote: > > > >> Hi Varun > >> > >> If you are using 0.94 version you have a coprocessor that is getting > >> invoked before and after Compaction selection. > >> preCompactScannerOpen() helps you to create your own scanner which > >> actually > >> does the next() operation. > >> Now if you can wrap your own scanner and implement your next() it will > >> help > >> you to play with the kvs that you need. So basically you can say what > >> cols > >> to include and what to exclude. > >> Does this help you Varun? > >> > >> Regards > >> Ram > >> > >> On Mon, Dec 10, 2012 at 7:28 PM, Varun Sharma <[email protected]> > >> wrote: > >> > >> > Hi, > >> > > >> > My understanding of major compaction is that it rewrites one store > file > >> and > >> > does a merge of the memstore, store files on disk and cleans out > delete > >> > tombstones and puts prior to them and cleans out excess versions. We > >> want > >> > to limit the number of columns per row in hbase. Also, we want to > limit > >> > them in lexicographically sorted order - which means we take the top, > >> say > >> > 100 smallest columns (in lexicographical sense) and only keep them > while > >> > discard the rest. > >> > > >> > One way to do this would be to clean out columns in a daily mapreduce > >> job. > >> > Or another way is to clean them out during the major compaction which > >> can > >> > be run daily too. I see, from the code that a major compaction > >> essentially > >> > invokes a Scan over the region - so if the Scan is invoked with the > >> > appropriate filter (say ColumnCountGetFilter) - would that do the > trick > >> ? > >> > > >> > Thanks > >> > Varun > >> > > >> > > > > >
