On Mon, Dec 9, 2013 at 4:18 PM, Terry P. <[email protected]> wrote: > Thanks Billie and Christopher, sounds like I should have the purge > iterator run after the VersioningIterator. > > Keith, uh oh, I was not aware that not all compactions will see the entire > row. That sounds like it could be bad for my case! Here is the original > thread that you helped me with as background: >
Sometimes Accumulo will compact a subset of the data in a tablet. This can happen during a minor compaction and when a major compaction is operating on a subset of files. The rows columns and updates are spread across multiple files. In these cases you may only see a subset of the columns in a row. Also you may not see the latest version. Scans and full major compactions see all data. You can tell the difference when an iterators is initialized. An IteratorEnvironment is passed into the init method. If the scope is majc and isFullMajorCompaction() is true then you know you will see all data (also if the scope is scan). For minor compactions and partial major compactions you may want to just let everything pass. > > > http://mail-archives.apache.org/mod_mbox/accumulo-user/201311.mbox/%3ccagutchryw3rr9pf5bad+psxe-dswl9fyogvv5mn_wj00o2m...@mail.gmail.com%3E > > We only have 10-12 k/v pairs per row -- is that a factor? Can you explain > the nuances with respect to when a compaction won't see the entire row? > > Thanks, > Terry > > > > On Mon, Dec 9, 2013 at 1:34 PM, Keith Turner <[email protected]> wrote: > >> >> >> >> On Mon, Dec 9, 2013 at 12:02 PM, Terry P. <[email protected]> wrote: >> >>> Greetings all, >>> With Accumulo v1.4.2, we have a purge filter/iterator that extents >>> RowFilter and I have a question about what priority it should be >>> implemented with. I see the default VersioningIterator runs at priority 20. >>> >>> Our purge iterator is designed to suppress (scan time) or remove (majc >>> or minc compactions) rows based on the value in a column. Is it more >>> efficient to run our purge iterator at a higher priority than the >>> VersioningIterator, or does it >>> >> >> Are you aware that not all compactions will see the entire row? >> >> >>> really matter? Our VersioningIterator maxVersions is set to the default >>> of 1 which is what we want/need. >>> >>> Thanks in advance, >>> Terry >>> >> >> >
