On Tue, Feb 26, 2013 at 12:31 PM, Mike Hugo <[email protected]> wrote:
> Our row keys are a combination of two elements, like this: > > foo/bar > foo/baz > foo/bee > > eee/blah > eee/boo > > When running without any ranges set, we're missing an entire prefix worth > - e.g. we don't get any rows that start with "foo" > That sounds like a clue, because Accumulo doesn't know about the format of your row keys. If it were dropping arbitrary rows, I would expect you to see some foo-prefixed rows and not others. Are there any other differences in the two runs? How is the TimestampFilter configured? Billie > > When I tried running with the range set, I did a prefix range on "foo" and > it then found the rows starting with "foo" > > > On Tue, Feb 26, 2013 at 2:28 PM, Billie Rinaldi <[email protected]> wrote: > >> Have you noticed any pattern in the rows it seems to be missing? E.g. >> every other row, the last row in each tablet, etc.? When you set a range, >> what range did you set? >> >> Billie >> >> >> >> On Tue, Feb 26, 2013 at 12:17 PM, Mike Hugo <[email protected]> wrote: >> >>> Hello, >>> >>> I'm running a map reduce job over a table using AccumuloRowInputFormat. >>> For debugging purposes I'm logging the key.getRow() so I can see what rows >>> it's finding as it progresses. >>> >>> If I don't specify any ranges on the input format, it skips significant >>> number of rows - that is, I don't see any logging indicating that it >>> traversed them. >>> >>> To see if it was a visibility issue, I tried explicitly setting a range, >>> like this: >>> >>> AccumuloRowInputFormat.setRanges(job.getConfiguration(), ranges); >>> >>> When doing that it does process the rows that it otherwise skips. >>> >>> The same TimestampFilter is being applied in both scenarios, no other >>> filters / iterators are being used. >>> >>> Any thoughts on why, when run without the ranges specified, it isn't >>> seeing a significant portion of the data? >>> >>> Thanks, >>> >>> Mike >>> >> >> >
