On Wed, Jun 6, 2012 at 1:46 PM, William Slacum <[email protected]> wrote: > You're kind of there. Essentially, you can think of your Scanner's > interactions with the TServers as a tree with a height of two. Your
One comment to add. The Scanner will do this work serially, one tablet server at a time. The batch scanner would execute the iterator in parallel on multiple tablet servers at a time. > Scanner is the "root" and its children are all of the TServers it > needs to interact with. Essentially, the operation you'd want to is > sum the number of records each of the children have. > > In Accumulo terms, you can use something like a CountingIterator to > count the number of results on each TServer. You can then sum all of > those intermediate results to get a total count of results. > > On Wed, Jun 6, 2012 at 10:39 AM, Hunter Provyn <[email protected]> wrote: >> I want to know the number of records a scanner has without actually getting >> the records from cloudbase. >> I've been looking at CountingIterator (1.3.4), which has a getCount() >> method. However, I don't know how >> to access the instance to call getCount() on it because Cloudbase server >> just passes back the entries and doesn't expose the instance of the >> iterator. >> >> It is possible to use an AggregatingIterator to aggregate all entries into a >> single entry whose value is the number of entries. But I was wondering if >> there was a better way that possibly makes use of the CountingIterator >> class. >>
