An interator stack of FirstEntryInRowIterator + CountingIterator will return the count of rows in each tablet, which can then be combined on the client side.
On Mon, Nov 9, 2015 at 10:25 AM, Josh Elser <[email protected]> wrote: > Yeah, there's no explicit tracking of all rows in Accumulo, you're stuck > with enumerating them (or explicitly tracking them yourself at ingest time). > > The easiest approach you can take is probably using the > FirstEntryInRowIterator and counting each row on the client-side. > > You could do another summation in a second iterator but this is a little > tricky to get correct. I tried to touch on this a little in a blog post[1]. > If this is a one-off question you want to answer, doing the summation on > the client side is likely not to take excessively longer than a server-side > summation. > > [1] > https://blogs.apache.org/accumulo/entry/thinking_about_reads_over_accumulo > > > z11373 wrote: > >> I want to get total rows of a table (likely has more than 100M rows), I >> think >> to get that information, Accumulo would have to iterate all rows :-( This >> may not be typical Accumulo scenario. >> >> Is there a more efficient way to get total number of rows in a table? >> When Accumulo iterating those items, does it mean it will pull the data to >> the client? If yes, is there a way to ask it to return just the number, >> since that's the only data I care. >> >> Thanks, >> Z >> >> >> >> -- >> View this message in context: >> http://apache-accumulo.1065345.n5.nabble.com/total-table-rows-tp15484.html >> Sent from the Developers mailing list archive at Nabble.com. >> >
