Yeah, there's no explicit tracking of all rows in Accumulo, you're stuck
with enumerating them (or explicitly tracking them yourself at ingest time).
The easiest approach you can take is probably using the
FirstEntryInRowIterator and counting each row on the client-side.
You could do another summation in a second iterator but this is a little
tricky to get correct. I tried to touch on this a little in a blog
post[1]. If this is a one-off question you want to answer, doing the
summation on the client side is likely not to take excessively longer
than a server-side summation.
[1]
https://blogs.apache.org/accumulo/entry/thinking_about_reads_over_accumulo
z11373 wrote:
I want to get total rows of a table (likely has more than 100M rows), I think
to get that information, Accumulo would have to iterate all rows :-( This
may not be typical Accumulo scenario.
Is there a more efficient way to get total number of rows in a table?
When Accumulo iterating those items, does it mean it will pull the data to
the client? If yes, is there a way to ask it to return just the number,
since that's the only data I care.
Thanks,
Z
--
View this message in context:
http://apache-accumulo.1065345.n5.nabble.com/total-table-rows-tp15484.html
Sent from the Developers mailing list archive at Nabble.com.