Hi,

I was wondering if there is any "recognized" way to obtain table statistics.
Ideally, given a Key range I would like to know the number of distinct rowids, 
entries and amount of data (in bytes) in that key range.
I assume that Accumulo holds at least some of this information internally, 
partly because I can see some of this
through the monitor, and partly because it must know something about the 
quantity of data held in order to be able
to implement the table threshold.

In my case the tables are very static and so the "estimates" that the monitor 
has are likely to sufficiently accurate for my purposes.

I have found this link
http://apache-accumulo.1065345.n5.nabble.com/Determining-tablets-assigned-to-table-splits-and-the-number-of-rows-in-each-tablet-td11546.html
which describes a process (which I haven't tried yet) to get the number of 
entries in a range.
Which would probably be sufficient for me and would certainly be a good start.
However it seems to be using internal data structures and non-published APIs, 
which is less than ideal.
And it seems to be written against Accumulo version 1.6.

I'm using Accumulo 1.7. Is there anything better than I can do or is it 
recommended that this is the way to go?

Regards,

Z

Please consider the environment before printing this email. This message should 
be regarded as confidential. If you have received this email in error please 
notify the sender and destroy it immediately. Statements of intent shall only 
become binding when confirmed in hard copy by an authorised signatory. The 
contents of this email may relate to dealings with other companies under the 
control of BAE Systems Applied Intelligence Limited, details of which can be 
found at http://www.baesystems.com/Businesses/index.htm.

Reply via email to