StoreFile query usage report
----------------------------

                 Key: HBASE-4147
                 URL: https://issues.apache.org/jira/browse/HBASE-4147
             Project: HBase
          Issue Type: Improvement
            Reporter: Doug Meil
            Priority: Minor


Detailed information on what HBase is doing in terms of reads is hard to come 
by.

What would be useful is to have a periodic StoreFile query report.  
Specifically, this could run on a configured interval (e.g., every 30 seconds, 
60 seconds) and dump the output to the log files.

This would have all StoreFiles accessed during the reporting period (and with 
the Path we would also know region, CF, and table), # of times the StoreFile 
was accessed, the size of the StoreFile, and the total time (ms) spent 
processing that StoreFile.

Even this level of summary would be useful to detect a which tables & CFs are 
being accessed the most, and including the StoreFile would provide insight into 
relative "uncompaction" (i.e., lots of StoreFiles).

I think the log-output, as opposed to UI, is an important facet with this.  I'm 
assuming that users will slice and dice this data on their own so I think we 
should skip any kind of admin view for now (i.e., new JSPs, new APIs to expose 
this data).  Just getting this to log-file would be a big improvement.

Will this have a non-zero performance impact?  Yes.  Hopefully small, but yes 
it will.  However, flying a plane without any instrumentation isn't fun.  :-)  

 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to