[ 
https://issues.apache.org/jira/browse/HBASE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4147:
-------------------------

         Priority: Major  (was: Critical)
    Fix Version/s:     (was: 0.96.0)

This turned into a (useful) discussion.  [~eclark] Can you take a look and note 
what in 0.96 metrics2 might help answering the questions Doug poses above?  We 
also have a trace mechanism committed but again it would take some work to get 
it to the level Doug is asking for in the above.

It would seem that this issue should become two issues now: one to improve the 
trace so can go down to the per-storefile level and another to add to metrics 
so can do at the storefile emissions (if possible).

Meantime, marking this as non-critical and moving out of 0.96 while it is w/o a 
sponsor.
                
> StoreFile query usage report
> ----------------------------
>
>                 Key: HBASE-4147
>                 URL: https://issues.apache.org/jira/browse/HBASE-4147
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>         Attachments: hbase_4147_storefilereport_2011_08_10.pdf, 
> hbase_4147_storefilereport.pdf
>
>
> Detailed information on what HBase is doing in terms of reads is hard to come 
> by.
> What would be useful is to have a periodic StoreFile query report.  
> Specifically, this could run on a configured interval (e.g., every 30 
> seconds, 60 seconds) and dump the output to the log files.
> This would have all StoreFiles accessed during the reporting period (and with 
> the Path we would also know region, CF, and table), # of times the StoreFile 
> was accessed, the size of the StoreFile, and the total time (ms) spent 
> processing that StoreFile.
> Even this level of summary would be useful to detect a which tables & CFs are 
> being accessed the most, and including the StoreFile would provide insight 
> into relative "uncompaction" (i.e., lots of StoreFiles).
> I think the log-output, as opposed to UI, is an important facet with this.  
> I'm assuming that users will slice and dice this data on their own so I think 
> we should skip any kind of admin view for now (i.e., new JSPs, new APIs to 
> expose this data).  Just getting this to log-file would be a big improvement.
> Will this have a non-zero performance impact?  Yes.  Hopefully small, but yes 
> it will.  However, flying a plane without any instrumentation isn't fun.  :-) 
>  
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to