[ 
https://issues.apache.org/jira/browse/HBASE-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653893#comment-16653893
 ] 

Andrew Purtell commented on HBASE-21301:
----------------------------------------

Above Archana references another internal discussion as "see W-5473921 Enhance 
compaction upgrade decision to consider file statistics". The idea there is
{quote}
Currently we can decide to upgrade a minor compaction to major if the data 
locality of the store files is below a threshold. There are other reasons we 
may want to upgrade the compaction. For example, the largest store file might 
be full of deleted cells. 
{quote}
While originally formulated to use statistics embedded in the hfiles, actually 
it seems a lot more natural to do this with time series data kept in a system 
table. What we want to know is, for a given store file, what percentage of its 
cells are covered by tombstones, but we only know that for a given store file 
by looking at the state of things at some later time well past when the store 
file itself is written. It's easy to see how timeseries data calculated and 
queried during compaction could support the use case, a lot harder to see how 
store file metadata could. I mention this only as an example of future work 
that would be enabled by the system metrics table proposed on this issue. I 
will file another issue about this at some point.

> Heatmap for key access patterns
> -------------------------------
>
>                 Key: HBASE-21301
>                 URL: https://issues.apache.org/jira/browse/HBASE-21301
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Archana Katiyar
>            Assignee: Archana Katiyar
>            Priority: Major
>
> Google recently released a beta feature for Cloud Bigtable which presents a 
> heat map of the keyspace. *Given how hotspotting comes up now and again here, 
> this is a good idea for giving HBase ops a tool to be proactive about it.* 
> >>>
> Additionally, we are announcing the beta version of Key Visualizer, a 
> visualization tool for Cloud Bigtable key access patterns. Key Visualizer 
> helps debug performance issues due to unbalanced access patterns across the 
> key space, or single rows that are too large or receiving too much read or 
> write activity. With Key Visualizer, you get a heat map visualization of 
> access patterns over time, along with the ability to zoom into specific key 
> or time ranges, or select a specific row to find the full row key ID that's 
> responsible for a hotspot. Key Visualizer is automatically enabled for Cloud 
> Bigtable clusters with sufficient data or activity, and does not affect Cloud 
> Bigtable cluster performance. 
> <<<
> From 
> [https://cloudplatform.googleblog.com/2018/07/on-gcp-your-database-your-way.html]
> (Copied this description from the write-up by [~apurtell], thanks Andrew.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to