[ https://issues.apache.org/jira/browse/HBASE-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653893#comment-16653893 ]
Andrew Purtell commented on HBASE-21301: ---------------------------------------- Above Archana references another internal discussion as "see W-5473921 Enhance compaction upgrade decision to consider file statistics". The idea there is {quote} Currently we can decide to upgrade a minor compaction to major if the data locality of the store files is below a threshold. There are other reasons we may want to upgrade the compaction. For example, the largest store file might be full of deleted cells. {quote} While originally formulated to use statistics embedded in the hfiles, actually it seems a lot more natural to do this with time series data kept in a system table. What we want to know is, for a given store file, what percentage of its cells are covered by tombstones, but we only know that for a given store file by looking at the state of things at some later time well past when the store file itself is written. It's easy to see how timeseries data calculated and queried during compaction could support the use case, a lot harder to see how store file metadata could. I mention this only as an example of future work that would be enabled by the system metrics table proposed on this issue. I will file another issue about this at some point. > Heatmap for key access patterns > ------------------------------- > > Key: HBASE-21301 > URL: https://issues.apache.org/jira/browse/HBASE-21301 > Project: HBase > Issue Type: Improvement > Reporter: Archana Katiyar > Assignee: Archana Katiyar > Priority: Major > > Google recently released a beta feature for Cloud Bigtable which presents a > heat map of the keyspace. *Given how hotspotting comes up now and again here, > this is a good idea for giving HBase ops a tool to be proactive about it.* > >>> > Additionally, we are announcing the beta version of Key Visualizer, a > visualization tool for Cloud Bigtable key access patterns. Key Visualizer > helps debug performance issues due to unbalanced access patterns across the > key space, or single rows that are too large or receiving too much read or > write activity. With Key Visualizer, you get a heat map visualization of > access patterns over time, along with the ability to zoom into specific key > or time ranges, or select a specific row to find the full row key ID that's > responsible for a hotspot. Key Visualizer is automatically enabled for Cloud > Bigtable clusters with sufficient data or activity, and does not affect Cloud > Bigtable cluster performance. > <<< > From > [https://cloudplatform.googleblog.com/2018/07/on-gcp-your-database-your-way.html] > (Copied this description from the write-up by [~apurtell], thanks Andrew.) -- This message was sent by Atlassian JIRA (v7.6.3#76005)