[
https://issues.apache.org/jira/browse/HBASE-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16660058#comment-16660058
]
Archana Katiyar commented on HBASE-21301:
-----------------------------------------
Sure [~apurtell], thanks.
I was working on the *logic to generate UID for given string*; here the problem
statement is two fold -
* We should be able to lookup uid, given the string.
* Also, we should be able to lookup string back, given the uid.
So, we will have to store two rows in the table corresponding to one uid for
both side lookups.
Generating uid can be done by using incrementColumnValue (in HTable), but
saving the uid against the string equivalent (and vice versa) will require some
sort of lock because HBase doesn't guarantee atomicity of operations across
rows. I am thinking of using Zookeeper lock here.
Following is the algorithm that I am working on -
For given string (like table name)
# Check if there is an entry in the table with given string as key
# If yes, then use the value as uid.
# If not, then follow next set of steps -
# take a lock at zookeeper level for the given string.
# Check again if there is an entry in the table with given string as key; this
is important as someone might have taken the lock, finished the work and
eventually released the lock between steps #1 and #4. If there is an entry
present, simply use that value as uid. If no entry present then move to next
steps -
# get uid by HTable#incrementColumnValue
# store string equivalent as row key and uid as value
# store uid as row key and string equivalent as value
# release zookeeper lock
One more thing to take care of here is system crash\shutdown between #7 and #8
i.e. string to uid relationship is stored but uid to string relationship could
not be stored. For handling this case, we have to check and create an row
entry, if needed, in #2 above.
Alternatively, we can following something on below lines to avoid taking the
lock -
* For table-name, assign and store uid during table creation.
* For metric names, admin can do it while creating the table post cluster
creation. One caveat is that whenever we will add new metrics (like anything
else than readcount\writecount), admin has to run a script again. This process
can lead to errors.
* For region_uid and block_uid also, admin can do at the beginning.
* For string values like region names and block names, it is expected that
only single server will try to create corresponding uid so taking lock can be
avoided.
I would prefer to go via locking approach because it puts all the logic at the
same place and also, it reduces chances of error because it doesn't require any
manual intervention.
> Heatmap for key access patterns
> -------------------------------
>
> Key: HBASE-21301
> URL: https://issues.apache.org/jira/browse/HBASE-21301
> Project: HBase
> Issue Type: Improvement
> Reporter: Archana Katiyar
> Assignee: Archana Katiyar
> Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
>
> Google recently released a beta feature for Cloud Bigtable which presents a
> heat map of the keyspace. *Given how hotspotting comes up now and again here,
> this is a good idea for giving HBase ops a tool to be proactive about it.*
> >>>
> Additionally, we are announcing the beta version of Key Visualizer, a
> visualization tool for Cloud Bigtable key access patterns. Key Visualizer
> helps debug performance issues due to unbalanced access patterns across the
> key space, or single rows that are too large or receiving too much read or
> write activity. With Key Visualizer, you get a heat map visualization of
> access patterns over time, along with the ability to zoom into specific key
> or time ranges, or select a specific row to find the full row key ID that's
> responsible for a hotspot. Key Visualizer is automatically enabled for Cloud
> Bigtable clusters with sufficient data or activity, and does not affect Cloud
> Bigtable cluster performance.
> <<<
> From
> [https://cloudplatform.googleblog.com/2018/07/on-gcp-your-database-your-way.html]
> (Copied this description from the write-up by [~apurtell], thanks Andrew.)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)