[
https://issues.apache.org/jira/browse/HBASE-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873208#action_12873208
]
HBase Review Board commented on HBASE-2531:
-------------------------------------------
Message from: "Kannan Muthukkaruppan" <[email protected]>
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/104/
-----------------------------------------------------------
(Updated 2010-05-28 18:03:20.789555)
Review request for hbase.
Summary
-------
The new format for a region name contains its encodedName. The encoded name
also serves as the directory name for the region in the filesystem.
New region name format:
<tablename>,<startkey>,<regionIdTimestamp>/<encodedName>/
where, <encodedName> is a hex version of the MD5 hash of
<tablename>,<startkey>,<regionIdTimestamp>
The old region name format remains:
<tablename>,<startkey>,<regionIdTimestamp>
For region names in the old format, the encoded name is a 32-bit JenkinsHash
integer value (in its decimal notation, string form).
**NOTE**
ROOT, the first META region, and regions created by an older version of HBase
(0.20 or prior) will continue to use the old region name format.
In the logs & web ui, old format region names will show up as:
<tablename>,<startkey>,<regionIdTimestamp>(<jenkinshashEncodedName>)
New format region names will show up as:
<tablename>,<startkey>,<regionIdTimestamp>/<md5hashEncodedName>/
This addresses bug HBASE-2531.
Diffs
-----
trunk/bin/add_table.rb 949322
trunk/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 949322
trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 949322
trunk/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
949322
trunk/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 949322
trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 949322
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 949322
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 949322
trunk/src/main/resources/hbase-webapps/master/table.jsp 949322
trunk/src/main/resources/hbase-webapps/regionserver/regionserver.jsp 949322
trunk/src/test/java/org/apache/hadoop/hbase/TestEmptyMetaInfo.java 949322
trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
949322
trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java
949322
trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java
949322
trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
949322
Diff: http://review.hbase.org/r/104/diff
Testing (updated)
-------
unit tests pass. ran some cluster tests, and things seemed to work ok. Yet to
try some migration test (upgrading from an older server).
Thanks,
Kannan
> 32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes
> ----------------------------------------------------------------------------
>
> Key: HBASE-2531
> URL: https://issues.apache.org/jira/browse/HBASE-2531
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: Kannan Muthukkaruppan
> Priority: Blocker
> Fix For: 0.21.0
>
>
> Kannan tripped over two regionnames that hashed the same:
> Here is code demo'ing that his two names hash the same:
> {code}
> package org;
> import org.apache.hadoop.hbase.util.Bytes;
> import org.apache.hadoop.hbase.util.JenkinsHash;
> public class Testing {
> public static void main(final String [] args) {
>
> System.out.println(encodeRegionName(Bytes.toBytes("test1,6838000000,1273541236167")));
>
> System.out.println(encodeRegionName(Bytes.toBytes("test1,0520100000,1273541610201")));
> }
> /**
> * @param regionName
> * @return the encodedName
> */
> public static int encodeRegionName(final byte [] regionName) {
> return Math.abs(JenkinsHash.getInstance().hash(regionName,
> regionName.length, 0));
> }
> }
> {code}
> Need new encoding mechanism. Will need to migrate old regions to new schema.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.