[ 
https://issues.apache.org/jira/browse/HBASE-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873745#action_12873745
 ] 

stack commented on HBASE-2531:
------------------------------

I tried it.  I saw stuff like this:

{code}
2010-05-31 11:19:10,258 INFO org.apache.hadoop.hbase.master.ServerManager: 
Processing MSG_REPORT_SPLIT_INCLUDES_DAUGHTERS: 
TestTable,0010151443,1275329539809(717050107): Daughters; 
TestTable,0010151443,1275329944907/80f60878ba8994c3458931a127d77377/, 
TestTable,0010376061,1275329944907/9058243430462cd436fc06b748c0aaca/ from 
sv2borg187,60020,1275329765174; 1 of 1
{code}

That looks really good as does a scan of .META. (I can see parents and 
daughters.... we should get rid of the new-style encoded field in HRegionInfo, 
but we can do that later).

Since there are so few things to change -- and changing the separator I can do 
-- you want me to fix the above white-spacing, etc., issues on commit Kannan; 
IOW, you don't have to make a new patch?

Looks great Kannan.

> 32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-2531
>                 URL: https://issues.apache.org/jira/browse/HBASE-2531
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Kannan Muthukkaruppan
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: HBASE-2531_v2.patch
>
>
> Kannan tripped over two regionnames that hashed the same:
> Here is code demo'ing that his two names hash the same:
> {code}
> package org;
> import org.apache.hadoop.hbase.util.Bytes;
> import org.apache.hadoop.hbase.util.JenkinsHash;
> public class Testing {
>   public static void main(final String [] args) {
>     
> System.out.println(encodeRegionName(Bytes.toBytes("test1,6838000000,1273541236167")));
>     
> System.out.println(encodeRegionName(Bytes.toBytes("test1,0520100000,1273541610201")));
>   }
>   /**
>    * @param regionName
>    * @return the encodedName
>    */
>   public static int encodeRegionName(final byte [] regionName) {
>     return Math.abs(JenkinsHash.getInstance().hash(regionName, 
> regionName.length, 0));
>   }
> }
> {code}
> Need new encoding mechanism.  Will need to migrate old regions to new schema.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to