[jira] Commented: (HBASE-82) row keys should be array of bytes with a specified comparator

Jim Kellerman (JIRA) Thu, 15 May 2008 13:13:25 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-82?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597245#action_12597245
 ]


Jim Kellerman commented on HBASE-82:
------------------------------------

Review comments:

HColumnDescriptor

Need to upgrade version number to 4 since 3 was used for column TTL

HRegionInfo, HStoreKey

How can you be certain that DELIMITER does not occur in some random byte in one 
of the names? Since we don't guard against it now, it is probably fine, but 
might want to do something about it in the future. Could cause invalid 
start/end keys. Might want to consider new format where each part of the region 
name is preceded by its length.

HTableDescriptor.isLegalTableName documentation does not agree with 
implementation. Says it takes period, but code only allows letters, digits and 
underscore. Used to take minus as well.

HConnectionManager, RegExpRowFilter, HbaseMapWritable: why change from HashMap 
to SortedMap? Is compareTo cheaper than computing hash? Not if map key is the 
computed hash of the byte array.

Why use Bytes.getMapKey instead of Arrays.hashCode(byte[]) ? Couldn't a lot of 
Maps be converted to Sets and then couldn't you use HashSet?

Why change HMasterRegionInterface.regionServerStartup from returning 
HbaseMapWritable to MapWritable?

Migrate.java: no migration is required ???

Bytes.SIZEOF_LONG Why not use Long.SIZE / Byte.SIZE ?


> row keys should be array of bytes with a specified comparator
> -------------------------------------------------------------
>
>                 Key: HBASE-82
>                 URL: https://issues.apache.org/jira/browse/HBASE-82
>             Project: Hadoop HBase
>          Issue Type: Wish
>            Reporter: Jim Kellerman
>            Assignee: stack
>             Fix For: 0.2.0
>
>         Attachments: 82-v12-ignore-ws.patch, 82-v13-ignore-ws.patch, 
> 82-v2.patch, 82-v3.patch, 82-v4.patch, 82-v5.patch, 82-v7.patch, 82-v8.patch, 
> 82-v9-ignore-ws.patch, 82.patch, Perf.java
>
>
> I have heard from several people that row keys in HBase should be less 
> restricted than hadoop.io.Text.
> What do you think?
> At the very least, a row key has to be a WritableComparable. This would lead 
> to the most general case being either hadoop.io.BytesWritable or 
> hbase.io.ImmutableBytesWritable. The primary difference between these two 
> classes is that hadoop.io.BytesWritable by default allocates 100 bytes and if 
> you do not pay attention to the length, (BytesWritable.getSize()), converting 
> a String to a BytesWritable and vice versa can become problematic. 
> hbase.io.ImmutableBytesWritable, in contrast only allocates as many bytes as 
> you pass in and then does not allow the size to be changed.
> If we were to change from Text to a non-text key, my preference would be for 
> ImmutableBytesWritable, because it has a fixed size once set, and operations 
> like get, etc do not have to something like System.arrayCopy where you 
> specify the number of bytes to copy.
> Your comments, questions are welcome on this issue. If we receive enough 
> feedback that Text is too restrictive, we are willing to change it, but we 
> need to hear what would be the most useful thing to change it to as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-82) row keys should be array of bytes with a specified comparator

Reply via email to