[ 
https://issues.apache.org/jira/browse/HBASE-18987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202042#comment-16202042
 ] 

Esteban Gutierrez commented on HBASE-18987:
-------------------------------------------

bq. Not in testOversizedRegionNameForPut 
[~mdrob]: thanks!

bq. The value length can be upto Integer.MAX_VALUE - 1 as we use 4 bytes to 
store that. But for the row length it is 2 bytes right? Then allowing 
Integer.MAX_VALUE - 1 for RK length also correct?

[~anoopsamjohn]: yeah, you are right. The problem seems to run deeper: The 
KeyValue constructor accepts an integer for rlength but there are few more 
places where we only use a short: {{createEmptyByteArray}} will test if rlength 
is greater than {{Short.MAX_VALUE}} and {{rowLen}} on {{KeyOnlyKeyValue}} is a 
short. Also {{KEYVALUE_INFRASTRUCTURE_SIZE}} depends on ROW_LENGTH_SIZE which 
is {{Bytes.SIZEOF_SHORT}} My test didn't catch that since you need to go all 
the way to serialize the KV.

I think I'm -1 now for this and the truncate approach might be the only 
alternative for now.


> Raise value of HConstants#MAX_ROW_LENGTH
> ----------------------------------------
>
>                 Key: HBASE-18987
>                 URL: https://issues.apache.org/jira/browse/HBASE-18987
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 1.0.0, 2.0.0
>            Reporter: Esteban Gutierrez
>            Assignee: Esteban Gutierrez
>            Priority: Minor
>         Attachments: HBASE-18987.master.001.patch, 
> HBASE-18987.master.002.patch
>
>
> Short.MAX_VALUE hasn't been a problem for a long time but one of our 
> customers ran into an  edgy case when the midKey used for the split point was 
> very close to Short.MAX_VALUE. When the split is submitted, we attempt to 
> create the new two daughter regions and we name those regions via 
> {{HRegionInfo.createRegionName()}} in order to be added to META. 
> Unfortunately, since {{HRegionInfo.createRegionName()}} uses midKey as the 
> startKey {{Put}} will fail since the row key length will now fail checkRow 
> and thus causing the split to fail.
> I tried a couple of alternatives to address this problem, e.g. truncating the 
> startKey. But the number of changes in the code doesn't justify for this edge 
> condition. Since we already use {{Integer.MAX_VALUE - 1}} for 
> {{HConstants#MAXIMUM_VALUE_LENGTH}} it should be ok to use the same limit for 
> the maximum row key. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to