Re: HBase Region Server crash if column size become to big

Michael Segel Wed, 11 Sep 2013 08:32:41 -0700

Well, there's a column width, and then there's the row's width. 

Unless I'm mistaken... rows can't span regions.  Right?


Note: the OP didn't say he got the error trying to add a column, but had issues 
in retrieving a column....

I personally never tried to break HBase on such an edge use case... (I like to 
avoid the problem in the first place....) 

Has anyone tested this specific limit? 


On Sep 11, 2013, at 10:02 AM, "Kevin O'dell" <[email protected]> wrote:

> I have not see the exact error, but if I recall correctly jobs will fail if
> the column is larger than 10MB and we have not raised the default
> setting(which I don't have in front of me) ?
> 
> 
> On Wed, Sep 11, 2013 at 10:53 AM, Michael Segel
> <[email protected]>wrote:
> 
>> Just out of curiosity...
>> 
>> How wide are the columns?
>> 
>> What's the region size?
>> 
>> Does anyone know the error message you'll get if your row is wider than a
>> region?
>> 
>> 
>> On Sep 11, 2013, at 9:47 AM, John <[email protected]> wrote:
>> 
>>> sry, I mean 570000 columns, not rows
>>> 
>>> 
>>> 2013/9/11 John <[email protected]>
>>> 
>>>> thanks for all the answers! The only entry I got in the
>>>> "hbase-cmf-hbase1-REGIONSERVER-mydomain.org.log.out" log file after I
>>>> executing the get command in the hbase shell is this:
>>>> 
>>>> 2013-09-11 16:38:56,175 WARN org.apache.hadoop.ipc.HBaseServer:
>>>> (operationTooLarge): {"processingtimems":3196,"client":"
>> 192.168.0.1:50629
>>>> 
>> ","timeRange":[0,9223372036854775807],"starttimems":1378910332920,"responsesize":108211303,"class":"HRegionServer","table":"P_SO","cacheBlocks":true,"families":{"myCf":["ALL"]},"row":"myRow","queuetimems":0,"method":"get","totalColumns":1,"maxVersions":1}
>>>> 
>>>> After this the RegionServer is down, nothing more. BTW I found out that
>>>> the row should have ~570000 rows. The size should be arround ~70mb
>>>> 
>>>> Thanks
>>>> 
>>>> 
>>>> 
>>>> 2013/9/11 Bing Jiang <[email protected]>
>>>> 
>>>>> hi john.
>>>>> I think it is a fresh question. Could you print the log from the
>>>>> regionserver crashed ?
>>>>> On Sep 11, 2013 8:38 PM, "John" <[email protected]> wrote:
>>>>> 
>>>>>> Okay, I will take a look at the ColumnPaginationFilter.
>>>>>> 
>>>>>> I tried to reproduce the error. I created a new table and add one new
>>>>> row
>>>>>> with 250 000 columns, but everything works fine if I execute a get to
>>>>> the
>>>>>> table. The only difference to my original programm was that I have
>> added
>>>>>> the data directly throught the hbase java api and not with the map
>>>>> reduce
>>>>>> bulk load. Maybe that can be the reason?
>>>>>> 
>>>>>> I wonder a little bit about the hdfs structure if I compare both
>> methods
>>>>>> (hbase api/bulk load). If I add the data through the hbase api there
>> is
>>>>> no
>>>>>> file in
>>>>> /hbase/MyTable/5faaf42997925e2f637d8d38c420862f/MyColumnFamily/*,
>>>>>> but if I use the bulk load method there is a file for every time I
>>>>> executed
>>>>>> a new bulk load:
>>>>>> 
>>>>>> root@pc11:~/hadoop# hadoop fs -ls
>>>>>> /hbase/mytestTable/5faaf42997925e2f637d8d38c420862f/mycf
>>>>>> root@pc11:~/hadoop# hadoop fs -ls
>>>>>> /hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/
>>>>>> Found 2 items
>>>>>> -rw-r--r--   1 root supergroup  118824462 2013-09-11 11:46
>>>>>> 
>>>>>> 
>>>>> 
>> /hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/28e919a0cc8a4592b7f2c09defaaea3a
>>>>>> -rw-r--r--   1 root supergroup  158576842 2013-09-11 11:35
>>>>>> 
>>>>>> 
>>>>> 
>> /hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/35c5e6df64c04d0a880ffe82593258b8
>>>>>> 
>>>>>> If I ececute a get operation in the hbase shell to my the "MyTable"
>>>>> table
>>>>>> if got the result:
>>>>>> 
>>>>>> hbase(main):004:0> get 'mytestTable', 'sampleRowKey'
>>>>>> ... <-- all results
>>>>>> 250000 row(s) in 38.4440 seconds
>>>>>> 
>>>>>> but if I try to get the results for my "bulkLoadTable" I got this (+
>> the
>>>>>> region server crash):
>>>>>> 
>>>>>> hbase(main):003:0> get 'bulkLoadTable', 'oneSpecificRowKey'
>>>>>> COLUMN                          CELL
>>>>>> 
>>>>>> ERROR: org.apache.hadoop.hbase.client.RetriesExhaustedException:
>> Failed
>>>>>> after attempts=7, exceptions:
>>>>>> Wed Sep 11 14:21:05 CEST 2013,
>>>>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f, java.io.IOException:
>>>>> Call
>>>>>> to pc17.pool.ifis.uni-luebeck.de/141.83.150.97:60020 failed on local
>>>>>> exception: java.io.EOFException
>>>>>> Wed Sep 11 14:21:06 CEST 2013,
>>>>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f,
>>>>>> java.net.ConnectException:
>>>>>> Connection refused
>>>>>> Wed Sep 11 14:21:07 CEST 2013,
>>>>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f,
>>>>>> org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This
>>>>> server
>>>>>> is in the failed servers list:
>>>>>> pc17.pool.ifis.uni-luebeck.de/141.83.150.97:60020
>>>>>> Wed Sep 11 14:21:08 CEST 2013,
>>>>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f,
>>>>>> java.net.ConnectException:
>>>>>> Connection refused
>>>>>> Wed Sep 11 14:21:10 CEST 2013,
>>>>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f,
>>>>>> java.net.ConnectException:
>>>>>> Connection refused
>>>>>> Wed Sep 11 14:21:12 CEST 2013,
>>>>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f,
>>>>>> java.net.ConnectException:
>>>>>> Connection refused
>>>>>> Wed Sep 11 14:21:16 CEST 2013,
>>>>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f,
>>>>>> java.net.ConnectException:
>>>>>> Connection refused
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 2013/9/11 Ted Yu <[email protected]>
>>>>>> 
>>>>>>> Take a look at
>>>>>>> 
>>>>>> 
>>>>> 
>> http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/filter/ColumnPaginationFilter.html
>>>>>>> 
>>>>>>> Cheers
>>>>>>> 
>>>>>>> On Sep 11, 2013, at 4:42 AM, John <[email protected]>
>> wrote:
>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> thanks for your fast answer! with size becoming too big I mean I
>>>>> have
>>>>>> one
>>>>>>>> row with thousands of columns. For example:
>>>>>>>> 
>>>>>>>> myrowkey1 -> column1, column2, column3 ... columnN
>>>>>>>> 
>>>>>>>> What do you mean with "change the batch size"? I try to create a
>>>>> little
>>>>>>>> java test code to reproduce the problem. It will take a moment
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 2013/9/11 Jean-Marc Spaggiari <[email protected]>
>>>>>>>> 
>>>>>>>>> Hi John,
>>>>>>>>> 
>>>>>>>>> Just to be sure. What is " the size become too big"? The size of a
>>>>>>> single
>>>>>>>>> column within this row? Or the number of columns?
>>>>>>>>> 
>>>>>>>>> If it's the number of columns, you can change the batch size to get
>>>>>> less
>>>>>>>>> columns in a single call? Can you share the relevant piece of code
>>>>>> doing
>>>>>>>>> the call?
>>>>>>>>> 
>>>>>>>>> JM
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 2013/9/11 John <[email protected]>
>>>>>>>>> 
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> I store a lot of columns for one row key and if the size become to
>>>>>> big
>>>>>>>>> the
>>>>>>>>>> relevant Region Server crashs if I try to get or scan the row. For
>>>>>>>>> example
>>>>>>>>>> if I try to get the relevant row I got this error:
>>>>>>>>>> 
>>>>>>>>>> 2013-09-11 12:46:43,696 WARN org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>>> (operationTooLarge): {"processingtimems":3091,"client":"
>>>>>>>>> 192.168.0.34:52488
>>>>>>>>>> ","ti$
>>>>>>>>>> 
>>>>>>>>>> If I try to load the relevant row via Apache Pig and the
>>>>> HBaseStorage
>>>>>>>>>> Loader (use the scan operation) I got this message and after that
>>>>> the
>>>>>>>>>> Region Servers crashs:
>>>>>>>>>> 
>>>>>>>>>> 2013-09-11 10:30:23,542 WARN org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>>> (responseTooLarge):
>>>>>>>>>> {"processingtimems":1851,"call":"next(-588368116791418695,
>>>>>>>>>> 1), rpc version=1, client version=29,$
>>>>>>>>>> 
>>>>>>>>>> I'm using Cloudera 4.4.0 with 0.94.6-cdh4.4.0
>>>>>>>>>> 
>>>>>>>>>> Any clues?
>>>>>>>>>> 
>>>>>>>>>> regards
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> 
>> 
>> The opinions expressed here are mine, while they may reflect a cognitive
>> thought, that is purely accidental.
>> Use at your own risk.
>> Michael Segel
>> michael_segel (AT) hotmail.com
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> -- 
> Kevin O'Dell
> Systems Engineer, Cloudera

The opinions expressed here are mine, while they may reflect a cognitive 
thought, that is purely accidental. 
Use at your own risk. 
Michael Segel
michael_segel (AT) hotmail.com

Re: HBase Region Server crash if column size become to big

Reply via email to