Re: HBase Region Server crash if column size become to big

John Wed, 11 Sep 2013 05:39:45 -0700

Okay, I will take a look at the ColumnPaginationFilter.

I tried to reproduce the error. I created a new table and add one new row
with 250 000 columns, but everything works fine if I execute a get to the
table. The only difference to my original programm was that I have added
the data directly throught the hbase java api and not with the map reduce
bulk load. Maybe that can be the reason?


I wonder a little bit about the hdfs structure if I compare both methods
(hbase api/bulk load). If I add the data through the hbase api there is no
file in /hbase/MyTable/5faaf42997925e2f637d8d38c420862f/MyColumnFamily/*,
but if I use the bulk load method there is a file for every time I executed
a new bulk load:

root@pc11:~/hadoop# hadoop fs -ls
/hbase/mytestTable/5faaf42997925e2f637d8d38c420862f/mycf
root@pc11:~/hadoop# hadoop fs -ls
/hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/
Found 2 items
-rw-r--r--   1 root supergroup  118824462 2013-09-11 11:46
/hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/28e919a0cc8a4592b7f2c09defaaea3a
-rw-r--r--   1 root supergroup  158576842 2013-09-11 11:35
/hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/35c5e6df64c04d0a880ffe82593258b8

If I ececute a get operation in the hbase shell to my the "MyTable" table
if got the result:

hbase(main):004:0> get 'mytestTable', 'sampleRowKey'
... <-- all results
250000 row(s) in 38.4440 seconds

but if I try to get the results for my "bulkLoadTable" I got this (+ the
region server crash):

hbase(main):003:0> get 'bulkLoadTable', 'oneSpecificRowKey'
COLUMN                          CELL

ERROR: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed
after attempts=7, exceptions:
Wed Sep 11 14:21:05 CEST 2013,
org.apache.hadoop.hbase.client.HTable$3@adc4d8f, java.io.IOException: Call
to pc17.pool.ifis.uni-luebeck.de/141.83.150.97:60020 failed on local
exception: java.io.EOFException
Wed Sep 11 14:21:06 CEST 2013,
org.apache.hadoop.hbase.client.HTable$3@adc4d8f, java.net.ConnectException:
Connection refused
Wed Sep 11 14:21:07 CEST 2013,
org.apache.hadoop.hbase.client.HTable$3@adc4d8f,
org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server
is in the failed servers list:
pc17.pool.ifis.uni-luebeck.de/141.83.150.97:60020
Wed Sep 11 14:21:08 CEST 2013,
org.apache.hadoop.hbase.client.HTable$3@adc4d8f, java.net.ConnectException:
Connection refused
Wed Sep 11 14:21:10 CEST 2013,
org.apache.hadoop.hbase.client.HTable$3@adc4d8f, java.net.ConnectException:
Connection refused
Wed Sep 11 14:21:12 CEST 2013,
org.apache.hadoop.hbase.client.HTable$3@adc4d8f, java.net.ConnectException:
Connection refused
Wed Sep 11 14:21:16 CEST 2013,
org.apache.hadoop.hbase.client.HTable$3@adc4d8f, java.net.ConnectException:
Connection refused



2013/9/11 Ted Yu <[email protected]>

> Take a look at
> http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/filter/ColumnPaginationFilter.html
>
> Cheers
>
> On Sep 11, 2013, at 4:42 AM, John <[email protected]> wrote:
>
> > Hi,
> >
> > thanks for your fast answer! with size becoming too big I mean I have one
> > row with thousands of columns. For example:
> >
> > myrowkey1 -> column1, column2, column3 ... columnN
> >
> > What do you mean with "change the batch size"? I try to create a little
> > java test code to reproduce the problem. It will take a moment
> >
> >
> >
> >
> > 2013/9/11 Jean-Marc Spaggiari <[email protected]>
> >
> >> Hi John,
> >>
> >> Just to be sure. What is " the size become too big"? The size of a
> single
> >> column within this row? Or the number of columns?
> >>
> >> If it's the number of columns, you can change the batch size to get less
> >> columns in a single call? Can you share the relevant piece of code doing
> >> the call?
> >>
> >> JM
> >>
> >>
> >> 2013/9/11 John <[email protected]>
> >>
> >>> Hi,
> >>>
> >>> I store a lot of columns for one row key and if the size become to big
> >> the
> >>> relevant Region Server crashs if I try to get or scan the row. For
> >> example
> >>> if I try to get the relevant row I got this error:
> >>>
> >>> 2013-09-11 12:46:43,696 WARN org.apache.hadoop.ipc.HBaseServer:
> >>> (operationTooLarge): {"processingtimems":3091,"client":"
> >> 192.168.0.34:52488
> >>> ","ti$
> >>>
> >>> If I try to load the relevant row via Apache Pig and the HBaseStorage
> >>> Loader (use the scan operation) I got this message and after that the
> >>> Region Servers crashs:
> >>>
> >>> 2013-09-11 10:30:23,542 WARN org.apache.hadoop.ipc.HBaseServer:
> >>> (responseTooLarge):
> >>> {"processingtimems":1851,"call":"next(-588368116791418695,
> >>> 1), rpc version=1, client version=29,$
> >>>
> >>> I'm using Cloudera 4.4.0 with 0.94.6-cdh4.4.0
> >>>
> >>> Any clues?
> >>>
> >>> regards
> >>
>

Re: HBase Region Server crash if column size become to big

Reply via email to