Can you pastebin server log snippet from the crashed region server ? Thanks
On Sep 11, 2013, at 5:38 AM, John <[email protected]> wrote: > Okay, I will take a look at the ColumnPaginationFilter. > > I tried to reproduce the error. I created a new table and add one new row > with 250 000 columns, but everything works fine if I execute a get to the > table. The only difference to my original programm was that I have added > the data directly throught the hbase java api and not with the map reduce > bulk load. Maybe that can be the reason? > > I wonder a little bit about the hdfs structure if I compare both methods > (hbase api/bulk load). If I add the data through the hbase api there is no > file in /hbase/MyTable/5faaf42997925e2f637d8d38c420862f/MyColumnFamily/*, > but if I use the bulk load method there is a file for every time I executed > a new bulk load: > > root@pc11:~/hadoop# hadoop fs -ls > /hbase/mytestTable/5faaf42997925e2f637d8d38c420862f/mycf > root@pc11:~/hadoop# hadoop fs -ls > /hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/ > Found 2 items > -rw-r--r-- 1 root supergroup 118824462 2013-09-11 11:46 > /hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/28e919a0cc8a4592b7f2c09defaaea3a > -rw-r--r-- 1 root supergroup 158576842 2013-09-11 11:35 > /hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/35c5e6df64c04d0a880ffe82593258b8 > > If I ececute a get operation in the hbase shell to my the "MyTable" table > if got the result: > > hbase(main):004:0> get 'mytestTable', 'sampleRowKey' > ... <-- all results > 250000 row(s) in 38.4440 seconds > > but if I try to get the results for my "bulkLoadTable" I got this (+ the > region server crash): > > hbase(main):003:0> get 'bulkLoadTable', 'oneSpecificRowKey' > COLUMN CELL > > ERROR: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed > after attempts=7, exceptions: > Wed Sep 11 14:21:05 CEST 2013, > org.apache.hadoop.hbase.client.HTable$3@adc4d8f, java.io.IOException: Call > to pc17.pool.ifis.uni-luebeck.de/141.83.150.97:60020 failed on local > exception: java.io.EOFException > Wed Sep 11 14:21:06 CEST 2013, > org.apache.hadoop.hbase.client.HTable$3@adc4d8f, java.net.ConnectException: > Connection refused > Wed Sep 11 14:21:07 CEST 2013, > org.apache.hadoop.hbase.client.HTable$3@adc4d8f, > org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server > is in the failed servers list: > pc17.pool.ifis.uni-luebeck.de/141.83.150.97:60020 > Wed Sep 11 14:21:08 CEST 2013, > org.apache.hadoop.hbase.client.HTable$3@adc4d8f, java.net.ConnectException: > Connection refused > Wed Sep 11 14:21:10 CEST 2013, > org.apache.hadoop.hbase.client.HTable$3@adc4d8f, java.net.ConnectException: > Connection refused > Wed Sep 11 14:21:12 CEST 2013, > org.apache.hadoop.hbase.client.HTable$3@adc4d8f, java.net.ConnectException: > Connection refused > Wed Sep 11 14:21:16 CEST 2013, > org.apache.hadoop.hbase.client.HTable$3@adc4d8f, java.net.ConnectException: > Connection refused > > > > 2013/9/11 Ted Yu <[email protected]> > >> Take a look at >> http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/filter/ColumnPaginationFilter.html >> >> Cheers >> >> On Sep 11, 2013, at 4:42 AM, John <[email protected]> wrote: >> >>> Hi, >>> >>> thanks for your fast answer! with size becoming too big I mean I have one >>> row with thousands of columns. For example: >>> >>> myrowkey1 -> column1, column2, column3 ... columnN >>> >>> What do you mean with "change the batch size"? I try to create a little >>> java test code to reproduce the problem. It will take a moment >>> >>> >>> >>> >>> 2013/9/11 Jean-Marc Spaggiari <[email protected]> >>> >>>> Hi John, >>>> >>>> Just to be sure. What is " the size become too big"? The size of a >> single >>>> column within this row? Or the number of columns? >>>> >>>> If it's the number of columns, you can change the batch size to get less >>>> columns in a single call? Can you share the relevant piece of code doing >>>> the call? >>>> >>>> JM >>>> >>>> >>>> 2013/9/11 John <[email protected]> >>>> >>>>> Hi, >>>>> >>>>> I store a lot of columns for one row key and if the size become to big >>>> the >>>>> relevant Region Server crashs if I try to get or scan the row. For >>>> example >>>>> if I try to get the relevant row I got this error: >>>>> >>>>> 2013-09-11 12:46:43,696 WARN org.apache.hadoop.ipc.HBaseServer: >>>>> (operationTooLarge): {"processingtimems":3091,"client":" >>>> 192.168.0.34:52488 >>>>> ","ti$ >>>>> >>>>> If I try to load the relevant row via Apache Pig and the HBaseStorage >>>>> Loader (use the scan operation) I got this message and after that the >>>>> Region Servers crashs: >>>>> >>>>> 2013-09-11 10:30:23,542 WARN org.apache.hadoop.ipc.HBaseServer: >>>>> (responseTooLarge): >>>>> {"processingtimems":1851,"call":"next(-588368116791418695, >>>>> 1), rpc version=1, client version=29,$ >>>>> >>>>> I'm using Cloudera 4.4.0 with 0.94.6-cdh4.4.0 >>>>> >>>>> Any clues? >>>>> >>>>> regards >>
