Regards, Vidosh. My responses in line
On 02/16/2013 01:03 AM, Vidosh Sahu wrote:
Hi,
I am using HBase 0.92.1-cdh4.1.1. There is mapper job which do a lot of
operation. This job is getting failed with the below operation. It looks
like the scanner operation is timed out. How to increase it using Cloudera
Manager Web Console. Can anyone please help me out.
Well, this questions should be in CDH mailing list ([email protected]).
Another thing is that Scanner“s performance have seen a lot of improvements
until the stable release (which is 0.94.5 right now, thanks to Lars H.
for the information).
I remembered a great talk by Mikhail Bautin called "Optimizing HBase
scanner performance"
in the last HBaseCon who explained all things done in this particular
topic, essentially:[1]
- HBASE-4433: Avoid extra next if done with row/column
- HBASE-4434: Don't do HFile Scanner next() unless the next KV is needed
- HBASE-2794: Multi-column Bloom Filters
- HBASE-4465: Lazy Seek
- HBASE-4469: Utilize existing ROWCOL Bloom filter
- HBASE-4532: Added a separate ROW-only Bloom filter for DeleteFamily
- HBASE-4585: Seek on deleted KV
- HBASE-4962: Top-of-the-column seek
So, you can see that you should upgrade your HBase installation. CDH 4.2
is out, and CDH 5.0 will be available very soon.[2]
[1] http://www.slideshare.net/cloudera/tag/hbasecon-2012
[2]
http://blog.cloudera.com/blog/2012/11/apache-hbase-assignmentmanager-improvements/
2013-02-16 05:30:45,978 [Thread-158] WARN
org.apache.hadoop.mapred.LocalJobRunner - job_local_0005
org.apache.hadoop.hbase.client.ScannerTimeoutException: 614527ms passed
since the last invocation, timeout is currently set to 60000
at
org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1302)
at
org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:133)
at
org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:142)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423)
at
org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
Caused by: org.apache.hadoop.hbase.UnknownScannerException:
org.apache.hadoop.hbase.UnknownScannerException: Name: -5689607602334883791
at
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2110)
at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1345)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at
org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:84)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:39)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1325)
at
org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1293)
... 8 more
2013-02-16 05:30:46,016 [communication thread] INFO
org.apache.hadoop.mapred.LocalJobRunner -
2013-02-16 05:30:46,424 [pool-1-thread-1] INFO
org.apache.hadoop.mapred.JobClient - Job complete: job_local_0005
2013-02-16 05:30:46,431 [pool-1-thread-1] INFO
org.apache.hadoop.mapred.JobClient - Counters: 5
2013-02-16 05:30:46,431 [pool-1-thread-1] INFO
org.apache.hadoop.mapred.JobClient - FileSystemCounters
2013-02-16 05:30:46,431 [pool-1-thread-1] INFO
org.apache.hadoop.mapred.JobClient - FILE_BYTES_READ=3827305
2013-02-16 05:30:46,431 [pool-1-thread-1] INFO
org.apache.hadoop.mapred.JobClient - FILE_BYTES_WRITTEN=3967770
2013-02-16 05:30:46,431 [pool-1-thread-1] INFO
org.apache.hadoop.mapred.JobClient - Map-Reduce Framework
2013-02-16 05:30:46,432 [pool-1-thread-1] INFO
org.apache.hadoop.mapred.JobClient - Map input records=2000
2013-02-16 05:30:46,432 [pool-1-thread-1] INFO
org.apache.hadoop.mapred.JobClient - Spilled Records=0
2013-02-16 05:30:46,433 [pool-1-thread-1] INFO
org.apache.hadoop.mapred.JobClient - Map output records=1040
Thanks,
Vidosh
--
Marcos Ortiz Valmaseda,
Product Manager && Data Scientist at UCI
Blog: http://marcosluis2186.posterous.com
Twitter: @marcosluis2186 <http://twitter.com/marcosluis2186>