Re: Scan exception when running MR

J Mohamed Zahoor Mon, 22 Oct 2012 09:32:38 -0700

Cool… But my map reduce doesn't even start…
It fails while creating a record reader...
The record reader fails in


TableMapReduceUtil.convertStringToScan(conf.get(SCAN));

and throws a 

java.io.IOException: version not supported
        at org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:558)
        at 
org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertStringToScan(TableMapReduceUtil.java:255)
        at 
org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:105)
        at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
        at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:723)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)


Basically the deserializing of Scan object from the conf is failing for some 
reason.

./zahoor





On 22-Oct-2012, at 9:52 PM, Bryan Beaudreault <[email protected]> wrote:

> I'm not on 0.94.1, but I've found a lot of situations that can cause
> scanner timeouts and other scanner exceptions from M/R.  The primary ones
> probably still apply in later versions:
> 
> 
>   - Caching or batching set too high.  If caching is set to, e.g. 1000,
>   and hbase.rpc.timeout is set to 30 seconds, it means you need to be able to
>   be able to process 1000 records in your mapper in under 30 seconds (minus
>   the overhead of actually returning that many records). Otherwise the
>   mapper's next call to next() will throw a timeout.
>   - Similar to the above, this can happen if just the logic in your mapper
>   is too heavy and taking too long.  Just keep in mind that the
>   hbase.rpc.timeout can be triggered by the time between calls to next()
>   - hbase.rpc.timeout > hbase.regionserver.lease.period.  If this is the
>   case, the RS will timeout first and kill the scan.  Then your mapper will
>   call next() and since the scan no longer exists will throw a scan/lease
>   exception.
>   - The filters on the mapper scan are causing too many rows to be
>   skipped, such that not enough rows can be collected to return within the
>   timeout.
> 
> Hope this helps and/or is still accurate for your version.
> 
> On Mon, Oct 22, 2012 at 11:51 AM, J Mohamed Zahoor <[email protected]> wrote:
> 
>> I am using 0.94.1
>> 
>> ./zahoor
>> 
>> 
>> On 22-Oct-2012, at 9:17 PM, J Mohamed Zahoor <[email protected]> wrote:
>> 
>>> Hi
>>> 
>>> I am facing a scanner exception like this when i run a mr job.
>>> Both the input and output are hbase tables (different tables)…
>>> This comes sporadically on some mapper and all other mappers runs fine..
>>> Even the failed mapper gets passed in the next attempt.
>>> Any clue on what might be wrong?
>>> 
>>> java.lang.NullPointerException
>>>      at org.apache.hadoop.hbase.client.Scan.<init>(Scan.java:147)
>>>      at
>> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.createRecordReader(TableInputFormatBase.java:123)
>>>      at
>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:489)
>>>      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
>>>      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>      at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>      at java.security.AccessController.doPrivileged(Native Method)
>>>      at javax.security.auth.Subject.doAs(Subject.java:415)
>>>      at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>>>      at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>> 
>>> 
>>> ./zahoor
>> 
>>

Re: Scan exception when running MR

Reply via email to