Re: BlockMissingException

Public Network Services Wed, 15 May 2013 01:39:56 -0700

Very reasonable scenario, but the application I run does not delete the
input files, so such a race condition could not manifest itself at any
point.


Funnily enough, experimenting around we have changed some local path
permissions and it seems to work now.

Thanks! :-)


On Tue, May 14, 2013 at 8:39 PM, Chris Nauroth <[email protected]>wrote:

> Is it possible that you have multiple MR jobs (or other HDFS clients)
> operating on the same file paths that could cause a conflict if run
> concurrently?
>
> At MR job submission time, the MR client identifies the set of input
> splits, which roughly correspond to the the blocks of the input HDFS files.
>  (This is a simplified description, because CombineFileInputFormat or your
> own custom InputFormat can complicate the picture, but this simplification
> is fine for our purposes.)  When map tasks launch, they read from the input
> splits (the HDFS file blocks).  If you have an MR job that decides once of
> its input splits needs block X, and then another process decides to delete
> the HDFS file containing block X before the map task that would read the
> block launches, then you'd have a race condition that could trigger a
> problem similar to this.
>
> Typically, the solution is to design applications such that concurrent
> deletes while reading from a particular HDFS file are not possible.  For
> example, you might code file deletion after the MR job that consumes those
> files, so that you know nothing else is reading while you're trying to
> delete.
>
> BlockMissingException could also show up if you've lost all replicas of a
> block, but this would be extremely rare for a typical deployment with a
> replication factor of 3.
>
> Hope this helps,
>
> Chris Nauroth
> Hortonworks
> http://hortonworks.com/
>
>
>
> On Tue, May 14, 2013 at 2:20 PM, Public Network Services <
> [email protected]> wrote:
>
>> Hi...
>>
>> I am getting a BlockMissingException in a fairly simple application with
>> a few mappers and reducers (see end of message).
>>
>> Looking around in the web has not helped much, including JIRA issues
>> HDFS-767 and HDFS-1907. The configuration variable
>>
>>    - dfs.client.baseTimeWindow.waitOn.BlockMissingException
>>
>> does not seem to make a difference, either.
>>
>> The BlockMissingException occurs in some of the runs, while in others
>> execution completes normally, which signifies a possible concurrency issue.
>>
>> Any ideas?
>>
>> Thanks!
>>
>>
>> org.apache.hadoop.yarn.YarnException:
>> org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block:
>> BP-390546703... file=...job.splitmetainfo
>>         at
>> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1159)
>>         at
>> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1013)
>>         at
>> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:985)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:380)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
>>         at
>> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:694)
>>         at
>> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:119)
>>         at
>> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:904)
>>         at
>> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.start(MRAppMaster.java:854)
>>         at
>> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1070)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1367)
>>         at
>> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1066)
>>         at
>> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1025)
>>
>>
>>
>

Re: BlockMissingException

Reply via email to