Re: LinuxContainerExecutor mkdir failures causing NodeManagers to become unhealthy

Shane Kumpf Mon, 17 Sep 2018 12:25:30 -0700

Hey Jon,

YARN-8751 takes care of the issue that marks the NM unhealthy under these
conditions. If you can open a JIRA with details on the swallowed error,
that would be appreciated. As noted, 3.1.1 has a number of fixes to the
YARN containerization features, so it would be great if you can see if the
issue still occurs with that release.


Thanks,
-Shane

On Mon, Sep 17, 2018 at 1:05 PM Jeff Hubbs <jhubbsl...@att.net> wrote:

> I would also just suggest moving up to 3.1.1 and trying again. Barring
> that, maybe you can take the error message at its word. My experience with
> running Hadoop 3.x jobs is a little limited, but I know that jobs can paint
> a lot of data into /tmp/hadoop-yarn and if your nodes can't absorb a lot of
> expansion in that directory, things will error out albeit softly. Noting
> the way the terasort example behaves in that regard, I set up my worker
> nodes to make /tmp/hadoop-yarn a mount point for its own disk volume whose
> size I can preset and I can also optionally enable transparent compression
> via btrfs. A lot of times, I would expect I could give that volume some
> token small size but in trying to make a 1/5-scale (i.e., 200GB) terasort
> run, 128GiB with compression enabled across five workers wasn't enough.
> 1/10th-scale I could manage but at 1/5, it would fill up one node's
> /tmp/hadoop-yarn, then the next, then the next, etc. Makes me think that
> terasort tries to write the whole dang thing out to extra-HDFS file system
> before making an output file in HDFS.
>
> On 9/17/18 1:55 PM, Eric Badger wrote:
>
> Hi Jonathan,
>
> Have you opened up a YARN JIRA with your findings? If not, that would be
> the next step in debugging the issue and coding up a fix. This certainly
> sounds like a bug and something that we should get to the bottom of.
>
> As far as Nodemanagers becoming unhealthy, a config could be added to
> prevent this. But, if you're only seeing 1 failure out of millions of
> tasks, this seems like it would unmask more problems than it fixes. 1
> container failing is bad, but a node going bad and failing every container
> that runs on it forever until it is shutdown is much, much worse. However,
> if you think that you have a use case that could benefit from the config
> being optional, that is something we could also look into. That would be a
> separate YARN JIRA as well.
>
> Thanks,
>
> Eric
>
> On Mon, Sep 17, 2018 at 12:37 PM, Jonathan Bender <
> jonben...@stripe.com.invalid> wrote:
>
>> Hello,
>>
>> We started are using CGroups with LinuxContainerExecutor recently,
>> running Apache Hadoop 3.0.0. Occasionally (once out of many millions of
>> tasks) a yarn container will fail with a message like the following:
>> WARN privileged.PrivilegedOperationExecutor: Shell execution returned
>> exit code: 35. Privileged Execution Operation Stderr:
>> Could not create container dirsCould not create local files and
>> directories
>>
>> Looking at the container executor source it's traceable to errors here:
>> https://github.com/apache/hadoop/blob/release-3.0.0-RC1/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c#L1604
>>
>> And ultimately to
>> https://github.com/apache/hadoop/blob/release-3.0.0-RC1/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c#L672
>>
>> The root failure seems to be in the underlying mkdir call, but that exit
>> code / errno is swallowed so we don't have more details. We tend to see
>> this when many containers start at the same time for the same application
>> on a host, and suspect it may be related to some race conditions around
>> those shared directories between containers for the same application.
>>
>> Has anyone seen similar failures in using the LinuxContainerExecutor?
>>
>> This issue compounded because LinuxContainerExecutor renders the node
>> unhealthy in these scenarios:
>> https://github.com/apache/hadoop/blob/release-3.0.0-RC1/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java#L566
>>
>> Under some circumstances this seems appropriate, but since this is a
>> transient failure (none of these machines were at capacity for disks,
>> inodes, etc) we shouldn't down the NodeManager. The behavior to add this
>> blacklisting came as part of
>> https://issues.apache.org/jira/browse/YARN-6302 which seems perfectly
>> valid, but perhaps we should make this configurable so certain users can
>> opt out?
>>
>> Cheers,
>> Jon
>>
>
>
>

Re: LinuxContainerExecutor mkdir failures causing NodeManagers to become unhealthy

Reply via email to