Adam Antal created YARN-9667:
--------------------------------

             Summary: Container-executor.c duplicates messages to stdout
                 Key: YARN-9667
                 URL: https://issues.apache.org/jira/browse/YARN-9667
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: nodemanager, yarn
    Affects Versions: 3.2.0
            Reporter: Adam Antal


When a container is killed by its AM we get a similar error message like this:
{noformat}
2019-06-30 12:09:04,412 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
 Shell execution returned exit code: 143. Privileged Execution Operation Stderr:

Stdout: main : command provided 1
main : run as user is systest
main : requested yarn user is systest
Getting exit code file...
Creating script paths...
Writing pid file...
Writing to tmp file 
/yarn/nm/nmPrivate/application_1561921629886_0001/container_e84_1561921629886_0001_01_000019/container_e84_1561921629886_0001_01_000019.pid.tmp
Writing to cgroup task files...
Creating local dirs...
Launching container...
Getting exit code file...
Creating script paths...
{noformat}
In container-executor.c the fork point is right after the "Creating script 
paths..." part, though in the Stdout log we can clearly see it has been written 
there twice. After consulting with [~pbacsko] it seems like there's a missing 
flush in container-executor.c before the fork and that causes the duplication.

I suggest to add a flush there so that it won't be duplicated: it's a bit 
misleading that the child process writes out "Getting exit code file" and 
"Creating script paths" even though it is clearly not doing that.

A more appealing solution could be to revisit the fprintf-fflush pairs in the 
code and change them to a single call, so that the fflush calls would not be 
forgotten accidentally. (It can cause problems in every place where it's used).

Note: this issue probably affects every occasion of fork(), not just the one 
from {{launch_container_as_user}} in {{main.c}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

Reply via email to