damccorm opened a new issue, #20757:
URL: https://github.com/apache/beam/issues/20757

   Log messages emitted by any DoFn is not logged by spark executors when the 
pipeline is run with Spark in cluster deployment mode (on YARN). Tested on 
Cloudera 6 with Spark 2.4.
   
   I made a test project to reproduce the issue: 
[https://github.com/ventuc/beam-log-test](https://github.com/ventuc/beam-log-test).
 Run it with:
   
   `spark-submit --class beam.tests.log.LogTesting --name LogTesting 
--deploy-mode cluster --master yarn --conf 
"spark.driver.extraJavaOptions=-Dlog4j.configuration=[file:log4j.properties|file://log4j.properties/]"
 --conf 
"spark.executor.extraJavaOptions=-Dlog4j.configuration=[file:log4j.properties|file://log4j.properties/]"
 --files $HOME/log4j.properties beam-log-test-0.0.1-SNAPSHOT.jar`
   
   To retrieve logs from YARN run:
   
   `yarn logs -applicationId <app_id>`
   
   As you can see, logs from the beam.tests.log appear only in the driver's 
log, and not in the executor's log.
   
    
   
   There's not any documentation about how to handle logs in Beam with the 
Spark runner. Please document it as requested also by BEAM-792.
   
    
   
   Imported from Jira 
[BEAM-11735](https://issues.apache.org/jira/browse/BEAM-11735). Original Jira 
may contain additional context.
   Reported by: claventu.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to