Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1573#discussion_r17853745
  
    --- Diff: docs/running-on-yarn.md ---
    @@ -205,6 +205,8 @@ Note that for the first option, both executors and the 
application master will s
     log4j configuration, which may cause issues when they run on the same node 
(e.g. trying to write
     to the same log file).
     
    +For a streaming application to operate 24/7, user can use 
RollingFileAppender in log4j.properties to avoid disk overflow of single log 
file. And to locate log files in YARN container log directory, configure 
RollingFileAppender's file location as 
"${spark.yarn.app.container.log.dir}/spark.log". So these log files can be 
viewed on YARN's container page during running. And logs will be copied to HDFS 
after job finished if log aggregation is turned on.
    --- End diff --
    
    Since this could be used in more then just streaming applications perhaps 
we should reword this a little.  Perhaps put the information about 
${spark.yarn.app.container.log.dir} first and then give the example using 
RollingFileAppender with streaming.
    
    Something more like: (note feel free to change the exact wording)
    
    If you need a reference to the proper location to put the log files in the 
YARN so that YARN can properly display and aggregate them, use 
"${spark.yarn.app.container.log.dir}" in your log4j.properties. For example... 
(then explain the streaming example).



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to