Martin Kleppmann created SAMZA-215:
--------------------------------------

             Summary: Better logging for interactive command-line tools
                 Key: SAMZA-215
                 URL: https://issues.apache.org/jira/browse/SAMZA-215
             Project: Samza
          Issue Type: Improvement
            Reporter: Martin Kleppmann


At the moment, if you use run-job.sh, it prints out a very long JVM invocation 
(which is arguably not very useful for most users) but no information about 
what has actually happened (e.g. connecting to YARN RM, etc). Where the 
progress messages get logged to depends on the configuration of the user 
project using Samza.

For example, hello-samza supplies 
{{samza-job-package/src/main/resources/log4j.xml}} which sends the logs to a 
file called {{deploy/samza/undefined-samza-container-name.log}} by default. 
That is not a great experience for new users — if the job won't start up, users 
need to know to look in an obscurely-named log file to see any errors that 
occurred in run-job.sh (e.g. could not connect to YARN RM).

It's good that jobs can supply their own configuration for logging within a 
container. However, for interactive tools like run-job.sh, kill-yarn-job.sh and 
checkpoint-tool.sh (SAMZA-180) it would be much better if the logs just went to 
the console (stdout or stderr).

Suggested solution: we include a default log4j configuration that sends logs to 
the console, and use it in the interactive shell scripts (e.g. run-job.sh). We 
don't use it in run-container.sh and run-am.sh, as those should be configured 
by the job.

This will be especially relevant when we make binary releases of Samza. A user 
should be able to download the tgz of a release and immediately use the shell 
scripts for managing jobs, without having to worry about configuring log4j.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to