Re: HADOOP_ROOT_LOGGER
In my experience the default HADOOP_ROOT_LOGGER definition will override any root logger defined in log4j.properties, which is where the problems have arisen. If the HADOOP_ROOT_LOGGER definition in hadoop-config.sh were removed, wouldn't the root logger defined in the log4j.properties file be used? Or do the client commands not read that configuration file? I'm trying to understand why the root logger should be defined outside of the log4j.properties file. Rob On 05/22/2014 12:53 AM, Vinayakumar B wrote: Hi Robert, I understand your confusion. HADOOP_ROOT_LOGGER is set to default value INFO,console if it hasn't set for anything and logs will be displayed on the console itself. This will be true for any client commands you run. For ex: hdfs dfs -ls / But for the server scripts (hadoop-daemon.sh, yarn-daemon.sh, etc) HADOOP_ROOT_LOGGER will be set to INFO, RFA if HADOOP_ROOT_LOGGER env variable is not defined. So that all the log messages of the server daemons goto some log files and this will be maintained by RollingFileAppender. If you want to override all these default and set your own loglevel then define that as env variable HADOOP_ROOT_LOGGER. For ex: export HADOOP_ROOT_LOGGER=DEBUG,RFA export above env variable and then start server scripts or execute client commands, all logs goto files and will be maintained by RollingFileAppender. Regards, Vinay On Wed, May 21, 2014 at 6:42 PM, Robert Rati rr...@redhat.com wrote: I noticed in hadoop-config.sh there is this line: HADOOP_OPTS=$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_ ROOT_LOGGER:-INFO,console} which is setting a root logger if HADOOP_ROOT_LOGGER isn't set. Why is this here.needed? There is a log4j.properties file provided that defines a default logger. I believe the line above will result in overriding whatever is set for the root logger in the log4j.properties file. This has caused some confusion and hacks to work around this. Is there a reason not to remove the above code and just have all the logger definitions in the log4j.properties file? Is there maybe a compatibility concern? Rob
Re: HADOOP_ROOT_LOGGER
It's not always practical to edit the log4j.properties file. For one thing, if you're using a management system, there may be many log4j properties sprinkled around the system, and it could be difficult to figure out which is the one you need to edit. For another, you may not (should not?) have permission to do this on a production cluster. Doing something like HADOOP_ROOT_LOGGER=DEBUG,console hadoop fs -cat /foo has helped me diagnose problems in the past. best, Colin On Thu, May 22, 2014 at 6:34 AM, Robert Rati rr...@redhat.com wrote: In my experience the default HADOOP_ROOT_LOGGER definition will override any root logger defined in log4j.properties, which is where the problems have arisen. If the HADOOP_ROOT_LOGGER definition in hadoop-config.sh were removed, wouldn't the root logger defined in the log4j.properties file be used? Or do the client commands not read that configuration file? I'm trying to understand why the root logger should be defined outside of the log4j.properties file. Rob On 05/22/2014 12:53 AM, Vinayakumar B wrote: Hi Robert, I understand your confusion. HADOOP_ROOT_LOGGER is set to default value INFO,console if it hasn't set for anything and logs will be displayed on the console itself. This will be true for any client commands you run. For ex: hdfs dfs -ls / But for the server scripts (hadoop-daemon.sh, yarn-daemon.sh, etc) HADOOP_ROOT_LOGGER will be set to INFO, RFA if HADOOP_ROOT_LOGGER env variable is not defined. So that all the log messages of the server daemons goto some log files and this will be maintained by RollingFileAppender. If you want to override all these default and set your own loglevel then define that as env variable HADOOP_ROOT_LOGGER. For ex: export HADOOP_ROOT_LOGGER=DEBUG,RFA export above env variable and then start server scripts or execute client commands, all logs goto files and will be maintained by RollingFileAppender. Regards, Vinay On Wed, May 21, 2014 at 6:42 PM, Robert Rati rr...@redhat.com wrote: I noticed in hadoop-config.sh there is this line: HADOOP_OPTS=$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_ ROOT_LOGGER:-INFO,console} which is setting a root logger if HADOOP_ROOT_LOGGER isn't set. Why is this here.needed? There is a log4j.properties file provided that defines a default logger. I believe the line above will result in overriding whatever is set for the root logger in the log4j.properties file. This has caused some confusion and hacks to work around this. Is there a reason not to remove the above code and just have all the logger definitions in the log4j.properties file? Is there maybe a compatibility concern? Rob
Re: HADOOP_ROOT_LOGGER
Ah, that makes sense. Would it make sense to default the root logger to the one defined in log4j.properties file instead of the static value in the script then? That way an admin can set all logging properties desired in the log4j.properties file, but can override with HADOOP_ROOT_LOGGER to debug. It feels a little black box-y that if HADOOP_ROOT_LOGGER isn't set then the root logger set in log4j.properties is ignored. Maybe this is all very well known and just a bit black box-y to me since I'm new-ish to hadoop. Rob On 05/22/2014 03:41 PM, Colin McCabe wrote: It's not always practical to edit the log4j.properties file. For one thing, if you're using a management system, there may be many log4j properties sprinkled around the system, and it could be difficult to figure out which is the one you need to edit. For another, you may not (should not?) have permission to do this on a production cluster. Doing something like HADOOP_ROOT_LOGGER=DEBUG,console hadoop fs -cat /foo has helped me diagnose problems in the past. best, Colin On Thu, May 22, 2014 at 6:34 AM, Robert Rati rr...@redhat.com wrote: In my experience the default HADOOP_ROOT_LOGGER definition will override any root logger defined in log4j.properties, which is where the problems have arisen. If the HADOOP_ROOT_LOGGER definition in hadoop-config.sh were removed, wouldn't the root logger defined in the log4j.properties file be used? Or do the client commands not read that configuration file? I'm trying to understand why the root logger should be defined outside of the log4j.properties file. Rob On 05/22/2014 12:53 AM, Vinayakumar B wrote: Hi Robert, I understand your confusion. HADOOP_ROOT_LOGGER is set to default value INFO,console if it hasn't set for anything and logs will be displayed on the console itself. This will be true for any client commands you run. For ex: hdfs dfs -ls / But for the server scripts (hadoop-daemon.sh, yarn-daemon.sh, etc) HADOOP_ROOT_LOGGER will be set to INFO, RFA if HADOOP_ROOT_LOGGER env variable is not defined. So that all the log messages of the server daemons goto some log files and this will be maintained by RollingFileAppender. If you want to override all these default and set your own loglevel then define that as env variable HADOOP_ROOT_LOGGER. For ex: export HADOOP_ROOT_LOGGER=DEBUG,RFA export above env variable and then start server scripts or execute client commands, all logs goto files and will be maintained by RollingFileAppender. Regards, Vinay On Wed, May 21, 2014 at 6:42 PM, Robert Rati rr...@redhat.com wrote: I noticed in hadoop-config.sh there is this line: HADOOP_OPTS=$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_ ROOT_LOGGER:-INFO,console} which is setting a root logger if HADOOP_ROOT_LOGGER isn't set. Why is this here.needed? There is a log4j.properties file provided that defines a default logger. I believe the line above will result in overriding whatever is set for the root logger in the log4j.properties file. This has caused some confusion and hacks to work around this. Is there a reason not to remove the above code and just have all the logger definitions in the log4j.properties file? Is there maybe a compatibility concern? Rob
HADOOP_ROOT_LOGGER
I noticed in hadoop-config.sh there is this line: HADOOP_OPTS=$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_ROOT_LOGGER:-INFO,console} which is setting a root logger if HADOOP_ROOT_LOGGER isn't set. Why is this here.needed? There is a log4j.properties file provided that defines a default logger. I believe the line above will result in overriding whatever is set for the root logger in the log4j.properties file. This has caused some confusion and hacks to work around this. Is there a reason not to remove the above code and just have all the logger definitions in the log4j.properties file? Is there maybe a compatibility concern? Rob
Re: HADOOP_ROOT_LOGGER
Hi Robert, I understand your confusion. HADOOP_ROOT_LOGGER is set to default value INFO,console if it hasn't set for anything and logs will be displayed on the console itself. This will be true for any client commands you run. For ex: hdfs dfs -ls / But for the server scripts (hadoop-daemon.sh, yarn-daemon.sh, etc) HADOOP_ROOT_LOGGER will be set to INFO, RFA if HADOOP_ROOT_LOGGER env variable is not defined. So that all the log messages of the server daemons goto some log files and this will be maintained by RollingFileAppender. If you want to override all these default and set your own loglevel then define that as env variable HADOOP_ROOT_LOGGER. For ex: export HADOOP_ROOT_LOGGER=DEBUG,RFA export above env variable and then start server scripts or execute client commands, all logs goto files and will be maintained by RollingFileAppender. Regards, Vinay On Wed, May 21, 2014 at 6:42 PM, Robert Rati rr...@redhat.com wrote: I noticed in hadoop-config.sh there is this line: HADOOP_OPTS=$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_ ROOT_LOGGER:-INFO,console} which is setting a root logger if HADOOP_ROOT_LOGGER isn't set. Why is this here.needed? There is a log4j.properties file provided that defines a default logger. I believe the line above will result in overriding whatever is set for the root logger in the log4j.properties file. This has caused some confusion and hacks to work around this. Is there a reason not to remove the above code and just have all the logger definitions in the log4j.properties file? Is there maybe a compatibility concern? Rob -- Regards, Vinay