Hello Friends, I am seeing Hadoop log timestamps & file timestamps not same as system time. I found that this problem was discussed on on mailing list earlier but there was no solution posted.
http://www.mail-archive.com/hdfs-user@hadoop.apache.org/msg00310.html I am wondering what is causing this behaviour and how to fix it. As you can see below system time is in UTC and Hadoop is showing EST. [had...@hadoop~]$ date Mon Jan 10 21:54:17 UTC 2011 [had...@hadoop ~]$ hadoop jar $HADOOP_HOME/*examples.jar teragen 1000 tera-dir Generating 1000 using 2 maps with step of 500 11/01/10 16:50:33 INFO mapred.JobClient: Running job: job_201101061958_2590 11/01/10 16:50:34 INFO mapred.JobClient: map 0% reduce 0% 11/01/10 16:50:40 INFO mapred.JobClient: map 100% reduce 0% 11/01/10 16:50:40 INFO mapred.JobClient: Job complete: job_201101061958_2590 11/01/10 16:50:40 INFO mapred.JobClient: Counters: 6 11/01/10 16:50:40 INFO mapred.JobClient: Job Counters 11/01/10 16:50:40 INFO mapred.JobClient: Launched map tasks=2 11/01/10 16:50:40 INFO mapred.JobClient: FileSystemCounters I thought possible workaround for this issue to add -Duser.timezone in conf/hadoop-env.sh but it works only on my localsystem but not on cluster. Why is hadoop reporting wrong EST time format? From where hadoop reads time? Any help is greatly appreciated. Thanks in advance, Ravi