Cool thx! I already moved ntp to NN (had the same thought). All machines sync to the same apt-mirror, but I'll check the TZ files to make sure. Not much left to try... I even formatted NN, and it didn't go away.
Next would be to dig up hadoop source for timestamps, and run it in a harness, I guess. Could be java versioning/TZ issue, I suppose (Is java tzdata managed separately?) We'll see Best, -stu Sent from my Verizon Wireless BlackBerry -----Original Message----- From: Eric Sammer <esam...@cloudera.com> Date: Thu, 22 Apr 2010 01:17:25 To: <hdfs-user@hadoop.apache.org> Subject: Re: hadoop log timestamps & file timestamps not same as system time My initial inclination is to check the timezone of the machines. An hour difference smells like DST shift. Check that the OS are the same version / patch level (as zoneinfo files may have changed based on legislation recently passed). Confirm that all machines are living in the same zone. Sync'ing to ntp is good too although an hour outage should be fine. If you want to protect against ntp drift due to outage, you can run a canonical ntp server inside the cluster. Given the NN has to be available for the cluster to function, that might be a good place to run one and just have the DN / TT nodes sync to it. HTH. On Wed, Apr 21, 2010 at 6:46 PM, Stuart Smith <stu24m...@yahoo.com> wrote: > Hello, > I've noticed an odd situation: > > The timestamps in my hadoop namenode log (and filesystem) appear to be an > hour behind the actual time on the system (as supplied by `date`). > > I checked that this wasn't user error by stopping and starting hadoop, and > checking the most recent messages (as well as the last time the file was > modified). > > For example, you can see the last modified time, and the timestamp on the > shutdown log message different significantly: > > s...@ubuntu-namenode:~/hadoop/logs$ ls -l > hadoop-stu-namenode-ubuntu-namenode.log > -rw-r--r-- 1 stu stu 22148124 2010-04-21 15:33 > hadoop-stu-namenode-ubuntu-namenode.log > s...@ubuntu-namenode:~/hadoop/logs$ tail -f > hadoop-stu-namenode-ubuntu-namenode.log > .... > 2010-04-21 14:33:19,032 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > SHUTDOWN_MSG: > /************************************************************ > SHUTDOWN_MSG: Shutting down NameNode at ubuntu-namenode/192.168.193.87 > ************************************************************/ > > I also checked this while hadoop was running: the messages are consistently > an hour behind. > > More troubling, this appears affect the filesystem timestamps as well. > > s...@ubuntu-namenode:~/hadoop/bin$ date > Wed Apr 21 15:53:34 PDT 2010 > s...@ubuntu-namenode:~/hadoop/bin$ ./hadoop dfs -touchz /test > s...@ubuntu-namenode:~/hadoop/bin$ ./hadoop dfs -ls / > Found 2 items > drwxr-xr-x - stu supergroup 0 2010-04-20 16:51 /home > -rw-r--r-- 3 stu supergroup 0 2010-04-21 14:53 /test > > > Interestingly, the datanodes appear to be fine: > > s...@ubuntu-hadoop-2:~/hadoop/logs$ ls -l > hadoop-stu-datanode-ubuntu-hadoop-2.log > -rw-r--r-- 1 stu stu 4918505 2010-04-21 15:47 > hadoop-stu-datanode-ubuntu-hadoop-2.log > s...@ubuntu-hadoop-2:~/hadoop/logs$ tail -n 1 > hadoop-stu-datanode-ubuntu-hadoop-2.log > 2010-04-21 15:47:21,875 INFO org.mortbay.log: jetty-6.1.14 > > Is there something wrong here? Am I just doing something crazy/dumb? > > Note that I discovered this after getting some rather odd behavior from hbase > - hbase appears to work fine, except it's continually adding the > regionservers onto the list so, in my itty-bitty cluster of about 4 machines > + 1 namenode, the master reports back ~ 1000s of regionservers and growing.. > There are indications timing issues involved in that as well, but I figure I > should get hdfs working first :) > > Note that I do have NTP running to keep all these boxes in sync, but the NTP > server will go down for an hour or so intermittently (due to some network > issues that don't involve the hadoop cluster). All the system times on the > machines appear to be (at least) within a few seconds of each other. > > I'm running hadoop 0.20.2 > > Any thoughts? Any debugging guidance? I'm pretty stumped. > > Take care, > -stu > > > > -- Eric Sammer phone: +1-917-287-2675 twitter: esammer data: www.cloudera.com