How many DN you have?
If this number is more than 1, check this in another DN to see if this
happens too there.
Check the /var/log/messages or dmesg (like Todd said you) with this for
example: (this is one of my Ubuntu servers):
less dmesg | grep EXT4-fs
[ 1.583836] EXT4-fs (sda7): INFO: recovery required on readonly
filesystem
[ 1.583843] EXT4-fs (sda7): write access will be enabled during recovery
[ 2.572935] EXT4-fs (sda7): orphan cleanup on readonly fs
[ 2.620969] EXT4-fs (sda7): ext4_orphan_cleanup: deleting
unreferenced inode 455946
[ 2.621015] EXT4-fs (sda7): ext4_orphan_cleanup: deleting
unreferenced inode 455942
[ 2.621029] EXT4-fs (sda7): 2 orphan inodes deleted
[ 2.621034] EXT4-fs (sda7): recovery complete
[ 2.785283] EXT4-fs (sda7): mounted filesystem with ordered data
mode. Opts: (null)
[ 22.041130] EXT4-fs (sda7): re-mounted. Opts: errors=remount-ro
[ 22.505474] EXT4-fs (sda8): mounted filesystem with ordered data
mode. Opts: (null)
Regards
El 6/6/2011 4:43 PM, Todd Lipcon escribió:
Hi Prem,
My guess is that your Linux filesystem on this partition is corrupt.
Check dmesg for output indicating fs-level errors.
-Todd
On Mon, Jun 6, 2011 at 1:23 PM, Jain, Prem <premanshu.j...@netapp.com
<mailto:premanshu.j...@netapp.com>> wrote:
Mapuser or hdfs user didn't seem to help, so I switched to root:
[root@hadoop20 mapred]# ls -la /part/data
total 0
drwx------ 3 hdfs hadoop 16 Jun 6 10:22 .
drwxrwxrwx 4 hdfs hadoop 47 May 26 18:36 ..
drwxr-xr-x 4 mapred mapred 35 May 26 21:02 tmp
[root@hadoop20 mapred]#
[root@hadoop20 mapred]# pwd
/part/data/tmp/distcache/642114211252449475_2038269146_799583695/hmaster/user/mapred
[root@hadoop20 mapred]# ls -la
total 0
drwxr-xr-x 3 mapred mapred 22 Jun 6 12:46 .
drwxr-xr-x 3 mapred mapred 19 May 26 21:17 ..
?--------- ? ? ? ? ? input-dir
-----Original Message-----
From: Marcos Ortiz [mailto:mlor...@uci.cu <mailto:mlor...@uci.cu>]
Sent: Monday, June 06, 2011 1:17 PM
To: hdfs-user@hadoop.apache.org <mailto:hdfs-user@hadoop.apache.org>
Cc: Jain, Prem
Subject: Re: cant remove files from tmp
* Why are using he root user for these operations?
* Which are your permisions on your data directory? (ls -la
/part/data)?
Regards
El 6/6/2011 3:41 PM, Jain, Prem escribió:
> I have a wrecked datanode which is giving me hard time
restarting. It
> keeps complaining of Datanode dead, pid file exists. I already
tried
> deleting the files but seems like the files are corrupted and don't
> allow me delete.
>
> ____________________________________________________________________
>
> Here is the log:
> ____________________________________________________________________
>
> /************************************************************
> STARTUP_MSG: Starting DataNode
> STARTUP_MSG: host = hadoop20/192.168.1.190 <http://192.168.1.190>
> STARTUP_MSG: args = []
> STARTUP_MSG: version = 0.20.2-cdh3u0
> STARTUP_MSG: build = -r 81256ad0f2e4ab2bd34b04f53d25a6c23686dd14;
> compiled by 'root' on Fri Mar 25 20:07:24 EDT 2011
> ************************************************************/
> 2011-06-06 09:11:01,232 INFO
> org.apache.hadoop.security.UserGroupInformation: JAAS Configuration
> already set up for Hadoop, not re-installing.
> 2011-06-06 09:11:01,369 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> org.apache.hadoop.util.Shell$ExitCodeException: du: cannot access
> `/part/data/tmp/distcache/642114211252449475_2038269146_79
> 9583695/hmaster/user/mapred/input-dir': No such file or directory
> du: cannot read directory
> `/part/data/tmp/mapred/jobcache/job_201105261845_0005': Permission
> denied
>
>
> _________________________________
> Here is the file I can't delete
> _________________________________
> [root@hadoop20 distcache]# pwd
> /part/data/tmp/distcache
> [root@hadoop20 distcache]# ls -la
> total 0
> drwxr-xr-x 3 mapred mapred 52 May 26 21:36 .
> drwxr-xr-x 4 mapred mapred 35 May 26 21:02 ..
> drwxr-xr-x 3 mapred mapred 20 May 26 21:17
> 642114211252449475_2038269146_799583695
> [root@hadoop20 distcache]# cd *
> [root@hadoop20 642114211252449475_2038269146_799583695]# ls -la
> total 0
> drwxr-xr-x 3 mapred mapred 20 May 26 21:17 .
> drwxr-xr-x 3 mapred mapred 52 May 26 21:36 ..
> drwxr-xr-x 3 mapred mapred 17 May 26 21:17 hmaster
> [root@hadoop20 642114211252449475_2038269146_799583695]# cd h*
> [root@hadoop20 hmaster]# ls
> user
> [root@hadoop20 hmaster]# cd *
> [root@hadoop20 user]# ls -la
> total 0
> drwxr-xr-x 3 mapred mapred 19 May 26 21:17 .
> drwxr-xr-x 3 mapred mapred 17 May 26 21:17 ..
> drwxr-xr-x 3 mapred mapred 22 May 26 21:17 mapred
> [root@hadoop20 user]# cd m*
> [root@hadoop20 mapred]# ls -la
> total 0
> drwxr-xr-x 3 mapred mapred 22 May 26 21:17 .
> drwxr-xr-x 3 mapred mapred 19 May 26 21:17 ..
> ?--------- ? ? ? ? ? input-dir
> [root@hadoop20 mapred]# rm input-dir
> rm: cannot lstat `input-dir': No such file or directory
> [root@hadoop20 mapred]# touch *
> [root@hadoop20 mapred]# ls
> input-dir input-dir
> [root@hadoop20 mapred]# rm *
> rm: remove regular empty file `input-dir'? y
> rm: cannot lstat `input-dir': No such file or directory
> [root@hadoop20 mapred]# ls -la
> total 0
> drwxr-xr-x 3 mapred mapred 22 Jun 6 12:45 .
> drwxr-xr-x 3 mapred mapred 19 May 26 21:17 ..
> ?--------- ? ? ? ? ? input-dir
> [root@hadoop20 mapred]#
>
--
Marcos Luís Ortíz Valmaseda
Software Engineer (UCI)
http://marcosluis2186.posterous.com
http://twitter.com/marcosluis2186
--
Todd Lipcon
Software Engineer, Cloudera
--
Marcos Luís Ortíz Valmaseda
Software Engineer (UCI)
http://marcosluis2186.posterous.com
http://twitter.com/marcosluis2186