Thank you for the responses. But I think I figured out the reason for 800GB of "Non DFS Used" The space was being used by another user who has about 750GB of data files.
So after all, I guess the "Non DFS Used" has nothing to do with hadoop. Maybe except for occasional unremoved task files or log files. 2010/7/7 Michael Segel <[email protected]> > > Non DFS used tends to be logging or some other information on the disk. > > So you can't use hadoop commands to remove the files from the disk. > > > > > Date: Wed, 7 Jul 2010 17:11:38 +0900 > > Subject: How do I remove "Non DFS Used"? > > From: [email protected] > > To: [email protected] > > > > I was looking at the web interface and found that some of my nodes have > > enormous amount of "Non DFS Used". > > > > There is even a node with 800GB of "Non DFS Used" which is just > ridiculous. > > > > I tried to remove them by doing: > > > > "hadoop namenode -format" > > > > and I also tried deleting "hadoop.tmp.dir" (in my case, which is > > /home/hadoop/hadoop_storage/tmp/). > > > > But when I start my cluster again, there it is again with thousands of > giga > > bytes of "Non DFS Used". > > > > Can anyone tell me what "Non DFS Used" is and how to remove them forever? > > > > Thanks in advance. > > _________________________________________________________________ > The New Busy is not the too busy. Combine all your e-mail accounts with > Hotmail. > > http://www.windowslive.com/campaign/thenewbusy?tile=multiaccount&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4 >
