detecting stalled daemons?

2009-10-08 Thread james warren
Quick question for the hadoop / linux masters out there: I recently observed a stalled tasktracker daemon on our production cluster, and was wondering if there were common tests to detect failures so that administration tools (e.g. monit) can automatically restart the daemon. The particular

hadoop startup problem

2009-10-08 Thread asmaa.atef
hello everyone, i have a problem in hadoop startup ,every time i try to start hadoop name node doesnot start and when i tried to stop name node ,it gives an error :no name node to start. i tried to format the name node and it works well ,but now i have data in hadoop and formatting name node will

Error: INFO ipc.Client: Retrying connect to server: /192.168.100.11:8020. Already tried 0 time(s).

2009-10-08 Thread santosh gandham
Hi, I am new to Hadoop. I just configured it based on the documentation. While I was running example program wordcount.java, I am getting errors. When I gave command $ /bin/hadoop dfs -mkdir santhosh , I am getting error as 09/10/08 13:30:12 INFO ipc.Client: Retrying connect to server: /

Recommended file-system for DataNode

2009-10-08 Thread Stas Oskin
Hi. I'm using the stock Ext3 as the most tested one, but I wonder, has someone ever tried, or even using there days in production another file system, like JFS, XFS or even maybe Ext4? I'm exploring way to boost the performance of DataNodes, and this seems as one of possible venues. Thanks for

Re: hadoop startup problem

2009-10-08 Thread David Howell
It sounds like the name node is crashing on startup. What kind of errors are there in the name node log? On Thu, Oct 8, 2009 at 4:01 AM, asmaa.atef sw_as...@hotmail.com wrote: hello everyone, i have a problem in hadoop startup ,every time i try to start hadoop name node doesnot start and when

Re: Recommended file-system for DataNode

2009-10-08 Thread Jason Venner
I have used xfs pretty extensively, it seemed to be somewhat faster than ext3. The only trouble we had related to some machines running the PAE 32 bit kernels, where we the filesystems lockup. That is an obscure use case however. Running JBOD with your dfs.data.dir listing a directory on each

Re: Recommended file-system for DataNode

2009-10-08 Thread Stas Oskin
Hi. Thanks for the info, question is whether XFS performance justifies switching from the more common Ext3? JBOD is a great approach indeed. Regards. 2009/10/8 Jason Venner jason.had...@gmail.com I have used xfs pretty extensively, it seemed to be somewhat faster than ext3. The only

Re: Recommended file-system for DataNode

2009-10-08 Thread Tom Wheeler
As an aside, there's a short article comparing the two in the latest edition of Linux Journal. It was hardly scientific, but the main points were: - XFS is faster than ext3, especially for large files - XFS is currently unsupported on Red Hat Enterprise, but apparently will be soon. On Thu,

Re: Recommended file-system for DataNode

2009-10-08 Thread Jason Venner
Busy datanodes become bound by the metadata lookup times for the directory and inode entries required to open a block. Anything that optimizes that will help substantially. We are thinking of playing with brtfs, and using a small SSD for our file system metadata, and the spinning disks for the

Re: Recommended file-system for DataNode

2009-10-08 Thread paul
Check out the bottom of this page: http://wiki.apache.org/hadoop/DiskSetup noatime is all we've done in our environment. I haven't found it worth the time to optimize further since we're CPU bound in most of our jobs. -paul On Thu, Oct 8, 2009 at 3:26 PM, Stas Oskin stas.os...@gmail.com

Re: Recommended file-system for DataNode

2009-10-08 Thread Tom Wheeler
I've used XFS on Silicon Graphics machines and JFS on AIX systems -- both were quite fast and extremely reliable, though this long predates my use of Hadoop. To your question, I recently came across a blog that compares performance of several Linux filesystems:

Re: Recommended file-system for DataNode

2009-10-08 Thread Edward Capriolo
On Thu, Oct 8, 2009 at 4:00 PM, Jason Venner jason.had...@gmail.com wrote: noatime is absolutely essential, I forget to mention it, because it is automatic now for me. I have a fun story about atime, I have some Solaris machines with ZFS file systems, and I was doing a find on a 6 level

University of Maryland: cloud computing assistant professor position

2009-10-08 Thread Jimmy Lin
FYI---the University of Maryland is seeking an assistant professor in cloud computing. See job description below. = College of Information Studies, Maryland's iSchool University of Maryland, College Park Assistant Professor in Cloud Computing The recently-formed Cloud

retrieving sequenceFile Postion of Key in mapper task

2009-10-08 Thread ishwar ramani
Hi, I need to get the position of the key being processed in a mapper task. My inputFile is a sequence file I tried the Context, but the best i could get was the inputsplit position and the file name My other option is to start recording the pos in the key value while generating the

Re: Recommended file-system for DataNode

2009-10-08 Thread Stas Oskin
Hi Jason. Brtfs is cool, I read that it has a 10% better performance then any other FS coming next to it. Can you post here the results of any your findings? Regards. 2009/10/8 Jason Venner jason.had...@gmail.com Busy datanodes become bound by the metadata lookup times for the directory and

Re: Recommended file-system for DataNode

2009-10-08 Thread Edward Capriolo
On Thu, Oct 8, 2009 at 9:15 PM, Stas Oskin stas.os...@gmail.com wrote: Hi. I head about this option before, but never actually tried it. There is also another option, called relatime, which described as being more compatible then noatime. Can anyone comment on this? Regards. 2009/10/8

Re: detecting stalled daemons?

2009-10-08 Thread Todd Lipcon
Hi James, This doesn't quite answer your original question, but if you want to help track down these kinds of bugs, you should grab a stack trace next time this happens. You can do this either using jstack from the command line, by visiting /stacks on the HTTP interface, or by sending the process

Re: detecting stalled daemons?

2009-10-08 Thread Edward Capriolo
On Thu, Oct 8, 2009 at 9:20 PM, Todd Lipcon t...@cloudera.com wrote: Hi James, This doesn't quite answer your original question, but if you want to help track down these kinds of bugs, you should grab a stack trace next time this happens. You can do this either using jstack from the command

Re: Error: INFO ipc.Client: Retrying connect to server: /192.168.100.11:8020. Already tried 0 time(s).

2009-10-08 Thread .ke. sivakumar
Hi Santosh, Check whether all the datanodes are up and running, using the command 'bin/hadoop dfsadmin -report'. On Thu, Oct 8, 2009 at 4:24 AM, santosh gandham santhosh...@gmail.comwrote: Hi, I am new to Hadoop. I just configured it based on the documentation. While I

Re: retrieving sequenceFile Postion of Key in mapper task

2009-10-08 Thread Ahad Rana
Hi Ishwar, You can implement a custom MapRunner and retrieve the position from the reader before calling your map function. Be aware though, that for block compressed files, the position returned represents block start position, not the individual record position. Ahad. On Thu, Oct 8, 2009 at

Re: retrieving sequenceFile Postion of Key in mapper task

2009-10-08 Thread Ahad Rana
Oops, memory fails me. To correct my previous statement, for block compressed files, getPosition reflects the position in the input stream of the NEXT compressed block of data, so you have to watch for the change in position after reading the key/value to capture a block transition. Ahad. On Thu,