Hi, I am trying to run a hadoop streaming job on a 2 node cluster using a binary C program. When I run the program(which constitutes a mapper and a reducer) it runs fine as long as I map the output of mapper directly to reducer using Unix pipe but when I do the same thing using hadoop streaming I get an error saying that streaming job failed. Please see the error trace below: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 36 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:366) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:397) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211)
Can anyone tell me what this error is due to. It seems this error is coming because the framework is killing my mapper process in the initial stages itself because I checked the user logs and it seems no data was dumped to the stdout stream which can only mean the process died prematurely. Is this because of a tuning issue ( my mapper and reducer instances are both <4G). Any inputs would be appreciated. Thanks, Amritanshu From: Marcos Ortiz [mailto:mlor...@uci.cu] Sent: Friday, March 09, 2012 7:43 PM To: Amritanshu Shekhar Cc: 'hdfs-user@hadoop.apache.org' Subject: Re: Error while using libhdfs C API On 03/09/2012 07:34 AM, Amritanshu Shekhar wrote: Hi Marcos, Figured out the compilation issue. It was due to error.h header file which was not used and not present in the distribution. There is one small issue however I was trying to test hdfs read. I copied an input file to /user/inputData(this can be listed using bin/hadoop dfs -ls /user/inputData). hdfsExists call fails for this directory however it works when I copy my file to /tmp. Is it because hdfs only recognizes /tmp as a valid dir? Thus I was wondering what directory structure does hdfs recognize by default and if we can override it through a conf variable what would that variable be and where to set it? Thanks, Amritanshu Awesome, Amritanshu. CC to hdfs-user@hadoop.apache.org<mailto:hdfs-user@hadoop.apache.org> Please, give some logs about your work with the compilation. How did you solve this? To have it on the mailing list archives. About your another issue, 1- Did you check that the $HADOOP_USER has access to /user/inputData? HDFS: It recognize the directory that you entered on the hdfs-site.xml on the dfs.name.dir(NN) property and on the dfs.data.dir (DN), but by default, it works with /tmp directory (not recommended in production). Look on the Eugene Ciurana´s Refcard called "Deploying Hadoop", where he did a amazing work explaining in a few pages some tricky configurations tips. Regards From: Marcos Ortiz [mailto:mlor...@uci.cu] Sent: Wednesday, March 07, 2012 7:36 PM To: Amritanshu Shekhar Subject: Re: Error while using libhdfs C API On 03/07/2012 01:15 AM, Amritanshu Shekhar wrote: Hi Marcos, Thanks for the quick reply. Actually I am using a gmake build system where the library is being linked as a static library(.a ) rather than a shared object. It seems strange since stderr is a standard symbol which should be resolved. Currently I am using the version that came with the distribution($HOME/c++/Linux-amd64-64/lib/libhdfs.a) . I tried building the library from the source but there were build dependencies that could not be resolved. I tried building $HOME/hadoop/hdfs/src/c++/libhdfs by running: ./configure ./make I got a lot of dependency errors so gave up the effort. If you happen to have a working application that make suse of libhdfs please let me know. Any inputs would be welcome as I have hit a roadblock as far as libhdfs is concerned. Thanks, Amritanshu No, Amritansu. I don't have any examples of the use of libhdfs API, but I remembered that some folks were using it. Search on the mailing list archives (http://www.search-hadoop.com). Can you put the errors that you had in your system when you tried to compile the library? Regards and best wishes From: Marcos Ortiz [mailto:mlor...@uci.cu] Sent: Monday, March 05, 2012 6:51 PM To: hdfs-user@hadoop.apache.org<mailto:hdfs-user@hadoop.apache.org> Cc: Amritanshu Shekhar Subject: Re: Error while using libhdfs C API Which platform are you using? Did you update the dynamic linker runtime bindings (ldconfig)? ldconfig $HOME/hadoop/c++/Linux-amd64/lib Regards On 03/06/2012 02:38 AM, Amritanshu Shekhar wrote: Hi, I was trying to link 64 bit libhdfs in my application program but it seems there is an issue with this library. Get the following error: Undefined first referenced symbol in file stderr libhdfs.a(hdfs.o) __errno_location libhdfs.a(hdfs.o) ld: fatal: Symbol referencing errors. No output written to ../../bin/sun86/mapreduce collect2: ld returned 1 exit status Now I was wondering if this a common error and is there an actual issue with the library or am I getting an error because of an incorrect configuration? I am using the following library: $HOME/hadoop/c++/Linux-amd64-64/lib/libhdfs.a Thanks, Amritanshu [cid:image001.jpg@01CD01F5.9450C8F0] <http://www.antiterroristas.cu/> -- Marcos Luis Ortíz Valmaseda Sr. Software Engineer (UCI) http://marcosluis2186.posterous.com http://postgresql.uci.cu/blog/38 [cid:part8.01090504.02020208@uci.cu]<http://www.antiterroristas.cu/> [http://cincoheroes.uci.cu/cinco.gif] <http://www.antiterroristas.cu/> -- Marcos Luis Ortíz Valmaseda Sr. Software Engineer (UCI) http://marcosluis2186.posterous.com http://postgresql.uci.cu/blog/38 [http://cincoheroes.uci.cu/cinco.gif]<http://www.antiterroristas.cu/> [http://cincoheroes.uci.cu/cinco.gif] <http://www.antiterroristas.cu/> -- Marcos Luis Ortíz Valmaseda Sr. Software Engineer (UCI) http://marcosluis2186.posterous.com http://postgresql.uci.cu/blog/38 [http://cincoheroes.uci.cu/cinco.gif]<http://www.antiterroristas.cu/>
<<inline: image001.jpg>>