Re: Pipes example wordcount-nopipe.cc failed when reading from input splits

11 Nov. Wed, 12 Mar 2008 22:37:14 -0700

I tried to specify "WordCountInputFormat" as the input format, here is the
command line:


bin/hadoop pipes -conf src/examples/pipes/conf/word-nopipe.xml  -input
inputdata/ -output outputdata -inputformat
org.apache.hadoop.mapred.pipes.WordCountInputFormat

The process of mapreduce seems not really executed and I only get such
output on screen:

08/03/13 13:17:44 WARN mapred.JobClient: No job jar file set.  User classes
may not be found. See JobConf(Class) or JobConf#setJar(String).
08/03/13 13:17:45 INFO mapred.JobClient: Running job: job_200803131137_0004
08/03/13 13:17:46 INFO mapred.JobClient:  map 100% reduce 100%
08/03/13 13:17:47 INFO mapred.JobClient: Job complete: job_200803131137_0004
08/03/13 13:17:47 INFO mapred.JobClient: Counters: 0

What should be the problem then?

In the former discussion, Owen said:
The nopipe example needs more documentation.  It assumes that it is
run with the InputFormat from src/test/org/apache/hadoop/mapred/pipes/
WordCountInputFormat.java, which has a very specific input split
format. By running with a TextInputFormat, it will send binary bytes
as the input split and won't work right. The nopipe example should
probably be recoded to use libhdfs too, but that is more complicated
to get running as a unit test. Also note that since the C++ example
is using local file reads, it will only work on a cluster if you have
nfs or something working across the cluster.

Can anybody give more details?

2008/3/7, 11 Nov. <[EMAIL PROTECTED]>:
>
> Thanks a lot!
>
> 2008/3/4, Amareshwari Sri Ramadasu <[EMAIL PROTECTED]>:
> >
> > Hi,
> >
> > Here is some discussion on how to run wordcount-nopipe :
> > http://www.nabble.com/pipe-application-error-td13840804.html
> > Probably makes sense for your question.
> >
> > Thanks
> >
> > Amareshwari
> >
> > 11 Nov. wrote:
> > > I traced into the c++ recordreader code:
> > >   WordCountReader(HadoopPipes::MapContext& context) {
> > >     std::string filename;
> > >     HadoopUtils::StringInStream stream(context.getInputSplit());
> > >     HadoopUtils::deserializeString(filename, stream);
> > >     struct stat statResult;
> > >     stat(filename.c_str(), &statResult);
> > >     bytesTotal = statResult.st_size;
> > >     bytesRead = 0;
> > >     cout << filename<<endl;
> > >     file = fopen(filename.c_str(), "rt");
> > >     HADOOP_ASSERT(file != NULL, "failed to open " + filename);
> > >   }
> > >
> > > I got nothing for the filename virable, which showed the InputSplit is
> > > empty.
> > >
> > > 2008/3/4, 11 Nov. <[EMAIL PROTECTED]>:
> > >
> > >> hi colleagues,
> > >>    I have set up the single node cluster to test pipes examples.
> > >>    wordcount-simple and wordcount-part work just fine. but
> > >> wordcount-nopipe can't run. Here is my commnad line:
> > >>
> > >>  bin/hadoop pipes -conf src/examples/pipes/conf/word-nopipe.xml-input
> > >> input/ -output out-dir-nopipe1
> > >>
> > >> and here is the error message printed on my console:
> > >>
> > >> 08/03/03 23:23:06 WARN mapred.JobClient: No job jar file set.  User
> > >> classes may not be found. See JobConf(Class) or
> > JobConf#setJar(String).
> > >> 08/03/03 23:23:06 INFO mapred.FileInputFormat: Total input paths to
> > >> process : 1
> > >> 08/03/03 23:23:07 INFO mapred.JobClient: Running job:
> > >> job_200803032218_0004
> > >> 08/03/03 23:23:08 INFO mapred.JobClient:  map 0% reduce 0%
> > >> 08/03/03 23:23:11 INFO mapred.JobClient: Task Id :
> > >> task_200803032218_0004_m_000000_0, Status : FAILED
> > >> java.io.IOException: pipe child exception
> > >>         at org.apache.hadoop.mapred.pipes.Application.abort(
> > >> Application.java:138)
> > >>         at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(
> > >> PipesMapRunner.java:83)
> > >>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
> > >>         at org.apache.hadoop.mapred.TaskTracker$Child.main(
> > >> TaskTracker.java:1787)
> > >> Caused by: java.io.EOFException
> > >>         at java.io.DataInputStream.readByte(DataInputStream.java:250)
> > >>         at org.apache.hadoop.io.WritableUtils.readVLong(
> > WritableUtils.java
> > >> :313)
> > >>         at org.apache.hadoop.io.WritableUtils.readVInt(
> > WritableUtils.java
> > >> :335)
> > >>         at
> > >> org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run(
> > >> BinaryProtocol.java:112)
> > >>
> > >> task_200803032218_0004_m_000000_0:
> > >> task_200803032218_0004_m_000000_0:
> > >> task_200803032218_0004_m_000000_0:
> > >> task_200803032218_0004_m_000000_0: Hadoop Pipes Exception: failed to
> > open
> > >> at /home/hadoop/hadoop-0.15.2-single-cluster
> > >> /src/examples/pipes/impl/wordcount-nopipe.cc:67 in
> > >> WordCountReader::WordCountReader(HadoopPipes::MapContext&)
> > >>
> > >>
> > >> Could anybody tell me how to fix this? That will be appreciated.
> > >> Thanks a lot!
> > >>
> > >>
> > >
> > >
> >
> >
>

Re: Pipes example wordcount-nopipe.cc failed when reading from input splits

Reply via email to