I tried to specify "WordCountInputFormat" as the input format, here is the command line:
bin/hadoop pipes -conf src/examples/pipes/conf/word-nopipe.xml -input inputdata/ -output outputdata -inputformat org.apache.hadoop.mapred.pipes.WordCountInputFormat The process of mapreduce seems not really executed and I only get such output on screen: 08/03/13 13:17:44 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 08/03/13 13:17:45 INFO mapred.JobClient: Running job: job_200803131137_0004 08/03/13 13:17:46 INFO mapred.JobClient: map 100% reduce 100% 08/03/13 13:17:47 INFO mapred.JobClient: Job complete: job_200803131137_0004 08/03/13 13:17:47 INFO mapred.JobClient: Counters: 0 What should be the problem then? In the former discussion, Owen said: The nopipe example needs more documentation. It assumes that it is run with the InputFormat from src/test/org/apache/hadoop/mapred/pipes/ WordCountInputFormat.java, which has a very specific input split format. By running with a TextInputFormat, it will send binary bytes as the input split and won't work right. The nopipe example should probably be recoded to use libhdfs too, but that is more complicated to get running as a unit test. Also note that since the C++ example is using local file reads, it will only work on a cluster if you have nfs or something working across the cluster. Can anybody give more details? 2008/3/7, 11 Nov. <[EMAIL PROTECTED]>: > > Thanks a lot! > > 2008/3/4, Amareshwari Sri Ramadasu <[EMAIL PROTECTED]>: > > > > Hi, > > > > Here is some discussion on how to run wordcount-nopipe : > > http://www.nabble.com/pipe-application-error-td13840804.html > > Probably makes sense for your question. > > > > Thanks > > > > Amareshwari > > > > 11 Nov. wrote: > > > I traced into the c++ recordreader code: > > > WordCountReader(HadoopPipes::MapContext& context) { > > > std::string filename; > > > HadoopUtils::StringInStream stream(context.getInputSplit()); > > > HadoopUtils::deserializeString(filename, stream); > > > struct stat statResult; > > > stat(filename.c_str(), &statResult); > > > bytesTotal = statResult.st_size; > > > bytesRead = 0; > > > cout << filename<<endl; > > > file = fopen(filename.c_str(), "rt"); > > > HADOOP_ASSERT(file != NULL, "failed to open " + filename); > > > } > > > > > > I got nothing for the filename virable, which showed the InputSplit is > > > empty. > > > > > > 2008/3/4, 11 Nov. <[EMAIL PROTECTED]>: > > > > > >> hi colleagues, > > >> I have set up the single node cluster to test pipes examples. > > >> wordcount-simple and wordcount-part work just fine. but > > >> wordcount-nopipe can't run. Here is my commnad line: > > >> > > >> bin/hadoop pipes -conf src/examples/pipes/conf/word-nopipe.xml-input > > >> input/ -output out-dir-nopipe1 > > >> > > >> and here is the error message printed on my console: > > >> > > >> 08/03/03 23:23:06 WARN mapred.JobClient: No job jar file set. User > > >> classes may not be found. See JobConf(Class) or > > JobConf#setJar(String). > > >> 08/03/03 23:23:06 INFO mapred.FileInputFormat: Total input paths to > > >> process : 1 > > >> 08/03/03 23:23:07 INFO mapred.JobClient: Running job: > > >> job_200803032218_0004 > > >> 08/03/03 23:23:08 INFO mapred.JobClient: map 0% reduce 0% > > >> 08/03/03 23:23:11 INFO mapred.JobClient: Task Id : > > >> task_200803032218_0004_m_000000_0, Status : FAILED > > >> java.io.IOException: pipe child exception > > >> at org.apache.hadoop.mapred.pipes.Application.abort( > > >> Application.java:138) > > >> at org.apache.hadoop.mapred.pipes.PipesMapRunner.run( > > >> PipesMapRunner.java:83) > > >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192) > > >> at org.apache.hadoop.mapred.TaskTracker$Child.main( > > >> TaskTracker.java:1787) > > >> Caused by: java.io.EOFException > > >> at java.io.DataInputStream.readByte(DataInputStream.java:250) > > >> at org.apache.hadoop.io.WritableUtils.readVLong( > > WritableUtils.java > > >> :313) > > >> at org.apache.hadoop.io.WritableUtils.readVInt( > > WritableUtils.java > > >> :335) > > >> at > > >> org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run( > > >> BinaryProtocol.java:112) > > >> > > >> task_200803032218_0004_m_000000_0: > > >> task_200803032218_0004_m_000000_0: > > >> task_200803032218_0004_m_000000_0: > > >> task_200803032218_0004_m_000000_0: Hadoop Pipes Exception: failed to > > open > > >> at /home/hadoop/hadoop-0.15.2-single-cluster > > >> /src/examples/pipes/impl/wordcount-nopipe.cc:67 in > > >> WordCountReader::WordCountReader(HadoopPipes::MapContext&) > > >> > > >> > > >> Could anybody tell me how to fix this? That will be appreciated. > > >> Thanks a lot! > > >> > > >> > > > > > > > > > > >
