Does anybody have any updates on this? How can we have our own RecordReader in Hadoop pipes? When I try to print the "context.getInputSplit", I get the filenames along with some junk characters. As a result the file open fails.
Anybody got it working? Viral. 11 Nov. wrote: > > I traced into the c++ recordreader code: > WordCountReader(HadoopPipes::MapContext& context) { > std::string filename; > HadoopUtils::StringInStream stream(context.getInputSplit()); > HadoopUtils::deserializeString(filename, stream); > struct stat statResult; > stat(filename.c_str(), &statResult); > bytesTotal = statResult.st_size; > bytesRead = 0; > cout << filename<<endl; > file = fopen(filename.c_str(), "rt"); > HADOOP_ASSERT(file != NULL, "failed to open " + filename); > } > > I got nothing for the filename virable, which showed the InputSplit is > empty. > > 2008/3/4, 11 Nov. <nov.eleve...@gmail.com>: >> >> hi colleagues, >> I have set up the single node cluster to test pipes examples. >> wordcount-simple and wordcount-part work just fine. but >> wordcount-nopipe can't run. Here is my commnad line: >> >> bin/hadoop pipes -conf src/examples/pipes/conf/word-nopipe.xml -input >> input/ -output out-dir-nopipe1 >> >> and here is the error message printed on my console: >> >> 08/03/03 23:23:06 WARN mapred.JobClient: No job jar file set. User >> classes may not be found. See JobConf(Class) or JobConf#setJar(String). >> 08/03/03 23:23:06 INFO mapred.FileInputFormat: Total input paths to >> process : 1 >> 08/03/03 23:23:07 INFO mapred.JobClient: Running job: >> job_200803032218_0004 >> 08/03/03 23:23:08 INFO mapred.JobClient: map 0% reduce 0% >> 08/03/03 23:23:11 INFO mapred.JobClient: Task Id : >> task_200803032218_0004_m_000000_0, Status : FAILED >> java.io.IOException: pipe child exception >> at org.apache.hadoop.mapred.pipes.Application.abort( >> Application.java:138) >> at org.apache.hadoop.mapred.pipes.PipesMapRunner.run( >> PipesMapRunner.java:83) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192) >> at org.apache.hadoop.mapred.TaskTracker$Child.main( >> TaskTracker.java:1787) >> Caused by: java.io.EOFException >> at java.io.DataInputStream.readByte(DataInputStream.java:250) >> at >> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java >> :313) >> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java >> :335) >> at >> org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run( >> BinaryProtocol.java:112) >> >> task_200803032218_0004_m_000000_0: >> task_200803032218_0004_m_000000_0: >> task_200803032218_0004_m_000000_0: >> task_200803032218_0004_m_000000_0: Hadoop Pipes Exception: failed to open >> at /home/hadoop/hadoop-0.15.2-single-cluster >> /src/examples/pipes/impl/wordcount-nopipe.cc:67 in >> WordCountReader::WordCountReader(HadoopPipes::MapContext&) >> >> >> Could anybody tell me how to fix this? That will be appreciated. >> Thanks a lot! >> > > -- View this message in context: http://www.nabble.com/Pipes-example-wordcount-nopipe.cc-failed-when-reading-from-input-splits-tp15807856p24084734.html Sent from the Hadoop core-user mailing list archive at Nabble.com.