S D wrote:
Thanks for your response. I'm using version 0.19.0 of Hadoop.
I tried your suggestion. Here is the line I use to invoke Hadoop
hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-0.19.0-streaming.jar \\
-input /user/hadoop/hadoop-input/inputFile.txt \\
-output /user/hadoop/hadoop-o
Thanks for your response. I'm using version 0.19.0 of Hadoop.
I tried your suggestion. Here is the line I use to invoke Hadoop
hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-0.19.0-streaming.jar \\
-input /user/hadoop/hadoop-input/inputFile.txt \\
-output /user/hadoop/hadoop-output \\
-
Which version of hadoop are you using?
You can directly use -inputformat
org.apache.hadoop.mapred.lib.NLineInputFormat for your streaming job.
You need not include it in your streaming jar.
-Amareshwari
S D wrote:
Thanks for your response Amereshwari. I'm unclear on how to take advantage
of
Thanks for your response Amereshwari. I'm unclear on how to take advantage
of NLineInputFormat with Hadoop Streaming. Is the idea that I modify the
streaming jar file (contrib/streaming/hadoop--streaming.jar) to
include the NLineInputFormat class and then pass a command line
configuration param to
You can use NLineInputFormat for this, which splits one line (N=1, by
default) as one split.
So, each map task processes one line.
See
http://hadoop.apache.org/core/docs/r0.19.0/api/org/apache/hadoop/mapred/lib/NLineInputFormat.html
-Amareshwari
S D wrote:
Hello,
I have a clarifying question