Re: Hadoop Streaming Semantics

2009-02-02 Thread Amareshwari Sriramadasu
S D wrote: Thanks for your response. I'm using version 0.19.0 of Hadoop. I tried your suggestion. Here is the line I use to invoke Hadoop hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-0.19.0-streaming.jar \\ -input /user/hadoop/hadoop-input/inputFile.txt \\ -output /user/hadoop/hadoop-o

Re: Hadoop Streaming Semantics

2009-02-02 Thread S D
Thanks for your response. I'm using version 0.19.0 of Hadoop. I tried your suggestion. Here is the line I use to invoke Hadoop hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-0.19.0-streaming.jar \\ -input /user/hadoop/hadoop-input/inputFile.txt \\ -output /user/hadoop/hadoop-output \\ -

Re: Hadoop Streaming Semantics

2009-02-01 Thread Amareshwari Sriramadasu
Which version of hadoop are you using? You can directly use -inputformat org.apache.hadoop.mapred.lib.NLineInputFormat for your streaming job. You need not include it in your streaming jar. -Amareshwari S D wrote: Thanks for your response Amereshwari. I'm unclear on how to take advantage of

Re: Hadoop Streaming Semantics

2009-01-30 Thread S D
Thanks for your response Amereshwari. I'm unclear on how to take advantage of NLineInputFormat with Hadoop Streaming. Is the idea that I modify the streaming jar file (contrib/streaming/hadoop--streaming.jar) to include the NLineInputFormat class and then pass a command line configuration param to

Re: Hadoop Streaming Semantics

2009-01-29 Thread Amareshwari Sriramadasu
You can use NLineInputFormat for this, which splits one line (N=1, by default) as one split. So, each map task processes one line. See http://hadoop.apache.org/core/docs/r0.19.0/api/org/apache/hadoop/mapred/lib/NLineInputFormat.html -Amareshwari S D wrote: Hello, I have a clarifying question