[ 
https://issues.apache.org/jira/browse/HADOOP-2622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated HADOOP-2622:
--------------------------------------------

    Attachment: patch-2622.txt

bq. if users have their own inputFormat, they would have to jar it with 
streaming jar and use the custom jar because setInputFormat is done at client 
side. So, passing that via -file does not work. It would be really helpful if 
this is also address.

If we add Inputformat class (hierarchy also if any) using -file will work with 
current code, if we add the jar to the classpath. 
For example, If you have a.b.c.MyInputFormat as the inputformat, and dir 
hierarchy is dir/a/b/c/MyInputFormat.class then inputformat can be added to the 
jar using following command:
{noformat}
bin/hadoop jar build/contrib/streaming/hadoop-0.17.0-dev-streaming.jar -mapper 
my.pl -input t.txt -output output -file my.pl -file dir/ -inputformat 
a.b.c.MyInputFormat
{noformat}

Here is patch which will add the jar file to the classpath.
I tested this to add an inputformat, and this worked fine.
Lohit, Can you apply this patch and check if use of -file works for adding 
inputformat ?

> Fix -file option in Streaming to use Distributed Cache
> ------------------------------------------------------
>
>                 Key: HADOOP-2622
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2622
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.17.0
>
>         Attachments: patch-2622.txt
>
>
> The -file option works by putting the script into the job's jar file by 
> unjar-ing, copying and then jar-ing it again.
> We should rework the -file option to use the DistributedCache and the symlink 
> option it provides.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to