[
https://issues.apache.org/jira/browse/HADOOP-2622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12563091#action_12563091
]
Amareshwari Sri Ramadasu commented on HADOOP-2622:
--------------------------------------------------
In the current behaviour of streaming jobs, -file option is used to ship
File/dir in the Job jar file.
There are -cacheFile, -cacheArchive options to add files/archives to the
distributed cache.
If we rework -file option to use distributed cache, we actually remove the
facility to ship to job jar file. Is this fine, since we are distributing it
through distributed cache?
If yes, should we deprecate -file option and ask users to use -cacheFile and
-cacheArchive for the same ?
Thoughts?
> Fix -file option in Streaming to use Distributed Cache
> ------------------------------------------------------
>
> Key: HADOOP-2622
> URL: https://issues.apache.org/jira/browse/HADOOP-2622
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Amareshwari Sri Ramadasu
> Fix For: 0.17.0
>
>
> The -file option works by putting the script into the job's jar file by
> unjar-ing, copying and then jar-ing it again.
> We should rework the -file option to use the DistributedCache and the symlink
> option it provides.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.