[
https://issues.apache.org/jira/browse/PIG-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13959140#comment-13959140
]
Rohini Palaniswamy commented on PIG-3861:
-----------------------------------------
Few comments:
- Please convert to Set for skipJars, extraJars, etc as well
- The changes to shipToHDFS is very bad and adds FS calls and also logic is
flimsy if user passes jars/files via distributedcache like in Oozie. Please
revert it and check the conf to see if DistributedCache already has that file.
Take into account symlinks while doing that.
- TestJobControlCompiler.java - Add an assert to check that it is in
DistributedCache only once.
> duplicate jars get added to distributed cache
> ---------------------------------------------
>
> Key: PIG-3861
> URL: https://issues.apache.org/jira/browse/PIG-3861
> Project: Pig
> Issue Type: Bug
> Reporter: Mona Chitnis
> Assignee: Mona Chitnis
> Priority: Minor
> Attachments: PIG-3681-1.patch
>
>
> PigContext's scriptJars should handle de-duplication of jars to account for
> script engines e.g. JythonScriptEngine performing various jar loading for
> module and sometimes adding same jar twice.
> AlsoJobControlCompiler.shipToHdfs() needs a check against adding the same jar
> more than once, under different randomly incremented sub-dirs.
--
This message was sent by Atlassian JIRA
(v6.2#6252)