[ 
https://issues.apache.org/jira/browse/OOZIE-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13504410#comment-13504410
 ] 

Mohammad Kamrul Islam commented on OOZIE-1089:
----------------------------------------------

I was considering two alternative options:
Option 1: Before adding any jar file into DC, we can check if the jar filename 
is already in the DC. If yes, we can skip the addition to DC. This way we can 
avoid the duplicate files into DC.

Option 2: Oozie can store all the jars into a local data structure (say 
HashSet). At then end, Oozie can add those jars (from HashSet) into class path.

Comments?

 
                
> DistributedCache workaround for Hadoop 2.0.2-alpha
> --------------------------------------------------
>
>                 Key: OOZIE-1089
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1089
>             Project: Oozie
>          Issue Type: Bug
>          Components: workflow
>    Affects Versions: 3.3.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 3.3.0
>
>         Attachments: OOZIE-1089.patch
>
>
> As explained in MAPREDUCE-4820, Hadoop 2.0.2-alpha introduced a duplicate 
> check that exposes an change of behavior in how the distributed-cache works 
> in Hadoop 2 (as opposed to Hadoop-1).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to