[
https://issues.apache.org/jira/browse/OOZIE-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387721#comment-14387721
]
Rohini Palaniswamy commented on OOZIE-1624:
-------------------------------------------
bq. Should be part of sharelib check, sharelib should not load any file which
file size is 0.
Not talking about empty files. Files can have same name. A proper check to
dedup should do file size and checksum size check and we don't do that now.
Refer TEZ-1697 for what is involved in doing a duplicate check. Having option
to exclude based on the file path gives a choice to user to chose what to
exclude in case of duplicate.
bq. Full path is never exposed to user and it can change.
In the normal sharelib mode (not metafile), the name of the sharelib is
same as the directory structure and is exposed to the users. Even with
metafile, the shareliblist command exposes the file paths. hbase. *thrift. *.
jar would match both cases. It is easy to write a generic regex for a file
under given directory how many ever levels it is nested.
Checking for file path instead of file name is a simple change. Do not need a
separate jira for that and do not have to complicate with sharelib tag names.
> Exclusion pattern for sharelib.
> -------------------------------
>
> Key: OOZIE-1624
> URL: https://issues.apache.org/jira/browse/OOZIE-1624
> Project: Oozie
> Issue Type: Sub-task
> Reporter: Purshotam Shah
> Assignee: Purshotam Shah
> Attachments: OOZIE-1624-V2.patch, OOZIE-1624-v1.patch
>
>
> Sharelib may bring some jar which might conflict with user jars.
> Ex. Sharelib hive has json-2.xxxx.jar, where as some of the user use-case
> need higher version of json jar.
> He should be able to exclude sharelib json jar and bring his own version.
> <property>
> <name>oozie.action.sharelib.for.hive.exclusion</name>
> <value>json-\*.jar|abc-*.jar</value>
> </property>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)