[ 
https://issues.apache.org/jira/browse/OOZIE-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14977171#comment-14977171
 ] 

Purshotam Shah commented on OOZIE-2347:
---------------------------------------

We noticed issue with this patch in our production.
If happens when 
{{oozie.service.HadoopAccessorService.action.configurations.load.default.resources}}
 is set to true ( default is false), Some file are added to distributed cache 
with doesn't contain default confs.

This happens only for pig and hive jobs as they override to create jobconf.

At runtime we see  ErrorCode [JA009], Message [JA009: 
java.lang.IllegalArgumentException: Failed to specify server's Kerberos 
principal name].
Doing a quick fix to resolve issue. We noticed that JavaActionExecutor sharelib 
code is convoluted and does lot of repetition of code and some unnecessary 
stuff which is expensive, that needs to rewritten.
Will create a new JIRA for that.


> Remove unnecessary new Configuration()/new jobConf() calls from oozie
> ---------------------------------------------------------------------
>
>                 Key: OOZIE-2347
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2347
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Purshotam Shah
>            Assignee: Purshotam Shah
>             Fix For: trunk
>
>         Attachments: OOZIE-2347-V1.patch, OOZIE-2347-V2.patch, 
> amend-OOZIE-2347-V1.patch
>
>
> We noticed that setting of job sharelib was slow and one prime reason was lot 
> of thread was blocked on "java.util.zip.ZipFile.getEntry"
> <0x00000005c0afda68> (a java.util.jar.JarFile): 0 Thread(s) sleeping, 178 
> Thread(s) waiting, 1 Thread(s) locking
> There are lot of places we do new Configuration()/new jobConf() 
> unnecessarily. This can be easily removed to enhance performance.
> 1.
> Configuration defaultConf = new Configuration(); is called for every file we 
> add to classpath.
> {code}
> public static void addFileToClassPath(Path file, Configuration conf, 
> FileSystem fs) throws IOException {
>       Configuration defaultConf = new Configuration();
>       XConfiguration.copy(conf, defaultConf);
>       if (fs == null) {
>         // it fails with conf, therefore we pass defaultConf instead
>         fs = file.getFileSystem(defaultConf);
>       }
>       // Hadoop 0.20/1.x.
>       if (defaultConf.get("yarn.resourcemanager.webapp.address") == null) {
>           // Duplicate hadoop 1.x code to workaround MAPREDUCE-2361 in Hadoop 
> 0.20
>           // Refer OOZIE-1806.
>           String filepath = file.toUri().getPath();
>           String classpath = conf.get("mapred.job.classpath.files");
>           conf.set("mapred.job.classpath.files", classpath == null
>               ? filepath
>               : classpath + System.getProperty("path.separator") + filepath);
>           URI uri = fs.makeQualified(file).toUri();
>           DistributedCache.addCacheFile(uri, conf);
>       }
>       else { // Hadoop 0.23/2.x
>           DistributedCache.addFileToClassPath(file, conf, fs);
>       }
>     }
> {code}
> 2.
> sharelib setup also calls new Configuration(), which is not needed.
> {code}
> public Configuration getShareLibConf(String inputKey, Path path) {
>         Configuration conf = new Configuration();
>         if (shareLibConfigMap.containsKey(inputKey)) {
>             conf = shareLibConfigMap.get(inputKey).get(path);
>         }
>         return conf;
>     }
> {code}        
>       
>       
> 3.CoordActionInputCheckXCommand.checkPath also creates jobConf every time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to