[ 
https://issues.apache.org/jira/browse/OOZIE-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16345077#comment-16345077
 ] 

Attila Sasvari commented on OOZIE-3170:
---------------------------------------

[~jphelps] many thanks for reporting this issue and reviewing related parts of 
the codebase. I added you as a contributor to the project and assigned this 
Jira to you. 

I reproduced that the cited NPE exception is thrown. The diag bundle zip is 
however generated and contain sharelib info - it is just using the 
OozieClient's 
[listShareLib()|https://github.com/apache/oozie/blob/ef6d0af5edeb18fbc0259d1962ac70f8ad7c2a0c/tools/src/main/java/org/apache/oozie/tools/diag/ServerInfoCollector.java#L42]:
{code:java}
$ bin/oozie-diag-bundle-collector.sh -oozie http://localhost:11000/oozie 
-output /tmp/jobs/                                                             
Checking Connection...Done
Using Temporary Directory: 
/var/folders/9q/f8p_r6gj0wbck49_dc092q_m0000gp/T/1517319232457-0
Getting Sharelib Information...Done
Getting Configuration...Done
Getting OS Environment Variables...Done
Getting Java System Properties...Done
Getting Queue Dump...Done
Getting Thread Dump...Done
Getting Instrumentation...Done
Getting Metrics...Skipping (Metrics are unavailable)
Creating Zip File: /tmp/jobs/oozie-diag-bundle-1517319233190.zip...Done

$ unzip -l /tmp/jobs/oozie-diag-bundle-1517319233190.zip
    68029  01-30-18 14:33   /effective-oozie-site.xml
     9876  01-30-18 14:33   /instrumentation.txt
    38636  01-30-18 14:33   /java-sys-props.txt
     3807  01-30-18 14:33   /os-env-vars.txt
      279  01-30-18 14:33   /queue-dump.txt
    40032  01-30-18 14:33   /sharelib.txt
   102084  01-30-18 14:33   /thread-dump.html{code}
 * In fact, I am not sure all those Oozie services are really needed here to be 
able to collect diagnostic information. If they are not needed they shall not 
be loaded at all.
 * There is also another problem. By default, logs generated by the tool appear 
in the server log if you run the tool from Oozie's home directory. It can make 
things very confusing for an admin or anyone who review Oozie server logs. 
Setting up logging is the responsibility of the 
[XLogService|[https://github.com/apache/oozie/blob/ef6d0af5edeb18fbc0259d1962ac70f8ad7c2a0c/core/src/main/java/org/apache/oozie/service/XLogService.java#L145]]
 and it is started via Services.init(). It can be controlled by the 
{{oozie.log.dir}} system property (e.g. \{{export 
JAVA_PROPERTIES="-Doozie.log.dir=/tmp/"}} before running the tool). This is 
something we should clarify in the documentation of the tool and/or change the 
code/script so that logs are put in the directory where the diag bundle is 
generated by default.

> Oozie Diagnostic Bundle tool fails with NPE due to missing service class
> ------------------------------------------------------------------------
>
>                 Key: OOZIE-3170
>                 URL: https://issues.apache.org/jira/browse/OOZIE-3170
>             Project: Oozie
>          Issue Type: Bug
>    Affects Versions: 5.0.0b1
>            Reporter: Jason Phelps
>            Priority: Major
>         Attachments: OOZIE-3170-1.patch
>
>
>  
> When I ran the below command after doing a clean build from the main branch
> {code:java}
> bin/oozie-diag-bundle-collector.sh -oozie 
> http://jphelps-60-1.gce.cloudera.com:11000/oozie -output /tmp/jobs/
> {code}
> It will fail with an NPE. I apologize as I did not copy the client error, but 
> the error in oozie.log is below:
> {code:java}
> 2018-01-25 10:53:58,123 ERROR ShareLibService:517 - SERVER[] 
> org.apache.oozie.service.ServiceException: E0104: Could not fully initialize 
> service [org.apache.oozie.service.ShareLibService], Not able to cache 
> sharelib. An Admin needs to install the sharelib with oozie-setup.sh and 
> issue the 'oozie admin' CLI command to update the sharelib
> org.apache.oozie.service.ServiceException: E0104: Could not fully initialize 
> service [org.apache.oozie.service.ShareLibService], Not able to cache 
> sharelib. An Admin needs to install the sharelib with oozie-setup.sh and 
> issue the 'oozie admin' CLI command to update the sharelib
>  at org.apache.oozie.service.ShareLibService.init(ShareLibService.java:144)
>  at org.apache.oozie.service.Services.setServiceInternal(Services.java:386)
>  at org.apache.oozie.service.Services.setService(Services.java:372)
>  at org.apache.oozie.service.Services.loadServices(Services.java:304)
>  at org.apache.oozie.service.Services.init(Services.java:212)
>  at 
> org.apache.oozie.tools.diag.DiagBundleCollectorDriver.initOozieServices(DiagBundleCollectorDriver.java:153)
>  at 
> org.apache.oozie.tools.diag.DiagBundleCollectorDriver.setHadoopConfig(DiagBundleCollectorDriver.java:135)
>  at 
> org.apache.oozie.tools.diag.DiagBundleCollectorDriver.run(DiagBundleCollectorDriver.java:56)
>  at 
> org.apache.oozie.tools.diag.DiagBundleCollectorDriver.main(DiagBundleCollectorDriver.java:52)
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.oozie.service.ShareLibService.cacheActionKeySharelibConfList(ShareLibService.java:878)
>  at org.apache.oozie.service.ShareLibService.init(ShareLibService.java:132)
>  ... 8 more
> 2018-01-25 10:53:58,130 INFO PartitionDependencyManagerService:520 - SERVER[] 
> PartitionDependencyManagerService initialized. Dependency cache is 
> org.apache.oozie.dependency.hcat.SimpleHCatDependencyCache
> 2018-01-25 10:53:58,131 FATAL Services:514 - SERVER[] Runtime Exception 
> during Services Load. Check your list of {0} or {1}
> java.lang.NullPointerException
>  at 
> org.apache.oozie.service.PartitionDependencyManagerService.init(PartitionDependencyManagerService.java:81)
>  at 
> org.apache.oozie.service.PartitionDependencyManagerService.init(PartitionDependencyManagerService.java:71)
>  at org.apache.oozie.service.Services.setServiceInternal(Services.java:386)
>  at org.apache.oozie.service.Services.setService(Services.java:372)
>  at org.apache.oozie.service.Services.loadServices(Services.java:304)
>  at org.apache.oozie.service.Services.init(Services.java:212)
>  at 
> org.apache.oozie.tools.diag.DiagBundleCollectorDriver.initOozieServices(DiagBundleCollectorDriver.java:153)
>  at 
> org.apache.oozie.tools.diag.DiagBundleCollectorDriver.setHadoopConfig(DiagBundleCollectorDriver.java:135)
>  at 
> org.apache.oozie.tools.diag.DiagBundleCollectorDriver.run(DiagBundleCollectorDriver.java:56)
>  at 
> org.apache.oozie.tools.diag.DiagBundleCollectorDriver.main(DiagBundleCollectorDriver.java:52)
> 2018-01-25 10:53:58,132 FATAL Services:514 - SERVER[] E0103: Could not load 
> service classes, null
> org.apache.oozie.service.ServiceException: E0103: Could not load service 
> classes, null
>  at org.apache.oozie.service.Services.loadServices(Services.java:309)
>  at org.apache.oozie.service.Services.init(Services.java:212)
>  at 
> org.apache.oozie.tools.diag.DiagBundleCollectorDriver.initOozieServices(DiagBundleCollectorDriver.java:153)
>  at 
> org.apache.oozie.tools.diag.DiagBundleCollectorDriver.setHadoopConfig(DiagBundleCollectorDriver.java:135)
>  at 
> org.apache.oozie.tools.diag.DiagBundleCollectorDriver.run(DiagBundleCollectorDriver.java:56)
>  at 
> org.apache.oozie.tools.diag.DiagBundleCollectorDriver.main(DiagBundleCollectorDriver.java:52)
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.oozie.service.PartitionDependencyManagerService.init(PartitionDependencyManagerService.java:81)
>  at 
> org.apache.oozie.service.PartitionDependencyManagerService.init(PartitionDependencyManagerService.java:71)
>  at org.apache.oozie.service.Services.setServiceInternal(Services.java:386)
>  at org.apache.oozie.service.Services.setService(Services.java:372)
>  at org.apache.oozie.service.Services.loadServices(Services.java:304)
>  ... 5 more
>  
> {code}
> From my debugging, it looks like it needs the JobsConcurrencyService to run
>  
> [https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/service/PartitionDependencyManagerService.java#L81]
>  
> {code:java}
> purgeEnabled = 
> Services.get().get(JobsConcurrencyService.class).isHighlyAvailableMode();{code}
> But this service is not loaded by the following:
> [https://github.com/apache/oozie/blob/master/tools/src/main/java/org/apache/oozie/tools/diag/DiagBundleCollectorDriver.java#L149]
> {code:java}
> services.getConf()
>  .set(Services.CONF_SERVICE_CLASSES, 
> "org.apache.oozie.service.LiteWorkflowAppService,"
>  + "org.apache.oozie.service.SchedulerService,"
>  + "org.apache.oozie.service.HadoopAccessorService,"
>  + "org.apache.oozie.service.ShareLibService");{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to