-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64579/#review193706
-----------------------------------------------------------


Ship it!




Ship It!

- Dmitro Lisnichenko


On Dec. 13, 2017, 7:12 p.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64579/
> -----------------------------------------------------------
> 
> (Updated Dec. 13, 2017, 7:12 p.m.)
> 
> 
> Review request for Ambari, Dmytro Grinenko, Dmitro Lisnichenko, and Nate Cole.
> 
> 
> Bugs: AMBARI-22644
>     https://issues.apache.org/jira/browse/AMBARI-22644
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> *STR*
> # Deploy HDP-2.6.4.0 cluster with Ambari-2.6.1.0-114
> # Apply HBase patch Upgrade on the cluster (this step is optional)
> # Then apply Spark2 patch Upgrade on the cluster
> # Restart Node Managers
> 
> *Result*
> NM restart fails with below error:
> ```
> 2017-12-10 07:17:02,559 INFO  impl.MetricsSystemImpl 
> (MetricsSystemImpl.java:shutdown(606)) - NodeManager metrics system shutdown 
> complete.
> 2017-12-10 07:17:02,559 FATAL nodemanager.NodeManager 
> (NodeManager.java:initAndStartNodeManager(549)) - Error starting NodeManager
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.spark.network.yarn.YarnShuffleService
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>         at 
> org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:197)
>         at 
> org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:165)
>         at java.lang.Class.forName0(Native Method)
>         at java.lang.Class.forName(Class.java:348)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.getInstance(AuxiliaryServiceWithCustomClassLoader.java:169)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:131)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         ... 8 more
> 2017-12-10 07:17:02,562 INFO  nodemanager.NodeManager 
> (LogAdapter.java:info(45)) - SHUTDOWN_MSG:
> ```
> 
> The spark properties are correctly being written out as per AMBARI-22525.
> 
> Initially, we had defined Spark properties for ATS like this:
> ```xml
>     <name>yarn.nodemanager.aux-services.spark_shuffle.classpath</name>
>     <value>{{stack_root}}/${hdp.version}/spark/aux/*</value>
> ```
> 
> When YARN upgrades without Spark, we run into AMBARI-22525. Seems like the 
> shuffle classes are installed as part of RPM dependencies, but not the 
> SparkATSPlugin.
> 
> So:
> - If we use YARN's version for the Spark classes, then ATS can't find 
> SparkATSPlugin since that is not part of YARN.
> - If we use Spark's version for the classes, then Spark can never upgrade 
> without YARN since NodeManager can't find the new Spark classes. 
> 
> However, it seems like shuffle and ATS use different properties. We changed 
> all 3 properties in AMBARI-22525:
> 
> ```
> yarn.nodemanager.aux-services.spark2_shuffle.classpath
> yarn.nodemanager.aux-services.spark_shuffle.classpath
> yarn.timeline-service.entity-group-fs-store.group-id-plugin-classpath
> ```
> 
> It seems like what need to do is change the spark shuffle stuff back to 
> hdp.version, but leave ATS using the new version since we're guaranteed to 
> have Spark installed on the ATS machine.
> 
> 
> Diffs
> -----
> 
>   
> ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/scripts/params_linux.py
>  6b5559cf91 
>   
> ambari-server/src/main/resources/stacks/HDP/2.5/services/YARN/configuration/yarn-site.xml
>  29833fbe03 
>   ambari-server/src/main/resources/stacks/HDP/2.6/upgrades/config-upgrade.xml 
> ea0e2cd46b 
> 
> 
> Diff: https://reviews.apache.org/r/64579/diff/2/
> 
> 
> Testing
> -------
> 
> Manual testing on a patched cluster with YARN/Spark
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>

Reply via email to