[ 
https://issues.apache.org/jira/browse/AMBARI-22628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hurley updated AMBARI-22628:
-------------------------------------
    Description: 
Installing a new cluster can create values in yarn-site.xml which have {{None}} 
specified in the classpath for Spark

{code:java}
<property>
      <name>yarn.nodemanager.aux-services.spark2_shuffle.classpath</name>
      <value>/usr/hdp/None/spark2/aux/*</value>
    </property>

 <property>
      <name>yarn.nodemanager.aux-services.spark_shuffle.classpath</name>
      <value>/usr/hdp/None/spark/aux/*</value>
    </property>

<property>
      
<name>yarn.timeline-service.entity-group-fs-store.group-id-plugin-classpath</name>
      <value>/usr/hdp/None/spark/hdpLib/*</value>
    </property>
{code}

The cause for this is that YARN Clients on hosts without daemons never get a 
restart command after the initial {{yarn-site.xml}}, and can never fill in the 
correct values. This causes problems when jobs are run on these nodes:

{code}
2017-12-04 10:16:41,789 INFO  service.AbstractService 
(AbstractService.java:noteFailure(272)) - Service 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed 
in state INITED; cause: java.lang.ClassNotFoundException: 
org.apache.spark.network.yarn.YarnShuffleService
java.lang.ClassNotFoundException: 
org.apache.spark.network.yarn.YarnShuffleService
{code}

  was:
Downloaded client configs have invalid values for spark properties in 
yarn-site.xml.

Issue: spark_version variable is replaced by 'None' in the spark related config 
properties in yarn-site in the client configs downloaded.

Attaching downloaded yarn-site.xml
 [^yarn-site.xml] 

Properties with issue:

{code:java}
<property>
      <name>yarn.nodemanager.aux-services.spark2_shuffle.classpath</name>
      <value>/usr/hdp/None/spark2/aux/*</value>
    </property>

 <property>
      <name>yarn.nodemanager.aux-services.spark_shuffle.classpath</name>
      <value>/usr/hdp/None/spark/aux/*</value>
    </property>

<property>
      
<name>yarn.timeline-service.entity-group-fs-store.group-id-plugin-classpath</name>
      <value>/usr/hdp/None/spark/hdpLib/*</value>
    </property>
{code}

The cause for this is that YARN Clients on hosts without daemons never get a 
restart command after the initial {{yarn-site.xml}}, and can never fill in the 
correct values.


> YARN Shuffle Service Can't Be Found On Client-Only Nodes After New Cluster 
> Install
> ----------------------------------------------------------------------------------
>
>                 Key: AMBARI-22628
>                 URL: https://issues.apache.org/jira/browse/AMBARI-22628
>             Project: Ambari
>          Issue Type: Bug
>    Affects Versions: 2.6.1
>            Reporter: Kishor Ramakrishnan
>            Assignee: Jonathan Hurley
>            Priority: Blocker
>             Fix For: 2.6.1
>
>
> Installing a new cluster can create values in yarn-site.xml which have 
> {{None}} specified in the classpath for Spark
> {code:java}
> <property>
>       <name>yarn.nodemanager.aux-services.spark2_shuffle.classpath</name>
>       <value>/usr/hdp/None/spark2/aux/*</value>
>     </property>
>  <property>
>       <name>yarn.nodemanager.aux-services.spark_shuffle.classpath</name>
>       <value>/usr/hdp/None/spark/aux/*</value>
>     </property>
> <property>
>       
> <name>yarn.timeline-service.entity-group-fs-store.group-id-plugin-classpath</name>
>       <value>/usr/hdp/None/spark/hdpLib/*</value>
>     </property>
> {code}
> The cause for this is that YARN Clients on hosts without daemons never get a 
> restart command after the initial {{yarn-site.xml}}, and can never fill in 
> the correct values. This causes problems when jobs are run on these nodes:
> {code}
> 2017-12-04 10:16:41,789 INFO  service.AbstractService 
> (AbstractService.java:noteFailure(272)) - Service 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed 
> in state INITED; cause: java.lang.ClassNotFoundException: 
> org.apache.spark.network.yarn.YarnShuffleService
> java.lang.ClassNotFoundException: 
> org.apache.spark.network.yarn.YarnShuffleService
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to