sekikn commented on PR #1008:
URL: https://github.com/apache/bigtop/pull/1008#issuecomment-1246519239

   @iwasakims Specifying a class directly (`Class['Hadoop::Init_hdfs']`) 
assumes that it's defined somewhere in the applied manifests, so it doesn't 
work if users try to deploy Spark without Hadoop, as follows.
   
   ```
   $ git diff
   diff --git a/bigtop-deploy/puppet/modules/spark/manifests/init.pp 
b/bigtop-deploy/puppet/modules/spark/manifests/init.pp
   index 06d804e5..3b9440c4 100644
   --- a/bigtop-deploy/puppet/modules/spark/manifests/init.pp
   +++ b/bigtop-deploy/puppet/modules/spark/manifests/init.pp
   @@ -63,7 +63,7 @@ class spark {
          hasrestart => true,
          hasstatus  => true,
        }
   -    
   +    Class['Hadoop::Init_hdfs'] -> Class['Spark::Spark_thriftserver']
      }
    
      class client {
   $ ./docker-hadoop.sh -d -C config_rockylinux-8.yaml -r 
file:///bigtop-home/output -G -k spark -c 1
   
   ...
   
   Error: Could not find resource 'Class[Hadoop::Init_hdfs]' for relationship 
on 'Class[Spark::Spark_thriftserver]' on node c3c4f6763873.bigtop.apache.org
   ```
   
   Such situation may be rare, but it's possible if users want to deploy only 
Spark onto their existing Hadoop cluster, for example.
   Puppet's resource collector (`Exec<| title == "init hdfs" |>`) can handle 
such case, so I think it's better here.
   
   ```
   $ git diff
   diff --git a/bigtop-deploy/puppet/modules/spark/manifests/init.pp 
b/bigtop-deploy/puppet/modules/spark/manifests/init.pp
   index 06d804e5..0da9f7f8 100644
   --- a/bigtop-deploy/puppet/modules/spark/manifests/init.pp
   +++ b/bigtop-deploy/puppet/modules/spark/manifests/init.pp
   @@ -63,7 +63,7 @@ class spark {
          hasrestart => true,
          hasstatus  => true,
        }
   -    
   +    Exec<| title == "init hdfs" |> -> Service["spark-thriftserver"]
      }
    
      class client {
   $ ./docker-hadoop.sh -d -C config_rockylinux-8.yaml -r 
file:///bigtop-home/output -G -k spark -c 1
   
   ...
   
   Notice: /Stage[main]/Spark::Datanucleus/Package[spark-datanucleus]/ensure: 
created
   Notice: /Stage[main]/Spark::Common/Package[spark-core]/ensure: created
   Notice: 
/Stage[main]/Spark::Common/File[/etc/spark/conf/spark-env.sh]/content: content 
changed '{md5}d5faace87add5ae723d31eb6b37e5007' to 
'{md5}7797eb3bd1be34a7a2d0ebc5bee1c4f4'
   Notice: 
/Stage[main]/Spark::Common/File[/etc/spark/conf/spark-defaults.conf]/ensure: 
defined content as '{md5}fc22ef68dcc50913d006da8d4466de4e'
   Notice: 
/Stage[main]/Spark::Common/File[/etc/spark/conf/log4j.properties]/ensure: 
defined content as '{md5}7fe3f0915ee2f08d6174bb80518c2b9a'
   Notice: /Stage[main]/Spark::Sparkr/Package[spark-sparkr]/ensure: created
   Notice: /Stage[main]/Spark::Client/Package[spark-python]/ensure: created
   Notice: /Stage[main]/Spark::Client/Package[spark-external]/ensure: created
   Notice: /Stage[main]/Spark::Yarn_shuffle/Package[spark-yarn-shuffle]/ensure: 
created
   Notice: 
/Stage[main]/Spark::Spark_thriftserver/Package[spark-thriftserver]/ensure: 
created
   Notice: 
/Stage[main]/Spark::Spark_thriftserver/Service[spark-thriftserver]/ensure: 
ensure changed 'stopped' to 'running'
   Notice: Applied catalog in 1729.48 seconds
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to