While working on some data loading to a CDH cluster, I noticed that the aforementioned script requires that the cluster must contain the exact following services and nothing else: HDFS,YARN,HIVE,IMPALA,MAPREDUCE,KUDU,HBASE,ZOOKEEPER
It also requires that all services on the cluster must be in STARTED state. Which are in-line with the original prequisites from IMPALA-4031 <https://issues.apache.org/jira/browse/IMPALA-4031?focusedCommentId=15920517&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15920517> Though these appears too strict to me. There shouldn't be any reason why the script can't continue on a cluster with extra services as long as the REQUIRED_SERVICES.issubset(services.keys()), should there? I patched the script locally to allow it to continue. So far I've not observed any problem. Are there any caveats to having extraneous services? Vincent
