Hey team, My beam pipeline seems to be executing twice. The business logic of beam pipeline is to create one ElasticSearch Index. But since its executed twice the "spark-submit" command always fails and fails my automation.
Attached is the logs. I am running spark-submit on AWS-EMR like this: spark-submit --deploy-mode cluster --conf spark.executor.extraJavaOptions=-DCLOUD_PLATFORM=AWS --conf spark.driver.extraJavaOptions=-DCLOUD_PLATFORM=AWS --conf spark.yarn.am.waitTime=300s --conf spark.executor.extraClassPath=__app__.jar --driver-memory 8G --num-executors 5 --executor-memory 20G --executor-cores 6 --jars s3://vivek-tests/cloud-dataflow-1.0.jar --name new_user_index_mappings_create_dev --class com.noka.beam.common.pipeline.EMRSparkStartPipeline s3://vivek-tests/cloud-dataflow-1.0.jar --job=new-user-index-mappings-create --dateTime=2020-02-04T00:00:00 --isDev=True --incrementalExport=False Note: The code has been working as expected (i.e. one run of create-index) on AWS EMR 5.17 but recently we upgraded to AWS-EMR-5.29 Does someone know if something changed in framework or am I doing smth wrong? Please help! Thanks Vivek
application_1581290593006_0004.log
Description: Binary data
