koochiswathiTR commented on issue #8984:
URL: https://github.com/apache/hudi/issues/8984#issuecomment-1596933153

   [hadoop@ip-100-66-69-75 a206760-PowerUser2]$ spark-submit --packages 
org.apache.hudi:hudi-utilities-bundle_2.12:0.11.1,org.apache.spark:spark-avro_2.11:2.4.4,org.apache.hudi:hudi-spark3-bundle_2.12:0.11.1
 --verbose --driver-memory 4g --executor-memory 16g --num-executors 8   
--driver-cores 10 --executor-cores 10 --class 
org.apache.hudi.utilities.HoodieCompactor 
/usr/lib/hudi/hudi-utilities-bundle.jar,/usr/lib/hudi/hudi-spark-bundle.jar  
--table-name novusdoc --base-path s3://a206760-novusdoc-s3-dev-use1/novusdoc 
--mode scheduleandexecute --spark-memory 2g  --hoodie-conf 
hoodie.metadata.enable=false --hoodie-conf 
hoodie.compact.inline.trigger.strategy=NUM_COMMITS  --hoodie-conf 
hoodie.compact.inline.max.delta.commits=5
   2023-06-19T10:26:47.109+0000: [GC pause (G1 Evacuation Pause) (young), 
0.0037454 secs]
      [Parallel Time: 1.6 ms, GC Workers: 8]
         [GC Worker Start (ms): Min: 418.9, Avg: 419.0, Max: 419.0, Diff: 0.1]
         [Ext Root Scanning (ms): Min: 0.1, Avg: 0.2, Max: 0.4, Diff: 0.3, Sum: 
1.8]
         [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
            [Processed Buffers: Min: 0, Avg: 0.0, Max: 0, Diff: 0, Sum: 0]
         [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
         [Code Root Scanning (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.3, 
Sum: 0.6]
         [Object Copy (ms): Min: 0.9, Avg: 1.0, Max: 1.1, Diff: 0.3, Sum: 8.1]
         [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
            [Termination Attempts: Min: 1, Avg: 6.9, Max: 12, Diff: 11, Sum: 55]
         [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 
0.1]
         [GC Worker Total (ms): Min: 1.3, Avg: 1.4, Max: 1.4, Diff: 0.1, Sum: 
10.9]
         [GC Worker End (ms): Min: 420.3, Avg: 420.3, Max: 420.3, Diff: 0.0]
      [Code Root Fixup: 0.0 ms]
      [Code Root Purge: 0.0 ms]
      [Clear CT: 0.1 ms]
      [Other: 2.0 ms]
         [Choose CSet: 0.0 ms]
         [Ref Proc: 1.7 ms]
         [Ref Enq: 0.0 ms]
         [Redirty Cards: 0.1 ms]
         [Humongous Register: 0.0 ms]
         [Humongous Reclaim: 0.0 ms]
         [Free CSet: 0.0 ms]
      [Eden: 24576.0K(24576.0K)->0.0B(34816.0K) Survivors: 0.0B->3072.0K Heap: 
24576.0K(496.0M)->4071.5K(496.0M)]
    [Times: user=0.01 sys=0.00, real=0.00 secs]
   2023-06-19T10:26:47.455+0000: [GC pause (G1 Evacuation Pause) (young), 
0.0053984 secs]
      [Parallel Time: 2.8 ms, GC Workers: 8]
         [GC Worker Start (ms): Min: 764.9, Avg: 765.1, Max: 766.4, Diff: 1.5]
         [Ext Root Scanning (ms): Min: 0.0, Avg: 0.3, Max: 0.9, Diff: 0.9, Sum: 
2.4]
         [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.1]
            [Processed Buffers: Min: 0, Avg: 0.1, Max: 1, Diff: 1, Sum: 1]
         [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
         [Code Root Scanning (ms): Min: 0.0, Avg: 0.1, Max: 0.6, Diff: 0.6, 
Sum: 0.7]
         [Object Copy (ms): Min: 0.9, Avg: 1.9, Max: 2.4, Diff: 1.5, Sum: 15.2]
         [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 1.5]
            [Termination Attempts: Min: 1, Avg: 15.1, Max: 28, Diff: 27, Sum: 
121]
         [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 
0.1]
         [GC Worker Total (ms): Min: 1.2, Avg: 2.5, Max: 2.7, Diff: 1.5, Sum: 
19.9]
         [GC Worker End (ms): Min: 767.6, Avg: 767.6, Max: 767.6, Diff: 0.0]
      [Code Root Fixup: 0.0 ms]
      [Code Root Purge: 0.0 ms]
      [Clear CT: 0.2 ms]
      [Other: 2.4 ms]
         [Choose CSet: 0.0 ms]
         [Ref Proc: 2.0 ms]
         [Ref Enq: 0.0 ms]
         [Redirty Cards: 0.1 ms]
         [Humongous Register: 0.0 ms]
         [Humongous Reclaim: 0.0 ms]
         [Free CSet: 0.0 ms]
      [Eden: 34816.0K(34816.0K)->0.0B(292.0M) Survivors: 3072.0K->5120.0K Heap: 
39486.1K(496.0M)->7351.0K(496.0M)]
    [Times: user=0.02 sys=0.01, real=0.01 secs]
   Using properties file: /usr/lib/spark/conf/spark-defaults.conf
   Adding default property: 
spark.serializer=org.apache.spark.serializer.KryoSerializer
   Adding default property: 
spark.yarn.appMasterEnv.bigdataEnv=bigdata_environment:dev,bigdata_project:tacticalnovusingest,bigdata_environment-type:DEVELOPMENT,bigdata_region:us-east-1,bigdata_servicename:tactical-novus-ingest,bigdata_version:dev4856801
   Adding default property: spark.sql.warehouse.dir=hdfs:///user/spark/warehouse
   Adding default property: 
spark.yarn.dist.files=/etc/hudi/conf/hudi-defaults.conf
   Adding default property: 
spark.sql.parquet.fs.optimized.committer.optimization-enabled=true
   Adding default property: spark.executorEnv.regionShortName=use1
   Adding default property: 
spark.executor.extraJavaOptions=-Dcom.amazonaws.sdk.disableCbor=true 
-Duser.timezone=GMT -verbose:gc -XX:+UseG1GC -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -XX:MetaspaceSize=300M
   Adding default property: 
spark.history.fs.logDirectory=hdfs:///var/log/spark/apps
   Adding default property: 
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version.emr_internal_use_only.EmrFileSystem=2
   Adding default property: 
spark.hadoop.mapreduce.output.fs.optimized.committer.enabled=true
   Adding default property: spark.yarn.appMasterEnv.assetId=a206760
   Adding default property: spark.sql.autoBroadcastJoinThreshold=104857600
   Adding default property: spark.eventLog.enabled=true
   Adding default property: spark.shuffle.service.enabled=false
   Adding default property: 
spark.driver.extraLibraryPath=/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native
   Adding default property: spark.emr.default.executor.memory=18971M
   Adding default property: 
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2
   Adding default property: spark.kryoserializer.buffer.max=1024m
   Adding default property: 
spark.yarn.historyServer.address=ip-100-66-69-75.3175.aws-int.thomsonreuters.com:18080
   Adding default property: 
spark.stage.attempt.ignoreOnDecommissionFetchFailure=true
   Adding default property: spark.yarn.appMasterEnv.regionFullName=us-east-1
   Adding default property: spark.yarn.appMasterEnv.regionShortName=use1
   Adding default property: 
spark.storage.decommission.shuffleBlocks.enabled=true
   Adding default property: spark.executorEnv.regionFullName=us-east-1
   Adding default property: spark.rpc.askTimeout=480
   Adding default property: spark.sql.streaming.metricsEnabled=true
   Adding default property: spark.locality.wait=6s
   Adding default property: spark.driver.memory=2048M
   Adding default property: spark.decommission.enabled=true
   Adding default property: spark.files.fetchFailure.unRegisterOutputOnHost=true
   Adding default property: spark.executorEnv.assetId=a206760
   Adding default property: spark.executor.defaultJavaOptions=-verbose:gc 
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:OnOutOfMemoryError='kill -9 %p' 
-Dfile.encoding=UTF-8
   Adding default property: spark.resourceManager.cleanupExpiredHost=true
   Adding default property: spark.yarn.appMasterEnv.SPARK_PUBLIC_DNS=$(hostname 
-f)
   Adding default property: 
spark.sql.emr.internal.extensions=com.amazonaws.emr.spark.EmrSparkSessionExtensions
   Adding default property: spark.emr.default.executor.cores=4
   Adding default property: 
spark.driver.extraJavaOptions=-Dcom.amazonaws.sdk.disableCbor=true 
-Duser.timezone=GMT -verbose:gc -XX:+UseG1GC -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -XX:MetaspaceSize=300M
   Adding default property: 
spark.hadoop.fs.s3.getObject.initialSocketTimeoutMilliseconds=2000
   Adding default property: spark.deploy.mode=cluster
   Adding default property: spark.master=yarn
   Adding default property: 
spark.sql.parquet.output.committer.class=com.amazon.emr.committer.EmrOptimizedSparkSqlParquetOutputCommitter
   Adding default property: spark.rpc.message.maxSize=416
   Adding default property: spark.driver.defaultJavaOptions=-verbose:gc 
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -verbose:gc -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -Dfile.encoding=UTF-8
   Adding default property: 
spark.executorEnv.correlationId=offline_compaction_schedule
   Adding default property: spark.blacklist.decommissioning.timeout=1h
   Adding default property: 
spark.executor.extraLibraryPath=/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native
   Adding default property: fs.s3.maxRetries=1000000
   Adding default property: 
spark.sql.hive.metastore.sharedPrefixes=com.amazonaws.services.dynamodbv2
   Adding default property: spark.executor.memory=18971M
   Adding default property: 
spark.driver.extraClassPath=/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-connector.jar:/docker/usr/lib/hadoop-lzo/lib/*:/docker/usr/lib/hadoop/hadoop-aws.jar:/docker/usr/share/aws/aws-java-sdk/*:/docker/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/docker/usr/share/aws/emr/security/conf:/docker/usr/share/aws/emr/security/lib/*:/docker/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/docker/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/docker/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/docker/usr/share/aws/emr/s3select/
 lib/emr-s3-select-spark-connector.jar:/usr/lib/aws-sdk-v2/bundle-2.17.282.jar
   Adding default property: spark.eventLog.dir=hdfs:///var/log/spark/apps
   Adding default property: 
spark.executorEnv.bigdataEnv=bigdata_environment:dev,bigdata_project:tacticalnovusingest,bigdata_environment-type:DEVELOPMENT,bigdata_region:us-east-1,bigdata_servicename:tactical-novus-ingest,bigdata_version:dev4856801
   Adding default property: spark.dynamicAllocation.enabled=false
   Adding default property: 
spark.executor.extraClassPath=/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-connector.jar:/docker/usr/lib/hadoop-lzo/lib/*:/docker/usr/lib/hadoop/hadoop-aws.jar:/docker/usr/share/aws/aws-java-sdk/*:/docker/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/docker/usr/share/aws/emr/security/conf:/docker/usr/share/aws/emr/security/lib/*:/docker/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/docker/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/docker/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/docker/usr/share/aws/emr/s3selec
 t/lib/emr-s3-select-spark-connector.jar:/usr/lib/aws-sdk-v2/bundle-2.17.282.jar
   Adding default property: spark.executor.cores=4
   Adding default property: spark.history.ui.port=18080
   Adding default property: spark.blacklist.decommissioning.enabled=true
   Adding default property: 
spark.yarn.appMasterEnv.correlationId=offline_compaction_schedule
   Adding default property: spark.decommissioning.timeout.threshold=20
   Adding default property: spark.yarn.heterogeneousExecutors.enabled=false
   Adding default property: 
spark.hadoop.mapreduce.fileoutputcommitter.cleanup-failures.ignored.emr_internal_use_only.EmrFileSystem=true
   Adding default property: spark.hadoop.yarn.timeline-service.enabled=false
   Adding default property: spark.yarn.executor.memoryOverheadFactor=0.1875
   Warning: Ignoring non-Spark config property: fs.s3.maxRetries
   Parsed arguments:
     master                  yarn
     deployMode              null
     executorMemory          16g
     executorCores           10
     totalExecutorCores      null
     propertiesFile          /usr/lib/spark/conf/spark-defaults.conf
     driverMemory            4g
     driverCores             10
     driverExtraClassPath    
/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-connector.jar:/docker/usr/lib/hadoop-lzo/lib/*:/docker/usr/lib/hadoop/hadoop-aws.jar:/docker/usr/share/aws/aws-java-sdk/*:/docker/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/docker/usr/share/aws/emr/security/conf:/docker/usr/share/aws/emr/security/lib/*:/docker/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/docker/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/docker/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/docker/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-con
 nector.jar:/usr/lib/aws-sdk-v2/bundle-2.17.282.jar
     driverExtraLibraryPath  
/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native
     driverExtraJavaOptions  -Dcom.amazonaws.sdk.disableCbor=true 
-Duser.timezone=GMT -verbose:gc -XX:+UseG1GC -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -XX:MetaspaceSize=300M
     supervise               false
     queue                   null
     numExecutors            8
     files                   null
     pyFiles                 null
     archives                null
     mainClass               org.apache.hudi.utilities.HoodieCompactor
     primaryResource         
file:/usr/lib/hudi/hudi-utilities-bundle.jar,/usr/lib/hudi/hudi-spark-bundle.jar
     name                    org.apache.hudi.utilities.HoodieCompactor
     childArgs               [--table-name novusdoc --base-path 
s3://a206760-novusdoc-s3-dev-use1/novusdoc --mode scheduleandexecute 
--spark-memory 2g --hoodie-conf hoodie.metadata.enable=false --hoodie-conf 
hoodie.compact.inline.trigger.strategy=NUM_COMMITS --hoodie-conf 
hoodie.compact.inline.max.delta.commits=5]
     jars                    null
     packages                
org.apache.hudi:hudi-utilities-bundle_2.12:0.11.1,org.apache.spark:spark-avro_2.11:2.4.4,org.apache.hudi:hudi-spark3-bundle_2.12:0.11.1
     packagesExclusions      null
     repositories            null
     verbose                 true
   
   Spark properties used, including those specified through
    --conf and those from the properties file 
/usr/lib/spark/conf/spark-defaults.conf:
     
(spark.sql.emr.internal.extensions,com.amazonaws.emr.spark.EmrSparkSessionExtensions)
     (spark.executor.defaultJavaOptions,-verbose:gc -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -XX:OnOutOfMemoryError='kill -9 %p' 
-Dfile.encoding=UTF-8)
     (spark.blacklist.decommissioning.timeout,1h)
     (spark.yarn.appMasterEnv.correlationId,offline_compaction_schedule)
     (spark.yarn.executor.memoryOverheadFactor,0.1875)
     (spark.executorEnv.correlationId,offline_compaction_schedule)
     (spark.executorEnv.regionShortName,use1)
     (spark.blacklist.decommissioning.enabled,true)
     
(spark.executor.extraLibraryPath,/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native)
     (spark.executorEnv.assetId,a206760)
     (spark.hadoop.yarn.timeline-service.enabled,false)
     (spark.driver.memory,4g)
     (spark.executor.memory,18971M)
     
(spark.executorEnv.bigdataEnv,bigdata_environment:dev,bigdata_project:tacticalnovusingest,bigdata_environment-type:DEVELOPMENT,bigdata_region:us-east-1,bigdata_servicename:tactical-novus-ingest,bigdata_version:dev4856801)
     (spark.sql.parquet.fs.optimized.committer.optimization-enabled,true)
     (spark.sql.warehouse.dir,hdfs:///user/spark/warehouse)
     
(spark.driver.extraLibraryPath,/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native)
     
(spark.yarn.historyServer.address,ip-100-66-69-75.3175.aws-int.thomsonreuters.com:18080)
     (spark.yarn.heterogeneousExecutors.enabled,false)
     (spark.rpc.message.maxSize,416)
     (spark.eventLog.enabled,true)
     (spark.storage.decommission.shuffleBlocks.enabled,true)
     (spark.yarn.dist.files,/etc/hudi/conf/hudi-defaults.conf)
     (spark.files.fetchFailure.unRegisterOutputOnHost,true)
     (spark.history.ui.port,18080)
     (spark.stage.attempt.ignoreOnDecommissionFetchFailure,true)
     (spark.hadoop.fs.s3.getObject.initialSocketTimeoutMilliseconds,2000)
     (spark.yarn.appMasterEnv.SPARK_PUBLIC_DNS,$(hostname -f))
     (spark.rpc.askTimeout,480)
     (spark.sql.streaming.metricsEnabled,true)
     (spark.driver.defaultJavaOptions,-verbose:gc -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
-Dfile.encoding=UTF-8)
     (spark.serializer,org.apache.spark.serializer.KryoSerializer)
     (spark.executor.extraJavaOptions,-Dcom.amazonaws.sdk.disableCbor=true 
-Duser.timezone=GMT -verbose:gc -XX:+UseG1GC -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -XX:MetaspaceSize=300M)
     (spark.resourceManager.cleanupExpiredHost,true)
     (spark.deploy.mode,cluster)
     (spark.history.fs.logDirectory,hdfs:///var/log/spark/apps)
     (spark.shuffle.service.enabled,false)
     (spark.yarn.appMasterEnv.regionFullName,us-east-1)
     (spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version,2)
     (spark.locality.wait,6s)
     (spark.emr.default.executor.cores,4)
     
(spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version.emr_internal_use_only.EmrFileSystem,2)
     (spark.driver.extraJavaOptions,-Dcom.amazonaws.sdk.disableCbor=true 
-Duser.timezone=GMT -verbose:gc -XX:+UseG1GC -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -XX:MetaspaceSize=300M)
     (spark.kryoserializer.buffer.max,1024m)
     (spark.hadoop.mapreduce.output.fs.optimized.committer.enabled,true)
     (spark.yarn.appMasterEnv.regionShortName,use1)
     
(spark.executor.extraClassPath,/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-connector.jar:/docker/usr/lib/hadoop-lzo/lib/*:/docker/usr/lib/hadoop/hadoop-aws.jar:/docker/usr/share/aws/aws-java-sdk/*:/docker/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/docker/usr/share/aws/emr/security/conf:/docker/usr/share/aws/emr/security/lib/*:/docker/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/docker/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/docker/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/docker/usr/share/aws/emr/s3select/lib/emr-s3-select-sp
 ark-connector.jar:/usr/lib/aws-sdk-v2/bundle-2.17.282.jar)
     (spark.sql.hive.metastore.sharedPrefixes,com.amazonaws.services.dynamodbv2)
     (spark.eventLog.dir,hdfs:///var/log/spark/apps)
     (spark.executorEnv.regionFullName,us-east-1)
     (spark.master,yarn)
     (spark.emr.default.executor.memory,18971M)
     (spark.decommission.enabled,true)
     (spark.dynamicAllocation.enabled,false)
     (spark.yarn.appMasterEnv.assetId,a206760)
     (spark.sql.autoBroadcastJoinThreshold,104857600)
     
(spark.sql.parquet.output.committer.class,com.amazon.emr.committer.EmrOptimizedSparkSqlParquetOutputCommitter)
     
(spark.yarn.appMasterEnv.bigdataEnv,bigdata_environment:dev,bigdata_project:tacticalnovusingest,bigdata_environment-type:DEVELOPMENT,bigdata_region:us-east-1,bigdata_servicename:tactical-novus-ingest,bigdata_version:dev4856801)
     (spark.executor.cores,4)
     (spark.decommissioning.timeout.threshold,20)
     
(spark.hadoop.mapreduce.fileoutputcommitter.cleanup-failures.ignored.emr_internal_use_only.EmrFileSystem,true)
     
(spark.driver.extraClassPath,/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-connector.jar:/docker/usr/lib/hadoop-lzo/lib/*:/docker/usr/lib/hadoop/hadoop-aws.jar:/docker/usr/share/aws/aws-java-sdk/*:/docker/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/docker/usr/share/aws/emr/security/conf:/docker/usr/share/aws/emr/security/lib/*:/docker/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/docker/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/docker/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/docker/usr/share/aws/emr/s3select/lib/emr-s3-select-spar
 k-connector.jar:/usr/lib/aws-sdk-v2/bundle-2.17.282.jar)
   
   
   :: loading settings :: url = 
jar:file:/usr/lib/spark/jars/ivy-2.5.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
   Ivy Default Cache set to: /home/hadoop/.ivy2/cache
   The jars for the packages stored in: /home/hadoop/.ivy2/jars
   org.apache.hudi#hudi-utilities-bundle_2.12 added as a dependency
   org.apache.spark#spark-avro_2.11 added as a dependency
   org.apache.hudi#hudi-spark3-bundle_2.12 added as a dependency
   :: resolving dependencies :: 
org.apache.spark#spark-submit-parent-1341569f-530d-4afe-a08e-cc9ee2167f5c;1.0
           confs: [default]
           found org.apache.hudi#hudi-utilities-bundle_2.12;0.11.1 in central
           found org.apache.htrace#htrace-core;3.1.0-incubating in central
           found org.apache.spark#spark-avro_2.11;2.4.4 in central
           found org.spark-project.spark#unused;1.0.0 in central
           found org.apache.hudi#hudi-spark3-bundle_2.12;0.11.1 in central
   :: resolution report :: resolve 257ms :: artifacts dl 13ms
           :: modules in use:
           org.apache.htrace#htrace-core;3.1.0-incubating from central in 
[default]
           org.apache.hudi#hudi-spark3-bundle_2.12;0.11.1 from central in 
[default]
           org.apache.hudi#hudi-utilities-bundle_2.12;0.11.1 from central in 
[default]
           org.apache.spark#spark-avro_2.11;2.4.4 from central in [default]
           org.spark-project.spark#unused;1.0.0 from central in [default]
           ---------------------------------------------------------------------
           |                  |            modules            ||   artifacts   |
           |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
           ---------------------------------------------------------------------
           |      default     |   5   |   0   |   0   |   0   ||   5   |   0   |
           ---------------------------------------------------------------------
   :: retrieving :: 
org.apache.spark#spark-submit-parent-1341569f-530d-4afe-a08e-cc9ee2167f5c
           confs: [default]
           0 artifacts copied, 5 already retrieved (0kB/12ms)
   2023-06-19T10:26:48.356+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.util.ShutdownHookManager] [ShutdownHookManager]: Adding 
shutdown hook
   Main class:
   org.apache.hudi.utilities.HoodieCompactor
   Arguments:
   --table-name
   novusdoc
   --base-path
   s3://a206760-novusdoc-s3-dev-use1/novusdoc
   --mode
   scheduleandexecute
   --spark-memory
   2g
   --hoodie-conf
   hoodie.metadata.enable=false
   --hoodie-conf
   hoodie.compact.inline.trigger.strategy=NUM_COMMITS
   --hoodie-conf
   hoodie.compact.inline.max.delta.commits=5
   Spark config:
   (spark.serializer,org.apache.spark.serializer.KryoSerializer)
   
(spark.yarn.appMasterEnv.bigdataEnv,bigdata_environment:dev,bigdata_project:tacticalnovusingest,bigdata_environment-type:DEVELOPMENT,bigdata_region:us-east-1,bigdata_servicename:tactical-novus-ingest,bigdata_version:dev4856801)
   (spark.sql.warehouse.dir,hdfs:///user/spark/warehouse)
   (spark.yarn.dist.files,file:/etc/hudi/conf.dist/hudi-defaults.conf)
   (spark.sql.parquet.fs.optimized.committer.optimization-enabled,true)
   (spark.executorEnv.regionShortName,use1)
   (spark.executor.extraJavaOptions,-Dcom.amazonaws.sdk.disableCbor=true 
-Duser.timezone=GMT -verbose:gc -XX:+UseG1GC -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -XX:MetaspaceSize=300M)
   (spark.history.fs.logDirectory,hdfs:///var/log/spark/apps)
   
(spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version.emr_internal_use_only.EmrFileSystem,2)
   (spark.hadoop.mapreduce.output.fs.optimized.committer.enabled,true)
   (spark.yarn.appMasterEnv.assetId,a206760)
   (spark.sql.autoBroadcastJoinThreshold,104857600)
   (spark.eventLog.enabled,true)
   (spark.shuffle.service.enabled,false)
   
(spark.driver.extraLibraryPath,/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native)
   (spark.emr.default.executor.memory,18971M)
   
(spark.jars,file:/usr/lib/hudi/hudi-utilities-bundle.jar,file:/usr/lib/hudi/hudi-spark3-bundle_2.12-0.11.0-amzn-0.jar)
   (spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version,2)
   (spark.kryoserializer.buffer.max,1024m)
   
(spark.yarn.historyServer.address,ip-100-66-69-75.3175.aws-int.thomsonreuters.com:18080)
   (spark.stage.attempt.ignoreOnDecommissionFetchFailure,true)
   (spark.yarn.appMasterEnv.regionFullName,us-east-1)
   (spark.yarn.appMasterEnv.regionShortName,use1)
   (spark.app.name,org.apache.hudi.utilities.HoodieCompactor)
   (spark.storage.decommission.shuffleBlocks.enabled,true)
   (spark.executorEnv.regionFullName,us-east-1)
   (spark.rpc.askTimeout,480)
   (spark.sql.streaming.metricsEnabled,true)
   (spark.locality.wait,6s)
   (spark.driver.memory,4g)
   (spark.executor.instances,8)
   (spark.decommission.enabled,true)
   (spark.files.fetchFailure.unRegisterOutputOnHost,true)
   (spark.submit.pyFiles,)
   (spark.executorEnv.assetId,a206760)
   (spark.executor.defaultJavaOptions,-verbose:gc -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -XX:OnOutOfMemoryError='kill -9 %p' 
-Dfile.encoding=UTF-8)
   (spark.resourceManager.cleanupExpiredHost,true)
   (spark.yarn.appMasterEnv.SPARK_PUBLIC_DNS,$(hostname -f))
   
(spark.sql.emr.internal.extensions,com.amazonaws.emr.spark.EmrSparkSessionExtensions)
   (spark.emr.default.executor.cores,4)
   (spark.driver.extraJavaOptions,-Dcom.amazonaws.sdk.disableCbor=true 
-Duser.timezone=GMT -verbose:gc -XX:+UseG1GC -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -XX:MetaspaceSize=300M)
   (spark.hadoop.fs.s3.getObject.initialSocketTimeoutMilliseconds,2000)
   (spark.submit.deployMode,client)
   (spark.deploy.mode,cluster)
   (spark.master,yarn)
   
(spark.sql.parquet.output.committer.class,com.amazon.emr.committer.EmrOptimizedSparkSqlParquetOutputCommitter)
   (spark.rpc.message.maxSize,416)
   (spark.driver.defaultJavaOptions,-verbose:gc -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
-Dfile.encoding=UTF-8)
   (spark.executorEnv.correlationId,offline_compaction_schedule)
   (spark.blacklist.decommissioning.timeout,1h)
   
(spark.executor.extraLibraryPath,/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native)
   (spark.sql.hive.metastore.sharedPrefixes,com.amazonaws.services.dynamodbv2)
   (spark.executor.memory,16g)
   
(spark.driver.extraClassPath,/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-connector.jar:/docker/usr/lib/hadoop-lzo/lib/*:/docker/usr/lib/hadoop/hadoop-aws.jar:/docker/usr/share/aws/aws-java-sdk/*:/docker/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/docker/usr/share/aws/emr/security/conf:/docker/usr/share/aws/emr/security/lib/*:/docker/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/docker/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/docker/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/docker/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-
 connector.jar:/usr/lib/aws-sdk-v2/bundle-2.17.282.jar)
   (spark.eventLog.dir,hdfs:///var/log/spark/apps)
   
(spark.executorEnv.bigdataEnv,bigdata_environment:dev,bigdata_project:tacticalnovusingest,bigdata_environment-type:DEVELOPMENT,bigdata_region:us-east-1,bigdata_servicename:tactical-novus-ingest,bigdata_version:dev4856801)
   (spark.dynamicAllocation.enabled,false)
   
(spark.executor.extraClassPath,/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-connector.jar:/docker/usr/lib/hadoop-lzo/lib/*:/docker/usr/lib/hadoop/hadoop-aws.jar:/docker/usr/share/aws/aws-java-sdk/*:/docker/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/docker/usr/share/aws/emr/security/conf:/docker/usr/share/aws/emr/security/lib/*:/docker/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/docker/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/docker/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/docker/usr/share/aws/emr/s3select/lib/emr-s3-select-spar
 k-connector.jar:/usr/lib/aws-sdk-v2/bundle-2.17.282.jar)
   (spark.executor.cores,10)
   (spark.history.ui.port,18080)
   
(spark.repl.local.jars,file:///home/hadoop/.ivy2/jars/org.apache.hudi_hudi-utilities-bundle_2.12-0.11.1.jar,file:///home/hadoop/.ivy2/jars/org.apache.spark_spark-avro_2.11-2.4.4.jar,file:///home/hadoop/.ivy2/jars/org.apache.hudi_hudi-spark3-bundle_2.12-0.11.1.jar,file:///home/hadoop/.ivy2/jars/org.apache.htrace_htrace-core-3.1.0-incubating.jar,file:///home/hadoop/.ivy2/jars/org.spark-project.spark_unused-1.0.0.jar)
   (spark.blacklist.decommissioning.enabled,true)
   (spark.yarn.appMasterEnv.correlationId,offline_compaction_schedule)
   (spark.decommissioning.timeout.threshold,20)
   (spark.yarn.heterogeneousExecutors.enabled,false)
   
(spark.hadoop.mapreduce.fileoutputcommitter.cleanup-failures.ignored.emr_internal_use_only.EmrFileSystem,true)
   
(spark.yarn.dist.jars,file:///home/hadoop/.ivy2/jars/org.apache.hudi_hudi-utilities-bundle_2.12-0.11.1.jar,file:///home/hadoop/.ivy2/jars/org.apache.spark_spark-avro_2.11-2.4.4.jar,file:///home/hadoop/.ivy2/jars/org.apache.hudi_hudi-spark3-bundle_2.12-0.11.1.jar,file:///home/hadoop/.ivy2/jars/org.apache.htrace_htrace-core-3.1.0-incubating.jar,file:///home/hadoop/.ivy2/jars/org.spark-project.spark_unused-1.0.0.jar)
   (spark.hadoop.yarn.timeline-service.enabled,false)
   (spark.yarn.executor.memoryOverheadFactor,0.1875)
   Classpath elements:
   
file:/usr/lib/hudi/hudi-utilities-bundle.jar,/usr/lib/hudi/hudi-spark-bundle.jar
   
file:///home/hadoop/.ivy2/jars/org.apache.hudi_hudi-utilities-bundle_2.12-0.11.1.jar
   file:///home/hadoop/.ivy2/jars/org.apache.spark_spark-avro_2.11-2.4.4.jar
   
file:///home/hadoop/.ivy2/jars/org.apache.hudi_hudi-spark3-bundle_2.12-0.11.1.jar
   
file:///home/hadoop/.ivy2/jars/org.apache.htrace_htrace-core-3.1.0-incubating.jar
   file:///home/hadoop/.ivy2/jars/org.spark-project.spark_unused-1.0.0.jar
   
   
   2023-06-19T10:26:48.653+0000 [WARN] [offline_compaction_schedule] 
[org.apache.spark.util.DependencyUtils] [DependencyUtils]: Local jar 
/usr/lib/hudi/hudi-utilities-bundle.jar,/usr/lib/hudi/hudi-spark-bundle.jar 
does not exist, skipping.
   2023-06-19T10:26:48.759+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SparkContext] [SparkContext]: Running Spark version 
3.2.1-amzn-0
   2023-06-19T10:26:48.783+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.resource.ResourceUtils] [ResourceUtils]: 
==============================================================
   2023-06-19T10:26:48.783+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.resource.ResourceUtils] [ResourceUtils]: No custom resources 
configured for spark.driver.
   2023-06-19T10:26:48.784+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.resource.ResourceUtils] [ResourceUtils]: 
==============================================================
   2023-06-19T10:26:48.784+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SparkContext] [SparkContext]: Submitted application: 
compactor-novusdoc
   2023-06-19T10:26:48.810+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.resource.ResourceProfile] [ResourceProfile]: Default 
ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 
10, script: , vendor: , memory -> name: memory, amount: 2048, script: , vendor: 
, offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: 
Map(cpus -> name: cpus, amount: 1.0)
   2023-06-19T10:26:48.824+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.resource.ResourceProfile] [ResourceProfile]: Limiting 
resource is cpus at 10 tasks per executor
   2023-06-19T10:26:48.826+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.resource.ResourceProfileManager] [ResourceProfileManager]: 
Added ResourceProfile id: 0
   2023-06-19T10:26:48.884+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SecurityManager] [SecurityManager]: Changing view acls to: 
hadoop
   2023-06-19T10:26:48.884+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SecurityManager] [SecurityManager]: Changing modify acls to: 
hadoop
   2023-06-19T10:26:48.884+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SecurityManager] [SecurityManager]: Changing view acls groups 
to:
   2023-06-19T10:26:48.885+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SecurityManager] [SecurityManager]: Changing modify acls 
groups to:
   2023-06-19T10:26:48.885+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SecurityManager] [SecurityManager]: SecurityManager: 
authentication disabled; ui acls disabled; users  with view permissions: 
Set(hadoop); groups with view permissions: Set(); users  with modify 
permissions: Set(hadoop); groups with modify permissions: Set()
   2023-06-19T10:26:48.918+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hadoop.conf.Configuration.deprecation] [deprecation]: 
mapred.output.compression.codec is deprecated. Instead, use 
mapreduce.output.fileoutputformat.compress.codec
   2023-06-19T10:26:48.918+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hadoop.conf.Configuration.deprecation] [deprecation]: 
mapred.output.compression.type is deprecated. Instead, use 
mapreduce.output.fileoutputformat.compress.type
   2023-06-19T10:26:48.919+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hadoop.conf.Configuration.deprecation] [deprecation]: 
mapred.output.compress is deprecated. Instead, use 
mapreduce.output.fileoutputformat.compress
   2023-06-19T10:26:49.159+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.network.server.TransportServer] [TransportServer]: Shuffle 
server started on port: 35007
   2023-06-19T10:26:49.168+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.util.Utils] [Utils]: Successfully started service 
'sparkDriver' on port 35007.
   2023-06-19T10:26:49.177+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.SparkEnv] [SparkEnv]: Using serializer: class 
org.apache.spark.serializer.KryoSerializer
   2023-06-19T10:26:49.196+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SparkEnv] [SparkEnv]: Registering MapOutputTracker
   2023-06-19T10:26:49.197+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.MapOutputTrackerMasterEndpoint] 
[MapOutputTrackerMasterEndpoint]: init
   2023-06-19T10:26:49.235+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SparkEnv] [SparkEnv]: Registering BlockManagerMaster
   2023-06-19T10:26:49.300+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SparkEnv] [SparkEnv]: Registering BlockManagerMasterHeartbeat
   2023-06-19T10:26:49.400+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SparkEnv] [SparkEnv]: Registering OutputCommitCoordinator
   2023-06-19T10:26:49.404+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.subresultcache.SubResultCacheManager] 
[SubResultCacheManager]: Sub-result caches config to enable false.
   2023-06-19T10:26:49.404+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.subresultcache.SubResultCacheManager] 
[SubResultCacheManager]: Sub-result caches are disabled.
   2023-06-19T10:26:49.423+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.SecurityManager] [SecurityManager]: Created SSL options for 
ui: SSLOptions{enabled=false, port=None, keyStore=None, keyStorePassword=None, 
trustStore=None, trustStorePassword=None, protocol=None, 
enabledAlgorithms=Set()}
   2023-06-19T10:26:49.504+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.util.log] [log]: Logging initialized @2813ms to 
org.sparkproject.jetty.util.log.Slf4jLog
   2023-06-19T10:26:49.581+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.Server] [Server]: jetty-9.4.43.v20210629; built: 
2021-06-30T11:07:22.254Z; git: 526006ecfa3af7f1a27ef3a288e2bef7ea9dd7e8; jvm 
1.8.0_372-b07
   2023-06-19T10:26:49.606+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.Server] [Server]: Started @2915ms
   2023-06-19T10:26:49.608+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.ui.JettyUtils] [JettyUtils]: Using requestHeaderSize: 8192
   2023-06-19T10:26:49.645+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.AbstractConnector] [AbstractConnector]: Started 
ServerConnector@34dc85a{HTTP/1.1, (http/1.1)}{0.0.0.0:8090}
   2023-06-19T10:26:49.646+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.util.Utils] [Utils]: Successfully started service 'SparkUI' 
on port 8090.
   2023-06-19T10:26:49.671+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started o.s.j.s.ServletContextHandler@b8a7e43{/jobs,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.674+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started o.s.j.s.ServletContextHandler@719843e5{/jobs/json,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.675+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started o.s.j.s.ServletContextHandler@58112bc4{/jobs/job,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.676+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started 
o.s.j.s.ServletContextHandler@2f5c1332{/jobs/job/json,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.677+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started o.s.j.s.ServletContextHandler@7cab1508{/stages,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.678+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started 
o.s.j.s.ServletContextHandler@258ee7de{/stages/json,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.679+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started 
o.s.j.s.ServletContextHandler@6d171ce0{/stages/stage,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.680+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started 
o.s.j.s.ServletContextHandler@6e1d4137{/stages/stage/json,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.681+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started 
o.s.j.s.ServletContextHandler@29a4f594{/stages/pool,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.682+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started 
o.s.j.s.ServletContextHandler@5327a06e{/stages/pool/json,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.683+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started o.s.j.s.ServletContextHandler@287f7811{/storage,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.684+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started 
o.s.j.s.ServletContextHandler@2b556bb2{/storage/json,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.684+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started 
o.s.j.s.ServletContextHandler@17271176{/storage/rdd,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.685+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started 
o.s.j.s.ServletContextHandler@2e34384c{/storage/rdd/json,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.686+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started 
o.s.j.s.ServletContextHandler@1f52eb6f{/environment,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.687+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started 
o.s.j.s.ServletContextHandler@58294867{/environment/json,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.688+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started o.s.j.s.ServletContextHandler@6fc3e1a4{/executors,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.689+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started 
o.s.j.s.ServletContextHandler@2d5f7182{/executors/json,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.690+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started 
o.s.j.s.ServletContextHandler@29ea78b1{/executors/threadDump,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.691+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started 
o.s.j.s.ServletContextHandler@7baf6acf{/executors/threadDump/json,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.701+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started o.s.j.s.ServletContextHandler@7b3315a5{/static,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.702+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started o.s.j.s.ServletContextHandler@629ae7e{/,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.703+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started o.s.j.s.ServletContextHandler@de88ac6{/api,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.704+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started 
o.s.j.s.ServletContextHandler@42fcc7e6{/jobs/job/kill,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.705+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started 
o.s.j.s.ServletContextHandler@5da7cee2{/stages/stage/kill,null,AVAILABLE,@Spark}
   2023-06-19T10:26:49.707+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.ui.SparkUI] [SparkUI]: Bound SparkUI to 0.0.0.0, and started 
at http://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:8090
   2023-06-19T10:26:49.729+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SparkContext] [SparkContext]: Added JAR 
file:/usr/lib/hudi/hudi-utilities-bundle.jar at 
spark://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:35007/jars/hudi-utilities-bundle.jar
 with timestamp 1687170408750
   2023-06-19T10:26:49.730+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SparkContext] [SparkContext]: Added JAR 
file:/usr/lib/hudi/hudi-spark3-bundle_2.12-0.11.0-amzn-0.jar at 
spark://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:35007/jars/hudi-spark3-bundle_2.12-0.11.0-amzn-0.jar
 with timestamp 1687170408750
   2023-06-19T10:26:49.849+0000: [GC pause (G1 Evacuation Pause) (young), 
0.0244707 secs]
      [Parallel Time: 11.2 ms, GC Workers: 8]
         [GC Worker Start (ms): Min: 3159.6, Avg: 3159.7, Max: 3159.7, Diff: 
0.1]
         [Ext Root Scanning (ms): Min: 0.7, Avg: 1.5, Max: 4.4, Diff: 3.7, Sum: 
11.7]
         [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.2, Diff: 0.2, Sum: 0.3]
            [Processed Buffers: Min: 0, Avg: 1.0, Max: 2, Diff: 2, Sum: 8]
         [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.3]
         [Code Root Scanning (ms): Min: 0.0, Avg: 0.5, Max: 1.3, Diff: 1.3, 
Sum: 4.3]
         [Object Copy (ms): Min: 6.6, Avg: 8.9, Max: 9.7, Diff: 3.1, Sum: 71.0]
         [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.4]
            [Termination Attempts: Min: 1, Avg: 128.1, Max: 158, Diff: 157, 
Sum: 1025]
         [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 
0.3]
         [GC Worker Total (ms): Min: 11.0, Avg: 11.0, Max: 11.1, Diff: 0.1, 
Sum: 88.3]
         [GC Worker End (ms): Min: 3170.7, Avg: 3170.7, Max: 3170.7, Diff: 0.0]
      [Code Root Fixup: 0.1 ms]
      [Code Root Purge: 0.0 ms]
      [Clear CT: 0.2 ms]
      [Other: 13.0 ms]
         [Choose CSet: 0.0 ms]
         [Ref Proc: 12.3 ms]
         [Ref Enq: 0.1 ms]
         [Redirty Cards: 0.1 ms]
         [Humongous Register: 0.0 ms]
         [Humongous Reclaim: 0.0 ms]
         [Free CSet: 0.3 ms]
      [Eden: 292.0M(292.0M)->0.0B(262.0M) Survivors: 5120.0K->35840.0K Heap: 
299.2M(496.0M)->37864.7K(496.0M)]
    [Times: user=0.09 sys=0.01, real=0.02 secs]
   2023-06-19T10:26:49.974+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hadoop.yarn.client.RMProxy] [RMProxy]: Connecting to 
ResourceManager at 
ip-100-66-69-75.3175.aws-int.thomsonreuters.com/100.66.69.75:8032
   2023-06-19T10:26:50.132+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Requesting a new application 
from cluster with 2 NodeManagers
   2023-06-19T10:26:50.432+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hadoop.conf.Configuration] [Configuration]: resource-types.xml not 
found
   2023-06-19T10:26:50.432+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hadoop.yarn.util.resource.ResourceUtils] [ResourceUtils]: Unable to 
find 'resource-types.xml'.
   2023-06-19T10:26:50.445+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Verifying our application has 
not requested more than the maximum memory capability of the cluster (122880 MB 
per container)
   2023-06-19T10:26:50.445+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Will allocate AM container, 
with 896 MB memory including 384 MB overhead
   2023-06-19T10:26:50.445+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Setting up container launch 
context for our AM
   2023-06-19T10:26:50.446+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Setting up the launch 
environment for our AM container
   2023-06-19T10:26:50.452+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Preparing resources for our AM 
container
   2023-06-19T10:26:50.478+0000 [WARN] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Neither spark.yarn.jars nor 
spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
   2023-06-19T10:26:54.119+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Uploading resource 
file:/mnt/tmp/spark-94366315-0ad4-4f1a-8051-1c517b83f435/__spark_libs__4987513252404456461.zip
 -> 
hdfs://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:8020/user/hadoop/.sparkStaging/application_1687146322573_0047/__spark_libs__4987513252404456461.zip
   2023-06-19T10:26:54.546+0000: [GC pause (G1 Evacuation Pause) (young), 
0.0166820 secs]
      [Parallel Time: 11.6 ms, GC Workers: 8]
         [GC Worker Start (ms): Min: 7856.4, Avg: 7856.7, Max: 7857.8, Diff: 
1.4]
         [Ext Root Scanning (ms): Min: 0.0, Avg: 1.1, Max: 4.5, Diff: 4.5, Sum: 
8.5]
         [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.2, Diff: 0.2, Sum: 0.3]
            [Processed Buffers: Min: 0, Avg: 0.6, Max: 3, Diff: 3, Sum: 5]
         [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.3]
         [Code Root Scanning (ms): Min: 0.0, Avg: 0.7, Max: 1.6, Diff: 1.6, 
Sum: 5.3]
         [Object Copy (ms): Min: 7.0, Avg: 9.3, Max: 10.5, Diff: 3.5, Sum: 74.6]
         [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.5]
            [Termination Attempts: Min: 1, Avg: 154.9, Max: 198, Diff: 197, 
Sum: 1239]
         [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 
0.3]
         [GC Worker Total (ms): Min: 10.1, Avg: 11.2, Max: 11.5, Diff: 1.4, 
Sum: 89.9]
         [GC Worker End (ms): Min: 7867.9, Avg: 7867.9, Max: 7867.9, Diff: 0.1]
      [Code Root Fixup: 0.2 ms]
      [Code Root Purge: 0.0 ms]
      [Clear CT: 0.2 ms]
      [Other: 4.7 ms]
         [Choose CSet: 0.0 ms]
         [Ref Proc: 4.1 ms]
         [Ref Enq: 0.0 ms]
         [Redirty Cards: 0.1 ms]
         [Humongous Register: 0.0 ms]
         [Humongous Reclaim: 0.0 ms]
         [Free CSet: 0.3 ms]
      [Eden: 262.0M(262.0M)->0.0B(262.0M) Survivors: 35840.0K->35840.0K Heap: 
299.0M(496.0M)->37559.0K(496.0M)]
    [Times: user=0.09 sys=0.01, real=0.02 secs]
   2023-06-19T10:26:55.069+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Uploading resource 
file:/home/hadoop/.ivy2/jars/org.apache.hudi_hudi-utilities-bundle_2.12-0.11.1.jar
 -> 
hdfs://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:8020/user/hadoop/.sparkStaging/application_1687146322573_0047/org.apache.hudi_hudi-utilities-bundle_2.12-0.11.1.jar
   2023-06-19T10:26:55.222+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Uploading resource 
file:/home/hadoop/.ivy2/jars/org.apache.spark_spark-avro_2.11-2.4.4.jar -> 
hdfs://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:8020/user/hadoop/.sparkStaging/application_1687146322573_0047/org.apache.spark_spark-avro_2.11-2.4.4.jar
   2023-06-19T10:26:55.238+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Uploading resource 
file:/home/hadoop/.ivy2/jars/org.apache.hudi_hudi-spark3-bundle_2.12-0.11.1.jar 
-> 
hdfs://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:8020/user/hadoop/.sparkStaging/application_1687146322573_0047/org.apache.hudi_hudi-spark3-bundle_2.12-0.11.1.jar
   2023-06-19T10:26:55.239+0000: [GC pause (G1 Evacuation Pause) (young), 
0.0122827 secs]
      [Parallel Time: 11.0 ms, GC Workers: 8]
         [GC Worker Start (ms): Min: 8548.8, Avg: 8548.9, Max: 8548.9, Diff: 
0.1]
         [Ext Root Scanning (ms): Min: 0.3, Avg: 0.8, Max: 3.8, Diff: 3.5, Sum: 
6.3]
         [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.2, Diff: 0.2, Sum: 0.3]
            [Processed Buffers: Min: 0, Avg: 0.4, Max: 1, Diff: 1, Sum: 3]
         [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.4]
         [Code Root Scanning (ms): Min: 0.0, Avg: 0.6, Max: 1.2, Diff: 1.2, 
Sum: 4.8]
         [Object Copy (ms): Min: 7.0, Avg: 9.3, Max: 10.3, Diff: 3.3, Sum: 74.2]
         [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.4]
            [Termination Attempts: Min: 1, Avg: 137.4, Max: 175, Diff: 174, 
Sum: 1099]
         [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 
0.4]
         [GC Worker Total (ms): Min: 10.8, Avg: 10.9, Max: 10.9, Diff: 0.1, 
Sum: 86.9]
         [GC Worker End (ms): Min: 8559.7, Avg: 8559.7, Max: 8559.8, Diff: 0.1]
      [Code Root Fixup: 0.1 ms]
      [Code Root Purge: 0.0 ms]
      [Clear CT: 0.2 ms]
      [Other: 1.0 ms]
         [Choose CSet: 0.0 ms]
         [Ref Proc: 0.5 ms]
         [Ref Enq: 0.0 ms]
         [Redirty Cards: 0.1 ms]
         [Humongous Register: 0.0 ms]
         [Humongous Reclaim: 0.0 ms]
         [Free CSet: 0.2 ms]
      [Eden: 262.0M(262.0M)->0.0B(280.0M) Survivors: 35840.0K->17408.0K Heap: 
298.7M(496.0M)->19127.0K(496.0M)]
    [Times: user=0.09 sys=0.00, real=0.01 secs]
   2023-06-19T10:26:55.407+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Uploading resource 
file:/home/hadoop/.ivy2/jars/org.apache.htrace_htrace-core-3.1.0-incubating.jar 
-> 
hdfs://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:8020/user/hadoop/.sparkStaging/application_1687146322573_0047/org.apache.htrace_htrace-core-3.1.0-incubating.jar
   2023-06-19T10:26:55.426+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Uploading resource 
file:/home/hadoop/.ivy2/jars/org.spark-project.spark_unused-1.0.0.jar -> 
hdfs://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:8020/user/hadoop/.sparkStaging/application_1687146322573_0047/org.spark-project.spark_unused-1.0.0.jar
   2023-06-19T10:26:55.438+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Uploading resource 
file:/etc/hudi/conf.dist/hudi-defaults.conf -> 
hdfs://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:8020/user/hadoop/.sparkStaging/application_1687146322573_0047/hudi-defaults.conf
   2023-06-19T10:26:55.858+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Creating an archive with the 
config files for distribution at 
/mnt/tmp/spark-94366315-0ad4-4f1a-8051-1c517b83f435/__spark_conf__7322044392243776097.zip.
   2023-06-19T10:26:55.946+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Uploading resource 
file:/mnt/tmp/spark-94366315-0ad4-4f1a-8051-1c517b83f435/__spark_conf__7322044392243776097.zip
 -> 
hdfs://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:8020/user/hadoop/.sparkStaging/application_1687146322573_0047/__spark_conf__.zip
   2023-06-19T10:26:56.009+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: 
===============================================================================
   2023-06-19T10:26:56.009+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: YARN AM launch context:
   2023-06-19T10:26:56.010+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:     user class: N/A
   2023-06-19T10:26:56.010+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:     env:
   2023-06-19T10:26:56.011+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:         regionShortName -> use1
   2023-06-19T10:26:56.011+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:         CLASSPATH -> 
/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-connector.jar:/docker/usr/lib/hadoop-lzo/lib/*:/docker/usr/lib/hadoop/hadoop-aws.jar:/docker/usr/share/aws/aws-java-sdk/*:/docker/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/docker/usr/share/aws/emr/security/conf:/docker/usr/share/aws/emr/security/lib/*:/docker/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/docker/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/docker/usr/share/aws/
 
sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/docker/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-connector.jar:/usr/lib/aws-sdk-v2/bundle-2.17.282.jar<CPS>{{PWD}}<CPS>{{PWD}}/__spark_conf__<CPS>{{PWD}}/__spark_libs__/*<CPS>{{PWD}}/__spark_conf__/__hadoop_conf__
   2023-06-19T10:26:56.011+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:         correlationId -> 
offline_compaction_schedule
   2023-06-19T10:26:56.011+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:         SPARK_YARN_STAGING_DIR 
-> 
hdfs://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:8020/user/hadoop/.sparkStaging/application_1687146322573_0047
   2023-06-19T10:26:56.011+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:         SPARK_USER -> hadoop
   2023-06-19T10:26:56.011+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:         regionFullName -> 
us-east-1
   2023-06-19T10:26:56.011+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:         bigdataEnv -> 
bigdata_environment:dev,bigdata_project:tacticalnovusingest,bigdata_environment-type:DEVELOPMENT,bigdata_region:us-east-1,bigdata_servicename:tactical-novus-ingest,bigdata_version:dev4856801
   2023-06-19T10:26:56.011+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:         assetId -> a206760
   2023-06-19T10:26:56.011+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:         SPARK_PUBLIC_DNS -> 
$(hostname -f)
   2023-06-19T10:26:56.012+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:     resources:
   2023-06-19T10:26:56.063+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:         
org.apache.hudi_hudi-utilities-bundle_2.12-0.11.1.jar -> resource { scheme: 
"hdfs" host: "ip-100-66-69-75.3175.aws-int.thomsonreuters.com" port: 8020 file: 
"/user/hadoop/.sparkStaging/application_1687146322573_0047/org.apache.hudi_hudi-utilities-bundle_2.12-0.11.1.jar"
 } size: 62863152 timestamp: 1687170415216 type: FILE visibility: PRIVATE
   2023-06-19T10:26:56.064+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:         
org.apache.hudi_hudi-spark3-bundle_2.12-0.11.1.jar -> resource { scheme: "hdfs" 
host: "ip-100-66-69-75.3175.aws-int.thomsonreuters.com" port: 8020 file: 
"/user/hadoop/.sparkStaging/application_1687146322573_0047/org.apache.hudi_hudi-spark3-bundle_2.12-0.11.1.jar"
 } size: 61591563 timestamp: 1687170415401 type: FILE visibility: PRIVATE
   2023-06-19T10:26:56.064+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:         hudi-defaults.conf -> 
resource { scheme: "hdfs" host: 
"ip-100-66-69-75.3175.aws-int.thomsonreuters.com" port: 8020 file: 
"/user/hadoop/.sparkStaging/application_1687146322573_0047/hudi-defaults.conf" 
} size: 1410 timestamp: 1687170415845 type: FILE visibility: PRIVATE
   2023-06-19T10:26:56.064+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:         __spark_libs__ -> 
resource { scheme: "hdfs" host: 
"ip-100-66-69-75.3175.aws-int.thomsonreuters.com" port: 8020 file: 
"/user/hadoop/.sparkStaging/application_1687146322573_0047/__spark_libs__4987513252404456461.zip"
 } size: 313860902 timestamp: 1687170415000 type: ARCHIVE visibility: PRIVATE
   2023-06-19T10:26:56.064+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:         __spark_conf__ -> 
resource { scheme: "hdfs" host: 
"ip-100-66-69-75.3175.aws-int.thomsonreuters.com" port: 8020 file: 
"/user/hadoop/.sparkStaging/application_1687146322573_0047/__spark_conf__.zip" 
} size: 304187 timestamp: 1687170415994 type: ARCHIVE visibility: PRIVATE
   2023-06-19T10:26:56.065+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:         
org.apache.spark_spark-avro_2.11-2.4.4.jar -> resource { scheme: "hdfs" host: 
"ip-100-66-69-75.3175.aws-int.thomsonreuters.com" port: 8020 file: 
"/user/hadoop/.sparkStaging/application_1687146322573_0047/org.apache.spark_spark-avro_2.11-2.4.4.jar"
 } size: 187318 timestamp: 1687170415232 type: FILE visibility: PRIVATE
   2023-06-19T10:26:56.065+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:         
org.apache.htrace_htrace-core-3.1.0-incubating.jar -> resource { scheme: "hdfs" 
host: "ip-100-66-69-75.3175.aws-int.thomsonreuters.com" port: 8020 file: 
"/user/hadoop/.sparkStaging/application_1687146322573_0047/org.apache.htrace_htrace-core-3.1.0-incubating.jar"
 } size: 1475955 timestamp: 1687170415420 type: FILE visibility: PRIVATE
   2023-06-19T10:26:56.065+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:         
org.spark-project.spark_unused-1.0.0.jar -> resource { scheme: "hdfs" host: 
"ip-100-66-69-75.3175.aws-int.thomsonreuters.com" port: 8020 file: 
"/user/hadoop/.sparkStaging/application_1687146322573_0047/org.spark-project.spark_unused-1.0.0.jar"
 } size: 2777 timestamp: 1687170415433 type: FILE visibility: PRIVATE
   2023-06-19T10:26:56.065+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:     command:
   2023-06-19T10:26:56.066+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:         {{JAVA_HOME}}/bin/java 
-server -Xmx512m -Djava.io.tmpdir={{PWD}}/tmp 
-Dspark.yarn.app.container.log.dir=<LOG_DIR> 
org.apache.spark.deploy.yarn.ExecutorLauncher --arg 
'ip-100-66-69-75.3175.aws-int.thomsonreuters.com:35007' --properties-file 
{{PWD}}/__spark_conf__/__spark_conf__.properties --dist-cache-conf 
{{PWD}}/__spark_conf__/__spark_dist_cache__.properties 1> <LOG_DIR>/stdout 2> 
<LOG_DIR>/stderr
   2023-06-19T10:26:56.066+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: 
===============================================================================
   2023-06-19T10:26:56.067+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SecurityManager] [SecurityManager]: Changing view acls to: 
hadoop
   2023-06-19T10:26:56.067+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SecurityManager] [SecurityManager]: Changing modify acls to: 
hadoop
   2023-06-19T10:26:56.067+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SecurityManager] [SecurityManager]: Changing view acls groups 
to:
   2023-06-19T10:26:56.067+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SecurityManager] [SecurityManager]: Changing modify acls 
groups to:
   2023-06-19T10:26:56.067+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SecurityManager] [SecurityManager]: SecurityManager: 
authentication disabled; ui acls disabled; users  with view permissions: 
Set(hadoop); groups with view permissions: Set(); users  with modify 
permissions: Set(hadoop); groups with modify permissions: Set()
   2023-06-19T10:26:56.090+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: AM resources: Map()
   2023-06-19T10:26:56.091+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: spark.yarn.maxAppAttempts is 
not set. Cluster's default value will be used.
   2023-06-19T10:26:56.092+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Created resource capability for 
AM request: <memory:896, max memory:9223372036854775807, vCores:1, max 
vCores:2147483647>
   2023-06-19T10:26:56.093+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Submitting application 
application_1687146322573_0047 to ResourceManager
   2023-06-19T10:26:56.124+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hadoop.yarn.client.api.impl.YarnClientImpl] [YarnClientImpl]: 
Submitted application application_1687146322573_0047
   2023-06-19T10:26:57.127+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Application report for 
application_1687146322573_0047 (state: ACCEPTED)
   2023-06-19T10:26:57.130+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:
            client token: N/A
            diagnostics: AM container is launched, waiting for AM container to 
Register with RM
            ApplicationMaster host: N/A
            ApplicationMaster RPC port: -1
            queue: default
            start time: 1687170416103
            final status: UNDEFINED
            tracking URL: 
http://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:20888/proxy/application_1687146322573_0047/
            user: hadoop
   2023-06-19T10:26:58.131+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Application report for 
application_1687146322573_0047 (state: ACCEPTED)
   2023-06-19T10:26:58.131+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:
            client token: N/A
            diagnostics: AM container is launched, waiting for AM container to 
Register with RM
            ApplicationMaster host: N/A
            ApplicationMaster RPC port: -1
            queue: default
            start time: 1687170416103
            final status: UNDEFINED
            tracking URL: 
http://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:20888/proxy/application_1687146322573_0047/
            user: hadoop
   2023-06-19T10:26:59.132+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Application report for 
application_1687146322573_0047 (state: ACCEPTED)
   2023-06-19T10:26:59.133+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:
            client token: N/A
            diagnostics: AM container is launched, waiting for AM container to 
Register with RM
            ApplicationMaster host: N/A
            ApplicationMaster RPC port: -1
            queue: default
            start time: 1687170416103
            final status: UNDEFINED
            tracking URL: 
http://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:20888/proxy/application_1687146322573_0047/
            user: hadoop
   2023-06-19T10:27:00.134+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Application report for 
application_1687146322573_0047 (state: ACCEPTED)
   2023-06-19T10:27:00.134+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:
            client token: N/A
            diagnostics: AM container is launched, waiting for AM container to 
Register with RM
            ApplicationMaster host: N/A
            ApplicationMaster RPC port: -1
            queue: default
            start time: 1687170416103
            final status: UNDEFINED
            tracking URL: 
http://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:20888/proxy/application_1687146322573_0047/
            user: hadoop
   2023-06-19T10:27:01.135+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Application report for 
application_1687146322573_0047 (state: ACCEPTED)
   2023-06-19T10:27:01.136+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:
            client token: N/A
            diagnostics: AM container is launched, waiting for AM container to 
Register with RM
            ApplicationMaster host: N/A
            ApplicationMaster RPC port: -1
            queue: default
            start time: 1687170416103
            final status: UNDEFINED
            tracking URL: 
http://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:20888/proxy/application_1687146322573_0047/
            user: hadoop
   2023-06-19T10:27:02.137+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Application report for 
application_1687146322573_0047 (state: ACCEPTED)
   2023-06-19T10:27:02.137+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:
            client token: N/A
            diagnostics: AM container is launched, waiting for AM container to 
Register with RM
            ApplicationMaster host: N/A
            ApplicationMaster RPC port: -1
            queue: default
            start time: 1687170416103
            final status: UNDEFINED
            tracking URL: 
http://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:20888/proxy/application_1687146322573_0047/
            user: hadoop
   2023-06-19T10:27:03.139+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Application report for 
application_1687146322573_0047 (state: ACCEPTED)
   2023-06-19T10:27:03.139+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:
            client token: N/A
            diagnostics: AM container is launched, waiting for AM container to 
Register with RM
            ApplicationMaster host: N/A
            ApplicationMaster RPC port: -1
            queue: default
            start time: 1687170416103
            final status: UNDEFINED
            tracking URL: 
http://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:20888/proxy/application_1687146322573_0047/
            user: hadoop
   2023-06-19T10:27:04.140+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Application report for 
application_1687146322573_0047 (state: ACCEPTED)
   2023-06-19T10:27:04.140+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:
            client token: N/A
            diagnostics: AM container is launched, waiting for AM container to 
Register with RM
            ApplicationMaster host: N/A
            ApplicationMaster RPC port: -1
            queue: default
            start time: 1687170416103
            final status: UNDEFINED
            tracking URL: 
http://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:20888/proxy/application_1687146322573_0047/
            user: hadoop
   2023-06-19T10:27:05.142+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Application report for 
application_1687146322573_0047 (state: ACCEPTED)
   2023-06-19T10:27:05.142+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:
            client token: N/A
            diagnostics: AM container is launched, waiting for AM container to 
Register with RM
            ApplicationMaster host: N/A
            ApplicationMaster RPC port: -1
            queue: default
            start time: 1687170416103
            final status: UNDEFINED
            tracking URL: 
http://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:20888/proxy/application_1687146322573_0047/
            user: hadoop
   2023-06-19T10:27:05.989+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.network.server.TransportServer] [TransportServer]: New 
connection accepted for remote address /100.66.95.167:57800.
   2023-06-19T10:27:06.143+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]: Application report for 
application_1687146322573_0047 (state: RUNNING)
   2023-06-19T10:27:06.143+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.deploy.yarn.Client] [Client]:
            client token: N/A
            diagnostics: N/A
            ApplicationMaster host: 100.66.95.167
            ApplicationMaster RPC port: -1
            queue: default
            start time: 1687170416103
            final status: UNDEFINED
            tracking URL: 
http://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:20888/proxy/application_1687146322573_0047/
            user: hadoop
   2023-06-19T10:27:06.152+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.network.server.TransportServer] [TransportServer]: Shuffle 
server started on port: 32849
   2023-06-19T10:27:06.152+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.util.Utils] [Utils]: Successfully started service 
'org.apache.spark.network.netty.NettyBlockTransferService' on port 32849.
   2023-06-19T10:27:06.152+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.network.netty.NettyBlockTransferService] 
[NettyBlockTransferService]: Server created on 
ip-100-66-69-75.3175.aws-int.thomsonreuters.com:32849
   2023-06-19T10:27:06.301+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.ui.ServerInfo] [ServerInfo]: Adding filter to /metrics/json: 
org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
   2023-06-19T10:27:06.303+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.handler.ContextHandler] [ContextHandler]: 
Started 
o.s.j.s.ServletContextHandler@5c134052{/metrics/json,null,AVAILABLE,@Spark}
   2023-06-19T10:27:06.323+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.deploy.history.SingleEventLogFileWriter] 
[SingleEventLogFileWriter]: Logging events to 
hdfs:/var/log/spark/apps/application_1687146322573_0047.inprogress
   2023-06-19T10:27:06.519+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.SparkContext] [SparkContext]: Adding shutdown hook
   2023-06-19T10:27:06.553+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.common.table.HoodieTableMetaClient] [HoodieTableMetaClient]: 
Loading HoodieTableMetaClient from s3://a206760-novusdoc-s3-dev-use1/novusdoc
   2023-06-19T10:27:06.772+0000: [GC pause (G1 Evacuation Pause) (young), 
0.0194145 secs]
      [Parallel Time: 13.6 ms, GC Workers: 8]
         [GC Worker Start (ms): Min: 20082.0, Avg: 20082.1, Max: 20082.2, Diff: 
0.1]
         [Ext Root Scanning (ms): Min: 0.5, Avg: 1.1, Max: 5.2, Diff: 4.7, Sum: 
9.0]
         [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.2, Diff: 0.2, Sum: 0.3]
            [Processed Buffers: Min: 0, Avg: 0.4, Max: 1, Diff: 1, Sum: 3]
         [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.3]
         [Code Root Scanning (ms): Min: 0.0, Avg: 0.7, Max: 1.5, Diff: 1.5, 
Sum: 5.3]
         [Object Copy (ms): Min: 8.3, Avg: 11.5, Max: 12.8, Diff: 4.4, Sum: 
91.9]
         [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.8]
            [Termination Attempts: Min: 1, Avg: 243.8, Max: 313, Diff: 312, 
Sum: 1950]
         [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 
0.4]
         [GC Worker Total (ms): Min: 13.4, Avg: 13.5, Max: 13.6, Diff: 0.2, 
Sum: 108.1]
         [GC Worker End (ms): Min: 20095.6, Avg: 20095.6, Max: 20095.6, Diff: 
0.1]
      [Code Root Fixup: 0.2 ms]
      [Code Root Purge: 0.0 ms]
      [Clear CT: 0.2 ms]
      [Other: 5.4 ms]
         [Choose CSet: 0.0 ms]
         [Ref Proc: 4.8 ms]
         [Ref Enq: 0.0 ms]
         [Redirty Cards: 0.1 ms]
         [Humongous Register: 0.0 ms]
         [Humongous Reclaim: 0.0 ms]
         [Free CSet: 0.3 ms]
      [Eden: 280.0M(280.0M)->0.0B(262.0M) Survivors: 17408.0K->35840.0K Heap: 
298.7M(496.0M)->37559.0K(496.0M)]
    [Times: user=0.10 sys=0.00, real=0.02 secs]
   2023-06-19T10:27:07.272+0000 [INFO] [offline_compaction_schedule] 
[com.amazon.ws.emr.hadoop.fs.util.ClientConfigurationFactory] 
[ClientConfigurationFactory]: Set initial getObject socket timeout to 2000 ms.
   2023-06-19T10:27:07.546+0000: [GC pause (G1 Evacuation Pause) (young), 
0.0190510 secs]
      [Parallel Time: 15.1 ms, GC Workers: 8]
         [GC Worker Start (ms): Min: 20856.5, Avg: 20856.5, Max: 20856.6, Diff: 
0.1]
         [Ext Root Scanning (ms): Min: 0.3, Avg: 1.1, Max: 5.7, Diff: 5.4, Sum: 
9.1]
         [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.2, Diff: 0.2, Sum: 0.3]
            [Processed Buffers: Min: 0, Avg: 0.4, Max: 1, Diff: 1, Sum: 3]
         [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 0.3]
         [Code Root Scanning (ms): Min: 0.0, Avg: 0.9, Max: 2.1, Diff: 2.1, 
Sum: 7.6]
         [Object Copy (ms): Min: 9.2, Avg: 12.7, Max: 14.2, Diff: 5.0, Sum: 
101.5]
         [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.6]
            [Termination Attempts: Min: 1, Avg: 191.4, Max: 236, Diff: 235, 
Sum: 1531]
         [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 
0.4]
         [GC Worker Total (ms): Min: 14.9, Avg: 15.0, Max: 15.0, Diff: 0.1, 
Sum: 119.7]
         [GC Worker End (ms): Min: 20871.5, Avg: 20871.5, Max: 20871.5, Diff: 
0.1]
      [Code Root Fixup: 0.2 ms]
      [Code Root Purge: 0.0 ms]
      [Clear CT: 0.2 ms]
      [Other: 3.6 ms]
         [Choose CSet: 0.0 ms]
         [Ref Proc: 3.0 ms]
         [Ref Enq: 0.0 ms]
         [Redirty Cards: 0.1 ms]
         [Humongous Register: 0.0 ms]
         [Humongous Reclaim: 0.0 ms]
         [Free CSet: 0.3 ms]
      [Eden: 262.0M(262.0M)->0.0B(266.0M) Survivors: 35840.0K->31744.0K Heap: 
298.7M(496.0M)->33975.0K(496.0M)]
    [Times: user=0.12 sys=0.00, real=0.02 secs]
   2023-06-19T10:27:08.214+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.common.table.HoodieTableConfig] [HoodieTableConfig]: Loading 
table properties from 
s3://a206760-novusdoc-s3-dev-use1/novusdoc/.hoodie/hoodie.properties
   2023-06-19T10:27:08.231+0000 [INFO] [offline_compaction_schedule] 
[com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem] [S3NativeFileSystem]: 
Opening 's3://a206760-novusdoc-s3-dev-use1/novusdoc/.hoodie/hoodie.properties' 
for reading
   2023-06-19T10:27:08.367+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.common.table.HoodieTableMetaClient] [HoodieTableMetaClient]: 
Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) 
from s3://a206760-novusdoc-s3-dev-use1/novusdoc
   2023-06-19T10:27:08.367+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.common.table.HoodieTableMetaClient] [HoodieTableMetaClient]: 
Loading Active commit timeline for s3://a206760-novusdoc-s3-dev-use1/novusdoc
   2023-06-19T10:27:08.460+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.common.table.timeline.HoodieActiveTimeline] 
[HoodieActiveTimeline]: Loaded instants upto : 
Option{val=[20230619102516597__deltacommit__COMPLETED]}
   2023-06-19T10:27:08.473+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.utilities.HoodieCompactor] [HoodieCompactor]: 
HoodieCompactorConfig {
      --base-path s3://a206760-novusdoc-s3-dev-use1/novusdoc,
      --table-name novusdoc,
      --instant-time null,
      --parallelism 200,
      --schema-file null,
      --spark-master null,
      --spark-memory 2g,
      --retry 0,
      --schedule false,
      --mode scheduleandexecute,
      --strategy 
org.apache.hudi.table.action.compact.strategy.LogFileSizeBasedCompactionStrategy,
      --props null,
      --hoodie-conf [hoodie.metadata.enable=false, 
hoodie.compact.inline.trigger.strategy=NUM_COMMITS, 
hoodie.compact.inline.max.delta.commits=5]
   }
   2023-06-19T10:27:08.474+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.utilities.HoodieCompactor] [HoodieCompactor]: Running Mode: 
[scheduleandexecute]
   2023-06-19T10:27:08.474+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.utilities.HoodieCompactor] [HoodieCompactor]: Step 1: Do 
schedule
   2023-06-19T10:27:08.651+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.client.embedded.EmbeddedTimelineService] 
[EmbeddedTimelineService]: Starting Timeline service !!
   2023-06-19T10:27:08.652+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.client.embedded.EmbeddedTimelineService] 
[EmbeddedTimelineService]: Overriding hostIp to 
(ip-100-66-69-75.3175.aws-int.thomsonreuters.com) found in spark-conf. It was 
null
   2023-06-19T10:27:08.661+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.common.table.view.FileSystemViewManager] 
[FileSystemViewManager]: Creating View Manager with storage type :MEMORY
   2023-06-19T10:27:08.661+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.common.table.view.FileSystemViewManager] 
[FileSystemViewManager]: Creating in-memory based Table View
   2023-06-19T10:27:08.671+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.hudi.org.eclipse.jetty.util.log] [log]: Logging to 
org.slf4j.impl.Log4jLoggerAdapter(org.apache.hudi.org.eclipse.jetty.util.log) 
via org.apache.hudi.org.eclipse.jetty.util.log.Slf4jLog
   2023-06-19T10:27:08.672+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.org.eclipse.jetty.util.log] [log]: Logging initialized 
@21982ms to org.apache.hudi.org.eclipse.jetty.util.log.Slf4jLog
   2023-06-19T10:27:08.723+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.hudi.timeline.service.handlers.MarkerHandler] [MarkerHandler]: 
MarkerHandler FileSystem: s3
   2023-06-19T10:27:08.723+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.hudi.timeline.service.handlers.MarkerHandler] [MarkerHandler]: 
MarkerHandler batching params: batchNumThreads=20 batchIntervalMs=50ms
   2023-06-19T10:27:08.767+0000: [GC pause (G1 Evacuation Pause) (young), 
0.0285504 secs]
      [Parallel Time: 21.1 ms, GC Workers: 8]
         [GC Worker Start (ms): Min: 22077.6, Avg: 22077.7, Max: 22078.6, Diff: 
1.0]
         [Ext Root Scanning (ms): Min: 0.2, Avg: 1.7, Max: 7.8, Diff: 7.6, Sum: 
13.3]
         [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.2, Diff: 0.2, Sum: 0.3]
            [Processed Buffers: Min: 0, Avg: 0.4, Max: 1, Diff: 1, Sum: 3]
         [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.3]
         [Code Root Scanning (ms): Min: 0.0, Avg: 1.0, Max: 2.2, Diff: 2.2, 
Sum: 7.9]
         [Object Copy (ms): Min: 13.1, Avg: 18.0, Max: 19.9, Diff: 6.8, Sum: 
143.7]
         [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.8]
            [Termination Attempts: Min: 1, Avg: 229.8, Max: 299, Diff: 298, 
Sum: 1838]
         [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 
0.3]
         [GC Worker Total (ms): Min: 20.0, Avg: 20.8, Max: 21.0, Diff: 1.0, 
Sum: 166.5]
         [GC Worker End (ms): Min: 22098.5, Avg: 22098.6, Max: 22098.6, Diff: 
0.0]
      [Code Root Fixup: 0.3 ms]
      [Code Root Purge: 0.0 ms]
      [Clear CT: 0.2 ms]
      [Other: 7.0 ms]
         [Choose CSet: 0.0 ms]
         [Ref Proc: 6.3 ms]
         [Ref Enq: 0.1 ms]
         [Redirty Cards: 0.1 ms]
         [Humongous Register: 0.0 ms]
         [Humongous Reclaim: 0.0 ms]
         [Free CSet: 0.3 ms]
      [Eden: 266.0M(266.0M)->0.0B(259.0M) Survivors: 31744.0K->38912.0K Heap: 
299.2M(496.0M)->44866.0K(496.0M)]
    [Times: user=0.16 sys=0.01, real=0.04 secs]
   2023-06-19T10:27:08.818+0000 [INFO] [offline_compaction_schedule] 
[io.javalin.Javalin] [Javalin]:
              __                      __ _
             / /____ _ _   __ ____ _ / /(_)____
        __  / // __ `/| | / // __ `// // // __ \
       / /_/ // /_/ / | |/ // /_/ // // // / / /
       \____/ \__,_/  |___/ \__,_//_//_//_/ /_/
   
           https://javalin.io/documentation
   
   2023-06-19T10:27:08.819+0000 [INFO] [offline_compaction_schedule] 
[io.javalin.Javalin] [Javalin]: Starting Javalin ...
   2023-06-19T10:27:08.957+0000 [INFO] [offline_compaction_schedule] 
[io.javalin.Javalin] [Javalin]: Listening on http://localhost:42997/
   2023-06-19T10:27:08.957+0000 [INFO] [offline_compaction_schedule] 
[io.javalin.Javalin] [Javalin]: Javalin started in 142ms \o/
   2023-06-19T10:27:08.957+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.timeline.service.TimelineService] [TimelineService]: Starting 
Timeline server on port :42997
   2023-06-19T10:27:08.957+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.client.embedded.EmbeddedTimelineService] 
[EmbeddedTimelineService]: Started embedded timeline server at 
ip-100-66-69-75.3175.aws-int.thomsonreuters.com:42997
   2023-06-19T10:27:08.970+0000 [WARN] [offline_compaction_schedule] 
[org.apache.hudi.utilities.HoodieCompactor] [HoodieCompactor]: No instant time 
is provided for scheduling compaction.
   2023-06-19T10:27:08.973+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.client.BaseHoodieWriteClient] [BaseHoodieWriteClient]: 
Scheduling table service COMPACT
   2023-06-19T10:27:08.974+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.client.BaseHoodieWriteClient] [BaseHoodieWriteClient]: 
Scheduling compaction at instant time :20230619102708972
   2023-06-19T10:27:08.978+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.common.table.HoodieTableMetaClient] [HoodieTableMetaClient]: 
Loading HoodieTableMetaClient from s3://a206760-novusdoc-s3-dev-use1/novusdoc
   2023-06-19T10:27:08.990+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.common.table.HoodieTableConfig] [HoodieTableConfig]: Loading 
table properties from 
s3://a206760-novusdoc-s3-dev-use1/novusdoc/.hoodie/hoodie.properties
   2023-06-19T10:27:08.990+0000 [INFO] [offline_compaction_schedule] 
[com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem] [S3NativeFileSystem]: 
Opening 's3://a206760-novusdoc-s3-dev-use1/novusdoc/.hoodie/hoodie.properties' 
for reading
   2023-06-19T10:27:09.067+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.common.table.HoodieTableMetaClient] [HoodieTableMetaClient]: 
Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) 
from s3://a206760-novusdoc-s3-dev-use1/novusdoc
   2023-06-19T10:27:09.068+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.common.table.HoodieTableMetaClient] [HoodieTableMetaClient]: 
Loading Active commit timeline for s3://a206760-novusdoc-s3-dev-use1/novusdoc
   2023-06-19T10:27:09.070+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.SecurityManager] [SecurityManager]: user=dr.who 
aclsEnabled=false viewAcls=hadoop viewAclsGroups=
   2023-06-19T10:27:09.113+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.common.table.timeline.HoodieActiveTimeline] 
[HoodieActiveTimeline]: Loaded instants upto : 
Option{val=[20230619102516597__deltacommit__COMPLETED]}
   2023-06-19T10:27:09.121+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.common.table.view.FileSystemViewManager] 
[FileSystemViewManager]: Creating View Manager with storage type :REMOTE_FIRST
   2023-06-19T10:27:09.121+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.common.table.view.FileSystemViewManager] 
[FileSystemViewManager]: Creating remote first table view
   2023-06-19T10:27:09.128+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.table.action.compact.ScheduleCompactionActionExecutor] 
[ScheduleCompactionActionExecutor]: Checking if compaction needs to be run on 
s3://a206760-novusdoc-s3-dev-use1/novusdoc
   2023-06-19T10:27:09.137+0000 [DEBUG] [offline_compaction_schedule] 
[org.apache.spark.SecurityManager] [SecurityManager]: user=dr.who 
aclsEnabled=false viewAcls=hadoop viewAclsGroups=
   2023-06-19T10:27:09.184+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.client.BaseHoodieClient] [BaseHoodieClient]: Stopping Timeline 
service !!
   2023-06-19T10:27:09.184+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.client.embedded.EmbeddedTimelineService] 
[EmbeddedTimelineService]: Closing Timeline server
   2023-06-19T10:27:09.184+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.timeline.service.TimelineService] [TimelineService]: Closing 
Timeline Service
   2023-06-19T10:27:09.184+0000 [INFO] [offline_compaction_schedule] 
[io.javalin.Javalin] [Javalin]: Stopping Javalin ...
   2023-06-19T10:27:09.195+0000 [INFO] [offline_compaction_schedule] 
[io.javalin.Javalin] [Javalin]: Javalin has stopped
   2023-06-19T10:27:09.195+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.timeline.service.TimelineService] [TimelineService]: Closed 
Timeline Service
   2023-06-19T10:27:09.195+0000 [INFO] [offline_compaction_schedule] 
[org.apache.hudi.client.embedded.EmbeddedTimelineService] 
[EmbeddedTimelineService]: Closed Timeline server
   2023-06-19T10:27:09.196+0000 [WARN] [offline_compaction_schedule] 
[org.apache.hudi.utilities.HoodieCompactor] [HoodieCompactor]: Couldn't do 
schedule
   2023-06-19T10:27:09.211+0000 [INFO] [offline_compaction_schedule] 
[org.sparkproject.jetty.server.AbstractConnector] [AbstractConnector]: Stopped 
Spark@34dc85a{HTTP/1.1, (http/1.1)}{0.0.0.0:8090}
   2023-06-19T10:27:09.238+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.ui.SparkUI] [SparkUI]: Stopped Spark web UI at 
http://ip-100-66-69-75.3175.aws-int.thomsonreuters.com:8090
   2023-06-19T10:27:09.708+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.MapOutputTrackerMasterEndpoint] 
[MapOutputTrackerMasterEndpoint]: MapOutputTrackerMasterEndpoint stopped!
   2023-06-19T10:27:09.749+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.SparkContext] [SparkContext]: Successfully stopped 
SparkContext
   2023-06-19T10:27:09.751+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.util.ShutdownHookManager] [ShutdownHookManager]: Shutdown 
hook called
   2023-06-19T10:27:09.751+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.util.ShutdownHookManager] [ShutdownHookManager]: Deleting 
directory /mnt/tmp/spark-94366315-0ad4-4f1a-8051-1c517b83f435
   2023-06-19T10:27:09.756+0000 [INFO] [offline_compaction_schedule] 
[org.apache.spark.util.ShutdownHookManager] [ShutdownHookManager]: Deleting 
directory /mnt/tmp/spark-f72ca80c-54af-4f64-bcaa-176fe9cc27e4
   Heap
    garbage-first heap   total 507904K, used 192322K [0x00000006c0000000, 
0x00000006c0100f80, 0x00000007c0000000)
     region size 1024K, 183 young (187392K), 38 survivors (38912K)
    Metaspace       used 102404K, capacity 108290K, committed 108544K, reserved 
1144832K
     class space    used 13406K, capacity 14036K, committed 14080K, reserved 
1048576K
   [hadoop@ip-100-66-69-75 a206760-PowerUser2


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to