PavelPetukhov edited a comment on issue #2959:
URL: https://github.com/apache/hudi/issues/2959#issuecomment-848930327


   `
   Logged in as: dr.who 
   Application
   About
   Jobs
   Tools
   
   Log Type: stderr
   Log Upload Time: Wed May 26 18:33:34 +0300 2021
   Log Length: 104910
   21/05/26 18:33:18 INFO util.SignalUtils: Registered signal handler for TERM
   21/05/26 18:33:18 INFO util.SignalUtils: Registered signal handler for HUP
   21/05/26 18:33:18 INFO util.SignalUtils: Registered signal handler for INT
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing view acls to: 
yarn,hdfs
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing modify acls to: 
yarn,hdfs
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing view acls groups to: 
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing modify acls groups 
to: 
   21/05/26 18:33:18 INFO spark.SecurityManager: SecurityManager: 
authentication disabled; ui acls disabled; users  with view permissions: 
Set(yarn, hdfs); groups with view permissions: Set(); users  with modify 
permissions: Set(yarn, hdfs); groups with modify permissions: Set()
   21/05/26 18:33:18 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
   21/05/26 18:33:18 INFO yarn.ApplicationMaster: Preparing Local resources
   21/05/26 18:33:19 WARN shortcircuit.DomainSocketFactory: The short-circuit 
local reads feature cannot be used because libhadoop cannot be loaded.
   21/05/26 18:33:19 INFO yarn.ApplicationMaster: ApplicationAttemptId: 
appattempt_1618828995116_0162_000001
   21/05/26 18:33:19 INFO yarn.ApplicationMaster: Starting the user application 
in a separate Thread
   21/05/26 18:33:19 INFO yarn.ApplicationMaster: Waiting for spark context 
initialization...
   21/05/26 18:33:19 WARN deltastreamer.SchedulerConfGenerator: Job Scheduling 
Configs will not be in effect as spark.scheduler.mode is not set to FAIR at 
instantiation time. Continuing without scheduling configs
   21/05/26 18:33:19 INFO spark.SparkContext: Running Spark version 2.4.7
   21/05/26 18:33:19 INFO spark.SparkContext: Submitted application: xxx
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing view acls to: 
yarn,hdfs
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing modify acls to: 
yarn,hdfs
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing view acls groups to: 
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing modify acls groups 
to: 
   21/05/26 18:33:19 INFO spark.SecurityManager: SecurityManager: 
authentication disabled; ui acls disabled; users  with view permissions: 
Set(yarn, hdfs); groups with view permissions: Set(); users  with modify 
permissions: Set(yarn, hdfs); groups with modify permissions: Set()
   21/05/26 18:33:20 INFO util.Utils: Successfully started service 
'sparkDriver' on port 37691.
   21/05/26 18:33:20 INFO spark.SparkEnv: Registering MapOutputTracker
   21/05/26 18:33:20 INFO spark.SparkEnv: Registering BlockManagerMaster
   21/05/26 18:33:20 INFO storage.BlockManagerMasterEndpoint: Using 
org.apache.spark.storage.DefaultTopologyMapper for getting topology information
   21/05/26 18:33:20 INFO storage.BlockManagerMasterEndpoint: 
BlockManagerMasterEndpoint up
   21/05/26 18:33:20 INFO storage.DiskBlockManager: Created local directory at 
/data/hadoop/yarn/local/usercache/hdfs/appcache/application_1618828995116_0162/blockmgr-9de167db-4756-414e-9126-32cb562e91aa
   21/05/26 18:33:20 INFO memory.MemoryStore: MemoryStore started with capacity 
912.3 MB
   21/05/26 18:33:20 INFO spark.SparkEnv: Registering OutputCommitCoordinator
   21/05/26 18:33:20 INFO util.log: Logging initialized @2935ms
   21/05/26 18:33:20 INFO ui.JettyUtils: Adding filter 
org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /jobs, 
/jobs/json, /jobs/job, /jobs/job/json, /stages, /stages/json, /stages/stage, 
/stages/stage/json, /stages/pool, /stages/pool/json, /storage, /storage/json, 
/storage/rdd, /storage/rdd/json, /environment, /environment/json, /executors, 
/executors/json, /executors/threadDump, /executors/threadDump/json, /static, /, 
/api, /jobs/job/kill, /stages/stage/kill.
   21/05/26 18:33:20 INFO server.Server: jetty-9.3.z-SNAPSHOT, build timestamp: 
unknown, git hash: unknown
   21/05/26 18:33:20 INFO server.Server: Started @3069ms
   21/05/26 18:33:20 INFO server.AbstractConnector: Started 
ServerConnector@7a0e94b4{HTTP/1.1,[http/1.1]}{0.0.0.0:32822}
   21/05/26 18:33:20 INFO util.Utils: Successfully started service 'SparkUI' on 
port 32822.
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@43837fbc{/jobs,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@d91ba30{/jobs/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@4854d5d9{/jobs/job,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@672e7ec3{/jobs/job/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@67ee182c{/stages,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@97af315{/stages/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@1936a0e0{/stages/stage,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@447ef19e{/stages/stage/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@68e36851{/stages/pool,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@352fe12b{/stages/pool/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@3d39f28d{/storage,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@e7806b5{/storage/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@7d2a56cb{/storage/rdd,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@37c6c6fc{/storage/rdd/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@4599e713{/environment,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@b9a0cbb{/environment/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@24299f0d{/executors,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@25594c52{/executors/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@2f728695{/executors/threadDump,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@7456a814{/executors/threadDump/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@1cef9064{/static,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@16ba2eda{/,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@dac88e2{/api,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@145850ef{/jobs/job/kill,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@6d678cf2{/stages/stage/kill,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at 
http://xxx:32822
   21/05/26 18:33:20 INFO cluster.YarnClusterScheduler: Created 
YarnClusterScheduler
   21/05/26 18:33:20 INFO cluster.SchedulerExtensionServices: Starting Yarn 
extension services with app application_1618828995116_0162 and attemptId 
Some(appattempt_1618828995116_0162_000001)
   21/05/26 18:33:20 WARN util.Utils: spark.executor.instances less than 
spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please 
update your configs.
   21/05/26 18:33:20 INFO util.Utils: Using initial executors = 1, max of 
spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors 
and spark.executor.instances
   21/05/26 18:33:20 INFO util.Utils: Successfully started service 
'org.apache.spark.network.netty.NettyBlockTransferService' on port 38417.
   21/05/26 18:33:20 INFO netty.NettyBlockTransferService: Server created on 
xxx:38417
   21/05/26 18:33:20 INFO storage.BlockManager: Using 
org.apache.spark.storage.RandomBlockReplicationPolicy for block replication 
policy
   21/05/26 18:33:20 INFO storage.BlockManagerMaster: Registering BlockManager 
BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO storage.BlockManagerMasterEndpoint: Registering block 
manager xxx:38417 with 912.3 MB RAM, BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO storage.BlockManagerMaster: Registered BlockManager 
BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO storage.BlockManager: external shuffle service port = 
7337
   21/05/26 18:33:20 INFO storage.BlockManager: Initialized BlockManager: 
BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO ui.JettyUtils: Adding filter 
org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /metrics/json.
   21/05/26 18:33:20 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@1b3c78ce{/metrics/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:21 INFO scheduler.EventLoggingListener: Logging events to 
hdfs://xxx:8020/eventLogging/application_1618828995116_0162_1
   21/05/26 18:33:21 WARN util.Utils: spark.executor.instances less than 
spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please 
update your configs.
   21/05/26 18:33:21 INFO util.Utils: Using initial executors = 1, max of 
spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors 
and spark.executor.instances
   21/05/26 18:33:21 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: 
Attempted to request executors before the AM has registered!
   21/05/26 18:33:21 INFO client.RMProxy: Connecting to ResourceManager at 
xxx/10.246.4.117:8030
   21/05/26 18:33:21 INFO yarn.YarnRMClient: Registering the ApplicationMaster
   21/05/26 18:33:21 INFO yarn.ApplicationMaster: 
   
===============================================================================
   YARN executor launch context:
     env:
       CLASSPATH -> 
{{PWD}}<CPS>{{PWD}}/__spark_conf__<CPS>{{PWD}}/__spark_libs__/*<CPS>/usr/hdp/2.6.0.3-8/hadoop/conf<CPS>/usr/hdp/2.6.0.3-8/hadoop/*<CPS>/usr/hdp/2.6.0.3-8/hadoop/lib/*<CPS>/usr/hdp/current/hadoop-hdfs-client/*<CPS>/usr/hdp/current/hadoop-hdfs-client/lib/*<CPS>/usr/hdp/current/hadoop-yarn-client/*<CPS>/usr/hdp/current/hadoop-yarn-client/lib/*<CPS>/usr/hdp/current/ext/hadoop/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*<CPS>{{PWD}}/__spark_conf__/__hadoop_conf__
       SPARK_YARN_STAGING_DIR -> 
hdfs://xxx:8020/user/hd_xyz/.sparkStaging/application_1618828995116_0162
       SPARK_USER -> hdfs
   
     command:
       {{JAVA_HOME}}/bin/java \ 
         -server \ 
         -Xmx2048m \ 
         -Djava.io.tmpdir={{PWD}}/tmp \ 
         '-Dspark.driver.port=37691' \ 
         '-Dspark.ui.port=0' \ 
         -Dspark.yarn.app.container.log.dir=<LOG_DIR> \ 
         -XX:OnOutOfMemoryError='kill %p' \ 
         org.apache.spark.executor.CoarseGrainedExecutorBackend \ 
         --driver-url \ 
         spark://CoarseGrainedScheduler@xxx:37691 \ 
         --executor-id \ 
         <executorId> \ 
         --hostname \ 
         <hostname> \ 
         --cores \ 
         1 \ 
         --app-id \ 
         application_1618828995116_0162 \ 
         --user-class-path \ 
         file:$PWD/__app__.jar \ 
         --user-class-path \ 
         file:$PWD/org.apache.spark_spark-avro_2.12-2.4.7.jar \ 
         --user-class-path \ 
         file:$PWD/org.spark-project.spark_unused-1.0.0.jar \ 
         1><LOG_DIR>/stdout \ 
         2><LOG_DIR>/stderr
   
     resources:
       org.apache.spark_spark-avro_2.12-2.4.7.jar -> resource { scheme: "hdfs" 
host: "xxx" port: 8020 file: 
"/user/hd_xyz/.sparkStaging/application_1618828995116_0162/org.apache.spark_spark-avro_2.12-2.4.7.jar"
 } size: 107269 timestamp: 1622043191967 type: FILE visibility: PRIVATE
       __app__.jar -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: 
"/user/jars/hudi/hudi-utilities-bundle_2.12-0.8.0.jar" } size: 40399204 
timestamp: 1622022896130 type: FILE visibility: PUBLIC
       __spark_conf__ -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: 
"/user/hd_xyz/.sparkStaging/application_1618828995116_0162/__spark_conf__.zip" 
} size: 205423 timestamp: 1622043193955 type: ARCHIVE visibility: PRIVATE
       org.spark-project.spark_unused-1.0.0.jar -> resource { scheme: "hdfs" 
host: "xxx" port: 8020 file: 
"/user/hd_xyz/.sparkStaging/application_1618828995116_0162/org.spark-project.spark_unused-1.0.0.jar"
 } size: 2777 timestamp: 1622043192905 type: FILE visibility: PRIVATE
       __spark_libs__ -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: 
"/user/hd_xyz/.sparkStaging/application_1618828995116_0162/__spark_libs__2858796966972713370.zip"
 } size: 242613518 timestamp: 1622043190403 type: ARCHIVE visibility: PRIVATE
   
   
===============================================================================
   21/05/26 18:33:21 WARN util.Utils: spark.executor.instances less than 
spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please 
update your configs.
   21/05/26 18:33:21 INFO util.Utils: Using initial executors = 1, max of 
spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors 
and spark.executor.instances
   21/05/26 18:33:21 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: 
ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM@xxx:37691)
   21/05/26 18:33:21 INFO yarn.YarnAllocator: Will request 1 executor 
container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of 
overhead)
   21/05/26 18:33:21 INFO yarn.YarnAllocator: Submitted 1 unlocalized container 
requests.
   21/05/26 18:33:21 INFO yarn.ApplicationMaster: Started progress reporter 
thread with (heartbeat : 3000, initial allocation : 200) intervals
   21/05/26 18:33:22 INFO impl.AMRMClientImpl: Received new token for : 
xxx:45454
   21/05/26 18:33:22 INFO yarn.YarnAllocator: Launching container 
container_e03_1618828995116_0162_01_000002 on host xxx for executor with ID 1
   21/05/26 18:33:22 INFO yarn.YarnAllocator: Received 1 containers from YARN, 
launching executors on 1 of them.
   21/05/26 18:33:22 INFO impl.ContainerManagementProtocolProxy: 
yarn.client.max-cached-nodemanagers-proxies : 0
   21/05/26 18:33:22 INFO impl.ContainerManagementProtocolProxy: Opening proxy 
: xxx:45454
   21/05/26 18:33:25 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: 
Registered executor NettyRpcEndpointRef(spark-client://Executor) 
(10.246.3.9:49980) with ID 1
   21/05/26 18:33:25 INFO spark.ExecutorAllocationManager: New executor 1 has 
registered (new total is 1)
   21/05/26 18:33:25 INFO cluster.YarnClusterSchedulerBackend: SchedulerBackend 
is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
   21/05/26 18:33:25 INFO cluster.YarnClusterScheduler: 
YarnClusterScheduler.postStartHook done
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: 
[hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, 
mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, 
hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: 
[DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs 
(auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO utilities.UtilHelpers: Adding overridden properties 
to file properties.
   21/05/26 18:33:25 WARN spark.SparkContext: Using an existing SparkContext; 
some configuration may not take effect.
   21/05/26 18:33:25 INFO storage.BlockManagerMasterEndpoint: Registering block 
manager xxx:35696 with 912.3 MB RAM, BlockManagerId(1, xxx, 35696, None)
   21/05/26 18:33:25 INFO deltastreamer.HoodieDeltaStreamer: Creating delta 
streamer with configs : {hoodie.deltastreamer.keygen.timebased.input.timezone=, 
hoodie.embed.timeline.server=true, schema.registry.url=http://xxx, 
hoodie.filesystem.view.type=EMBEDDED_KV_STORE, 
hoodie.deltastreamer.keygen.timebased.input.dateformat=yyyy-MM-ddTHH:mm:ssZ,yyyy-MM-ddTHH:mm:ss.SSSZ,
 hoodie.delete.shuffle.parallelism=2, hoodie.bulkinsert.shuffle.parallelism=2, 
hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy/MM/dd, 
group.id=hudi_group_080, auto.offset.reset=earliest, 
hoodie.insert.shuffle.parallelism=2, 
hoodie.deltastreamer.keygen.timebased.timestamp.type=DATE_STRING, 
hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.CustomKeyGenerator,
 hoodie.deltastreamer.source.kafka.topic=xxx, bootstrap.servers=xxx:9092, 
hoodie.deltastreamer.keygen.timebased.input.dateformat.list.delimiter.regex=, 
hoodie.deltastreamer.schemaprovider.registry.url=http://xxx/subjects/xxx-value/versions
 /latest, hoodie.datasource.write.recordkey.field=id, 
hoodie.upsert.shuffle.parallelism=2, 
hoodie.datasource.write.partitionpath.field=date:TIMESTAMP}
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Initializing 
/user/hd_xyz/yyy/ml_xxx/foo as hoodie table /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: 
[hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, 
mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, 
hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: 
[DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs 
(auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Loading 
HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: 
[hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, 
mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, 
hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: 
[DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs 
(auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO table.HoodieTableConfig: Loading table properties 
from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Finished Loading Table 
of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from 
/user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Finished initializing 
Table of type MERGE_ON_READ from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO deltastreamer.DeltaSync: Registering Schema 
:[{"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.ml_xxx.public.foo.Value"},
 
{"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.m
 l_xxx.public.foo.Value"}]
   21/05/26 18:33:25 INFO deltastreamer.HoodieDeltaStreamer: Delta Streamer 
running only single round
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Loading 
HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: 
[hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, 
mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, 
hdfs-default.xml, hdfs-site.xml], FileSystem: 
[DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs 
(auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO table.HoodieTableConfig: Loading table properties 
from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Finished Loading Table 
of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from 
/user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:26 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:26 INFO deltastreamer.DeltaSync: Checkpoint to resume from : 
Optional.empty
   21/05/26 18:33:26 INFO consumer.ConsumerConfig: ConsumerConfig values: 
        allow.auto.create.topics = true
        auto.commit.interval.ms = 5000
        auto.offset.reset = earliest
        bootstrap.servers = [xxx]
        check.crcs = true
        client.dns.lookup = default
        client.id = 
        client.rack = 
        connections.max.idle.ms = 540000
        default.api.timeout.ms = 60000
        enable.auto.commit = true
        exclude.internal.topics = true
        fetch.max.bytes = 52428800
        fetch.max.wait.ms = 500
        fetch.min.bytes = 1
        group.id = hudi_group_080
        group.instance.id = null
        heartbeat.interval.ms = 3000
        interceptor.classes = []
        internal.leave.group.on.close = true
        isolation.level = read_uncommitted
        key.deserializer = class 
org.apache.kafka.common.serialization.StringDeserializer
        max.partition.fetch.bytes = 1048576
        max.poll.interval.ms = 300000
        max.poll.records = 500
        metadata.max.age.ms = 300000
        metric.reporters = []
        metrics.num.samples = 2
        metrics.recording.level = INFO
        metrics.sample.window.ms = 30000
        partition.assignment.strategy = [class 
org.apache.kafka.clients.consumer.RangeAssignor]
        receive.buffer.bytes = 65536
        reconnect.backoff.max.ms = 1000
        reconnect.backoff.ms = 50
        request.timeout.ms = 30000
        retry.backoff.ms = 100
        sasl.client.callback.handler.class = null
        sasl.jaas.config = null
        sasl.kerberos.kinit.cmd = /usr/bin/kinit
        sasl.kerberos.min.time.before.relogin = 60000
        sasl.kerberos.service.name = null
        sasl.kerberos.ticket.renew.jitter = 0.05
        sasl.kerberos.ticket.renew.window.factor = 0.8
        sasl.login.callback.handler.class = null
        sasl.login.class = null
        sasl.login.refresh.buffer.seconds = 300
        sasl.login.refresh.min.period.seconds = 60
        sasl.login.refresh.window.factor = 0.8
        sasl.login.refresh.window.jitter = 0.05
        sasl.mechanism = GSSAPI
        security.protocol = PLAINTEXT
        security.providers = null
        send.buffer.bytes = 131072
        session.timeout.ms = 10000
        ssl.cipher.suites = null
        ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
        ssl.endpoint.identification.algorithm = https
        ssl.key.password = null
        ssl.keymanager.algorithm = SunX509
        ssl.keystore.location = null
        ssl.keystore.password = null
        ssl.keystore.type = JKS
        ssl.protocol = TLS
        ssl.provider = null
        ssl.secure.random.implementation = null
        ssl.trustmanager.algorithm = PKIX
        ssl.truststore.location = null
        ssl.truststore.password = null
        ssl.truststore.type = JKS
        value.deserializer = class 
io.confluent.kafka.serializers.KafkaAvroDeserializer
   
   21/05/26 18:33:26 INFO serializers.KafkaAvroDeserializerConfig: 
KafkaAvroDeserializerConfig values: 
        schema.registry.url = [xxx]
        max.schemas.per.subject = 1000
        specific.avro.reader = false
   
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 
'hoodie.deltastreamer.keygen.timebased.timestamp.type' was supplied but isn't a 
known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 
'hoodie.deltastreamer.keygen.timebased.output.dateformat' was supplied but 
isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 
'hoodie.deltastreamer.keygen.timebased.input.dateformat.list.delimiter.regex' 
was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 
'hoodie.deltastreamer.keygen.timebased.input.dateformat' was supplied but isn't 
a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 
'hoodie.datasource.write.partitionpath.field' was supplied but isn't a known 
config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 
'hoodie.delete.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 
'hoodie.datasource.write.recordkey.field' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 
'hoodie.upsert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 
'hoodie.datasource.write.keygenerator.class' was supplied but isn't a known 
config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 
'hoodie.deltastreamer.source.kafka.topic' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 
'hoodie.deltastreamer.schemaprovider.registry.url' was supplied but isn't a 
known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 
'hoodie.insert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 
'hoodie.embed.timeline.server' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 
'hoodie.bulkinsert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 
'hoodie.deltastreamer.keygen.timebased.input.timezone' was supplied but isn't a 
known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 
'hoodie.filesystem.view.type' was supplied but isn't a known config.
   21/05/26 18:33:26 INFO utils.AppInfoParser: Kafka version: 2.4.1
   21/05/26 18:33:26 INFO utils.AppInfoParser: Kafka commitId: c57222ae8cd7866b
   21/05/26 18:33:26 INFO utils.AppInfoParser: Kafka startTimeMs: 1622043206225
   21/05/26 18:33:26 INFO clients.Metadata: [Consumer 
clientId=consumer-hudi_group_080-1, groupId=hudi_group_080] Cluster ID: 
5XoPi9AYT0mbHVQEj6VEaw
   21/05/26 18:33:27 INFO helpers.KafkaOffsetGen: SourceLimit not configured, 
set numEvents to default value : 5000000
   21/05/26 18:33:27 INFO sources.AvroKafkaSource: About to read 0 from Kafka 
for topic :xxx
   21/05/26 18:33:27 INFO deltastreamer.DeltaSync: No new data, perform empty 
commit.
   21/05/26 18:33:27 INFO deltastreamer.DeltaSync: Setting up new Hoodie Write 
Client
   21/05/26 18:33:27 INFO deltastreamer.DeltaSync: Registering Schema 
:[{"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.ml_xxx.public.foo.Value"},
 
{"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.m
 l_xxx.public.foo.Value"}]
   21/05/26 18:33:27 INFO embedded.EmbeddedTimelineService: Starting Timeline 
service !!
   21/05/26 18:33:27 INFO embedded.EmbeddedTimelineService: Overriding hostIp 
to (xxx) found in spark-conf. It was null
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating View Manager 
with storage type :EMBEDDED_KV_STORE
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating embedded 
rocks-db based Table View
   21/05/26 18:33:27 INFO util.log: Logging initialized @9978ms to 
org.apache.hudi.org.eclipse.jetty.util.log.Slf4jLog
   21/05/26 18:33:27 INFO javalin.Javalin: 
              __                      __ _
             / /____ _ _   __ ____ _ / /(_)____
        __  / // __ `/| | / // __ `// // // __ \
       / /_/ // /_/ / | |/ // /_/ // // // / / /
       \____/ \__,_/  |___/ \__,_//_//_//_/ /_/
   
           https://javalin.io/documentation
   
   21/05/26 18:33:27 INFO javalin.Javalin: Starting Javalin ...
   21/05/26 18:33:27 INFO javalin.Javalin: Listening on http://localhost:37089/
   21/05/26 18:33:27 INFO javalin.Javalin: Javalin started in 179ms \o/
   21/05/26 18:33:27 INFO service.TimelineService: Starting Timeline server on 
port :37089
   21/05/26 18:33:27 INFO embedded.EmbeddedTimelineService: Started embedded 
timeline server at xxx:37089
   21/05/26 18:33:27 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: 
[hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, 
mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, 
hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: 
[DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs 
(auth:SIMPLE)]]]
   21/05/26 18:33:27 INFO client.AbstractHoodieClient: Timeline Server already 
running. Not restarting the service
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Loading 
HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: 
[hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, 
mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, 
hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: 
[DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs 
(auth:SIMPLE)]]]
   21/05/26 18:33:27 INFO table.HoodieTableConfig: Loading table properties 
from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Finished Loading Table 
of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from 
/user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Loading Active commit 
timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating View Manager 
with storage type :REMOTE_FIRST
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating remote first 
table view
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Loading 
HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: 
[hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, 
mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, 
hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: 
[DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs 
(auth:SIMPLE)]]]
   21/05/26 18:33:28 INFO table.HoodieTableConfig: Loading table properties 
from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Finished Loading Table 
of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from 
/user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading Active commit 
timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading 
HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: 
[hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, 
mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, 
hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: 
[DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs 
(auth:SIMPLE)]]]
   21/05/26 18:33:28 INFO table.HoodieTableConfig: Loading table properties 
from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Finished Loading Table 
of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from 
/user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading Active commit 
timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:28 INFO client.AbstractHoodieWriteClient: Generate a new 
instant time: 20210526183328 action: deltacommit
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Creating a new instant 
[==>20210526183328__deltacommit__REQUESTED]
   21/05/26 18:33:28 INFO deltastreamer.DeltaSync: Starting commit  : 
20210526183328
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading 
HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: 
[hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, 
mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, 
hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: 
[DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs 
(auth:SIMPLE)]]]
   21/05/26 18:33:28 INFO table.HoodieTableConfig: Loading table properties 
from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Finished Loading Table 
of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from 
/user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading Active commit 
timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Loaded instants 
[[==>20210526183328__deltacommit__REQUESTED]]
   21/05/26 18:33:28 INFO view.FileSystemViewManager: Creating View Manager 
with storage type :REMOTE_FIRST
   21/05/26 18:33:28 INFO view.FileSystemViewManager: Creating remote first 
table view
   21/05/26 18:33:28 INFO client.SparkRDDWriteClient: Successfully synced to 
metadata table
   21/05/26 18:33:28 INFO client.AsyncCleanerService: Auto cleaning is not 
enabled. Not running cleaner now
   21/05/26 18:33:28 INFO spark.SparkContext: Starting job: countByKey at 
SparkHoodieBloomIndex.java:114
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Registering RDD 1 (mapToPair 
at SparkWriteHelper.java:54) as input to shuffle 1
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Registering RDD 5 (countByKey 
at SparkHoodieBloomIndex.java:114) as input to shuffle 0
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Got job 0 (countByKey at 
SparkHoodieBloomIndex.java:114) with 2 output partitions
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Final stage: ResultStage 2 
(countByKey at SparkHoodieBloomIndex.java:114)
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Parents of final stage: 
List(ShuffleMapStage 1)
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Missing parents: 
List(ShuffleMapStage 1)
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 1 
(MapPartitionsRDD[5] at countByKey at SparkHoodieBloomIndex.java:114), which 
has no missing parents
   21/05/26 18:33:28 INFO memory.MemoryStore: Block broadcast_0 stored as 
values in memory (estimated size 6.2 KB, free 912.3 MB)
   21/05/26 18:33:28 INFO yarn.YarnAllocator: Driver requested a total number 
of 2 executor(s).
   21/05/26 18:33:28 INFO yarn.YarnAllocator: Will request 1 executor 
container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of 
overhead)
   21/05/26 18:33:28 INFO yarn.YarnAllocator: Submitted 1 unlocalized container 
requests.
   21/05/26 18:33:28 INFO spark.ExecutorAllocationManager: Requesting 1 new 
executor because tasks are backlogged (new desired total will be 2)
   21/05/26 18:33:28 INFO memory.MemoryStore: Block broadcast_0_piece0 stored 
as bytes in memory (estimated size 3.3 KB, free 912.3 MB)
   21/05/26 18:33:28 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in 
memory on xxx:38417 (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:28 INFO spark.SparkContext: Created broadcast 0 from 
broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Submitting 2 missing tasks 
from ShuffleMapStage 1 (MapPartitionsRDD[5] at countByKey at 
SparkHoodieBloomIndex.java:114) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:28 INFO cluster.YarnClusterScheduler: Adding task set 1.0 
with 2 tasks
   21/05/26 18:33:28 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 
1.0 (TID 0, xxx, executor 1, partition 0, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:28 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in 
memory on xxx:35696 (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:29 INFO impl.AMRMClientImpl: Received new token for : 
xxx:45454
   21/05/26 18:33:29 INFO yarn.YarnAllocator: Launching container 
container_e03_1618828995116_0162_01_000004 on host xxx for executor with ID 2
   21/05/26 18:33:29 INFO yarn.YarnAllocator: Received 1 containers from YARN, 
launching executors on 1 of them.
   21/05/26 18:33:29 INFO impl.ContainerManagementProtocolProxy: 
yarn.client.max-cached-nodemanagers-proxies : 0
   21/05/26 18:33:29 INFO impl.ContainerManagementProtocolProxy: Opening proxy 
: xxx:45454
   21/05/26 18:33:29 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send 
map output locations for shuffle 1 to 10.246.3.9:49980
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added rdd_3_0 in memory on 
xxx:35696 (size: 0.0 B, free: 912.3 MB)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 
1.0 (TID 1, xxx, executor 1, partition 1, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added rdd_3_1 in memory on 
xxx:35696 (size: 0.0 B, free: 912.3 MB)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 
1.0 (TID 0) in 1023 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 
1.0 (TID 1) in 70 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: ShuffleMapStage 1 (countByKey 
at SparkHoodieBloomIndex.java:114) finished in 1.177 s
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: looking for newly runnable 
stages
   21/05/26 18:33:29 INFO cluster.YarnClusterScheduler: Removed TaskSet 1.0, 
whose tasks have all completed, from pool 
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 2)
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: Submitting ResultStage 2 
(ShuffledRDD[6] at countByKey at SparkHoodieBloomIndex.java:114), which has no 
missing parents
   21/05/26 18:33:29 INFO memory.MemoryStore: Block broadcast_1 stored as 
values in memory (estimated size 3.8 KB, free 912.3 MB)
   21/05/26 18:33:29 INFO memory.MemoryStore: Block broadcast_1_piece0 stored 
as bytes in memory (estimated size 2.2 KB, free 912.3 MB)
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in 
memory on xxx:38417 (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:29 INFO spark.SparkContext: Created broadcast 1 from 
broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: Submitting 2 missing tasks 
from ResultStage 2 (ShuffledRDD[6] at countByKey at 
SparkHoodieBloomIndex.java:114) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:29 INFO cluster.YarnClusterScheduler: Adding task set 2.0 
with 2 tasks
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 
2.0 (TID 2, xxx, executor 1, partition 0, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in 
memory on xxx:35696 (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:29 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send 
map output locations for shuffle 0 to 10.246.3.9:49980
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 
2.0 (TID 3, xxx, executor 1, partition 1, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 
2.0 (TID 2) in 85 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 
2.0 (TID 3) in 32 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:29 INFO cluster.YarnClusterScheduler: Removed TaskSet 2.0, 
whose tasks have all completed, from pool 
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: ResultStage 2 (countByKey at 
SparkHoodieBloomIndex.java:114) finished in 0.126 s
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: Job 0 finished: countByKey at 
SparkHoodieBloomIndex.java:114, took 1.627903 s
   21/05/26 18:33:29 INFO yarn.YarnAllocator: Driver requested a total number 
of 1 executor(s).
   21/05/26 18:33:30 INFO spark.SparkContext: Starting job: collect at 
HoodieSparkEngineContext.java:78
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Got job 1 (collect at 
HoodieSparkEngineContext.java:78) with 1 output partitions
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Final stage: ResultStage 3 
(collect at HoodieSparkEngineContext.java:78)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Parents of final stage: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Missing parents: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ResultStage 3 
(MapPartitionsRDD[8] at flatMap at HoodieSparkEngineContext.java:78), which has 
no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_2 stored as 
values in memory (estimated size 368.5 KB, free 911.9 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_2_piece0 stored 
as bytes in memory (estimated size 101.0 KB, free 911.8 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in 
memory on xxx:38417 (size: 101.0 KB, free: 912.2 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 2 from 
broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 1 missing tasks 
from ResultStage 3 (MapPartitionsRDD[8] at flatMap at 
HoodieSparkEngineContext.java:78) (first 15 tasks are for partitions Vector(0))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 3.0 
with 1 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 
3.0 (TID 4, xxx, executor 1, partition 0, PROCESS_LOCAL, 7710 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in 
memory on xxx:35696 (size: 101.0 KB, free: 912.2 MB)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 
3.0 (TID 4) in 178 ms on xxx (executor 1) (1/1)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 3.0, 
whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ResultStage 3 (collect at 
HoodieSparkEngineContext.java:78) finished in 0.233 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Job 1 finished: collect at 
HoodieSparkEngineContext.java:78, took 0.236923 s
   21/05/26 18:33:30 INFO spark.SparkContext: Starting job: collect at 
HoodieSparkEngineContext.java:73
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Got job 2 (collect at 
HoodieSparkEngineContext.java:73) with 1 output partitions
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Final stage: ResultStage 4 
(collect at HoodieSparkEngineContext.java:73)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Parents of final stage: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Missing parents: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ResultStage 4 
(MapPartitionsRDD[10] at map at HoodieSparkEngineContext.java:73), which has no 
missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_3 stored as 
values in memory (estimated size 368.3 KB, free 911.5 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_3_piece0 stored 
as bytes in memory (estimated size 100.9 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in 
memory on xxx:38417 (size: 100.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 3 from 
broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 1 missing tasks 
from ResultStage 4 (MapPartitionsRDD[10] at map at 
HoodieSparkEngineContext.java:73) (first 15 tasks are for partitions Vector(0))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 4.0 
with 1 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 
4.0 (TID 5, xxx, executor 1, partition 0, PROCESS_LOCAL, 7710 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in 
memory on xxx:35696 (size: 100.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 
4.0 (TID 5) in 94 ms on xxx (executor 1) (1/1)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 4.0, 
whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ResultStage 4 (collect at 
HoodieSparkEngineContext.java:73) finished in 0.167 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Job 2 finished: collect at 
HoodieSparkEngineContext.java:73, took 0.174163 s
   21/05/26 18:33:30 INFO spark.SparkContext: Starting job: countByKey at 
SparkHoodieBloomIndex.java:149
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Registering RDD 14 
(countByKey at SparkHoodieBloomIndex.java:149) as input to shuffle 2
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Got job 3 (countByKey at 
SparkHoodieBloomIndex.java:149) with 2 output partitions
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Final stage: ResultStage 7 
(countByKey at SparkHoodieBloomIndex.java:149)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Parents of final stage: 
List(ShuffleMapStage 6)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Missing parents: 
List(ShuffleMapStage 6)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 6 
(MapPartitionsRDD[14] at countByKey at SparkHoodieBloomIndex.java:149), which 
has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_4 stored as 
values in memory (estimated size 7.5 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_4_piece0 stored 
as bytes in memory (estimated size 3.9 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_4_piece0 in 
memory on xxx:38417 (size: 3.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 4 from 
broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 2 missing tasks 
from ShuffleMapStage 6 (MapPartitionsRDD[14] at countByKey at 
SparkHoodieBloomIndex.java:149) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 6.0 
with 2 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 
6.0 (TID 6, xxx, executor 1, partition 0, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_4_piece0 in 
memory on xxx:35696 (size: 3.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 
6.0 (TID 7, xxx, executor 1, partition 1, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 
6.0 (TID 6) in 60 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 
6.0 (TID 7) in 36 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 6.0, 
whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ShuffleMapStage 6 (countByKey 
at SparkHoodieBloomIndex.java:149) finished in 0.121 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: looking for newly runnable 
stages
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 7)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ResultStage 7 
(ShuffledRDD[15] at countByKey at SparkHoodieBloomIndex.java:149), which has no 
missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_5 stored as 
values in memory (estimated size 3.8 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_5_piece0 stored 
as bytes in memory (estimated size 2.2 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in 
memory on xxx:38417 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 5 from 
broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 2 missing tasks 
from ResultStage 7 (ShuffledRDD[15] at countByKey at 
SparkHoodieBloomIndex.java:149) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 7.0 
with 2 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 
7.0 (TID 8, xxx, executor 1, partition 0, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in 
memory on xxx:35696 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send 
map output locations for shuffle 2 to 10.246.3.9:49980
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 
7.0 (TID 9, xxx, executor 1, partition 1, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 
7.0 (TID 8) in 47 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 
7.0 (TID 9) in 20 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 7.0, 
whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ResultStage 7 (countByKey at 
SparkHoodieBloomIndex.java:149) finished in 0.081 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Job 3 finished: countByKey at 
SparkHoodieBloomIndex.java:149, took 0.219895 s
   21/05/26 18:33:30 INFO bloom.SparkHoodieBloomIndex: InputParallelism: ${2}, 
IndexParallelism: ${0}
   21/05/26 18:33:30 INFO bloom.BucketizedBloomCheckPartitioner: TotalBuckets 
0, min_buckets/partition 1
   21/05/26 18:33:30 INFO rdd.MapPartitionsRDD: Removing RDD 3 from persistence 
list
   21/05/26 18:33:30 INFO storage.BlockManager: Removing RDD 3
   21/05/26 18:33:31 INFO rdd.MapPartitionsRDD: Removing RDD 22 from 
persistence list
   21/05/26 18:33:31 INFO storage.BlockManager: Removing RDD 22
   21/05/26 18:33:31 INFO spark.SparkContext: Starting job: countByKey at 
BaseSparkCommitActionExecutor.java:158
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 16 (mapToPair 
at SparkHoodieBloomIndex.java:266) as input to shuffle 6
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 23 (mapToPair 
at SparkHoodieBloomIndex.java:287) as input to shuffle 3
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 22 
(flatMapToPair at SparkHoodieBloomIndex.java:274) as input to shuffle 4
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 31 
(countByKey at BaseSparkCommitActionExecutor.java:158) as input to shuffle 5
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Got job 4 (countByKey at 
BaseSparkCommitActionExecutor.java:158) with 2 output partitions
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Final stage: ResultStage 13 
(countByKey at BaseSparkCommitActionExecutor.java:158)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Parents of final stage: 
List(ShuffleMapStage 12)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Missing parents: 
List(ShuffleMapStage 12)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 10 
(MapPartitionsRDD[23] at mapToPair at SparkHoodieBloomIndex.java:287), which 
has no missing parents
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_6 stored as 
values in memory (estimated size 5.9 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_6_piece0 stored 
as bytes in memory (estimated size 3.3 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_6_piece0 in 
memory on xxx:38417 (size: 3.3 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Created broadcast 6 from 
broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting 2 missing tasks 
from ShuffleMapStage 10 (MapPartitionsRDD[23] at mapToPair at 
SparkHoodieBloomIndex.java:287) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Adding task set 10.0 
with 2 tasks
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 
10.0 (TID 10, xxx, executor 1, partition 0, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_6_piece0 in 
memory on xxx:35696 (size: 3.3 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send 
map output locations for shuffle 1 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 
10.0 (TID 11, xxx, executor 1, partition 1, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 
10.0 (TID 10) in 50 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 
10.0 (TID 11) in 24 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Removed TaskSet 10.0, 
whose tasks have all completed, from pool 
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: ShuffleMapStage 10 (mapToPair 
at SparkHoodieBloomIndex.java:287) finished in 0.092 s
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: looking for newly runnable 
stages
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: waiting: Set(ShuffleMapStage 
12, ResultStage 13)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 12 
(MapPartitionsRDD[31] at countByKey at BaseSparkCommitActionExecutor.java:158), 
which has no missing parents
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_7 stored as 
values in memory (estimated size 7.1 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_7_piece0 stored 
as bytes in memory (estimated size 3.8 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in 
memory on xxx:38417 (size: 3.8 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Created broadcast 7 from 
broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting 2 missing tasks 
from ShuffleMapStage 12 (MapPartitionsRDD[31] at countByKey at 
BaseSparkCommitActionExecutor.java:158) (first 15 tasks are for partitions 
Vector(0, 1))
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Adding task set 12.0 
with 2 tasks
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 
12.0 (TID 12, xxx, executor 1, partition 0, PROCESS_LOCAL, 7730 bytes)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in 
memory on xxx:35696 (size: 3.8 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send 
map output locations for shuffle 3 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send 
map output locations for shuffle 4 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added rdd_29_0 in memory on 
xxx:35696 (size: 0.0 B, free: 912.1 MB)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 
12.0 (TID 13, xxx, executor 1, partition 1, PROCESS_LOCAL, 7730 bytes)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 
12.0 (TID 12) in 105 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added rdd_29_1 in memory on 
xxx:35696 (size: 0.0 B, free: 912.1 MB)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 
12.0 (TID 13) in 24 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Removed TaskSet 12.0, 
whose tasks have all completed, from pool 
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: ShuffleMapStage 12 
(countByKey at BaseSparkCommitActionExecutor.java:158) finished in 0.146 s
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: looking for newly runnable 
stages
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 13)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting ResultStage 13 
(ShuffledRDD[32] at countByKey at BaseSparkCommitActionExecutor.java:158), 
which has no missing parents
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_8 stored as 
values in memory (estimated size 3.8 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_8_piece0 stored 
as bytes in memory (estimated size 2.2 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in 
memory on xxx:38417 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Created broadcast 8 from 
broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting 2 missing tasks 
from ResultStage 13 (ShuffledRDD[32] at countByKey at 
BaseSparkCommitActionExecutor.java:158) (first 15 tasks are for partitions 
Vector(0, 1))
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Adding task set 13.0 
with 2 tasks
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 
13.0 (TID 14, xxx, executor 1, partition 0, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in 
memory on xxx:35696 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send 
map output locations for shuffle 5 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 
13.0 (TID 15, xxx, executor 1, partition 1, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 
13.0 (TID 14) in 31 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 
13.0 (TID 15) in 12 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Removed TaskSet 13.0, 
whose tasks have all completed, from pool 
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: ResultStage 13 (countByKey at 
BaseSparkCommitActionExecutor.java:158) finished in 0.064 s
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Job 4 finished: countByKey at 
BaseSparkCommitActionExecutor.java:158, took 0.320123 s
   21/05/26 18:33:31 INFO commit.BaseSparkCommitActionExecutor: Workload 
profile :WorkloadProfile {globalStat=WorkloadStat {numInserts=0, numUpdates=0}, 
partitionStat={}, operationType=UPSERT}
   21/05/26 18:33:31 INFO timeline.HoodieActiveTimeline: Checking for file 
exists ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit.requested
   21/05/26 18:33:31 INFO timeline.HoodieActiveTimeline: Create new file for 
toInstant 
?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit.inflight
   21/05/26 18:33:31 INFO commit.UpsertPartitioner: AvgRecordSize => 1024
   21/05/26 18:33:31 INFO view.AbstractTableFileSystemView: Took 3 ms to read  
0 instants, 0 replaced file groups
   21/05/26 18:33:31 INFO util.ClusteringUtils: Found 0 files in pending 
clustering operations
   21/05/26 18:33:31 INFO commit.UpsertPartitioner: Total Buckets :0, buckets 
info => {}, 
   Partition to insert buckets => {}, 
   UpdateLocations mapped to buckets =>{}
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 175
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 62
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 9
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 148
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 105
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 143
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 2
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 55
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 209
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 154
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 147
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 163
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 69
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 34
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 100
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned shuffle 5
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 1
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 193
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 169
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 27
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 16
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 115
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 120
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 106
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 174
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 210
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 96
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 6
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 57
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 133
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 11
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 74
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 107
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 164
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 172
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 176
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 194
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 109
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 37
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 177
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 128
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 182
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 205
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 30
   21/05/26 18:33:31 INFO commit.BaseCommitActionExecutor: Auto commit disabled 
for 20210526183328
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 102
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 180
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 150
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 186
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 89
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 223
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 47
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 158
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 162
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 88
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 39
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 8
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 29
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 124
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 75
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 165
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 217
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 134
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_5_piece0 
on xxx:35696 in memory (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_5_piece0 
on xxx:38417 in memory (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 35
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 216
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 22
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 114
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 152
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 42
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 94
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 145
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 126
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 144
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 168
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_3_piece0 
on xxx:38417 in memory (size: 100.9 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_3_piece0 
on xxx:35696 in memory (size: 100.9 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 149
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 38
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 70
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 15
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 118
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 166
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 207
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 170
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 171
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 65
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 5
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 97
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 110
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 222
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 87
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 
on xxx:38417 in memory (size: 3.3 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 
on xxx:35696 in memory (size: 3.3 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 192
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 201
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 117
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 123
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 12
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 60
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 84
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 127
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 91
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 136
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 45
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 200
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 64
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_2_piece0 
on xxx:38417 in memory (size: 101.0 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_2_piece0 
on xxx:35696 in memory (size: 101.0 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 92
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 0
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 81
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 185
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 214
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 21
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 31
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 67
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 112
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 178
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 208
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 78
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 73
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 131
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_8_piece0 
on xxx:38417 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_8_piece0 
on xxx:35696 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 61
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 3
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_7_piece0 
on xxx:38417 in memory (size: 3.8 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_7_piece0 
on xxx:35696 in memory (size: 3.8 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Starting job: sum at 
DeltaSync.java:448
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Job 5 finished: sum at 
DeltaSync.java:448, took 0.000044 s
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 36
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 80
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 103
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 108
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 183
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 72
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 54
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 132
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 99
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 19
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 93
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 179
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 215
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 66
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 77
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 151
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 116
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 191
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 17
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 14
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 18
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 125
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 204
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 146
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 50
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 56
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 52
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 101
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 221
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 213
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 181
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 190
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 85
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned shuffle 2
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 156
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 161
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 53
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 197
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 20
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 41
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 44
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 140
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 218
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 188
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 122
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 195
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 167
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 220
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 43
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 199
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 155
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 24
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 219
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 71
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 198
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 23
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 135
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 26
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 141
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 121
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 157
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 13
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 130
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned shuffle 0
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 7
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 138
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 63
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 187
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 32
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 196
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 48
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 206
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 119
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 160
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 90
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 40
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 113
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_0_piece0 
on xxx:38417 in memory (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_0_piece0 
on xxx:35696 in memory (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 68
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 224
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 28
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 202
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 10
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 139
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 76
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 49
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 137
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 58
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 
on xxx:38417 in memory (size: 3.9 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 
on xxx:35696 in memory (size: 3.9 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 4
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 211
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 212
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 83
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 203
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 33
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 86
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 82
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_1_piece0 
on xxx:38417 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_1_piece0 
on xxx:35696 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 95
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 142
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 111
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 98
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 184
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 46
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 129
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 104
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 159
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 59
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 25
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 173
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 79
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 153
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 189
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 51
   21/05/26 18:33:32 INFO spark.SparkContext: Starting job: sum at 
DeltaSync.java:449
   21/05/26 18:33:32 INFO scheduler.DAGScheduler: Job 6 finished: sum at 
DeltaSync.java:449, took 0.000035 s
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading 
HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: 
[hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, 
mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, 
hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: 
[DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs 
(auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties 
from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table 
of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from 
/user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO spark.SparkContext: Starting job: collect at 
SparkRDDWriteClient.java:120
   21/05/26 18:33:32 INFO scheduler.DAGScheduler: Job 7 finished: collect at 
SparkRDDWriteClient.java:120, took 0.000039 s
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading 
HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: 
[hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, 
mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, 
hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: 
[DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs 
(auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties 
from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table 
of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from 
/user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading Active commit 
timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants 
[[==>20210526183328__deltacommit__INFLIGHT]]
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating View Manager 
with storage type :REMOTE_FIRST
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote first 
table view
   21/05/26 18:33:32 INFO util.CommitUtils: Creating  metadata for UPSERT 
numWriteStats:0numReplaceFileIds:0
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading 
HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: 
[hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, 
mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, 
hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: 
[DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs 
(auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties 
from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table 
of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from 
/user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading Active commit 
timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants 
[[==>20210526183328__deltacommit__INFLIGHT]]
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating View Manager 
with storage type :REMOTE_FIRST
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote first 
table view
   21/05/26 18:33:32 INFO client.AbstractHoodieWriteClient: Committing 
20210526183328 action deltacommit
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Marking instant 
complete [==>20210526183328__deltacommit__INFLIGHT]
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Checking for file 
exists ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit.inflight
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Create new file for 
toInstant ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Completed 
[==>20210526183328__deltacommit__INFLIGHT]
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants 
[[==>20210526183328__deltacommit__REQUESTED], 
[==>20210526183328__deltacommit__INFLIGHT], 
[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:32 INFO table.HoodieTimelineArchiveLog: No Instants to archive
   21/05/26 18:33:32 INFO client.AbstractHoodieWriteClient: Auto cleaning is 
enabled. Running cleaner now
   21/05/26 18:33:32 INFO client.AbstractHoodieWriteClient: Scheduling cleaning 
at instant time :20210526183332
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading 
HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: 
[hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, 
mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, 
hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: 
[DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs 
(auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties 
from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table 
of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from 
/user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading Active commit 
timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants 
[[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating View Manager 
with storage type :REMOTE_FIRST
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote first 
table view
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote view for 
basePath /user/hd_xyz/yyy/ml_xxx/foo. Server=xxx:37089, Timeout=300
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating InMemory based 
view for basePath /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO view.AbstractTableFileSystemView: Took 0 ms to read  
0 instants, 0 replaced file groups
   21/05/26 18:33:32 INFO util.ClusteringUtils: Found 0 files in pending 
clustering operations
   21/05/26 18:33:32 INFO view.RemoteHoodieTableFileSystemView: Sending request 
: 
(http://xxx:37089/v1/hoodie/view/compactions/pending/?basepath=%2Fuser%2Fhdfs%2Fxyz%2Fpublic%2Fml_xxx%2Ffoo&lastinstantts=20210526183328&timelinehash=3cb19d4eacc8a39b3d4198ed17d5dac7ca1a076cc50020fab31fed29c6ccddb1)
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading 
HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: 
[hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, 
mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, 
hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: 
[DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs 
(auth:SIMPLE)]]]
   21/05/26 18:33:33 INFO table.HoodieTableConfig: Loading table properties 
from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Finished Loading Table 
of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from 
/user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO timeline.HoodieActiveTimeline: Loaded instants 
[[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:33 INFO collection.RocksDBDAO: DELETING RocksDB persisted at 
/tmp/hoodie_timeline_rocksdb/_user_hdfs_xyz_public_ml_xxx_foo/a138e066-6b6b-4f72-8865-4c30301cbe11
   21/05/26 18:33:33 INFO collection.RocksDBDAO: No column family found. 
Loading default
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/db_impl_open.cc:230] Creating manifest 1 
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/version_set.cc:3406] Recovering from manifest file: MANIFEST-000001
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/column_family.cc:475] --------------- Options for column family [default]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/version_set.cc:3610] Recovered from manifest 
file:/tmp/hoodie_timeline_rocksdb/_user_hdfs_xyz_public_ml_xxx_foo/a138e066-6b6b-4f72-8865-4c30301cbe11/MANIFEST-000001
 succeeded,manifest_file_number is 1, next_file_number is 3, last_sequence is 
0, log_number is 0,prev_log_number is 0,max_column_family is 
0,min_log_number_to_keep is 0
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/version_set.cc:3618] Column family [default] (ID 0), log number is 0
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/db_impl_open.cc:1287] DB pointer 0x7f3aaccf1f20
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/version_set.cc:2936] Creating manifest 6
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/column_family.cc:475] --------------- Options for column family 
[hudi_view__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/db_impl.cc:1546] Created column family 
[hudi_view__user_hdfs_xyz_public_ml_xxx_foo] (ID 1)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/column_family.cc:475] --------------- Options for column family 
[hudi_pending_compaction__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/db_impl.cc:1546] Created column family 
[hudi_pending_compaction__user_hdfs_xyz_public_ml_xxx_foo] (ID 2)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/column_family.cc:475] --------------- Options for column family 
[hudi_bootstrap_basefile__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/db_impl.cc:1546] Created column family 
[hudi_bootstrap_basefile__user_hdfs_xyz_public_ml_xxx_foo] (ID 3)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/column_family.cc:475] --------------- Options for column family 
[hudi_partitions__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/db_impl.cc:1546] Created column family 
[hudi_partitions__user_hdfs_xyz_public_ml_xxx_foo] (ID 4)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/column_family.cc:475] --------------- Options for column family 
[hudi_replaced_fg_user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/db_impl.cc:1546] Created column family 
[hudi_replaced_fg_user_hdfs_xyz_public_ml_xxx_foo] (ID 5)
   21/05/26 18:33:33 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: 
Registered executor NettyRpcEndpointRef(spark-client://Executor) 
(10.246.4.117:53684) with ID 2
   21/05/26 18:33:33 INFO spark.ExecutorAllocationManager: New executor 2 has 
registered (new total is 2)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/column_family.cc:475] --------------- Options for column family 
[hudi_pending_clustering_fg_user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : 
[db/db_impl.cc:1546] Created column family 
[hudi_pending_clustering_fg_user_hdfs_xyz_public_ml_xxx_foo] (ID 6)
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting 
replacedFileGroups to ROCKSDB based file-system view at 
/tmp/hoodie_timeline_rocksdb, Total file-groups=0
   21/05/26 18:33:33 INFO collection.RocksDBDAO: Prefix DELETE (query=part=) on 
hudi_replaced_fg_user_hdfs_xyz_public_ml_xxx_foo
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting 
replacedFileGroups to ROCKSDB based file-system view complete
   21/05/26 18:33:33 INFO view.AbstractTableFileSystemView: Took 9 ms to read  
0 instants, 0 replaced file groups
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Initializing pending 
compaction operations. Count=0
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Initializing 
external data file mapping. Count=0
   21/05/26 18:33:33 INFO util.ClusteringUtils: Found 0 files in pending 
clustering operations
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting file 
groups in pending clustering to ROCKSDB based file-system view at 
/tmp/hoodie_timeline_rocksdb, Total file-groups=0
   21/05/26 18:33:33 INFO collection.RocksDBDAO: Prefix DELETE (query=part=) on 
hudi_pending_clustering_fg_user_hdfs_xyz_public_ml_xxx_foo
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting 
replacedFileGroups to ROCKSDB based file-system view complete
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Created ROCKSDB 
based file-system view at /tmp/hoodie_timeline_rocksdb
   21/05/26 18:33:33 INFO collection.RocksDBDAO: Prefix Search for (query=) on 
hudi_pending_compaction__user_hdfs_xyz_public_ml_xxx_foo. Total Time Taken 
(msec)=1. Serialization Time taken(micro)=0, num entries=0
   21/05/26 18:33:33 INFO service.RequestHandler: TimeTakenMillis[Total=791, 
Refresh=779, handle=11, Check=1], Success=true, 
Query=basepath=%2Fuser%2Fhdfs%2Fxyz%2Fpublic%2Fml_xxx%2Ffoo&lastinstantts=20210526183328&timelinehash=3cb19d4eacc8a39b3d4198ed17d5dac7ca1a076cc50020fab31fed29c6ccddb1,
 Host=xxx:37089, synced=false
   21/05/26 18:33:33 INFO storage.BlockManagerMasterEndpoint: Registering block 
manager xxx:36920 with 912.3 MB RAM, BlockManagerId(2, xxx, 36920, None)
   21/05/26 18:33:33 INFO clean.CleanPlanner: No earliest commit to retain. No 
need to scan partitions !!
   21/05/26 18:33:33 INFO clean.CleanPlanner: Nothing to clean here. It is 
already clean
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Cleaner started
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Cleaned failed 
attempts if any
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading 
HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: 
[hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, 
mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, 
hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: 
[DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs 
(auth:SIMPLE)]]]
   21/05/26 18:33:33 INFO table.HoodieTableConfig: Loading table properties 
from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Finished Loading Table 
of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from 
/user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading Active commit 
timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO timeline.HoodieActiveTimeline: Loaded instants 
[[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating View Manager 
with storage type :REMOTE_FIRST
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating remote first 
table view
   21/05/26 18:33:33 INFO client.SparkRDDWriteClient: Successfully synced to 
metadata table
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Committed 
20210526183328
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Scheduling table 
service COMPACT
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Scheduling 
compaction at instant time :20210526183333
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading 
HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: 
[hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, 
mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, 
hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: 
[DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs 
(auth:SIMPLE)]]]
   21/05/26 18:33:33 INFO table.HoodieTableConfig: Loading table properties 
from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Finished Loading Table 
of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from 
/user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading Active commit 
timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO timeline.HoodieActiveTimeline: Loaded instants 
[[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating View Manager 
with storage type :REMOTE_FIRST
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating remote first 
table view
   21/05/26 18:33:33 INFO compact.SparkScheduleCompactionActionExecutor: 
Checking if compaction needs to be run on /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO deltastreamer.DeltaSync: Commit 20210526183328 
successful!
   21/05/26 18:33:33 INFO rdd.MapPartitionsRDD: Removing RDD 29 from 
persistence list
   21/05/26 18:33:33 INFO storage.BlockManager: Removing RDD 29
   21/05/26 18:33:34 INFO rdd.MapPartitionsRDD: Removing RDD 37 from 
persistence list
   21/05/26 18:33:34 INFO storage.BlockManager: Removing RDD 37
   21/05/26 18:33:34 INFO deltastreamer.DeltaSync: Shutting down embedded 
timeline server
   21/05/26 18:33:34 INFO embedded.EmbeddedTimelineService: Closing Timeline 
server
   21/05/26 18:33:34 INFO service.TimelineService: Closing Timeline Service
   21/05/26 18:33:34 INFO javalin.Javalin: Stopping Javalin ...
   21/05/26 18:33:34 INFO javalin.Javalin: Javalin has stopped
   21/05/26 18:33:34 INFO view.RocksDbBasedFileSystemView: Closing Rocksdb !!
   21/05/26 18:33:34 INFO collection.RocksDBDAO: From Rocks DB : 
[db/db_impl.cc:365] Shutdown: canceling all background work
   21/05/26 18:33:34 INFO collection.RocksDBDAO: From Rocks DB : 
[db/db_impl.cc:521] Shutdown complete
   21/05/26 18:33:34 INFO view.RocksDbBasedFileSystemView: Closed Rocksdb !!
   21/05/26 18:33:34 INFO service.TimelineService: Closed Timeline Service
   21/05/26 18:33:34 INFO embedded.EmbeddedTimelineService: Closed Timeline 
server
   21/05/26 18:33:34 INFO deltastreamer.HoodieDeltaStreamer: Shut down delta 
streamer
   21/05/26 18:33:34 INFO server.AbstractConnector: Stopped 
Spark@7a0e94b4{HTTP/1.1,[http/1.1]}{0.0.0.0:0}
   21/05/26 18:33:34 INFO ui.SparkUI: Stopped Spark web UI at http://xxx:32822
   21/05/26 18:33:34 INFO yarn.YarnAllocator: Driver requested a total number 
of 0 executor(s).
   21/05/26 18:33:34 INFO cluster.YarnClusterSchedulerBackend: Shutting down 
all executors
   21/05/26 18:33:34 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: 
Asking each executor to shut down
   21/05/26 18:33:34 INFO cluster.SchedulerExtensionServices: Stopping 
SchedulerExtensionServices
   (serviceOption=None,
    services=List(),
    started=false)
   21/05/26 18:33:34 INFO spark.MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
   21/05/26 18:33:34 INFO memory.MemoryStore: MemoryStore cleared
   21/05/26 18:33:34 INFO storage.BlockManager: BlockManager stopped
   21/05/26 18:33:34 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
   21/05/26 18:33:34 INFO 
scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
   21/05/26 18:33:34 INFO spark.SparkContext: Successfully stopped SparkContext
   21/05/26 18:33:34 INFO yarn.ApplicationMaster: Final app status: SUCCEEDED, 
exitCode: 0
   21/05/26 18:33:34 INFO yarn.ApplicationMaster: Unregistering 
ApplicationMaster with SUCCEEDED
   21/05/26 18:33:34 INFO impl.AMRMClientImpl: Waiting for application to be 
successfully unregistered.
   21/05/26 18:33:34 INFO yarn.ApplicationMaster: Deleting staging directory 
hdfs://xxx:8020/user/hd_xyz/.sparkStaging/application_1618828995116_0162
   21/05/26 18:33:34 INFO util.ShutdownHookManager: Shutdown hook called
   21/05/26 18:33:34 INFO util.ShutdownHookManager: Deleting directory 
/data/hadoop/yarn/local/usercache/hdfs/appcache/application_1618828995116_0162/spark-4c7e81b9-e526-4325-abf0-d163828b92b5
   `


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to