[GitHub] [hudi] aznwarmonkey opened a new issue #4803: [SUPPORT] Clustering throwing exception

GitBox Sun, 13 Feb 2022 12:10:06 -0800


aznwarmonkey opened a new issue #4803:
URL: https://github.com/apache/hudi/issues/4803



   Hello,
   
   I am trying to run clustering and the job is erroring out without much 
indication as to why.
   
   Here's the command I am using to run clustering:
   
   ```sh
   spark-submit \
   --class org.apache.hudi.utilities.HoodieClusteringJob \
   /usr/lib/hudi/hudi-utilities-bundle.jar \
   --props s3://path-to-test/clustering.properties \
   --mode scheduleAndExecute \
   --base-path s3://path-to-test/data/hudi/test/country/ \
   --table-name country --spark-memory 1g
   ```
   
   Here's the properties file:
   ```
   hoodie.clustering.async.enabled=true
   hoodie.clustering.async.max.commits=1
   hoodie.clustering.plan.strategy.target.file.max.bytes=1073741824
   hoodie.clustering.plan.strategy.small.file.limit=629145600
   
hoodie.clustering.execution.strategy.class=org.apache.hudi.client.clustering.run.strategy.SparkSortAndSizeExecutionStrategy
   hoodie.clustering.plan.strategy.sort.columns=enrich_selector_id
   ```
   
   And here's the console output of clustering job.
   ```shell
   22/02/13 20:00:48 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 
1541, ip-172-31-74-236.ec2.internal, executor 3, partition 0, PROCESS_LOCAL, 
7927 bytes)
   22/02/13 20:00:48 INFO TaskSetManager: Starting task 1.0 in stage 3.0 (TID 
1542, ip-172-31-69-5.ec2.internal, executor 2, partition 1, PROCESS_LOCAL, 7922 
bytes)
   22/02/13 20:00:48 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory 
on ip-172-31-74-236.ec2.internal:46827 (size: 102.2 KB, free: 4.8 GB)
   22/02/13 20:00:48 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory 
on ip-172-31-69-5.ec2.internal:43517 (size: 102.2 KB, free: 366.1 MB)
   22/02/13 20:00:49 INFO S3NativeFileSystem: Opening 
's3://path-to-test/data/hudi/test/country/.hoodie/hoodie.properties' for reading
   22/02/13 20:00:50 INFO S3NativeFileSystem: Opening 
's3://path-to-test/data/hudi/test/country/.hoodie/20220213084348.replacecommit' 
for reading
   22/02/13 20:00:50 INFO S3NativeFileSystem: Opening 
's3://path-to-test/data/hudi/test/country/.hoodie/20220213103928.replacecommit' 
for reading
   22/02/13 20:00:50 INFO S3NativeFileSystem: Opening 
's3://path-to-test/data/hudi/test/country/.hoodie/20220213122909.replacecommit' 
for reading
   22/02/13 20:00:50 INFO S3NativeFileSystem: Opening 
's3://path-to-test/data/hudi/test/country/.hoodie/20220213142348.replacecommit' 
for reading
   22/02/13 20:00:50 INFO S3NativeFileSystem: Opening 
's3://path-to-test/data/hudi/test/country/.hoodie/20220213162102.replacecommit' 
for reading
   22/02/13 20:00:50 INFO S3NativeFileSystem: Opening 
's3://path-to-test/data/hudi/test/country/.hoodie/20220213181256.replacecommit.requested'
 for reading
   22/02/13 20:00:50 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID 
1541) in 1859 ms on ip-172-31-74-236.ec2.internal (executor 3) (1/2)
   22/02/13 20:00:50 INFO TaskSetManager: Finished task 1.0 in stage 3.0 (TID 
1542) in 1865 ms on ip-172-31-69-5.ec2.internal (executor 2) (2/2)
   22/02/13 20:00:50 INFO YarnScheduler: Removed TaskSet 3.0, whose tasks have 
all completed, from pool
   22/02/13 20:00:50 INFO DAGScheduler: ResultStage 3 (collect at 
HoodieSparkEngineContext.java:78) finished in 1.891 s
   22/02/13 20:00:50 INFO DAGScheduler: Job 3 finished: collect at 
HoodieSparkEngineContext.java:78, took 1.893633 s
   22/02/13 20:00:50 INFO Javalin: Stopping Javalin ...
   22/02/13 20:00:50 INFO Javalin: Javalin has stopped
   22/02/13 20:00:50 INFO S3NativeFileSystem: Opening 
's3://path-to-test/data/hudi/test/country/.hoodie/20220213084348.replacecommit' 
for reading
   22/02/13 20:00:50 INFO S3NativeFileSystem: Opening 
's3://path-to-test/data/hudi/test/country/.hoodie/20220213103928.replacecommit' 
for reading
   22/02/13 20:00:50 INFO S3NativeFileSystem: Opening 
's3://path-to-test/data/hudi/test/country/.hoodie/20220213122909.replacecommit' 
for reading
   22/02/13 20:00:50 INFO S3NativeFileSystem: Opening 
's3://path-to-test/data/hudi/test/country/.hoodie/20220213142348.replacecommit' 
for reading
   22/02/13 20:00:51 INFO S3NativeFileSystem: Opening 
's3://path-to-test/data/hudi/test/country/.hoodie/20220213162102.replacecommit' 
for reading
   22/02/13 20:00:51 INFO S3NativeFileSystem: Opening 
's3://path-to-test/data/hudi/test/country/.hoodie/20220213181256.replacecommit.requested'
 for reading
   22/02/13 20:00:51 ERROR HoodieClusteringJob: Clustering with basePath: 
s3://path-to-test/data/hudi/test/country/, tableName: country, runningMode: 
scheduleAndExecute failed
   22/02/13 20:00:51 INFO SparkUI: Stopped Spark web UI at 
http://ip-172-31-66-151.ec2.internal:4041
   22/02/13 20:00:51 INFO YarnClientSchedulerBackend: Interrupting monitor 
thread
   22/02/13 20:00:51 INFO YarnClientSchedulerBackend: Shutting down all 
executors
   22/02/13 20:00:51 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each 
executor to shut down
   22/02/13 20:00:51 INFO SchedulerExtensionServices: Stopping 
SchedulerExtensionServices
   (serviceOption=None,
    services=List(),
    started=false)
   22/02/13 20:00:51 INFO YarnClientSchedulerBackend: Stopped
   22/02/13 20:00:51 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
   22/02/13 20:00:51 INFO MemoryStore: MemoryStore cleared
   22/02/13 20:00:51 INFO BlockManager: BlockManager stopped
   22/02/13 20:00:51 INFO BlockManagerMaster: BlockManagerMaster stopped
   22/02/13 20:00:51 INFO 
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
   22/02/13 20:00:51 INFO SparkContext: Successfully stopped SparkContext
   22/02/13 20:00:51 INFO ShutdownHookManager: Shutdown hook called
   22/02/13 20:00:51 INFO ShutdownHookManager: Deleting directory 
/mnt/tmp/spark-3dd6c522-17c3-48a5-a809-8a0ad56c6da7
   22/02/13 20:00:51 INFO ShutdownHookManager: Deleting directory 
/mnt/tmp/spark-f841de1d-4101-49bc-8178-7bc79ede16b3
   ```
   
   Do any of you guys have any insight as to why this error is happening?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] aznwarmonkey opened a new issue #4803: [SUPPORT] Clustering throwing exception

Reply via email to