[
https://issues.apache.org/jira/browse/HUDI-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
YangXuan updated HUDI-3945:
---------------------------
Description:
This problem occurs if you perform the following operations:
1、Create a mor table and perform the upsert operation on it for five times.
2、Gets the timestamp of the last upsert execution, assuming it is
20220318191221.
3、Do compaction schedule by spark-submit and the timestamp is 20220318191221
plus 5.
spark-submit --conf
"spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/opt/hudi/error_log4j.properties"
--jars /opt/client/Hudi/hudi/lib/hudi-client-common*.jar --class
org.apache.hudi.utilities.HoodieCompactor
/opt/client/Hudi/hudi/lib/hudi-utilities*.jar --base-path
/tmp/testdb/tb_test_mor --table-name tb_test_mor --parallelism 100
--spark-memory 1G --schema-file /tmp/json/compact_tb_base.json --instant-time
20220318191226 --schedule --strategy
org.apache.hudi.table.action.compact.strategy.UnBoundedCompactionStrategy
4、Run compaction by spark-submit and you will see that the task of running
compaction is executed successfully, but the spark task does not exit.
spark-submit --conf
"spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/opt/hudi/error_log4j.properties"
--num-executors 4 --jars /opt/client/Hudi/hudi/lib/hudi-client-common-{_}.jar
--class org.apache.hudi.utilities.HoodieCompactor
/opt/client/Hudi/hudi/lib/hudi-utilities_{_}.jar --base-path
/tmp/testdb/tb_test_mor --table-name tb_test_mor --parallelism 100
--spark-memory 1G --schema-file /tmp/json/compact_tb_base.json --instant-time
20220318191226
> After the async compaction operation is complete, the task should exit.
> -----------------------------------------------------------------------
>
> Key: HUDI-3945
> URL: https://issues.apache.org/jira/browse/HUDI-3945
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: YangXuan
> Priority: Major
> Labels: pull-request-available
>
> This problem occurs if you perform the following operations:
> 1、Create a mor table and perform the upsert operation on it for five times.
> 2、Gets the timestamp of the last upsert execution, assuming it is
> 20220318191221.
> 3、Do compaction schedule by spark-submit and the timestamp is 20220318191221
> plus 5.
> spark-submit --conf
> "spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/opt/hudi/error_log4j.properties"
> --jars /opt/client/Hudi/hudi/lib/hudi-client-common*.jar --class
> org.apache.hudi.utilities.HoodieCompactor
> /opt/client/Hudi/hudi/lib/hudi-utilities*.jar --base-path
> /tmp/testdb/tb_test_mor --table-name tb_test_mor --parallelism 100
> --spark-memory 1G --schema-file /tmp/json/compact_tb_base.json --instant-time
> 20220318191226 --schedule --strategy
> org.apache.hudi.table.action.compact.strategy.UnBoundedCompactionStrategy
> 4、Run compaction by spark-submit and you will see that the task of running
> compaction is executed successfully, but the spark task does not exit.
> spark-submit --conf
> "spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/opt/hudi/error_log4j.properties"
> --num-executors 4 --jars
> /opt/client/Hudi/hudi/lib/hudi-client-common-{_}.jar --class
> org.apache.hudi.utilities.HoodieCompactor
> /opt/client/Hudi/hudi/lib/hudi-utilities_{_}.jar --base-path
> /tmp/testdb/tb_test_mor --table-name tb_test_mor --parallelism 100
> --spark-memory 1G --schema-file /tmp/json/compact_tb_base.json --instant-time
> 20220318191226
--
This message was sent by Atlassian Jira
(v8.20.7#820007)