[ 
https://issues.apache.org/jira/browse/HUDI-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YangXuan updated HUDI-3945:
---------------------------
    Description: 
This problem occurs if you perform the following operations:

1、Create a mor table and perform the upsert operation on it for five times.

2、Gets the timestamp of the last upsert execution, assuming it is 
20220318191221.

3、Do compaction schedule by spark-submit and the timestamp is 20220318191221 
plus 5.

spark-submit --conf 
"spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/opt/hudi/error_log4j.properties"
 --jars /opt/client/Hudi/hudi/lib/hudi-client-common*.jar --class 
org.apache.hudi.utilities.HoodieCompactor 
/opt/client/Hudi/hudi/lib/hudi-utilities*.jar --base-path 
/tmp/testdb/tb_test_mor --table-name tb_test_mor --parallelism 100 
--spark-memory 1G --schema-file /tmp/json/compact_tb_base.json --instant-time 
20220318191226 --schedule --strategy 
org.apache.hudi.table.action.compact.strategy.UnBoundedCompactionStrategy

4、Run compaction by spark-submit and you will see that the task of running 
compaction is executed successfully, but the spark task does not exit.

spark-submit --conf 
"spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/opt/hudi/error_log4j.properties"
 --num-executors 4 --jars /opt/client/Hudi/hudi/lib/hudi-client-common-{_}.jar 
--class org.apache.hudi.utilities.HoodieCompactor 
/opt/client/Hudi/hudi/lib/hudi-utilities_{_}.jar --base-path 
/tmp/testdb/tb_test_mor --table-name tb_test_mor --parallelism 100 
--spark-memory 1G --schema-file /tmp/json/compact_tb_base.json --instant-time 
20220318191226

> After the async compaction operation is complete, the task should exit.
> -----------------------------------------------------------------------
>
>                 Key: HUDI-3945
>                 URL: https://issues.apache.org/jira/browse/HUDI-3945
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: YangXuan
>            Priority: Major
>              Labels: pull-request-available
>
> This problem occurs if you perform the following operations:
> 1、Create a mor table and perform the upsert operation on it for five times.
> 2、Gets the timestamp of the last upsert execution, assuming it is 
> 20220318191221.
> 3、Do compaction schedule by spark-submit and the timestamp is 20220318191221 
> plus 5.
> spark-submit --conf 
> "spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/opt/hudi/error_log4j.properties"
>  --jars /opt/client/Hudi/hudi/lib/hudi-client-common*.jar --class 
> org.apache.hudi.utilities.HoodieCompactor 
> /opt/client/Hudi/hudi/lib/hudi-utilities*.jar --base-path 
> /tmp/testdb/tb_test_mor --table-name tb_test_mor --parallelism 100 
> --spark-memory 1G --schema-file /tmp/json/compact_tb_base.json --instant-time 
> 20220318191226 --schedule --strategy 
> org.apache.hudi.table.action.compact.strategy.UnBoundedCompactionStrategy
> 4、Run compaction by spark-submit and you will see that the task of running 
> compaction is executed successfully, but the spark task does not exit.
> spark-submit --conf 
> "spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/opt/hudi/error_log4j.properties"
>  --num-executors 4 --jars 
> /opt/client/Hudi/hudi/lib/hudi-client-common-{_}.jar --class 
> org.apache.hudi.utilities.HoodieCompactor 
> /opt/client/Hudi/hudi/lib/hudi-utilities_{_}.jar --base-path 
> /tmp/testdb/tb_test_mor --table-name tb_test_mor --parallelism 100 
> --spark-memory 1G --schema-file /tmp/json/compact_tb_base.json --instant-time 
> 20220318191226



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to