cshuo commented on issue #13908:
URL: https://github.com/apache/hudi/issues/13908#issuecomment-3305929603
@nsivabalan I followed the steps you provided, and cannot reproduce the
problem. I noticed the exception stack you provided occurred in
`HoodieStorage#rename` for local file, can this issue be consistently
reproduced on your machine, or just an intermittent problem?If there are issues
with my reproducing steps, please help point them out.
---
* spark: spark-3.5.6-bin-hadoop3
* hudi v6: 0.15.0
* hudi v9: master
### Start spark sql shell with hudi spark bundle 0.15.0
```sql
export SPARK_VERSION=3.5
./bin/spark-sql --packages
org.apache.hudi:hudi-spark$SPARK_VERSION-bundle_2.12:0.15.0 \
--conf
'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
--conf
'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'
\
--conf
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' \
--conf
'spark.kryo.registrator=org.apache.spark.HoodieSparkKryoRegistrar' \
--conf
'spark.sql.warehouse.dir=file:///tmp/warehouse' \
--conf
'spark.sql.catalogImplementation=in-memory'
```
### Create table in v6
```sql
CREATE TABLE hudi_table (
ts BIGINT,
uuid STRING,
rider STRING,
driver STRING,
fare DOUBLE,
city STRING
) USING HUDI
options(
type = 'mor',
primaryKey ='uuid',
preCombineField='ts'
)
PARTITIONED BY (city)
LOCATION 'file:///tmp/hudi_v6_table';
```
### Insert 3 batches of data
```sql
INSERT INTO hudi_table
VALUES
(1695159649087,'334e26e9-8355-45cc-97c6-c31daf0df330','rider-A','driver-K',19.10,'san_francisco'),
(1695091554788,'e96c4396-3fad-413a-a942-4cb36106d721','rider-C','driver-M',27.70
,'san_francisco'),
(1695046462179,'9909a8b1-2d15-4d3d-8ec9-efc48c536a00','rider-D','driver-L',33.90
,'san_francisco'),
(1695332066204,'1dced545-862b-4ceb-8b43-d2a568f6616b','rider-E','driver-O',93.50,'san_francisco'),
(1695516137016,'e3cf430c-889d-4015-bc98-59bdce1e530c','rider-F','driver-P',34.15,'sao_paulo'
),
(1695376420876,'7a84095f-737f-40bc-b62f-6b69664712d2','rider-G','driver-Q',43.40
,'sao_paulo' ),
(1695173887231,'3eeb61f7-c2b0-4636-99bd-5d7a5a1d2c04','rider-I','driver-S',41.06
,'chennai' ),
(1695115999911,'c8abbe79-8d89-47ea-b4ce-4d224bae5bfa','rider-J','driver-T',17.85,'chennai');
INSERT INTO hudi_table
VALUES
(1695159649087,'334e26e9-8355-45cc-97c6-c31daf0df330','rider-A','driver-K',19.10,'san_francisco'),
(1695091554788,'e96c4396-3fad-413a-a942-4cb36106d721','rider-C','driver-M',27.70
,'san_francisco'),
(1695046462179,'9909a8b1-2d15-4d3d-8ec9-efc48c536a00','rider-D','driver-L',33.90
,'san_francisco'),
(1695332066204,'1dced545-862b-4ceb-8b43-d2a568f6616b','rider-E','driver-O',93.50,'san_francisco'),
(1695516137016,'e3cf430c-889d-4015-bc98-59bdce1e530c','rider-F','driver-P',34.15,'sao_paulo'
),
(1695376420876,'7a84095f-737f-40bc-b62f-6b69664712d2','rider-G','driver-Q',43.40
,'sao_paulo' ),
(1695173887231,'3eeb61f7-c2b0-4636-99bd-5d7a5a1d2c04','rider-I','driver-S',41.06
,'chennai' ),
(1695115999911,'c8abbe79-8d89-47ea-b4ce-4d224bae5bfa','rider-J','driver-T',17.85,'chennai');
INSERT INTO hudi_table
VALUES
(1695159649087,'334e26e9-8355-45cc-97c6-c31daf0df330','rider-A','driver-K',19.10,'san_francisco'),
(1695091554788,'e96c4396-3fad-413a-a942-4cb36106d721','rider-C','driver-M',27.70
,'san_francisco'),
(1695046462179,'9909a8b1-2d15-4d3d-8ec9-efc48c536a00','rider-D','driver-L',33.90
,'san_francisco'),
(1695332066204,'1dced545-862b-4ceb-8b43-d2a568f6616b','rider-E','driver-O',93.50,'san_francisco'),
(1695516137016,'e3cf430c-889d-4015-bc98-59bdce1e530c','rider-F','driver-P',34.15,'sao_paulo'
),
(1695376420876,'7a84095f-737f-40bc-b62f-6b69664712d2','rider-G','driver-Q',43.40
,'sao_paulo' ),
(1695173887231,'3eeb61f7-c2b0-4636-99bd-5d7a5a1d2c04','rider-I','driver-S',41.06
,'chennai' ),
(1695115999911,'c8abbe79-8d89-47ea-b4ce-4d224bae5bfa','rider-J','driver-T',17.85,'chennai');
```
### Start hudi cli in Local master dev branch
```bash
$ cd ../hudi/packaging/hudi-cli-bundle/
$ ./hudi-cli-with-bundle.sh
$ connect --path /tmp/hudi_v6_table
$ hudi:hudi_table->commits show
38769 [main] INFO
org.apache.hudi.common.table.timeline.versioning.v2.ActiveTimelineV2 [] -
Loaded instants upto :
Option{val=[20250918151414760__20250918151420286__commit__COMPLETED]}
╔═══════════════════╤═════════════════════╤═══════════════════╤═════════════════════╤══════════════════════════╤═══════════════════════╤══════════════════════════════╤══════════════╗
║ CommitTime │ Total Bytes Written │ Total Files Added │ Total Files
Updated │ Total Partitions Written │ Total Records Written │ Total Update
Records Written │ Total Errors ║
╠═══════════════════╪═════════════════════╪═══════════════════╪═════════════════════╪══════════════════════════╪═══════════════════════╪══════════════════════════════╪══════════════╣
║ 20250918151414760 │ 1.2 MB │ 0 │ 3
│ 3 │ 8 │ 8
│ 0 ║
╟───────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20250918151246172 │ 4.4 KB │ 0 │ 3
│ 3 │ 8 │ 8
│ 0 ║
╟───────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20250918151242843 │ 4.4 KB │ 0 │ 3
│ 3 │ 8 │ 8
│ 0 ║
╟───────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20250918151200170 │ 1.2 MB │ 3 │ 0
│ 3 │ 8 │ 0
│ 0 ║
╚═══════════════════╧═════════════════════╧═══════════════════╧═════════════════════╧══════════════════════════╧═══════════════════════╧══════════════════════════════╧══════════════╝
```
### Upgrade table to v9
```bash
$ hudi:hudi_table->upgrade table --toVersion 9 --sparkMaster local
```
```bash
# output
76715 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:14:20 INFO SparkMain: Table at "/tmp/hudi_v6_table" upgraded /
downgraded to version "NINE".
76716 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:14:20 INFO SparkContext: SparkContext is stopping with exitCode 0.
76722 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:14:20 INFO SparkUI: Stopped Spark web UI at http://mba.lan:4041
76732 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:14:20 INFO MapOutputTrackerMasterEndpoint:
MapOutputTrackerMasterEndpoint stopped!
76769 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:14:20 INFO MemoryStore: MemoryStore cleared
76769 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:14:20 INFO BlockManager: BlockManager stopped
76773 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:14:20 INFO BlockManagerMaster: BlockManagerMaster stopped
76777 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:14:20 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
OutputCommitCoordinator stopped!
76786 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:14:20 INFO SparkContext: Successfully stopped SparkContext
76792 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:14:20 INFO ShutdownHookManager: Shutdown hook called
76793 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:14:20 INFO ShutdownHookManager: Deleting directory
/private/var/folders/br/97x3mf4d32l8t6clbtjmnpdh0000gn/T/spark-225cf190-1bb3-4371-849c-3d80ec9ef382
76806 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:14:20 INFO ShutdownHookManager: Deleting directory
/private/var/folders/br/97x3mf4d32l8t6clbtjmnpdh0000gn/T/spark-4f04efe1-fe90-49ec-9bef-ef77dde8c1fe
76883 [main] INFO org.apache.hudi.common.table.HoodieTableMetaClient [] -
Loading HoodieTableMetaClient from /tmp/hudi_v6_table
76883 [main] INFO org.apache.hudi.common.table.HoodieTableConfig [] -
Loading table properties from /tmp/hudi_v6_table/.hoodie/hoodie.properties
76885 [main] INFO org.apache.hudi.common.table.HoodieTableMetaClient [] -
Finished Loading Table of type MERGE_ON_READ(version=2) from /tmp/hudi_v6_table
Hoodie table upgraded/downgraded to NINE
```
```bash
shuo@mba /t/h/.hoodie> tree
.
├── archived
├── hoodie.properties
└── timeline
├── 20250918151200170.deltacommit.inflight
├── 20250918151200170.deltacommit.requested
├── 20250918151200170_20250918151205642.deltacommit
├── 20250918151242843.deltacommit.inflight
├── 20250918151242843.deltacommit.requested
├── 20250918151242843_20250918151244561.deltacommit
├── 20250918151246172.deltacommit.inflight
├── 20250918151246172.deltacommit.requested
├── 20250918151246172_20250918151247836.deltacommit
├── 20250918151414760.compaction.inflight
├── 20250918151414760.compaction.requested
├── 20250918151414760_20250918151420286.commit
└── history
4 directories, 13 files
```
### Build async index 
```sql
./bin/spark-submit \
--class
org.apache.hudi.utilities.HoodieIndexer \
/Users/cshuo/code/hudi/packaging/hudi-utilities-bundle/target/hudi-utilities-bundle_2.12-1.1.0-SNAPSHOT.jar
\
--props
/Users/cshuo/Downloads/indexer.properties \
--mode scheduleAndExecute \
--base-path /tmp/hudi_v6_table \
--table-name hudi_table \
--index-types FILES \
--parallelism 1 \
--spark-memory 1g
```
```bash
# timeline after async index
cshuo@mba /t/h/.hoodie> tree
.
├── archived
├── hoodie.properties
├── metadata
│ └── files
│ └── files-0000-0_0-7-13_00000000000000000.hfile
└── timeline
├── 20250918153201977.deltacommit.inflight
├── 20250918153201977.deltacommit.requested
├── 20250918153201977_20250918153209461.deltacommit
├── 20250918153210887.deltacommit.inflight
├── 20250918153210887.deltacommit.requested
├── 20250918153210887_20250918153212856.deltacommit
├── 20250918153217685.deltacommit.inflight
├── 20250918153217685.deltacommit.requested
├── 20250918153217685_20250918153219447.deltacommit
├── 20250918153326554.compaction.inflight
├── 20250918153326554.compaction.requested
├── 20250918153326554_20250918153332855.commit
├── 20250918153458448.indexing.inflight
├── 20250918153458448.indexing.requested
├── 20250918153458448_20250918153501146.indexing
└── history
6 directories, 17 files
```
### Downgrade table to v6
```bash
$ hudi:hudi_table->downgrade table --toVersion 6 --sparkMaster local
# output
...
25672 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:37:44 INFO SparkMain: Table at "/tmp/hudi_v6_table" upgraded /
downgraded to version "SIX".
25673 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:37:44 INFO SparkContext: SparkContext is stopping with exitCode 0.
25679 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:37:44 INFO SparkUI: Stopped Spark web UI at http://mba.lan:4040
25701 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:37:44 INFO MapOutputTrackerMasterEndpoint:
MapOutputTrackerMasterEndpoint stopped!
25722 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:37:44 INFO MemoryStore: MemoryStore cleared
25722 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:37:44 INFO BlockManager: BlockManager stopped
25724 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:37:44 INFO BlockManagerMaster: BlockManagerMaster stopped
25727 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:37:44 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
OutputCommitCoordinator stopped!
25733 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:37:44 INFO SparkContext: Successfully stopped SparkContext
25735 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:37:44 INFO ShutdownHookManager: Shutdown hook called
25735 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:37:44 INFO ShutdownHookManager: Deleting directory
/private/var/folders/br/97x3mf4d32l8t6clbtjmnpdh0000gn/T/spark-8a9d6d3c-b171-435e-a35c-5174265cb858
25739 [Thread-3] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] -
25/09/18 15:37:44 INFO ShutdownHookManager: Deleting directory
/private/var/folders/br/97x3mf4d32l8t6clbtjmnpdh0000gn/T/spark-7aec3ccf-94ac-41bd-9e55-68fb7f219e9d
25799 [main] INFO org.apache.hudi.common.table.HoodieTableMetaClient [] -
Loading HoodieTableMetaClient from /tmp/hudi_v6_table
25799 [main] INFO org.apache.hudi.common.table.HoodieTableConfig [] -
Loading table properties from /tmp/hudi_v6_table/.hoodie/hoodie.properties
25806 [main] INFO org.apache.hudi.common.table.HoodieTableMetaClient [] -
Finished Loading Table of type MERGE_ON_READ(version=1) from /tmp/hudi_v6_table
Hoodie table upgraded/downgraded to SIX
```
```bash
$ cshuo@mba /t/h/.hoodie> tree
.
├── 20250918153201977.deltacommit
├── 20250918153201977.deltacommit.inflight
├── 20250918153201977.deltacommit.requested
├── 20250918153210887.deltacommit
├── 20250918153210887.deltacommit.inflight
├── 20250918153210887.deltacommit.requested
├── 20250918153217685.deltacommit
├── 20250918153217685.deltacommit.inflight
├── 20250918153217685.deltacommit.requested
├── 20250918153326554.commit
├── 20250918153326554.compaction.inflight
├── 20250918153326554.compaction.requested
├── 20250918153458448.indexing
├── 20250918153458448.indexing.inflight
├── 20250918153458448.indexing.requested
├── archived
├── hoodie.properties
├── metadata
│ ├── column_stats
│ │ ├── col-stats-0000-0_0-14-47_20250918153458448.hfile
│ │ └── col-stats-0001-0_1-14-48_20250918153458448.hfile
│ └── files
│ ├── files-0000-0_0-3-4_20250918153738391.hfile
│ └── files-0000-0_0-7-13_00000000000000000.hfile
└── timeline
└── history
7 directories, 20 files
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]