LinMingQiang commented on issue #5886:
URL: https://github.com/apache/hudi/issues/5886#issuecomment-1158381755
This problem requires a large amount of data to be written,
'hoodie.parquet.small.file.limit' = '20' can be a quick recurrence of the
problem.. and I found that after 'clean' is completed, this error will appear
in the next instant commit, And the nonexistent fileID must be the last instant
commit.
During debugging, I found that:
```
```
1:The error occurs in the first commit after clean
```
-rw-r--r-- 1 hunter staff 5646 Jun 17 09:13 20220617091324896.commit
-rw-r--r-- 1 hunter staff 0 Jun 17 09:13
20220617091324896.commit.requested
-rw-r--r-- 1 hunter staff 0 Jun 17 09:13 20220617091324896.inflight
-rw-r--r-- 1 hunter staff 4028 Jun 17 09:13 20220617091336735.commit
-rw-r--r-- 1 hunter staff 0 Jun 17 09:13
20220617091336735.commit.requested
-rw-r--r-- 1 hunter staff 0 Jun 17 09:13 20220617091336735.inflight
-rw-r--r-- 1 hunter staff 5650 Jun 17 09:13 20220617091341327.commit
-rw-r--r-- 1 hunter staff 0 Jun 17 09:13
20220617091341327.commit.requested
-rw-r--r-- 1 hunter staff 0 Jun 17 09:13 20220617091341327.inflight
-rw-r--r-- 1 hunter staff 1561 Jun 17 09:13 20220617091348754.clean
-rw-r--r-- 1 hunter staff 1699 Jun 17 09:13
20220617091348754.clean.inflight
-rw-r--r-- 1 hunter staff 1699 Jun 17 09:13
20220617091348754.clean.requested
-rw-r--r-- 1 hunter staff 0 Jun 17 09:13
20220617091358406.commit.requested
-rw-r--r-- 1 hunter staff 0 Jun 17 09:13 20220617091358406.inflight
```
```
-rw-r--r-- 1 hunter staff 7772443 Jun 17 09:12
536b7835-0754-4187-96dc-1a5ecc5c152a_0-1-0_20220617091245620.parquet
-rw-r--r-- 1 hunter staff 12079980 Jun 17 09:13
536b7835-0754-4187-96dc-1a5ecc5c152a_0-1-0_20220617091257227.parquet
-rw-r--r-- 1 hunter staff 12125690 Jun 17 09:13
536b7835-0754-4187-96dc-1a5ecc5c152a_0-1-0_20220617091302805.parquet
-rw-r--r-- 1 hunter staff 5261158 Jun 17 09:13
5ab5fc26-12bc-4832-b70c-499ef968b427_0-1-0_20220617091324896.parquet
-rw-r--r-- 1 hunter staff 8276875 Jun 17 09:13
5ab5fc26-12bc-4832-b70c-499ef968b427_0-1-0_20220617091336735.parquet
-rw-r--r-- 1 hunter staff 8322925 Jun 17 09:13
5ab5fc26-12bc-4832-b70c-499ef968b427_0-1-0_20220617091341327.parquet
-rw-r--r-- 1 hunter staff 18307016 Jun 17 09:12
61b5598d-7879-439b-87d8-04ce14b95a4e_0-1-0_20220617091237415.parquet
-rw-r--r-- 1 hunter staff 18356323 Jun 17 09:12
61b5598d-7879-439b-87d8-04ce14b95a4e_0-1-0_20220617091245620.parquet
-rw-r--r-- 1 hunter staff 18646634 Jun 17 09:13
b4c27901-1c35-4224-8485-8540278d85c3_0-1-0_20220617091341327.parquet
-rw-r--r-- 1 hunter staff 14465106 Jun 17 09:13
d9b1e244-098c-487b-b927-4c76bb6a8e97_0-1-0_20220617091302805.parquet
-rw-r--r-- 1 hunter staff 16797868 Jun 17 09:13
d9b1e244-098c-487b-b927-4c76bb6a8e97_0-1-0_20220617091317899.parquet
-rw-r--r-- 1 hunter staff 16844180 Jun 17 09:13
d9b1e244-098c-487b-b927-4c76bb6a8e97_0-1-0_20220617091324896.parquet
```
2: `HoodieTableFileSystemView.partitionToFileGroupsMap` sometimes not
contains FileGoups ``,but sometimes do.
so when i execute `hoodieTable.getMetadataTable().reset()` after
`HoodieMergeHandler.getLatestBaseFile` throw error and re-execute
`hoodieTable.getBaseFileOnlyView().getLatestBaseFile(partitionPath, fileId)` it
working well.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]