loukey_j created HUDI-4133:
------------------------------
Summary: Sprak query mor by snapshot query lost data
Key: HUDI-4133
URL: https://issues.apache.org/jira/browse/HUDI-4133
Project: Apache Hudi
Issue Type: Bug
Components: core
Reporter: loukey_j
Suppose there are two no intersection batches of data written to a new hudi
mor no partition table in turn by flink.
Hooide timeline and log file as follows:
hdfs dfs -ls hdfs://xxx/mor_test/.hoodie
0 2022-05-21 16:41 hdfs://xxx/mor_test/.hoodie/.aux
0 2022-05-21 16:41 hdfs://xxx/mor_test/.hoodie/.schema
0 2022-05-21 16:42 hdfs://xxx/mor_test/.hoodie/.temp
5291 2022-05-21 16:42
hdfs://xxx/mor_test/.hoodie/20220521164201245.deltacommit
0 2022-05-21 16:42
hdfs://xxx/mor_test/.hoodie/20220521164201245.deltacommit.inflight
0 2022-05-21 16:42
hdfs://xxx/mor_test/.hoodie/20220521164201245.deltacommit.requested
5291 2022-05-21 16:42
hdfs://xxx/mor_test/.hoodie/20220521164214473.deltacommit
0 2022-05-21 16:42
hdfs://xxx/mor_test/.hoodie/20220521164214473.deltacommit.inflight
0 2022-05-21 16:42
hdfs://xxx/mor_test/.hoodie/20220521164214473.deltacommit.requested
0 2022-05-21 16:41 hdfs://xxx/mor_test/.hoodie/archived
798 2022-05-21 16:41 hdfs://xxx/mor_test/.hoodie/hoodie.properties
hdfs dfs -ls hdfs://xxx/mor_test/
13316 2022-05-21 16:42
hdfs://xxx/mor_test/.00000000-1dd6-4395-9c90-53f8a6c6eed3_20220521164201245.log.1_0-2-0
28395 2022-05-21 16:42
hdfs://xxx/mor_test/.00000000-1dd6-4395-9c90-53f8a6c6eed3_20220521164214473.log.1_0-2-0
0 2022-05-21 16:42 hdfs://xxx/mor_test/.hoodie
100 2022-05-21 16:42 hdfs://xxx/mor_test/.hoodie_partition_metadata
Use spark snapshot query execute such sql 'select distinct _hoodie_commit_time
from mor_test_rt'
Expected results is 20220521164201245 and 20220521164214473, but actual results
is 20220521164214473.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)