juntaozhang opened a new pull request, #6342:
URL: https://github.com/apache/paimon/pull/6342
<!-- Please specify the module before the PR name: [core] ... or [flink] ...
-->
### Purpose
<!-- Linking this pull request to the issue -->
Spark data evolution table can appear inconsistent before and after
compaction. Example:
```sql
CREATE TABLE s (id INT, b INT);
INSERT INTO s VALUES (1, 11), (2, 22);
CREATE TABLE t (id INT, b INT, c INT) TBLPROPERTIES ('row-tracking.enabled'
= 'true', 'data-evolution.enabled' = 'true');
INSERT INTO t VALUES (2, 2, 2), (3, 3, 3);
MERGE INTO t
USING s
ON t.id = s.id
WHEN MATCHED THEN UPDATE SET t.b = s.b
WHEN NOT MATCHED THEN INSERT (id, b, c) VALUES (id, b, 0);
select *, _ROW_ID, _SEQUENCE_NUMBER from t order by _ROW_ID asc;
CALL sys.compact(table => 't');
select *, _ROW_ID, _SEQUENCE_NUMBER from t order by _ROW_ID asc;
```
before compaction:
```text
+----+----+---+---------+------------------+
| id | b | c | _ROW_ID | _SEQUENCE_NUMBER |
+----+----+---+---------+------------------+
| 2 | 22 | 2 | 0 | 2 |
| 3 | 3 | 3 | 1 | 2 |
| 1 | 11 | 0 | 2 | 2 |
+----+----+---+---------+------------------+
```
after compaction:
```text
+--------+----+--------+---------+------------------+
| id | b | c | _ROW_ID | _SEQUENCE_NUMBER |
+--------+----+--------+---------+------------------+
| <NULL> | 22 | <NULL> | 0 | 2 |
| 2 | 2 | 2 | 0 | 1 |
| <NULL> | 3 | <NULL> | 1 | 2 |
| 3 | 3 | 3 | 1 | 1 |
| 1 | 11 | 0 | 2 | 2 |
+--------+----+--------+---------+------------------+
```
Disable compaction in Spark to align with Flink behavior (#6112).
<!-- What is the purpose of the change -->
### Tests
<!-- List UT and IT cases to verify this change -->
### API and Format
<!-- Does this change affect API or storage format -->
### Documentation
<!-- Does this change introduce a new feature -->
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]