yiminli-sp commented on PR #5804:
URL: https://github.com/apache/paimon/pull/5804#issuecomment-3809303825
Hi, master. I'm writing to discuss an issue regarding default value handling
that we've encountered while using Paimon to synchronize data from MySQL. Our
typical workflow involves an initial full dump querying MySQL directly and an
incremental synchronization via CDC.
**The Core Issue:**
After a Paimon table has fully synchronized a MySQL table, if a new column
with a default value is subsequently added to the MySQL source, querying the
existing data in Paimon returns null for that column. In contrast, querying the
same historical data in MySQL returns the defined default value, leading to a
data discrepancy.
**Our Previous Workaround & Current Challenge:**
Previously, we addressed this by modifying the column's default value
through "ALTER TABLE xxx SET TBLPROPERTIES('fields.xxx.default-value' = 'xxx')"
DDL, which allowed the Paimon reader to replace null with the intended default.
However, after upgrading to Paimon 1.3, this workaround no longer seems to be
effective since the Paimon reader no longer applies the default value from the
table option. As a result, we are forced to perform a new full dump to resolve
the inconsistency, which is time-consuming and resource-intensive.
**Our Questions:**
1. Could you please share the background or rationale behind removing the
default value logic from reader side in Paimon version 1.2+? Understanding the
design decision would be very helpful.
2. In light of this change, is there a more recommended or elegant approach
to handle the scenario where a source table adds a new column with a default
value?
3. Are there any possible plans to support a mechanism that could
automatically propagate source table default values to existing synchronized
data in a future release?
We greatly appreciate Paimon and the team's work. Thank you for your time
and any guidance you can provide.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]