Dear Paimon Community,
I hope this email finds you well. I’m currently working with Paimon primary key
tables and trying to leverage Flink SQL to retrieve the row kind (referred to
as op, with values like +I, -U, +U, -D) for business logic operations. However,
I’ve encountered some challenges and would appreciate your guidance.
Here are the two scenarios I’ve tested so far:
Scenario 1: Implicit Row Kind in SELECT *When I execute SELECT * on a Paimon
primary key table via Flink SQL, the result set includes a dynamic op column
that reflects the row kind. However, if I try to explicitly reference this op
column (e.g., SELECT op FROM ... or WHERE op IN ('-U', '+U')), Flink SQL throws
an error stating that the column op does not exist in the table schema. This is
because op is a virtual column added dynamically by Paimon, not a physical
column in the table.
Scenario 2: Explicit rowkind.field ConfigurationI attempted to use the
rowkind.field table property to bind a physical column (e.g., op VARCHAR(2))
for storing row kind values. While this allows me to reference op in queries,
it requires manually inserting op values (e.g., +I, -U) during data ingestion.
This approach is not ideal, as I want Paimon to automatically determine the row
kind (based on insert/update/delete operations) without manual intervention.
My Question:Is there an alternative approach to retrieve the automatically
generated row kind (op) in Flink SQL queries without needing to:
Manually maintain a physical op column via rowkind.field.
Only rely on SELECT * (which is not scalable for complex business logic).
I aim to directly use the row kind in operations like SELECT * FROM xxx WHERE
op IN ('-U', '+U') to filter and process records based on their change type,
all while letting Paimon handle the row kind generation automatically.
Any insights or best practices you can share would be extremely helpful. Thank
you in advance for your time and support!
Best regards,
Q7