yzeng1618 commented on issue #10317:
URL: https://github.com/apache/seatunnel/issues/10317#issuecomment-3736674399

   <img width="973" height="356" alt="Image" 
src="https://github.com/user-attachments/assets/428a6d5f-cedc-4385-af6c-b90620667130";
 />
   
   1. Why it fails only when FieldRename is enabled
   
   - The Iceberg sink resolves identifier fields from 
iceberg.table.primary-keys during table creation (SchemaUtils.toIcebergSchema 
in connector-iceberg).
   
   - If any configured PK field name does not exist in the incoming schema, 
structType.field(pk) becomes null and the current code throws an NPE.
   
   - In FieldRename, replacements_with_regex is only treated as regex when 
is_regex = true is set for each rule. If is_regex is omitted (null), it’s 
treated as a literal full-match, so patterns like (?<=[a-z0-9])(?=[A-Z]) never 
match.
   
   - Result: columns may become invoicenum/vendorid (lowercase only) instead of 
invoice_num/vendor_id, so the PK list can’t be resolved and Iceberg NPEs.
   
   2. Workaround
   - Add is_regex = true to each FieldRename.replacements_with_regex item, e.g. 
{ replace_from = "(?<=[a-z0-9])(?=[A-Z])", replace_to = "_", is_regex = true }
   
   - As a quick validation, you can also temporarily change 
iceberg.table.primary-keys to match the actual output column names after 
FieldRename.
   
   3. Follow-up / fix plan 
   
   - Make FieldRename default is_regex to true (align behavior with docs and 
TableRename) and update the docs accordingly.
   
   - Improve the Iceberg sink to fail fast with a clear “missing PK field(s)” 
error instead of throwing an NPE.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to