jiwen624 opened a new pull request, #56351:
URL: https://github.com/apache/spark/pull/56351
### What changes were proposed in this pull request?
A surgical fix focused on customer-facing error messages only: the three
`TableOutputResolver` sites that render a column/field name in
`INCOMPATIBLE_DATA_FOR_TABLE` errors now pass it as `toSQLId(Seq(name))`, so a
dotted name like `b.c` shows as `` `b.c` `` rather than `` `b`.`c` ``. The root
cause is `toSQLId(String)` splitting names on dots, but fixing that shared
primitive would touch ~130 call sites and isn't worth the churn/risk, so this
targets only the affected, user-visible write-path renders.
Affected errors: `INCOMPATIBLE_DATA_FOR_TABLE.{EXTRA_COLUMNS,
EXTRA_STRUCT_FIELDS, STRUCT_MISSING_FIELDS}`.
### Why are the changes needed?
A column or struct field whose name contains a dot is misreported in V2
write /INSERT error messages — `` `b`.`c` `` reads as two nested identifiers
rather than one name `b.c`, confusing users.
### Does this PR introduce _any_ user-facing change?
Yes, error-message text only (no behavior change). For a dotted name, e.g.
`EXTRA_COLUMNS` now renders `` `b.c` `` instead of `` `b`.`c` ``.
### How was this patch tested?
New tests in `DataFrameWriterV2Suite` covering the three error classes
(append and by-position `INSERT`); each fails before the change and passes
after.
### Was this patch authored or co-authored using generative AI tooling?
Yes. Claude Code
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]