cloud-fan commented on a change in pull request #30412:
URL: https://github.com/apache/spark/pull/30412#discussion_r526640152
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
##########
@@ -3088,7 +3088,12 @@ class Analyzer(override val catalogManager:
CatalogManager)
val projection = TableOutputResolver.resolveOutputColumns(
v2Write.table.name, v2Write.table.output, v2Write.query,
v2Write.isByName, conf)
if (projection != v2Write.query) {
- v2Write.withNewQuery(projection)
+ val cleanedTable = v2Write.table match {
+ case r: DataSourceV2Relation =>
+ r.copy(output = r.output.map(CharVarcharUtils.cleanAttrMetadata))
Review comment:
No it doesn't. Metadata is fine as it's harmless. We only need to watch
out for some specific rules that look at the char/varchar metadata, and make
sure they are idempotent.
As a fact, the added cast and length check expression is wrapped by an
`Alias` which retains char/varchar metadata. So the output attributes of
`Project` above the v2 relation still have metadata. It's necessary as we need
to rely on it later to do padding for char type column comparison.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]