srielau commented on code in PR #52638:
URL: https://github.com/apache/spark/pull/52638#discussion_r2463083342
##########
sql/api/src/main/scala/org/apache/spark/sql/catalyst/parser/DataTypeAstBuilder.scala:
##########
@@ -45,18 +46,188 @@ class DataTypeAstBuilder extends
SqlBaseParserBaseVisitor[AnyRef] {
withOrigin(ctx)(StructType(visitColTypeList(ctx.colTypeList)))
}
- override def visitStringLiteralValue(ctx: StringLiteralValueContext): Token =
- Option(ctx).map(_.STRING_LITERAL.getSymbol).orNull
+ /**
+ * Visits a stringLit context that may contain multiple singleStringLit
children (which can be
+ * either singleStringLitWithoutMarker or parameterMarker). When multiple
children are present,
+ * they are coalesced into a single token.
+ */
+ override def visitStringLit(ctx: StringLitContext): Token = {
+ if (ctx == null) {
+ return null
+ }
+
+ import scala.jdk.CollectionConverters._
+
+ // Collect tokens from all singleStringLit children.
+ // Each child is either a singleStringLitWithoutMarker or a
parameterMarker.
+ val tokens = ctx
+ .singleStringLit()
+ .asScala
+ .map { child =>
+ visit(child).asInstanceOf[Token]
+ }
+ .toSeq
+
+ if (tokens.isEmpty) {
+ null
+ } else if (tokens.size == 1) {
+ // Fast path: single token, return unchanged
+ tokens.head
+ } else {
+ // Multiple tokens: create coalesced token
+ createCoalescedStringToken(tokens)
Review Comment:
Claude objects for these reasons:
The "con" is that if we move coalescing to AstBuilder, standalone
DataTypeAstBuilder instances lose this functionality, which would break
features like:
- parseDataType() with comments that use coalesced strings
- parseTableSchema() with column comments that use coalesced strings
- Any other pure data type parsing that involves string literals
We'd need to either duplicate the logic, create helper methods/traits, or
break existing functionality. The current design elegantly avoids all these
issues by placing the coalescing in the base class where it belongs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]