srielau commented on code in PR #52765:
URL: https://github.com/apache/spark/pull/52765#discussion_r2521491775
##########
sql/api/src/main/scala/org/apache/spark/sql/catalyst/parser/DataTypeAstBuilder.scala:
##########
@@ -60,12 +60,52 @@ import org.apache.spark.sql.types.{ArrayType, BinaryType,
BooleanType, ByteType,
*
* @see
* [[org.apache.spark.sql.catalyst.parser.AstBuilder]] for the full SQL
statement parser
+ *
+ * ==CRITICAL: Extracting Identifier Names==
+ *
+ * When extracting identifier names from parser contexts, you MUST use the
helper methods provided
+ * by this class instead of calling ctx.getText() directly:
+ *
+ * - '''getIdentifierText(ctx)''': For single identifiers (column names,
aliases, window names)
+ * - '''getIdentifierParts(ctx)''': For qualified identifiers (table names,
schema.table)
+ *
+ * '''DO NOT use ctx.getText() or ctx.identifier.getText()''' directly! These
methods do not
+ * handle the IDENTIFIER('literal') syntax and will cause incorrect behavior.
+ *
+ * The IDENTIFIER('literal') syntax allows string literals to be used as
identifiers at parse time
+ * (e.g., IDENTIFIER('my_col') resolves to the identifier my_col). If you use
getText(), you'll
+ * get the raw text "IDENTIFIER('my_col')" instead of "my_col", breaking the
feature.
+ *
+ * Example:
+ * {{{
+ * // WRONG - does not handle IDENTIFIER('literal'):
+ * val name = ctx.identifier.getText
Review Comment:
yes! These slipped through the cracks. Must have fat-fingers my own grep to
miss out on those.
I'll create a follow up.
##########
sql/api/src/main/scala/org/apache/spark/sql/catalyst/parser/DataTypeAstBuilder.scala:
##########
@@ -60,12 +60,52 @@ import org.apache.spark.sql.types.{ArrayType, BinaryType,
BooleanType, ByteType,
*
* @see
* [[org.apache.spark.sql.catalyst.parser.AstBuilder]] for the full SQL
statement parser
+ *
+ * ==CRITICAL: Extracting Identifier Names==
+ *
+ * When extracting identifier names from parser contexts, you MUST use the
helper methods provided
+ * by this class instead of calling ctx.getText() directly:
+ *
+ * - '''getIdentifierText(ctx)''': For single identifiers (column names,
aliases, window names)
+ * - '''getIdentifierParts(ctx)''': For qualified identifiers (table names,
schema.table)
+ *
+ * '''DO NOT use ctx.getText() or ctx.identifier.getText()''' directly! These
methods do not
+ * handle the IDENTIFIER('literal') syntax and will cause incorrect behavior.
+ *
+ * The IDENTIFIER('literal') syntax allows string literals to be used as
identifiers at parse time
+ * (e.g., IDENTIFIER('my_col') resolves to the identifier my_col). If you use
getText(), you'll
+ * get the raw text "IDENTIFIER('my_col')" instead of "my_col", breaking the
feature.
+ *
+ * Example:
+ * {{{
+ * // WRONG - does not handle IDENTIFIER('literal'):
+ * val name = ctx.identifier.getText
Review Comment:
yes! These slipped through the cracks. Must have fat-fingered my own grep to
miss out on those.
I'll create a follow up.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]