sunchao commented on code in PR #56547:
URL: https://github.com/apache/spark/pull/56547#discussion_r3437516845
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala:
##########
@@ -136,6 +141,82 @@ case class GetJsonObject(json: Expression, path:
Expression)
copy(json = newLeft, path = newRight)
}
+object GetJsonObject {
+ private[sql] def simpleTopLevelField(path: UTF8String): Option[String] = {
+ try {
+ Option(path).flatMap(value =>
JsonPathParser.parse(value.toString)).collect {
+ case List(PathInstruction.Key, PathInstruction.Named(fieldName)) =>
fieldName
+ }
+ } catch {
+ // Numeric subscripts are parsed as Long and can overflow before the
parser returns None.
+ case _: NumberFormatException => None
+ }
+ }
+}
+
+/**
+ * Extracts multiple simple top-level fields from a JSON string in one parse.
This is an internal
+ * expression used to share sibling [[GetJsonObject]] expressions; unsupported
JSON paths remain
+ * as independent GetJsonObject expressions.
+ */
+case class MultiGetJsonObject(
Review Comment:
Thanks for the detailed suggestion. I removed GET_JSON_OBJECT from the inner
pruning predicate in the latest commit.
For the larger codegen change, I agree that RuntimeReplaceable would be
cleaner once #56575 is available. Since that PR is still open/WIP, and emitting
Invoke directly today would make the optimized plan less readable, I would
prefer to retain the current internal expression in this PR and handle that
refactor as a follow-up. The current doGenCode contains no JSON-processing
logic; it only evaluates the child, handles nullability, and delegates to
MultiGetJsonObjectEvaluator.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]