sunchao commented on code in PR #56685:
URL: https://github.com/apache/spark/pull/56685#discussion_r3460925830
##########
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/SharedJsonParseBenchmark.scala:
##########
@@ -75,6 +75,40 @@ object SharedJsonParseBenchmark extends SqlBasedBenchmark {
}
data.unpersist()
+
+ val nestedData = spark.range(0, rows, 1, 4)
+ .select(to_json(struct(struct(Seq.tabulate(fieldCount) { index =>
+ fieldValue.as(s"field_$index")
+ }: _*).as("payload"))).as("json"))
+ .cache()
+ nestedData.count()
+
+ Seq(2, 4, 8, 16).foreach { selectedFieldCount =>
+ val pathBenchmark = new Benchmark(
+ s"get_json_object extracting $selectedFieldCount of $fieldCount
nested fields",
Review Comment:
Thanks, addressed. I regenerated and committed all three result files with
Spark's official benchmark workflow: [JDK
17](https://github.com/sunchao/spark/actions/runs/28035861052), [JDK
21](https://github.com/sunchao/spark/actions/runs/28035861792), and [JDK
25](https://github.com/sunchao/spark/actions/runs/28035859413). The generated
commits are 152293f5cda, 8c97ee4d0fa, and 6cff419d7fb, and the PR description
now links the hosted runs and reports the JDK 17 results.
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala:
##########
@@ -189,10 +194,18 @@ case class MultiGetJsonObject(
final override val nodePatterns: Seq[TreePattern] = Seq(GET_JSON_OBJECT)
+ @transient
+ private lazy val namedPaths = fallbackPaths.map { path =>
Review Comment:
Thanks, addressed in bb0fff01e3e. I removed `fieldNames` from both
`MultiGetJsonObject` and `MultiGetJsonObjectEvaluator`, derive the arity from
`fallbackPaths`, and updated the optimizer and affected tests.
`OptimizeJsonExprsSuite` (24/24), `JsonFunctionsSuite` (106/106), and the
touched-module Scalastyle checks pass on JDK 17.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]