sunchao commented on code in PR #56685:
URL: https://github.com/apache/spark/pull/56685#discussion_r3460925830


##########
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/SharedJsonParseBenchmark.scala:
##########
@@ -75,6 +75,40 @@ object SharedJsonParseBenchmark extends SqlBasedBenchmark {
       }
 
       data.unpersist()
+
+      val nestedData = spark.range(0, rows, 1, 4)
+        .select(to_json(struct(struct(Seq.tabulate(fieldCount) { index =>
+          fieldValue.as(s"field_$index")
+        }: _*).as("payload"))).as("json"))
+        .cache()
+      nestedData.count()
+
+      Seq(2, 4, 8, 16).foreach { selectedFieldCount =>
+        val pathBenchmark = new Benchmark(
+          s"get_json_object extracting $selectedFieldCount of $fieldCount 
nested fields",

Review Comment:
   Thanks, addressed. I regenerated and committed all three result files with 
Spark's official benchmark workflow: [JDK 
17](https://github.com/sunchao/spark/actions/runs/28035861052), [JDK 
21](https://github.com/sunchao/spark/actions/runs/28035861792), and [JDK 
25](https://github.com/sunchao/spark/actions/runs/28035859413). The generated 
commits are 152293f5cda, 8c97ee4d0fa, and 6cff419d7fb, and the PR description 
now links the hosted runs and reports the JDK 17 results.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala:
##########
@@ -189,10 +194,18 @@ case class MultiGetJsonObject(
 
   final override val nodePatterns: Seq[TreePattern] = Seq(GET_JSON_OBJECT)
 
+  @transient
+  private lazy val namedPaths = fallbackPaths.map { path =>

Review Comment:
   Thanks, addressed in bb0fff01e3e. I removed `fieldNames` from both 
`MultiGetJsonObject` and `MultiGetJsonObjectEvaluator`, derive the arity from 
`fallbackPaths`, and updated the optimizer and affected tests. 
`OptimizeJsonExprsSuite` (24/24), `JsonFunctionsSuite` (106/106), and the 
touched-module Scalastyle checks pass on JDK 17.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to