fqaiser94 commented on a change in pull request #29795:
URL: https://github.com/apache/spark/pull/29795#discussion_r495623163
##########
File path: sql/core/benchmarks/UpdateFieldsBenchmark-results.txt
##########
@@ -0,0 +1,26 @@
+================================================================================================
+Add 2 columns and drop 2 columns at 3 different depths of nesting
+================================================================================================
+
+OpenJDK 64-Bit Server VM 1.8.0_212-b03 on Mac OS X 10.14.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Add 2 columns and drop 2 columns at 3 different depths of nesting: Best
Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
+-------------------------------------------------------------------------------------------------------------------------------------------------
+To non-nullable StructTypes using performant method
10 11 2 0.0 Infinity 1.0X
+To nullable StructTypes using performant method
9 10 1 0.0 Infinity 1.0X
+To non-nullable StructTypes using non-performant method
2457 2464 10 0.0 Infinity 0.0X
+To nullable StructTypes using non-performant method
42641 43804 1644 0.0 Infinity 0.0X
Review comment:
As expected, this last result isn't great (43 seconds).
It's partially because of the non-performant method and partially because
the optimizer rules aren't able to perfectly optimize complex nullable
StructType scenarios (I've documented these scenarios in this
[commit](https://github.com/apache/spark/pull/29795/commits/4fe48b4287c81e73276165453477811211e341d9)).
It should be possible to improve the optimizer rules further in the future.
I have a couple of simple ideas I'm toying around with but it will take me a
while to reason/test if they are safe from a correctness point of view.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]