fqaiser94 commented on a change in pull request #29795:
URL: https://github.com/apache/spark/pull/29795#discussion_r495623163



##########
File path: sql/core/benchmarks/UpdateFieldsBenchmark-results.txt
##########
@@ -0,0 +1,26 @@
+================================================================================================
+Add 2 columns and drop 2 columns at 3 different depths of nesting
+================================================================================================
+
+OpenJDK 64-Bit Server VM 1.8.0_212-b03 on Mac OS X 10.14.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+Add 2 columns and drop 2 columns at 3 different depths of nesting:  Best 
Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
+-------------------------------------------------------------------------------------------------------------------------------------------------
+To non-nullable StructTypes using performant method                           
10             11           2          0.0      Infinity       1.0X
+To nullable StructTypes using performant method                                
9             10           1          0.0      Infinity       1.0X
+To non-nullable StructTypes using non-performant method                     
2457           2464          10          0.0      Infinity       0.0X
+To nullable StructTypes using non-performant method                        
42641          43804        1644          0.0      Infinity       0.0X

Review comment:
       As expected, this last result isn't great (43 seconds). 
   It's partially because of the non-performant method and partially because 
the optimizer rules aren't able to perfectly optimize complex nullable 
StructType scenarios (I've documented these scenarios in this 
[commit](https://github.com/apache/spark/pull/29795/commits/4fe48b4287c81e73276165453477811211e341d9)).
 
   It should be possible to improve the optimizer rules further in the future. 
I have a couple of simple ideas I'm toying around with but it will take me a 
while to reason/test if they are safe from a correctness point of view. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to