dongjoon-hyun commented on a change in pull request #23964: [SPARK-26975][SQL] 
Support nested-column pruning over limit/sample/repartition
URL: https://github.com/apache/spark/pull/23964#discussion_r266103343
 
 

 ##########
 File path: sql/core/benchmarks/OrcNestedSchemaPruningBenchmark-results.txt
 ##########
 @@ -6,35 +6,42 @@ OpenJDK 64-Bit Server VM 1.8.0_201-b09 on Linux 
3.10.0-862.3.2.el7.x86_64
 Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
 Selection:                                Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Top-level column                                    117            154         
 23          8.5         117.5       1.0X
-Nested column                                      1271           1295         
 26          0.8        1270.5       0.1X
+Top-level column                                    131            150         
 25          7.7         130.6       1.0X
+Nested column                                       922            954         
 21          1.1         922.2       0.1X
 
 OpenJDK 64-Bit Server VM 1.8.0_201-b09 on Linux 3.10.0-862.3.2.el7.x86_64
 Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
 Limiting:                                 Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Top-level column                                    431            488         
 73          2.3         431.2       1.0X
-Nested column                                      1738           1777         
 24          0.6        1738.3       0.2X
+Top-level column                                    446            477         
 50          2.2         445.5       1.0X
+Nested column                                      1328           1366         
 44          0.8        1328.4       0.3X
 
 OpenJDK 64-Bit Server VM 1.8.0_201-b09 on Linux 3.10.0-862.3.2.el7.x86_64
 Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
 Repartitioning:                           Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Top-level column                                    349            381         
 87          2.9         348.7       1.0X
-Nested column                                      4374           4456         
125          0.2        4373.6       0.1X
+Top-level column                                    357            386         
 33          2.8         356.8       1.0X
+Nested column                                      1266           1274         
  7          0.8        1266.3       0.3X
 
 Review comment:
   This becomes 3 times faster.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to