LuciferYang commented on code in PR #47310:
URL: https://github.com/apache/spark/pull/47310#discussion_r1677272149


##########
sql/core/benchmarks/DataSourceReadBenchmark-results.txt:
##########
@@ -1,431 +1,438 @@
-DataSourceReadBenchmark-jdk21-results.txt================================================================================================
+================================================================================================
 SQL Single Numeric Column Scan
 
================================================================================================
 
-OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1022-azure
+OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1023-azure
 AMD EPYC 7763 64-Core Processor
 SQL Single BOOLEAN Column Scan:           Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-SQL CSV                                           10363          10364         
  2          1.5         658.9       1.0X
-SQL Json                                           8667           8699         
 46          1.8         551.0       1.2X
-SQL Parquet Vectorized: DataPageV1                  103            114         
  8        153.3           6.5     101.0X
-SQL Parquet Vectorized: DataPageV2                  101            111         
  6        155.4           6.4     102.4X
-SQL Parquet MR: DataPageV1                         1809           1813         
  6          8.7         115.0       5.7X
-SQL Parquet MR: DataPageV2                         1715           1720         
  8          9.2         109.0       6.0X
-SQL ORC Vectorized                                  139            146         
  5        113.1           8.8      74.5X
-SQL ORC MR                                         1508           1511         
  5         10.4          95.8       6.9X
-
-OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1022-azure
+SQL CSV                                           10854          10862         
 12          1.4         690.1       1.0X
+SQL Json                                           8728           8896         
238          1.8         554.9       1.2X
+SQL Json with UnsafeRow                            9797           9841         
 62          1.6         622.9       1.1X
+SQL Parquet Vectorized: DataPageV1                  105            119         
  8        149.2           6.7     103.0X
+SQL Parquet Vectorized: DataPageV2                  108            115         
  6        146.2           6.8     100.9X
+SQL Parquet MR: DataPageV1                         1861           1872         
 16          8.5         118.3       5.8X
+SQL Parquet MR: DataPageV2                         1770           1771         
  1          8.9         112.5       6.1X
+SQL ORC Vectorized                                  147            154         
  3        107.2           9.3      74.0X
+SQL ORC MR                                         1650           1650         
  0          9.5         104.9       6.6X
+
+OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1023-azure
 AMD EPYC 7763 64-Core Processor
 Parquet Reader Single BOOLEAN Column Scan:   Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1                    88             90      
     2        178.9           5.6       1.0X
-ParquetReader Vectorized: DataPageV2                    95             96      
     1        166.2           6.0       0.9X
-ParquetReader Vectorized -> Row: DataPageV1             73             74      
     1        215.3           4.6       1.2X
-ParquetReader Vectorized -> Row: DataPageV2             81             83      
     1        193.1           5.2       1.1X
+ParquetReader Vectorized: DataPageV1                    96             97      
     1        163.7           6.1       1.0X
+ParquetReader Vectorized: DataPageV2                   102            104      
     4        154.4           6.5       0.9X
+ParquetReader Vectorized -> Row: DataPageV1             75             77      
     1        208.5           4.8       1.3X
+ParquetReader Vectorized -> Row: DataPageV2             82             83      
     2        192.8           5.2       1.2X
 
-OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1022-azure
+OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1023-azure
 AMD EPYC 7763 64-Core Processor
 SQL Single TINYINT Column Scan:           Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-SQL CSV                                           11538          11589         
 73          1.4         733.5       1.0X
-SQL Json                                           9586           9596         
 14          1.6         609.5       1.2X
-SQL Parquet Vectorized: DataPageV1                  109            116         
  6        144.8           6.9     106.2X
-SQL Parquet Vectorized: DataPageV2                  110            118         
  8        142.6           7.0     104.6X
-SQL Parquet MR: DataPageV1                         1901           1953         
 74          8.3         120.9       6.1X
-SQL Parquet MR: DataPageV2                         1817           1832         
 22          8.7         115.5       6.4X
-SQL ORC Vectorized                                  118            126         
  7        133.6           7.5      98.0X
-SQL ORC MR                                         1505           1535         
 43         10.5          95.7       7.7X
-
-OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1022-azure
+SQL CSV                                           10361          10395         
 48          1.5         658.7       1.0X
+SQL Json                                           9825           9848         
 32          1.6         624.7       1.1X
+SQL Json with UnsafeRow                           10692          10700         
 11          1.5         679.8       1.0X
+SQL Parquet Vectorized: DataPageV1                  108            115         
  6        145.6           6.9      95.9X
+SQL Parquet Vectorized: DataPageV2                  106            115         
  6        147.9           6.8      97.4X
+SQL Parquet MR: DataPageV1                         1924           1937         
 18          8.2         122.4       5.4X
+SQL Parquet MR: DataPageV2                         1841           1858         
 25          8.5         117.0       5.6X
+SQL ORC Vectorized                                  113            117         
  4        138.8           7.2      91.4X
+SQL ORC MR                                         1554           1564         
 14         10.1          98.8       6.7X
+
+OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1023-azure
 AMD EPYC 7763 64-Core Processor
 Parquet Reader Single TINYINT Column Scan:   Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
---------------------------------------------------------------------------------------------------------------------------
-ParquetReader Vectorized: DataPageV1                    93             94      
     1        169.9           5.9       1.0X
-ParquetReader Vectorized: DataPageV2                    93             94      
     1        169.1           5.9       1.0X
-ParquetReader Vectorized -> Row: DataPageV1             61             62      
     1        258.0           3.9       1.5X
-ParquetReader Vectorized -> Row: DataPageV2             61             62      
     1        258.4           3.9       1.5X
+ParquetReader Vectorized: DataPageV1                    85             88      
     4        185.9           5.4       1.0X
+ParquetReader Vectorized: DataPageV2                    84             86      
     2        186.5           5.4       1.0X
+ParquetReader Vectorized -> Row: DataPageV1             62             64      
     1        252.7           4.0       1.4X
+ParquetReader Vectorized -> Row: DataPageV2             62             63      
     1        253.9           3.9       1.4X
 
-OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1022-azure
+OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1023-azure
 AMD EPYC 7763 64-Core Processor
 SQL Single SMALLINT Column Scan:          Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-SQL CSV                                           12200          12203         
  5          1.3         775.7       1.0X
-SQL Json                                           9813           9854         
 57          1.6         623.9       1.2X
-SQL Parquet Vectorized: DataPageV1                  101            107         
  6        156.1           6.4     121.0X
-SQL Parquet Vectorized: DataPageV2                  129            135         
  6        122.3           8.2      94.9X
-SQL Parquet MR: DataPageV1                         1968           1989         
 29          8.0         125.1       6.2X
-SQL Parquet MR: DataPageV2                         1913           1916         
  3          8.2         121.6       6.4X
-SQL ORC Vectorized                                  130            135         
  6        120.8           8.3      93.7X
-SQL ORC MR                                         1593           1600         
 10          9.9         101.3       7.7X
-
-OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1022-azure
+SQL CSV                                           10958          10970         
 18          1.4         696.7       1.0X
+SQL Json                                          10164          10169         
  7          1.5         646.2       1.1X
+SQL Json with UnsafeRow                           11113          11137         
 33          1.4         706.5       1.0X

Review Comment:
   ~So, using UnsafeRow is slower than not using it? Is this a negative effect 
brought about by saving memory?~
   
   I have seen the PR description



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to