Github user SongYadong commented on the issue:

    https://github.com/apache/spark/pull/22348
  
     @dongjoon-hyun .  You are right, DataSourceReadBenchmark result show the 
benefit is too small even in some cases is covered up by fluctuation.
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_141-b15 on Windows 7 6.1
    Intel64 Family 6 Model 42 Stepping 7, GenuineIntel
    
    Before:
    ```
    Parquet Reader Single TINYINT Column Scan: Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    ParquetReader Vectorized                       330 /  334         47.7      
    21.0       1.0X
    ParquetReader Vectorized -> Row                213 /  301         73.7      
    13.6       1.5X
    ```
    After:
    ```
    Parquet Reader Single TINYINT Column Scan: Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    ParquetReader Vectorized                       292 /  366         53.8      
    18.6       1.0X
    ParquetReader Vectorized -> Row                254 /  286         62.0      
    16.1       1.2X
    ```
    
    
    
    Before:
    ```
    Parquet Reader Single SMALLINT Column Scan: Best/Avg Time(ms)    Rate(M/s)  
 Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    ParquetReader Vectorized                       391 /  425         40.2      
    24.9       1.0X
    ParquetReader Vectorized -> Row                371 /  407         42.4      
    23.6       1.1X
    ```
    After:
    ```
    Parquet Reader Single SMALLINT Column Scan: Best/Avg Time(ms)    Rate(M/s)  
 Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    ParquetReader Vectorized                       435 /  485         36.1      
    27.7       1.0X
    ParquetReader Vectorized -> Row                398 /  440         39.5      
    25.3       1.1X
    ```
    
    
    
    Before:
    ```
    Parquet Reader Single INT Column Scan:   Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    ParquetReader Vectorized                       453 /  516         34.7      
    28.8       1.0X
    ParquetReader Vectorized -> Row                542 /  563         29.0      
    34.5       0.8X
    ```
    After:
    ```
    Parquet Reader Single INT Column Scan:   Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    ParquetReader Vectorized                       533 /  602         29.5      
    33.9       1.0X
    ParquetReader Vectorized -> Row                549 /  570         28.6      
    34.9       1.0X
    ```
    
    
    
    
    Before:
    ```
    Parquet Reader Single BIGINT Column Scan: Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    ParquetReader Vectorized                       800 /  817         19.7      
    50.9       1.0X
    ParquetReader Vectorized -> Row                530 /  686         29.7      
    33.7       1.5X
    ```
    After:
    ```
    Parquet Reader Single BIGINT Column Scan: Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    ParquetReader Vectorized                       692 /  847         22.7      
    44.0       1.0X
    ParquetReader Vectorized -> Row                580 /  610         27.1      
    36.9       1.2X
    ```
    
    
    
    
    Before:
    ```
    Parquet Reader Single FLOAT Column Scan: Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    ParquetReader Vectorized                       467 /  543         33.7      
    29.7       1.0X
    ParquetReader Vectorized -> Row                457 /  507         34.4      
    29.1       1.0X
    ```
    After:
    ```
    Parquet Reader Single FLOAT Column Scan: Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    ParquetReader Vectorized                       584 /  600         26.9      
    37.1       1.0X
    ParquetReader Vectorized -> Row                546 /  555         28.8      
    34.7       1.1X
    ```
    
    
    
    
    Before:
    ```
    Parquet Reader Single DOUBLE Column Scan: Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    ParquetReader Vectorized                       801 /  808         19.6      
    50.9       1.0X
    ParquetReader Vectorized -> Row                590 /  668         26.7      
    37.5       1.4X
    ```
    After:
    ```
    Parquet Reader Single DOUBLE Column Scan: Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    ParquetReader Vectorized                       833 /  858         18.9      
    53.0       1.0X
    ParquetReader Vectorized -> Row                672 /  722         23.4      
    42.7       1.2X
    ```
    
    
    Before:
    ```
    String with Nulls Scan:                  Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    ParquetReader Vectorized                      2098 / 2263          5.0      
   200.1      13.3X
    ```
    After:
    ```
    String with Nulls Scan:                  Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    ParquetReader Vectorized                      1943 / 2084          5.4      
   185.3      14.0X
    ```
    
    
    
    Before:
    ```
    String with Nulls Scan:                  Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    ParquetReader Vectorized                      1930 / 1980          5.4      
   184.0      15.0X
    ```
    After:
    ```
    String with Nulls Scan:                  Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    ParquetReader Vectorized                      1873 / 1875          5.6      
   178.6      16.7X
    ```
    
    
    Before:
    ```
    String with Nulls Scan:                  Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    ParquetReader Vectorized                       301 /  356         34.9      
    28.7      83.1X
    ```
    After:
    ```
    String with Nulls Scan:                  Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    ParquetReader Vectorized                       506 /  542         20.7      
    48.3      53.0X
    ```
    
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to