Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/21288
  
    @dongjoon-hyun I got the same result in case of the same condition (enough 
memory), but, if `--diriver-memory 3g` (smaller memory), I got a little 
different results;
    ```
    // --diriver-memory=3g (default)
    OpenJDK 64-Bit Server VM 1.8.0_171-b10 on Linux 4.14.33-51.37.amzn1.x86_64
    Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
    Select 0 string row ('7864320' < value < '7864320'): Best/Avg Time(ms)    
Rate(M/s)   Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    Parquet Vectorized                          10084 / 10154          1.6      
   641.1       1.0X
    Parquet Vectorized (Pushdown)                  967 / 1008         16.3      
    61.5      10.4X
    Native ORC Vectorized                       11088 / 11116          1.4      
   705.0       0.9X
    Native ORC Vectorized (Pushdown)               270 /  278         58.2      
    17.2      37.3X
    
    Select 1 string row (value = '7864320'): Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    Parquet Vectorized                          10032 / 10085          1.6      
   637.8       1.0X
    Parquet Vectorized (Pushdown)                  959 /  998         16.4      
    61.0      10.5X
    Native ORC Vectorized                       11104 / 11128          1.4      
   706.0       0.9X
    Native ORC Vectorized (Pushdown)               259 /  277         60.6      
    16.5      38.7X
    ...
    
    
    // --diriver-memory=10g
    OpenJDK 64-Bit Server VM 1.8.0_171-b10 on Linux 4.14.33-51.37.amzn1.x86_64
    Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
    Select 0 string row (value IS NULL):     Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    Parquet Vectorized                            9201 / 9300          1.7      
   585.0       1.0X
    Parquet Vectorized (Pushdown)                   89 /  105        176.3      
     5.7     103.1X
    Native ORC Vectorized                         8886 / 8898          1.8      
   564.9       1.0X
    Native ORC Vectorized (Pushdown)               110 /  128        143.4      
     7.0      83.9X
    
    Select 0 string row ('7864320' < value < '7864320'): Best/Avg Time(ms)    
Rate(M/s)   Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    Parquet Vectorized                            9336 / 9357          1.7      
   593.6       1.0X
    Parquet Vectorized (Pushdown)                  927 /  937         17.0      
    58.9      10.1X
    Native ORC Vectorized                         9026 / 9041          1.7      
   573.9       1.0X
    Native ORC Vectorized (Pushdown)               257 /  272         61.1      
    16.4      36.3X
    ...
    ```
    The parquet has smaller memory footprint? I'm currently look into this (I 
updated the result in case of the enough memory).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to