Andy Grove created ARROW-10453:
----------------------------------

             Summary: [Rust] [DataFusion] Performance degredation after 
removing specialization
                 Key: ARROW-10453
                 URL: https://issues.apache.org/jira/browse/ARROW-10453
             Project: Apache Arrow
          Issue Type: Bug
          Components: Rust, Rust - DataFusion
    Affects Versions: 3.0.0
            Reporter: Andy Grove
             Fix For: 3.0.0


The following commit caused a pretty large drop in performance for the TPC-H 
benchmark running against a SF=100 data set.
{code:java}
 29e9d13481ea6acc3f74cda108ed34ef8a411ba2 is the first bad commit
commit 29e9d13481ea6acc3f74cda108ed34ef8a411ba2
Author: Jorge C. Leitao <[email protected]>
Date:   Sun Oct 18 21:05:48 2020 +0200    ARROW-10002: [Rust] Remove trait 
specialization from arrow crate
    
    This PR removes trait specialization by leveraging the compiler to remove 
trivial `if` statements.
    
    I verified that the assembly code was the same in a [simple 
example](https://rust.godbolt.org/z/qrcW8W). I do not know if this generalizes 
to our use-case, but I suspect so as LLVM is (hopefully) removing trivial 
branches like `if a != a`.
    
    The change `get_data_type()` to `DATA_TYPE` is not necessary. I did it 
before realizing this. IMO it makes it more explicit that this is not a 
function, but a constant, but we can revert it.
    
    Closes #8485 from jorgecarleitao/simp_types
    
    Authored-by: Jorge C. Leitao <[email protected]>
    Signed-off-by: Neville Dipale <[email protected]>:040000 040000 
cbdaf3c9e924ec0e51d178df73169956b2bf723f 
87c79e17378196b61dce9c5373e008ee94620d58 M     rust
{code}
Benchmark command:
{code:java}
 cargo run --release --bin tpch -- --iterations 3 --path 
/mnt/tpch/parquet-100GB --format parquet --query 1 --batch-size 4096 
--concurrency 24{code}
Before this commit:
{code:java}
Query 1 iteration 0 took 13629 ms
Query 1 iteration 1 took 13450 ms
Query 1 iteration 2 took 13465 ms {code}
After this commit:
{code:java}
Query 1 iteration 0 took 18586 ms
Query 1 iteration 1 took 18297 ms
Query 1 iteration 2 took 18253 ms {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to