andygrove commented on pull request #8409:
URL: https://github.com/apache/arrow/pull/8409#issuecomment-706429236


   The results are pretty interesting for me.
   
   Without `--mem-table`:
   
   ```
   Running benchmarks with the following options: TpchOpt { query: 1, debug: 
false, iterations: 3, concurrency: 24, batch_size: 4096, path: 
"/mnt/tpch/s1/parquet", file_format: "parquet", mem_table: false }
   Query 1 iteration 0 took 241 ms
   Query 1 iteration 1 took 164 ms
   Query 1 iteration 2 took 167 ms
   ```
   
   With `--mem-table`:
   
   ```
   Running benchmarks with the following options: TpchOpt { query: 1, debug: 
false, iterations: 3, concurrency: 24, batch_size: 4096, path: 
"/mnt/tpch/s1/parquet", file_format: "parquet", mem_table: true }
   Loading data into memory
   Loaded data into memory in 11240 ms
   Query 1 iteration 0 took 353 ms
   Query 1 iteration 1 took 302 ms
   Query 1 iteration 2 took 322 ms
   ```
   
   I filed https://issues.apache.org/jira/browse/ARROW-10251 to fix the 
single-threaded loading in MemTable but I'm not sure why the actual query time 
is slower for mem tables than for Parquet.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to