[jira] [Created] (IMPALA-8888) Profile fetch performance when result spooling is enabled

Sahil Takiar (Jira) Fri, 23 Aug 2019 09:41:23 -0700

Sahil Takiar created IMPALA-8888:
------------------------------------

             Summary: Profile fetch performance when result spooling is enabled
                 Key: IMPALA-8888
                 URL: https://issues.apache.org/jira/browse/IMPALA-8888
             Project: IMPALA
          Issue Type: Sub-task
            Reporter: Sahil Takiar
            Assignee: Sahil Takiar



Profile the performance of fetching rows when result spooling is enabled. There 
are a few queries that can be used to benchmark the performance:

{{time ./bin/impala-shell.sh -B -q "select l_orderkey from 
tpch_parquet.lineitem" > /dev/null}}

{{time ./bin/impala-shell.sh -B -q "select * from tpch_parquet.orders" > 
/dev/null}}

The first fetches one column and 6,001,215 the second fetches 9 columns and 
1,500,000 - so a mix of rows fetched vs. columns fetched.

The base line for the benchmark should be the commit prior to IMPALA-8780.

The benchmark should check for both latency and CPU usage (to see if the copy 
into {{BufferedTupleStream}} has a significant overhead).

Various fetch sizes should be used in the benchmark as well to see if 
increasing the fetch size for result spooling improves performance (ideally it 
should) (it would be nice to run some fetches between machines as well as that 
will better reflect network round trip latencies).



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Created] (IMPALA-8888) Profile fetch performance when result spooling is enabled

Reply via email to