[jira] [Resolved] (IMPALA-5243) Slow codegen for wide Avro tables

Philip Zeyliger (JIRA) Wed, 29 Nov 2017 15:52:56 -0800

     [ 
https://issues.apache.org/jira/browse/IMPALA-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Philip Zeyliger resolved IMPALA-5243.
-------------------------------------
    Resolution: Fixed

{code}
commit 43ef80e4f1c93ea69883a4670c79ba10b0ed0432
Author: Philip Zeyliger <[email protected]>
Date:   Wed Sep 13 09:04:56 2017 -0700

    IMPALA-5243: Speed up code gen for wide Avro tables.

    HdfsAvroScanner::CodegenMaterializeTuple generates a function linear in
    size to the number of columns. On 1000 column tables, codegen time is
    significant. This commit roughly halves it for wide columns.
    (Note that this had been much worse in recent history (<= Impala 2.9).)

    It does so by breaking up MaterializeTuple() into multiple smaller
    functions, and then calls them in order. When breaking up into
    200-column chunks, there is a noticeable speed-up.

    I've made the helper code for generating LLVM function prototypes
    have a mutable function name, so that the builder can be re-used
    multiple times.

    I've checked by inspecting optimized LLVM that in the case where there's
    only 1 helper function, code gets inlined so that there doesn't seem to
    be an extra function.

    I measured codegen time for various "step sizes." The case where there
    are no helper functions is about 2.7s. The best case was about a step
    size of 200, with timings of 1.3s.

    For the query "select count(int_col16) from 
functional_avro.widetable_1000_cols",
    codegen times as a function of step size are roughly as follows. This is
    averaged across 5 executions, and rounded to 0.1s.

       step time
         10     2.4
         50     2.5
         75     2.9
        100     3.0
        125     3.0
        150     1.4
        175     1.3
        200     1.3 <-- chosen step size
        225     1.5
        250     1.4
        300     1.6
        400     1.6
        500     1.8
       1000     2.7

    The raw data was generated like so, with some code that let me change the 
step size at runtime:

      $(for step in 10 50 75 100 125 150 175 200 225 250 300 400 500 1000; do 
for try in $(seq 5); do echo $step > /tmp/step_size.txt; echo -n "$step "; 
impala-shell.sh -q "select count(int_col16) from 
functional_avro.widetable_1000_cols; profile;" 2> /dev/null | grep -A9 'CodeGe
n:(Total: [0-9]*s' -m 1 | sed -e 's/ - / /' |
      sed -e 's/([0-9]*)//' | tr -d '\n' | tr -s ' ' ' '; echo; done; done) | 
tee out.txt
      ...
      200  CodeGen:(Total: 1s333ms, non-child: 1s333ms, % non-child: 100.00%) 
CodegenTime: 613.562us CompileTime: 605.320ms LoadTime: 0.000ns 
ModuleBitcodeSize: 1.95 MB NumFunctions: 38 NumInstructions: 8.44K 
OptimizationTime: 701.276ms PeakMemoryUsage: 4.12 MB PrepareTime: 10.01
4ms
      ...
      1000  CodeGen:(Total: 2s659ms, non-child: 2s659ms, % non-child: 100.00%) 
CodegenTime: 558.860us CompileTime: 1s267ms LoadTime: 0.000ns 
ModuleBitcodeSize: 1.95 MB NumFunctions: 34 NumInstructions: 8.41K 
OptimizationTime: 1s362ms PeakMemoryUsage: 4.11 MB PrepareTime: 10.574ms

    I have run the core tests with this change.

    Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476
    Reviewed-on: http://gerrit.cloudera.org:8080/8211
    Reviewed-by: Tim Armstrong <[email protected]>
    Tested-by: Impala Public Jenkins
{code}

> Slow codegen for wide Avro tables
> ---------------------------------
>
>                 Key: IMPALA-5243
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5243
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 2.7.0, Impala 2.8.0
>            Reporter: Alexander Behm
>            Assignee: Philip Zeyliger
>              Labels: codegen, performance, ramp-up
>         Attachments: screenshot-1.png
>
>
> Codegen gets rather expensive when scanning wide Avro tables (>500 columns), 
> regardless of how many columns are materialized by the query.
> {code}
> select count(int_col16) from functional_avro.widetable_250_cols;
> +------------------+
> | count(int_col16) |
> +------------------+
> | 10               |
> +------------------+
> Fetched 1 row(s) in 0.93s
> select count(int_col16) from functional_avro.widetable_500_cols;
> +------------------+
> | count(int_col16) |
> +------------------+
> | 10               |
> +------------------+
> Fetched 1 row(s) in 2.87s
> select count(int_col16) from widetable_1000_cols;
> +------------------+
> | count(int_col16) |
> +------------------+
> | 10               |
> +------------------+
> Fetched 1 row(s) in 10.58s
> {code}
> For the last query with 1000 columns, here's the codegen snippet from the 
> query profile:
> {code}
>         CodeGen:(Total: 10s115ms, non-child: 10s115ms, % non-child: 100.00%)
>            - CodegenTime: 530.211us
>            - CompileTime: 1s683ms
>            - LoadTime: 0.000ns
>            - ModuleBitcodeSize: 1.98 MB (2073044)
>            - NumFunctions: 32 (32)
>            - NumInstructions: 8.41K (8413)
>            - OptimizationTime: 8s416ms
>            - PeakMemoryUsage: 4.11 MB (4307456)
>            - PrepareTime: 15.357ms
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (IMPALA-5243) Slow codegen for wide Avro tables

Reply via email to