[
https://issues.apache.org/jira/browse/IMPALA-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17889826#comment-17889826
]
ASF subversion and git services commented on IMPALA-11943:
----------------------------------------------------------
Commit c5b7c6395b1f161bbc55596bb3694fbcf3e658c6 in impala's branch
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c5b7c6395 ]
IMPALA-11943: Mark utf8 string functions with IR_ALWAYS_INLINE
String functions that have both UTF-8 and the traditional ASCII
behaviors have checks for the UTF8_MODE query option. The check is
intended to be replaced with constants during codegen in
LlvmCodeGen::InlineConstFnAttrs().
However, as mentioned in the method comment, InlineConstFnAttrs() only
replaces call instructions inside the current function. To replace the
call to FunctionContextImpl::GetConstFnAttr() inside the callee
functions, we have to inline the callee functions (by annotating them
with IR_ALWAYS_INLINE).
This patch annotates UTF-8 related string functions with
IR_ALWAYS_INLINE to make sure the checks on UTF8_MODE are all replaced
in codegen. Note that builtin functions don't need the annotation if
they are not invoked by other builtin functions, because
InlineConstFnAttrs() will be invoked recursively in the expression tree.
See ScalarFnCall::GetCodegendComputeFnImpl().
Perf tests:
Ran PERF_STRING queries in targeted-perf (scale factor 100) on parquet
format. Saw significant improvements:
+-----------------+--------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+---------+
| Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) |
StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | Tval |
+-----------------+--------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+---------+
| PERF_STRING-Q6 | parquet/none | 11.98 | 12.53 | -4.39% | 0.44%
| 0.51% | 30 | -4.59% | -6.54 | -36.46 |
| PERF_STRING-Q9 | parquet/none | 11.76 | 12.35 | -4.77% | 0.38%
| 0.41% | 30 | -5.04% | -6.54 | -47.82 |
| PERF_STRING-Q7 | parquet/none | 9.88 | 10.44 | I -5.34% | 0.83%
| 1.10% | 30 | I -5.64% | -6.54 | -21.64 |
| PERF_STRING-Q13 | parquet/none | 9.52 | 10.08 | I -5.56% | 0.55%
| 0.59% | 30 | I -5.89% | -6.54 | -38.72 |
| PERF_STRING-Q11 | parquet/none | 10.97 | 11.64 | I -5.72% | 0.44%
| 0.61% | 30 | I -6.00% | -6.54 | -42.72 |
| PERF_STRING-Q4 | parquet/none | 5.30 | 5.66 | I -6.33% | 1.06%
| 1.44% | 30 | I -6.49% | -6.54 | -19.84 |
| PERF_STRING-Q3 | parquet/none | 5.27 | 5.70 | I -7.43% | 1.03%
| 0.94% | 30 | I -8.12% | -6.54 | -30.40 |
| PERF_STRING-Q5 | parquet/none | 5.87 | 6.36 | I -7.69% | 1.18%
| 0.88% | 30 | I -8.47% | -6.54 | -30.09 |
| PERF_STRING-Q12 | parquet/none | 6.56 | 7.15 | I -8.33% | 0.67%
| 0.73% | 30 | I -9.03% | -6.54 | -47.94 |
| PERF_STRING-Q10 | parquet/none | 6.62 | 7.24 | I -8.54% | 0.84%
| 0.60% | 30 | I -9.34% | -6.54 | -48.21 |
| PERF_STRING-Q2 | parquet/none | 4.77 | 5.25 | I -9.20% | 0.89%
| 1.05% | 30 | I -10.17% | -6.54 | -37.90 |
| PERF_STRING-Q1 | parquet/none | 4.16 | 4.62 | I -9.80% | 2.10%
| 1.10% | 30 | I -11.27% | -6.54 | -24.50 |
| PERF_STRING-Q8 | parquet/none | 5.10 | 6.57 | I -22.40% | 0.87%
| 0.73% | 30 | I -28.74% | -6.54 | -123.34 |
+-----------------+--------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+---------+
Note that Q8-Q13 are new queries added by this patch to reveal the
performance difference.
The perf test command is
bin/single_node_perf_run.py --iterations 30 --scale 100 \
--table_formats parquet/none --num_impalads 3 \
--query_names 'PERF_STRING.*' f4a321cf8 c8d2b6a90
Running on this branch:
https://github.com/stiga-huang/impala/commits/inline-utf8-func-perf-test
Change-Id: I19e8fba332ae329da8b1d37dba3bbc64f59e6f3a
Reviewed-on: http://gerrit.cloudera.org:8080/19535
Reviewed-by: Daniel Becker <[email protected]>
Reviewed-by: Csaba Ringhofer <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Performance regression in utf8 string functions due to utf8_mode checks
> -----------------------------------------------------------------------
>
> Key: IMPALA-11943
> URL: https://issues.apache.org/jira/browse/IMPALA-11943
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Critical
> Attachments: instr-perf.hist.txt, mask-perf.hist.txt
>
>
> IMPALA-2019 adds the UTF-8 aware behavior for string functions. The behavior
> is turned on by a query option, UTF8_MODE.
> String functions that have both UTF-8 and the traditional ASCII behaviors are
> added the checks for the UTF8_MODE query option. The check is intended to be
> replaced with constants during codegen in LlvmCodeGen::InlineConstFnAttrs().
> However, LlvmCodeGen::InlineConstFnAttrs() only replaces call instructions
> inside the current function. To replace the call to
> FunctionContextImpl::GetConstFnAttr() inside the callee functions, we have to
> inline the callee functions (by annotating them with IR_ALWAYS_INLINE).
> The UTF-8 related string functions are not annotated with IR_ALWAYS_INLINE,
> which cause performance regression when they are used in predicates, i.e.
> they are called in the predicate expressions.
> A perf-top measure shows that some portion of the cpu time is spent in
> GetConstFnAttr():
> {noformat}
> Overhead Shared Object Symbol
> 34.92% impalad [.] snappy::RawUncompress
> 7.05% perf-24093.map [.]
> impala::Operators::Gt_IntVal_IntValWrapper:c2426f4b421512bc:5a20733f00000000
> 7.02% perf-24100.map [.]
> impala::Operators::Gt_IntVal_IntValWrapper:c2426f4b421512bc:5a20733f00000000
> 7.00% perf-24096.map [.]
> impala::Operators::Gt_IntVal_IntValWrapper:c2426f4b421512bc:5a20733f00000000
> 6.26% [kernel] [k] native_flush_tlb_one_user
> 3.79% impalad [.]
> impala::FunctionContextImpl::GetConstFnAttr
> 3.44% impalad [.]
> impala::FunctionContextImpl::GetConstFnAttr
> 3.02% impalad [.]
> impala::ScalarColumnReader<impala::StringValue, (parquet::Type::type)6,
> true>::ReadSlotsNoConversion
> 1.74% [kernel] [k] smp_call_function_many
> 1.59% [kernel] [k] copy_user_enhanced_fast_string
> 1.34% impalad [.]
> impala::RuntimeState::query_options
> 1.21% impalad [.]
> impala::RuntimeState::query_ctx {noformat}
> Attached the perf histograms. [^instr-perf.hist.txt] is measured for query
> {code:sql}
> select count(*) from tpch100_parquet.lineitem
> where instr(l_comment, 'egular courts above the') > 0;{code}
> [^mask-perf.hist.txt] is measured for query
> {code:sql}
> select count(*) from tpch100_parquet.lineitem
> where mask(l_comment) is null;{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]