[
https://issues.apache.org/jira/browse/IMPALA-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joe McDonnell reassigned IMPALA-12363:
--------------------------------------
Assignee: Joe McDonnell
> Upgrade re2 to version 2023-03-01 or higher
> -------------------------------------------
>
> Key: IMPALA-12363
> URL: https://issues.apache.org/jira/browse/IMPALA-12363
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Affects Versions: Impala 4.3.0
> Reporter: Joe McDonnell
> Assignee: Joe McDonnell
> Priority: Major
>
> There has been a lot of development on google's re2 since the version that we
> currently use (20190301). In a prototype using version 2023-03-01, it seems
> to help TPC-H Q13, which has a "o_comment not like '%special%requests%'"
> predicate:
> {noformat}
> (I) Improvement: TPCH(42) TPCH-Q13 [parquet / none / none] (5.26s -> 4.77s
> [-9.43%])
> +---------------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+-------+--------+-----------+
> | Operator | % of Query | Avg | Base Avg | Delta(Avg) |
> StdDev(%) | Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est
> #Rows |
> +---------------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+-------+--------+-----------+
> | 03:AGGREGATE | 8.84% | 478.98ms | 503.19ms | -4.81% |
> 1.74% | 642.76ms | 695.25ms | -7.55% | 3 | 15 | 6.30M | 6.22M
> |
> | 02:HASH JOIN | 9.35% | 506.60ms | 532.76ms | -4.91% |
> 1.49% | 664.59ms | 738.50ms | -10.01% | 3 | 15 | 64.42M | 6.38M
> |
> | F00:EXCHANGE SENDER | 38.39% | 2.08s | 1.99s | +4.49% |
> 0.87% | 2.39s | 2.28s | +4.77% | 3 | 15 | -1 | -1
> |
> | 01:SCAN HDFS | 38.93% | 2.11s | 2.64s | -20.17% |
> 0.88% | 2.37s | 2.99s | -20.87% | 3 | 15 | 62.32M | 6.30M
> |
> +---------------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+-------+--------+-----------+
> {noformat}
> This is with
> mt_dop=5,runtime_filter_min_size=8192,runtime_filter_max_size=2097152,max_num_runtime_filters=50,runtime_filter_wait_time_ms=10000
> .
> Beyond 2023-03-01, re2 takes an Abseil dependency. It may have further
> improvements (they replace some std::unordered_map structures with Abseil's
> hash table). We can look into those versions, but it is a little bit more
> work compared to 2023-03-01.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]