Michael Ho has uploaded a new change for review. http://gerrit.cloudera.org:8080/4350
Change subject: IMPALA-4026: Implement double-buffering for BlockingQueue. ...................................................................... IMPALA-4026: Implement double-buffering for BlockingQueue. With recent changes to improve the parquet scanner's efficency, row batches are produced more quickly, leading to higher contention in the blocking queue shared between scanner threads and the scan node. The contention happens between different producers (i.e. the scanner threads) and also to a lesser extent, between scanner threads and scan node. This change addresses the contention between scanner threads and scan node by splitting the queue into a 'read_list_' and a 'write_list_'. The consumers will consume from 'read_list_' until it's exhausted while the producers will enqueue into 'write_list_' until it's full. When 'read_list_' is exhausted, the consumer will atomically swap the 'read_list_' with 'write_list_'. This reduces the contention/overhead in two ways: (1) 'read_list_' and 'write_list_' are protected by two different locks so consumer only contends for the write lock when 'read_list_' is empty. (2) the consumer only signals the producer after an entire 'read_list_' is consumed instead of signalling it per entry in the 'read_list_'. This change also converts BlockingQueue to using POSIX pthread primitives instead of boost library which introduces some unncessary overhead (as observed from VTune profiles). With this change, the regression in primitive_filter_bigint_non_selective went from 1.6s to 0.8s, improving by 50%. +---------------------+-----------------------+---------+------------+------------+----------------+ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +---------------------+-----------------------+---------+------------+------------+----------------+ | TARGETED-PERF(_300) | parquet / none / none | 34.74 | -4.56% | 9.75 | -4.50% | +---------------------+-----------------------+---------+------------+------------+----------------+ +---------------------+--------------------------------------------------------+-----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+ | Workload | Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters | +---------------------+--------------------------------------------------------+-----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+ | TARGETED-PERF(_300) | primitive_conjunct_ordering_1 | parquet / none / none | 10.72 | 9.84 | +8.95% | 2.57% | 0.53% | 1 | 3 | | TARGETED-PERF(_300) | primitive_filter_string_selective | parquet / none / none | 1.09 | 1.01 | +7.42% | 2.59% | 4.91% | 1 | 3 | | TARGETED-PERF(_300) | primitive_broadcast_join_2 | parquet / none / none | 6.01 | 5.71 | +5.38% | 0.07% | 0.96% | 1 | 3 | | TARGETED-PERF(_300) | primitive_filter_string_non_selective | parquet / none / none | 1.04 | 0.99 | +4.89% | 2.44% | 2.46% | 1 | 3 | | TARGETED-PERF(_300) | primitive_conjunct_ordering_5 | parquet / none / none | 40.60 | 38.93 | +4.29% | 3.50% | 1.02% | 1 | 3 | | TARGETED-PERF(_300) | primitive_filter_decimal_selective | parquet / none / none | 0.92 | 0.88 | +3.70% | 0.06% | 2.69% | 1 | 3 | | TARGETED-PERF(_300) | primitive_broadcast_join_1 | parquet / none / none | 1.70 | 1.66 | +2.74% | 1.26% | 1.43% | 1 | 3 | | TARGETED-PERF(_300) | primitive_empty_build_join_1 | parquet / none / none | 13.31 | 13.07 | +1.79% | 1.04% | 0.98% | 1 | 3 | | TARGETED-PERF(_300) | primitive_groupby_bigint_lowndv | parquet / none / none | 3.62 | 3.59 | +0.73% | 2.10% | 0.04% | 1 | 3 | | TARGETED-PERF(_300) | primitive_filter_string_like | parquet / none / none | 5.77 | 5.74 | +0.45% | 3.07% | 1.76% | 1 | 3 | | TARGETED-PERF(_300) | primitive_broadcast_join_3 | parquet / none / none | 58.96 | 58.73 | +0.39% | 0.43% | 0.27% | 1 | 3 | | TARGETED-PERF(_300) | primitive_groupby_bigint_highndv | parquet / none / none | 23.84 | 23.78 | +0.26% | 0.67% | 1.27% | 1 | 3 | | TARGETED-PERF(_300) | primitive_groupby_bigint_pk | parquet / none / none | 89.36 | 89.41 | -0.05% | 2.64% | 1.74% | 1 | 3 | | TARGETED-PERF(_300) | primitive_conjunct_ordering_3 | parquet / none / none | 1.53 | 1.53 | -0.23% | 0.37% | 0.07% | 1 | 3 | | TARGETED-PERF(_300) | primitive_conjunct_ordering_4 | parquet / none / none | 1.17 | 1.18 | -1.22% | 1.30% | 0.01% | 1 | 3 | | TARGETED-PERF(_300) | primitive_groupby_decimal_lowndv.test | parquet / none / none | 3.49 | 3.62 | -3.48% | 1.45% | 0.72% | 1 | 3 | | TARGETED-PERF(_300) | primitive_filter_bigint_selective | parquet / none / none | 0.64 | 0.67 | -3.66% | 4.04% | 0.09% | 1 | 3 | | TARGETED-PERF(_300) | primitive_exchange_shuffle | parquet / none / none | 76.82 | 80.02 | -4.00% | 0.28% | 1.07% | 1 | 3 | | TARGETED-PERF(_300) | primitive_exchange_broadcast | parquet / none / none | 84.30 | 88.32 | -4.56% | 0.67% | 1.02% | 1 | 3 | | TARGETED-PERF(_300) | primitive_top-n_all | parquet / none / none | 34.24 | 35.89 | -4.59% | 1.85% | 0.26% | 1 | 3 | | TARGETED-PERF(_300) | primitive_topn_bigint | parquet / none / none | 5.02 | 5.28 | -4.89% | 1.52% | 0.98% | 1 | 3 | | TARGETED-PERF(_300) | primitive_orderby_bigint | parquet / none / none | 30.72 | 32.55 | -5.62% | 0.00% | 0.94% | 1 | 3 | | TARGETED-PERF(_300) | primitive_conjunct_ordering_2 | parquet / none / none | 73.94 | 78.98 | -6.39% | 3.50% | 2.23% | 1 | 3 | | TARGETED-PERF(_300) | primitive_shuffle_join_one_to_many_string_with_groupby | parquet / none / none | 228.99 | 244.67 | -6.41% | 0.29% | 0.74% | 1 | 3 | | TARGETED-PERF(_300) | primitive_orderby_all | parquet / none / none | 108.38 | 116.47 | -6.94% | 0.73% | 2.14% | 1 | 3 | | TARGETED-PERF(_300) | primitive_groupby_decimal_highndv | parquet / none / none | 24.78 | 26.76 | -7.40% | 0.08% | 4.73% | 1 | 3 | | TARGETED-PERF(_300) | primitive_shuffle_join_union_all_with_groupby | parquet / none / none | 74.98 | 83.63 | -10.34% | 3.71% | 1.21% | 1 | 3 | | TARGETED-PERF(_300) | primitive_filter_decimal_non_selective | parquet / none / none | 0.88 | 1.12 | I -21.79% | 1.14% | 0.08% | 1 | 3 | | TARGETED-PERF(_300) | primitive_filter_bigint_non_selective | parquet / none / none | 0.74 | 1.60 | I -53.54% | 3.50% | 1.53% | 1 | 3 | +---------------------+--------------------------------------------------------+-----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+ Change-Id: Ib9f4cf351455efefb0f3bb791cf9bc82d1421d54 --- M be/src/util/blocking-queue.h 1 file changed, 113 insertions(+), 57 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/4350/1 -- To view, visit http://gerrit.cloudera.org:8080/4350 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ib9f4cf351455efefb0f3bb791cf9bc82d1421d54 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho <[email protected]>
