[
https://issues.apache.org/jira/browse/IMPALA-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17966278#comment-17966278
]
ASF subversion and git services commented on IMPALA-13964:
----------------------------------------------------------
Commit 14597c7e2fd44ca21bcdfcdc6a73b5a7c4aa6241 in impala's branch
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=14597c7e2 ]
IMPALA-13964: Fix test_tuple_cache_tpc_queries.py flakiness
This makes two changes to deflake test_tuple_cache_tpc_queries.py.
First, it increases the runtime filter wait time from 60 seconds to
600 seconds. The correctness verification slows down the path
that produces the runtime filter. The slowdown is dependent on
the speed of storage, so this can get very slow on test machines.
Second, this skips correctness checking for locations that are just
after streaming aggregations. Streaming aggregations can produce
variable output that the correctness checking can't handle.
For example a grouping aggregation computing a sum might have
a preaggregation produce either (A: 3) or (A: 2), (A: 1) or
(A: 1), (A: 1), (A: 1). The finalization sees these as equivalent.
This marks the nodes as variable starting with the preaggregation
and clears the mark at the finalize stage.
When skipping correctness checking, the tuple cache node does not
hit the cache normally. This guarantees that its children will run
and go through correctness checking.
Testing:
- Ran test_tuple_cache_tpc_queries.py locally
- Added a frontend test for this specific case
Change-Id: If5e1be287bdb489a89aea3b2d7bec416220feb9a
Reviewed-on: http://gerrit.cloudera.org:8080/23010
Reviewed-by: Michael Smith <[email protected]>
Tested-by: Michael Smith <[email protected]>
> test_tuple_cache_tpc_queries.py intermittently shows errors for TPC-DS queries
> ------------------------------------------------------------------------------
>
> Key: IMPALA-13964
> URL: https://issues.apache.org/jira/browse/IMPALA-13964
> Project: IMPALA
> Issue Type: Task
> Components: Backend, Frontend
> Affects Versions: Impala 5.0.0
> Reporter: Joe McDonnell
> Assignee: Joe McDonnell
> Priority: Critical
> Labels: broken-build
>
> test_tuple_cache_tpc_queries.py failed with some tuple cache correctness
> verification errors on some TPC-DS queries. For example:
> {noformat}
> query_test.test_tuple_cache_tpc_queries.TestTupleCacheTpcdsQuery.test_tpcds[protocol:
> beeswax | table_format: parquet/none | exec_option: {'test_replan': 1,
> 'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000,
> 'disable_codegen': False, 'abort_on_error': 1,
> 'exec_single_node_rows_threshold': 0}-0-tpcds-decimal_v2-q72]
> E Inconsistent tuple cache found: Result '[(11581 80525 327 2452547 2452588
> 197 145263 69)]' of file
> '/data/jenkins/workspace/tmp/impala-tuplecache-debugdump-2/tuple-cache-debug-dump/1bc6486bcce556626e0e1705bd7f9578_3685260755/314cc074d3b8c64c:c66d4d0300000003_37.bad'
> doesn't exist in the reference file:
> '/data/jenkins/workspace/tmp/impala-tuplecache-debugdump-2/tuple-cache-debug-dump/1bc6486bcce556626e0e1705bd7f9578_3685260755/314cc074d3b8c64c:c66d4d0300000003_37_2840c3f9468fae3d:ac5f338400000003_37_ref.bad'.{noformat}
> This showed up in a nightly job. There were also failures for Q72, Q97,
> Q23-1, Q23-2. This does not reproduce on my development machine.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]