[
https://issues.apache.org/jira/browse/IMPALA-13727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17923877#comment-17923877
]
Quanlong Huang commented on IMPALA-13727:
-----------------------------------------
Reproduced this with more output on the profile:
{code:python}
query_test/test_scanners.py:903: in test_multiple_blocks_mt_dop
assert ranges_per_host[host] == 2,\
E AssertionError: ScanRangesComplete for
impala-ec2-rhel88-m7g-4xlarge-ondemand-1b29.vpc.cloudera.com:27002): should be
2 in profile:{code}
The problem is that all fragment instances except the last one in
[^profile.txt] have an info of "(Total: 49.999ms, non-child: 0.000ns, %
non-child: 0.00%)" after the hostname part:
{noformat}
$ grep -E 'ScanRangesComplete|host=' profile.txt
Instance d843c27e276bfa7a:a86450f600000000
(host=impala-ec2-rhel88-m7g-4xlarge-ondemand-1b29.vpc.cloudera.com:27000):(Total:
49.999ms, non-child: 0.000ns, % non-child: 0.00%)
- ScanRangesComplete: 1 (1)
Instance d843c27e276bfa7a:a86450f600000001
(host=impala-ec2-rhel88-m7g-4xlarge-ondemand-1b29.vpc.cloudera.com:27000):(Total:
19.999ms, non-child: 0.000ns, % non-child: 0.00%)
- ScanRangesComplete: 1 (1)
Instance d843c27e276bfa7a:a86450f600000002
(host=impala-ec2-rhel88-m7g-4xlarge-ondemand-1b29.vpc.cloudera.com:27000):(Total:
19.999ms, non-child: 0.000ns, % non-child: 0.00%)
- ScanRangesComplete: 1 (1)
Instance d843c27e276bfa7a:a86450f600000003
(host=impala-ec2-rhel88-m7g-4xlarge-ondemand-1b29.vpc.cloudera.com:27001):(Total:
19.999ms, non-child: 0.000ns, % non-child: 0.00%)
- ScanRangesComplete: 1 (1)
Instance d843c27e276bfa7a:a86450f600000004
(host=impala-ec2-rhel88-m7g-4xlarge-ondemand-1b29.vpc.cloudera.com:27001):(Total:
19.999ms, non-child: 0.000ns, % non-child: 0.00%)
- ScanRangesComplete: 1 (1)
Instance d843c27e276bfa7a:a86450f600000006
(host=impala-ec2-rhel88-m7g-4xlarge-ondemand-1b29.vpc.cloudera.com:27002):(Total:
19.999ms, non-child: 0.000ns, % non-child: 0.00%)
- ScanRangesComplete: 1 (1)
Instance d843c27e276bfa7a:a86450f600000005
(host=impala-ec2-rhel88-m7g-4xlarge-ondemand-1b29.vpc.cloudera.com:27002):
- ScanRangesComplete: 1 (1){noformat}
"(Total:" is also matched by the regex:
{code:python}
host_list = re.findall(r'host=(\S+:[0-9]*)', result.runtime_profile)
{code}
https://github.com/apache/impala/blob/9b93ab8b55901fabd0db3dfca5fb5209122c0e34/tests/query_test/test_scanners.py#L885
Note that "\S" matches all non-space characters. We should change this regex
expression to correctly match the hostname part.
> TestParquet.test_multiple_blocks_mt_dop failed by unexpected ranges_per_host
> ----------------------------------------------------------------------------
>
> Key: IMPALA-13727
> URL: https://issues.apache.org/jira/browse/IMPALA-13727
> Project: IMPALA
> Issue Type: Bug
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Critical
> Attachments: profile.txt
>
>
> The test could fail in any exec vectors, e.g.
> {code}
> query_test.test_scanners.TestParquet.test_multiple_blocks_mt_dop[protocol:
> beeswax | table_format: parquet/none | exec_option: {'test_replan': 1,
> 'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0,
> 'disable_codegen': True, 'abort_on_error': 1, 'debug_action':
> '-1:OPEN:[email protected]',
> 'exec_single_node_rows_threshold': 0}] {code}
> Stacktrace
> {code:python}
> query_test/test_scanners.py:903: in test_multiple_blocks_mt_dop
> assert ranges_per_host[host] == 2
> E assert 1 == 2{code}
> Standard Error
> {noformat}
> SET
> client_identifier=query_test/test_scanners.py::TestParquet::()::test_multiple_blocks_mt_dop[protocol:beeswax|table_format:parquet/none|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'de;
> SET test_replan=1;
> SET mt_dop=2;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=True;
> SET abort_on_error=1;
> SET debug_action=-1:OPEN:[email protected];
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> select count(l_orderkey) from functional_parquet.lineitem_sixblocks;
> -- 2025-02-01 20:42:19,750 INFO MainThread: Started query
> 0347a7702c366f22:89a1448a00000000
> SET
> client_identifier=query_test/test_scanners.py::TestParquet::()::test_multiple_blocks_mt_dop[protocol:beeswax|table_format:parquet/none|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'de;{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]