[ 
https://issues.apache.org/jira/browse/IMPALA-11295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17538139#comment-17538139
 ] 

Quanlong Huang commented on IMPALA-11295:
-----------------------------------------

Download and extract the logs. I can find the profile using the query id. It's 
in folder 
ubuntu-16.04-dockerised-tests-5677/logs/ee_tests/impalad_coord_exec-0/profiles. 
Using the impala-profile-tool util, I can extract the text format profile 
(uploaded as [^test_multiple_blocks_mt_dop.profile.txt] ).

Then get the counters of ScanRangesComplete and the instance hosts:
{code}
$ egrep 'ScanRangesComplete|host=' test_multiple_blocks_mt_dop.profile.txt
      Instance d04d0645dcd4ee6f:dc3ba79f00000000 (host=192.168.124.4:27000):
         - ScanRangesComplete: 1 (1)
      Instance d04d0645dcd4ee6f:dc3ba79f00000001 (host=192.168.124.5:27000):
           - ScanRangesComplete: 1 (1)
      Instance d04d0645dcd4ee6f:dc3ba79f00000006 (host=192.168.124.4:27000):
           - ScanRangesComplete: 2 (2)
      Instance d04d0645dcd4ee6f:dc3ba79f00000005 (host=192.168.124.4:27000):
           - ScanRangesComplete: 0 (0)
      Instance d04d0645dcd4ee6f:dc3ba79f00000002 (host=192.168.124.5:27000):
           - ScanRangesComplete: 1 (1)
      Instance d04d0645dcd4ee6f:dc3ba79f00000004 (host=192.168.124.6:27000):
           - ScanRangesComplete: 1 (1)
      Instance d04d0645dcd4ee6f:dc3ba79f00000003 (host=192.168.124.6:27000):
           - ScanRangesComplete: 1 (1)
{code}
The test assert that each host has 2 scan ranges in total, which is true based 
on the above output. What's wrong is that we should not assume the counters are 
ordered by the hosts:
{code:python}
 Again skip the Averaged Fragment; it comes first in the runtime profile.
# With mt_dop 2, every backend will have 2 instances which are printed 
consecutively
# in the profile.
for i in range(1, len(ranges_complete_list), 2):
  assert int(ranges_complete_list[i]) + int(ranges_complete_list[i + 1]) == 2
{code}
In the above profile, the first counter and host is for the averaged fragment 
which can be ignored. The following 6 counters and hosts are
{noformat}
1 192.168.124.5
2 192.168.124.4
0 192.168.124.4
1 192.168.124.5
1 192.168.124.6
1 192.168.124.6
{noformat}
We should sum the counters based on their hosts.

> TestParquet.test_multiple_blocks_mt_dop failed by unexpected 
> ranges_complete_list
> ---------------------------------------------------------------------------------
>
>                 Key: IMPALA-11295
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11295
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>              Labels: flaky
>         Attachments: test_multiple_blocks_mt_dop.profile.txt
>
>
> Saw this in [https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/5677]
> {noformat}
> query_test.test_acid.TestAcid.test_acid_compute_stats[protocol: beeswax | 
> exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> text/none] (from pytest)
> {noformat}
> *Stacktrace*
> {code:python}
> query_test/test_scanners.py:701: in test_multiple_blocks_mt_dop
>     assert int(ranges_complete_list[i]) + int(ranges_complete_list[i + 1]) == 
> 2
> E   assert (1 + 2) == 2
> E    +  where 1 = int('1')
> E    +  and   2 = int('2')
> {code}
> *Standard Error*
> {code:sql}
> SET mt_dop=2;
> -- executing against localhost:21000
> select count(l_orderkey) from functional_parquet.lineitem_sixblocks;
> -- 2022-05-16 22:44:28,068 INFO     MainThread: Started query 
> d04d0645dcd4ee6f:dc3ba79f00000000
> SET 
> client_identifier=query_test/test_scanners.py::TestParquet::()::test_multiple_blocks_mt_dop[protocol:beeswax|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'debug_action':'HDFS_SCANNER_;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to