[ 
https://issues.apache.org/jira/browse/DISPATCH-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17301679#comment-17301679
 ] 

Charles E. Rolke commented on DISPATCH-1660:
--------------------------------------------

I have a Xenial docker build to test this issue locally. I have rebuilt and run 
it many times since this issue was opened. It ran 17,000 successful passes over 
the week end. Running this more today or again in the near future seems like a 
waste of time.

A good step would be to run tests and debug builds on the system where the 
failure happens. Could anyone help me connect the the live debug in travis-ci? 
Referring to [https://docs.travis-ci.com/user/running-build-in-debug-mode/] the 
debug pathway is there but it looks like there are tricks to getting it to work 
for a regular user. Thanks in advance for help getting started.

Hats off the whoever got the Backtrace into the self test output. This really 
helps. But ...
 * Which router was it that crashed? Did more than one router crash? From the 
test log I can't tell.
 * There are 15 calls to qdr_check_addr_CT in the code base. Which one was it 
that failed in this case? In the backtrace shown here control goes through an 
anonymous function to get to qdr_check_addr_CT(). Debugging an issue is a lot 
easier when you can see the function names in the BT. See DISPATCH-1971
 * There is no access to the router logs. There are a lot of hints and clues in 
the router logs.

Problems similar to the one reported here happened during development of the 
oversize message blocking feature. Connection closures are normally discovered 
by proton and dispatch is notified of them. Oversize message rejection reverses 
that pattern and the connection closure is initiated in dispatch code. Now 
dispatch and proton have to get the events and states lined up so that the 
closure is effected properly and nothing breaks. On Xenial this is not 
happening as expected.

Given a system on which even some small percentage of the tests fail, a good 
debug strategy would be to add logging to identify when physical addr hashes 
are created and destroyed.

Perhaps the fix is to keep safe-pointers to qd_hash_item_t objects in the hash 
buckets. Perhaps not. Only enlightened debugging will tell.

> Intermittent failure in system_tests_policy_oversize_basic
> ----------------------------------------------------------
>
>                 Key: DISPATCH-1660
>                 URL: https://issues.apache.org/jira/browse/DISPATCH-1660
>             Project: Qpid Dispatch
>          Issue Type: Bug
>          Components: Tests
>    Affects Versions: 1.12.0
>            Reporter: Ganesh Murthy
>            Assignee: Charles E. Rolke
>            Priority: Major
>
> {noformat}
>      Start 25: system_tests_policy_oversize_basic25: Test command: 
> /usr/bin/python "/foo/qpid-dispatch/build/tests/run.py" "-m" "unittest" "-v" 
> "system_tests_policy_oversize_basic"
> 25: Test timeout computed to be: 600
> 25: test_40_block_oversize_INTA_INTA 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... ok
> 25: test_41_block_oversize_INTA_INTB 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... ok
> 25: test_42_block_oversize_INTA_EA1 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... ok
> 25: test_43_block_oversize_INTA_EB1 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... ok
> 25: test_44_block_oversize_INTB_INTA 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... ok
> 25: test_45_block_oversize_INTB_INTB 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... ok
> 25: OversizeMessageTransferTest - e46
> 25: 2020-05-20 12:11:40.002177 on_start
> 25: 2020-05-20 12:11:40.002267 on_start: opening receiver connection to 
> amqp://0.0.0.0:24197
> 25: 2020-05-20 12:11:40.002545 on_start: opening   sender connection to 
> amqp://0.0.0.0:24195
> 25: 2020-05-20 12:11:40.002714 on_start: Creating receiver
> 25: 2020-05-20 12:11:40.002878 on_start: Creating sender
> 25: 2020-05-20 12:11:40.002997 on_start: done
> 25: 2020-05-20 12:12:40.003628 self.timeout Timeout Expired: n_sent=0 
> n_rcvd=0 n_rejected=0 n_aborted=0
> 25: 2020-05-20 12:12:45.289228 test_46 test error: Timeout Expired: n_sent=0 
> n_rcvd=0 n_rejected=0 n_aborted=0
> 25: test_46_block_oversize_INTB_EA1 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... FAIL
> 25: OversizeMessageTransferTest - e47
> 25: 2020-05-20 12:12:45.310757 on_start
> 25: 2020-05-20 12:12:45.310873 on_start: opening receiver connection to 
> amqp://0.0.0.0:24199
> 25: 2020-05-20 12:12:45.311197 on_start: opening   sender connection to 
> amqp://0.0.0.0:24195
> 25: 2020-05-20 12:12:45.311385 on_start: Creating receiver
> 25: 2020-05-20 12:12:45.311570 on_start: Creating sender
> 25: 2020-05-20 12:12:45.311701 on_start: done
> 25: 2020-05-20 12:13:45.311633 self.timeout Timeout Expired: n_sent=0 
> n_rcvd=0 n_rejected=0 n_aborted=0
> 25: 2020-05-20 12:13:48.055263 test_47 test error: Timeout Expired: n_sent=0 
> n_rcvd=0 n_rejected=0 n_aborted=0
> 25: test_47_block_oversize_INTB_EB1 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... FAIL
> 25: test_48_block_oversize_EA1_INTA 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... ok
> 25: OversizeMessageTransferTest - e49
> 25: 2020-05-20 12:13:48.275921 on_start
> 25: 2020-05-20 12:13:48.275995 on_start: opening receiver connection to 
> amqp://0.0.0.0:24195
> 25: 2020-05-20 12:13:48.276285 on_start: opening   sender connection to 
> amqp://0.0.0.0:24197
> 25: 2020-05-20 12:13:48.276454 on_start: Creating receiver
> 25: 2020-05-20 12:13:48.276615 on_start: Creating sender
> 25: 2020-05-20 12:13:48.276735 on_start: done
> 25: 2020-05-20 12:14:48.277283 self.timeout Timeout Expired: n_sent=0 
> n_rcvd=0 n_rejected=0 n_aborted=0
> 25: 2020-05-20 12:14:51.017235 test_49 test error: Timeout Expired: n_sent=0 
> n_rcvd=0 n_rejected=0 n_aborted=0
> 25: test_49_block_oversize_EA1_INTB 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... FAIL
> 25: test_4a_block_oversize_EA1_EA1 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... ok
> 25: OversizeMessageTransferTest - e4b
> 25: 2020-05-20 12:14:51.235405 on_start
> 25: 2020-05-20 12:14:51.235481 on_start: opening receiver connection to 
> amqp://0.0.0.0:24199
> 25: 2020-05-20 12:14:51.235753 on_start: opening   sender connection to 
> amqp://0.0.0.0:24197
> 25: 2020-05-20 12:14:51.235920 on_start: Creating receiver
> 25: 2020-05-20 12:14:51.236082 on_start: Creating sender
> 25: 2020-05-20 12:14:51.236201 on_start: done
> 25: 2020-05-20 12:15:51.236483 self.timeout Timeout Expired: n_sent=0 
> n_rcvd=0 n_rejected=0 n_aborted=0
> 25: 2020-05-20 12:15:51.246181 test_4b test error: Timeout Expired: n_sent=0 
> n_rcvd=0 n_rejected=0 n_aborted=0
> 25: test_4b_block_oversize_EA1_EB1 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... FAIL
> 25: test_4c_block_oversize_EB1_INTA 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... skipped 
> 'EB1 sending to INT.A may be blocked by EB1 limit and also by INT.B limit. 
> That condition is tested in compound test.'
> 25: test_4d_block_oversize_EB1_INTB 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... skipped 
> 'EB1 sending to INT.B may be blocked by EB1 limit and also by INT.B limit. 
> That condition is tested in compound test.'
> 25: test_4e_block_oversize_EB1_EA1 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... skipped 
> 'EB1 sending to EA1 may be blocked by EB1 limit and also by INT.B limit. That 
> condition is tested in compound test.'
> 25: test_4f_block_oversize_EB1_EB1 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... ok
> 25: test_50_allow_undersize_INTA_INTA 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... ok
> 25: OversizeMessageTransferTest - e51
> 25: 2020-05-20 12:15:53.637504 on_start
> 25: 2020-05-20 12:15:53.637586 on_start: opening receiver connection to 
> amqp://0.0.0.0:24195
> 25: 2020-05-20 12:15:53.637871 on_start: opening   sender connection to 
> amqp://0.0.0.0:24194
> 25: 2020-05-20 12:15:53.638042 on_start: Creating receiver
> 25: 2020-05-20 12:15:53.638208 on_start: Creating sender
> 25: 2020-05-20 12:15:53.638336 on_start: done
> 25: 2020-05-20 12:16:53.638800 self.timeout Timeout Expired: n_sent=0 
> n_rcvd=0 n_rejected=0 n_aborted=0
> 25: 2020-05-20 12:16:56.375683 test_51 test error: Timeout Expired: n_sent=0 
> n_rcvd=0 n_rejected=0 n_aborted=0
> 25: test_51_allow_undersize_INTA_INTB 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... FAIL
> 25: test_52_allow_undersize_INTA_EA1 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... ok
> 25: OversizeMessageTransferTest - e53
> 25: 2020-05-20 12:16:56.981609 on_start
> 25: 2020-05-20 12:16:56.981683 on_start: opening receiver connection to 
> amqp://0.0.0.0:24199
> 25: 2020-05-20 12:16:56.981950 on_start: opening   sender connection to 
> amqp://0.0.0.0:24194
> 25: 2020-05-20 12:16:56.982119 on_start: Creating receiver
> 25: 2020-05-20 12:16:56.982283 on_start: Creating sender
> 25: 2020-05-20 12:16:56.982401 on_start: done
> 25: 2020-05-20 12:17:56.983678 self.timeout Timeout Expired: n_sent=0 
> n_rcvd=0 n_rejected=0 n_aborted=0
> 25: 2020-05-20 12:17:56.992594 test_53 test error: Timeout Expired: n_sent=0 
> n_rcvd=0 n_rejected=0 n_aborted=0
> 25: test_53_allow_undersize_INTA_EB1 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... FAIL
> 25: OversizeMessageTransferTest - e54
> 25: 2020-05-20 12:17:57.011138 on_start
> 25: 2020-05-20 12:17:57.011215 on_start: opening receiver connection to 
> amqp://0.0.0.0:24194
> 25: 2020-05-20 12:17:57.011485 on_start: opening   sender connection to 
> amqp://0.0.0.0:24195
> 25: 2020-05-20 12:17:57.011653 on_start: Creating receiver
> 25: 2020-05-20 12:17:57.011811 on_start: Creating sender
> 25: 2020-05-20 12:17:57.011929 on_start: done
> 25: 2020-05-20 12:18:57.012497 self.timeout Timeout Expired: n_sent=0 
> n_rcvd=0 n_rejected=0 n_aborted=0
> 25: 2020-05-20 12:18:59.752664 test_54 test error: Timeout Expired: n_sent=0 
> n_rcvd=0 n_rejected=0 n_aborted=0
> 25: test_54_allow_undersize_INTB_INTA 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... FAIL
> 25: OversizeMessageTransferTest - e55
> 25: 2020-05-20 12:18:59.772203 on_start
> 25: 2020-05-20 12:18:59.772287 on_start: opening receiver connection to 
> amqp://0.0.0.0:24195
> 25: 2020-05-20 12:18:59.772555 on_start: opening   sender connection to 
> amqp://0.0.0.0:24195
> 25: 2020-05-20 12:18:59.772723 on_start: Creating receiver
> 25: 2020-05-20 12:18:59.772885 on_start: Creating sender
> 25: 2020-05-20 12:18:59.773005 on_start: done
> 25: 2020-05-20 12:19:59.772590 self.timeout Timeout Expired: n_sent=0 
> n_rcvd=0 n_rejected=0 n_aborted=0
> 25: 2020-05-20 12:20:02.522946 test_55 test error: Timeout Expired: n_sent=0 
> n_rcvd=0 n_rejected=0 n_aborted=0
> 25: test_55_allow_undersize_INTB_INTB 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... FAIL
> 25: OversizeMessageTransferTest - e56
> 25: 2020-05-20 12:20:02.542721 on_start
> 25: 2020-05-20 12:20:02.542801 on_start: opening receiver connection to 
> amqp://0.0.0.0:24197
> 25: 2020-05-20 12:20:02.543089 on_start: opening   sender connection to 
> amqp://0.0.0.0:24195
> 25: 2020-05-20 12:20:02.543269 on_start: Creating receiver
> 25: 2020-05-20 12:20:02.543433 on_start: Creating sender
> 25: 2020-05-20 12:20:02.543553 on_start: done
> 25: 2020-05-20 12:21:02.543622 self.timeout Timeout Expired: n_sent=0 
> n_rcvd=0 n_rejected=0 n_aborted=0
> 25: 2020-05-20 12:21:05.284396 test_56 test error: Timeout Expired: n_sent=0 
> n_rcvd=0 n_rejected=0 n_aborted=0
> 25: test_56_allow_undersize_INTB_EA1 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... FAIL
> 25: test_57_allow_undersize_INTB_EB1 
> (system_tests_policy_oversize_basic.MaxMessageSizeBlockOversize) ... 
> 25/66 Test #25: system_tests_policy_oversize_basic ................***Timeout 
> 600.01 sec
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to