[ 
https://issues.apache.org/jira/browse/PROTON-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855513#comment-16855513
 ] 

Robbie Gemmell commented on PROTON-2056:
----------------------------------------

{quote}This does leave open the explanation for the sender side disposition in 
Ganesh Murthy's trace in 
https://issues.apache.org/jira/browse/PROTON-2056?focusedCommentId=16852243&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16852243.
{quote}
Ganesh later noted he applied the patch to the wrong checkout/install 
originally, so are those traces not just reflective of the original bug?
 
https://issues.apache.org/jira/browse/PROTON-2056?focusedCommentId=16852374&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16852374

I expect the buggy behaviour likely differs (on the wire) based on whether the 
two dispositions are processed from the same read, in that if they were it 
would already know it is remotely-settled before actually processing the 
release, and might behave differently.

If they arrive in the same read, I might expect no settled disposition would 
get sent to the receiver in that case even though the sender erroneously 
locally settles, because the transport knows its already remotely-settled. I'd 
expect that the local settlement could (I haven't verified) also prevent the 
on_settled callback actually firing for the second disposition even though it 
was already read. It certainly should stop it firing in my mind, per earlier 
comments, since the delivery was locally settled by the point it could.

Alternatively, if they arrived in different reads, processing the release could 
actually cause it to send a settled disposition to the receiver when the 
erroneous local settlement occurs, because it doesn't know yet that it is 
remotely settled (notification of which is still in flight or hasnt occurred at 
all). Similarly, I'd expect the local settlement to then mean the on_settled 
callback doesnt fire when the settled disposition arrives from the receiver.

So those could just be two buggy runs where the behaviour differed because the 
frame timing did?
{quote}I've checked and all the proton-c based bindings (actually python, C++, 
ruby, go) have exactly the same behaviour that the python binding does. So I'm 
not sure what "in other cases" means here?
{quote}
In particular I was thinking of a case outwith Qpid where I implemented 
auto-settle at Vert.x, ironically based on how I thought it was actually being 
done in the [other] Proton bindings, both at the time I did it and when writing 
that comment the other day.
{quote}Another important thing we need to rectify is the documentation so that 
we explicitly call out what "automatic sender settlement" means. We should be 
explicit and say that the sender will settle automatically when it receives a 
settlement from the receiver with no intervention - Previously the conditions 
which provoked the automatic settlement were unspecified (but clearly 
important!).
{quote}
Agreed. The wording I used for a setter method for the above case was "Sets 
whether sent deliveries should be automatically locally-settled once they have 
become remotely-settled by the receiving peer."
{quote}I will raise some extra JIRAs to cover the non python fixes and doc 
improvement.
{quote}
Great, thanks. Per above, it hadn't actually occurred to me that they all might 
be affected by the same issue!

> [proton-python]  on_settled callback not called when disposition arrives in 2 
> frames
> ------------------------------------------------------------------------------------
>
>                 Key: PROTON-2056
>                 URL: https://issues.apache.org/jira/browse/PROTON-2056
>             Project: Qpid Proton
>          Issue Type: Bug
>          Components: proton-c, python-binding
>    Affects Versions: proton-c-0.28.0
>            Reporter: Ganesh Murthy
>            Priority: Major
>         Attachments: proton-2056.patch
>
>
> When very large anonymous messages are sent to the router and these messages 
> have no receiver, they are immediately released. The router waits for the 
> entire large message to arrive in the router before settling it. Due to this, 
> in some cases, two disposition frames are sent for the same delivery, the 
> first has state=released and the second has settled=true as seen below
>  
> {noformat}
> 0x56330c891430]:0 <- @disposition(21) [role=true, first=315, 
> state=@released(38) []]
> [0x56330c891430]:0 <- @disposition(21) [role=true, first=315, settled=true, 
> state=@released(38) []]{noformat}
>  
> When this case happens, the on_settled is not called for the python binding. 
> The on_released is called. The on_settled must be called when a settlement 
> arrives for every delivery. I observed this behavior in a python system test 
> in Dispatch Router. The test called
> test_51_anon_sender_mobile_address_large_msg_edge_to_edge_two_interior can be 
> found in tests/system_tests_edge_router.py
> The test does not fail all the time but when it does it is due to the 
> on_settled not being called for deliveries that have this two part 
> disposition.
>  
> I tried in vain to write a standalone python reproducer. I could not do it.
>  
> To run the specific system test run the following from the 
> qpid-dispatch/build folder
>  
> {noformat}
>  /usr/bin/python "/home/gmurthy/opensource/qpid-dispatch/build/tests/run.py" 
> "-m" "unittest" "-v" 
> "system_tests_edge_router.RouterTest.test_51_anon_sender_mobile_address_large_msg_edge_to_edge_two_interior"{noformat}
>  
> The following is the test failure
> {noformat}
> test_51_anon_sender_mobile_address_large_msg_edge_to_edge_two_interior 
> (system_tests_edge_router.RouterTest) ... FAIL
> ======================================================================
> FAIL: test_51_anon_sender_mobile_address_large_msg_edge_to_edge_two_interior 
> (system_tests_edge_router.RouterTest)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File 
> "/home/gmurthy/opensource/qpid-dispatch/tests/system_tests_edge_router.py", 
> line 964, in 
> test_51_anon_sender_mobile_address_large_msg_edge_to_edge_two_interior
>     self.assertEqual(None, test.error)
> AssertionError: None != u'Timeout Expired - n_sent=350 n_accepted=300 
> n_modified=0 n_released=48'
> ----------------------------------------------------------------------
> Ran 1 test in 17.661s
> FAILED (failures=1)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to