[ 
https://issues.apache.org/jira/browse/DISPATCH-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16746172#comment-16746172
 ] 

Bart van Andel commented on DISPATCH-1019:
------------------------------------------

Updated source code links in current master:
- https://github.com/apache/qpid-dispatch/blob/master/src/parse.c#L156
- https://github.com/apache/qpid-dispatch/blob/master/src/iterator.c#L362

Links to the source code lines at the time of writing:
- 
https://github.com/apache/qpid-dispatch/blob/fff61db817fa310ff01061f288c5f86f8d5be262/src/parse.c#L151
- 
https://github.com/apache/qpid-dispatch/blob/fff61db817fa310ff01061f288c5f86f8d5be262/src/iterator.c#L323

> Messaging instability in networks with many clients / addresses.
> ----------------------------------------------------------------
>
>                 Key: DISPATCH-1019
>                 URL: https://issues.apache.org/jira/browse/DISPATCH-1019
>             Project: Qpid Dispatch
>          Issue Type: Bug
>          Components: Routing Engine
>    Affects Versions: 1.1.0
>            Reporter: Marcel Meulemans
>            Priority: Major
>
> After DISPATCH-966 has been fixed I am still experiencing problems in a 
> network with many clients / addresses. I am running a three node fully 
> connected mesh of dispatch routers with 10000 attached clients all with two 
> address messaging at around 100 msg/sec.
> In the logs I am seeing the following errors:
> {{2018-05-29 14:31:05.145732 +0000 ERROR (error) Invalid message: 
> Insufficient Data to Determine Tag}}
>  {{2018-05-29 14:31:05.145748 +0000 ERROR (error) Invalid message: Can't 
> convert message field body}}
> Which, in turn, lead to python errors like:
> {{2018-05-29 14:31:05.145971 +0000 ROUTER_MA (trace) RCVD: MAU(id=None pv=1 
> area=0 mobile_seq=0)}}
> {{2018-05-29 14:31:05.146130 +0000 ROUTER (error) Exception in control 
> message processing}}
> {{ Traceback (most recent call last):}}
> {{ File "/usr/lib/python2.7/qpid_dispatch_internal/router/engine.py", line 
> 157, in handleControlMessage}}
> {{ self.mobile_address_engine.handle_mau(msg, now)}}
> {{ File "/usr/lib/python2.7/qpid_dispatch_internal/router/mobile.py", line 
> 97, in handle_mau}}
> {{ node = self.node_tracker.router_node(msg.id)}}
> {{ File "/usr/lib/python2.7/qpid_dispatch_internal/router/node.py", line 363, 
> in router_node}}
> {{ return self.nodes[node_id]}}
> {{ KeyError: None}}
> {{2018-05-29 14:31:05.146175 +0000 ROUTER (error) Control message error: 
> opcode=MAU body=None}}
> I have tracked down the cause of the "Insufficient Data to Determine Tag" 
> message to the following: During the call to {{qd_parse}} of the {{MAU}} 
> message the {{qd_iterator_t}} reaches the end of the buffer list before it 
> should. Specifically the call to {{qd_iterator_advance}} here 
> [https://github.com/apache/qpid-dispatch/blob/master/src/parse.c#L151,] 
> "fails" to move forward a certain number of bytes (e.g. 31) even though the 
> {{iterator->view_pointer->remaining}} value has plenty of bytes left (e.g. 
> 84802). The "fail" is because we reach the end of the buffer list before we 
> should (here 
> [https://github.com/apache/qpid-dispatch/blob/master/src/iterator.c#L323|https://github.com/apache/qpid-dispatch/blob/master/src/iterator.c#L323).]).
> What I have not been able to figure out yet is why this happens because it is 
> not consistent. Many large MAU message are parsed correctly only sometimes 
> not. I am able to reproduce these errors every time I run my tests. There may 
> be a time component involved because the more logging I to the router code, 
> the less often the errors seem to occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to