messenger store and links

2013-10-24 Thread Bozo Dragojevic

Hi!

Chasing down a weird behavior...

looking at messengers pni_pump_out() and how it's used from 
pn_messenger_endpoints()


   link = pn_link_head(conn, PN_LOCAL_ACTIVE | PN_REMOTE_ACTIVE);
   while (link) {
 if (pn_link_is_sender(link)) {
   pni_pump_out(messenger, 
pn_terminus_get_address(pn_link_target(link)), link);



is it really fair to assume that target address is always expected to be 
non NULL?



I've added a bit of debug code to pn_messenger_endpoints() so it reads:

  link = pn_link_head(conn, PN_LOCAL_ACTIVE | PN_REMOTE_ACTIVE);
  while (link) {
if (pn_link_is_sender(link)) {
  static int addrnull, addrok;
  const char *address = pn_terminus_get_address(pn_link_target(link));
  if (!address) {
addrnull++;
  } else {
addrok++;
  }
  fprintf(stderr, links with null address: %d links with ok 
address %d\n,

  addrnull, addrok);
  pni_pump_out(messenger, address, link);


and I never see 'addrok' change from 0


when pni_pump_out is called with address==NULL:

int pni_pump_out(pn_messenger_t *messenger, const char *address, 
pn_link_t *sender)

{
  pni_entry_t *entry = pni_store_get(messenger-outgoing, address);
  if (!entry) return 0;

pni_store_get cheerfuly returns first message on the list


end effect is that random clients start receiving messages not directed 
at them.



For some inexplicable reason is mostly works out while there are just 
two clients
connected to the messenger and we're not pushing it really hard. Still 
trying to come

up with a simple test-case.

Can anyone shed some light how the addressing on the link level is 
supposed to work in mesenger?


Bozzo


Re: messenger store and links

2013-10-24 Thread Rafael Schloming
Can you post the exact addresses and routing configuration you're using and
which direction messages are flowing? I'd like to try to simulate this
using the example send/recv scripts.

My guess is that the issue may not be so much related to whether the
addresses are NULL or not but whether there are multiple receivers
competing for the same messages.

--Rafael


On Thu, Oct 24, 2013 at 11:52 AM, Bozo Dragojevic bo...@digiverse.siwrote:

 Hi!

 Chasing down a weird behavior...

 looking at messengers pni_pump_out() and how it's used from
 pn_messenger_endpoints()

link = pn_link_head(conn, PN_LOCAL_ACTIVE | PN_REMOTE_ACTIVE);
while (link) {
  if (pn_link_is_sender(link)) {
pni_pump_out(messenger, 
 pn_terminus_get_address(pn_**link_target(link)),
 link);


 is it really fair to assume that target address is always expected to be
 non NULL?


 I've added a bit of debug code to pn_messenger_endpoints() so it reads:

   link = pn_link_head(conn, PN_LOCAL_ACTIVE | PN_REMOTE_ACTIVE);
   while (link) {
 if (pn_link_is_sender(link)) {
   static int addrnull, addrok;
   const char *address = pn_terminus_get_address(pn_**
 link_target(link));
   if (!address) {
 addrnull++;
   } else {
 addrok++;
   }
   fprintf(stderr, links with null address: %d links with ok address
 %d\n,
   addrnull, addrok);
   pni_pump_out(messenger, address, link);


 and I never see 'addrok' change from 0


 when pni_pump_out is called with address==NULL:

 int pni_pump_out(pn_messenger_t *messenger, const char *address, pn_link_t
 *sender)
 {
   pni_entry_t *entry = pni_store_get(messenger-**outgoing, address);
   if (!entry) return 0;

 pni_store_get cheerfuly returns first message on the list


 end effect is that random clients start receiving messages not directed at
 them.


 For some inexplicable reason is mostly works out while there are just two
 clients
 connected to the messenger and we're not pushing it really hard. Still
 trying to come
 up with a simple test-case.

 Can anyone shed some light how the addressing on the link level is
 supposed to work in mesenger?

 Bozzo



Re: messenger store and links

2013-10-24 Thread Rafael Schloming
If this is with trunk I'm guessing you might be noticing a change in
reply-to behaviour from a recent fix:
https://issues.apache.org/jira/browse/PROTON-278

As you mention, previously an unset reply-to would get automatically filled
in with the messenger's name. This is no longer the case. That behaviour
was unintentional as there are times when you legitimately want the
reply-to to be left unset. The intended behaviour was to expand ~ at the
beginning of an address to the messenger's name. That is now how trunk
behaves, so if you set your reply-to's to ~ then your problem might go
away, although your question is still an interesting one as I believe if
you wished you could intentionally set up competing receivers using
explicit non-uuid addresses that collide.

--Rafael


On Thu, Oct 24, 2013 at 2:49 PM, Bozo Dragojevic bo...@digiverse.si wrote:

 All messengers are created with default constructor (uuid-based names).
 The 'broker' messenger does a pn_messenger_subscribe(amqp:/**/~
 0.0.0.0:8194)
 All messages are constructed with address amqp://127.0.0.1:8194 and leave
 reply_to unset (so it's set to amqp://$uuid)

 Broker does application-level routing of messages
   a publisher sends a special 'register' message
   replies are constructed using stored 'reply_to' address from incoming
 message
   forwarded messages are constructed using stored 'reply_to' address from
 incoming 'registration' messages

 Messenger routing facility is not used in any way.
 All Messengers are running in async mode (broker and client library share
 the same 'event loop code').
 We're using outgoing trackers, mostly for the 'buffered' check
 All incoming messages are accepted as soon as they are processed.
 All outgoing messages are settled as soon as they are not buffered anymore

 maybe it'd be possible to simulate the situation by commenting out the
 pni_pump_out() in pn_messenger_put(), that, or at least checking if sender
 link address really has anything to do with
 the address calculated in pn_messenger_put()

 Bozzo


 On 24. 10. 13 20:25, Rafael Schloming wrote:

 Can you post the exact addresses and routing configuration you're using
 and
 which direction messages are flowing? I'd like to try to simulate this
 using the example send/recv scripts.

 My guess is that the issue may not be so much related to whether the
 addresses are NULL or not but whether there are multiple receivers
 competing for the same messages.

 --Rafael


 On Thu, Oct 24, 2013 at 11:52 AM, Bozo Dragojevicbo...@digiverse.si**
 wrote:

  Hi!

 Chasing down a weird behavior...

 looking at messengers pni_pump_out() and how it's used from
 pn_messenger_endpoints()

 link = pn_link_head(conn, PN_LOCAL_ACTIVE | PN_REMOTE_ACTIVE);
 while (link) {
   if (pn_link_is_sender(link)) {
 pni_pump_out(messenger, pn_terminus_get_address(pn_
 link_target(link)),

 link);


 is it really fair to assume that target address is always expected to be
 non NULL?


 I've added a bit of debug code to pn_messenger_endpoints() so it reads:

link = pn_link_head(conn, PN_LOCAL_ACTIVE | PN_REMOTE_ACTIVE);
while (link) {
  if (pn_link_is_sender(link)) {
static int addrnull, addrok;
const char *address = pn_terminus_get_address(pn_**

 link_target(link));
if (!address) {
  addrnull++;
} else {
  addrok++;
}
fprintf(stderr, links with null address: %d links with ok address
 %d\n,
addrnull, addrok);
pni_pump_out(messenger, address, link);


 and I never see 'addrok' change from 0


 when pni_pump_out is called with address==NULL:

 int pni_pump_out(pn_messenger_t *messenger, const char *address,
 pn_link_t
 *sender)
 {
pni_entry_t *entry = pni_store_get(messenger-outgoing, address);

if (!entry) return 0;

 pni_store_get cheerfuly returns first message on the list


 end effect is that random clients start receiving messages not directed
 at
 them.


 For some inexplicable reason is mostly works out while there are just two
 clients
 connected to the messenger and we're not pushing it really hard. Still
 trying to come
 up with a simple test-case.

 Can anyone shed some light how the addressing on the link level is
 supposed to work in mesenger?

 Bozzo





Re: messenger store and links

2013-10-24 Thread Bozo Dragojevic

we don't have that commit yet. last merge from trunk was
PROTON-440: fix from hiram
git-svn-id: 
https://svn.apache.org/repos/asf/qpid/proton/trunk@1531508 
13f79535-47bb-0310-9956-ffa450edef68


maybe this trace of an unsuspecting client (recv) connecting to a broker 
in 'hosed' state can illustrate.


what happened before is this:
broker was started. a publisher and a slow subscriber are started, they 
publish approx 2000 messages.
when publisher is done it disconnects gracefully and the slow subscriber 
(living in a daemon thread) just dies.
Broker is left alone with approx 800 messages inside messenger's store. 
broker is unaware the messages exist, thinks all is well (until 
keepalive timeout kicks in).



$ PN_TRACE_FRM=1 ./recv amqp://127.0.0.1:8194 21 | head -200
Connected to 127.0.0.1:8194
- SASL
[0x7fd051814600:0] - @sasl-init(65) [mechanism=:ANONYMOUS, 
initial-response=b]

- SASL
[0x7fd051814600:0] - @sasl-mechanisms(64) 
[sasl-server-mechanisms=@PN_SYMBOL[:ANONYMOUS]]

[0x7fd051814600:0] - @sasl-outcome(68) [code=0]
- AMQP
[0x7fd05180e600:0] - @open(16) 
[container-id=8FC1DAF7-1783-41D7-A892-9CDC45BCAEA7]

- AMQP
[0x7fd05180e600:0] - @open(16) 
[container-id=CEF14E1A-4D53-44AE-A513-F82C489AEB63, hostname=127.0.0.1]
[0x7fd05180e600:0] - @begin(17) [next-outgoing-id=0, 
incoming-window=2147483647, outgoing-window=0]
[0x7fd05180e600:0] - @attach(18) [name=receiver-xxx, handle=0, 
role=true, snd-settle-mode=2, rcv-settle-mode=0, source=@source(40) 
[durable=0, timeout=0, dynamic=false], target=@target(41) [durable=0, 
timeout=0, dynamic=false], initial-delivery-count=0]
[0x7fd05180e600:0] - @flow(19) [incoming-window=2147483647, 
next-outgoing-id=0, outgoing-window=0, handle=0, delivery-count=0, 
link-credit=1024, drain=false]
[0x7fd05180e600:0] - @begin(17) [remote-channel=0, next-outgoing-id=0, 
incoming-window=2147483647, outgoing-window=1]
[0x7fd05180e600:0] - @attach(18) [name=receiver-xxx, handle=0, 
role=false, snd-settle-mode=2, rcv-settle-mode=0, initial-delivery-count=0]
[0x7fd05180e600:0] - @transfer(20) [handle=0, delivery-id=0, 
delivery-tag=bw\x00\x00\x00\x00\x00\x00\x00, message-format=0, 
settled=true, more=false] (551) 
\x00Sp\xd0\x00\x00\x00\x0b\x00\x00\x00\x05BP\x04@BR\x00\x00Ss\xd0\x00\x00\x00z\x00\x00\x00\x0d@@\xa1+amqp://d80a10f2-5e82-4fa4-a3d1-03ff188bbffa@\xa1+amqp://8FC1DAF7-1783-41D7-A892-9CDC45BCAEA7@@@\x83\x00\x00\x00\x00\x00\x00\x00\x00\x83\x00\x00\x00\x00\x00\x00\x00\x00@R\x00@\x00Sw\xd0\x00\x00\x01\x8a\x00\x00\x00\x06q\x00\x00\x00\x04q\x00\x00\x00\x08`\x00\x01\xa1\x0d... 
(truncated)
[0x7fd05180e600:0] - @transfer(20) [handle=0, delivery-id=1, 
delivery-tag=bx\x00\x00\x00\x00\x00\x00\x00, message-format=0, 
settled=true, more=false] (551) 
\x00Sp\xd0\x00\x00\x00\x0b\x00\x00\x00\x05BP\x04@BR\x00\x00Ss\xd0\x00\x00\x00z\x00\x00\x00\x0d@@\xa1+amqp://d80a10f2-5e82-4fa4-a3d1-03ff188bbffa@\xa1+amqp://8FC1DAF7-1783-41D7-A892-9CDC45BCAEA7@@@\x83\x00\x00\x00\x00\x00\x00\x00\x00\x83\x00\x00\x00\x00\x00\x00\x00\x00@R\x00@\x00Sw\xd0\x00\x00\x01\x8a\x00\x00\x00\x06q\x00\x00\x00\x04q\x00\x00\x00\x08`\x00\x01\xa1\x0d... 
(truncated)
[0x7fd05180e600:0] - @transfer(20) [handle=0, delivery-id=2, 
delivery-tag=by\x00\x00\x00\x00\x00\x00\x00, message-format=0, 
settled=true, more=false] (551) 
\x00Sp\xd0\x00\x00\x00\x0b\x00\x00\x00\x05BP\x04@BR\x00\x00Ss\xd0\x00\x00\x00z\x00\x00\x00\x0d@@\xa1+amqp://d80a10f2-5e82-4fa4-a3d1-03ff188bbffa@\xa1+amqp://8FC1DAF7-1783-41D7-A892-9CDC45BCAEA7@@@\x83\x00\x00\x00\x00\x00\x00\x00\x00\x83\x00\x00\x00\x00\x00\x00\x00\x00@R\x00@\x00Sw\xd0\x00\x00\x01\x8a\x00\x00\x00\x06q\x00\x00\x00\x04q\x00\x00\x00\x08`\x00\x01\xa1\x0d... 
(truncated)

... lots more

Note the two addresses in the transfer frames.
The 'to' address is lowercase and distinct from either container-id in 
the open frames. This is the address of the already gone slow subscriber.
The 'reply_to' address is set and is from the broker, which is 
consistent with not having the change from PROTON-278


Below is the slice of the broker trace of the same thing. There is the 
extra printf from pn_messenger_endpoints before it calls pni_pump_out


Accepted from localhost:53480
- SASL
[0x7fbbc9846200:0] - @sasl-init(65) [mechanism=:ANONYMOUS, 
initial-response=b]
[0x7fbbc9846200:0] - @sasl-mechanisms(64) 
[sasl-server-mechanisms=@PN_SYMBOL[:ANONYMOUS]]

[0x7fbbc9846200:0] - @sasl-outcome(68) [code=0]
- SASL
- AMQP
[0x7fbbc9840c00:0] - @open(16) 
[container-id=8FC1DAF7-1783-41D7-A892-9CDC45BCAEA7]

- AMQP
[0x7fbbc9840c00:0] - @open(16) 
[container-id=CEF14E1A-4D53-44AE-A513-F82C489AEB63, hostname=127.0.0.1]
[0x7fbbc9840c00:0] - @begin(17) [next-outgoing-id=0, 
incoming-window=2147483647, outgoing-window=0]
[0x7fbbc9840c00:0] - @attach(18) [name=receiver-xxx, handle=0, 
role=true, snd-settle-mode=2, rcv-settle-mode=0, source=@source(40) 
[durable=0, timeout=0, dynamic=false], target=@target(41) [durable=0, 
timeout=0, dynamic=false], initial-delivery-count=0]
[0x7fbbc9840c00:0]