Is there any way to recover from Messenger errors short of completely freeing the messenger instance and starting with a new one?

I've been deliberately making it fail, so for example starting a messenger with subscriptions like this:
amqp://~0.0.0.0,localhost:5672

with no broker running the first subscription should succeed and the second one should fail

In my case it's a bit more awkward because it's fully asynchronous, but what I see in this case is that it creates a connection instance to localhost:5672 because in pn_connect there is a test for

  if (connect(sock, addr->ai_addr, addr->ai_addrlen) == -1) {
    if (errno != EINPROGRESS) {
      pn_i_error_from_errno(io->error, "connect");
      freeaddrinfo(addr);
      close(sock);
      return PN_INVALID_SOCKET;
    }
  }

with my connect on a non-blocking socket EINPROGRESS is set so the socket ends up being valid, but subsequently it will fail to connect.


I've actually got a listener that can detect the Connection refused, but what I can't seem to do is to cleanly clear the connection object.

I've tried all sorts of hacks around pn_messenger_resolve/pni_messenger_reclaim (in that case pn_messenger_resolve found the connection object given the name "localhost:5672" which was found OK then I tried a pni_messenger_reclaim hack to clear it, but that didn't seem to close the underlying socket).

I also tried to find the relevant selectable pn_messenger_selectable that matched the file descriptor of the failed connection I then tried a pni_connection_finalize(sel) hack. In that case I seem to free up the connection and the underlying socket gets closed, but when I subsequently try to connect (to the working amqp://~0.0.0.0) although I get an accept on the right file descriptor I subsequently get an assertion failed at messenger.c,151,pni_context at Error


So in short given that a connection object gets created because of a connect on a non-blocking socket, which subsequently and asynchronously fails to connect there doesn't seem any way to tidy up that failed connection.

To be clear if I have subscriptions
amqp://~0.0.0.0,localhost:5672

And ignore any errors and don't bother to try and tidy up and I subsequently do a client connection to amqp://0.0.0.0 my client connects fine but on the next file descriptor up from the one created by the failed localhost:5672 connection so basically my failed subscription has leaked a connection. That is the listen fd for amqp://~0.0.0.0 is 3 the (failed) fd for localhost:5672 is 4 and when I connect to amqp://~0.0.0.0 the accept fd is 5, it really should be 4 but I can't get shot of the connection object etc. for localhost:5672.

The only way to deal with it seems to be to free and create a new messenger when anything fails, which is a pain because the subscription amqp://~0.0.0.0 is actually fine.


TBH messenger's error handling is driving me nuts, it has been mentioned in a few threads that it might be better to give up on messenger and just use engine.

Is messenger really irredeemably broken? Without decent error handling/recovery it's very little use in a production environment.

Frase





---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to