I'm *really* interested, Frase.

-Ray Keating, Pontimax Technologies LLC

-----Original Message-----
From: Fraser Adams [mailto:[email protected]] 
Sent: Monday, September 8, 2014 2:07 PM
To: [email protected]
Subject: Re: proton Messenger error handling/recovery REQUEST FEEDBACK!

Messenger gurus seem to be keeping their heads down a bit.

Is it *really* just Alan and I who are interested to understand the error 
handling/reconnection behaviour of Messenger?

Is anybody using it in "industrial strength" applications or is it just being 
used in quick and dirty demos? Without error handling and reconnection 
mechanisms I'm struggling to see how it can be used for the former.

I can likely hack things and Alan also mentioned that he "cheats", but I'd 
really like to know from people who really understand messenger how to do it 
*properly*.

Frase


On 05/09/14 14:17, Alan Conway wrote:
> On Thu, 2014-09-04 at 18:28 +0100, Fraser Adams wrote:
>> On 03/09/14 23:29, Alan Conway wrote:
>>> On Wed, 2014-09-03 at 20:05 +0100, Fraser Adams wrote:
>>>> Hello,
>>>> I've probably missed something, but I don't know how to reliably 
>>>> detect failures and reconnect.
>>>>
>>>> So if I sent to an address with a freshly stood up Messenger 
>>>> instance and the address can't be found things aren't too bad and I 
>>>> wind up with an ECONNREFUSED that I could do something with, 
>>>> however if I've been sending messages to a valid address then I kill off 
>>>> the consumer I see a:
>>>>
>>>> [0x513380]:ERROR amqp:connection:framing-error connection aborted 
>>>> [0x513380]:ERROR[-2] connection aborted
>>>>
>>>> CONNECTION ERROR connection aborted (remote)
>>>>
>>>> The thing is that all of these are *internally* generated messages 
>>>> sent to the console via fprintf, so my *application* doesn't really 
>>>> know about them (though I could be crafty and interpose my own 
>>>> cheeky fprintf to intercept them). That doesn't quite sound like 
>>>> the desired behaviour for a robust system?
>>>>
>>>>
>>>> Similarly should I actually trap an error what's the correct way to 
>>>> continue, as it happens currently my app carries on silently doing 
>>>> nothing useful and continuing to do so even when the peer restarts 
>>>> (so there is no magic internal reconnection logic as far as I can see).
>>>>
>>>> do I have to do a
>>>> messenger.stop()
>>>> messenger.start()
>>>>
>>>> cycle to get things going again, I'm guessing so, but I'll like to 
>>>> know what the "correct"/expected way to create Messenger code that 
>>>> is robust against remote failures, as far as I can see there are no 
>>>> examples of that sort of thing?
>>> I've come up against similar problems, I think it's an area that 
>>> needs some work in Proton. Is anybody already working on/thinking 
>>> about this area?
>>>
>>> Cheers,
>>> Alan.
>>>
>> I'd definitely like to know how others deal with this sort of thing.
> I cheat. I've been using proton in dispatch system tests, I come up 
> against these issues when I start up some proton/dispatch network and 
> try to use it too quickly before things have settled down. I have some 
> tweaks in my test harness to wait till things are ready so there are 
> no errors :) That's not a solution for general non-test situations - 
> although knowing how to wait till things are ready is always useful.
>
> https://svn.apache.org/repos/asf/qpid/dispatch/trunk/tests/system_test
> .py
>
> class Messenger adds a "flush" method that pumps the Messenger event 
> loop till there is no more work to do. Otherwise subscribe() in 
> particular gives no way to tell when the subscription is active.
>
> Note: My situation is a bit special in that dispatch creates addresses 
> dynamically on subscribe and my tests involve slow stuff like 
> waypoints to brokers etc. That introduces a delay in subscribe that 
> probably isn't visible when the address is created beforehand.
>
> There's also Qpidd.wait_ready and Qdrouterd.wait_ready that wait for 
> qpidd and dispatch router to be ready respectively so I can be sure 
> that when I connect with proton they'll be listening. Those wait for 
> the expected listening ports to be connectable and in the case of 
> dispatch also does a qmf check to make sure that all expected outgoing 
> connectors
> are there.            
>
>> For info notwithstanding not necessarily being able to trap all the 
>> errors without being devious around fprintf  (which to be fair works, 
>> but it's a bit sneaky and if you have multiple Messenger instances 
>> won't tell you which one the error relates to) but when I do get an 
>> error I appear to have to start from scratch - in other words:
>>
>> message.free();
>> messenger.free();
>> message = new proton.Message();
>> messenger = new proton.Messenger();
>> messenger.start();
>>
>> If I try to restart the original messenger or use existing queue I 
>> get no joy. It's not the end of the world but I've no idea what 
>> robust Messenger code is *supposed* to look like.
>>
>> Presumably Alan and I aren't the only people who might like to be 
>> able to trap errors and restart? Or does every one else write code 
>> that never fails ;->
> I always wondered how everybody but me can do that. Sigh. For you and 
> me I think we need to do some work on proton's error handling.
>
> - proton (or any library!) should NEVER EVER write anything direct to 
> stdout or stderr. It needs a (very simple) logging facility that can 
> write to stderr by default but can be redirected elsewhere.
> - proton should never log an error without also returning some useful 
> error condition to the application.
>
> Proton has some useful pn_error_* functions, they just need to be used 
> more widely. In dispatch I introduced an errno-style thread-local 
> error code/message (in proton it would be a pn_error_t*) That allows 
> sensible error messages out of functions that want to return something else 
> (e.g.
> pointer or null and set the thread error) It also allows you to work 
> around lazy error handling (temporarily of course (hahahaha)) - a 
> caller couple of stack frames up can detect an error even if 
> intermediate functions didn't check & propagate errors properly. I'm 
> not advocating lazy error checking but in C it is hard to get everything.
>
> FEEDBACK PLEASE: anyone think this is a great/horrible idea? Does 
> proton already do things I've missed that would make this unnecessary?
>
> Cheers,
> Alan.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected] For 
> additional commands, e-mail: [email protected]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected] For additional 
commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to