Re: I think that's a blocker...
This isn't necessarily a proton bug. Nothing in the referenced checkin actually touches the logic around allocating/freeing error strings, it merely causes pn_send/pn_recv to make use of pn_io_t's pn_error_t where previously it threw away the error information. This would suggest that there is perhaps a pre-existing bug in dispatch where it is calling pn_send/pn_recv with a pn_io_t that has been freed, and it is only now triggering due to the additional asserts that are encountered due to not ignoring the error information. I could be mistaken, but I would try reproducing this under valgrind. That will tell you where the first free occurred and that should hopefully make it obvious whether this is indeed a proton bug or whether dispatch is somehow freeing the pn_io_t sooner than it should. (FWIW, if it is indeed a proton bug, then I would agree it is a blocker.) --Rafael On Wed, Feb 25, 2015 at 7:54 AM, Michael Goulish mgoul...@redhat.com wrote: ...but if not, somebody please feel free to correct me. The Jira that I just created -- PROTON-826 -- is for a bug I found with my topology testing of the Dispatch Router, in which I repeatedly kill and restart a router and make sure that the router network comes back to the same topology that it had before. As of checkin 01cb00c -- which had no Jira -- it is pretty easy for my test to blow core. It looks like an error string is being double-freed (maybe) in the proton library. ( full info in the Jira. https://issues.apache.org/jira/browse/PROTON-826 )
Re: I think that's a blocker...
Good point! I'm afraid it will take me the rest of my life to reproduce under valgrind .. but ... I'll see what I can do In the meantime -- I'm not sure what to do with a Jira if the provenance is in doubt... - Original Message - This isn't necessarily a proton bug. Nothing in the referenced checkin actually touches the logic around allocating/freeing error strings, it merely causes pn_send/pn_recv to make use of pn_io_t's pn_error_t where previously it threw away the error information. This would suggest that there is perhaps a pre-existing bug in dispatch where it is calling pn_send/pn_recv with a pn_io_t that has been freed, and it is only now triggering due to the additional asserts that are encountered due to not ignoring the error information. I could be mistaken, but I would try reproducing this under valgrind. That will tell you where the first free occurred and that should hopefully make it obvious whether this is indeed a proton bug or whether dispatch is somehow freeing the pn_io_t sooner than it should. (FWIW, if it is indeed a proton bug, then I would agree it is a blocker.) --Rafael On Wed, Feb 25, 2015 at 7:54 AM, Michael Goulish mgoul...@redhat.com wrote: ...but if not, somebody please feel free to correct me. The Jira that I just created -- PROTON-826 -- is for a bug I found with my topology testing of the Dispatch Router, in which I repeatedly kill and restart a router and make sure that the router network comes back to the same topology that it had before. As of checkin 01cb00c -- which had no Jira -- it is pretty easy for my test to blow core. It looks like an error string is being double-freed (maybe) in the proton library. ( full info in the Jira. https://issues.apache.org/jira/browse/PROTON-826 )
[jira] [Commented] (PROTON-614) PHP Fatal error: Uncaught exception 'MessengerException' with message '[-5]: no valid sources'
[ https://issues.apache.org/jira/browse/PROTON-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336569#comment-14336569 ] Rajkumar commented on PROTON-614: - Does anyone know why this error is coming? I am getting this error in C language too. PHP Fatal error: Uncaught exception 'MessengerException' with message '[-5]: no valid sources' --- Key: PROTON-614 URL: https://issues.apache.org/jira/browse/PROTON-614 Project: Qpid Proton Issue Type: Bug Components: php-binding Affects Versions: 0.7 Environment: Ubuntu Linux on Amazon EC2 Reporter: Jose Berardo Cunha Priority: Minor Labels: php, php-proton Sorry, but I don't know if it is really a bug or I'm doing something wrong. There is no documentation for Proton PHP, so here I am. Every time I try to execute recv.php sample I get this error: [0x16d2180]:ERROR[0] (null) [0x16d2180]:ERROR[0] (null) CONNECTION ERROR connection aborted (remote) PHP Fatal error: Uncaught exception 'MessengerException' with message '[-5]: no valid sources' in /usr/share/php/proton.php:61 Stack trace: #0 /usr/share/php/proton.php(146): Messenger-_check(-5) #1 /home/ubuntu/qpid-proton-0.7/tests/smoke/recv.php(20): Messenger-recv() #2 {main} thrown in /usr/share/php/proton.php on line 61 I've tried all of this command lines: $ php recv.php $ php recv.php amqp://0.0.0.0 $ php recv.php amqp://127.0.0.1 $ php recv.php amqp://localhost $ php recv.php amqp://localhost:5672 $ php recv.php amqp://localhost/myqueue $ php recv.php amqp://localhost:5672/myqueue Where myqueue is my first queue created at Qpid 0.28 Web Console. I'got the same Connection aborted on send.php but what is weird is when I run send.php against an ActiveMQ Apollo broker I have no error message and I can see for just one or two seconds one line referring my message sent to it at its web console. So I presumed that send.php is able to connect an amqp broker but I don't know why it doesn't connect to Qpid 0.28. Please, What am I doing wrong, what are valid sources and where can I find more documentation about the Proton PHP library? Even the proton.php and cproton.php created by Swig have no comments. Thank you. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Proposed SASL changes (API and functional)
On Tue, 2015-02-24 at 15:48 -0500, Andrew Stitcher wrote: As many of you know I've been working on implementing a SASL AMQP protocol layer that does more than PLAIN and ANONYMOUS for proton-c. I'm currently in at a point where the work is reasonably functional (with some gaps) I've put together a fairly comprehensive account of this work on the Apache wiki: https://cwiki.apache.org/confluence/x/B5cWAw If you are at all interested please go and look at the proposal and comment on it there. You can see my actual code changes in my github proton repo: https://github.com/astitcher/qpid-proton/commits/sasl-work [This is my working branch, so not all the changes make a lot of sense, just pay attention to the tip of the branch] In a short while when people have had enough time to absorb the proposal and comment I will post a code review of the actual code changes. As there are substantial API changes I'd like to get this in for 0.9 because we were intending to stabilise the API at this point. This looks very good to me. One ignorant question: Qpid has a min/max Security Strength Factor for encryption rather than a binary enable/disable. Is that relevant here? Cheers, Alan.
Re: I think that's a blocker...
On Wed, Feb 25, 2015 at 9:53 AM, Alan Conway acon...@redhat.com wrote: On Wed, 2015-02-25 at 09:04 -0500, Michael Goulish wrote: Good point! I'm afraid it will take me the rest of my life to reproduce under valgrind .. but ... I'll see what I can do Try this in your environment: export MALLOC_PERTURB_=66 That will cause malloc to immediately fill freed memory with 0x42 bytes so it is obvious when you gdb the core dump if someone is using freed memory. It's not as informative as valgrind but has no peformance impact that I can detect, and it often helps to crash faster and closer to the real problem. Freed memory can hold valid-seeming values for a while so your code may not notice immediately, whereas 4242424242 is rarely valid for anything. In the meantime -- I'm not sure what to do with a Jira if the provenance is in doubt... Maybe just put a note on it till we know more. +1 FWIW, given that it looks like we are gonna wait for Andrew's SASL changes to land, and Gordon is away till the end of the week and still working on some documentation that would be nice to get into the release, I suspect you have at least a few days to investigate the issue before we get a release candidate for 0.9. --Rafael
Re: I think that's a blocker...
Would it be safe to assume that any operations on driver-io are not thread safe? Dispatch is a multi-threaded application. It looks to me as though io-error is a resource shared across the threads in an unsafe way. -Ted On 02/25/2015 08:55 AM, Rafael Schloming wrote: This isn't necessarily a proton bug. Nothing in the referenced checkin actually touches the logic around allocating/freeing error strings, it merely causes pn_send/pn_recv to make use of pn_io_t's pn_error_t where previously it threw away the error information. This would suggest that there is perhaps a pre-existing bug in dispatch where it is calling pn_send/pn_recv with a pn_io_t that has been freed, and it is only now triggering due to the additional asserts that are encountered due to not ignoring the error information. I could be mistaken, but I would try reproducing this under valgrind. That will tell you where the first free occurred and that should hopefully make it obvious whether this is indeed a proton bug or whether dispatch is somehow freeing the pn_io_t sooner than it should. (FWIW, if it is indeed a proton bug, then I would agree it is a blocker.) --Rafael On Wed, Feb 25, 2015 at 7:54 AM, Michael Goulish mgoul...@redhat.com wrote: ...but if not, somebody please feel free to correct me. The Jira that I just created -- PROTON-826 -- is for a bug I found with my topology testing of the Dispatch Router, in which I repeatedly kill and restart a router and make sure that the router network comes back to the same topology that it had before. As of checkin 01cb00c -- which had no Jira -- it is pretty easy for my test to blow core. It looks like an error string is being double-freed (maybe) in the proton library. ( full info in the Jira. https://issues.apache.org/jira/browse/PROTON-826 )
Re: I think that's a blocker...
On Wed, Feb 25, 2015 at 10:49 AM, Ted Ross tr...@redhat.com wrote: Would it be safe to assume that any operations on driver-io are not thread safe? Dispatch is a multi-threaded application. It looks to me as though io-error is a resource shared across the threads in an unsafe way. Interesting... so this is what the docs say: /** * A ::pn_io_t manages IO for a group of pn_socket_t handles. A * pn_io_t object may have zero or one pn_selector_t selectors * associated with it (see ::pn_io_selector()). If one is associated, * all the pn_socket_t handles managed by a pn_io_t must use that * pn_selector_t instance. * * The pn_io_t interface is single-threaded. All methods are intended * to be used by one thread at a time, except that multiple threads * may use: * * ::pn_write() * ::pn_send() * ::pn_recv() * ::pn_close() * ::pn_selector_select() * * provided at most one thread is calling ::pn_selector_select() and * the other threads are operating on separate pn_socket_t handles. */ I think this has been somewhat modified by the constraints from the windows implementation, and I'm not sure I understand completely what the constraints are there, or entirely what is being described above, but on the posix front, the pn_io_t is little more than just a holder for an error slot, and you should have one of these per thread. It shouldn't be a problem to use send/recv/etc from multiple threads though so long as you pass in the pn_io_t from the current thread. --Rafael
Re: I think that's a blocker...
On 02/25/2015 11:52 AM, Rafael Schloming wrote: On Wed, Feb 25, 2015 at 10:49 AM, Ted Ross tr...@redhat.com wrote: Would it be safe to assume that any operations on driver-io are not thread safe? Dispatch is a multi-threaded application. It looks to me as though io-error is a resource shared across the threads in an unsafe way. Interesting... so this is what the docs say: /** * A ::pn_io_t manages IO for a group of pn_socket_t handles. A * pn_io_t object may have zero or one pn_selector_t selectors * associated with it (see ::pn_io_selector()). If one is associated, * all the pn_socket_t handles managed by a pn_io_t must use that * pn_selector_t instance. * * The pn_io_t interface is single-threaded. All methods are intended * to be used by one thread at a time, except that multiple threads * may use: * * ::pn_write() * ::pn_send() * ::pn_recv() * ::pn_close() * ::pn_selector_select() * * provided at most one thread is calling ::pn_selector_select() and * the other threads are operating on separate pn_socket_t handles. */ I claim that the commit-in-question violates the text above. Calls to pn_send() and pn_recv() are no longer thread-safe because they now use the shared error record. I think this has been somewhat modified by the constraints from the windows implementation, and I'm not sure I understand completely what the constraints are there, or entirely what is being described above, but on the posix front, the pn_io_t is little more than just a holder for an error slot, and you should have one of these per thread. It shouldn't be a problem to use send/recv/etc from multiple threads though so long as you pass in the pn_io_t from the current thread. It's not desirable to allocate sockets to threads up front (i.e. partition the set of sockets into per-thread slots). I know you didn't say that was needed but it's what I infer from the docs for pn_io_t. Assuming, as you suggest, that pn_io_t is nothing more than a thread-specific error notepad seems like a recipe for future disaster because pn_io_t is clearly intended to be more than that. -Ted
Re: I think that's a blocker...
A pn_io_t is heavyweight in Windows, because it has an opposite usage pattern and moves a lot of kernel stuff into user space compared to POSIX. The quoted documentation was my attempt to capture the Dispatch usage pattern, which I assumed would be typical of an application trying to spread proton engine use between threads: basically single access to pn_selector_select() via a condition variable, and no more than one thread working on a given selectable (using proton engine encoding/decoding etc., not just io). In the end, we could just add a zillion locks into the Windows code and make it look like it is as thread safe as the POSIX counterpart (which has implicit safety when it does in the kernel what Windows is doing in user space), but that would defeat using IO completion ports at all. The documentation was my attempt of balancing performance with sophisticated proton usage on multiple platforms. Note that there is only one pn_selector_t allowed per pn_io_t (a very strong Windows completion port requirement, and sockets are bound to a single completion port for life). On Wed, Feb 25, 2015 at 8:52 AM, Rafael Schloming r...@alum.mit.edu wrote: On Wed, Feb 25, 2015 at 10:49 AM, Ted Ross tr...@redhat.com wrote: Would it be safe to assume that any operations on driver-io are not thread safe? Dispatch is a multi-threaded application. It looks to me as though io-error is a resource shared across the threads in an unsafe way. Interesting... so this is what the docs say: /** * A ::pn_io_t manages IO for a group of pn_socket_t handles. A * pn_io_t object may have zero or one pn_selector_t selectors * associated with it (see ::pn_io_selector()). If one is associated, * all the pn_socket_t handles managed by a pn_io_t must use that * pn_selector_t instance. * * The pn_io_t interface is single-threaded. All methods are intended * to be used by one thread at a time, except that multiple threads * may use: * * ::pn_write() * ::pn_send() * ::pn_recv() * ::pn_close() * ::pn_selector_select() * * provided at most one thread is calling ::pn_selector_select() and * the other threads are operating on separate pn_socket_t handles. */ I think this has been somewhat modified by the constraints from the windows implementation, and I'm not sure I understand completely what the constraints are there, or entirely what is being described above, but on the posix front, the pn_io_t is little more than just a holder for an error slot, and you should have one of these per thread. It shouldn't be a problem to use send/recv/etc from multiple threads though so long as you pass in the pn_io_t from the current thread. --Rafael
PROTON-827: Reactive client binding for the go programming language
I plan to start working on a go golang.org binding for proton. I envisage a SWIG binding similar to the other swig-based bindings (python, ruby, etc.) and an API layer similar to the new reactive Python API (based on the C reactor.) This will be an exploratory effort to begin with, I'd like to hear from anybody who might be interested in using such a thing or helping to implement it. Cheers, Alan.
[jira] [Commented] (PROTON-826) recent checkin causes frequent double-free or corruption crash
[ https://issues.apache.org/jira/browse/PROTON-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336824#comment-14336824 ] michael goulish commented on PROTON-826: It looks like the problem here is just that the error struct used in proton-c/src/error.c is not thread safe -- so I am opening a new Jira for Dispatch. I am leaving this one open for now, however, because other applications using proton will encounter this. Either something could be changed in proton to make this less thread-hostile, or ... it could be publicized better? Please feel free to close when appropriate. recent checkin causes frequent double-free or corruption crash -- Key: PROTON-826 URL: https://issues.apache.org/jira/browse/PROTON-826 Project: Qpid Proton Issue Type: Bug Components: proton-c Affects Versions: 0.9 Reporter: michael goulish Priority: Blocker In my dispatch testing I am seeing frequent crashes in proton library that began with proton checkin 01cb00c on 2015-02-15 report read and write errors through the transport The output at crash-time says this: --- *** Error in `/home/mick/dispatch/install/sbin/qdrouterd': double free or corruption (fasttop): 0x020ee880 *** === Backtrace: = /lib64/libc.so.6[0x3e3d875a4f] /lib64/libc.so.6[0x3e3d87cd78] /lib64/libqpid-proton.so.2(pn_error_clear+0x18)[0x7f4f4f4e1f18] /lib64/libqpid-proton.so.2(pn_error_set+0x11)[0x7f4f4f4e1f41] /lib64/libqpid-proton.so.2(pn_error_vformat+0x3e)[0x7f4f4f4e1f9e] /lib64/libqpid-proton.so.2(pn_error_format+0x82)[0x7f4f4f4e2032] /lib64/libqpid-proton.so.2(pn_i_error_from_errno+0x67)[0x7f4f4f4fd737] /lib64/libqpid-proton.so.2(pn_recv+0x5a)[0x7f4f4f4fd16a] /home/mick/dispatch/install/lib64/libqpid-dispatch.so.0(qdpn_connector_process+0xd7)[0x7f4f4f759430] The backtrace from the core file looks like this: #0 0x003e3d835877 in raise () from /lib64/libc.so.6 #1 0x003e3d836f68 in abort () from /lib64/libc.so.6 #2 0x003e3d875a54 in __libc_message () from /lib64/libc.so.6 #3 0x003e3d87cd78 in _int_free () from /lib64/libc.so.6 #4 0x7fbf8a59b2e8 in pn_error_clear (error=error@entry=0x1501140) at /home/mick/rh-qpid-proton/proton-c/src/error.c:56 #5 0x7fbf8a59b311 in pn_error_set (error=error@entry=0x1501140, code=code@entry=-2, text=text@entry=0x7fbf801a69c0 recv: Resource temporarily unavailable) at /home/mick/rh-qpid-proton/proton-c/src/error.c:65 #6 0x7fbf8a59b36e in pn_error_vformat (error=0x1501140, code=-2, fmt=optimized out, ap=ap@entry=0x7fbf801a6de8) at /home/mick/rh-qpid-proton/proton-c/src/error.c:81 #7 0x7fbf8a59b402 in pn_error_format (error=error@entry=0x1501140, code=optimized out, fmt=fmt@entry=0x7fbf8a5bb21e %s: %s) at /home/mick/rh-qpid-proton/proton-c/src/error.c:89 #8 0x7fbf8a5b6797 in pn_i_error_from_errno (error=0x1501140, msg=msg@entry=0x7fbf8a5bbe1a recv) at /home/mick/rh-qpid-proton/proton-c/src/platform.c:119 #9 0x7fbf8a5b61ca in pn_recv (io=0x14e77b0, socket=optimized out, buf=optimized out, size=optimized out) at /home/mick/rh-qpid-proton/proton-c/src/posix/io.c:271 #10 0x7fbf8a812430 in qdpn_connector_process (c=0x7fbf7801c7f0) - And I can prevent the crash from happening, apparently forever, by commenting out this line: free(error-text); in the function pn_error_clear in the file proton-c/src/error.c The error text that is being freed which causes the crash looks like this: $2 = {text = 0x7f66e8104e30 recv: Resource temporarily unavailable, root = 0x0, code = -2} My dispatch test creates a router network and then repeatedly kills and restarts a randomly-selected router. After this proton checkin it almost never gets through 5 iterations without this crash. After I commented out that line, it got through more than 500 iterations before I stopped it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PROTON-827) Reactive client binding for the go programming language.
Alan Conway created PROTON-827: -- Summary: Reactive client binding for the go programming language. Key: PROTON-827 URL: https://issues.apache.org/jira/browse/PROTON-827 Project: Qpid Proton Issue Type: Improvement Components: proton-c Affects Versions: 0.9 Reporter: Alan Conway Assignee: Alan Conway Develop a reactive API binding in go http://golang.org/, similar to the existing reactive python API illustrated in examples/python. It should follow the pattern of the existing python and C reactive APIs as far as possible while respecting common conventions and idioms of the go langauge. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: I think that's a blocker...
On Wed, Feb 25, 2015 at 12:48 PM, Ted Ross tr...@redhat.com wrote: On 02/25/2015 11:52 AM, Rafael Schloming wrote: On Wed, Feb 25, 2015 at 10:49 AM, Ted Ross tr...@redhat.com wrote: Would it be safe to assume that any operations on driver-io are not thread safe? Dispatch is a multi-threaded application. It looks to me as though io-error is a resource shared across the threads in an unsafe way. Interesting... so this is what the docs say: /** * A ::pn_io_t manages IO for a group of pn_socket_t handles. A * pn_io_t object may have zero or one pn_selector_t selectors * associated with it (see ::pn_io_selector()). If one is associated, * all the pn_socket_t handles managed by a pn_io_t must use that * pn_selector_t instance. * * The pn_io_t interface is single-threaded. All methods are intended * to be used by one thread at a time, except that multiple threads * may use: * * ::pn_write() * ::pn_send() * ::pn_recv() * ::pn_close() * ::pn_selector_select() * * provided at most one thread is calling ::pn_selector_select() and * the other threads are operating on separate pn_socket_t handles. */ I claim that the commit-in-question violates the text above. Calls to pn_send() and pn_recv() are no longer thread-safe because they now use the shared error record. You could be right. I'm not entirely sure how to interpret the above text. I don't know that I would necessarily consider pn_send/pn_recv to be methods of pn_io_t. I think this has been somewhat modified by the constraints from the windows implementation, and I'm not sure I understand completely what the constraints are there, or entirely what is being described above, but on the posix front, the pn_io_t is little more than just a holder for an error slot, and you should have one of these per thread. It shouldn't be a problem to use send/recv/etc from multiple threads though so long as you pass in the pn_io_t from the current thread. It's not desirable to allocate sockets to threads up front (i.e. partition the set of sockets into per-thread slots). I know you didn't say that was needed but it's what I infer from the docs for pn_io_t. I don't think the posix implementation requires this. Assuming, as you suggest, that pn_io_t is nothing more than a thread-specific error notepad seems like a recipe for future disaster because pn_io_t is clearly intended to be more than that. It may not work out so well on windows, I honestly don't know what the situation is there, but certainly for posix systems I think we need *something* in this area to function as a context that can be associated with thread-or-smaller granularities. Having some way to return error information is just one example of a useful context to be able to use at thread or smaller granularities. If the windows I/O APIs require a heavier-weight interface, then perhaps we need to factor it into two different parts. --Rafael
Re: I think that's a blocker...
Maybe my head is just thick today, but even staring at the docs a couple times and reading through what you have below, I can't say I quite understand what you're going for. What are the actual constraints for the windows APIs and what is the heavyweight stuff pn_io_t is doing? --Rafael On Wed, Feb 25, 2015 at 1:02 PM, Cliff Jansen cliffjan...@gmail.com wrote: A pn_io_t is heavyweight in Windows, because it has an opposite usage pattern and moves a lot of kernel stuff into user space compared to POSIX. The quoted documentation was my attempt to capture the Dispatch usage pattern, which I assumed would be typical of an application trying to spread proton engine use between threads: basically single access to pn_selector_select() via a condition variable, and no more than one thread working on a given selectable (using proton engine encoding/decoding etc., not just io). In the end, we could just add a zillion locks into the Windows code and make it look like it is as thread safe as the POSIX counterpart (which has implicit safety when it does in the kernel what Windows is doing in user space), but that would defeat using IO completion ports at all. The documentation was my attempt of balancing performance with sophisticated proton usage on multiple platforms. Note that there is only one pn_selector_t allowed per pn_io_t (a very strong Windows completion port requirement, and sockets are bound to a single completion port for life). On Wed, Feb 25, 2015 at 8:52 AM, Rafael Schloming r...@alum.mit.edu wrote: On Wed, Feb 25, 2015 at 10:49 AM, Ted Ross tr...@redhat.com wrote: Would it be safe to assume that any operations on driver-io are not thread safe? Dispatch is a multi-threaded application. It looks to me as though io-error is a resource shared across the threads in an unsafe way. Interesting... so this is what the docs say: /** * A ::pn_io_t manages IO for a group of pn_socket_t handles. A * pn_io_t object may have zero or one pn_selector_t selectors * associated with it (see ::pn_io_selector()). If one is associated, * all the pn_socket_t handles managed by a pn_io_t must use that * pn_selector_t instance. * * The pn_io_t interface is single-threaded. All methods are intended * to be used by one thread at a time, except that multiple threads * may use: * * ::pn_write() * ::pn_send() * ::pn_recv() * ::pn_close() * ::pn_selector_select() * * provided at most one thread is calling ::pn_selector_select() and * the other threads are operating on separate pn_socket_t handles. */ I think this has been somewhat modified by the constraints from the windows implementation, and I'm not sure I understand completely what the constraints are there, or entirely what is being described above, but on the posix front, the pn_io_t is little more than just a holder for an error slot, and you should have one of these per thread. It shouldn't be a problem to use send/recv/etc from multiple threads though so long as you pass in the pn_io_t from the current thread. --Rafael
Re: Proposed SASL changes (API and functional)
On Wed, 2015-02-25 at 10:27 +0100, Jakub Scholz wrote: ... But I find this part a bit dangerous: Classically in protocols where SASL was not optional the way to avoid double authentication was to use the EXTERNAL SASL mechanism. With AMQP, SASL is optional, so if SSL is used for client authentication the SASL layer could be entirely omitted and so using EXTERNAL is not necessary. This is really just a statement about how AMQP 1.0 works - if you like - it is an aside praising the good protocol design sense of the standard's authors (you know who you are!). I understand the idea and I would even agree that this is the proper way how to do it in the long term. But I'm not sure whether all brokers support this concept. For example, I'm not sure whether you can configure the Qpid C++ broker in a way to accept AMQP 1.0 connections with SSL Client Authentication without SASL EXTERNAL while at the same time accepting AMQP 0-10 connections only with SASL EXTERNAL. Therefore I would be afraid that allowing SSL Client Authentication only without SASL could cause some serious incompatibilities - I think both should be possible / supported. And both are supported. The qpidd 0-10 support is not going to change. The qpidd 1.0 support is on a different code path so there is little bleed over in functionality. The proton server code can auto detect which protocol layers the client is using, and subject to it being an allowed protocol configuration, authenticate it. Other AMQP 1.0 implementations may not support leaving out the SASL layer and so you can certainly always tell the client to use it (even if it adds no useful functionality as in the ANONYMOUS and EXTERNAL cases). So as far as the current plans for proton go if you require SSL client authentication it will happen whether or not a SASL layer is there. As EXTERNAL and better SSL integration with the transport code is not yet implemented there may be something significant I've missed in this analysis, in which case it's all subject to change! I hope that helps. Andrew
Re: Proposed SASL changes (API and functional)
On Tue, 2015-02-24 at 15:48 -0500, Andrew Stitcher wrote: ... If you are at all interested please go and look at the proposal and comment on it there. Thank you very much to Alan and Jakub for commenting on my proposal. The reason I asked people to comment over on the wiki is that it is very hard to find a discussion like this related to a specific proposal after some time has elapsed if it is in email, whereas actually attached to the proposal on the wiki keeps all the relevant comments together. If it is ok with them I will copy the comments over there: Alan, Jakub? Thanks Andrew
Re: Proposed SASL changes (API and functional)
On Wed, 2015-02-25 at 10:46 -0500, Alan Conway wrote: ... One ignorant question: Qpid has a min/max Security Strength Factor for encryption rather than a binary enable/disable. Is that relevant here? (Hardly an ignorant question!) You make a very good point, and this design may indeed be a little simplistic - largely because I've not implemented the encryption side yet! 1. I doubt that max ssf is all that useful in practice. 2. Effectively pn_transport_require_encryption() is the same as setting min ssf 1, but is simpler to understand! An alternative might be pn_transport_require_ssf(int) however that isn't as clear and it's not obvious how to choose the ssf value. Perhaps the '1' should be configurable differently. Some input from those who did the similar work in qpidd might be useful. Just some random wittering. Andrew
[jira] [Created] (PROTON-828) Python binding does not support MODIFIED delivery state
Ken Giusti created PROTON-828: - Summary: Python binding does not support MODIFIED delivery state Key: PROTON-828 URL: https://issues.apache.org/jira/browse/PROTON-828 Project: Qpid Proton Issue Type: Bug Components: python-binding Affects Versions: 0.8 Reporter: Ken Giusti Assignee: Ken Giusti Priority: Blocker Fix For: 0.9 import proton proton.RELEASED RELEASED proton.MODIFIED Traceback (most recent call last): File stdin, line 1, in module AttributeError: 'module' object has no attribute 'MODIFIED' proton.ACCEPTED ACCEPTED -- This message was sent by Atlassian JIRA (v6.3.4#6332)
ApacheCon North America 2015
Hi Everyone, I'll be attending ApacheCon in April. I was wondering if there are others that plan to go and if so would there be any interest in having an informal BOF/hackathon/get-together either during the conference or after hours? --Rafael
Re: PROTON-827: Reactive client binding for the go programming language
+1 Have you thought about how to integrate Go channels with AMQP links? Richard On Wed, Feb 25, 2015 at 12:13 PM, Alan Conway acon...@redhat.com wrote: I plan to start working on a go golang.org binding for proton. I envisage a SWIG binding similar to the other swig-based bindings (python, ruby, etc.) and an API layer similar to the new reactive Python API (based on the C reactor.) This will be an exploratory effort to begin with, I'd like to hear from anybody who might be interested in using such a thing or helping to implement it. Cheers, Alan.
[jira] [Created] (PROTON-829) Possible reference counting bug in pn_clear_tpwork
Alan Conway created PROTON-829: -- Summary: Possible reference counting bug in pn_clear_tpwork Key: PROTON-829 URL: https://issues.apache.org/jira/browse/PROTON-829 Project: Qpid Proton Issue Type: Bug Components: proton-c Affects Versions: 0.8 Reporter: Alan Conway Assignee: Alan Conway Fix For: 0.9 See QPID-6415 which describes a core dump in the qpid tests that appears when using the current 0.9 proton master. The qpid tests pass OK with proton 0.8. The valgrind output in QPID-6415 shows that a connection is deleted while it is being finalized by a call from pn_connection_unbound to pn_clear_tpwork. I do not yet understand the details, but removing the following strange code fixes the problem and passes the proton test suite without valgrind errors: {noformat} --- a/proton-c/src/engine/engine.c +++ b/proton-c/src/engine/engine.c @@ -690,10 +690,10 @@ void pn_clear_tpwork(pn_delivery_t *delivery) { LL_REMOVE(connection, tpwork, delivery); delivery-tpwork = false; -if (pn_refcount(delivery) 0) { - pn_incref(delivery); - pn_decref(delivery); -} } } {noformat} The code is strange because a) you should never examine a refcount except for debugging purposes b) under normal refcounting semantics incref+decref is a no-op. Is removing this code OK? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PROTON-818) Reactor C soak tests
[ https://issues.apache.org/jira/browse/PROTON-818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338047#comment-14338047 ] ASF subversion and git services commented on PROTON-818: Commit cc791045eb725ee60da4e3f2e62d7f35c9d82455 in qpid-proton's branch refs/heads/master from Clifford Jansen [ https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;h=cc79104 ] PROTON-818: Reactor C soak test Reactor C soak tests Key: PROTON-818 URL: https://issues.apache.org/jira/browse/PROTON-818 Project: Qpid Proton Issue Type: Test Components: proton-c Affects Versions: 0.9 Reporter: Cliff Jansen Assignee: Cliff Jansen Provide analogous programs to msgr-send and msgr-recv that can extend the soak tests to reactor sample programs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PROTON-824) Windows fails testIdleTimeout with assert p.conn.remote_condition
[ https://issues.apache.org/jira/browse/PROTON-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336311#comment-14336311 ] ASF subversion and git services commented on PROTON-824: Commit 37a0d6b07708beb2eeb21cd2bd97bd756f33ee71 in qpid-proton's branch refs/heads/master from Clifford Jansen [ https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;h=37a0d6b ] PROTON-824: idle timeout test on Windows: make selector PN_ERROR handling more like POSIX, especially for broken connections Windows fails testIdleTimeout with assert p.conn.remote_condition - Key: PROTON-824 URL: https://issues.apache.org/jira/browse/PROTON-824 Project: Qpid Proton Issue Type: Bug Components: proton-c Affects Versions: 0.9 Environment: Windows Server 2008 or 2012 Visual studio 2010, x86 Reporter: Chuck Rolke {noformat} 1: proton_tests.engine.ServerTest.testIdleTimeout . fail 1: Error during test: Traceback (most recent call last): 1: File D:/Users/crolke/git/rh-qpid-proton/tests/python/proton-test, line 355, in run 1: phase() 1: File D:\Users\crolke\git\rh-qpid-proton\tests\python\proton_tests\engine.py, line 1919 (or so), in testIdleTimeout 1: assert p.conn.remote_condition 1: AssertionError {noformat} Playing with Program explicit timeout (trying 10 instead of 3) gets the test to pass sometimes. It passes sometimes with 3 as well but normally fails. In debugging this it looks like there as no synchronization between what a test will show through print statements and what the proton library shows through PN_TRACE_FRM statements. Are there any hints to lining these up? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Proposed SASL changes (API and functional)
Hi Andrew, I'm definitely not a Proton expert, so please excuse me if I missed something. But I find this part a bit dangerous: Classically in protocols where SASL was not optional the way to avoid double authentication was to use the EXTERNAL SASL mechanism. With AMQP, SASL is optional, so if SSL is used for client authentication the SASL layer could be entirely omitted and so using EXTERNAL is not necessary. I understand the idea and I would even agree that this is the proper way how to do it in the long term. But I'm not sure whether all brokers support this concept. For example, I'm not sure whether you can configure the Qpid C++ broker in a way to accept AMQP 1.0 connections with SSL Client Authentication without SASL EXTERNAL while at the same time accepting AMQP 0-10 connections only with SASL EXTERNAL. Therefore I would be afraid that allowing SSL Client Authentication only without SASL could cause some serious incompatibilities - I think both should be possible / supported. Regards Jakub On Tue, Feb 24, 2015 at 9:48 PM, Andrew Stitcher astitc...@redhat.com wrote: As many of you know I've been working on implementing a SASL AMQP protocol layer that does more than PLAIN and ANONYMOUS for proton-c. I'm currently in at a point where the work is reasonably functional (with some gaps) I've put together a fairly comprehensive account of this work on the Apache wiki: https://cwiki.apache.org/confluence/x/B5cWAw If you are at all interested please go and look at the proposal and comment on it there. You can see my actual code changes in my github proton repo: https://github.com/astitcher/qpid-proton/commits/sasl-work [This is my working branch, so not all the changes make a lot of sense, just pay attention to the tip of the branch] In a short while when people have had enough time to absorb the proposal and comment I will post a code review of the actual code changes. As there are substantial API changes I'd like to get this in for 0.9 because we were intending to stabilise the API at this point. Thanks. Andrew