Re: I think that's a blocker...

2015-02-25 Thread Rafael Schloming
This isn't necessarily a proton bug. Nothing in the referenced checkin
actually touches the logic around allocating/freeing error strings, it
merely causes pn_send/pn_recv to make use of pn_io_t's pn_error_t where
previously it threw away the error information. This would suggest that
there is perhaps a pre-existing bug in dispatch where it is calling
pn_send/pn_recv with a pn_io_t that has been freed, and it is only now
triggering due to the additional asserts that are encountered due to not
ignoring the error information.

I could be mistaken, but I would try reproducing this under valgrind. That
will tell you where the first free occurred and that should hopefully make
it obvious whether this is indeed a proton bug or whether dispatch is
somehow freeing the pn_io_t sooner than it should.

(FWIW, if it is indeed a proton bug, then I would agree it is a blocker.)

--Rafael

On Wed, Feb 25, 2015 at 7:54 AM, Michael Goulish mgoul...@redhat.com
wrote:

 ...but if not, somebody please feel free to correct me.

 The Jira that I just created -- PROTON-826 -- is for a
 bug I found with my topology testing of the Dispatch Router,
 in which I repeatedly kill and restart a router and make
 sure that the router network comes back to the same topology
 that it had before.

 As of checkin 01cb00c -- which had no Jira -- it is pretty
 easy for my test to blow core.  It looks like an error
 string is being double-freed (maybe) in the proton library.

 ( full info in the Jira.  https://issues.apache.org/jira/browse/PROTON-826
 )





Re: I think that's a blocker...

2015-02-25 Thread Michael Goulish

Good point!  I'm afraid it will take me the rest of my life
to reproduce under valgrind .. but ... I'll see what I can do

In the meantime -- I'm not sure what to do with a Jira if the
provenance is in doubt...


- Original Message -
 This isn't necessarily a proton bug. Nothing in the referenced checkin
 actually touches the logic around allocating/freeing error strings, it
 merely causes pn_send/pn_recv to make use of pn_io_t's pn_error_t where
 previously it threw away the error information. This would suggest that
 there is perhaps a pre-existing bug in dispatch where it is calling
 pn_send/pn_recv with a pn_io_t that has been freed, and it is only now
 triggering due to the additional asserts that are encountered due to not
 ignoring the error information.
 
 I could be mistaken, but I would try reproducing this under valgrind. That
 will tell you where the first free occurred and that should hopefully make
 it obvious whether this is indeed a proton bug or whether dispatch is
 somehow freeing the pn_io_t sooner than it should.
 
 (FWIW, if it is indeed a proton bug, then I would agree it is a blocker.)
 
 --Rafael
 
 On Wed, Feb 25, 2015 at 7:54 AM, Michael Goulish mgoul...@redhat.com
 wrote:
 
  ...but if not, somebody please feel free to correct me.
 
  The Jira that I just created -- PROTON-826 -- is for a
  bug I found with my topology testing of the Dispatch Router,
  in which I repeatedly kill and restart a router and make
  sure that the router network comes back to the same topology
  that it had before.
 
  As of checkin 01cb00c -- which had no Jira -- it is pretty
  easy for my test to blow core.  It looks like an error
  string is being double-freed (maybe) in the proton library.
 
  ( full info in the Jira.  https://issues.apache.org/jira/browse/PROTON-826
  )
 
 
 
 


[jira] [Commented] (PROTON-614) PHP Fatal error: Uncaught exception 'MessengerException' with message '[-5]: no valid sources'

2015-02-25 Thread Rajkumar (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336569#comment-14336569
 ] 

Rajkumar commented on PROTON-614:
-

Does anyone know why this error is coming? I am getting this error in C 
language too. 

 PHP Fatal error:  Uncaught exception 'MessengerException' with message '[-5]: 
 no valid sources'
 ---

 Key: PROTON-614
 URL: https://issues.apache.org/jira/browse/PROTON-614
 Project: Qpid Proton
  Issue Type: Bug
  Components: php-binding
Affects Versions: 0.7
 Environment: Ubuntu Linux on Amazon EC2
Reporter: Jose Berardo Cunha
Priority: Minor
  Labels: php, php-proton

 Sorry, but I don't know if it is really a bug or I'm doing something wrong. 
 There is no documentation for Proton PHP, so here I am.
 Every time I try to execute recv.php sample I get this error:
 [0x16d2180]:ERROR[0] (null)
 [0x16d2180]:ERROR[0] (null)
 CONNECTION ERROR connection aborted (remote)
 PHP Fatal error:  Uncaught exception 'MessengerException' with message '[-5]: 
 no valid sources' in /usr/share/php/proton.php:61
 Stack trace:
 #0 /usr/share/php/proton.php(146): Messenger-_check(-5)
 #1 /home/ubuntu/qpid-proton-0.7/tests/smoke/recv.php(20): Messenger-recv()
 #2 {main}
   thrown in /usr/share/php/proton.php on line 61
 I've tried all of this command lines:
 $ php recv.php 
 $ php recv.php amqp://0.0.0.0
 $ php recv.php amqp://127.0.0.1
 $ php recv.php amqp://localhost
 $ php recv.php amqp://localhost:5672
 $ php recv.php amqp://localhost/myqueue
 $ php recv.php amqp://localhost:5672/myqueue
 Where myqueue is my first queue created at Qpid 0.28 Web Console.
 I'got the same Connection aborted on send.php but what is weird is when I run 
 send.php against an ActiveMQ Apollo broker I have no error message and I can 
 see for just one or two seconds one line referring my message sent to it at 
 its web console. So I presumed that send.php is able to connect an amqp 
 broker but I don't know why it doesn't connect to Qpid 0.28. 
 Please, What am I doing wrong, what are valid sources and where can I find 
 more documentation about the Proton PHP library?
 Even the proton.php and cproton.php created by Swig have no comments.
 Thank you.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Proposed SASL changes (API and functional)

2015-02-25 Thread Alan Conway
On Tue, 2015-02-24 at 15:48 -0500, Andrew Stitcher wrote:
 As many of you know I've been working on implementing a SASL AMQP
 protocol layer that does more than PLAIN and ANONYMOUS for proton-c.
 
 I'm currently in at a point where the work is reasonably functional
 (with some gaps)
 
 I've put together a fairly comprehensive account of this work on the
 Apache wiki: https://cwiki.apache.org/confluence/x/B5cWAw
 
 If you are at all interested please go and look at the proposal and
 comment on it there.
 
 You can see my actual code changes in my github proton repo:
 https://github.com/astitcher/qpid-proton/commits/sasl-work
 
 [This is my working branch, so not all the changes make a lot of sense,
 just pay attention to the tip of the branch]
 
 In a short while when people have had enough time to absorb the proposal
 and comment I will post a code review of the actual code changes. As
 there are substantial API changes I'd like to get this in for 0.9
 because we were intending to stabilise the API at this point.

This looks very good to me.

One ignorant question: Qpid has a min/max Security Strength Factor for
encryption rather than a binary enable/disable. Is that relevant here?

Cheers,
Alan.



Re: I think that's a blocker...

2015-02-25 Thread Rafael Schloming
On Wed, Feb 25, 2015 at 9:53 AM, Alan Conway acon...@redhat.com wrote:

 On Wed, 2015-02-25 at 09:04 -0500, Michael Goulish wrote:
  Good point!  I'm afraid it will take me the rest of my life
  to reproduce under valgrind .. but ... I'll see what I can do

 Try this in your environment:
  export MALLOC_PERTURB_=66
 That will cause malloc to immediately fill freed memory with 0x42 bytes
 so it is obvious when you gdb the core dump if someone is using freed
 memory.

 It's not as informative as valgrind but has no peformance impact that I
 can detect, and it often helps to crash faster and closer to the real
 problem. Freed memory can hold valid-seeming values for a while so your
 code may not notice immediately, whereas 4242424242 is rarely valid for
 anything.

  In the meantime -- I'm not sure what to do with a Jira if the
  provenance is in doubt...

 Maybe just put a note on it till we know more.


+1

FWIW, given that it looks like we are gonna wait for Andrew's SASL changes
to land, and Gordon is away till the end of the week and still working on
some documentation that would be nice to get into the release, I suspect
you have at least a few days to investigate the issue before we get a
release candidate for 0.9.

--Rafael


Re: I think that's a blocker...

2015-02-25 Thread Ted Ross
Would it be safe to assume that any operations on driver-io are not 
thread safe?


Dispatch is a multi-threaded application.  It looks to me as though 
io-error is a resource shared across the threads in an unsafe way.


-Ted

On 02/25/2015 08:55 AM, Rafael Schloming wrote:

This isn't necessarily a proton bug. Nothing in the referenced checkin
actually touches the logic around allocating/freeing error strings, it
merely causes pn_send/pn_recv to make use of pn_io_t's pn_error_t where
previously it threw away the error information. This would suggest that
there is perhaps a pre-existing bug in dispatch where it is calling
pn_send/pn_recv with a pn_io_t that has been freed, and it is only now
triggering due to the additional asserts that are encountered due to not
ignoring the error information.

I could be mistaken, but I would try reproducing this under valgrind. That
will tell you where the first free occurred and that should hopefully make
it obvious whether this is indeed a proton bug or whether dispatch is
somehow freeing the pn_io_t sooner than it should.

(FWIW, if it is indeed a proton bug, then I would agree it is a blocker.)

--Rafael

On Wed, Feb 25, 2015 at 7:54 AM, Michael Goulish mgoul...@redhat.com
wrote:


...but if not, somebody please feel free to correct me.

The Jira that I just created -- PROTON-826 -- is for a
bug I found with my topology testing of the Dispatch Router,
in which I repeatedly kill and restart a router and make
sure that the router network comes back to the same topology
that it had before.

As of checkin 01cb00c -- which had no Jira -- it is pretty
easy for my test to blow core.  It looks like an error
string is being double-freed (maybe) in the proton library.

( full info in the Jira.  https://issues.apache.org/jira/browse/PROTON-826
)







Re: I think that's a blocker...

2015-02-25 Thread Rafael Schloming
On Wed, Feb 25, 2015 at 10:49 AM, Ted Ross tr...@redhat.com wrote:

 Would it be safe to assume that any operations on driver-io are not
 thread safe?

 Dispatch is a multi-threaded application.  It looks to me as though
 io-error is a resource shared across the threads in an unsafe way.


Interesting... so this is what the docs say:

/**
 * A ::pn_io_t manages IO for a group of pn_socket_t handles.  A
 * pn_io_t object may have zero or one pn_selector_t selectors
 * associated with it (see ::pn_io_selector()).  If one is associated,
 * all the pn_socket_t handles managed by a pn_io_t must use that
 * pn_selector_t instance.
 *
 * The pn_io_t interface is single-threaded. All methods are intended
 * to be used by one thread at a time, except that multiple threads
 * may use:
 *
 *   ::pn_write()
 *   ::pn_send()
 *   ::pn_recv()
 *   ::pn_close()
 *   ::pn_selector_select()
 *
 * provided at most one thread is calling ::pn_selector_select() and
 * the other threads are operating on separate pn_socket_t handles.
 */

I think this has been somewhat modified by the constraints from the windows
implementation, and I'm not sure I understand completely what the
constraints are there, or entirely what is being described above, but on
the posix front, the pn_io_t is little more than just a holder for an error
slot, and you should have one of these per thread. It shouldn't be a
problem to use send/recv/etc from multiple threads though so long as you
pass in the pn_io_t from the current thread.

--Rafael


Re: I think that's a blocker...

2015-02-25 Thread Ted Ross



On 02/25/2015 11:52 AM, Rafael Schloming wrote:

On Wed, Feb 25, 2015 at 10:49 AM, Ted Ross tr...@redhat.com wrote:


Would it be safe to assume that any operations on driver-io are not
thread safe?

Dispatch is a multi-threaded application.  It looks to me as though
io-error is a resource shared across the threads in an unsafe way.



Interesting... so this is what the docs say:

/**
  * A ::pn_io_t manages IO for a group of pn_socket_t handles.  A
  * pn_io_t object may have zero or one pn_selector_t selectors
  * associated with it (see ::pn_io_selector()).  If one is associated,
  * all the pn_socket_t handles managed by a pn_io_t must use that
  * pn_selector_t instance.
  *
  * The pn_io_t interface is single-threaded. All methods are intended
  * to be used by one thread at a time, except that multiple threads
  * may use:
  *
  *   ::pn_write()
  *   ::pn_send()
  *   ::pn_recv()
  *   ::pn_close()
  *   ::pn_selector_select()
  *
  * provided at most one thread is calling ::pn_selector_select() and
  * the other threads are operating on separate pn_socket_t handles.
  */


I claim that the commit-in-question violates the text above.  Calls to 
pn_send() and pn_recv() are no longer thread-safe because they now use 
the shared error record.




I think this has been somewhat modified by the constraints from the windows
implementation, and I'm not sure I understand completely what the
constraints are there, or entirely what is being described above, but on
the posix front, the pn_io_t is little more than just a holder for an error
slot, and you should have one of these per thread. It shouldn't be a
problem to use send/recv/etc from multiple threads though so long as you
pass in the pn_io_t from the current thread.


It's not desirable to allocate sockets to threads up front (i.e. 
partition the set of sockets into per-thread slots).  I know you didn't 
say that was needed but it's what I infer from the docs for pn_io_t.


Assuming, as you suggest, that pn_io_t is nothing more than a 
thread-specific error notepad seems like a recipe for future disaster 
because pn_io_t is clearly intended to be more than that.


-Ted


Re: I think that's a blocker...

2015-02-25 Thread Cliff Jansen
A pn_io_t is heavyweight in Windows, because it has an opposite usage
pattern and moves a lot of kernel stuff into user space compared to
POSIX.

The quoted documentation was my attempt to capture the Dispatch usage
pattern, which I assumed would be typical of an application trying to
spread proton engine use between threads: basically single access to
pn_selector_select() via a condition variable, and no more than one
thread working on a given selectable (using proton engine
encoding/decoding etc., not just io).

In the end, we could just add a zillion locks into the Windows code
and make it look like it is as thread safe as the POSIX counterpart
(which has implicit safety when it does in the kernel what Windows is
doing in user space), but that would defeat using IO completion ports
at all.  The documentation was my attempt of balancing performance
with sophisticated proton usage on multiple platforms.

Note that there is only one pn_selector_t allowed per pn_io_t (a very
strong Windows completion port requirement, and sockets are bound to a
single completion port for life).

On Wed, Feb 25, 2015 at 8:52 AM, Rafael Schloming r...@alum.mit.edu wrote:
 On Wed, Feb 25, 2015 at 10:49 AM, Ted Ross tr...@redhat.com wrote:

 Would it be safe to assume that any operations on driver-io are not
 thread safe?

 Dispatch is a multi-threaded application.  It looks to me as though
 io-error is a resource shared across the threads in an unsafe way.


 Interesting... so this is what the docs say:

 /**
  * A ::pn_io_t manages IO for a group of pn_socket_t handles.  A
  * pn_io_t object may have zero or one pn_selector_t selectors
  * associated with it (see ::pn_io_selector()).  If one is associated,
  * all the pn_socket_t handles managed by a pn_io_t must use that
  * pn_selector_t instance.
  *
  * The pn_io_t interface is single-threaded. All methods are intended
  * to be used by one thread at a time, except that multiple threads
  * may use:
  *
  *   ::pn_write()
  *   ::pn_send()
  *   ::pn_recv()
  *   ::pn_close()
  *   ::pn_selector_select()
  *
  * provided at most one thread is calling ::pn_selector_select() and
  * the other threads are operating on separate pn_socket_t handles.
  */

 I think this has been somewhat modified by the constraints from the windows
 implementation, and I'm not sure I understand completely what the
 constraints are there, or entirely what is being described above, but on
 the posix front, the pn_io_t is little more than just a holder for an error
 slot, and you should have one of these per thread. It shouldn't be a
 problem to use send/recv/etc from multiple threads though so long as you
 pass in the pn_io_t from the current thread.

 --Rafael


PROTON-827: Reactive client binding for the go programming language

2015-02-25 Thread Alan Conway
I plan to start working on a go golang.org binding for proton. I
envisage a SWIG binding similar to the other swig-based bindings
(python, ruby, etc.) and an API layer similar to the new reactive Python
API (based on the C reactor.)

This will be an exploratory effort to begin with, I'd like to hear from
anybody who might be interested in using such a thing or helping to
implement it.

Cheers,
Alan.



[jira] [Commented] (PROTON-826) recent checkin causes frequent double-free or corruption crash

2015-02-25 Thread michael goulish (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336824#comment-14336824
 ] 

michael goulish commented on PROTON-826:


It looks like the problem here is just that the error struct used in  
proton-c/src/error.c is not thread safe -- so I am opening a new Jira for 
Dispatch.

I am leaving this one open for now, however, because other applications using 
proton will encounter this.  Either something could be changed in proton to 
make this less thread-hostile, or ... it could be publicized better?

Please feel free to close when appropriate.



 recent checkin causes frequent double-free or corruption crash
 --

 Key: PROTON-826
 URL: https://issues.apache.org/jira/browse/PROTON-826
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.9
Reporter: michael goulish
Priority: Blocker

 In my dispatch testing I am seeing frequent crashes in proton library that 
 began with proton checkin   01cb00c  on 2015-02-15   report read and write 
 errors through the transport
 The output at crash-time says this:
 ---
 *** Error in `/home/mick/dispatch/install/sbin/qdrouterd': double free or 
 corruption (fasttop): 0x020ee880 ***
 === Backtrace: =
 /lib64/libc.so.6[0x3e3d875a4f]
 /lib64/libc.so.6[0x3e3d87cd78]
 /lib64/libqpid-proton.so.2(pn_error_clear+0x18)[0x7f4f4f4e1f18]
 /lib64/libqpid-proton.so.2(pn_error_set+0x11)[0x7f4f4f4e1f41]
 /lib64/libqpid-proton.so.2(pn_error_vformat+0x3e)[0x7f4f4f4e1f9e]
 /lib64/libqpid-proton.so.2(pn_error_format+0x82)[0x7f4f4f4e2032]
 /lib64/libqpid-proton.so.2(pn_i_error_from_errno+0x67)[0x7f4f4f4fd737]
 /lib64/libqpid-proton.so.2(pn_recv+0x5a)[0x7f4f4f4fd16a]
 /home/mick/dispatch/install/lib64/libqpid-dispatch.so.0(qdpn_connector_process+0xd7)[0x7f4f4f759430]
 The backtrace from the core file looks like this:
 
 #0  0x003e3d835877 in raise () from /lib64/libc.so.6
 #1  0x003e3d836f68 in abort () from /lib64/libc.so.6
 #2  0x003e3d875a54 in __libc_message () from /lib64/libc.so.6
 #3  0x003e3d87cd78 in _int_free () from /lib64/libc.so.6
 #4  0x7fbf8a59b2e8 in pn_error_clear (error=error@entry=0x1501140)
 at /home/mick/rh-qpid-proton/proton-c/src/error.c:56
 #5  0x7fbf8a59b311 in pn_error_set (error=error@entry=0x1501140, 
 code=code@entry=-2,
 text=text@entry=0x7fbf801a69c0 recv: Resource temporarily unavailable)
 at /home/mick/rh-qpid-proton/proton-c/src/error.c:65
 #6  0x7fbf8a59b36e in pn_error_vformat (error=0x1501140, code=-2, 
 fmt=optimized out,
 ap=ap@entry=0x7fbf801a6de8) at 
 /home/mick/rh-qpid-proton/proton-c/src/error.c:81
 #7  0x7fbf8a59b402 in pn_error_format (error=error@entry=0x1501140, 
 code=optimized out,
 fmt=fmt@entry=0x7fbf8a5bb21e %s: %s) at 
 /home/mick/rh-qpid-proton/proton-c/src/error.c:89
 #8  0x7fbf8a5b6797 in pn_i_error_from_errno (error=0x1501140,
 msg=msg@entry=0x7fbf8a5bbe1a recv)
 at /home/mick/rh-qpid-proton/proton-c/src/platform.c:119
 #9  0x7fbf8a5b61ca in pn_recv (io=0x14e77b0, socket=optimized out, 
 buf=optimized out,
 size=optimized out) at 
 /home/mick/rh-qpid-proton/proton-c/src/posix/io.c:271
 #10 0x7fbf8a812430 in qdpn_connector_process (c=0x7fbf7801c7f0)
 -
 And I can prevent the crash from happening, apparently forever, by commenting 
 out this line:
   free(error-text);
 in the function  pn_error_clear
 in the file proton-c/src/error.c
 The error text that is being freed which causes the crash looks like this:
   $2 = {text = 0x7f66e8104e30 recv: Resource temporarily unavailable, root 
 = 0x0, code = -2}
 My dispatch test creates a router network and then repeatedly kills and 
 restarts a randomly-selected router.  After this proton checkin it almost 
 never gets through 5 iterations without this crash.  After I commented out 
 that line, it got through more than 500 iterations before I stopped it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PROTON-827) Reactive client binding for the go programming language.

2015-02-25 Thread Alan Conway (JIRA)
Alan Conway created PROTON-827:
--

 Summary: Reactive client binding for the go programming language.
 Key: PROTON-827
 URL: https://issues.apache.org/jira/browse/PROTON-827
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: 0.9
Reporter: Alan Conway
Assignee: Alan Conway


Develop a reactive API binding in go http://golang.org/, similar to the 
existing reactive python API illustrated in examples/python. It should follow 
the pattern of the existing python and C reactive APIs as far as possible while 
respecting common conventions and idioms of the go langauge.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: I think that's a blocker...

2015-02-25 Thread Rafael Schloming
On Wed, Feb 25, 2015 at 12:48 PM, Ted Ross tr...@redhat.com wrote:



 On 02/25/2015 11:52 AM, Rafael Schloming wrote:

 On Wed, Feb 25, 2015 at 10:49 AM, Ted Ross tr...@redhat.com wrote:

  Would it be safe to assume that any operations on driver-io are not
 thread safe?

 Dispatch is a multi-threaded application.  It looks to me as though
 io-error is a resource shared across the threads in an unsafe way.


 Interesting... so this is what the docs say:

 /**
   * A ::pn_io_t manages IO for a group of pn_socket_t handles.  A
   * pn_io_t object may have zero or one pn_selector_t selectors
   * associated with it (see ::pn_io_selector()).  If one is associated,
   * all the pn_socket_t handles managed by a pn_io_t must use that
   * pn_selector_t instance.
   *
   * The pn_io_t interface is single-threaded. All methods are intended
   * to be used by one thread at a time, except that multiple threads
   * may use:
   *
   *   ::pn_write()
   *   ::pn_send()
   *   ::pn_recv()
   *   ::pn_close()
   *   ::pn_selector_select()
   *
   * provided at most one thread is calling ::pn_selector_select() and
   * the other threads are operating on separate pn_socket_t handles.
   */


 I claim that the commit-in-question violates the text above.  Calls to
 pn_send() and pn_recv() are no longer thread-safe because they now use the
 shared error record.


You could be right. I'm not entirely sure how to interpret the above text.
I don't know that I would necessarily consider pn_send/pn_recv to be
methods of pn_io_t.



 I think this has been somewhat modified by the constraints from the
 windows
 implementation, and I'm not sure I understand completely what the
 constraints are there, or entirely what is being described above, but on
 the posix front, the pn_io_t is little more than just a holder for an
 error
 slot, and you should have one of these per thread. It shouldn't be a
 problem to use send/recv/etc from multiple threads though so long as you
 pass in the pn_io_t from the current thread.


 It's not desirable to allocate sockets to threads up front (i.e. partition
 the set of sockets into per-thread slots).  I know you didn't say that was
 needed but it's what I infer from the docs for pn_io_t.


I don't think the posix implementation requires this.


 Assuming, as you suggest, that pn_io_t is nothing more than a
 thread-specific error notepad seems like a recipe for future disaster
 because pn_io_t is clearly intended to be more than that.


It may not work out so well on windows, I honestly don't know what the
situation is there, but certainly for posix systems I think we need
*something* in this area to function as a context that can be associated
with thread-or-smaller granularities. Having some way to return error
information is just one example of a useful context to be able to use at
thread or smaller granularities. If the windows I/O APIs require a
heavier-weight interface, then perhaps we need to factor it into two
different parts.

--Rafael


Re: I think that's a blocker...

2015-02-25 Thread Rafael Schloming
Maybe my head is just thick today, but even staring at the docs a couple
times and reading through what you have below, I can't say I quite
understand what you're going for. What are the actual constraints for the
windows APIs and what is the heavyweight stuff pn_io_t is doing?

--Rafael

On Wed, Feb 25, 2015 at 1:02 PM, Cliff Jansen cliffjan...@gmail.com wrote:

 A pn_io_t is heavyweight in Windows, because it has an opposite usage
 pattern and moves a lot of kernel stuff into user space compared to
 POSIX.

 The quoted documentation was my attempt to capture the Dispatch usage
 pattern, which I assumed would be typical of an application trying to
 spread proton engine use between threads: basically single access to
 pn_selector_select() via a condition variable, and no more than one
 thread working on a given selectable (using proton engine
 encoding/decoding etc., not just io).

 In the end, we could just add a zillion locks into the Windows code
 and make it look like it is as thread safe as the POSIX counterpart
 (which has implicit safety when it does in the kernel what Windows is
 doing in user space), but that would defeat using IO completion ports
 at all.  The documentation was my attempt of balancing performance
 with sophisticated proton usage on multiple platforms.

 Note that there is only one pn_selector_t allowed per pn_io_t (a very
 strong Windows completion port requirement, and sockets are bound to a
 single completion port for life).

 On Wed, Feb 25, 2015 at 8:52 AM, Rafael Schloming r...@alum.mit.edu
 wrote:
  On Wed, Feb 25, 2015 at 10:49 AM, Ted Ross tr...@redhat.com wrote:
 
  Would it be safe to assume that any operations on driver-io are not
  thread safe?
 
  Dispatch is a multi-threaded application.  It looks to me as though
  io-error is a resource shared across the threads in an unsafe way.
 
 
  Interesting... so this is what the docs say:
 
  /**
   * A ::pn_io_t manages IO for a group of pn_socket_t handles.  A
   * pn_io_t object may have zero or one pn_selector_t selectors
   * associated with it (see ::pn_io_selector()).  If one is associated,
   * all the pn_socket_t handles managed by a pn_io_t must use that
   * pn_selector_t instance.
   *
   * The pn_io_t interface is single-threaded. All methods are intended
   * to be used by one thread at a time, except that multiple threads
   * may use:
   *
   *   ::pn_write()
   *   ::pn_send()
   *   ::pn_recv()
   *   ::pn_close()
   *   ::pn_selector_select()
   *
   * provided at most one thread is calling ::pn_selector_select() and
   * the other threads are operating on separate pn_socket_t handles.
   */
 
  I think this has been somewhat modified by the constraints from the
 windows
  implementation, and I'm not sure I understand completely what the
  constraints are there, or entirely what is being described above, but on
  the posix front, the pn_io_t is little more than just a holder for an
 error
  slot, and you should have one of these per thread. It shouldn't be a
  problem to use send/recv/etc from multiple threads though so long as you
  pass in the pn_io_t from the current thread.
 
  --Rafael



Re: Proposed SASL changes (API and functional)

2015-02-25 Thread Andrew Stitcher
On Wed, 2015-02-25 at 10:27 +0100, Jakub Scholz wrote:
 ...
 But I find this part a bit dangerous:
 Classically in protocols where SASL was not optional the way to avoid
 double authentication was to use the EXTERNAL SASL mechanism. With AMQP,
 SASL is optional, so if SSL is used for client authentication the SASL
 layer could be entirely omitted and so using EXTERNAL is not necessary.
 

This is really just a statement about how AMQP 1.0 works - if you like -
it is an aside praising the good protocol design sense of the standard's
authors (you know who you are!).

 I understand the idea and I would even agree that this is the proper way
 how to do it in the long term. But I'm not sure whether all brokers support
 this concept. For example, I'm not sure whether you can configure the Qpid
 C++ broker in a way to accept AMQP 1.0 connections with SSL Client
 Authentication without SASL EXTERNAL while at the same time accepting AMQP
 0-10 connections only with SASL EXTERNAL. Therefore I would be afraid that
 allowing SSL Client Authentication only without SASL could cause some
 serious incompatibilities - I think both should be possible / supported.

And both are supported.

The qpidd 0-10 support is not going to change. The qpidd 1.0 support is
on a different code path so there is little bleed over in functionality.

The proton server code can auto detect which protocol layers the client
is using, and subject to it being an allowed protocol configuration,
authenticate it.

Other AMQP 1.0 implementations may not support leaving out the SASL
layer and so you can certainly always tell the client to use it (even if
it adds no useful functionality as in the ANONYMOUS and EXTERNAL cases).

So as far as the current plans for proton go if you require SSL client
authentication it will happen whether or not a SASL layer is there.

As EXTERNAL and better SSL integration with the transport code is not
yet implemented there may be something significant I've missed in this
analysis, in which case  it's all subject to change!

I hope that helps.

Andrew



Re: Proposed SASL changes (API and functional)

2015-02-25 Thread Andrew Stitcher
On Tue, 2015-02-24 at 15:48 -0500, Andrew Stitcher wrote:
 ...
 If you are at all interested please go and look at the proposal and
 comment on it there.

Thank you very much to Alan and Jakub for commenting on my proposal.

The reason I asked people to comment over on the wiki is that it is very
hard to find a discussion like this related to a specific proposal after
some time has elapsed if it is in email, whereas actually attached to
the proposal on the wiki keeps all the relevant comments together.

If it is ok with them I will copy the comments over there:
Alan, Jakub?

Thanks

Andrew




Re: Proposed SASL changes (API and functional)

2015-02-25 Thread Andrew Stitcher
On Wed, 2015-02-25 at 10:46 -0500, Alan Conway wrote:
 ...
 One ignorant question: Qpid has a min/max Security Strength Factor for
 encryption rather than a binary enable/disable. Is that relevant here?

(Hardly an ignorant question!) You make a very good point, and this
design may indeed be a little simplistic - largely because I've not
implemented the encryption side yet!

1. I doubt that max ssf is all that useful in practice.
2. Effectively pn_transport_require_encryption() is the same as setting
min ssf 1, but is simpler to understand! An alternative might be
pn_transport_require_ssf(int) however that isn't as clear and it's not
obvious how to choose the ssf value. Perhaps the '1' should be
configurable differently.

Some input from those who did the similar work in qpidd might be useful.

Just some random wittering.

Andrew




[jira] [Created] (PROTON-828) Python binding does not support MODIFIED delivery state

2015-02-25 Thread Ken Giusti (JIRA)
Ken Giusti created PROTON-828:
-

 Summary: Python binding does not support MODIFIED delivery state
 Key: PROTON-828
 URL: https://issues.apache.org/jira/browse/PROTON-828
 Project: Qpid Proton
  Issue Type: Bug
  Components: python-binding
Affects Versions: 0.8
Reporter: Ken Giusti
Assignee: Ken Giusti
Priority: Blocker
 Fix For: 0.9


 import proton
 proton.RELEASED
RELEASED
 proton.MODIFIED
Traceback (most recent call last):
  File stdin, line 1, in module
AttributeError: 'module' object has no attribute 'MODIFIED'
 proton.ACCEPTED
ACCEPTED
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


ApacheCon North America 2015

2015-02-25 Thread Rafael Schloming
Hi Everyone,

I'll be attending ApacheCon in April. I was wondering if there are others
that plan to go and if so would there be any interest in having an informal
BOF/hackathon/get-together either during the conference or after hours?

--Rafael


Re: PROTON-827: Reactive client binding for the go programming language

2015-02-25 Thread Richard Li
+1

Have you thought about how to integrate Go channels with AMQP links?

Richard

On Wed, Feb 25, 2015 at 12:13 PM, Alan Conway acon...@redhat.com wrote:

 I plan to start working on a go golang.org binding for proton. I
 envisage a SWIG binding similar to the other swig-based bindings
 (python, ruby, etc.) and an API layer similar to the new reactive Python
 API (based on the C reactor.)

 This will be an exploratory effort to begin with, I'd like to hear from
 anybody who might be interested in using such a thing or helping to
 implement it.

 Cheers,
 Alan.




[jira] [Created] (PROTON-829) Possible reference counting bug in pn_clear_tpwork

2015-02-25 Thread Alan Conway (JIRA)
Alan Conway created PROTON-829:
--

 Summary: Possible reference counting bug in pn_clear_tpwork
 Key: PROTON-829
 URL: https://issues.apache.org/jira/browse/PROTON-829
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.8
Reporter: Alan Conway
Assignee: Alan Conway
 Fix For: 0.9


See QPID-6415 which describes a core dump in the qpid tests that appears when 
using the current 0.9 proton master. The qpid tests pass OK with proton 0.8.

The valgrind output in QPID-6415 shows that a connection is deleted while it is 
being finalized by  a call from pn_connection_unbound to pn_clear_tpwork.

I do not yet understand the details, but removing the following strange code 
fixes the problem and passes the proton test suite without valgrind errors:

{noformat}
--- a/proton-c/src/engine/engine.c
+++ b/proton-c/src/engine/engine.c
@@ -690,10 +690,10 @@ void pn_clear_tpwork(pn_delivery_t *delivery)
   {
 LL_REMOVE(connection, tpwork, delivery);
 delivery-tpwork = false;
-if (pn_refcount(delivery)  0) {
-  pn_incref(delivery);
-  pn_decref(delivery);
-}
   }
 }
{noformat}

The code is strange because
a) you should never examine a refcount except for debugging purposes
b) under normal refcounting semantics incref+decref is a no-op.

Is removing this code OK?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PROTON-818) Reactor C soak tests

2015-02-25 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338047#comment-14338047
 ] 

ASF subversion and git services commented on PROTON-818:


Commit cc791045eb725ee60da4e3f2e62d7f35c9d82455 in qpid-proton's branch 
refs/heads/master from Clifford Jansen
[ https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;h=cc79104 ]

PROTON-818: Reactor C soak test


 Reactor C soak tests
 

 Key: PROTON-818
 URL: https://issues.apache.org/jira/browse/PROTON-818
 Project: Qpid Proton
  Issue Type: Test
  Components: proton-c
Affects Versions: 0.9
Reporter: Cliff Jansen
Assignee: Cliff Jansen

 Provide analogous programs to msgr-send and msgr-recv that can extend the 
 soak tests to reactor sample programs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PROTON-824) Windows fails testIdleTimeout with assert p.conn.remote_condition

2015-02-25 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PROTON-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336311#comment-14336311
 ] 

ASF subversion and git services commented on PROTON-824:


Commit 37a0d6b07708beb2eeb21cd2bd97bd756f33ee71 in qpid-proton's branch 
refs/heads/master from Clifford Jansen
[ https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;h=37a0d6b ]

PROTON-824: idle timeout test on Windows: make selector PN_ERROR handling more 
like POSIX, especially for broken connections


 Windows fails testIdleTimeout with assert p.conn.remote_condition
 -

 Key: PROTON-824
 URL: https://issues.apache.org/jira/browse/PROTON-824
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: 0.9
 Environment: Windows Server 2008 or 2012
 Visual studio 2010, x86
Reporter: Chuck Rolke

 {noformat}
 1: proton_tests.engine.ServerTest.testIdleTimeout . 
 fail
 1: Error during test:  Traceback (most recent call last):
 1: File D:/Users/crolke/git/rh-qpid-proton/tests/python/proton-test, 
 line 355, in run
 1:   phase()
 1: File 
 D:\Users\crolke\git\rh-qpid-proton\tests\python\proton_tests\engine.py, 
 line 1919 (or so), in testIdleTimeout
 1:   assert p.conn.remote_condition
 1:   AssertionError
 {noformat}
 Playing with Program explicit timeout (trying 10 instead of 3) gets the test 
 to pass sometimes. It passes sometimes with 3 as well but normally fails.
 In debugging this it looks like there as no synchronization between what a 
 test will show through print statements and what the proton library shows 
 through PN_TRACE_FRM statements. Are there any hints to lining these up?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Proposed SASL changes (API and functional)

2015-02-25 Thread Jakub Scholz
Hi Andrew,

I'm definitely not a Proton expert, so please excuse me if I missed
something.

But I find this part a bit dangerous:
Classically in protocols where SASL was not optional the way to avoid
double authentication was to use the EXTERNAL SASL mechanism. With AMQP,
SASL is optional, so if SSL is used for client authentication the SASL
layer could be entirely omitted and so using EXTERNAL is not necessary.

I understand the idea and I would even agree that this is the proper way
how to do it in the long term. But I'm not sure whether all brokers support
this concept. For example, I'm not sure whether you can configure the Qpid
C++ broker in a way to accept AMQP 1.0 connections with SSL Client
Authentication without SASL EXTERNAL while at the same time accepting AMQP
0-10 connections only with SASL EXTERNAL. Therefore I would be afraid that
allowing SSL Client Authentication only without SASL could cause some
serious incompatibilities - I think both should be possible / supported.

Regards
Jakub

On Tue, Feb 24, 2015 at 9:48 PM, Andrew Stitcher astitc...@redhat.com
wrote:

 As many of you know I've been working on implementing a SASL AMQP
 protocol layer that does more than PLAIN and ANONYMOUS for proton-c.

 I'm currently in at a point where the work is reasonably functional
 (with some gaps)

 I've put together a fairly comprehensive account of this work on the
 Apache wiki: https://cwiki.apache.org/confluence/x/B5cWAw

 If you are at all interested please go and look at the proposal and
 comment on it there.

 You can see my actual code changes in my github proton repo:
 https://github.com/astitcher/qpid-proton/commits/sasl-work

 [This is my working branch, so not all the changes make a lot of sense,
 just pay attention to the tip of the branch]

 In a short while when people have had enough time to absorb the proposal
 and comment I will post a code review of the actual code changes. As
 there are substantial API changes I'd like to get this in for 0.9
 because we were intending to stabilise the API at this point.

 Thanks.

 Andrew