Re: PN_REACTOR_QUIESCED

2015-10-14 Thread Rafael Schloming
It wasn't actually an accidental commit. If I recall correctly I ended up
using it more like a 0xDEADBEEF value. It makes it easy to distinguish
between the failure mode of an actual hang (e.g. infinite loop or blocking
call inside a handler) vs reaching a state where there are simply no more
events to process. I guess you can think of it like a heartbeat in a way.

--Rafael

On Tue, Oct 13, 2015 at 10:56 AM, Michael Goulish 
wrote:

>
> But it's obvious how this constant was chosen.
>
> With circular reasoning.
>
> 
>
>
>
> - Original Message -
> > On Mon, 2015-10-12 at 16:05 -0400, aconway wrote:
> > > ...
> > > +1, that looks like the right fix. 3141 is an odd choice of default,
> > > even for a mathematician.
> > >
> >
> > At this point, I'm desperately trying to find an appropriate pi joke :
> > -)
> >
> > Andrew
> >
> >
>


Re: All about proton memory management (or C++, python and Go - Oh My!)

2015-08-18 Thread Rafael Schloming
Nice writeup!

--Rafael

On Mon, Aug 17, 2015 at 4:34 PM, aconway  wrote:

> I've been doing a lot of thinking about memory management and proton
> bindings in 4 languages (C, Go, python and C++) and I have Seen The
> Light. Here is a write-up, I'd appreciate feedback in the form of
> email, reviewboard diffs, regular diffs, or just commit improvements if
> you're a commiter. I will add this to the official proton API doc after
> incorporating feedback:
>
> https://github.com/apache/qpid-proton/blob/master/docs/markdown/memory_
> management.md
>
>
>


0.10 beta1

2015-07-21 Thread Rafael Schloming
Hi Everyone,

As requested, here is 0.10 beta1. I've attached a log of all commits since
alpha1. You can download from the usual places:

Source artifacts:

  https://people.apache.org/~rhs/qpid-proton-0.10-beta1/

Java binaries:

  https://repository.apache.org/content/repositories/orgapacheqpid-1037

--Rafael
commit 115bd26cbc5501d5bbf850fb8b7811aaf17ba507
Author: Rafael Schloming 
Date:   Tue Jul 21 12:22:14 2015 -0400

Release 0.10

commit 8f82638b7e8299c8c3544516e4b8a18d185d3325
Author: Kenneth Giusti 
Date:   Mon Jul 20 15:08:54 2015 -0400

PROTON-943: bump libqpid-proton shared library major version for 0.10 
release

commit d9ce3cfd0916ae3719cb39a83a6174c5f88b10bb
Author: Gordon Sim 
Date:   Mon Jul 20 14:00:33 2015 -0400

PROTON-905: fix to prevent crash with latest qpidd

commit 02789482e5652dee08fa3fef2d03f8ec6b2f69fa
Merge: 22ca57e 4476e97
Author: Andrew Stitcher 
Date:   Mon Jul 20 01:58:35 2015 -0400

This closes #47

commit 22ca57e140eb8cbdd0c334f3aab63e457453b307
Author: Andrew Stitcher 
Date:   Mon Jul 20 01:29:46 2015 -0400

NO-JIRA: Pass in some more env vars to the tox tests to run them without 
skipping any

commit 4476e97527f4a3b6ae8e85f0bd82cb660270eb16
Author: Andrew Stitcher 
Date:   Thu Jul 9 16:43:20 2015 -0400

NO-JIRA: Change travis configuration to use container based builds

commit 71f0be88ee8649e98571e6208858945f1d2708f6
Author: Andrew Stitcher 
Date:   Fri Jul 17 17:36:23 2015 -0400

NO-JIRA: Make tox get the path for swig from the SWIG env var if present

commit 198af3dbadc5f01f5333bab6313f812ccab0b750
Author: Andrew Stitcher 
Date:   Thu Jul 16 18:32:15 2015 -0400

NO-JIRA: Skip Extended SASL tests if we can't find saslpasswd2

commit eec9cb33ab5ea08ed515a10015a3643f3d09b261
Author: Andrew Stitcher 
Date:   Thu Jul 16 03:24:21 2015 -0400

NO-JIRA: Minor changes to CMake to detect and pass some extra things to 
tests

commit 4ee726002804d7286a8c76b42e0a0717e0798822
Author: mgoulish 
Date:   Fri Jul 17 10:29:13 2015 -0400

PROTON-919: make C behave same as Java wrt channel_max error

commit 17250c94799ac1551fdd53683d3f28f13bdc4764
Author: Robert Gemmell 
Date:   Thu Jul 16 18:12:02 2015 +0100

NO-JIRA: add a couple of clarifying comments

commit 3459aa239252412d77a125d34d7dc51b65ed0201
Author: Robert Gemmell 
Date:   Tue Jul 14 11:23:03 2015 +0100

PROTON-947: deprecate stale methods, to be removed after the imminent 
release

commit b67a2a943017910bcf8bf67a05aafed93ab7b8b1
Author: Robert Gemmell 
Date:   Mon Jul 13 17:28:31 2015 +0100

PROTON-944: add ability to set a default state for use when 
settling/freeing received deliveries without having previously set/sent 
dispositon state for them

commit 6d873ebed766fa8a1108f72837ce9b4e05cc5e09
Author: dcristoloveanu 
Date:   Thu Jul 9 13:01:30 2015 -0700

- Fix 2 Code Analysis warnings in iocp.c
- Fix one realloc leak and the fact that the realloc result was not checked.
  This rippled through codec.c as more functions needed to have error 
checking.

Fix amended by astitc...@apache.org

This closes #45

commit 7e43dc32dbdc7a5e7f3b9ac09ddb13c42d168ad1
Author: Andrew Stitcher 
Date:   Thu Jul 9 16:52:27 2015 -0400

PROTON-904: No longer need to include libuuid header

commit ed8e0144a4fccec6d5270198d2dc3aa3e9b22b4b
Author: Luca Ceresoli 
Date:   Fri Jul 10 10:13:47 2015 +0200

proton-c: fix C compiler detection with _ARG1/_ARG2

The C compiler commandline in CMake is composed by the concatenation of
CMAKE_C_COMPILER + CMAKE_C_COMPILER_ARG1 + CMAKE_C_COMPILER_ARG2.

In most use cases the two additional argument variables are empty, thus
CMAKE_C_COMPILER can be used without any noticeable difference.

The Buildroot embedded Linux build system [0], however, optionally exploits 
the
CMAKE_C_COMPILER_ARG1 variable to speed up the cross-compilation of 
CMake-based
packages using ccache. It does so by setting [1]:

  CMAKE_C_COMPILER  = /path/to/ccache
  CMAKE_C_COMPILER_ARG1 = /path/to/cross-gcc

This works fine with other CMake-based packages, but proton-c's 
CMakeLists.txt
calls gcc to extract the compiler version. It does so by calling
"${CMAKE_C_COMPILER} -dumpversion", without honoring the two extra 
arguments.
Within Buildroot with ccache enabled, this means calling
"/path/to/ccache -dumpversion", which fails with the error:

  ccache: invalid option -- 'd'

Fix the compiler check by adding the two arguments.

[0] http://buildroot.net/
[1] 
http://git.buildroot.net/buildroot/tree/support/misc/toolchainfile.cmake.in?id=2015.05

Signed-off-by: Luca Ceresoli 

This closes #46

commit 246007f488950b0ccbd734eadd194c690bd8a049
Author: Kenneth Giusti 
Date:   Thu Jul 9 10:13:52 2015 -0400

NO-JIRA: update developer documentation regarding Python support

commit 2b41931dbb

Re: 0.10 beta?

2015-07-21 Thread Rafael Schloming
Ok, I'll spin one shortly.

Thanks for posting, BTW. It's easy enough to put them out, but harder to
know when people are ready to actually test them. ;-)

--Rafael

On Tue, Jul 21, 2015 at 8:42 AM, Robbie Gemmell 
wrote:

> I think it would be good to do a beta for 0.10, given the alpha has
> been out a couple weeks and had various issues on the proton-j side.
> There are a couple of remaining blockers still needing resolved, but
> it would be good to keep the process moving forwards and aid the
> likleyhood of getting a release out quickly once those are resolved.
>
> The are currently 6 remaining unresolved JIRAs assigned a 0.10 fix-for:
> http://s.apache.org/ytK
>
> These 2 are currently listed as blockers:
> https://issues.apache.org/jira/browse/PROTON-923
> https://issues.apache.org/jira/browse/PROTON-950
>
> Robbie
>


Re: proton 0.10 blocker

2015-07-20 Thread Rafael Schloming
I'm fine going ahead with Gordon's fix. I don't have a lot of time to dig
into the refcounting issue personally right now, but I'd at least leave the
bug open until we have made it through a bit more testing. I have an uneasy
feeling it (or something closely related) may pop up again if we push
harder on testing.

--Rafael


On Mon, Jul 20, 2015 at 10:43 AM, Ken Giusti  wrote:

> +1 for using Gordon's fix for now - we can cut a beta and see if it holds
> up.
>
> Since there's some unanswered questions regarding the patch's behavior,
> I'll leave the bug open - drop the blocker status - and assign it over to
> Rafi.  He's better cerebrally equipped to understand the reference counting
> implementation than I certainly am.
>
>
>
> - Original Message -
> > From: "Robbie Gemmell" 
> > To: d...@qpid.apache.org, proton@qpid.apache.org
> > Sent: Monday, July 20, 2015 12:03:06 PM
> > Subject: Re: proton 0.10 blocker
> >
> > On 17 July 2015 at 23:32, Gordon Sim  wrote:
> > > On 07/17/2015 10:04 PM, Rafael Schloming wrote:
> > >>
> > >> On Fri, Jul 17, 2015 at 12:47 PM, Gordon Sim  wrote:
> > >>
> > >>> On 07/17/2015 08:15 PM, Rafael Schloming wrote:
> > >>>
> > >>>> On Fri, Jul 17, 2015 at 11:56 AM, Gordon Sim 
> wrote:
> > >>>>
> > >>>>   On 07/17/2015 07:17 PM, Rafael Schloming wrote:
> > >>>>>
> > >>>>>
> > >>>>>   Given this I believe the incref/decref pair is indeed running
> into
> > >>>>>>
> > >>>>>> problems
> > >>>>>> when being invoked from inside a finalizer. I'd be curious if an
> > >>>>>> alternative fix would work. I suspect you could replace the
> additional
> > >>>>>> conditions you added to the if predicate with this:
> > >>>>>>
> > >>>>>>  pn_refcount(endpoint) > 0
> > >>>>>>
> > >>>>>>
> > >>>>> If the refcount is *not* 0, what does the incref/decref sequence
> > >>>>> accomplish?
> > >>>>>
> > >>>>
> > >>>>
> > >>>> I believe the answer to this is the same as the answer I just
> posted on
> > >>>> the
> > >>>> other thread, i.e. the incref may trigger the incref hook (see
> > >>>> pn_xxx_incref in engine.c), and this in turn may update the endpoint
> > >>>> state
> > >>>> and adjust the refcount accordingly. The decref may then end up
> > >>>> finalizing
> > >>>> the object.
> > >>>>
> > >>>
> > >>> Right, understood now.
> > >>>
> > >>> Unfortunately replacing the additional conditions with just that
> check on
> > >>> the refcount doesn't prevent the crash though.
> > >>>
> > >>
> > >> Doh, not the result I was hoping for. Does it yield the same stack
> trace
> > >> as
> > >> before?
> > >
> > >
> > > Yes
> > >
> >
> > Given that and all who looked thinking the earlier proposal was safe,
> > is it worth going with that change at least for now? Knocking things
> > off the blocker list and getting an RC (or even just a beta) out would
> > be really really good.
> >
> > Robbie
> >
>
> --
> -K
>


Re: Semantics of proton refcounts [was Re: proton 0.10 blocker]

2015-07-17 Thread Rafael Schloming
On Fri, Jul 17, 2015 at 10:45 AM, Gordon Sim  wrote:

> Still digesting the explanation (thanks!) but one follow up question:
>
> On 07/17/2015 04:37 PM, Rafael Schloming wrote:
>
>> it isn't actually possible to use the object when there refcount is 0.
>>
>
> What is the purpose of the incref/decref pattern then, e.g. as used in
> pn_session_free()? That is designed to trigger the finalizer, right? If so
> it would seem that can only happen if the reference count is 0 at the point
> when pn_session_free() is called.
>
> (And if it was valid to call pn_session_free for a given session, then
> that session would have been valid for use before that, no?)
>

The refcount should never be zero going into pn_session_free. The situation
where it triggers the finalizer is when the refcount is 1 upon entering
pn_session_free, but the referenced boolean is false meaning that the one
remaining reference that exists is the parent -> child pointer (in this
case the connection -> session pointer). The incref triggers the
pn_session_incref hook which flips the reference around so that the session
-> connection pointer is now the reference and the connection -> session
pointer is the borrowed reference. Logically this momentarily increases the
refcount from 1 to 2 and then the hook decreases it back to 1 and the
decref triggers the finalizer.

Now if pn_session_free is called when the refcount is > 1, or if
pn_ep_decref ends up posting events (events create references to endpoint
objects and thereby increase the refcount), then the pn_incref/decref ends
up being a noop and the session is simply marked as free but finalized when
the last reference to it is decref'ed.

--Rafael


Re: proton 0.10 blocker

2015-07-17 Thread Rafael Schloming
Hi Gordon,

I did my best to dump some useful info on the refcounting stuff in the
other thread. I also posted a comment on the review. As I said there it
would be helpful to see the stack trace from the crash in order to figure
out if the fix is merely a workaround.

--Rafael


On Wed, Jul 15, 2015 at 9:30 AM, Gordon Sim  wrote:

> The latest proton code is causing crashes in qpid-cpp tests that use it.
> I've tracked the problem down to the fix for PROTON-905[1] and proposed an
> enhancement to that fix, https://reviews.apache.org/r/36509/, which
> avoids the crash.
>
> Could someone who understands the logic controlling the lifecycle of
> pn_session_t and pn_link_t objects in detail review and approve this please?
>
>
> [1] https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;h=653f4e5
>


Re: Semantics of proton refcounts [was Re: proton 0.10 blocker]

2015-07-17 Thread Rafael Schloming
On Thu, Jul 16, 2015 at 7:46 AM, Gordon Sim  wrote:

> On 07/16/2015 02:40 PM, aconway wrote:
>
>> Can someone who understand the proton use of refcounts please add some
>> doc comments to explain the semantics? Apologies if this is already
>> there and I missed it, tell me to RTFM.
>>
>
> I'm not entirely sure I understand it. However having spent a couple of
> days reading and puzzling over the code, I'll try and offer some answers
> where I think I can, and add some questions of my own.
>
>  For comparison here is what "refcount" traditionally means ("tradition"
>> includes CORBA, boost::intrusive_ptr, std::shared_ptr etc.) I'm not
>> saying we have to make proton conform, but we should document how it
>> differs to save confusion.
>>
>> 1. A refcount is a count of the number of references (owning pointers)
>> to an object.
>>
>
> Yes, that broadly speaking is the purpose in proton. However this was a
> recent addition and it is not used uniformly inside the library itself. Not
> only that but the old api, where the application (generally) owned the
> pointers it created until it called pn_xxx_free, is still supported
> alongside the newer mode of use, as e.g. employed in the python binding,
> where the application uses only incref and decref.
>
>  2. Objects are created with refcuont=1, the creator owns the first
>> reference.
>>
>
> This is not always the case and was one of the points I found surprising.
> Though a newly created 'object' does indeed have its ref count set to 1,
> for pn_session() and pn_link(), which are the functions actually used by
> the application to created new sessions or links, the implementation of
> those functions includes an explicit decref, meaning the objects are
> returned with a ref count of 0.
>

Are they really returned with a ref count of 0? I don't think proton
objects can actually exist with a refcount of 0 outside a finalizer. What
should actually happen is that the finalizer for the newly created session
will run and cause the parent of the session or link (the connection or
session) to inspect the child's state. Based on that state it may create a
new reference to the child, e.g. if there is outstanding work on the wire
to be done for that session or link. In the case of a new session or link
this is always the case, so it will always end up creating a new reference
to it, but this should never result in a child with a non zero refcount
being returned or visible in any way to user code.

I suspect this was done to simplify things for the newer mode of use, where
> a wrapper object can always be instantiated and it simply increments the
> ref count. Would be good to have that confirmed though, as to me this is
> the source of confusion and complexity.
>

Yes, this is certainly true, but it is also to accommodate the memory model
the C interface exposes. In the C interface sessions and links are owned by
their parent objects, e.g. freeing the connection frees all the contained
sessions. The way this is accommodated is that the parent object is what
owns the reference to the child by default. What is returned from the
pn_session()/pn_link() calls is a borrowed reference. (This changes when
you do an incref, see below for more details.)


> However this means that in the old mode of use, e.g.
> pn_session()/pn_session_free(), the object may have a refcount of 0 even
> though it has not been freed by the application.
>

As mentioned above, this should never be able to happen outside of a
finalizer. Perhaps what is confusing here is that the finalizer can create
a new reference to the object that is being finalized thereby causing the
pn_decref() to *appear* to be a noop, however what is really happening is
an important state change, the last reference is being deleted, and the
finalizer has decided to create a new reference for other purposes (e.g.
because there is outstanding transport work to be done with that object or
because it is going into a pool) and the state of the object (or related
objects/state) is being changed in key ways to reflect this.

Note that this pattern is not at all unique to proton's refcount system.
All GC systems that have finalizers accommodate these sorts of semantics,
i.e. finalizers causing objects to become live again. This is fundamental
to finalizers since they are just user code and can do whatever they want,
including create new references.

What is confusing about it here is more to do with how this capability is
being used to interact with all the engine data structures that predate
both the refcount system and the collection objects that make use of the
refcount system. I believe ultimately the engine data structures should be
reworked to use the various collection objects now available, at which
point a lot of this will become much more centralized, self contained, and
understandable and likely won't require so much magic (or at least the
magic will be in one place rather than spread around like it is now).


>
>  3. If anot

Re: 0.10 alpha1

2015-07-17 Thread Rafael Schloming
FWIW, I'm currently swapping in enough context to confidently +1
PROTON-905, as well as hopefully answer a few related questions on the list.

--Rafael

On Fri, Jul 17, 2015 at 3:35 AM, Robbie Gemmell 
wrote:

> The are currently 11 unresolved JIRAs assigned a 0.10 fix-for:
> http://s.apache.org/ytK
>
> Of those, 4 are listed as blockers:
>
> https://issues.apache.org/jira/browse/PROTON-905
> Long-lived connections leak sessions and links
>
> https://issues.apache.org/jira/browse/PROTON-923
> [SASL] PN_TRANSPORT_ERROR event not raised
>
> https://issues.apache.org/jira/browse/PROTON-943
> Bump library (.so) major version for proton-c libraries
>
> https://issues.apache.org/jira/browse/PROTON-950
> SASL PLAIN over cleartext should be supported
>
> It would be good if any others JIRAs could be updated to indicate they
> are considered blockers so we know what is actually remaining, or to
> move them to either the 0.11 or no fix version if they won't be
> tackled in 0.10 (which should be most things now; please lets get it
> out, I'm starting to wish I did 0.9.2 so I can release the JMS client
> :P )
>
> Robbie
>
> On 7 July 2015 at 13:26, Ken Giusti  wrote:
> > Yay Rafi!  Thanks!
> >
> >
> > A simple query of currently outstanding blocker JIRAs affecting 0.9+
> shows only three:
> >
> > https://issues.apache.org/jira/browse/PROTON-826  (unassigned)
> > https://issues.apache.org/jira/browse/PROTON-923  (asticher)
> > https://issues.apache.org/jira/browse/PROTON-934  (rschloming)
> >
> >
> > The remaining open bugs affecting 0.9+ are:
> >
> >
> https://issues.apache.org/jira/browse/PROTON-826?jql=project%20%3D%20PROTON%20AND%20status%20in%20%28Open%2C%20%22In%20Progress%22%2C%20Reopened%29%20AND%20affectedVersion%20in%20%280.9%2C%200.9.1%2C%200.10%29%20ORDER%20BY%20priority%20DESC
> >
> >
> >
> >
> > - Original Message -
> >> From: "Rafael Schloming" 
> >> To: proton@qpid.apache.org
> >> Sent: Tuesday, July 7, 2015 1:28:17 AM
> >> Subject: 0.10 alpha1
> >>
> >> As promised, here is the first alpha for 0.10. It's posted in the usual
> >> places:
> >>
> >> Source code is here:
> >>
> >> http://people.apache.org/~rhs/qpid-proton-0.10-alpha1/
> >>
> >> Java binaries are here:
> >>
> >>
> https://repository.apache.org/content/repositories/orgapacheqpid-1036
> >>
> >> Please check it out and follow up with any issues.
> >>
> >> --Rafael
> >>
> >
> > --
> > -K
>


Re: Proton reactor documentation

2015-07-16 Thread Rafael Schloming
There is a sort of tutorial for the python version. It should provide you
with a pretty good start. You can find it from
examples/python/reactor/README.md

--Rafael

On Wed, Jul 15, 2015 at 3:24 PM, aconway  wrote:

> I'm documenting the C++ binding but the proton C reactor.h is very
> light on documentation. Is anyone working on this? I'm figuring it out
> from source but it would be good to have some docs there.
>
> Cheers,
> Alan.
>


Re: Proton Devs using Linux: please run the python-tox-test unit tests!!

2015-07-09 Thread Rafael Schloming
Is it worth putting some version of this in DEVELOPERS.md?

--Rafael

On Wed, Jul 8, 2015 at 7:48 AM, Ken Giusti  wrote:

>
> Devs,
>
> As you probably know, I've pushed changes to the proton python bindings
> that make proton compatible with python3.
>
> Since then, I've hit bugs in the python3 stuff that could've been caught
> by running the above unit test on a linux system that has python3 installed.
>
> This test currently only runs on linux, and requires both python3 and
> extra python tools be installed in order to run it.  I suspect most devs
> don't have these tools installed by default.   If the tools are not
> available - or are not current - ctest will skip running these tests.
>
> Most current linux distros - I'm running Fedora 21 btw - support
> installing both python2.x and python3.x in parallel.  Most default to just
> having python 2.x installed - you usually have to install python3 manually.
>
> Once you have python3 installed, you will also need to have an up-to-date
> version of the 'tox' and 'virtualenv' tools installed.
>
> For example, on my F21 box:  "sudo yum install python-tox
> python-virtualenv"  does the trick.
>
> Note: the unit tests require version 1.7+ of python-tox.  If that isn't
> available to you, you can use 'python-pip' to either overwrite the
> installed version of tox with a newer one, or install a local copy of tox
> in your home directory:
>
> $ sudo pip install -U tox   # this overwrites
> or
> $ pip install --user -U tox  # will put tox in ~/.local - you'll have to
> update your PATH/PYTHONPATH to look there
>
> Once all that is done, a simple 'make test' should run the python-tox-test.
>
>
> Doing all this is optional, and will increase the time it takes to run the
> unit tests, but it prevent inadvertent regressions to the python3 support.
> And it will greatly appreciated by yours truly!
>
> thanks all,
>
> -K
>


Re: AMQP 1.0 session outgoing-window usage / meaning

2015-07-08 Thread Rafael Schloming
On Wed, Jul 8, 2015 at 8:58 AM, Gordon Sim  wrote:

> On 07/08/2015 03:49 PM, Rafael Schloming wrote:
>
>> I think what is confusing about the spec language is that it is defining
>> the meaning of the value in terms of the sending endpoint's state and then
>> not saying anything about when the sending endpoint is obligated to
>> communicate the value to the receiver. This is because (at least as
>> originally conceived) it is never directly obligated to communicate the
>> value to the receiver, i.e. it could just choose to grow its array
>> indefinitely.
>>
>> By contrast the incoming-window is defined in terms of the receiver's
>> endpoint state and it is (relatively) clearly communicated that the
>> receiver is obligated to communicate a non-zero incoming-window if it
>> wishes to receive transfer frames.
>>
>> So while I agree that the language as defined requires the outgoing window
>> to be increased from zero, I don't think this implies (at least without a
>> more careful reading) that the sender needs to communicate this fact
>> before
>> sending transfers. It owns the value so it can effectively increase the
>> outgoing-window to 1 momentarily, send the transfer, and then decrease it
>> back to zero without needing to sandwich that transfer in flow frames in
>> order to signal the changing window values.
>>
>
> I think that is a bit of a stretch given the language the spec does use. I
> agree it doesn't explicitly state the rule in those terms, however it does
> say:
>
> The remote-outgoing-window reflects the maximum number of
> incoming transfers that MAY arrive without exceeding the
> remote endpoint’s outgoing-window. This value MUST be
> decremented after every incoming transfer frame is received,
> and recomputed when informed of the remote session endpoint
> state.
>
> I.e. the language suggests that a compliant receiver must do some work to
> keep track of the value and states that this value is tied to the maximum
> number of transfers that may arrive.
>
>  Put another way, the rule that says it is illegal for the sender to send
>> transfers when the receiver's incoming window is zero creates an
>> obligation
>> for the receiver to communicate its window, but there is no corresponding
>> rule for the receiver, e.g. nothing says it is illegal for the receiver to
>> send credit when the sender's outgoing-window is zero, nor that the
>> receiver must compute it's incoming-window based on the sender's
>> outgoing-window,
>>
>
> I certainly agree that nothing ties the receivers incoming window to the
> senders outgoing window in any fixed way.
>
>  so there is no obligation similarly implied for the sender.
>>
>
> I disagree here. I think the fact that begin and flow have the
> outgoing-window field defined as mandatory, and the fact that there is
> specific (though bewildering) language that appears to mandate the tracking
> of the senders outgoing window by the receiver, *implies* (though does not
> explicitly spell out) that the sender is supposed to keep the receiver
> informed of its outgoing window.
>
>  Further I think the sender should not take the lack of credit as grounds
>>> to set a window of 0. The receiver knows it has not issued credit. (At
>>> the
>>> link level, the sender can also indicate that it has messages awaiting
>>> link
>>> credit).
>>>
>>>
>> I agree that we shouldn't do this in our implementation, however I think
>> it
>> is a valid interpretation,
>>
>
> The flow frame describes the outgoing window field as defining:
>
> the maximum number of outgoing transfer frames that the
> endpoint could potentially currently send, if it was not
> constrained by restrictions imposed by its peer’s
> incoming-window.
>
> To me that explicitly states that the value of the senders outgoing window
> is not tied to the value of the receivers incoming window, i.e. that a
> value of 0 for the senders outgoing window cannot reasonably be interpreted
> as an expectation for the receiver to open its incoming window.
>
> I don't see a single sentence that would tie the session outgoing window
> to link level credit, which is an entirely distinct mechanism.
>
> The one thing the spec does say of the remote-outgoing-window is that
> "settling outstanding transfers can cause the window to grow". Settling
> deliveries frees the sender from needing to track them which clearly
> *might* enable a constrained sender to then send some more. Likewise a
> received outcome 

Re: AMQP 1.0 session outgoing-window usage / meaning

2015-07-08 Thread Rafael Schloming
On Wed, Jul 8, 2015 at 8:29 AM, Robbie Gemmell 
wrote:

> The wording of "This identifies a current maximum outgoing transfer-id
> that can be computed by subtracting one from the sum of
> outgoing-window and next-outgoing-id." when describing it, coupled
> with the requirement for the peer session to track it and decrement
> the remote-outgoing-window when receiving transfers, does suggest to
> me that things above the advertised point should not be sent (without
> first asynchronously communicating an increaseat least) since it would
> define a 'maximum' below the next-outgoing-id.
>
> On the other hand, you are right that it doesnt explicitly define when
> we need to update the peer, and there is a specific error condition
> symbol for when the peer exceeds the incoming window but there is no
> equivalent condition for doing the same with the outgoing window.
>
> I think the wording is wooly enough we could be here for a while
> figuring it out and still end up being unsure what it actually says
> though :)
>

Fair enough, and I think the conservative thing to do is certainly to
asynchronously notify your peer when the window changes, however as in my
other reply I don't see how doing that actually eliminates the possibility
of deadlock with service-bus. Asynchronously notifying your peer when you
change the window is a very different requirement than being obliged to
notify your peer that the window is nonzero in order to receive credit.


I think Gordon is meaning in regard to the fact that we send a
> transfer when our outgoing window is [initially] 0 without first
> increasing it to e.g.1, due to the use of the phrasing I mentioned
> earlier that defines a "maximum" id below our [initial] next outgoing
> id. If so, I tend to share his view on that.
>

Gotcha, I certainly agree that the robust thing to do would be to issue the
flow frame in such cases.


> Ultimately I think we just set a big window and essentially forgot it
> exists for now, we aren't using it at all within proton currently
> after all. We might as well stop calculating the remote value given we
> never actually read it.
>
> > I don't think we should do the max-frame-size thing
> > though as this encourages the notion that the incoming-window is somehow
> > dependent on the outgoing-window which as I said above is I think unsafe.
> >
> > I guess setting it to a large constant for now is fairly reasonable, but
> I
> > do think we should encourage service-bus to make its implementation more
> > robust in this regard.
> >
> > --Rafael
>
> Can I take that as a +1 on the approach I proposed changes for on
> PROTON-936 / https://github.com/apache/qpid-proton/pull/42 ?
>

Yes

--Rafael


Re: AMQP 1.0 session outgoing-window usage / meaning

2015-07-08 Thread Rafael Schloming
On Wed, Jul 8, 2015 at 8:28 AM, iPC  wrote:

> // quotes from the spec ---
> 2.5.6 Session Flow Control
> The Session Endpoint assigns each outgoing transfer frame an implicit
> transfer-id from a session scoped sequence.
>
> outgoing-window: The outgoing-window defines the maximum number of outgoing
> transfer frames that the endpoint can currently send. This identifies a
> current maximum outgoing transfer-id that can be computed by subtracting
> one from the sum of outgoing-window and next-outgoing-id.
> // end quotes ---
>
> Outgoing-window is a mandatory field on flow. The sender's outgoing window
> is communicated to the peer. If the sender sends a 0 outgoing window but
> subsequently sends a transfer frame, the implicit outgoing transfer-id
> would exceed the limit implied by the flow frame.
>
> I agree that the sender can maintain a local outgoing window to satisfy any
> local implementation specific logic, but the value communicated to the peer
> must allow further transfer frames to be sent.
>
> I don't think it makes senses to use 0 outgoing window to request for
> credit (assuming it is the link credit mentioned in early mails). Using a 0
> link credit would make more sense.
>

That's certainly a fair interpretation, but even under that interpretation
a potential deadlock with service-bus may still exist. Consider an
implementation that sets its initial outgoing window to zero, waits for
credit, and then upon receiving credit expands its window, sends a flow
frame with the updated window, and then sends a transfer. I expect this
implementation would deadlock with service-bus, but I don't think it is
violating the spec under any of the interpretations proposed in this thread.

--Rafael


Re: Timeline to drop Java 6 support for Proton?

2015-07-08 Thread Rafael Schloming
I'm +1 on dropping Java 6.

--Rafael

On Wed, Jul 8, 2015 at 6:59 AM, Robbie Gemmell 
wrote:

> Epic bump.
>
> As per https://issues.apache.org/jira/browse/PROTON-935 the build is
> currently broken again on Java 6. We need to either update it to
> compile on Java 6, since that is still the builds compiler
> source/target, or alternatively drop support for Java 6 and require
> Java 7.
>
> I'd do the latter given that noone except the CI box seems to be
> testing it, Java 7 is already EOL itself, and most if not all of the
> dependent proejcts that I am aware of using proton-j already require
> Java 7 themselves now.
>
> Robbie
>
> On 24 September 2014 at 15:24, Robbie Gemmell 
> wrote:
> > The compilation issue I missed in the patch was test-only this time, but
> it
> > could have as easily been in non-test code. The other tests now failing
> > might actually point to some functionality under test not working under
> Java
> > 6 at runtime though, which is more of an issue. If the tests showing it
> > didnt exist, or the CI job had been using either the current or previous
> > major Java release, then that might not have been noticed prior to
> release.
> >
> > Whether it compiles or not isnt the only reason to drop support.
> Releasing
> > new versions that people can continue deploying to EOL plaforms in years
> to
> > come isnt necessarily helping anyone if we aren't in fact properly
> ensuring
> > it really works there. If we dont tuly support it, we should probably cut
> > it.
> >
> > Whether we do it now, or later, I just think it would be a good idea to
> > actually decide on a timeline.
> >
> > Robbie
> >
> >
> > On 24 September 2014 14:11, Clebert Suconic  wrote:
> >>
> >> This is just testing... can't you have a java7 tests folder? you would
> be
> >> able to still have java7 specific tests.
> >>
> >>
> >>
> >> On Sep 24, 2014, at 7:13 AM, Robbie Gemmell 
> >> wrote:
> >>
> >> > Hi all,
> >> >
> >> > With Qpid 0.30 we have made the move to requiring Java 7+. Currently,
> >> > proton still allows for use of Java 6, so I wonder what peoples
> thoughts
> >> > are on the timing of a similar move for Proton? I'd personally like to
> >> > do
> >> > it soon since Java 6 is EOL, but if not then I think we should at
> least
> >> > decide when we will.
> >> >
> >> > Robbie
> >> >
> >> > Background:
> >> > I committed a patch yesterday which contained some Java 7 API usage in
> >> > its
> >> > tests, and subsequently broke the ASF Jenkins jobs that are still
> using
> >> > Java 6 (I'm using 8). Having now noticed this I updated the test to
> make
> >> > it
> >> > compile and run on Java 6, unfortunately having to disable use of some
> >> > of
> >> > the input aimed at testing the defect in question. Everything now
> >> > compiles
> >> > and the test in question passes, but the overall test run is still
> >> > failing
> >> > because it turns out some other new changes in recent days mean there
> >> > are
> >> > now a couple of URL tests which fail on Java 6 (but work on Java 8).
> >>
> >
>


Re: AMQP 1.0 session outgoing-window usage / meaning

2015-07-08 Thread Rafael Schloming
On Wed, Jul 8, 2015 at 2:38 AM, Robbie Gemmell 
wrote:

> On 8 July 2015 at 10:03, Gordon Sim  wrote:
> > On 07/08/2015 02:22 AM, Rafael Schloming wrote:
> >>
> >> a value of zero is actually what
> >> signals that the receiver needs to take some action here, and arguably
> an
> >> initial value of zero is correct since it is signaling that the receiver
> >> needs to take action (in this case issue credit).
> >
> >
> > My interpretation is that if 0 is sent as the initial value, the sender
> > cannot legally send any transfers without first expanding the window by
> > sending a flow with a non-zero value.
> >
>
> Agreed. We don't currently do that, so we are typically violating the
> window (Messenger being an exception, on its initial send to a node at
> least).
>
> In the case the remote incoming window is also 0 (which it is in the
> case which prompted this discussion) we would also have to
> synchronously wait for a flow 'response' increasing it before we could
> send anything, since we also need to know the remote incoming window
> actually enabled us to send.
>

See my reply to Gordon for more details, but I think we need to be careful
in distinguishing between the rules the spec mandates for
updating/maintaining the value locally, and the rules the spec mandates for
when it is necessary to communicate the value to the peer. These definitely
should not be conflated because you don't want to notify the peer whenever
you update the value locally, this would be way to chatty.


> > Further I think the sender should not take the lack of credit as grounds
> to
> > set a window of 0. The receiver knows it has not issued credit. (At the
> link
> > level, the sender can also indicate that it has messages awaiting link
> > credit).
> >
>
> That was my take on reading things, that the two are essentially
> separate mechanisms, and on top the two windows are separate as well.
>
> If the initial outgoing window was set to 0 because the reciever hasnt
> issued, would there be a need for the field to exist on Begin? It
> would seem like much of the time it wouldnt be used and a Flow would
> have to be used instead.
>
> > In the case where the sender's implementation involves a fixed amount of
> > buffer space and requires messages to be settled before it can send more,
> > the receiver would not be able to know that without getting some signal.
> So
> > to my mind that is the only case for which it would make sense to send an
> > outgoing window of 0. (I'm not sure how useful this is in practice and I
> > don't believe it applies to proton at present anyway).
> >
> > I think as it stands proton is violating the spec, and should be changed
> to
> > send a non-zero outgoing window.
>
> That is what I did for the proposed changes on
> https://issues.apache.org/jira/browse/PROTON-936 /
> https://github.com/apache/qpid-proton/pull/42. Essentially setting it
> initially on begin to max int (unless otherwise configured) and
> leaving it there for any subsequent flows. We could later (when we
> arent right before a release) look to make it smarter if needed.
>

See my other email for more details, but I agree this seems like a
reasonable change for now.

--Rafael


Re: AMQP 1.0 session outgoing-window usage / meaning

2015-07-08 Thread Rafael Schloming
On Wed, Jul 8, 2015 at 2:03 AM, Gordon Sim  wrote:

> On 07/08/2015 02:22 AM, Rafael Schloming wrote:
>
>> a value of zero is actually what
>> signals that the receiver needs to take some action here, and arguably an
>> initial value of zero is correct since it is signaling that the receiver
>> needs to take action (in this case issue credit).
>>
>
> My interpretation is that if 0 is sent as the initial value, the sender
> cannot legally send any transfers without first expanding the window by
> sending a flow with a non-zero value.
>

I think what is confusing about the spec language is that it is defining
the meaning of the value in terms of the sending endpoint's state and then
not saying anything about when the sending endpoint is obligated to
communicate the value to the receiver. This is because (at least as
originally conceived) it is never directly obligated to communicate the
value to the receiver, i.e. it could just choose to grow its array
indefinitely.

By contrast the incoming-window is defined in terms of the receiver's
endpoint state and it is (relatively) clearly communicated that the
receiver is obligated to communicate a non-zero incoming-window if it
wishes to receive transfer frames.

So while I agree that the language as defined requires the outgoing window
to be increased from zero, I don't think this implies (at least without a
more careful reading) that the sender needs to communicate this fact before
sending transfers. It owns the value so it can effectively increase the
outgoing-window to 1 momentarily, send the transfer, and then decrease it
back to zero without needing to sandwich that transfer in flow frames in
order to signal the changing window values.

Put another way, the rule that says it is illegal for the sender to send
transfers when the receiver's incoming window is zero creates an obligation
for the receiver to communicate its window, but there is no corresponding
rule for the receiver, e.g. nothing says it is illegal for the receiver to
send credit when the sender's outgoing-window is zero, nor that the
receiver must compute it's incoming-window based on the sender's
outgoing-window, so there is no obligation similarly implied for the sender.


>
> Further I think the sender should not take the lack of credit as grounds
> to set a window of 0. The receiver knows it has not issued credit. (At the
> link level, the sender can also indicate that it has messages awaiting link
> credit).
>

I agree that we shouldn't do this in our implementation, however I think it
is a valid interpretation, which implies that making the incoming window
dependent on your peers outgoing window (as service-bus is doing) is
probably not a safe thing to do.


>
> In the case where the sender's implementation involves a fixed amount of
> buffer space and requires messages to be settled before it can send more,
> the receiver would not be able to know that without getting some signal. So
> to my mind that is the only case for which it would make sense to send an
> outgoing window of 0. (I'm not sure how useful this is in practice and I
> don't believe it applies to proton at present anyway).
>
> I think as it stands proton is violating the spec, and should be changed
> to send a non-zero outgoing window.
>

Per above I don't believe it is violating the spec, but given that it has
been misinterpreted at least once, I certainly agree that some behavior
change is warranted. I don't think we should do the max-frame-size thing
though as this encourages the notion that the incoming-window is somehow
dependent on the outgoing-window which as I said above is I think unsafe.

I guess setting it to a large constant for now is fairly reasonable, but I
do think we should encourage service-bus to make its implementation more
robust in this regard.

--Rafael


Re: AMQP 1.0 session outgoing-window usage / meaning

2015-07-07 Thread Rafael Schloming
On Tue, Jul 7, 2015 at 4:29 AM, Gordon Sim  wrote:

> On 07/07/2015 07:22 AM, Rafael Schloming wrote:
>
>> IIRC, the definition of the outgoing-window was largely motivated by the
>> need to express to receivers certain conditions under which they may be
>> required to settle deliveries in order to receive more. For example if an
>> implementation uses a fixed sized array to store deliveries, and this
>> array
>> is keyed by the offset of the delivery-id from the smallest unsettled
>> delivery, then although the sender may have sufficient credit to send more
>> transfers, it may not actually be capable of doing this because the next
>> delivery-id would land outside the range of deliveries that are currently
>> represented within its fixed size array.
>>
>
> The outgoing-window is measured in transfers, right? So in this case each
> slot in the array would be a *transfer* with a single delivery possibly
> spanning multiple slots.
>

Yes, good point.


> This could happen for example if
>> the receiver issues N credits (where N is the size of the sender's fixed
>> array) and settles deliveries 2 through N. The sender is then stuck with
>> an
>> unsettled delivery in the first slot of its fixed sized array and cannot
>> send another delivery until that first delivery is settled.
>>
>> Given this, it's certainly true an outgoing-window of 0 is kind of strange
>> and useless.
>>
>
> Isn't that exactly the mechanism by which a sender, such as the one in
> your description above, would indicate its inability to send further
> transfers?


Gah, sorry, jet lag... you are right. I was thinking of the window measured
from the oldest unsettled transfers, however it is actually measured from
the next-outgoing-id, and so as you say a value of zero is actually what
signals that the receiver needs to take some action here, and arguably an
initial value of zero is correct since it is signaling that the receiver
needs to take action (in this case issue credit).

--Rafael


Re: AMQP 1.0 session outgoing-window usage / meaning

2015-07-06 Thread Rafael Schloming
IIRC, the definition of the outgoing-window was largely motivated by the
need to express to receivers certain conditions under which they may be
required to settle deliveries in order to receive more. For example if an
implementation uses a fixed sized array to store deliveries, and this array
is keyed by the offset of the delivery-id from the smallest unsettled
delivery, then although the sender may have sufficient credit to send more
transfers, it may not actually be capable of doing this because the next
delivery-id would land outside the range of deliveries that are currently
represented within its fixed size array. This could happen for example if
the receiver issues N credits (where N is the size of the sender's fixed
array) and settles deliveries 2 through N. The sender is then stuck with an
unsettled delivery in the first slot of its fixed sized array and cannot
send another delivery until that first delivery is settled.

Given this, it's certainly true an outgoing-window of 0 is kind of strange
and useless. It's probably also true that it is never super useful for the
incoming window of the receiver to be larger than the outgoing window of
the sender (or vice versa) since one can't ever exceed the other, so I'd
say your largest-possible-int default and max-frame-like treatment are
fairly appropriate.

--Rafael


On Fri, Jul 3, 2015 at 8:57 AM, Robbie Gemmell 
wrote:

> Rob, Rafi, as authors of the spec and the related code in proton, do
> you have any thoughts to add here?
>
> Barring any discussion otherwise I will be looking to change proton to
> at least optionally allow controlling the outgoing window along the
> lines I mentioned near the end of my original mail.
>
> Robbie
>
> On 2 July 2015 at 00:15, Robbie Gemmell  wrote:
> > Thanks James. Some expansion which may be useful to add.
> >
> > When comparing the older JMS client, proton-c via the Messenger API,
> > and the new JMS client using proton-j, its important to note that they
> > aren't all doing the same thing even where their underlying
> > implementations do seem to share the same behaviour in the cases of
> > proton-c and proton-j.
> >
> > The older JMS client initializes its outgoing window to a fixed number
> > in the session Begin frame and then doesnt seem to ever change it for
> > subsequent Flow frames, and simply manages whether its session can
> > later send transfer frames based on the current value of the remote
> > incoming window. Proton-J and Proton-C similarly only base their
> > session level decision to send transfers on the remote incoming window
> > and not their own outgoing window (which as noted below means they
> > violate their advertised outgoing window, which is often going to be
> > 0).
> >
> > Proton-C and Proton-J both currently look to set the outgoing window
> > at any given time to a calculated value based on either the number of
> > buffered messages or the buffered bytes divided by frame size. If
> > there are no buffered messages at the point the Begin and Flow frames
> > are generated, then the outgoing-window will be set to 0. This appears
> > to function the same for both proton-c and proton-j. A key point
> > though is that I think much of the historic usage of proton-c against
> > Service Bus has been via the Messenger API, which works somewhat
> > differently than many others in that it looks to create a session and
> > a sender and sends the messages in one pipelined sequence of transport
> > output, which means that by the point the Begin frame actually gets
> > generated there are indeed buffered messages to send which means the
> > outgoing-window is initialised to a value greater than zero. Other
> > APIs which create the session as a distinct step thus wont ever have
> > buffered messages when the Begin frame gets created and so the
> > outgoing-window is initialised to 0, which is the behaviour observed
> > with the new JMS client using proton-j and also what I saw when trying
> > proton-c via the Qpid Messaging C++ client (against qpidd).
> >
> > Robbie
> >
> >
> > On 1 July 2015 at 20:54, James Birdsall  wrote:
> >> FYI, I have forwarded this and important bits of the preceding
> discussion to our AMQP stack dev within the ServiceBus team.
> >>
> >> Both the Qpid JMS AMQP 1.0 "legacy" client and Proton-C have been
> working fine with Azure SB for years now. Proton-J, however, is not
> something we have explored previously, and obviously there is something
> different about its behavior compared to the other clients.
> >>
> >> The Qpid JMS client is our recommended JMS client for interop with
> ServiceBus, and we would like to keep up with the times and not have to
> direct customers to the legacy client, so we are very interested in
> figuring out the correct resolution to this issue.
> >>
> >> -Original Message-
> >> From: Robbie Gemmell [mailto:robbie.gemm...@gmail.com]
> >> Sent: Wednesday, July 1, 2015 7:48 AM
> >> To: us...@qpid.apache.org; proton@qpid.apache.org
> >> S

0.10 alpha1

2015-07-06 Thread Rafael Schloming
As promised, here is the first alpha for 0.10. It's posted in the usual
places:

Source code is here:

http://people.apache.org/~rhs/qpid-proton-0.10-alpha1/

Java binaries are here:

https://repository.apache.org/content/repositories/orgapacheqpid-1036

Please check it out and follow up with any issues.

--Rafael


Re: Schedule for the 0.10 release?

2015-07-06 Thread Rafael Schloming
+1

The only issue that has me worried here is the sasl interop story between
proton-c and proton-j. I can cut a release later today just to give us
something to poke at, but there may still be sasl work needed.

--Rafael


On Mon, Jul 6, 2015 at 9:57 AM, Flavio Percoco  wrote:

> On 06/07/15 12:35 -0400, Ken Giusti wrote:
>
>> Hi all,
>>
>> I remember recently that there was a rumor that we may be able to start
>> the release of 0.10 at the end of june.
>>
>> Now that my july 4th long weekend has worn off - what's left to be done
>> before we can cut an alpha?
>>
>
> Yup, I shot a request and there seemed to be consensus. I'd really
> love to see the 0.10 release out this week.
>
> Thanks,
> Flavio
>
> --
> @flaper87
> Flavio Percoco
>


Re: [38/38] qpid-proton git commit: implemented sasl sniffing for proton-j; this allows the reactor interop tests to pass

2015-07-06 Thread Rafael Schloming
FWIW, my changes in this area really represent the minimum diff necessary
to get the reactor branch to land. None of this is related to the reactor
changes per se, it just so happens the reactor tests include several tests
that check interop between proton-c and proton-j and these tests keep
stumbling over incompatibilities that are currently quite easy to arise
given the current state of the sasl implementations.

While I agree 100% that the APIs should converge, at the moment I'm
actually slightly more worried about the on-the-wire interop issues. More
specifically, while it's bad for proton-c and proton-j APIs to look
different, it's *really* bad if the default settings for one result in a
configuration that won't interop with the default settings for the other.

--Rafael

On Mon, Jul 6, 2015 at 10:28 AM, Andrew Stitcher 
wrote:

> On Mon, 2015-07-06 at 13:14 -0400, Andrew Stitcher wrote:
> > On Mon, 2015-07-06 at 17:48 +0100, Robbie Gemmell wrote:
> > > ...
> > > The old toggle only used to define whether sasl was required or not
> > > (which it historically was once you enabled the sasl layer, and the
> > > toggle was never implemented in proton-j), whereas IIRC the new
> > > 'requireAuth' governs that but also whether ANONYMOUS is allowed or
> > > not when a SASL layer is used, is that correct?
> >
> > That is true, but I think it actually more useful to be able to select
> > authenticated or not compared to using SASL or not (because ANONYMOUS is
> > unauthenticated but uses SASL).
> >
> > The C implementation does the actual enforcement when it reads the AMQP
> > header, which would obviously be a significant change to the Java
> > implementation, but I really do think gives a more satisfactory user
> > result.
>
> The reason for the complexity and the checking at AMQP header time is to
> allow SSL certificates as a valid form of authentication (not
> necessarily only used with SASL EXTERNAL). If you don't need to support
> that (or at least not yet) then "require authentication" can simply mean
> require the SASL layer but don't offer the ANONYMOUS mechanism. That is
> what earlier versions of the C code did*, and I think that would be
> relatively simple to implement in Java too.
>
> * The C code will still not offer ANONYMOUS as a possible mechanism if
> authentication is required. But the overall meaning of the flag is more
> complex than this as explained.
>
> Andrew
>
>
>


Re: [38/38] qpid-proton git commit: implemented sasl sniffing for proton-j; this allows the reactor interop tests to pass

2015-07-06 Thread Rafael Schloming
I wired in allowSkip in a very minimal way just to restore the ability to
force the old behaviour. It would be a fairly trivial to change the name of
course, however it appears there are a bunch of other related changes that
go along with it, e.g. adding a bunch of accessors and fixing the error
behavior. Currently if you put in require authentication the java sasl
layer will simply die with a TransportException if it sees a regular AMQP
header, and the tests appear to expect something more graceful.

I stopped there because I noticed a bunch of other unimplemented stuff and
I wasn't sure how deep the bottom of the rabbit hole was.

--Rafael

On Mon, Jul 6, 2015 at 11:11 AM, Andrew Stitcher 
wrote:

> On Mon, 2015-07-06 at 10:56 -0400, Rafael Schloming wrote:
> > On Mon, Jul 6, 2015 at 9:52 AM, Robbie Gemmell  >
> > wrote:
> >
> > > Is this change allowing clients to skip the SASL layer when connecting
> > > to servers that have enabled the SASL layer? If so, how is the new
> > > default behaviour disabled?
> > >
> >
> > Yes, it was necessary to allow the tests to pass.
> >
> >
> > > The existing but unimplemented 'allowSkip' method previously intended
> > > to enable such behaviour still doesn't do anything, so is there a way
> > > to require clients use a SASL layer as would have been previously
> > > after enabling SASL for a proton-j (and in the past a proton-c)
> > > server?
> > >
> >
> > Ah, I didn't notice that. Thanks for pointing it out. I'll wire it up and
> > cross my fingers that the tests still pass.
>
> Allow_skip is no longer present in the C API it is replaced with the
> require_auth (now on the transport object) API.
>
> So it would make more sense to implement that and remove allow_skip.
>
> >
> > --Rafael
>
>
>


Re: ProtonJ compilation and test failures

2015-07-06 Thread Rafael Schloming
Any sort of missing class really should be a compile time exception, which
I think means you must have stale class files *somewhere*. You could try
doing a find checkout -name "*.class" just as a sanity check. Also, it's
possible something in your local maven repo is somehow coming into play,
maybe blow that away and rebuild it and/or do an mvn install to be sure
that remove dependencies aren't out of sync with local code?

--Rafael

On Mon, Jul 6, 2015 at 11:02 AM, Gordon Sim  wrote:

> On 07/06/2015 02:23 PM, Robbie Gemmell wrote:
>
>> On 6 July 2015 at 14:17, Gordon Sim  wrote:
>>
>>> On 07/06/2015 01:24 PM, Rafael Schloming wrote:
>>>
>>>>
>>>> Can you try doing an mvn clean and seeing if it is still an issue?
>>>>
>>>
>>>
>>> I see the same thing after mvn clean
>>>
>>>
>> Does cleaning the checkout as a whole make any difference?
>>
>
> Doesn't seem to, no.
>
>
>  To preview what woudl be deleted:
>> git clean -n -d -x -e "*.classpath" -e "*.project" -e "*.settings" .
>>
>> To actually delete things:
>> git clean -f -d -x -e "*.classpath" -e "*.project" -e "*.settings" .
>>
>> The -e flags are protecting project files generated by Eclipse...if
>> you dont use it, no need for them.
>>
>>
>


Re: [38/38] qpid-proton git commit: implemented sasl sniffing for proton-j; this allows the reactor interop tests to pass

2015-07-06 Thread Rafael Schloming
On Mon, Jul 6, 2015 at 9:52 AM, Robbie Gemmell 
wrote:

> Is this change allowing clients to skip the SASL layer when connecting
> to servers that have enabled the SASL layer? If so, how is the new
> default behaviour disabled?
>

Yes, it was necessary to allow the tests to pass.


> The existing but unimplemented 'allowSkip' method previously intended
> to enable such behaviour still doesn't do anything, so is there a way
> to require clients use a SASL layer as would have been previously
> after enabling SASL for a proton-j (and in the past a proton-c)
> server?
>

Ah, I didn't notice that. Thanks for pointing it out. I'll wire it up and
cross my fingers that the tests still pass.

--Rafael


Re: ProtonJ compilation and test failures

2015-07-06 Thread Rafael Schloming
Can you do a git pull and give it another shot?

I believe what is happening is that when maven launches the jython tests,
it doesn't seem to include the jython shim in the class path. For some
reason, this isn't an issue of the .class files that jython generates are
hanging around in the source tree from a previous build, or if the
JYTHONPATH environment variable is set to include the appropriate path.

I'm not sure how long this has been the case, but I suspect it may actually
predate the reactor changes. I think they just triggered the latent issue
for you by adding another file that did not already have a generated .class
file sitting around in the tree.

I've modified JythonTest to add the shim to the classpath and it now works
for me without needing any help from a prior cmake build.

--Rafael



On Mon, Jul 6, 2015 at 8:25 AM, Robbie Gemmell 
wrote:

> Were you running it after having previously used the cmake build in
> the same terminal?
>
> I do indeed have the definition in ctypes, with the cproton file
> importing everything from ctypes. The maven build failed when I ran it
> directly in my git-clean'ed checkout. It then passed when run
> indirectly via cmake and make test. It then passed if run directly
> again. However as it turns out, it also passed in the same terminal
> after I git-clean'ed the checkout again. It failed if I run it again
> directly in the same checkout but using a new terminal where I hadn't
> used the cmake build.
>
> Robbie
>
> On 6 July 2015 at 11:57, Rafael Schloming  wrote:
> > I just ran a maven-only clean build locally with no problems.
> >
> > You should have PN_MILLIS_MAX defined in
> > proton-j/src/main/resources/ctypes.py, and this should be imported from
> > proton-j/src/main/resources/cproton.py. Can you verify that this is as
> > expected?
> >
> > --Rafael
> >
> > On Mon, Jul 6, 2015 at 5:50 AM, Robbie Gemmell  >
> > wrote:
> >
> >> The recent changes on Proton-J seemed to have created some issues:
> >>
> https://builds.apache.org/view/M-R/view/Qpid/job/Qpid-proton-j/1032/console
> >>
> >> The module currently requries Java 7 to compile, which is a slightly
> >> out of sync with the compiler source+target still being set to Java 6
> >> (which the above job is using).
> >>
> >> Once using Java 8 to do the maven build locally, the python tests then
> >> failed with:
> >>
> >> proton_tests.utils.SyncRequestResponseTest.test_request_response
> 
> >> fail
> >> Error during test:  Traceback (most recent call last):
> >> File "/home/gemmellr/workspace/proton/tests/python/proton-test",
> >> line 360, in run
> >>   phase()
> >> File
> >> "/home/gemmellr/workspace/proton/tests/python/proton_tests/utils.py",
> >> line 89, in test_request_response
> >>   connection = BlockingConnection(server.url, timeout=self.timeout)
> >> File
> >>
> "/home/gemmellr/workspace/proton/tests/../proton-c/bindings/python/proton/utils.py",
> >> line 195, in __init__
> >>   self.wait(lambda: not (self.conn.state & Endpoint.REMOTE_UNINIT),
> >> File
> >>
> "/home/gemmellr/workspace/proton/tests/../proton-c/bindings/python/proton/utils.py",
> >> line 229, in wait
> >>   container_timeout = self.container.timeout
> >> File
> >>
> "/home/gemmellr/workspace/proton/tests/../proton-c/bindings/python/proton/reactor.py",
> >> line 104, in _get_timeout
> >>   return millis2timeout(pn_reactor_get_timeout(self._impl))
> >> File
> >>
> "/home/gemmellr/workspace/proton/tests/../proton-c/bindings/python/proton/__init__.py",
> >> line 2337, in millis2timeout
> >>   if millis == PN_MILLIS_MAX: return None
> >>   NameError: global name 'PN_MILLIS_MAX' is not defined
> >>
> >> I notice that the TravisCI job did pass:
> >> https://travis-ci.org/apache/qpid-proton/builds/69665060
> >>
> >> I guess the main difference is it ran via cmake so the proton-c build
> >> was performed before the proton-j tests were run.
> >>
> >> Robbie
> >>
>


Re: ProtonJ compilation and test failures

2015-07-06 Thread Rafael Schloming
Can you try doing an mvn clean and seeing if it is still an issue?

A class entirely missing like that is usually due to mvn not recompiling
everything that is impacted by a given change.

--Rafael

On Mon, Jul 6, 2015 at 8:11 AM, Gordon Sim  wrote:

> All the ProtonJInterop tests fail for me, and the python-test then hangs.
> The error for each is something like:
>
>  2: proton_tests.reactor_interop.ReactorInteropTest. \
>> 2: Error: Could not find or load main class
>> org.apache.qpid.proton.ProtonJInterop
>> 2: test_protonc_to_protonj_1
>> ... fail
>> 2: Error during test:  Traceback (most recent call last):
>> 2: File "/home/gordon/projects/proton-git/tests/python/proton-test",
>> line 360, in run
>> 2:   phase()
>> 2: File
>> "/home/gordon/projects/proton-git/tests/python/proton_tests/reactor_interop.py",
>> line 141, in test_protonc_to_protonj_1
>> 2:   self.protonc_to_protonj(1)
>> 2: File
>> "/home/gordon/projects/proton-git/tests/python/proton_tests/reactor_interop.py",
>> line 124, in protonc_to_protonj
>> 2:   assert(java_thread.result == 0)
>> 2:   AssertionError
>> 2: proton_tests.reactor_interop.ReactorInteropTest. \
>> 2: Error: Could not find or load main class
>> org.apache.qpid.proton.ProtonJInterop
>>
>
>


Re: ProtonJ compilation and test failures

2015-07-06 Thread Rafael Schloming
I just ran a maven-only clean build locally with no problems.

You should have PN_MILLIS_MAX defined in
proton-j/src/main/resources/ctypes.py, and this should be imported from
proton-j/src/main/resources/cproton.py. Can you verify that this is as
expected?

--Rafael

On Mon, Jul 6, 2015 at 5:50 AM, Robbie Gemmell 
wrote:

> The recent changes on Proton-J seemed to have created some issues:
> https://builds.apache.org/view/M-R/view/Qpid/job/Qpid-proton-j/1032/console
>
> The module currently requries Java 7 to compile, which is a slightly
> out of sync with the compiler source+target still being set to Java 6
> (which the above job is using).
>
> Once using Java 8 to do the maven build locally, the python tests then
> failed with:
>
> proton_tests.utils.SyncRequestResponseTest.test_request_response 
> fail
> Error during test:  Traceback (most recent call last):
> File "/home/gemmellr/workspace/proton/tests/python/proton-test",
> line 360, in run
>   phase()
> File
> "/home/gemmellr/workspace/proton/tests/python/proton_tests/utils.py",
> line 89, in test_request_response
>   connection = BlockingConnection(server.url, timeout=self.timeout)
> File
> "/home/gemmellr/workspace/proton/tests/../proton-c/bindings/python/proton/utils.py",
> line 195, in __init__
>   self.wait(lambda: not (self.conn.state & Endpoint.REMOTE_UNINIT),
> File
> "/home/gemmellr/workspace/proton/tests/../proton-c/bindings/python/proton/utils.py",
> line 229, in wait
>   container_timeout = self.container.timeout
> File
> "/home/gemmellr/workspace/proton/tests/../proton-c/bindings/python/proton/reactor.py",
> line 104, in _get_timeout
>   return millis2timeout(pn_reactor_get_timeout(self._impl))
> File
> "/home/gemmellr/workspace/proton/tests/../proton-c/bindings/python/proton/__init__.py",
> line 2337, in millis2timeout
>   if millis == PN_MILLIS_MAX: return None
>   NameError: global name 'PN_MILLIS_MAX' is not defined
>
> I notice that the TravisCI job did pass:
> https://travis-ci.org/apache/qpid-proton/builds/69665060
>
> I guess the main difference is it ran via cmake so the proton-c build
> was performed before the proton-j tests were run.
>
> Robbie
>


Re: ruby test failures on master?

2015-06-24 Thread Rafael Schloming
FWIW, I'm on fedora 20 also.

--Rafael

On Wed, Jun 24, 2015 at 8:33 AM, Ken Giusti  wrote:

> I'm seeing exactly the same failures on latest proton master on my fedora
> 20 box.
>
> - Original Message -----
> > From: "Rafael Schloming" 
> > To: proton@qpid.apache.org
> > Sent: Wednesday, June 24, 2015 8:10:03 AM
> > Subject: ruby test failures on master?
> >
> > Is anyone else seeing the ruby tests fail on master?
> >
> > I've attached the test output.
> >
> > --Rafael
> >
>
> --
> -K
>


ruby test failures on master?

2015-06-24 Thread Rafael Schloming
Is anyone else seeing the ruby tests fail on master?

I've attached the test output.

--Rafael
Start testing: Jun 24 06:24 EDT
--
3/11 Testing: ruby-unit-test
3/11 Test: ruby-unit-test
Command: "/usr/bin/python" "/home/rhs/proton/proton-c/env.py" 
"PATH=/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/home/rhs/.local/bin:/home/rhs/bin:/home/rhs/proton/build/proton-c/bindings/ruby:/home/rhs/proton/build/proton-c"
 
"RUBYLIB=/home/rhs/proton/tests/ruby:/home/rhs/proton/proton-c/bindings/ruby:/home/rhs/proton/build/proton-c/bindings/ruby:/home/rhs/proton/build/proton-c:/home/rhs/proton/proton-c/bindings/ruby/lib"
 "/home/rhs/proton/tests/ruby/proton-test"
Directory: /home/rhs/proton/build/proton-c
"ruby-unit-test" start time: Jun 24 06:24 EDT
Output:
--
/home/rhs/proton/tests/ruby/proton_tests/interop.rb:16:in 
`': uninitialized constant Qpid::Proton::Data (NameError)
from /home/rhs/proton/tests/ruby/proton_tests/interop.rb:15:in `'
from /usr/share/rubygems/rubygems/core_ext/kernel_require.rb:55:in 
`require'
from /usr/share/rubygems/rubygems/core_ext/kernel_require.rb:55:in 
`require'
from /home/rhs/proton/tests/ruby/proton-test:9:in `'

Test time =   0.13 sec
--
Test Failed.
"ruby-unit-test" end time: Jun 24 06:24 EDT
"ruby-unit-test" time elapsed: 00:00:00
--

4/11 Testing: ruby-spec-test
4/11 Test: ruby-spec-test
Command: "/usr/bin/python" "/home/rhs/proton/proton-c/env.py" 
"PATH=/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/home/rhs/.local/bin:/home/rhs/bin:/home/rhs/proton/build/proton-c/bindings/ruby:/home/rhs/proton/build/proton-c"
 
"RUBYLIB=/home/rhs/proton/tests/ruby:/home/rhs/proton/proton-c/bindings/ruby:/home/rhs/proton/build/proton-c/bindings/ruby:/home/rhs/proton/build/proton-c:/home/rhs/proton/proton-c/bindings/ruby/lib"
 "/usr/bin/rspec"
Directory: /home/rhs/proton/proton-c/bindings/ruby
"ruby-spec-test" start time: Jun 24 06:24 EDT
Output:
--
/home/rhs/proton/proton-c/bindings/ruby/spec/qpid/proton/exception_handling_spec.rb:25:in
 `': uninitialized constant 
Qpid::Proton::ExceptionHandling (NameError)
from 
/home/rhs/proton/proton-c/bindings/ruby/spec/qpid/proton/exception_handling_spec.rb:24:in
 `'
from 
/home/rhs/proton/proton-c/bindings/ruby/spec/qpid/proton/exception_handling_spec.rb:22:in
 `'
from 
/home/rhs/proton/proton-c/bindings/ruby/spec/qpid/proton/exception_handling_spec.rb:20:in
 `'
from 
/usr/share/gems/gems/rspec-core-2.14.7/lib/rspec/core/configuration.rb:896:in 
`load'
from 
/usr/share/gems/gems/rspec-core-2.14.7/lib/rspec/core/configuration.rb:896:in 
`block in load_spec_files'
from 
/usr/share/gems/gems/rspec-core-2.14.7/lib/rspec/core/configuration.rb:896:in 
`each'
from 
/usr/share/gems/gems/rspec-core-2.14.7/lib/rspec/core/configuration.rb:896:in 
`load_spec_files'
from 
/usr/share/gems/gems/rspec-core-2.14.7/lib/rspec/core/command_line.rb:22:in 
`run'
from 
/usr/share/gems/gems/rspec-core-2.14.7/lib/rspec/core/runner.rb:80:in `run'
from 
/usr/share/gems/gems/rspec-core-2.14.7/lib/rspec/core/runner.rb:17:in `block in 
autorun'
simplecov available
Coverage report generated for RSpec to 
/home/rhs/proton/proton-c/bindings/ruby/coverage. 1299 / 2928 LOC (44.36%) 
covered.

Test time =   0.53 sec
--
Test Failed.
"ruby-spec-test" end time: Jun 24 06:24 EDT
"ruby-spec-test" time elapsed: 00:00:00
--

End testing: Jun 24 06:24 EDT


Re: Can we release proton 0.10? Can we add Py3K to that release?

2015-06-16 Thread Rafael Schloming
I'd like to get the proton-j-reactor branch into 0.10 also. It should be
ready soon, so if py3k can be sorted and merged in a similar timeframe we
could target a release for the end of the month.

--Rafael

On Tue, Jun 16, 2015 at 3:32 PM, Flavio Percoco  wrote:

> Greetings,
>
> I've been looking with great pleasure all the progress happening in
> proton lately and I was wondering whether it'd be possible to have an
> 0.10 release cut soon.
>
> There are some bugfixes I'm personally interested in but also some
> important changes (specifically in the python bindings) that will make
> consuming proton easier for users (OpenStack among those).
>
> Is there a chance for the above to happen any time soon?
>
> Can I push my request a bit further and ask for the py3k code to be
> merged as well?
>
> All the above are key pieces to make proton more consumable and allow
> for services like OpenStack to fully adopt it.
>
> Thanks,
> Flavio
>
> --
> @flaper87
> Flavio Percoco
>


Re: proton-j reactor branch

2015-06-15 Thread Rafael Schloming
On Mon, Jun 15, 2015 at 3:07 PM, Adrian Preston  wrote:

>   3) Write Jython shims do the existing Python tests can run against the
>  Proton-J reactor.
>

I have much of this done in my local checkout, I just need to patch it up a
bit and push it.

--Rafael


Re: [Proton-J] Improving the engine impl.

2015-05-13 Thread Rafael Schloming
On Wed, May 13, 2015 at 12:58 PM, Rajith Muditha Attapattu <
rajit...@gmail.com> wrote:

> If you look at the engine impl, you would see TransportLink and a LinkImpl
> (ditto for Session, Delivery etc..)
>
> 1. Is this separation necessary? Could we not collapse the two into one?
>
> 2. The extra copying of data btw the two layers could possibly be a
> performance issue.
>

Logically it is good to keep them distinct since the TransportLink state is
only valid when the Link is bound to a connection and needs to be
cleared/reinitialized on unbind/rebind. I doubt there is much overhead to
it so I'd leave it be for now.

--Rafael


Re: [Proton-J] Removing unused methods in the Message interface

2015-05-13 Thread Rafael Schloming
On Wed, May 13, 2015 at 12:44 PM, Rajith Muditha Attapattu <
rajit...@gmail.com> wrote:

> This is a courtesy notice that I will be removing the following methods
> from the Proton Message interface in 48 hrs unless I hear any objections.
>
> I believe Robbie has already mentioned this in the past.
>
> Rajith
>
> Object save();
>
> String toAMQPFormat(Object value);
>
> Object parseAMQPFormat(String value);
>
> void setMessageFormat(MessageFormat format);
>
> MessageFormat getMessageFormat();
>
> void clear();
>
> MessageError getError();
>

I'd suggest leaving clear(), but I'm +1 on removing the others.

It looks to me like clear() is a reasonable method to have since it would
be cumbersome to set the body back to null and then individually reset all
the headers, however what it does now looks broken to me. It currently
*just* sets the body to null when it should really clear all the headers
and properties as well, i.e. do the same thing that is done at the
beginning of decode(). In fact the beginning of decode() should probably
just call clear() (once it's fixed of course).

--Rafael


Re: codec changes

2015-05-12 Thread Rafael Schloming
On Mon, May 11, 2015 at 3:49 PM, Alan Conway  wrote:

> On Thu, 2015-05-07 at 15:53 -0400, Rafael Schloming wrote:
> > I believe where we ended up was standardizing on a single snprintf-style
> > signature for all encode functions, i.e.:
> >
> > // always returns encoded size, never overwrites limit, pass in
> NULL, 0
> > to compute size in advance
> > // returns < 0 if there was an actual error of some sort, e.g. xxx_t
> > cannot be validly encoded for some reason
> > ssize_t pn_xxx_encode(xxx_t, char *buf, size_t limit);
> >
> > And transitioning to this with a feature macro.
>
> I'm good with sprintf-style.
>
> I'm not sure what you mean by a feature macro. Does that addresses
> binary compatibility or is just a source-level switch? If we break
> binary compatibility we need to make sure we bump the library version
> appropriately to avoid breaking existing apps. If we set the switch to
> default to the new behavior we should make sure the compiler barfs on
> old code and it's fairly obvious what to do about it.
>

I've never actually defined feature test macros before, just used them, so
I'm just making this up as we go along. What I imagine could work though is
to define pn_xxx_encode2 in the .c file to provide the new behavior. By
default it would not be visible from the header files unless the
appropriate feature macro was defined upon inclusion, in which case we
would alias pn_xxx_encode to pn_xxx_encode2, e.g.:

--
#include 
#include 
...
int err = pn_message_encode(msg, buf, &in_out_size); // we could make this
spew a deprecated warning by default
--

--
#define PN_STANDARD_ENCODE_SIGNATURE
#include 
#include 
...
ssize_t encoded_size = pn_message_encode(msg, buf, bufsize);
--

This would allow people to incrementally upgrade and would not break binary
compatibility. We could of course at some later point break binary
compatibility if we want to and clean up the encode2 symbols, especially if
we have other reasons to change ABI.

One thing I'm not sure of is the granularity of the feature test, e.g.
should the macro be specific to the change (PN_STANDARD_ENCODE_SIGNATURE)
or should it be related to the version somehow, e.g. (PN_0_10_SOURCE) or
something like that. I guess both could be an option too, e.g.
PN_0_10_SOURCE could be an alias for all the new features in 0.10,
presumably just this and possibly the sasl stuff if there is a way to make
that work.

--Rafael


Re: Introducing the Ruby Reactive APIs

2015-05-12 Thread Rafael Schloming
On Tue, May 12, 2015 at 10:34 AM, Darryl L. Pierce 
wrote:

> On Tue, May 12, 2015 at 09:44:41AM -0400, Rafael Schloming wrote:
> > On Tue, May 12, 2015 at 8:34 AM, Darryl L. Pierce 
> > wrote:
> >
> > > On Tue, May 12, 2015 at 05:45:20AM -0400, Rafael Schloming wrote:
> > > > Can you post an isolated reproducer with just your definition of
> > > pn_rbkey_t
> > > > and a code version of the 5 steps that lead to the seg fault?
> > >
> > > On my PROTON-781-reactive-ruby-apis branch is an example named
> > > "$REPO/examples/ruby/registry_test.rb" which does it. It's very pared
> > > down, only
> > > creating 3 Transport instances and consistently produces the segfault.
> > >
> >
> > That branch has about a 15 thousand line delta from master. That's a lot
> of
> > lines of code to hide a subtle memory bug. The pn_rbkey_t definition and
> > enough ruby code to use it in a proof of concept should only require a
> few
> > hundred line delta from master. I suggest producing just this delta for
> two
> > reasons. 1) Just the act of producing it will help narrow down where the
> > bug is, and 2) it gives me a much smaller delta to look at so I can be
> more
> > useful to you.
> >
> > (I did run the registry_test.rb that you pointed to through a debugger,
> > however its not obvious what the issue is upon inspection of the trace,
> and
> > while the ruby.i delta is pretty self contained, the details of how you
> are
> > using it from ruby are somewhere buried in the 15K diff.)
>
> I've pulled the pertinent pieces into a new branch:
>
> http://github.com/mcpierce/Proton/tree/rbkey-isolation
>
> The changes are all here:
>
>
> http://github.com/mcpierce/Proton/commit/2dc770992a7c02fe2054a4a77325a199e39b9c94
>
> and it blows up exactly as I've been experiencing.
>

I made a bunch of line comments.

--Rafael


Re: Introducing the Ruby Reactive APIs

2015-05-12 Thread Rafael Schloming
On Tue, May 12, 2015 at 8:34 AM, Darryl L. Pierce 
wrote:

> On Tue, May 12, 2015 at 05:45:20AM -0400, Rafael Schloming wrote:
> > Can you post an isolated reproducer with just your definition of
> pn_rbkey_t
> > and a code version of the 5 steps that lead to the seg fault?
>
> On my PROTON-781-reactive-ruby-apis branch is an example named
> "$REPO/examples/ruby/registry_test.rb" which does it. It's very pared
> down, only
> creating 3 Transport instances and consistently produces the segfault.
>

That branch has about a 15 thousand line delta from master. That's a lot of
lines of code to hide a subtle memory bug. The pn_rbkey_t definition and
enough ruby code to use it in a proof of concept should only require a few
hundred line delta from master. I suggest producing just this delta for two
reasons. 1) Just the act of producing it will help narrow down where the
bug is, and 2) it gives me a much smaller delta to look at so I can be more
useful to you.

(I did run the registry_test.rb that you pointed to through a debugger,
however its not obvious what the issue is upon inspection of the trace, and
while the ruby.i delta is pretty self contained, the details of how you are
using it from ruby are somewhere buried in the 15K diff.)

--Rafael


Re: Introducing the Ruby Reactive APIs

2015-05-12 Thread Rafael Schloming
Can you post an isolated reproducer with just your definition of pn_rbkey_t
and a code version of the 5 steps that lead to the seg fault?

--Rafael


On Mon, May 11, 2015 at 5:21 PM, Darryl L. Pierce 
wrote:

> On Thu, May 07, 2015 at 01:02:13PM -0400, Rafael Schloming wrote:
> > The way the python code does this is by checking whenever a C object is
> > returned to python code. If the record contains an attachment indicating
> > that the C object has previously been wrapped, it uses this to
> > construct/retrieve an appropriate wrapper object. If it doesn't have the
> > appropriate attachment then it uses the record API to define/set the
> > attachment to the appropriate value. I presume you could do something
> > similar with ruby.
>
> After we chatted the other day, I've tried the following approach, using
> the pn_transport_t type as my test bed since it has relatively fewer
> dependencies. However, the plumbing in Proton for objects isn't quite
> clear to me and my code's not quite working the way we had discussed and
> I'm not sure why.
>
> The goal is to have a type that will live for as long as one of the
> impls in Proton lives; i.e., when we create something like
> pn_transport_t, the attachment created for this would hold some type
> that will get finalized when the pn_transport_t type is finalized. And
> that type would be the hook to clean up the single instance of a Ruby
> class that wraps the underlying C type.
>
> I've created a new type in the ruby.i descriptor for Swig and named it
> pn_rbkey_t, with three files: void* registry (a pointer to an object
> held in Ruby), char* method (the name of a method to invoke on that
> object), and char* key_value (the argument to be passed to that method).
>
> The code defines pn_rbkey_initialize and pn_rbkey_finalize methods, as
> well as getter and setter methods for the three fields. But I've put
> debugging into the code and never see the pn_rbkey_finalize method being
> invoked.
>
> My registry_test app does the following:
>
> 1. create an instance of pn_transport_t: impl = Cproton.pn_transport
> 2. create a Ruby Transport object: transport =  Transport.wrap(impl)
>a. puts a weak reference to the Transport into the hashtable
>b. creates a pn_rbkey_t object and sets it as the sole record for the
>   pn_transport_t object
>c. calls Cproton.pn_incref on the pn_rbkey_t instance
> 3. remove the reference: transport = nil
> 4. call garbage collection: ObjectSpace.garbage_collect
> 5. get the object back: transport = Transport.wrap(impl)
>a. calls pn_transport_attachment and retrieves the record created in 2
>b. should then be able to get the key_value from the pn_rbkey_t type
>c. should then get the object out of the hashtable to return
>
> It's at step 5 that the example app segfaults. The segfault happens
> when, from Ruby, there's a call to print the attachment retrieve in 5a.
> Swig isn't failing since it's returning a value that seems to have been
> cached. But when Swig tries to retrieve data from the pn_rbkey_t struct
> underneath of it, *THAT* seems to have been reaped by Proton and Swig
> then segfaults, thinking there was an object still under the covers.
>
> Any ideas or suggestions of where to look for what's going on?
>
> --
> Darryl L. Pierce, Sr. Software Engineer @ Red Hat, Inc.
> Delivering value year after year.
> Red Hat ranks #1 in value among software vendors.
> http://www.redhat.com/promo/vendor/
>
>


Re: Python 3 port is 'done'

2015-05-08 Thread Rafael Schloming
On Thu, Apr 30, 2015 at 11:18 AM, Robbie Gemmell 
wrote:

> On 30 April 2015 at 15:56, Ken Giusti  wrote:
> >
> >
> > - Original Message -
> >> From: "Robbie Gemmell" 
> >> To: proton@qpid.apache.org
> >> Cc: us...@qpid.apache.org
> >> Sent: Thursday, April 30, 2015 10:20:07 AM
> >> Subject: Re: Python 3 port is 'done'
> >>
> >> On 29 April 2015 at 21:05, Ken Giusti  wrote:
> >> >
> >> > Well, done enough to consider merging to master.
> >> >
> >> > While the patch is quite large, most of the changes are simple syntax
> >> > changes to avoid non-python3 compliant syntax.
> >> >
> >> > The code is available on the kgiusti-python3 branch at the Apache
> repo.
> >> >
> >> >
> https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;a=shortlog;h=refs/heads/kgiusti-python3
> >> >
> >> > I've also made a patch that can be viewed up on reviewboard:
> >> >
> >> > https://reviews.apache.org/r/33691/
> >> >
> >> > I've verified that the unit tests and python examples run under
> python2.6,
> >> > 2.7, and python3.3.   I'd appreciate if folks would take this patch
> for a
> >> > spin and report back their experience.
> >> >
> >> > Known Issues:
> >> >
> >> > These changes will be incompatible with earlier versions of the
> python 2.x
> >> > series.  I know for a fact that python versions <= 2.4 won't even
> parse
> >> > this patch, and I suspect getting such older versions of python to
> work
> >> > would require lots of effort.   I'm a little unsure of how well
> python 2.5
> >> > will be supported - I have yet to test that far back.  I also didn't
> test
> >> > anything earlier than 3.3 in the python3.x stream.
> >> >
> >> > --
> >> > -K
> >>
> >> I gave thigns a kick with Python 2.7, and Jython 2.5.3 without issue.
> >>
> >> I also tried the maven build with Jython 2.7 RC3 (there was a new one)
> >> and things exploded similarly to the way they did before.
> >>
> >
> > Thanks Robbie.
> >
> > What kind of issues does Jython 2.7 complain about?  I'll have to
> install that RC at some point... :(
> >
> > -K
>
> Lots of the tests fail due to error, most if not all of which seem to
> be "TypeError: Type not compatible with array". As I say though, this
> isnt to do with your changes since it did that last time I tried too
> :)
>

I happened to run into a similar error recently and tracked the issue down.
I believe this error is due to an API incompatibility between jython 2.5
and jython 2.7. In jython 2.5 you need to convert python strings into java
byte arrays manually and you use the 'array' constructor from the jarray
module to do this, e.g. something like:
array("string-that-has-bytes-in-it", 'b') will construct a java byte array
from a python string. In jython 2.7 this conversion is done automatically
(which is nice). Sadly though, the array constructor that worked in jython
2.5 now barfs with exactly the TypeError message you are showing above. I
was able to get a simple smoke test working with proton-j and jython 2.7 by
removing the array constructor and just passing the string directly,
however this will not work on 2.5, so we probably either need to write our
own conversion function that behaves differently for different jython
versions or do a full switch over to 2.7.

--Rafael


Re: codec changes

2015-05-07 Thread Rafael Schloming
I believe where we ended up was standardizing on a single snprintf-style
signature for all encode functions, i.e.:

// always returns encoded size, never overwrites limit, pass in NULL, 0
to compute size in advance
// returns < 0 if there was an actual error of some sort, e.g. xxx_t
cannot be validly encoded for some reason
ssize_t pn_xxx_encode(xxx_t, char *buf, size_t limit);

And transitioning to this with a feature macro.

--Rafael

On Thu, May 7, 2015 at 3:28 PM, Andrew Stitcher 
wrote:

> On Wed, 2015-05-06 at 14:03 -0400, Rafael Schloming wrote:
> > We seem to have reached consensus here, but I haven't seen any commits on
> > this. We should probably fix this before 0.10 so we don't end up putting
> > out a new API and then deprecating it in the next release. Is anyone
> > actually working on this?
>
> Could you articulate the consensus then.
>
> Asserting "we have reached consensus" without explicitly stating what
> you think the consensus to be is remarkably likely to be proven wrong by
> subsequent events!
>
> Andrew
>
>
>


Re: Development workflow and release process [WAS: Concurrent Go API for proton is, erm, GO!]

2015-05-07 Thread Rafael Schloming
On Thu, May 7, 2015 at 2:52 PM, Alan Conway  wrote:

>
> - Original Message -
> > The recent landing of the Go changes make me think that we should be more
> > explicit about our development process with respect to new language
> > bindings (or possibly in general). There are two problems I would like to
> > address.
> >
> > First, a bunch of code just landed on trunk without prior
> > communication/peer review right as we are trying to stabilize for 0.10.
> > With the go binding work having started/proceeded directly on trunk, I
> > can't tell if this is a rush commit to get stuff into 0.10, or if it's
> just
> > more ongoing development that was assumed to not impact the stated 0.10
> > goals.
> >
> > Secondly, from a release management perspective it is in general awkward
> to
> > have early stage development mixed in with changes to a stable codebase.
> > The git history between 0.9, 0.9.1, and master is currently a mix of high
> > fidelity changes, e.g. discrete bug fixes/feature enhancements all
> > cluttered up with a bunch of more noisy checkpoint/work-in-progress style
> > commits for the go binding that are a normal part of early stage
> > development work. This makes things hard when it comes to release
> > management as there is a lot of noise to sort through when running git
> > cherry and the like.
> >
> > I'd like to propose getting a bit more formal about the following policy,
> > especially now that we are fully using git and branches are cheap. I
> think
> > a number of people already follow this implicitly, but as a whole we are
> > somewhat inconsistent about it (myself included at times):
> >
> > 1. For developing new language bindings (and really for any development
> > work that will involve enough new stuff to have a noisy commit history)
> we
> > use branches. This is already the case with the Ruby/C++/Python3
> bindings,
> > as well as the SASL work.
> >
> > 2. We should discuss on the mailing list before we land major features.
> We
> > were trying to stabilize trunk for a 0.10 release, and this hasn't been
> in
> > the discussion, and a number of things have been broken in the recent
> > commits.
>
> :))
>
> I didn't follow the process 'cause there wasn't one :) This process makes
> perfect sense, I will move the go work to a "go" branch to comply.
>
> n fairness to me I did ask on the list whether I should do this on a
> branch or whether it was OK on trunk blah, blah, whine, whine, poor me etc.
> In fairness to the new process I did break the build with a go commit. Not
> because of the go binding but because of an emacs keyboard twitch leaving
> random characters in a python file I was "viewing". Being on a branch would
> have saved me that embarrassment.
>

I think I pretty much said go ahead on master, so apologies for singling
you out. It's awesome that lots of new binding work is happening and I
think it's just a natural part of any project's growing pains to introduce
a bit more process when dealing with a code base that has both
stable/mature parts and newly expanding parts.

--Rafael


Re: Introducing the Ruby Reactive APIs

2015-05-07 Thread Rafael Schloming
On Thu, May 7, 2015 at 11:49 AM, Darryl L. Pierce 
wrote:

> On Thu, May 07, 2015 at 11:32:33AM -0400, Rafael Schloming wrote:
> > On Thu, May 7, 2015 at 10:40 AM, Darryl L. Pierce 
> > wrote:
> >
> > > On Thu, May 07, 2015 at 09:57:49AM -0400, Rafael Schloming wrote:
> > > > On Thu, May 7, 2015 at 9:41 AM, Darryl L. Pierce  >
> > > wrote:
> > > 
> > > > > To help with this, two additional callback APIs were added to the
> > > Proton
> > > > > libraries: pn_record_set_callback and pn_record_has_callback.
> These two
> > > > > functions work to help register a method to be called whenever a
> record
> > > > > is deleted to enable memory management. This way the
> above-mentioned
> > > key
> > > > > can be properly deleted, and the value stored in the hash table
> > > > > discarded.
> > > >
> > > > I would need to see the code in detail, but I suspect you don't need
> to
> > > add
> > > > a pn_record_set_callback/get_callback to achieve roughly the
> > > functionality.
> > > > I *think* you could simply define a pn_class_t that is a reference
> > > counted
> > > > holder of your key. You could then put your callback logic in the
> > > finalizer
> > > > for that class, and when proton's reference counting triggers the
> > > > finalizer, it will run the callback logic at the appropriate time.
> > >
> > > (edit)
> > >
> > > As I was writing up a description of the code I realized I have already
> > > done what you suggest above WRT the pni_rbhandler_t type. I could use
> > > the same logic to create a pni_rbrecord_t type and manage its lifecycle
> > > the same way the handler's lifecycles are managed, yeah?
> >
> > Yes, I believe so.
>
> Since records are created when a struct if initially created, I'm not
> sure how to go about attaching the key to its lifecycle since the
> dynamic language isn't explicitly creating the record.
>

The way the python code does this is by checking whenever a C object is
returned to python code. If the record contains an attachment indicating
that the C object has previously been wrapped, it uses this to
construct/retrieve an appropriate wrapper object. If it doesn't have the
appropriate attachment then it uses the record API to define/set the
attachment to the appropriate value. I presume you could do something
similar with ruby.

--Rafael


Re: Introducing the Ruby Reactive APIs

2015-05-07 Thread Rafael Schloming
On Thu, May 7, 2015 at 10:40 AM, Darryl L. Pierce 
wrote:

> On Thu, May 07, 2015 at 09:57:49AM -0400, Rafael Schloming wrote:
> > On Thu, May 7, 2015 at 9:41 AM, Darryl L. Pierce 
> wrote:
> 
> > > To help with this, two additional callback APIs were added to the
> Proton
> > > libraries: pn_record_set_callback and pn_record_has_callback. These two
> > > functions work to help register a method to be called whenever a record
> > > is deleted to enable memory management. This way the above-mentioned
> key
> > > can be properly deleted, and the value stored in the hash table
> > > discarded.
> >
> > I would need to see the code in detail, but I suspect you don't need to
> add
> > a pn_record_set_callback/get_callback to achieve roughly the
> functionality.
> > I *think* you could simply define a pn_class_t that is a reference
> counted
> > holder of your key. You could then put your callback logic in the
> finalizer
> > for that class, and when proton's reference counting triggers the
> > finalizer, it will run the callback logic at the appropriate time.
>
> (edit)
>
> As I was writing up a description of the code I realized I have already
> done what you suggest above WRT the pni_rbhandler_t type. I could use
> the same logic to create a pni_rbrecord_t type and manage its lifecycle
> the same way the handler's lifecycles are managed, yeah?
>

Yes, I believe so.

--Rafael


Re: Introducing the Ruby Reactive APIs

2015-05-07 Thread Rafael Schloming
On Thu, May 7, 2015 at 9:41 AM, Darryl L. Pierce  wrote:

> I've been working on this codebase since the beginning of the year. The
> two branches [1, 2] in my git repo represent the low-level engine APIs
> and the higher-level reactive APIs, respectively.
>
> I'm still working through the set of example apps for the reactive APIs,
> but at this point I feel this is close enough that I want to start
> getting feedback from people.
>
> == Memory Concerns
>
> Of particular important is memory management: the Proton libraries use
> reference counting to manage object lifespans, while Ruby uses mark and
> sweep operations for garbage collection. So ensuring that pure Ruby
> objects aren't reaped when they've only known to the Proton libraries,
> in the case of event handlers specifically, has been a challenge and one
> that's sure to have some cases that need fixing.
>
> The first model explored was to attachment the Ruby wrapper objects to
> the Swig-generated wrappers for the underlying C structs in Proton.
> Which worked at first, but turned out to be not useful. The reason being
> that the Swig bindings were themselves being reaped when they went out
> of scope; i.e., Swig doesn't maintain them by providing a mark operation
> until disposal of the underlying C structs. So this path, while
> initially promising, was discarded.
>
> The current model uses a hash table that is attached to the Qpid::Proton
> module. When objects are stored for use by the C libraries, they are
> tucked away in this hash table with a unique key generated based on
> memory addresses. A copy of that key, as a char*, is given to Proton to
> use later when the object is being retrieved.
>
> To help with this, two additional callback APIs were added to the Proton
> libraries: pn_record_set_callback and pn_record_has_callback. These two
> functions work to help register a method to be called whenever a record
> is deleted to enable memory management. This way the above-mentioned key
> can be properly deleted, and the value stored in the hash table
> discarded.
>

I would need to see the code in detail, but I suspect you don't need to add
a pn_record_set_callback/get_callback to achieve roughly the functionality.
I *think* you could simply define a pn_class_t that is a reference counted
holder of your key. You could then put your callback logic in the finalizer
for that class, and when proton's reference counting triggers the
finalizer, it will run the callback logic at the appropriate time.


> The reference counting aspect of the Proton libraries is a concern as
> well. The code currently increments and decrements references in the
> same places as the Python code, but there are likely more places where
> such reference accounting need to be added.
>
> [1] http://github.com/mcpierce/Proton/tree/PROTON-799-Ruby-engine-apis
> [2] http://github.com/mcpierce/Proton/tree/PROTON-781-reactive-ruby-apis
> --
> Darryl L. Pierce, Sr. Software Engineer @ Red Hat, Inc.
> Delivering value year after year.
> Red Hat ranks #1 in value among software vendors.
> http://www.redhat.com/promo/vendor/
>
>


Re: [GitHub] qpid-proton pull request: Reactor

2015-05-07 Thread Rafael Schloming
On Thu, May 7, 2015 at 7:29 AM, gemmellr  wrote:

> Github user gemmellr commented on a diff in the pull request:
>
> https://github.com/apache/qpid-proton/pull/29#discussion_r29843571
>
> --- Diff:
> proton-j/src/main/java/org/apache/qpid/proton/engine/impl/EventImpl.java ---
> @@ -157,12 +165,51 @@ public void dispatch(Handler handler)
>  case TRANSPORT_CLOSED:
>  handler.onTransportClosed(this);
>  break;
> +case REACTOR_FINAL:
> +handler.onReactorFinal(this);
> +break;
> +case REACTOR_QUIESCED:
> +handler.onReactorQuiesced(this);
> +break;
> +case REACTOR_INIT:
> +handler.onReactorInit(this);
> +break;
> +case SELECTABLE_ERROR:
> +handler.onSelectableError(this);
> +break;
> +case SELECTABLE_EXPIRED:
> +handler.onSelectableExpired(this);
> +break;
> +case SELECTABLE_FINAL:
> +handler.onSelectableFinal(this);
> +break;
> +case SELECTABLE_INIT:
> +handler.onSelectableInit(this);
> +break;
> +case SELECTABLE_READABLE:
> +handler.onSelectableReadable(this);
> +break;
> +case SELECTABLE_UPDATED:
> +handler.onSelectableWritable(this);
> +break;
> +case SELECTABLE_WRITABLE:
> +handler.onSelectableWritable(this);
> +break;
> +case TIMER_TASK:
> +handler.onTimerTask(this);
> +break;
>  default:
>  handler.onUnhandled(this);
>  break;
>  }
> +
> +Iterator children = handler.children();
> --- End diff --
>
> Commenting here to avoid spamming the semi-unrelated JIRA mentioned
> linked with the other PR that got merged.
>
> I have only skimmed this PR, since I haven't got much of experience of
> the reactor code in any of the other languages, so I'm not sure what many
> things are actually meant to do.
>
> It felt a little strange at times that some of the Connection etc
> objects now have some very reactor specific methods even though you might
> not be using the reactor with them (i.e all existing use), but I can
> certainly live with that if thats how it works elsewhere, and they
> presumably dont hurt anything if you dont use them.
>

There may be a way to reduce the coupling a little bit there (I'm currently
looking into that), but in the end it simply becomes awkward to write event
handlers without being able to navigate from all the various engine objects
back to the reactor that holds them.


> I did spot this 1 specific bit though, where I think it would be nice
> to optimise for what anyone using this currently will be doing. Existing
> handlers wont have any children, and that would currently mean that every
> time you dispatch an event you will create a pointless iterator, so it
> would be nice to avoid that in the case of no children.
>

Agreed. I need to tweak event dispatch to support some other scenarios, so
I'll probably be messing with that code anyways. I'll make sure it doesn't
create extraneous objects when I do.

--Rafael


Re: Question about pn_reactor and threads.

2015-05-06 Thread Rafael Schloming
On Wed, May 6, 2015 at 10:59 AM, Alan Conway  wrote:

> On Thu, 2015-04-30 at 21:51 -0400, Rafael Schloming wrote:
> > What sort of work/connections are you talking about here? Are you talking
> > about processing AMQP messages in a thread pool or are you talking about
> > writing some sort of multithreaded I/O handler?
>
> I'm talking about running multiple proton instances in a thread pool,
> like the Qpid C++ broker does. It uses a thread safe poller (based on
> epoll or somesuch) to dispatch data from many connections to a small
> thread pool. The poller serializes work per connection. There is a
> proton engine per connection so work to each engine is serialized but
> different engines run concurrently. The worker threads feed the proton
> transport, process events, and move messages to and from thread-safe
> queues (which are potentially shared among all connections)
>
> I'm trying to understand if the C reactor can help with that. I have the
> impression that the reactor requires all of the engines and their
> handlers to run in a single thread. From there I could put messages into
> a thread pool to process them concurrently. However the engines share no
> data so there is no reason not to run them in parallel as well.
> Serializing everything thru the reactor doesn't seem to buy me anything,
> but it reduces concurrency and makes the reactor thread a bottleneck.
>

Your assumption that the engines share no state is not valid in general.
Given that the application can attach arbitrary state to connections,
sessions, links, deliveries, etc, it is entirely possible (and quite
common) that a whole group of connections might actually be fairly
intimately intertwined via shared application state. The reactor permits
you to code such an application without having to worry about concurrency.

On the other hand, using a reactor per connection doesn't seem to give
> me anything over just using an engine per connection.
>

In this case it still provides a common event dispatch framework. If you
write your code against the engine directly, then your code is basically
going to be one big event handler for every possible event that might occur
on a connection. This is fine until you start needing different behavior on
different links, say you want to apply one flow control policy on one link
and a different one on another link. You now either need to turn your one
big event handler into spaghetti, or you need a way to associate different
handlers with different links.

You can of course do all this in an ad-hoc way rather than using the
reactor, but the advantage of using the reactor is that by being consistent
about how we do this, we can build up a library of useful handlers that can
work well together.

--Rafael


Re: codec changes

2015-05-06 Thread Rafael Schloming
We seem to have reached consensus here, but I haven't seen any commits on
this. We should probably fix this before 0.10 so we don't end up putting
out a new API and then deprecating it in the next release. Is anyone
actually working on this?

--Rafael

On Wed, Apr 15, 2015 at 2:29 PM, Andrew Stitcher 
wrote:

> On Wed, 2015-04-15 at 07:13 -0400, Rafael Schloming wrote:
> > On Tue, Apr 14, 2015 at 1:27 PM, Alan Conway  wrote:
> >
> > > That works for me, now how do we manage the transition? I don't think
> we
> > > can afford to inflict "yum update proton; all proton apps crash" on our
> > > users. That means we cannot change the behavior of existing function
> > > names. I don't much like adding encode_foo2 for every encode_foo but I
> > > don't see what else to do.
> > >
> >
> > We could mark the old ones as deprecated and add the new ones as
> > pn_xxx_encode2 and provide a feature macro that #defines pn_xxx_encode to
> > pn_xxx_encode2. Then after a sufficient transition period we can get rid
> of
> > the old versions.
>
> Along these lines it is also possible to keep ABI (binary library
> compatibility) by using versioned library symbols as well. There is a
> level of linker magic needed, but at least on Unix it's well understood,
> if a little arcane.
>
> Another perfectly reasonable approach would be to create a new name for
> the new API and deprecate the old name.
>
> So for example deprecate pn_data_t in favour of pn_value_t (or whatever
> better but new name). Where pn_value_t has all the old functionality of
> pn_data_t and indeed may be the same internal structure initially, but
> with a different interface.
>
> Incidentally C++ does make this easier because it allows function
> overloading.
>
> Andrew
>
>
>


Development workflow and release process [WAS: Concurrent Go API for proton is, erm, GO!]

2015-05-06 Thread Rafael Schloming
The recent landing of the Go changes make me think that we should be more
explicit about our development process with respect to new language
bindings (or possibly in general). There are two problems I would like to
address.

First, a bunch of code just landed on trunk without prior
communication/peer review right as we are trying to stabilize for 0.10.
With the go binding work having started/proceeded directly on trunk, I
can't tell if this is a rush commit to get stuff into 0.10, or if it's just
more ongoing development that was assumed to not impact the stated 0.10
goals.

Secondly, from a release management perspective it is in general awkward to
have early stage development mixed in with changes to a stable codebase.
The git history between 0.9, 0.9.1, and master is currently a mix of high
fidelity changes, e.g. discrete bug fixes/feature enhancements all
cluttered up with a bunch of more noisy checkpoint/work-in-progress style
commits for the go binding that are a normal part of early stage
development work. This makes things hard when it comes to release
management as there is a lot of noise to sort through when running git
cherry and the like.

I'd like to propose getting a bit more formal about the following policy,
especially now that we are fully using git and branches are cheap. I think
a number of people already follow this implicitly, but as a whole we are
somewhat inconsistent about it (myself included at times):

1. For developing new language bindings (and really for any development
work that will involve enough new stuff to have a noisy commit history) we
use branches. This is already the case with the Ruby/C++/Python3 bindings,
as well as the SASL work.

2. We should discuss on the mailing list before we land major features. We
were trying to stabilize trunk for a 0.10 release, and this hasn't been in
the discussion, and a number of things have been broken in the recent
commits.

--Rafael

On Tue, May 5, 2015 at 7:48 PM, Alan Conway  wrote:

> First serious stab at a concurrent Go API for proton with working examples
> (send.go, receive.go) that inter-operate with the python examples :)
>
> Read all about it:
> https://github.com/apache/qpid-proton/tree/master/proton-c/bindings/go
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
> For additional commands, e-mail: dev-h...@qpid.apache.org
>
>


[RESULT] [VOTE]: Release Proton 0.9.1-rc1 as 0.9.1

2015-05-02 Thread Rafael Schloming
The vote passes with 7 +1's and no other votes.

--Rafael

On Wed, Apr 29, 2015 at 3:34 PM, Rafael Schloming  wrote:

> Hi Everyone,
>
> I've put out an RC for 0.9.1 in the usual places.
>
> Source artifacts are here:
> https://people.apache.org/~rhs/qpid-proton-0.9.1-rc1/
>
> Java binaries are here:
> https://repository.apache.org/content/repositories/orgapacheqpid-1033
>
> Please check them out and register your vote:
>
> [  ]: Yes, release Proton 0.9.1-rc1 as 0.9.1
> [  ]: No, ...
>
> --Rafael
>
>


Re: [VOTE]: Release Proton 0.9.1-rc1 as 0.9.1

2015-05-02 Thread Rafael Schloming
On Wed, Apr 29, 2015 at 3:34 PM, Rafael Schloming  wrote:

> Hi Everyone,
>
> I've put out an RC for 0.9.1 in the usual places.
>
> Source artifacts are here:
> https://people.apache.org/~rhs/qpid-proton-0.9.1-rc1/
>
> Java binaries are here:
> https://repository.apache.org/content/repositories/orgapacheqpid-1033
>
> Please check them out and register your vote:
>
>
[ x ]: Yes, release Proton 0.9.1-rc1 as 0.9.1

--Rafael


Re: Question about pn_reactor and threads.

2015-04-30 Thread Rafael Schloming
What sort of work/connections are you talking about here? Are you talking
about processing AMQP messages in a thread pool or are you talking about
writing some sort of multithreaded I/O handler?

--Rafael

On Thu, Apr 30, 2015 at 2:47 PM, Alan Conway  wrote:

> Can the proton reactor be used to deliver work from multiple connections
> to a thread pool, where work from a given connection is only handled by
> one thread at a time (so access to each pn_transport and it's stuff is
> serialized)? That is a pretty standard model for servers.
>
> It doesn't look to me like this is the case but I may be missing
> something. If it is the case, what's the pattern for doing it?
>
> Cheers,
> Alan.
>
>


Re: 0.10 release time frame?

2015-04-30 Thread Rafael Schloming
I'd like to see one fairly soon. I'm currently working through a few
sasl-related interop issues between proton-c and proton-j, but once that is
done and gordon's map fix lands, I think we would be in decent shape to put
out a 0.10 in short order.

--Rafael

On Thu, Apr 30, 2015 at 3:06 PM, Rajith Muditha Attapattu <
rajit...@gmail.com> wrote:

> I'm interested in knowing the timelines the community has in mind for the
> 0.10 release.
>
> A tentative date for alpha and beta cycles would be helpful in planning the
> work tasks and vacation time.
>
> Regards,
>
> Rajith
>


Re: Python 3 port is 'done'

2015-04-30 Thread Rafael Schloming
On Thu, Apr 30, 2015 at 8:35 AM, Ken Giusti  wrote:

>
>
> - Original Message -
> > From: "Rafael Schloming" 
> > To: proton@qpid.apache.org
> > Sent: Wednesday, April 29, 2015 4:24:09 PM
> > Subject: Re: Python 3 port is 'done'
> >
> > What happens when I run make test and I have both python2 and python3
> > installed on my system? Do the tests run once under each version or does
> > one of the versions 'win'?
>
> At this point it only runs on the 'default' version - whatever
> /usr/bin/python resolves to.
>
> I like the idea of having it run on all installed python versions, but I
> haven't explored how to do that yet.
>
> I've been using virtualenv [1] to switch between the two versions of
> python I have installed on my development station.  Tox [2] is probably the
> best approach to enable testing against multiple python environments.
>
> I'll look into tox a bit and see what I can come up with.
>

My system comes with both python and python3 on my path. Just running
python3 manually on proton/tests/proton-test will run it with the python3
interpreter. I don't know how standard this setup is (I'm running stock
fedora 20), but it would be pretty easy to do a check in cmake and run the
tests using python3 if present.

I'm also a fan of running both python versions if present, but I also don't
want to double the time it takes to run through the tests. Given that we
are mostly looking for syntactic incompatibilities in the wrapper code
here, I wonder if it would be sufficient to run a subset of the tests that
is likely to give us good coverage on the wrapper code but doesn't bother
trying to exercise all the C code twice. Obviously if this proves
insufficient we could expand the subset.

--Rafael


Re: Python 3 port is 'done'

2015-04-29 Thread Rafael Schloming
What happens when I run make test and I have both python2 and python3
installed on my system? Do the tests run once under each version or does
one of the versions 'win'?

--Rafael

On Wed, Apr 29, 2015 at 4:05 PM, Ken Giusti  wrote:

>
> Well, done enough to consider merging to master.
>
> While the patch is quite large, most of the changes are simple syntax
> changes to avoid non-python3 compliant syntax.
>
> The code is available on the kgiusti-python3 branch at the Apache repo.
>
>
> https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;a=shortlog;h=refs/heads/kgiusti-python3
>
> I've also made a patch that can be viewed up on reviewboard:
>
> https://reviews.apache.org/r/33691/
>
> I've verified that the unit tests and python examples run under python2.6,
> 2.7, and python3.3.   I'd appreciate if folks would take this patch for a
> spin and report back their experience.
>
> Known Issues:
>
> These changes will be incompatible with earlier versions of the python 2.x
> series.  I know for a fact that python versions <= 2.4 won't even parse
> this patch, and I suspect getting such older versions of python to work
> would require lots of effort.   I'm a little unsure of how well python 2.5
> will be supported - I have yet to test that far back.  I also didn't test
> anything earlier than 3.3 in the python3.x stream.
>
> --
> -K
>


[VOTE]: Release Proton 0.9.1-rc1 as 0.9.1

2015-04-29 Thread Rafael Schloming
Hi Everyone,

I've put out an RC for 0.9.1 in the usual places.

Source artifacts are here:
https://people.apache.org/~rhs/qpid-proton-0.9.1-rc1/

Java binaries are here:
https://repository.apache.org/content/repositories/orgapacheqpid-1033

Please check them out and register your vote:

[  ]: Yes, release Proton 0.9.1-rc1 as 0.9.1
[  ]: No, ...

--Rafael


Re: candidate commits for 0.9.1

2015-04-29 Thread Rafael Schloming
On Wed, Apr 29, 2015 at 12:38 PM, Gordon Sim  wrote:

> On 04/27/2015 01:45 PM, Gordon Sim wrote:
>
>> On 04/27/2015 01:14 PM, Rafael Schloming wrote:
>>
>>> I also added PROTON-858 as a release blocker.
>>>
>>
>> I've been trying to get a fix proposal together for that. I'll post it
>> for review as soon as I'm reasonably confident, still seeing some issues
>> at present (not 100% sure they are related, but am assuming so).
>>
>
> Just to update the status here. Although I have positive reviews for the
> simple patch, I have encountered some issues even with that during stress
> testing.
>
> I can't say for sure whether these are caused by my change as the test
> showing them up doesn't run long enough without the change. However until I
> know for sure I am not keen to commit it.
>

It sounds like the patch has at least improved things in your scenario. Do
you think it's likely that the patch could have made things worse in some
other way?

--Rafael


Re: candidate commits for 0.9.1

2015-04-29 Thread Rafael Schloming
Ok, I'll spin up an RC shortly.

--Rafael

On Wed, Apr 29, 2015 at 12:48 PM, Robbie Gemmell 
wrote:

> On 29 April 2015 at 17:38, Gordon Sim  wrote:
> > On 04/27/2015 01:45 PM, Gordon Sim wrote:
> >>
> >> On 04/27/2015 01:14 PM, Rafael Schloming wrote:
> >>>
> >>> I also added PROTON-858 as a release blocker.
> >>
> >>
> >> I've been trying to get a fix proposal together for that. I'll post it
> >> for review as soon as I'm reasonably confident, still seeing some issues
> >> at present (not 100% sure they are related, but am assuming so).
> >
> >
> > Just to update the status here. Although I have positive reviews for the
> > simple patch, I have encountered some issues even with that during stress
> > testing.
> >
> > I can't say for sure whether these are caused by my change as the test
> > showing them up doesn't run long enough without the change. However
> until I
> > know for sure I am not keen to commit it.
> >
> > If this is holding up things for the proton-j side, which after all is
> what
> > motivated the release in the first place, I would suggest we continue
> > without the fix for PROTON-858.
>
> I think that might be a good idea, certainly I wont argue against it.
> If we arent confident in the fix, we might end up needing a respin
> that could push things out further.
>
> At the end of the day, we can easily do a 0.9.2 when we think it is
> ready. Its possibly also worth saying that even if we didnt do that,
> its likely less of an issue for folks using proton-c to just patch the
> source they build from, whereas many using proton-j will be doing so
> via binaries at maven central and so need us to push a release out to
> get the changes.
>
> Robbie
>


Re: candidate commits for 0.9.1

2015-04-29 Thread Rafael Schloming
On Wed, Apr 29, 2015 at 6:16 AM, Dominic Evans 
wrote:

> -Robbie Gemmell  wrote: -
> > There were some changes on master and the branch yesterday, so I have
> > updated the commit lists again. The current categorised list of
> > commits is now at:
> > http://people.apache.org/~robbie/qpid/proton/0.9.1/git-cherry-pass3-c
> > ategorised.txt
> >
> > As before, only the commits at the very bottom have been picked from
> > master to the 0.9.x branch. All the previous commits mentioned in the
> > file have not. If you want anything else included you need to say so,
> > or do so.
> >
> > We are essentially waiting for PROTON-858 at this point, which there
> > still seems to be a lot of discussion going on about. If we cant land
> > it quickly with confidence I'd like to suggest possibly deferring it,
> > as we can always do more releases.
>
> Thanks Robbie.
>
> Ongoing, is the plan that we should continue to backport cherry-picked
> bugfix
> commits from master and keep the 0.9.x series going for possible future
> point
> releases?
>

I think the SASL stuff should stabilize soon, and as that is the major
delta, we could simply do a 0.10 when that happens.

--Rafael


Re: candidate commits for 0.9.1

2015-04-27 Thread Rafael Schloming
This one should definitely go in:

+ aa5ea2b62fd5680bc2a36bee14f72e037d8cc276 close the transport when the
selector reports an error

I also added PROTON-858 as a release blocker.

--Rafael


On Mon, Apr 27, 2015 at 8:07 AM, Gordon Sim  wrote:

> On 04/27/2015 12:46 PM, Robbie Gemmell wrote:
>
>> I have gone through the git cherry output and categorised the
>> remaining commits from master that dont have a direct equivalent on
>> the 0.9.x branch, splitting according to what they update i.e. mainly
>> by language. I listed some as excluded based on what they are for, e.g
>> the 0.10 sasl work, and any Go-only changes (because the Go bits were
>> not in 0.9).
>>
>>
>> http://people.apache.org/~robbie/qpid/proton/0.9.1/git-cherry-pass1-categorised.txt
>>
>> I'd like to get the 0.9.1 release out this week, which would mean
>> starting a vote tomorrow, so if you want specific commits included
>> then please shout now.
>>
>
> I think all the commits under 'Python (+Examples)' can be included for
> 0.9.1.
>
>


Re: Messenger race condition on Android?

2015-04-27 Thread Rafael Schloming
On Mon, Apr 20, 2015 at 8:12 PM, Adam Wynne  wrote:

> Hi Rafael,
>
> My answers to your questions are below...
>
> On Fri, Apr 17, 2015 at 8:33 AM Rafael Schloming-3 [via Qpid] <
> ml-node+s2158936n7623117...@n2.nabble.com> wrote:
>
> > On Fri, Apr 17, 2015 at 8:09 AM, Adam Wynne <[hidden email]
> > <http:///user/SendEmail.jtp?type=node&node=7623117&i=0>> wrote:
> >
> > > Sorry for the cross-post but I didn't get any hits on the user list and
> > I
> > > now
> > > think this could be a bug.
> > >
> > > I think I am seeing a race condition with Messenger on Android only:
> > >
> > > When I do the typical put/send sequence in a Thread started from an
> > Android
> > > Activity, the message is not received by a subscribed peer.  If I kill
> > the
> > > Activity, the peer will complain that the connection is broken.  So it
> > > seems that the connection is being made but the data is not sent.  Here
> > is
> > > an example code snippet:
> > >
> > > Messenger messenger = Messenger.Factory.create();
> > > // do other things like create a message
> > > messenger.put(msg)
> > >// Thread.sleep(200)
> > > messenger.send()
> > >
> > > However when  I uncomment the sleep statement above, the message is
> > > received without any problem.   The message is also received if I
> > attempt
> > > to debug to see what is happening in put().
> > >
> > > I noticed that put() does not simply add the message to a queue, it
> also
> > > uses nio methods to do some encoding of the message.  I'm wondering if
> > > since it is not blocking, is there some encoding method happening while
> > the
> > > send() is being processed, causing the message to be lost.
> > >
> > > We also noticed that there is a big CPU usage (up to 40%) spike during
> > the
> > > put/send process, which seems extreme for just a tcp send.
> > >
> >
> > Hi Adam,
> >
> > Apologies in advance for the barrage of questions, but  some additional
> > information would be helpful.
> >
> > What version of the code are you working with?
> >
> I first tried with 0.9, then I built the latest from source and had the
> same results each time
>
> > Is your thread a long running thread or does it terminate shortly after
> > the
> > code you have posted?
> >
> The thread is long running
>
> > What exactly is receiving the message at the other end of the connection?
> >
> I have tried 2 subscribers with the same results:  one in android and one
> on a macbook.  I get the same results on mac.
>
> > Does a similar thread arrangement reproduce the issue outside of Android,
> > and if so would it be possible to post a reproducer?
> >
> No, I couldn't reproduce in a standard JVM.   Do you want me to post the
> android app?
>

That would certainly be helpful.

--Rafael


Re: Error handling in the proton-c reactor (and how it might relate to a Java port)

2015-04-24 Thread Rafael Schloming
Hi Adrian,

See inline for answers...

On Thu, Apr 23, 2015 at 12:17 PM, Adrian Preston 
wrote:

> Hello all,
>
> While porting the proton-c reactor to Java, I've found a few error paths
> that I wasn't sure how best to handle.
>
> I have some ideas (see below), but if this stuff is already written down
> somewhere - feel free to suitably admonish me (and then point me towards
> it...)
>
> 1) When an error occurs while the reactor is servicing a connection: the
> connection is closed with a transport error.  This is already implemented
> by various functions in reactor/connection.c (e.g. pni_handle_bound, to
> pick one at random), so I expect Java following suit shouldn't be too
> contentious.
>

Yes


> 2) When an error occurs while the reactor is accepting a connection: a
> PN_SELECTABLE_ERROR event is delivered to the acceptor's collector.  This
> might necessitate a new pn_acceptor_attachments function to associate a
> handler with an acceptor (casting to selectable strikes me as something
> that might break in the future...).  Aside: should it be possible to
> associate a pn_error (Java Throwable?) with an event, so that it is
> possible to report the underlying cause for a PN_SELECTABLE_ERROR?
>

A pn_acceptor_attachments function makes sense to me.

Regarding your other question. In general I've been trying to stick to
having each event reference only a single object, and also reference state
in the object model rather than carry state itself, so I might consider
adding an accessor to pn_selectable_t to store/extract error information
instead of storing it on the event.

3) In the Java reactor it is possible for an unchecked (derived from
> RuntimeException) exception to be thrown from a handler.  Delivering a
> PN_SELECTABLE_ERROR to the selectable seems like the wrong thing to do
> (because the handler that threw the exception might not be associated with
> a selectable, or the exception could be thrown while handling
> PN_SELECTABLE_ERROR).  Logging the exception then swallowing it seems
> likely to result in situations where the reactor appears to have hung.  So
> the best I've come up with is that the Java equivalent of
> pn_reactor_process throws an exception - but then I'm not clear what state
> the reactor should be left in?  Permanently failing, by throwing a
> "ReactorBorked" exception from any future pn_reactor_process invocation?
> Also, if this happens should the reactor be responsible for reclaiming the
> resources used by its children (e.g. closing their sockets)?
>

The python wrapper of the reactor has a similar situation since python code
can also throw runtime exceptions. From my experience coding against the
API, you definitely want to know sooner rather than later exactly what has
gone wrong. It can be easy to miss errors that scroll by in a log, so I
would definitely not attempt to continue executing automatically. That said
I would try not to leave the reactor in a permanently borked state either
since you might want the option to fire events related to shutdown after an
error.

What I've done in python is roughly the following. I catch and save any
exceptions that occur during dispatch of the current event to its handlers.
When that event has been dispatched to all handlers, I throw an exception
(it's anonymous currently, but it should probably be some sort of
DispatchException) from Reactor.process() that references any exceptions
that occurred during dispatch of that event. This by default results in the
reactor failing fast when an exception occurs, but also leave things in a
state where the user can easily log the exception and call process again if
they wish to continue.

Regarding reclaiming resources, I don't attempt to close sockets or
anything like that since for my use cases when the reactor fails the whole
process exits. In C this will happen when the reactor is freed, but
obviously in python and/or java you would be depending on GC to make that
happen and it might not be soon enough, so it may make sense to add a
method that would explicitly do that sort of cleanup.

--Rafael


Re: New release?

2015-04-23 Thread Rafael Schloming
There are a couple of proton-c changes that while not as critical as the
proton-j stuff would make sense to go out in such a release, e.g. there is
a two line fix that avoids zombie connections building up when the network
dies in just the right way, so I'm +1 on a quick turnaround release.

--Rafael

On Thu, Apr 23, 2015 at 6:39 AM, Robbie Gemmell 
wrote:

> Hi folks,
>
> I would like to propose doing a new release. There have been quite a
> few important fixes or changes since 0.9, mainly in proton-j, that I
> would like to see made available for use in dependent projects such as
> the JMS client. These include things such as preventing a few memory
> leaks, some changes to stop erroneous attach frames being sent, and
> some updates in the new heartbeat support to align its timing
> behaviour with proton-c.
>
> I dont think there is anything on master specific to proton-j that I
> wouldnt include currently, however it seems likely the same isn't true
> for proton-c right now, e.g with the large SASL changes having just
> had their initial landing this week and given that the next release
> was probably not expected to be for a noticable period of time. As a
> result I would propose doing a point release based on 0.9, branching
> from the 0.9 tag (e.g to 0.9.x) and then cherry picking any desired
> changes to include.
>
> Thoughts?
>
> Robbie
>


Re: problems with master after sasl changes

2015-04-23 Thread Rafael Schloming
FYI, I was able to get past all of my issues by installing
cyrus-sasl-devel. Apparently the fallback implementation that is there when
cyrus is not available doesn't handle pipelining, and this was responsible
for both the test failures and the interop issue. Andrew is working on
fixing this.

--Rafael

On Wed, Apr 22, 2015 at 3:02 PM, Gordon Sim  wrote:

> On 04/22/2015 05:42 PM, Gordon Sim wrote:
>
>> On 04/21/2015 12:52 PM, Rafael Schloming wrote:
>>
>>> I'm seeing a couple of issues with the recently landed sasl changes. I'm
>>> getting four test failures in the python tests (see details at the end).
>>> I'm also seeing interop issues with the proton.js built prior to these
>>> changes, and with these changes in place the javascript build seems to be
>>> messed up (not finding new symbols).
>>>
>>> Is anyone else seeing similar issues?
>>>
>>
>> I can't even get as far as the tests. On a clean build
>> (cyrus-sasl-lib-2.1.23-31 installed) I get:
>>
>>  /home/gordon/projects/proton-git/proton-c/src/sasl/cyrus_sasl.c: In
>>> function ‘pn_sasl_free’:
>>> /home/gordon/projects/proton-git/proton-c/src/sasl/cyrus_sasl.c:656:15:
>>> error:
>>> implicit declaration of function ‘sasl_client_done’
>>> [-Wimplicit-function-declaration]
>>> /home/gordon/projects/proton-git/proton-c/src/sasl/cyrus_sasl.c:658:15:
>>> error:
>>> implicit declaration of function ‘sasl_server_done’
>>> [-Wimplicit-function-declaration]
>>> make[2]: ***
>>> [proton-c/CMakeFiles/qpid-proton.dir/src/sasl/cyrus_sasl.c.o] Error 1
>>>
>>
> This is a cyrus version issue, see
> https://issues.apache.org/jira/browse/PROTON-859
>
>  If I set the SASL_IMPL in cmake to null (is that how you turn it off),
>> then I get:
>>
>
> I got past this with a completely fresh build directory.
>
>
>


problems with master after sasl changes

2015-04-21 Thread Rafael Schloming
I'm seeing a couple of issues with the recently landed sasl changes. I'm
getting four test failures in the python tests (see details at the end).
I'm also seeing interop issues with the proton.js built prior to these
changes, and with these changes in place the javascript build seems to be
messed up (not finding new symbols).

Is anyone else seeing similar issues?

--Rafael

proton_tests.sasl.SaslTest.testPipelined2
 fail
Error during test:  Traceback (most recent call last):
File "/home/rhs/proton/tests/python/proton-test", line 355, in run
  phase()
File "/home/rhs/proton/tests/python/proton_tests/sasl.py", line 161, in
testPipelined2
  assert len(out1) > 0
  AssertionError
proton_tests.sasl.SaslTest.testPipelinedClient
... fail
Error during test:  Traceback (most recent call last):
File "/home/rhs/proton/tests/python/proton-test", line 355, in run
  phase()
File "/home/rhs/proton/tests/python/proton_tests/sasl.py", line 68, in
testPipelinedClient
  assert self.s1.outcome == SASL.OK
  AssertionError
proton_tests.sasl.SaslTest.testPipelinedClientFail
... fail
Error during test:  Traceback (most recent call last):
File "/home/rhs/proton/tests/python/proton-test", line 355, in run
  phase()
File "/home/rhs/proton/tests/python/proton_tests/sasl.py", line 95, in
testPipelinedClientFail
  assert self.s1.outcome == SASL.AUTH
  AssertionError
proton_tests.sasl.SaslTest.testSaslAndAmqpInSingleChunk
.. fail
Error during test:  Traceback (most recent call last):
File "/home/rhs/proton/tests/python/proton-test", line 355, in run
  phase()
File "/home/rhs/proton/tests/python/proton_tests/sasl.py", line 140, in
testSaslAndAmqpInSingleChunk
  assert self.s2.outcome == SASL.OK
  AssertionError


Re: Messenger race condition on Android?

2015-04-17 Thread Rafael Schloming
On Fri, Apr 17, 2015 at 8:09 AM, Adam Wynne  wrote:

> Sorry for the cross-post but I didn't get any hits on the user list and I
> now
> think this could be a bug.
>
> I think I am seeing a race condition with Messenger on Android only:
>
> When I do the typical put/send sequence in a Thread started from an Android
> Activity, the message is not received by a subscribed peer.  If I kill the
> Activity, the peer will complain that the connection is broken.  So it
> seems that the connection is being made but the data is not sent.  Here is
> an example code snippet:
>
> Messenger messenger = Messenger.Factory.create();
> // do other things like create a message
> messenger.put(msg)
>// Thread.sleep(200)
> messenger.send()
>
> However when  I uncomment the sleep statement above, the message is
> received without any problem.   The message is also received if I attempt
> to debug to see what is happening in put().
>
> I noticed that put() does not simply add the message to a queue, it also
> uses nio methods to do some encoding of the message.  I'm wondering if
> since it is not blocking, is there some encoding method happening while the
> send() is being processed, causing the message to be lost.
>
> We also noticed that there is a big CPU usage (up to 40%) spike during the
> put/send process, which seems extreme for just a tcp send.
>

Hi Adam,

Apologies in advance for the barrage of questions, but  some additional
information would be helpful.

What version of the code are you working with?
Is your thread a long running thread or does it terminate shortly after the
code you have posted?
What exactly is receiving the message at the other end of the connection?
Does a similar thread arrangement reproduce the issue outside of Android,
and if so would it be possible to post a reproducer?

Thanks,

--Rafael


Re: [RFC] Strategy for porting Proton to Python 3

2015-04-17 Thread Rafael Schloming
On Thu, Apr 16, 2015 at 9:54 AM, Ken Giusti  wrote:

>
> Hi all,
>
> I'm building on the work done by Dominic and Mickael to get all the proton
> python bits to work under both python2 and python3.   See [1].
>
> I think this will entail a lot of little changes to the python sources and
> the unit tests.  Rather than check in a single huge patch, I'm going to
> break it up over several patches.
>
> The first bunch of patches will simply 'modernize' the existing python
> code.  Old style syntax that is not forward compatible with python 3 will
> be replaced (eg. print "foo" --> print("foo"), etc).  I'll use a tool
> called 'futurize' which is part of the python future toolset [2], [3].
>
> Once all python code is updated, then I'll begin introducing python 3
> specific patches, including the work already done by Dominic and Mickael.
> Of course I'll verify that none of these changes will break python 2.
>  I've got a local CI system that can build/test in both environments.
>
> From a discussion with Dominic, we agreed that it would be A Good Thing to
> use one of the existing Py2 <--> Py3 abstraction libraries.  These
> libraries provide utilities for writing code that works under both python
> versions.  I've used 'six' in the past [4] and found it quite helpful - it
> will eliminate a lot of the messy conditional code one has to hack in order
> to support both languages.
>
> However, this library is not part of the standard python library.  This
> means introducing a new dependency.
>
> Personally, I don't think this is a big deal - use of 'six' is ubiquitous
> among python packages.  It's available freely via pypi, and though most
> distros.
>
> So that's the Big Question - is everyone comfortable with this additional
> dependency?   Does anyone have a better alternative?  Has anyone ported
> other large python codebases - what was your experience?
>
> thanks,
>
> -K
>
>
> [1] https://issues.apache.org/jira/browse/PROTON-490
> [2] http://python-future.org/index.html
> [3] http://python-future.org/futurize_cheatsheet.html
> [4] https://pythonhosted.org/six/
>


This makes sense to me, assuming the jython issues can be sorted. Would it
make sense to modify the build to automatically run the tests under both 2
and 3 to avoid introducing new code that only works on one version?

--Rafael


Re: codec changes

2015-04-15 Thread Rafael Schloming
On Tue, Apr 14, 2015 at 1:27 PM, Alan Conway  wrote:

> That works for me, now how do we manage the transition? I don't think we
> can afford to inflict "yum update proton; all proton apps crash" on our
> users. That means we cannot change the behavior of existing function
> names. I don't much like adding encode_foo2 for every encode_foo but I
> don't see what else to do.
>

We could mark the old ones as deprecated and add the new ones as
pn_xxx_encode2 and provide a feature macro that #defines pn_xxx_encode to
pn_xxx_encode2. Then after a sufficient transition period we can get rid of
the old versions.

--Rafael


Re: codec changes

2015-04-12 Thread Rafael Schloming
On Fri, Apr 10, 2015 at 11:37 AM, Alan Conway  wrote:

>
> On Wed, 2015-04-08 at 14:50 -0400, Andrew Stitcher wrote:
> > On Wed, 2015-04-08 at 10:48 -0400, Alan Conway wrote:
> > > ...
> > > The issue is that we need buffer growth AND control over allocation.
> > > pn_buffer_t forces use of malloc/free/realloc. That won't help if
> you're
> > > trying to get the data into a buffer allocated by Go for example. I
> > > agree a growable buffer type is a nicer API than raw function pointers,
> > > but the buffer type itself would need to use function pointers so we
> can
> > > replace the functions used for alloc/free/realloc.
> >
> > I think this is a pretty general issue actually that also crops up in
> > embedded systems, where you want some different control over memory
> > allocation.
> >
> > I think it might be better addressed by making malloc/free/realloc
> > replaceable for the whole of proton - would that solve your go issue?
>
> Sadly no. A proton realloc surrogate is just going to be passed void*
> and return void*. There's no memory safe way to implement that in Go,
> unless someone keeps a Go pointer to the returned buffer it gets garbage
> collected. There's no way to communicate a Go pointer thru a C proton
> interface.
>
> Given all that, I think I prefer the sprintf-style solution for growing
> buffers. It keeps the memory management strictly on the caller side of
> the pn_ interface which is simpler.
>
> If there is a big pn_data overhaul in the works then maybe we should not
> be talking about pn_data_encode and instead make sure we fix it in the
> overhaul.
>
> I notice we have these other APIs:
>
> codec.h:499:PN_EXTERN int pn_data_format(pn_data_t *data, char *bytes,
> size_t *size);
> message.h:739:PN_EXTERN int pn_message_encode(pn_message_t *msg, char
> *bytes, size_t *size);
> ssl.h:320:PN_EXTERN int pn_ssl_get_peer_hostname( pn_ssl_t *ssl, char
> *hostname, size_t *bufsize );
>
> That is a better signature IMO, the new (backward compatible) syntax for
> such functions would be:
>
> return 0: *size is updated to be the actual size.
> return PN_OVERFLOW: *size is updated to be the required size.
> return < 0: *size is undefined.
> don't return > 0
>
> Perhaps for the overhaul we should ensure all such functions (including
> the replacement for pn_data_encode) follow this pattern?
>

I'm a big +1 on standardizing on a single snprintf style signature for all
encode functions.

The signatures you mention above (similar to sprintf but using an IN/OUT
parameter) were my first attempt at a standard encode/decode signature. I
agree with you that at first it looks a bit more appealing for some reason,
however after using it for a while I actually found it quite cumbersome
relative to the traditional snprintf style. In the common case it takes
almost twice as much code to use it since you always have to declare,
initialize, and check an extra variable in separate steps due to the IN/OUT
parameter. It also doesn't map that nicely when you swig it since most
languages don't do the IN/OUT parameter thing well, so you have to have
more cumborsome/custom typemaps and/or wrapper code to deal with the
impedance mismatch.

For those reasons I propose that we standardize on the traditional snprintf
rather than using the IN/OUT variant.

--Rafael


Re: codec changes

2015-04-08 Thread Rafael Schloming
On Wed, Apr 8, 2015 at 10:48 AM, Alan Conway  wrote:

> On Tue, 2015-04-07 at 17:57 -0400, Rafael Schloming wrote:
> > Maybe I'm not following something, but I don't see how passing around
> > allocation functions actually solves any ownership problems. I would
> think
> > the ownership problems come from pn_data_t holding onto a pointer
> > regardless of whether that pointer was gotten from malloc or from a
> > callback. I'm assuming you want to be able to do something like use a
> > pn_data_t that is owned by one thread to encode into a buffer and pass
> that
> > buffer over to another thread. How does passing in a bunch of allocators
> > help with this? If the pn_data_t holds a pointer to whatever those
> > allocators return, then aren't you going to have ownership issues no
> matter
> > what?
> >
> > To answer Bozzo's original question, I think that it's good to keep the
> > encoder/decoder decoupled from the buffer for a number of reasons. In
> > addition to the ownership issues that Alan points out, the encoded data
> may
> > have a different lifecycle from the pn_data_t that created it for non
> > thread related reasons, or you may simply want to encode directly into a
> > frame buffer and avoid an extra copy.
> >
> > If the goal here is simply to provide a convenient way to avoid having to
> > repeat the resizing loop then I would suggest simply providing a
> > convenience API that accepts a growable buffer of some sort. This
> provides
> > both convenience and avoids ownership issues with holding pointers. I'm
> > thinking something along the lines of:
> >
> > pn_data_t data = ...;
> > pn_string/buffer_t buf = ...;
> > pn_data_encode_conveniently(data, buf); // obviously needs a better
> name
> >
> > It is perhaps not *quite* as convenient in the minimal case as pn_data_t
> > holding the buffer internally, but it is an improvement in general and
> > probably simpler than having to mess with function pointers in the case
> > where the buffer's lifecycle is independent from the pn_data_t. (Not
> that I
> > really understand how that would work anyways.)
>
> The issue is that we need buffer growth AND control over allocation.
> pn_buffer_t forces use of malloc/free/realloc. That won't help if you're
> trying to get the data into a buffer allocated by Go for example. I
> agree a growable buffer type is a nicer API than raw function pointers,
> but the buffer type itself would need to use function pointers so we can
> replace the functions used for alloc/free/realloc.
>

I'm skeptical that passing around a set of function pointers whether
directly to pn_data_t or to pn_buffer_t is actually fewer lines of code
than simply writing the analogous convenience encoding function for that
language. For python it would likely be fewer lines of code to simply
define something like this:

ssize_t pn_data_encode2bytearray(pn_data_t *data, PyObject *bytearray) {
ssize_t size = pn_data_encode(data,
PyByteArray_AsString(bytearray), PyByteArray_Size(bytearray));
if (size < 0) { return size; }
if (size > PyByteArray_Size(bytearray)) {
PyByteArray_Resize(bytearray, size);
return pn_data_encode(data, PyByteArray_AsString(bytearray),
PyByteArray_Size(bytearray));
}
}

This would also be more flexible since you could pass in any bytearray
object from python, reuse it as much as you like, and have full control
over the lifecycle of that object, whereas I suspect given the signature of
the function pointers you are defining you would need to choose some sort
of predefined allocation strategy for creating whatever go object you are
encoding into. This sounds to me like it is both less convenient and less
flexible, but its entirely possible I'm just not understanding how you
intend to use this. Perhaps it would clarify things if you could post an
actual example usage of how you imagine using this that hopefully
illustrates why it is fewer lines of code than the alternatives?

--Rafael


Re: codec changes

2015-04-07 Thread Rafael Schloming
Maybe I'm not following something, but I don't see how passing around
allocation functions actually solves any ownership problems. I would think
the ownership problems come from pn_data_t holding onto a pointer
regardless of whether that pointer was gotten from malloc or from a
callback. I'm assuming you want to be able to do something like use a
pn_data_t that is owned by one thread to encode into a buffer and pass that
buffer over to another thread. How does passing in a bunch of allocators
help with this? If the pn_data_t holds a pointer to whatever those
allocators return, then aren't you going to have ownership issues no matter
what?

To answer Bozzo's original question, I think that it's good to keep the
encoder/decoder decoupled from the buffer for a number of reasons. In
addition to the ownership issues that Alan points out, the encoded data may
have a different lifecycle from the pn_data_t that created it for non
thread related reasons, or you may simply want to encode directly into a
frame buffer and avoid an extra copy.

If the goal here is simply to provide a convenient way to avoid having to
repeat the resizing loop then I would suggest simply providing a
convenience API that accepts a growable buffer of some sort. This provides
both convenience and avoids ownership issues with holding pointers. I'm
thinking something along the lines of:

pn_data_t data = ...;
pn_string/buffer_t buf = ...;
pn_data_encode_conveniently(data, buf); // obviously needs a better name

It is perhaps not *quite* as convenient in the minimal case as pn_data_t
holding the buffer internally, but it is an improvement in general and
probably simpler than having to mess with function pointers in the case
where the buffer's lifecycle is independent from the pn_data_t. (Not that I
really understand how that would work anyways.)

--Rafael


On Tue, Apr 7, 2015 at 1:21 PM, Alan Conway  wrote:

> On Tue, 2015-04-07 at 11:00 -0400, Andrew Stitcher wrote:
> > On Tue, 2015-04-07 at 09:38 -0400, Alan Conway wrote:
> > > On Tue, 2015-03-31 at 19:17 +0200, Božo Dragojevič wrote:
> > > > Given the memory overhead of a pn_data_t before encoding, why not
> have it
> > > > own an encode buffer? it could get by with exactly that grow_buffer()
> > > > callback if ownership is the issue .
> > >
> >
> > I think the best way to do this would be to introduce a new class to sit
> > on top of the existing pn_data_t which does this, rather than extending
> > the current pn_data_t.
> >
> > So I think the below is fine, but I'd prefer to avoid stuffing it all
> > into pn_data_t especially as I think the class is, long term, going
> > away.
>
> Who's replacing it ;) ? Are they taking notes?
>
> Cheers,
> Alan.
>
>


[ANNOUNCE] Qpid Proton 0.9 released

2015-04-02 Thread Rafael Schloming
Hi Everyone,

Qpid Proton 0.9 is now officially available. You can find it here:

http://qpid.apache.org/releases/qpid-proton-0.9/index.html

In addition to numerous bug fixes and improvements, the 0.9 release
includes a new reactive API that makes it significantly easier to integrate
AMQP into an existing application, or to build custom AMQP applications
from scratch.

The release also includes a number of fixes and/or enhancements resulting
in API changes, some of these are noted below:

- pn_buffer_t is no longer public API
- pn_bytes_t.bytes is renamed to pn_bytes_t.start
- pn_sasl_{client|server}() is replaced by pn_transport_set_server() and
client being the default.

There is also an important Windows SSL/TLS functionality change:

The Proton SSL/TLS module for 0.8 using the native Microsoft SChannel
libraries was not configurable and successful handshakes resulted
solely from having the appropriate CA certificate in the official
Windows Trusted Root CA store.  It is now possible to specify
alternate trusted root CA databases and to turn off certificate
checking altogether, using the same Proton API conventions as for
OpenSSL on Posix systems.

In particular, Proton applications in Windows will not check server
certificates at all in 0.9 unless the capability is explicitly enabled
using the pn_ssl_domain_set_trusted_ca_db() function.  To use the
system Trusted Root CA store:

  pn_ssl_domain_set_trusted_ca_db(d, "sys:root")

Or to use a file based PKCS#12 certificate store:

  pn_ssl_domain_set_trusted_ca_db(d, "mycerts.p12")

For more information, see the detailed release notes posted on the release
page at the URL above.

--Rafael


Re: Idle Timeout of a Connection

2015-04-01 Thread Rafael Schloming
On Wed, Apr 1, 2015 at 6:00 AM, Dominic Evans 
wrote:

> 2.4.5 Idle Timeout Of A Connection
>
> "To avoid spurious timeouts, the value in idle-time-out SHOULD be half the
> peer's actual timeout threshold"
>
> So, to me, this means on the @open performative the client should flow
> (e.g.,) 3 as the idleTimeOut it would like to negotiate, but should
> actually only enforce that data is received from the other end within 6
> milliseconds before it closes the session+connection.
>
> However, if that is the case, then the code in proton-c (pn_tick_amqp in
> transport.c) and proton-j (#tick() in TransportImpl.java) would appear to
> be doing the wrong thing?
> Currently it *halves* the advertised remote_idle_timeout of the peer in
> order to determine what deadline to adhere to for sending empty keepalive
> frames to the remote end.
> Similarly it uses its local_idle_timeout as-is to determine if the remote
> end hasn't send data recently enough (closing the link with
> resource-limit-exceeded when the deadline elapses). This would seem to mean
> that empty frames are being sent twice as often as they need to be, and
> resource-limit-exceeded is being fired too soon.
>
> It seems to me that instead it should used remote_idle_timeout as-is for
> determining the deadline for sending data, and the local_idle_timeout
> specified by the client user should either be doubled when determining the
> deadline or halved before sending it in the @open frame.
>
> Thoughts?
>

I believe your interpretation is correct. I've certainly noticed idle
frames being sent significantly more often than I would have expected, but
I haven't had time to dig into the cause.

--Rafael


Re: proton-j reactor implementation?

2015-03-31 Thread Rafael Schloming
On Tue, Mar 31, 2015 at 2:48 PM, Alan Conway  wrote:

> On Tue, 2015-03-31 at 11:26 -0400, Rafael Schloming wrote:
> > On Tue, Mar 31, 2015 at 11:02 AM, Alan Conway 
> wrote:
> >
> > > On Mon, 2015-03-30 at 00:11 +0100, Adrian Preston wrote:
> > > > Hello all,
> > > >
> > > > I've been following the development of the reactor API and think
> that it
> > > looks really neat. Is anyone working on a pure Java version? I'd be
> > > interested in helping.
> > > >
> > > > Regards
> > > > - Adrian
> > > > Unless stated otherwise above:
> > > > IBM United Kingdom Limited - Registered in England and Wales with
> number
> > > 741598.
> > > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
> PO6
> > > 3AU
> > > >
> > >
> > > I'm currently working on a Go version which is not directly relevant,
> > > but porting directly from the python handlers.py and it is pretty
> > > straightforward. That's where I would start. For Go I also had to wrap
> a
> > > bunch of lower-level proton details but the task should be easier for
> > > Java since all that stuff already exists in Java.
> > >
> > > In Go I am not using the proton reactor to establish or select over
> > > connections, so I'm not using any of the reactor or selectable events.
> I
> > > have a goroutine per connection pumping a proton transport with
> separate
> > > event handling per-connection so we have connection concurrency but
> each
> > > proton engine is only used in a single thread.
> > >
> >
> > This sounds to me like it is based on a bit of a missunderstanding of how
> > the reactor works. The reactor doesn't actually establish or select over
> > connections in C/Python. It is actually just mediating the request for
> > connection creation between whatever piece of application code wants a
> new
> > connection, and whatever handler has been configured to deal with I/O
> > related events. This allows for a significant amount of flexiblity since
> > you can have multiple I/O implementations without having to hard code
> your
> > app to work against a specific one. This is just as important a
> requirement
> > in Go or Java or any other language.
>
> Yep I think I've misstated the issue for Go and I agree with you on
> Java.
>
> The real reason I'm not using the reactor is because event-loop
> programming is counter to the whole design of Go. I want to minimize it
> to handling AMQP-related events.
>

What makes you think this? If anything I would argue that go actually
embraces the concept of event-loop programming. For example they have
elevated the select API into a first class language construct (See
https://gobyexample.com/select).

Also, perhaps more importantly, the most significant difference between go
and other languages is the emphasis on concurrency via ownership (i.e.
passing pointers across concurrent queues) rather than on using explicit
mutexes. Event loops are most popular exactly where they enable a
processing pipeline that defines clear ownership of program constructs such
that programmers can take advantage of some concurrency without having to
explicitly reason about locks (e.g. nodejs, virt.x, spring reactor, ...).

Case in point, using the reactor API (either in C or in Python), I can
quite trivially write a simple program that shuffles messages from a link
on one connection to a link on another connection. It's not much more than
this:

def on_message(self, event):
sender = lookup(event.receiver)
sender.send(transform(event.message))

In fact this example would not only be how you'd write such a proxy, but it
would also be how you'd write a simple request/response processor. The
thing is, the only reason this works is because the reactor owns both of
these connections and so it is safe for the event handler to manipulate
both connections at the same time. In the model you've described below,
this code or any similar code would be fraught with read/modify/write
hazards because every connection is potentially owned by a distinct thread
at any given point. To do something like this you would either need to
start introducing explicit locks (which is frowned upon in go, and even so
would have some fairly fundamental deadlocking issues), or you would need
to somehow separate the interaction between the two connections, e.g.
basically inject code in the goroutine that actually owns the connection so
it can run safely. Once you have a generic way of injecting this code
though I claim you've rebuilt the reactor API since that is one o

codec changes

2015-03-31 Thread Rafael Schloming
Hi Alan,

Sorry I didn't comment on this sooner, I didn't have time to comment on
your original review request during my travels, however I do have some
thoughts on the changes you made to the codec interface. I noticed you
added a separate accessor for the size:

ssize_t pn_data_encoded_size(pn_data_t *data);

This is alongside the original encode method:

ssize_t pn_data_encode(pn_data_t *data, char *bytes, size_t size);

I think this API choice while nice in that it is backwards compatible is
also going to result in code that is roughly twice as slow as it needs to
be in the most common case. Based on my experience implementing and
profiling codec in C, Python, and Java, computing the size of the encoded
data seems to usually be roughly the same amount of work as actually
encoding it regardless of the implementation language. Therefore code like
this:

if (buffer_size() < pn_data_encoded_size(data)) grow_buffer();
pn_data_encode(data, buffer, buffer_size());

Can end up being roughly twice as slow as code like this:

ssize_t err;
while ((err = pn_data_encode(data, buffer, buffer_size())) ==
PN_OVERFLOW) {
grow_buffer();
}

Admittedly the latter form is much more awkward in those cases where you
don't care about performance, so I'm all for providing something nicer, but
I think a better API change would be to steal a page from the C stdio.h
APIs and have pn_data_encode always return the number of bytes that would
have been written had there been enough space. This allows you to write the
simplified encode as above:

if (buffer_size() < pn_data_encode(data, NULL, 0)) grow_buffer();
pn_data_encode(data, buffer, buffer_size());

Or use a more optimal form:

   ssize_t n = pn_data_encode(data, buffer, buffer_size());
   if (n > buffer_size()) {
   grow_buffer();
   pn_data_encode(data, buffer, buffer_size());
   }

This makes the slow/convenient form possible, and provides some options
that are a bit less awkward than the loop, but it also makes it very clear
that when you use the slow/convenient form you are incurring roughly twice
the cost of the alternative.

Normally I wouldn't be overly fussed by something like this, and I realize
what I'm suggesting is a breaking change relative to what you provided, but
based on what profiling we've done in the past, codec is probably the most
significant source of overhead that we add to an application, and exactly
this sort of double encode effect is almost always one of the first things
you hit when you try to optimize. Given this, I think it would be a good
thing if the API accurately reflects the relative cost of the different
styles of use.

Thoughts?

--Rafael


Re: proton-j reactor implementation?

2015-03-31 Thread Rafael Schloming
On Tue, Mar 31, 2015 at 11:02 AM, Alan Conway  wrote:

> On Mon, 2015-03-30 at 00:11 +0100, Adrian Preston wrote:
> > Hello all,
> >
> > I've been following the development of the reactor API and think that it
> looks really neat. Is anyone working on a pure Java version? I'd be
> interested in helping.
> >
> > Regards
> > - Adrian
> > Unless stated otherwise above:
> > IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
> 3AU
> >
>
> I'm currently working on a Go version which is not directly relevant,
> but porting directly from the python handlers.py and it is pretty
> straightforward. That's where I would start. For Go I also had to wrap a
> bunch of lower-level proton details but the task should be easier for
> Java since all that stuff already exists in Java.
>
> In Go I am not using the proton reactor to establish or select over
> connections, so I'm not using any of the reactor or selectable events. I
> have a goroutine per connection pumping a proton transport with separate
> event handling per-connection so we have connection concurrency but each
> proton engine is only used in a single thread.
>

This sounds to me like it is based on a bit of a missunderstanding of how
the reactor works. The reactor doesn't actually establish or select over
connections in C/Python. It is actually just mediating the request for
connection creation between whatever piece of application code wants a new
connection, and whatever handler has been configured to deal with I/O
related events. This allows for a significant amount of flexiblity since
you can have multiple I/O implementations without having to hard code your
app to work against a specific one. This is just as important a requirement
in Go or Java or any other language.


> I'm not sure what the right approach is for Java. Having a C-based
> reactor is useful in C and for some bindings (e.g. the python binding
> uses it) but in languages that have their own version of event
> loops/polling/selecting it may be better to go with the native
> approach.
>

The reactor is pure code/data structure. I believe the correct approach for
Java would be a straightforward port, and the correct approach for Go would
be a simple binding, just like all the other pure code/data pieces
(connection, transport, etc). Tthinking of the reactor as part of the I/O
subsystem is to misunderstand how it works. The reactor proper has been
carefully designed to not directly incorporate any I/O dependencies at all.

In other words, don't think of the reactor as analogous to or a replacement
for the old Driver, think of the reactor as a (potentially)
multi-connection engine, or in UML terms:

Reactor <>---> Connection <>--> Session <>---> Link <>---> Delivery

Please excuse the ascii art UML. The diamonds are supposed to imply
containment by composition.

--Rafael


Re: Proton 0.9 final tarball

2015-03-31 Thread Rafael Schloming
Hi Irina,

I've updated as described in the git pull thread on the PAX extension
issue. I'm just waiting for the mirrors to update before updating the web
site and making the announcement.

--Rafael

On Mon, Mar 30, 2015 at 2:01 PM, Irina Boverman  wrote:

> Rafael,
> Will you update it or not (http://archive.apache.org/dist/qpid/proton/0.9/
> )?
> When will it be announced?
> --
> Regards, Irina.
>


Re: [RESULT] [VOTE]: Proton 0.9-rc-3

2015-03-27 Thread Rafael Schloming
For a start you can look at the changes thread I posted a couple days ago.
I will be pulling this together into a complete list early next week and
will put it out wit the release announcement.

--Rafael

On Fri, Mar 27, 2015 at 6:54 AM, Kritikos, Alex <
alex.kriti...@softwareag.com> wrote:

> Rafael,
>
> could you or someone else point me to a change list from 0.8?
>
> Thanks,
>
> Alex
> On 26 Mar 2015, at 22:20, Ted Ross  wrote:
>
> > Rafael,
> >
> > Do you have an ETA for the final bits?  We're anxious to build some
> downstream packages.
> >
> > -Ted
> >
> > On 03/22/2015 02:44 PM, Rafael Schloming wrote:
> >> This vote passes with 8 binding +1's and no other votes. I will push the
> >> final bits soon.
> >>
> >> --Rafael
> >>
> >> On Tue, Mar 17, 2015 at 9:42 AM, Rafael Schloming 
> wrote:
> >>
> >>> Hi Everyone,
> >>>
> >>> Here's a quick respin of 0.9-rc-3. The only changes from rc-2 are
> exactly
> >>> those two mentioned on the rc-2 vote thread. I've included them at the
> end
> >>> for reference. You can find the source artifacts in the usual location:
> >>>
> >>> https://people.apache.org/~rhs/qpid-proton-0.9-rc-3/
> >>>
> >>> Java binaries are here:
> >>>
> >>>
> https://repository.apache.org/content/repositories/orgapacheqpid-1031
> >>>
> >>> Please check it out and register your vote:
> >>>
> >>> [   ] Yes, release Proton 0.9-rc-3 as 0.9 final
> >>> [   ] No, because ...
> >>>
> >>> --Rafael
> >>>
> >>> ==
> >>>
> >>> commit 810088b14dedcd12a9474687ba9cd05fc8297188
> >>> Author: Dominic Evans 
> >>> Date:   Mon Mar 16 12:18:20 2015 +
> >>>
> >>> PROTON-834: further UTF-8 encoder fixes
> >>>
> >>> After commit c65e897 it turned out there were still some issues
> with
> >>> strings containing a codepoint >0xDBFF which was being incorrectly
> >>> treated as a surrogate pair in the calculateUTF8Length method.
> >>>
> >>> Fixed this up and added some more test coverage.
> >>>
> >>> Closes #13
> >>>
> >>> (cherry picked from commit
> 7b9b516d445ab9e86a0313709c77218d901435b1)
> >>>
> >>> commit c2042d7d26c4383047dac2709d1a2effe0b11419
> >>> Author: Alan Conway 
> >>> Date:   Mon Mar 16 09:51:28 2015 -0400
> >>>
> >>> PROTON-839: Proton 0.9 RC 2 blocker -
> >>> proton_tests.utils.SyncRequestResponse
> >>>
> >>> Fix to reactor.py, check for lack of SSL domain.
> >>>
> >>> (cherry picked from commit
> e31df015a79d791e62caf9bef3f29bdfd77042ef)
> >>>
> >>>
> >>
>
> This communication contains information which is confidential and may also
> be privileged. It is for the exclusive use of the intended recipient(s). If
> you are not the intended recipient(s), please note that any distribution,
> copying, or use of this communication or the information in it, is strictly
> prohibited. If you have received this communication in error please notify
> us by e-mail and then delete the e-mail and any copies of it.
> Software AG (UK) Limited Registered in England & Wales 1310740 -
> http://www.softwareag.com/uk
>
>


Re: [RESULT] [VOTE]: Proton 0.9-rc-3

2015-03-27 Thread Rafael Schloming
I pushed the final bits this morning, both via a git tag and signed
tarballs in svn dist. (I still haven't quite figured out how to get signed
git tags to work for me, so it's just a regular tag at the moment.)

The web site update and announcement will probably come on Monday.

--Rafael


On Thu, Mar 26, 2015 at 4:20 PM, Ted Ross  wrote:

> Rafael,
>
> Do you have an ETA for the final bits?  We're anxious to build some
> downstream packages.
>
> -Ted
>
>
> On 03/22/2015 02:44 PM, Rafael Schloming wrote:
>
>> This vote passes with 8 binding +1's and no other votes. I will push the
>> final bits soon.
>>
>> --Rafael
>>
>> On Tue, Mar 17, 2015 at 9:42 AM, Rafael Schloming 
>> wrote:
>>
>>  Hi Everyone,
>>>
>>> Here's a quick respin of 0.9-rc-3. The only changes from rc-2 are exactly
>>> those two mentioned on the rc-2 vote thread. I've included them at the
>>> end
>>> for reference. You can find the source artifacts in the usual location:
>>>
>>>  https://people.apache.org/~rhs/qpid-proton-0.9-rc-3/
>>>
>>> Java binaries are here:
>>>
>>>  https://repository.apache.org/content/repositories/
>>> orgapacheqpid-1031
>>>
>>> Please check it out and register your vote:
>>>
>>> [   ] Yes, release Proton 0.9-rc-3 as 0.9 final
>>> [   ] No, because ...
>>>
>>> --Rafael
>>>
>>> ==
>>>
>>> commit 810088b14dedcd12a9474687ba9cd05fc8297188
>>> Author: Dominic Evans 
>>> Date:   Mon Mar 16 12:18:20 2015 +
>>>
>>>  PROTON-834: further UTF-8 encoder fixes
>>>
>>>  After commit c65e897 it turned out there were still some issues with
>>>  strings containing a codepoint >0xDBFF which was being incorrectly
>>>  treated as a surrogate pair in the calculateUTF8Length method.
>>>
>>>  Fixed this up and added some more test coverage.
>>>
>>>  Closes #13
>>>
>>>  (cherry picked from commit 7b9b516d445ab9e86a0313709c7721
>>> 8d901435b1)
>>>
>>> commit c2042d7d26c4383047dac2709d1a2effe0b11419
>>> Author: Alan Conway 
>>> Date:   Mon Mar 16 09:51:28 2015 -0400
>>>
>>>  PROTON-839: Proton 0.9 RC 2 blocker -
>>> proton_tests.utils.SyncRequestResponse
>>>
>>>  Fix to reactor.py, check for lack of SSL domain.
>>>
>>>  (cherry picked from commit e31df015a79d791e62caf9bef3f29b
>>> dfd77042ef)
>>>
>>>
>>>
>>


Items for the 0.9 Release notes/announcement

2015-03-24 Thread Rafael Schloming
I'm trying to put together a relatively complete set of changes and release
notes for 0.9. If there is any particular feature be it new or some
existing behaviour change that is worthy of being mentioned in the release
notes or release announcement, please follow up with a suitable blurb on
this thread.

--Rafael


[RESULT] [VOTE]: Proton 0.9-rc-3

2015-03-22 Thread Rafael Schloming
This vote passes with 8 binding +1's and no other votes. I will push the
final bits soon.

--Rafael

On Tue, Mar 17, 2015 at 9:42 AM, Rafael Schloming  wrote:

> Hi Everyone,
>
> Here's a quick respin of 0.9-rc-3. The only changes from rc-2 are exactly
> those two mentioned on the rc-2 vote thread. I've included them at the end
> for reference. You can find the source artifacts in the usual location:
>
> https://people.apache.org/~rhs/qpid-proton-0.9-rc-3/
>
> Java binaries are here:
>
> https://repository.apache.org/content/repositories/orgapacheqpid-1031
>
> Please check it out and register your vote:
>
> [   ] Yes, release Proton 0.9-rc-3 as 0.9 final
> [   ] No, because ...
>
> --Rafael
>
> ==
>
> commit 810088b14dedcd12a9474687ba9cd05fc8297188
> Author: Dominic Evans 
> Date:   Mon Mar 16 12:18:20 2015 +
>
> PROTON-834: further UTF-8 encoder fixes
>
> After commit c65e897 it turned out there were still some issues with
> strings containing a codepoint >0xDBFF which was being incorrectly
> treated as a surrogate pair in the calculateUTF8Length method.
>
> Fixed this up and added some more test coverage.
>
> Closes #13
>
> (cherry picked from commit 7b9b516d445ab9e86a0313709c77218d901435b1)
>
> commit c2042d7d26c4383047dac2709d1a2effe0b11419
> Author: Alan Conway 
> Date:   Mon Mar 16 09:51:28 2015 -0400
>
> PROTON-839: Proton 0.9 RC 2 blocker -
> proton_tests.utils.SyncRequestResponse
>
> Fix to reactor.py, check for lack of SSL domain.
>
> (cherry picked from commit e31df015a79d791e62caf9bef3f29bdfd77042ef)
>
>


[VOTE]: Proton 0.9-rc-3

2015-03-16 Thread Rafael Schloming
Hi Everyone,

Here's a quick respin of 0.9-rc-3. The only changes from rc-2 are exactly
those two mentioned on the rc-2 vote thread. I've included them at the end
for reference. You can find the source artifacts in the usual location:

https://people.apache.org/~rhs/qpid-proton-0.9-rc-3/

Java binaries are here:

https://repository.apache.org/content/repositories/orgapacheqpid-1031

Please check it out and register your vote:

[   ] Yes, release Proton 0.9-rc-3 as 0.9 final
[   ] No, because ...

--Rafael

==

commit 810088b14dedcd12a9474687ba9cd05fc8297188
Author: Dominic Evans 
Date:   Mon Mar 16 12:18:20 2015 +

PROTON-834: further UTF-8 encoder fixes

After commit c65e897 it turned out there were still some issues with
strings containing a codepoint >0xDBFF which was being incorrectly
treated as a surrogate pair in the calculateUTF8Length method.

Fixed this up and added some more test coverage.

Closes #13

(cherry picked from commit 7b9b516d445ab9e86a0313709c77218d901435b1)

commit c2042d7d26c4383047dac2709d1a2effe0b11419
Author: Alan Conway 
Date:   Mon Mar 16 09:51:28 2015 -0400

PROTON-839: Proton 0.9 RC 2 blocker -
proton_tests.utils.SyncRequestResponse

Fix to reactor.py, check for lack of SSL domain.

(cherry picked from commit e31df015a79d791e62caf9bef3f29bdfd77042ef)


Re: [VOTE] Proton 0.9 RC 2

2015-03-16 Thread Rafael Schloming
I'm happy to do a quick turnaround on RC 3 if you/he wants to pull the fix
onto the release branch.

--Rafael

On Tue, Mar 17, 2015 at 5:54 AM, Robbie Gemmell 
wrote:

> Dominic has posted a further change for PROTON-834 that fixes a corner
> case not seen when making an earlier change in 0.9.
>
> I think we should include it if doing an RC3 to pick up the change
> Alan also made earlier.
>
> Robbie
>
> On 16 March 2015 at 12:20, Robbie Gemmell 
> wrote:
> > [ +1 ] Yes, release Proton 0.9-rc-2 as 0.9 final
> >
> > I tested out RC2 as follows:
> > - Checked license/notice files present.
> > - Verified sigs and checksums match.
> > - Built everything using cmake and ran the tests.
> > - Verified building/running the Qpid C++ broker against proton-c.
> > - Built proton-j and ran the tests using maven.
> > - Verified using proton-j in the build+tests for the new Qpid JMS client.
> >
> > Robbie
> >
> > On 16 March 2015 at 05:04, Rafael Schloming  wrote:
> >> Hi Everyone,
> >>
> >> I've posted 0.9 RC 2 in the usual places. The source artifacts are
> >> available here:
> >>
> >> https://people.apache.org/~rhs/qpid-proton-0.9-rc-2/
> >>
> >> Java binaries are available here:
> >>
> >>
> https://repository.apache.org/content/repositories/orgapacheqpid-1029
> >>
> >> The changes from RC 1 are very minor, I've appended the commit log if
> >> you're interested in the details. Please have a look and register your
> vote:
> >>
> >> [   ] Yes, release Proton 0.9-rc-2 as 0.9 final
> >> [   ] No, because ...
> >>
> >> --Rafael
> >>
> >> 
> >>
> >> commit b6506126afbc51bab4c97bf847f2f07f2cc4e6e2
> >> Author: Rafael Schloming 
> >> Date:   Mon Mar 16 17:42:56 2015 +1300
> >>
> >> Release 0.9-rc-2
> >>
> >> commit 0fbe80e2f81a1e8a2df55f6221b987391d5dc336
> >> Author: Bozo Dragojevic 
> >> Date:   Fri Mar 13 14:01:35 2015 +0100
> >>
> >> PROTON-838: proton-hawtdispatch cannot connect with SSL
> >>
> >> Remove one of calls to transport.connecting() as this causes
> >> an assert with SslTransport
> >>
> >> (cherry picked from commit f8ca35f3e007b99e0a5365e154e067840adcefb0)
> >>
> >> commit ea01c014fedd8dff134131f93b7d957aabac70ec
> >> Author: Alan Conway 
> >> Date:   Wed Mar 11 11:17:23 2015 -0400
> >>
> >> PROTON-836: Missing import SSLUnavailable in reactor.py.
> >>
> >> commit 59efe4f6b2ea597332a2191665c498dbf44b8bd8
> >> Author: Gordon Sim 
> >> Date:   Mon Mar 9 14:37:35 2015 +
> >>
> >> NO-JIRA: lack of ssl support should not prevent Container being used
> >>
> >> commit 27851a6d979342178817734423b83b37840a
> >> Author: Robert Gemmell 
> >> Date:   Mon Mar 9 19:36:06 2015 +
> >>
> >> add missing NOTICE file
> >>
> >> cherry-picked from 36e32d2309bb0a96e63e9874758de8906a22ec69
> >>
> >> commit 77403b3c091b2c6dfcc766d978130a34f765ee29
> >> Author: Kenneth Giusti 
> >> Date:   Mon Mar 9 13:38:56 2015 -0400
> >>
> >> NO-JIRA: fix documentation build
> >>
> >> (cherry picked from commit bc2b630eb969710b04a861797567ab2dc368020a)
>
> -
> To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
> For additional commands, e-mail: users-h...@qpid.apache.org
>
>


[VOTE] Proton 0.9 RC 2

2015-03-15 Thread Rafael Schloming
Hi Everyone,

I've posted 0.9 RC 2 in the usual places. The source artifacts are
available here:

https://people.apache.org/~rhs/qpid-proton-0.9-rc-2/

Java binaries are available here:

https://repository.apache.org/content/repositories/orgapacheqpid-1029

The changes from RC 1 are very minor, I've appended the commit log if
you're interested in the details. Please have a look and register your vote:

[   ] Yes, release Proton 0.9-rc-2 as 0.9 final
[   ] No, because ...

--Rafael



commit b6506126afbc51bab4c97bf847f2f07f2cc4e6e2
Author: Rafael Schloming 
Date:   Mon Mar 16 17:42:56 2015 +1300

Release 0.9-rc-2

commit 0fbe80e2f81a1e8a2df55f6221b987391d5dc336
Author: Bozo Dragojevic 
Date:   Fri Mar 13 14:01:35 2015 +0100

PROTON-838: proton-hawtdispatch cannot connect with SSL

Remove one of calls to transport.connecting() as this causes
an assert with SslTransport

(cherry picked from commit f8ca35f3e007b99e0a5365e154e067840adcefb0)

commit ea01c014fedd8dff134131f93b7d957aabac70ec
Author: Alan Conway 
Date:   Wed Mar 11 11:17:23 2015 -0400

PROTON-836: Missing import SSLUnavailable in reactor.py.

commit 59efe4f6b2ea597332a2191665c498dbf44b8bd8
Author: Gordon Sim 
Date:   Mon Mar 9 14:37:35 2015 +

NO-JIRA: lack of ssl support should not prevent Container being used

commit 27851a6d979342178817734423b83b37840a
Author: Robert Gemmell 
Date:   Mon Mar 9 19:36:06 2015 +

add missing NOTICE file

cherry-picked from 36e32d2309bb0a96e63e9874758de8906a22ec69

commit 77403b3c091b2c6dfcc766d978130a34f765ee29
Author: Kenneth Giusti 
Date:   Mon Mar 9 13:38:56 2015 -0400

NO-JIRA: fix documentation build

(cherry picked from commit bc2b630eb969710b04a861797567ab2dc368020a)


Re: VOTE: Release Proton 0.9-rc-1 as 0.9 final

2015-03-12 Thread Rafael Schloming
Hi Everyone,

FYI, I'm going to be out of contact for a few days again. So far the two
issues discovered in this RC are quite isolated, so please continue to test
RC 1 and make sure any new fixes go on the 0.9 branch. I will spin another
RC off of the branch as soon as I'm back.

Thanks,

--Rafael


On Tue, Mar 10, 2015 at 12:57 AM, Rafael Schloming  wrote:

> Hi Everyone,
>
> I've posted 0.9-rc-1 in the usual places. Please have a look and register
> your vote:
>
> Source code can be found here:
>
> http://people.apache.org/~rhs/qpid-proton-0.9-rc-1/
>
> Java binaries are here:
>
> https://repository.apache.org/content/repositories/orgapacheqpid-1025
>
> [   ] Yes, release Proton 0.9-rc-1 as 0.9 final
> [   ] No, because ...
> --Rafael
>


Re: Review Request 31681: suggestion: rename proton.utils to proton.sync?

2015-03-10 Thread Rafael Schloming
Sorry to follow up on this late. I'm currently traveling and have spotty
connectivity.

To be honest I haven't been able to spare the time to looked very closely
at the code in question until recently, so most of my comments are less
about the name and more a rather late set of questions around the
intentions behind the code itself.

My tour of the code started with an attempt to figure out what the term
"sync" was intended to mean since "synchronous" is commonly used to refer
to both a blocking programming style and a request/response messaging
pattern, but the two aren't necessarily correlated. (Blocking APIs can do
asynchronous messaging, and non-blocking APIs can do synchronous
messaging.) I expected a closer look at the code to clear up the ambiguity
for me, but it actually doesn't. The code uses the term Blocking throughout
suggesting that it refers to the programming style, yet it also seems
(possibly intentionally) limited to purely synchronous scenarios, e.g.
there is a window on the receive side, but send is hardcoded to have a
window of 1 and just from looking at the code it is hard to tell whether
this is an intentional design choice or whether the initial use case simply
doesn't happen to include send windows larger than 1.

Looking at the SyncRequestResponse part, it's also possible I have the
wrong end of the stick and that all the various Blocking* classes are
mostly implementation details and the real point of the thing is to be an
rpc tool.

I think some description of the intended scope of the API would be helpful
in providing a less provisional name. As it stands this code could be
anything from the very early start of a general purpose blocking API for
proton, to a simple convenience API for one particular scenario. Where it
falls on this spectrum would significantly influence both its name and
maturity level.

--Rafael


On Fri, Mar 6, 2015 at 8:56 AM, Justin Ross  wrote:

>This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31681/
>
> Ship it!
>
> I like the name.
>
>
> - Justin Ross
>
> On March 3rd, 2015, 2:39 p.m. UTC, Gordon Sim wrote:
>   Review request for qpid, Alan Conway, Justin Ross, and Rafael Schloming.
> By Gordon Sim.
>
> *Updated March 3, 2015, 2:39 p.m.*
>  *Repository: * qpid-proton-git
> Description
>
> In order to convey a little bit more about what the contents do. Just a 
> suggestion.
>
>   Diffs
>
>- examples/python/helloworld_blocking.py (d9a24a9)
>- examples/python/sync_client.py (82cd85f)
>- proton-c/bindings/python/CMakeLists.txt (7f54033)
>- proton-c/bindings/python/proton/sync.py (PRE-CREATION)
>- proton-c/bindings/python/proton/utils.py (d5e2e0a)
>- tests/python/proton_tests/sync.py (PRE-CREATION)
>- tests/python/proton_tests/utils.py (727b586)
>
> View Diff <https://reviews.apache.org/r/31681/diff/>
>


Re: VOTE: Release Proton 0.9-rc-1 as 0.9 final

2015-03-09 Thread Rafael Schloming
FYI, unlike previous releases I've created a 0.9 branch so that work can
continue on trunk without impacting the release. Please ensure that any
fixes intended for the release actually end up on the 0.9 release branch.

--Rafael

On Mon, Mar 9, 2015 at 7:57 AM, Rafael Schloming  wrote:

> Hi Everyone,
>
> I've posted 0.9-rc-1 in the usual places. Please have a look and register
> your vote:
>
> Source code can be found here:
>
> http://people.apache.org/~rhs/qpid-proton-0.9-rc-1/
>
> Java binaries are here:
>
> https://repository.apache.org/content/repositories/orgapacheqpid-1025
>
> [   ] Yes, release Proton 0.9-rc-1 as 0.9 final
> [   ] No, because ...
> --Rafael
>


Re: VOTE: Release Proton 0.9-rc-1 as 0.9 final

2015-03-09 Thread Rafael Schloming
Can you pull this over to the 0.9 branch?

--Rafael

On Mon, Mar 9, 2015 at 10:39 AM, Gordon Sim  wrote:

> On 03/09/2015 02:14 PM, Ken Giusti wrote:
>
>> Additionally, the following python unit tests fail unless the openssl
>> libraries are installed:
>>
>> proton_tests.engine.ServerTest.testIdleTimeout
>> proton_tests.engine.ServerTest.testKeepalive
>> proton_tests.messenger.IdleTimeoutTest.testIdleTimeout
>> proton_tests.utils.SyncRequestResponseTest.test_request_response
>>
>> All the other ssl tests raise the Skip exception, which I believe should
>> be done for the above as well.
>>
>> All failures have a traceback similar to this:
>>
>> 1: Error during test:  Traceback (most recent call last):
>> 1: File 
>> "/home/kgiusti/Downloads/qpid-proton-0.9-rc-1/tests/python/proton-test",
>> line 355, in run
>> 1:   phase()
>> 1: File 
>> "/home/kgiusti/Downloads/qpid-proton-0.9-rc-1/tests/python/proton_tests/engine.py",
>> line 1882, in testIdleTimeout
>> 1:   server = common.TestServer(idle_timeout=idle_timeout)
>> 1: File 
>> "/home/kgiusti/Downloads/qpid-proton-0.9-rc-1/tests/python/proton_tests/common.py",
>> line 127, in __init__
>> 1:   self.reactor = Container(self)
>> 1: File "/home/kgiusti/Downloads/qpid-proton-0.9-rc-1/proton-c/
>> bindings/python/proton/reactor.py", line 588, in __init__
>> 1:   self.ssl = SSLConfig()
>> 1: File "/home/kgiusti/Downloads/qpid-proton-0.9-rc-1/proton-c/
>> bindings/python/proton/reactor.py", line 566, in __init__
>> 1:   self.client = SSLDomain(SSLDomain.MODE_CLIENT)
>> 1: File "/home/kgiusti/Downloads/qpid-proton-0.9-rc-1/proton-c/
>> bindings/python/proton/__init__.py", line 3371, in __init__
>> 1:   raise SSLUnavailable()
>> 1:   SSLUnavailable
>>
>> I suspect the reactor shouldn't fail in these cases if SSL is not
>> available.
>>
>
> I have checked in a fix for that to master.
>
>
>
> -
> To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
> For additional commands, e-mail: users-h...@qpid.apache.org
>
>


Re: VOTE: Release Proton 0.9-rc-1 as 0.9 final

2015-03-09 Thread Rafael Schloming
Can you pull this over to the 0.9 branch?

--Rafael

On Mon, Mar 9, 2015 at 1:41 PM, Ken Giusti  wrote:

> FWIW:  pushed a fix to these doc errors:
>
>
> https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;a=commit;h=bc2b630eb969710b04a861797567ab2dc368020a
>
>
>
>
> - Original Message -
> > From: "Ken Giusti" 
> > To: us...@qpid.apache.org
> > Cc: proton@qpid.apache.org
> > Sent: Monday, March 9, 2015 9:13:07 AM
> > Subject: Re: VOTE: Release Proton 0.9-rc-1 as 0.9 final
> >
> > Anyone else getting the following errors when building the docs?
> >
> >
> > ;; This buffer is for notes you don't want to save, and for Lisp
> evaluation.
> > ;; If you want to create a file, visit that file with C-x C-f,
> > ;; then enter the text in that file's own buffer.
> >
> > Generating example index...
> > finalizing index lists...
> > lookup cache used 974/65536 hits=10705 misses=982
> > finished...
> > Built target docs-c
> >   [
> >   Error: Could not find a file or object named proton/handlers.py
> >   Error: Could not find a file or object named proton/reactor.py
> >   Error: Could not find a file or object named proton/utils.py
> >   Error: Could not find a file or object named proton/wrapper.py
> >   [..
> >
> +
> > | File
> > |
> /home/kgiusti/work/proton/0.9rc1/qpid-proton-0.9-rc-1/proton-c/bindings/python/proton/__init__.py,
> > | line 3778, in proton.Url
> > |   Warning: Line 3781: Improper paragraph indentation.
> > |
> >
>  
> [........]
> > Built target docs-py
> >
> >
> >
> >
> > - Original Message -
> > > From: "Rafael Schloming" 
> > > To: proton@qpid.apache.org, us...@qpid.apache.org
> > > Sent: Monday, March 9, 2015 7:57:53 AM
> > > Subject: VOTE: Release Proton 0.9-rc-1 as 0.9 final
> > >
> > > Hi Everyone,
> > >
> > > I've posted 0.9-rc-1 in the usual places. Please have a look and
> register
> > > your vote:
> > >
> > > Source code can be found here:
> > >
> > > http://people.apache.org/~rhs/qpid-proton-0.9-rc-1/
> > >
> > > Java binaries are here:
> > >
> > >
> https://repository.apache.org/content/repositories/orgapacheqpid-1025
> > >
> > > [   ] Yes, release Proton 0.9-rc-1 as 0.9 final
> > > [   ] No, because ...
> > > --Rafael
> > >
> >
> > --
> > -K
> >
>
> --
> -K
>


VOTE: Release Proton 0.9-rc-1 as 0.9 final

2015-03-09 Thread Rafael Schloming
Hi Everyone,

I've posted 0.9-rc-1 in the usual places. Please have a look and register
your vote:

Source code can be found here:

http://people.apache.org/~rhs/qpid-proton-0.9-rc-1/

Java binaries are here:

https://repository.apache.org/content/repositories/orgapacheqpid-1025

[   ] Yes, release Proton 0.9-rc-1 as 0.9 final
[   ] No, because ...
--Rafael


Re: 0.9 release schedule

2015-03-09 Thread Rafael Schloming
Ok, I'll push out a 0.9 RC ASAP.

On the general topic of API stability, I think the key measure of
"stability" that I would personally like to see (be it 0.9 or 0.10) is not
that we somehow freeze APIs and guarantee to never change them, but rather
that we change them in ways that are backwards compatible. This doesn't
limit us as much as you might think, it just means we need to put in a bit
more work for certain changes, e.g. start using feature macros. The point
of 0.9 was to get as many changes out of the way as possible before
incurring the extra overhead associated with maintaining full backwards
compatibility.

Once we are satisfied we can maintain this guarantee, I think we should go
to 1.0 rather than sticking with the perpetual 0.x theme.

As for newly introduced APIs, I think once we hit 1.0 we probably need to
put some process in place around bringing new APIs into the codebase.
Something that makes it clear to users whether something is at that 1.x
level or not.

--Rafael


On Fri, Mar 6, 2015 at 11:58 AM, Alan Conway  wrote:

> On Fri, 2015-03-06 at 08:50 -0500, Ken Giusti wrote:
> >
> > - Original Message -
> > > From: "Andrew Stitcher" 
> > > To: proton@qpid.apache.org
> > > Sent: Monday, March 2, 2015 8:46:10 PM
> > > Subject: Re: 0.9 release schedule
> > >
> > > On Mon, 2015-03-02 at 20:00 +, Gordon Sim wrote:
> > > > On 03/02/2015 07:07 PM, Rafael Schloming wrote:
> > > > > Hi Everyone,
> > > > >
> > > > > I'd like to propose spinning the first beta (or possibly just RC)
> for 0.9
> > > > > sometime next week. We've been using alphas to get some early eyes
> on
> > > > > some
> > > > > of the new APIs in this release. I think when Andrew's SASL work
> lands
> > > > > there will be no remaining work for 0.9 in the category of major
> API
> > > > > changes/improvements. That should hopefully put us in a position to
> > > > > quickly
> > > > > test and stabilize things and get 0.9 out the door.
> > > >
> > > > The 0.9 release was originally scheduled for the end of 2014. We've
> had
> > > > three alphas already. To me, its already too late in the cycle for
> > > > 'major API changes/improvements'.
> > > >
> > > > As mentioned on the other thread, in my opinion it would be better to
> > > > land Andrew's work during 0.10 allowing for less rushed review,
> > > > evaluation and testing.
> > >
> > > I'm happy to let the new API work be more carefully reviewed. The only
> > > reason to me to get it in 0.9 is that 0.9 was intended to be a point
> for
> > > API stability from then on. And the transport API is a significant
> > > change in the engine API. Pushing it off means allowing 0.10 to break
> > > API compatibility.
> > >
> > > Andrew
> > >
> >
> > In a general sense, how can we be comfortable introducing an API in a
> 0.x release, and consider it "stable"?   Wouldn't it make sense to expose
> the *completed* API for at least one release before we propose stabilizing
> it?
> >
> > For example, the reactor API is new in 0.9, but until 0.9 is released I
> suspect that this API won't be fully explored by users.  And of course we
> won't uncover any potential gotchas with the new API until it has seen some
> adoption.  At that point we may need to change/enhance the api.
> >
> > Seems to me we should get the reactor API out in 0.9, consider it
> complete, and stabilize that *portion* of the API for 0.10 - possibly
> longer given the scope of that API.  The SASL API would then be a candidate
> for stabilization in 0.10 - assuming it has been completed in time - with
> 0.11 being a realistic target for considering the SASL API stable.
> >
> > In other words, when the project considers an API to be complete (from
> the developer's point of view), then there should be at least one release
> that contains that API before we consider it a candidate for stabilization.
> >
> > Just MHO...
> >
>
>
> Hear hear! This is still a young and evolving project, we need to
> release our developments *quickly* so real developers can use them and
> tell us how they need fixing. We are now in SEVERE feature-creep mode
> with this release. Dispatch is dependent on unreleased features and is
> suffering as a consequence. I am sure there are others in the same boat.
> It is not good to make early adopters suffer! Lets release what we have
> now, and then do another release *quickly* for things that are not yet
> finished but are important  (e.g. the transport changes).
>
> Apart from being a problem for users, slow releases make developers want
> to stuff all their new work into the current release because they fear
> there won't be another release for ages, and it becomes a
> self-fulfilling prophecy. Lets nip this in the bud and get back to a
> healthy schedule of regular, rapid releases.
>
> Cheers,
> Alan.
>
>


0.9 release schedule

2015-03-02 Thread Rafael Schloming
Hi Everyone,

I'd like to propose spinning the first beta (or possibly just RC) for 0.9
sometime next week. We've been using alphas to get some early eyes on some
of the new APIs in this release. I think when Andrew's SASL work lands
there will be no remaining work for 0.9 in the category of major API
changes/improvements. That should hopefully put us in a position to quickly
test and stabilize things and get 0.9 out the door.

Please plan any work you're doing accordingly, it would be good to get 0.9
closed down quickly.

FYI, I am currently traveling, so I'll only be online intermittently over
the next two weeks. I will be checking in next week though and assuming
there are no issues, I will spin the release as soon as Andrew's work has
landed.

--Rafael


Re: pn_incref/pn_decref and Ruby

2015-03-02 Thread Rafael Schloming
On Mon, Mar 2, 2015 at 1:12 PM, Darryl L. Pierce  wrote:

> Do I need to use reference counting from within the Proton library even
> though I'm writing in a language that doesn't use them?
>
> I asked because I see calls to pn_incref/pn_decref in the Python
> bindings.
>

You are implicitly using reference counting whenever you use the C API. Any
proton object constructed via the C API has an initial reference count of
1, and most "pn__free" methods are just aliases for pn_decref.
Whether or not you need to explicitly use pn_incref/pn_decref at any point
other than these times depends on your overall scheme for managing memory.
If two ruby objects will ever share a reference to the same C object, then
explicitly using pn_incref/pn_decref is likely the most sensible way to
manage things. If however you are maintaining a one-to-one mapping between
ruby wrapper and C object, then you may not have to use pn_incref/decref
explicitly.

--Rafael


Re: Proposed SASL changes (API and functional)

2015-02-26 Thread Rafael Schloming
This is from Andrew's wiki comment. Sorry to paste it back to the list, but
I'm having some difficulty commenting there:

>
>1. Setters with no getters
>Philosophically I don't agree that you need to make all properties
>read/write. I see no particular reason to make these properties readable
>since they will never change once they are set, or in the case of the
>password should actually not be accessible once set (because the
>implementation *should* be scrubbing the bytes from memory after use).
>In my view if the application needs the value again it already has it.
>In the case of the read-only property authenticated user I definitely
>think that needs to be read only.
>Having said that, I don't feel that strongly about getters for the
>client username and hostname.
>
> There's actually an important point here worth noting. With reactive use,
I don't think it's true, pragmatically speaking, that the application has
the value again when it needs it. When your primary means of operating on a
connection is through handling events, the only state you have easy access
to is what is provided by those events. Taking your suggestion, if I wanted
to do something simple like log a debug statement from an event handler and
include the hostname and/or username of the connection in that info, my
only recourse would be to malloc some sort of ancillary struct and attach
it to the connection and fill it in with the very same hostname that the
connection is holding internally, or alternatively access some sort of
global state that holds a map from the connection back to that same
information. If your point is that this is possible then of course that's
true, but it doesn't seem at all reasonable.

>
>1. inconsistency with existing use of "remote" in API
>I take your point - I'm happy to remove "remote" from the API name -
>would "connected" be all right? pn_transport_set_hostname() just
>doesn't seem specific enough to me - it might just as well be telling the
>transport what *our* hostname is.
>2. Redundancy of pn_connection_set_hostname() and
>pn_transport_set__hostname()
>Yes these are definitely redundant, and I would need to deprecate the
>connection version and set it from the transport when binding them together
>- good catch.
>The transport version must be primary, as (in principle at least, if
>not in the current implementation) you don't need the connection object
>until you have authenticated the transport and authentication and
>encryption may to know need the hostname you are connecting to. I think it
>would have to be an error to rebind (on reconnect) a connection with a
>different hostname than the transport hostname.
>
> This isn't consistent with how the connection and transport are actually
used. With the reactive API, the user creates the connection endpoint first
and configures it with all the relevant details that it cares about. The
transport isn't created at all until the client actually opens the
connection (which could be somewhere completely different from where it
configures the connection). It's also important to note that the user
doesn't actually create the transport at all. The default handlers do this
when the PN_CONNECTION_LOCAL_OPEN event occurs. The users don't even need
to be aware that a transport object exists at all if they don't care to
customize it. This is a nice property and would be difficult to maintain if
you start pushing connection level configuration like hostname into the
transport.

I think if you flip things around it actually makes more sense. As a server
you are going to have a transport prior to having a connection, and in this
case you want to access the hostname-that-your-remote-peer desires for
vhost purposes, but it makes no sense to actually set it. As a client, a
transport is pretty much useless until it is bound to a connection, as you
can't safely do much more than send the protocol header, so the natural
sequence is to create the connection first and set the hostname you want to
connect to, and not worry about the Transport.

>
>1. Having a separate set_user/set_password
>That would make sense. However from this conversation I'm wondering
>actually if we should more carefully distinguish the client and server
>ends. And so have a client only API to set user/password and a server only
>API to extract the authenticated username.
>
> So in conclusion how about:
>
>- Changing pn_transport_set_remote_hostname() →
>pn_transport_set_connect_hostname() (connect/connected/connection?)
>- Adding pn_transport_get_connect_hostname()
>(connect/connected/connection?)
>- Deprecating pn_connection_set/get_hostname() in favour of
>pn_transport_set/get_connect_hostname()
>Actually changing the pn_transport_bind() code would be required too.
>- Removing pn_transport_set_user_password() and pn_transport_get_user()
>

Re: I think that's a blocker...

2015-02-26 Thread Rafael Schloming
On Wed, Feb 25, 2015 at 9:23 PM, Cliff Jansen  wrote:

> Two usage cases, very desirable in Proton, happen to be mostly trivial
> on POSIX but mostly non-trivial on Windows: multi-threading and
> external loops.  Proton io and selector classes are POSIX-y in look
> and feel, but IO completion ports are very different and can mimic the
> POSIX functionality only so far.  The documented restrictions I wrote
> (PROTON-668) tries to curb expectations for cross platform
> functionality (but still allow use cases like Dispatch).
>
> This, however, is not a cross platform issue.  This particular problem
> is confined to user space and affects all platforms equally.
> Presumably the API needs fixing or we agree to take a step backwards.
>

There is a pretty severe limitation without this fix since there is no way
for the API to communicate socket errors back to the user. Also, even if we
roll back the fix, the wouldblock flag is accessed with exactly the same
pattern as the error slot. I would assume this means there is still an
issue even with the fix rolled back since presumably two different threads
could overwrite the wouldblock and you could get a hang or an error since
one/both of them could get the wrong value.

Note that Bozo previously pointed out (Proton mailing list) that the
> pn_io_t API had threading inconsistencies with pn_io_error() and
> pn_io_wouldblock().  Perhaps a pn_selectable_t should be passed in
> instead of a socket parameter, or proton should maintain a map of
> errors and wouldblock state for sockets until they are pn_close'd (as
> the Windows code already does for completion port descriptor state).
> The former would be more consistent with proton overall, the latter
> would require some user space locking.  Another possibility could be
> to pass in pointers to error/wouldblock state as part of the io call.
>

Passing in specific pointers to error and/or wouldblock state seems like it
is less flexible than passing in a pointer to something more abstract that
can contain not only the error state, but also whatever other thread local
state makes sense. My assumption is/was that pn_io_t provides this context.


> While I still think this is not a Windows issue, and the documentation
> is supposed to reflect the Dispatch pattern and not handcuff it, here
> is more about the pn_io_t implementation:
>
>   Global state is in struct iocp_t, per socket state is in struct
> iocpdesc_t
>   (see iocp.h, the OS looks after this stuff in POSIX)
>
>   It has to set up and tear down the Windows socket system.
>   It has to "know" if the application is using an external loop or
> pn_selector_select
>   Setup the completion port subsystem (unless external loop)
>   It has to find IOCP state for each passed in socket
>   Manage multiple concurrent io operations per socket (N writes + 1 read)
>   Notify PN_READABLE, PN_WRITABLE, etc changes to the selector (if
> any) for each call
>   Do an elaborate tear down on pn_io_free (drain completions, force
> close dangling sockets)
>
> Regarding the documentation, I looked at Dispatch, which had been
> using Proton in a multi-threaded manner for some time with
> considerable success.  The old driver.c (now deprecated) allowed
> simultaneous threads to do
>
>   pn_connector_process()  <- but no two threads sharing a
> connector/transport/socket
>   pn_driver_wait_n() combined with pn_connector_next() <- only one
> thread waiting/choosing at a time
>   pn_driver_wakeup() <- any thread, any time, to unstick things
>   everything else (listen accept connect) considered non-thread safe
>
> which provides plenty of parallelism if you have more than one connection.
>
> The documentation I wrote tried to say that you could do that much on
> any platform, but no more (without risking undefined behavior).
> Things (and the documentation) get further complicated by supporting
> external loops, which prevents the use of IO completion ports
> completely for a given pn_io_t and uses a different set of system
> calls.
>
> Perhaps the doc restrictions could be summarized as:
>
>   One pn_io_t/pn_selector_t, one thread -> no restrictions
>   One pn_io_t/pn_selector_t, multi threads -> limited thread safety
> (Dispatch)
>   One pn_io_t, no pn_selector_t, external loop, one thread -> no
> restrictions
>   One pn_io_t, no selector, external loop, multi threads -> ???
>   multiple pn_io_t: doable, but sockets must stick to one pn_io_
>
>
> Some difficulties you might not expect: Linux doesn't care if sockets
> move between selectors, or if one thread is reading on a socket while
> another is writing to the same one.  Simple things like this would
> have major design and performance implications for Proton on Windows.
>

It sounds like one way or another we need at least some design changes. I
don't think it's workable to have overlapping/close but distinct semantics
for the API on different platforms (e.g. you can move sockets on one
platform but not on another). I'm starting to thi

ApacheCon North America 2015

2015-02-25 Thread Rafael Schloming
Hi Everyone,

I'll be attending ApacheCon in April. I was wondering if there are others
that plan to go and if so would there be any interest in having an informal
BOF/hackathon/get-together either during the conference or after hours?

--Rafael


Re: I think that's a blocker...

2015-02-25 Thread Rafael Schloming
Maybe my head is just thick today, but even staring at the docs a couple
times and reading through what you have below, I can't say I quite
understand what you're going for. What are the actual constraints for the
windows APIs and what is the heavyweight stuff pn_io_t is doing?

--Rafael

On Wed, Feb 25, 2015 at 1:02 PM, Cliff Jansen  wrote:

> A pn_io_t is heavyweight in Windows, because it has an opposite usage
> pattern and moves a lot of kernel stuff into user space compared to
> POSIX.
>
> The quoted documentation was my attempt to capture the Dispatch usage
> pattern, which I assumed would be typical of an application trying to
> spread proton engine use between threads: basically single access to
> pn_selector_select() via a condition variable, and no more than one
> thread working on a given selectable (using proton engine
> encoding/decoding etc., not just io).
>
> In the end, we could just add a zillion locks into the Windows code
> and make it look like it is as thread safe as the POSIX counterpart
> (which has implicit safety when it does in the kernel what Windows is
> doing in user space), but that would defeat using IO completion ports
> at all.  The documentation was my attempt of balancing performance
> with sophisticated proton usage on multiple platforms.
>
> Note that there is only one pn_selector_t allowed per pn_io_t (a very
> strong Windows completion port requirement, and sockets are bound to a
> single completion port for life).
>
> On Wed, Feb 25, 2015 at 8:52 AM, Rafael Schloming 
> wrote:
> > On Wed, Feb 25, 2015 at 10:49 AM, Ted Ross  wrote:
> >
> >> Would it be safe to assume that any operations on driver->io are not
> >> thread safe?
> >>
> >> Dispatch is a multi-threaded application.  It looks to me as though
> >> io->error is a resource shared across the threads in an unsafe way.
> >>
> >
> > Interesting... so this is what the docs say:
> >
> > /**
> >  * A ::pn_io_t manages IO for a group of pn_socket_t handles.  A
> >  * pn_io_t object may have zero or one pn_selector_t selectors
> >  * associated with it (see ::pn_io_selector()).  If one is associated,
> >  * all the pn_socket_t handles managed by a pn_io_t must use that
> >  * pn_selector_t instance.
> >  *
> >  * The pn_io_t interface is single-threaded. All methods are intended
> >  * to be used by one thread at a time, except that multiple threads
> >  * may use:
> >  *
> >  *   ::pn_write()
> >  *   ::pn_send()
> >  *   ::pn_recv()
> >  *   ::pn_close()
> >  *   ::pn_selector_select()
> >  *
> >  * provided at most one thread is calling ::pn_selector_select() and
> >  * the other threads are operating on separate pn_socket_t handles.
> >  */
> >
> > I think this has been somewhat modified by the constraints from the
> windows
> > implementation, and I'm not sure I understand completely what the
> > constraints are there, or entirely what is being described above, but on
> > the posix front, the pn_io_t is little more than just a holder for an
> error
> > slot, and you should have one of these per thread. It shouldn't be a
> > problem to use send/recv/etc from multiple threads though so long as you
> > pass in the pn_io_t from the current thread.
> >
> > --Rafael
>


Re: I think that's a blocker...

2015-02-25 Thread Rafael Schloming
On Wed, Feb 25, 2015 at 12:48 PM, Ted Ross  wrote:

>
>
> On 02/25/2015 11:52 AM, Rafael Schloming wrote:
>
>> On Wed, Feb 25, 2015 at 10:49 AM, Ted Ross  wrote:
>>
>>  Would it be safe to assume that any operations on driver->io are not
>>> thread safe?
>>>
>>> Dispatch is a multi-threaded application.  It looks to me as though
>>> io->error is a resource shared across the threads in an unsafe way.
>>>
>>>
>> Interesting... so this is what the docs say:
>>
>> /**
>>   * A ::pn_io_t manages IO for a group of pn_socket_t handles.  A
>>   * pn_io_t object may have zero or one pn_selector_t selectors
>>   * associated with it (see ::pn_io_selector()).  If one is associated,
>>   * all the pn_socket_t handles managed by a pn_io_t must use that
>>   * pn_selector_t instance.
>>   *
>>   * The pn_io_t interface is single-threaded. All methods are intended
>>   * to be used by one thread at a time, except that multiple threads
>>   * may use:
>>   *
>>   *   ::pn_write()
>>   *   ::pn_send()
>>   *   ::pn_recv()
>>   *   ::pn_close()
>>   *   ::pn_selector_select()
>>   *
>>   * provided at most one thread is calling ::pn_selector_select() and
>>   * the other threads are operating on separate pn_socket_t handles.
>>   */
>>
>
> I claim that the commit-in-question violates the text above.  Calls to
> pn_send() and pn_recv() are no longer thread-safe because they now use the
> shared error record.
>

You could be right. I'm not entirely sure how to interpret the above text.
I don't know that I would necessarily consider pn_send/pn_recv to be
"methods" of pn_io_t.


>
>> I think this has been somewhat modified by the constraints from the
>> windows
>> implementation, and I'm not sure I understand completely what the
>> constraints are there, or entirely what is being described above, but on
>> the posix front, the pn_io_t is little more than just a holder for an
>> error
>> slot, and you should have one of these per thread. It shouldn't be a
>> problem to use send/recv/etc from multiple threads though so long as you
>> pass in the pn_io_t from the current thread.
>>
>
> It's not desirable to allocate sockets to threads up front (i.e. partition
> the set of sockets into per-thread slots).  I know you didn't say that was
> needed but it's what I infer from the docs for pn_io_t.
>

I don't think the posix implementation requires this.


> Assuming, as you suggest, that pn_io_t is nothing more than a
> thread-specific error notepad seems like a recipe for future disaster
> because pn_io_t is clearly intended to be more than that.
>

It may not work out so well on windows, I honestly don't know what the
situation is there, but certainly for posix systems I think we need
*something* in this area to function as a context that can be associated
with thread-or-smaller granularities. Having some way to return error
information is just one example of a useful context to be able to use at
thread or smaller granularities. If the windows I/O APIs require a
heavier-weight interface, then perhaps we need to factor it into two
different parts.

--Rafael


Re: I think that's a blocker...

2015-02-25 Thread Rafael Schloming
On Wed, Feb 25, 2015 at 10:49 AM, Ted Ross  wrote:

> Would it be safe to assume that any operations on driver->io are not
> thread safe?
>
> Dispatch is a multi-threaded application.  It looks to me as though
> io->error is a resource shared across the threads in an unsafe way.
>

Interesting... so this is what the docs say:

/**
 * A ::pn_io_t manages IO for a group of pn_socket_t handles.  A
 * pn_io_t object may have zero or one pn_selector_t selectors
 * associated with it (see ::pn_io_selector()).  If one is associated,
 * all the pn_socket_t handles managed by a pn_io_t must use that
 * pn_selector_t instance.
 *
 * The pn_io_t interface is single-threaded. All methods are intended
 * to be used by one thread at a time, except that multiple threads
 * may use:
 *
 *   ::pn_write()
 *   ::pn_send()
 *   ::pn_recv()
 *   ::pn_close()
 *   ::pn_selector_select()
 *
 * provided at most one thread is calling ::pn_selector_select() and
 * the other threads are operating on separate pn_socket_t handles.
 */

I think this has been somewhat modified by the constraints from the windows
implementation, and I'm not sure I understand completely what the
constraints are there, or entirely what is being described above, but on
the posix front, the pn_io_t is little more than just a holder for an error
slot, and you should have one of these per thread. It shouldn't be a
problem to use send/recv/etc from multiple threads though so long as you
pass in the pn_io_t from the current thread.

--Rafael


Re: I think that's a blocker...

2015-02-25 Thread Rafael Schloming
On Wed, Feb 25, 2015 at 9:53 AM, Alan Conway  wrote:

> On Wed, 2015-02-25 at 09:04 -0500, Michael Goulish wrote:
> > Good point!  I'm afraid it will take me the rest of my life
> > to reproduce under valgrind .. but ... I'll see what I can do
>
> Try this in your environment:
>  export MALLOC_PERTURB_=66
> That will cause malloc to immediately fill freed memory with 0x42 bytes
> so it is obvious when you gdb the core dump if someone is using freed
> memory.
>
> It's not as informative as valgrind but has no peformance impact that I
> can detect, and it often helps to crash faster and closer to the real
> problem. Freed memory can hold valid-seeming values for a while so your
> code may not notice immediately, whereas 4242424242 is rarely valid for
> anything.
>
> > In the meantime -- I'm not sure what to do with a Jira if the
> > provenance is in doubt...
>
> Maybe just put a note on it till we know more.
>

+1

FWIW, given that it looks like we are gonna wait for Andrew's SASL changes
to land, and Gordon is away till the end of the week and still working on
some documentation that would be nice to get into the release, I suspect
you have at least a few days to investigate the issue before we get a
release candidate for 0.9.

--Rafael


Re: I think that's a blocker...

2015-02-25 Thread Rafael Schloming
This isn't necessarily a proton bug. Nothing in the referenced checkin
actually touches the logic around allocating/freeing error strings, it
merely causes pn_send/pn_recv to make use of pn_io_t's pn_error_t where
previously it threw away the error information. This would suggest that
there is perhaps a pre-existing bug in dispatch where it is calling
pn_send/pn_recv with a pn_io_t that has been freed, and it is only now
triggering due to the additional asserts that are encountered due to not
ignoring the error information.

I could be mistaken, but I would try reproducing this under valgrind. That
will tell you where the first free occurred and that should hopefully make
it obvious whether this is indeed a proton bug or whether dispatch is
somehow freeing the pn_io_t sooner than it should.

(FWIW, if it is indeed a proton bug, then I would agree it is a blocker.)

--Rafael

On Wed, Feb 25, 2015 at 7:54 AM, Michael Goulish 
wrote:

> ...but if not, somebody please feel free to correct me.
>
> The Jira that I just created -- PROTON-826 -- is for a
> bug I found with my topology testing of the Dispatch Router,
> in which I repeatedly kill and restart a router and make
> sure that the router network comes back to the same topology
> that it had before.
>
> As of checkin 01cb00c -- which had no Jira -- it is pretty
> easy for my test to blow core.  It looks like an error
> string is being double-freed (maybe) in the proton library.
>
> ( full info in the Jira.  https://issues.apache.org/jira/browse/PROTON-826
> )
>
>
>


  1   2   3   4   5   6   7   >