Re: API and terms: idle-time-out and heartbeat intervals.

Rob Godfrey Mon, 17 Oct 2016 13:45:33 -0700

OK - that (with the above documentation) seems fine to me - I like the use
of Heartbeat to properly from the notion of the timeout threshold.


-- Rob

On 17 October 2016 at 21:09, Alan Conway <acon...@redhat.com> wrote:

> On Mon, 2016-10-17 at 17:08 +0100, Rob Godfrey wrote:
> > If this API is not also making any decision on the (local) "timeout
> > threshold" then that seems eminently reasonable (and what anyone with
> > a
> > passing familiarity of the spec would expect).
> >
> > If the API is using the same call to also determine when the library
> > will
> > actually consider the peer has ceased to function, then I would be
> > more
> > concerned (albeit in an abstract way as I'm not actually currently
> > using
> > the API at all :-) ).
> >
>
> Here's what I am considering for Go:
>
> Called by the end initiating the connection:
>
> // Heartbeat returns a ConnectionOption that requests the maximum delay
> // between sending frames for the remote peer. If we don't receive any
> frames
> // within 2*delay we will close the connection.
> //
> func Heartbeat(delay time.Duration) ConnectionOption {
>         // Proton-C divides the idle-timeout by 2 before sending, so
> compensate.
>         return func(c *connection) { c.engine.Transport().SetIdleTimeout(2
> * delay) }
> }
>
> Called to query an incoming connection:
>
>         // Heartbeat is the maximum delay between sending frames that the
> remote peer
>         // has requested of us. If the interval expires an empty
> "heartbeat" frame
>         // will be sent automatically to keep the connection open.
>         Heartbeat() time.Duration
>
> Key point is there is just *one* number that is passed to the API, sent
> on the wire, and available to the other end, and it has a clearly
> described meaning. The built-in automatic behavior is also described.
>
> IMO using "idle timeout" to mean one thing at one end of the
> connection, and a different thing at the other (and wait, which one
> goes on the wire again?) is insane.
>
> > -- Rob
> >
> > On 17 October 2016 at 16:57, Alan Conway <acon...@redhat.com> wrote:
> >
> > >
> > > On Fri, 2016-09-30 at 14:40 +0100, Rob Godfrey wrote:
> > > >
> > > > I don't think it's a bug - it's a completely valid (if chatty)
> > > > choice.
> > > >
> > > > To my mind the semantics of the field are this:
> > > >
> > > > The sender of the open frame is saying "if I do not receive any
> > > > data
> > > > for this period of time, I reserve the right to assume that you
> > > > are
> > > > no
> > > > longer functioning correctly"
> > > > On receiving this information, the peer should decide on a
> > > > strategy
> > > > that ensures to the best of its ability, that if it is
> > > > functioning
> > > > normally it will be generating data sufficiently frequently that
> > > > the
> > > > condition does not occur.  If this peer knows that it is
> > > > susceptible
> > > > to delays outside its direct control (such as garbage collection
> > > > pauses) it should take account of this in how often it schedules
> > > > sending heartbeat frames.
> > > > At the same time, the sender of this value should give some
> > > > leeway
> > > > before actually shutting down to account for unexpected transport
> > > > delays, or processing delays on its side.  Since it is unaware of
> > > > the
> > > > nature of the allowances made at the peer, it may decide to be
> > > > particularly generous in the leeway it grants.
> > > >
> > > > I think the spec mentioning a ratio (such as twice) is massively
> > > > unhelpful.
> > > >
> > > > In terms of the API - what is the applications intent in setting
> > > > the
> > > > value?  Is the intent to ensure transport activity within a
> > > > specific
> > > > period, or to set a hard limit on how long the library will wait
> > > > before generating some sort of timeout exception?
> > >
> > > My guess is that in terms of the API, the application's intent in
> > > calling "set_idle_timeout(T)" is to set the formally-specified AMQP
> > > value called "idle-timeout" to T. I don't think anybody's first
> > > reading
> > > of the API doc or the spec would suggest to them that if they want
> > > to
> > > set a wire timeout of T they actually need to call
> > > set_idle_timeout(T*2) on one side of the connection and it will
> > > magically come out as T on the other.
> > >
> > > Spec wording aside, does anybody actually disagree with that?
> > >
> > >
> > > >
> > > > -- Rob
> > > >
> > > >
> > > > On 30 September 2016 at 14:25, Ken Giusti <kgiu...@redhat.com>
> > > > wrote:
> > > > >
> > > > >
> > > > > CC'ing user's list
> > > > >
> > > > > Given the interpretations of the spec discussed below is the
> > > > > way
> > > > > Proton handles this functionality inconsistent with itself?
> > > > >
> > > > > 1) Proton sends an open frame with _half_ the value of the
> > > > > timeout
> > > > > as set by the application.
> > > > > 2) Proton interprets the idle-timeout in the received open as
> > > > > the
> > > > > actual time out interval, and pessimistically ;) generates
> > > > > heartbeats as 1/2 that value.
> > > > >
> > > > > Is this a bug in the Proton implementation?
> > > > >
> > > > > -K
> > > > >
> > > > > ----- Original Message -----
> > > > > >
> > > > > >
> > > > > > From: "Alan Conway" <acon...@redhat.com>
> > > > > > To: d...@qpid.apache.org
> > > > > > Sent: Thursday, September 29, 2016 9:11:51 AM
> > > > > > Subject: Re: API and terms: idle-time-out and heartbeat
> > > > > > intervals.
> > > > > >
> > > > > > On Wed, 2016-09-28 at 22:33 +0100, Robbie Gemmell wrote:
> > > > > > >
> > > > > > >
> > > > > > > The spec unfortunately says what it says, and even if there
> > > > > > > were
> > > > > > > scope
> > > > > > > to update it, I'm not sure there is a way to change it to
> > > > > > > address the
> > > > > > > main issue (which regardless of any other clarity issues,
> > > > > > > is I
> > > > > > > think
> > > > > > > just that it only says you SHOULD advertise half your
> > > > > > > 'actual
> > > > > > > timeout') that wouldn't result in an 'incompatible change'
> > > > > > > of
> > > > > > > behaviour.
> > > > > >
> > > > > > I look at this differently. We agree on the semantics of
> > > > > > idle-
> > > > > > time-out
> > > > > > defined by the spec. Now forget the spec wording, it is easy
> > > > > > to
> > > > > > explain
> > > > > > clearly what the semantics are. It is not "half your actual
> > > > > > time
> > > > > > out",
> > > > > > it is the *exact* frame rate you expect from your peer. The
> > > > > > peer's
> > > > > > *only* task is to respect that frame rate, they do not need
> > > > > > any
> > > > > > more
> > > > > > information than that - they certainly don't need to guess at
> > > > > > what your
> > > > > > margin for error is for keeping the connection open.
> > > > > >
> > > > > > The margin for error is an implementation detail that does
> > > > > > *not*
> > > > > > need
> > > > > > to be advertised. Double the frame rate is a reasonable
> > > > > > general
> > > > > > recommendation, but the ideal margin depends on latency
> > > > > > variability in
> > > > > > the system, you can imagine systems where it might be better
> > > > > > to
> > > > > > have a
> > > > > > larger or a smaller margin.
> > > > > >
> > > > > > The name 'idle-time-out' is not an ideal choice, 'heartbeat'
> > > > > > or
> > > > > > 'max-
> > > > > > frame-delay' might be better. But once we know what it means,
> > > > > > the
> > > > > > semantics are perfectly clear.
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Knowing the exact enforced timeout could certainly be
> > > > > > > clearer
> > > > > > > in some
> > > > > > > ways, but if the period you had to send after wasnt defined
> > > > > > > (e.g
> > > > > > > half)
> > > > > > > then it would essentially leave you with the same problem
> > > > > > > as
> > > > > > > now:
> > > > > > > deciding exactly how early you should send heartbeat frames
> > > > > > > to
> > > > > > > avoid
> > > > > > > a
> > > > > > > [more-precisely-known] timeout.
> > > > > >
> > > > > > The only important thing is to agree on the meaning of the
> > > > > > advertised
> > > > > > value: is it the frame rate or the connection close
> > > > > > threshold?
> > > > > > The work
> > > > > >  of adjusting at one end or the other is equivalent.
> > > > > >
> > > > > > I think we agree it is currently defined as the frame rate, I
> > > > > > see
> > > > > > no
> > > > > > benefit in trying to change the meaning. (There would be no
> > > > > > drawback
> > > > > > either if the spec was not already published and implemented
> > > > > > by
> > > > > > multiple implementors, but it is)
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > With what the spec says we effectively have a lower-bound
> > > > > > > on
> > > > > > > the
> > > > > > > actual-timeout (if cases the peer didnt half their actual-
> > > > > > > timeout
> > > > > > > before advertising a number) and an upper bound (if they
> > > > > > > did)
> > > > > > > due to
> > > > > > > the use of SHOULD as opposed to MUST or some other more
> > > > > > > fixed
> > > > > > > definition of behaviour. So we currently try to send frames
> > > > > > > often
> > > > > > > enough to satisfy that lower bound, i.e the number we
> > > > > > > received.
> > > > > > > That
> > > > > > > essentially reduces it to the the same problem as if we
> > > > > > > were
> > > > > > > trying
> > > > > > > to
> > > > > > > satisfy a theoretical exactly-known actual-timeout (just
> > > > > > > using
> > > > > > > a
> > > > > > > smaller number that we dont want to use, because we
> > > > > > > followed
> > > > > > > the
> > > > > > > specs
> > > > > > > recommendation). Currently we do this by sending frames
> > > > > > > after
> > > > > > > half
> > > > > > > the
> > > > > > > advertised period, i.e. what we know is actually a quarter
> > > > > > > of
> > > > > > > the
> > > > > > > actual-timeout enforced by peers such as proton which are
> > > > > > > following
> > > > > > > the specs recommendations. We could choose to do it less
> > > > > > > often
> > > > > > > than
> > > > > > > that by reducing how conservative it is, e.g use 75+% of
> > > > > > > advertised
> > > > > > > value instead for example, which should have no impact in
> > > > > > > the
> > > > > > > cases
> > > > > > > where peers had followed the recommendation (as we would
> > > > > > > still
> > > > > > > be
> > > > > > > sending more than twice as quickly as needed to satisfy
> > > > > > > their
> > > > > > > actual-timeout), but would put us closer to spurious
> > > > > > > timeout (
> > > > > > > if e.g
> > > > > > > there was a delay in delivering/processing the heartbeat)
> > > > > > > against any
> > > > > > > peers that didn't follow the recommendation.
> > > > > > >
> > > > > > > On 28 September 2016 at 21:26, Justin Ross <justin.ross@gma
> > > > > > > il.c
> > > > > > > om>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > IMO, the overall picture is simpler, and easier to
> > > > > > > > explain to
> > > > > > > > third
> > > > > > > > parties, if we go the way Ken suggested.  When a remote
> > > > > > > > peer
> > > > > > > > sends
> > > > > > > > you an
> > > > > > > > idle timeout value, it is an expression of an (actual,
> > > > > > > > not
> > > > > > > > simply
> > > > > > > > "advertised") guarantee - "I will expire the connection
> > > > > > > > after
> > > > > > > > X
> > > > > > > > time
> > > > > > > > without receiving a frame from you".
> > > > > > > >
> > > > > > > > We could also legitimately go the direction you
> > > > > > > > suggest.  *But* its
> > > > > > > > name is
> > > > > > > > "idle timeout".  We can't easily change the name.  I
> > > > > > > > think we
> > > > > > > > should take
> > > > > > > > the spec text that goes with the name, and the behavior
> > > > > > > > of
> > > > > > > > our
> > > > > > > > components,
> > > > > > > > firmly in the direction Ken suggests.
> > > > > > > >
> > > > > > > > Off topic: why is this on the dev list?
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Sep 28, 2016 at 12:36 PM, Alan Conway <aconway@re
> > > > > > > > dhat
> > > > > > > > .com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, 2016-09-28 at 10:13 -0400, Ken Giusti wrote:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I've had a hand in the way Proton/C interprets the
> > > > > > > > > > meaning of
> > > > > > > > > > 'idle-
> > > > > > > > > > timeout' and I've never liked the solution.  I think
> > > > > > > > > > Proton/C's
> > > > > > > > > > behavior is not 'pessimistic' as much as it is
> > > > > > > > > > 'conservative'
> > > > > > > > > > for the
> > > > > > > > > > sake of interoperability.  This, unfortunately ends
> > > > > > > > > > up
> > > > > > > > > > with a
> > > > > > > > > > needless idle frame chattiness when both ends are
> > > > > > > > > > Proton-
> > > > > > > > > > based.
> > > > > > > > > >
> > > > > > > > > > ----- Original Message -----
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > From: "Rob Godfrey" <rob.j.godf...@gmail.com>
> > > > > > > > > > > To: "qpid" <d...@qpid.apache.org>
> > > > > > > > > > > Sent: Wednesday, September 28, 2016 6:19:05 AM
> > > > > > > > > > > Subject: Re: API and terms: idle-time-out and
> > > > > > > > > > > heartbeat
> > > > > > > > > > > intervals.
> > > > > > > > > > >
> > > > > > > > > > > I agree that specifying that the communicated
> > > > > > > > > > > figure
> > > > > > > > > > > should
> > > > > > > > > > > be
> > > > > > > > > > > "half"
> > > > > > > > > > > the "actual" timeout was a mistake.
> > > > > > > > > > >
> > > > > > > > > > > What the spec should have tried to communicate is
> > > > > > > > > > > that
> > > > > > > > > > > the
> > > > > > > > > > > sender
> > > > > > > > > > > should communicate a value somewhat less than the
> > > > > > > > > > > period it
> > > > > > > > > > > uses to
> > > > > > > > > > > determine that the connection has actually timed-
> > > > > > > > > > > out to
> > > > > > > > > > > allow
> > > > > > > > > > > for
> > > > > > > > > > > the
> > > > > > > > > > > receiver to process and emit a heartbeat frame.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Wouldn't it be much clearer to simply send the
> > > > > > > > > > _actual_
> > > > > > > > > > idle
> > > > > > > > > > timeout
> > > > > > > > > > value?
> > > > > > > > >
> > > > > > > > > My read is that is exactly what it does: It sends the
> > > > > > > > > max
> > > > > > > > > time
> > > > > > > > > that the
> > > > > > > > > *sender* of frames may be idle. The receiver of frames
> > > > > > > > > SHOULD be
> > > > > > > > > more
> > > > > > > > > patient than that. The wording of the "discussion"
> > > > > > > > > around
> > > > > > > > > it and
> > > > > > > > > the
> > > > > > > > > choice of terms is a bit cloudy but, the text that
> > > > > > > > > describes
> > > > > > > > > idle-time-
> > > > > > > > > out seems clear enough: it is the max interval between
> > > > > > > > > sending
> > > > > > > > > frames.
> > > > > > > > > The frame receiver SHOULD wait longer that that before
> > > > > > > > > closing,
> > > > > > > > > and 2x
> > > > > > > > > seems a reasonable suggestion, but that's for the impl
> > > > > > > > > to
> > > > > > > > > decide.
> > > > > > > > >
> > > > > > > > > It's weird that it says "idle-time-out should be half
> > > > > > > > > the
> > > > > > > > > threshold"
> > > > > > > > > instead of "the threshold should be twice the idle-
> > > > > > > > > time-
> > > > > > > > > out" but
> > > > > > > > > it's
> > > > > > > > > logically equivalent.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Having the spec suggest "communicating a value
> > > > > > > > > > *somewhat
> > > > > > > > > > less*"
> > > > > > > > >
> > > > > > > > > The wording is odd but the semantics are you
> > > > > > > > > communicate
> > > > > > > > > *exactly* the
> > > > > > > > > max frame delay you want and then you SHOULD set your
> > > > > > > > > connection
> > > > > > > > > close
> > > > > > > > > threshold to something bigger. The other end doesn't
> > > > > > > > > need
> > > > > > > > > to know
> > > > > > > > > how
> > > > > > > > > much bigger, they just need to know what rate to send
> > > > > > > > > frames.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > [emphasis mine] leaves the implementation open for
> > > > > > > > > > interpretation -
> > > > > > > > > > which is exactly how we got into this mess in the
> > > > > > > > > > first
> > > > > > > > > > place.  Developers are a smart bunch - they know that
> > > > > > > > > > keep
> > > > > > > > > > alive
> > > > > > > > > > traffic will have to be sent frequently enough to
> > > > > > > > > > prevent
> > > > > > > > > > idle
> > > > > > > > > > timeout.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >  Similarly the sender
> > > > > > > > > > > should ensure that a frame has been emitted well
> > > > > > > > > > > within
> > > > > > > > > > > the
> > > > > > > > > > > timeout
> > > > > > > > > > > period to allow for any communication / processing
> > > > > > > > > > > delay.
> > > > > > > > > >
> > > > > > > > > > Agreed - perfectly acceptable for the spec to point
> > > > > > > > > > this
> > > > > > > > > > out.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >  In practice
> > > > > > > > > > > these "wiggle room" factors should not be
> > > > > > > > > > > determined by
> > > > > > > > > > > the
> > > > > > > > > > > application level timeout setting but by sensible
> > > > > > > > > > > calculations on
> > > > > > > > > > > transport delay variance / processing time,
> > > > > > > > > > > etc...  these
> > > > > > > > > > > calculation
> > > > > > > > > > > may differ between different use-cases /
> > > > > > > > > > > environments
> > > > > > > > > > > (for
> > > > > > > > > > > example
> > > > > > > > > > > in
> > > > > > > > > > > a low latency / real-time environment you may be
> > > > > > > > > > > able
> > > > > > > > > > > to make
> > > > > > > > > > > hard
> > > > > > > > > > > guarantees about the number of milliseconds that
> > > > > > > > > > > communication /
> > > > > > > > > > > processing delay will take... on the other hand if
> > > > > > > > > > > you
> > > > > > > > > > > are
> > > > > > > > > > > using an
> > > > > > > > > > > interpreted language with stop-the-world garbage
> > > > > > > > > > > collection
> > > > > > > > > > > you may
> > > > > > > > > > > not be able to say much better than the delay
> > > > > > > > > > > should be
> > > > > > > > > > > less
> > > > > > > > > > > than
> > > > > > > > > > > 30s
> > > > > > > > > > > or whatever).
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Yes - very important things to keep in mind when
> > > > > > > > > > implementing
> > > > > > > > > > this.  But the spec shouldn't be making these
> > > > > > > > > > suggestions
> > > > > > > > > > for
> > > > > > > > > > different implementation options. The spec should be
> > > > > > > > > > as
> > > > > > > > > > concise
> > > > > > > > > > as
> > > > > > > > > > possible about the mandated behavior, and leave the
> > > > > > > > > > implementation to
> > > > > > > > > > the developers.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I think application level APIs should be in terms
> > > > > > > > > > > of
> > > > > > > > > > > the
> > > > > > > > > > > timeouts
> > > > > > > > > > > that
> > > > > > > > > > > will affect the application.  The AMQP library
> > > > > > > > > > > should
> > > > > > > > > > > be
> > > > > > > > > > > massaging
> > > > > > > > > > > those numbers in such a way that they can fulfil
> > > > > > > > > > > the
> > > > > > > > > > > application
> > > > > > > > > > > requirements.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Agreed.  Now, is there _any_ way we can suggest an
> > > > > > > > > > update
> > > > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > spec?  Perhaps an errata, etc?
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > -- Rob
> > > > > > > > > > >
> > > > > > > > > > > On 28 September 2016 at 10:42, Robbie Gemmell
> > > > > > > > > > > <robbie.gemmell
> > > > > > > > > > > @gmail
> > > > > > > > > > > .com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On 27 September 2016 at 22:24, Alan Conway <aconw
> > > > > > > > > > > > ay@r
> > > > > > > > > > > > edhat.
> > > > > > > > > > > > com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, 2016-09-27 at 15:37 -0400, Alan Conway
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I want to clarify and document the meaning of
> > > > > > > > > > > > > > these
> > > > > > > > > > > > > > terms for
> > > > > > > > > > > > > > our
> > > > > > > > > > > > > > APIs,
> > > > > > > > > > > > > > presently I can't find anywhere where they
> > > > > > > > > > > > > > are
> > > > > > > > > > > > > > documented
> > > > > > > > > > > > > > clearly.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The AMQP spec says: "Each peer has its own
> > > > > > > > > > > > > > (independent) idle
> > > > > > > > > > > > > > timeout.
> > > > > > > > > > > > > > At connection open each peer communicates the
> > > > > > > > > > > > > > maximum
> > > > > > > > > > > > > > period between activity (frames) on the
> > > > > > > > > > > > > > connection that
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > desires
> > > > > > > > > > > > > > from
> > > > > > > > > > > > > > its partner.The open frame carries the
> > > > > > > > > > > > > > idletime-
> > > > > > > > > > > > > > out
> > > > > > > > > > > > > > field for this purpose. To avoid spurious
> > > > > > > > > > > > > > timeouts, the
> > > > > > > > > > > > > > value
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > idle-
> > > > > > > > > > > > > > time-out SHOULD be half the peer’s
> > > > > > > > > > > > > > actual timeout threshold."
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > In other words: if I send you an "open" frame
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > idle-time-
> > > > > > > > > > > > > > out=N
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > means *you* should not wait for longer than N
> > > > > > > > > > > > > > milliseconds to
> > > > > > > > > > > > > > send a
> > > > > > > > > > > > > > frame to me. It does not mean *I* will close
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > connection
> > > > > > > > > > > > > > after N
> > > > > > > > > > > > > > milliseconds, I SHOULD be more patient and
> > > > > > > > > > > > > > wait
> > > > > > > > > > > > > > for N*2
> > > > > > > > > > > > > > ms to
> > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > closing prematurely due to minor timing
> > > > > > > > > > > > > > wobbles.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I think the choice of name is slightly
> > > > > > > > > > > > > > ambiguous
> > > > > > > > > > > > > > but
> > > > > > > > > > > > > > the spec
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > clear
> > > > > > > > > > > > > > on the semantics, so it's important to
> > > > > > > > > > > > > > document
> > > > > > > > > > > > > > it to
> > > > > > > > > > > > > > remove
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > ambiguity.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Anybody disagree?
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Sigh. Sadly proton-C interprets "idle-timeout"
> > > > > > > > > > > > > differently
> > > > > > > > > > > > > depending on
> > > > > > > > > > > > > which end of the connection you are on:
> > > > > > > > > > > > >
> > > > > > > > > > > > >       // as per the recommendation in the spec,
> > > > > > > > > > > > > advertise
> > > > > > > > > > > > > half
> > > > > > > > > > > > > our
> > > > > > > > > > > > >       // actual timeout to the remote
> > > > > > > > > > > > >       const pn_millis_t idle_timeout =
> > > > > > > > > > > > > transport-
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > local_idle_timeout
> > > > > > > > > > > > >           ? (transport->local_idle_timeout/2)
> > > > > > > > > > > > >           : 0;
> > > > > > > > > > > > >
> > > > > > > > > > > > > So in proton, pn_set_idle_timeout does NOT mean
> > > > > > > > > > > > > set
> > > > > > > > > > > > > the
> > > > > > > > > > > > > AMQP
> > > > > > > > > > > > > idle-
> > > > > > > > > > > > > timeout value, it means set the local "receive
> > > > > > > > > > > > > timeout"
> > > > > > > > > > > > > value
> > > > > > > > > > > > > and send
> > > > > > > > > > > > > half that as the AMQP "send timeout" for the
> > > > > > > > > > > > > peer.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I'm tempted to use a new term in the Go API:
> > > > > > > > > > > > > "heartbeat".
> > > > > > > > > > > > > To me
> > > > > > > > > > > > > that
> > > > > > > > > > > > > clearly means the "send timeout" (hearts beat,
> > > > > > > > > > > > > they
> > > > > > > > > > > > > don't
> > > > > > > > > > > > > listen for
> > > > > > > > > > > > > beats) so it coincides with the meaning of the
> > > > > > > > > > > > > AMQP
> > > > > > > > > > > > > "idle-
> > > > > > > > > > > > > timeout", but
> > > > > > > > > > > > > without the ambiguity that is exacerbated by
> > > > > > > > > > > > > proton
> > > > > > > > > > > > > interpreting it
> > > > > > > > > > > > > both ways.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Proton may seem to behave differently on each
> > > > > > > > > > > > end,
> > > > > > > > > > > > but I
> > > > > > > > > > > > don't
> > > > > > > > > > > > think
> > > > > > > > > > > > its necessarily a bad thing that it does, and it
> > > > > > > > > > > > is
> > > > > > > > > > > > also I
> > > > > > > > > > > > think
> > > > > > > > > > > > largely just reflecting an annoying bit in the
> > > > > > > > > > > > spec
> > > > > > > > > > > > around
> > > > > > > > > > > > this
> > > > > > > > > > > > where
> > > > > > > > > > > > different behaviours are allowed for, whereas it
> > > > > > > > > > > > would be
> > > > > > > > > > > > easier
> > > > > > > > > > > > if it
> > > > > > > > > > > > had less wiggle room.
> > > > > > > > > > > >
> > > > > > > > > > > > The transport setter/getter for the local timeout
> > > > > > > > > > > > takes the
> > > > > > > > > > > > 'actual
> > > > > > > > > > > > timeout' and then sends half of it as the
> > > > > > > > > > > > advertised
> > > > > > > > > > > > value
> > > > > > > > > > > > in the
> > > > > > > > > > > > Open
> > > > > > > > > > > > sent. This makes a certain amount of sense since
> > > > > > > > > > > > it
> > > > > > > > > > > > ensures
> > > > > > > > > > > > that
> > > > > > > > > > > > appropriate behaviour is actually satisfied,
> > > > > > > > > > > > rather
> > > > > > > > > > > > than
> > > > > > > > > > > > expecting the
> > > > > > > > > > > > user to ensure they only give half the value they
> > > > > > > > > > > > really
> > > > > > > > > > > > want for
> > > > > > > > > > > > their actual timeout. The getter for the remote
> > > > > > > > > > > > timeout
> > > > > > > > > > > > value on
> > > > > > > > > > > > the
> > > > > > > > > > > > other hand returns the advertised value from the
> > > > > > > > > > > > Open
> > > > > > > > > > > > that
> > > > > > > > > > > > is
> > > > > > > > > > > > received. I expect it does that since it cant
> > > > > > > > > > > > actually ever
> > > > > > > > > > > > return the
> > > > > > > > > > > > remotes 'actual timeout' without making an
> > > > > > > > > > > > assumption, i.e
> > > > > > > > > > > > that
> > > > > > > > > > > > they
> > > > > > > > > > > > did in fact advertise half (or less) of their
> > > > > > > > > > > > actual
> > > > > > > > > > > > timeout,
> > > > > > > > > > > > which
> > > > > > > > > > > > the spec only says that they SHOULD do.
> > > > > > > > > > > >
> > > > > > > > > > > > Yes the local setter taking the advertised value
> > > > > > > > > > > > may
> > > > > > > > > > > > have
> > > > > > > > > > > > been
> > > > > > > > > > > > better
> > > > > > > > > > > > for method consistency with the remote getter. On
> > > > > > > > > > > > the
> > > > > > > > > > > > other
> > > > > > > > > > > > hand,
> > > > > > > > > > > > sending of necessary heartbeats is handled
> > > > > > > > > > > > directly
> > > > > > > > > > > > by the
> > > > > > > > > > > > transport
> > > > > > > > > > > > during the tick process, so users may not
> > > > > > > > > > > > necessarily
> > > > > > > > > > > > even
> > > > > > > > > > > > use
> > > > > > > > > > > > the
> > > > > > > > > > > > getter themselves, and proton uses that remote
> > > > > > > > > > > > value
> > > > > > > > > > > > internally
> > > > > > > > > > > > by
> > > > > > > > > > > > pessimistically halfing it to account for the
> > > > > > > > > > > > case
> > > > > > > > > > > > that
> > > > > > > > > > > > folks on
> > > > > > > > > > > > the
> > > > > > > > > > > > other end did not advertise half their actual
> > > > > > > > > > > > timeout
> > > > > > > > > > > > (since the
> > > > > > > > > > > > spec
> > > > > > > > > > > > doesnt require that they do). Side note: proton
> > > > > > > > > > > > could
> > > > > > > > > > > > arguably be
> > > > > > > > > > > > less
> > > > > > > > > > > > pessimistic here and go for say a percentage much
> > > > > > > > > > > > nearer
> > > > > > > > > > > > the full
> > > > > > > > > > > > advertised value, but then you'd probably need to
> > > > > > > > > > > > start
> > > > > > > > > > > > guaging
> > > > > > > > > > > > how
> > > > > > > > > > > > close is too close.
> > > > > > > > > > > >
> > > > > > > > > > > > I think ensuring the doccumentation on the
> > > > > > > > > > > > methods is
> > > > > > > > > > > > clear
> > > > > > > > > > > > what
> > > > > > > > > > > > they
> > > > > > > > > > > > do is sufficient enough here. I actually prefer
> > > > > > > > > > > > idle-
> > > > > > > > > > > > timeout as
> > > > > > > > > > > > an
> > > > > > > > > > > > name rather than heartbeat due to the way this
> > > > > > > > > > > > all
> > > > > > > > > > > > works.
> > > > > > > > > > > > Since
> > > > > > > > > > > > you
> > > > > > > > > > > > only tell the other side [half] your timeout, you
> > > > > > > > > > > > dont
> > > > > > > > > > > > actually
> > > > > > > > > > > > have
> > > > > > > > > > > > direct control over when they send any needed
> > > > > > > > > > > > empty
> > > > > > > > > > > > frames
> > > > > > > > > > > > to
> > > > > > > > > > > > satisfy
> > > > > > > > > > > > it (as the above shows, we might send them more
> > > > > > > > > > > > often
> > > > > > > > > > > > than
> > > > > > > > > > > > they
> > > > > > > > > > > > require) and 'heartbeat' might seem to imply that
> > > > > > > > > > > > you
> > > > > > > > > > > > do,
> > > > > > > > > > > > and
> > > > > > > > > > > > possibly
> > > > > > > > > > > > even that they need be sent at that period all
> > > > > > > > > > > > the
> > > > > > > > > > > > time
> > > > > > > > > > > > even
> > > > > > > > > > > > despite
> > > > > > > > > > > > regular traffic, which is not the case.
> > > > > > > > > > > >
> > > > > > > > > > > > Robbie
> > > > > > > > > > > >
> > > > > > > > > > > > -----------------------------------------------
> > > > > > > > > > > > ----
> > > > > > > > > > > > ------
> > > > > > > > > > > > ------
> > > > > > > > > > > > ------
> > > > > > > > > > > > To unsubscribe, e-mail: dev-unsubscr...@qpid.apac
> > > > > > > > > > > > he.o
> > > > > > > > > > > > rg
> > > > > > > > > > > > For additional commands, e-mail: dev-h...@qpid.ap
> > > > > > > > > > > > ache
> > > > > > > > > > > > .org
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > -------------------------------------------------
> > > > > > > > > > > ----
> > > > > > > > > > > ------
> > > > > > > > > > > ------
> > > > > > > > > > > ----
> > > > > > > > > > > To unsubscribe, e-mail: dev-unsubscribe@qpid.apache
> > > > > > > > > > > .org
> > > > > > > > > > > For additional commands, e-mail: dev-h...@qpid.apac
> > > > > > > > > > > he.o
> > > > > > > > > > > rg
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > -----------------------------------------------------
> > > > > > > > > ----
> > > > > > > > > ------
> > > > > > > > > ------
> > > > > > > > > To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
> > > > > > > > > For additional commands, e-mail: dev-help@qpid.apache.o
> > > > > > > > > rg
> > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > > > > ---------------------------------------------------------
> > > > > > > ----
> > > > > > > --------
> > > > > > > To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
> > > > > > > For additional commands, e-mail: dev-h...@qpid.apache.org
> > > > > > >
> > > > > >
> > > > > >
> > > > > > -----------------------------------------------------------
> > > > > > ----
> > > > > > ------
> > > > > > To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
> > > > > > For additional commands, e-mail: dev-h...@qpid.apache.org
> > > > > >
> > > > > >
> > > > >
> > > > > --
> > > > > -K
> > > > >
> > > > > -------------------------------------------------------------
> > > > > ----
> > > > > ----
> > > > > To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
> > > > > For additional commands, e-mail: users-h...@qpid.apache.org
> > > > >
> > > >
> > > > ---------------------------------------------------------------
> > > > ------
> > > > To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
> > > > For additional commands, e-mail: users-h...@qpid.apache.org
> > > >
> > >
> > >
> > > -----------------------------------------------------------------
> > > ----
> > > To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
> > > For additional commands, e-mail: users-h...@qpid.apache.org
> > >
> > >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
> For additional commands, e-mail: users-h...@qpid.apache.org
>
>

Re: API and terms: idle-time-out and heartbeat intervals.

Reply via email to