Re: API and terms: idle-time-out and heartbeat intervals.

Ken Giusti Wed, 28 Sep 2016 07:24:08 -0700

I've had a hand in the way Proton/C interprets the meaning of 'idle-timeout' 
and I've never liked the solution.  I think Proton/C's behavior is not 
'pessimistic' as much as it is 'conservative' for the sake of interoperability. 
 This, unfortunately ends up with a needless idle frame chattiness when both 
ends are Proton-based.


----- Original Message -----
> From: "Rob Godfrey" <rob.j.godf...@gmail.com>
> To: "qpid" <dev@qpid.apache.org>
> Sent: Wednesday, September 28, 2016 6:19:05 AM
> Subject: Re: API and terms: idle-time-out and heartbeat intervals.
> 
> I agree that specifying that the communicated figure should be "half"
> the "actual" timeout was a mistake.
> 
> What the spec should have tried to communicate is that the sender
> should communicate a value somewhat less than the period it uses to
> determine that the connection has actually timed-out to allow for the
> receiver to process and emit a heartbeat frame.


Wouldn't it be much clearer to simply send the _actual_ idle timeout value?  
Having the spec suggest "communicating a value *somewhat less*" [emphasis mine] 
leaves the implementation open for interpretation - which is exactly how we got 
into this mess in the first place.  Developers are a smart bunch - they know 
that keep alive traffic will have to be sent frequently enough to prevent idle 
timeout. 


>  Similarly the sender
> should ensure that a frame has been emitted well within the timeout
> period to allow for any communication / processing delay.

Agreed - perfectly acceptable for the spec to point this out.

>  In practice
> these "wiggle room" factors should not be determined by the
> application level timeout setting but by sensible calculations on
> transport delay variance / processing time, etc...  these calculation
> may differ between different use-cases / environments (for example in
> a low latency / real-time environment you may be able to make hard
> guarantees about the number of milliseconds that communication /
> processing delay will take... on the other hand if you are using an
> interpreted language with stop-the-world garbage collection you may
> not be able to say much better than the delay should be less than 30s
> or whatever).
> 

Yes - very important things to keep in mind when implementing this.  But the 
spec shouldn't be making these suggestions for different implementation 
options. The spec should be as concise as possible about the mandated behavior, 
and leave the implementation to the developers.

> I think application level APIs should be in terms of the timeouts that
> will affect the application.  The AMQP library should be massaging
> those numbers in such a way that they can fulfil the application
> requirements.
> 

Agreed.  Now, is there _any_ way we can suggest an update to the spec?  Perhaps 
an errata, etc?

> -- Rob
> 
> On 28 September 2016 at 10:42, Robbie Gemmell <robbie.gemm...@gmail.com>
> wrote:
> > On 27 September 2016 at 22:24, Alan Conway <acon...@redhat.com> wrote:
> >> On Tue, 2016-09-27 at 15:37 -0400, Alan Conway wrote:
> >>> I want to clarify and document the meaning of these terms for our
> >>> APIs,
> >>> presently I can't find anywhere where they are documented clearly.
> >>>
> >>> The AMQP spec says: "Each peer has its own (independent) idle
> >>> timeout.
> >>> At connection open each peer communicates the maximum
> >>> period between activity (frames) on the connection that it desires
> >>> from
> >>> its partner.The open frame carries the idletime-out
> >>> field for this purpose. To avoid spurious timeouts, the value in
> >>> idle-
> >>> time-out SHOULD be half the peer’s
> >>> actual timeout threshold."
> >>>
> >>> In other words: if I send you an "open" frame with idle-time-out=N
> >>> that
> >>> means *you* should not wait for longer than N milliseconds to send a
> >>> frame to me. It does not mean *I* will close the connection after N
> >>> milliseconds, I SHOULD be more patient and wait for N*2 ms to avoid
> >>> closing prematurely due to minor timing wobbles.
> >>>
> >>> I think the choice of name is slightly ambiguous but the spec is
> >>> clear
> >>> on the semantics, so it's important to document it to remove the
> >>> ambiguity.
> >>>
> >>> Anybody disagree?
> >>>
> >>
> >> Sigh. Sadly proton-C interprets "idle-timeout" differently depending on
> >> which end of the connection you are on:
> >>
> >>       // as per the recommendation in the spec, advertise half our
> >>       // actual timeout to the remote
> >>       const pn_millis_t idle_timeout = transport->local_idle_timeout
> >>           ? (transport->local_idle_timeout/2)
> >>           : 0;
> >>
> >> So in proton, pn_set_idle_timeout does NOT mean set the AMQP idle-
> >> timeout value, it means set the local "receive timeout" value and send
> >> half that as the AMQP "send timeout" for the peer.
> >>
> >> I'm tempted to use a new term in the Go API: "heartbeat". To me that
> >> clearly means the "send timeout" (hearts beat, they don't listen for
> >> beats) so it coincides with the meaning of the AMQP "idle-timeout", but
> >> without the ambiguity that is exacerbated by proton interpreting it
> >> both ways.
> >>
> >>
> >
> > Proton may seem to behave differently on each end, but I don't think
> > its necessarily a bad thing that it does, and it is also I think
> > largely just reflecting an annoying bit in the spec around this where
> > different behaviours are allowed for, whereas it would be easier if it
> > had less wiggle room.
> >
> > The transport setter/getter for the local timeout takes the 'actual
> > timeout' and then sends half of it as the advertised value in the Open
> > sent. This makes a certain amount of sense since it ensures that
> > appropriate behaviour is actually satisfied, rather than expecting the
> > user to ensure they only give half the value they really want for
> > their actual timeout. The getter for the remote timeout value on the
> > other hand returns the advertised value from the Open that is
> > received. I expect it does that since it cant actually ever return the
> > remotes 'actual timeout' without making an assumption, i.e that they
> > did in fact advertise half (or less) of their actual timeout, which
> > the spec only says that they SHOULD do.
> >
> > Yes the local setter taking the advertised value may have been better
> > for method consistency with the remote getter. On the other hand,
> > sending of necessary heartbeats is handled directly by the transport
> > during the tick process, so users may not necessarily even use the
> > getter themselves, and proton uses that remote value internally by
> > pessimistically halfing it to account for the case that folks on the
> > other end did not advertise half their actual timeout (since the spec
> > doesnt require that they do). Side note: proton could arguably be less
> > pessimistic here and go for say a percentage much nearer the full
> > advertised value, but then you'd probably need to start guaging how
> > close is too close.
> >
> > I think ensuring the doccumentation on the methods is clear what they
> > do is sufficient enough here. I actually prefer idle-timeout as an
> > name rather than heartbeat due to the way this all works. Since you
> > only tell the other side [half] your timeout, you dont actually have
> > direct control over when they send any needed empty frames to satisfy
> > it (as the above shows, we might send them more often than they
> > require) and 'heartbeat' might seem to imply that you do, and possibly
> > even that they need be sent at that period all the time even despite
> > regular traffic, which is not the case.
> >
> > Robbie
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
> > For additional commands, e-mail: dev-h...@qpid.apache.org
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
> For additional commands, e-mail: dev-h...@qpid.apache.org
> 
> 

-- 
-K

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Re: API and terms: idle-time-out and heartbeat intervals.

Reply via email to