Re: [openstack-dev] 'retry' option

2014-06-29 Thread Flavio Percoco

On 28/06/14 22:49 +0100, Mark McLoughlin wrote:

On Fri, 2014-06-27 at 17:02 +0100, Gordon Sim wrote:

A question about the new 'retry' option. The doc says:

 By default, cast() and call() will block until the
 message is successfully sent.

What does 'successfully sent' mean here?


Unclear, ambiguous, probably driver dependent etc.

The 'blocking' we're talking about here is establishing a connection
with the broker. If the connection has been lost, then cast() will block
until the connection has been re-established and the message 'sent'.


 Does it mean 'written to the wire' or 'accepted by the broker'?

For the impl_qpid.py driver, each send is synchronous, so it means
accepted by the broker[1].

What does the impl_rabbit.py driver do? Does it just mean 'written to
the wire', or is it using RabbitMQ confirmations to get notified when
the broker accepts it (standard 0-9-1 has no way of doing this).


I don't know, but it would be nice if someone did take the time to
figure it out and document it :)

Seriously, some docs around the subtle ways that the drivers differ from
one another would be helpful ... particularly if it exposed incorrect
assumptions API users are currently making.


+1

We should also keep this in mind for the amqp 1.0 driver. It's
mandatory to have a detailed documentation of it. Specifically, I'm
interested in having documented how its architecture differs from
other drivers. The more we can write down about it, the better.


If the intention is to block until accepted by the broker that has
obvious performance implications. On the other hand if it means block
until written to the wire, what is the advantage of that? Was that a
deliberate feature or perhaps just an accident of implementation?

The use case for the new parameter, as described in the git commit,
seems to be motivated by wanting to avoid the blocking when sending
notifications. I can certainly understand that desire.

However, notifications and casts feel like inherently asynchronous
things to me, and perhaps having/needing the synchronous behaviour is
the real issue?


It's not so much about sync vs async, but a failure mode. By default, if
we lose our connection with the broker, we wait until we can
re-establish it rather than throwing exceptions (requiring the API
caller to have its own retry logic) or quietly dropping the message.

The use case for ceilometer is to allow its RPCPublisher to have a
publishing policy - block until the samples have been sent, queue (in an
in-memory, fixed-length queue) if we don't have a connection to the
broker, or drop it if we don't have a connection to the broker.

 https://review.openstack.org/77845

I do understand the ambiguity around what message delivery guarantees
are implicit in cast() isn't ideal, but that's not what adding this
'retry' parameter was about.


Besides documenting what each driver does, it might also be useful to
document what the expected guarantees are. The way you just described
them, in terms of connections, sound quite reasonable to me.


 Calls by contrast, are inherently synchronous, but at
present the retry controls only the sending of the request. If the
server fails, the call may timeout regardless of the value of 'retry'.

Just in passing, I'd suggest that renaming the new parameter
max_reconnects, would make it's current behaviour and values clearer.
The name 'retry' sounds like a yes/no type value, and retry=0 v. retry=1
is the reverse of what I would intuitively expect.


Sounds reasonable. Would you like to submit a patch? Quick turnaround is
important, because if Ceilometer starts using this retry parameter
before we rename it, I'm not sure it'll be worth the hassle.


+1

Flavio

--
@flaper87
Flavio Percoco


pgpSNaEPjQ0Ig.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] 'retry' option

2014-06-28 Thread Mark McLoughlin
On Fri, 2014-06-27 at 17:02 +0100, Gordon Sim wrote:
> A question about the new 'retry' option. The doc says:
> 
>  By default, cast() and call() will block until the
>  message is successfully sent.
> 
> What does 'successfully sent' mean here?

Unclear, ambiguous, probably driver dependent etc.

The 'blocking' we're talking about here is establishing a connection
with the broker. If the connection has been lost, then cast() will block
until the connection has been re-established and the message 'sent'.

>  Does it mean 'written to the wire' or 'accepted by the broker'?
> 
> For the impl_qpid.py driver, each send is synchronous, so it means 
> accepted by the broker[1].
> 
> What does the impl_rabbit.py driver do? Does it just mean 'written to 
> the wire', or is it using RabbitMQ confirmations to get notified when 
> the broker accepts it (standard 0-9-1 has no way of doing this).

I don't know, but it would be nice if someone did take the time to
figure it out and document it :)

Seriously, some docs around the subtle ways that the drivers differ from
one another would be helpful ... particularly if it exposed incorrect
assumptions API users are currently making.

> If the intention is to block until accepted by the broker that has 
> obvious performance implications. On the other hand if it means block 
> until written to the wire, what is the advantage of that? Was that a 
> deliberate feature or perhaps just an accident of implementation?
> 
> The use case for the new parameter, as described in the git commit, 
> seems to be motivated by wanting to avoid the blocking when sending 
> notifications. I can certainly understand that desire.
> 
> However, notifications and casts feel like inherently asynchronous 
> things to me, and perhaps having/needing the synchronous behaviour is 
> the real issue?

It's not so much about sync vs async, but a failure mode. By default, if
we lose our connection with the broker, we wait until we can
re-establish it rather than throwing exceptions (requiring the API
caller to have its own retry logic) or quietly dropping the message.

The use case for ceilometer is to allow its RPCPublisher to have a
publishing policy - block until the samples have been sent, queue (in an
in-memory, fixed-length queue) if we don't have a connection to the
broker, or drop it if we don't have a connection to the broker.

  https://review.openstack.org/77845

I do understand the ambiguity around what message delivery guarantees
are implicit in cast() isn't ideal, but that's not what adding this
'retry' parameter was about.

>  Calls by contrast, are inherently synchronous, but at 
> present the retry controls only the sending of the request. If the 
> server fails, the call may timeout regardless of the value of 'retry'.
> 
> Just in passing, I'd suggest that renaming the new parameter 
> max_reconnects, would make it's current behaviour and values clearer. 
> The name 'retry' sounds like a yes/no type value, and retry=0 v. retry=1 
> is the reverse of what I would intuitively expect.

Sounds reasonable. Would you like to submit a patch? Quick turnaround is
important, because if Ceilometer starts using this retry parameter
before we rename it, I'm not sure it'll be worth the hassle.

Thanks,
Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] 'retry' option

2014-06-27 Thread Gordon Sim

A question about the new 'retry' option. The doc says:

By default, cast() and call() will block until the
message is successfully sent.

What does 'successfully sent' mean here? Does it mean 'written to the 
wire' or 'accepted by the broker'?


For the impl_qpid.py driver, each send is synchronous, so it means 
accepted by the broker[1].


What does the impl_rabbit.py driver do? Does it just mean 'written to 
the wire', or is it using RabbitMQ confirmations to get notified when 
the broker accepts it (standard 0-9-1 has no way of doing this).


If the intention is to block until accepted by the broker that has 
obvious performance implications. On the other hand if it means block 
until written to the wire, what is the advantage of that? Was that a 
deliberate feature or perhaps just an accident of implementation?


The use case for the new parameter, as described in the git commit, 
seems to be motivated by wanting to avoid the blocking when sending 
notifications. I can certainly understand that desire.


However, notifications and casts feel like inherently asynchronous 
things to me, and perhaps having/needing the synchronous behaviour is 
the real issue? Calls by contrast, are inherently synchronous, but at 
present the retry controls only the sending of the request. If the 
server fails, the call may timeout regardless of the value of 'retry'.


Just in passing, I'd suggest that renaming the new parameter 
max_reconnects, would make it's current behaviour and values clearer. 
The name 'retry' sounds like a yes/no type value, and retry=0 v. retry=1 
is the reverse of what I would intuitively expect.


--Gordon.

[1] I've personally considered that somewhat unnecessary.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev