Re: [Openstack] Burrow (queue service)

2011-02-24 Thread Jay Pipes
On Thu, Feb 24, 2011 at 3:56 PM, Eric Day  wrote:
> Hi Sandy,
>
> On Thu, Feb 24, 2011 at 07:42:34PM +, Sandy Walsh wrote:
>> Looks like a fun project Eric. I only got caught up on the ML this weekend 
>> and I'm behind again already.
>
> It's a never-ending battle. I find routing all messages from jaypipes
> into /dev/null helps. :)

I heard that.

-jay

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Burrow (queue service)

2011-02-24 Thread Eric Day
Hi Sandy,

On Thu, Feb 24, 2011 at 07:42:34PM +, Sandy Walsh wrote:
> Looks like a fun project Eric. I only got caught up on the ML this weekend 
> and I'm behind again already. 

It's a never-ending battle. I find routing all messages from jaypipes
into /dev/null helps. :)

> 1. Will broadcast queues be supported? 

As we discussed on IRC, this can be one of two this. Broadcast
(unreliable) or replication (reliable). The former is part of the
service. The latter is possible using workers from the start, but
in the future we'll probably add a special "router" API for these
specialized workers. This will look kind of like exchanges in AMQP. For
more info on both, see:

http://wiki.openstack.org/QueueService#Multi-cast_Event_Notifications
http://wiki.openstack.org/QueueService#Other_Features

> 2. What are the guarantees of idempotence in the event of retries/errors/etc? 
> (not applicable to broadcast obviously).

Basically, for messages that were added with persistence, we error on
the side of duplicating the message instead of loosing the message. The
client can be fully aware of this. If a delete fails (server holding
the message is down), it can ignore (and duplicates are fine), revert
the work, or mark that a duplicate may be coming (mark the message
ID somewhere to check later when the duplicate may show up).

> 3. Could there be support for WAN clusters? i.e. only forward certain queues 
> to remote locations to keep chatter down?

Sure, we can construct any topology we want. For a public queue
service, I'm thinking having a separate cluster per zone (whatever
that means for deployment), and then clients can use the local zone
or remote zone, probably preferring the local for speed.

> 4. Are all operations assumed to be one-way, or is it assumed that return 
> values are part of each call. i.e. do I have to set up other queues for 
> return values as callbacks or is that part of the framework? Is it optional?

Right now it's all async. A sync operation with a 60 second timeout
could be done with:

PUT /account/queue/
GET /account/queue/_result?wait=60

And you would write your worker to PUT the result in a message named
_result.

-Eric

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Burrow (queue service)

2011-02-24 Thread Sandy Walsh
> Let me know if there are any questions I didn't answer thoroughly enough.

Looks like a fun project Eric. I only got caught up on the ML this weekend and 
I'm behind again already. 

Some questions that I jotted down from the previous discussions were:

1. Will broadcast queues be supported? 

2. What are the guarantees of idempotence in the event of retries/errors/etc? 
(not applicable to broadcast obviously).

3. Could there be support for WAN clusters? i.e. only forward certain queues to 
remote locations to keep chatter down?

4. Are all operations assumed to be one-way, or is it assumed that return 
values are part of each call. i.e. do I have to set up other queues for return 
values as callbacks or is that part of the framework? Is it optional?

Thanks
-S


Confidentiality Notice: This e-mail message (including any attached or
embedded documents) is intended for the exclusive and confidential use of the
individual or entity to which this message is addressed, and unless otherwise
expressly indicated, is confidential and privileged information of Rackspace.
Any dissemination, distribution or copying of the enclosed material is 
prohibited.
If you receive this transmission in error, please notify us immediately by 
e-mail
at ab...@rackspace.com, and delete the original message.
Your cooperation is appreciated.


___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Burrow (queue service)

2011-02-24 Thread Eric Day
Hi Greg,

I've updated the wiki with a number of items you mention below,
as well as some responses inline. I'm CC'ing the openstack list so
others can see as well.

On Wed, Feb 23, 2011 at 10:49:31AM -0600, gregory_alth...@dell.com wrote:
> 1.  What are the functional constraints of the queue?  In the email chain, 
> you and others mentioned some, but I think doc needs to be clearer as to the 
> goal/requirements.   Is order maintained?  With regard to whom?   Who decides 
> order?  Are they client relative?  Queue relative?  Proxy relative? I think 
> the document mentions that duplicates might show up and the worker may need 
> to handle this through idempotent operation.   Is expected base operation a 
> FIFO?  Are LIFOs possible?

See: http://wiki.openstack.org/QueueService#Behavior_and_Constraints

Let me know if there are any questions I didn't answer thoroughly
enough.

> 2.  What is the expectation of the client?  Is it to only track the API or is 
> it intended to be functional in the operation and distribution of the 
> messages.  For example, the client in example deployment 1 and 3 are simple 
> Restful clients that don't have to worry about availability, server lists, 
> distribution mechanisms, etc.  While the client in example 2, appears to be 
> doing the work of the proxy without calling out the proxy.  Each client seems 
> to maintaining connections and dealing with routing.  This seems a little 
> onerous for a client implementation.   Seem different, but I don't think 
> should be.

Your description is correct, and both are possible. See: the last
paragraph in the "Behavior_and_Constraints" section listed above.

> 3.  Who is responsible for account authentication in use cases 1 and 2?  In 
> example 3, it seems that the proxy is the control point, but in the other 
> cases it isn't clear (I mean the queue server is going to do it, but it seems 
> like that isn't that component's job).

The authentication will be pluggable and can be inserted in or bypassed
in either the proxy or queue server. For example, the proxy server
may do the real auth an send a 'already authed' header to the queue
server, assuming this is a trusted link in the deployment. I want to
keep this flexible to accommodate for different deployments.

> 4.  Components might want to be split into pieces.  Half the components are 
> logic data elements (Queue, account, message). While the other half are 
> actors.  Since Erlang is the chosen language, you may want to skew the 
> blueprint into that paradigm.

Thanks, I split them up into groups.

> 5. Are their monitoring or feedback mechanisms in the system?  What are your 
> intentions or thoughts?

The logging will be modular, and one of the modules will be feeding
logs/diagnostics/accounting information back into the queue as special
queue names. For example, I could subscribe to /account/_accounting to
see all accounting related events for my account, or /_accounting for
all accounting info (for service billing). You can also write service
plugins in the server/proxy to push them elsewhere. I still need to
work out the details on this, but this is related to the "firehose"
discussion in a previous thread from last week.

> 6. What are the security constraints between client <-> proxy, Proxy <-> 
> queue, queue <-> proxy, and proxy <-> worker.

This is deployment specific and pluggable. All interfaces will be
modular, so you could use SSL, plain http, or other protocols which
may or may not be secure. For example, if I'm running this in my own
DC behind a DMZ and want optimal throughput, I won't be running with
any secure protocols. Public cloud will be just the opposite.

> 7. Given my concerns with item 2, have you considered going full swift-style 
> and requiring a proxy in all deployments (may run in the same erlang 
> process)?  It seems that by acknowledging that upfront and modularizing it at 
> the beginning you get a more consistent model for all the deployment cases.  
> Also, we get subcomponents that can be built and validated independently.  
> Proxy in this model basically becomes an API endpoint and router.  Queue is 
> message persistence at the levels you mentioned.  Client and worker get to be 
> simple Rest clients. 

Due to the ability for clients in some scenarios to want to
manage failing over or spreading the load, I'd like to keep it
optional. Clients and workers can still be simple REST clients when
using a single queue server or proxy.

We could always require a proxy up front, but from the user, this would
look no different than speaking to the queue service directly. We
should definitely keep these as separate modules and validate them
independently, but I guess I don't see a good reason to *require*
a proxy for all configurations, even if it's in-process.

I'd like to hear more from you on this, perhaps I'm not understanding
the reasoning well enough.

> 8.  I really like the choice of erlang for this and should make for a very 
> clean