Two questions: federation and authentication

Jon Watte Fri, 25 Dec 2009 16:04:59 -0800

I'm looking at Qpid as a possible replacement for a home-grown back-end for
a large real-time chat operation. My design target is 1,000,000 simultaneous
online users, none of which is trusted (all the brokers and exchanges run in
a controlled data center), each of which is subscribed to between two and
ten brokers through their indivdual queues. Exchanges are all fanout, queues
are all unreliable. In addition, there's about 200,000 communication
channels ("chat rooms" or whatever -- not AMQP channels), which are created
by the controlled servers, and have their own incoming and outgoing
exchanges, and enforce rules on the messages flowing through them.


The logical set-up is something like:

Server create channel:
1) Create "channelname".in exchange
2) Create "channelname".out exchange
3) Create "server-specific" queue
4) Bind "channelname".in to the "server-specific" queue
5) Subscribe to the "server-specific" queue. At times, send messages on the
"channelname.out" exchange.

Client join the system:
1) Create a "clientname".in exchange
2) Create a "client-specific" queue
3) Bind "clientname".in to "client-specific" queue
4) Subscribe to "client-specific" queue.

This allows the system to send messages targeted to the user.

Client join a channel:

5) Bind "channelname".out to "client-specific" queue
6) At times, send messages to the "channelname".in exchange

Now, to make this scale as appropriate, I'll need to use
clustering/federation. However, the description for the federation commands
for Qpid seem to assume that message processing will be the bottleneck. In
this system, I'm almost certain that simply having a large number of clients
connected to a physical machine will be the main bottleneck. (If it isn't,
then that probably calls out some opportunity for optimizing the innards of
the system)
So, let's assume that each 1U pizza box or blade can do 20,000 connected
users or processes. This means I'd need about 60 boxes/blades (50 for the
1,000,000 users, and 10 for the 200,000 server processes). This also means
that there are 20,000 exchanges on each user-facing box, and 40,000
exchanges on each channel/chat-room-facing box. (Btw: I'm currently running
1 Gb/s Ethernet ports and backbone, but I could upgrade to 10 Gb/s backbone
if needed).

In the description for federation, it's unclear how to set this up such that
it load balances appropriately, without putting mirrors of all 1,400,000
exchanges on all of the server boxes (which would probably not be the way to
go). The most promising part I've seen was the queue route. Suppose I create
a fully connected set of 60 physical boxes running a broker each, with
20,000-40,000 exchanges on each, perhaps I could create a queue route that
routes from a given exchange output to a given queue, for delivery to the
client over the internet. Vice versa, maybe I could have a route from the
client broker to the chat room broker for each room that the client wants to
send messages to.

So, questions:

1) Is this reaosonable? Am I thinking wrong somewhere? Is 40,000 exchanges
on a single machine with 20,000 separately connected clients a reasonable
expectation for a quad-core Xeon blade type server? (each connection may
send 5 messages per second at most, so actual message traffic won't be that
great)

2) Does a queue route have to have a queue on the source and destination
sides, or can I queue route from a local exchange to a remote queue? How
would I set up a queue route from a locally connected client to a remote
exchange? (for posting messages on the chat room exchange)

3) I'm separating exchanges for each client and channel/room here, with full
fan-out for each exchange. This seems simplest -- is it also the best
performing? Would it be better to have a single direct exchange per box, and
use the routing key instead?

4) How do I do reasonable authentication in this system? The kinds of
authentication and authorization I need are:
- User is authenticated through external means (some super-user process),
and issued a ticket (Kerberos-style). When connecting to a broker, the user
uses this ticket.
- The user can only receive data from the user-specific queue for that user.
This queue is likely created by external means (super-user process), but
subscribed by the (remote) user client.
- The user can send data only to exchanges that some external means
(super-user process) allows. When a user is joined to a channel/room, the
system should open up the exchange for that room to the user. I *could*
enforce this at the application level (when processing messages in the
channel/room server code), but I'd prefer to weed out invalid message
sources earlier than that.
- Binding of exchanges to queues is controlled by external means (super-user
process), not by the user. The only things that a remote user would be
allowed to do is subscribe to the user-specific queue, and to post messages
on a small subset of brokers. That subset is separate for each user, and
will dynamically change.

The group and authorization system does not seem like a good fit here,
because users mutate (and are created) all the time; anything that lives in
a text configuration file is unlikely to scale well with user flow. Peak
sign-up periods may see 100+ new users sign up per minute, all of which
expect to immediately be able to enter the system (assuming the proper
client software is installed).


Would it fit the model better if any message *from* a user also went to a
user-specific queue, which in turn could be bound (perhaps based on routing
key) to the appropriate exchanges for the appropriate targets? Would that
make authentication and authorization easier to enforce?


Sincerely,

jw


--
Americans might object: there is no way we would sacrifice our living
standards for the benefit of people in the rest of the world. Nevertheless,
whether we get there willingly or not, we shall soon have lower consumption
rates, because our present rates are unsustainable.

Two questions: federation and authentication

Reply via email to