On Wed, 10 Jun 2009 10:55:23 -0700, Jonathan Feally <vult...@netvulture.com> wrote: > Paul J Stevens wrote: >> Some kind of replication of the dbmail_users table across database >> servers would still be required, like we do now between ldap and sql. >> That would allow *any* of the database backends to function as the >> primary backend - in fact, any of the active daemons/tools would be able >> to use any of the backends for it's initial database connection, >> switching to another if the authenticated user so requires.
Right, simply shard the tables that contain messages on a per-user basis. Figuring out a sharding policy for shared mailboxes is probably the only really hard part. It's otherwise totally standard practice. One technique I've seen is to have one common database and several users databases. The common database holds configs, user-shard lookups, and any other random not-user-associated data. The users databases contain all the heavy stuff. If a shard gets too big, a resharding tool can move a user from one database to another, verify their data, flip their common database shard-lookup entry to the new database, and then clean up. I wish I still had time to code this, it's a lot of fun to shard out an app! > My thought on separating the tables like I had stated is that you could > have a circular replication cluster or ndb cluster on the front end that > all the daemons could use, that would include the acl and mailboxes > tables. Changes to these tables few and far between, making the > replication of data in this front-end cluster much faster than that of > message insertion and cleanup. The rule set on which backend database to > put mailboxes could be per mailbox or per user or mixed. If the rule was > per user, the first 10 mailboxes would be created on the server set on > the users row setting. If that setting changed, then new mailboxes would > go to a different backed. The backend servers could be a simple > master-slave or master-master replication setup with only 1 server in > that cluster talking to dbmail. I think this approach with a fully > replicating front-end cluster and multiple backend clusters could be > used to have a huge user base around the world, with users being homed > to a backend near their actual location, and using a front-end > connection at that same site. > > I could have data center locations in Los Angels, New York, London, and > India. I have users all around, but primarily closer to one of those > areas. The user would use the pop3 or imap that is running at their > closest data center. The rule is set such that their backend is one of > the backends at that same data center. But I have developers in India > and Los Angeles that need to be able to access a public mailbox on a New > York backend. The front-ends are replicating over a vpn or private > links. The dbmail daemons would also be able to connect to any backend > over this private meshed network. I'm in Los Angeles and open the Public > mailbox - that mailbox data is pulled from New York to Los Angels to the > daemon I'm connected to, then the results are sent on to me the client. > Same as if I had done the same connecting to the India site. This should > not break any functionality as each mailbox is fully stored on the same > backend. I do like the approach of having it per mailbox, as some users > have a lot of email in a lot of mailboxes. > > Of course all of this advanced separation of message storage could be > essentially disabled by just putting all the tables on the same database > server as in backend #0 - the default. > > -Jon _______________________________________________ Dbmail-dev mailing list Dbmail-dev@dbmail.org http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev