though Cassandra supports multiDC cross availability zone well this dont' mean all Cassandra implems do
And James don't: - IMAP reliand on incrematal monotic counters means strong concistency which don't play well with high latencies (2-4 rountrips) - multiple levels of metadata makes it inconsistencies prone if not operated with quorum consistency - and quorum consistency means cross availability read and writes which is a latency and throughtput show stoper. TL DR: James distributed server can work on multiDC, but with significant shortcomings, and only with region-wide set up, not world wide setup -- Best regards, Benoit TELLIER General manager of Linagora VIETNAM. Product owner for Team-Mail product. Chairman of the Apache James project. Mail: btell...@linagora.com Tel: (0033) 6 77 26 04 58 (WhatsApp, Signal) On Mar 8, 2025 10:48 AM, from Jean Helou <jhe...@apache.org>Hi Matt, This has turned into a rather long answer. The first part is more about james in general, the second is more about your specific setup :) As far as I'm aware James itself is stateless. I don't think you loose counter values when you restart your main server. This, you should be able to spin as many James instances as you want and point them to the same storage without issues. Even if there are some asynchronous state updates the state should eventually converge. The difficulty is distributed storage not distributed processing. For instance of you spin a mariadb on one or your new VPs and reload a backup from you main mariadb the states of both databases will immediately start to diverge as they are unaware of each other, new messages delivered to your main since the backup will not be visible to the VPs, messages read on the vps will still appear unread on the main server. >From there you will want to look into replication but simple primary/secondary replication will throw errors on writes to the secondary making you secondary James instance fill error logs on failed writes. The next step is multimaster replication which is something I never tried. The distributed james app demonstrates a fully distributed system : including a distributed database (Cassandra), a distributed message broker (rabbitmq iirc), a distributed search engine (opensearch), etc. This allows you to have as many James nodes as you want, all talking to as many messaging/storage nodes as you want. All fully synced and with write semantics that offer a reasonable consistency. This is a setup that makes sense for massive deployments. If you wanted to build the next google mail for example. The use of blob storage (S3 like) to store message contents is an orthogonal concern. Database storage is fairly expensive compared to blob storage. And storing large blobs in databases while doable is usually not recommended, at least not without specific table design. The same is true for message brokers. The alternatives are storing on the file system, which is not distributed or using a blob store. I'm almost certain you can configure the distributed app (or build a variant of it) that does not use blob storage but I wouldn't recommend it. Now, how all this applies to your setup :) My understanding is that for now you have a single rather powerful machine hosting both James and mariadb. The james instance handles both SMTP and IMAP or POP. I'll also assume that you don't intend to start operating a multi DC Cassandra cluster :) Finally I'll assume the VPS are rather small at this price :) If they are large enough to host a clone of your main Mariadb and it's data you can use one for a mariadb and another for James.start from a backup of the main Mariadb then use IMAP sync to have eventual consistency between mailboxes on your main server and the replica. You can go further and spread the workload of the main server too You start a James instance configured for IMAP/POP on a couple vps instances, keep the db config to talk to the main Mariadb. Change your clients config and eventually you can drop the corresponding listeners on the main server if you want Do the same for SMTP and put the new ones at a higher priority than the instance running on the main server, after a while you can even stop the main server James process entirely :) The downside of course is increased latency both from client to vps but also from vps to vps or to the main database server. I hope that opens venues for exploration:) Have fun Le sam. 8 mars 2025 à 03:06, cryptearth <cryptea...@cryptearth.de.invalid> a écrit : > Hello there dear James devs and fellow James users, > > my hoster OVH currently offers me a great deal on VPSs for less than 12 > bucks a year (less than 1 buck per month) in several datacenters around > the world. I really tempt to get that deal as I have some ideas to > utilize multiple servers - having them around the world like in > Australia and Canada is just a bonus. > One thing I plan to implement is to setup James on each of the servers. > But then the question came up: How to synchronize them? > Currently I use my home server only as a backup without any > synchronization with my main root server. In fact: It's currently not > running due to some issues I have with my home server I have to fix > first before get James running again. > Now when scaling up to several servers around the world it would be cool > to take advantage of that by combine them with synchronization. But as > the additional systems are VPSs only I'd like to setup a master-slave > setup with each slave James on the VPSs sync up to the master James on > my powerful root server. > First I thought about fetchmail to at least pull in mails from the > slaves to the master - but fetchmail is only part of the deprecated > spring build. As I like to have my mailstorage in a database I would > like to keep using the guice-jpa build instead of switching the the > guice-distributed which doesn't use jpa and seems to be meant for use > with AWS S3 buckets. > I also could write some java code using the java mail api working in a > fetchmail way itself - but I'm unsure how to inject mails from other > servers properly into the main server so they do look like if they were > receive by the masterserver itself. > Could it be done by just synchronizing the MariaDB databases in the > background or would fiddle with the database while James is running > screw it up like the several counters for mails and mailboxes? > If James 3.x isn't suited for such a use case maybe that's something to > be considered for 4.0? Or is that too late into the current development > now and would delay a 4.0 release? > > I would like to explore this idea further to see if and how James can be > used in a distributed cluster like other mailers can. Building a James > mail server cluster sounds just cool - and seen from "well, big > companies like google have several hundrets to thousands mail servers > deployed around the glob all working together" it sure has to be > possible with James as well - as broken down it's just some listeners on > some server sockets with some database backend synchronized by a message > bus. This should be extendable across multiple servers. > > Have a nice weekend everyone. > > Greetings from Germany, > > Matt > > --------------------------------------------------------------------- > To unsubscribe, e-mail: server-user-unsubscr...@james.apache.org > For additional commands, e-mail: server-user-h...@james.apache.org > >