Hello Benoit ! Matthieu and I are trying to create a server instance to host our family's email :) Being the geeks that we are, we are not satisfied with a single node JPA deployment and want to preserve the scalability and resilience properties of distributed james while minimizing the run cost of a small scale deployment so as to keep our hobby in a reasonable[1] price range.
Since a mail server is such a complex piece of software, we chose to concentrate our efforts on a smaller scope first: SMTP processing only. Our initial deployment will thus be SMTP only, then we intend to look into building an IMAP/JMAP instance, a bit like what's described in "distributed james - specialized instances" [2]. At least that's what's guiding our forays into james codebase. We looked at various options and found a pulsar as a service that's fairly cheap over at clever-cloud, they also have a reasonably cheap S3 clone. This is what led to the initial pulsar dev as the cost of other MQ saas offering was much higher. That leaves the primary datastore, we initially targeted cassandra because we knew that worked but that made the run cost explode. At the moment we will intend to start with a hosted postgresql instance instead. If only to finally be able to see the pulsar code run without ruining ourselves :) With this clarified, I'll try to answer your mail : The name makes it hard to grasp what goals are researched, we also miss > a little statement/README about this new product. > The goals did change a couple times but it always wanted to be a "simpler", SMTP-specialized instance than what could be achieved with the main distributed server app. Here is a little explaination of the goal of this application I can propose: > > ``` > This artifact leverage Apache Pulsar to deliver a mailet container that > scales your mail processing. > It can be used both for inbound and outbound behavior customization. > Thanks to Apache Pulsar, get your hands on a fully featured distributed > mail queue to manage efficiently your email delivery. > It targets minimal dependencies: solely Apache Pulsar and an object store. > ``` > If we reach a consensus I would be happy to write its README, including > sample configuration and start instructions. > Minimal dependencies to this level would be great but we looked into it and we don't expect to be able to minimize them that much, in the short/medium term. > I understand from recent tickets (JAMES-3761 JAMES-3762 and JAMES-3763) > that it is intendeed to not get Cassandra as a dependency. Some features > are optional and can easily be dropped (recipient rewritting) however > some others would eventually need an alternative implementation (I am > thinking to domains, users). What is the plan regarding this? > Offering an alternative to cassandra which is an expensive fit for smaller scale deployments would indeed require an alternative implementation to: - DomainList (5 methods) - RecipientRewriteTable with it's regex mapping (26 methods) - UserRepository (15 methods) I am not sure we can completely drop RecipientRewriteTable, I think we looked into it with Matthieu and it wasn't as optional as it first looked. I think maybe it had something to do with error handling which made it a mandatory dependency for the mailet container. Maybe we will try dropping it again in our next attempt at starting the app... If we can't drop it, it means 46 behaviours to implement. That's quite a lot the time we can allocate for our evening fun-first pair programming sessions :) For the time being, we intend to run with JPA instead so we can get our instances up and running, start accumulating feedback from a real world deployment and work on documenting the configuration and run sides. I realize we could build our assembly privately or in a different repository but we also need a demo place to plug in the implementations we consider to be the most valuable for the community : the pulsar mailqueue and the blob store mailrepository. I do feel that these two are components that most James deployments would benefit from. Regarding the artifact name I am uneasy with it. Generally we tend to be > too technical on the artifact name and base it on backing technologies > rather than focussing on the intent. Also, 'relay' is one of the many > possible features, but there are others. As such I think the > 'smtp-relay' part of the name is too restrictive. > I wholeheartedly agree, I am quite ashamed being the author of the current naming. Alternative concepts for names: > - Replace smtp-relay with smtp only to make it more generic > +1 > - Replace smtp-relay by mail-processing which I think might be the > main target of such a server. > `mail-processing` feels too generic to me, while it might be a good fit I wouldn't be able to tell what such an app does. I think smtp is a better name as that is the only mail protocol spoken by the app. - Replace pulsar-cassandra by scaling or distributed prefix that would > emphasis the intent rather than the technology > Within the context of what I explained above, I am not sure which, if any, is appropriate. Maybe scaling is less constraining than distributed. If we end up depending upon JPA we probably won't be able to say "distributed" :) - If we are to stick with technology names maybe just keep 'pulsar' as > Cassandra usage looks accidental and contributors > Casandra usage is indeed accidental, the initial app cloned the distributed app, simplified it and swapped rabbitMQ with pulsar. Which would lead to (scalar combination of the above): > - pulsar-smtp / distributed-smtp > - pulsar-mail-processing / distributed-mail-processing / > scaling-mail-processing > So maybe scaling-pulsar-smtp ? Jean [1] For some definition of "reasonable" that we carefully avoid investigating to closely, ignorance is bliss [2] https://github.com/apache/james-project/blob/master/server/apps/distributed-app/docs/modules/ROOT/pages/architecture/specialized-instances.adoc#distributed-james-server--specialized-instances I couldn't navigate to this page from the james website, I remember having done so accidentally in the past, but I couldn't find the path this time around. Also https://james.apache.org/server/objectives.html has a dead link to Distributed Email server <https://james.staged.apache.org/james-distributed-app/3.7.0/index.html>