Hello Jean,

Did you had a look at configuration based server-data implementations?

They could fit your goals while minimizing storage costs.

I'm thinking to XMLDomainList, XMLRacipientRewriteTable. We would likely miss a corresponding XML user repository but this should not be hard to implement if needed.

For the artifact name, +1 for scaling-pulsar-smtp.

Best regards,

Benoit

On 11/05/2022 04:50, Jean Helou wrote:
Hello Benoit !

Matthieu and I are trying to create a server instance to host our family's
email :)
Being the geeks that we are, we are not satisfied with a single node JPA
deployment and want
to preserve the scalability and resilience properties of distributed james
while minimizing the
run cost of a small scale deployment so as to keep our hobby in a
reasonable[1] price range.

Since a mail server is such a  complex piece of software, we chose to
concentrate our efforts
on a smaller scope first: SMTP processing only. Our initial deployment will
thus be SMTP
only, then we intend to look into building an IMAP/JMAP instance, a bit
like what's described in
"distributed james - specialized instances" [2]. At least that's what's
guiding our forays into james
codebase.

We looked at various options and found a pulsar as a service that's fairly
cheap over at clever-cloud, they also have a reasonably cheap S3 clone.
This is what led to the initial pulsar dev as the cost of other
MQ saas offering was much higher.
That leaves the primary datastore, we initially targeted cassandra because
we knew that worked but that made the run cost explode.

At the moment we will intend to start with a hosted postgresql instance
instead. If only to finally be able to see the pulsar code run without
ruining ourselves :)

With this clarified, I'll try to answer your mail :

The name makes it hard to grasp what goals are researched, we also miss
a little statement/README about this new product.

The goals did change a couple times but it always wanted to be a "simpler",
SMTP-specialized instance than what could be achieved with the main
distributed server app.

Here is a little explaination of the goal of this application I can propose:
```
This artifact leverage Apache Pulsar to deliver a mailet container that
scales your mail processing.
It can be used both for inbound and outbound behavior customization.
Thanks to Apache Pulsar, get your hands on a fully featured distributed
mail queue to manage efficiently your email delivery.
It targets minimal dependencies: solely Apache Pulsar and an object store.
```

If we reach a consensus I would be happy to write its README, including
sample configuration and start instructions.

Minimal dependencies to this level would be great but we looked into it and
we don't expect to be able to minimize them that much, in the short/medium
term.


I understand from recent tickets (JAMES-3761 JAMES-3762 and JAMES-3763)
that it is intendeed to not get Cassandra as a dependency. Some features
are optional and can easily be dropped (recipient rewritting) however
some others would eventually need an alternative implementation (I am
thinking to domains, users). What is the plan regarding this?

Offering an alternative to cassandra which is an expensive fit for smaller
scale deployments would indeed require an alternative implementation to:
- DomainList (5 methods)
- RecipientRewriteTable with it's regex mapping (26 methods)
- UserRepository (15 methods)

I am not sure we can completely drop RecipientRewriteTable, I think we
looked into it with Matthieu and it wasn't as optional as it first looked.
I think maybe it had something to do with error handling which made it a
mandatory dependency for the mailet container.
Maybe we will try dropping it again in our next attempt at starting the
app...

If we can't drop it, it means 46 behaviours to implement. That's quite a
lot the time we can allocate for our
evening fun-first pair programming sessions :)
For the time being, we intend to run with JPA instead so we can get our
instances up and running, start accumulating feedback from a real world
deployment and work on documenting the configuration and run sides.

I realize we could build our assembly privately or in a different
repository but we also need a demo place to plug in the implementations we
consider to be the most valuable for the community : the pulsar mailqueue
and the blob store mailrepository.
I do feel that these two are components that most James deployments would
benefit from.

Regarding the artifact name I am uneasy with it. Generally we tend to be
too technical on the artifact name and base it on backing technologies
rather than focussing on the intent. Also, 'relay' is one of the many
possible features, but there are others. As such I think the
'smtp-relay' part of the name is too restrictive.

I wholeheartedly agree, I am quite ashamed being the author of the current
naming.

Alternative concepts for names:
   - Replace smtp-relay with smtp only to make it more generic

+1

   - Replace smtp-relay by mail-processing which I think might be the
main target of such a server.

`mail-processing` feels too generic to me, while it might be a good fit I
wouldn't be able to tell what such an app does. I think smtp is a better
name as that is the only mail protocol
spoken by the app.

   - Replace pulsar-cassandra by scaling or distributed prefix that would
emphasis the intent rather than the technology

Within the context of what I explained above, I am not sure which, if any,
is appropriate.
Maybe scaling is less constraining than distributed. If we end up depending
upon JPA we probably won't be able to say "distributed" :)

   - If we are to stick with technology names maybe just keep 'pulsar' as
Cassandra usage looks accidental and contributors

Casandra usage is indeed accidental, the initial app cloned the distributed
app,
simplified it and swapped rabbitMQ with pulsar.

Which would lead to (scalar combination of the above):
   - pulsar-smtp / distributed-smtp
   - pulsar-mail-processing / distributed-mail-processing /
scaling-mail-processing

So maybe scaling-pulsar-smtp ?

Jean

[1] For some definition of "reasonable" that we carefully avoid
investigating to closely, ignorance is bliss
[2]
https://github.com/apache/james-project/blob/master/server/apps/distributed-app/docs/modules/ROOT/pages/architecture/specialized-instances.adoc#distributed-james-server--specialized-instances
I couldn't navigate to this page from the james website, I remember having
done so accidentally in the past, but I couldn't find the path this time
around. Also https://james.apache.org/server/objectives.html has a dead
link to Distributed Email server
<https://james.staged.apache.org/james-distributed-app/3.7.0/index.html>


---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org
For additional commands, e-mail: server-dev-h...@james.apache.org

Reply via email to