Re: MailstoreApi (was Re: ElasticSearch upgrade to 8.2)

Benoit TELLIER Fri, 10 Jun 2022 01:53:21 -0700

Hello Jean,

Answers inlined.


On 10/06/2022 15:22, Jean Helou wrote:

I fork the thread to respond on the MailRepository part :D


    > fun quiz can you tell looking
    > only at the documentation and code comments the difference between :
    > MailRepositoryStore, MailRepositoryUrlStore and MailRepository
    all of which
    > are in mailrepository-api  )
    Game accepted.
      - Mail repository is storage for email, with their processing
    context
    (long term storage, differs from mail queue which is a flow).
      - Mail repository are identified by their URL
      - Mail repository can be created through the use of mail repository
    store by supplying an URL
      - MailRepositoryUrlStore is an implementation detail of
    MailRepositoryStore, and brings persistance to mail repositories
    (that
    are created through webadmin, configuration changes etc..)


Almost :D

MailRepositoryStore only has 2 implementations : in memory and aspring based

If I understand it correctly the MailRepositoryStore is actually acomputing cache.Roughly equivalent to Map<MailRepositoryUrl, MailRepository> with ageneric factory method to create the MailRepository when it is notalready in the cache.

Yes.

The factory method relies on a statically injected config that maps"protocols" to the FQDN of the corresponding implementation. When the"store" resolves a MailRepository through its MailRepositoryUrl, itretrieves the FQDN from the protocol part of the url then delegates tospring, guice or a static map to actually get the correspondingimplementation.The naming Store and InMemory got me mightily confused when I tried tosort this out to inject the blob mail repository store.
MailRepositoryUrls start with a protocol such as cassandra:// blob://file://, then a repository "id"
The MailRepositoryUrlStore has 3 implementations : cassandra, jpa andinmemory. I am not yet clear on what having a persistent store bringsover using the in memory store.
    Now to have a MailRepositoryStore not based on Cassandra, the memory
    implementation is good enough if manual creation of mail
    repository is
    forbidden (akka through webadmin) and if configuration is
    homogeneous in
    the James cluster.
Even if you were to create a mail repository manually, I don'tunderstand how anything would be stored in it if it is not mentionedin the server's configuration (mailet container's config most likely).

One case is "it was mentioned in the server configuration, and no longeris".

Without such persistence you could not, for instance, reprocess mailrepositories that you had been using.


--------------

An other case is "parametric mail repository" iecassandra://var/mail/customera.com/rejected

One such exemple is Data Leak Prevention cfhttps://github.com/apache/james-project/tree/master/server/mailet/mailets/src/main/java/org/apache/james/transport/matchers/dlp

And his friendhttps://github.com/apache/james-project/blob/master/server/mailet/mailets/src/main/java/org/apache/james/transport/mailets/ToSenderDomainRepository.java

I might want to access a mail repository that exist, contains stuff, butis not provisionned localy because the James server I am using did notyet reject an email for this domain since it had been started.


--------------

Another thing is the difference between mailrepository URL / path (whichI am not a fan of)


The idea was not to leak through webadmin the underlying storage structure

URL: cassandra://var/mail/error
PATH: var/mail/error

Then you need to do translation between the path and the URL, which isnot trivial in face of several underlying storage technologies (jdbc +file for example)

Even if there is a way to dynamically make james store mails in a mailrepository that is not mentioned in the configuration, the in memoryimplementation will still register it when it is used. I guess thatonly leaves discoverability of existing MailRepositoryUrls acrossrestarts when an Url is not used much. That leaves me wondering whatthe actual use case is ...

Well the one time I had to deal with mail repository with a customer,listing them was handy.

That being said, I also share the feeling that "listing URLs in use"through MailRepositoryUrlStore might be overkill.

Instead we could rely on each MailRepository implementation to list theURLs it do actually contain, thus drop MailRepositoryUrlStorealltogether, make it an implementation detail.


We would get :

- MailRepositoryUrlSupplier interface with an implementation for eachMailRepository implementation. - Implementations can base decisions on their underlying storage thusremoving the needs for additional metadata.

I would support such a refactoring. One less Cassandra table makes mehappy ;-)

    Ideally MailRepositoryUrlStore should not have had been in the API.
Interesting, according to git history it was introduced byhttps://issues.apache.org/jira/browse/JAMES-2418 but that only saysthat the last point I mention above ( the discoverability part) isneeded but not why it is needed :D

I hope I did get better at writting issues since then :-P


cheers
jean


Cheers

Re: MailstoreApi (was Re: ElasticSearch upgrade to 8.2)

Reply via email to