Hi,

I would like to implement some kind of analytics with james (simple examples: top-frequent term per user, natural language analysis,...). It could be done in batch when mail is persisted in mailbox, it could be done via maibox listeners, it could be done in the future remote/local delivery services we talked about or via mailets.
Is it recommended to do it via mailets (current or future api?).

I also find the routing very important and would love to see features such as the one described in EIP (Enterprise Integration Pattern) (implemented by camel project).

Tks,

Eric


On 11/01/2011 17:00, Stefano Bagnara wrote:
Hi all,

In the last year we succesfully separated the mailet-api and a
collection of generic matcher/mailets from the main james codebase.
Yet we have still many mailets in the code base mainly because they
depends on some james specific behaviour/class. I think this is not
good for mailets and that this is the right place to find a solution.
In past we discussed about wishlists for mailets v3, but we always
stuck. I try again and I hope to find some time to push this a bit
further.

In order to allow more generic mailets I think we need to evolve the
mailetcontext to provide some more service.

Here are a couple of easy and enabling changes:

+ dns queries: SPF, whitelists, blacklists, DKIM. There are a lot of
mail processing nowadays requiring dns calls. Mailets could use java
dns resolver but we know that from a server point on view it is better
to work with dnsjava: if we provide some generic api to run a dns
query then each container will be able to decide how to implement it
and mailets would be easier and more portable.

+ redirect/forward mailets tree: we have a bunch of mailets that are
not generic because the duplicate the current "Mail". mailet api
doesn't provide a way to do this, so our mailets simply have to use
"new MailImpl(mail)". What about exposing message creation and message
duplication methods from the mailet context?  e.g:
    Mail createMail(String sender);
    Mail duplicateMail(String sender, Mail original);
   (as an alternative to have sender as a parameter we could add the
setSender to the Mail interface. Not sure what's best)


Then I have a couple of other changes. This will require some more
thinking, but I'd like to throw them in, maybe someone else share my
needs:

+ routing: in a mailet container we "move mails" around. Yet,
mailet-api just provide the "Mail.state" handling. Should an
implementor simply "embrace" state and use it as a destination url so
that each mailet can define a new destination url or should mailet api
provide something smarter about rerouting? (I feel it's better to skip
this at this time, but I still think this is on the table)

+ simple persistence: I find that often a mailet needs some sort of
simple persistence or some sort of "advanced/dynamic" configuration.
   a list of domains for a whitelist, a list of usernames (list<string>)
   a list of email addresses for a mailing list expansion (list<string>)
   map of aliases (map<string,string>)
   a properties like configuration (map<string,string>)
They are simple list of strings or maps of strings to strings.
Why don't we provide a way for mailets to declare they work with a
"named map" or a "named list" and automatically provide
persistence/lookup methods in a container agnostic way?
This way a container could provide a simple file based (properties
files works too!) a jdbc/jpa/jcr based implementation (configured in
the container not in the mailet).
I know this may sound "limited" but mailets could then encode/decode
their data in simple strings to be added as values in the map and this
would be a very little cost compared to writing the persistence backed
(or being limited to a simple file or jdbc solution hardcoded in the
mailet itself).
I think this would be a great step beyond! But there are details to be
understood: the content should be "editable" both by the mailet
(mailet keeping statistics) and by the container (e.g container
monitoring the filesystem for changes), so we probably need to define
how new data is "flushed" and how persistence is monitored for
changes. Maybe we could even provide just maps as list are a subset
(the keys of a map are not so different from a list)
Sendmail provides "compiled" (hash db) lists and maps to be used in
processing for virtusertable, mailertable, domaintable, aliases
because it is generic enough to be useful.
WDYT? (maybe it's better to start a new topic if answering to this one).

Stefano

Reply via email to