* Jonas Liljegren <[EMAIL PROTECTED]> [2008-02-18T16:45:32]
> http://jonas.liljegren.org/perl/dist/Email-Classifier-0.01.tar.gz

I finally got around to flipping through this, tonight!

I think my main concern relates to the way that one would use Classifier, and
what it would return.  I imagine this as a very basic use case:

  my $classifier = Email::Classifier->new({
    classifiers => [ ... ],
    ...
  };

  my $report = $classifier->classify($email);

  if ($report->type eq 'bounce') {
    my $details = $report->details;
    
    log("got bounce from $details->{remote_mta});

    if ($details->{status_code} =~ /^5/) {
      ...
    }
  }

A classifier returns a report, which has a globally fixed API.  It can provide
details, which are defined by the winning classifier.  The report is generated
by the specific classifiers set up in construction.  The first one to issue a
report wins.  If no classifier matches, we return false.

This will handle most use cases, I think.

It will also be easy to say:

  my @reports = $classifier->all_classifications($email);

...to get every report, in the event that we think ones after the first hit
will be useful.

If a $report has one and only one type, we could say:

  my $report_set     = $classifier->all_classifications($email);
  my @bounce_reports = $reports_set->of_type('bounce');

...but now I'm getting into exotica.  I think the really nice simple bit of
design is that Email::Classifier runs an email through a set of specified
classifiers, stops when it gets a hit, and returns an object with a globally
constant API.

It's easy, then, to say that the classifiers for Email::Classifier are, in
turn, Email::Classifiers.  So, Email::Classifier::Bounce is just an
Email::Classifier built up of all its Email::Classifier-compliant plugins for
specific bounce formats.

This also avoids all the is_* methods, conflict detection, AUTOLOAD, and
typeglob wrangling in the code you posted.

I also think that totally ditching Mail::DeliveryStatus::BounceParser is a good
start: it will be a nightmare to fuss with or extend, even in the form you've
pruned it down to.

So, if I am not full of crap, the two big tasks are:

  1. implement Email::Classifier, which should be really easy
  2. implement various useful classifiers, starting with some bounce processors

A good precursor to (2) is to build up a good corpus of messages.

Thoughts?

-- 
rjbs

Reply via email to