Re: possible changes for Mail::DeliveryStatus::BounceParser

2006-08-04 Thread Ricardo SIGNES
* William Yardley [EMAIL PROTECTED] [2006-08-03T23:25:31]
  Excellent; I would much rather trust the data that's supposed to tell
  us something than the data whose entrails we have ripped out read.
 
 Yeah. The only problem is what to do when there is some sort of conflict
 between the two bits of data. In theory, you'd hope that people

If we find examples of this, the behavior can be tailored to those specific
examples, rather than remain so general.  I think this is still a win.

 I've been thinking about writing a blocked std_reason for bounces that
 appear to come from spam or virus filters, or other site policy type
 blocks - this might make dealing with those blocks a bit easier for
 organizations... even the most legitimate of lists end up getting
 blocked by someone, somehow.

Sounds good.

 You mentioned something about no_problemo - maybe we should get rid of
 it entirely?

Well, I wish it was something like not_a_bounce, but we can't just change it,
as things probably rely on it.  Adding code to make MBP return something
else based on an option seems like it could introduce bugs with no clear
benefit (beyond feeling less silly when checking std_reason).  What we *can* do
is only return it when std_reason is called on a parsed non-bounce, rather than
actually setting it on the object.  So:

  sub std_reason { return $_[0]-{std_reason} || 'no_problemo' };

(We can make 'get' use methods to get internal values, to make this work.  This
lets us provide more dynamic behavior, which is good.  'get' is a dumb method.)

That would also let us drop -{is_bounce} and have the is_bounce method return
the std_reason, leaving it undef for non-bounces  Same behavior, more
information.

-- 
rjbs
 
 w
 


signature.asc
Description: Digital signature


Re: possible changes for Mail::DeliveryStatus::BounceParser

2006-08-03 Thread Ricardo SIGNES
* William Yardley [EMAIL PROTECTED] [2006-08-03T18:09:01]
 I've also been working at building some tests for some of the problems
 we've seen, and building a bigger corpus of emails to use for testing.

I am really looking forward to having a nice body of messages selected for use
as proof that feature X is any good.

 I have most recently been working on messing around with the regexes for
 user_unknown (see changes around line 796), and using 5.1.0, 5.1.1,
 5.1.2 and 5.2.2 errors[1] in the status report as a preferred method of
 determining $report-std_reason over text regexes (see changes around
 line 374).

Excellent; I would much rather trust the data that's supposed to tell us
something than the data whose entrails we have ripped out read.

 I also ripped out some AOL / Hotmail specific hacks which I'm pretty
 sure are way out of date now.

I bet there are more yet to go.

 http://veggiechinese.net/bounce_parser_diff_3.txt
 
 None of these changes are checked in; at this point, I'm just soliciting
 opinions, and hoping some other folks might be willing to test these
 changes.

I'd suggest you check them in; it'll be easier to test them.  Consider making a
branch.  Looking at the code changes, I think checking them in would be good.
They're definitely an improvement, unless I'm missing some obnoxious bug.

 I made a few suggestions (the last 2) at:
 http://emailproject.perl.org/wiki/Mail::DeliveryStatus::BounceParser

bounce unless otherwise is, yes, dumb.  I think that leads to a lot of grief.
It makes sense in the project's original context, when it was known that it
should only be seeing bounces, and the author wanted to be able to exclude some
things.  In production, it's an obnoxious assumption.

It should be easy to make that an option.

I think I'll be deploying these changes, after a bit more testing; I hope to
see some improvements in junk avoidance.

-- 
rjbs


signature.asc
Description: Digital signature