http://bugzilla.spamassassin.org/show_bug.cgi?id=4176





------- Additional Comments From [EMAIL PROTECTED]  2005-05-06 16:20 -------
Subject: Re:  [review] RFE: please add pattern for nate.com redirector

On Thu, May 05, 2005 at 08:31:04PM -0700, [EMAIL PROTECTED] wrote:
> There's a lot of stuff in the comments here. I want to make sure that your 
> vote
> is about what ended in the patch and not about things I only talked about 
> doing.

It was the patch, but ...

> 1) Mail::SpamAssassin->parse() was called in three places as a class method,
> even though the POD doesn't document it as being one. That happened not to 
> break
> anything before now, but should be fixed independent of this bug. It's a one
> line change in two of those places, and a three line change in the third. The
> API for parse() does not have to be changed.

That's fine.

masses/corpora/mass-find-nonspam:  my $ma = Mail::SpamAssassin->parse 
($dataref);

This has no M::SA object.  I'd probably just change it to use
M::SA::Message->new().

sa-learn.raw:  my $ma = Mail::SpamAssassin->parse($dataref);
tools/speedtest:        my $mail = Mail::SpamAssassin->parse (\*FILE);

There's a M::SA object already, so doing $sa->parse() is fine with me.

This shouldn't be part of this ticket.

> 2) The Mail::SpamAssassin instance is passed in to the constructors for 
> Message,
> Node, and HTML so they can have access to configuration options. That adds an
> argument to the constructor, a 'main' instance variable that is assigned in 
> the
> constructor and deleted in the finish() method.

This is what I'm -1 on.

> 3) The code that actually performs the enhancement.
> #1 should be done anyway. #3 is not the issue you are bringing up.

Sort of.  I'm actually somewhat negative against #3 as well.  This is yet
another reserved IP/RegistrarBoundary thing, except the potential list of
these is much MUCH larger than those other two.  I think I'd rather have a
"open redirector domain" list via SURBL or something that gets you a few
points if your domain has an open redirector available.  Gives some incentive
to those people to close the redirector.

> I don't know that #2 is a big change to the core API. The three constructors 
> are
>  only called in a handful of places in all. It could be argued that we should
> not have core functionality like message parsing and HTML rendering that 
> cannot
> be modified by any configuration options.

Yes, it is a big change.  Message and Message::Node are designed to be
independent of SpamAssassin.  The code's job is to parse a message and
return an internal data format, there's no configuration involved in
that process.

In this case, there's no need to have anything change in the current process.
The HTML is rendered and URIs are pulled out and cooked appropriately.   The
text URIs are handled separately and would also need to be gone over by the
code.

So if we want this support for just the URIBL stuff, it's easy to just add to
that plugin.  If we want it in general, it's a bit more complex.  As far as
our code is concerned, get_uri_list() is the way to get the URI list.  The
URIBL plugin is the only thing that I can see that directly accesses the
HTML uri_detail hash (and nothing seems to look at the old uri array).

This may be a good time to change get_uri_list().  If it's called with
no parameters, return what we currently do (array of uris).  If it's
called with some new parameter, return a uri_detail-esque hash which
includes both HTML and parsed URIs (trivial).  Change URIBL plugin to use
that new get_uri_list() call.  Have get_uri_list() do the full gambit
of redirection, skip it in uri_list_canonify().  Heck, skip the whole
canonification step in HTML.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
You are on the CC list for the bug, or are watching someone who is.

Reply via email to