Re: RFC: Plan for faster updates

Theo Van Dinter 9 Feb 2005 18:01:55 -0000

On Tue, Feb 08, 2005 at 09:49:37PM -0800, Robert Menschel wrote:
> Agreed.  Also thinking that to reduce bandwidth, it might be a good
> idea to separate "core" rule scores from "updates" containing new and
> changed rules.


Yeah, it's possible to do, but I'm not so worried about that right now.  The
likelihood is we'll just do rule additions.  Scores are possible, but it's
arduous enough now to get the scores done for a minor release (takes ~1 month).

> And I assume that *.*.3 would also be viable to accept rules for all
> 3.x.x versions, or more to the point, *.*.2 could be used within SARE
> to flag rules that apply to all 2.xx versions that predate 3.0.0.

Sure.  Actually, just do *.3 and *.2 then. :)  That works easily if the rule
sets include the "if version" blocks, BTW.

> Where do these updates come from?  When would the GPG signature be
> applied, and by whom/what?  Within SARE we have multiple working

When the new release is put up on the mirror master, it'd have the SHA1 and
GPG files created and mirrored out at the same time.  The theory being that
you'd do the release, potentially wait some amount of time for mirrors to pick
up the changes, then make the DNS change.  Alternately, have the TTL on the RR
be large enough such that you randomly account for client/mirror timing, and
update DNS when the master is ready.

> files, and I can see our scripts combining all files that match a
> given critiera into a single channel file. The original files are
> sometimes signed to validate them, but I don't see any value to having
> an automated script sign the compilation. I suppose it might be a YMMV
> situation.

Yeah, that's all up to the channel owners how they want to do things.
The goal is for the users to validate that what they received is valid.
If you're not paranoid, you care that the file downloaded wasn't corrupt
or MITM modified, so SHA1 is sufficient for now.  If you're paranoid that
someone may have altered the mirrored copy (and the SHA1 file of course),
you want GPG so you can be sure the files are as intended by your trusted
"author".

Most people would find the SHA1 file sufficient, the more paranoid among us
would do GPG.

> I would think that the compilation script could simply cat the
> component files together.  eg [I often use shell as my meta language]:

Right, but then that negates the idea of having multiple files ala
##_type.cf.  Plus it also makes things like plugins impossible to
distribute this way (*.cf, *.pre, and *.pm files are all separate...)

I was thinking about smooshing them into a cheap concat format (separate
files with a null or something (since we're distributing text files), then
have the update script split them out.  If we're going that route though,
handling compression of the file would require the update script to do
IO::Zlib or something, so we may as well do Archive::* to handle this all.

Guess it depends how many perl module requirements we want to have.

> I would agree that we want all channel files to come before local.cf
> alphabetically, and also want them to have reasonably short names.

I thought about that (I was originally just going to take the first
section of the channel FQDN for the filename), but decided that since
this doesn't require much/any effort from the admin/user (SA will just
include the .cf file, or at worst the admin will have to add an include
line for it once), the length doesn't really matter.

> Let's say SARE includes an "english" channel, containing our rules
> that work well in the English language, USA, UK, Australia, etc., but
> does not work nearly as well for sites that receive emails in other
> languages.  Let's then say that SARU (our Russian counterparts) create
> a channel which simply rescores our "english" channel to reflect
> mass-check results in their part of the world. How can we guarantee
> that their channel file scores override our scores?

Channels are non-coupled, since their sources are completely independent.
I would expect, if this behavior were required, either the admins would
have to use includes to specify the order, or the channels would have
to use some form of "if language", etc, methodology.

Interdependencies between channels cause lots of issues which I was trying to
avoid by basically not supporting it. ;)

> It'd be good if those channels could be provided either directly in
> the command line (one or two additional channels) or through an input
> file (a dozen or so channels).

Yeah, it's doable on the commandline.  I was trying to avoid a config file for
the whole script, although I could see "--channel-file /path/to/file" as an
option.  Good suggestion. :)

> To help those who need to put these into a user_prefs file, it'd be
> good to include an option(s) which specifies that a) output will be to
> $HOME/.spamassassin/user_prefs, b) all channel files should be

a is -1 -- the script is not going to edit a file, it's going to
overwrite it.  there are all forms of issues with editing.

> concatenated together, along with a core user_prefs file, and c)

-1

For users, they can simply have the updates go into
~/.spamassassin/updates (or something), then from their user_prefs
include the updates/channel.cf file.  There's no reason this needs to
be different for users than admins.

> This might be a good place to also --lint the received channel file,
> and fail any channel file that fails --lint.

Hrm.  Part of me agrees, and part of me thinks it's the responsibility
of the channel admin not to put in broken stuff.  Basically, you'd have
to lint the entire config to deal with all the dependencies, and then
that may not work due to local custimizations, people calling M::SA not
"spamassassin" or "spamd", etc.

> This has good potential.

:)  Thanks.

-- 
Randomly Generated Tagline:
There are still some other things to do, so don't think if I didn't fix
 your favorite bug that your bug report is in the bit bucket.  (It may be,
 but don't think it.  :-)  Larry Wall in <[EMAIL PROTECTED]>

pgpzaVmU17c9f.pgp
Description: PGP signature

Re: RFC: Plan for faster updates

Reply via email to