-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Warren Togami writes:
> Theo Van Dinter wrote:
> > On Mon, Nov 21, 2005 at 08:38:05PM -0800, Justin Mason wrote:
> > 
> >>well, it's more than that.  with a small number of corpora, the
> >>scores will be over-optimised for those people.   It's a tricky
> >>problem....
> > 
> > I've actually been thinking about this a bit.  Our normal mass-check runs
> > are heavily weighted towards a small number of people already.  For 3.1,
> > we used 9 people's logs.  It totalled 1766844 messages (bmenschel's
> > wasn't included apparently).  Breaking it down:
> > 
> > Percent     Provider
> > ------- ----------
> > 33.93       jm
> > 31.00       theo
> > 9.35        daf
> > 7.68        rod
> > 6.05        parkerm
> > 5.62        bzoetekouw
> > 5.11        quinlan
> > 1.20        cthielen
> > 0.07        misak
> > 
> > So basically Justin is 34%, I'm 31%, and everyone else combined is 35%.
> > So in reality, the scores are far more tuned for Justin and myself than
> > any other single person.
> > 
> > This is something I've been trying to think about wrt doing weekly score
> > generations for use by sa-update, but no real solution has come to mind yet.
> 
> We seriously need to improve documentation and tools to make it easier 
> for people to understand and do this.  At our company we need to almost 
> cripple our Asian office spamassassin because of the FP levels.  We need 
> better representation especially from non-Western users in mass checks.
> 
> I for example am trying to get a few native Japanese employees at my 
> office to participate because of the total lack of Asian representation 
> currently in mass check.  They misunderstood the sorting directions at 
> first, so I need to train them myself to make sure they do a good job at it.

true.  although without useful rules that work on Asian spam, the results
aren't going to be great.

by the way, I've decided I'll run a 3.0.5 mass-check on my corpora, if
needed.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFDg7FOMJF5cimLx9ARAishAKC2Qqs1x10Kn7vzY+8YH+AFIemkYQCgqifE
b2vxt1b8Mq3Lq2nFoO+KQjU=
=43nt
-----END PGP SIGNATURE-----

Reply via email to