Re: 3.1.0 schedule

2005-06-27 Thread Nix
On Sun, 26 Jun 2005, Justin Mason spake: Nix writes: I'm still not sure how intensely to de-dupe: should I zap articles with identical bodies? identical bodies except for MIME headers? identical bodies except for identifiable bayes poison? Until the obfu rules came in, I'd have said the

Re: 3.1.0 schedule

2005-06-27 Thread Nix
On Sun, 26 Jun 2005, Theo Van Dinter moaned: On Sun, Jun 26, 2005 at 05:23:24PM -0700, Justin Mason wrote: I may still have an account (username `nix'): but that was a long, long time ago --- pre-Apache, I think --- and I'm not sure if it's still there. No such account. :( OK, it

Re: 3.1.0 schedule

2005-06-27 Thread Theo Van Dinter
On Mon, Jun 27, 2005 at 03:54:35PM +0100, Nix wrote: run with --learn=N -- we're going to want to figure out N small # for large # of messages, large # for small # of messages? That sounds like an optimization problem to me (find that percentage which yields the greatest accuracy when

Re: 3.1.0 schedule

2005-06-27 Thread Nix
On Mon, 27 Jun 2005, Theo Van Dinter spake: On Mon, Jun 27, 2005 at 03:54:35PM +0100, Nix wrote: run with --learn=N -- we're going to want to figure out N small # for large # of messages, large # for small # of messages? That sounds like an optimization problem to me (find that

Re: 3.1.0 schedule

2005-06-26 Thread Theo Van Dinter
On Sat, Jun 25, 2005 at 06:29:44PM -0700, Justin Mason wrote: Hey -- I presume we won't be going ahead with this schedule, since nobody's voted, explicitly given a thumbs-up, or updated the details on how mass-checks now work in 3.1.0... Ok, so the first step is to announce this is coming up

Re: 3.1.0 schedule

2005-06-26 Thread Nix
On Sun, 26 Jun 2005, Theo Van Dinter spake: On Sat, Jun 25, 2005 at 06:29:44PM -0700, Justin Mason wrote: Hey -- I presume we won't be going ahead with this schedule, since nobody's voted, explicitly given a thumbs-up, or updated the details on how mass-checks now work in 3.1.0... Ok, so

Re: 3.1.0 schedule

2005-06-26 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Nix writes: On Sun, 26 Jun 2005, Theo Van Dinter spake: On Sat, Jun 25, 2005 at 06:29:44PM -0700, Justin Mason wrote: Hey -- I presume we won't be going ahead with this schedule, since nobody's voted, explicitly given a thumbs-up, or updated

Re: 3.1.0 schedule

2005-06-26 Thread Theo Van Dinter
On Sun, Jun 26, 2005 at 05:23:24PM -0700, Justin Mason wrote: I may still have an account (username `nix'): but that was a long, long time ago --- pre-Apache, I think --- and I'm not sure if it's still there. No such account. :( Update docs, please! I've still got to work out what

Re: 3.1.0 schedule

2005-06-25 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hey -- I presume we won't be going ahead with this schedule, since nobody's voted, explicitly given a thumbs-up, or updated the details on how mass-checks now work in 3.1.0... - --j. Daniel Quinlan writes: [EMAIL PROTECTED] (Justin Mason) writes:

3.1.0 schedule

2005-06-21 Thread Justin Mason
let's get this properly underway... how's about this. - today to Mon, 2005-06-27: clean up our corpora, get ready for mass-checking, try out mass-check to spot any big memory leaks or whatnot. - Mon, 2005-06-27 to Wed, 2005-07-06: mass-checks; move to C-T-R? - Wed,

Re: 3.1.0 schedule

2005-06-21 Thread Daniel Quinlan
[EMAIL PROTECTED] (Justin Mason) writes: - Mon, 2005-06-27 to Wed, 2005-07-06: mass-checks; move to C-T-R? One week is enough. It's single pass now, remember, so we could say Tuesday. Either way... (Daniel, we can now get all scoresets from one mass-check run, right?)

3.1.0 schedule rev 2

2005-06-21 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 OK, a redo after a little chat -- with an extra optional week at the end. - - today to Mon, 2005-06-27: clean up our corpora, get ready for mass-checking, try out mass-check to spot any big memory leaks or whatnot. - - Mon, 2005-06-27 to Wed,