>
> Uhm, what is it you want to do?

I'm sorry I wasn't clear. I am looking for downloadable ham corpora in order
to try to develop a way to find new rules in an automatic or semi-automatic
way.

I guess what irritates me most is, that it's unclear why you want to

perform a "mass check on the server" in the context of asking for other

folks ham corpus.


After I generate new rules, I would need to test their accuracy somehow, the
mass check seems to be a good way. So I guess my question about the mass
check is wether or not my rules will be tested on others' corpora as well as
on my own corpus.

And no, you cannot download other contributors ham corpus. See the

section Privacy in your reference.

I read that, but I wasn't sure wether or not it was a warning against using
others' corpora for means other than evaluating rules.  Thanks for the
clarification and for the quick reply.


2010/10/15 Karsten Bräckelmann <[email protected]>

> On Fri, 2010-10-15 at 17:29 -0300, Marco Ribeiro wrote:
> > Does anyone know of a good up-to-date Ham Corpus? I'm using
> > SpamArchive for spam, but I haven't found a good one for Ham.
> > I'm not sure if I understood this [1] correctly, but does that mean I
> > must upload my own corpus if I want to perform a mass check on the
> > server? If not, is it possible to download others' corpora?
>
> Uhm, what is it you want to do?
>
> The UploadedCorpora [1] wiki page is for Rule QA -- the masscheck refers
> to checking SA rules against a corpus of ham and spam. Hand classified
> ham and spam that is, to evaluate performance and accuracy of SA rules
> and re-scoring.
>
> I guess what irritates me most is, that it's unclear why you want to
> perform a "mass check on the server" in the context of asking for other
> folks ham corpus.
>
> And no, you cannot download other contributors ham corpus. See the
> section Privacy in your reference.
>
>
> [1] http://wiki.apache.org/spamassassin/UploadedCorpora
>
> --
> char *t="\10pse\0r\0dtu...@ghno
> \x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
> main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8?
> c<<=1:
> (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0;
> }}}
>
>

Reply via email to