> > Uhm, what is it you want to do? I'm sorry I wasn't clear. I am looking for downloadable ham corpora in order to try to develop a way to find new rules in an automatic or semi-automatic way.
I guess what irritates me most is, that it's unclear why you want to perform a "mass check on the server" in the context of asking for other folks ham corpus. After I generate new rules, I would need to test their accuracy somehow, the mass check seems to be a good way. So I guess my question about the mass check is wether or not my rules will be tested on others' corpora as well as on my own corpus. And no, you cannot download other contributors ham corpus. See the section Privacy in your reference. I read that, but I wasn't sure wether or not it was a warning against using others' corpora for means other than evaluating rules. Thanks for the clarification and for the quick reply. 2010/10/15 Karsten Bräckelmann <[email protected]> > On Fri, 2010-10-15 at 17:29 -0300, Marco Ribeiro wrote: > > Does anyone know of a good up-to-date Ham Corpus? I'm using > > SpamArchive for spam, but I haven't found a good one for Ham. > > I'm not sure if I understood this [1] correctly, but does that mean I > > must upload my own corpus if I want to perform a mass check on the > > server? If not, is it possible to download others' corpora? > > Uhm, what is it you want to do? > > The UploadedCorpora [1] wiki page is for Rule QA -- the masscheck refers > to checking SA rules against a corpus of ham and spam. Hand classified > ham and spam that is, to evaluate performance and accuracy of SA rules > and re-scoring. > > I guess what irritates me most is, that it's unclear why you want to > perform a "mass check on the server" in the context of asking for other > folks ham corpus. > > And no, you cannot download other contributors ham corpus. See the > section Privacy in your reference. > > > [1] http://wiki.apache.org/spamassassin/UploadedCorpora > > -- > char *t="\10pse\0r\0dtu...@ghno > \x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; > main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? > c<<=1: > (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; > }}} > >
