On Wed, 14 Jul 2004 11:27:28 -0400 (EDT) Richard Welty <[EMAIL PROTECTED]> wrote:

RW> > RW> 2) it would be nice if there were a way to corpus train from
RW> > RW> Message|Spam; the existing entries are for error training which
RW> > RW> isn't quite the same thing.
RW> 
RW> >  No, it isn't, but is it really common to train it like this?
RW> 
RW> if someone doesn't have a big corpus on hand, they may be stuck
RW> doing it the slow way -- although admittedly, they'll probably train a lot
RW> of ham from a folder, turn up dspam and start training spam using misclassified
RW> materials, in which case they would use the existing buttons.

 This was exactly what I thought and also why I didn't add it.

RW> what dialog are they in? i see file and folder, but no way to right click
RW> on a spam and corpus train with it.

 No, these buttons are only in the spam filters options dialog.

RW> >  Yes, adding "clear" button to the statistics would be nice but this is
RW> > really low priority to be honest... And I don't know how to do it neither
RW> > to be honest (but then I didn't even look).
RW> 
RW> i'll look into it.

 Thanks!

RW> > RW> also, purge after training
RW> > RW> completion is important, as the early databases get quite large and
RW> > RW> once in production, they can be shrunk a lot.
RW> 
RW> >  It didn't shrink at all for me. But it has stopped to grow which is
RW> > already nice (as it was at 55Mb).
RW> 
RW> i may be confusing MySQL/PostgreSQL practice, since MySQL's
RW> OPTIMIZE TABLE and PostgreSQL's VACUUM FULL both reclaim
RW> disk space (at a cost of table locks for the duration.)

 In any case, it looks like it would be a really good idea to add this as
most users will probably need it after this release, not the next one (when
they will have already trained dspam).

RW> >  I'm almost surely using it incorrectly because I create and destroy
RW> > DSPAM_CTX for each message. Apparently I can reuse the same one for all the
RW> > messages. Also, iterating over messages is much slower than it could have
RW> > been (although I suspect it doesn't play big role here). In any case,
RW> > progress dialog should definitely be added...
RW> 
RW> does libdspam require an extension to support this? i can take it up with
RW> jonz

 I've asked him repeatedly about whether it is even legal to use dspam like
this, i.e. by keeping the same DSPAM_CTX around but unfortunately have
failed to get any adequate answer. It looks like I may have exhausted his
patience/time limits :-(

 Regards,
VZ



-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
_______________________________________________
Mahogany-Developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/mahogany-developers

Reply via email to