On Wed, 14 Jul 2004 00:14:04 +0200 (Romance Daylight Time) Vadim Zeitlin <[EMAIL PROTECTED]> wrote:
> On Tue, 13 Jul 2004 11:30:51 -0400 (EDT) Richard Welty <[EMAIL PROTECTED]> wrote: > RW> 0) a setup wizard might be a good idea; it could encourage people to > RW> use dspam in the mythical "right" way (e.g., don't send spam to > RW> the trash, but to a quarantine folder for review before disposal, > RW> that sort of thing.) > I've at least documented this in the manual but, of course, such wizard > would be nice. I've even thought of showing one when the spam filters > options dialog is opened for the first time but probably won't have time to > implement it. for entertainment value, i've done some preliminary writing about what one might look like. i might want to take a crack at putting one together, but i won't promise any timeline on it. > RW> 1) if configure doesn't see sqlite, even if --with-dspam is specified, > RW> it builds without giving any indication that dspam was skipped, > Well, it does say "Cannot find sqlite header - support for DSPAM > disabled." but I guess a line in the summary for DSPAM could be a good > idea. Added. i'm sure that message came out, but i never saw it. > RW> and the menu entries in Message|Spam still show up and when used, don't > RW> generate error messages. > This is by design. DSPAM is just one of spam filters and there could be > more of them (currently there are 2: DSPAM and my old home brewn one). > You'd get an error only if there are no filters at all. i see. that makes sense. i was unaware of the multiple spam filters model that you are working to. > RW> 2) it would be nice if there were a way to corpus train from > RW> Message|Spam; the existing entries are for error training which > RW> isn't quite the same thing. > No, it isn't, but is it really common to train it like this? if someone doesn't have a big corpus on hand, they may be stuck doing it the slow way -- although admittedly, they'll probably train a lot of ham from a folder, turn up dspam and start training spam using misclassified materials, in which case they would use the existing buttons. > It doesn't > cost much to add these menu commands but I thought that it was wiser to put > them in a separate dialog because they're needed so rarely (basically I've > used them only once). what dialog are they in? i see file and folder, but no way to right click on a spam and corpus train with it. > RW> 4) it would be good to add a button for clearing statistics on > RW> the Edit|Spam Filters menu for dspam, as after you're satisfied > RW> with the training, you usually are well advised to clear stats, > RW> as errors in training tend to skew totals for a Very Long Time > RW> after dspam moves into production mode. > Yes, adding "clear" button to the statistics would be nice but this is > really low priority to be honest... And I don't know how to do it neither > to be honest (but then I didn't even look). i'll look into it. > RW> 5) it'd be nice if the displayed stats computed percentages > RW> for false positives, false negatives, etc. > I was quite confused by dspam statistics so I just gave the same output as > dspam_stats. If you feel like improving it, just hack the relevant code in > DspamFilter.cpp and send me the patch, I'd eb happy to apply it. i'll look into this, too. > RW> also, purge after training > RW> completion is important, as the early databases get quite large and > RW> once in production, they can be shrunk a lot. > It didn't shrink at all for me. But it has stopped to grow which is > already nice (as it was at 55Mb). i may be confusing MySQL/PostgreSQL practice, since MySQL's OPTIMIZE TABLE and PostgreSQL's VACUUM FULL both reclaim disk space (at a cost of table locks for the duration.) > I'm almost surely using it incorrectly because I create and destroy > DSPAM_CTX for each message. Apparently I can reuse the same one for all the > messages. Also, iterating over messages is much slower than it could have > been (although I suspect it doesn't play big role here). In any case, > progress dialog should definitely be added... does libdspam require an extension to support this? i can take it up with jonz richard -- Richard Welty [EMAIL PROTECTED] Averill Park Networking 518-573-7592 Java, PHP, PostgreSQL, Unix, Linux, IP Network Engineering, Security ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click _______________________________________________ Mahogany-Developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/mahogany-developers