On Wed, 14 Jul 2004 00:14:04 +0200 (Romance Daylight Time) Vadim Zeitlin <[EMAIL 
PROTECTED]> wrote:

> On Tue, 13 Jul 2004 11:30:51 -0400 (EDT) Richard Welty <[EMAIL PROTECTED]> wrote:

> RW> 0) a setup wizard might be a good idea; it could encourage people to
> RW> use dspam in the mythical "right" way (e.g., don't send spam to
> RW> the trash, but to a quarantine folder for review before disposal,
> RW> that sort of thing.)

>  I've at least documented this in the manual but, of course, such wizard
> would be nice. I've even thought of showing one when the spam filters
> options dialog is opened for the first time but probably won't have time to
> implement it.

for entertainment value, i've done some preliminary writing about what
one might look like. i might want to take a crack at putting one together,
but i won't promise any timeline on it.

> RW> 1) if configure doesn't see sqlite, even if --with-dspam is specified,
> RW> it builds without giving any indication that dspam was skipped,

>  Well, it does say "Cannot find sqlite header - support for DSPAM
> disabled." but I guess a line in the summary for DSPAM could be a good
> idea. Added.

i'm sure that message came out, but i never saw it.

> RW> and the menu entries in Message|Spam still show up and when used, don't
> RW> generate error messages.

>  This is by design. DSPAM is just one of spam filters and there could be
> more of them (currently there are 2: DSPAM and my old home brewn one).
> You'd get an error only if there are no filters at all.

i see. that makes sense. i was unaware of the multiple spam filters model
that you are working to.

> RW> 2) it would be nice if there were a way to corpus train from
> RW> Message|Spam; the existing entries are for error training which
> RW> isn't quite the same thing.

>  No, it isn't, but is it really common to train it like this?

if someone doesn't have a big corpus on hand, they may be stuck
doing it the slow way -- although admittedly, they'll probably train a lot
of ham from a folder, turn up dspam and start training spam using misclassified
materials, in which case they would use the existing buttons.

> It doesn't
> cost much to add these menu commands but I thought that it was wiser to put
> them in a separate dialog because they're needed so rarely (basically I've
> used them only once).

what dialog are they in? i see file and folder, but no way to right click
on a spam and corpus train with it.

> RW> 4) it would be good to add a button for clearing statistics on
> RW> the Edit|Spam Filters menu for dspam, as after you're satisfied
> RW> with the training, you usually are well advised to clear stats,
> RW> as errors in training tend to skew totals for a Very Long Time
> RW> after dspam moves into production mode.

>  Yes, adding "clear" button to the statistics would be nice but this is
> really low priority to be honest... And I don't know how to do it neither
> to be honest (but then I didn't even look).

i'll look into it.

> RW> 5) it'd be nice if the displayed stats computed percentages
> RW> for false positives, false negatives, etc.

>  I was quite confused by dspam statistics so I just gave the same output as
> dspam_stats. If you feel like improving it, just hack the relevant code in
> DspamFilter.cpp and send me the patch, I'd eb happy to apply it.

i'll look into this, too.

> RW> also, purge after training
> RW> completion is important, as the early databases get quite large and
> RW> once in production, they can be shrunk a lot.

>  It didn't shrink at all for me. But it has stopped to grow which is
> already nice (as it was at 55Mb).

i may be confusing MySQL/PostgreSQL practice, since MySQL's
OPTIMIZE TABLE and PostgreSQL's VACUUM FULL both reclaim
disk space (at a cost of table locks for the duration.)

 
>  I'm almost surely using it incorrectly because I create and destroy
> DSPAM_CTX for each message. Apparently I can reuse the same one for all the
> messages. Also, iterating over messages is much slower than it could have
> been (although I suspect it doesn't play big role here). In any case,
> progress dialog should definitely be added...

does libdspam require an extension to support this? i can take it up with
jonz

richard
-- 
Richard Welty                                         [EMAIL PROTECTED]
Averill Park Networking                                         518-573-7592
    Java, PHP, PostgreSQL, Unix, Linux, IP Network Engineering, Security



-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
_______________________________________________
Mahogany-Developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/mahogany-developers

Reply via email to