Isn't that exact same logic an argument for having the maximum number of
duplicate subjects apply to the HAM / notspam folder too?  5000 or 15000 of
the same message sent individually by (untrainable / apathetic) users would
fill the notspam folder and mess up HMM / Bayesian right?

And for those RE / FWD / No subject emails, maybe we could have ASSP ignore
subjects shorter than say 5 or 6 characters when deleting duplicate file
names?  Then those files could get wiped out oldest first during the
maintenance.

\

On Thu, Mar 10, 2016 at 11:18 AM, Thomas Eckardt <thomas.ecka...@thockar.com
> wrote:

> Just think about the logic behind Bayesian and HMM - this will answer your
> question.
>
> Having the same mail in the spam folder multiple times, this will score
> the content to extreme spam havy, even your users are using the same
> content - but less often.
>
> Thomas
>
>
>
>
>
> Von:    K Post <nntp.p...@gmail.com>
> An:     ASSP development mailing list <assp-test@lists.sourceforge.net>
> Datum:  10.03.2016 16:58
> Betreff:        Re: [Assp-test] Max Number Duplicate File Names
>
>
>
> I know you're all RTFM, but there's plenty of places in the GUI where the
> description isn't exactly clear or right.  For example
>
> MaxFiles
> If you're not using subjects as file names ( UseSubjectsAsMaillogNames ),
> this is the maximum number of files to keep in each collection (spam &
> nonspam)
> It's actually less than this -- files get a random number between 1 and
> MaxFiles.
>
> I AM using file names and MaxFiles DOES control the maximum number of
> files
> in each collection, despite what the description says when
> MaintBayesCollection is on and no max age is set. The language is not
> clear
> and that makes us assume things, sometimes incorrectly, about what the GUI
> really mean.  We've been working this way since ASSP came out.  Because of
> this, I had no way of knowing that MaxAllowedDups >really< only applied to
> the spam collection.  I assumed the GUI meant the whole log of spam and
> NOTspam.  I don't think that's an unreasonable assumption, or call it an
> oversight, or a mistake on my part - but none of that justifies and angry
> sounding response from you.
>
>  I'm not looking for a fight, but I feel like I have to keep justifying
> myself after you appear to be so angry with me, and the rest of us, who
> turn to you for enlightenment.  You're carrying the entire weight of this
> project on your shoulders.  It's a lot, I know,  Can we move on and have a
> reasonable discussion here?
>
> Is there a reason that MaxAllowedDups shouldn't also apply to the notspam
> collection?   Shouldn't we want that to be the case for the same reason
> that we have it for spam?   Maybe also to the errors collections?
>
> If we don't, wouldn't the case where a staff member sends the same basic
> message to 5000 people (against my wishes, but I can't control everything)
> that'll take 1/3 of the other notspam messages out of the rebuild
> processes?  How about if 20k messages are sent?
>
> Maybe I'm just not understanding, and that's why I'm asking, but I hope it
> doesn't result in any more scolding.
>
> Thank you
>
>
> On Thu, Mar 10, 2016 at 4:15 AM, Thomas Eckardt
> <thomas.ecka...@thockar.com>
> wrote:
>
> > >There are about 600 of those files in NotSpam.
> >
> > 'MaxAllowedDups','Max Number of Duplicate File Names'
> >   'The maximum number of logged files with the same filename (subject)
> > that are stored in the spam folder (spamlog),........
> >
> > I'll write in Hebrew - possibly the english is better, if you translate
> it
> > back to english.
> >
> > Thomas
> >
> >
> >
> > Von:    K Post <nntp.p...@gmail.com>
> > An:     ASSP development mailing list <assp-test@lists.sourceforge.net>
> > Datum:  10.03.2016 00:29
> > Betreff:        [Assp-test] Max Number Duplicate File Names
> >
> >
> >
> > I've got UseSubjectAsMaillogNames checked (the messages are stored in
> the
> > folders user the subject name followed by a 6 digit number as expected)
> >
> > I've got MaxAllowedDups set to 3
> >
> > MaxBayesFileAge is 0
> > MaxFiles is 15000
> >
> > I'm noticing that MaxAllowedDups doesn't seem to be working.
> >
> > For example, a couple users often send emails with the subject
> > "Your Donation Receipt"
> > There are about 600 of those files in NotSpam.
> > Your_Donation_Receipt--123456.txt
> > where 123456 is a random differing number.
> >
> > Shouldn't only 3 of these files exist in the folder (with the exception
> of
> > those that were sent since the rebuild / maintenance window)?
> >
> > Thanks
> >
> >
>
> ------------------------------------------------------------------------------
> > Transform Data into Opportunity.
> > Accelerate data analysis in your applications with
> > Intel Data Analytics Acceleration Library.
> > Click to learn more.
> > http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140
> > _______________________________________________
> > Assp-test mailing list
> > Assp-test@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/assp-test
> >
> >
> >
> >
> > DISCLAIMER:
> > *******************************************************
> > This email and any files transmitted with it may be confidential,
> legally
> > privileged and protected in law and are intended solely for the use of
> the
> >
> > individual to whom it is addressed.
> > This email was multiple times scanned for viruses. There should be no
> > known virus in this email!
> > *******************************************************
> >
> >
> >
> >
>
> ------------------------------------------------------------------------------
> > Transform Data into Opportunity.
> > Accelerate data analysis in your applications with
> > Intel Data Analytics Acceleration Library.
> > Click to learn more.
> > http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140
> > _______________________________________________
> > Assp-test mailing list
> > Assp-test@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/assp-test
> >
> >
>
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140
> _______________________________________________
> Assp-test mailing list
> Assp-test@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/assp-test
>
>
>
>
> DISCLAIMER:
> *******************************************************
> This email and any files transmitted with it may be confidential, legally
> privileged and protected in law and are intended solely for the use of the
>
> individual to whom it is addressed.
> This email was multiple times scanned for viruses. There should be no
> known virus in this email!
> *******************************************************
>
>
>
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140
> _______________________________________________
> Assp-test mailing list
> Assp-test@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/assp-test
>
>
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140
_______________________________________________
Assp-test mailing list
Assp-test@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/assp-test

Reply via email to