On Tue, 16 Mar 2021 13:16:49 -0400 Steve Dondley wrote: > I have been accumulating spam/ham samples and sorting them out into > different directories on my server. As new spam/ham comes in, I throw > it into the existing pile and then run "sa-learn --spam|--ham" on the > whole pile. > > It dawned on me that this will get very slow as I eventually collect > tens of thousand of emails. So I'm wondernig if it's better to: > > 1) Place all new, incoming spam/ham into empty directories > 2) Run sa-learn only on these directories with small samples
Why with small samples? Just train on new spam and ham and then move them. > 3) Once done, move these new emails to an archive of spam/ham samples > 4) Repeat