On Aug 13, 2009, at 4:07 PM, Warren Togami wrote:
On 08/13/2009 11:04 AM, Justin Mason wrote:
IMHO, none of the network tests should be used during masscheck
for ham
older than 4 weeks. Thoughts?
if we had enough ham to get useful results with that limit, sure. As
it is, I'm not sure that's the case.
If we are in agreement that old network tests are not good, all we
are doing is collecting large numbers of bad results and clouding
the statistical measurements of recent network tests.
I wouldn't say that we're in agreement. You're measuring historical
accuracy for reuse results. At that specific point in time domain
example.com was on the SURBL/URIDB/etc list. At actual receive time
how good was the list? Thats what you're measuring. You aren't going
back later and running those domains through the current lists.
Historical accuracy of network tests is key, providing corpora without
SpamAssassin rules from actual receive time does not help scoring, it
hurts it.
Michael