Sanford Whiteman wrote:
Jsut  fine.  Tens of thousands is a very, very different story. Again,
you  seem  to  be  missing  the point in thinking these two situations
don't  present  different  requirements.  "Solely  for  the purpose of
scaleability" is one of the purest and most commendable motivations in
application  design, since it encompasses both "in the wild" stability
and  performance  under  a  simple  umbrella.  Far  from a dirty word,
scaleability  is  what  makes so many open-source projects work in the
enterprise,   despite  their  many  other  foibles.  If  you  start  a
development  project  with  an express disregard for it, count out the
most capable programmers.
  

My friend is one of the most capable programmers that you will find, he's done a great deal of work in the last 5 years within Microsoft's framework, and I don't expect for this to be a challenge for him.  I'm still waiting to see if he wants to take this on.

In terms of scale, I would expect to see a server handle not much more than 500,000 messages in a full Declude/IMail environment, and with an average of more than 10 pieces of spam per address per day, a solution of this sort would need to effectively resolve against 50,000 or so E-mail addresses.  While I'm not at all sure how to properly index this information for rapid use, I do know that you could split the data into user and domain, and first query the domain, and then the user, and that would likely mean for the most part that you would need to do one query (full string match) on about 1,000 domains, and then another query on an average of maybe 50 user addresses.  Pete over at Sniffer has figured out how to search the entire source of a message with tens of thousands of rules complete with wildcards, and he does that quite efficiently considering that the application loads the entire rule base every time it is hit with a message.  I think a capable programmer would not at all be bothered by the demands.  There's absolutely no reason why this couldn't be done.

If you have a recommendation for how to best handle the task where data is initially sourced from a text file, please share it and I will pass that on.

Thanks,

Matt
-- 
=====================================================
MailPure custom filters for Declude JunkMail Pro.
http://www.mailpure.com/software/
=====================================================


Reply via email to