Sahil Tandon wrote: > >Thanks Mark, this seems like the ideal approach. I'll try to hack >something together borrowing from the various handlers (namely >AvoidDuplicates.py) that are already in use.
Actually, AvoidDuplicates.py ccould serve as a good example, but it is currently not actually used. It is experimental and is bot included in the default GLOBAL_PIPELINE. >If I can understand how >Mailman keeps the in-memory dictionary of Message-IDs mentioned in >AvoidDuplicates.py, and implement an analogue for our use-case, that >would do it. The major problem with keeping these data in-memory other than purging "old" entries so that the dictionary doesn't grow too large, is that in-memory data aren't shared between runners so if the incoming queue is sliced, the multiple copies of IncomingRunner do not have access to each other's data. In your case, the input to the hash on which runners are sliced includes all the message headers and the listname so it is likely that the "equivalent but different" listname messages will be in different slices of the hash space. This is not a concern if IncomingRunner is not sliced. It is also not a concern with a disk based cache as long as buffers are flushed after writing because IncomingRunner locks the list whose message is being processed which should prevent race conditions between different slices of IncomingRunner. >The goal is to check whether a tuple of (message-id, >listname) already exists in the dict and, if it does, raise >Errors.DiscardMessage; otherwise, add the tuple to the dict and do >nothing. I would make a dictionary keyed on message-id + the cannonical listname with value = the time seen. Then I could just check if the key for the current message exists and proceed as above, and I also have time stamps so I can periodically remove old entries. -- Mark Sapiro <[email protected]> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan ------------------------------------------------------ Mailman-Users mailing list [email protected] http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-users/archive%40jab.org
