>> IMO there're 2 types of spam:
>> 1. made by accident (eg. "* * * * *" instead "@weekly" in crontab)
>> 2. made intentionlly
>> The 1st can be handled by UUID - just drop any old related dataset from 
>> inbox when a new one arrived
>> For the 2nd: what about accepting only datasets from "valid" UUIDs, meaning 
>> where just 1 dataset/week/IPv4 (maybe /16 block) in the mean did arrived in 
>> the last few weeks/months ?
Well, inspired by what Tor people do with Tor bridge stats:

- Create an UUID (never published, known only at the client and at the gentoo 
stats server)
- Calculate a hash of it. The hash is allowed to be published. The hash may be 
related with contact informations. The contact data may or may not be 
published. The hash is used for contacting people in case of questions.

The stats sent by the client contains the UUID.
Stats are send to a stats server in an area where they do live fore a while 
If a new stats file was got then the stats server deletes all older stats file 
of thet UUID in the stats area.

Stats are be trusted if they meet conditions already mentioned by Brian Dolbec.

IMO do not care about detecting spam, just try to detect valid UUIDs.

