Re: [freenet-dev] Distributed RBL implementation

Gordan Fri, 26 Sep 2003 09:10:37 -0700

On Friday 26 Sep 2003 16:02, Pascal wrote:
> The goal of the hashing was to address the problem of keeping the list
> of IPs out of the hands of spammers.


That would be rather difficult to do, while keeping the operation quick. 
Additionally, 1-way hashing is not guaranteed to produce unique results. If 
you take all possible 32-bit integers, and take an MD5 hash of them, I don't 
think there is any way to guarantee, other than by proof of exhaustion, that 
there will not be two numbers that will hash to the same value. This is bad 
because you could accidentally blacklist a good server along with an evil 
server. This, of course, is rather unlikely.

Of course, if the hashing function is complex enough to take 1 second to 
calculate on modern hardware, then it will take 136 years of CPU time to work 
out every possible combination. However, this could be narrowed down 
considerably by only checking known fertile IP blocks, e.g. South America, 
Far East, etc.

This would also mean that we are limiting server throughput to about 60 
emails/hour, not withstanding any caching that would likely help quite a lot.

The problem there is how long will it be before zombie machines are starting 
to be used for a big distributed computing number cruncing project to 
discover open relays? Then we are pretty much back to square one...

> I have looked into obtaining an IP
> listing from existing RBLs and that seems to be their biggest concern.

That is indeed a problem. This would only be a solution if EVERYBODY was 
running a RBL blocker.

> Some even make you sign paperwork to that effect.  I left the phrasing
> generic ("one-way hashing function") because that is an area in which I
> am weak and was hoping to receive feedback on.  Is there no function
> computationally intensive enough to use in this application?

Not really, no.

The only way around it that I can see is manifest-less uploading of files 
under an SSK. That way you just put in a file for each IP address. If the 
file retrieves, it is RBLed.

Unfortunately, it means that:

1) Locating of files would be slow. There is no way to tell if a file is 
really there or not, only whether we can find it or not. That means that up 
to HTL nodes will have to process a request and return either true or false. 
This would cause rather heavy network load in terms of proessing, even if not 
in terms of data transfers. We are atlking about a delay of up to several 
minutes for each relaying mail server. Now imagine how many Freenet hits 
there would be from a heavily loaded mail server. The chances are that 
Freenet would fall over.

Having said that, this is not too bad, because the chances are that a lot of 
spam will probably come from the same relay in a fairly short amount of time, 
so the files (or the DNF) will be cached.

2) Locating of files would be unreliable. Without a manifest, we cannot tell 
if a file is there or not. We can only say that it isn't there within our 
data horizon.

3) It would cause a huge network load. Even with routing working perfectly, 
without manifests, it would probably be more efficient to just store an 
inverse list, i.e. the list of all non-RBL-ed hosts. The chances are that 
those files would be found faster.

4) We would need a DNS server deamon that looks things up based on the data in 
Freenet. This would need writing.

To summarize, if we want to keep a list of RBLed hosts secret:
It would probably kill the network.
If it doesn't kill the network, it would be too slow.
It would require the writing of a special deamon to handle such things.

I think that would be just too much work...

Gordan
_______________________________________________
Devl mailing list
[EMAIL PROTECTED]
http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Distributed RBL implementation

Reply via email to