On 6/12/2009 3:29 AM, Scott Bennett wrote: > In other words, by restricting just port 43 exits to only the legitimate whois > IP addresses, I eliminated at least 70% of *all* exits through my tor node, > which suggests to me that the vast, overwhelming majority of exits from the > tor network are illegitimate and place a terribly taxing load upon the tor > network as a whole.
Scott, Thanks for your continued analysis, this is interesting information. However, the list of WHOIS servers you mentioned (and I snipped for brevity) is by no means a complete set of "the legitimate WHOIS IP addresses". In fact, it's much much too small to draw any significant conclusions, for at least two major reasons: 1) Any .com or .net WHOIS queries that hit whois.verisign-grs.com (aka whois.internic.net in your list) with a legitimate domain name will result in a referral to an individual registrar's WHOIS server, which will often be followed by the client, and would not be allowed by your exit policy. There are potentially tens of thousands of these registrar WHOIS servers out there. 2) Your list significantly excludes all ccTLD WHOIS servers. While the numbers of domains registered in ccTLDs are not significant compared to .com/.net, their use is quite popular in a number of places, particularly in some where Tor is also quite popular, ie Germany. I'd be interested in seeing a comparison done with a more significantly complete list. I understand you feel very strongly about sampling the contents of the traffic, and that's perfectly understandable and appropriate, but it is probably the only way to actually make a firm determination of how much of this exit traffic really is WHOIS, without crafting a VERY large Exit policy. It may be possible, with appropriately engineered tools, to sample the traffic in a suitably anonymous way but still draw some conclusions, perhaps by simply attempting to determine if the TCP session involves mostly text or binary data. That may still be a bit too intrusive, so I suppose we might just never know. Given these shortcomings in the list, I definitely wouldn't suggest that such a list be considered a "default", as you'll be blocking a potentially significant amount of legitimate WHOIS traffic. If you do attempt to dig up a more complete list of WHOIS servers, I'd certainly be interested to see what you come up with, but of course understand you're doing this all on your own time and dime, and would never suggest that you're by any means obligated to do so. :) Best Regards, Tim

