Anyone interested in this matter might want to inspect their own Cherokee log files --- especially if you have mixed case URLs.
90% or more of the offending requests come from one source: All the URL requests are entirely lower cased. That's their doing and not us --- for sure. IP: 207.46.119.86 Browser: MSNPTC/1.0 (compatible; MSIE 6.0; Win dows NT 5.2; MyIE2; .NET CLR 1.1.4322; .NET CLR 1.0.3705) That is a Microsoft spider of some sort --- requests from other IP's as well. Someone called this: Microsoft AdCenter spider. traceroute: 3: ge-7-3-0-0.dal-64cb-1b.ntwk.msn.net 1.382ms asymm 5 4: ge-4-3-0-0.dal-64cb-1a.ntwk.msn.net 2.059ms 5: xe-1-0-1-0.bay-16c-1b.ntwk.msn.net 43.291ms 6: ge-1-2-0-0.by2-64c-1b.ntwk.msn.net 43.870ms 7: ten7-4.by2-76c-1a.ntwk.msn.net 43.570ms 8: 10.22.48.2 43.445ms asymm 9 I have about a 18 megabyte log file of these invalid requests from February so far :( Anyone claiming Microsoft engineers good products needs to be thoroughly scrutinized. This same spider is known to pseudo DDOS you also. Saw about 5+ more requests per second on our application server when I implemented the redirect. Quite an instant jump in traffic for handling their mess. Off to find some official contact managing this "spider". On Sun, Feb 20, 2011 at 10:54 AM, pub crawler <[email protected]> wrote: > We have a high traffic problem. > > Have a directory: > > http://www.website.com/Directory/whatever.php > > Search spiders are going nuts requesting this (1000's of these wrong > requests a day) > http://www.website.com/directory/whatever.php > > How do I simply handle this transparently to internally redirect to > the proper /Directory subdirectory instead of the wrong /directory > subdirectory? > > (I am at a loss on the regex stuff - anyone with a useful > tool/builder/reference please recommend). > > Thanks! > _______________________________________________ Cherokee mailing list [email protected] http://lists.octality.com/listinfo/cherokee
