What exactly is the issue here?

Lewis

On Thu, Oct 25, 2012 at 4:59 PM, Alex diNorcia <[email protected]> wrote:
> http://alex.dinorcia.net/robots.txt has been in place and unchanged since
> Aug 24  2004
>
> * i'd also point out that it's crawling poorly to boot. the original link it
> got into the directory with was
> http://alex.dinorcia.net/stuff_i_got_in_emails/?C=M;O=D
> it appears to add the descending order part of the get variables to each
> file and gets a 404 error.
>
> here are some of the 14516 log entries that are not obeying the rules :
> 119.139.27.64 - - [25/Oct/2012:04:22:08 -0400] "GET
> /stuff_i_got_in_emails/Japanese%20Engrish%204.jpg;O=D HTTP/1.0" 404 246 "-"
> "HD nutch agent/Nutch-1.1 (Think)"
> 119.139.27.64 - - [25/Oct/2012:05:20:50 -0400] "GET
> /stuff_i_got_in_emails/LeafBlower.jpg;O=D HTTP/1.0" 404 238 "-" "HD nutch
> agent/Nutch-1.1 (Think)"
> 119.139.27.64 - - [25/Oct/2012:06:26:43 -0400] "GET
> /stuff_i_got_in_emails/snowmen3.gif;O=D HTTP/1.0" 404 236 "-" "HD nutch
> agent/Nutch-1.1 (Think)"
> 119.139.27.64 - - [25/Oct/2012:07:01:49 -0400] "GET
> /stuff_i_got_in_emails/Everything.About.The.Doctor.jpg;O=D HTTP/1.0" 404 255
> "-" "HD nutch agent/Nutch-1.1 (Think)"
> 119.139.27.64 - - [25/Oct/2012:08:12:06 -0400] "GET
> /stuff_i_got_in_emails/fucked.jpg;O=D HTTP/1.0" 404 234 "-" "HD nutch
> agent/Nutch-1.1 (Think)"
> 119.139.27.64 - - [25/Oct/2012:08:18:54 -0400] "GET
> /stuff_i_got_in_emails/H28.gif;O=D HTTP/1.0" 404 231 "-" "HD nutch
> agent/Nutch-1.1 (Think)"
> 119.139.27.64 - - [25/Oct/2012:08:26:50 -0400] "GET
> /stuff_i_got_in_emails/Oprahs-Bees.gif;O=D HTTP/1.0" 404 239 "-" "HD nutch
> agent/Nutch-1.1 (Think)"
> 119.139.27.64 - - [25/Oct/2012:08:50:31 -0400] "GET
> /stuff_i_got_in_emails/Reindeer_Mural.jpg;O=D HTTP/1.0" 404 242 "-" "HD
> nutch agent/Nutch-1.1 (Think)"
> 119.139.27.64 - - [25/Oct/2012:09:02:52 -0400] "GET
> /stuff_i_got_in_emails/snowmen4.gif;O=D HTTP/1.0" 404 236 "-" "HD nutch
> agent/Nutch-1.1 (Think)"
> 119.139.27.64 - - [25/Oct/2012:09:04:52 -0400] "GET
> /stuff_i_got_in_emails/ATT00173.jpg;O=D HTTP/1.0" 404 236 "-" "HD nutch
> agent/Nutch-1.1 (Think)"
> 119.139.27.64 - - [25/Oct/2012:09:22:19 -0400] "GET
> /stuff_i_got_in_emails/?C=S;O=A HTTP/1.0" 200 159957 "-" "HD nutch
> agent/Nutch-1.1 (Think)"
> 119.139.27.64 - - [25/Oct/2012:10:55:09 -0400] "GET
> /stuff_i_got_in_emails/outofthecloset%20(5).jpg;O=D HTTP/1.0" 404 246 "-"
> "HD nutch agent/Nutch-1.1 (Think)"
>
>
>



-- 
Lewis

Reply via email to