The NPR site has become self aware and sent millions of clones of itself
around the internet. Now it is searching for all its children so it can take
over the world?

OK just kidding. Do the requests come at any interval of time? Might be
helpful to know if this is a single distributed system sniffing your site,
or if each ip appears to be acting individually.

Ryan

On Thu, Jun 3, 2010 at 5:54 PM, Keith Aric Hall <[email protected]> wrote:

> hrmm...do you have any NPR links on your site? If I do a search
> for "/help/communityfaq.html" AND "/ombusdman/" or
> "/audiohelp/progstream.html" I get references to NPR.org pages.
>
>
> Keith Aric Hall
>
> http://keitharichall.com/
> http://twitter.com/keitharichall
>
>
>
>
>
> On Jun 3, 2010, at 5:29 PM, Mark Phillip wrote:
>
> Evening folks,
>
> I have pretty high expectations for the Refresh Austin list whenever I have
> a tough question, but I might have found one stump-worthy.
>
> A couple months ago I started seeing requests in my web server access log
> for "/ombudsman".  I don't have an Ombudsman page, so it returned a 404.
> Digging a little deeper, the same IP was repeatedly searching for the same
> set of non-existent pages on my site:
>
> /about/privacypolicy.html
> /about/termsofuse.html
> /audiohelp/progstream.html
> /blogs
> /corrections
> /email
> /help
> /help/communityfaq.html
> /music
> /ombudsman
> /podcast
>
> After a bit more digging, I realized that it wasn't coming from just one IP
> address.  Turns out there are dozens of IP addresses all requesting the same
> non-existent URLs.  Each IP is scattered across the globe without any common
> thread.  The only user-agent listed in each request is a member of the
> "Java/1.6.0" family.
>
> I am 100% stumped on this one.  All Googling for community-sourced
> Java-based search spiders comes up completely empty.
>
>
> Any thoughts?  Solve this and I'll buy you a beer on Tuesday.
>
>
>
>
> Thanks,
> Mark
> http://markphillip.com
>
>
> --
> Our Web site: http://www.RefreshAustin.org/
>
> You received this message because you are subscribed to the Google Groups
> "Refresh Austin" group.
>
> [ Posting ]
> To post to this group, send email to [email protected]
> Job-related postings should follow http://tr.im/refreshaustinjobspolicy
> We do not accept job posts from recruiters.
>
> [ Unsubscribe ]
> To unsubscribe from this group, send email to
> [email protected]
>
> [ More Info ]
> For more options, visit this group at
> http://groups.google.com/group/Refresh-Austin
>
>  --
> Our Web site: http://www.RefreshAustin.org/
>
> You received this message because you are subscribed to the Google Groups
> "Refresh Austin" group.
>
> [ Posting ]
> To post to this group, send email to [email protected]
> Job-related postings should follow http://tr.im/refreshaustinjobspolicy
> We do not accept job posts from recruiters.
>
> [ Unsubscribe ]
> To unsubscribe from this group, send email to
> [email protected]<refresh-austin%[email protected]>
>
> [ More Info ]
> For more options, visit this group at
> http://groups.google.com/group/Refresh-Austin
>

-- 
Our Web site: http://www.RefreshAustin.org/

You received this message because you are subscribed to the Google Groups 
"Refresh Austin" group.

[ Posting ]
To post to this group, send email to [email protected]
Job-related postings should follow http://tr.im/refreshaustinjobspolicy
We do not accept job posts from recruiters.

[ Unsubscribe ]
To unsubscribe from this group, send email to 
[email protected]

[ More Info ]
For more options, visit this group at 
http://groups.google.com/group/Refresh-Austin

Reply via email to