Hi All,
Webmasters for some of the sites we are crawling are reporting high
numbers of requests resulting in 404s
It appears that <script type="text/javascript"> is being treated as a
relative link to the current page, which is causing the bot to generate
hundreds of page not found errors.
E.g. on http://www.mysite.com/page1/
If there is a <script type="text/javascript"> it will try and access to url
http://www.mysite.com/page1/text/javascript
Which of course never resolves to a valid url.
Cheers,
Euan Clark