NoRobotsClient don't follow the standar
---------------------------------------
Key: LABS-198
URL: https://issues.apache.org/jira/browse/LABS-198
Project: Labs
Issue Type: Bug
Components: Droids
Reporter: Javier Puerto
I see that the url for the robots was relative to the path and this not follow
the robots standard.
...
public static URL findRobotsUrl(URL base, String prefix) throws
MalformedURLException {
URL url = new URL(base, "robots.txt");
boolean exist = existUrl(url);
...
It should be "new URL(base, "/robots.txt");"
I found this on the web:
* http://www.robotstxt.org/norobots-rfc.txt (sec 3.1)
* http://en.wikipedia.org/wiki/Robots.txt
* http://www.w3.org/TR/html4/appendix/notes.html#h-B.4.1.1
Attach a patch to solve this behavior.
Salu10.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]