NoRobotsClient don't follow the standar
---------------------------------------

                 Key: LABS-198
                 URL: https://issues.apache.org/jira/browse/LABS-198
             Project: Labs
          Issue Type: Bug
          Components: Droids
            Reporter: Javier Puerto


I see that the url for the robots was relative to the path and this not follow 
the robots standard.

...
  public static URL findRobotsUrl(URL base, String prefix) throws 
MalformedURLException {
    URL url = new URL(base, "robots.txt");
    boolean exist = existUrl(url);
...

It should be "new URL(base, "/robots.txt");"

I found this on the web: 
 * http://www.robotstxt.org/norobots-rfc.txt (sec 3.1)
 * http://en.wikipedia.org/wiki/Robots.txt
 * http://www.w3.org/TR/html4/appendix/notes.html#h-B.4.1.1

Attach a patch to solve this behavior.

Salu10.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to