so one of the exceptions that I see a lot in my log files is invalid uri exception like this:

2012-02-13 15:05:50,217 ERROR org.apache.nutch.protocol.httpclient.Http: java.lang.IllegalArgumentException: Invalid uri 'http://www.prolitegear.com/site/xdpy/ssg/Shelters/Shelter Accessories.html': escaped absolute path not valid

2012-02-13 15:05:50,217 ERROR org.apache.nutch.protocol.httpclient.Http: java.lang.IllegalArgumentException: Invalid uri 'http://www.prolitegear.com/site/xdpy/ssg/Shelters/Shelter Accessories.html': escaped absolute path not valid 2012-02-13 15:05:50,226 ERROR org.apache.nutch.protocol.httpclient.Http: java.lang.IllegalArgumentException: Invalid uri 'http://www.prolitegear.com/activity/Adventure Racing/index.html': escaped absolute path not valid


(there is a space between "Shelter" and "Accessories") I thought at first that it is because of the space in the linke but these addresses from the same site go through with no problem:

2012-02-13 15:05:50,114 INFO org.apache.nutch.fetcher.Fetcher: fetching http://www.prolitegear.com/site/xdpy/ssg/Shelters/Shelter Accessories.html 2012-02-13 15:05:50,105 INFO org.apache.nutch.fetcher.Fetcher: fetching http://www.prolitegear.com/site/xdpy/ssg/Accessories/Sun Protection.html 2012-02-13 15:05:50,149 INFO org.apache.nutch.fetcher.Fetcher: fetching http://www.prolitegear.com/site/xdpy/ssg/Bargains & Closeouts/Sleeping Bags: 0° to 20° F.html 2012-02-13 15:05:50,100 INFO org.apache.nutch.fetcher.Fetcher: fetching http://www.prolitegear.com/site/xdpy/ssg/Climbing Gear/Protection.html


does anybody have any idea what might be wrong here? ( I am using protocol-httpclient and all the links are actually valid. they work if u copy and paste them into a browser)


--
Kaveh Minooie

www.plutoz.com

Reply via email to