Hi

I receive a few errors while crawling sites. It ususally happens when it
attempts to retrieve PDF or other documents.

fetching
http://www.un.org/esa/sustdev/publications/sdea/2_urbaine/pdf/03_partie_1.pdf
fetching
http://www.bilfingerberger.com/C125710E004ABFC5/Print/W26MNKY3155MARSEN
fetching
http://www.un.org/esa/sustdev/publications/sdea/1_villageoise/pdf/03_partie_1.pdf
fetching
http://www.sourcesecurity.com/companies/micro-site/verint-systems/case-studies.html
fetching http://www.britishland.com/images/Biodiversity Programme.pdf
*fetch of http://www.britishland.com/images/Biodiversity Programme.pdf
failed with: java.lang.Ille
galArgumentException: Invalid uri '
http://www.britishland.com/images/Biodiversity Programme.pdf':
 escaped absolute path not valid*
fetching
http://www.sourcesecurity.com/technical-details/cctv/image-capture/lenses/fujinon-fe185c
o57ha-1.html
fetching
http://www.sourcesecurity.com/product-filter/cctv/enclosures-and-fittings/consoles-racks
-and-desks.html


As I can see Nutch deosn't properly convert links, it doesn't URL escape
them for some reasone. Could someone advise me if there is a patch or
something to help me identify the place where it happens.

-- 
Best Regards
Alexander Aristov

Reply via email to