Re: Problem to crawl pdf content in urls

lewis john mcgibbney Fri, 16 Sep 2011 08:38:55 -0700

I think at the very least you should provide some log output of the URLs
which are not being fetched this would give us a chance of providing
accurate info.

http.content.limit is one of many many options which might be the problem
here.

Thank you

On Fri, Sep 16, 2011 at 6:57 AM, Mohammad Anbari <[email protected]>wrote:

> I have some urls that contain many pdf links and i want to index them
> but when i start crawling with nutch 1.3 no pdf link fetch,is there
> any config i miss?
> thanks
>

-- 
*Lewis*

Re: Problem to crawl pdf content in urls

Reply via email to