Hi,

After every crawl iteration check out your webdb with the readdb tool.
There is pleanty linked to from the wiki on this topic. Check
urlfilters as an important area as well.

hth

Lewis

On Fri, Oct 5, 2012 at 6:08 PM, Hailong Yang <[email protected]> wrote:
> Dear all,
>
>
>
> I am trying to crawl a large index (maybe more than 2Gb) for future
> analysis. However, after 11 hours crawling, I looked at the crawl directory
> which was 1.2GB as a whole, but the size of the index was only 50MB. Could
> someone tell me how to configure the crawling so that I can retrieve a large
> enough index. Thank you!
>
>
>
> Best
>
>
>
> Hailong
>



-- 
Lewis

Reply via email to