Hi,

I have a list of around 1000 seed URLS, which I crawl till depth=2 or 3.
This is done on a local machine having a configuration(having no other
large resource consuming processes running) :
Dual Core (2.4 GHz),
4GB Ram

It takes around 14-15 hours to crawl this seedlist, which generates
around 21k web page content. Is there any way this can be optimized and
takes less time, Nutch(1.2) settings are all default.

Thanks for the help.

Regards,

Bharat Goyal

DISCLAIMER
This email is intended only for the person or the entity to whom it is 
addressed and may contain information which is confidential and privileged. Any 
review, retransmission, dissemination or any other use of the said information 
by person or entities other than intended recipient is unauthorized and 
prohibited. If you are not the intended recipient, please delete this email and 
contact the sender.

Reply via email to