Second this. Best practice in a production system, how to keep re-crawling without bloating the whole system.

On 5/17/2010 3:40 AM, Piet van Remortel wrote:
re-crawling and controlling that process seems like an issue in need of
covering to me

Thanks

Piet
Belgium

On Mon, May 17, 2010 at 9:32 AM, Alexander Aristov<
[email protected]>  wrote:

I would definetely want to see answers on questions about distributed
search.

Starting from crawling, - how to make it in distributed mode, where to
store
collected pages and indexes
and ending questions about relevancy of results abtained from different
search servers.


Best Regards
Alexander Aristov


On 17 May 2010 05:27, Dennis Kubes<[email protected]>  wrote:

Hi Everyone,

It has been a long time coming but I have finally started to write a book
on Nutch.  It will be self published and should be available in PDF /
paperback form in less than a month hopefully.

A while back we discussed a Nutch training seminar on the list.  I am not
ready to do a full on seminar yet but I will be putting up some training
and
tutorial videos in the next few weeks.  I will update the list as those
become available.

I already have a general outline but it would help me to know the
following:

1) What types of things you would want explained in a book / videos on
Nutch?
2) What are the biggest problems you face using Nutch?
3) Anything special you would like answered or explained?

Thanks in advance for any responses.

Dennis





Reply via email to