Yes, you need to use map reduce on several boxes. Anyway 100 mio files will also work on powerful box.There are some configuration values in the nutch-default.xml that can improve indexing speed.
Am 28.12.2005 um 09:56 schrieb R.Mayoran:
Hi, I need to index about 100million files. Is it possible to cluster this job? Are there any sugestions to increase the speed of indexing? Thank you in advance. Mayu.
