Hi,

--- Andrzej Bialecki <[EMAIL PROTECTED]> wrote:

> Matthias Jaekle wrote:
> > Hi Andrzej,
> > 
> > thanks for your response. I am not really familar with the lucene 
> > internals.
> > 
> > I am just running nutch with the default parameters on a debian
> sarge 
> > system with ext3 file system, maximum 1024 files opened, and 1 GB
> RAM.
> > 
> > So is ext3 a bad file system for millions of files?
> 
> AFAIK reiserfs comes out a much better in benchmarks than
> ext3.noatime, especially for small files.

Never used reiserfs, but I heard the same.

> > I could not change the file system in the moment. So I think I
> should 
> > change the parameters.
> > 
> > Which values would you suggest for
> > * indexer.mergeFactor?
> > * indexer.minMergeDocs?
> > * indexer.maxMergeDocs?
> > * indexer.termIndexInterval?

You probably don't want to touch indexer.termIndexInterval and
indexer.maxMergeDocs (determines the max size of an individual
segment).
How high you can go with minMergeDocs will be determined by your
RAM/Heap, and your maximum open file descriptor limit will determine
how high you can go with your mergeFactor.

Otis

____________________________________________________________________
Simpy -- simpy.com -- tags, social bookmarks, personal search engine

Reply via email to