Well, the problem is not about just IndexSearcher on big index. The
problem is about many PythonThreads.

I can open index 20gb or 50gb with IndexSearcher without problem.

That's not what you said yesterday.

But when I try to create many PythonThreads which operates with
IndexSearcher opened on big index then I receive exceptions like these
GC Warning: Header allocation failed: Dropping block.
GC Warning: Out of Memory!  Returning NIL!

Have you asked the [EMAIL PROTECTED] mailing list about this error ?

And when I have decreased number of threads to 2 or 3 threads then
this error has gone.
So, the question is: How can PythonThread affect to this? Why the less
amount of them does not produce exceptions of this kind?

There may be overhead involved in having multiple threads against a given index. Have you tried this under Java yet ? Have you asked the lucene-user mailing list ? A PythonThread is really a wrapper around a Java/libgcj thread that python is tricked into thinking it's one of its own.


By the way, you say that you have a 51 Gb index and a 20 Gb index. What is the
size of the biggest single index files in these index directories ? There used
to be a bug in libgcj that it couldn't support files bigger than 2 or 4 Gb (I
don't remember). I know this bug is fixed in gcj 4.0, a PyLucene user actually
verified that. I do not know that this bug has been fixed in the version of
gcj we're using, gcj 3.4.3.

Also, I just learned that using an unoptimized index is going to require more memory. How much more is a question for lucene-user as well. Optimizing your index is likely to push you over the 4Gb per file limit in gcj < 4.0 though. Have you tried it ? (backing up your existing index first).

Andi..
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to