Hello, Andi.

Well, the problem is not about just IndexSearcher on big index. The
problem is about many PythonThreads.

I can open index 20gb or 50gb with IndexSearcher without problem.
But when I try to create many PythonThreads which operates with
IndexSearcher opened on big index then I receive exceptions like these
GC Warning: Header allocation failed: Dropping block.
GC Warning: Out of Memory!  Returning NIL!

And when I have decreased number of threads to 2 or 3 threads then
this error has gone.
So, the question is: How can PythonThread affect to this? Why the less
amount of them does not produce exceptions of this kind?

>> I have following structure of program:
>>
>> I am trying to create 10 running threads of LuceneWorkerThread from
>> main thread.
>>
>> When I set number of threads to 4 or less then it runs without
>> exception. Number of PythonThreads affect to this exception!
>>
>> Any ideas?

AV> I don't know what the memory requirements are for a given index size. That 
is
AV> a question for the lucene-user or lucene-dev mailing lists. If you reach the
AV> libgcj memory limits, there may be some environment variables you can set to
AV> change them too. That is a question for the [EMAIL PROTECTED] mailing list. 
I
AV> do not know what they are either, but I've seen them posted and discussed
AV> before there.

AV> Here is what I'd do resolve this issue :

AV>    1. Try it under Java, if it fails the same way, solve it there first, 
with
AV>       help from the lucene mailing lists. To open up a huge index with Java
AV>       Lucene, you actually don't need to write a single line of java code, 
the
AV>       src/demo/org/apache/lucene/demo/SearchFiles.java should be enough, it
AV>       takes a command line argument to an index directory.
AV>       If there are patches that can be applied to the Java Lucene code base
AV>       before it gets compiled into PyLucene, that help with this, I'd be 
happy
AV>       to apply them.

AV>    2. If it works fine under Java yet fails under python/PyLucene, 
investigate
AV>       what the libgcj memory limits are, with help from the [EMAIL 
PROTECTED]
AV>       mailing list.

AV>    3. If all fails then we have a harder problem that will require some
AV>       rethinking and some fixing of the underlying software packages used.

AV> By the way, you say that you have a 51 Gb index and a 20 Gb index. What is 
the
AV> size of the biggest single index files in these index directories ? There 
used
AV> to be a bug in libgcj that it couldn't support files bigger than 2 or 4 Gb 
(I
AV> don't remember). I know this bug is fixed in gcj 4.0, a PyLucene user 
actually
AV> verified that. I do not know that this bug has been fixed in the version of
AV> gcj we're using, gcj 3.4.3.

AV> Andi..




Yura Smolsky.

_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to