Excuse, I was wrong again.
I can use IndexReader.... forget the last email :-D
----- Original Message -----
From: "MariLuz Elola" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Thursday, July 07, 2005 4:16 PM
Subject: Re: OUTOFMEMORY ERROR
Erik, I have a problem.
Firstly I have created several IndexWriter.
One of them has 210.000 documents, and in the future will be IndexWriters
with more than millions of documents.
I need to obtain all the documents.
I am searching using the query ID:0* because this query returns all the
documents.
Exactly I am getting the metadata ID (hits.doc(start).get(.ID)), I am
getting all the IDs of all the documents of a specific IndexWriter.
I am getting out of memory doing it.
About maxClauseCount (by default 1024), I am setting this property:
org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.searchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;
You gave me an idea...to use IndexReader instead of IndexSearcher for
getting all the documents.
I think that it is not possible to use IndexReader, because I need the ID,
not the phisical files:
Directory directory = FSDirectory.getDirectory(path false);
IndexReader reader = IndexReader.open(directory);
for (int i = 0; i < reader.maxDoc(); i++) ............
Moreover "directory" has all the documents of all the IndexWriter.
Mari Luz
----- Original Message -----
From: "MariLuz Elola" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Thursday, July 07, 2005 3:40 PM
Subject: Re: OUTOFMEMORY ERROR
Thanks Erik,
I was wrong, exactly the query that throws an OutOfMemory error is ==>
ID:0* -ID:xtent.
With the query ID:0* I have tried to reproduce the error, but the
exception doen´t appear.
I will use IndexReader instead of IndexSearcher for getting all the
documents. It´s a good idea.
Other thing, when the user searchs without using any query, internally I
am creating the next query ==> ID:0* OR NOT ID:xtent. And this query
parsed by QueryParser I am obtaining ID:0* -ID:xtent (traslated ==> ID:0*
AND NOT ID:xtent), isn´t? Is QueryParser working wrong???
About maxClauseCount (by default 1024), I am setting this property:
org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.searchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;
Mari Luz
----- Original Message -----
From: "Erik Hatcher" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Thursday, July 07, 2005 2:46 PM
Subject: Re: OUTOFMEMORY ERROR
On Jul 7, 2005, at 6:02 AM, MariLuz Elola wrote:
The query is ==> ID:0*
This query returns all the documents, exactly 210.000 documents.
If the user doesn´t specify any criterio in the user interface of
searching, the server searchs all the documents.
Doing a prefix query (which ID:0* is) internally builds a
BooleanQuery OR'ing all unique terms in the ID field that begin with
a "0". The built in limit is 1,024 clauses in a BooleanQuery.
You will need to re-think your approach. If the goal is to return
all documents, then use IndexReader to walk them. If the goal is to
have a general user query expression where ID:0* would be entered you
will need to account for that possibility with more system resources
and bumping up the BooleanQuery limit or indexing differently so that
there are no so many terms being put into the BooleanQuery. It is
difficult to offer specific advice as I'm not sure what your use
cases are.
Erik
Mari Luz
Untitled Document ---------------------------------------------------
Mari Luz Elola Developer Engineer Caleruega, 67 28033 Madrid (Spain)
Tel.: +34 91 768 46 58 mailto:
[EMAIL PROTECTED] ---------------------------------------------------
Privileged/ Confidential Information may be contained in this message
and is intended solely for the use of the named addressee(s). Access to
this e-mail by anyone else is unauthorised. If you are not the intended
recipient, any disclosure, copying, distribution or re-use of the
information contained in it is prohibited and may be unlawful.
Opinions, conclusions and any other information contained in this
message that do not relate to the official business of Seinet shall be
understood as neither given nor endorsed by it. If you have received
this communication in error, please notify us immediately by replying
to this mail and deleting it from your computer. Thank you.
----- Original Message ----- From: "Erik Hatcher"
<[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Wednesday, July 06, 2005 8:12 PM
Subject: Re: OUTOFMEMORY ERROR
We'll need some more details to help. What query was it?
Erik
On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:
Hi, I have a problem when I am trying to search a simple query
without sorting into an index with 210.000 documents.
Executing the query several times I am getting the OutOfMemory error.
I am creating an IndexSearcher(pathDir) every search.
I don´t know if it will be necessary to create only one indexSearcher
and caching it,
If I search into an index with only 50.000 documents, the outofMemory
error doen´t appear.
------------------------
ENVIROMENT DESCRIPTION:
------------------------
---SERVER---
MEMORY 2GB
APP SERVER Jboss3.2.3
JAVA_OPTS -Xmx640M -Xms640M
----LUCENE 1.4.3-------
INDEX +- 210.000 documents
EACH DOCUMENT +- 20 fields (metadatas)
SIZE TEXT DOCUMENT 1k
------------------------
ERROR:
------------------------
18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
java.lang.OutOfMemoryError
18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
java.lang.OutOfMemoryError
18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected Error;
nested exception is:
java.lang.OutOfMemoryError
18:52:18,661 ERROR [STDERR] at
org.jboss.ejb.plugins.LogInterceptor.handleException
(LogInterceptor.java:374)
18:52:18,661 ERROR [STDERR] at
org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
18:52:18,661 ERROR [STDERR] at
org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke
(ProxyFactoryFinderInterceptor.java:122)
18:52:18,662 ERROR [STDERR] at
org.jboss.ejb.StatelessSessionContainer.internalInvoke
(StatelessSessionContainer.java:331)
18:52:18,662 ERROR [STDERR] at org.jboss.ejb.Container.invoke
(Container.java:700)
18:52:18,662 ERROR [STDERR] at
sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
18:52:18,662 ERROR [STDERR] at
sun.reflect.DelegatingMethodAccessorImpl.invok
.
.
Exception java.lang.OutOfMemoryError: requested 4 bytes for CMS: Work
queue overflow; try -XX:-CMSParallelRemarkEnabled. Out of swap space?
Could anybody help me???
Thanks in advance
Mari Luz