I had this problem when using xerces to parse xml documents. The problem I think lies 
in the Java garbage collector. The way I solved it was to create a shell script that 
invokes a java program for each xml file that adds it to the index.

Hope this helps...

Aaron
---------- Original Message ----------------------------------
From: "Rob Outar" <[EMAIL PROTECTED]>
Reply-To: "Lucene Users List" <[EMAIL PROTECTED]>
Date:  Fri, 14 Feb 2003 08:43:34 -0500

>Forgot to mention I am indexing 1000's of XML files.  I ran a little test to
>see if that file was the problem, but it was abled to be indexed after some
>time and memory usage was huge.  I think maybe because I index these files
>one after the other something is not getting cleaned up leading to the
>exception.
>
>Thanks,
>
>Rob
>
>
>-----Original Message-----
>From: Rob Outar [mailto:[EMAIL PROTECTED]]
>Sent: Friday, February 14, 2003 8:25 AM
>To: Lucene Users List
>Subject: RE: OutOfMemoryException while Indexing an XML file
>
>
>So to the best of your knowledge the Lucene Document Object should not cause
>the exception even though the XML file is huge and 1000's of fields are
>being added to the Lucene Document Object?
>
>Thanks,
>
>Rob
>
>
>-----Original Message-----
>From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
>Sent: Friday, February 14, 2003 8:21 AM
>To: Lucene Users List
>Subject: Re: OutOfMemoryException while Indexing an XML file
>
>
>Nothing in the code snippet you sent would cause that exception.
>If I were you I'd run it under a profiler to quickly see where the leak
>is.  You can even use something free like JMP.
>
>Otis
>
>--- Rob Outar <[EMAIL PROTECTED]> wrote:
>> Hi all,
>>
>>      I was using the sample code provided I believe by Doug Cutting to
>> index an
>> XML file, the XML file was 2 megs (kinda large) but while adding
>> fields to
>> the Document object I got an OutOfMemoryException exception.  I work
>> with
>> XML files a lot, I can easily parse that 2 meg file into a DOM tree,
>> I can't
>> imagine a Lucene document being larger than a DOM Tree, pasted below
>> is the
>> SAX handler.
>>
>> public class XMLDocumentBuilder
>> extends DefaultHandler {
>>
>>     /** A buffer for each XML element */
>>     private StringBuffer elementBuffer = new StringBuffer();
>>
>>     private Document mDocument;
>>
>>
>>     public void buildDocument(Document doc, String xmlFile) throws
>> IOException,
>>     SAXException {
>>
>>         this.mDocument = doc;
>>         SAXReader.parse(xmlFile, this);
>>     }
>>
>>     public void startElement(String uri, String localName, String
>> qName,
>>     Attributes atts) {
>>
>>         elementBuffer.setLength(0);
>>
>>         if (atts != null) {
>>
>>             for (int i = 0; i < atts.getLength(); i++) {
>>
>>                 String attname = atts.getLocalName(i);
>>                 mDocument.add(new Field(attname, atts.getValue(i),
>>                 true, true, true));
>>             }
>>         }
>>     }
>>
>>     // call when cdata found
>>     public void characters(char[] text, int start, int length) {
>>         elementBuffer.append(text, start, length);
>>     }
>>
>>     public void endElement(String uri, String localName, String
>> qName) {
>>         mDocument.add(Field.Text(localName,
>> elementBuffer.toString()));
>>     }
>>     public Document getDocument() {
>>         return mDocument;
>>     }
>> }
>>
>> Any help would be appreciated.
>>
>> Thanks,
>>
>> Rob
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>
>
>__________________________________________________
>Do you Yahoo!?
>Yahoo! Shopping - Send Flowers for Valentine's Day
>http://shopping.yahoo.com
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]
>
>---
>[This E-mail was scanned for spam and viruses by NextGen.net.]
>
>
 




________________________________________________________________
Sent through the WebMail system at nextgen.net.mt

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to