Re: out of memory while indexing one single file

Otis Gospodnetic Tue, 08 Jun 2004 08:38:57 -0700

Hello,

I don't know if the author of CLucene is on this list.  You may get
better help on CLucene mailing list or forum on sf.net.


Otis

--- Yue Sun <[EMAIL PROTECTED]> wrote:
> Hi,
> 
> First, I am not sure if I should post my question here, since I am
> using 
> CLucene (C++ port of Lucene) to build indexes. Hope someone here
> could 
> help me.
> 
> I am indexing at a solaris machine with 1G memory. I use ram writer
> and 
> fs writer, and write into fs index once a while. Now I am testing to 
> index single input files. While testing on files < 50M, the program 
> works well. While indexing bigger file, it runs out of 1G memory and 
> crashes, whatever I set some parameters such as merge factor and the 
> frequency writing to disk. My input files are in ASN.1 format, each
> with 
> nested entries, and each entry with various number of fields. I index
> 
> every outermost entry as a lucene document, and each data field as a 
> lucene field. So what is different from others, the number of fields 
> indexed in my program is quite big. Some files have more than 1000 
> different field names. There is no problem with max file descriptors.
> 
> For those failed, some lucene documents have more than 40,000 field 
> pairs (duplicate field names with different values). I think it is
> the 
> reason why memory is consumed vastly. One of the failed cases is with
> an 
> input file size: 66M, and crashes after processing about 3800
> documents.
> 
> Is there any way to improve the program to use less memory? Any 
> suggestion would be apprecited!
> 
> Regards,
> Yue Sun
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: out of memory while indexing one single file

Reply via email to