Hello, I don't know if the author of CLucene is on this list. You may get better help on CLucene mailing list or forum on sf.net.
Otis --- Yue Sun <[EMAIL PROTECTED]> wrote: > Hi, > > First, I am not sure if I should post my question here, since I am > using > CLucene (C++ port of Lucene) to build indexes. Hope someone here > could > help me. > > I am indexing at a solaris machine with 1G memory. I use ram writer > and > fs writer, and write into fs index once a while. Now I am testing to > index single input files. While testing on files < 50M, the program > works well. While indexing bigger file, it runs out of 1G memory and > crashes, whatever I set some parameters such as merge factor and the > frequency writing to disk. My input files are in ASN.1 format, each > with > nested entries, and each entry with various number of fields. I index > > every outermost entry as a lucene document, and each data field as a > lucene field. So what is different from others, the number of fields > indexed in my program is quite big. Some files have more than 1000 > different field names. There is no problem with max file descriptors. > > For those failed, some lucene documents have more than 40,000 field > pairs (duplicate field names with different values). I think it is > the > reason why memory is consumed vastly. One of the failed cases is with > an > input file size: 66M, and crashes after processing about 3800 > documents. > > Is there any way to improve the program to use less memory? Any > suggestion would be apprecited! > > Regards, > Yue Sun > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
