Kelvin, <snip> > >This seems like a silly question, but will keeping hold of Document objects >cause me to run into "Too many files open" problems? If each document object
No, unless you don't close the evt. files you read the doc fields from. It depends on how you obtain your document fields. >has a Field.Text which contains a Reader, and the Reader isn't closed till >the document is indexed, would this be an issue? Is the memory consumed by I have not used Readers yet, so I don't know. >Document objects directly proportional to the size of the object the Reader >reads? I think/hope the point of using a Reader is to avoid reading the whole document into some buffer, so the add() method of the index writer only needs to tokenize the stream from the Reader. As for memory usage during indexing: I have indexed docs with around 100,000 terms in a single String passed to Field(), and with the max nr. of terms per field set to ten million. The JVM starts taking more memory occasionaly, but I have not seen it use more than 17Mb yet (-verbose option to java). I'd suggest to reconsider the use of a Hashtable to communicate between threads. I know a Hashtable is thread safe, but some form of queue is more like the thing one would expect there. Also, with a bounded queue a limit on memory usage is easily enforced because the feeding thread will wait as long as needed. For more about queues: http://g.oswego.edu/dl/classes/EDU/oswego/cs/dl/util/concurrent/intro.html The faq entry there about producer and consumer threads convinced me to use bounded queues after I got some out of memory crashes... Have fun, Ype -- -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
