IndexSearcher
Maybe too general a question, but is there anything about creating an IndexSearcher( directory) object that would make the instantiation really slow? I have one index where the instantiation is very fast, to the point where I don't need to do any pooling. A new index I have created, takes a very long time to create the IndexSearcher object. With a 30mb index, it can take about 30 seconds just to instantiate an IndexSearcher(). It almost seems like it is reading the index at that point. The only difference between the indexes has been the # of fields indexed. The newer one only having one field indexed. Any ways to speed up that instantiation? Or do I have to use a pooling system? Thanks for any suggestions, -Gus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: IndexSearcher
Thanks Hoss, I did figure out that I was putting about 400 stored fields per document into my new index; more than my prior indexes. Reducing the number of stored fields seems to have helped significantly. I do call writer.optimize() after loading in documents, but not sure how I would set the # of segments? I think I will keep the IndexSearcher statically for all instances. The slow times I was seeing, weren't even sufficient for that though. Since this is a case of really only needing to search on one field and use the index as a storage medium for the rest of the data(pretty much textual data), I'm thinking it would make sense to get the latest version of lucene and create a two field index. Something like: Field1: id Field2: serialized data object. Any reason why that wouldn't be fast? I have been having elusive memory issues with my other usage, maybe you just helped me find that solution as well. Thanks, -Gus -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 22, 2006 4:02 PM To: java-user@lucene.apache.org Subject: Re: IndexSearcher : I have one index where the instantiation is very fast, to the point where I : don't need to do any pooling. A new index I have created, takes a very long : time to create the IndexSearcher object. With a 30mb index, it can take : about 30 seconds just to instantiate an IndexSearcher(). It almost seems : like it is reading the index at that point. : : : The only difference between the indexes has been the # of fields indexed. : The newer one only having one field indexed. If i remember correctly, The IndexSearcher constructor doesn't do anything but open an IndexReader ... IndexReader.open() opens a MultiReader on all of the segments, and each of the SegmentReaders open up a bunch of files. so off hte top of my head, one thing that can make a differnece in the new IndexSearcher times, is how many segments you have in your index (ie: is it optimized?) ... using the compound fileformat can probably make a difference as well. : Any ways to speed up that instantiation? Or do I have to use a pooling : system? Even if you get it down to 0.1 seconds,i would still reuse the same IndexSearcher as much as possible. See previous replies from me in the archive about memory for my reasoning. -Hoss - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Custom sort/basic question
Hey John, My understanding is that if you add a field with the same name as a previous field added, you will be overwriting the value stored in the document. So if you add: doc.add(Field.Text(sequence, 1)); doc.add(Field.Text(sequence, 2)); doc.add(Field.Text(sequence, 3)); Afterwards, the field sequence would hold a value of 3. I'm guessing that by now you have already tested that. -Gus -Original Message- From: John Powers [mailto:[EMAIL PROTECTED] Sent: Monday, November 21, 2005 3:01 PM To: java-user@lucene.apache.org Subject: Custom sort/basic question If I add keywords to a document at the same time, will they stay in that order? Create New doc A doc.add(Field.Text(category, toys)); doc.add(Field.Text(sequence, 235)); doc.add(Field.Text(category, bears)); doc.add(Field.Text(sequence, 63)); doc.add(Field.Text(category, trucks)); doc.add(Field.Text(sequence, 56)); Create New doc B doc.add(Field.Text(category, computers)); doc.add(Field.Text(sequence, 7)); doc.add(Field.Text(category, bears)); doc.add(Field.Text(sequence, 12)); doc.add(Field.Text(category, trucks)); doc.add(Field.Text(sequence, 772)); I want to sort on the right sequence number, so I need to find the right category. If I iterator through doc.getFields( category) and find the category I want is #2, then can I get sequence #2 and know its the right one? Or does everything get jumbled up in the indexing process? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]