IndexSearcher

2006-02-22 Thread Gus Kormeier
Maybe too general a question, but is there anything about creating an
IndexSearcher( directory) object that would make the instantiation really
slow?


I have one index where the instantiation is very fast, to the point where I
don't need to do any pooling.  A new index I have created, takes a very long
time to create the IndexSearcher object.  With a 30mb index, it can take
about 30 seconds just to instantiate an IndexSearcher().  It almost seems
like it is reading the index at that point.


The only difference between the indexes has been the # of fields indexed.
The newer one only having one field indexed.

Any ways to speed up that instantiation? Or do I have to use a pooling
system?

Thanks for any suggestions,
-Gus

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: IndexSearcher

2006-02-22 Thread Gus Kormeier
Thanks Hoss,
I did figure out that I was putting about 400 stored fields per
document into my new index; more than my prior indexes. 
Reducing the number of stored fields seems to have helped significantly.

I do call writer.optimize() after loading in documents, but not sure how I
would set the # of segments?
I think I will keep the IndexSearcher statically for all instances. The slow
times I was seeing, weren't even sufficient for that though.

Since this is a case of really only needing to search on one field and use
the index as a storage medium for the rest of the data(pretty much textual
data), I'm thinking it would make sense to get the latest version of lucene
and create a two field index.
Something like:
Field1: id
Field2: serialized data object.

Any reason why that wouldn't be fast?

I have been having elusive memory issues with my other usage, maybe you just
helped me find that solution as well.
Thanks,
-Gus

-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED]
Sent: Wednesday, February 22, 2006 4:02 PM
To: java-user@lucene.apache.org
Subject: Re: IndexSearcher


: I have one index where the instantiation is very fast, to the point where
I
: don't need to do any pooling.  A new index I have created, takes a very
long
: time to create the IndexSearcher object.  With a 30mb index, it can take
: about 30 seconds just to instantiate an IndexSearcher().  It almost seems
: like it is reading the index at that point.
:
:
: The only difference between the indexes has been the # of fields indexed.
: The newer one only having one field indexed.

If i remember correctly, The IndexSearcher constructor doesn't do anything
but open an IndexReader ... IndexReader.open() opens a MultiReader on all
of the segments, and each of the SegmentReaders open up a bunch of files.

so off hte top of my head, one thing that can make a differnece in the
new IndexSearcher times, is how many segments you have in your index
(ie: is it optimized?) ... using the compound fileformat can probably make
a difference as well.

: Any ways to speed up that instantiation? Or do I have to use a pooling
: system?

Even if you get it down to 0.1 seconds,i would still reuse the same
IndexSearcher as much as possible.  See previous replies from me in
the archive about memory for my reasoning.



-Hoss


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Custom sort/basic question

2005-11-22 Thread Gus Kormeier
Hey John,
My understanding is that if you add a field with the same name as a
previous field added, you will be overwriting the value stored in the
document.

So if you add:
doc.add(Field.Text(sequence, 1));
doc.add(Field.Text(sequence, 2));
doc.add(Field.Text(sequence, 3));

Afterwards, the field sequence would hold a value of 3.

I'm guessing that by now you have already tested that.
-Gus


-Original Message-
From: John Powers [mailto:[EMAIL PROTECTED]
Sent: Monday, November 21, 2005 3:01 PM
To: java-user@lucene.apache.org
Subject: Custom sort/basic question


If I add keywords to a document at the same time, will they stay in that
order?


Create New doc A
doc.add(Field.Text(category, toys));
doc.add(Field.Text(sequence, 235));

doc.add(Field.Text(category, bears));
doc.add(Field.Text(sequence, 63));

doc.add(Field.Text(category, trucks));
doc.add(Field.Text(sequence, 56));



Create New doc B
doc.add(Field.Text(category, computers));
doc.add(Field.Text(sequence, 7));

doc.add(Field.Text(category, bears));
doc.add(Field.Text(sequence, 12));

doc.add(Field.Text(category, trucks));
doc.add(Field.Text(sequence, 772));

I want to sort on the right sequence number, so I need to find the right
category.   If I iterator through doc.getFields( category) and find the
category I want is #2, then can I get sequence #2 and know its the right
one?

Or does everything get jumbled up in the indexing process?


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]