Re: Number Searches vs Character

Chris Hostetter Mon, 30 Jan 2006 14:24:23 -0800

PrefixQuery is implimented as a BooleanQuery using term expansion.  what
that means is that a prefix query on a common prefix is much more
expensive then a prefix query on a less common prefix.  not just in terms
of hte number of documents that match, but because of the number of terms
that match the prefix.


assuming "pq" is your PrefixQuery object, take a look at the output of
calling pq.rewrite(yourReader).toString() and compare the difference
between your 12* and your de* queries ... i'm guessing you'll find that
the 12* approach is a lot bigger.

if you poke arround you'll find mention of a ConstanctScoreRangeQuery ...
using the ideas from that, you can impliment a much faster version of
PrefixQuery that doesn't score documents based on term frequency ... which
may be ok depending on your needs.


: Date: Mon, 30 Jan 2006 13:51:15 -0500
: From: "Aigner, Thomas" <[EMAIL PROTECTED]>
: Reply-To: [email protected]
: To: [email protected]
: Subject: Number Searches vs Character
:
:
: I am curious what would be the difference between searching for a number
: verses a character.
:
: I have a large index consisting of a few fields (So index would look
: something like:  " 123123123 my description my catalog"
:
: Searching for 12* is much slower than searching for de*
: I don't have any issues searching for 3 or more characters.. just 1 or 2
: and the wildcard.
:       Stored this way: doc.add(Field.Text("allindex", all,true));
:       Using 1.4.3
:
: Any reason why that would be and how I can help speed that up?
:
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: [EMAIL PROTECTED]
: For additional commands, e-mail: [EMAIL PROTECTED]
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Number Searches vs Character

Reply via email to