You're not using any type of phrase search. Try -
( (title:John Bush^4.0) OR (body:John Bush) ) AND ( (title:John^4.0
body:John) AND (title:Bush^4.0 body:Bush) )
or maybe
( (title:John Bush~4^4.0) OR (body:John Bush~4) ) AND (
(title:John^4.0 body:John) AND (title:Bush^4.0 body:Bush) )
: [EMAIL PROTECTED]
--
Jason Pump
Technical Architect
Healthline
660 Third Street, Ste. 100
San Francisco, CA 94107
direct dial 415.281.3133
cell 510.812.1784
www.healthline.com
09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
If the documents have some sort of fixed ranking value (pageweight) and
the documents are arranged in the index in that order then at some point
you can say there is no reason to look for more matches, e.g. even if
the words were next to each other in query order, the document couldn't
you're ever going to do is to protect the index as well as
you do the original documents.
jch
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--
Jason Pump
Technical Architect
If you store a hash code of the word rather then the actual word you
should be able to search for stuff but not be able to actually retrieve
it; you can trade precision for security based on the number of bits
in the hash code ( e.g. 32 or 64 bits). I'd think a 64 bit hash would be
a
of
the document, banana and orange at the end. Wouldn't your optimization
stop at the word apple and just return this word highlighted? Or do you
know of a way to quantify the match?
-Original Message-
From: Jason Pump [mailto:[EMAIL PROTECTED]
Sent: Wednesday, January 10, 2007 1:49 PM
To: java
Renaud, one optimization you can do on this is to try the first 10kb,
see if it finds text worth highlighting, if not, with a slight overlap
try the next 9.9kb - 19.9kb or just 9.9kb - end if you're feeling lazy.
This assumes that most good matches are at the start of the document,
and that
are normalized as follows: ALL CAP words are prepended with a_ and
Capitalized words are prepended with c_ after downcasing. Digits are all
replaced with 0.
Cheers,
Boris
On 8/30/06, Jason Pump [EMAIL PROTECTED] wrote:
Is there a large list of words and their frequency in the english
language? Obviously
Is there a large list of words and their frequency in the english
language? Obviously it would differ by corpus but I would like to see
what's already available.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional
It's a string comparison. Make the 5 a 05 would be a simple workaround.
Jason
Peter W. wrote:
Hello,
I'm trying to do a numerical search for a property in Lucene using
RangeFilter.Less
without using both RangeQuery and test cases.
Here's the code that I expect would return one hit :
10 matches
Mail list logo