I am trying to simply count whether a phrase exists in a document or
not, thus I think "simple-score" should work for me.
If I do this:
let $search :=
search:search('"new trial"',
<options xmlns="http://marklogic.com/appservices/search">
<page-length>100</page-length>
<search-option>score-simple</search-option>
</options>
)
I get a score of 8 for each doc that has at least 1 occurrence of '"new
trial"'. Note: I have "fast phrase searches" set to true.
If I do '"trial of [something]"', I get a score of 16 for each document
that has '"trial of [something]"'. And, if I do '"new trial of
[something1]"', I actually get 24...
What's going on here?
An esteemed colleague of mine explains that the "fast phrase" index only
allows for bi-grams and thus, a 3 word phrase will always have a score
of 2 (i.e., simple-score of 16) when that phrase is matched because it
is made up of 2 bi-grams.
I'd like for a phrase, no matter how long, to be counted as 1. How can
I get the counting effect that I want?
Do I always have to rely on the fact that the simple-score for an exact
match of a phrase will be 8 * (n - 1), where n is the number of terms in
a phrase?
Is a partial match for a phrase ever returned? If my phrase is
"something1 something2 something3", I won't ever get a score of 8,
correct? I'll either get a match with a score of 16, or nothing.
So, if I have '"something1 something2 something3" OR word' my possible
results are ONLY:
24 - I matched the phrase and the word
16 - I matched the phrase
8 - I matched the word
No other possibilities, right?
Thanks,
David Steiner
Consulting Research Scientist
Global Architecture & Research iLabs
LexisNexis Group
[email protected] <mailto:[email protected]>
Toll Free: 800-227-9597 ext. 51894
Direct: 937-865-1894
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general