karl wettin wrote:
28 apr 2007 kl. 07.52 skrev Kun Hong:
karl wettin wrote:
27 apr 2007 kl. 14.11 skrev Erik Hatcher:
On Apr 27, 2007, at 6:39 AM, karl wettin wrote:
27 apr 2007 kl. 12.36 skrev Erik Hatcher:
Unless someone has some other tricks I'm not aware of, that is.
I guess it would be possible to add start/stop-tokens such as ^
and $ to the indexed text: "^ the $" and place a phrase query with
0 slop.
True true. That'd work too.
Thanks for the replies and discussion.
I think I didn't express my problems correctly. The problem is I
want to
find documents containing only the "the" token in the title field,
but not
necessarily with only one appearance. For example, if the query is
"the",
I want to find documents whose title is "the", "the the" or "the the
the".
I'm not sure if you mean that it should treat all repetative tokens as
only one token? Then you are better of using a filter when analyzing
text you insert to the index: rather than creating one token for each
the in "the the the the the the" you only create one. You might also
want to use this filter when parsing user queries. (It will be hard to
find the band 'the the'.)
I can't just use filters because I have to cater for other titles that
are not just stop words, which
should be analyzed normally. (I know this requirement is a bit fussy).
If not and what you write above is all you want to match, nothing
more, nothing less, then you could do something like this:
(dry coded and untested.)
int n = 3; // the; the the; the the the
String field = "title";
String token = "the";
BooleanQuery bq = new BooleanQuery();
for (int i=0;i<n;i++) {
Term[] terms = new Term[i+2];
terms[0] = new Term(field, "^");
for (int j=0;j<i;j++) {
terms[j+1] = new Term(field, token);
}
terms[i+2] = new Term(field, "$");
bq.add(new BooleanClause(new PhraseQuery(terms, 0), Orrcurs.SHOULD);
}
This seems to be a solution, but with n fixed. But I think it is good
enough for me now. :)
Thanks a lot.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]