Great info Morus,
After making the "escape the dash" change to the QueryParser:
Query query = QueryParser.parse("+category:HW\\-NCI_TOPICS AND SPACE",
"description",
analyzer);
Hits hits = searcher.search(query);
System.out.println("query.ToString = " + query.toString("description"));
assertEquals("HW-NCI_TOPICS kept as-is",
"+category:HW\\-NCI_TOPICS +space", query.toString("description"));
<------note that this passes with the escape put in, so not "as-is".
assertEquals("doc found!", 1, hits.length());
I'm still getting this output:
domain.lucenesearch.KeywordAnalyzer:
[HW-NCI_TOPICS]
query.ToString = +category:HW\-NCI_TOPICS +space
junit.framework.AssertionFailedError: doc found! expected:<1> but was:<0>
It look like bug, http://issues.apache.org/bugzilla/show_bug.cgi?id=27491
<http://issues.apache.org/bugzilla/show_bug.cgi?id=27491> , was fixed today:
------- Additional Comments From Otis Gospodnetic <mailto:[EMAIL PROTECTED]>
2004-03-24 10:10 -------
Although tft-monitor should not really result in a phrase query "tft monitor", I
agree that this is better than converting it to tft AND NOT monitor (tft -monitor).
Moreover, I have seen query syntax where '-' characters are used for phrase
queries instead or in addition to quotes, so one could use either morus-walter
or "morus walter".
I applied your change, as it doesn't look like it breaks anything, and I hope
nobody relied on ill behaviour where tft-monitor would result in AND NOT query.
-----------
But I assume this fix won't come out for some time. Is there a way I can get this fix
sooner?
I'm up against a deadline and would very much like this functionality.
And to go one more step with the KeywordAnalyzer that I wrote, changing this method to
skip the escape:
protected boolean isTokenChar(char c)
{
if (c == '\\')
{
return false;
}
else
{
return true;
}
}
The test then returns with a space:
healthecare.domain.lucenesearch.KeywordAnalyzer:
[HW-NCI_TOPICS]
query.ToString = +category:"HW -NCI_TOPICS" +space
junit.framework.ComparisonFailure: HW-NCI_TOPICS kept as-is
Expected:+category:HW\-NCI_TOPICS +space
Actual :+category:"HW -NCI_TOPICS" +space <----note space where escape was.
thanks,
chad.
-----Original Message-----
From: Morus Walter [mailto:[EMAIL PROTECTED]
Sent: Wed 3/24/2004 1:43 AM
To: Lucene Users List
Cc:
Subject: RE: Query syntax on Keyword field question
Chad Small writes:
> Here is my attempt at a KeywordAnalyzer - although is not working? Excuse
the length of the message, but wanted to give actual code.
>
> With this output:
>
> Analzying "HW-NCI_TOPICS"
> org.apache.lucene.analysis.WhitespaceAnalyzer:
> [HW-NCI_TOPICS]
> org.apache.lucene.analysis.SimpleAnalyzer:
> [hw] [nci] [topics]
> org.apache.lucene.analysis.StopAnalyzer:
> [hw] [nci] [topics]
> org.apache.lucene.analysis.standard.StandardAnalyzer:
> [hw] [nci] [topics]
> healthecare.domain.lucenesearch.KeywordAnalyzer:
> [HW-NCI_TOPICS]
>
> query.ToString = category:HW -"nci topics" +space
>
> junit.framework.ComparisonFailure: HW-NCI_TOPICS kept as-is
> Expected:+category:HW-NCI_TOPICS +space
> Actual :category:HW -"nci topics" +space
>
Well query parser does not allow `-' within words currently.
So before your analyzer is called, query parser reads one word HW, a `-'
operator, one word NCI_TOPICS.
The latter is analyzed as "nci topics" because it's not in field category
anymore, I guess.
I suggested to change this. See
http://issues.apache.org/bugzilla/show_bug.cgi?id=27491
Either you escape the - using category:HW\-NCI_TOPICS in your query
(untested. and I don't know where the escape character will be removed)
or you apply my suggested change.
Another option for using keywords with query parser might be adding a
keyword syntax to the query parser.
Something like category:key("HW-NCI_TOPICS") or category="HW-NCI_TOPICS".
HTH
Morus
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]