There is one other detail that should clarify the situation. At query time,
the query parser itself is breaking your query into space-delimited terms,
and only calling the analyzer for each of those terms, each of which will be
treated as if a quoted phrase. So it doesn't matter whether it is the
standard analyzer or word delimiter filter or other filter that is breaking
up the compound term.
And the default "query operator" only applies to the "terms" as the query
parser parsed them, not for the sub-terms of a compound term like CD-ROM or
gb-mb.
-- Jack Krupansky
-----Original Message-----
From: Alireza Salimi
Sent: Wednesday, July 04, 2012 12:05 PM
To: solr-user@lucene.apache.org
Subject: Re: Synonyms and hyphens
Wow, I didn't know that. Is there a way to disable this feature? I mean, is
it something coming from the Analyzer?
On Wed, Jul 4, 2012 at 12:26 PM, Jack Krupansky
<j...@basetechnology.com>wrote:
Terms with embedded special characters are treated as phrases with spaces
in place of the special characters. So, "gb-mb" is treated as if you had
enclosed the term in quotes.
-- Jack Krupansky
-----Original Message----- From: Alireza Salimi
Sent: Wednesday, July 04, 2012 6:50 AM
To: solr-user@lucene.apache.org
Subject: Re: Synonyms and hyphens
Hi,
Does anybody know why hyphen '-' and q.op=AND causes such a big difference
between the two queries? I thought hyphens are removed by
StandardTokenizer
which means theoretically the two queries should be the same!
Thanks
On Tue, Jul 3, 2012 at 4:05 PM, Alireza Salimi <alireza.sal...@gmail.com>*
*wrote:
Hi,
I'm not sure if anybody has experienced this behavior before or not.
I noticed that 'hyphen' plays a very important role here.
I used Solr's default example directory.
http://localhost:8983/solr/**select/?q=name:(gb-mb)&**
version=2.2&start=0&rows=10&**indent=on&debugQuery=on&**
indent=on&wt=json&q.op=AND<http://localhost:8983/solr/select/?q=name:(gb-mb)&version=2.2&start=0&rows=10&indent=on&debugQuery=on&indent=on&wt=json&q.op=AND>
results in "parsedquery":"+name:gb +name:gib +name:gigabyte
+name:gigabytes +name:mb +name:mib +name:megabyte +name:megabytes",
While searching http://localhost:8984/solr/**
select/?q=name:(gbmb)&version=**2.2&start=0&rows=10&indent=on&**
debugQuery=on&indent=on&wt=**json&q.op=AND<http://localhost:8984/solr/select/?q=name:(gbmb)&version=2.2&start=0&rows=10&indent=on&debugQuery=on&indent=on&wt=json&q.op=AND>
results in "parsedquery":"+(name:gb name:gib name:gigabyte
name:gigabytes) +(name:mb name:mib name:megabyte name:megabytes)",
If you notice to the first query - with hyphens - you can see that the
results of
parsing is totally different. I know that hyphens are special characters
in Solr,
but there's no way that the first query returns any entry because it's
asking for
ALL synonyms.
Am I missing something here?
Thanks
--
Alireza Salimi
Java EE Developer
--
Alireza Salimi
Java EE Developer
--
Alireza Salimi
Java EE Developer