Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-06-08 Thread Øystein F. Steimler
On Monday 01 June 2009 16:50, Sam Michaels wrote:
> So the fix for this problem would be
>
> 1. Stop using WordDelimiterFilter for queries (what is the alternative) OR
> 2. Not allow any search strings without any alphanumeric characters..

We ran into this same problem while replacing all characters using a 
PatternReplaceFilter. I've been working around this bug by using a 
LengthFilter to filter out tokens of zero length.

.øs

> Yonik Seeley-2 wrote:
> > OK, here's the deal:
> >
> > -features:foo features:(\...@#$%\^&\*\(\))
> > -features:foo features:(\...@#$%\^&\*\(\))
> > -features:foo
> > -features:foo
> >
> > The text analysis is throwing away non alphanumeric chars (probably
> > the WordDelimiterFilter).  The Lucene (and Solr) query parser throws
> > away term queries when the token is zero length (after analysis).
> > Solr then interprets the left over "-features:foo" as "all documents
> > not containing foo in the features field", so you get a bunch of
> > matches.
> >
> > -Yonik
> > http://www.lucidimagination.com
> >
> > On Mon, Jun 1, 2009 at 10:15 AM, Sam Michaels  wrote:
> >> Walter,
> >>
> >> The analysis link does not produce any matches for either @ or
> >> !...@#$%^&*() strings when I try to match against bathing. I'm worried that
> >> this might be
> >> the symptom of another problem (which has not revealed itself yet) and
> >> want
> >> to get to the bottom of this...
> >>
> >> Thank you.
> >> sm
> >>
> >> Walter Underwood wrote:
> >>> Use the [analysis] link on the Solr admin UI to get more info on
> >>> how this is being interpreted.
> >>>
> >>> However, I am curious about why this is important. Do users enter
> >>> this query often? If not, maybe it is not something to spend time on.
> >>>
> >>> wunder
> >>>
> >>> On 5/31/09 2:56 PM, "Sam Michaels"  wrote:
>  Here is the output from the debug query when I'm trying to match the
>  String @
>  against Bathing (should not match)
> 
>  
>  3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of:
>    0.9994 = queryWeight(activity_type:NAME), product of:
>      3.2689075 = idf(docFreq=153, numDocs=1489)
>      0.30591258 = queryNorm
>    3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product
>  of: 1.0 = tf(termFreq(activity_type:NAME)=1)
>      3.2689075 = idf(docFreq=153, numDocs=1489)
>      1.0 = fieldNorm(field=activity_type, doc=0)
>  
> 
>  Looks like the AND clause in the search string is ignored...
> 
>  SM.
> 
>  ryantxu wrote:
> > two key things to try (for anyone ever wondering why a query matches
> > documents)
> >
> > 1.  add &debugQuery=true and look at the explain text below --
> > anything that contributed to the score is listed there
> > 2.  check /admin/analysis.jsp -- this will let you see how analyzers
> > break text up into tokens.
> >
> > Not sure off hand, but I'm guessing the WordDelimiterFilterFactory
> > has something to do with it...
> >
> >
> > On Sat, May 30, 2009 at 5:59 PM, Sam Michaels 
> >
> > wrote:
> >> Hi,
> >>
> >> I'm running Solr 1.3/Java 1.6.
> >>
> >> When I run a query like  - (activity_type:NAME) AND
> >> title:(\...@#$%\^&\*\(\))
> >> all the documents are returned even though there is not a single
> >> match.
> >> There is no title that matches the string (which has been escaped).
> >>
> >> My document structure is as follows
> >>
> >> 
> >> NAME
> >> Bathing
> >> 
> >> 
> >>
> >>
> >> The title field is of type text_title which is described below.
> >>
> >>  >> positionIncrementGap="100">
> >>      
> >>        
> >>        
> >>         >> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> >> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
> >>        
> >>        
> >>      
> >>      
> >>        
> >>         >> synonyms="synonyms.txt"
> >> ignoreCase="true" expand="true"/>
> >>         >> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> >> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
> >>        
> >>        
> >>
> >>      
> >>    
> >>
> >> When I run the query against Luke, no results are returned. Any
> >> suggestions
> >> are appreciated.
> >>
> >>
> >> --
> >> View this message in context:
> >> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-
> >>document s-are-matched-incorrectly-tp23797731p23797731.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >> --
> >> View this message in context:
> >> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-docu
> >>ments-are-matched-incorrectly-tp23797731p23815688.html Sent from the Solr
> >> - User mailing list archive at Nabble.com

Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-06-01 Thread Sam Michaels

Yonik,

Done, here is the link.
https://issues.apache.org/jira/browse/SOLR-1196

SM.


Yonik Seeley-2 wrote:
> 
> On Mon, Jun 1, 2009 at 10:50 AM, Sam Michaels  wrote:
>>
>> So the fix for this problem would be
>>
>> 1. Stop using WordDelimiterFilter for queries (what is the alternative)
>> OR
>> 2. Not allow any search strings without any alphanumeric characters..
> 
> Short term workaround for you, yes.
> I would classify this surprising behavior as a bug we should
> eventually fix though.  Could you open a JIRA issue for it?
> 
> -Yonik
> http://www.lucidimagination.com
> 
>> SM.
>>
>>
>> Yonik Seeley-2 wrote:
>>>
>>> OK, here's the deal:
>>>
>>> -features:foo
>>> features:(\...@#$%\^&\*\(\))
>>> -features:foo features:(\...@#$%\^&\*\(\))
>>> -features:foo
>>> -features:foo
>>>
>>> The text analysis is throwing away non alphanumeric chars (probably
>>> the WordDelimiterFilter).  The Lucene (and Solr) query parser throws
>>> away term queries when the token is zero length (after analysis).
>>> Solr then interprets the left over "-features:foo" as "all documents
>>> not containing foo in the features field", so you get a bunch of
>>> matches.
>>>
>>> -Yonik
>>> http://www.lucidimagination.com
>>>
>>>
>>> On Mon, Jun 1, 2009 at 10:15 AM, Sam Michaels  wrote:

 Walter,

 The analysis link does not produce any matches for either @ or
 !...@#$%^&*()
 strings when I try to match against bathing. I'm worried that this
 might
 be
 the symptom of another problem (which has not revealed itself yet) and
 want
 to get to the bottom of this...

 Thank you.
 sm


 Walter Underwood wrote:
>
> Use the [analysis] link on the Solr admin UI to get more info on
> how this is being interpreted.
>
> However, I am curious about why this is important. Do users enter
> this query often? If not, maybe it is not something to spend time on.
>
> wunder
>
> On 5/31/09 2:56 PM, "Sam Michaels"  wrote:
>
>>
>> Here is the output from the debug query when I'm trying to match the
>> String @
>> against Bathing (should not match)
>>
>> 
>> 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of:
>>   0.9994 = queryWeight(activity_type:NAME), product of:
>>     3.2689075 = idf(docFreq=153, numDocs=1489)
>>     0.30591258 = queryNorm
>>   3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product
>> of:
>>     1.0 = tf(termFreq(activity_type:NAME)=1)
>>     3.2689075 = idf(docFreq=153, numDocs=1489)
>>     1.0 = fieldNorm(field=activity_type, doc=0)
>> 
>>
>> Looks like the AND clause in the search string is ignored...
>>
>> SM.
>>
>>
>> ryantxu wrote:
>>>
>>> two key things to try (for anyone ever wondering why a query matches
>>> documents)
>>>
>>> 1.  add &debugQuery=true and look at the explain text below --
>>> anything that contributed to the score is listed there
>>> 2.  check /admin/analysis.jsp -- this will let you see how analyzers
>>> break text up into tokens.
>>>
>>> Not sure off hand, but I'm guessing the WordDelimiterFilterFactory
>>> has
>>> something to do with it...
>>>
>>>
>>> On Sat, May 30, 2009 at 5:59 PM, Sam Michaels 
>>> wrote:

 Hi,

 I'm running Solr 1.3/Java 1.6.

 When I run a query like  - (activity_type:NAME) AND
 title:(\...@#$%\^&\*\(\))
 all the documents are returned even though there is not a single
 match.
 There is no title that matches the string (which has been escaped).

 My document structure is as follows

 
 NAME
 Bathing
 
 


 The title field is of type text_title which is described below.

 >>> positionIncrementGap="100">
      
        
        
        >>> generateWordParts="1" generateNumberParts="1" catenateWords="1"
 catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
        
        
      
      
        
        >>> synonyms="synonyms.txt"
 ignoreCase="true" expand="true"/>
        >>> generateWordParts="1" generateNumberParts="1" catenateWords="1"
 catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
        
        

      
    

 When I run the query against Luke, no results are returned. Any
 suggestions
 are appreciated.


 --
 View this message in context:
 http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-document
 s-are-matched-incorrectly-tp23797731p23797731.html
 Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-06-01 Thread Yonik Seeley
On Mon, Jun 1, 2009 at 10:50 AM, Sam Michaels  wrote:
>
> So the fix for this problem would be
>
> 1. Stop using WordDelimiterFilter for queries (what is the alternative) OR
> 2. Not allow any search strings without any alphanumeric characters..

Short term workaround for you, yes.
I would classify this surprising behavior as a bug we should
eventually fix though.  Could you open a JIRA issue for it?

-Yonik
http://www.lucidimagination.com

> SM.
>
>
> Yonik Seeley-2 wrote:
>>
>> OK, here's the deal:
>>
>> -features:foo features:(\...@#$%\^&\*\(\))
>> -features:foo features:(\...@#$%\^&\*\(\))
>> -features:foo
>> -features:foo
>>
>> The text analysis is throwing away non alphanumeric chars (probably
>> the WordDelimiterFilter).  The Lucene (and Solr) query parser throws
>> away term queries when the token is zero length (after analysis).
>> Solr then interprets the left over "-features:foo" as "all documents
>> not containing foo in the features field", so you get a bunch of
>> matches.
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>>
>> On Mon, Jun 1, 2009 at 10:15 AM, Sam Michaels  wrote:
>>>
>>> Walter,
>>>
>>> The analysis link does not produce any matches for either @ or !...@#$%^&*()
>>> strings when I try to match against bathing. I'm worried that this might
>>> be
>>> the symptom of another problem (which has not revealed itself yet) and
>>> want
>>> to get to the bottom of this...
>>>
>>> Thank you.
>>> sm
>>>
>>>
>>> Walter Underwood wrote:

 Use the [analysis] link on the Solr admin UI to get more info on
 how this is being interpreted.

 However, I am curious about why this is important. Do users enter
 this query often? If not, maybe it is not something to spend time on.

 wunder

 On 5/31/09 2:56 PM, "Sam Michaels"  wrote:

>
> Here is the output from the debug query when I'm trying to match the
> String @
> against Bathing (should not match)
>
> 
> 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of:
>   0.9994 = queryWeight(activity_type:NAME), product of:
>     3.2689075 = idf(docFreq=153, numDocs=1489)
>     0.30591258 = queryNorm
>   3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product of:
>     1.0 = tf(termFreq(activity_type:NAME)=1)
>     3.2689075 = idf(docFreq=153, numDocs=1489)
>     1.0 = fieldNorm(field=activity_type, doc=0)
> 
>
> Looks like the AND clause in the search string is ignored...
>
> SM.
>
>
> ryantxu wrote:
>>
>> two key things to try (for anyone ever wondering why a query matches
>> documents)
>>
>> 1.  add &debugQuery=true and look at the explain text below --
>> anything that contributed to the score is listed there
>> 2.  check /admin/analysis.jsp -- this will let you see how analyzers
>> break text up into tokens.
>>
>> Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
>> something to do with it...
>>
>>
>> On Sat, May 30, 2009 at 5:59 PM, Sam Michaels 
>> wrote:
>>>
>>> Hi,
>>>
>>> I'm running Solr 1.3/Java 1.6.
>>>
>>> When I run a query like  - (activity_type:NAME) AND
>>> title:(\...@#$%\^&\*\(\))
>>> all the documents are returned even though there is not a single
>>> match.
>>> There is no title that matches the string (which has been escaped).
>>>
>>> My document structure is as follows
>>>
>>> 
>>> NAME
>>> Bathing
>>> 
>>> 
>>>
>>>
>>> The title field is of type text_title which is described below.
>>>
>>> >> positionIncrementGap="100">
>>>      
>>>        
>>>        
>>>        >> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>>        
>>>        
>>>      
>>>      
>>>        
>>>        >> synonyms="synonyms.txt"
>>> ignoreCase="true" expand="true"/>
>>>        >> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>>        
>>>        
>>>
>>>      
>>>    
>>>
>>> When I run the query against Luke, no results are returned. Any
>>> suggestions
>>> are appreciated.
>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-document
>>> s-are-matched-incorrectly-tp23797731p23797731.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
>>
>>



>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23815688.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>
> --
> Vie

Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-06-01 Thread Sam Michaels

So the fix for this problem would be

1. Stop using WordDelimiterFilter for queries (what is the alternative) OR
2. Not allow any search strings without any alphanumeric characters..

SM.


Yonik Seeley-2 wrote:
> 
> OK, here's the deal:
> 
> -features:foo features:(\...@#$%\^&\*\(\))
> -features:foo features:(\...@#$%\^&\*\(\))
> -features:foo
> -features:foo
> 
> The text analysis is throwing away non alphanumeric chars (probably
> the WordDelimiterFilter).  The Lucene (and Solr) query parser throws
> away term queries when the token is zero length (after analysis).
> Solr then interprets the left over "-features:foo" as "all documents
> not containing foo in the features field", so you get a bunch of
> matches.
> 
> -Yonik
> http://www.lucidimagination.com
> 
> 
> On Mon, Jun 1, 2009 at 10:15 AM, Sam Michaels  wrote:
>>
>> Walter,
>>
>> The analysis link does not produce any matches for either @ or !...@#$%^&*()
>> strings when I try to match against bathing. I'm worried that this might
>> be
>> the symptom of another problem (which has not revealed itself yet) and
>> want
>> to get to the bottom of this...
>>
>> Thank you.
>> sm
>>
>>
>> Walter Underwood wrote:
>>>
>>> Use the [analysis] link on the Solr admin UI to get more info on
>>> how this is being interpreted.
>>>
>>> However, I am curious about why this is important. Do users enter
>>> this query often? If not, maybe it is not something to spend time on.
>>>
>>> wunder
>>>
>>> On 5/31/09 2:56 PM, "Sam Michaels"  wrote:
>>>

 Here is the output from the debug query when I'm trying to match the
 String @
 against Bathing (should not match)

 
 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of:
   0.9994 = queryWeight(activity_type:NAME), product of:
     3.2689075 = idf(docFreq=153, numDocs=1489)
     0.30591258 = queryNorm
   3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product of:
     1.0 = tf(termFreq(activity_type:NAME)=1)
     3.2689075 = idf(docFreq=153, numDocs=1489)
     1.0 = fieldNorm(field=activity_type, doc=0)
 

 Looks like the AND clause in the search string is ignored...

 SM.


 ryantxu wrote:
>
> two key things to try (for anyone ever wondering why a query matches
> documents)
>
> 1.  add &debugQuery=true and look at the explain text below --
> anything that contributed to the score is listed there
> 2.  check /admin/analysis.jsp -- this will let you see how analyzers
> break text up into tokens.
>
> Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
> something to do with it...
>
>
> On Sat, May 30, 2009 at 5:59 PM, Sam Michaels 
> wrote:
>>
>> Hi,
>>
>> I'm running Solr 1.3/Java 1.6.
>>
>> When I run a query like  - (activity_type:NAME) AND
>> title:(\...@#$%\^&\*\(\))
>> all the documents are returned even though there is not a single
>> match.
>> There is no title that matches the string (which has been escaped).
>>
>> My document structure is as follows
>>
>> 
>> NAME
>> Bathing
>> 
>> 
>>
>>
>> The title field is of type text_title which is described below.
>>
>> > positionIncrementGap="100">
>>      
>>        
>>        
>>        > generateWordParts="1" generateNumberParts="1" catenateWords="1"
>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>        
>>        
>>      
>>      
>>        
>>        > synonyms="synonyms.txt"
>> ignoreCase="true" expand="true"/>
>>        > generateWordParts="1" generateNumberParts="1" catenateWords="1"
>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>        
>>        
>>
>>      
>>    
>>
>> When I run the query against Luke, no results are returned. Any
>> suggestions
>> are appreciated.
>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-document
>> s-are-matched-incorrectly-tp23797731p23797731.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
>
>
>>>
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23815688.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23816242.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-06-01 Thread Yonik Seeley
OK, here's the deal:

-features:foo features:(\...@#$%\^&\*\(\))
-features:foo features:(\...@#$%\^&\*\(\))
-features:foo
-features:foo

The text analysis is throwing away non alphanumeric chars (probably
the WordDelimiterFilter).  The Lucene (and Solr) query parser throws
away term queries when the token is zero length (after analysis).
Solr then interprets the left over "-features:foo" as "all documents
not containing foo in the features field", so you get a bunch of
matches.

-Yonik
http://www.lucidimagination.com


On Mon, Jun 1, 2009 at 10:15 AM, Sam Michaels  wrote:
>
> Walter,
>
> The analysis link does not produce any matches for either @ or !...@#$%^&*()
> strings when I try to match against bathing. I'm worried that this might be
> the symptom of another problem (which has not revealed itself yet) and want
> to get to the bottom of this...
>
> Thank you.
> sm
>
>
> Walter Underwood wrote:
>>
>> Use the [analysis] link on the Solr admin UI to get more info on
>> how this is being interpreted.
>>
>> However, I am curious about why this is important. Do users enter
>> this query often? If not, maybe it is not something to spend time on.
>>
>> wunder
>>
>> On 5/31/09 2:56 PM, "Sam Michaels"  wrote:
>>
>>>
>>> Here is the output from the debug query when I'm trying to match the
>>> String @
>>> against Bathing (should not match)
>>>
>>> 
>>> 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of:
>>>   0.9994 = queryWeight(activity_type:NAME), product of:
>>>     3.2689075 = idf(docFreq=153, numDocs=1489)
>>>     0.30591258 = queryNorm
>>>   3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product of:
>>>     1.0 = tf(termFreq(activity_type:NAME)=1)
>>>     3.2689075 = idf(docFreq=153, numDocs=1489)
>>>     1.0 = fieldNorm(field=activity_type, doc=0)
>>> 
>>>
>>> Looks like the AND clause in the search string is ignored...
>>>
>>> SM.
>>>
>>>
>>> ryantxu wrote:

 two key things to try (for anyone ever wondering why a query matches
 documents)

 1.  add &debugQuery=true and look at the explain text below --
 anything that contributed to the score is listed there
 2.  check /admin/analysis.jsp -- this will let you see how analyzers
 break text up into tokens.

 Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
 something to do with it...


 On Sat, May 30, 2009 at 5:59 PM, Sam Michaels  wrote:
>
> Hi,
>
> I'm running Solr 1.3/Java 1.6.
>
> When I run a query like  - (activity_type:NAME) AND
> title:(\...@#$%\^&\*\(\))
> all the documents are returned even though there is not a single match.
> There is no title that matches the string (which has been escaped).
>
> My document structure is as follows
>
> 
> NAME
> Bathing
> 
> 
>
>
> The title field is of type text_title which is described below.
>
>  positionIncrementGap="100">
>      
>        
>        
>         generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>        
>        
>      
>      
>        
>         synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>         generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>        
>        
>
>      
>    
>
> When I run the query against Luke, no results are returned. Any
> suggestions
> are appreciated.
>
>
> --
> View this message in context:
> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-document
> s-are-matched-incorrectly-tp23797731p23797731.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


>>
>>
>>
>
> --
> View this message in context: 
> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23815688.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-06-01 Thread Sam Michaels

Walter,

The analysis link does not produce any matches for either @ or !...@#$%^&*()
strings when I try to match against bathing. I'm worried that this might be
the symptom of another problem (which has not revealed itself yet) and want
to get to the bottom of this...

Thank you.
sm


Walter Underwood wrote:
> 
> Use the [analysis] link on the Solr admin UI to get more info on
> how this is being interpreted.
> 
> However, I am curious about why this is important. Do users enter
> this query often? If not, maybe it is not something to spend time on.
> 
> wunder
> 
> On 5/31/09 2:56 PM, "Sam Michaels"  wrote:
> 
>> 
>> Here is the output from the debug query when I'm trying to match the
>> String @
>> against Bathing (should not match)
>> 
>> 
>> 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of:
>>   0.9994 = queryWeight(activity_type:NAME), product of:
>> 3.2689075 = idf(docFreq=153, numDocs=1489)
>> 0.30591258 = queryNorm
>>   3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product of:
>> 1.0 = tf(termFreq(activity_type:NAME)=1)
>> 3.2689075 = idf(docFreq=153, numDocs=1489)
>> 1.0 = fieldNorm(field=activity_type, doc=0)
>> 
>> 
>> Looks like the AND clause in the search string is ignored...
>> 
>> SM.
>> 
>> 
>> ryantxu wrote:
>>> 
>>> two key things to try (for anyone ever wondering why a query matches
>>> documents)
>>> 
>>> 1.  add &debugQuery=true and look at the explain text below --
>>> anything that contributed to the score is listed there
>>> 2.  check /admin/analysis.jsp -- this will let you see how analyzers
>>> break text up into tokens.
>>> 
>>> Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
>>> something to do with it...
>>> 
>>> 
>>> On Sat, May 30, 2009 at 5:59 PM, Sam Michaels  wrote:
 
 Hi,
 
 I'm running Solr 1.3/Java 1.6.
 
 When I run a query like  - (activity_type:NAME) AND
 title:(\...@#$%\^&\*\(\))
 all the documents are returned even though there is not a single match.
 There is no title that matches the string (which has been escaped).
 
 My document structure is as follows
 
 
 NAME
 Bathing
 
 
 
 
 The title field is of type text_title which is described below.
 
 >>> positionIncrementGap="100">
      
        
        
        >>> generateWordParts="1" generateNumberParts="1" catenateWords="1"
 catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
        
        
      
      
        
        >>> synonyms="synonyms.txt"
 ignoreCase="true" expand="true"/>
        >>> generateWordParts="1" generateNumberParts="1" catenateWords="1"
 catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
        
        
 
      
    
 
 When I run the query against Luke, no results are returned. Any
 suggestions
 are appreciated.
 
 
 --
 View this message in context:
 http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-document
 s-are-matched-incorrectly-tp23797731p23797731.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 
>>> 
>>> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23815688.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-05-31 Thread Walter Underwood
Use the [analysis] link on the Solr admin UI to get more info on
how this is being interpreted.

However, I am curious about why this is important. Do users enter
this query often? If not, maybe it is not something to spend time on.

wunder

On 5/31/09 2:56 PM, "Sam Michaels"  wrote:

> 
> Here is the output from the debug query when I'm trying to match the String @
> against Bathing (should not match)
> 
> 
> 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of:
>   0.9994 = queryWeight(activity_type:NAME), product of:
> 3.2689075 = idf(docFreq=153, numDocs=1489)
> 0.30591258 = queryNorm
>   3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product of:
> 1.0 = tf(termFreq(activity_type:NAME)=1)
> 3.2689075 = idf(docFreq=153, numDocs=1489)
> 1.0 = fieldNorm(field=activity_type, doc=0)
> 
> 
> Looks like the AND clause in the search string is ignored...
> 
> SM.
> 
> 
> ryantxu wrote:
>> 
>> two key things to try (for anyone ever wondering why a query matches
>> documents)
>> 
>> 1.  add &debugQuery=true and look at the explain text below --
>> anything that contributed to the score is listed there
>> 2.  check /admin/analysis.jsp -- this will let you see how analyzers
>> break text up into tokens.
>> 
>> Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
>> something to do with it...
>> 
>> 
>> On Sat, May 30, 2009 at 5:59 PM, Sam Michaels  wrote:
>>> 
>>> Hi,
>>> 
>>> I'm running Solr 1.3/Java 1.6.
>>> 
>>> When I run a query like  - (activity_type:NAME) AND
>>> title:(\...@#$%\^&\*\(\))
>>> all the documents are returned even though there is not a single match.
>>> There is no title that matches the string (which has been escaped).
>>> 
>>> My document structure is as follows
>>> 
>>> 
>>> NAME
>>> Bathing
>>> 
>>> 
>>> 
>>> 
>>> The title field is of type text_title which is described below.
>>> 
>>> >> positionIncrementGap="100">
>>>      
>>>        
>>>        
>>>        >> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>>        
>>>        
>>>      
>>>      
>>>        
>>>        >> ignoreCase="true" expand="true"/>
>>>        >> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>>        
>>>        
>>> 
>>>      
>>>    
>>> 
>>> When I run the query against Luke, no results are returned. Any
>>> suggestions
>>> are appreciated.
>>> 
>>> 
>>> --
>>> View this message in context:
>>> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-document
>>> s-are-matched-incorrectly-tp23797731p23797731.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>> 
>>> 
>> 
>> 



Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-05-31 Thread Sam Michaels

Here is the output from the debug query when I'm trying to match the String @
against Bathing (should not match)


3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of:
  0.9994 = queryWeight(activity_type:NAME), product of:
3.2689075 = idf(docFreq=153, numDocs=1489)
0.30591258 = queryNorm
  3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product of:
1.0 = tf(termFreq(activity_type:NAME)=1)
3.2689075 = idf(docFreq=153, numDocs=1489)
1.0 = fieldNorm(field=activity_type, doc=0)


Looks like the AND clause in the search string is ignored...

SM.


ryantxu wrote:
> 
> two key things to try (for anyone ever wondering why a query matches
> documents)
> 
> 1.  add &debugQuery=true and look at the explain text below --
> anything that contributed to the score is listed there
> 2.  check /admin/analysis.jsp -- this will let you see how analyzers
> break text up into tokens.
> 
> Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
> something to do with it...
> 
> 
> On Sat, May 30, 2009 at 5:59 PM, Sam Michaels  wrote:
>>
>> Hi,
>>
>> I'm running Solr 1.3/Java 1.6.
>>
>> When I run a query like  - (activity_type:NAME) AND
>> title:(\...@#$%\^&\*\(\))
>> all the documents are returned even though there is not a single match.
>> There is no title that matches the string (which has been escaped).
>>
>> My document structure is as follows
>>
>> 
>> NAME
>> Bathing
>> 
>> 
>>
>>
>> The title field is of type text_title which is described below.
>>
>> > positionIncrementGap="100">
>>      
>>        
>>        
>>        > generateWordParts="1" generateNumberParts="1" catenateWords="1"
>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>        
>>        
>>      
>>      
>>        
>>        > ignoreCase="true" expand="true"/>
>>        > generateWordParts="1" generateNumberParts="1" catenateWords="1"
>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>        
>>        
>>
>>      
>>    
>>
>> When I run the query against Luke, no results are returned. Any
>> suggestions
>> are appreciated.
>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23797731.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23807341.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-05-31 Thread Sam Michaels

Upon some further experimentation, I found out that even @ matches all the
documents. However when I append the wildcard * to @ (@*) then there is no
match...

SM


Sam Michaels wrote:
> 
> Hi,
> 
> I'm running Solr 1.3/Java 1.6.  
> 
> When I run a query like  - (activity_type:NAME) AND
> title:(\...@#$%\^&\*\(\)) all the documents are returned even though there
> is not a single match. There is no title that matches the string (which
> has been escaped). 
> 
> My document structure is as follows
> 
> 
> NAME
> Bathing
> 
> 
> 
> 
> The title field is of type text_title which is described below. 
> 
>  positionIncrementGap="100">
>   
> 
> 
>  generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
> 
> 
>   
>   
> 
>  ignoreCase="true" expand="true"/>
>  generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
> 
> 
> 
>   
> 
> 
> When I run the query against Luke, no results are returned. Any
> suggestions are appreciated.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23804381.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-05-31 Thread Sam Michaels

As per relevance, no results should be returned. But all the results are
returned in alphabetical order.


Walter Underwood wrote:
> 
> I'm really curious. What is the most relevant result for that query?
> 
> wunder
> 
> On 5/30/09 7:35 PM, "Ryan McKinley"  wrote:
> 
>> two key things to try (for anyone ever wondering why a query matches
>> documents)
>> 
>> 1.  add &debugQuery=true and look at the explain text below --
>> anything that contributed to the score is listed there
>> 2.  check /admin/analysis.jsp -- this will let you see how analyzers
>> break text up into tokens.
>> 
>> Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
>> something to do with it...
>> 
>> 
>> On Sat, May 30, 2009 at 5:59 PM, Sam Michaels  wrote:
>>> 
>>> Hi,
>>> 
>>> I'm running Solr 1.3/Java 1.6.
>>> 
>>> When I run a query like  - (activity_type:NAME) AND
>>> title:(\...@#$%\^&\*\(\))
>>> all the documents are returned even though there is not a single match.
>>> There is no title that matches the string (which has been escaped).
>>> 
>>> My document structure is as follows
>>> 
>>> 
>>> NAME
>>> Bathing
>>> 
>>> 
>>> 
>>> 
>>> The title field is of type text_title which is described below.
>>> 
>>> >> positionIncrementGap="100">
>>>      
>>>        
>>>        
>>>        >> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>>        
>>>        
>>>      
>>>      
>>>        
>>>        >> ignoreCase="true" expand="true"/>
>>>        >> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>>        
>>>        
>>> 
>>>      
>>>    
>>> 
>>> When I run the query against Luke, no results are returned. Any
>>> suggestions
>>> are appreciated.
>>> 
>>> 
>>> --
>>> View this message in context:
>>> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents
>>> -are-matched-incorrectly-tp23797731p23797731.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>> 
>>> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23804060.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-05-30 Thread Walter Underwood
I'm really curious. What is the most relevant result for that query?

wunder

On 5/30/09 7:35 PM, "Ryan McKinley"  wrote:

> two key things to try (for anyone ever wondering why a query matches
> documents)
> 
> 1.  add &debugQuery=true and look at the explain text below --
> anything that contributed to the score is listed there
> 2.  check /admin/analysis.jsp -- this will let you see how analyzers
> break text up into tokens.
> 
> Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
> something to do with it...
> 
> 
> On Sat, May 30, 2009 at 5:59 PM, Sam Michaels  wrote:
>> 
>> Hi,
>> 
>> I'm running Solr 1.3/Java 1.6.
>> 
>> When I run a query like  - (activity_type:NAME) AND title:(\...@#$%\^&\*\(\))
>> all the documents are returned even though there is not a single match.
>> There is no title that matches the string (which has been escaped).
>> 
>> My document structure is as follows
>> 
>> 
>> NAME
>> Bathing
>> 
>> 
>> 
>> 
>> The title field is of type text_title which is described below.
>> 
>> > positionIncrementGap="100">
>>      
>>        
>>        
>>        > generateWordParts="1" generateNumberParts="1" catenateWords="1"
>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>        
>>        
>>      
>>      
>>        
>>        > ignoreCase="true" expand="true"/>
>>        > generateWordParts="1" generateNumberParts="1" catenateWords="1"
>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>        
>>        
>> 
>>      
>>    
>> 
>> When I run the query against Luke, no results are returned. Any suggestions
>> are appreciated.
>> 
>> 
>> --
>> View this message in context:
>> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents
>> -are-matched-incorrectly-tp23797731p23797731.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>> 
>> 



Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-05-30 Thread Ryan McKinley
two key things to try (for anyone ever wondering why a query matches documents)

1.  add &debugQuery=true and look at the explain text below --
anything that contributed to the score is listed there
2.  check /admin/analysis.jsp -- this will let you see how analyzers
break text up into tokens.

Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
something to do with it...


On Sat, May 30, 2009 at 5:59 PM, Sam Michaels  wrote:
>
> Hi,
>
> I'm running Solr 1.3/Java 1.6.
>
> When I run a query like  - (activity_type:NAME) AND title:(\...@#$%\^&\*\(\))
> all the documents are returned even though there is not a single match.
> There is no title that matches the string (which has been escaped).
>
> My document structure is as follows
>
> 
> NAME
> Bathing
> 
> 
>
>
> The title field is of type text_title which is described below.
>
>  positionIncrementGap="100">
>      
>        
>        
>         generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>        
>        
>      
>      
>        
>         ignoreCase="true" expand="true"/>
>         generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>        
>        
>
>      
>    
>
> When I run the query against Luke, no results are returned. Any suggestions
> are appreciated.
>
>
> --
> View this message in context: 
> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23797731.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


When searching for !...@#$%^&*() all documents are matched incorrectly

2009-05-30 Thread Sam Michaels

Hi,

I'm running Solr 1.3/Java 1.6.  

When I run a query like  - (activity_type:NAME) AND title:(\...@#$%\^&\*\(\))
all the documents are returned even though there is not a single match.
There is no title that matches the string (which has been escaped). 

My document structure is as follows


NAME
Bathing




The title field is of type text_title which is described below. 


  





  
  






  


When I run the query against Luke, no results are returned. Any suggestions
are appreciated.


-- 
View this message in context: 
http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23797731.html
Sent from the Solr - User mailing list archive at Nabble.com.