Yonik, Done, here is the link. https://issues.apache.org/jira/browse/SOLR-1196
SM. Yonik Seeley-2 wrote: > > On Mon, Jun 1, 2009 at 10:50 AM, Sam Michaels <mas...@yahoo.com> wrote: >> >> So the fix for this problem would be >> >> 1. Stop using WordDelimiterFilter for queries (what is the alternative) >> OR >> 2. Not allow any search strings without any alphanumeric characters.. > > Short term workaround for you, yes. > I would classify this surprising behavior as a bug we should > eventually fix though. Could you open a JIRA issue for it? > > -Yonik > http://www.lucidimagination.com > >> SM. >> >> >> Yonik Seeley-2 wrote: >>> >>> OK, here's the deal: >>> >>> <str name="rawquerystring">-features:foo >>> features:(\...@#$%\^&\*\(\))</str> >>> <str name="querystring">-features:foo features:(\...@#$%\^&\*\(\))</str> >>> <str name="parsedquery">-features:foo</str> >>> <str name="parsedquery_toString">-features:foo</str> >>> >>> The text analysis is throwing away non alphanumeric chars (probably >>> the WordDelimiterFilter). The Lucene (and Solr) query parser throws >>> away term queries when the token is zero length (after analysis). >>> Solr then interprets the left over "-features:foo" as "all documents >>> not containing foo in the features field", so you get a bunch of >>> matches. >>> >>> -Yonik >>> http://www.lucidimagination.com >>> >>> >>> On Mon, Jun 1, 2009 at 10:15 AM, Sam Michaels <mas...@yahoo.com> wrote: >>>> >>>> Walter, >>>> >>>> The analysis link does not produce any matches for either @ or >>>> !...@#$%^&*() >>>> strings when I try to match against bathing. I'm worried that this >>>> might >>>> be >>>> the symptom of another problem (which has not revealed itself yet) and >>>> want >>>> to get to the bottom of this... >>>> >>>> Thank you. >>>> sm >>>> >>>> >>>> Walter Underwood wrote: >>>>> >>>>> Use the [analysis] link on the Solr admin UI to get more info on >>>>> how this is being interpreted. >>>>> >>>>> However, I am curious about why this is important. Do users enter >>>>> this query often? If not, maybe it is not something to spend time on. >>>>> >>>>> wunder >>>>> >>>>> On 5/31/09 2:56 PM, "Sam Michaels" <mas...@yahoo.com> wrote: >>>>> >>>>>> >>>>>> Here is the output from the debug query when I'm trying to match the >>>>>> String @ >>>>>> against Bathing (should not match) >>>>>> >>>>>> <str name="GLOM-1"> >>>>>> 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of: >>>>>> 0.99999994 = queryWeight(activity_type:NAME), product of: >>>>>> 3.2689075 = idf(docFreq=153, numDocs=1489) >>>>>> 0.30591258 = queryNorm >>>>>> 3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product >>>>>> of: >>>>>> 1.0 = tf(termFreq(activity_type:NAME)=1) >>>>>> 3.2689075 = idf(docFreq=153, numDocs=1489) >>>>>> 1.0 = fieldNorm(field=activity_type, doc=0) >>>>>> </str> >>>>>> >>>>>> Looks like the AND clause in the search string is ignored... >>>>>> >>>>>> SM. >>>>>> >>>>>> >>>>>> ryantxu wrote: >>>>>>> >>>>>>> two key things to try (for anyone ever wondering why a query matches >>>>>>> documents) >>>>>>> >>>>>>> 1. add &debugQuery=true and look at the explain text below -- >>>>>>> anything that contributed to the score is listed there >>>>>>> 2. check /admin/analysis.jsp -- this will let you see how analyzers >>>>>>> break text up into tokens. >>>>>>> >>>>>>> Not sure off hand, but I'm guessing the WordDelimiterFilterFactory >>>>>>> has >>>>>>> something to do with it... >>>>>>> >>>>>>> >>>>>>> On Sat, May 30, 2009 at 5:59 PM, Sam Michaels <mas...@yahoo.com> >>>>>>> wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I'm running Solr 1.3/Java 1.6. >>>>>>>> >>>>>>>> When I run a query like - (activity_type:NAME) AND >>>>>>>> title:(\...@#$%\^&\*\(\)) >>>>>>>> all the documents are returned even though there is not a single >>>>>>>> match. >>>>>>>> There is no title that matches the string (which has been escaped). >>>>>>>> >>>>>>>> My document structure is as follows >>>>>>>> >>>>>>>> <doc> >>>>>>>> <str name="activity_type">NAME</str> >>>>>>>> <str name="title">Bathing</str> >>>>>>>> .... >>>>>>>> </doc> >>>>>>>> >>>>>>>> >>>>>>>> The title field is of type text_title which is described below. >>>>>>>> >>>>>>>> <fieldType name="text_title" class="solr.TextField" >>>>>>>> positionIncrementGap="100"> >>>>>>>> <analyzer type="index"> >>>>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >>>>>>>> <!-- in this example, we will only use synonyms at query >>>>>>>> time >>>>>>>> <filter class="solr.SynonymFilterFactory" >>>>>>>> synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> >>>>>>>> --> >>>>>>>> <filter class="solr.WordDelimiterFilterFactory" >>>>>>>> generateWordParts="1" generateNumberParts="1" catenateWords="1" >>>>>>>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/> >>>>>>>> <filter class="solr.LowerCaseFilterFactory"/> >>>>>>>> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> >>>>>>>> </analyzer> >>>>>>>> <analyzer type="query"> >>>>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >>>>>>>> <filter class="solr.SynonymFilterFactory" >>>>>>>> synonyms="synonyms.txt" >>>>>>>> ignoreCase="true" expand="true"/> >>>>>>>> <filter class="solr.WordDelimiterFilterFactory" >>>>>>>> generateWordParts="1" generateNumberParts="1" catenateWords="1" >>>>>>>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/> >>>>>>>> <filter class="solr.LowerCaseFilterFactory"/> >>>>>>>> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> >>>>>>>> >>>>>>>> </analyzer> >>>>>>>> </fieldType> >>>>>>>> >>>>>>>> When I run the query against Luke, no results are returned. Any >>>>>>>> suggestions >>>>>>>> are appreciated. >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> View this message in context: >>>>>>>> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-document >>>>>>>> s-are-matched-incorrectly-tp23797731p23797731.html >>>>>>>> Sent from the Solr - User mailing list archive at Nabble.com. >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> View this message in context: >>>> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23815688.html >>>> Sent from the Solr - User mailing list archive at Nabble.com. >>>> >>>> >>> >>> >> >> -- >> View this message in context: >> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23816242.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > -- View this message in context: http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23816809.html Sent from the Solr - User mailing list archive at Nabble.com.