Peter:

bq: I don't have a requestHandler named "/select".

Right, that was just an example of a request handler, your
"/scoresearch" handler _does_ have edismax as your default "defType"
so assuming you're using that one it makes no difference at all
whether you specify &defType=edismax on the URL or not. You'd see a
differences if you specified "&defType=any_parser_other_than_dismax"
though ;)

As for the rest, I'll leave you in the much more capable hands of
Markus since he has, you know, real knowledge in this area rather than
my generalities....

Best,
Erick

On Mon, Mar 12, 2018 at 1:33 AM, Markus Jelsma
<markus.jel...@openindex.io> wrote:
> Hi,
>
> Glad to hear you removed the gramming, but Kraaij-Pohlmann isn't going to 
> solve all problems either, for example molens => molen, but molen => mool, 
> and many more like that. You can solve this by adding manual rules to 
> StemmerOverrideFilter, but due to the compound nature of words, you would 
> need to add it for all the mills.
>
> Regarding the compounds, Dutch is (more or less) just another Germanic 
> language and uses compounds just like German, Swedish etc. To deal with that 
> you can try the vanilla HyphenationCompoundWordTokenFilter (or something like 
> that). Be sure not to set minWordLength too low, or you'll get plenty of bad 
> results. The major drawback of this token filter is that it emits overlapping 
> terms, and may not always work with compounds of which the head is a plural, 
> just like dierenzaak, of scholierenkorting.
>
> Also add a AccentFoldingFilter, or ICUNormalizer to get rid of accents, or 
> you may have trouble finding a café.
>
> Regards,
> Markus
>
> -----Original message-----
>> From:PeterKerk <petervdk...@hotmail.com>
>> Sent: Sunday 11th March 2018 23:55
>> To: solr-user@lucene.apache.org
>> Subject: Re: Solr search engine configuration
>>
>> Sorry for this lengthy post, but I wanted to be complete.
>>
>> The only occurence of edismax in solrconfig.xml is this one:
>>
>>       <requestHandler name="/scoresearch" class="solr.SearchHandler"
>> default="true">
>>
>>                       <lst name="defaults">
>>                         <str name="defType">edismax</str>
>>                         <str name="echoParams">explicit</str>
>>                         <int name="rows">10</int>
>>
>>                         <str name="qf">double_score</str>
>>                         <str name="debug">false</str>
>>                         <str name="q.alt">*:*</str>
>>               </lst>
>>       </requestHandler>
>>
>> I don't have a requestHandler named "/select".
>>
>>
>> Also, removing the gramming definitely helped! :-)
>>
>> I tried to simplify my setup first and then expand, so what I have now is
>> this:
>>
>>
>>       <fieldType name="searchtext_nl" class="solr.TextField"
>> positionIncrementGap="100">
>>       <analyzer type="index">
>>               <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>               <filter class="solr.StopFilterFactory" ignoreCase="true"
>> words="stopwords_nl.txt"/>
>>               <filter class="solr.LowerCaseFilterFactory"/>
>>               <filter class="solr.SnowballPorterFilterFactory" language="Kp"
>> protected="protwords_nl.txt"></filter>
>>
>>
>>       </analyzer>
>>       <analyzer type="query">
>>               <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>               <filter class="solr.StopFilterFactory" ignoreCase="true"
>> words="stopwords_nl.txt"/>
>>               <filter class="solr.LowerCaseFilterFactory"/>
>>               <filter class="solr.SnowballPorterFilterFactory" language="Kp"
>> protected="protwords_nl.txt"></filter>
>>
>>
>>       </analyzer>
>>     </fieldType>
>>
>>       <field name="title_search_global" type="searchtext_nl" indexed="true"
>> stored="true"/>
>>
>> In my database I have these 4 values for "title" that populate
>> "title_search_global"
>>
>> "Hi there dier something else"
>> "Hi there dieren zaak something else"
>> "Hi there dierenzaak something else"
>> "Hi there dierzaak something else"
>>
>> ps. "dier" is singular of plural "dieren".
>>
>> Using this query:
>> http://localhost:8983/solr/search-global/select?q=title_search_global%3A(dieren+zaak)&fq=(lang%3A%22nl%22+OR+lang%3A%22all%22)&fl=id%2Ctitle&wt=xml&indent=true&defType=edismax&qf=title_search_global&stopwords=true&lowercaseOperators=true&debug=true
>>
>> These results are found:
>> "Hi there dier something else"
>> "Hi there dieren zaak something else"
>>
>> And these are NOT:
>> "Hi there dierenzaak something else"
>> "Hi there dierzaak something else"
>>
>> I'd expect it should be fairly easy (although I don't know how) to also
>> include result "dierenzaak", by compounding the 2 query values. And yes you
>> are correct: in Dutch "dieren zaak" would mean the same as "dierenzaak". Not
>> sure what logic would also include "dierzaak"
>>
>> Regarding your question: yes, I do consider "dieren zaak soemthingelse" an
>> exact match of "dieren zaak"
>> So I also checked the usage of pf parameters with edismax (based on these
>> links:
>> https://lucene.apache.org/solr/guide/6_6/the-extended-dismax-query-parser.html,
>> http://blog.thedigitalgroup.com/vijaym/understanding-phrasequery-and-slop-in-solr/)
>> And also for dismax:
>> https://lucene.apache.org/solr/guide/6_6/the-dismax-query-parser.html#TheDisMaxQueryParser-Theqs_QueryPhraseSlop_Parameter
>>
>> But I can't find any examples how to actually use these parameters?
>>
>>
>> The search results, including debug info is here:
>>
>>
>> <response>
>>     <lst name="responseHeader">
>>         <int name="status">0</int>
>>         <int name="QTime">7</int>
>>         <lst name="params">
>>             <str name="q">title_search_global:(dieren zaak)</str>
>>             <str name="defType">edismax</str>
>>             <str name="debug">true</str>
>>             <str name="indent">true</str>
>>             <str name="qf">title_search_global</str>
>>             <str name="fl">id,title</str>
>>             <str name="fq">(lang:"nl" OR lang:"all")</str>
>>             <str name="wt">xml</str>
>>             <str name="lowercaseOperators">true</str>
>>             <str name="stopwords">true</str>
>>         </lst>
>>     </lst>
>>     <result name="response" numFound="2" start="0">
>>         <doc>
>>             <str name="title">dieren zaak</str>
>>             <str name="id">115_3699638</str>
>>         </doc>
>>         <doc>
>>             <str name="title">dier</str>
>>             <str name="id">115_3699637</str>
>>         </doc>
>>     </result>
>>     <lst name="debug">
>>         <str name="rawquerystring">title_search_global:(dieren zaak)</str>
>>         <str name="querystring">title_search_global:(dieren zaak)</str>
>>         <str name="parsedquery">
>> (+(title_search_global:dier title_search_global:zaak))/no_coord
>> </str>
>>         <str name="parsedquery_toString">
>> +(title_search_global:dier title_search_global:zaak)
>> </str>
>>         <lst name="explain">
>>             <str name="115_3699638">
>> 5.489122 = (MATCH) sum of: 2.4387078 = (MATCH)
>> weight(title_search_global:dier in 51) [DefaultSimilarity], result of:
>> 2.4387078 = score(doc=51,freq=1.0 = termFreq=1.0 ), product of: 0.66654336 =
>> queryWeight, product of: 5.8539815 = idf(docFreq=3, maxDocs=513) 0.113861546
>> = queryNorm 3.6587384 = fieldWeight in 51, product of: 1.0 = tf(freq=1.0),
>> with freq of: 1.0 = termFreq=1.0 5.8539815 = idf(docFreq=3, maxDocs=513)
>> 0.625 = fieldNorm(doc=51) 3.050414 = (MATCH) weight(title_search_global:zaak
>> in 51) [DefaultSimilarity], result of: 3.050414 = score(doc=51,freq=1.0 =
>> termFreq=1.0 ), product of: 0.7454662 = queryWeight, product of: 6.5471287 =
>> idf(docFreq=1, maxDocs=513) 0.113861546 = queryNorm 4.091955 = fieldWeight
>> in 51, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
>> 6.5471287 = idf(docFreq=1, maxDocs=513) 0.625 = fieldNorm(doc=51)
>> </str>
>>             <str name="115_3699637">
>> 1.9509662 = (MATCH) product of: 3.9019325 = (MATCH) sum of: 3.9019325 =
>> (MATCH) weight(title_search_global:dier in 50) [DefaultSimilarity], result
>> of: 3.9019325 = score(doc=50,freq=1.0 = termFreq=1.0 ), product of:
>> 0.66654336 = queryWeight, product of: 5.8539815 = idf(docFreq=3,
>> maxDocs=513) 0.113861546 = queryNorm 5.8539815 = fieldWeight in 50, product
>> of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.8539815 =
>> idf(docFreq=3, maxDocs=513) 1.0 = fieldNorm(doc=50) 0.5 = coord(1/2)
>> </str>
>>             <str name="110_141">
>> 0.9754831 = (MATCH) product of: 1.9509662 = (MATCH) sum of: 1.9509662 =
>> (MATCH) weight(title_search_global:dier in 132) [DefaultSimilarity], result
>> of: 1.9509662 = score(doc=132,freq=1.0 = termFreq=1.0 ), product of:
>> 0.66654336 = queryWeight, product of: 5.8539815 = idf(docFreq=3,
>> maxDocs=513) 0.113861546 = queryNorm 2.9269907 = fieldWeight in 132, product
>> of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.8539815 =
>> idf(docFreq=3, maxDocs=513) 0.5 = fieldNorm(doc=132) 0.5 = coord(1/2)
>> </str>
>>         </lst>
>>         <str name="QParser">ExtendedDismaxQParser</str>
>>         <null name="altquerystring" />
>>         <null name="boost_queries" />
>>         <arr name="parsed_boost_queries" />
>>         <null name="boostfuncs" />
>>         <arr name="filter_queries">
>>             <str>(lang:"nl" OR lang:"all")</str>
>>         </arr>
>>         <arr name="parsed_filter_queries">
>>             <str>lang:nl lang:all</str>
>>         </arr>
>>         <lst name="timing">
>>             <double name="time">7.0</double>
>>             <lst name="prepare">
>>                 <double name="time">4.0</double>
>>                 <lst name="query">
>>                     <double name="time">4.0</double>
>>                 </lst>
>>                 <lst name="facet">
>>                     <double name="time">0.0</double>
>>                 </lst>
>>                 <lst name="mlt">
>>                     <double name="time">0.0</double>
>>                 </lst>
>>                 <lst name="highlight">
>>                     <double name="time">0.0</double>
>>                 </lst>
>>                 <lst name="stats">
>>                     <double name="time">0.0</double>
>>                 </lst>
>>                 <lst name="debug">
>>                     <double name="time">0.0</double>
>>                 </lst>
>>             </lst>
>>             <lst name="process">
>>                 <double name="time">3.0</double>
>>                 <lst name="query">
>>                     <double name="time">0.0</double>
>>                 </lst>
>>                 <lst name="facet">
>>                     <double name="time">0.0</double>
>>                 </lst>
>>                 <lst name="mlt">
>>                     <double name="time">0.0</double>
>>                 </lst>
>>                 <lst name="highlight">
>>                     <double name="time">0.0</double>
>>                 </lst>
>>                 <lst name="stats">
>>                     <double name="time">0.0</double>
>>                 </lst>
>>                 <lst name="debug">
>>                     <double name="time">3.0</double>
>>                 </lst>
>>             </lst>
>>         </lst>
>>     </lst>
>> </response>
>>
>>
>> PS. had to laugh out loud about that professor joke :-D
>>
>>
>>
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>

Reply via email to