For the record, I solved this problem by removing stop words in my analyzer
for wordfield.

We often get this problem where there is stop words discrepancies between
fields.

Le jeu. 16 nov. 2023 à 09:28, elisabeth benoit <elisaelisael...@gmail.com>
a écrit :

>
> Thanks a lot for taking time to answer.
>
> I'll have to figure out a work around, decreasing mm is not an option for
> me, maybe use a boost for this extra field.
>
> Best regards,
> Elisabeth
>
> Le mar. 14 nov. 2023 à 12:05, Mikhail Khludnev <m...@apache.org> a écrit :
>
>> Ok. Right
>> (one two three four five six seven)~7 means match all of them ie in fact
>> +one
>> +two +three +four +five +six +seven
>> Here we can see that how dismax handles fields with different analyzers is
>> far from perfection.
>> You can either decrease mm
>>
>> https://solr.apache.org/guide/6_6/the-dismax-query-parser.html#TheDisMaxQueryParser-Themm_MinimumShouldMatch_Parameter
>> or experiment with mm.autoRelax=true
>>
>> https://solr.apache.org/guide/6_6/the-extended-dismax-query-parser.html#TheExtendedDisMaxQueryParser-Themm.autoRelaxParameter
>>
>>
>> On Mon, Nov 13, 2023 at 10:33 PM elisabeth benoit <
>> elisaelisael...@gmail.com>
>> wrote:
>>
>> > okay, thanks, for the answer. the thing is
>> >
>> > when there is no *wordf**ield* in the *qf* param, but only *edgefield1*
>> and
>> > *edgefield2*, I get this parsedQuery
>> >
>> > parsedQuery =
>> >  +(DisjunctionMaxQuery(((edgefield1:musee)^1.1 | edgefield2:musee))
>> >  DisjunctionMaxQuery(((edgefield1:maillol)^1.1 | edgefield2:maillol))
>> >  DisjunctionMaxQuery(((edgefield1:61)^1.1 | edgefield2:61))
>> >  DisjunctionMaxQuery(((edgefield1:r)^1.1 | edgefield2:r))
>> >  DisjunctionMaxQuery(((edgefield1:grenelle)^1.1 | edgefield2:grenelle))
>> >  DisjunctionMaxQuery(((edgefield1:75007)^1.1 | edgefield2:75007))
>> >  DisjunctionMaxQuery(((edgefield1:paris)^1.1 | edgefield2:paris)))~7
>> >
>> > and SolR does return documents
>> >
>> > but when I have instead* wordf**ield* and *edgefield* in *qf*,  I get
>> this
>> > parsedQuery
>> >
>> > parsedQuery =
>> > >  "+DisjunctionMaxQuery((((wordfield:musee wordfield:maillol
>> wordfield:61
>> > >  Synonym(wordfield:r wordfield:ru wordfield:rue) wordfield:grenelle
>> > >  wordfield:75007 wordfield:paris)~7)^1.1 | ((edgefield:musee
>> > > edgefield:maillol
>> > >  edgefield:61 edgefield:r edgefield:grenelle edgefield:75007
>> > > edgefield:paris)~7)))"
>> >
>> > and SolR does not return any documents.
>> >
>> > That is what makes me thing there is something wrong with the second
>> > parsedQuery.
>> >
>> > Best regards,
>> > Elisabeth
>> >
>> >
>> >
>> > Le lun. 13 nov. 2023 à 20:15, Mikhail Khludnev <m...@apache.org> a
>> écrit :
>> >
>> > > >
>> > > >  the first case listed in my mail
>> > > > parsedQuery =
>> > > >  "+DisjunctionMaxQuery((((wordfield:musee wordfield:maillol
>> > wordfield:61
>> > > >  Synonym(wordfield:r wordfield:ru wordfield:rue) wordfield:grenelle
>> > > >  wordfield:75007 wordfield:paris)~7)^1.1 | ((edgefield:musee
>> > > > edgefield:maillol
>> > > >  edgefield:61 edgefield:r edgefield:grenelle edgefield:75007
>> > > > edgefield:paris)~7)))"
>> > >
>> > >
>> > > > The OR is different, it is all words must match wordfield OR all
>> words
>> > > must
>> > > > match edgefield, but no mix between the two fields are allowed.
>> > >
>> > >
>> > > It doesn't work this way. These two queries differs only in
>> > scoring/results
>> > > ordering. i.e
>> > > this query matches  docs: {wordfield:musee, edgefield:musee} as well
>> as {
>> > > wordfield:musee,edgefield:maillol},   {wordfield:musee}, {
>> > > edgefield:maillol}.
>> > > This explanation might be useful
>> > > https://lucidworks.com/post/solr-boolean-operators/
>> > > Note: DisMax works like OR/| but takes max instead of sum as a score.
>> > >
>> > > On Mon, Nov 13, 2023 at 7:21 PM elisabeth benoit <
>> > > elisaelisael...@gmail.com>
>> > > wrote:
>> > >
>> > > > Hello,
>> > > >
>> > > > Thanks for your answer.
>> > > >
>> > > > I mean that in the second case listed in my mail, the query is
>> > > > parsedQuery =
>> > > >  +(DisjunctionMaxQuery(((edgefield1:musee)^1.1 | edgefield2:musee))
>> > > >  DisjunctionMaxQuery(((edgefield1:maillol)^1.1 |
>> edgefield2:maillol))
>> > > >  DisjunctionMaxQuery(((edgefield1:61)^1.1 | edgefield2:61))
>> > > >  DisjunctionMaxQuery(((edgefield1:r)^1.1 | edgefield2:r))
>> > > >  DisjunctionMaxQuery(((edgefield1:grenelle)^1.1 |
>> edgefield2:grenelle))
>> > > >  DisjunctionMaxQuery(((edgefield1:75007)^1.1 | edgefield2:75007))
>> > > >  DisjunctionMaxQuery(((edgefield1:paris)^1.1 | edgefield2:paris)))~7
>> > > >
>> > > > and so the way I read it is "musee" can match edgefield1 OR
>> edgefield2,
>> > > > "maillol" can match edgefield1 OR edgefield2, and so on, so solr can
>> > > return
>> > > > a doc where some query words match with edgefield1 and some other
>> query
>> > > > words with edgefield2.
>> > > >
>> > > > But in the first case listed in my mail
>> > > >
>> > > > parsedQuery =
>> > > >  "+DisjunctionMaxQuery((((wordfield:musee wordfield:maillol
>> > wordfield:61
>> > > >  Synonym(wordfield:r wordfield:ru wordfield:rue) wordfield:grenelle
>> > > >  wordfield:75007 wordfield:paris)~7)^1.1 | ((edgefield:musee
>> > > > edgefield:maillol
>> > > >  edgefield:61 edgefield:r edgefield:grenelle edgefield:75007
>> > > > edgefield:paris)~7)))"
>> > > >
>> > > > The OR is different, it is all words must match wordfield OR all
>> words
>> > > must
>> > > > match edgefield, but no mix between the two fields are allowed.
>> > > >
>> > > > So I cannot search both fields at the same time.
>> > > >
>> > > > I hope this is clear!
>> > > >
>> > > > I would like to search both fields in same query.
>> > > >
>> > > > Best regards,
>> > > > Elisabeth
>> > > >
>> > > > Le lun. 13 nov. 2023 à 17:02, Mikhail Khludnev <m...@apache.org> a
>> > > écrit :
>> > > >
>> > > > > Hello Elisabeth.
>> > > > > DisMax analyses user input across the given qf fields. If the
>> number
>> > of
>> > > > > resulting tokens are different it can't apply defaults logic - per
>> > word
>> > > > sum
>> > > > > over per field maximums; and flips to max over sums. The good
>> news is
>> > > > that
>> > > > > the difference between two approaches is only scoring.
>> > > > > WDYM exactly by absence of "matching words to be in two different
>> > > > fields"?
>> > > > >
>> > > > > On Mon, Nov 13, 2023 at 5:01 PM elisabeth benoit <
>> > > > > elisaelisael...@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > > > Hello,
>> > > > > >
>> > > > > > I am using solr 7.3.1 with ExtendedDismaxQParser.
>> > > > > >
>> > > > > > I have a edgengrams field and a normal text field. When I mix
>> those
>> > > two
>> > > > > in
>> > > > > > the same query, ie *qf=edgefield wordfield* and use option
>> > > > > *debugQuery=on*,
>> > > > > > I see that the parsedQuery is different, ie all words should
>> match
>> > > the
>> > > > > same
>> > > > > > field.
>> > > > > >
>> > > > > > ie parsedQuery =
>> > > > > >
>> > > > > > "+DisjunctionMaxQuery((((wordfield:musee wordfield:maillol
>> > > wordfield:61
>> > > > > > Synonym(wordfield:r wordfield:ru wordfield:rue)
>> wordfield:grenelle
>> > > > > > wordfield
>> > > > > > :75007 wordfield:paris)~7)^1.1 | ((edgefield:musee
>> > edgefield:maillol
>> > > > > > edgefield:61 edgefield:r edgefield:grenelle edgefield:75007
>> > edgefield
>> > > > > > :paris)~7)))"
>> > > > > >
>> > > > > > When instead I use two edgefields with *qf=**edgefield1
>> > **edgefield2*
>> > > > > >
>> > > > > > parsedQuery =
>> > > > > > +(DisjunctionMaxQuery(((edgefield1:musee)^1.1 |
>> edgefield2:musee))
>> > > > > > DisjunctionMaxQuery(((edgefield1:maillol)^1.1 |
>> > edgefield2:maillol))
>> > > > > > DisjunctionMaxQuery(((edgefield1:61)^1.1 | edgefield2:61))
>> > > > > > DisjunctionMaxQuery(((edgefield1:r)^1.1 | edgefield2:r))
>> > > > > > DisjunctionMaxQuery(((edgefield1:grenelle)^1.1 |
>> > > edgefield2:grenelle))
>> > > > > > DisjunctionMaxQuery(((edgefield1:75007)^1.1 | edgefield2:75007))
>> > > > > > DisjunctionMaxQuery(((edgefield1:paris)^1.1 |
>> edgefield2:paris)))~7
>> > > > > >
>> > > > > > In the second case, edismax allows matching words to be in two
>> > > > different
>> > > > > > fields, but not in first case.
>> > > > > >
>> > > > > > Is there a way to have the same behaviour, ie case two, in all
>> > cases?
>> > > > > >
>> > > > > > best regards,
>> > > > > > Elisabeth
>> > > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Sincerely yours
>> > > > > Mikhail Khludnev
>> > > > >
>> > > >
>> > >
>> > >
>> > > --
>> > > Sincerely yours
>> > > Mikhail Khludnev
>> > >
>> >
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>>
>

Reply via email to