[jira] [Comment Edited] (SOLR-12243) Edismax missing phrase queries when phrases contain multiterm synonyms
[ https://issues.apache.org/jira/browse/SOLR-12243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836249#comment-16836249 ] Fredrik Rodland edited comment on SOLR-12243 at 5/9/19 6:07 PM: I am aware that this issue is closed, but nonetheless: I think this actually broke something regarding expansion of synonyms for large queries (possibly large {{OR}}-queries). Having {{pf}} enabled on fields with a substantial amount of synonym resulted in the pf-portion of the query growing "exponentially" and resulted in one single query taking down an entire solr-server. By adjusting the number of {{OR}}-queries we were able to increase the memory required for running the query. example (id has synonyms enabled, companyname has not): *{{A.}}* {{q=( samfunnsviter (klima OR miljø) ) NOT ( psykolog%20 OR rus OR ortopedi OR odontologi )&debugQuery=true&pf=companyname}} results in pf-part of edismax-query {{(+DisjunctionMaxQuery((companyname:\"? samfunnsviter klima miljø ? ? psykolog rus ortopedi odontologi\"~5)~0.01))}} *{{B.}}* {{q=( samfunnsviter (klima OR miljø) ) NOT ( psykolog%20 OR rus OR ortopedi OR odontologi )&debugQuery=true&pf=id companyname}} results in pf-part of edismax-query {{(+DisjunctionMaxQuery(((id:\"samfunnsviter klima miljø psykolog rus ortopedi odontologi\"~5 id:\"samfunnsviter klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"samfunnsvitar klima miljø psykolog rus ortopedi odontologi\"~5 id:\"samfunnsvitar klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"social scientist klima miljø psykolog rus ortopedi odontologi\"~5 id:\"social scientist klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"statsviter klima miljø psykolog rus ortopedi odontologi\"~5 id:\"statsviter klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"samfunnsøkonom klima miljø psykolog rus ortopedi odontologi\"~5 id:\"samfunnsøkonom klima miljø psykologspesialist rus ortopedi odontologi\"~5) | companyname:\"? samfunnsviter klima miljø ? ? psykolog rus ortopedi odontologi\"~5)~0.01))}} B. above is just a reasonably short example to show our point. Our actually queries (and resulting {{pf}} {{DisjunctionMaxQuery}} are a *lot longer*. Increasing the number of OR-terms or synonyms results in the id-part of the query growing "exponentially" was (Author: fmr): I am aware that this issue is closed, but nonetheless: I think this actually broke something regarding expansion of synonyms for large queries (possibly large OR-queries). Having \{code}pf\{code} enabled on fields with a substansial amount of synonym resulted in the pf-portion of the query growing "exponentially" and resulted in one single query taking down an entire solr-server. By adjusting the number of OR-queries we were able to increase the memory required for running the query. example (id has synonyms enabled, companyname has not): q=( samfunnsviter (klima OR miljø) ) NOT ( psykolog%20 OR rus OR ortopedi OR odontologi )&debugQuery=true&pf=companyname results in pf-part of edismax-query (+DisjunctionMaxQuery((companyname:\"? samfunnsviter klima miljø ? ? psykolog rus ortopedi odontologi\"~5)~0.01)) q=( samfunnsviter (klima OR miljø) ) NOT ( psykolog%20 OR rus OR ortopedi OR odontologi )&debugQuery=true&pf=id companyname results in pf-part of edismax-query (+DisjunctionMaxQuery(((id:\"samfunnsviter klima miljø psykolog rus ortopedi odontologi\"~5 id:\"samfunnsviter klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"samfunnsvitar klima miljø psykolog rus ortopedi odontologi\"~5 id:\"samfunnsvitar klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"social scientist klima miljø psykolog rus ortopedi odontologi\"~5 id:\"social scientist klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"statsviter klima miljø psykolog rus ortopedi odontologi\"~5 id:\"statsviter klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"samfunnsøkonom klima miljø psykolog rus ortopedi odontologi\"~5 id:\"samfunnsøkonom klima miljø psykologspesialist rus ortopedi odontologi\"~5) | companyname:\"? samfunnsviter klima miljø ? ? psykolog rus ortopedi odontologi\"~5)~0.01))\{code} increasing the number of OR-terms or synonyms results in the id-part of the query growing "exponentially" > Edismax missing phrase queries when phrases contain multiterm synonyms > -- > > Key: SOLR-12243 > URL: https://issues.apache.org/jira/browse/SOLR-12243 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: 7.1 > Environment: RHEL, MacOS X > Do not believe this is environment-specific. >
[jira] [Comment Edited] (SOLR-12243) Edismax missing phrase queries when phrases contain multiterm synonyms
[ https://issues.apache.org/jira/browse/SOLR-12243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836454#comment-16836454 ] Fredrik Rodland edited comment on SOLR-12243 at 5/9/19 3:05 PM: Thanks for taking the time to explain and link other issues [~mgibney]. Good we're not alone here. For the time being we've limited pf to only allow non-synonym fields as pf is really not that crucial for our site. was (Author: fmr): Thanks for taking the time to explain and link other issues [~mgibney]. Good we're not alone here. For the time being we've disabled limited pf to only allow non-synonym fields as pf is really not that crucial for our site. > Edismax missing phrase queries when phrases contain multiterm synonyms > -- > > Key: SOLR-12243 > URL: https://issues.apache.org/jira/browse/SOLR-12243 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: 7.1 > Environment: RHEL, MacOS X > Do not believe this is environment-specific. >Reporter: Elizabeth Haubert >Assignee: Steve Rowe >Priority: Major > Fix For: 7.6, 8.0 > > Attachments: SOLR-12243.patch, SOLR-12243.patch, SOLR-12243.patch, > SOLR-12243.patch, SOLR-12243.patch, SOLR-12243.patch, SOLR-12243.patch, > multiword-synonyms.txt, schema.xml, solrconfig.xml > > Time Spent: 10m > Remaining Estimate: 0h > > synonyms.txt: > {code} > allergic, hypersensitive > aspirin, acetylsalicylic acid > dog, canine, canis familiris, k 9 > rat, rattus > {code} > request handler: > {code:xml} > > > > edismax > 0.4 > title^100 > title~20^5000 > title~11 > title~22^1000 > text > > 3<-1 6<-3 9<30% > *:* > 25 > > > {code} > Phrase queries (pf, pf2, pf3) containing "dog" or "aspirin" against the > above list will not be generated. > "allergic reaction dog" will generate pf2: "allergic reaction", but not > pf:"allergic reaction dog", pf2: "reaction dog", or pf3: "allergic reaction > dog" > "aspirin dose in rats" will generate pf3: "dose ? rats" but not pf2: "aspirin > dose" or pf3:"aspirin dose ?" > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-12243) Edismax missing phrase queries when phrases contain multiterm synonyms
[ https://issues.apache.org/jira/browse/SOLR-12243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836249#comment-16836249 ] Fredrik Rodland edited comment on SOLR-12243 at 5/9/19 9:58 AM: I am aware that this issue is closed, but nonetheless: I think this actually broke something regarding expansion of synonyms for large queries (possibly large OR-queries). Having \{code}pf\{code} enabled on fields with a substansial amount of synonym resulted in the pf-portion of the query growing "exponentially" and resulted in one single query taking down an entire solr-server. By adjusting the number of OR-queries we were able to increase the memory required for running the query. example (id has synonyms enabled, companyname has not): q=( samfunnsviter (klima OR miljø) ) NOT ( psykolog%20 OR rus OR ortopedi OR odontologi )&debugQuery=true&pf=companyname results in pf-part of edismax-query (+DisjunctionMaxQuery((companyname:\"? samfunnsviter klima miljø ? ? psykolog rus ortopedi odontologi\"~5)~0.01)) q=( samfunnsviter (klima OR miljø) ) NOT ( psykolog%20 OR rus OR ortopedi OR odontologi )&debugQuery=true&pf=id companyname results in pf-part of edismax-query (+DisjunctionMaxQuery(((id:\"samfunnsviter klima miljø psykolog rus ortopedi odontologi\"~5 id:\"samfunnsviter klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"samfunnsvitar klima miljø psykolog rus ortopedi odontologi\"~5 id:\"samfunnsvitar klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"social scientist klima miljø psykolog rus ortopedi odontologi\"~5 id:\"social scientist klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"statsviter klima miljø psykolog rus ortopedi odontologi\"~5 id:\"statsviter klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"samfunnsøkonom klima miljø psykolog rus ortopedi odontologi\"~5 id:\"samfunnsøkonom klima miljø psykologspesialist rus ortopedi odontologi\"~5) | companyname:\"? samfunnsviter klima miljø ? ? psykolog rus ortopedi odontologi\"~5)~0.01))\{code} increasing the number of OR-terms or synonyms results in the id-part of the query growing "exponentially" was (Author: fmr): I am aware that this issue is closed, but nonetheless: I think this actually broke something regarding expansion of synonyms for large queries (possibly large OR-queries). Having \{code}pf\{code} enabled on fields with a substansial amount of synonym resulted in the pf-portion of the query growing "exponentially" and resulted in one single query taking down an entire solr-server. By adjusting the number of OR-queries we were able to increase the memory required for running the query. example (id has synonyms enabled, companyname has not): {code:java} q= ( samfunnsviter (klima OR miljø) ) NOT ( psykolog%20 OR rus OR ortopedi OR odontologi )&debugQuery=true&pf=companyname\ {code} results in pf-part of edismax-query {code}(+DisjunctionMaxQuery((companyname:\"? samfunnsviter klima miljø ? ? psykolog rus ortopedi odontologi\"~5)~0.01))\{code} {code:java} q= ( samfunnsviter (klima OR miljø) ) NOT ( psykolog%20 OR rus OR ortopedi OR odontologi )&debugQuery=true&pf=id companyname\ {code} results in pf-part of edismax-query {code}(+DisjunctionMaxQuery(((id:\"samfunnsviter klima miljø psykolog rus ortopedi odontologi\"~5 id:\"samfunnsviter klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"samfunnsvitar klima miljø psykolog rus ortopedi odontologi\"~5 id:\"samfunnsvitar klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"social scientist klima miljø psykolog rus ortopedi odontologi\"~5 id:\"social scientist klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"statsviter klima miljø psykolog rus ortopedi odontologi\"~5 id:\"statsviter klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"samfunnsøkonom klima miljø psykolog rus ortopedi odontologi\"~5 id:\"samfunnsøkonom klima miljø psykologspesialist rus ortopedi odontologi\"~5) | companyname:\"? samfunnsviter klima miljø ? ? psykolog rus ortopedi odontologi\"~5)~0.01))\{code} increasing the number of OR-terms or synonyms results in the id-part of the query growing "exponentially" > Edismax missing phrase queries when phrases contain multiterm synonyms > -- > > Key: SOLR-12243 > URL: https://issues.apache.org/jira/browse/SOLR-12243 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: 7.1 > Environment: RHEL, MacOS X > Do not believe this is environment-specific. >Reporter: Elizabeth Haubert >Assignee: Steve Rowe >Priority: Major > Fix For: 7.6, 8
[jira] [Comment Edited] (SOLR-12243) Edismax missing phrase queries when phrases contain multiterm synonyms
[ https://issues.apache.org/jira/browse/SOLR-12243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836249#comment-16836249 ] Fredrik Rodland edited comment on SOLR-12243 at 5/9/19 9:56 AM: I am aware that this issue is closed, but nonetheless: I think this actually broke something regarding expansion of synonyms for large queries (possibly large OR-queries). Having \{code}pf\{code} enabled on fields with a substansial amount of synonym resulted in the pf-portion of the query growing "exponentially" and resulted in one single query taking down an entire solr-server. By adjusting the number of OR-queries we were able to increase the memory required for running the query. example (id has synonyms enabled, companyname has not): {code:java} q= ( samfunnsviter (klima OR miljø) ) NOT ( psykolog%20 OR rus OR ortopedi OR odontologi )&debugQuery=true&pf=companyname\ {code} results in pf-part of edismax-query {code}(+DisjunctionMaxQuery((companyname:\"? samfunnsviter klima miljø ? ? psykolog rus ortopedi odontologi\"~5)~0.01))\{code} {code:java} q= ( samfunnsviter (klima OR miljø) ) NOT ( psykolog%20 OR rus OR ortopedi OR odontologi )&debugQuery=true&pf=id companyname\ {code} results in pf-part of edismax-query {code}(+DisjunctionMaxQuery(((id:\"samfunnsviter klima miljø psykolog rus ortopedi odontologi\"~5 id:\"samfunnsviter klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"samfunnsvitar klima miljø psykolog rus ortopedi odontologi\"~5 id:\"samfunnsvitar klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"social scientist klima miljø psykolog rus ortopedi odontologi\"~5 id:\"social scientist klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"statsviter klima miljø psykolog rus ortopedi odontologi\"~5 id:\"statsviter klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"samfunnsøkonom klima miljø psykolog rus ortopedi odontologi\"~5 id:\"samfunnsøkonom klima miljø psykologspesialist rus ortopedi odontologi\"~5) | companyname:\"? samfunnsviter klima miljø ? ? psykolog rus ortopedi odontologi\"~5)~0.01))\{code} increasing the number of OR-terms or synonyms results in the id-part of the query growing "exponentially" was (Author: fmr): I am aware that this issue is closed, but nonetheless: I think this actually broke something regarding expansion of synonyms for large queries (possibly large OR-queries). Having \{code}pf\{code} enabled on fields with a substansial amount of synonym resulted in the pf-portion of the query growing "exponentially" and resulted in one single query taking down an entire solr-server. By adjusting the number of OR-queries we were able to increase the memory required for running the query. example (id has synonyms enabled, companyname has not): {code}q= ( samfunnsviter (klima OR miljø) ) NOT ( psykolog%20 OR rus OR ortopedi OR odontologi )&debugQuery=true&pf=companyname\{code} results in pf-part of edismax-query {code}(+DisjunctionMaxQuery((companyname:\"? samfunnsviter klima miljø ? ? psykolog rus ortopedi odontologi\"~5)~0.01))\{code} {code}q= ( samfunnsviter (klima OR miljø) ) NOT ( psykolog%20 OR rus OR ortopedi OR odontologi )&debugQuery=true&pf=id companyname\{code} results in pf-part of edismax-query {code}(+DisjunctionMaxQuery(((id:\"samfunnsviter klima miljø psykolog rus ortopedi odontologi\"~5 id:\"samfunnsviter klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"samfunnsvitar klima miljø psykolog rus ortopedi odontologi\"~5 id:\"samfunnsvitar klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"social scientist klima miljø psykolog rus ortopedi odontologi\"~5 id:\"social scientist klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"statsviter klima miljø psykolog rus ortopedi odontologi\"~5 id:\"statsviter klima miljø psykologspesialist rus ortopedi odontologi\"~5 id:\"samfunnsøkonom klima miljø psykolog rus ortopedi odontologi\"~5 id:\"samfunnsøkonom klima miljø psykologspesialist rus ortopedi odontologi\"~5) | companyname:\"? samfunnsviter klima miljø ? ? psykolog rus ortopedi odontologi\"~5)~0.01))\{code} increasing the number of OR-terms or synonyms results in the id-part of the query growing "exponentially" > Edismax missing phrase queries when phrases contain multiterm synonyms > -- > > Key: SOLR-12243 > URL: https://issues.apache.org/jira/browse/SOLR-12243 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: 7.1 > Environment: RHEL, MacOS X > Do not believe this is environment-specific. >Reporter: Elizabeth Haubert >Assignee: Steve Rowe >
[jira] [Comment Edited] (SOLR-12243) Edismax missing phrase queries when phrases contain multiterm synonyms
[ https://issues.apache.org/jira/browse/SOLR-12243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16667457#comment-16667457 ] Uwe Schindler edited comment on SOLR-12243 at 10/29/18 5:19 PM: Thanks Elizabeth and Steve, I think the problem Sarowe mentioned was actually the problem for the failing test (I expected something like this). This is also the reason why I don't like the current architecture. EDismax relies on the (internal) structure of queries that querybuilder produces! IMHO, we should maybe add a "Lucene" version of the dismax parser for easier testing. Also I figured out that especially the phrase expansions are useful for Lucene users, too. I had several people I made a custom query parser for and for all of those you hd to reinvent the phrase expansion stuff. Elizabeth: I think the permutation problem is not new with the recent Lucene fixes. This problem should also have happened with Span expansions, right? Maybe we should add an option to limit the number of phrase expansions (as a safety feature). If those limits are reached, the phrase expansion should be stopped (maybe then only bigrams and no trigrams). was (Author: thetaphi): Thanks Elizabeth and Steve, I think the problem Sarowe mentioned was actually the problem for the failing test (I expected something like this). This is also the reason why I don't like the current architecture. EDismax relies on the (internal) structure of queries that querybuilder produces! IMHO, we should maybe add a "Lucene" version of the dismax parser for easier testing. Also I figured out that especially the phrase expansions are useful for Lucene users, too. I had several people I made a custom query parser for and for all of those you hd to reinvent the phrase expansion stuff. Elizabeth: I think the permutation problem is not new with the recent Lucene fixed. This problem also happened with Span expansions, right? Maybe we should add an option to limit the number of phrase expansions (as a safety feature). If those limits are reached, the phrase expansion should be stopped (maybe then only bigrams and no trigrams). > Edismax missing phrase queries when phrases contain multiterm synonyms > -- > > Key: SOLR-12243 > URL: https://issues.apache.org/jira/browse/SOLR-12243 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: 7.1 > Environment: RHEL, MacOS X > Do not believe this is environment-specific. >Reporter: Elizabeth Haubert >Assignee: Uwe Schindler >Priority: Major > Attachments: SOLR-12243.patch, SOLR-12243.patch, SOLR-12243.patch, > SOLR-12243.patch, SOLR-12243.patch, SOLR-12243.patch, multiword-synonyms.txt, > schema.xml, solrconfig.xml > > Time Spent: 10m > Remaining Estimate: 0h > > synonyms.txt: > {code} > allergic, hypersensitive > aspirin, acetylsalicylic acid > dog, canine, canis familiris, k 9 > rat, rattus > {code} > request handler: > {code:xml} > > > > edismax > 0.4 > title^100 > title~20^5000 > title~11 > title~22^1000 > text > > 3<-1 6<-3 9<30% > *:* > 25 > > > {code} > Phrase queries (pf, pf2, pf3) containing "dog" or "aspirin" against the > above list will not be generated. > "allergic reaction dog" will generate pf2: "allergic reaction", but not > pf:"allergic reaction dog", pf2: "reaction dog", or pf3: "allergic reaction > dog" > "aspirin dose in rats" will generate pf3: "dose ? rats" but not pf2: "aspirin > dose" or pf3:"aspirin dose ?" > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-12243) Edismax missing phrase queries when phrases contain multiterm synonyms
[ https://issues.apache.org/jira/browse/SOLR-12243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657074#comment-16657074 ] Uwe Schindler edited comment on SOLR-12243 at 10/19/18 4:51 PM: That's waht I mean, it's still linked together. The main bug is still in Lucene, because the Lucene Query builder creates a query that does not correctly implement span queries on multi-term synonyms, because it uses the wrong query type. The issues here are coming from the fact that dismax relies on the interal implementation of the lucene code, which is not a good thing. The solr code should not do this and instead we should add something into Lucene that can create those pf auto-phrase queries. I was missing that in an own query parser, too. So basically it would be good to have some additional query builder method in Lucene that analyzes some text and then builds configureable shingles that are connected with span/phrase using a slop. This code should not depend on the structure of a span/boolean query that was parsed before. I'd like to wait a few days until the Lucene issue is solved and then review the changes here and adapt them as necessary. On the longer term, I'd like to get rid of the query instanceof spaghetticode and move the query construction for dismax-like queries using term shingles (bigrams, trigrams) to a separate builder class. So it's better resuseable. was (Author: thetaphi): That's waht I mean, it's still linked together. The main bug is still in Lucene, because the Lucene Query builder creates a query that does not correctly implement span queries on multi-term synonyms, because it uses the wrong query type. The issues here are coming from the fact that dismax relies on the interal implementation of the lucene code, which is not a good thing. The solr code should not do this and instead we should add something into Lucene that can create those pf auto-phrase queries. I was missing that in an own query parser, too. So basically it would be good to have some additional query builder method in Lucene that analyzes some text and then builds configureable shingles that are connected with span/phrase using a slop. This code should not depend on the structure of a span/boolean query that was parsed before. I'd like to wait a few days until the Lucene issue is solved and then review the changes here and adapt them as necessary. On the longer term, I'd like to get rid of the query instance of shingling and move the query construction for dismax-like queries to a separate builder class. So it's better resuseable. > Edismax missing phrase queries when phrases contain multiterm synonyms > -- > > Key: SOLR-12243 > URL: https://issues.apache.org/jira/browse/SOLR-12243 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: 7.1 > Environment: RHEL, MacOS X > Do not believe this is environment-specific. >Reporter: Elizabeth Haubert >Assignee: Uwe Schindler >Priority: Major > Attachments: SOLR-12243.patch, SOLR-12243.patch, SOLR-12243.patch, > SOLR-12243.patch, SOLR-12243.patch > > Time Spent: 10m > Remaining Estimate: 0h > > synonyms.txt: > {code} > allergic, hypersensitive > aspirin, acetylsalicylic acid > dog, canine, canis familiris, k 9 > rat, rattus > {code} > request handler: > {code:xml} > > > > edismax > 0.4 > title^100 > title~20^5000 > title~11 > title~22^1000 > text > > 3<-1 6<-3 9<30% > *:* > 25 > > > {code} > Phrase queries (pf, pf2, pf3) containing "dog" or "aspirin" against the > above list will not be generated. > "allergic reaction dog" will generate pf2: "allergic reaction", but not > pf:"allergic reaction dog", pf2: "reaction dog", or pf3: "allergic reaction > dog" > "aspirin dose in rats" will generate pf3: "dose ? rats" but not pf2: "aspirin > dose" or pf3:"aspirin dose ?" > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-12243) Edismax missing phrase queries when phrases contain multiterm synonyms
[ https://issues.apache.org/jira/browse/SOLR-12243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456929#comment-16456929 ] Elizabeth Haubert edited comment on SOLR-12243 at 4/27/18 7:26 PM: --- The fix I pushed up really only handles the case where you're starting with the single-word synonym well for pf2. So matching "foo bar" queries to "foo tropical cyclone" documents. This was a real problem for my use case, because the pf clauses weren't being generated at all. The other direction, to match "foo tropical cyclone" queries to "foo bar" documents is harder. I've gone a little ways into the pf2 "b tropical" problem, but it is a deeper problem than the spans getting thrown out because they were the wrong type of query. Start small. Here's what I've got for the other direction: One of first thing edismax does is generate a list of different kinds of clauses off the user query, and that seems to be unaffected by the sow flag. So "foo tropical cyclone" has three bareword clauses: "foo", "tropical", and "cyclone". But 'foo "tropical cyclone"' (with quotes) has two: a bareword foo and a phrase "tropical cyclone". When it goes to construct pf2 and pf3, edismax re-assembles the bareword clauses, then makes the 2- and 3- word shingles. So "foo tropical cyclone" would get pf2="foo tropical" and "tropical cyclone", pf2="foo tropical" can't get expanded, because it is missing cyclone, and will go through such as it is; "tropical cyclone" will get expanded, but then removed as not a phrase, not just because it is a Span. That seems consistent if we think of "tropical cyclone" as a single entity. So to do anything, we need to address how the shingle queries are being constructed. I opened Jira-12260 to start looping in the phrases to pf clauses, not just the barewords, because that has some other weird semantics. So 'foo "tropical cyclone" baz' (with quotes) would generate pf="foo baz", which is unintuitive - it would make more sense if it became "foo "tropical cyclone"" and "tropical cyclone" baz. Beyond looking a little into whether the graph queries could handle the phrase, I haven't really dug how to do that yet. That matters here, because if that works and the semantics are acceptable, multi-word synoynms are already handled as quoted in the logic that creates the graph queries. So it would (probably) be safe to take that another step to stuff the multiword synonyms into a single phrase clause, rather than individual bareword clauses. Maybe. was (Author: ehaubert): The fix I pushed up really only handles the case where you're starting with the single-word synonym well. So matching "foo bar" queries to "foo tropical cyclone" documents. This was a real problem for my use case, because the pf clauses weren't being generated at all. The other direction, to match "foo tropical cyclone" queries to "foo bar" documents is harder. I've gone a little ways into the pf2 "b tropical" problem, but it is a deeper problem than the spans getting thrown out because they were the wrong type of query. Start small. Here's what I've got for the other direction: One of first thing edismax does is generate a list of different kinds of clauses off the user query, and that seems to be unaffected by the sow flag. So "foo tropical cyclone" has three bareword clauses: "foo", "tropical", and "cyclone". But 'foo "tropical cyclone"' (with quotes) has two: a bareword foo and a phrase "tropical cyclone". When it goes to construct pf2 and pf3, edismax re-assembles the bareword clauses, then makes the 2- and 3- word shingles. So "foo tropical cyclone" would get pf2="foo tropical" and "tropical cyclone", pf2="foo tropical" can't get expanded, because it is missing cyclone, and will go through such as it is; "tropical cyclone" will get expanded, but then removed as not a phrase, not just because it is a Span. That seems consistent if we think of "tropical cyclone" as a single entity. So to do anything, we need to address how the shingle queries are being constructed. I opened Jira-12260 to start looping in the phrases to pf clauses, not just the barewords, because that has some other weird semantics. So 'foo "tropical cyclone" baz' (with quotes) would generate pf="foo baz", which is unintuitive - it would make more sense if it became "foo "tropical cyclone"" and "tropical cyclone" baz. Beyond looking a little into whether the graph queries could handle the phrase, I haven't really dug how to do that yet. That matters here, because if that works and the semantics are acceptable, multi-word synoynms are already handled as quoted in the logic that creates the graph queries. So it would (probably) be safe to take that another step to stuff the multiword synonyms into a single phrase clause, rather than individual bareword clauses. May
[jira] [Comment Edited] (SOLR-12243) Edismax missing phrase queries when phrases contain multiterm synonyms
[ https://issues.apache.org/jira/browse/SOLR-12243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456834#comment-16456834 ] Alessandro Benedetti edited comment on SOLR-12243 at 4/27/18 5:58 PM: -- Hi [~ehaubert], thanks for the reply. I think the current patch could be completed adding a test that verifies the actual query (building) parsing. The bug affects the query (building) parsing in the end, so, testing on results per query can be effective, but it's not testing the bugfix. I will just post a brutal copy and paste here, If the Jira is still open I will push a PR with the fix in the next days. Adding something like this should work : public void testEdismaxQueryParsing_multiTermWithPf_shouldParseCorrectPhraseQueries() throws Exception { Query q = QParser.getParser("foo a b bar","edismax",true, req(params("sow", "false","qf", "text^10","pf", "text^10","pf2", "text^5","pf3", "text^8"))).getQuery(); assertEquals("+(" + "((text:foo)^10.0) ((text:a)^10.0) ((text:b)^10.0) (((+text:tropical +text:cyclone) text:bar)^10.0)) " + "((spanNear([text:foo, text:a, text:b, spanOr([spanNear([text:tropical, text:cyclone], 0, true), text:bar])], 0, true))^10.0) " + "(((text:\"foo a\")^5.0) ((text:\"a b\")^5.0) ((spanNear([text:b, spanOr([spanNear([text:tropical, text:cyclone], 0, true), text:bar])], 0, true))^5.0)) " + "(((text:\"foo a b\")^8.0) ((spanNear([text:a, text:b, spanOr([spanNear([text:tropical, text:cyclone], 0, true), text:bar])], 0, true))^8.0))", q.toString()); q = QParser.getParser("foo a b tropical cyclone","edismax",true, req(params("qf", "text^10","pf", "text^10","pf2", "text^5","pf3", "text^8"))).getQuery(); assertEquals("+(" + "((text:foo)^10.0) ((text:a)^10.0) ((text:b)^10.0) ((text:bar (+text:tropical +text:cyclone))^10.0)) " + "((spanNear([text:foo, text:a, text:b, spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)])], 0, true))^10.0) " + "(((text:\"foo a\")^5.0) ((text:\"a b\")^5.0) ((text:\"b tropical\")^5.0)) {color:#ff}*(spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)]))^5.0))"*{color} + "(((text:\"foo a b\")^8.0) ((text:\"a b tropical\")^8.0) ((spanNear([text:b, spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)])], 0, true))^8.0))", q.toString()); } *N.B.* The second part is failing for pf2, because for the query "foo a b tropical cyclone" , pf2 is generating just : ((text:\"foo a\")^5.0) ((text:\"a b\")^5.0) ((text:\"b tropical\")^5.0)), which I believe is incorrect as an additional span query should be generated ( (spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)]))^5.0)). I will investigate further in the next days, just wanted to post it here to the community attention :) was (Author: alessandro.benedetti): Hi [~ehaubert], thanks for the reply. I think the current patch could be completed adding a test that verifies the actual query (building) parsing. The bug affects the query (building) parsing in the end, so, testing on results per query can be effective, but it's not testing the bugfix. Adding something like this should work : public void testEdismaxQueryParsing_multiTermWithPf_shouldParseCorrectPhraseQueries() throws Exception { Query q = QParser.getParser("foo a b bar","edismax",true, req(params("sow", "false","qf", "text^10","pf", "text^10","pf2", "text^5","pf3", "text^8"))).getQuery(); assertEquals("+(" + "((text:foo)^10.0) ((text:a)^10.0) ((text:b)^10.0) (((+text:tropical +text:cyclone) text:bar)^10.0)) " + "((spanNear([text:foo, text:a, text:b, spanOr([spanNear([text:tropical, text:cyclone], 0, true), text:bar])], 0, true))^10.0) " + "(((text:\"foo a\")^5.0) ((text:\"a b\")^5.0) ((spanNear([text:b, spanOr([spanNear([text:tropical, text:cyclone], 0, true), text:bar])], 0, true))^5.0)) " + "(((text:\"foo a b\")^8.0) ((spanNear([text:a, text:b, spanOr([spanNear([text:tropical, text:cyclone], 0, true), text:bar])], 0, true))^8.0))", q.toString()); q = QParser.getParser("foo a b tropical cyclone","edismax",true, req(params("qf", "text^10","pf", "text^10","pf2", "text^5","pf3", "text^8"))).getQuery(); assertEquals("+(" + "((text:foo)^10.0) ((text:a)^10.0) ((text:b)^10.0) ((text:bar (+text:tropical +text:cyclone))^10.0)) " + "((spanNear([text:foo, text:a, text:b, spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)])], 0, true))^10.0) " + "(((text:\"foo a\")^5.0) ((text:\"a b\")^5.0) ((text:\"b tropical\")^5.0)) {color:#FF}*(spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)]))^5.0))"*{color} + "(((text:\"foo a b\")^8.0) ((text:\"a b tropical\")^8.0) ((spanNear([text:b, spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)])], 0, true))^8.0))", q.toString()); } *N.B.* The second part is failing for pf2, because for the query "foo a b tropical cyclone" , pf2 is generating just
[jira] [Comment Edited] (SOLR-12243) Edismax missing phrase queries when phrases contain multiterm synonyms
[ https://issues.apache.org/jira/browse/SOLR-12243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456834#comment-16456834 ] Alessandro Benedetti edited comment on SOLR-12243 at 4/27/18 5:57 PM: -- Hi [~ehaubert], thanks for the reply. I think the current patch could be completed adding a test that verifies the actual query (building) parsing. The bug affects the query (building) parsing in the end, so, testing on results per query can be effective, but it's not testing the bugfix. Adding something like this should work : public void testEdismaxQueryParsing_multiTermWithPf_shouldParseCorrectPhraseQueries() throws Exception { Query q = QParser.getParser("foo a b bar","edismax",true, req(params("sow", "false","qf", "text^10","pf", "text^10","pf2", "text^5","pf3", "text^8"))).getQuery(); assertEquals("+(" + "((text:foo)^10.0) ((text:a)^10.0) ((text:b)^10.0) (((+text:tropical +text:cyclone) text:bar)^10.0)) " + "((spanNear([text:foo, text:a, text:b, spanOr([spanNear([text:tropical, text:cyclone], 0, true), text:bar])], 0, true))^10.0) " + "(((text:\"foo a\")^5.0) ((text:\"a b\")^5.0) ((spanNear([text:b, spanOr([spanNear([text:tropical, text:cyclone], 0, true), text:bar])], 0, true))^5.0)) " + "(((text:\"foo a b\")^8.0) ((spanNear([text:a, text:b, spanOr([spanNear([text:tropical, text:cyclone], 0, true), text:bar])], 0, true))^8.0))", q.toString()); q = QParser.getParser("foo a b tropical cyclone","edismax",true, req(params("qf", "text^10","pf", "text^10","pf2", "text^5","pf3", "text^8"))).getQuery(); assertEquals("+(" + "((text:foo)^10.0) ((text:a)^10.0) ((text:b)^10.0) ((text:bar (+text:tropical +text:cyclone))^10.0)) " + "((spanNear([text:foo, text:a, text:b, spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)])], 0, true))^10.0) " + "(((text:\"foo a\")^5.0) ((text:\"a b\")^5.0) ((text:\"b tropical\")^5.0)) {color:#FF}*(spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)]))^5.0))"*{color} + "(((text:\"foo a b\")^8.0) ((text:\"a b tropical\")^8.0) ((spanNear([text:b, spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)])], 0, true))^8.0))", q.toString()); } *N.B.* The second part is failing for pf2, because for the query "foo a b tropical cyclone" , pf2 is generating just : ((text:\"foo a\")^5.0) ((text:\"a b\")^5.0) ((text:\"b tropical\")^5.0)), which I believe is incorrect as an additional span query should be generated ( (spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)]))^5.0)). I will investigate further in the next days, just wanted to post it here to the community attention :) was (Author: alessandro.benedetti): Hi [~ehaubert], thanks for the reply. I think the current patch could be completed adding a test that verifies the actual query (building) parsing. The bug affects the query (building) parsing in the end, so, testing on results per query can be effective, but it's not testing the bugfix. Adding something like this should work : public void testEdismaxQueryParsing_multiTermWithPf_shouldParseCorrectPhraseQueries() throws Exception { Query q = QParser.getParser("foo a b bar","edismax",true, req(params("sow", "false","qf", "text^10","pf", "text^10","pf2", "text^5","pf3", "text^8"))).getQuery(); assertEquals("+(" + "((text:foo)^10.0) ((text:a)^10.0) ((text:b)^10.0) (((+text:tropical +text:cyclone) text:bar)^10.0)) " + "((spanNear([text:foo, text:a, text:b, spanOr([spanNear([text:tropical, text:cyclone], 0, true), text:bar])], 0, true))^10.0) " + "(((text:\"foo a\")^5.0) ((text:\"a b\")^5.0) ((spanNear([text:b, spanOr([spanNear([text:tropical, text:cyclone], 0, true), text:bar])], 0, true))^5.0)) " + "(((text:\"foo a b\")^8.0) ((spanNear([text:a, text:b, spanOr([spanNear([text:tropical, text:cyclone], 0, true), text:bar])], 0, true))^8.0))", q.toString()); q = QParser.getParser("foo a b tropical cyclone","edismax",true, req(params("qf", "text^10","pf", "text^10","pf2", "text^5","pf3", "text^8"))).getQuery(); assertEquals("+(" + "((text:foo)^10.0) ((text:a)^10.0) ((text:b)^10.0) ((text:bar (+text:tropical +text:cyclone))^10.0)) " + "((spanNear([text:foo, text:a, text:b, spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)])], 0, true))^10.0) " + "(((text:\"foo a\")^5.0) ((text:\"a b\")^5.0) ((text:\"b tropical\")^5.0)) (spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)]))^5.0))" + "(((text:\"foo a b\")^8.0) ((text:\"a b tropical\")^8.0) ((spanNear([text:b, spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)])], 0, true))^8.0))", q.toString()); } *N.B.* The second part is failing for pf2, because for the query "foo a b tropical cyclone" , pf2 is generating just : ((text:\"foo a\")^5.0) ((text:\"a b\")^5.0) ((text:\"b tropical\")^5.0)), which I believe is incorrect as an additional span query should be ge
[jira] [Comment Edited] (SOLR-12243) Edismax missing phrase queries when phrases contain multiterm synonyms
[ https://issues.apache.org/jira/browse/SOLR-12243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456834#comment-16456834 ] Alessandro Benedetti edited comment on SOLR-12243 at 4/27/18 5:55 PM: -- Hi [~ehaubert], thanks for the reply. I think the current patch could be completed adding a test that verifies the actual query (building) parsing. The bug affects the query (building) parsing in the end, so, testing on results per query can be effective, but it's not testing the bugfix. Adding something like this should work : public void testEdismaxQueryParsing_multiTermWithPf_shouldParseCorrectPhraseQueries() throws Exception { Query q = QParser.getParser("foo a b bar","edismax",true, req(params("sow", "false","qf", "text^10","pf", "text^10","pf2", "text^5","pf3", "text^8"))).getQuery(); assertEquals("+(" + "((text:foo)^10.0) ((text:a)^10.0) ((text:b)^10.0) (((+text:tropical +text:cyclone) text:bar)^10.0)) " + "((spanNear([text:foo, text:a, text:b, spanOr([spanNear([text:tropical, text:cyclone], 0, true), text:bar])], 0, true))^10.0) " + "(((text:\"foo a\")^5.0) ((text:\"a b\")^5.0) ((spanNear([text:b, spanOr([spanNear([text:tropical, text:cyclone], 0, true), text:bar])], 0, true))^5.0)) " + "(((text:\"foo a b\")^8.0) ((spanNear([text:a, text:b, spanOr([spanNear([text:tropical, text:cyclone], 0, true), text:bar])], 0, true))^8.0))", q.toString()); q = QParser.getParser("foo a b tropical cyclone","edismax",true, req(params("qf", "text^10","pf", "text^10","pf2", "text^5","pf3", "text^8"))).getQuery(); assertEquals("+(" + "((text:foo)^10.0) ((text:a)^10.0) ((text:b)^10.0) ((text:bar (+text:tropical +text:cyclone))^10.0)) " + "((spanNear([text:foo, text:a, text:b, spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)])], 0, true))^10.0) " + "(((text:\"foo a\")^5.0) ((text:\"a b\")^5.0) ((text:\"b tropical\")^5.0)) (spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)]))^5.0))" + "(((text:\"foo a b\")^8.0) ((text:\"a b tropical\")^8.0) ((spanNear([text:b, spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)])], 0, true))^8.0))", q.toString()); } *N.B.* The second part is failing for pf2, because for the query "foo a b tropical cyclone" , pf2 is generating just : ((text:\"foo a\")^5.0) ((text:\"a b\")^5.0) ((text:\"b tropical\")^5.0)), which I believe is incorrect as an additional span query should be generated ( (spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)]))^5.0)). I will investigate further in the next days, just wanted to post it here to the community attention :) was (Author: alessandro.benedetti): Hi [~ehaubert], thanks for the reply. I think the current patch could be completed adding a test that verifies the actual query (building) parsing. The bug affects the query (building) parsing in the end, so, testing on results per query can be effective, but it's not testing the bugfix. Adding something like this should work : public void testEdismaxQueryParsing_multiTermWithPf_shouldParseCorrectPhraseQueries() throws Exception { Query q = QParser.getParser("foo a b bar","edismax",true, req(params("sow", "false","qf", "text^10","pf", "text^10","pf2", "text^5","pf3", "text^8"))).getQuery(); assertEquals("+(" + "((text:foo)^10.0) ((text:a)^10.0) ((text:b)^10.0) (((+text:tropical +text:cyclone) text:bar)^10.0)) " + "((spanNear([text:foo, text:a, text:b, spanOr([spanNear([text:tropical, text:cyclone], 0, true), text:bar])], 0, true))^10.0) " + "(((text:\"foo a\")^5.0) ((text:\"a b\")^5.0) ((spanNear([text:b, spanOr([spanNear([text:tropical, text:cyclone], 0, true), text:bar])], 0, true))^5.0)) " + "(((text:\"foo a b\")^8.0) ((spanNear([text:a, text:b, spanOr([spanNear([text:tropical, text:cyclone], 0, true), text:bar])], 0, true))^8.0))", q.toString()); q = QParser.getParser("foo a b tropical cyclone","edismax",true, req(params("qf", "text^10","pf", "text^10","pf2", "text^5","pf3", "text^8"))).getQuery(); assertEquals("+(" + "((text:foo)^10.0) ((text:a)^10.0) ((text:b)^10.0) ((text:bar (+text:tropical +text:cyclone))^10.0)) " + "((spanNear([text:foo, text:a, text:b, spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)])], 0, true))^10.0) " + "(((text:\"foo a\")^5.0) ((text:\"a b\")^5.0) ((text:\"b tropical\")^5.0)) (spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)]))^5.0))" + "(((text:\"foo a b\")^8.0) ((text:\"a b tropical\")^8.0) ((spanNear([text:b, spanOr([text:bar, spanNear([text:tropical, text:cyclone], 0, true)])], 0, true))^8.0))", q.toString()); } N.B. The second part is failing for pf2, because for the query "foo a b tropical cyclone" , pf2 is generating just : ((text:\"foo a\")^5.0) ((text:\"a b\")^5.0) ((text:\"b tropical\")^5.0)), which I believe is incorrect as an additional span query should be generated ( (spanOr(
[jira] [Comment Edited] (SOLR-12243) Edismax missing phrase queries when phrases contain multiterm synonyms
[ https://issues.apache.org/jira/browse/SOLR-12243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456720#comment-16456720 ] Elizabeth Haubert edited comment on SOLR-12243 at 4/27/18 5:16 PM: --- My understanding from the [HowToContribute|https://wiki.apache.org/solr/HowToContribute] is that is supposed to happen automagically if the patch is named correctly, but I didn't knowingly do anything to cause it to happen. Code is pretty straightforward, but I'm having a bit of a learning curve on the non-code things that need to happen. was (Author: ehaubert): My understanding from the [HowToContribute|https://wiki.apache.org/solr/HowToContribute] is that is supposed to happen automagically if the patch is named correctly, but I didn't knowingly do anything to cause it to happen. The code base is pretty straightforward, but I'm having a bit of a learning curve on the non-code things that need to happen. > Edismax missing phrase queries when phrases contain multiterm synonyms > -- > > Key: SOLR-12243 > URL: https://issues.apache.org/jira/browse/SOLR-12243 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: 7.1 > Environment: RHEL, MacOS X > Do not believe this is environment-specific. >Reporter: Elizabeth Haubert >Priority: Major > Attachments: SOLR-12243.patch > > > synonyms.txt: > allergic, hypersensitive > aspirin, acetylsalicylic acid > dog, canine, canis familiris, k 9 > rat, rattus > request handler: > > > > edismax > 0.4 > title^100 > title~20^5000 > title~11 > title~22^1000 > text > > 3<-1 6<-3 9<30% > *:* > 25 > > > Phrase queries (pf, pf2, pf3) containing "dog" or "aspirin" against the > above list will not be generated. > "allergic reaction dog" will generate pf2: "allergic reaction", but not > pf:"allergic reaction dog", pf2: "reaction dog", or pf3: "allergic reaction > dog" > "aspirin dose in rats" will generate pf3: "dose ? rats" but not pf2: "aspirin > dose" or pf3:"aspirin dose ?" > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-12243) Edismax missing phrase queries when phrases contain multiterm synonyms
[ https://issues.apache.org/jira/browse/SOLR-12243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16452972#comment-16452972 ] Elizabeth Haubert edited comment on SOLR-12243 at 4/25/18 7:54 PM: --- I'd really like to make a bugfix release of the 7_1 branch with this, although the problem is still present on 7.x as well. Thoughts? The actual change is quite small. was (Author: ehaubert): I'd really like to make a bugfix release of this on the 7_1 branch with this, although the problem is still present on 7.x as well. Thoughts? The actual change is quite small. > Edismax missing phrase queries when phrases contain multiterm synonyms > -- > > Key: SOLR-12243 > URL: https://issues.apache.org/jira/browse/SOLR-12243 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: 7.1 > Environment: RHEL, MacOS X > Do not believe this is environment-specific. >Reporter: Elizabeth Haubert >Priority: Major > Attachments: SOLR-12243.patch > > > synonyms.txt: > allergic, hypersensitive > aspirin, acetylsalicylic acid > dog, canine, canis familiris, k 9 > rat, rattus > request handler: > > > > edismax > 0.4 > title^100 > title~20^5000 > title~11 > title~22^1000 > text > > 3<-1 6<-3 9<30% > *:* > 25 > > > Phrase queries (pf, pf2, pf3) containing "dog" or "aspirin" against the > above list will not be generated. > "allergic reaction dog" will generate pf2: "allergic reaction", but not > pf:"allergic reaction dog", pf2: "reaction dog", or pf3: "allergic reaction > dog" > "aspirin dose in rats" will generate pf3: "dose ? rats" but not pf2: "aspirin > dose" or pf3:"aspirin dose ?" > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org