Re: Porter Stem filter and employing
The easiest way to think about it is that the “mm” parameter is a sliding scale between pure OR and pure AND, i.e. any clause that matches returns the doc (0) and all clauses must be matched (100) for the doc to be returned… But no, I don’t know of any other explanation pages for that parameter. Best, Erick > On Mar 7, 2019, at 1:37 AM, Marisol Redondo > wrote: > > Following Erik idea, I started to look in different fields or queries than > the title field itself, and I started using the normal requesthandler > (/select) and adding parameters to see if any of the parameters in my query > make this problem. > And I discovered that in my customize RequestHandler I'm using > deftype=edixmax and mm=100% (and other params), when I remove the param mm, > I get the documents. > > I have been looking for information about this parameter and I've only > found one page in solr > https://lucene.apache.org/solr/guide/6_6/the-dismax-query-parser.html. > Is there any other documentation that can help me to understand how this > parameter works, I don't want to break all the searches removing that. > > Thanks for all your help > > > On Mon, 4 Mar 2019 at 17:11, Erick Erickson wrote: > >> First, if you _changed_ the analysis chain without re-indexing all >> documents, that could account for it. >> >> Second, the analysis page is a little tricky. It _assumes_ that the words >> you put in the boxes have been parsed into the field you select. So let’s >> say you have this field “title” that has stemming turned on. Let’s further >> assume your default search field is “text” (this is configured in >> solrconfig.xml, the “df” parameter in your request handler). >> >> Now, if your search is "q=employ” the actual search will be against your >> default field, as though you had entered “q=text:employ”. This is a common >> problem, adding "=query" to the search and looking at the result >> parsed_query.toString() will show you what’s actually the result of the >> query parsing and may help. >> >> Best, >> Erick >> >>> On Mar 4, 2019, at 3:13 AM, Marisol Redondo < >> marisol.redondo.gar...@gmail.com> wrote: >>> >>> Thank you for the answer and heading me to this solution. But I've >> already >>> used this filter for index analysis and I'm not getting any result. So I >>> don't understand why I'm not getting the result. >>> If I use the Analysis tool, I'm gettin >>> So, maybe the problem is other? But I don't see what can be the problem, >>> because, when using the Analysis took I got the same result for index and >>> query: (the entry to this filter was employing carer) >>> >>> *PSF (Index)* >>> >>> *PSF (query)* >>> >>> text >>> >>> emploi >>> >>> carer >>> >>> text >>> >>> emploi >>> >>> carer >>> >>> raw_bytes >>> >>> [65 6d 70 6c 6f 69] >>> >>> [63 61 72 65 72] >>> >>> raw_bytes >>> >>> [65 6d 70 6c 6f 69] >>> >>> [63 61 72 65 72] >>> >>> start >>> >>> 0 >>> >>> 12 >>> >>> start >>> >>> 0 >>> >>> 12 >>> >>> end >>> >>> 9 >>> >>> 17 >>> >>> end >>> >>> 9 >>> >>> 17 >>> >>> positionLength >>> >>> 1 >>> >>> 1 >>> >>> positionLength >>> >>> 1 >>> >>> 1 >>> >>> type >>> >>> >>> >>> >>> >>> type >>> >>> >>> >>> >>> >>> position >>> >>> 1 >>> >>> 3 >>> >>> position >>> >>> 1 >>> >>> 3 >>> >>> keyword >>> >>> FALSE >>> >>> FALSE >>> >>> keyword >>> >>> FALSE >>> >>> FALSE >>> >>> >>> >>> >>> >>> >>> >>> >>> On Fri, 1 Mar 2019 at 15:51, Shawn Heisey wrote: >>> On 3/1/2019 4:38 AM, Marisol Redondo wrote: > When using the PorterStemFilter, I saw that the work "employing" is change > to "emploi" and my document is not found in the query to solr because >> of > that. > > This also happens with other words that finish in -ying as annoying or > deploying. > > It there any path for this filter or should I create a new Jira issue? When you are using a stemming filter, you will need to use the same filter on both the query analysis and the index analysis, so that similar words are stemmed to the same root in both cases, leading to matches. If the other steps in your analysis chain are changing the words so that the stemming filter cannot recognize the word, that might also cause problems. Thanks, Shawn >> >>
Re: Porter Stem filter and employing
Following Erik idea, I started to look in different fields or queries than the title field itself, and I started using the normal requesthandler (/select) and adding parameters to see if any of the parameters in my query make this problem. And I discovered that in my customize RequestHandler I'm using deftype=edixmax and mm=100% (and other params), when I remove the param mm, I get the documents. I have been looking for information about this parameter and I've only found one page in solr https://lucene.apache.org/solr/guide/6_6/the-dismax-query-parser.html. Is there any other documentation that can help me to understand how this parameter works, I don't want to break all the searches removing that. Thanks for all your help On Mon, 4 Mar 2019 at 17:11, Erick Erickson wrote: > First, if you _changed_ the analysis chain without re-indexing all > documents, that could account for it. > > Second, the analysis page is a little tricky. It _assumes_ that the words > you put in the boxes have been parsed into the field you select. So let’s > say you have this field “title” that has stemming turned on. Let’s further > assume your default search field is “text” (this is configured in > solrconfig.xml, the “df” parameter in your request handler). > > Now, if your search is "q=employ” the actual search will be against your > default field, as though you had entered “q=text:employ”. This is a common > problem, adding "=query" to the search and looking at the result > parsed_query.toString() will show you what’s actually the result of the > query parsing and may help. > > Best, > Erick > > > On Mar 4, 2019, at 3:13 AM, Marisol Redondo < > marisol.redondo.gar...@gmail.com> wrote: > > > > Thank you for the answer and heading me to this solution. But I've > already > > used this filter for index analysis and I'm not getting any result. So I > > don't understand why I'm not getting the result. > > If I use the Analysis tool, I'm gettin > > So, maybe the problem is other? But I don't see what can be the problem, > > because, when using the Analysis took I got the same result for index and > > query: (the entry to this filter was employing carer) > > > > *PSF (Index)* > > > > *PSF (query)* > > > > text > > > > emploi > > > > carer > > > > text > > > > emploi > > > > carer > > > > raw_bytes > > > > [65 6d 70 6c 6f 69] > > > > [63 61 72 65 72] > > > > raw_bytes > > > > [65 6d 70 6c 6f 69] > > > > [63 61 72 65 72] > > > > start > > > > 0 > > > > 12 > > > > start > > > > 0 > > > > 12 > > > > end > > > > 9 > > > > 17 > > > > end > > > > 9 > > > > 17 > > > > positionLength > > > > 1 > > > > 1 > > > > positionLength > > > > 1 > > > > 1 > > > > type > > > > > > > > > > > > type > > > > > > > > > > > > position > > > > 1 > > > > 3 > > > > position > > > > 1 > > > > 3 > > > > keyword > > > > FALSE > > > > FALSE > > > > keyword > > > > FALSE > > > > FALSE > > > > > > > > > > > > > > > > > > On Fri, 1 Mar 2019 at 15:51, Shawn Heisey wrote: > > > >> On 3/1/2019 4:38 AM, Marisol Redondo wrote: > >>> When using the PorterStemFilter, I saw that the work "employing" is > >> change > >>> to "emploi" and my document is not found in the query to solr because > of > >>> that. > >>> > >>> This also happens with other words that finish in -ying as annoying or > >>> deploying. > >>> > >>> It there any path for this filter or should I create a new Jira issue? > >> > >> > >> When you are using a stemming filter, you will need to use the same > >> filter on both the query analysis and the index analysis, so that > >> similar words are stemmed to the same root in both cases, leading to > >> matches. > >> > >> If the other steps in your analysis chain are changing the words so that > >> the stemming filter cannot recognize the word, that might also cause > >> problems. > >> > >> Thanks, > >> Shawn > >> > >
Re: Porter Stem filter and employing
First, if you _changed_ the analysis chain without re-indexing all documents, that could account for it. Second, the analysis page is a little tricky. It _assumes_ that the words you put in the boxes have been parsed into the field you select. So let’s say you have this field “title” that has stemming turned on. Let’s further assume your default search field is “text” (this is configured in solrconfig.xml, the “df” parameter in your request handler). Now, if your search is "q=employ” the actual search will be against your default field, as though you had entered “q=text:employ”. This is a common problem, adding "=query" to the search and looking at the result parsed_query.toString() will show you what’s actually the result of the query parsing and may help. Best, Erick > On Mar 4, 2019, at 3:13 AM, Marisol Redondo > wrote: > > Thank you for the answer and heading me to this solution. But I've already > used this filter for index analysis and I'm not getting any result. So I > don't understand why I'm not getting the result. > If I use the Analysis tool, I'm gettin > So, maybe the problem is other? But I don't see what can be the problem, > because, when using the Analysis took I got the same result for index and > query: (the entry to this filter was employing carer) > > *PSF (Index)* > > *PSF (query)* > > text > > emploi > > carer > > text > > emploi > > carer > > raw_bytes > > [65 6d 70 6c 6f 69] > > [63 61 72 65 72] > > raw_bytes > > [65 6d 70 6c 6f 69] > > [63 61 72 65 72] > > start > > 0 > > 12 > > start > > 0 > > 12 > > end > > 9 > > 17 > > end > > 9 > > 17 > > positionLength > > 1 > > 1 > > positionLength > > 1 > > 1 > > type > > > > > > type > > > > > > position > > 1 > > 3 > > position > > 1 > > 3 > > keyword > > FALSE > > FALSE > > keyword > > FALSE > > FALSE > > > > > > > > > On Fri, 1 Mar 2019 at 15:51, Shawn Heisey wrote: > >> On 3/1/2019 4:38 AM, Marisol Redondo wrote: >>> When using the PorterStemFilter, I saw that the work "employing" is >> change >>> to "emploi" and my document is not found in the query to solr because of >>> that. >>> >>> This also happens with other words that finish in -ying as annoying or >>> deploying. >>> >>> It there any path for this filter or should I create a new Jira issue? >> >> >> When you are using a stemming filter, you will need to use the same >> filter on both the query analysis and the index analysis, so that >> similar words are stemmed to the same root in both cases, leading to >> matches. >> >> If the other steps in your analysis chain are changing the words so that >> the stemming filter cannot recognize the word, that might also cause >> problems. >> >> Thanks, >> Shawn >>
Re: Porter Stem filter and employing
Thank you for the answer and heading me to this solution. But I've already used this filter for index analysis and I'm not getting any result. So I don't understand why I'm not getting the result. If I use the Analysis tool, I'm gettin So, maybe the problem is other? But I don't see what can be the problem, because, when using the Analysis took I got the same result for index and query: (the entry to this filter was employing carer) *PSF (Index)* *PSF (query)* text emploi carer text emploi carer raw_bytes [65 6d 70 6c 6f 69] [63 61 72 65 72] raw_bytes [65 6d 70 6c 6f 69] [63 61 72 65 72] start 0 12 start 0 12 end 9 17 end 9 17 positionLength 1 1 positionLength 1 1 type type position 1 3 position 1 3 keyword FALSE FALSE keyword FALSE FALSE On Fri, 1 Mar 2019 at 15:51, Shawn Heisey wrote: > On 3/1/2019 4:38 AM, Marisol Redondo wrote: > > When using the PorterStemFilter, I saw that the work "employing" is > change > > to "emploi" and my document is not found in the query to solr because of > > that. > > > > This also happens with other words that finish in -ying as annoying or > > deploying. > > > > It there any path for this filter or should I create a new Jira issue? > > > When you are using a stemming filter, you will need to use the same > filter on both the query analysis and the index analysis, so that > similar words are stemmed to the same root in both cases, leading to > matches. > > If the other steps in your analysis chain are changing the words so that > the stemming filter cannot recognize the word, that might also cause > problems. > > Thanks, > Shawn >
Re: Porter Stem filter and employing
On 3/1/2019 4:38 AM, Marisol Redondo wrote: When using the PorterStemFilter, I saw that the work "employing" is change to "emploi" and my document is not found in the query to solr because of that. This also happens with other words that finish in -ying as annoying or deploying. It there any path for this filter or should I create a new Jira issue? When you are using a stemming filter, you will need to use the same filter on both the query analysis and the index analysis, so that similar words are stemmed to the same root in both cases, leading to matches. If the other steps in your analysis chain are changing the words so that the stemming filter cannot recognize the word, that might also cause problems. Thanks, Shawn
Porter Stem filter and employing
Hi. When using the PorterStemFilter, I saw that the work "employing" is change to "emploi" and my document is not found in the query to solr because of that. This also happens with other words that finish in -ying as annoying or deploying. It there any path for this filter or should I create a new Jira issue? Thanks