[jira] [Commented] (LUCENE-8936) Add SpanishMinimalStemFilter
[ https://issues.apache.org/jira/browse/LUCENE-8936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894718#comment-16894718 ] vinod kumar commented on LUCENE-8936: - Okay, Thank you [~tomoko] > Add SpanishMinimalStemFilter > > > Key: LUCENE-8936 > URL: https://issues.apache.org/jira/browse/LUCENE-8936 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Reporter: vinod kumar >Assignee: Tomoko Uchida >Priority: Major > Fix For: master (9.0), 8.3 > > Attachments: LUCENE-8936.patch, LUCENE-8936.patch > > > SpanishMinimalStemmerFilter is less aggressive stemmer than > SpanishLightStemmerFilter > Ex: > input tokens -> output tokens > 1. camiseta niños -> *camiseta* and *nino* > 2. camisas -> camisa > *camisetas* and *camisas* are t-shirts and shirts respectively. > Stemming both of the tokens to *camis* will match both tokens and returns > both t-shirts and shirts for query camisas(shirts). > SpanishMinimalStemmerFilter will help handling these cases. > And importantly It will preserve gender context with tokens. > Ex: *niños* ,*niñas* *chicos* and *chicas* are stemmed to *nino*, *nina*, > *chico* and *chica* -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8936) Add SpanishMinimalStemFilter
[ https://issues.apache.org/jira/browse/LUCENE-8936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894716#comment-16894716 ] ASF subversion and git services commented on LUCENE-8936: - Commit a229e711cabb6027eecd06e2c9ec92002d2b6949 in lucene-solr's branch refs/heads/branch_8x from vinod kumar [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a229e71 ] LUCENE-8936: Add SpanishMinimalStemFilter Signed-off-by: Tomoko Uchida > Add SpanishMinimalStemFilter > > > Key: LUCENE-8936 > URL: https://issues.apache.org/jira/browse/LUCENE-8936 > Project: Lucene - Core > Issue Type: Improvement >Reporter: vinod kumar >Assignee: Tomoko Uchida >Priority: Major > Attachments: LUCENE-8936.patch, LUCENE-8936.patch > > > SpanishMinimalStemmerFilter is less aggressive stemmer than > SpanishLightStemmerFilter > Ex: > input tokens -> output tokens > 1. camiseta niños -> *camiseta* and *nino* > 2. camisas -> camisa > *camisetas* and *camisas* are t-shirts and shirts respectively. > Stemming both of the tokens to *camis* will match both tokens and returns > both t-shirts and shirts for query camisas(shirts). > SpanishMinimalStemmerFilter will help handling these cases. > And importantly It will preserve gender context with tokens. > Ex: *niños* ,*niñas* *chicos* and *chicas* are stemmed to *nino*, *nina*, > *chico* and *chica* -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8936) Add SpanishMinimalStemFilter
[ https://issues.apache.org/jira/browse/LUCENE-8936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894715#comment-16894715 ] ASF subversion and git services commented on LUCENE-8936: - Commit 8c8d8abddc9f5f8c92943e50d6169882e7188c44 in lucene-solr's branch refs/heads/master from vinod kumar [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8c8d8ab ] LUCENE-8936: Add SpanishMinimalStemFilter Signed-off-by: Tomoko Uchida > Add SpanishMinimalStemFilter > > > Key: LUCENE-8936 > URL: https://issues.apache.org/jira/browse/LUCENE-8936 > Project: Lucene - Core > Issue Type: Improvement >Reporter: vinod kumar >Assignee: Tomoko Uchida >Priority: Major > Attachments: LUCENE-8936.patch, LUCENE-8936.patch > > > SpanishMinimalStemmerFilter is less aggressive stemmer than > SpanishLightStemmerFilter > Ex: > input tokens -> output tokens > 1. camiseta niños -> *camiseta* and *nino* > 2. camisas -> camisa > *camisetas* and *camisas* are t-shirts and shirts respectively. > Stemming both of the tokens to *camis* will match both tokens and returns > both t-shirts and shirts for query camisas(shirts). > SpanishMinimalStemmerFilter will help handling these cases. > And importantly It will preserve gender context with tokens. > Ex: *niños* ,*niñas* *chicos* and *chicas* are stemmed to *nino*, *nina*, > *chico* and *chica* -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8936) Add SpanishMinimalStemFilter
[ https://issues.apache.org/jira/browse/LUCENE-8936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894713#comment-16894713 ] vinod kumar commented on LUCENE-8936: - Hi [~tomoko] vinod.nandikolm...@yahoo.com > Add SpanishMinimalStemFilter > > > Key: LUCENE-8936 > URL: https://issues.apache.org/jira/browse/LUCENE-8936 > Project: Lucene - Core > Issue Type: Improvement >Reporter: vinod kumar >Assignee: Tomoko Uchida >Priority: Major > Attachments: LUCENE-8936.patch, LUCENE-8936.patch > > > SpanishMinimalStemmerFilter is less aggressive stemmer than > SpanishLightStemmerFilter > Ex: > input tokens -> output tokens > 1. camiseta niños -> *camiseta* and *nino* > 2. camisas -> camisa > *camisetas* and *camisas* are t-shirts and shirts respectively. > Stemming both of the tokens to *camis* will match both tokens and returns > both t-shirts and shirts for query camisas(shirts). > SpanishMinimalStemmerFilter will help handling these cases. > And importantly It will preserve gender context with tokens. > Ex: *niños* ,*niñas* *chicos* and *chicas* are stemmed to *nino*, *nina*, > *chico* and *chica* -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8936) Add SpanishMinimalStemFilter
[ https://issues.apache.org/jira/browse/LUCENE-8936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894708#comment-16894708 ] Tomoko Uchida commented on LUCENE-8936: --- Hi [~vinod1812], would you tell me your e-mail address to credit your name with e-mail as the Author of the commit? (I cannot find it from mail list or jira.) > Add SpanishMinimalStemFilter > > > Key: LUCENE-8936 > URL: https://issues.apache.org/jira/browse/LUCENE-8936 > Project: Lucene - Core > Issue Type: Improvement >Reporter: vinod kumar >Assignee: Tomoko Uchida >Priority: Major > Attachments: LUCENE-8936.patch, LUCENE-8936.patch > > > SpanishMinimalStemmerFilter is less aggressive stemmer than > SpanishLightStemmerFilter > Ex: > input tokens -> output tokens > 1. camiseta niños -> *camiseta* and *nino* > 2. camisas -> camisa > *camisetas* and *camisas* are t-shirts and shirts respectively. > Stemming both of the tokens to *camis* will match both tokens and returns > both t-shirts and shirts for query camisas(shirts). > SpanishMinimalStemmerFilter will help handling these cases. > And importantly It will preserve gender context with tokens. > Ex: *niños* ,*niñas* *chicos* and *chicas* are stemmed to *nino*, *nina*, > *chico* and *chica* -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8936) Add SpanishMinimalStemFilter
[ https://issues.apache.org/jira/browse/LUCENE-8936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894528#comment-16894528 ] Lucene/Solr QA commented on LUCENE-8936: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 25s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 48s{color} | {color:green} common in the patch passed. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 13m 4s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | LUCENE-8936 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12976054/LUCENE-8936.patch | | Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns | | uname | Linux lucene2-us-west.apache.org 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | ant | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh | | git revision | master / 4050ddc | | ant | version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018 | | Default Java | LTS | | Test Results | https://builds.apache.org/job/PreCommit-LUCENE-Build/199/testReport/ | | modules | C: lucene lucene/analysis/common U: lucene | | Console output | https://builds.apache.org/job/PreCommit-LUCENE-Build/199/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > Add SpanishMinimalStemFilter > > > Key: LUCENE-8936 > URL: https://issues.apache.org/jira/browse/LUCENE-8936 > Project: Lucene - Core > Issue Type: Improvement >Reporter: vinod kumar >Assignee: Tomoko Uchida >Priority: Major > Attachments: LUCENE-8936.patch, LUCENE-8936.patch > > > SpanishMinimalStemmerFilter is less aggressive stemmer than > SpanishLightStemmerFilter > Ex: > input tokens -> output tokens > 1. camiseta niños -> *camiseta* and *nino* > 2. camisas -> camisa > *camisetas* and *camisas* are t-shirts and shirts respectively. > Stemming both of the tokens to *camis* will match both tokens and returns > both t-shirts and shirts for query camisas(shirts). > SpanishMinimalStemmerFilter will help handling these cases. > And importantly It will preserve gender context with tokens. > Ex: *niños* ,*niñas* *chicos* and *chicas* are stemmed to *nino*, *nina*, > *chico* and *chica* -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8936) Add SpanishMinimalStemFilter
[ https://issues.apache.org/jira/browse/LUCENE-8936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894454#comment-16894454 ] vinod kumar commented on LUCENE-8936: - okay. > Add SpanishMinimalStemFilter > > > Key: LUCENE-8936 > URL: https://issues.apache.org/jira/browse/LUCENE-8936 > Project: Lucene - Core > Issue Type: Improvement >Reporter: vinod kumar >Assignee: Tomoko Uchida >Priority: Major > Attachments: LUCENE-8936.patch, LUCENE-8936.patch > > > SpanishMinimalStemmerFilter is less aggressive stemmer than > SpanishLightStemmerFilter > Ex: > input tokens -> output tokens > 1. camiseta niños -> *camiseta* and *nino* > 2. camisas -> camisa > *camisetas* and *camisas* are t-shirts and shirts respectively. > Stemming both of the tokens to *camis* will match both tokens and returns > both t-shirts and shirts for query camisas(shirts). > SpanishMinimalStemmerFilter will help handling these cases. > And importantly It will preserve gender context with tokens. > Ex: *niños* ,*niñas* *chicos* and *chicas* are stemmed to *nino*, *nina*, > *chico* and *chica* -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8936) Add SpanishMinimalStemFilter
[ https://issues.apache.org/jira/browse/LUCENE-8936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894421#comment-16894421 ] Tomoko Uchida commented on LUCENE-8936: --- +1 to the patch. Let us wait one day or so, then commit the changes on the master and 8.x branch. > Add SpanishMinimalStemFilter > > > Key: LUCENE-8936 > URL: https://issues.apache.org/jira/browse/LUCENE-8936 > Project: Lucene - Core > Issue Type: Improvement >Reporter: vinod kumar >Assignee: Tomoko Uchida >Priority: Major > Attachments: LUCENE-8936.patch, LUCENE-8936.patch > > > SpanishMinimalStemmerFilter is less aggressive stemmer than > SpanishLightStemmerFilter > Ex: > input tokens -> output tokens > 1. camiseta niños -> *camiseta* and *nino* > 2. camisas -> camisa > *camisetas* and *camisas* are t-shirts and shirts respectively. > Stemming both of the tokens to *camis* will match both tokens and returns > both t-shirts and shirts for query camisas(shirts). > SpanishMinimalStemmerFilter will help handling these cases. > And importantly It will preserve gender context with tokens. > Ex: *niños* ,*niñas* *chicos* and *chicas* are stemmed to *nino*, *nina*, > *chico* and *chica* -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8936) Add SpanishMinimalStemFilter
[ https://issues.apache.org/jira/browse/LUCENE-8936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894406#comment-16894406 ] vinod kumar commented on LUCENE-8936: - [~tomoko] Thank you. Have removed it. Thanks for your time. > Add SpanishMinimalStemFilter > > > Key: LUCENE-8936 > URL: https://issues.apache.org/jira/browse/LUCENE-8936 > Project: Lucene - Core > Issue Type: Improvement >Reporter: vinod kumar >Assignee: Tomoko Uchida >Priority: Major > Attachments: LUCENE-8936.patch, LUCENE-8936.patch > > > SpanishMinimalStemmerFilter is less aggressive stemmer than > SpanishLightStemmerFilter > Ex: > input tokens -> output tokens > 1. camiseta niños -> *camiseta* and *nino* > 2. camisas -> camisa > *camisetas* and *camisas* are t-shirts and shirts respectively. > Stemming both of the tokens to *camis* will match both tokens and returns > both t-shirts and shirts for query camisas(shirts). > SpanishMinimalStemmerFilter will help handling these cases. > And importantly It will preserve gender context with tokens. > Ex: *niños* ,*niñas* *chicos* and *chicas* are stemmed to *nino*, *nina*, > *chico* and *chica* -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8936) Add SpanishMinimalStemFilter
[ https://issues.apache.org/jira/browse/LUCENE-8936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894402#comment-16894402 ] Tomoko Uchida commented on LUCENE-8936: --- [~vinod1812]: I noticed your name is credited in {{SpanishMinimalStemmer}} Javadocs. Lucene/Solr source code don't have any {{@author}} tag or person's name who donated the code. Credits are appeared only in the commit log and CHANGES. Can you please remove it? > Add SpanishMinimalStemFilter > > > Key: LUCENE-8936 > URL: https://issues.apache.org/jira/browse/LUCENE-8936 > Project: Lucene - Core > Issue Type: Improvement >Reporter: vinod kumar >Assignee: Tomoko Uchida >Priority: Major > Attachments: LUCENE-8936.patch > > > SpanishMinimalStemmerFilter is less aggressive stemmer than > SpanishLightStemmerFilter > Ex: > input tokens -> output tokens > 1. camiseta niños -> *camiseta* and *nino* > 2. camisas -> camisa > *camisetas* and *camisas* are t-shirts and shirts respectively. > Stemming both of the tokens to *camis* will match both tokens and returns > both t-shirts and shirts for query camisas(shirts). > SpanishMinimalStemmerFilter will help handling these cases. > And importantly It will preserve gender context with tokens. > Ex: *niños* ,*niñas* *chicos* and *chicas* are stemmed to *nino*, *nina*, > *chico* and *chica* -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8936) Add SpanishMinimalStemFilter
[ https://issues.apache.org/jira/browse/LUCENE-8936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894394#comment-16894394 ] Tomoko Uchida commented on LUCENE-8936: --- Hi [~vinod1812], the patch looks fine. Actually I cannot review the {{SpanishMinimalStemmer}} class (I don't understand Spanish), but other parts looks okay to me. And this passed {{ant precommit}} (thanks!). I will commit it after waiting 24 hours if there are no other comments. About the github PR, only the Lucene/Solr committers have the write permission to the apache/lucene-solr repo. So you have to fork the repo and open a pull request. But this time, a patch has been provided so you do not need to do so. > Add SpanishMinimalStemFilter > > > Key: LUCENE-8936 > URL: https://issues.apache.org/jira/browse/LUCENE-8936 > Project: Lucene - Core > Issue Type: Improvement >Reporter: vinod kumar >Priority: Major > Attachments: LUCENE-8936.patch > > > SpanishMinimalStemmerFilter is less aggressive stemmer than > SpanishLightStemmerFilter > Ex: > input tokens -> output tokens > 1. camiseta niños -> *camiseta* and *nino* > 2. camisas -> camisa > *camisetas* and *camisas* are t-shirts and shirts respectively. > Stemming both of the tokens to *camis* will match both tokens and returns > both t-shirts and shirts for query camisas(shirts). > SpanishMinimalStemmerFilter will help handling these cases. > And importantly It will preserve gender context with tokens. > Ex: *niños* ,*niñas* *chicos* and *chicas* are stemmed to *nino*, *nina*, > *chico* and *chica* -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8936) Add SpanishMinimalStemFilter
[ https://issues.apache.org/jira/browse/LUCENE-8936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894346#comment-16894346 ] vinod kumar commented on LUCENE-8936: - [~atris] github Permission to apache/lucene-solr.git denied for me. can you also suggest how do I get permission. Thanks for your time. > Add SpanishMinimalStemFilter > > > Key: LUCENE-8936 > URL: https://issues.apache.org/jira/browse/LUCENE-8936 > Project: Lucene - Core > Issue Type: Improvement >Reporter: vinod kumar >Priority: Major > Attachments: LUCENE-8936.patch > > > SpanishMinimalStemmerFilter is less aggressive stemmer than > SpanishLightStemmerFilter > Ex: > input tokens -> output tokens > 1. camiseta niños -> *camiseta* and *nino* > 2. camisas -> camisa > *camisetas* and *camisas* are t-shirts and shirts respectively. > Stemming both of the tokens to *camis* will match both tokens and returns > both t-shirts and shirts for query camisas(shirts). > SpanishMinimalStemmerFilter will help handling these cases. > And importantly It will preserve gender context with tokens. > Ex: *niños* ,*niñas* *chicos* and *chicas* are stemmed to *nino*, *nina*, > *chico* and *chica* -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8936) Add SpanishMinimalStemFilter
[ https://issues.apache.org/jira/browse/LUCENE-8936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894340#comment-16894340 ] vinod kumar commented on LUCENE-8936: - Thank you [~atris]. attached patch file [^LUCENE-8936.patch] > Add SpanishMinimalStemFilter > > > Key: LUCENE-8936 > URL: https://issues.apache.org/jira/browse/LUCENE-8936 > Project: Lucene - Core > Issue Type: Improvement >Reporter: vinod kumar >Priority: Major > Attachments: LUCENE-8936.patch > > > SpanishMinimalStemmerFilter is less aggressive stemmer than > SpanishLightStemmerFilter > Ex: > input tokens -> output tokens > 1. camiseta niños -> *camiseta* and *nino* > 2. camisas -> camisa > *camisetas* and *camisas* are t-shirts and shirts respectively. > Stemming both of the tokens to *camis* will match both tokens and returns > both t-shirts and shirts for query camisas(shirts). > SpanishMinimalStemmerFilter will help handling these cases. > And importantly It will preserve gender context with tokens. > Ex: *niños* ,*niñas* *chicos* and *chicas* are stemmed to *nino*, *nina*, > *chico* and *chica* -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8936) Add SpanishMinimalStemFilter
[ https://issues.apache.org/jira/browse/LUCENE-8936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894336#comment-16894336 ] Atri Sharma commented on LUCENE-8936: - Hello Vinod! Welcome to the community. Thank you for your contribution. I would suggest following either of two approaches : 1) Attach a patch to this JIRA or 2) Open a pull request on the Lucene-Solr Github repository. Somebody will review your contribution soon and provide feedback. > Add SpanishMinimalStemFilter > > > Key: LUCENE-8936 > URL: https://issues.apache.org/jira/browse/LUCENE-8936 > Project: Lucene - Core > Issue Type: Improvement >Reporter: vinod kumar >Priority: Major > Attachments: LUCENE-8936.patch > > > SpanishMinimalStemmerFilter is less aggressive stemmer than > SpanishLightStemmerFilter > Ex: > input tokens -> output tokens > 1. camiseta niños -> *camiseta* and *nino* > 2. camisas -> camisa > *camisetas* and *camisas* are t-shirts and shirts respectively. > Stemming both of the tokens to *camis* will match both tokens and returns > both t-shirts and shirts for query camisas(shirts). > SpanishMinimalStemmerFilter will help handling these cases. > And importantly It will preserve gender context with tokens. > Ex: *niños* ,*niñas* *chicos* and *chicas* are stemmed to *nino*, *nina*, > *chico* and *chica* -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8936) Add SpanishMinimalStemFilter
[ https://issues.apache.org/jira/browse/LUCENE-8936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894276#comment-16894276 ] vinod kumar commented on LUCENE-8936: - [~atris] can you please help me on this. I have done all development. access is denied for me to raise pull request. > Add SpanishMinimalStemFilter > > > Key: LUCENE-8936 > URL: https://issues.apache.org/jira/browse/LUCENE-8936 > Project: Lucene - Core > Issue Type: Improvement >Reporter: vinod kumar >Priority: Major > > SpanishMinimalStemmerFilter is less aggressive stemmer than > SpanishLightStemmerFilter > Ex: > input tokens -> output tokens > 1. camiseta niños -> *camiseta* and *nino* > 2. camisas -> camisa > *camisetas* and *camisas* are t-shirts and shirts respectively. > Stemming both of the tokens to *camis* will match both tokens and returns > both t-shirts and shirts for query camisas(shirts). > SpanishMinimalStemmerFilter will help handling these cases. > And importantly It will preserve gender context with tokens. > Ex: *niños* ,*niñas* *chicos* and *chicas* are stemmed to *nino*, *nina*, > *chico* and *chica* -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org