Re: Welcome Tomoko Uchida as Lucene/Solr committer
Congralations Tomoko! On Tuesday, April 9, 2019, 8:48:03 PM GMT+3, Robert Muir wrote: Welcome! On Mon, Apr 8, 2019 at 11:21 AM Uwe Schindler wrote: > > Hi all, > > Please join me in welcoming Tomoko Uchida as the latest Lucene/Solr committer! > > She has been working on https://issues.apache.org/jira/browse/LUCENE-2562 for > several years with awesome progress and finally we got the fantastic Luke as > a branch on ASF JIRA: > https://gitbox.apache.org/repos/asf?p=lucene-solr.git;a=shortlog;h=refs/heads/jira/lucene-2562-luke-swing-3 > Looking forward to the first release of Apache Lucene 8.1 with Luke bundled > in the distribution. I will take care of merging it to master and 8.x > branches together with her once she got the ASF account. > > Tomoko also helped with the Japanese and Korean Analyzers. > > Congratulations and Welcome, Tomoko! Tomoko, it's traditional for you to > introduce yourself with a brief bio. > > Uwe & Robert (who nominated Tomoko) > > - > Uwe Schindler > Achterdiek 19, D-28357 Bremen > https://www.thetaphi.de > eMail: u...@thetaphi.de > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Gus Heck as Lucene/Solr committer
Congratulations! On Friday, November 2, 2018, 7:13:35 PM GMT+3, Varun Thacker wrote: Congratulations and welcome Gus! On Thu, Nov 1, 2018 at 5:22 AM David Smiley wrote: Hi all, Please join me in welcoming Gus Heck as the latest Lucene/Solr committer! Congratulations and Welcome, Gus! Gus, it's traditional for you to introduce yourself with a brief bio. ~ David-- Lucene/Solr Search Committer, Consultant, Developer, Author, SpeakerLinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.solrenterprisesearchserver.com
Re: Welcome Tim Allison as a Lucene/Solr committer
Congratulations ! On Saturday, November 3, 2018, 1:43:31 AM GMT+3, Nhat Nguyen wrote: Welcome Tim! On Fri, Nov 2, 2018 at 6:33 PM Tommaso Teofili wrote: Welcome Tim!!! Tommaso Il giorno ven 2 nov 2018 alle ore 22:30 Steve Rowe ha scritto: > > Welcome Tim! > > Steve > > On Fri, Nov 2, 2018 at 12:20 PM Erick Erickson > wrote: >> >> Hi all, >> >> Please join me in welcoming Tim Allison as the latest Lucene/Solr committer! >> >> Congratulations and Welcome, Tim! >> >> It's traditional for you to introduce yourself with a brief bio. >> >> Erick >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Nhat Nguyen as Lucene/Solr committer
Congratulations and Welcome! On Tuesday, June 19, 2018, 7:20:48 PM GMT+3, Jason Gerlowski wrote: Welcome Nhat! On Tue, Jun 19, 2018 at 10:10 AM, Varun Thacker wrote: > Congratulations and welcome Nhat! > > On Tue, Jun 19, 2018 at 10:16 AM, Alan Woodward wrote: >> Welcome Nhat! >> >> >>> On 18 Jun 2018, at 21:41, Adrien Grand wrote: >>> >>> Hi all, >>> >>> Please join me in welcoming Nhat Nguyen as the latest Lucene/Solr committer. >>> Nhat, it's tradition for you to introduce yourself with a brief bio. >>> >>> Congratulations and Welcome! >>> >>> Adrien >>> >> > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 7.3.1 RC2
+1 SUCCESS! [1:19:56.690027] Ahmet On Saturday, May 12, 2018, 12:41:16 AM GMT+3, Michael McCandlesswrote: +1 SUCCESS! [0:40:57.887333] Mike McCandless http://blog.mikemccandless.com On Fri, May 11, 2018 at 1:09 PM, Adrien Grand wrote: > +1 > SUCCESS! [1:33:37.370199] > > Le mer. 9 mai 2018 à 16:59, Mark Miller a écrit : >> Even before I saw that comment, I was thinking poor Alan... >> >> - Mark >> >> >> On Wed, May 9, 2018 at 7:31 AM Alan Woodward wrote: >>> +1 >>> SUCCESS! [3:10:43.862442] >>> >>> My internet has been really very slow today... >>> >>> On Wed, May 9, 2018 at 5:50 AM, Đạt Cao Mạnh >>> wrote: Please vote for release candidate 2 for Lucene/Solr 7.3.1 The artifact can be downloaded from: https://dist.apache.org/repos/ dist/dev/lucene/lucene-solr-7. 3.1-RC2- revae0705edb59eaa567fe13ed3a22 2fdadc7153680/ You can run the smoke tester directly with this command: python3 -u dev-tools/scripts/smokeTestRel ease.py https://dist.apache. org/repos/dist/dev/lucene/ lucene-solr-7.3.1-RC2- revae0705edb59eaa567fe13ed3a22 2fdadc7153680 Here’s my +1 SUCCESS! [0:53:47.443795] >>> >>> >> -- >> - Mark >> about.me/markrmiller >> > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 7.3.1 RC1
+1 SUCCESS! [1:15:16.705804] Ahmet On Wednesday, May 2, 2018, 9:55:04 PM GMT+3, David Smileywrote: +1 SUCCESS! [1:04:51.914445] On Wed, May 2, 2018 at 12:32 PM Michael McCandless wrote: > +1 > > SUCCESS! [0:49:04.927108] > > Mike McCandless > > http://blog.mikemccandless.com > > On Wed, May 2, 2018 at 6:40 AM, Đạt Cao Mạnh wrote: >> Please vote for release candidate 1 for Lucene/Solr 7.3.1 >> >> The artifacts can be downloaded from: >> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.3.1-RC1-rev8fa7687413558b3bc65cbbbeb722a21314187e6a >> >> You can run the smoke tester directly with this command: >> >> python3 -u dev-tools/scripts/smokeTestRelease.py \ >> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.3.1-RC1-rev8fa7687413558b3bc65cbbbeb722a21314187e6a >> >> Here's my +1 >> SUCCESS! [0:52:14.381028] >> > -- Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker LinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.solrenterprisesearchserver.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Jason Gerlowski as committer
Congratulations and welcome Jason! On Friday, February 9, 2018, 11:58:06 AM GMT+3, Alan Woodwardwrote: Welcome Jason! > On 8 Feb 2018, at 17:02, David Smiley wrote: > > Hello everyone, > > It's my pleasure to announce that Jason Gerlowski is our latest committer for > Lucene/Solr in recognition for his contributions to the project! Please join > me in welcoming him. Jason, it's tradition for you to introduce yourself > with a brief bio. > > Congratulations and Welcome! > -- > Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker > LinkedIn: http://linkedin.com/in/davidwsmiley | Book: > http://www.solrenterprisesearchserver.com > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10308) Solr fails to work with Guava 21.0
[ https://issues.apache.org/jira/browse/SOLR-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357783#comment-16357783 ] Ahmet Arslan commented on SOLR-10308: - {quote}I don't know if it was correct to assume UTF-8 on the hashString usages {quote} I believe it should be as the following. I had confirmed it before in SOLR-11260 * HashFunction.hashString -> HashFunction.hashUnencodedChars > Solr fails to work with Guava 21.0 > -- > > Key: SOLR-10308 > URL: https://issues.apache.org/jira/browse/SOLR-10308 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: highlighter >Affects Versions: 6.4.2 >Reporter: Vincent Massol >Priority: Major > Attachments: SOLR-10308.patch > > > This is what we get: > {noformat} > Caused by: java.lang.NoSuchMethodError: > com.google.common.base.Objects.firstNonNull(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; > at > org.apache.solr.handler.component.HighlightComponent.prepare(HighlightComponent.java:118) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:269) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:166) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2299) > at > org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:178) > at > org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149) > at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:942) > at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:957) > at > org.xwiki.search.solr.internal.AbstractSolrInstance.query(AbstractSolrInstance.java:117) > at > org.xwiki.query.solr.internal.SolrQueryExecutor.execute(SolrQueryExecutor.java:122) > at > org.xwiki.query.internal.DefaultQueryExecutorManager.execute(DefaultQueryExecutorManager.java:72) > at > org.xwiki.query.internal.SecureQueryExecutorManager.execute(SecureQueryExecutorManager.java:67) > at org.xwiki.query.internal.DefaultQuery.execute(DefaultQuery.java:287) > at org.xwiki.query.internal.ScriptQuery.execute(ScriptQuery.java:237) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.velocity.util.introspection.UberspectImpl$VelMethodImpl.doInvoke(UberspectImpl.java:395) > at > org.apache.velocity.util.introspection.UberspectImpl$VelMethodImpl.invoke(UberspectImpl.java:384) > at > org.apache.velocity.runtime.parser.node.ASTMethod.execute(ASTMethod.java:173) > ... 183 more > {noformat} > Guava 21 has removed some signature that solr is currently using. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10308) Solr fails to work with Guava 21.0
[ https://issues.apache.org/jira/browse/SOLR-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327479#comment-16327479 ] Ahmet Arslan commented on SOLR-10308: - Unfortunately, the patch affects the following hdfs test cases. {noformat} [junit4] HEARTBEAT J1 PID(26866@...): 2018-01-16T19:27:40, stalled for 7988s at: HdfsNNFailoverTest (suite) [junit4] HEARTBEAT J2 PID(26860@...): 2018-01-16T19:27:40, stalled for 8591s at: StressHdfsTest (suite) [junit4] HEARTBEAT J0 PID(26869@...): 2018-01-16T19:28:04, stalled for 7141s at: MoveReplicaHDFSFailoverTest (suite) [junit4] HEARTBEAT J3 PID(26880@...): 2018-01-16T19:28:04, stalled for 8352s at: CheckHdfsIndexTest (suite) [junit4] HEARTBEAT J1 PID(26866@...): 2018-01-16T19:28:40, stalled for 8048s at: HdfsNNFailoverTest (suite) [junit4] HEARTBEAT J2 PID(26860@...): 2018-01-16T19:28:40, stalled for 8651s at: StressHdfsTest (suite) [junit4] HEARTBEAT J0 PID(26869@...): 2018-01-16T19:29:04, stalled for 7201s at: MoveReplicaHDFSFailoverTest (suite) [junit4] HEARTBEAT J3 PID(26880@...): 2018-01-16T19:29:04, stalled for 8412s at: CheckHdfsIndexTest (suite) {noformat} > Solr fails to work with Guava 21.0 > -- > > Key: SOLR-10308 > URL: https://issues.apache.org/jira/browse/SOLR-10308 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: highlighter >Affects Versions: 6.4.2 >Reporter: Vincent Massol >Priority: Major > > This is what we get: > {noformat} > Caused by: java.lang.NoSuchMethodError: > com.google.common.base.Objects.firstNonNull(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; > at > org.apache.solr.handler.component.HighlightComponent.prepare(HighlightComponent.java:118) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:269) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:166) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2299) > at > org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:178) > at > org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149) > at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:942) > at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:957) > at > org.xwiki.search.solr.internal.AbstractSolrInstance.query(AbstractSolrInstance.java:117) > at > org.xwiki.query.solr.internal.SolrQueryExecutor.execute(SolrQueryExecutor.java:122) > at > org.xwiki.query.internal.DefaultQueryExecutorManager.execute(DefaultQueryExecutorManager.java:72) > at > org.xwiki.query.internal.SecureQueryExecutorManager.execute(SecureQueryExecutorManager.java:67) > at org.xwiki.query.internal.DefaultQuery.execute(DefaultQuery.java:287) > at org.xwiki.query.internal.ScriptQuery.execute(ScriptQuery.java:237) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.velocity.util.introspection.UberspectImpl$VelMethodImpl.doInvoke(UberspectImpl.java:395) > at > org.apache.velocity.util.introspection.UberspectImpl$VelMethodImpl.invoke(UberspectImpl.java:384) > at > org.apache.velocity.runtime.parser.node.ASTMethod.execute(ASTMethod.java:173) > ... 183 more > {noformat} > Guava 21 has removed some signature that solr is currently using. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10308) Solr fails to work with Guava 21.0
[ https://issues.apache.org/jira/browse/SOLR-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327071#comment-16327071 ] Ahmet Arslan commented on SOLR-10308: - {quote} relocate the version of guava used{quote} [~mdrob] can you please provide some context/pointers? I am not familiar with the relocate concept. > Solr fails to work with Guava 21.0 > -- > > Key: SOLR-10308 > URL: https://issues.apache.org/jira/browse/SOLR-10308 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: highlighter >Affects Versions: 6.4.2 >Reporter: Vincent Massol >Priority: Major > > This is what we get: > {noformat} > Caused by: java.lang.NoSuchMethodError: > com.google.common.base.Objects.firstNonNull(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; > at > org.apache.solr.handler.component.HighlightComponent.prepare(HighlightComponent.java:118) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:269) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:166) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2299) > at > org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:178) > at > org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149) > at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:942) > at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:957) > at > org.xwiki.search.solr.internal.AbstractSolrInstance.query(AbstractSolrInstance.java:117) > at > org.xwiki.query.solr.internal.SolrQueryExecutor.execute(SolrQueryExecutor.java:122) > at > org.xwiki.query.internal.DefaultQueryExecutorManager.execute(DefaultQueryExecutorManager.java:72) > at > org.xwiki.query.internal.SecureQueryExecutorManager.execute(SecureQueryExecutorManager.java:67) > at org.xwiki.query.internal.DefaultQuery.execute(DefaultQuery.java:287) > at org.xwiki.query.internal.ScriptQuery.execute(ScriptQuery.java:237) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.velocity.util.introspection.UberspectImpl$VelMethodImpl.doInvoke(UberspectImpl.java:395) > at > org.apache.velocity.util.introspection.UberspectImpl$VelMethodImpl.invoke(UberspectImpl.java:384) > at > org.apache.velocity.runtime.parser.node.ASTMethod.execute(ASTMethod.java:173) > ... 183 more > {noformat} > Guava 21 has removed some signature that solr is currently using. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10308) Solr fails to work with Guava 21.0
[ https://issues.apache.org/jira/browse/SOLR-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327068#comment-16327068 ] Ahmet Arslan commented on SOLR-10308: - {{ant precommit}} passed. Running {{ant test}} now but it looks like it is hanged: {noformat} [junit4] HEARTBEAT J0 PID(2348@...): 2018-01-16T15:19:55, stalled for 28263s at: HdfsDirectoryTest (suite) [junit4] HEARTBEAT J0 PID(2348@...): 2018-01-16T15:20:55, stalled for 28323s at: HdfsDirectoryTest (suite) [junit4] HEARTBEAT J0 PID(2348@...): 2018-01-16T15:21:55, stalled for 28383s at: HdfsDirectoryTest (suite) {noformat} I am not sure whether it is due to guava update or not. I will try to figure out. > Solr fails to work with Guava 21.0 > -- > > Key: SOLR-10308 > URL: https://issues.apache.org/jira/browse/SOLR-10308 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: highlighter >Affects Versions: 6.4.2 >Reporter: Vincent Massol >Priority: Major > > This is what we get: > {noformat} > Caused by: java.lang.NoSuchMethodError: > com.google.common.base.Objects.firstNonNull(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; > at > org.apache.solr.handler.component.HighlightComponent.prepare(HighlightComponent.java:118) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:269) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:166) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2299) > at > org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:178) > at > org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149) > at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:942) > at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:957) > at > org.xwiki.search.solr.internal.AbstractSolrInstance.query(AbstractSolrInstance.java:117) > at > org.xwiki.query.solr.internal.SolrQueryExecutor.execute(SolrQueryExecutor.java:122) > at > org.xwiki.query.internal.DefaultQueryExecutorManager.execute(DefaultQueryExecutorManager.java:72) > at > org.xwiki.query.internal.SecureQueryExecutorManager.execute(SecureQueryExecutorManager.java:67) > at org.xwiki.query.internal.DefaultQuery.execute(DefaultQuery.java:287) > at org.xwiki.query.internal.ScriptQuery.execute(ScriptQuery.java:237) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.velocity.util.introspection.UberspectImpl$VelMethodImpl.doInvoke(UberspectImpl.java:395) > at > org.apache.velocity.util.introspection.UberspectImpl$VelMethodImpl.invoke(UberspectImpl.java:384) > at > org.apache.velocity.runtime.parser.node.ASTMethod.execute(ASTMethod.java:173) > ... 183 more > {noformat} > Guava 21 has removed some signature that solr is currently using. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10308) Solr fails to work with Guava 21.0
[ https://issues.apache.org/jira/browse/SOLR-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326657#comment-16326657 ] Ahmet Arslan commented on SOLR-10308: - {quote}Fixing the three guava usages in Solr code that are incompatible with version 21 should be pretty easy {quote} I have a patch in SOLR-11260 for this. I needed this for another reason: to be able to use a third party NLP library for my [Turkish analysis plugin|https://github.com/iorixxx/lucene-solr-analysis-turkish]. A drop-in upgrade of Guava breaks highlighting. I patched solr 6.6.0 and using it in a production-like environment. Does this patch solve [~vmassol]'s problem? Does it break any existing functionality? > Solr fails to work with Guava 21.0 > -- > > Key: SOLR-10308 > URL: https://issues.apache.org/jira/browse/SOLR-10308 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: highlighter >Affects Versions: 6.4.2 >Reporter: Vincent Massol >Priority: Major > > This is what we get: > {noformat} > Caused by: java.lang.NoSuchMethodError: > com.google.common.base.Objects.firstNonNull(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; > at > org.apache.solr.handler.component.HighlightComponent.prepare(HighlightComponent.java:118) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:269) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:166) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2299) > at > org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:178) > at > org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149) > at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:942) > at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:957) > at > org.xwiki.search.solr.internal.AbstractSolrInstance.query(AbstractSolrInstance.java:117) > at > org.xwiki.query.solr.internal.SolrQueryExecutor.execute(SolrQueryExecutor.java:122) > at > org.xwiki.query.internal.DefaultQueryExecutorManager.execute(DefaultQueryExecutorManager.java:72) > at > org.xwiki.query.internal.SecureQueryExecutorManager.execute(SecureQueryExecutorManager.java:67) > at org.xwiki.query.internal.DefaultQuery.execute(DefaultQuery.java:287) > at org.xwiki.query.internal.ScriptQuery.execute(ScriptQuery.java:237) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.velocity.util.introspection.UberspectImpl$VelMethodImpl.doInvoke(UberspectImpl.java:395) > at > org.apache.velocity.util.introspection.UberspectImpl$VelMethodImpl.invoke(UberspectImpl.java:384) > at > org.apache.velocity.runtime.parser.node.ASTMethod.execute(ASTMethod.java:173) > ... 183 more > {noformat} > Guava 21 has removed some signature that solr is currently using. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 7.2.1 RC1
+1 SUCCESS! [4:00:41.664562] Ahmet On Saturday, January 13, 2018, 7:06:47 PM GMT+3, Kevin Risdenwrote: Ishan - Try docker run -it openjdk:9-jdk. java was replaced with openjdk. java:9-jdk has version 9b149 where as openjdk:9-jdk has version 9.0.1-11. This should have been fixed before Java 9 GA. https://github.com/docker- library/openjdk/issues/101 Kevin Risden On Sat, Jan 13, 2018 at 6:09 AM, Ishan Chattopadhyaya wrote: This also happens with 7.2.0 and 7.1.0. Could be something to do with the official Java image. Nothing that stops the RC, I think. On Sat, Jan 13, 2018 at 5:11 PM, Ishan Chattopadhyaya wrote: I spun up a docker container with Java 9 (java:9-jdk) from docker hub [0]. Downloaded the Solr 7.2.1 RC1 tarball and unzipped it. Tried to start it, but it failed citing some crypto issue: https://gist.github.com/anonym ous/ed1a179b1043190b5f6fd635c6 a47f23 I'm trying out the same for 7.2.0 and earlier versions to see if this is a recent regression. [0] - docker run -it java:9-jdk On Wed, Jan 10, 2018 at 11:04 PM, Adrien Grand wrote: +1 SUCCESS! [1:29:47.999770] Le mer. 10 janv. 2018 à 18:03, Tomas Fernandez Lobbe a écrit : +1 SUCCESS! [1:04:34.912689] On Jan 10, 2018, at 8:01 AM, Alan Woodward wrote: +1 SUCCESS! [1:43:16.772919] I need to get a new test machine... On 10 Jan 2018, at 09:51, Dawid Weiss wrote: +1 SUCCESS! [1:31:30.029815] Dawid On Wed, Jan 10, 2018 at 10:46 AM, Shalin Shekhar Mangar wrote: +1 SUCCESS! [1:13:22.042124] On Wed, Jan 10, 2018 at 8:00 AM, jim ferenczi wrote: Please vote for release candidate 1 for Lucene/Solr 7.2.1 The artifacts can be downloaded from: https://dist.apache.org/repos/ dist/dev/lucene/lucene-solr-7. 2.1-RC1-revb2b6438b37073bee1fc a40374e85bf91aa457c0b You can run the smoke tester directly with this command: python3 -u dev-tools/scripts/smokeTestRel ease.py \ https://dist.apache.org/repos/ dist/dev/lucene/lucene-solr-7. 2.1-RC1-revb2b6438b37073bee1fc a40374e85bf91aa457c0b Here's my +1 SUCCESS! [0:38:10.689623] -- Regards, Shalin Shekhar Mangar. -- -- - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache. org For additional commands, e-mail: dev-h...@lucene.apache.org -- -- - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache. org For additional commands, e-mail: dev-h...@lucene.apache.org -- -- - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache. org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Ignacio Vera as Lucene/Solr committer
Congratulations Ignacio! Ahmet On Thursday, January 11, 2018, 9:43:50 PM GMT+3, Martin Gaintywrote: ¡Bienvendos Ignacio! Martín __ From: Erick Erickson Sent: Thursday, January 11, 2018 12:39 PM To: dev@lucene.apache.org Subject: Re: Welcome Ignacio Vera as Lucene/Solr committer Welcome Ignacio! On Thu, Jan 11, 2018 at 9:09 AM, Karl Wright wrote: > > Welcome, Ignacio! > Karl > > > > > > > On Thu, Jan 11, 2018 at 11:46 AM, Steve Rowe wrote: > >> Congrats and welcome Ignacio! >> >> -- >> Steve >> www.lucidworks.com >> >> >> >>> On Jan 11, 2018, at 11:35 AM, Adrien Grand wrote: >>> >>> Hi all, >>> >>> Please join me in welcoming Ignacio Vera as the latest Lucene/Solr >>> committer. >>> Ignacio, it's tradition for you to introduce yourself with a brief bio. >>> >>> Congratulations and Welcome! >> >> >> >> >> -- -- - >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache. org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> > > > > > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-11260) Update Guava to 23.0
[ https://issues.apache.org/jira/browse/SOLR-11260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan reassigned SOLR-11260: --- Assignee: Ahmet Arslan > Update Guava to 23.0 > > > Key: SOLR-11260 > URL: https://issues.apache.org/jira/browse/SOLR-11260 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 6.6 >Reporter: Ahmet Arslan > Assignee: Ahmet Arslan >Priority: Minor > Fix For: master (8.0) > > Attachments: SOLR-11260.patch > > > Solr 6.6.0 depends on a pretty old version of guava. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Karl Wright to the PMC
Congratulations Karl! Ahmet On Thursday, December 28, 2017, 7:32:41 PM GMT+3, Steve Rowewrote: Congrats and welcome Karl! -- Steve www.lucidworks.com > On Dec 28, 2017, at 9:08 AM, Adrien Grand wrote: > > I am pleased to announce that Karl Wright has accepted the PMC's invitation > to join. > > Welcome Karl! - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Dennis Gove to the PMC
Congratulations Dennis! Ahmet On Wednesday, December 27, 2017, 7:56:58 PM GMT+3, Dawid Weisswrote: Congratulations Dennis! Dawid On Wed, Dec 27, 2017 at 5:37 PM, Anshum Gupta wrote: > Congratulations and welcome Dennis! > > On Wed, Dec 27, 2017 at 4:59 PM Steve Rowe wrote: >> >> Congrats and welcome Dennis! >> >> -- >> Steve >> www.lucidworks.com >> >> > On Dec 26, 2017, at 8:12 AM, Joel Bernstein wrote: >> > >> > I am pleased to announce that Dennis Gove has accepted the PMC's >> > invitation to join. >> > >> > Welcome Dennis! >> >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Ahmet Arslan as Lucene/Solr committer
Hi, Thanks to all for the warm welcome. It is such an honor to be invited by the PMC. I am an Assistant Professor in the Department of Computer Engineering at Anadolu University, Turkey. My current research interests include selective information retrieval and index term weighting. I started using Lucene during my master studies for academic purposes.Later on, I have worked in a number of commercial search projects using Apache Lucene/Solr. I am very proud of being part of this team! Thanks, Ahmet On Monday, December 18, 2017, 4:42:34 PM GMT+3, Steve Rowe <sar...@gmail.com> wrote: Congrats and welcome Ahmet! -- Steve www.lucidworks.com > On Dec 17, 2017, at 5:15 AM, Adrien Grand <jpou...@gmail.com> wrote: > > Hi all, > > Please join me in welcoming Ahmet Arslan as the latest Lucene/Solr committer. > Ahmet, it's tradition for you to introduce yourself with a brief bio. > > Congratulations and Welcome! > > Adrien - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-11260) Update Guava to 23.0
[ https://issues.apache.org/jira/browse/SOLR-11260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-11260: Attachment: SOLR-11260.patch Patch replaces two methods that are removed in Guava 23.0 * Objects.firstNonNull -> MoreObjects.firstNonNull * HashFunction.hashString -> HashFunction.hashUnencodedChars > Update Guava to 23.0 > > > Key: SOLR-11260 > URL: https://issues.apache.org/jira/browse/SOLR-11260 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 6.6 >Reporter: Ahmet Arslan >Priority: Minor > Fix For: master (8.0) > > Attachments: SOLR-11260.patch > > > Solr 6.6.0 depends on a pretty old version of guava. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-11260) Update Guava to 23.0
Ahmet Arslan created SOLR-11260: --- Summary: Update Guava to 23.0 Key: SOLR-11260 URL: https://issues.apache.org/jira/browse/SOLR-11260 Project: Solr Issue Type: Task Security Level: Public (Default Security Level. Issues are Public) Affects Versions: 6.6 Reporter: Ahmet Arslan Priority: Minor Fix For: master (8.0) Solr 6.6.0 depends on a pretty old version of guava. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
lucene 6.6.0 download link redirects to 6.5.1
Hi, Lucene download page redirects to http://www-eu.apache.org/dist/lucene/java/6.5.1 for me.Solr's link is correct. Ahmet
TestUAX29URLEmailTokenizer inconsistent adding dots and apostrophes to URLs and Emails
Hi, I extracted Emails and URLs from certain TREC collections using TestUAX29URLEmailTokenizer combined with TypeTokenFilter. High Freq. terms reveal that * some e-mail addressed start with apostrophes * some e-mails or URLs end with a period. I ran a few tests and this behaviour occurs only if the entity is the first or last term in the text.If the entity is the middle of the text, UAXURLET strips apostrophes and dots. For example, "Contact me at java-u...@lucene.apache.org. or dev@lucene.apache.org." will produce java-u...@lucene.apache.org. dev@lucene.apache.orgNotice first email has a dot, while second has not. Why UAXURLET behaves different for the first/last token? Could this be a bug? It looks like dot and apostrophes are legal parts of the entities but with this abbreviations such as W.Va. D-W.Va. v.ye. are recognized as URL. I created 8 test cases to get your opinions for this one, before creating a Jira issue. public void testURLEndingWithDot2() throws IOException { BaseTokenStreamTestCase.assertAnalyzesTo(a, "My Web addresses are www.apache.org. and lucene.apache.org", new String[] {"My","Web","addresses", "are","www.apache.org","and","lucene.apache.org"}, new String[] {"","","","","","",""}); } public void testEMailStartingWithApostrophe2() throws IOException { BaseTokenStreamTestCase.assertAnalyzesTo(a, "'g...@usgs.gov 'cber_i...@a1.cber.fda.gov.", new String[] {"g...@usgs.gov","cber_i...@a1.cber.fda.gov"}, new String[] {"","","",""}); } P.S. I observed somehow similar phenomena with ICU tokenizer. ICU tokenizer sets script attribute to Latin for words that consist of numbers. But if the whole text is composed of words that consist of numbers, script attribute is set to Common. Thanks,Ahmet
[jira] [Updated] (LUCENE-7585) Interface for common parameters used across analysis factories
[ https://issues.apache.org/jira/browse/LUCENE-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7585: - Attachment: LUCENE-7585.patch Fix TestStopFilterFactory and TestSuggestStopFilterFactory failure > Interface for common parameters used across analysis factories > -- > > Key: LUCENE-7585 > URL: https://issues.apache.org/jira/browse/LUCENE-7585 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.3 >Reporter: Ahmet Arslan >Assignee: David Smiley >Priority: Minor > Fix For: 7.0 > > Attachments: LUCENE-7585.patch, LUCENE-7585.patch, LUCENE-7585.patch, > LUCENE-7585.patch, LUCENE-7585.patch, LUCENE-7585.patch, LUCENE-7585.patch, > LUCENE-7585.patch > > > Certain parameters (String constants) are same/common for multiple analysis > factories. Some examples are {{ignoreCase}}, {{dictionary}}, and > {{preserveOriginal}}. These string constants are handled inconsistently in > different factories. This is an effort to define most common constants in > ({{CommonAnalysisFactoryParams}}) interface and reuse them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7773) Remove unused/deprecated token types from StandardTokenizer
[ https://issues.apache.org/jira/browse/LUCENE-7773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072994#comment-16072994 ] Ahmet Arslan commented on LUCENE-7773: -- Can someone please look into this issue? This issue addresses a [TODO|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/analysis/standard/StandardTokenizer.java#L43] introduced by [~rcmuir] in [this|https://github.com/apache/lucene-solr/commit/bc3a3dc5d47af0c00748468b1ae14b4a18854366] commit. > Remove unused/deprecated token types from StandardTokenizer > --- > > Key: LUCENE-7773 > URL: https://issues.apache.org/jira/browse/LUCENE-7773 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.5 >Reporter: Ahmet Arslan >Priority: Minor > Labels: analyzers > Fix For: 7.0 > > Attachments: LUCENE-7773.patch, LUCENE-7773.patch > > > StandardTokenizer does not recognize e-mail, company etc. This issue removes > those token types. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7585) Interface for common parameters used across analysis factories
[ https://issues.apache.org/jira/browse/LUCENE-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7585: - Attachment: LUCENE-7585.patch Sorry for the huge delay. This patch addresses the issues raised by David. * consumeAllTokens is used by LimitTokenOffset and LimitTokenPosition too. * applies Yonik's concept * improved javadoc. Some arguments are difficult since they have different meanings in different components. * covers a few more overlooked analysis factories * spotted a copy-paste mistake Any feedback is appreciated. > Interface for common parameters used across analysis factories > -- > > Key: LUCENE-7585 > URL: https://issues.apache.org/jira/browse/LUCENE-7585 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.3 >Reporter: Ahmet Arslan >Assignee: David Smiley >Priority: Minor > Fix For: 7.0 > > Attachments: LUCENE-7585.patch, LUCENE-7585.patch, LUCENE-7585.patch, > LUCENE-7585.patch, LUCENE-7585.patch, LUCENE-7585.patch, LUCENE-7585.patch > > > Certain parameters (String constants) are same/common for multiple analysis > factories. Some examples are {{ignoreCase}}, {{dictionary}}, and > {{preserveOriginal}}. These string constants are handled inconsistently in > different factories. This is an effort to define most common constants in > ({{CommonAnalysisFactoryParams}}) interface and reuse them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7585) Interface for common parameters used across analysis factories
[ https://issues.apache.org/jira/browse/LUCENE-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7585: - Attachment: LUCENE-7585.patch Finally {{ant precommit}} passes with this patch. It checks missing javadocs using *level=package* for icu, morfologik, phonetic, and suggest. > Interface for common parameters used across analysis factories > -- > > Key: LUCENE-7585 > URL: https://issues.apache.org/jira/browse/LUCENE-7585 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.3 >Reporter: Ahmet Arslan >Assignee: David Smiley >Priority: Minor > Fix For: master (7.0) > > Attachments: LUCENE-7585.patch, LUCENE-7585.patch, LUCENE-7585.patch, > LUCENE-7585.patch, LUCENE-7585.patch, LUCENE-7585.patch > > > Certain parameters (String constants) are same/common for multiple analysis > factories. Some examples are {{ignoreCase}}, {{dictionary}}, and > {{preserveOriginal}}. These string constants are handled inconsistently in > different factories. This is an effort to define most common constants in > ({{CommonAnalysisFactoryParams}}) interface and reuse them. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7585) Interface for common parameters used across analysis factories
[ https://issues.apache.org/jira/browse/LUCENE-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7585: - Attachment: LUCENE-7585.patch bring the patch to master. > Interface for common parameters used across analysis factories > -- > > Key: LUCENE-7585 > URL: https://issues.apache.org/jira/browse/LUCENE-7585 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.3 >Reporter: Ahmet Arslan >Assignee: David Smiley >Priority: Minor > Fix For: master (7.0) > > Attachments: LUCENE-7585.patch, LUCENE-7585.patch, LUCENE-7585.patch, > LUCENE-7585.patch, LUCENE-7585.patch > > > Certain parameters (String constants) are same/common for multiple analysis > factories. Some examples are {{ignoreCase}}, {{dictionary}}, and > {{preserveOriginal}}. These string constants are handled inconsistently in > different factories. This is an effort to define most common constants in > ({{CommonAnalysisFactoryParams}}) interface and reuse them. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7773) Remove unused/deprecated token types from StandardTokenizer
[ https://issues.apache.org/jira/browse/LUCENE-7773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7773: - Attachment: LUCENE-7773.patch Make the {{TestAnalyzers}} compile again. > Remove unused/deprecated token types from StandardTokenizer > --- > > Key: LUCENE-7773 > URL: https://issues.apache.org/jira/browse/LUCENE-7773 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.5 >Reporter: Ahmet Arslan >Priority: Minor > Labels: analyzers > Fix For: master (7.0) > > Attachments: LUCENE-7773.patch, LUCENE-7773.patch > > > StandardTokenizer does not recognize e-mail, company etc. This issue removes > those token types. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7773) Remove unused/deprecated token types from StandardTokenizer
[ https://issues.apache.org/jira/browse/LUCENE-7773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7773: - Attachment: LUCENE-7773.patch This patch removes old types. > Remove unused/deprecated token types from StandardTokenizer > --- > > Key: LUCENE-7773 > URL: https://issues.apache.org/jira/browse/LUCENE-7773 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.5 >Reporter: Ahmet Arslan >Priority: Minor > Labels: analyzers > Fix For: master (7.0) > > Attachments: LUCENE-7773.patch > > > StandardTokenizer does not recognize e-mail, company etc. This issue removes > those token types. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-7773) Remove unused/deprecated token types from StandardTokenizer
Ahmet Arslan created LUCENE-7773: Summary: Remove unused/deprecated token types from StandardTokenizer Key: LUCENE-7773 URL: https://issues.apache.org/jira/browse/LUCENE-7773 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 6.5 Reporter: Ahmet Arslan Priority: Minor Fix For: master (7.0) StandardTokenizer does not recognize e-mail, company etc. This issue removes those token types. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7602) Fix compiler warnings for ant clean compile
[ https://issues.apache.org/jira/browse/LUCENE-7602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15780903#comment-15780903 ] Ahmet Arslan commented on LUCENE-7602: -- I think the current issue will clean up three previous issues. > Fix compiler warnings for ant clean compile > --- > > Key: LUCENE-7602 > URL: https://issues.apache.org/jira/browse/LUCENE-7602 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Paul Elschot >Priority: Minor > Labels: build > Fix For: trunk > > Attachments: LUCENE-7602-ContextMap-lucene.patch, > LUCENE-7602-ContextMap-solr.patch, LUCENE-7602.patch, LUCENE-7602.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7602) Fix compiler warnings for ant clean compile
[ https://issues.apache.org/jira/browse/LUCENE-7602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15777253#comment-15777253 ] Ahmet Arslan commented on LUCENE-7602: -- can't we just use Map<Object,Object> instead of Map or ContextMap? > Fix compiler warnings for ant clean compile > --- > > Key: LUCENE-7602 > URL: https://issues.apache.org/jira/browse/LUCENE-7602 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Paul Elschot >Priority: Minor > Labels: build > Fix For: trunk > > Attachments: LUCENE-7602-ContextMap-lucene.patch, > LUCENE-7602-ContextMap-solr.patch, LUCENE-7602.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7585) Interface for common parameters used across analysis factories
[ https://issues.apache.org/jira/browse/LUCENE-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7585: - Attachment: LUCENE-7585.patch Patch that adds javadocs. {{ant documentation-lint}} still fails for some reason that I cannot figure out. > Interface for common parameters used across analysis factories > -- > > Key: LUCENE-7585 > URL: https://issues.apache.org/jira/browse/LUCENE-7585 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.3 >Reporter: Ahmet Arslan >Assignee: David Smiley >Priority: Minor > Fix For: master (7.0) > > Attachments: LUCENE-7585.patch, LUCENE-7585.patch, LUCENE-7585.patch, > LUCENE-7585.patch > > > Certain parameters (String constants) are same/common for multiple analysis > factories. Some examples are {{ignoreCase}}, {{dictionary}}, and > {{preserveOriginal}}. These string constants are handled inconsistently in > different factories. This is an effort to define most common constants in > ({{CommonAnalysisFactoryParams}}) interface and reuse them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7599) replace TestRandomChains.Predicate with java.util.function.Predicate
[ https://issues.apache.org/jira/browse/LUCENE-7599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7599: - Attachment: LUCENE-7599.patch Patch that replaces ArgProducer with Function<Random,Object> > replace TestRandomChains.Predicate with java.util.function.Predicate > > > Key: LUCENE-7599 > URL: https://issues.apache.org/jira/browse/LUCENE-7599 > Project: Lucene - Core > Issue Type: Improvement > Components: general/test >Affects Versions: 6.3 >Reporter: Ahmet Arslan >Priority: Trivial > Labels: test > Fix For: master (7.0) > > Attachments: LUCENE-7599.patch, LUCENE-7599.patch > > > {{TestRandomChains}} has its own Predicate interface which can be replaced > with {{java.util.function.Predicate}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7599) replace TestRandomChains.Predicate with java.util.function.Predicate
[ https://issues.apache.org/jira/browse/LUCENE-7599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7599: - Attachment: LUCENE-7599.patch Patch that removes {{TestRandomChains.Predicate}} in favour of {{java.util.function.Predicate}. It simplifies code with lambda expressions or method references. > replace TestRandomChains.Predicate with java.util.function.Predicate > > > Key: LUCENE-7599 > URL: https://issues.apache.org/jira/browse/LUCENE-7599 > Project: Lucene - Core > Issue Type: Improvement > Components: general/test >Affects Versions: 6.3 >Reporter: Ahmet Arslan >Priority: Trivial > Labels: test > Fix For: master (7.0) > > Attachments: LUCENE-7599.patch > > > {{TestRandomChains}} has its own Predicate interface which can be replaced > with {{java.util.function.Predicate}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-7599) replace TestRandomChains.Predicate with java.util.function.Predicate
Ahmet Arslan created LUCENE-7599: Summary: replace TestRandomChains.Predicate with java.util.function.Predicate Key: LUCENE-7599 URL: https://issues.apache.org/jira/browse/LUCENE-7599 Project: Lucene - Core Issue Type: Improvement Components: general/test Affects Versions: 6.3 Reporter: Ahmet Arslan Priority: Trivial Fix For: master (7.0) {{TestRandomChains}} has its own Predicate interface which can be replaced with {{java.util.function.Predicate}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7585) Interface for common parameters used across analysis factories
[ https://issues.apache.org/jira/browse/LUCENE-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15758938#comment-15758938 ] Ahmet Arslan commented on LUCENE-7585: -- I tried adding javadocs to fields in the interface, but it did not solve the missing javadocs problem. {{documentation-lint}} complains/fails for the lucene/analysis/modules, which are explicitly defined with the level of method in [lucene/build.xml|https://github.com/apache/lucene-solr/blob/master/lucene/build.xml] {code:xml} {code} I figured that this method=(level|class|none) thing is about [checkJavaDocs.py|https://github.com/apache/lucene-solr/blob/master/dev-tools/scripts/checkJavaDocs.py]. Any pointer how to document interface fields so that level="method" passes in checkJavaDocs.py? Or, can we remove above xml fragment from build.xml? > Interface for common parameters used across analysis factories > -- > > Key: LUCENE-7585 > URL: https://issues.apache.org/jira/browse/LUCENE-7585 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.3 >Reporter: Ahmet Arslan >Assignee: David Smiley >Priority: Minor > Fix For: master (7.0) > > Attachments: LUCENE-7585.patch, LUCENE-7585.patch, LUCENE-7585.patch > > > Certain parameters (String constants) are same/common for multiple analysis > factories. Some examples are {{ignoreCase}}, {{dictionary}}, and > {{preserveOriginal}}. These string constants are handled inconsistently in > different factories. This is an effort to define most common constants in > ({{CommonAnalysisFactoryParams}}) interface and reuse them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7585) Interface for common parameters used across analysis factories
[ https://issues.apache.org/jira/browse/LUCENE-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7585: - Attachment: LUCENE-7585.patch a few more refactoring including the overlooked code point filter factory. > Interface for common parameters used across analysis factories > -- > > Key: LUCENE-7585 > URL: https://issues.apache.org/jira/browse/LUCENE-7585 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.3 >Reporter: Ahmet Arslan >Assignee: David Smiley >Priority: Minor > Fix For: master (7.0) > > Attachments: LUCENE-7585.patch, LUCENE-7585.patch, LUCENE-7585.patch > > > Certain parameters (String constants) are same/common for multiple analysis > factories. Some examples are {{ignoreCase}}, {{dictionary}}, and > {{preserveOriginal}}. These string constants are handled inconsistently in > different factories. This is an effort to define most common constants in > ({{CommonAnalysisFactoryParams}}) interface and reuse them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7585) Interface for common parameters used across analysis factories
[ https://issues.apache.org/jira/browse/LUCENE-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749577#comment-15749577 ] Ahmet Arslan commented on LUCENE-7585: -- By saying inconsistency, I mean the strategy to retrieve those parameters from the arg map. Some use inline string constant e.g. getBoolean(args, "reverse"); others define private or public static final String for the key. > Interface for common parameters used across analysis factories > -- > > Key: LUCENE-7585 > URL: https://issues.apache.org/jira/browse/LUCENE-7585 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.3 >Reporter: Ahmet Arslan >Assignee: David Smiley >Priority: Minor > Fix For: master (7.0) > > Attachments: LUCENE-7585.patch, LUCENE-7585.patch > > > Certain parameters (String constants) are same/common for multiple analysis > factories. Some examples are {{ignoreCase}}, {{dictionary}}, and > {{preserveOriginal}}. These string constants are handled inconsistently in > different factories. This is an effort to define most common constants in > ({{CommonAnalysisFactoryParams}}) interface and reuse them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7585) Interface for common parameters used across analysis factories
[ https://issues.apache.org/jira/browse/LUCENE-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15737593#comment-15737593 ] Ahmet Arslan commented on LUCENE-7585: -- Here is an excerpt from {{documentation-lint}} {code} [exec] build/docs/analyzers-icu/org/apache/lucene/analysis/icu/segmentation/ICUTokenizerFactory.html [exec] missing Fields: CONSUME_ALL_TOKENS [exec] missing Fields: DELIMITER [exec] missing Fields: DICTIONARY [exec] missing Fields: ENCODER [exec] missing Fields: FORMAT [exec] missing Fields: IGNORE_CASE [exec] missing Fields: LUCENE_MATCH_VERSION [exec] missing Fields: MAX [exec] missing Fields: MAX_TOKEN_LENGTH [exec] missing Fields: MIN [exec] missing Fields: PATTERN [exec] missing Fields: PRESERVE_ORIGINAL [exec] missing Fields: PROTECTED [exec] missing Fields: TYPES [exec] missing Fields: WORDS [exec] [exec] Missing javadocs were found! {code} > Interface for common parameters used across analysis factories > -- > > Key: LUCENE-7585 > URL: https://issues.apache.org/jira/browse/LUCENE-7585 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.3 >Reporter: Ahmet Arslan >Assignee: David Smiley >Priority: Minor > Fix For: master (7.0) > > Attachments: LUCENE-7585.patch, LUCENE-7585.patch > > > Certain parameters (String constants) are same/common for multiple analysis > factories. Some examples are {{ignoreCase}}, {{dictionary}}, and > {{preserveOriginal}}. These string constants are handled inconsistently in > different factories. This is an effort to define most common constants in > ({{CommonAnalysisFactoryParams}}) interface and reuse them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7585) Interface for common parameters used across analysis factories
[ https://issues.apache.org/jira/browse/LUCENE-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736501#comment-15736501 ] Ahmet Arslan commented on LUCENE-7585: -- Thank you for looking into this. Initially, I was planning to move all existing parameters to a common interface. I figured that the interface will grow very large since certain factories have many specific parameters. I moved the most common parameters to the interface. However, there still remains a lot in the codebase. For example, ngram package has "minGramSize" and "maxGramSize" in common. Phonetic module has "maxCodeLength" and "inject." What could be the preferred course of action here? * Handle packages and modules locally? If yes how? * Move all parameters to the interface unconditionally. * Devise an algorithm: Move if a parameter is shared by at least two package or module. * ? > Interface for common parameters used across analysis factories > -- > > Key: LUCENE-7585 > URL: https://issues.apache.org/jira/browse/LUCENE-7585 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.3 >Reporter: Ahmet Arslan >Assignee: David Smiley >Priority: Minor > Fix For: master (7.0) > > Attachments: LUCENE-7585.patch, LUCENE-7585.patch > > > Certain parameters (String constants) are same/common for multiple analysis > factories. Some examples are {{ignoreCase}}, {{dictionary}}, and > {{preserveOriginal}}. These string constants are handled inconsistently in > different factories. This is an effort to define most common constants in > ({{CommonAnalysisFactoryParams}}) interface and reuse them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7585) Interface for common parameters used across analysis factories
[ https://issues.apache.org/jira/browse/LUCENE-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7585: - Attachment: LUCENE-7585.patch Properly created patch that includes proposed changes (alphabetisation and lucene_match_version). Ant {{documentation-lint}} complains about factories of icu. Any pointer how to fix it? > Interface for common parameters used across analysis factories > -- > > Key: LUCENE-7585 > URL: https://issues.apache.org/jira/browse/LUCENE-7585 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.3 >Reporter: Ahmet Arslan >Assignee: David Smiley >Priority: Minor > Fix For: master (7.0) > > Attachments: LUCENE-7585.patch, LUCENE-7585.patch > > > Certain parameters (String constants) are same/common for multiple analysis > factories. Some examples are {{ignoreCase}}, {{dictionary}}, and > {{preserveOriginal}}. These string constants are handled inconsistently in > different factories. This is an effort to define most common constants in > ({{CommonAnalysisFactoryParams}}) interface and reuse them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7585) Interface for common parameters used across analysis factories
[ https://issues.apache.org/jira/browse/LUCENE-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7585: - Attachment: LUCENE-7585.patch > Interface for common parameters used across analysis factories > -- > > Key: LUCENE-7585 > URL: https://issues.apache.org/jira/browse/LUCENE-7585 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.3 >Reporter: Ahmet Arslan >Priority: Minor > Fix For: master (7.0) > > Attachments: LUCENE-7585.patch > > > Certain parameters (String constants) are same/common for multiple analysis > factories. Some examples are {{ignoreCase}}, {{dictionary}}, and > {{preserveOriginal}}. These string constants are handled inconsistently in > different factories. This is an effort to define most common constants in > ({{CommonAnalysisFactoryParams}}) interface and reuse them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-7585) Interface for common parameters used across analysis factories
Ahmet Arslan created LUCENE-7585: Summary: Interface for common parameters used across analysis factories Key: LUCENE-7585 URL: https://issues.apache.org/jira/browse/LUCENE-7585 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 6.3 Reporter: Ahmet Arslan Priority: Minor Fix For: master (7.0) Certain parameters (String constants) are same/common for multiple analysis factories. Some examples are {{ignoreCase}}, {{dictionary}}, and {{preserveOriginal}}. These string constants are handled inconsistently in different factories. This is an effort to define most common constants in ({{CommonAnalysisFactoryParams}}) interface and reuse them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7525) ASCIIFoldingFilter.foldToASCII performance issue due to large compiled method size
[ https://issues.apache.org/jira/browse/LUCENE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15616922#comment-15616922 ] Ahmet Arslan commented on LUCENE-7525: -- Can workings of ICUFoldingFilter give any insight here? > ASCIIFoldingFilter.foldToASCII performance issue due to large compiled method > size > -- > > Key: LUCENE-7525 > URL: https://issues.apache.org/jira/browse/LUCENE-7525 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.2.1 >Reporter: Karl von Randow > Attachments: ASCIIFolding.java, ASCIIFoldingFilter.java, > TestASCIIFolding.java > > > The {{ASCIIFoldingFilter.foldToASCII}} method has an enormous switch > statement and is too large for the HotSpot compiler to compile; causing a > performance problem. > The method is about 13K compiled, versus the 8KB HotSpot limit. So splitting > the method in half works around the problem. > In my tests splitting the method in half resulted in a 5X performance > increase. > In the test code below you can see how slow the fold method is, even when it > is using the shortcut when the character is less than 0x80, compared to an > inline implementation of the same shortcut. > So a workaround is to split the method. I'm happy to provide a patch. It's a > hack, of course. Perhaps using the {{MappingCharFilterFactory}} with an input > file as per SOLR-2013 would be a better replacement for this method in this > class? > {code:java} > public class ASCIIFoldingFilterPerformanceTest { > private static final int ITERATIONS = 1_000_000; > @Test > public void testFoldShortString() { > char[] input = "testing".toCharArray(); > char[] output = new char[input.length * 4]; > for (int i = 0; i < ITERATIONS; i++) { > ASCIIFoldingFilter.foldToASCII(input, 0, output, 0, > input.length); > } > } > @Test > public void testFoldShortAccentedString() { > char[] input = "éúéúøßüäéúéúøßüä".toCharArray(); > char[] output = new char[input.length * 4]; > for (int i = 0; i < ITERATIONS; i++) { > ASCIIFoldingFilter.foldToASCII(input, 0, output, 0, > input.length); > } > } > @Test > public void testManualFoldTinyString() { > char[] input = "t".toCharArray(); > char[] output = new char[input.length * 4]; > for (int i = 0; i < ITERATIONS; i++) { > int k = 0; > for (int j = 0; j < 1; ++j) { > final char c = input[j]; > if (c < '\u0080') { > output[k++] = c; > } else { > Assert.assertTrue(false); > } > } > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7377) Remove ClassicSimilarity?
[ https://issues.apache.org/jira/browse/LUCENE-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374882#comment-15374882 ] Ahmet Arslan commented on LUCENE-7377: -- I think, an implementation of TFIDF should stay in Lucene, but it should extend SimilarityBase and it should have a simple, single line code in org.apache.lucene.search.similarities.SimilarityBase#score method. e.g., {code} return tf * log2(((double) stats.getNumberOfDocuments() / (double) stats.getDocFreq()) + 1); {code} Current TFIDFSimilarity and ClassicSimilarity are hard to understand. > Remove ClassicSimilarity? > - > > Key: LUCENE-7377 > URL: https://issues.apache.org/jira/browse/LUCENE-7377 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > > ClassicSimilarity was relying on coordination factors in order to produce > good scores. Now that coords are gone, it is quite a bad option compared to > eg. BM25Similarity. > Maybe we should remove ClassicSimilarity entirely in master and deprecated in > 6.x in order to encourage users to move to BM25Similarity rather than stay on > a Similarity impl of lesser quality? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9250) Search breaks with EU symbol € and wildcard *
[ https://issues.apache.org/jira/browse/SOLR-9250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15349836#comment-15349836 ] Ahmet Arslan commented on SOLR-9250: Yes this is a know issue of wildcard queries. > Search breaks with EU symbol € and wildcard * > - > > Key: SOLR-9250 > URL: https://issues.apache.org/jira/browse/SOLR-9250 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Server >Affects Versions: 5.3.1 >Reporter: Tim Nolan > Attachments: contact-name-analyze.png, contact-name-field-type.png > > > While testing UTF-8 character searches, which worked, we have noticed a > combination that fails. Testing with the data {{Tùûüÿ€àâæçéèêëïîôœm}}, we > found the search worked, but by adding a wild-card (e.g. > {{Tùûüÿ€àâæçéèêëïîôœm*}}), the search fails. Adding the wildcard before the > {{€}} symbol worked (i.e. {{Tùûüÿ*}}). > Showing the logs for these queries: > {noformat:title=Full text without wildcard, hit=1} > 2016-06-25 13:16:34.361 [qtp237852351-21] INFO > org.apache.solr.core.SolrCore.Request – [core-name] webapp=/solr > path=/select > params={q=Tùûüÿ€àâæçéèêëïîôœm=true=type:CONTACT=12=json&_=1466860594348} > hits=1 status=0 QTime=0 > {noformat} > {noformat:title=Full text with wildcard, hit=0} > 2016-06-25 13:16:41.172 [qtp237852351-16] INFO > org.apache.solr.core.SolrCore.Request – [core-name] webapp=/solr > path=/select > params={q=Tùûüÿ€àâæçéèêëïîôœm*=true=type:CONTACT=12=json&_=1466860601160} > hits=0 status=0 QTime=0 > {noformat} > {noformat:title=Partial text before € with wildcard, hit=1} > 2016-06-25 13:16:52.135 [qtp237852351-18] INFO > org.apache.solr.core.SolrCore.Request – [core-name] webapp=/solr > path=/select > params={q=Tùûüÿ*=true=type:CONTACT=12=json&_=1466860612125} > hits=1 status=0 QTime=2 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9250) Search breaks with EU symbol € and wildcard *
[ https://issues.apache.org/jira/browse/SOLR-9250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15349767#comment-15349767 ] Ahmet Arslan commented on SOLR-9250: Yes this one, but you needs to make the chains visible. It is the tag in schema. Anyways, the problem looks like your tokenizer breaks/tokenizes your sample input at the UE char. Please use analysis admin page to see how your example text is tokenized/indexed. Have you read https://wiki.apache.org/solr/MultitermQueryAnalysis ? > Search breaks with EU symbol € and wildcard * > - > > Key: SOLR-9250 > URL: https://issues.apache.org/jira/browse/SOLR-9250 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Server >Affects Versions: 5.3.1 >Reporter: Tim Nolan > Attachments: contact-name-analyze.png, contact-name-field-type.png > > > While testing UTF-8 character searches, which worked, we have noticed a > combination that fails. Testing with the data {{Tùûüÿ€àâæçéèêëïîôœm}}, we > found the search worked, but by adding a wild-card (e.g. > {{Tùûüÿ€àâæçéèêëïîôœm*}}), the search fails. Adding the wildcard before the > {{€}} symbol worked (i.e. {{Tùûüÿ*}}). > Showing the logs for these queries: > {noformat:title=Full text without wildcard, hit=1} > 2016-06-25 13:16:34.361 [qtp237852351-21] INFO > org.apache.solr.core.SolrCore.Request – [core-name] webapp=/solr > path=/select > params={q=Tùûüÿ€àâæçéèêëïîôœm=true=type:CONTACT=12=json&_=1466860594348} > hits=1 status=0 QTime=0 > {noformat} > {noformat:title=Full text with wildcard, hit=0} > 2016-06-25 13:16:41.172 [qtp237852351-16] INFO > org.apache.solr.core.SolrCore.Request – [core-name] webapp=/solr > path=/select > params={q=Tùûüÿ€àâæçéèêëïîôœm*=true=type:CONTACT=12=json&_=1466860601160} > hits=0 status=0 QTime=0 > {noformat} > {noformat:title=Partial text before € with wildcard, hit=1} > 2016-06-25 13:16:52.135 [qtp237852351-18] INFO > org.apache.solr.core.SolrCore.Request – [core-name] webapp=/solr > path=/select > params={q=Tùûüÿ*=true=type:CONTACT=12=json&_=1466860612125} > hits=1 status=0 QTime=2 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9250) Search breaks with EU symbol € and wildcard *
[ https://issues.apache.org/jira/browse/SOLR-9250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15349738#comment-15349738 ] Ahmet Arslan commented on SOLR-9250: Please, we need to see field *type* definition. Where the analyzer elements are chained. > Search breaks with EU symbol € and wildcard * > - > > Key: SOLR-9250 > URL: https://issues.apache.org/jira/browse/SOLR-9250 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Server >Affects Versions: 5.3.1 >Reporter: Tim Nolan > > While testing UTF-8 character searches, which worked, we have noticed a > combination that fails. Testing with the data {{Tùûüÿ€àâæçéèêëïîôœm}}, we > found the search worked, but by adding a wild-card (e.g. > {{Tùûüÿ€àâæçéèêëïîôœm*}}), the search fails. Adding the wildcard before the > {{€}} symbol worked (i.e. {{Tùûüÿ*}}). > Showing the logs for these queries: > {noformat:title=Full text without wildcard, hit=1} > 2016-06-25 13:16:34.361 [qtp237852351-21] INFO > org.apache.solr.core.SolrCore.Request – [core-name] webapp=/solr > path=/select > params={q=Tùûüÿ€àâæçéèêëïîôœm=true=type:CONTACT=12=json&_=1466860594348} > hits=1 status=0 QTime=0 > {noformat} > {noformat:title=Full text with wildcard, hit=0} > 2016-06-25 13:16:41.172 [qtp237852351-16] INFO > org.apache.solr.core.SolrCore.Request – [core-name] webapp=/solr > path=/select > params={q=Tùûüÿ€àâæçéèêëïîôœm*=true=type:CONTACT=12=json&_=1466860601160} > hits=0 status=0 QTime=0 > {noformat} > {noformat:title=Partial text before € with wildcard, hit=1} > 2016-06-25 13:16:52.135 [qtp237852351-18] INFO > org.apache.solr.core.SolrCore.Request – [core-name] webapp=/solr > path=/select > params={q=Tùûüÿ*=true=type:CONTACT=12=json&_=1466860612125} > hits=1 status=0 QTime=2 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9250) Search breaks with EU symbol € and wildcard *
[ https://issues.apache.org/jira/browse/SOLR-9250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15349677#comment-15349677 ] Ahmet Arslan commented on SOLR-9250: bq. I'm not sure what you mean by that statement Please see https://wiki.apache.org/solr/MultitermQueryAnalysis > Search breaks with EU symbol € and wildcard * > - > > Key: SOLR-9250 > URL: https://issues.apache.org/jira/browse/SOLR-9250 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Server >Affects Versions: 5.3.1 >Reporter: Tim Nolan > > While testing UTF-8 character searches, which worked, we have noticed a > combination that fails. Testing with the data {{Tùûüÿ€àâæçéèêëïîôœm}}, we > found the search worked, but by adding a wild-card (e.g. > {{Tùûüÿ€àâæçéèêëïîôœm*}}), the search fails. Adding the wildcard before the > {{€}} symbol worked (i.e. {{Tùûüÿ*}}). > Showing the logs for these queries: > {noformat:title=Full text without wildcard, hit=1} > 2016-06-25 13:16:34.361 [qtp237852351-21] INFO > org.apache.solr.core.SolrCore.Request – [core-name] webapp=/solr > path=/select > params={q=Tùûüÿ€àâæçéèêëïîôœm=true=type:CONTACT=12=json&_=1466860594348} > hits=1 status=0 QTime=0 > {noformat} > {noformat:title=Full text with wildcard, hit=0} > 2016-06-25 13:16:41.172 [qtp237852351-16] INFO > org.apache.solr.core.SolrCore.Request – [core-name] webapp=/solr > path=/select > params={q=Tùûüÿ€àâæçéèêëïîôœm*=true=type:CONTACT=12=json&_=1466860601160} > hits=0 status=0 QTime=0 > {noformat} > {noformat:title=Partial text before € with wildcard, hit=1} > 2016-06-25 13:16:52.135 [qtp237852351-18] INFO > org.apache.solr.core.SolrCore.Request – [core-name] webapp=/solr > path=/select > params={q=Tùûüÿ*=true=type:CONTACT=12=json&_=1466860612125} > hits=1 status=0 QTime=2 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9250) Search breaks with EU symbol € and wildcard *
[ https://issues.apache.org/jira/browse/SOLR-9250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15349674#comment-15349674 ] Ahmet Arslan commented on SOLR-9250: We need to see field type definition for that field. Index time analyzer may breaking words at EU symbol or something. > Search breaks with EU symbol € and wildcard * > - > > Key: SOLR-9250 > URL: https://issues.apache.org/jira/browse/SOLR-9250 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Server >Affects Versions: 5.3.1 >Reporter: Tim Nolan > > While testing UTF-8 character searches, which worked, we have noticed a > combination that fails. Testing with the data {{Tùûüÿ€àâæçéèêëïîôœm}}, we > found the search worked, but by adding a wild-card (e.g. > {{Tùûüÿ€àâæçéèêëïîôœm*}}), the search fails. Adding the wildcard before the > {{€}} symbol worked (i.e. {{Tùûüÿ*}}). > Showing the logs for these queries: > {noformat:title=Full text without wildcard, hit=1} > 2016-06-25 13:16:34.361 [qtp237852351-21] INFO > org.apache.solr.core.SolrCore.Request – [core-name] webapp=/solr > path=/select > params={q=Tùûüÿ€àâæçéèêëïîôœm=true=type:CONTACT=12=json&_=1466860594348} > hits=1 status=0 QTime=0 > {noformat} > {noformat:title=Full text with wildcard, hit=0} > 2016-06-25 13:16:41.172 [qtp237852351-16] INFO > org.apache.solr.core.SolrCore.Request – [core-name] webapp=/solr > path=/select > params={q=Tùûüÿ€àâæçéèêëïîôœm*=true=type:CONTACT=12=json&_=1466860601160} > hits=0 status=0 QTime=0 > {noformat} > {noformat:title=Partial text before € with wildcard, hit=1} > 2016-06-25 13:16:52.135 [qtp237852351-18] INFO > org.apache.solr.core.SolrCore.Request – [core-name] webapp=/solr > path=/select > params={q=Tùûüÿ*=true=type:CONTACT=12=json&_=1466860612125} > hits=1 status=0 QTime=2 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9250) Search breaks with EU symbol € and wildcard *
[ https://issues.apache.org/jira/browse/SOLR-9250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15349652#comment-15349652 ] Ahmet Arslan commented on SOLR-9250: What do you mean by saying the search fails? Throws exception? Does not return expected results? Wildcard queries are not analyzed by the way. Please ask question of this type on user mailing list. > Search breaks with EU symbol € and wildcard * > - > > Key: SOLR-9250 > URL: https://issues.apache.org/jira/browse/SOLR-9250 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Server >Affects Versions: 5.3.1 >Reporter: Tim Nolan > > While testing UTF-8 character searches, which worked, we have noticed a > combination that fails. Testing with the data {{Tùûüÿ€àâæçéèêëïîôœm}}, we > found the search worked, but by adding a wild-card (e.g. > {{Tùûüÿ€àâæçéèêëïîôœm*}}), the search fails. Adding the wildcard before the > {{€}} symbol worked (i.e. {{Tùûüÿ*}}). > Showing the logs for these queries: > {noformat:title=Full text without wildcard, hit=1} > 2016-06-25 13:16:34.361 [qtp237852351-21] INFO > org.apache.solr.core.SolrCore.Request – [core-name] webapp=/solr > path=/select > params={q=Tùûüÿ€àâæçéèêëïîôœm=true=type:CONTACT=12=json&_=1466860594348} > hits=1 status=0 QTime=0 > {noformat} > {noformat:title=Full text with wildcard, hit=0} > 2016-06-25 13:16:41.172 [qtp237852351-16] INFO > org.apache.solr.core.SolrCore.Request – [core-name] webapp=/solr > path=/select > params={q=Tùûüÿ€àâæçéèêëïîôœm*=true=type:CONTACT=12=json&_=1466860601160} > hits=0 status=0 QTime=0 > {noformat} > {noformat:title=Partial text before € with wildcard, hit=1} > 2016-06-25 13:16:52.135 [qtp237852351-18] INFO > org.apache.solr.core.SolrCore.Request – [core-name] webapp=/solr > path=/select > params={q=Tùûüÿ*=true=type:CONTACT=12=json&_=1466860612125} > hits=1 status=0 QTime=2 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7287) New lemma-tizer plugin for ukrainian language.
[ https://issues.apache.org/jira/browse/LUCENE-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346949#comment-15346949 ] Ahmet Arslan commented on LUCENE-7287: -- This is a new feature that is never released, new ticket may not be needed. > New lemma-tizer plugin for ukrainian language. > -- > > Key: LUCENE-7287 > URL: https://issues.apache.org/jira/browse/LUCENE-7287 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Dmytro Hambal >Priority: Minor > Labels: analysis, language, plugin > Fix For: master (7.0), 6.2 > > Attachments: LUCENE-7287.patch, Screen Shot 2016-06-23 at 8.23.01 > PM.png, Screen Shot 2016-06-23 at 8.41.28 PM.png > > > Hi all, > I wonder whether you are interested in supporting a plugin which provides a > mapping between ukrainian word forms and their lemmas. Some tests and docs go > out-of-the-box =) . > https://github.com/mrgambal/elasticsearch-ukrainian-lemmatizer > It's really simple but still works and generates some value for its users. > More: https://github.com/elastic/elasticsearch/issues/18303 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7287) New lemma-tizer plugin for ukrainian language.
[ https://issues.apache.org/jira/browse/LUCENE-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346875#comment-15346875 ] Ahmet Arslan commented on LUCENE-7287: -- Hi, multiple tokens OK, but multiple identical tokens look weird, no? Have you checked the screenshot that includes RemoveDuplicatesTokenFilterFactory (RDTF)? bq. Shall I create mappings_uk.txt so we can use it in solr? Lets ask Michael. Either separate file or we can just recommend to use mapping char filter the recommended mappings. May be we can place the uk_mappings.txt file under https://github.com/apache/lucene-solr/tree/master/solr/server/solr/configsets/sample_techproducts_configs/conf/lang > New lemma-tizer plugin for ukrainian language. > -- > > Key: LUCENE-7287 > URL: https://issues.apache.org/jira/browse/LUCENE-7287 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Dmytro Hambal >Priority: Minor > Labels: analysis, language, plugin > Fix For: master (7.0), 6.2 > > Attachments: LUCENE-7287.patch, Screen Shot 2016-06-23 at 8.23.01 > PM.png, Screen Shot 2016-06-23 at 8.41.28 PM.png > > > Hi all, > I wonder whether you are interested in supporting a plugin which provides a > mapping between ukrainian word forms and their lemmas. Some tests and docs go > out-of-the-box =) . > https://github.com/mrgambal/elasticsearch-ukrainian-lemmatizer > It's really simple but still works and generates some value for its users. > More: https://github.com/elastic/elasticsearch/issues/18303 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7287) New lemma-tizer plugin for ukrainian language.
[ https://issues.apache.org/jira/browse/LUCENE-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346857#comment-15346857 ] Ahmet Arslan commented on LUCENE-7287: -- Please see screenshots in the attachments section at the begging of the page and let me know what you think. > New lemma-tizer plugin for ukrainian language. > -- > > Key: LUCENE-7287 > URL: https://issues.apache.org/jira/browse/LUCENE-7287 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Dmytro Hambal >Priority: Minor > Labels: analysis, language, plugin > Fix For: master (7.0), 6.2 > > Attachments: LUCENE-7287.patch, Screen Shot 2016-06-23 at 8.23.01 > PM.png, Screen Shot 2016-06-23 at 8.41.28 PM.png > > > Hi all, > I wonder whether you are interested in supporting a plugin which provides a > mapping between ukrainian word forms and their lemmas. Some tests and docs go > out-of-the-box =) . > https://github.com/mrgambal/elasticsearch-ukrainian-lemmatizer > It's really simple but still works and generates some value for its users. > More: https://github.com/elastic/elasticsearch/issues/18303 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7287) New lemma-tizer plugin for ukrainian language.
[ https://issues.apache.org/jira/browse/LUCENE-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7287: - Attachment: Screen Shot 2016-06-23 at 8.41.28 PM.png Here is the screen shot of analysis admin page, with RemoveDuplicatesTokenFilter added. {code:xml} {code} > New lemma-tizer plugin for ukrainian language. > -- > > Key: LUCENE-7287 > URL: https://issues.apache.org/jira/browse/LUCENE-7287 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Dmytro Hambal >Priority: Minor > Labels: analysis, language, plugin > Fix For: master (7.0), 6.2 > > Attachments: LUCENE-7287.patch, Screen Shot 2016-06-23 at 8.23.01 > PM.png, Screen Shot 2016-06-23 at 8.41.28 PM.png > > > Hi all, > I wonder whether you are interested in supporting a plugin which provides a > mapping between ukrainian word forms and their lemmas. Some tests and docs go > out-of-the-box =) . > https://github.com/mrgambal/elasticsearch-ukrainian-lemmatizer > It's really simple but still works and generates some value for its users. > More: https://github.com/elastic/elasticsearch/issues/18303 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7287) New lemma-tizer plugin for ukrainian language.
[ https://issues.apache.org/jira/browse/LUCENE-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7287: - Attachment: Screen Shot 2016-06-23 at 8.23.01 PM.png {code:xml} {code} > New lemma-tizer plugin for ukrainian language. > -- > > Key: LUCENE-7287 > URL: https://issues.apache.org/jira/browse/LUCENE-7287 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Dmytro Hambal >Priority: Minor > Labels: analysis, language, plugin > Fix For: master (7.0), 6.2 > > Attachments: LUCENE-7287.patch, Screen Shot 2016-06-23 at 8.23.01 > PM.png > > > Hi all, > I wonder whether you are interested in supporting a plugin which provides a > mapping between ukrainian word forms and their lemmas. Some tests and docs go > out-of-the-box =) . > https://github.com/mrgambal/elasticsearch-ukrainian-lemmatizer > It's really simple but still works and generates some value for its users. > More: https://github.com/elastic/elasticsearch/issues/18303 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7287) New lemma-tizer plugin for ukrainian language.
[ https://issues.apache.org/jira/browse/LUCENE-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346816#comment-15346816 ] Ahmet Arslan commented on LUCENE-7287: -- Hi, I was able to run the analyzer successfully. Without mapping chart filter. Because character mappings are hardcoded into code. I am attaching an analysis screen shot. However, it looks like we need a remove duplicates token filter at the end. It looks like Morfologik filter injects multiple tokens at the same position > New lemma-tizer plugin for ukrainian language. > -- > > Key: LUCENE-7287 > URL: https://issues.apache.org/jira/browse/LUCENE-7287 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Dmytro Hambal >Priority: Minor > Labels: analysis, language, plugin > Fix For: master (7.0), 6.2 > > Attachments: LUCENE-7287.patch > > > Hi all, > I wonder whether you are interested in supporting a plugin which provides a > mapping between ukrainian word forms and their lemmas. Some tests and docs go > out-of-the-box =) . > https://github.com/mrgambal/elasticsearch-ukrainian-lemmatizer > It's really simple but still works and generates some value for its users. > More: https://github.com/elastic/elasticsearch/issues/18303 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7287) New lemma-tizer plugin for ukrainian language.
[ https://issues.apache.org/jira/browse/LUCENE-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15344403#comment-15344403 ] Ahmet Arslan commented on LUCENE-7287: -- only committers have rights to edit confluence wiki. Contributors include the proposed change/addition as a message at the end of the page. > New lemma-tizer plugin for ukrainian language. > -- > > Key: LUCENE-7287 > URL: https://issues.apache.org/jira/browse/LUCENE-7287 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Dmytro Hambal >Priority: Minor > Labels: analysis, language, plugin > Fix For: master (7.0), 6.2 > > Attachments: LUCENE-7287.patch > > > Hi all, > I wonder whether you are interested in supporting a plugin which provides a > mapping between ukrainian word forms and their lemmas. Some tests and docs go > out-of-the-box =) . > https://github.com/mrgambal/elasticsearch-ukrainian-lemmatizer > It's really simple but still works and generates some value for its users. > More: https://github.com/elastic/elasticsearch/issues/18303 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7287) New lemma-tizer plugin for ukrainian language.
[ https://issues.apache.org/jira/browse/LUCENE-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15344290#comment-15344290 ] Ahmet Arslan commented on LUCENE-7287: -- I think you, as the author of Ukrainian. Thanks! > New lemma-tizer plugin for ukrainian language. > -- > > Key: LUCENE-7287 > URL: https://issues.apache.org/jira/browse/LUCENE-7287 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Dmytro Hambal >Priority: Minor > Labels: analysis, language, plugin > Fix For: master (7.0), 6.2 > > Attachments: LUCENE-7287.patch > > > Hi all, > I wonder whether you are interested in supporting a plugin which provides a > mapping between ukrainian word forms and their lemmas. Some tests and docs go > out-of-the-box =) . > https://github.com/mrgambal/elasticsearch-ukrainian-lemmatizer > It's really simple but still works and generates some value for its users. > More: https://github.com/elastic/elasticsearch/issues/18303 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7287) New lemma-tizer plugin for ukrainian language.
[ https://issues.apache.org/jira/browse/LUCENE-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343877#comment-15343877 ] Ahmet Arslan commented on LUCENE-7287: -- So, Solr field type counterpart of this analyzer would be something like: {code:xml} {code} It would be nice to add an entry for Ukranian to https://cwiki.apache.org/confluence/display/solr/Language+Analysis > New lemma-tizer plugin for ukrainian language. > -- > > Key: LUCENE-7287 > URL: https://issues.apache.org/jira/browse/LUCENE-7287 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Dmytro Hambal >Priority: Minor > Labels: analysis, language, plugin > Fix For: master (7.0), 6.2 > > Attachments: LUCENE-7287.patch > > > Hi all, > I wonder whether you are interested in supporting a plugin which provides a > mapping between ukrainian word forms and their lemmas. Some tests and docs go > out-of-the-box =) . > https://github.com/mrgambal/elasticsearch-ukrainian-lemmatizer > It's really simple but still works and generates some value for its users. > More: https://github.com/elastic/elasticsearch/issues/18303 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7287) New lemma-tizer plugin for ukrainian language.
[ https://issues.apache.org/jira/browse/LUCENE-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342944#comment-15342944 ] Ahmet Arslan commented on LUCENE-7287: -- Can we use this analyzer in solr? {code:xml} {code} > New lemma-tizer plugin for ukrainian language. > -- > > Key: LUCENE-7287 > URL: https://issues.apache.org/jira/browse/LUCENE-7287 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Dmytro Hambal >Priority: Minor > Labels: analysis, language, plugin > Fix For: master (7.0), 6.2 > > Attachments: LUCENE-7287.patch > > > Hi all, > I wonder whether you are interested in supporting a plugin which provides a > mapping between ukrainian word forms and their lemmas. Some tests and docs go > out-of-the-box =) . > https://github.com/mrgambal/elasticsearch-ukrainian-lemmatizer > It's really simple but still works and generates some value for its users. > More: https://github.com/elastic/elasticsearch/issues/18303 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7287) New lemma-tizer plugin for ukrainian language.
[ https://issues.apache.org/jira/browse/LUCENE-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326628#comment-15326628 ] Ahmet Arslan commented on LUCENE-7287: -- May be MappingCharFilter could be used instead of a token filter? > New lemma-tizer plugin for ukrainian language. > -- > > Key: LUCENE-7287 > URL: https://issues.apache.org/jira/browse/LUCENE-7287 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Dmytro Hambal >Priority: Minor > Labels: analysis, language, plugin > > Hi all, > I wonder whether you are interested in supporting a plugin which provides a > mapping between ukrainian word forms and their lemmas. Some tests and docs go > out-of-the-box =) . > https://github.com/mrgambal/elasticsearch-ukrainian-lemmatizer > It's really simple but still works and generates some value for its users. > More: https://github.com/elastic/elasticsearch/issues/18303 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8676) It's not possible to use a different log4.properties file on windows
[ https://issues.apache.org/jira/browse/SOLR-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15314246#comment-15314246 ] Ahmet Arslan commented on SOLR-8676: [~mkhludnev] do you mind looking into SOLR-8445 too? > It's not possible to use a different log4.properties file on windows > > > Key: SOLR-8676 > URL: https://issues.apache.org/jira/browse/SOLR-8676 > Project: Solr > Issue Type: Bug >Affects Versions: 5.4.1 >Reporter: Kristine Jetzke >Assignee: Mikhail Khludnev > Attachments: SOLR-8676.patch, SOLR-8676.patch, verifying SOLR-8676.txt > > > It's currently not possible to change the location of the log4j.properties > file on windows. The value of {{LOG4J_CONFIG}} always gets replaced with the > default value {{server\resources\log4j.properties}}. Thus, this file inside > the server directory needs to be changed after every update. > See attached patch for a fix. Unfortunately, I couldn't figure out why > {{LOG4J_CONFIG}} was set to empty. I tested manually that logging still works > when running an example so I hope that this line is really just obsolete. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9174) After Solr 5.5, mm parameter doesn't work properly
[ https://issues.apache.org/jira/browse/SOLR-9174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308266#comment-15308266 ] Ahmet Arslan commented on SOLR-9174: Can someone explain why (e)dismax should honor/respect/care the {{q.op}} parameter? (e)dismax has its own parameter {{mm}} for the task. > After Solr 5.5, mm parameter doesn't work properly > -- > > Key: SOLR-9174 > URL: https://issues.apache.org/jira/browse/SOLR-9174 > Project: Solr > Issue Type: Bug > Components: query parsers, search >Affects Versions: 5.5, 6.0, 6.0.1 >Reporter: Issei Nishigata > > “mm" parameter does not work properly, when I set "q.op=AND” after Solr 5.5. > In Solr 5.4, mm parameter works expectedly with the following setting. > [schema] > {code:xml} > > > maxGramSize="2"/> > > > {code} > [request] > {quote} > http://localhost:8983/solr/collection1/select?defType=edismax=AND=2=solar > {quote} > After Solr 5.5, the result will not be the same as Solr 5.4. > [Solr 5.4] > {code:xml} > > ... > > 2 > solar > edismax > AND > > ... > > > 0 > > solr > > > > > solar > solar > > (+DisjunctionMaxQuerytext:so text:ol text:la text:ar)~2/no_coord > > +(((text:so text:ol text:la > text:ar)~2)) > ... > > {code} > [Solr 6.0.1] > {code:xml} > > ... > > 2 > solar > edismax > AND > > ... > > > solar > solar > > (+DisjunctionMaxQuery(((+text:so +text:ol +text:la +text:ar/no_coord > > +((+text:so +text:ol +text:la > +text:ar)) > ... > {code} > As shown above, parsedquery also differs from Solr 5.4 and Solr 6.0.1(after > Solr 5.5). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 6.0.1 RC2
+1 SUCCESS! [1:00:26.085469] On Wednesday, May 25, 2016 11:27 AM, Tommaso Teofiliwrote: got the same warning on the GPG key signature but could not reproduce David's issue, not sure what it could be though. I'd say if no one else can reproduce it let's go ahead with the release. +1 on my side. SUCCESS! [1:19:14.997834] Regards, Tommaso Il giorno mer 25 mag 2016 alle ore 06:48 David Smiley ha scritto: I tried to run the smoke tester directly on my machine and it failed right after unpacking. Given other's success, it must be user error. What might the problem be? > > > unpack lucene-6.0.1.tgz... >verify JAR metadata/identity/no javax.* or java.* classes... >Traceback (most recent call last): > File "dev-tools/scripts/smokeTestRelease.py", line 1412, in >main() > File "dev-tools/scripts/smokeTestRelease.py", line 1356, in main >smokeTest(c.java, c.url, c.revision, c.version, c.tmp_dir, c.is_signed, ' > '.join(c.test_args)) > File "dev-tools/scripts/smokeTestRelease.py", line 1393, in smokeTest >unpackAndVerify(java, 'lucene', tmpDir, artifact, gitRevision, version, > testArgs, baseURL) > File "dev-tools/scripts/smokeTestRelease.py", line 590, in unpackAndVerify >verifyUnpacked(java, project, artifact, unpackPath, gitRevision, version, > testArgs, tmpDir, baseURL) > File "dev-tools/scripts/smokeTestRelease.py", line 712, in verifyUnpacked >checkAllJARs(os.getcwd(), project, gitRevision, version, tmpDir, baseURL) > File "dev-tools/scripts/smokeTestRelease.py", line 270, in checkAllJARs >checkJARMetaData('JAR file "%s"' % fullPath, fullPath, gitRevision, > version) > File "dev-tools/scripts/smokeTestRelease.py", line 202, in checkJARMetaData >(desc, verify)) >RuntimeError: JAR file >"/private/tmp/smoke_lucene_6.0.1_c7510a0fdd93329ec04c853c8557f4a3f2309eaf/unpack/lucene-6.0.1/analysis/common/lucene-analyzers-common-6.0.1.jar" > is missing "X-Compile-Source-JDK: 8" inside its META-INF/MANIFEST.MF > > >Separately from the smoketest, I've downloaded this RC to use it on a new >project and haven't found issues yet. > >On Tue, May 24, 2016 at 1:19 PM Anshum Gupta wrote: > >Thanks for doing the release, Steve. All looks good to me but I think you >should get someone to sign you GPG key :) >> >> >> >>I see this warning while running the tests: GPG: gpg: WARNING: This key is >>not certified with a trusted signature! >> >> >>Here's my +1! >> >> >>SUCCESS! [1:05:50.755245] >> >> >> >> >> >> >> >>On Tue, May 24, 2016 at 5:24 AM, Michael McCandless >> wrote: >> >>+1 >>> >>> >>>SUCCESS! [0:31:57.451386] >>> >>> >>> >>>Mike McCandless >>> >>>http://blog.mikemccandless.com >>> >>> >>>On Tue, May 24, 2016 at 12:13 AM, Steve Rowe wrote: >>> >>>Please vote for release candidate 2 for Lucene/Solr 6.0.1. (I found a >>>couple problems in CHANGES after I committed RC1 to Subversion, so I didn’t >>>call the vote, and cut RC2 instead.) The artifacts can be downloaded from: https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-6.0.1-RC2-revc7510a0fdd93329ec04c853c8557f4a3f2309eaf You can run the smoke tester directly with this command: python3 -u dev-tools/scripts/smokeTestRelease.py \ https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-6.0.1-RC2-revc7510a0fdd93329ec04c853c8557f4a3f2309eaf Here’s my +1. Docs, changes and javadocs look good. SUCCESS! [0:26:34.596490] -- Steve www.lucidworks.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org >>> >> >> >> >> >>-- >> >>Anshum Gupta >-- > >Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker >LinkedIn: http://linkedin.com/in/davidwsmiley | Book: >http://www.solrenterprisesearchserver.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7287) New lemma-tizer plugin for ukrainian language.
[ https://issues.apache.org/jira/browse/LUCENE-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299381#comment-15299381 ] Ahmet Arslan commented on LUCENE-7287: -- This looks like a wrapper for string to string mapping. No need to roll a custom Lucene code for this: Just replace comma with tab in the {{mapping_sorted.csv}} file and use good old {{StemmerOverrideFilter}}, which has the fast lookup that does not require {{termAtt.toString()}} conversion. > New lemma-tizer plugin for ukrainian language. > -- > > Key: LUCENE-7287 > URL: https://issues.apache.org/jira/browse/LUCENE-7287 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Dmytro Hambal >Priority: Minor > Labels: analysis, language, plugin > > Hi all, > I wonder whether you are interested in supporting a plugin which provides a > mapping between ukrainian word forms and their lemmas. Some tests and docs go > out-of-the-box =) . > https://github.com/mrgambal/elasticsearch-ukrainian-lemmatizer > It's really simple but still works and generates some value for its users. > More: https://github.com/elastic/elasticsearch/issues/18303 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Using rows=-1 for "give me all"
Hi Steffensen, Not sure about rows=-1, but retrieval engines are optimized to return top-N results. However, there exists special commands for "give me all" https://cwiki.apache.org/confluence/display/solr/Exporting+Result+Sets Ahmet On Monday, May 23, 2016 11:38 PM, Per Steffensenwrote: Hi Back when we used 4.4.0 I believe a query with rows=-1 returned all matching documents. In 5.1.0 (the one we are using now) rows=-1 will trigger a validation exception. If I remove the code that throws that exception, it seems like rows=-1 behaves like rows=0. Has the support for rows=-1 (give me all) been reintroduced in a release after 5.1.0? If yes, which JIRA-ticket? If no, any plans to reintroduce it? Any good reason for changing the rows=-1 behavior? Am I the only one that liked it? :-) Regards, Per Steffensen - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-7148) Support boolean subset matching
[ https://issues.apache.org/jira/browse/LUCENE-7148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15227887#comment-15227887 ] Ahmet Arslan edited comment on LUCENE-7148 at 4/6/16 7:37 AM: -- bq. Perhaps you mean something like Solr's frange that filters based on the value? Exactly. Given that q=john smith, lets assume that we have a field titleLenght that stores the number of words in the field. We can even extract that info from norm doc values later on. Something like: {noformat} fq={!frange l=0 u=0 cache=false cost=200} sub(titleLength, sum(termfreq(title,'smith'), termfreq(title,'john'))) {noformat} bq. That would be O(docs) as it evaluates per doc. Cant we make this filter query executed last, with cache=false cost=150? was (Author: iorixxx): bq. Perhaps you mean something like Solr's frange that filters based on the value? Exactly. Given that q=john smith, lets assume that we have a field titleLenght that stores the number of words in the field. We can even extract that info from norm doc values later on. Something like fq={!frange l=0 u=0} sub(titleLength, sum(termfreq(title,'smith'), termfreq(title,'john'))) bq. That would be O(docs) as it evaluates per doc. Cant we make this filter query executed last, with cache=false cost=150? > Support boolean subset matching > --- > > Key: LUCENE-7148 > URL: https://issues.apache.org/jira/browse/LUCENE-7148 > Project: Lucene - Core > Issue Type: New Feature > Components: core/search >Affects Versions: 5.x >Reporter: Otmar Caduff > Labels: newbie > > In Lucene, I know of the possibility of Occur.SHOULD, Occur.MUST and the > “minimum should match” setting on the boolean query. > Now, when querying, I want to > - (1) match the documents which either contain all the terms of the query > (Occur.MUST for all terms would do that) or, > - (2) if all terms for a given field of a document are a subset of the query > terms, that document should match as well. > Example: > Document d hast field f with terms A, B, C > Query with the following terms should match that document: > A > B > A B > A B C > A B C D > Query with the following terms should not match: > D > A B D -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7148) Support boolean subset matching
[ https://issues.apache.org/jira/browse/LUCENE-7148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15227887#comment-15227887 ] Ahmet Arslan commented on LUCENE-7148: -- bq. Perhaps you mean something like Solr's frange that filters based on the value? Exactly. Given that q=john smith, lets assume that we have a field titleLenght that stores the number of words in the field. We can even extract that info from norm doc values later on. Something like fq={!frange l=0 u=0} sub(titleLength, sum(termfreq(title,'smith'), termfreq(title,'john'))) bq. That would be O(docs) as it evaluates per doc. Cant we make this filter query executed last, with cache=false cost=150? > Support boolean subset matching > --- > > Key: LUCENE-7148 > URL: https://issues.apache.org/jira/browse/LUCENE-7148 > Project: Lucene - Core > Issue Type: New Feature > Components: core/search >Affects Versions: 5.x >Reporter: Otmar Caduff > Labels: newbie > > In Lucene, I know of the possibility of Occur.SHOULD, Occur.MUST and the > “minimum should match” setting on the boolean query. > Now, when querying, I want to > - (1) match the documents which either contain all the terms of the query > (Occur.MUST for all terms would do that) or, > - (2) if all terms for a given field of a document are a subset of the query > terms, that document should match as well. > Example: > Document d hast field f with terms A, B, C > Query with the following terms should match that document: > A > B > A B > A B C > A B C D > Query with the following terms should not match: > D > A B D -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7148) Support boolean subset matching
[ https://issues.apache.org/jira/browse/LUCENE-7148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15227378#comment-15227378 ] Ahmet Arslan commented on LUCENE-7148: -- can't we have a function query that just returns the number of matching terms here? Then we compare it with the document length. > Support boolean subset matching > --- > > Key: LUCENE-7148 > URL: https://issues.apache.org/jira/browse/LUCENE-7148 > Project: Lucene - Core > Issue Type: New Feature > Components: core/search >Affects Versions: 5.x >Reporter: Otmar Caduff > Labels: newbie > > In Lucene, I know of the possibility of Occur.SHOULD, Occur.MUST and the > “minimum should match” setting on the boolean query. > Now, when querying, I want to > - (1) match the documents which either contain all the terms of the query > (Occur.MUST for all terms would do that) or, > - (2) if all terms for a given field of a document are a subset of the query > terms, that document should match as well. > Example: > Document d hast field f with terms A, B, C > Query with the following terms should match that document: > A > B > A B > A B C > A B C D > Query with the following terms should not match: > D > A B D -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 6.0.0 RC2
+1 SUCCESS! [1:42:49.802039] Ahmet On Tuesday, April 5, 2016 1:09 AM, Anshum Guptawrote: Thanks for taking this up Nick! Here's my +1: SUCCESS! [0:38:14.023246] On Fri, Apr 1, 2016 at 1:44 PM, Nicholas Knize wrote: Please vote for the RC2 release candidate for Lucene/Solr 6.0.0. > >Artifacts: > > > > > https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-6.0.0-RC2-rev48c80f91b8e5cd9b3a9b48e6184bd53e7619e7e3 > > >Smoke tester: > > > python3 -u dev-tools/scripts/smokeTestRelease.py > https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-6.0.0-RC2-rev48c80f91b8e5cd9b3a9b48e6184bd53e7619e7e3 > > > >Here's my +1: > > >SUCCESS! [0:28:59.770357] > > > >- Nick Knize -- Anshum Gupta - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Karl Wright as a Lucene/Solr committer!
Welcome Karl! On Monday, April 4, 2016 6:54 PM, Robert Muirwrote: Welcome Karl! On Mon, Apr 4, 2016 at 10:40 AM, Karl Wright wrote: > Hi all, > > Professionally, I've been active in software development since the 1970's. > My interests include many things related to software development, as well as > areas as varied as geology, carpentry, and gardening. I'm the PMC chair for > the ManifoldCF project, as well as a committer on other Apache projects such > as Http Components. > > My current employer is HERE, Inc, who is a spin-off from Nokia, who sells > map data, services, and search capabilities. > > I'm also the contributor and principal author of the Geo3D package, which is > now part of Lucene under the spatial3d module. I intend to continue to > contribute to this package for the foreseeable future. > > Thanks!! > Karl > > > On Mon, Apr 4, 2016 at 10:28 AM, Michael McCandless > wrote: >> >> I'm pleased to announce that Karl Wright has accepted the Lucene PMC's >> invitation to become a committer. >> >> Karl, it's tradition that you introduce yourself with a brief bio. >> >> Karma has been granted to your pre-existing account, so that you can >> add yourself to the committers section of the Who We Are page on the >> website: http://lucene.apache.org/whoweare.html >> >> Congratulations and welcome! >> >> Mike McCandless >> >> http://blog.mikemccandless.com > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7132) ScoreDoc.score() returns a different value than that of Explanation's
[ https://issues.apache.org/jira/browse/LUCENE-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210008#comment-15210008 ] Ahmet Arslan commented on LUCENE-7132: -- It is really hard to decipher what is going on inside the good old TFIDFSimilarity. {code:title=TFIDFSimilarity.IDFStats.normalize|borderStyle=solid} @Override public void normalize(float queryNorm, float boost) { this.boost = boost; this.queryNorm = queryNorm; queryWeight = queryNorm * boost * idf.getValue(); value = queryWeight * idf.getValue(); // idf for document } {code} * Why query weight has a IDF multiplicand? * Why TFIDFSimilarity.IDFStats#value is set to IDF square? * Why TFIDFSimilarity.IDFStats#value is need even though we have TFIDFSimilarity.IDFStats.idf.getValue(); * TFIDFSimilarity.TFIDFSimScorer#score returns tf(freq) * IDFStats.value which looks tfxIDFxIDF to me. > ScoreDoc.score() returns a different value than that of Explanation's > - > > Key: LUCENE-7132 > URL: https://issues.apache.org/jira/browse/LUCENE-7132 > Project: Lucene - Core > Issue Type: Bug > Components: core/search >Affects Versions: 5.5 >Reporter: Ahmet Arslan >Assignee: Steve Rowe > Attachments: LUCENE-7132.patch, SOLR-8884.patch, SOLR-8884.patch, > debug.xml > > > Some of the folks > [reported|http://find.searchhub.org/document/80666f5c3b86ddda] that sometimes > explain's score can be different than the score requested by fields > parameter. Interestingly, Explain's scores would create a different ranking > than the original result list. This is something users experience, but it > cannot be re-produced deterministically. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7132) ScoreDoc.score() returns a different value than that of Explanation's
[ https://issues.apache.org/jira/browse/LUCENE-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7132: - Component/s: core/search > ScoreDoc.score() returns a different value than that of Explanation's > - > > Key: LUCENE-7132 > URL: https://issues.apache.org/jira/browse/LUCENE-7132 > Project: Lucene - Core > Issue Type: Bug > Components: core/search >Affects Versions: 5.5 >Reporter: Ahmet Arslan >Assignee: Steve Rowe > Attachments: LUCENE-7132.patch, SOLR-8884.patch, SOLR-8884.patch, > debug.xml > > > Some of the folks > [reported|http://find.searchhub.org/document/80666f5c3b86ddda] that sometimes > explain's score can be different than the score requested by fields > parameter. Interestingly, Explain's scores would create a different ranking > than the original result list. This is something users experience, but it > cannot be re-produced deterministically. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7132) ScoreDoc.score() returns a different value than that of Explanation's
[ https://issues.apache.org/jira/browse/LUCENE-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209155#comment-15209155 ] Ahmet Arslan commented on LUCENE-7132: -- Thanks Steve for taking care of this! > ScoreDoc.score() returns a different value than that of Explanation's > - > > Key: LUCENE-7132 > URL: https://issues.apache.org/jira/browse/LUCENE-7132 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.5 > Reporter: Ahmet Arslan >Assignee: Steve Rowe > Attachments: LUCENE-7132.patch, SOLR-8884.patch, SOLR-8884.patch, > debug.xml > > > Some of the folks > [reported|http://find.searchhub.org/document/80666f5c3b86ddda] that sometimes > explain's score can be different than the score requested by fields > parameter. Interestingly, Explain's scores would create a different ranking > than the original result list. This is something users experience, but it > cannot be re-produced deterministically. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7132) ScoreDoc.score() returns a different value than that of Explanation's
[ https://issues.apache.org/jira/browse/LUCENE-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7132: - Attachment: LUCENE-7132.patch Lucene only patch. Interestingly, *testExplainScoreEquality* method also failed once for me. Which can be reproduced with : {{ant test -Dtestcase=TestExplain -Dtests.method=testExplainScoreEquality -Dtests.seed=B90C674F754D524 -Dtests.locale=de -Dtests.timezone=Etc/GMT-12 -Dtests.asserts=true -Dtests.file.encoding=UTF-8}} However, *testRajeshData* method fails more frequently. > ScoreDoc.score() returns a different value than that of Explanation's > - > > Key: LUCENE-7132 > URL: https://issues.apache.org/jira/browse/LUCENE-7132 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.5 > Reporter: Ahmet Arslan >Assignee: Steve Rowe > Attachments: LUCENE-7132.patch, SOLR-8884.patch, SOLR-8884.patch, > debug.xml > > > Some of the folks > [reported|http://find.searchhub.org/document/80666f5c3b86ddda] that sometimes > explain's score can be different than the score requested by fields > parameter. Interestingly, Explain's scores would create a different ranking > than the original result list. This is something users experience, but it > cannot be re-produced deterministically. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7132) ScoreDoc.score() returns a different value than that of Explanation's
[ https://issues.apache.org/jira/browse/LUCENE-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7132: - Summary: ScoreDoc.score() returns a different value than that of Explanation's (was: fl=score returns a different value than that of Explain's) > ScoreDoc.score() returns a different value than that of Explanation's > - > > Key: LUCENE-7132 > URL: https://issues.apache.org/jira/browse/LUCENE-7132 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.5 > Reporter: Ahmet Arslan >Assignee: Steve Rowe > Attachments: SOLR-8884.patch, SOLR-8884.patch, debug.xml > > > Some of the folks > [reported|http://find.searchhub.org/document/80666f5c3b86ddda] that sometimes > explain's score can be different than the score requested by fields > parameter. Interestingly, Explain's scores would create a different ranking > than the original result list. This is something users experience, but it > cannot be re-produced deterministically. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8884) fl=score returns a different value than that of Explain's
[ https://issues.apache.org/jira/browse/SOLR-8884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208940#comment-15208940 ] Ahmet Arslan commented on SOLR-8884: Can someone who have the appropriate permissions please move SOLR-8884 to LUCENE-? > fl=score returns a different value than that of Explain's > - > > Key: SOLR-8884 > URL: https://issues.apache.org/jira/browse/SOLR-8884 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 5.5 >Reporter: Ahmet Arslan > Attachments: SOLR-8884.patch, SOLR-8884.patch, debug.xml > > > Some of the folks > [reported|http://find.searchhub.org/document/80666f5c3b86ddda] that sometimes > explain's score can be different than the score requested by fields > parameter. Interestingly, Explain's scores would create a different ranking > than the original result list. This is something users experience, but it > cannot be re-produced deterministically. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8884) fl=score returns a different value than that of Explain's
[ https://issues.apache.org/jira/browse/SOLR-8884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-8884: --- Attachment: SOLR-8884.patch This is truly a Lucene level bug. Attached path includes a failing test case. It can be reproduced with: {{ant test -Dtestcase=TestExplain -Dtests.method=testRajeshData -Dtests.seed=D5E55A7E84F4C82C -Dtests.slow=true -Dtests.locale=es-HN -Dtests.timezone=Asia/Samarkand -Dtests.asserts=true -Dtests.file.encoding=UTF-8}} > fl=score returns a different value than that of Explain's > - > > Key: SOLR-8884 > URL: https://issues.apache.org/jira/browse/SOLR-8884 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 5.5 >Reporter: Ahmet Arslan > Attachments: SOLR-8884.patch, SOLR-8884.patch, debug.xml > > > Some of the folks > [reported|http://find.searchhub.org/document/80666f5c3b86ddda] that sometimes > explain's score can be different than the score requested by fields > parameter. Interestingly, Explain's scores would create a different ranking > than the original result list. This is something users experience, but it > cannot be re-produced deterministically. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8884) fl=score returns a different value than that of Explain's
[ https://issues.apache.org/jira/browse/SOLR-8884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-8884: --- Attachment: SOLR-8884.patch Randomized test case for Lucene in hopes that it will trigger sometime. Will try to write Solr counterpart. > fl=score returns a different value than that of Explain's > - > > Key: SOLR-8884 > URL: https://issues.apache.org/jira/browse/SOLR-8884 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 5.5 >Reporter: Ahmet Arslan > Attachments: SOLR-8884.patch, debug.xml > > > Some of the folks > [reported|http://find.searchhub.org/document/80666f5c3b86ddda] that sometimes > explain's score can be different than the score requested by fields > parameter. Interestingly, Explain's scores would create a different ranking > than the original result list. This is something users experience, but it > cannot be re-produced deterministically. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8884) fl=score returns a different value than that of Explain's
[ https://issues.apache.org/jira/browse/SOLR-8884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-8884: --- Attachment: debug.xml There is the Rajesh's response file that demonstrates the problem. > fl=score returns a different value than that of Explain's > - > > Key: SOLR-8884 > URL: https://issues.apache.org/jira/browse/SOLR-8884 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 5.5 >Reporter: Ahmet Arslan > Attachments: debug.xml > > > Some of the folks > [reported|http://find.searchhub.org/document/80666f5c3b86ddda] that sometimes > explain's score can be different than the score requested by fields > parameter. Interestingly, Explain's scores would create a different ranking > than the original result list. This is something users experience, but it > cannot be re-produced deterministically. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-8884) fl=score returns a different value than that of Explain's
Ahmet Arslan created SOLR-8884: -- Summary: fl=score returns a different value than that of Explain's Key: SOLR-8884 URL: https://issues.apache.org/jira/browse/SOLR-8884 Project: Solr Issue Type: Bug Components: search Affects Versions: 5.5 Reporter: Ahmet Arslan Some of the folks [reported|http://find.searchhub.org/document/80666f5c3b86ddda] that sometimes explain's score can be different than the score requested by fields parameter. Interestingly, Explain's scores would create a different ranking than the original result list. This is something users experience, but it cannot be re-produced deterministically. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7014) Use TimeUnit.TARGETUNIT.convert() to convert between time units
[ https://issues.apache.org/jira/browse/LUCENE-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7014: - Attachment: LUCENE-7014.patch I started to incorporate suggested changes. This patch includes only {{org.apache.lucene.index.*}} files. For three digits, I switched to milliseconds. However, I rounded {{%.1f}}. Is this reasonable in terms of precision loss? May be we should not touch these cases? > Use TimeUnit.TARGETUNIT.convert() to convert between time units > --- > > Key: LUCENE-7014 > URL: https://issues.apache.org/jira/browse/LUCENE-7014 > Project: Lucene - Core > Issue Type: Improvement >Affects Versions: master, 5.4.1 > Reporter: Ahmet Arslan >Priority: Minor > Fix For: 5.5, master, 6.0 > > Attachments: LUCENE-7014.patch, LUCENE-7014.patch > > > Re-phrased from [~steve_rowe]'s > [comment|https://issues.apache.org/jira/browse/LUCENE-6823?focusedCommentId=14941283=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14941283] > : > System.nanoTime(), which is guaranteed to be monotonic, is now used to > recored elapsed times. In several places, conversion from nano seconds to > some target unit (e.g. seconds, milli seconds) is performed using hard-coded > conversion constants, which is prone to mistakes. > It would be nice to use {{TimeUnit.TARGETUNIT.convert(sourceDuration, > TimeUnit.SOURCEUNIT)}} instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7014) Use TimeUnit.TARGETUNIT.convert() to convert between time units
[ https://issues.apache.org/jira/browse/LUCENE-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-7014: - Attachment: LUCENE-7014.patch > Use TimeUnit.TARGETUNIT.convert() to convert between time units > --- > > Key: LUCENE-7014 > URL: https://issues.apache.org/jira/browse/LUCENE-7014 > Project: Lucene - Core > Issue Type: Improvement >Affects Versions: master, 5.4.1 > Reporter: Ahmet Arslan >Priority: Minor > Fix For: 5.5, master, 6.0 > > Attachments: LUCENE-7014.patch > > > Re-phrased from [~steve_rowe]'s > [comment|https://issues.apache.org/jira/browse/LUCENE-6823?focusedCommentId=14941283=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14941283] > : > System.nanoTime(), which is guaranteed to be monotonic, is now used to > recored elapsed times. In several places, conversion from nano seconds to > some target unit (e.g. seconds, milli seconds) is performed using hard-coded > conversion constants, which is prone to mistakes. > It would be nice to use {{TimeUnit.TARGETUNIT.convert(sourceDuration, > TimeUnit.SOURCEUNIT)}} instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-7014) Use TimeUnit.TARGETUNIT.convert() to convert between time units
Ahmet Arslan created LUCENE-7014: Summary: Use TimeUnit.TARGETUNIT.convert() to convert between time units Key: LUCENE-7014 URL: https://issues.apache.org/jira/browse/LUCENE-7014 Project: Lucene - Core Issue Type: Improvement Affects Versions: 5.4.1, master Reporter: Ahmet Arslan Priority: Minor Fix For: 5.5, master, 6.0 Re-phrased from [~steve_rowe]'s [comment|https://issues.apache.org/jira/browse/LUCENE-6823?focusedCommentId=14941283=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14941283] : System.nanoTime(), which is guaranteed to be monotonic, is now used to recored elapsed times. In several places, conversion from nano seconds to some target unit (e.g. seconds, milli seconds) is performed using hard-coded conversion constants, which is prone to mistakes. It would be nice to use {{TimeUnit.TARGETUNIT.convert(sourceDuration, TimeUnit.SOURCEUNIT)}} instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8445) fix line separator in log4j.properties files
[ https://issues.apache.org/jira/browse/SOLR-8445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-8445: --- Fix Version/s: 5.5 > fix line separator in log4j.properties files > > > Key: SOLR-8445 > URL: https://issues.apache.org/jira/browse/SOLR-8445 > Project: Solr > Issue Type: Bug > Components: Server >Affects Versions: 5.4, Trunk >Reporter: Ahmet Arslan >Priority: Trivial > Labels: log4j, logging > Fix For: 5.5, Trunk > > Attachments: SOLR-8445.patch, SOLR-8445.patch > > > new line is mistyped in conversion pattern -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8570) Make discountOverlaps' initialization value consistent across subclasses of SimilarityFactory
[ https://issues.apache.org/jira/browse/SOLR-8570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-8570: --- Fix Version/s: 5.5 > Make discountOverlaps' initialization value consistent across subclasses of > SimilarityFactory > -- > > Key: SOLR-8570 > URL: https://issues.apache.org/jira/browse/SOLR-8570 > Project: Solr > Issue Type: Improvement >Affects Versions: 5.4 >Reporter: Ahmet Arslan >Priority: Minor > Labels: similarity > Fix For: 5.5, Trunk > > Attachments: SOLR-8570.patch, SOLR-8570.patch > > > Subclasses of SimilarityFactory have a member variable named > {{discountOverlaps}}. > In ClassicSimilarityFactory, it is initialized to {{true}} in SOLR-5561. > Since discountOverlaps' default value is true, we should do the same in > remaining subclasses. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8445) fix line separator in log4j.properties files
[ https://issues.apache.org/jira/browse/SOLR-8445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-8445: --- Attachment: SOLR-8445.patch Patch generated by {{git diff}}. > fix line separator in log4j.properties files > > > Key: SOLR-8445 > URL: https://issues.apache.org/jira/browse/SOLR-8445 > Project: Solr > Issue Type: Bug > Components: Server >Affects Versions: 5.4, Trunk >Reporter: Ahmet Arslan >Priority: Trivial > Labels: log4j, logging > Fix For: Trunk > > Attachments: SOLR-8445.patch, SOLR-8445.patch > > > new line is mistyped in conversion pattern -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8570) Make discountOverlaps' initialization value consistent across subclasses of SimilarityFactory
[ https://issues.apache.org/jira/browse/SOLR-8570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-8570: --- Attachment: SOLR-8570.patch Patch generated by {{git diff origin/master..SOLR-8570}} . > Make discountOverlaps' initialization value consistent across subclasses of > SimilarityFactory > -- > > Key: SOLR-8570 > URL: https://issues.apache.org/jira/browse/SOLR-8570 > Project: Solr > Issue Type: Improvement >Affects Versions: 5.4 >Reporter: Ahmet Arslan >Priority: Minor > Labels: similarity > Fix For: Trunk > > Attachments: SOLR-8570.patch, SOLR-8570.patch > > > Subclasses of SimilarityFactory have a member variable named > {{discountOverlaps}}. > In ClassicSimilarityFactory, it is initialized to {{true}} in SOLR-5561. > Since discountOverlaps' default value is true, we should do the same in > remaining subclasses. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 5.3.2-RC2
+1 SUCCESS! [1:38:55.940645] On Tuesday, January 19, 2016 10:25 PM, Yonik Seeleywrote: +1 -Yonik On Mon, Jan 18, 2016 at 11:23 AM, Anshum Gupta wrote: > Please vote for the RC2 release candidate for Lucene/Solr 5.3.2 > > The artifacts can be downloaded from: > https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-5.3.2-RC2-rev1725196 > > You can run the smoke tester directly with this command: > python3 -u dev-tools/scripts/smokeTestRelease.py > https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-5.3.2-RC2-rev1725196 > > Here's my +1 > > SUCCESS! [0:26:22.094521] > > -- > Anshum Gupta - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 5.4.1 RC2
+1 SUCCESS! [1:50:21.498224] On Wednesday, January 20, 2016 1:28 AM, Tomás Fernández Löbbewrote: +1 SUCCESS! [1:27:55.987215] On Tue, Jan 19, 2016 at 12:25 PM, Yonik Seeley wrote: +1 > >-Yonik > > >On Mon, Jan 18, 2016 at 9:38 AM, Adrien Grand wrote: >> Please vote for the RC2 release candidate for Lucene/Solr 5.4.1 >> >> This release candidate contains 3 additional changes compared to the RC1: >> - SOLR-8496: multi-select faceting and getDocSet(List) can match >> deleted docs >> - SOLR-8418: Adapt to changes in LUCENE-6590 for use of boosts with >> MLTHandler and Simple/CloudMLTQParser >> - SOLR-8561: Add fallback to ZkController.getLeaderProps for a mixed >> 5.4-pre-5.4 deployments >> >> The artifacts can be downloaded from: >> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-5.4.1-RC2-rev1725212 >> >> You can run the smoke tester directly with this command: >> python3 -u dev-tools/scripts/smokeTestRelease.py >> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-5.4.1-RC2-rev1725212 >> >> The smoke tester already passed for me both with the local and remote >> artifacts, so here is my +1. > > >- >To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >For additional commands, e-mail: dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8445) fix line separator in log4j.properties files
[ https://issues.apache.org/jira/browse/SOLR-8445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-8445: --- Fix Version/s: Trunk > fix line separator in log4j.properties files > > > Key: SOLR-8445 > URL: https://issues.apache.org/jira/browse/SOLR-8445 > Project: Solr > Issue Type: Bug > Components: Server >Affects Versions: 5.4, Trunk >Reporter: Ahmet Arslan >Priority: Trivial > Labels: log4j, logging > Fix For: Trunk > > Attachments: SOLR-8445.patch > > > new line is mistyped in conversion pattern -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8445) fix line separator in log4j.properties files
[ https://issues.apache.org/jira/browse/SOLR-8445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-8445: --- Labels: log4j logging (was: ) > fix line separator in log4j.properties files > > > Key: SOLR-8445 > URL: https://issues.apache.org/jira/browse/SOLR-8445 > Project: Solr > Issue Type: Bug > Components: Server >Affects Versions: 5.4, Trunk >Reporter: Ahmet Arslan >Priority: Trivial > Labels: log4j, logging > Fix For: Trunk > > Attachments: SOLR-8445.patch > > > new line is mistyped in conversion pattern -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8570) Make discountOverlaps' initialization value consistent across subclasses of SimilarityFactory
[ https://issues.apache.org/jira/browse/SOLR-8570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-8570: --- Attachment: SOLR-8570.patch I had this patch handy. However, does moving {{protected boolean discountOverlaps = true;}} into the SimilarityFactory breaks any good practices? > Make discountOverlaps' initialization value consistent across subclasses of > SimilarityFactory > -- > > Key: SOLR-8570 > URL: https://issues.apache.org/jira/browse/SOLR-8570 > Project: Solr > Issue Type: Improvement >Affects Versions: 5.4 >Reporter: Ahmet Arslan >Priority: Minor > Labels: similarity > Fix For: Trunk > > Attachments: SOLR-8570.patch > > > Subclasses of SimilarityFactory have a member variable named > {{discountOverlaps}}. > In ClassicSimilarityFactory, it is initialized to {{true}} in SOLR-5561. > Since discountOverlaps' default value is true, we should do the same in > remaining subclasses. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-8570) Make discountOverlaps' initialization value consistent across subclasses of SimilarityFactory
Ahmet Arslan created SOLR-8570: -- Summary: Make discountOverlaps' initialization value consistent across subclasses of SimilarityFactory Key: SOLR-8570 URL: https://issues.apache.org/jira/browse/SOLR-8570 Project: Solr Issue Type: Improvement Affects Versions: 5.4 Reporter: Ahmet Arslan Priority: Minor Fix For: Trunk Subclasses of SimilarityFactory have a member variable named {{discountOverlaps}}. In ClassicSimilarityFactory, it is initialized to {{true}} in SOLR-5561. Since discountOverlaps' default value is true, we should do the same in remaining subclasses. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6818) Implementing Divergence from Independence (DFI) Term-Weighting for Lucene/Solr
[ https://issues.apache.org/jira/browse/LUCENE-6818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106999#comment-15106999 ] Ahmet Arslan commented on LUCENE-6818: -- Thanks [~rcmuir] for taking care of this. bq. For the solr factory changes around discountOverlaps, can you make a separate issue for that? Created SOLR-8570 > Implementing Divergence from Independence (DFI) Term-Weighting for Lucene/Solr > -- > > Key: LUCENE-6818 > URL: https://issues.apache.org/jira/browse/LUCENE-6818 > Project: Lucene - Core > Issue Type: New Feature > Components: core/query/scoring >Affects Versions: 5.3 >Reporter: Ahmet Arslan >Assignee: Robert Muir >Priority: Minor > Labels: similarity > Fix For: 5.5, Trunk > > Attachments: LUCENE-6818.patch, LUCENE-6818.patch, LUCENE-6818.patch, > LUCENE-6818.patch, LUCENE-6818.patch > > > As explained in the > [write-up|http://lucidworks.com/blog/flexible-ranking-in-lucene-4], many > state-of-the-art ranking model implementations are added to Apache Lucene. > This issue aims to include DFI model, which is the non-parametric counterpart > of the Divergence from Randomness (DFR) framework. > DFI is both parameter-free and non-parametric: > * parameter-free: it does not require any parameter tuning or training. > * non-parametric: it does not make any assumptions about word frequency > distributions on document collections. > It is highly recommended *not* to remove stopwords (very common terms: the, > of, and, to, a, in, for, is, on, that, etc) with this similarity. > For more information see: [A nonparametric term weighting method for > information retrieval based on measuring the divergence from > independence|http://dx.doi.org/10.1007/s10791-013-9225-4] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-839) XML Query Parser support (deftype=xmlparser)
[ https://issues.apache.org/jira/browse/SOLR-839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-839: -- Attachment: SOLR-839.patch This patch replaces utf8 constant string with StandardCharsets.UTF_8 as suggested by LUCENE-5560 > XML Query Parser support (deftype=xmlparser) > > > Key: SOLR-839 > URL: https://issues.apache.org/jira/browse/SOLR-839 > Project: Solr > Issue Type: New Feature > Components: query parsers >Affects Versions: 1.3, 5.4, Trunk >Reporter: Erik Hatcher >Assignee: Christine Poerschke >Priority: Minor > Fix For: Trunk > > Attachments: SOLR-839-object-parser.patch, SOLR-839.patch, > SOLR-839.patch, SOLR-839.patch, lucene-xml-query-parser-2.4-dev.jar > > > Lucene contrib includes a query parser that is able to create the > full-spectrum of Lucene queries, using an XML data structure. > This patch adds "xml" query parser support to Solr. > Example (from > {{lucene/queryparser/src/test/org/apache/lucene/queryparser/xml/NestedBooleanQuery.xml}}): > {code} > > > > > > doesNotExistButShouldBeOKBecauseOtherClauseExists > > > > > bank > > > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8445) fix line separator in log4j.properties files
[ https://issues.apache.org/jira/browse/SOLR-8445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-8445: --- Attachment: SOLR-8445.patch > fix line separator in log4j.properties files > > > Key: SOLR-8445 > URL: https://issues.apache.org/jira/browse/SOLR-8445 > Project: Solr > Issue Type: Bug > Components: Server >Affects Versions: 5.4, Trunk >Reporter: Ahmet Arslan >Priority: Trivial > Attachments: SOLR-8445.patch > > > new line is mistyped in conversion pattern -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-8445) fix line separator in log4j.properties files
Ahmet Arslan created SOLR-8445: -- Summary: fix line separator in log4j.properties files Key: SOLR-8445 URL: https://issues.apache.org/jira/browse/SOLR-8445 Project: Solr Issue Type: Bug Components: Server Affects Versions: 5.4, Trunk Reporter: Ahmet Arslan Priority: Trivial new line is mistyped in conversion pattern -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org