date:20190726

[jira] [Resolved] (SOLR-12254) TestInPlaceUpdatesDistrib reproducing failures

2019-07-26 Thread Munendra S N (JIRA)



 [ 
https://issues.apache.org/jira/browse/SOLR-12254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Munendra S N resolved SOLR-12254.
-
Resolution: Fixed

Closing this, as there are no failures after SOLR-12801

> TestInPlaceUpdatesDistrib reproducing failures
> --
>
> Key: SOLR-12254
> URL: https://issues.apache.org/jira/browse/SOLR-12254
> Project: Solr
>  Issue Type: Bug
>  Components: Tests, update
>Reporter: Steve Rowe
>Priority: Major
>
> From [https://builds.apache.org/job/Lucene-Solr-NightlyTests-7.x/205/], 100% 
> reproducing (see [https://builds.apache.org/job/Lucene-Solr-repro/535/]):
> {noformat}
> Checking out Revision 3d21fda4ce1c899f31b8f00e200eb1ac0d23d17b 
> (refs/remotes/origin/branch_7x)
> [...]
>[junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=TestInPlaceUpdatesDistrib -Dtests.method=test 
> -Dtests.seed=9BC71F2BDDB8F28A -Dtests.multiplier=2 -Dtests.nightly=true 
> -Dtests.slow=true 
> -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-7.x/test-data/enwiki.random.lines.txt
>  -Dtests.locale=ru-RU -Dtests.timezone=Hongkong -Dtests.asserts=true 
> -Dtests.file.encoding=ISO-8859-1
>[junit4] ERROR   23.6s J2 | TestInPlaceUpdatesDistrib.test <<<
>[junit4]> Throwable #1: 
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
> from server at https://127.0.0.1:56916/collection1: ERROR adding document 
> SolrInputDocument(fields: [id=-216, title_s=title-216, id_i=-216, 
> _version_=1598231319283761152])
>[junit4]>  at 
> __randomizedtesting.SeedInfo.seed([9BC71F2BDDB8F28A:139320F173449F72]:0)
>[junit4]>  at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:643)
>[junit4]>  at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
>[junit4]>  at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
>[junit4]>  at 
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194)
>[junit4]>  at 
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211)
>[junit4]>  at 
> org.apache.solr.update.TestInPlaceUpdatesDistrib.addDocAndGetVersion(TestInPlaceUpdatesDistrib.java:1105)
>[junit4]>  at 
> org.apache.solr.update.TestInPlaceUpdatesDistrib.buildRandomIndex(TestInPlaceUpdatesDistrib.java:1150)
>[junit4]>  at 
> org.apache.solr.update.TestInPlaceUpdatesDistrib.docValuesUpdateTest(TestInPlaceUpdatesDistrib.java:318)
>[junit4]>  at 
> org.apache.solr.update.TestInPlaceUpdatesDistrib.test(TestInPlaceUpdatesDistrib.java:144)
>[junit4]>  at 
> org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:993)
>[junit4]>  at 
> org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:968)
>[junit4]>  at java.lang.Thread.run(Thread.java:748)
> [...]
>[junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): 
> {title_s=PostingsFormat(name=LuceneFixedGap), id=Lucene50(blocksize=128), 
> id_field_copy_that_does_not_support_in_place_update_s=PostingsFormat(name=Memory)},
>  docValues:{inplace_updatable_float=DocValuesFormat(name=Lucene70), 
> id_i=DocValuesFormat(name=Direct), _version_=DocValuesFormat(name=Asserting), 
> id=DocValuesFormat(name=Memory), 
> inplace_updatable_int_with_default=DocValuesFormat(name=Lucene70), 
> inplace_updatable_float_with_default=DocValuesFormat(name=Direct)}, 
> maxPointsInLeafNode=922, maxMBSortInHeap=5.690194493492291, 
> sim=RandomSimilarity(queryNorm=true): {}, locale=ru-RU, timezone=Hongkong
>[junit4]   2> NOTE: Linux 3.13.0-88-generic amd64/Oracle Corporation 
> 1.8.0_152 (64-bit)/cpus=4,threads=1,free=127774192,total=523763712
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8936) Add SpanishMinimalStemFilter

2019-07-26 Thread vinod kumar (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894276#comment-16894276
 ] 

vinod kumar commented on LUCENE-8936:
-

[~atris] can you please help me on this. I have done all development. access is 
denied for me to raise pull request.  

> Add SpanishMinimalStemFilter
> 
>
> Key: LUCENE-8936
> URL: https://issues.apache.org/jira/browse/LUCENE-8936
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: vinod kumar
>Priority: Major
>
> SpanishMinimalStemmerFilter is less aggressive stemmer than 
> SpanishLightStemmerFilter
> Ex:
> input tokens -> output tokens
>  1. camiseta niños -> *camiseta* and *nino*
>  2. camisas -> camisa
> *camisetas* and *camisas* are t-shirts and shirts respectively.
>  Stemming both of the tokens to *camis* will match both tokens and returns 
> both t-shirts and shirts for query camisas(shirts). 
> SpanishMinimalStemmerFilter will help handling these cases.
> And importantly It will preserve gender context with tokens.
> Ex:  *niños* ,*niñas* *chicos* and *chicas* are stemmed to *nino*, *nina*, 
> *chico* and *chica*



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-12555) Replace try-fail-catch test patterns

2019-07-26 Thread Munendra S N (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-12555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894275#comment-16894275
 ] 

Munendra S N commented on SOLR-12555:
-

 [^SOLR-12555.patch] 
For all other packages

> Replace try-fail-catch test patterns
> 
>
> Key: SOLR-12555
> URL: https://issues.apache.org/jira/browse/SOLR-12555
> Project: Solr
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 8.0
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Trivial
> Attachments: SOLR-12555-sorted-by-package.txt, SOLR-12555.patch, 
> SOLR-12555.patch, SOLR-12555.patch, SOLR-12555.txt
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> I recently added some test code through SOLR-12427 which used the following 
> test anti-pattern:
> {code}
> try {
> actionExpectedToThrowException();
> fail("I expected this to throw an exception, but it didn't");
> catch (Exception e) {
> assertOnThrownException(e);
> }
> {code}
> Hoss (rightfully) objected that this should instead be written using the 
> formulation below, which is clearer and more concise.
> {code}
> SolrException e = expectThrows(() -> {...});
> {code}
> We should remove many of these older formulations where it makes sense.  Many 
> of them were written before {{expectThrows}} was introduced, and having the 
> old style assertions around makes it easier for them to continue creeping in.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-12555) Replace try-fail-catch test patterns

2019-07-26 Thread Munendra S N (JIRA)



 [ 
https://issues.apache.org/jira/browse/SOLR-12555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Munendra S N updated SOLR-12555:

Status: Patch Available  (was: Open)

> Replace try-fail-catch test patterns
> 
>
> Key: SOLR-12555
> URL: https://issues.apache.org/jira/browse/SOLR-12555
> Project: Solr
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 8.0
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Trivial
> Attachments: SOLR-12555-sorted-by-package.txt, SOLR-12555.patch, 
> SOLR-12555.patch, SOLR-12555.patch, SOLR-12555.txt
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> I recently added some test code through SOLR-12427 which used the following 
> test anti-pattern:
> {code}
> try {
> actionExpectedToThrowException();
> fail("I expected this to throw an exception, but it didn't");
> catch (Exception e) {
> assertOnThrownException(e);
> }
> {code}
> Hoss (rightfully) objected that this should instead be written using the 
> formulation below, which is clearer and more concise.
> {code}
> SolrException e = expectThrows(() -> {...});
> {code}
> We should remove many of these older formulations where it makes sense.  Many 
> of them were written before {{expectThrows}} was introduced, and having the 
> old style assertions around makes it easier for them to continue creeping in.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-12555) Replace try-fail-catch test patterns

2019-07-26 Thread Munendra S N (JIRA)



 [ 
https://issues.apache.org/jira/browse/SOLR-12555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Munendra S N updated SOLR-12555:

Attachment: SOLR-12555.patch

> Replace try-fail-catch test patterns
> 
>
> Key: SOLR-12555
> URL: https://issues.apache.org/jira/browse/SOLR-12555
> Project: Solr
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 8.0
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Trivial
> Attachments: SOLR-12555-sorted-by-package.txt, SOLR-12555.patch, 
> SOLR-12555.patch, SOLR-12555.patch, SOLR-12555.txt
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> I recently added some test code through SOLR-12427 which used the following 
> test anti-pattern:
> {code}
> try {
> actionExpectedToThrowException();
> fail("I expected this to throw an exception, but it didn't");
> catch (Exception e) {
> assertOnThrownException(e);
> }
> {code}
> Hoss (rightfully) objected that this should instead be written using the 
> formulation below, which is clearer and more concise.
> {code}
> SolrException e = expectThrows(() -> {...});
> {code}
> We should remove many of these older formulations where it makes sense.  Many 
> of them were written before {{expectThrows}} was introduced, and having the 
> old style assertions around makes it easier for them to continue creeping in.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-master-Linux (64bit/jdk-12.0.1) - Build # 24454 - Unstable!

2019-07-26 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/24454/
Java: 64bit/jdk-12.0.1 -XX:-UseCompressedOops -XX:+UseSerialGC

7 tests failed.
FAILED:  org.apache.solr.cloud.AliasIntegrationTest.testClusterStateProviderAPI

Error Message:
[testClusterStateProviderAPI] expected:<2> but was:<1>

Stack Trace:
java.lang.AssertionError: [testClusterStateProviderAPI] expected:<2> but was:<1>
at 
__randomizedtesting.SeedInfo.seed([DA122FAB54BD28A:1276BED6C6402BC1]:0)
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at 
org.apache.solr.cloud.AliasIntegrationTest.testClusterStateProviderAPI(AliasIntegrationTest.java:299)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.base/java.lang.Thread.run(Thread.java:835)


FAILED:  org.apache.solr.cloud.rule.RulesTest.doIntegrationTest

Error Message:
Timeout occurred while waiting response from server at: 
https://127.0.0.1:37885/solr

Stack Trace:

[jira] [Commented] (SOLR-13645) Add analytics function to format/extract components from dates

2019-07-26 Thread Lucene/Solr QA (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894246#comment-16894246
 ] 

Lucene/Solr QA commented on SOLR-13645:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
36s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | 
{color:green}  1m 41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | 
{color:green}  1m 35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate source patterns {color} | 
{color:green}  1m 35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate ref guide {color} | 
{color:green}  1m 35s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  2m 15s{color} 
| {color:red} analytics in the patch failed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}  9m 49s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | solr.analytics.function.mapping.DateFormatFunctionTest |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | SOLR-13645 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12976002/SOLR-13645-Analytics-function-for-date-components.patch
 |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  validaterefguide  |
| uname | Linux lucene1-us-west 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 
10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / 4050ddc59b |
| ant | version: Apache Ant(TM) version 1.10.5 compiled on March 28 2019 |
| Default Java | LTS |
| unit | 
https://builds.apache.org/job/PreCommit-SOLR-Build/508/artifact/out/patch-unit-solr_contrib_analytics.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-SOLR-Build/508/testReport/ |
| modules | C: solr/contrib/analytics solr/solr-ref-guide U: solr |
| Console output | 
https://builds.apache.org/job/PreCommit-SOLR-Build/508/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Add analytics function to format/extract components from dates
> --
>
> Key: SOLR-13645
> URL: https://issues.apache.org/jira/browse/SOLR-13645
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 8.1.1
>Reporter: Neal Sidhwaney
>Priority: Minor
> Attachments: SOLR-13645-Analytics-function-for-date-components.patch, 
> SOLR-13645-Analytics-function-for-date-components.patch, 
> SOLR-13645-Analytics-function-for-date-components.patch, 
> SOLR-13645-Analytics-function-for-date-components.patch
>
>
> It's helpful when running analytics to be able to do manipulation on dates 
> such as extracting month/day/year, converting to th week of year, etc, and 
> other formatting as many existing libraries provide.  I have a patch going 
> through final testing that will add this to the analytcs library.
> One thing I'm sort of amibvialent about is that it exposes that we use Java 
> date parsing in the analytics function, because the syntax is the same format 
> string that SimpleDateFormat accepts.  Ideally there would be an abstraction 
> between the analytics language and what's used on the backend to implement 
> it.  On the other hand, implementing a syntax for time/date formatting is 
> something that's been done many many times before, and this is not the only 
> place where Java date particulars show through.  It would be good to revisit 
> this at a later time.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail:

[jira] [Commented] (SOLR-13643) ResponseBuilder should provide accessors/setters for analytics response handling

2019-07-26 Thread Lucene/Solr QA (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894241#comment-16894241
 ] 

Lucene/Solr QA commented on SOLR-13643:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
36s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | 
{color:green}  1m 40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | 
{color:green}  1m 34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate source patterns {color} | 
{color:green}  1m 34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
57s{color} | {color:green} analytics in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 33m 
19s{color} | {color:green} core in the patch passed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 43m 26s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | SOLR-13643 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12975999/SOLR-13643-Create-accessors-setters-in-ResponseBuild.patch
 |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  |
| uname | Linux lucene1-us-west 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 
10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / 4050ddc59b |
| ant | version: Apache Ant(TM) version 1.10.5 compiled on March 28 2019 |
| Default Java | LTS |
|  Test Results | 
https://builds.apache.org/job/PreCommit-SOLR-Build/507/testReport/ |
| modules | C: solr/contrib/analytics solr/core U: solr |
| Console output | 
https://builds.apache.org/job/PreCommit-SOLR-Build/507/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> ResponseBuilder should provide accessors/setters for analytics response 
> handling
> 
>
> Key: SOLR-13643
> URL: https://issues.apache.org/jira/browse/SOLR-13643
> Project: Solr
>  Issue Type: Task
>  Components: Response Writers
>Affects Versions: 8.1.1
>Reporter: Neal Sidhwaney
>Priority: Trivial
> Attachments: 
> SOLR-13643-Create-accessors-setters-in-ResponseBuild.patch, 
> SOLR-13643-Create-accessors-setters-in-ResponseBuild.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Right now inside o.a.s.h.c.AnalyticsComponent.java, fields inside 
> ResponseBuilder are accessed directly.  Since they're in the same package, 
> this is OK at compile tie.  But when the Solr core and Analytics jars are 
> loaded at runtime by Solr, they are done by different classloaders, which 
> causes an IllegalAccessError during request handling.  There must be soething 
> different about y setup which is why I am running into this, but it seems 
> like a good idea to abstract away the fields behinds setters/getters anyway.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-NightlyTests-8.x - Build # 162 - Still Failing

2019-07-26 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-8.x/162/

No tests ran.

Build Log:
[...truncated 25 lines...]
ERROR: Failed to check out http://svn.apache.org/repos/asf/lucene/test-data
org.tmatesoft.svn.core.SVNException: svn: E175002: connection refused by the 
server
svn: E175002: OPTIONS request failed on '/repos/asf/lucene/test-data'
at 
org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:112)
at 
org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:96)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:765)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:352)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:340)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVConnection.performHttpRequest(DAVConnection.java:910)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVConnection.exchangeCapabilities(DAVConnection.java:702)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVConnection.open(DAVConnection.java:113)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVRepository.openConnection(DAVRepository.java:1035)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVRepository.getLatestRevision(DAVRepository.java:164)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgRepositoryAccess.getRevisionNumber(SvnNgRepositoryAccess.java:119)
at 
org.tmatesoft.svn.core.internal.wc2.SvnRepositoryAccess.getLocations(SvnRepositoryAccess.java:178)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgRepositoryAccess.createRepositoryFor(SvnNgRepositoryAccess.java:43)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgAbstractUpdate.checkout(SvnNgAbstractUpdate.java:831)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgCheckout.run(SvnNgCheckout.java:26)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgCheckout.run(SvnNgCheckout.java:11)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgOperationRunner.run(SvnNgOperationRunner.java:20)
at 
org.tmatesoft.svn.core.internal.wc2.SvnOperationRunner.run(SvnOperationRunner.java:21)
at 
org.tmatesoft.svn.core.wc2.SvnOperationFactory.run(SvnOperationFactory.java:1239)
at org.tmatesoft.svn.core.wc2.SvnOperation.run(SvnOperation.java:294)
at 
hudson.scm.subversion.CheckoutUpdater$SubversionUpdateTask.perform(CheckoutUpdater.java:133)
at 
hudson.scm.subversion.WorkspaceUpdater$UpdateTask.delegateTo(WorkspaceUpdater.java:168)
at 
hudson.scm.subversion.WorkspaceUpdater$UpdateTask.delegateTo(WorkspaceUpdater.java:176)
at 
hudson.scm.subversion.UpdateUpdater$TaskImpl.perform(UpdateUpdater.java:134)
at 
hudson.scm.subversion.WorkspaceUpdater$UpdateTask.delegateTo(WorkspaceUpdater.java:168)
at 
hudson.scm.SubversionSCM$CheckOutTask.perform(SubversionSCM.java:1041)
at hudson.scm.SubversionSCM$CheckOutTask.invoke(SubversionSCM.java:1017)
at hudson.scm.SubversionSCM$CheckOutTask.invoke(SubversionSCM.java:990)
at hudson.FilePath$FileCallableWrapper.call(FilePath.java:3086)
at hudson.remoting.UserRequest.perform(UserRequest.java:212)
at hudson.remoting.UserRequest.perform(UserRequest.java:54)
at hudson.remoting.Request$2.run(Request.java:369)
at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at 
org.tmatesoft.svn.core.internal.util.SVNSocketConnection.run(SVNSocketConnection.java:57)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
... 4 more
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)

[jira] [Commented] (SOLR-13616) Possible racecondition/deadlock between collection DELETE and PrepRecovery ? (TestPolicyCloud failures)

2019-07-26 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894232#comment-16894232
 ] 

ASF subversion and git services commented on SOLR-13616:


Commit 4050ddc59beeff2be5a862782579ceb8e5775c60 in lucene-solr's branch 
refs/heads/master from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4050ddc ]

Harden RulesTest

* ensure all collections/replicas are active

* use waitForState or waitForActiveCollection before checking rules/snitch to 
prevent false failures on stale state

* ensure cluster policy is cleared after each test method

Some of these changes should also help ensure we don't get (more) spurious 
failures due to SOLR-13616


> Possible racecondition/deadlock between collection DELETE and PrepRecovery ? 
> (TestPolicyCloud failures)
> ---
>
> Key: SOLR-13616
> URL: https://issues.apache.org/jira/browse/SOLR-13616
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Priority: Major
> Attachments: SOLR-13616.test-incomplete.patch, 
> thetaphi_Lucene-Solr-master-Linux_24358.log.txt
>
>
> Based on some recent jenkins failures in TestPolicyCloud, I suspect there is 
> a possible deadlock condition when attempting to delete a collection while 
> recovery is in progress.
> I haven't been able to identify exactly where/why/how the problem occurs, but 
> it does not appear to be a test specific problem, and seems like it could 
> potentially affect anyone unlucky enough to issue poorly timed DELETE.
> Details to follow in comments...



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13616) Possible racecondition/deadlock between collection DELETE and PrepRecovery ? (TestPolicyCloud failures)

2019-07-26 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894231#comment-16894231
 ] 

ASF subversion and git services commented on SOLR-13616:


Commit 32da33936532a13523be522a8e86820c5bd9a497 in lucene-solr's branch 
refs/heads/branch_8x from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=32da339 ]

Harden RulesTest

* ensure all collections/replicas are active

* use waitForState or waitForActiveCollection before checking rules/snitch to 
prevent false failures on stale state

* ensure cluster policy is cleared after each test method

Some of these changes should also help ensure we don't get (more) spurious 
failures due to SOLR-13616

(cherry picked from commit 4050ddc59beeff2be5a862782579ceb8e5775c60)


> Possible racecondition/deadlock between collection DELETE and PrepRecovery ? 
> (TestPolicyCloud failures)
> ---
>
> Key: SOLR-13616
> URL: https://issues.apache.org/jira/browse/SOLR-13616
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Priority: Major
> Attachments: SOLR-13616.test-incomplete.patch, 
> thetaphi_Lucene-Solr-master-Linux_24358.log.txt
>
>
> Based on some recent jenkins failures in TestPolicyCloud, I suspect there is 
> a possible deadlock condition when attempting to delete a collection while 
> recovery is in progress.
> I haven't been able to identify exactly where/why/how the problem occurs, but 
> it does not appear to be a test specific problem, and seems like it could 
> potentially affect anyone unlucky enough to issue poorly timed DELETE.
> Details to follow in comments...



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-13599) ReplicationFactorTest high failure rate on Windows jenkins VMs after 2019-06-22 OS/java upgrades

2019-07-26 Thread Hoss Man (JIRA)



 [ 
https://issues.apache.org/jira/browse/SOLR-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-13599.
-
Resolution: Cannot Reproduce

not a single jenkins failure in this test since backporting the logging 
additions to brach_8x on July8.

doesn't seem like there is much more we can do here

> ReplicationFactorTest high failure rate on Windows jenkins VMs after 
> 2019-06-22 OS/java upgrades
> 
>
> Key: SOLR-13599
> URL: https://issues.apache.org/jira/browse/SOLR-13599
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Priority: Major
> Attachments: thetaphi_Lucene-Solr-master-Windows_8025.log.txt
>
>
> We've started seeing some weirdly consistent (but not reliably reproducible) 
> failures from ReplicationFactorTest when running on Uwe's Windows jenkins 
> machines.
> The failures all seem to have started on June 22 -- when Uwe upgraded his 
> Windows VMs to upgrade the Java version, but happen across all versions of 
> java tested, and on both the master and branch_8x.
> While this test failed a total of 5 times, in different ways, on various 
> jenkins boxes between 2019-01-01 and 2019-06-21, it seems to have failed on 
> all but 1 or 2 of Uwe's "Windows" jenkins builds since that 2019-06-22, and 
> when it fails the {{reproduceJenkinsFailures.py}} logic used in Uwe's jenkins 
> builds frequently fails anywhere from 1-4 additional times.
> All of these failures occur in the exact same place, with the exact same 
> assertion: that the expected replicationFactor of 2 was not achieved, and an 
> rf=1 (ie: only the master) was returned, when sending a _batch_ of documents 
> to a collection with 1 shard, 3 replicas; while 1 of the replicas was 
> partitioned off due to a closed proxy.
> In the handful of logs I've examined closely, the 2nd "live" replica does in 
> fact log that it recieved & processed the update, but with a QTime of over 30 
> seconds, and it then it immediately logs an 
> {{org.eclipse.jetty.io.EofException: Reset cancel_stream_error}} Exception -- 
> meanwhile, the leader has one ({{updateExecutor}} thread logging copious 
> amount of {{java.net.ConnectException: Connection refused: no further 
> information}} regarding the replica that was partitioned off, before a second 
> {{updateExecutor}} thread ultimately logs 
> {{java.util.concurrent.ExecutionException: 
> java.util.concurrent.TimeoutException: idle_timeout}} regarding the "live" 
> replica.
> 
> What makes this perplexing is that this is not the first time in the test 
> that documents were added to this collection while one replica was 
> partitioned off, but it is the first time that all 3 of the following are 
> true _at the same time_:
> # the collection has recovered after some replicas were partitioned and 
> re-connected
> # a batch of multiple documents is being added
> # one replica has been "re" partitioned.
> ...prior to the point when this failure happens, only individual document 
> adds were tested while replicas where partitioned.  Batches of adds were only 
> tested when all 3 replicas were "live" after the proxies were re-opened and 
> the collection had fully recovered.  The failure also comes from the first 
> update to happen after a replica's proxy port has been "closed" for the 
> _second_ time.
> While this conflagration of events might concievible trigger some weird bug, 
> what makes these failures _particularly_ perplexing is that:
> * the failures only happen on Windows
> * the failures only started after the Windows VM update on June-22.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-SmokeRelease-master - Build # 1402 - Failure

2019-07-26 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-master/1402/

No tests ran.

Build Log:
[...truncated 24552 lines...]
[asciidoctor:convert] asciidoctor: ERROR: about-this-guide.adoc: line 1: 
invalid part, must have at least one section (e.g., chapter, appendix, etc.)
[asciidoctor:convert] asciidoctor: ERROR: solr-glossary.adoc: line 1: invalid 
part, must have at least one section (e.g., chapter, appendix, etc.)
 [java] Processed 2590 links (2119 relative) to 3409 anchors in 259 files
 [echo] Validated Links & Anchors via: 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/solr/build/solr-ref-guide/bare-bones-html/

-dist-changes:
 [copy] Copying 4 files to 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/solr/package/changes

package:

-unpack-solr-tgz:

-ensure-solr-tgz-exists:
[mkdir] Created dir: 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/solr/build/solr.tgz.unpacked
[untar] Expanding: 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/solr/package/solr-9.0.0.tgz
 into 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/solr/build/solr.tgz.unpacked

generate-maven-artifacts:

resolve:

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

[jira] [Commented] (SOLR-13579) Create resource management API

2019-07-26 Thread Hoss Man (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894184#comment-16894184
 ] 

Hoss Man commented on SOLR-13579:
-

Honestly, i'm still very lost.

Part of my struggle is i'm trying to wade into the patch, and review the APIs 
and functionality it contains, while knowing – as you mentioned – that's not 
all the details are here, and it's not fully fleshed out w/everything you 
intend as far as configuration and customization and having more concrete 
implementations beyond just the {{CacheManagerPlugin}}.

I know that in your mind there is more that can/should be done, and that some 
of this code is just "placeholder" for later, but i don't have enough 
familiarity with the "long term" plan to really understand what in the current 
patch is placeholder or stub APIs, vs what is "real" and exists because of long 
term visions for how all of these pieces can be used together in a more 
generalized system – ie: what classes might have surface APIs that look more 
complex then needed given what's currently implemented in the patch, because of 
how you envinsion those classes being used in the future?

Just to pick one example, was my question about the "ResourceManagerPool" vs 
"ResourceManagerPlugin" – in your reply you said...
{quote}The code in ResourceManagerPool is independent of the type of 
resource(s) that a pool can manage. ...
{quote}
...but the code in {{ResourceManagerPlugin}} is _also_ independent of any 
specific type of resource(s) that a pool can manage – those specifics only 
exist in the concrete subclasses. Hence the crux of my question is why theses 
two very generalized pieces of abstract functionality/data collection couldn't 
just be a single abstract base class for all (concrete) ResourceManagerPlugin 
subclasses to extend?

Your followup gives a clue...
{quote}...perhaps at some point we could allow a single pool to manage several 
aspects of a component, in which case a pool could have several plugins.
{quote}
but w/o some "concrete hypothetical" examples of what that might look like, 
it's hard to evaluate if the current APIs are the "best" approach, or if maybe 
there is something better/simpler.
{quote}Also, there can be different pools of the same type, each used for a 
different group of components that support the same management aspect. For 
example, for searcher caches we may want to eventually create separate pools 
for filterCache, queryResultCache and fieldValueCache. All of these pools would 
use the same plugin implementation CacheManagerPlugin but configured with 
different params and limits.
{quote}
But even in this situation, there could be multiple *instances* of a 
{{CacheManagerPlugin}}, one for each pool, each with different params and 
limits, w/o needing distinction between the {{ResourceManagerPlugin}} 
concept/instances and the {{ResourceManagerPool}} concept/instances.

(To be clear, i'm not trying to harp on the specific design/seperation/linkage 
of {{ResourceManagerPlugin}} vs {{ResourceManagerPool}} – these are just some 
of the first classes i looked at and had questions about. I'm just using them 
as examples of where/how it's hard to ask questions or form opinions about the 
current API/code w/o having a better grasp of some "concrete specifcs" (or even 
"hypothetical specifics") of when/how/where/why each of these APIs are expected 
to be used and interact w/each other.

Another example of where i got lost as to the specific motivation behind some 
of these APIs in the long term view is in the "loose coupling" that currently 
exists in the patch between the {{ManagedComponent}} API and 
{{ResourceManagerPlugin}}:
 As i understand it:
 * An object in Solr supports being managed by a particular subclass of 
{{ResourceManagerPlugin}} if and only if it extends {{ManagedComponent}} and 
implementes {{ManagedComponent.getManagedResourceTypes()}} such that the 
resulting {{Collection}} contains a String matching the return value of 
a {{ResourceManagerPlugin.getType()}} for that particular 
{{ResourceManagerPlugin}}
 ** ie: {{SolrCache}} extends the {{ManagedComponent}} interface, and all 
classess implementeing {{SolrCache}} should/must implement 
{{getManagedResourceTypes()}} by returning a java {{Collection}} containing 
{{CacheManagerPlugin.TYPE}}
 * once some {{ManagedComponent}} instances are "registered in a pool" and 
managed by a specific {{ResourceManagerPlugin}} intsance then that plugin 
expects to be able to call {{ManagedComponent.setResourceLimits(Map limits)}} and {{ManagedComponent.getResourceLimits()}} on all of those 
{{ManagedComponent}} instances, and that both Maps should contain/support a set 
of {{String}} keys specific to that {{ResourceManagerPlugin}} subclass acording 
to {{ResourceManagerPlugin.getControlledParams()}}
 ** ie: {{CacheManagerPlugin.getControlledParams()}} returns a java 
{{Collection}} containing

Re: patch review for github PRs

2019-07-26 Thread Sidhwaney, Neal B

Thank you, Steve.

Sounds like something I can get started on. Forgive me but I am brand new to 
the infrastructure used for this project.

Is patch_tested.txt generated by the build system or by jenkins-admin? If I'm 
modifying the format of it, would I need to modify something else besides 
jenkins-admin? It doesn't look like it writes the file, only reads it.

Could you give an example filter URL that is passed to jenkins-admin? I tried 
various options but couldn't get it to actually call into Jenkins to start any 
builds (or even print that it was doing since I was not running it in live mode)

Thanks,

Neal

From: Steve Rowe 
Sent: Wednesday, July 24, 2019 9:41:50 AM
To: dev@lucene.apache.org 
Subject: Re: patch review for github PRs

Hi Neal,

Patches welcome!

IMHO the ideal way to choose patch vs. Github PR, if there are both, is to 
choose the most recent.  I'm not sure if timestamp data is available, though.

As I understand the problem, the modifications to jenkins-admin.py will be:

1. Modify the format of patch_tested.txt (used by jenkins-admin.py to remember 
the latest attachment IDs per JIRA issue) to include Github PR references 
(currently it's just Issue-ID/attachment-ID pairs).
2. Trigger downstream Jenkins jobs using the correct parameters for Github PRs. 
 I'm not sure if this will require modifications to downstream jobs like 
PreCommit-SOLR-Build and PreCommit-LUCENE-Build.  Looking at the build script 
here 
https://builds.apache.org/job/PreCommit-SOLR-Build/configure
 : when it invokes Yetus's test-patch script, it ignores the $ATTACHMENT_ID 
parameter currently sent with the build request by jenkins-admin.py, so maybe 
Yetus's test-patch script will auto-detect Github PRs without any addition 
effort?  (Since it currently auto-detects which attached patch to use in some 
way.)

One wrinkle here: the Yetus version used by PreCommit-SOLR-Build and 
PreCommit-LUCENE-Build is pinned at v0.7.0, but the latest Yetus version is 
0.10.0 . We should almost certainly upgrade the Yetus version used by those 
jobs as part of this effort.  Related: 
https://issues.apache.org/jira/browse/LUCENE-8515

If you'd like to start work on this, please create a JIRA issue.  At a minimum, 
a YETUS issue will be required for the jenkins-admin.py changes.

Thanks,
Steve

On Jul 24, 2019, at 12:24 PM, Sidhwaney, Neal B 
mailto:neal.sidhwa...@providence.org>> wrote:

Hello,

I can try to take a look at this if nobody else has the time right now.

>From a brief glance at the code, it looks like the right thing to do would to 
>detect Github PR links in addition to the existing attachment logic, and send 
>those JIRA issues to test-jira.  If I understand the email below, it should 
>prefer the Github PR without having to do anything special?

Thank you,

Neal

From: Steve Rowe mailto:sar...@gmail.com>>
Sent: Friday, July 19, 2019 4:01 PM
To: dev@lucene.apache.org 
mailto:dev@lucene.apache.org>>
Subject: Re: patch review for github PRs

Hi Mike,

I don't think so, based on this SOLR-10912 comment[1] from Allen Wittenauer, 
who works on Yetus:

Github PR support is sort of there.

test-patch does. It can take either a github PR directly on the command line or 
passed via a JIRA. If it gets told to test a JIRA that references a github PR, 
it will defer to the PR as the source of the patch. In other words, if a JIRA 
issue references a github PR and has a patch attached, it will use the github 
PR and ignore the attachments.

However!

The job on Jenkins that feeds test-patch is NOT github aware. The original 
version was built before github integration existed. To make matters worse, 
that code was locked away in a repository no one really had access to modify. 
As of a month or so ago, that code is now part of Apache Yetus ( 
https://github.com/apache/yetus/blob/master/precommit/jenkins/jenkins-admin.py
 ), so there is an opportunity for us to fix this problem and add better 
asf<->github integration.

jenkins-admin.py has moved here:

[jira] [Assigned] (LUCENE-6336) AnalyzingInfixSuggester needs duplicate handling

2019-07-26 Thread JIRA



 [ 
https://issues.apache.org/jira/browse/LUCENE-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl reassigned LUCENE-6336:
---

Assignee: (was: Jan Høydahl)

> AnalyzingInfixSuggester needs duplicate handling
> 
>
> Key: LUCENE-6336
> URL: https://issues.apache.org/jira/browse/LUCENE-6336
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 4.10.3, 5.0
>Reporter: Jan Høydahl
>Priority: Major
>  Labels: lookup, suggester
> Attachments: LUCENE-6336.patch
>
>
> Spinoff from LUCENE-5833 but else unrelated.
> Using {{AnalyzingInfixSuggester}} which is backed by a Lucene index and 
> stores payload and score together with the suggest text.
> I did some testing with Solr, producing the DocumentDictionary from an index 
> with multiple documents containing the same text, but with random weights 
> between 0-100. Then I got duplicate identical suggestions sorted by weight:
> {code}
> {
>   "suggest":{"languages":{
>   "engl":{
> "numFound":101,
> "suggestions":[{
> "term":"English",
> "weight":100,
> "payload":"0"},
>   {
> "term":"English",
> "weight":99,
> "payload":"0"},
>   {
> "term":"English",
> "weight":98,
> "payload":"0"},
> ---etc all the way down to 0---
> {code}
> I also reproduced the same behavior in AnalyzingInfixSuggester directly. So 
> there is a need for some duplicate removal here, either while building the 
> local suggest index or during lookup. Only the highest weight suggestion for 
> a given term should be returned.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-Tests-master - Build # 3458 - Failure

2019-07-26 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Tests-master/3458/

All tests passed

Build Log:
[...truncated 64776 lines...]
-ecj-javadoc-lint-tests:
[mkdir] Created dir: /tmp/ecj202473320
 [ecj-lint] Compiling 48 source files to /tmp/ecj202473320
 [ecj-lint] invalid Class-Path header in manifest of jar file: 
/home/jenkins/.ivy2/cache/org.restlet.jee/org.restlet/jars/org.restlet-2.3.0.jar
 [ecj-lint] invalid Class-Path header in manifest of jar file: 
/home/jenkins/.ivy2/cache/org.restlet.jee/org.restlet.ext.servlet/jars/org.restlet.ext.servlet-2.3.0.jar
 [ecj-lint] --
 [ecj-lint] 1. ERROR in 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/MockInitialContextFactory.java
 (at line 23)
 [ecj-lint] import javax.naming.NamingException;
 [ecj-lint]
 [ecj-lint] The type javax.naming.NamingException is not accessible
 [ecj-lint] --
 [ecj-lint] 2. ERROR in 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/MockInitialContextFactory.java
 (at line 28)
 [ecj-lint] public class MockInitialContextFactory implements 
InitialContextFactory {
 [ecj-lint]  ^
 [ecj-lint] The type MockInitialContextFactory must implement the inherited 
abstract method InitialContextFactory.getInitialContext(Hashtable)
 [ecj-lint] --
 [ecj-lint] 3. ERROR in 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/MockInitialContextFactory.java
 (at line 30)
 [ecj-lint] private final javax.naming.Context context;
 [ecj-lint]   
 [ecj-lint] The type javax.naming.Context is not accessible
 [ecj-lint] --
 [ecj-lint] 4. ERROR in 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/MockInitialContextFactory.java
 (at line 33)
 [ecj-lint] context = mock(javax.naming.Context.class);
 [ecj-lint] ^^^
 [ecj-lint] context cannot be resolved to a variable
 [ecj-lint] --
 [ecj-lint] 5. ERROR in 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/MockInitialContextFactory.java
 (at line 33)
 [ecj-lint] context = mock(javax.naming.Context.class);
 [ecj-lint]
 [ecj-lint] The type javax.naming.Context is not accessible
 [ecj-lint] --
 [ecj-lint] 6. ERROR in 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/MockInitialContextFactory.java
 (at line 36)
 [ecj-lint] when(context.lookup(anyString())).thenAnswer(invocation -> 
objects.get(invocation.getArgument(0)));
 [ecj-lint]  ^^^
 [ecj-lint] context cannot be resolved
 [ecj-lint] --
 [ecj-lint] 7. ERROR in 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/MockInitialContextFactory.java
 (at line 38)
 [ecj-lint] } catch (NamingException e) {
 [ecj-lint]  ^^^
 [ecj-lint] NamingException cannot be resolved to a type
 [ecj-lint] --
 [ecj-lint] 8. ERROR in 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/MockInitialContextFactory.java
 (at line 45)
 [ecj-lint] public javax.naming.Context getInitialContext(Hashtable env) {
 [ecj-lint]
 [ecj-lint] The type javax.naming.Context is not accessible
 [ecj-lint] --
 [ecj-lint] 9. ERROR in 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/MockInitialContextFactory.java
 (at line 46)
 [ecj-lint] return context;
 [ecj-lint]^^^
 [ecj-lint] context cannot be resolved to a variable
 [ecj-lint] --
 [ecj-lint] 9 problems (9 errors)

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/build.xml:634: 
The following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/build.xml:101: 
The following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/solr/build.xml:651:
 The following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/solr/common-build.xml:479:
 The following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/lucene/common-build.xml:2015:
 The following error occurred while executing this line:

[JENKINS] Lucene-Solr-8.2-Linux (32bit/jdk1.8.0_201) - Build # 475 - Unstable!

2019-07-26 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-8.2-Linux/475/
Java: 32bit/jdk1.8.0_201 -server -XX:+UseParallelGC

5 tests failed.
FAILED:  org.apache.solr.handler.admin.IndexSizeEstimatorTest.testEstimator

Error Message:


Stack Trace:
java.lang.NullPointerException
at 
__randomizedtesting.SeedInfo.seed([78DD45FC30BCC5BF:B7002AAEBCC4540F]:0)
at 
org.apache.lucene.codecs.simpletext.SimpleTextStoredFieldsReader.visitDocument(SimpleTextStoredFieldsReader.java:108)
at 
org.apache.solr.handler.admin.IndexSizeEstimator.estimateStoredFields(IndexSizeEstimator.java:512)
at 
org.apache.solr.handler.admin.IndexSizeEstimator.estimate(IndexSizeEstimator.java:197)
at 
org.apache.solr.handler.admin.IndexSizeEstimatorTest.testEstimator(IndexSizeEstimatorTest.java:117)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)


FAILED:  org.apache.solr.handler.admin.IndexSizeEstimatorTest.testEstimator

Error Message:


Stack Trace:

[jira] [Commented] (SOLR-13645) Add analytics function to format/extract components from dates

2019-07-26 Thread Neal Sidhwaney (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894094#comment-16894094
 ] 

Neal Sidhwaney commented on SOLR-13645:
---

FYI(no action necessary) the patch has been updated to apply to HEAD. Thanks!

Neal

> Add analytics function to format/extract components from dates
> --
>
> Key: SOLR-13645
> URL: https://issues.apache.org/jira/browse/SOLR-13645
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 8.1.1
>Reporter: Neal Sidhwaney
>Priority: Minor
> Attachments: SOLR-13645-Analytics-function-for-date-components.patch, 
> SOLR-13645-Analytics-function-for-date-components.patch, 
> SOLR-13645-Analytics-function-for-date-components.patch, 
> SOLR-13645-Analytics-function-for-date-components.patch
>
>
> It's helpful when running analytics to be able to do manipulation on dates 
> such as extracting month/day/year, converting to th week of year, etc, and 
> other formatting as many existing libraries provide.  I have a patch going 
> through final testing that will add this to the analytcs library.
> One thing I'm sort of amibvialent about is that it exposes that we use Java 
> date parsing in the analytics function, because the syntax is the same format 
> string that SimpleDateFormat accepts.  Ideally there would be an abstraction 
> between the analytics language and what's used on the backend to implement 
> it.  On the other hand, implementing a syntax for time/date formatting is 
> something that's been done many many times before, and this is not the only 
> place where Java date particulars show through.  It would be good to revisit 
> this at a later time.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-13645) Add analytics function to format/extract components from dates

2019-07-26 Thread Neal Sidhwaney (JIRA)



 [ 
https://issues.apache.org/jira/browse/SOLR-13645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Sidhwaney updated SOLR-13645:
--
Attachment: SOLR-13645-Analytics-function-for-date-components.patch

> Add analytics function to format/extract components from dates
> --
>
> Key: SOLR-13645
> URL: https://issues.apache.org/jira/browse/SOLR-13645
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 8.1.1
>Reporter: Neal Sidhwaney
>Priority: Minor
> Attachments: SOLR-13645-Analytics-function-for-date-components.patch, 
> SOLR-13645-Analytics-function-for-date-components.patch, 
> SOLR-13645-Analytics-function-for-date-components.patch, 
> SOLR-13645-Analytics-function-for-date-components.patch
>
>
> It's helpful when running analytics to be able to do manipulation on dates 
> such as extracting month/day/year, converting to th week of year, etc, and 
> other formatting as many existing libraries provide.  I have a patch going 
> through final testing that will add this to the analytcs library.
> One thing I'm sort of amibvialent about is that it exposes that we use Java 
> date parsing in the analytics function, because the syntax is the same format 
> string that SimpleDateFormat accepts.  Ideally there would be an abstraction 
> between the analytics language and what's used on the backend to implement 
> it.  On the other hand, implementing a syntax for time/date formatting is 
> something that's been done many many times before, and this is not the only 
> place where Java date particulars show through.  It would be good to revisit 
> this at a later time.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8920) Reduce size of FSTs due to use of direct-addressing encoding

2019-07-26 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894086#comment-16894086
 ] 

ASF subversion and git services commented on LUCENE-8920:
-

Commit fe0c042470dc1a1ba7ffd27f91ac7bc96c3254a0 in lucene-solr's branch 
refs/heads/master from Michael Sokolov
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fe0c042 ]

LUCENE-8920: remove Arc setters, moving implementations into Arc, or copying 
data into consumers


> Reduce size of FSTs due to use of direct-addressing encoding 
> -
>
> Key: LUCENE-8920
> URL: https://issues.apache.org/jira/browse/LUCENE-8920
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mike Sokolov
>Priority: Blocker
> Fix For: 8.3
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Some data can lead to worst-case ~4x RAM usage due to this optimization. 
> Several ideas were suggested to combat this on the mailing list:
> bq. I think we can improve thesituation here by tracking, per-FST instance, 
> the size increase we're seeing while building (or perhaps do a preliminary 
> pass before building) in order to decide whether to apply the encoding. 
> bq. we could also make the encoding a bit more efficient. For instance I 
> noticed that arc metadata is pretty large in some cases (in the 10-20 bytes) 
> which make gaps very costly. Associating each label with a dense id and 
> having an intermediate lookup, ie. lookup label -> id and then id->arc offset 
> instead of doing label->arc directly could save a lot of space in some cases? 
> Also it seems that we are repeating the label in the arc metadata when 
> array-with-gaps is used, even though it shouldn't be necessary since the 
> label is implicit from the address?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8920) Reduce size of FSTs due to use of direct-addressing encoding

2019-07-26 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894085#comment-16894085
 ] 

ASF subversion and git services commented on LUCENE-8920:
-

Commit 760f2dbdcb29b993aab8f981d84ccbf2e20e9fa5 in lucene-solr's branch 
refs/heads/master from Michael Sokolov
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=760f2dbd ]

LUCENE-8920: encapsulate FST.Arc data


> Reduce size of FSTs due to use of direct-addressing encoding 
> -
>
> Key: LUCENE-8920
> URL: https://issues.apache.org/jira/browse/LUCENE-8920
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mike Sokolov
>Priority: Blocker
> Fix For: 8.3
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Some data can lead to worst-case ~4x RAM usage due to this optimization. 
> Several ideas were suggested to combat this on the mailing list:
> bq. I think we can improve thesituation here by tracking, per-FST instance, 
> the size increase we're seeing while building (or perhaps do a preliminary 
> pass before building) in order to decide whether to apply the encoding. 
> bq. we could also make the encoding a bit more efficient. For instance I 
> noticed that arc metadata is pretty large in some cases (in the 10-20 bytes) 
> which make gaps very costly. Associating each label with a dense id and 
> having an intermediate lookup, ie. lookup label -> id and then id->arc offset 
> instead of doing label->arc directly could save a lot of space in some cases? 
> Also it seems that we are repeating the label in the arc metadata when 
> array-with-gaps is used, even though it shouldn't be necessary since the 
> label is implicit from the address?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8920) Reduce size of FSTs due to use of direct-addressing encoding

2019-07-26 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894087#comment-16894087
 ] 

ASF subversion and git services commented on LUCENE-8920:
-

Commit 92d4e712d5d50d745c5a6c10dacda66198974116 in lucene-solr's branch 
refs/heads/master from Michael Sokolov
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=92d4e71 ]

LUCENE-8920: refactor FST binary search


> Reduce size of FSTs due to use of direct-addressing encoding 
> -
>
> Key: LUCENE-8920
> URL: https://issues.apache.org/jira/browse/LUCENE-8920
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mike Sokolov
>Priority: Blocker
> Fix For: 8.3
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Some data can lead to worst-case ~4x RAM usage due to this optimization. 
> Several ideas were suggested to combat this on the mailing list:
> bq. I think we can improve thesituation here by tracking, per-FST instance, 
> the size increase we're seeing while building (or perhaps do a preliminary 
> pass before building) in order to decide whether to apply the encoding. 
> bq. we could also make the encoding a bit more efficient. For instance I 
> noticed that arc metadata is pretty large in some cases (in the 10-20 bytes) 
> which make gaps very costly. Associating each label with a dense id and 
> having an intermediate lookup, ie. lookup label -> id and then id->arc offset 
> instead of doing label->arc directly could save a lot of space in some cases? 
> Also it seems that we are repeating the label in the arc metadata when 
> array-with-gaps is used, even though it shouldn't be necessary since the 
> label is implicit from the address?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] msokolov closed pull request #800: Lucene-8920: refactor FST.Arc and utilities

2019-07-26 Thread GitBox

msokolov closed pull request #800: Lucene-8920: refactor FST.Arc and utilities
URL: https://github.com/apache/lucene-solr/pull/800
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] msokolov commented on issue #800: Lucene-8920: refactor FST.Arc and utilities

2019-07-26 Thread GitBox

msokolov commented on issue #800: Lucene-8920: refactor FST.Arc and utilities
URL: https://github.com/apache/lucene-solr/pull/800#issuecomment-515579943
 
 
   pushed from my dev box


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] msokolov commented on a change in pull request #800: Lucene-8920: refactor FST.Arc and utilities

2019-07-26 Thread GitBox

msokolov commented on a change in pull request #800: Lucene-8920: refactor 
FST.Arc and utilities
URL: https://github.com/apache/lucene-solr/pull/800#discussion_r307886991
 
 

 ##
 File path: 
lucene/codecs/src/java/org/apache/lucene/codecs/memory/FSTOrdTermsReader.java
 ##
 @@ -305,7 +305,7 @@ public String toString() {
 }
 
 // Only wraps common operations for PBF interact
-abstract class BaseTermsEnum extends org.apache.lucene.index.BaseTermsEnum 
{
+abstract class  BaseTermsEnum extends 
org.apache.lucene.index.BaseTermsEnum {
 
 Review comment:
   Yeah, not sure how that crept in. I'll fix this and merge shortly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-8936) Add SpanishMinimalStemFilter

2019-07-26 Thread vinod kumar (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinod kumar updated LUCENE-8936:

Description: 
SpanishMinimalStemmerFilter is less aggressive stemmer than 
SpanishLightStemmerFilter

Ex:

input tokens -> output tokens
 1. camiseta niños -> *camiseta* and *nino*
 2. camisas -> camisa

*camisetas* and *camisas* are t-shirts and shirts respectively.
 Stemming both of the tokens to *camis* will match both tokens and returns both 
t-shirts and shirts for query camisas(shirts). SpanishMinimalStemmerFilter will 
help handling these cases.

And importantly It will preserve gender context with tokens.

Ex:  *niños* ,*niñas* *chicos* and *chicas* are stemmed to *nino*, *nina*, 
*chico* and *chica*

  was:
SpanishMinimalStemmerFilter is less aggressive stemmer than 
SpanishLightStemmerFilter

Ex: 
1. camiseta niños -> *camiseta* and *niños*
2. camisas -> camisa

Here *camisetas* and *camisas* are t-shirts and shirts respectively.
Stemming both of the tokens to *camis* will match both tokens and returns both 
t-shirts and shirts for query camisas(shirts).

SpanishMinimalStemmerFilter will help handling these cases.
It will preserve gender context with tokens *niños* ,*niñas* *chicos* and 
*chicas*



> Add SpanishMinimalStemFilter
> 
>
> Key: LUCENE-8936
> URL: https://issues.apache.org/jira/browse/LUCENE-8936
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: vinod kumar
>Priority: Major
>
> SpanishMinimalStemmerFilter is less aggressive stemmer than 
> SpanishLightStemmerFilter
> Ex:
> input tokens -> output tokens
>  1. camiseta niños -> *camiseta* and *nino*
>  2. camisas -> camisa
> *camisetas* and *camisas* are t-shirts and shirts respectively.
>  Stemming both of the tokens to *camis* will match both tokens and returns 
> both t-shirts and shirts for query camisas(shirts). 
> SpanishMinimalStemmerFilter will help handling these cases.
> And importantly It will preserve gender context with tokens.
> Ex:  *niños* ,*niñas* *chicos* and *chicas* are stemmed to *nino*, *nina*, 
> *chico* and *chica*



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-BadApples-NightlyTests-master - Build # 72 - Still Failing

2019-07-26 Thread Apache Jenkins Server

Build: 
https://builds.apache.org/job/Lucene-Solr-BadApples-NightlyTests-master/72/

All tests passed

Build Log:
[...truncated 68364 lines...]
BUILD FAILED
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-BadApples-NightlyTests-master/checkout/build.xml:662:
 The following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-BadApples-NightlyTests-master/checkout/build.xml:101:
 The following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-BadApples-NightlyTests-master/checkout/solr/build.xml:625:
 The following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-BadApples-NightlyTests-master/checkout/solr/build.xml:594:
 The following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-BadApples-NightlyTests-master/checkout/lucene/common-build.xml:2463:
 Failed to load 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-BadApples-NightlyTests-master/checkout/dev-tools/doap/solr.rdf

Total time: 520 minutes 37 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-8936) Add SpanishMinimalStemFilter

2019-07-26 Thread vinod kumar (JIRA)

vinod kumar created LUCENE-8936:
---

 Summary: Add SpanishMinimalStemFilter
 Key: LUCENE-8936
 URL: https://issues.apache.org/jira/browse/LUCENE-8936
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: vinod kumar


SpanishMinimalStemmerFilter is less aggressive stemmer than 
SpanishLightStemmerFilter

Ex: 
1. camiseta niños -> *camiseta* and *niños*
2. camisas -> camisa

Here *camisetas* and *camisas* are t-shirts and shirts respectively.
Stemming both of the tokens to *camis* will match both tokens and returns both 
t-shirts and shirts for query camisas(shirts).

SpanishMinimalStemmerFilter will help handling these cases.
It will preserve gender context with tokens *niños* ,*niñas* *chicos* and 
*chicas*




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13643) ResponseBuilder should provide accessors/setters for analytics response handling

2019-07-26 Thread Neal Sidhwaney (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894070#comment-16894070
 ] 

Neal Sidhwaney commented on SOLR-13643:
---

FYI, no action necessary, but I updated the patch to apply against HEAD. 

Thanks,
Neal

> ResponseBuilder should provide accessors/setters for analytics response 
> handling
> 
>
> Key: SOLR-13643
> URL: https://issues.apache.org/jira/browse/SOLR-13643
> Project: Solr
>  Issue Type: Task
>  Components: Response Writers
>Affects Versions: 8.1.1
>Reporter: Neal Sidhwaney
>Priority: Trivial
> Attachments: 
> SOLR-13643-Create-accessors-setters-in-ResponseBuild.patch, 
> SOLR-13643-Create-accessors-setters-in-ResponseBuild.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Right now inside o.a.s.h.c.AnalyticsComponent.java, fields inside 
> ResponseBuilder are accessed directly.  Since they're in the same package, 
> this is OK at compile tie.  But when the Solr core and Analytics jars are 
> loaded at runtime by Solr, they are done by different classloaders, which 
> causes an IllegalAccessError during request handling.  There must be soething 
> different about y setup which is why I am running into this, but it seems 
> like a good idea to abstract away the fields behinds setters/getters anyway.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-13643) ResponseBuilder should provide accessors/setters for analytics response handling

2019-07-26 Thread Neal Sidhwaney (JIRA)



 [ 
https://issues.apache.org/jira/browse/SOLR-13643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Sidhwaney updated SOLR-13643:
--
Attachment: SOLR-13643-Create-accessors-setters-in-ResponseBuild.patch

> ResponseBuilder should provide accessors/setters for analytics response 
> handling
> 
>
> Key: SOLR-13643
> URL: https://issues.apache.org/jira/browse/SOLR-13643
> Project: Solr
>  Issue Type: Task
>  Components: Response Writers
>Affects Versions: 8.1.1
>Reporter: Neal Sidhwaney
>Priority: Trivial
> Attachments: 
> SOLR-13643-Create-accessors-setters-in-ResponseBuild.patch, 
> SOLR-13643-Create-accessors-setters-in-ResponseBuild.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Right now inside o.a.s.h.c.AnalyticsComponent.java, fields inside 
> ResponseBuilder are accessed directly.  Since they're in the same package, 
> this is OK at compile tie.  But when the Solr core and Analytics jars are 
> loaded at runtime by Solr, they are done by different classloaders, which 
> causes an IllegalAccessError during request handling.  There must be soething 
> different about y setup which is why I am running into this, but it seems 
> like a good idea to abstract away the fields behinds setters/getters anyway.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene/Solr 8.2.0

2019-07-26 Thread Namgyu Kim

Hi Ignacio,

I found a wrong space character in Optimizations part of the Lucene News.
("  Optimizations" should be " Optimizations")
So I changed it on CMS and submitted.
It looks fine now, but needs checking.

Thanks,
Namgyu

[jira] [Commented] (SOLR-13650) Support for named global classloaders

2019-07-26 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894037#comment-16894037
 ] 

ASF subversion and git services commented on SOLR-13650:


Commit 24039e6f80685c768c1b08d46a0f71c033fe213b in lucene-solr's branch 
refs/heads/jira/SOLR-13650 from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=24039e6 ]

SOLR-13650: support for named packages


> Support for named global classloaders
> -
>
> Key: SOLR-13650
> URL: https://issues.apache.org/jira/browse/SOLR-13650
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
>Priority: Major
>
> {code:json}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "add-package": {
>"name": "my-package" ,
>   "url" : "http://host:port/url/of/jar;,
>   "sha512":""
>   }
> }' http://localhost:8983/api/cluster
> {code}
> This means that Solr creates a globally accessible classloader with a name 
> {{my-package}} which contains all the jars of that package. 
> A component should be able to use the package by using the {{"package" : 
> "my-package"}}.
> eg:
> {code:json}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "create-searchcomponent": {
>   "name": "my-searchcomponent" ,
>   "class" : "my.path.to.ClassName",
>  "package" : "my-package"
>   }
> }' http://localhost:8983/api/c/mycollection/config 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-12532) Slop specified in query string is not preserved for certain phrase searches

2019-07-26 Thread Steve Rowe (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-12532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893979#comment-16893979
 ] 

Steve Rowe commented on SOLR-12532:
---

Sure, I'll take a look in the next couple days.

> Slop specified in query string is not preserved for certain phrase searches
> ---
>
> Key: SOLR-12532
> URL: https://issues.apache.org/jira/browse/SOLR-12532
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Affects Versions: 7.4
>Reporter: Brad Sumersford
>Priority: Major
> Attachments: SOLR-12532.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Note: This only impacts specific settings for the WordDelimiterGraphFilter as 
> detailed below.
> When a phrase search is parsed by the SolrQueryParser, and the phrase search 
> results in a graph token stream, the resulting SpanNearQuery created does not 
> have the slop correctly set.
> h4. Conditions
>  - Slop provided in query string (ex: ~2")
>  - WordDelimiterGraphFilterFactory with query time preserveOriginal and 
> generateWordParts
>  - query string includes a term that contains a word delimiter
> h4. Example
> Field: wdf_partspreserve 
>  – WordDelimiterGraphFilterFactory 
>   preserveOriginal="1"
>   generateWordParts="1"
> Data: you just can't
>  Search: wdf_partspreserve:"you can't"~2 -> 0 Results
> h4. Cause
> The slop supplied by the query string is applied in 
> SolrQueryParserBase#getFieldQuery which will set the slop only for 
> PhraseQuery and MultiPhaseQuery. Since "can't" will be broken down into 
> multiple tokens analyzeGraphPhrase will be triggered when the Query is being 
> constructed which will return a SpanNearQuery instead of a (Multi)PhraseQuery.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-12532) Slop specified in query string is not preserved for certain phrase searches

2019-07-26 Thread Michael Gibney (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-12532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893976#comment-16893976
 ] 

Michael Gibney commented on SOLR-12532:
---

[~steve_rowe], I'd be curious to know what you think of this issue/PR, if you 
have a chance to take a look.

> Slop specified in query string is not preserved for certain phrase searches
> ---
>
> Key: SOLR-12532
> URL: https://issues.apache.org/jira/browse/SOLR-12532
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Affects Versions: 7.4
>Reporter: Brad Sumersford
>Priority: Major
> Attachments: SOLR-12532.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Note: This only impacts specific settings for the WordDelimiterGraphFilter as 
> detailed below.
> When a phrase search is parsed by the SolrQueryParser, and the phrase search 
> results in a graph token stream, the resulting SpanNearQuery created does not 
> have the slop correctly set.
> h4. Conditions
>  - Slop provided in query string (ex: ~2")
>  - WordDelimiterGraphFilterFactory with query time preserveOriginal and 
> generateWordParts
>  - query string includes a term that contains a word delimiter
> h4. Example
> Field: wdf_partspreserve 
>  – WordDelimiterGraphFilterFactory 
>   preserveOriginal="1"
>   generateWordParts="1"
> Data: you just can't
>  Search: wdf_partspreserve:"you can't"~2 -> 0 Results
> h4. Cause
> The slop supplied by the query string is applied in 
> SolrQueryParserBase#getFieldQuery which will set the slop only for 
> PhraseQuery and MultiPhaseQuery. Since "can't" will be broken down into 
> multiple tokens analyzeGraphPhrase will be triggered when the Query is being 
> constructed which will return a SpanNearQuery instead of a (Multi)PhraseQuery.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent

2019-07-26 Thread Michael Gibney (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893958#comment-16893958
 ] 

Michael Gibney commented on SOLR-7798:
--

[~joel.bernstein], I circled back to this, and squash-rebased [PR 
325|https://github.com/apache/lucene-solr/pull/325] on current master. The 
patch applies cleanly and passes precommit and all tests, so it should be 
solid. I'm sorry for the false start (in Feb. 2018); if you'd be willing to 
take another look at this, I think this will now _actually_ be as 
straightforward as it initially should have been!

> Improve robustness of ExpandComponent
> -
>
> Key: SOLR-7798
> URL: https://issues.apache.org/jira/browse/SOLR-7798
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Reporter: Jörg Rathlev
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: expand-component.patch, expand-npe.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally 
> used without prior collapsing of results.
> If there are multiple documents in the result which have the same term value 
> in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from 
> the {{count}} value, and the {{getGroupQuery}} method creates an incompletely 
> filled {{bytesRef}} array, which later causes a {{NullPointerException}} when 
> trying to sort the terms.
> The attached patch extends the test to demonstrate the error, and modifies 
> the {{getGroupQuery}} methods to create the array based on the size of the 
> input maps.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on issue #794: LUCENE-8769: Introduce Range Query Type With Multiple Ranges

2019-07-26 Thread GitBox

atris commented on issue #794: LUCENE-8769: Introduce Range Query Type With 
Multiple Ranges
URL: https://github.com/apache/lucene-solr/pull/794#issuecomment-515476696
 
 
   @jpountz I ran recommit on the latest version again and rebased with master 
-- all came in clean


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8776) Start offset going backwards has a legitimate purpose

2019-07-26 Thread David Smiley (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893852#comment-16893852
 ] 

David Smiley commented on LUCENE-8776:
--

[~venkat11] I am curious if you've tried the latest version of the 
UnifiedHighlighter that can be configured to use the "WeightMatches" API.  This 
is toggled via the 
{{org.apache.lucene.search.uhighlight.UnifiedHighlighter.HighlightFlag#WEIGHT_MATCHES}}
 flag.  This isn't the default at the Lucene level but ought to be changed to 
be.  You may notice some highlighting differences, like for phrases and spans 
in which phrases as a whole get one pair of open/close tags instead of the 
constituent words.  There's a chance that this mode ameliorates your 
highlighting woes with offsets.

> Start offset going backwards has a legitimate purpose
> -
>
> Key: LUCENE-8776
> URL: https://issues.apache.org/jira/browse/LUCENE-8776
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 7.6
>Reporter: Ram Venkat
>Priority: Major
>
> Here is the use case where startOffset can go backwards:
> Say there is a line "Organic light-emitting-diode glows", and I want to run 
> span queries and highlight them properly. 
> During index time, light-emitting-diode is split into three words, which 
> allows me to search for 'light', 'emitting' and 'diode' individually. The 
> three words occupy adjacent positions in the index, as 'light' adjacent to 
> 'emitting' and 'light' at a distance of two words from 'diode' need to match 
> this word. So, the order of words after splitting are: Organic, light, 
> emitting, diode, glows. 
> But, I also want to search for 'organic' being adjacent to 
> 'light-emitting-diode' or 'light-emitting-diode' being adjacent to 'glows'. 
> The way I solved this was to also generate 'light-emitting-diode' at two 
> positions: (a) In the same position as 'light' and (b) in the same position 
> as 'glows', like below:
> ||organic||light||emitting||diode||glows||
> | |light-emitting-diode| |light-emitting-diode| |
> |0|1|2|3|4|
> The positions of the two 'light-emitting-diode' are 1 and 3, but the offsets 
> are obviously the same. This works beautifully in Lucene 5.x in both 
> searching and highlighting with span queries. 
> But when I try this in Lucene 7.6, it hits the condition "Offsets must not go 
> backwards" at DefaultIndexingChain:818. This IllegalArgumentException is 
> being thrown without any comments on why this check is needed. As I explained 
> above, startOffset going backwards is perfectly valid, to deal with word 
> splitting and span operations on these specialized use cases. On the other 
> hand, it is not clear what value is added by this check and which highlighter 
> code is affected by offsets going backwards. This same check is done at 
> BaseTokenStreamTestCase:245. 
> I see others talk about how this check found bugs in WordDelimiter etc. but 
> it also prevents legitimate use cases. Can this check be removed?  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8369) Remove the spatial module as it is obsolete

2019-07-26 Thread David Smiley (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893833#comment-16893833
 ] 

David Smiley commented on LUCENE-8369:
--

IMO to most users anything beyond points, rectangles, and point-radius, is 
exotic/specialized.  Many search apps don't even have any spatial at all for 
that matter.

> Remove the spatial module as it is obsolete
> ---
>
> Key: LUCENE-8369
> URL: https://issues.apache.org/jira/browse/LUCENE-8369
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/spatial
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
> Attachments: LUCENE-8369.patch
>
>
> The "spatial" module is at this juncture nearly empty with only a couple 
> utilities that aren't used by anything in the entire codebase -- 
> GeoRelationUtils, and MortonEncoder.  Perhaps it should have been removed 
> earlier in LUCENE-7664 which was the removal of GeoPointField which was 
> essentially why the module existed.  Better late than never.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on issue #769: LUCENE-8905: Better Error Handling For Illegal Arguments

2019-07-26 Thread GitBox

atris commented on issue #769: LUCENE-8905: Better Error Handling For Illegal 
Arguments
URL: https://github.com/apache/lucene-solr/pull/769#issuecomment-515451533
 
 
   I looked at the testGrouping failure -- looks like the test assumes that the 
second pass collector will always collect atleast docsInGroup hits, and thus 
takes a random offset there to start with. However, that is not always true, 
and the number of hits collected can be lesser.
   
   I think the ideal way of usage of topDocs is to get the totalHits from the 
corresponding TopDocsCollector and use that as the fence to set the starting 
point.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on issue #769: LUCENE-8905: Better Error Handling For Illegal Arguments

2019-07-26 Thread GitBox

atris commented on issue #769: LUCENE-8905: Better Error Handling For Illegal 
Arguments
URL: https://github.com/apache/lucene-solr/pull/769#issuecomment-515440912
 
 
   > > I'm wondering whether we should consider removing this `topDocs` variant 
which takes a `howMany` parameter. I can't think of a good reason to pass 
something else than `numHits-start` there? Is there a use-case I'm missing?
   > 
   > I think it was intended to allow people to "scroll" through hits and 
"pick" a small subset to get e.g. get the hits from 4th best to 8th best. I 
agree that there is no concrete use case that comes to mind though, so +1 to 
removing it
   
   Another question is -- what is the expected behaviour when start > size? 
Should we error out, or return a null topdocs (as we do today), or normalize 
the start value to be 0?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on issue #769: LUCENE-8905: Better Error Handling For Illegal Arguments

2019-07-26 Thread GitBox

atris commented on issue #769: LUCENE-8905: Better Error Handling For Illegal 
Arguments
URL: https://github.com/apache/lucene-solr/pull/769#issuecomment-515432956
 
 
   > I'm wondering whether we should consider removing this `topDocs` variant 
which takes a `howMany` parameter. I can't think of a good reason to pass 
something else than `numHits-start` there? Is there a use-case I'm missing?
   
   I think it was intended to allow people to "scroll" through hits and "pick" 
a small subset to get e.g. get the hits from 4th best to 8th best. I agree that 
there is no concrete use case that comes to mind though, so +1 to removing it


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on issue #769: LUCENE-8905: Better Error Handling For Illegal Arguments

2019-07-26 Thread GitBox

jpountz commented on issue #769: LUCENE-8905: Better Error Handling For Illegal 
Arguments
URL: https://github.com/apache/lucene-solr/pull/769#issuecomment-515432397
 
 
   I'm wondering whether we should consider removing this `topDocs` variant 
which takes a `howMany` parameter. I can't think of a good reason to pass 
something else than `numHits-start` there? Is there a use-case I'm missing?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on issue #769: LUCENE-8905: Better Error Handling For Illegal Arguments

2019-07-26 Thread GitBox

atris commented on issue #769: LUCENE-8905: Better Error Handling For Illegal 
Arguments
URL: https://github.com/apache/lucene-solr/pull/769#issuecomment-515429984
 
 
   > I ran top-level `ant test` and hit this exception on the PR as of 
yesterday:
   > 
   > ```
   >[junit4] Suite: org.apache.lucene.search.grouping.TestGrouping
   >[junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestGrouping 
-Dtests.method=testRandom -Dtests.seed=F08BE82DC9B3756C -Dtests.badapples=true 
-Dtests.locale=es-BZ -Dtests.timezone\
   > =Africa/Gaborone -Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1
   >[junit4] ERROR   1.65s J0 | TestGrouping.testRandom <<<
   >[junit4]> Throwable #1: java.lang.IllegalArgumentException: 
Expected value of starting position is between 0 and 3, got 7
   >[junit4]>at 
__randomizedtesting.SeedInfo.seed([F08BE82DC9B3756C:82C7CD2278D3C31F]:0)
   >[junit4]>at 
org.apache.lucene.search.TopDocsCollector.topDocs(TopDocsCollector.java:141)
   >[junit4]>at 
org.apache.lucene.search.grouping.TopGroupsCollector.getTopGroups(TopGroupsCollector.java:163)
   >[junit4]>at 
org.apache.lucene.search.grouping.TestGrouping.getTopGroups(TestGrouping.java:319)
   >[junit4]>at 
org.apache.lucene.search.grouping.TestGrouping.testRandom(TestGrouping.java:965)
   >[junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   >[junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   >[junit4]>at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   >[junit4]>at 
java.base/java.lang.reflect.Method.invoke(Method.java:566)
   >[junit4]>at java.base/java.lang.Thread.run(Thread.java:834)
   >[junit4]   2> NOTE: test params are: codec=Asserting(Lucene80): 
{groupend=TestBloomFilteredLucenePostings(BloomFilteringPostingsFormat(Lucene50(blocksize=128))),
 sort1=FST50, sort2=TestB\
   > 
loomFilteredLucenePostings(BloomFilteringPostingsFormat(Lucene50(blocksize=128))),
 content=PostingsFormat(name=Direct), group=FST50}, 
docValues:{author=DocValuesFormat(name=Asserting), sort\
   > 1=DocValuesFormat(name=Lucene80), id=DocValuesFormat(name=Lucene80), 
sort2=DocValuesFormat(name=Direct), 
___soft_deletes=DocValuesFormat(name=Lucene80), 
group=DocValuesFormat(name=Lucene80)\
   > }, maxPointsInLeafNode=2036, maxMBSortInHeap=7.045070063116054, 
sim=Asserting(org.apache.lucene.search.similarities.AssertingSimilarity@19474185),
 locale=es-BZ, timezone=Africa/Gaborone
   >[junit4]   2> NOTE: Linux 4.4.0-96-generic amd64/Oracle Corporation 
11.0.2 (64-bit)/cpus=8,threads=1,free=510432528,total=536870912
   >[junit4]   2> NOTE: All tests run in this JVM: [TestGrouping]
   > ```
   
   Interesting, I cannot see this for previous versions of Lucene, but rebasing 
got me this error. I added the check Adrien suggested and got the following 
failure:
   
   java.lang.IllegalArgumentException: Requested number of hits greater than 
original hits limit for the collector. size 4 start 2 howMany 49
   
   I think why that is happening is because today, we fix howMany to be within 
valid limits in case it is larger than the available number of hits, rather 
than erroring out.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8935) BooleanQuery with no scoring clauses cannot skip documents when running TOP_SCORES mode

2019-07-26 Thread Adrien Grand (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893786#comment-16893786
 ] 

Adrien Grand commented on LUCENE-8935:
--

Woops indeed you are right. +1 to the attached patch!

> BooleanQuery with no scoring clauses cannot skip documents when running 
> TOP_SCORES mode
> ---
>
> Key: LUCENE-8935
> URL: https://issues.apache.org/jira/browse/LUCENE-8935
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Jim Ferenczi
>Priority: Minor
> Attachments: LUCENE-8935.patch
>
>
> Today a boolean query that is composed of filtering clauses only (more than 
> one) cannot skip documents when the search is executed with the TOP_SCORES 
> mode. However since all documents have a score of 0 it should be possible to 
> early terminate the query as soon as we collected enough top hits. Wrapping 
> the resulting boolean scorer in a constant score scorer should allow early 
> termination in this case and would speed up the retrieval of top hits case 
> considerably if the total hit count is not requested.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8928) BKDWriter could make splitting decisions based on the actual range of values

2019-07-26 Thread Ignacio Vera (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893763#comment-16893763
 ] 

Ignacio Vera commented on LUCENE-8928:
--

I tried to see the effect running on 1D ranges and it is the same as above. So 
+1 to apply this change only when numDims > 2 as it seems the right tree off.

> BKDWriter could make splitting decisions based on the actual range of values
> 
>
> Key: LUCENE-8928
> URL: https://issues.apache.org/jira/browse/LUCENE-8928
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>
> Currently BKDWriter assumes that splitting on one dimension has no effect on 
> values in other dimensions. While this may be ok for geo points, this is 
> usually not true for ranges (or geo shapes, which are ranges too). Maybe we 
> could get better indexing by re-computing the range of values on each 
> dimension before making the choice of the split dimension?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-8928) BKDWriter could make splitting decisions based on the actual range of values

2019-07-26 Thread Ignacio Vera (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893763#comment-16893763
 ] 

Ignacio Vera edited comment on LUCENE-8928 at 7/26/19 12:02 PM:


I tried to see the effect running on 1D ranges and it is the same as above. So 
+1 to apply this change only when numDims > 2 as it seems the right trade off.


was (Author: ivera):
I tried to see the effect running on 1D ranges and it is the same as above. So 
+1 to apply this change only when numDims > 2 as it seems the right tree off.

> BKDWriter could make splitting decisions based on the actual range of values
> 
>
> Key: LUCENE-8928
> URL: https://issues.apache.org/jira/browse/LUCENE-8928
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>
> Currently BKDWriter assumes that splitting on one dimension has no effect on 
> values in other dimensions. While this may be ok for geo points, this is 
> usually not true for ranges (or geo shapes, which are ranges too). Maybe we 
> could get better indexing by re-computing the range of values on each 
> dimension before making the choice of the split dimension?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-NightlyTests-8.2 - Build # 14 - Still Failing

2019-07-26 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-8.2/14/

No tests ran.

Build Log:
[...truncated 25 lines...]
ERROR: Failed to check out http://svn.apache.org/repos/asf/lucene/test-data
org.tmatesoft.svn.core.SVNException: svn: E175002: connection refused by the 
server
svn: E175002: OPTIONS request failed on '/repos/asf/lucene/test-data'
at 
org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:112)
at 
org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:96)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:765)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:352)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:340)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVConnection.performHttpRequest(DAVConnection.java:910)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVConnection.exchangeCapabilities(DAVConnection.java:702)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVConnection.open(DAVConnection.java:113)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVRepository.openConnection(DAVRepository.java:1035)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVRepository.getLatestRevision(DAVRepository.java:164)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgRepositoryAccess.getRevisionNumber(SvnNgRepositoryAccess.java:119)
at 
org.tmatesoft.svn.core.internal.wc2.SvnRepositoryAccess.getLocations(SvnRepositoryAccess.java:178)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgRepositoryAccess.createRepositoryFor(SvnNgRepositoryAccess.java:43)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgAbstractUpdate.checkout(SvnNgAbstractUpdate.java:831)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgCheckout.run(SvnNgCheckout.java:26)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgCheckout.run(SvnNgCheckout.java:11)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgOperationRunner.run(SvnNgOperationRunner.java:20)
at 
org.tmatesoft.svn.core.internal.wc2.SvnOperationRunner.run(SvnOperationRunner.java:21)
at 
org.tmatesoft.svn.core.wc2.SvnOperationFactory.run(SvnOperationFactory.java:1239)
at org.tmatesoft.svn.core.wc2.SvnOperation.run(SvnOperation.java:294)
at 
hudson.scm.subversion.CheckoutUpdater$SubversionUpdateTask.perform(CheckoutUpdater.java:133)
at 
hudson.scm.subversion.WorkspaceUpdater$UpdateTask.delegateTo(WorkspaceUpdater.java:168)
at 
hudson.scm.subversion.WorkspaceUpdater$UpdateTask.delegateTo(WorkspaceUpdater.java:176)
at 
hudson.scm.subversion.UpdateUpdater$TaskImpl.perform(UpdateUpdater.java:134)
at 
hudson.scm.subversion.WorkspaceUpdater$UpdateTask.delegateTo(WorkspaceUpdater.java:168)
at 
hudson.scm.SubversionSCM$CheckOutTask.perform(SubversionSCM.java:1041)
at hudson.scm.SubversionSCM$CheckOutTask.invoke(SubversionSCM.java:1017)
at hudson.scm.SubversionSCM$CheckOutTask.invoke(SubversionSCM.java:990)
at hudson.FilePath$FileCallableWrapper.call(FilePath.java:3086)
at hudson.remoting.UserRequest.perform(UserRequest.java:212)
at hudson.remoting.UserRequest.perform(UserRequest.java:54)
at hudson.remoting.Request$2.run(Request.java:369)
at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at 
org.tmatesoft.svn.core.internal.util.SVNSocketConnection.run(SVNSocketConnection.java:57)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
... 4 more
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)

[JENKINS] Lucene-Solr-Tests-8.x - Build # 316 - Failure

2019-07-26 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Tests-8.x/316/

1 tests failed.
FAILED:  org.apache.solr.cloud.LeaderElectionTest.testStressElection

Error Message:
Could not get leader props for collection1 shard1

Stack Trace:
java.lang.RuntimeException: Could not get leader props for collection1 shard1
at 
__randomizedtesting.SeedInfo.seed([105D35EF39B8F981:EF0BE1415EF6E892]:0)
at 
org.apache.solr.cloud.LeaderElectionTest.getLeaderUrl(LeaderElectionTest.java:263)
at 
org.apache.solr.cloud.LeaderElectionTest.getLeaderThread(LeaderElectionTest.java:416)
at 
org.apache.solr.cloud.LeaderElectionTest.testStressElection(LeaderElectionTest.java:525)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)




Build Log:
[...truncated 13869 lines...]
   [junit4] Suite: org.apache.solr.cloud.LeaderElectionTest
   [junit4]   2> Creating dataDir:

[jira] [Commented] (LUCENE-8369) Remove the spatial module as it is obsolete

2019-07-26 Thread Ignacio Vera (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893752#comment-16893752
 ] 

Ignacio Vera commented on LUCENE-8369:
--

My best example is LUCENE-8746: I am trying to refactor the classes that 
contain the spatial logic and having them in different packages make it very 
difficult.

In addition, how to asses what is common and what is exotic? Maybe pointInBox 
(which is a range anyway) is the most common case but pointInPolygon might 
start moving into the exotic area.

> Remove the spatial module as it is obsolete
> ---
>
> Key: LUCENE-8369
> URL: https://issues.apache.org/jira/browse/LUCENE-8369
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/spatial
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
> Attachments: LUCENE-8369.patch
>
>
> The "spatial" module is at this juncture nearly empty with only a couple 
> utilities that aren't used by anything in the entire codebase -- 
> GeoRelationUtils, and MortonEncoder.  Perhaps it should have been removed 
> earlier in LUCENE-7664 which was the removal of GeoPointField which was 
> essentially why the module existed.  Better late than never.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] mikemccand commented on issue #769: LUCENE-8905: Better Error Handling For Illegal Arguments

2019-07-26 Thread GitBox

mikemccand commented on issue #769: LUCENE-8905: Better Error Handling For 
Illegal Arguments
URL: https://github.com/apache/lucene-solr/pull/769#issuecomment-515420557
 
 
   I ran top-level `ant test` and hit this exception on the PR as of yesterday:
   
   ```
  [junit4] Suite: org.apache.lucene.search.grouping.TestGrouping
  [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestGrouping 
-Dtests.method=testRandom -Dtests.seed=F08BE82DC9B3756C -Dtests.badapples=true 
-Dtests.locale=es-BZ -Dtests.timezone\
   =Africa/Gaborone -Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1
  [junit4] ERROR   1.65s J0 | TestGrouping.testRandom <<<
  [junit4]> Throwable #1: java.lang.IllegalArgumentException: Expected 
value of starting position is between 0 and 3, got 7
  [junit4]>at 
__randomizedtesting.SeedInfo.seed([F08BE82DC9B3756C:82C7CD2278D3C31F]:0)
  [junit4]>at 
org.apache.lucene.search.TopDocsCollector.topDocs(TopDocsCollector.java:141)
  [junit4]>at 
org.apache.lucene.search.grouping.TopGroupsCollector.getTopGroups(TopGroupsCollector.java:163)
  [junit4]>at 
org.apache.lucene.search.grouping.TestGrouping.getTopGroups(TestGrouping.java:319)
  [junit4]>at 
org.apache.lucene.search.grouping.TestGrouping.testRandom(TestGrouping.java:965)
  [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  [junit4]>at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  [junit4]>at 
java.base/java.lang.reflect.Method.invoke(Method.java:566)
  [junit4]>at java.base/java.lang.Thread.run(Thread.java:834)
  [junit4]   2> NOTE: test params are: codec=Asserting(Lucene80): 
{groupend=TestBloomFilteredLucenePostings(BloomFilteringPostingsFormat(Lucene50(blocksize=128))),
 sort1=FST50, sort2=TestB\
   
loomFilteredLucenePostings(BloomFilteringPostingsFormat(Lucene50(blocksize=128))),
 content=PostingsFormat(name=Direct), group=FST50}, 
docValues:{author=DocValuesFormat(name=Asserting), sort\
   1=DocValuesFormat(name=Lucene80), id=DocValuesFormat(name=Lucene80), 
sort2=DocValuesFormat(name=Direct), 
___soft_deletes=DocValuesFormat(name=Lucene80), 
group=DocValuesFormat(name=Lucene80)\
   }, maxPointsInLeafNode=2036, maxMBSortInHeap=7.045070063116054, 
sim=Asserting(org.apache.lucene.search.similarities.AssertingSimilarity@19474185),
 locale=es-BZ, timezone=Africa/Gaborone
  [junit4]   2> NOTE: Linux 4.4.0-96-generic amd64/Oracle Corporation 
11.0.2 (64-bit)/cpus=8,threads=1,free=510432528,total=536870912
  [junit4]   2> NOTE: All tests run in this JVM: [TestGrouping]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on a change in pull request #769: LUCENE-8905: Better Error Handling For Illegal Arguments

2019-07-26 Thread GitBox

atris commented on a change in pull request #769: LUCENE-8905: Better Error 
Handling For Illegal Arguments
URL: https://github.com/apache/lucene-solr/pull/769#discussion_r307702699
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/TopDocsCollector.java
 ##
 @@ -136,14 +136,21 @@ public TopDocs topDocs(int start, int howMany) {
 // pq.size() or totalHits.
 int size = topDocsSize();
 
-// Don't bother to throw an exception, just return an empty TopDocs in case
-// the parameters are invalid or out of range.
-// TODO: shouldn't we throw IAE if apps give bad params here so they dont
-// have sneaky silent bugs?
-if (start < 0 || start >= size || howMany <= 0) {
+
+if (start < 0 || start > size) {
 
 Review comment:
   @jpountz That can allow potential abuse by passing in Integer.MAX_VALUE for 
numHits, and a value of start > size. With this check gone, we would not be 
able to detect the case.
   
   Should we keep the existing one and add the one you proposed?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-8.2-Linux (64bit/jdk1.8.0_201) - Build # 473 - Failure!

2019-07-26 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-8.2-Linux/473/
Java: 64bit/jdk1.8.0_201 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 68107 lines...]
BUILD FAILED
/home/jenkins/workspace/Lucene-Solr-8.2-Linux/build.xml:634: The following 
error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-8.2-Linux/build.xml:101: The following 
error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-8.2-Linux/solr/build.xml:624: The following 
error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-8.2-Linux/solr/build.xml:593: The following 
error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-8.2-Linux/lucene/common-build.xml:2552: 
Failed to load 
/home/jenkins/workspace/Lucene-Solr-8.2-Linux/dev-tools/doap/solr.rdf

Total time: 83 minutes 45 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Setting 
ANT_1_8_2_HOME=/home/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
[WARNINGS] Skipping publisher since build result is FAILURE
Recording test results
Setting 
ANT_1_8_2_HOME=/home/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting 
ANT_1_8_2_HOME=/home/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
Setting 
ANT_1_8_2_HOME=/home/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
Setting 
ANT_1_8_2_HOME=/home/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
Setting 
ANT_1_8_2_HOME=/home/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
Setting 
ANT_1_8_2_HOME=/home/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
Setting 
ANT_1_8_2_HOME=/home/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene/Solr 8.2.0

2019-07-26 Thread Ignacio Vera

Sorry about that, just pushed a fix

On Fri, Jul 26, 2019 at 12:34 PM Jason Gerlowski 
wrote:

> Hey Ignacio,
>
> I think in the course of cutting the 8.2 release, you might've broken
> precommit on master.  I'm getting the following error consistently:
>
> [xmlproperty] [Fatal Error] solr.rdf:685:5: The element type "release"
> must be terminated by the matching end-tag "".
> ...
> BUILD FAILED
>
> When I look at dev-tools/doap/solr.rdf, the new 8.2.0 release section
> doesn't have a closing tag.
>
> Best,
>
> Jason
>
> On Fri, Jul 19, 2019 at 2:50 AM Adrien Grand  wrote:
> >
> > +1
> >
> > On Thu, Jul 18, 2019 at 9:38 AM Ignacio Vera  wrote:
> > >
> > > Hi,
> > >
> > > As there is no blockers for the release of Lucene/Solr 8.2 and the
> branch is stable I am planning to build the first Release candidate
> tomorrow (Friday). Please let us know if there is any concern/ issue that
> needs to be dealt with before moving to the next step.
> > >
> > >
> > > On Mon, Jul 15, 2019 at 11:32 PM Michael Sokolov 
> wrote:
> > >>
> > >> Thanks, good catch, I'll set the current version back to 6. I haven't
> > >> seen any comments on the (trivial) PR, so I'll push tonight in order
> > >> to keep the release train rolling
> > >>
> > >> On Mon, Jul 15, 2019 at 3:28 PM David Smiley <
> david.w.smi...@gmail.com> wrote:
> > >> >
> > >> > Disable or rollback; I'm good either way.  I think you should
> un-bump the FST version since the feature becomes entirely experimental.
> > >> >
> > >> > ~ David Smiley
> > >> > Apache Lucene/Solr Search Developer
> > >> > http://www.linkedin.com/in/davidwsmiley
> > >> >
> > >> >
> > >> > On Mon, Jul 15, 2019 at 12:34 PM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
> > >> >>
> > >> >> +1 to rollback and having a 8.3 as soon as we nail this down (even
> if that is days or 1-2 weeks after 8.2).
> > >> >>
> > >> >> On Mon, 15 Jul, 2019, 9:22 PM Michael Sokolov, 
> wrote:
> > >> >>>
> > >> >>> I guess whether we roll back depends on timing. I think we are
> close
> > >> >>> to a release though, and these changes are complex and will
> require
> > >> >>> further testing, so rollback seems reasonable to me. I think from
> code
> > >> >>> management perspective it will be simplest to disable direct
> > >> >>> addressing for now, rather than actually reverting the various
> commits
> > >> >>> that are in place. I can post a patch doing that today.
> > >> >>>
> > >> >>> I like the ideas you have for compressing FSTs further. It was
> > >> >>> bothering me that we store the labels needlessly. I do think that
> > >> >>> before making more radical changes to Arc though, I would like to
> add
> > >> >>> some encapsulation so that we can be a bit freer without being
> > >> >>> concerned about the abstraction leaking (Several classes depend
> on the
> > >> >>> Arc internals today). EG I'd like to make its members private and
> add
> > >> >>> getters. I know this is a performance-sensitive area, and maybe
> we had
> > >> >>> a reason for not using them? Do we have some experience that
> suggests
> > >> >>> that would be a performance issue? My assumption is that JIT
> > >> >>> compilation would make that free, but I haven't tested.
> > >> >>>
> > >> >>> On Mon, Jul 15, 2019 at 11:36 AM Adrien Grand 
> wrote:
> > >> >>> >
> > >> >>> > That would be great. I wonder that we could also make the
> encoding a
> > >> >>> > bit more efficient. For instance I noticed that arc metadata is
> pretty
> > >> >>> > large in some cases (in the 10-20 bytes) which make gaps very
> costly.
> > >> >>> > Associating each label with a dense id and having an
> intermediate
> > >> >>> > lookup, ie. lookup label -> id and then id->arc offset instead
> of
> > >> >>> > doing label->arc directly could save a lot of space in some
> cases?
> > >> >>> > Also it seems that we are repeating the label in the arc
> metadata when
> > >> >>> > array-with-gaps is used, even though it shouldn't be necessary
> since
> > >> >>> > the label is implicit from the address?
> > >> >>> >
> > >> >>> > Do you think we can have a mitigation for worst-case scenarii
> in 8.2
> > >> >>> > or should we revert from branch_8_2 to keep the release process
> going
> > >> >>> > and work on this for 8.3?
> > >> >>> >
> > >> >>> > On Mon, Jul 15, 2019 at 5:12 PM Michael Sokolov <
> msoko...@gmail.com> wrote:
> > >> >>> > >
> > >> >>> > > Thanks for the nice test, Adrien. Yes, the tradeoff of direct
> > >> >>> > > addressing is heavily data-dependent. I think we can improve
> the
> > >> >>> > > situation here by tracking, per-FST instance, the size
> increase we're
> > >> >>> > > seeing while building (or perhaps do a preliminary pass before
> > >> >>> > > building) in order to decide whether to apply the encoding.
> > >> >>> > >
> > >> >>> > > On Mon, Jul 15, 2019 at 9:02 AM Adrien Grand <
> jpou...@gmail.com> wrote:
> > >> >>> > > >
> > >> >>> > > > I dug this a bit and suspect that the issue is mostly with
> one field
> > >> >>> > > > that is not

Re: Lucene/Solr 8.2.0

2019-07-26 Thread Jason Gerlowski

Hey Ignacio,

I think in the course of cutting the 8.2 release, you might've broken
precommit on master.  I'm getting the following error consistently:

[xmlproperty] [Fatal Error] solr.rdf:685:5: The element type "release"
must be terminated by the matching end-tag "".
...
BUILD FAILED

When I look at dev-tools/doap/solr.rdf, the new 8.2.0 release section
doesn't have a closing tag.

Best,

Jason

On Fri, Jul 19, 2019 at 2:50 AM Adrien Grand  wrote:
>
> +1
>
> On Thu, Jul 18, 2019 at 9:38 AM Ignacio Vera  wrote:
> >
> > Hi,
> >
> > As there is no blockers for the release of Lucene/Solr 8.2 and the branch 
> > is stable I am planning to build the first Release candidate tomorrow 
> > (Friday). Please let us know if there is any concern/ issue that needs to 
> > be dealt with before moving to the next step.
> >
> >
> > On Mon, Jul 15, 2019 at 11:32 PM Michael Sokolov  wrote:
> >>
> >> Thanks, good catch, I'll set the current version back to 6. I haven't
> >> seen any comments on the (trivial) PR, so I'll push tonight in order
> >> to keep the release train rolling
> >>
> >> On Mon, Jul 15, 2019 at 3:28 PM David Smiley  
> >> wrote:
> >> >
> >> > Disable or rollback; I'm good either way.  I think you should un-bump 
> >> > the FST version since the feature becomes entirely experimental.
> >> >
> >> > ~ David Smiley
> >> > Apache Lucene/Solr Search Developer
> >> > http://www.linkedin.com/in/davidwsmiley
> >> >
> >> >
> >> > On Mon, Jul 15, 2019 at 12:34 PM Ishan Chattopadhyaya 
> >> >  wrote:
> >> >>
> >> >> +1 to rollback and having a 8.3 as soon as we nail this down (even if 
> >> >> that is days or 1-2 weeks after 8.2).
> >> >>
> >> >> On Mon, 15 Jul, 2019, 9:22 PM Michael Sokolov,  
> >> >> wrote:
> >> >>>
> >> >>> I guess whether we roll back depends on timing. I think we are close
> >> >>> to a release though, and these changes are complex and will require
> >> >>> further testing, so rollback seems reasonable to me. I think from code
> >> >>> management perspective it will be simplest to disable direct
> >> >>> addressing for now, rather than actually reverting the various commits
> >> >>> that are in place. I can post a patch doing that today.
> >> >>>
> >> >>> I like the ideas you have for compressing FSTs further. It was
> >> >>> bothering me that we store the labels needlessly. I do think that
> >> >>> before making more radical changes to Arc though, I would like to add
> >> >>> some encapsulation so that we can be a bit freer without being
> >> >>> concerned about the abstraction leaking (Several classes depend on the
> >> >>> Arc internals today). EG I'd like to make its members private and add
> >> >>> getters. I know this is a performance-sensitive area, and maybe we had
> >> >>> a reason for not using them? Do we have some experience that suggests
> >> >>> that would be a performance issue? My assumption is that JIT
> >> >>> compilation would make that free, but I haven't tested.
> >> >>>
> >> >>> On Mon, Jul 15, 2019 at 11:36 AM Adrien Grand  
> >> >>> wrote:
> >> >>> >
> >> >>> > That would be great. I wonder that we could also make the encoding a
> >> >>> > bit more efficient. For instance I noticed that arc metadata is 
> >> >>> > pretty
> >> >>> > large in some cases (in the 10-20 bytes) which make gaps very costly.
> >> >>> > Associating each label with a dense id and having an intermediate
> >> >>> > lookup, ie. lookup label -> id and then id->arc offset instead of
> >> >>> > doing label->arc directly could save a lot of space in some cases?
> >> >>> > Also it seems that we are repeating the label in the arc metadata 
> >> >>> > when
> >> >>> > array-with-gaps is used, even though it shouldn't be necessary since
> >> >>> > the label is implicit from the address?
> >> >>> >
> >> >>> > Do you think we can have a mitigation for worst-case scenarii in 8.2
> >> >>> > or should we revert from branch_8_2 to keep the release process going
> >> >>> > and work on this for 8.3?
> >> >>> >
> >> >>> > On Mon, Jul 15, 2019 at 5:12 PM Michael Sokolov  
> >> >>> > wrote:
> >> >>> > >
> >> >>> > > Thanks for the nice test, Adrien. Yes, the tradeoff of direct
> >> >>> > > addressing is heavily data-dependent. I think we can improve the
> >> >>> > > situation here by tracking, per-FST instance, the size increase 
> >> >>> > > we're
> >> >>> > > seeing while building (or perhaps do a preliminary pass before
> >> >>> > > building) in order to decide whether to apply the encoding.
> >> >>> > >
> >> >>> > > On Mon, Jul 15, 2019 at 9:02 AM Adrien Grand  
> >> >>> > > wrote:
> >> >>> > > >
> >> >>> > > > I dug this a bit and suspect that the issue is mostly with one 
> >> >>> > > > field
> >> >>> > > > that is not part of the data but auto-generated: the ID field. 
> >> >>> > > > It is a
> >> >>> > > > slight variant of Flake IDs, so it's not random, it includes a
> >> >>> > > > timestamp and a sequence number, and I suspect that its patterns
> >> >>> > > > combined with the larger alphabet than ascii makes

[jira] [Commented] (LUCENE-8935) BooleanQuery with no scoring clauses cannot skip documents when running TOP_SCORES mode

2019-07-26 Thread Jim Ferenczi (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893713#comment-16893713
 ] 

Jim Ferenczi commented on LUCENE-8935:
--

Sorry I misunderstood the logic but the number of scoring clauses is already 
computed from the pruned list of scorers so the actual patch works. It's the 
scorer supplier that can be null but in such case they would not appear in 
Boolean2ScorerSupplier. 

> BooleanQuery with no scoring clauses cannot skip documents when running 
> TOP_SCORES mode
> ---
>
> Key: LUCENE-8935
> URL: https://issues.apache.org/jira/browse/LUCENE-8935
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Jim Ferenczi
>Priority: Minor
> Attachments: LUCENE-8935.patch
>
>
> Today a boolean query that is composed of filtering clauses only (more than 
> one) cannot skip documents when the search is executed with the TOP_SCORES 
> mode. However since all documents have a score of 0 it should be possible to 
> early terminate the query as soon as we collected enough top hits. Wrapping 
> the resulting boolean scorer in a constant score scorer should allow early 
> termination in this case and would speed up the retrieval of top hits case 
> considerably if the total hit count is not requested.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8935) BooleanQuery with no scoring clauses cannot skip documents when running TOP_SCORES mode

2019-07-26 Thread Jim Ferenczi (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893708#comment-16893708
 ] 

Jim Ferenczi commented on LUCENE-8935:
--

The logic is already at the bottom of Boolean2ScorerSupplier#get but good call 
on the SHOULD clause that can produce a null scorer.

We can check the number of scoring clauses after the build instead of checking 
the number of scorer suppliers. I'll work on a fix.

> BooleanQuery with no scoring clauses cannot skip documents when running 
> TOP_SCORES mode
> ---
>
> Key: LUCENE-8935
> URL: https://issues.apache.org/jira/browse/LUCENE-8935
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Jim Ferenczi
>Priority: Minor
> Attachments: LUCENE-8935.patch
>
>
> Today a boolean query that is composed of filtering clauses only (more than 
> one) cannot skip documents when the search is executed with the TOP_SCORES 
> mode. However since all documents have a score of 0 it should be possible to 
> early terminate the query as soon as we collected enough top hits. Wrapping 
> the resulting boolean scorer in a constant score scorer should allow early 
> termination in this case and would speed up the retrieval of top hits case 
> considerably if the total hit count is not requested.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-SmokeRelease-8.x - Build # 160 - Still Failing

2019-07-26 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-8.x/160/

No tests ran.

Build Log:
[...truncated 24989 lines...]
[asciidoctor:convert] asciidoctor: ERROR: about-this-guide.adoc: line 1: 
invalid part, must have at least one section (e.g., chapter, appendix, etc.)
[asciidoctor:convert] asciidoctor: ERROR: solr-glossary.adoc: line 1: invalid 
part, must have at least one section (e.g., chapter, appendix, etc.)
 [java] Processed 2590 links (2119 relative) to 3408 anchors in 259 files
 [echo] Validated Links & Anchors via: 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/solr/build/solr-ref-guide/bare-bones-html/

-dist-changes:
 [copy] Copying 4 files to 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/solr/package/changes

package:

-unpack-solr-tgz:

-ensure-solr-tgz-exists:
[mkdir] Created dir: 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/solr/build/solr.tgz.unpacked
[untar] Expanding: 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/solr/package/solr-8.3.0.tgz
 into 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/solr/build/solr.tgz.unpacked

generate-maven-artifacts:

resolve:

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:

[GitHub] [lucene-solr] atris commented on issue #794: LUCENE-8769: Introduce Range Query Type With Multiple Ranges

2019-07-26 Thread GitBox

atris commented on issue #794: LUCENE-8769: Introduce Range Query Type With 
Multiple Ranges
URL: https://github.com/apache/lucene-solr/pull/794#issuecomment-515390656
 
 
   @jpountz Updated the same


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-master-Linux (64bit/jdk-11.0.3) - Build # 24450 - Failure!

2019-07-26 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/24450/
Java: 64bit/jdk-11.0.3 -XX:+UseCompressedOops -XX:+UseG1GC

All tests passed

Build Log:
[...truncated 67663 lines...]
BUILD FAILED
/home/jenkins/workspace/Lucene-Solr-master-Linux/build.xml:634: The following 
error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-master-Linux/build.xml:101: The following 
error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-master-Linux/solr/build.xml:625: The 
following error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-master-Linux/solr/build.xml:594: The 
following error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-master-Linux/lucene/common-build.xml:2463: 
Failed to load 
/home/jenkins/workspace/Lucene-Solr-master-Linux/dev-tools/doap/solr.rdf

Total time: 89 minutes 10 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Setting 
ANT_1_8_2_HOME=/home/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
[WARNINGS] Skipping publisher since build result is FAILURE
Recording test results
Setting 
ANT_1_8_2_HOME=/home/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting 
ANT_1_8_2_HOME=/home/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
Setting 
ANT_1_8_2_HOME=/home/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
Setting 
ANT_1_8_2_HOME=/home/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
Setting 
ANT_1_8_2_HOME=/home/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
Setting 
ANT_1_8_2_HOME=/home/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
Setting 
ANT_1_8_2_HOME=/home/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8935) BooleanQuery with no scoring clauses cannot skip documents when running TOP_SCORES mode

2019-07-26 Thread Adrien Grand (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893690#comment-16893690
 ] 

Adrien Grand commented on LUCENE-8935:
--

The approach works for me. I'm wondering that if we put this logic at the very 
bottom of Boolean2ScorerSupplier#get instead then we'd also cover the case when 
there is a SHOULD clause in addition to the FILTER clauses, but it produces a 
null scorer.

> BooleanQuery with no scoring clauses cannot skip documents when running 
> TOP_SCORES mode
> ---
>
> Key: LUCENE-8935
> URL: https://issues.apache.org/jira/browse/LUCENE-8935
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Jim Ferenczi
>Priority: Minor
> Attachments: LUCENE-8935.patch
>
>
> Today a boolean query that is composed of filtering clauses only (more than 
> one) cannot skip documents when the search is executed with the TOP_SCORES 
> mode. However since all documents have a score of 0 it should be possible to 
> early terminate the query as soon as we collected enough top hits. Wrapping 
> the resulting boolean scorer in a constant score scorer should allow early 
> termination in this case and would speed up the retrieval of top hits case 
> considerably if the total hit count is not requested.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #797: LUCENE-8921: IndexSearcher.termStatistics requires docFreq totalTermFreq instead of TermStates

2019-07-26 Thread GitBox

jpountz commented on a change in pull request #797: LUCENE-8921: 
IndexSearcher.termStatistics requires docFreq totalTermFreq instead of 
TermStates
URL: https://github.com/apache/lucene-solr/pull/797#discussion_r307667928
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java
 ##
 @@ -861,17 +861,17 @@ public String toString() {
   
   /**
* Returns {@link TermStatistics} for a term, or {@code null} if
-   * the term does not exist.
+   * the term does not exist (if docFreq == 0).
* 
* This can be overridden for example, to return a term's statistics
* across a distributed collection.
* @lucene.experimental
*/
-  public TermStatistics termStatistics(Term term, TermStates context) throws 
IOException {
-if (context.docFreq() == 0) {
+  public TermStatistics termStatistics(Term term, int docFreq, long 
totalTermFreq) throws IOException {
+if (docFreq == 0) {
 
 Review comment:
   Maybe we should do it in two steps, first change the signature and keep the 
docFreq==0 check, this could probably go in 8.x, and then reject calls when 
docFreq==0, which we might only want to push to 9.0?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #803: LUCENE-8929: Early Terminating CollectorManager with Global Hitcount

2019-07-26 Thread GitBox

jpountz commented on a change in pull request #803: LUCENE-8929: Early 
Terminating CollectorManager with Global Hitcount
URL: https://github.com/apache/lucene-solr/pull/803#discussion_r30713
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java
 ##
 @@ -279,6 +280,125 @@ public void collect(int doc) throws IOException {
 
   }
 
+  /*
+   * Implements a TopFieldCollector that terminates early based on a global
+   * scoreboard which is shared amongst multiple collectors.
+   * NOTE: This should ideally be outside of TopFieldCollector since it does
+   * not have private access, but we keep it here to limit the visibility
+   * of dependent classes
+   */
+  public static class GlobalStateFieldCollector extends TopFieldCollector {
+
+final Sort sort;
+final FieldValueHitQueue queue;
+final AtomicInteger globalTotalHits;
+
+final GlobalStateCollectorManager.GlobalStateCallback callback;
+
+public GlobalStateFieldCollector(Sort sort, FieldValueHitQueue 
queue, int numHits, int totalHitsThreshold,
+ AtomicInteger globalTotalHits, 
GlobalStateCollectorManager.GlobalStateCallback callback) {
+  super(queue, numHits, totalHitsThreshold, sort.needsScores());
+  this.sort = sort;
+  this.queue = queue;
+  this.globalTotalHits = globalTotalHits;
+  this.callback = callback;
+}
+
+@Override
+public LeafCollector getLeafCollector(LeafReaderContext context) throws 
IOException {
+  docBase = context.docBase;
+
+  final LeafFieldComparator[] comparators = queue.getComparators(context);
+  final int[] reverseMul = queue.getReverseMul();
+  final Sort indexSort = context.reader().getMetaData().getSort();
+  final boolean canEarlyTerminate = canEarlyTerminate(sort, indexSort);
+
+  return new MultiComparatorLeafCollector(comparators, reverseMul) {
+
+boolean collectedAllCompetitiveHits = false;
+
+@Override
+public void setScorer(Scorable scorer) throws IOException {
+  super.setScorer(scorer);
+  updateMinCompetitiveScore(scorer);
+}
+
+@Override
+public void collect(int doc) throws IOException {
+  // Increment local hit counter
+  totalHits++;
+
+  if (queueFull) {
+if (collectedAllCompetitiveHits || !isHitCompetitive(doc, scorer)) 
{
+  // since docs are visited in doc Id order, if compare is 0, it 
means
+  // this document is largest than anything else in the queue, and
+  // therefore not competitive.
+  if (canEarlyTerminate) {
+// Check the global scoreboard to see total hits accumulated 
yet
+if (globalTotalHits.incrementAndGet() > totalHitsThreshold) {
+  totalHitsRelation = Relation.GREATER_THAN_OR_EQUAL_TO;
+  throw new CollectionTerminatedException();
+} else {
+  collectedAllCompetitiveHits = true;
+}
+  } else if (totalHitsRelation == Relation.EQUAL_TO) {
+// we just reached totalHitsThreshold, we can start setting 
the min
+// competitive score now
+//TODO: Should we also update competitive score globally?
+updateMinCompetitiveScore(scorer);
+  }
+  return;
+}
+
+// This hit is competitive - replace bottom element in queue & 
adjustTop
+comparator.copy(bottom.slot, doc);
+updateBottom(doc);
+comparator.setBottom(bottom.slot);
+updateMinCompetitiveScore(scorer);
+//Increment global hit counter
+globalTotalHits.incrementAndGet();
+  } else {
+// Startup transient: queue hasn't gathered numHits yet
+
+//Increment global hit counter
+globalTotalHits.incrementAndGet();
+
+final int slot = totalHits - 1;
+
+// Copy hit into queue
+comparator.copy(slot, doc);
+add(slot, doc);
+if (queueFull) {
+  comparator.setBottom(bottom.slot);
+  updateMinCompetitiveScore(scorer);
+}
+  }
+}
+
+// Check if hit is competitive and set the global value accordingly
+private boolean isHitCompetitive(int doc, Scorable scorer) throws 
IOException {
+  // Check if hit is locally competitive
+  if (reverseMul * comparator.compareBottom(doc) > 0) {
+// Hit was competitive locally, but was it globally competitive?
+if (callback.getGlobalMinCompetitiveScore() > scorer.score()) {
+  return false;
+} else {
+  // Hit was locally and globally competitive, set the right
+  // global minimum competitive score
+

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-26 Thread Jim Ferenczi (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893677#comment-16893677
 ] 

Jim Ferenczi commented on LUCENE-8933:
--

{quote}

Should we go further and check that the concatenation of the segments is equal 
to the surface form?

{quote}

 

+1 too, the user dictionary should be used for segmentations purpose only. This 
would be a breaking change though since users seem to abuse this functionality 
to normalize input (see example). Maybe we can check the length in 8x and the 
content in master only ?

> JapaneseTokenizer creates Token objects with corrupt offsets
> 
>
> Key: LUCENE-8933
> URL: https://issues.apache.org/jira/browse/LUCENE-8933
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Adrien Grand
>Priority: Minor
>
> An Elasticsearch user reported the following stack trace when parsing 
> synonyms. It looks like the only reason why this might occur is if the offset 
> of a {{org.apache.lucene.analysis.ja.Token}} is not within the expected range.
>  
> {noformat}
> Caused by: java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl.copyBuffer(CharTermAttributeImpl.java:44)
>  ~[lucene-core-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f - 
> nknize - 2018-12-07 14:44:20]
> at 
> org.apache.lucene.analysis.ja.JapaneseTokenizer.incrementToken(JapaneseTokenizer.java:486)
>  ~[?:?]
> at 
> org.apache.lucene.analysis.synonym.SynonymMap$Parser.analyze(SynonymMap.java:318)
>  ~[lucene-analyzers-common-7.6.0.jar:7.6.0 
> 719cde97f84640faa1e3525690d262946571245f - nknize - 2018-12-07 14:44:48]
> at 
> org.elasticsearch.index.analysis.ESSolrSynonymParser.analyze(ESSolrSynonymParser.java:57)
>  ~[elasticsearch-6.6.1.jar:6.6.1]
> at 
> org.apache.lucene.analysis.synonym.SolrSynonymParser.addInternal(SolrSynonymParser.java:114)
>  ~[lucene-analyzers-common-7.6.0.jar:7.6.0 
> 719cde97f84640faa1e3525690d262946571245f - nknize - 2018-12-07 14:44:48]
> at 
> org.apache.lucene.analysis.synonym.SolrSynonymParser.parse(SolrSynonymParser.java:70)
>  ~[lucene-analyzers-common-7.6.0.jar:7.6.0 
> 719cde97f84640faa1e3525690d262946571245f - nknize - 2018-12-07 14:44:48]
> at 
> org.elasticsearch.index.analysis.SynonymTokenFilterFactory.buildSynonyms(SynonymTokenFilterFactory.java:154)
>  ~[elasticsearch-6.6.1.jar:6.6.1]
> ... 24 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-Tests-8.2 - Build # 26 - Failure

2019-07-26 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Tests-8.2/26/

All tests passed

Build Log:
[...truncated 68044 lines...]
BUILD FAILED
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-8.2/build.xml:634: The 
following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-8.2/build.xml:101: The 
following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-8.2/solr/build.xml:624: 
The following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-8.2/solr/build.xml:593: 
The following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-8.2/lucene/common-build.xml:2552:
 Failed to load 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-8.2/dev-tools/doap/solr.rdf

Total time: 112 minutes 39 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-8915) Document that RateLimiter's limits may be updated over time

2019-07-26 Thread Adrien Grand (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-8915.
--
   Resolution: Fixed
Fix Version/s: 8.3

Thanks [~atris]

> Document that RateLimiter's limits may be updated over time
> ---
>
> Key: LUCENE-8915
> URL: https://issues.apache.org/jira/browse/LUCENE-8915
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Atri Sharma
>Priority: Major
> Fix For: 8.3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> RateLimiter does not allow dynamic configuration of the rate limit today. 
> This limits the kind of applications that the functionality can be applied 
> to. This Jira tracks 1) allowing the rate limiter to change limits 
> dynamically. 2) Add a RateLimiter subclass which exposes the same.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-8915) Document that RateLimiter's limits may be updated over time

2019-07-26 Thread Adrien Grand (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-8915:
-
Summary: Document that RateLimiter's limits may be updated over time  (was: 
Allow RateLimiter To Have Dynamic Limits)

> Document that RateLimiter's limits may be updated over time
> ---
>
> Key: LUCENE-8915
> URL: https://issues.apache.org/jira/browse/LUCENE-8915
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Atri Sharma
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> RateLimiter does not allow dynamic configuration of the rate limit today. 
> This limits the kind of applications that the functionality can be applied 
> to. This Jira tracks 1) allowing the rate limiter to change limits 
> dynamically. 2) Add a RateLimiter subclass which exposes the same.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8915) Allow RateLimiter To Have Dynamic Limits

2019-07-26 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893672#comment-16893672
 ] 

ASF subversion and git services commented on LUCENE-8915:
-

Commit bbaa02ddc6da505f250ccf7b6e603e92d40628c7 in lucene-solr's branch 
refs/heads/branch_8x from Atri Sharma
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=bbaa02d ]

LUCENE-8915 : Improve Javadocs for RateLimiter and SimpleRateLimiter (#789)




> Allow RateLimiter To Have Dynamic Limits
> 
>
> Key: LUCENE-8915
> URL: https://issues.apache.org/jira/browse/LUCENE-8915
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Atri Sharma
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> RateLimiter does not allow dynamic configuration of the rate limit today. 
> This limits the kind of applications that the functionality can be applied 
> to. This Jira tracks 1) allowing the rate limiter to change limits 
> dynamically. 2) Add a RateLimiter subclass which exposes the same.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8915) Allow RateLimiter To Have Dynamic Limits

2019-07-26 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893670#comment-16893670
 ] 

ASF subversion and git services commented on LUCENE-8915:
-

Commit 42fadbff79ca93b693e577d13a5a4ebc35d75cbb in lucene-solr's branch 
refs/heads/master from Atri Sharma
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=42fadbf ]

LUCENE-8915 : Improve Javadocs for RateLimiter and SimpleRateLimiter (#789)




> Allow RateLimiter To Have Dynamic Limits
> 
>
> Key: LUCENE-8915
> URL: https://issues.apache.org/jira/browse/LUCENE-8915
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Atri Sharma
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> RateLimiter does not allow dynamic configuration of the rate limit today. 
> This limits the kind of applications that the functionality can be applied 
> to. This Jira tracks 1) allowing the rate limiter to change limits 
> dynamically. 2) Add a RateLimiter subclass which exposes the same.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz merged pull request #789: LUCENE-8915 : Improve Javadocs for RateLimiter and SimpleRateLimiter

2019-07-26 Thread GitBox

jpountz merged pull request #789: LUCENE-8915 : Improve Javadocs for 
RateLimiter and SimpleRateLimiter
URL: https://github.com/apache/lucene-solr/pull/789
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-8935) BooleanQuery with no scoring clauses cannot skip documents when running TOP_SCORES mode

2019-07-26 Thread Jim Ferenczi (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Ferenczi updated LUCENE-8935:
-
Attachment: LUCENE-8935.patch
Status: Open  (was: Open)

Here is a patch that wraps the boolean scorer in a constant score scorer when 
there is no scoring clause and the score mode is TOP_SCORES.

> BooleanQuery with no scoring clauses cannot skip documents when running 
> TOP_SCORES mode
> ---
>
> Key: LUCENE-8935
> URL: https://issues.apache.org/jira/browse/LUCENE-8935
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Jim Ferenczi
>Priority: Minor
> Attachments: LUCENE-8935.patch
>
>
> Today a boolean query that is composed of filtering clauses only (more than 
> one) cannot skip documents when the search is executed with the TOP_SCORES 
> mode. However since all documents have a score of 0 it should be possible to 
> early terminate the query as soon as we collected enough top hits. Wrapping 
> the resulting boolean scorer in a constant score scorer should allow early 
> termination in this case and would speed up the retrieval of top hits case 
> considerably if the total hit count is not requested.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on a change in pull request #803: LUCENE-8929: Early Terminating CollectorManager with Global Hitcount

2019-07-26 Thread GitBox

atris commented on a change in pull request #803: LUCENE-8929: Early 
Terminating CollectorManager with Global Hitcount
URL: https://github.com/apache/lucene-solr/pull/803#discussion_r307663405
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java
 ##
 @@ -279,6 +280,125 @@ public void collect(int doc) throws IOException {
 
   }
 
+  /*
+   * Implements a TopFieldCollector that terminates early based on a global
+   * scoreboard which is shared amongst multiple collectors.
+   * NOTE: This should ideally be outside of TopFieldCollector since it does
+   * not have private access, but we keep it here to limit the visibility
+   * of dependent classes
+   */
+  public static class GlobalStateFieldCollector extends TopFieldCollector {
+
+final Sort sort;
+final FieldValueHitQueue queue;
+final AtomicInteger globalTotalHits;
+
+final GlobalStateCollectorManager.GlobalStateCallback callback;
+
+public GlobalStateFieldCollector(Sort sort, FieldValueHitQueue 
queue, int numHits, int totalHitsThreshold,
+ AtomicInteger globalTotalHits, 
GlobalStateCollectorManager.GlobalStateCallback callback) {
+  super(queue, numHits, totalHitsThreshold, sort.needsScores());
+  this.sort = sort;
+  this.queue = queue;
+  this.globalTotalHits = globalTotalHits;
+  this.callback = callback;
+}
+
+@Override
+public LeafCollector getLeafCollector(LeafReaderContext context) throws 
IOException {
+  docBase = context.docBase;
+
+  final LeafFieldComparator[] comparators = queue.getComparators(context);
+  final int[] reverseMul = queue.getReverseMul();
+  final Sort indexSort = context.reader().getMetaData().getSort();
+  final boolean canEarlyTerminate = canEarlyTerminate(sort, indexSort);
+
+  return new MultiComparatorLeafCollector(comparators, reverseMul) {
+
+boolean collectedAllCompetitiveHits = false;
+
+@Override
+public void setScorer(Scorable scorer) throws IOException {
+  super.setScorer(scorer);
+  updateMinCompetitiveScore(scorer);
+}
+
+@Override
+public void collect(int doc) throws IOException {
+  // Increment local hit counter
+  totalHits++;
+
+  if (queueFull) {
+if (collectedAllCompetitiveHits || !isHitCompetitive(doc, scorer)) 
{
+  // since docs are visited in doc Id order, if compare is 0, it 
means
+  // this document is largest than anything else in the queue, and
+  // therefore not competitive.
+  if (canEarlyTerminate) {
+// Check the global scoreboard to see total hits accumulated 
yet
+if (globalTotalHits.incrementAndGet() > totalHitsThreshold) {
+  totalHitsRelation = Relation.GREATER_THAN_OR_EQUAL_TO;
+  throw new CollectionTerminatedException();
+} else {
+  collectedAllCompetitiveHits = true;
+}
+  } else if (totalHitsRelation == Relation.EQUAL_TO) {
+// we just reached totalHitsThreshold, we can start setting 
the min
+// competitive score now
+//TODO: Should we also update competitive score globally?
+updateMinCompetitiveScore(scorer);
+  }
+  return;
+}
+
+// This hit is competitive - replace bottom element in queue & 
adjustTop
+comparator.copy(bottom.slot, doc);
+updateBottom(doc);
+comparator.setBottom(bottom.slot);
+updateMinCompetitiveScore(scorer);
+//Increment global hit counter
+globalTotalHits.incrementAndGet();
+  } else {
+// Startup transient: queue hasn't gathered numHits yet
+
+//Increment global hit counter
+globalTotalHits.incrementAndGet();
+
+final int slot = totalHits - 1;
+
+// Copy hit into queue
+comparator.copy(slot, doc);
+add(slot, doc);
+if (queueFull) {
+  comparator.setBottom(bottom.slot);
+  updateMinCompetitiveScore(scorer);
+}
+  }
+}
+
+// Check if hit is competitive and set the global value accordingly
+private boolean isHitCompetitive(int doc, Scorable scorer) throws 
IOException {
+  // Check if hit is locally competitive
+  if (reverseMul * comparator.compareBottom(doc) > 0) {
+// Hit was competitive locally, but was it globally competitive?
+if (callback.getGlobalMinCompetitiveScore() > scorer.score()) {
+  return false;
+} else {
+  // Hit was locally and globally competitive, set the right
+  // global minimum competitive score
+

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-26 Thread Adrien Grand (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893662#comment-16893662
 ] 

Adrien Grand commented on LUCENE-8933:
--

Should we go further and check that the concatenation of the segments is equal 
to the surface form?

> JapaneseTokenizer creates Token objects with corrupt offsets
> 
>
> Key: LUCENE-8933
> URL: https://issues.apache.org/jira/browse/LUCENE-8933
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Adrien Grand
>Priority: Minor
>
> An Elasticsearch user reported the following stack trace when parsing 
> synonyms. It looks like the only reason why this might occur is if the offset 
> of a {{org.apache.lucene.analysis.ja.Token}} is not within the expected range.
>  
> {noformat}
> Caused by: java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl.copyBuffer(CharTermAttributeImpl.java:44)
>  ~[lucene-core-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f - 
> nknize - 2018-12-07 14:44:20]
> at 
> org.apache.lucene.analysis.ja.JapaneseTokenizer.incrementToken(JapaneseTokenizer.java:486)
>  ~[?:?]
> at 
> org.apache.lucene.analysis.synonym.SynonymMap$Parser.analyze(SynonymMap.java:318)
>  ~[lucene-analyzers-common-7.6.0.jar:7.6.0 
> 719cde97f84640faa1e3525690d262946571245f - nknize - 2018-12-07 14:44:48]
> at 
> org.elasticsearch.index.analysis.ESSolrSynonymParser.analyze(ESSolrSynonymParser.java:57)
>  ~[elasticsearch-6.6.1.jar:6.6.1]
> at 
> org.apache.lucene.analysis.synonym.SolrSynonymParser.addInternal(SolrSynonymParser.java:114)
>  ~[lucene-analyzers-common-7.6.0.jar:7.6.0 
> 719cde97f84640faa1e3525690d262946571245f - nknize - 2018-12-07 14:44:48]
> at 
> org.apache.lucene.analysis.synonym.SolrSynonymParser.parse(SolrSynonymParser.java:70)
>  ~[lucene-analyzers-common-7.6.0.jar:7.6.0 
> 719cde97f84640faa1e3525690d262946571245f - nknize - 2018-12-07 14:44:48]
> at 
> org.elasticsearch.index.analysis.SynonymTokenFilterFactory.buildSynonyms(SynonymTokenFilterFactory.java:154)
>  ~[elasticsearch-6.6.1.jar:6.6.1]
> ... 24 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-8935) BooleanQuery with no scoring clauses cannot skip documents when running TOP_SCORES mode

2019-07-26 Thread Jim Ferenczi (JIRA)

Jim Ferenczi created LUCENE-8935:


 Summary: BooleanQuery with no scoring clauses cannot skip 
documents when running TOP_SCORES mode
 Key: LUCENE-8935
 URL: https://issues.apache.org/jira/browse/LUCENE-8935
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Jim Ferenczi


Today a boolean query that is composed of filtering clauses only (more than 
one) cannot skip documents when the search is executed with the TOP_SCORES 
mode. However since all documents have a score of 0 it should be possible to 
early terminate the query as soon as we collected enough top hits. Wrapping the 
resulting boolean scorer in a constant score scorer should allow early 
termination in this case and would speed up the retrieval of top hits case 
considerably if the total hit count is not requested.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #803: LUCENE-8929: Early Terminating CollectorManager with Global Hitcount

2019-07-26 Thread GitBox

jpountz commented on a change in pull request #803: LUCENE-8929: Early 
Terminating CollectorManager with Global Hitcount
URL: https://github.com/apache/lucene-solr/pull/803#discussion_r307660097
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java
 ##
 @@ -279,6 +280,125 @@ public void collect(int doc) throws IOException {
 
   }
 
+  /*
+   * Implements a TopFieldCollector that terminates early based on a global
+   * scoreboard which is shared amongst multiple collectors.
+   * NOTE: This should ideally be outside of TopFieldCollector since it does
+   * not have private access, but we keep it here to limit the visibility
+   * of dependent classes
+   */
+  public static class GlobalStateFieldCollector extends TopFieldCollector {
+
+final Sort sort;
+final FieldValueHitQueue queue;
+final AtomicInteger globalTotalHits;
+
+final GlobalStateCollectorManager.GlobalStateCallback callback;
+
+public GlobalStateFieldCollector(Sort sort, FieldValueHitQueue 
queue, int numHits, int totalHitsThreshold,
+ AtomicInteger globalTotalHits, 
GlobalStateCollectorManager.GlobalStateCallback callback) {
+  super(queue, numHits, totalHitsThreshold, sort.needsScores());
+  this.sort = sort;
+  this.queue = queue;
+  this.globalTotalHits = globalTotalHits;
+  this.callback = callback;
+}
+
+@Override
+public LeafCollector getLeafCollector(LeafReaderContext context) throws 
IOException {
+  docBase = context.docBase;
+
+  final LeafFieldComparator[] comparators = queue.getComparators(context);
+  final int[] reverseMul = queue.getReverseMul();
+  final Sort indexSort = context.reader().getMetaData().getSort();
+  final boolean canEarlyTerminate = canEarlyTerminate(sort, indexSort);
+
+  return new MultiComparatorLeafCollector(comparators, reverseMul) {
+
+boolean collectedAllCompetitiveHits = false;
+
+@Override
+public void setScorer(Scorable scorer) throws IOException {
+  super.setScorer(scorer);
+  updateMinCompetitiveScore(scorer);
+}
+
+@Override
+public void collect(int doc) throws IOException {
+  // Increment local hit counter
+  totalHits++;
+
+  if (queueFull) {
+if (collectedAllCompetitiveHits || !isHitCompetitive(doc, scorer)) 
{
+  // since docs are visited in doc Id order, if compare is 0, it 
means
+  // this document is largest than anything else in the queue, and
+  // therefore not competitive.
+  if (canEarlyTerminate) {
+// Check the global scoreboard to see total hits accumulated 
yet
+if (globalTotalHits.incrementAndGet() > totalHitsThreshold) {
+  totalHitsRelation = Relation.GREATER_THAN_OR_EQUAL_TO;
+  throw new CollectionTerminatedException();
+} else {
+  collectedAllCompetitiveHits = true;
+}
+  } else if (totalHitsRelation == Relation.EQUAL_TO) {
+// we just reached totalHitsThreshold, we can start setting 
the min
+// competitive score now
+//TODO: Should we also update competitive score globally?
+updateMinCompetitiveScore(scorer);
+  }
+  return;
+}
+
+// This hit is competitive - replace bottom element in queue & 
adjustTop
+comparator.copy(bottom.slot, doc);
+updateBottom(doc);
+comparator.setBottom(bottom.slot);
+updateMinCompetitiveScore(scorer);
+//Increment global hit counter
+globalTotalHits.incrementAndGet();
+  } else {
+// Startup transient: queue hasn't gathered numHits yet
+
+//Increment global hit counter
+globalTotalHits.incrementAndGet();
+
+final int slot = totalHits - 1;
+
+// Copy hit into queue
+comparator.copy(slot, doc);
+add(slot, doc);
+if (queueFull) {
+  comparator.setBottom(bottom.slot);
+  updateMinCompetitiveScore(scorer);
+}
+  }
+}
+
+// Check if hit is competitive and set the global value accordingly
+private boolean isHitCompetitive(int doc, Scorable scorer) throws 
IOException {
+  // Check if hit is locally competitive
+  if (reverseMul * comparator.compareBottom(doc) > 0) {
+// Hit was competitive locally, but was it globally competitive?
+if (callback.getGlobalMinCompetitiveScore() > scorer.score()) {
+  return false;
+} else {
+  // Hit was locally and globally competitive, set the right
+  // global minimum competitive score
+

[GitHub] [lucene-solr] atris commented on a change in pull request #803: LUCENE-8929: Early Terminating CollectorManager with Global Hitcount

2019-07-26 Thread GitBox

atris commented on a change in pull request #803: LUCENE-8929: Early 
Terminating CollectorManager with Global Hitcount
URL: https://github.com/apache/lucene-solr/pull/803#discussion_r307658298
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java
 ##
 @@ -279,6 +280,125 @@ public void collect(int doc) throws IOException {
 
   }
 
+  /*
+   * Implements a TopFieldCollector that terminates early based on a global
+   * scoreboard which is shared amongst multiple collectors.
+   * NOTE: This should ideally be outside of TopFieldCollector since it does
+   * not have private access, but we keep it here to limit the visibility
+   * of dependent classes
+   */
+  public static class GlobalStateFieldCollector extends TopFieldCollector {
+
+final Sort sort;
+final FieldValueHitQueue queue;
+final AtomicInteger globalTotalHits;
+
+final GlobalStateCollectorManager.GlobalStateCallback callback;
+
+public GlobalStateFieldCollector(Sort sort, FieldValueHitQueue 
queue, int numHits, int totalHitsThreshold,
+ AtomicInteger globalTotalHits, 
GlobalStateCollectorManager.GlobalStateCallback callback) {
+  super(queue, numHits, totalHitsThreshold, sort.needsScores());
+  this.sort = sort;
+  this.queue = queue;
+  this.globalTotalHits = globalTotalHits;
+  this.callback = callback;
+}
+
+@Override
+public LeafCollector getLeafCollector(LeafReaderContext context) throws 
IOException {
+  docBase = context.docBase;
+
+  final LeafFieldComparator[] comparators = queue.getComparators(context);
+  final int[] reverseMul = queue.getReverseMul();
+  final Sort indexSort = context.reader().getMetaData().getSort();
+  final boolean canEarlyTerminate = canEarlyTerminate(sort, indexSort);
+
+  return new MultiComparatorLeafCollector(comparators, reverseMul) {
+
+boolean collectedAllCompetitiveHits = false;
+
+@Override
+public void setScorer(Scorable scorer) throws IOException {
+  super.setScorer(scorer);
+  updateMinCompetitiveScore(scorer);
+}
+
+@Override
+public void collect(int doc) throws IOException {
+  // Increment local hit counter
+  totalHits++;
+
+  if (queueFull) {
+if (collectedAllCompetitiveHits || !isHitCompetitive(doc, scorer)) 
{
+  // since docs are visited in doc Id order, if compare is 0, it 
means
+  // this document is largest than anything else in the queue, and
+  // therefore not competitive.
+  if (canEarlyTerminate) {
+// Check the global scoreboard to see total hits accumulated 
yet
+if (globalTotalHits.incrementAndGet() > totalHitsThreshold) {
+  totalHitsRelation = Relation.GREATER_THAN_OR_EQUAL_TO;
+  throw new CollectionTerminatedException();
+} else {
+  collectedAllCompetitiveHits = true;
+}
+  } else if (totalHitsRelation == Relation.EQUAL_TO) {
+// we just reached totalHitsThreshold, we can start setting 
the min
+// competitive score now
+//TODO: Should we also update competitive score globally?
+updateMinCompetitiveScore(scorer);
+  }
+  return;
+}
+
+// This hit is competitive - replace bottom element in queue & 
adjustTop
+comparator.copy(bottom.slot, doc);
+updateBottom(doc);
+comparator.setBottom(bottom.slot);
+updateMinCompetitiveScore(scorer);
+//Increment global hit counter
+globalTotalHits.incrementAndGet();
+  } else {
+// Startup transient: queue hasn't gathered numHits yet
+
+//Increment global hit counter
+globalTotalHits.incrementAndGet();
+
+final int slot = totalHits - 1;
+
+// Copy hit into queue
+comparator.copy(slot, doc);
+add(slot, doc);
+if (queueFull) {
+  comparator.setBottom(bottom.slot);
+  updateMinCompetitiveScore(scorer);
+}
+  }
+}
+
+// Check if hit is competitive and set the global value accordingly
+private boolean isHitCompetitive(int doc, Scorable scorer) throws 
IOException {
+  // Check if hit is locally competitive
+  if (reverseMul * comparator.compareBottom(doc) > 0) {
+// Hit was competitive locally, but was it globally competitive?
+if (callback.getGlobalMinCompetitiveScore() > scorer.score()) {
+  return false;
+} else {
+  // Hit was locally and globally competitive, set the right
+  // global minimum competitive score
+

[GitHub] [lucene-solr] jpountz commented on a change in pull request #803: LUCENE-8929: Early Terminating CollectorManager with Global Hitcount

2019-07-26 Thread GitBox

jpountz commented on a change in pull request #803: LUCENE-8929: Early 
Terminating CollectorManager with Global Hitcount
URL: https://github.com/apache/lucene-solr/pull/803#discussion_r307658181
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java
 ##
 @@ -279,6 +280,125 @@ public void collect(int doc) throws IOException {
 
   }
 
+  /*
+   * Implements a TopFieldCollector that terminates early based on a global
+   * scoreboard which is shared amongst multiple collectors.
+   * NOTE: This should ideally be outside of TopFieldCollector since it does
+   * not have private access, but we keep it here to limit the visibility
+   * of dependent classes
+   */
+  public static class GlobalStateFieldCollector extends TopFieldCollector {
+
+final Sort sort;
+final FieldValueHitQueue queue;
+final AtomicInteger globalTotalHits;
+
+final GlobalStateCollectorManager.GlobalStateCallback callback;
+
+public GlobalStateFieldCollector(Sort sort, FieldValueHitQueue 
queue, int numHits, int totalHitsThreshold,
+ AtomicInteger globalTotalHits, 
GlobalStateCollectorManager.GlobalStateCallback callback) {
+  super(queue, numHits, totalHitsThreshold, sort.needsScores());
+  this.sort = sort;
+  this.queue = queue;
+  this.globalTotalHits = globalTotalHits;
+  this.callback = callback;
+}
+
+@Override
+public LeafCollector getLeafCollector(LeafReaderContext context) throws 
IOException {
+  docBase = context.docBase;
+
+  final LeafFieldComparator[] comparators = queue.getComparators(context);
+  final int[] reverseMul = queue.getReverseMul();
+  final Sort indexSort = context.reader().getMetaData().getSort();
+  final boolean canEarlyTerminate = canEarlyTerminate(sort, indexSort);
+
+  return new MultiComparatorLeafCollector(comparators, reverseMul) {
+
+boolean collectedAllCompetitiveHits = false;
+
+@Override
+public void setScorer(Scorable scorer) throws IOException {
+  super.setScorer(scorer);
+  updateMinCompetitiveScore(scorer);
+}
+
+@Override
+public void collect(int doc) throws IOException {
+  // Increment local hit counter
+  totalHits++;
+
+  if (queueFull) {
+if (collectedAllCompetitiveHits || !isHitCompetitive(doc, scorer)) 
{
+  // since docs are visited in doc Id order, if compare is 0, it 
means
+  // this document is largest than anything else in the queue, and
+  // therefore not competitive.
+  if (canEarlyTerminate) {
+// Check the global scoreboard to see total hits accumulated 
yet
+if (globalTotalHits.incrementAndGet() > totalHitsThreshold) {
+  totalHitsRelation = Relation.GREATER_THAN_OR_EQUAL_TO;
+  throw new CollectionTerminatedException();
+} else {
+  collectedAllCompetitiveHits = true;
+}
+  } else if (totalHitsRelation == Relation.EQUAL_TO) {
+// we just reached totalHitsThreshold, we can start setting 
the min
+// competitive score now
+//TODO: Should we also update competitive score globally?
+updateMinCompetitiveScore(scorer);
+  }
+  return;
+}
+
+// This hit is competitive - replace bottom element in queue & 
adjustTop
+comparator.copy(bottom.slot, doc);
+updateBottom(doc);
+comparator.setBottom(bottom.slot);
+updateMinCompetitiveScore(scorer);
+//Increment global hit counter
+globalTotalHits.incrementAndGet();
+  } else {
+// Startup transient: queue hasn't gathered numHits yet
+
+//Increment global hit counter
+globalTotalHits.incrementAndGet();
+
+final int slot = totalHits - 1;
+
+// Copy hit into queue
+comparator.copy(slot, doc);
+add(slot, doc);
+if (queueFull) {
+  comparator.setBottom(bottom.slot);
+  updateMinCompetitiveScore(scorer);
+}
+  }
+}
+
+// Check if hit is competitive and set the global value accordingly
+private boolean isHitCompetitive(int doc, Scorable scorer) throws 
IOException {
+  // Check if hit is locally competitive
+  if (reverseMul * comparator.compareBottom(doc) > 0) {
+// Hit was competitive locally, but was it globally competitive?
+if (callback.getGlobalMinCompetitiveScore() > scorer.score()) {
+  return false;
+} else {
+  // Hit was locally and globally competitive, set the right
+  // global minimum competitive score
+

[GitHub] [lucene-solr] atris commented on a change in pull request #803: LUCENE-8929: Early Terminating CollectorManager with Global Hitcount

2019-07-26 Thread GitBox

atris commented on a change in pull request #803: LUCENE-8929: Early 
Terminating CollectorManager with Global Hitcount
URL: https://github.com/apache/lucene-solr/pull/803#discussion_r307656572
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java
 ##
 @@ -279,6 +280,125 @@ public void collect(int doc) throws IOException {
 
   }
 
+  /*
+   * Implements a TopFieldCollector that terminates early based on a global
+   * scoreboard which is shared amongst multiple collectors.
+   * NOTE: This should ideally be outside of TopFieldCollector since it does
+   * not have private access, but we keep it here to limit the visibility
+   * of dependent classes
+   */
+  public static class GlobalStateFieldCollector extends TopFieldCollector {
+
+final Sort sort;
+final FieldValueHitQueue queue;
+final AtomicInteger globalTotalHits;
+
+final GlobalStateCollectorManager.GlobalStateCallback callback;
+
+public GlobalStateFieldCollector(Sort sort, FieldValueHitQueue 
queue, int numHits, int totalHitsThreshold,
+ AtomicInteger globalTotalHits, 
GlobalStateCollectorManager.GlobalStateCallback callback) {
+  super(queue, numHits, totalHitsThreshold, sort.needsScores());
+  this.sort = sort;
+  this.queue = queue;
+  this.globalTotalHits = globalTotalHits;
+  this.callback = callback;
+}
+
+@Override
+public LeafCollector getLeafCollector(LeafReaderContext context) throws 
IOException {
+  docBase = context.docBase;
+
+  final LeafFieldComparator[] comparators = queue.getComparators(context);
+  final int[] reverseMul = queue.getReverseMul();
+  final Sort indexSort = context.reader().getMetaData().getSort();
+  final boolean canEarlyTerminate = canEarlyTerminate(sort, indexSort);
+
+  return new MultiComparatorLeafCollector(comparators, reverseMul) {
+
+boolean collectedAllCompetitiveHits = false;
+
+@Override
+public void setScorer(Scorable scorer) throws IOException {
+  super.setScorer(scorer);
+  updateMinCompetitiveScore(scorer);
+}
+
+@Override
+public void collect(int doc) throws IOException {
+  // Increment local hit counter
+  totalHits++;
+
+  if (queueFull) {
+if (collectedAllCompetitiveHits || !isHitCompetitive(doc, scorer)) 
{
+  // since docs are visited in doc Id order, if compare is 0, it 
means
+  // this document is largest than anything else in the queue, and
+  // therefore not competitive.
+  if (canEarlyTerminate) {
+// Check the global scoreboard to see total hits accumulated 
yet
+if (globalTotalHits.incrementAndGet() > totalHitsThreshold) {
+  totalHitsRelation = Relation.GREATER_THAN_OR_EQUAL_TO;
+  throw new CollectionTerminatedException();
+} else {
+  collectedAllCompetitiveHits = true;
+}
+  } else if (totalHitsRelation == Relation.EQUAL_TO) {
+// we just reached totalHitsThreshold, we can start setting 
the min
+// competitive score now
+//TODO: Should we also update competitive score globally?
+updateMinCompetitiveScore(scorer);
+  }
+  return;
+}
+
+// This hit is competitive - replace bottom element in queue & 
adjustTop
+comparator.copy(bottom.slot, doc);
+updateBottom(doc);
+comparator.setBottom(bottom.slot);
+updateMinCompetitiveScore(scorer);
+//Increment global hit counter
+globalTotalHits.incrementAndGet();
+  } else {
+// Startup transient: queue hasn't gathered numHits yet
+
+//Increment global hit counter
+globalTotalHits.incrementAndGet();
+
+final int slot = totalHits - 1;
+
+// Copy hit into queue
+comparator.copy(slot, doc);
+add(slot, doc);
+if (queueFull) {
+  comparator.setBottom(bottom.slot);
+  updateMinCompetitiveScore(scorer);
+}
+  }
+}
+
+// Check if hit is competitive and set the global value accordingly
+private boolean isHitCompetitive(int doc, Scorable scorer) throws 
IOException {
+  // Check if hit is locally competitive
+  if (reverseMul * comparator.compareBottom(doc) > 0) {
+// Hit was competitive locally, but was it globally competitive?
+if (callback.getGlobalMinCompetitiveScore() > scorer.score()) {
+  return false;
+} else {
+  // Hit was locally and globally competitive, set the right
+  // global minimum competitive score
+

[GitHub] [lucene-solr] atris commented on a change in pull request #803: LUCENE-8929: Early Terminating CollectorManager with Global Hitcount

2019-07-26 Thread GitBox

atris commented on a change in pull request #803: LUCENE-8929: Early 
Terminating CollectorManager with Global Hitcount
URL: https://github.com/apache/lucene-solr/pull/803#discussion_r307656294
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java
 ##
 @@ -279,6 +280,125 @@ public void collect(int doc) throws IOException {
 
   }
 
+  /*
+   * Implements a TopFieldCollector that terminates early based on a global
+   * scoreboard which is shared amongst multiple collectors.
+   * NOTE: This should ideally be outside of TopFieldCollector since it does
+   * not have private access, but we keep it here to limit the visibility
+   * of dependent classes
+   */
+  public static class GlobalStateFieldCollector extends TopFieldCollector {
+
+final Sort sort;
+final FieldValueHitQueue queue;
+final AtomicInteger globalTotalHits;
+
+final GlobalStateCollectorManager.GlobalStateCallback callback;
+
+public GlobalStateFieldCollector(Sort sort, FieldValueHitQueue 
queue, int numHits, int totalHitsThreshold,
+ AtomicInteger globalTotalHits, 
GlobalStateCollectorManager.GlobalStateCallback callback) {
+  super(queue, numHits, totalHitsThreshold, sort.needsScores());
+  this.sort = sort;
+  this.queue = queue;
+  this.globalTotalHits = globalTotalHits;
+  this.callback = callback;
+}
+
+@Override
+public LeafCollector getLeafCollector(LeafReaderContext context) throws 
IOException {
+  docBase = context.docBase;
+
+  final LeafFieldComparator[] comparators = queue.getComparators(context);
+  final int[] reverseMul = queue.getReverseMul();
+  final Sort indexSort = context.reader().getMetaData().getSort();
+  final boolean canEarlyTerminate = canEarlyTerminate(sort, indexSort);
+
+  return new MultiComparatorLeafCollector(comparators, reverseMul) {
+
+boolean collectedAllCompetitiveHits = false;
+
+@Override
+public void setScorer(Scorable scorer) throws IOException {
+  super.setScorer(scorer);
+  updateMinCompetitiveScore(scorer);
+}
+
+@Override
+public void collect(int doc) throws IOException {
+  // Increment local hit counter
+  totalHits++;
+
+  if (queueFull) {
+if (collectedAllCompetitiveHits || !isHitCompetitive(doc, scorer)) 
{
+  // since docs are visited in doc Id order, if compare is 0, it 
means
+  // this document is largest than anything else in the queue, and
+  // therefore not competitive.
+  if (canEarlyTerminate) {
+// Check the global scoreboard to see total hits accumulated 
yet
+if (globalTotalHits.incrementAndGet() > totalHitsThreshold) {
+  totalHitsRelation = Relation.GREATER_THAN_OR_EQUAL_TO;
+  throw new CollectionTerminatedException();
+} else {
+  collectedAllCompetitiveHits = true;
+}
+  } else if (totalHitsRelation == Relation.EQUAL_TO) {
+// we just reached totalHitsThreshold, we can start setting 
the min
+// competitive score now
+//TODO: Should we also update competitive score globally?
+updateMinCompetitiveScore(scorer);
+  }
+  return;
+}
+
+// This hit is competitive - replace bottom element in queue & 
adjustTop
+comparator.copy(bottom.slot, doc);
+updateBottom(doc);
+comparator.setBottom(bottom.slot);
+updateMinCompetitiveScore(scorer);
+//Increment global hit counter
+globalTotalHits.incrementAndGet();
+  } else {
+// Startup transient: queue hasn't gathered numHits yet
+
+//Increment global hit counter
+globalTotalHits.incrementAndGet();
+
+final int slot = totalHits - 1;
+
+// Copy hit into queue
+comparator.copy(slot, doc);
+add(slot, doc);
+if (queueFull) {
+  comparator.setBottom(bottom.slot);
+  updateMinCompetitiveScore(scorer);
+}
+  }
+}
+
+// Check if hit is competitive and set the global value accordingly
+private boolean isHitCompetitive(int doc, Scorable scorer) throws 
IOException {
+  // Check if hit is locally competitive
+  if (reverseMul * comparator.compareBottom(doc) > 0) {
+// Hit was competitive locally, but was it globally competitive?
+if (callback.getGlobalMinCompetitiveScore() > scorer.score()) {
+  return false;
+} else {
+  // Hit was locally and globally competitive, set the right
+  // global minimum competitive score
+

[GitHub] [lucene-solr] jpountz commented on a change in pull request #769: LUCENE-8905: Better Error Handling For Illegal Arguments

2019-07-26 Thread GitBox

jpountz commented on a change in pull request #769: LUCENE-8905: Better Error 
Handling For Illegal Arguments
URL: https://github.com/apache/lucene-solr/pull/769#discussion_r307654649
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/TopDocsCollector.java
 ##
 @@ -136,14 +136,21 @@ public TopDocs topDocs(int start, int howMany) {
 // pq.size() or totalHits.
 int size = topDocsSize();
 
-// Don't bother to throw an exception, just return an empty TopDocs in case
-// the parameters are invalid or out of range.
-// TODO: shouldn't we throw IAE if apps give bad params here so they dont
-// have sneaky silent bugs?
-if (start < 0 || start >= size || howMany <= 0) {
+
+if (start < 0 || start > size) {
 
 Review comment:
   I don't think we should validate `start` against `size` since `size` depends 
on the number of matches, and callers have no clue about its value at this 
point? Otherwise we are making this method almost impossible to use?
   
   The thing that we should validate is that `start + howMany` is less than or 
equal to the `numHits` parameter (which doesn't seem to be passed to 
TopDocsCollector by its sub classes at the moment, but we could)?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on issue #794: LUCENE-8769: Introduce Range Query Type With Multiple Ranges

2019-07-26 Thread GitBox

atris commented on issue #794: LUCENE-8769: Introduce Range Query Type With 
Multiple Ranges
URL: https://github.com/apache/lucene-solr/pull/794#issuecomment-515372993
 
 
   @jpountz updated the same.
   
   ant precommit looks fine


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-26 Thread Jim Ferenczi (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893478#comment-16893478
 ] 

Jim Ferenczi commented on LUCENE-8933:
--

{quote}
If there are no other opinions or objections, I'd like to create a patch that 
add a validation rule to the UserDictionary.
{quote}

Thanks [~tomoko]!

{quote}
For purpose of format validation, I think it would be better that we check if 
the sum of length of segments is equal to the length of its surface form.
i.e., we also should not allow such entry "aabbcc,a b c,aa bb cc,pos_tag" even 
if this does not cause any exceptions.
{quote}

+1



> JapaneseTokenizer creates Token objects with corrupt offsets
> 
>
> Key: LUCENE-8933
> URL: https://issues.apache.org/jira/browse/LUCENE-8933
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Adrien Grand
>Priority: Minor
>
> An Elasticsearch user reported the following stack trace when parsing 
> synonyms. It looks like the only reason why this might occur is if the offset 
> of a {{org.apache.lucene.analysis.ja.Token}} is not within the expected range.
>  
> {noformat}
> Caused by: java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl.copyBuffer(CharTermAttributeImpl.java:44)
>  ~[lucene-core-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f - 
> nknize - 2018-12-07 14:44:20]
> at 
> org.apache.lucene.analysis.ja.JapaneseTokenizer.incrementToken(JapaneseTokenizer.java:486)
>  ~[?:?]
> at 
> org.apache.lucene.analysis.synonym.SynonymMap$Parser.analyze(SynonymMap.java:318)
>  ~[lucene-analyzers-common-7.6.0.jar:7.6.0 
> 719cde97f84640faa1e3525690d262946571245f - nknize - 2018-12-07 14:44:48]
> at 
> org.elasticsearch.index.analysis.ESSolrSynonymParser.analyze(ESSolrSynonymParser.java:57)
>  ~[elasticsearch-6.6.1.jar:6.6.1]
> at 
> org.apache.lucene.analysis.synonym.SolrSynonymParser.addInternal(SolrSynonymParser.java:114)
>  ~[lucene-analyzers-common-7.6.0.jar:7.6.0 
> 719cde97f84640faa1e3525690d262946571245f - nknize - 2018-12-07 14:44:48]
> at 
> org.apache.lucene.analysis.synonym.SolrSynonymParser.parse(SolrSynonymParser.java:70)
>  ~[lucene-analyzers-common-7.6.0.jar:7.6.0 
> 719cde97f84640faa1e3525690d262946571245f - nknize - 2018-12-07 14:44:48]
> at 
> org.elasticsearch.index.analysis.SynonymTokenFilterFactory.buildSynonyms(SynonymTokenFilterFactory.java:154)
>  ~[elasticsearch-6.6.1.jar:6.6.1]
> ... 24 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8927) Cut Over To Set.copyOf and Set.Of From Collections.unmodifiableSet

2019-07-26 Thread Adrien Grand (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893477#comment-16893477
 ] 

Adrien Grand commented on LUCENE-8927:
--

Thanks for fixing [~ichattopadhyaya]!

> Cut Over To Set.copyOf and Set.Of From Collections.unmodifiableSet
> --
>
> Key: LUCENE-8927
> URL: https://issues.apache.org/jira/browse/LUCENE-8927
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Atri Sharma
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #803: LUCENE-8929: Early Terminating CollectorManager with Global Hitcount

2019-07-26 Thread GitBox

jpountz commented on a change in pull request #803: LUCENE-8929: Early 
Terminating CollectorManager with Global Hitcount
URL: https://github.com/apache/lucene-solr/pull/803#discussion_r307643964
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java
 ##
 @@ -279,6 +280,125 @@ public void collect(int doc) throws IOException {
 
   }
 
+  /*
+   * Implements a TopFieldCollector that terminates early based on a global
+   * scoreboard which is shared amongst multiple collectors.
+   * NOTE: This should ideally be outside of TopFieldCollector since it does
+   * not have private access, but we keep it here to limit the visibility
+   * of dependent classes
+   */
+  public static class GlobalStateFieldCollector extends TopFieldCollector {
+
+final Sort sort;
+final FieldValueHitQueue queue;
+final AtomicInteger globalTotalHits;
+
+final GlobalStateCollectorManager.GlobalStateCallback callback;
+
+public GlobalStateFieldCollector(Sort sort, FieldValueHitQueue 
queue, int numHits, int totalHitsThreshold,
+ AtomicInteger globalTotalHits, 
GlobalStateCollectorManager.GlobalStateCallback callback) {
+  super(queue, numHits, totalHitsThreshold, sort.needsScores());
+  this.sort = sort;
+  this.queue = queue;
+  this.globalTotalHits = globalTotalHits;
+  this.callback = callback;
+}
+
+@Override
+public LeafCollector getLeafCollector(LeafReaderContext context) throws 
IOException {
+  docBase = context.docBase;
+
+  final LeafFieldComparator[] comparators = queue.getComparators(context);
+  final int[] reverseMul = queue.getReverseMul();
+  final Sort indexSort = context.reader().getMetaData().getSort();
+  final boolean canEarlyTerminate = canEarlyTerminate(sort, indexSort);
+
+  return new MultiComparatorLeafCollector(comparators, reverseMul) {
+
+boolean collectedAllCompetitiveHits = false;
+
+@Override
+public void setScorer(Scorable scorer) throws IOException {
+  super.setScorer(scorer);
+  updateMinCompetitiveScore(scorer);
+}
+
+@Override
+public void collect(int doc) throws IOException {
+  // Increment local hit counter
+  totalHits++;
+
+  if (queueFull) {
+if (collectedAllCompetitiveHits || !isHitCompetitive(doc, scorer)) 
{
+  // since docs are visited in doc Id order, if compare is 0, it 
means
+  // this document is largest than anything else in the queue, and
+  // therefore not competitive.
+  if (canEarlyTerminate) {
+// Check the global scoreboard to see total hits accumulated 
yet
+if (globalTotalHits.incrementAndGet() > totalHitsThreshold) {
+  totalHitsRelation = Relation.GREATER_THAN_OR_EQUAL_TO;
+  throw new CollectionTerminatedException();
+} else {
+  collectedAllCompetitiveHits = true;
+}
+  } else if (totalHitsRelation == Relation.EQUAL_TO) {
+// we just reached totalHitsThreshold, we can start setting 
the min
+// competitive score now
+//TODO: Should we also update competitive score globally?
+updateMinCompetitiveScore(scorer);
+  }
+  return;
+}
+
+// This hit is competitive - replace bottom element in queue & 
adjustTop
+comparator.copy(bottom.slot, doc);
+updateBottom(doc);
+comparator.setBottom(bottom.slot);
+updateMinCompetitiveScore(scorer);
+//Increment global hit counter
+globalTotalHits.incrementAndGet();
+  } else {
+// Startup transient: queue hasn't gathered numHits yet
+
+//Increment global hit counter
+globalTotalHits.incrementAndGet();
+
+final int slot = totalHits - 1;
+
+// Copy hit into queue
+comparator.copy(slot, doc);
+add(slot, doc);
+if (queueFull) {
+  comparator.setBottom(bottom.slot);
+  updateMinCompetitiveScore(scorer);
+}
+  }
+}
+
+// Check if hit is competitive and set the global value accordingly
+private boolean isHitCompetitive(int doc, Scorable scorer) throws 
IOException {
+  // Check if hit is locally competitive
+  if (reverseMul * comparator.compareBottom(doc) > 0) {
+// Hit was competitive locally, but was it globally competitive?
+if (callback.getGlobalMinCompetitiveScore() > scorer.score()) {
+  return false;
+} else {
+  // Hit was locally and globally competitive, set the right
+  // global minimum competitive score
+

[GitHub] [lucene-solr] jpountz commented on issue #794: LUCENE-8769: Introduce Range Query Type With Multiple Ranges

2019-07-26 Thread GitBox

jpountz commented on issue #794: LUCENE-8769: Introduce Range Query Type With 
Multiple Ranges
URL: https://github.com/apache/lucene-solr/pull/794#issuecomment-515362265
 
 
   It's probably fine for sandbox. Can you add TODOs about handling overlapping 
ranges at rewrite time, and organizing query ranges in a tree structure 
similarly to EdgeTree so that we don't have to walk query ranges linearly when 
running the query?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-13655) Cut Over Collections.unmodifiedSet usages to Set.*

2019-07-26 Thread Atri Sharma (JIRA)

Atri Sharma created SOLR-13655:
--

 Summary: Cut Over Collections.unmodifiedSet usages to Set.*
 Key: SOLR-13655
 URL: https://issues.apache.org/jira/browse/SOLR-13655
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Atri Sharma






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on issue #794: LUCENE-8769: Introduce Range Query Type With Multiple Ranges

2019-07-26 Thread GitBox

atris commented on issue #794: LUCENE-8769: Introduce Range Query Type With 
Multiple Ranges
URL: https://github.com/apache/lucene-solr/pull/794#issuecomment-515349726
 
 
   Any comments on this one, please? Happy to iterate


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[ANNOUNCE] Apache Lucene 8.2.0 released

2019-07-26 Thread Ignacio Vera

## 26 July 2019, Apache Lucene™ 8.2.0 available


The Lucene PMC is pleased to announce the release of Apache Lucene 8.2.0.


Apache Lucene is a high-performance, full-featured text search engine
library written entirely in Java. It is a technology suitable for nearly
any application that requires full-text search, especially cross-platform.


This release contains numerous bug fixes, optimizations, and improvements,
some of which are highlighted below. The release is available for immediate
download at:


  


### Lucene 8.2.0 Release Highlights:


 API Changes:


  * Intervals queries has been moved from the sandbox to the queries module.


 New Features


  * New XYShape Field and Queries for indexing and querying general
cartesian geometries.

  * Snowball stemmer/analyzer for the Estonian language.

  * Provide a FeatureSortfield to allow sorting search hits by descending
value of a feature.

  * Add new KoreanNumberFilter that can change Hangul character to number
and process decimal point.

  * Add doc-value support to range fields.

  * Add monitor subproject (previously Luwak monitoring library) that
allows a stream of documents to be matched against a set of registered
queriesin an efficient manner.

  * Add a numeric range query in sandbox that takes advantage of index
sorting.Add a numeric range query in sandbox that takes advantage of index
sorting.


 Optimizations


  * Use exponential search instead of binary search in
IntArrayDocIdSet#advance method.

  * Use incoming thread for execution if IndexSearcher has an executor. Now
caller threads execute at least one search on an index even if there is an
executor provided to minimize thread context switching.

  * New storing strategy for BKD tree leaves with low cardinality that can
lower storage costs and It can be used at search time to speed up queries.

  * Load frequencies lazily only when needed in BlockDocsEnum and
BlockImpactsEverythingEnum.

  * Phrase queries now leverage impacts.


Please read CHANGES.txt for a full list of new features and changes:


  


Note: The Apache Software Foundation uses an extensive mirroring network for

distributing releases. It is possible that the mirror you are using may not
have

replicated the release yet. If that is the case, please try another mirror.

This also applies to Maven access.

[ANNOUNCE] Apache Solr 8.2.0 released

2019-07-26 Thread Ignacio Vera

## 26 July 2019, Apache Solr™ 8.2.0 available


The Lucene PMC is pleased to announce the release of Apache Solr 8.2.0.


Solr is the popular, blazing fast, open source NoSQL search platform from
the Apache Lucene project. Its major features include powerful full-text
search, hit highlighting, faceted search, dynamic clustering, database
integration, rich document handling, and geospatial search. Solr is highly
scalable, providing fault tolerant distributed search and indexing, and
powers the search and navigation features of many of the world's largest
internet sites.


Solr 8.2.0 is available for immediate download at:


  


### Solr 8.2.0 Release Highlights:


 New features


  * Add an update param failOnVersionConflicts=false to updates not fail
when there is a version conflict

  * Add facet2D Streaming Expression.

  * Preferred replicas on nodes with same system properties as the query
master

  * OpenTracing support for Solr

  * Raw index data analysis tool (extension of COLSTATUS collection
command).

  * Add recNum Stream Evaluator.

  * Allow zplot to visualize 2D clusters and convex hulls.

  * Add a field type for Estonian language to default managed_schema,
document about Estonian language analysis in Solr Ref Guide


 Bug Fixes


  * Intermittent 401's for internode requests with basicauth enabled.

  * In 8.1, Atomic Updates were broken (NPE) when the schema declared the
new _nest_path_ field even if you weren't using nested docs. In-place
updates were not affected (worked)

  * Fix atomic update encoding issue for UUID, enum, bool, and binary
fields.

  * Impossible to delete a collection with the same name as an existing
alias. This fixes also a bug inREINDEXCOLLECTION when used with
removeSource=true which could lead to a data loss.


Please read CHANGES.txt for a full list of new features and changes:


  


Solr 8.2.0 also includes features, optimizations  and bugfixes in the
corresponding Apache Lucene release:


  


Note: The Apache Software Foundation uses an extensive mirroring network for

distributing releases. It is possible that the mirror you are using may not
have

replicated the release yet. If that is the case, please try another mirror.

This also applies to Maven access.

[JENKINS] Lucene-Solr-BadApples-master-Linux (64bit/jdk-12.0.1) - Build # 239 - Unstable!

2019-07-26 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-BadApples-master-Linux/239/
Java: 64bit/jdk-12.0.1 -XX:-UseCompressedOops -XX:+UseG1GC

8 tests failed.
FAILED:  org.apache.solr.cloud.rule.RulesTest.doIntegrationTest

Error Message:
Timeout occurred while waiting response from server at: 
https://127.0.0.1:39807/solr

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: Timeout occurred while 
waiting response from server at: https://127.0.0.1:39807/solr
at 
__randomizedtesting.SeedInfo.seed([62988027E0B4C067:87ABC7A6FCC03265]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:667)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:262)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:245)
at 
org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:368)
at 
org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:296)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1128)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:897)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:829)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:228)
at 
org.apache.solr.cloud.MiniSolrCloudCluster.deleteAllCollections(MiniSolrCloudCluster.java:547)
at 
org.apache.solr.cloud.rule.RulesTest.removeCollections(RulesTest.java:65)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
at 
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at

90 matches

Mail list logo