date:20191118

[jira] [Commented] (LUCENE-9049) Remove FST cachedRootArcs now redundant with direct-addressing

2019-11-18 Thread Bruno Roustant (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977221#comment-16977221
 ] 

Bruno Roustant commented on LUCENE-9049:


[~jdconradson] I was about to start but I didn't start yet, so yes you can work 
on it. I'll be glad to review.

> Remove FST cachedRootArcs now redundant with direct-addressing
> --
>
> Key: LUCENE-9049
> URL: https://issues.apache.org/jira/browse/LUCENE-9049
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Bruno Roustant
>Priority: Major
>
> With LUCENE-8920 FST most often encodes top level nodes with 
> direct-addressing (instead of array for binary search). This probably made 
> the cachedRootArcs redundant. So they should be removed, and this will reduce 
> the code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13942) /api/cluster/zk/* to fetch raw ZK data

2019-11-18 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977180#comment-16977180
 ] 

ASF subversion and git services commented on SOLR-13942:


Commit 935a2987f8677dae79b360ec50630a42ec8473c3 in lucene-solr's branch 
refs/heads/master from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=935a298 ]

SOLR-13942: /api/cluster/zk/* to fetch raw ZK data


> /api/cluster/zk/* to fetch raw ZK data
> --
>
> Key: SOLR-13942
> URL: https://issues.apache.org/jira/browse/SOLR-13942
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>
> If the requested path is a node with children show the list of child nodes 
> and their meta data



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13942) /api/cluster/zk/* to fetch raw ZK data

2019-11-18 Thread Noble Paul (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-13942:
--
Description: If the requested path is a node with children show the list of 
child nodes and their meta data  (was: an extra parameter {{raw=true}} should 
just dump the content of the node)

> /api/cluster/zk/* to fetch raw ZK data
> --
>
> Key: SOLR-13942
> URL: https://issues.apache.org/jira/browse/SOLR-13942
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>
> If the requested path is a node with children show the list of child nodes 
> and their meta data



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13942) /api/cluster/zk/* to fetch raw ZK data

2019-11-18 Thread Noble Paul (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-13942:
--
Summary: /api/cluster/zk/* to fetch raw ZK data  (was: 
/solr/admin/zookeeper should have an option to get raw data)

> /api/cluster/zk/* to fetch raw ZK data
> --
>
> Key: SOLR-13942
> URL: https://issues.apache.org/jira/browse/SOLR-13942
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>
> an extra parameter {{raw=true}} should just dump the content of the node



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-13944) CollapsingQParserPlugin throws NPE instead of bad request

2019-11-18 Thread Stefan (Jira)

Stefan created SOLR-13944:
-

 Summary: CollapsingQParserPlugin throws NPE instead of bad request
 Key: SOLR-13944
 URL: https://issues.apache.org/jira/browse/SOLR-13944
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 7.3.1
Reporter: Stefan


 I noticed the following NPE:
{code:java}
java.lang.NullPointerException at 
org.apache.solr.search.CollapsingQParserPlugin$OrdFieldValueCollector.finish(CollapsingQParserPlugin.java:1021)
 at 
org.apache.solr.search.CollapsingQParserPlugin$OrdFieldValueCollector.finish(CollapsingQParserPlugin.java:1081)
 at 
org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:230)
 at 
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1602)
 at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1419)
 at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:584)
{code}
If I am correct, the problem was already addressed in SOLR-8807. The fix does 
was not working in this case though, because of a syntax error in the query (I 
used the local parameter syntax twice instead of combining it). The relevant 
part of the query is:
{code:java}
={!tag=collapser}{!collapse field=productId sort='merchantOrder asc, price 
asc, id asc'}
{code}
After discussing that on the mailing list, I was asked to open a ticket, 
because this situation should result in a bad request instead of a 
NullpointerException (see 
[https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201911.mbox/%3CCAMJgJxTuSb%3D8szO8bvHiAafJOs08O_NMB4pcaHOXME4Jj-GO2A%40mail.gmail.com%3E])



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

2019-11-18 Thread Varun Thacker

[jira] [Updated] (LUCENE-9053) java.lang.AssertionError: inputs are added out of order lastInput=[f0 9d 9c 8b] vs input=[ef ac 81 67 75 72 65]

2019-11-18 Thread gitesh (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gitesh updated LUCENE-9053:
---
Description: 
Even if the inputs are sorted in unicode order, I get following exception while 
creating FST:

 
{code:java}
// Input values (keys). These must be provided to Builder in Unicode sorted 
order!
String inputValues[] = {"퐴", "ﬁgure", "ﬂagship"};
long outputValues[] = {5, 7, 12};
PositiveIntOutputs outputs = PositiveIntOutputs.getSingleton();
Builder builder = new Builder(FST.INPUT_TYPE.BYTE1, outputs);
BytesRefBuilder scratchBytes = new BytesRefBuilder();
IntsRefBuilder scratchInts = new IntsRefBuilder();
for (int i = 0; i < inputValues.length; i++) {
 scratchBytes.copyChars(inputValues[i]);
 builder.add(Util.toIntsRef(scratchBytes.get(), scratchInts), outputValues[i]);
}
FST fst = builder.finish();
Long value = Util.get(fst, new BytesRef("ﬁgure"));
System.out.println(value);
{code}
 Please note that ﬁgure {color:#172b4d}and{color} ﬂagship {color:#172b4d}are 
using the ligature character{color} ﬂ {color:#172b4d}above. {color}

  was:
Even if the inputs are sorted in unicode order, I get following exception while 
creating FST:

 
{code:java}
// Input values (keys). These must be provided to Builder in Unicode sorted 
order!
String inputValues[] = {"퐴", "ﬁgure", "ﬂagship"};
long outputValues[] = {5, 7, 12};
PositiveIntOutputs outputs = PositiveIntOutputs.getSingleton();
Builder builder = new Builder(FST.INPUT_TYPE.BYTE1, outputs);
BytesRefBuilder scratchBytes = new BytesRefBuilder();
IntsRefBuilder scratchInts = new IntsRefBuilder();
for (int i = 0; i < inputValues.length; i++) {
 scratchBytes.copyChars(inputValues[i]);
 builder.add(Util.toIntsRef(scratchBytes.get(), scratchInts), outputValues[i]);
}
FST fst = builder.finish();
Long value = Util.get(fst, new BytesRef("ﬁgure"));
System.out.println(value);
{code}
 


> java.lang.AssertionError: inputs are added out of order lastInput=[f0 9d 9c 
> 8b] vs input=[ef ac 81 67 75 72 65]
> ---
>
> Key: LUCENE-9053
> URL: https://issues.apache.org/jira/browse/LUCENE-9053
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: gitesh
>Priority: Minor
>
> Even if the inputs are sorted in unicode order, I get following exception 
> while creating FST:
>  
> {code:java}
> // Input values (keys). These must be provided to Builder in Unicode sorted 
> order!
> String inputValues[] = {"퐴", "ﬁgure", "ﬂagship"};
> long outputValues[] = {5, 7, 12};
> PositiveIntOutputs outputs = PositiveIntOutputs.getSingleton();
> Builder builder = new Builder(FST.INPUT_TYPE.BYTE1, outputs);
> BytesRefBuilder scratchBytes = new BytesRefBuilder();
> IntsRefBuilder scratchInts = new IntsRefBuilder();
> for (int i = 0; i < inputValues.length; i++) {
>  scratchBytes.copyChars(inputValues[i]);
>  builder.add(Util.toIntsRef(scratchBytes.get(), scratchInts), 
> outputValues[i]);
> }
> FST fst = builder.finish();
> Long value = Util.get(fst, new BytesRef("ﬁgure"));
> System.out.println(value);
> {code}
>  Please note that ﬁgure {color:#172b4d}and{color} ﬂagship {color:#172b4d}are 
> using the ligature character{color} ﬂ {color:#172b4d}above. {color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9053) java.lang.AssertionError: inputs are added out of order lastInput=[f0 9d 9c 8b] vs input=[ef ac 81 67 75 72 65]

2019-11-18 Thread gitesh (Jira)

gitesh created LUCENE-9053:
--

 Summary: java.lang.AssertionError: inputs are added out of order 
lastInput=[f0 9d 9c 8b] vs input=[ef ac 81 67 75 72 65]
 Key: LUCENE-9053
 URL: https://issues.apache.org/jira/browse/LUCENE-9053
 Project: Lucene - Core
  Issue Type: Bug
Reporter: gitesh


Even if the inputs are sorted in unicode order, I get following exception while 
creating FST:

 
{code:java}
// Input values (keys). These must be provided to Builder in Unicode sorted 
order!
String inputValues[] = {"퐴", "ﬁgure", "ﬂagship"};
long outputValues[] = {5, 7, 12};
PositiveIntOutputs outputs = PositiveIntOutputs.getSingleton();
Builder builder = new Builder(FST.INPUT_TYPE.BYTE1, outputs);
BytesRefBuilder scratchBytes = new BytesRefBuilder();
IntsRefBuilder scratchInts = new IntsRefBuilder();
for (int i = 0; i < inputValues.length; i++) {
 scratchBytes.copyChars(inputValues[i]);
 builder.add(Util.toIntsRef(scratchBytes.get(), scratchInts), outputValues[i]);
}
FST fst = builder.finish();
Long value = Util.get(fst, new BytesRef("ﬁgure"));
System.out.println(value);
{code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13943) TimeRoutedAliasUpdateProcessorTest.testDateMathInStart: multi-threaded race condition due to ZK assumptions

2019-11-18 Thread Gus Heck (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977035#comment-16977035
 ] 

Gus Heck commented on SOLR-13943:
-

Thanks Chris! I had noticed and been irritated by this but not yet had time to 
dig into it.  Your analysis is very helpful.  I'll try to work on it this 
weekend.

> TimeRoutedAliasUpdateProcessorTest.testDateMathInStart: multi-threaded race 
> condition due to ZK assumptions
> ---
>
> Key: SOLR-13943
> URL: https://issues.apache.org/jira/browse/SOLR-13943
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Gus Heck
>Priority: Major
> Attachments: apache_Lucene-Solr-BadApples-Tests-master_531.log.txt, 
> apache_Lucene-Solr-BadApples-Tests-master_533.log.txt, 
> apache_Lucene-Solr-repro-Java11_618.log.txt
>
>
> TimeRoutedAliasUpdateProcessorTest does not currently run in many jenkins 
> builds due to being marked BadApple(SOLR-13059) -- however when it does run, 
> the method {{testDateMathInStart}} frequently fails due to what appears to be 
> a multi-threaded race condition in the test logic...
> {noformat}
>[junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=TimeRoutedAliasUpdateProcessorTest 
> -Dtests.method=testDateMathInStart -Dtests.seed=8879E35521A4B9EA 
> -Dtests.multiplier=2 -Dtests.
> slow=true -Dtests.badapples=true -Dtests.locale=nl-BQ 
> -Dtests.timezone=America/Porto_Acre -Dtests.asserts=true 
> -Dtests.file.encoding=UTF-8
>[junit4] FAILURE 6.96s J0 | 
> TimeRoutedAliasUpdateProcessorTest.testDateMathInStart <<<
>[junit4]> Throwable #1: java.lang.AssertionError: router.start should 
> not have any date math by this point and parse as an instant. Using class 
> org.apache.solr.client.solrj.impl.ZkCl
> ientClusterStateProvider Found:2019-09-14T03:00:00Z/DAY
>[junit4]>at 
> __randomizedtesting.SeedInfo.seed([8879E35521A4B9EA:64FE3DD88112B802]:0)
>[junit4]>at 
> org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest.testDateMathInStart(TimeRoutedAliasUpdateProcessorTest.java:765)
>[junit4]>at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>[junit4]>at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>[junit4]>at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>[junit4]>at 
> java.base/java.lang.reflect.Method.invoke(Method.java:566)
>[junit4]>at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}
> I'll attach some logs from recent failures and my own quick analysis of the 
> problems of how the test appears to be asserting ZK updates.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (SOLR-13943) TimeRoutedAliasUpdateProcessorTest.testDateMathInStart: multi-threaded race condition due to ZK assumptions

2019-11-18 Thread Chris M. Hostetter (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter reassigned SOLR-13943:
-

Assignee: Gus Heck

assigning to gus in the hopes that he can take a look and refactor the test to 
remove the race condition.

> TimeRoutedAliasUpdateProcessorTest.testDateMathInStart: multi-threaded race 
> condition due to ZK assumptions
> ---
>
> Key: SOLR-13943
> URL: https://issues.apache.org/jira/browse/SOLR-13943
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Gus Heck
>Priority: Major
> Attachments: apache_Lucene-Solr-BadApples-Tests-master_531.log.txt, 
> apache_Lucene-Solr-BadApples-Tests-master_533.log.txt, 
> apache_Lucene-Solr-repro-Java11_618.log.txt
>
>
> TimeRoutedAliasUpdateProcessorTest does not currently run in many jenkins 
> builds due to being marked BadApple(SOLR-13059) -- however when it does run, 
> the method {{testDateMathInStart}} frequently fails due to what appears to be 
> a multi-threaded race condition in the test logic...
> {noformat}
>[junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=TimeRoutedAliasUpdateProcessorTest 
> -Dtests.method=testDateMathInStart -Dtests.seed=8879E35521A4B9EA 
> -Dtests.multiplier=2 -Dtests.
> slow=true -Dtests.badapples=true -Dtests.locale=nl-BQ 
> -Dtests.timezone=America/Porto_Acre -Dtests.asserts=true 
> -Dtests.file.encoding=UTF-8
>[junit4] FAILURE 6.96s J0 | 
> TimeRoutedAliasUpdateProcessorTest.testDateMathInStart <<<
>[junit4]> Throwable #1: java.lang.AssertionError: router.start should 
> not have any date math by this point and parse as an instant. Using class 
> org.apache.solr.client.solrj.impl.ZkCl
> ientClusterStateProvider Found:2019-09-14T03:00:00Z/DAY
>[junit4]>at 
> __randomizedtesting.SeedInfo.seed([8879E35521A4B9EA:64FE3DD88112B802]:0)
>[junit4]>at 
> org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest.testDateMathInStart(TimeRoutedAliasUpdateProcessorTest.java:765)
>[junit4]>at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>[junit4]>at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>[junit4]>at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>[junit4]>at 
> java.base/java.lang.reflect.Method.invoke(Method.java:566)
>[junit4]>at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}
> I'll attach some logs from recent failures and my own quick analysis of the 
> problems of how the test appears to be asserting ZK updates.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13943) TimeRoutedAliasUpdateProcessorTest.testDateMathInStart: multi-threaded race condition due to ZK assumptions

2019-11-18 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976989#comment-16976989
 ] 

ASF subversion and git services commented on SOLR-13943:


Commit 8759dea69adfadfcfd448aeae2cafc8273f0912d in lucene-solr's branch 
refs/heads/branch_8x from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8759dea ]

SOLR-13943: AwaitsFix TimeRoutedAliasUpdateProcessorTest.testDateMathInStart

(cherry picked from commit 59465c20c462147f0239449ea43f4844cfa585c2)


> TimeRoutedAliasUpdateProcessorTest.testDateMathInStart: multi-threaded race 
> condition due to ZK assumptions
> ---
>
> Key: SOLR-13943
> URL: https://issues.apache.org/jira/browse/SOLR-13943
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Priority: Major
> Attachments: apache_Lucene-Solr-BadApples-Tests-master_531.log.txt, 
> apache_Lucene-Solr-BadApples-Tests-master_533.log.txt, 
> apache_Lucene-Solr-repro-Java11_618.log.txt
>
>
> TimeRoutedAliasUpdateProcessorTest does not currently run in many jenkins 
> builds due to being marked BadApple(SOLR-13059) -- however when it does run, 
> the method {{testDateMathInStart}} frequently fails due to what appears to be 
> a multi-threaded race condition in the test logic...
> {noformat}
>[junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=TimeRoutedAliasUpdateProcessorTest 
> -Dtests.method=testDateMathInStart -Dtests.seed=8879E35521A4B9EA 
> -Dtests.multiplier=2 -Dtests.
> slow=true -Dtests.badapples=true -Dtests.locale=nl-BQ 
> -Dtests.timezone=America/Porto_Acre -Dtests.asserts=true 
> -Dtests.file.encoding=UTF-8
>[junit4] FAILURE 6.96s J0 | 
> TimeRoutedAliasUpdateProcessorTest.testDateMathInStart <<<
>[junit4]> Throwable #1: java.lang.AssertionError: router.start should 
> not have any date math by this point and parse as an instant. Using class 
> org.apache.solr.client.solrj.impl.ZkCl
> ientClusterStateProvider Found:2019-09-14T03:00:00Z/DAY
>[junit4]>at 
> __randomizedtesting.SeedInfo.seed([8879E35521A4B9EA:64FE3DD88112B802]:0)
>[junit4]>at 
> org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest.testDateMathInStart(TimeRoutedAliasUpdateProcessorTest.java:765)
>[junit4]>at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>[junit4]>at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>[junit4]>at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>[junit4]>at 
> java.base/java.lang.reflect.Method.invoke(Method.java:566)
>[junit4]>at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}
> I'll attach some logs from recent failures and my own quick analysis of the 
> problems of how the test appears to be asserting ZK updates.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13943) TimeRoutedAliasUpdateProcessorTest.testDateMathInStart: multi-threaded race condition due to ZK assumptions

2019-11-18 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976986#comment-16976986
 ] 

ASF subversion and git services commented on SOLR-13943:


Commit 59465c20c462147f0239449ea43f4844cfa585c2 in lucene-solr's branch 
refs/heads/master from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=59465c2 ]

SOLR-13943: AwaitsFix TimeRoutedAliasUpdateProcessorTest.testDateMathInStart


> TimeRoutedAliasUpdateProcessorTest.testDateMathInStart: multi-threaded race 
> condition due to ZK assumptions
> ---
>
> Key: SOLR-13943
> URL: https://issues.apache.org/jira/browse/SOLR-13943
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Priority: Major
> Attachments: apache_Lucene-Solr-BadApples-Tests-master_531.log.txt, 
> apache_Lucene-Solr-BadApples-Tests-master_533.log.txt, 
> apache_Lucene-Solr-repro-Java11_618.log.txt
>
>
> TimeRoutedAliasUpdateProcessorTest does not currently run in many jenkins 
> builds due to being marked BadApple(SOLR-13059) -- however when it does run, 
> the method {{testDateMathInStart}} frequently fails due to what appears to be 
> a multi-threaded race condition in the test logic...
> {noformat}
>[junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=TimeRoutedAliasUpdateProcessorTest 
> -Dtests.method=testDateMathInStart -Dtests.seed=8879E35521A4B9EA 
> -Dtests.multiplier=2 -Dtests.
> slow=true -Dtests.badapples=true -Dtests.locale=nl-BQ 
> -Dtests.timezone=America/Porto_Acre -Dtests.asserts=true 
> -Dtests.file.encoding=UTF-8
>[junit4] FAILURE 6.96s J0 | 
> TimeRoutedAliasUpdateProcessorTest.testDateMathInStart <<<
>[junit4]> Throwable #1: java.lang.AssertionError: router.start should 
> not have any date math by this point and parse as an instant. Using class 
> org.apache.solr.client.solrj.impl.ZkCl
> ientClusterStateProvider Found:2019-09-14T03:00:00Z/DAY
>[junit4]>at 
> __randomizedtesting.SeedInfo.seed([8879E35521A4B9EA:64FE3DD88112B802]:0)
>[junit4]>at 
> org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest.testDateMathInStart(TimeRoutedAliasUpdateProcessorTest.java:765)
>[junit4]>at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>[junit4]>at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>[junit4]>at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>[junit4]>at 
> java.base/java.lang.reflect.Method.invoke(Method.java:566)
>[junit4]>at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}
> I'll attach some logs from recent failures and my own quick analysis of the 
> problems of how the test appears to be asserting ZK updates.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13059) TimeRoutedAliasUpdateProcessorTest rarely fails to see collection just created

2019-11-18 Thread Chris M. Hostetter (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976985#comment-16976985
 ] 

Chris M. Hostetter commented on SOLR-13059:
---

FWIW, until SOLR-13943 started being problematic when testDateMathInStart was 
added by SOLR-13760 ~2019-10-11, there hadn't been any jenkins failures of 
TimeRoutedAliasUpdateProcessorTest since 2019-06-06

Perhaps something was changed in the underlying code (or test plumbing) 
on/around mid-june that fixed the underlying problem?

{noformat}
$ zgrep TimeRoutedAliasUpdateProcessorTest 2019-*method-failures.csv.gz
2019-01-19.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testPreemptiveCreation,apache/Lucene-Solr-BadApples-NightlyTests-8.x/1/
2019-02-16.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testSliceRouting,apache/Lucene-Solr-BadApples-Tests-master/285/
2019-03-15.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,initializationError,apache/Lucene-Solr-NightlyTests-8.x/45/
2019-03-16.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,,apache/Lucene-Solr-BadApples-Tests-master/307/
2019-03-16.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testSliceRouting,apache/Lucene-Solr-BadApples-Tests-master/307/
2019-03-17.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testPreemptiveCreation,apache/Lucene-Solr-repro/3034/
2019-03-19.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testPreemptiveCreation,thetaphi/Lucene-Solr-BadApples-master-Linux/179/
2019-03-20.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testSliceRouting,apache/Lucene-Solr-BadApples-Tests-master/310/
2019-03-24.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testPreemptiveCreation,thetaphi/Lucene-Solr-BadApples-master-Linux/182/
2019-03-26.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testSliceRouting,apache/Lucene-Solr-BadApples-Tests-8.x/55/
2019-04-01.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testPreemptiveCreation,apache/Lucene-Solr-BadApples-Tests-8.x/60/
2019-04-04.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,,thetaphi/Lucene-Solr-BadApples-master-Linux/186/
2019-04-04.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testSliceRouting,thetaphi/Lucene-Solr-BadApples-master-Linux/186/
2019-04-29.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testSliceRouting,apache/Lucene-Solr-BadApples-Tests-8.x/89/
2019-05-17.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testPreemptiveCreation,apache/Lucene-Solr-BadApples-Tests-master/363/
2019-06-06.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testPreemptiveCreation,thetaphi/Lucene-Solr-BadApples-8.x-Linux/68/
2019-10-11.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testDateMathInStart,apache/Lucene-Solr-BadApples-Tests-master/501/
2019-11-08.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testDateMathInStart,apache/Lucene-Solr-BadApples-Tests-master/527/
2019-11-10.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testDateMathInStart,apache/Lucene-Solr-BadApples-Tests-8.x/270/
2019-11-12.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testDateMathInStart,apache/Lucene-Solr-BadApples-Tests-master/531/
2019-11-13.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testDateMathInStart,apache/Lucene-Solr-repro-Java11/618/
2019-11-14.method-failures.csv.gz:org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest,testDateMathInStart,apache/Lucene-Solr-BadApples-Tests-master/533/
{noformat}

> TimeRoutedAliasUpdateProcessorTest rarely fails to see collection just created
> --
>
> Key: SOLR-13059
> URL: https://issues.apache.org/jira/browse/SOLR-13059
> Project: Solr
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 8.0
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Major
>
> This issue is for tracking down and fixing this stack trace observed during 
> SOLR-13051:
> {code:java}
> [junit4] ERROR 11.2s | TimeRoutedAliasUpdateProcessorTest.testSliceRouting <<<
> [junit4] >

[jira] [Commented] (SOLR-13941) Tests configure Jetty differently than when running via start.jar

2019-11-18 Thread Uwe Schindler (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976979#comment-16976979
 ] 

Uwe Schindler commented on SOLR-13941:
--

I commented on PR, I was answering before seeing the PR.

> Tests configure Jetty differently than when running via start.jar
> -
>
> Key: SOLR-13941
> URL: https://issues.apache.org/jira/browse/SOLR-13941
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
>Priority: Major
> Attachments: SOLR-13941.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Spinoff from SOLR-13905.
> There seems to be a slightly different configuration of our servlets when 
> Solr is run from command line and through Test-runner for our tests.
> This causes different behavior of {{httpRequest.getPathInfo}} and 
> {{httpRequest.getServletPath()}} in the two environments, making it hard to 
> depend on these in critical code paths, such as 
> [SolrDispatchFilter|https://github.com/apache/lucene-solr/blob/f07998fc234c81ff956a84ee508b85f8d573ef38/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L494-L499]
>  and AuditEvent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-13943) TimeRoutedAliasUpdateProcessorTest.testDateMathInStart: multi-threaded race condition due to ZK assumptions

2019-11-18 Thread Chris M. Hostetter (Jira)

Chris M. Hostetter created SOLR-13943:
-

 Summary: TimeRoutedAliasUpdateProcessorTest.testDateMathInStart: 
multi-threaded race condition due to ZK assumptions
 Key: SOLR-13943
 URL: https://issues.apache.org/jira/browse/SOLR-13943
 Project: Solr
  Issue Type: Test
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Chris M. Hostetter


TimeRoutedAliasUpdateProcessorTest does not currently run in many jenkins 
builds due to being marked BadApple(SOLR-13059) -- however when it does run, 
the method {{testDateMathInStart}} frequently fails due to what appears to be a 
multi-threaded race condition in the test logic...

{noformat}
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=TimeRoutedAliasUpdateProcessorTest 
-Dtests.method=testDateMathInStart -Dtests.seed=8879E35521A4B9EA 
-Dtests.multiplier=2 -Dtests.
slow=true -Dtests.badapples=true -Dtests.locale=nl-BQ 
-Dtests.timezone=America/Porto_Acre -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8
   [junit4] FAILURE 6.96s J0 | 
TimeRoutedAliasUpdateProcessorTest.testDateMathInStart <<<
   [junit4]> Throwable #1: java.lang.AssertionError: router.start should 
not have any date math by this point and parse as an instant. Using class 
org.apache.solr.client.solrj.impl.ZkCl
ientClusterStateProvider Found:2019-09-14T03:00:00Z/DAY
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([8879E35521A4B9EA:64FE3DD88112B802]:0)
   [junit4]>at 
org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest.testDateMathInStart(TimeRoutedAliasUpdateProcessorTest.java:765)
   [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   [junit4]>at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   [junit4]>at 
java.base/java.lang.reflect.Method.invoke(Method.java:566)
   [junit4]>at java.base/java.lang.Thread.run(Thread.java:834)
{noformat}

I'll attach some logs from recent failures and my own quick analysis of the 
problems of how the test appears to be asserting ZK updates.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] uschindler commented on issue #1018: SOLR-13941: Configure JettySolrRunner same as in web.xml

2019-11-18 Thread GitBox

uschindler commented on issue #1018: SOLR-13941: Configure JettySolrRunner same 
as in web.xml
URL: https://github.com/apache/lucene-solr/pull/1018#issuecomment-555257355
 
 
   You can remove the dummy 404Servlet from source code. It's obsolete now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13941) Tests configure Jetty differently than when running via start.jar

2019-11-18 Thread Uwe Schindler (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976967#comment-16976967
 ] 

Uwe Schindler commented on SOLR-13941:
--

Hi,
You can also remove the obsolete servlet from source code. 
The reason why the servlet was needed is caused by this binding without a 
slash: the container sees no matching servlet on Root path, because * is an 
invalid mapping, that should be used for file extensions only. So the 404 
servlet is needed as a workaround to make container happy.

Could you please also fix the DebugFilter mapping a few lines above? It has 
same issue.

> Tests configure Jetty differently than when running via start.jar
> -
>
> Key: SOLR-13941
> URL: https://issues.apache.org/jira/browse/SOLR-13941
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
>Priority: Major
> Attachments: SOLR-13941.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Spinoff from SOLR-13905.
> There seems to be a slightly different configuration of our servlets when 
> Solr is run from command line and through Test-runner for our tests.
> This causes different behavior of {{httpRequest.getPathInfo}} and 
> {{httpRequest.getServletPath()}} in the two environments, making it hard to 
> depend on these in critical code paths, such as 
> [SolrDispatchFilter|https://github.com/apache/lucene-solr/blob/f07998fc234c81ff956a84ee508b85f8d573ef38/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L494-L499]
>  and AuditEvent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13941) Tests configure Jetty differently than when running via start.jar

2019-11-18 Thread Jira



[ 
https://issues.apache.org/jira/browse/SOLR-13941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976963#comment-16976963
 ] 

Jan Høydahl commented on SOLR-13941:


Took it a step further in [GitHub Pull Request 
#1018|https://github.com/apache/lucene-solr/pull/1018] where I added the 
utility method and simplified all code paths I could find using 
{{getPathInfo}}. There as a lot of duplication and different ways to 
concatenate those.

I think it makes sense to commit this one first and then SOLR-13905 after.

> Tests configure Jetty differently than when running via start.jar
> -
>
> Key: SOLR-13941
> URL: https://issues.apache.org/jira/browse/SOLR-13941
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
>Priority: Major
> Attachments: SOLR-13941.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Spinoff from SOLR-13905.
> There seems to be a slightly different configuration of our servlets when 
> Solr is run from command line and through Test-runner for our tests.
> This causes different behavior of {{httpRequest.getPathInfo}} and 
> {{httpRequest.getServletPath()}} in the two environments, making it hard to 
> depend on these in critical code paths, such as 
> [SolrDispatchFilter|https://github.com/apache/lucene-solr/blob/f07998fc234c81ff956a84ee508b85f8d573ef38/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L494-L499]
>  and AuditEvent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] janhoy opened a new pull request #1018: SOLR-13941: Configure JettySolrRunner same as in web.xml

2019-11-18 Thread GitBox

janhoy opened a new pull request #1018: SOLR-13941: Configure JettySolrRunner 
same as in web.xml
URL: https://github.com/apache/lucene-solr/pull/1018
 
 
   
   
   
   # Description
   
   Make sure tests have same servlet root for DispatchFilter as when running 
Solr from cmdline. This avoids some subtle confusion in tests.
   
   # Solution
   
   Wire SolrDispatchFilter on `/*` instead of `*`, and make a new 
`SerlvetUtils` class with a method that concatenates servletPath and pathInfo 
for more readable and less error prone code.
   
   # Tests
   
   No new tests necessary
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I am authorized to contribute this code to the ASF and have removed 
any code I do not have a license to distribute.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `master` branch.
   - [x] I have run `ant precommit` and the appropriate test suite.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-13942) /solr/admin/zookeeper should have an option to get raw data

2019-11-18 Thread Noble Paul (Jira)

Noble Paul created SOLR-13942:
-

 Summary: /solr/admin/zookeeper should have an option to get raw 
data
 Key: SOLR-13942
 URL: https://issues.apache.org/jira/browse/SOLR-13942
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul


an extra parameter {{raw=true}} should just dump the content of the node



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13941) Tests configure Jetty differently than when running via start.jar

2019-11-18 Thread Jira



[ 
https://issues.apache.org/jira/browse/SOLR-13941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976944#comment-16976944
 ] 

Jan Høydahl commented on SOLR-13941:


I tested the attached patch. Had to remove the wiring of the dummy 404 servlet 
for it to work correctly. What happened then is that you suddenly got the path 
from the servletPath call instead of the pathInfo call - just like production. 
I'll try to run the full test suite and see if something breaks :)

> Tests configure Jetty differently than when running via start.jar
> -
>
> Key: SOLR-13941
> URL: https://issues.apache.org/jira/browse/SOLR-13941
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
>Priority: Major
> Attachments: SOLR-13941.patch
>
>
> Spinoff from SOLR-13905.
> There seems to be a slightly different configuration of our servlets when 
> Solr is run from command line and through Test-runner for our tests.
> This causes different behavior of {{httpRequest.getPathInfo}} and 
> {{httpRequest.getServletPath()}} in the two environments, making it hard to 
> depend on these in critical code paths, such as 
> [SolrDispatchFilter|https://github.com/apache/lucene-solr/blob/f07998fc234c81ff956a84ee508b85f8d573ef38/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L494-L499]
>  and AuditEvent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Moved] (LUCENE-9052) Deprecated method copyChars is used in example

2019-11-18 Thread Robert Scholte (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Scholte moved MNGSITE-382 to LUCENE-9052:


  Key: LUCENE-9052  (was: MNGSITE-382)
Lucene Fields: New
 Workflow: patch-available, re-open possible, new labels  (was: Default 
workflow, editable Closed status)
  Project: Lucene - Core  (was: Maven Project Web Site)

> Deprecated method copyChars is used in example
> --
>
> Key: LUCENE-9052
> URL: https://issues.apache.org/jira/browse/LUCENE-9052
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: gitesh
>Priority: Minor
>
> Following documentation page for FST class has some example code.
> [https://lucene.apache.org/core/8_3_0/core/org/apache/lucene/util/fst/package-summary.html]
> {code:java}
>  // Input values (keys). These must be provided to Builder in Unicode 
> sorted order!
>  String inputValues[] = {"cat", "dog", "dogs"};
>  long outputValues[] = {5, 7, 12};
>  
>  PositiveIntOutputs outputs = PositiveIntOutputs.getSingleton();
>  Builder builder = new Builder(INPUT_TYPE.BYTE1, outputs);
>  BytesRef scratchBytes = new BytesRef();
>  IntsRefBuilder scratchInts = new IntsRefBuilder();
>  for (int i = 0; i < inputValues.length; i++) {
>scratchBytes.copyChars(inputValues[i]);
>builder.add(Util.toIntsRef(scratchBytes, scratchInts), 
> outputValues[i]);
>  }
>  FST fst = builder.finish();
> {code}
> Compilation of above code with Solr 8.3 libraries complains that no method 
> copyChars found. copyChars method in BytesRef class is deprecated from long 
> time. We should use BytesRefBuilder class instead. Here is the correct code:
> {code:java}
> String inputValues[] = {"cat", "dog", "dogs"};
> long outputValues[] = {5, 7, 12};
> PositiveIntOutputs outputs = PositiveIntOutputs.getSingleton();
> Builder builder = new Builder(FST.INPUT_TYPE.BYTE1, outputs);
> BytesRefBuilder scratchBytes = new BytesRefBuilder();
> IntsRefBuilder scratchInts = new IntsRefBuilder();
> for (int i = 0; i < inputValues.length; i++) {
> scratchBytes.copyChars(inputValues[i]);
> builder.add(Util.toIntsRef(scratchBytes.get(), scratchInts), 
> outputValues[i]);
> }
> FST fst = builder.finish();
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-12028) BadApple and AwaitsFix annotations usage

2019-11-18 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976903#comment-16976903
 ] 

ASF subversion and git services commented on SOLR-12028:


Commit cb72085ee8cc5fa9229424c101a744f758042153 in lucene-solr's branch 
refs/heads/branch_8x from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=cb72085 ]

HdfsRecoveryZkTest & HdfsNNFailoverTest: Remove @BadApple anotation

These tests were originally anotated @BadApple in early 2018 as pat of 
SOLR-12028.

Neither test has failed since 2018-12-28.

Since we no longer have logs from those older jenkins builds, it's hard to be 
certain how/why this
test was failing, or why exactly it stopped failing – but it's possible the 
underlying issues were
addressed by general hardening of SolrCloud and the associated base test 
classes around the same time.

(cherry picked from commit 1411aaee94d49f26c55272f3876a4261357467c8)


> BadApple and AwaitsFix annotations usage
> 
>
> Key: SOLR-12028
> URL: https://issues.apache.org/jira/browse/SOLR-12028
> Project: Solr
>  Issue Type: Task
>  Components: Tests
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-12016-buildsystem.patch, SOLR-12028-3-Mar.patch, 
> SOLR-12028-sysprops-reproduce.patch, SOLR-12028.patch, SOLR-12028.patch
>
>
> There's a long discussion of this topic at SOLR-12016. Here's a summary:
> - BadApple annotations are used for tests that intermittently fail, say < 30% 
> of the time. Tests that fail more often shold be moved to AwaitsFix. This is, 
> of course, a judgement call
> - AwaitsFix annotations are used for tests that, for some reason, the problem 
> can't be fixed immediately. Likely reasons are third-party dependencies, 
> extreme difficulty tracking down, dependency on another JIRA etc.
> Jenkins jobs will typically run with BadApple disabled to cut down on noise. 
> Periodically Jenkins jobs will be run with BadApples enabled so BadApple 
> tests won't be lost and reports can be generated. Tests that run with 
> BadApples disabled that fail require _immediate_ attention.
> The default for developers is that BadApple is enabled.
> If you are working on one of these tests and cannot get the test to fail 
> locally, it is perfectly acceptable to comment the annotation out. You should 
> let the dev list know that this is deliberate.
> This JIRA is a placeholder for BadApple tests to point to between the times 
> they're identified as BadApple and they're either fixed or changed to 
> AwaitsFix or assigned their own JIRA.
> I've assigned this to myself to track so I don't lose track of it. No one 
> person will fix all of these issues, this will be an ongoing technical debt 
> cleanup effort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9049) Remove FST cachedRootArcs now redundant with direct-addressing

2019-11-18 Thread Jack Conradson (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976899#comment-16976899
 ] 

Jack Conradson commented on LUCENE-9049:


Hi [~bruno.roustant], I was wondering if you were actively working on this 
issue. If not, would you mind if I gave it try?

> Remove FST cachedRootArcs now redundant with direct-addressing
> --
>
> Key: LUCENE-9049
> URL: https://issues.apache.org/jira/browse/LUCENE-9049
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Bruno Roustant
>Priority: Major
>
> With LUCENE-8920 FST most often encodes top level nodes with 
> direct-addressing (instead of array for binary search). This probably made 
> the cachedRootArcs redundant. So they should be removed, and this will reduce 
> the code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13782) Make HTML Ref Guide the primary release vehicle instead of PDF

2019-11-18 Thread Cassandra Targett (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976897#comment-16976897
 ] 

Cassandra Targett commented on SOLR-13782:
--

Yes, it is going to be this week. I wasn't able to get to it before I had to 
travel and didn't want to do it while on the road.

> Make HTML Ref Guide the primary release vehicle instead of PDF
> --
>
> Key: SOLR-13782
> URL: https://issues.apache.org/jira/browse/SOLR-13782
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
> Fix For: 8.4
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As discussed in a recent mail thread [1], we have agreed that it is time for 
> us to stop treating the PDF version of the Ref Guide as the "official" 
> version and instead emphasize the HTML version as the official version.
> The arguments for/against this decision are in the linked thread, but for the 
> purpose of this issue there are a couple of things to do:
> - Modify the publication process docs (under 
> {{solr/solr-ref-guide/src/meta-docs}}
> - Announce to the solr-user list that this is happening
> A separate issue will be created to automate parts of the publication 
> process, since they require some discussion and possibly coordination with 
> Infra on the options there.
> [1] 
> https://lists.apache.org/thread.html/f517b3b74a0a33e5e6fa87e888459fc007decc49d27a4f49822ca2ee@%3Cdev.lucene.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9036) ExitableDirectoryReader to interrupt DocValues as well

2019-11-18 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976892#comment-16976892
 ] 

ASF subversion and git services commented on LUCENE-9036:
-

Commit 1c0c244129e9c8d8b926c27c7e81a299fd8b4ab0 in lucene-solr's branch 
refs/heads/branch_8x from Mikhail Khludnev
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1c0c244 ]

LUCENE-9036: ExitableDirectoryReader checks timeout on DocValues access.


> ExitableDirectoryReader to interrupt DocValues as well
> --
>
> Key: LUCENE-9036
> URL: https://issues.apache.org/jira/browse/LUCENE-9036
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mikhail Khludnev
>Priority: Major
> Attachments: LUCENE-9036.patch, LUCENE-9036.patch, LUCENE-9036.patch, 
> LUCENE-9036.patch, LUCENE-9036.patch, LUCENE-9036.patch
>
>
> This allow to make AnalyticsComponent and json.facet sensitive to time 
> allowed. 
> Does it make sense? Is it enough to check on DV creation ie per field/segment 
> or it's worth to check every Nth doc? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13941) Tests configure Jetty differently than when running via start.jar

2019-11-18 Thread Jira



 [ 
https://issues.apache.org/jira/browse/SOLR-13941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-13941:
---
Attachment: SOLR-13941.patch

> Tests configure Jetty differently than when running via start.jar
> -
>
> Key: SOLR-13941
> URL: https://issues.apache.org/jira/browse/SOLR-13941
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
>Priority: Major
> Attachments: SOLR-13941.patch
>
>
> Spinoff from SOLR-13905.
> There seems to be a slightly different configuration of our servlets when 
> Solr is run from command line and through Test-runner for our tests.
> This causes different behavior of {{httpRequest.getPathInfo}} and 
> {{httpRequest.getServletPath()}} in the two environments, making it hard to 
> depend on these in critical code paths, such as 
> [SolrDispatchFilter|https://github.com/apache/lucene-solr/blob/f07998fc234c81ff956a84ee508b85f8d573ef38/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L494-L499]
>  and AuditEvent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9036) ExitableDirectoryReader to interrupt DocValues as well

2019-11-18 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976871#comment-16976871
 ] 

ASF subversion and git services commented on LUCENE-9036:
-

Commit 51b1c5a023e587646f4d01ee38dfa3848faac91c in lucene-solr's branch 
refs/heads/master from Mikhail Khludnev
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=51b1c5a ]

LUCENE-9036: ExitableDirectoryReader checks timeout on DocValues access.


> ExitableDirectoryReader to interrupt DocValues as well
> --
>
> Key: LUCENE-9036
> URL: https://issues.apache.org/jira/browse/LUCENE-9036
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mikhail Khludnev
>Priority: Major
> Attachments: LUCENE-9036.patch, LUCENE-9036.patch, LUCENE-9036.patch, 
> LUCENE-9036.patch, LUCENE-9036.patch, LUCENE-9036.patch
>
>
> This allow to make AnalyticsComponent and json.facet sensitive to time 
> allowed. 
> Does it make sense? Is it enough to check on DV creation ie per field/segment 
> or it's worth to check every Nth doc? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-12028) BadApple and AwaitsFix annotations usage

2019-11-18 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976811#comment-16976811
 ] 

ASF subversion and git services commented on SOLR-12028:


Commit 1411aaee94d49f26c55272f3876a4261357467c8 in lucene-solr's branch 
refs/heads/master from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1411aae ]

HdfsRecoveryZkTest & HdfsNNFailoverTest: Remove @BadApple anotation

These tests were originally anotated @BadApple in early 2018 as pat of 
SOLR-12028.

Neither test has failed since 2018-12-28.

Since we no longer have logs from those older jenkins builds, it's hard to be 
certain how/why this
test was failing, or why exactly it stopped failing – but it's possible the 
underlying issues were
addressed by general hardening of SolrCloud and the associated base test 
classes around the same time.


> BadApple and AwaitsFix annotations usage
> 
>
> Key: SOLR-12028
> URL: https://issues.apache.org/jira/browse/SOLR-12028
> Project: Solr
>  Issue Type: Task
>  Components: Tests
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-12016-buildsystem.patch, SOLR-12028-3-Mar.patch, 
> SOLR-12028-sysprops-reproduce.patch, SOLR-12028.patch, SOLR-12028.patch
>
>
> There's a long discussion of this topic at SOLR-12016. Here's a summary:
> - BadApple annotations are used for tests that intermittently fail, say < 30% 
> of the time. Tests that fail more often shold be moved to AwaitsFix. This is, 
> of course, a judgement call
> - AwaitsFix annotations are used for tests that, for some reason, the problem 
> can't be fixed immediately. Likely reasons are third-party dependencies, 
> extreme difficulty tracking down, dependency on another JIRA etc.
> Jenkins jobs will typically run with BadApple disabled to cut down on noise. 
> Periodically Jenkins jobs will be run with BadApples enabled so BadApple 
> tests won't be lost and reports can be generated. Tests that run with 
> BadApples disabled that fail require _immediate_ attention.
> The default for developers is that BadApple is enabled.
> If you are working on one of these tests and cannot get the test to fail 
> locally, it is perfectly acceptable to comment the annotation out. You should 
> let the dev list know that this is deliberate.
> This JIRA is a placeholder for BadApple tests to point to between the times 
> they're identified as BadApple and they're either fixed or changed to 
> AwaitsFix or assigned their own JIRA.
> I've assigned this to myself to track so I don't lose track of it. No one 
> person will fix all of these issues, this will be an ongoing technical debt 
> cleanup effort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9051) Implement random access seeks in IndexedDISI (DocValues)

2019-11-18 Thread Adrien Grand (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976762#comment-16976762
 ] 

Adrien Grand commented on LUCENE-9051:
--

Code duplication might look unnecessary, but I also see benefits in having 
independent forks so that they can evolve according to their own constraints. 
For instance today's implementation of Lucene80's IndexedDISI might be close to 
your needs, but if we find a way to make it better for the access pattern that 
is typical to doc values, it would be a shame that it would slow down 
nearest-neighbor search or vice-versa. One could make the argument that we 
could delay the decision to fork until it's needed, but then it's an incentive 
against simple changes, e.g. reordering some loops or replacing a binary search 
with an exponential search would make the diff very large because of the need 
to duplicate IndexedDISI.

> Implement random access seeks in IndexedDISI (DocValues)
> 
>
> Key: LUCENE-9051
> URL: https://issues.apache.org/jira/browse/LUCENE-9051
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In LUCENE-9004 we have a use case for random-access seeking in DocValues, 
> which currently only support forward-only iteration (with efficient 
> skipping). One idea there was to write an entirely new format to cover these 
> cases. While looking into that, I noticed that our current DocValues 
> addressing implementation, {{IndexedDISI}}, already has a pretty good basis 
> for providing random accesses. I worked up a patch that does that; we already 
> have the ability to jump to a block, thanks to the jump-tables added last 
> year by [~toke]; the patch uses that, and/or rewinds the iteration within 
> current block as needed.
> I did a very simple performance test, comparing forward-only iteration with 
> random seeks, and in my test I saw no difference, but that can't be right, so 
> I wonder if we have a more thorough performance test of DocValues somwhere 
> that I could repurpose. Probably I'll go back and dig into the issue where we 
> added the jump tables - I seem to recall some testing was done then.
> Aside from performance testing the implementation, there is the question 
> should we alter our API guarantees in this way. This might be controversial, 
> I don't know the history or all the reasoning behind the way it is today. We 
> provide {{advanceExact}} and some implementations support docids going 
> backwards, others don't.  {{AssertingNumericDocValues.advanceExact}} does  
> enforce forward-iteration (in tests); what would the consequence be of 
> relaxing that? We'd then open ourselves up to requiring all DV impls to 
> support random access. Are there other impls to worry about though? I'm not 
> sure. I'd appreciate y'all's input on this one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9027) SIMD-based decoding of postings lists

2019-11-18 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976752#comment-16976752
 ] 

ASF subversion and git services commented on LUCENE-9027:
-

Commit 7755cdf03fc250e310c3b7d9b2e785f2939d3dc9 in lucene-solr's branch 
refs/heads/master from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7755cdf ]

LUCENE-9027: Use SIMD instructions to decode postings. (#973)




> SIMD-based decoding of postings lists
> -
>
> Key: LUCENE-9027
> URL: https://issues.apache.org/jira/browse/LUCENE-9027
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> [~rcmuir] has been mentioning the idea for quite some time that we might be 
> able to write the decoding logic in such a way that Java would use SIMD 
> instructions. More recently [~paul.masurel] wrote a [blog 
> post|https://fulmicoton.com/posts/bitpacking/] that raises the point that 
> Lucene could still do decode multiple ints at once in a single instruction by 
> packing two ints in a long and we've had some discussions about what we could 
> try in Lucene to speed up the decoding of postings. This made me want to look 
> a bit deeper at what we could do.
> Our current decoding logic reads data in a byte[] and decodes packed integers 
> from it. Unfortunately it doesn't make use of SIMD instructions and looks 
> like 
> [this|https://github.com/jpountz/decode-128-ints-benchmark/blob/master/src/main/java/jpountz/NaiveByteDecoder.java].
> I confirmed by looking at the generated assembly that if I take an array of 
> integers and shift them all by the same number of bits then Java will use 
> SIMD instructions to shift multiple integers at once. This led me to writing 
> this 
> [implementation|https://github.com/jpountz/decode-128-ints-benchmark/blob/master/src/main/java/jpountz/SimpleSIMDDecoder.java]
>  that tries as much as possible to shift long sequences of ints by the same 
> number of bits to speed up decoding. It is indeed faster than the current 
> logic we have, up to about 2x faster for some numbers of bits per value.
> Currently the best 
> [implementation|https://github.com/jpountz/decode-128-ints-benchmark/blob/master/src/main/java/jpountz/SIMDDecoder.java]
>  I've been able to come up with combines the above idea with the idea that 
> Paul mentioned in his blog that consists of emulating SIMD from Java by 
> packing multiple integers into a long: 2 ints, 4 shorts or 8 bytes. It is a 
> bit harder to read but gives another speedup on top of the above 
> implementation.
> I have a [JMH 
> benchmark|https://github.com/jpountz/decode-128-ints-benchmark/] available in 
> case someone would like to play with this and maybe even come up with an even 
> faster implementation. It is 2-2.5x faster than our current implementation 
> for most numbers of bits per value. I'm copying results here:
> {noformat}
>  * `readLongs` just reads 2*bitsPerValue longs from the ByteBuffer, it serves 
> as
>a baseline.
>  * `decodeNaiveFromBytes` reads a byte[] and decodes from it. This is what the
>current Lucene codec does.
>  * `decodeNaiveFromLongs` decodes from longs on the fly.
>  * `decodeSimpleSIMD` is a simple implementation that relies on how Java
>recognizes some patterns and uses SIMD instructions.
>  * `decodeSIMD` is a more complex implementation that both relies on the C2
>compiler to generate SIMD instructions and encodes 8 bytes, 4 shorts or
>2 ints in a long in order to decompress multiple values at once.
> Benchmark   (bitsPerValue)  (byteOrder)   
> Mode  Cnt   Score   Error   Units
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   1   LE  
> thrpt5  12.912 ± 0.393  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   1   BE  
> thrpt5  12.862 ± 0.395  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   2   LE  
> thrpt5  13.040 ± 1.162  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   2   BE  
> thrpt5  13.027 ± 0.270  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   3   LE  
> thrpt5  12.409 ± 0.637  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   3   BE  
> thrpt5  12.268 ± 0.947  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   4   LE  
> thrpt5  14.177 ± 2.263  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   4   BE  
> thrpt5  11.457 ± 0.150  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   5   LE  
> thrpt5  10.988 ± 1.179  ops/us
>

[GitHub] [lucene-solr] jpountz merged pull request #973: LUCENE-9027: Use SIMD instructions to decode postings.

2019-11-18 Thread GitBox

jpountz merged pull request #973: LUCENE-9027: Use SIMD instructions to decode 
postings.
URL: https://github.com/apache/lucene-solr/pull/973
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on issue #973: LUCENE-9027: Use SIMD instructions to decode postings.

2019-11-18 Thread GitBox

jpountz commented on issue #973: LUCENE-9027: Use SIMD instructions to decode 
postings.
URL: https://github.com/apache/lucene-solr/pull/973#issuecomment-555138178
 
 
   Thanks @mikemccand for taking the time to look at this large PR!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13924) MoveReplica failures when using HDFS (NullPointerException)

2019-11-18 Thread Chris M. Hostetter (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976716#comment-16976716
 ] 

Chris M. Hostetter commented on SOLR-13924:
---

SOLR-13924: AwaitsFix: MoveReplicaHDFSTest

I've AwaitsFix'ed the entire class due to this bug so it no longer triggers 
jenkins failures, but I should point out that some of the individual test 
methods have already been BadApple'ed with an annotation pointing at 
SOLR-12028, however:
 * the SOLR-12028 BadApple annotations were added ~2018-10
 * Prior to SOLR-13843 / SOLR-13924 failures, the last jenkins failure from 
this class was 2019-03-13
 ** [~krisden] did a lot of general HDFS test improvements, including 
modifications to this class, in SOLR-13330 ~2019-03 which where probably 
related to the drop off in failures
 * So once SOLR-13924 is fixed, the entire test should be re-evaluated to 
confirm if all the SOLR-12028 annotations can be removed.

> MoveReplica failures when using HDFS (NullPointerException)
> ---
>
> Key: SOLR-13924
> URL: https://issues.apache.org/jira/browse/SOLR-13924
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.3
>Reporter: Chris M. Hostetter
>Assignee: Shalin Shekhar Mangar
>Priority: Major
>
> Based on recent jenkins test failures, it appears that attemping to use the 
> "MoveReplica" command on HDFS has a high chance of failure due to an 
> underlying NPE.
> I'm not sure if this bug *only* affects HDFS, or if it's just more likly to 
> occur when using HDFS due to some timing quirks.
> It's also possible that the bug impacts non-HDFS users just as much as HDFS 
> users, but only manifests in our tests due to some quick of our 
> {{cloud-hdfs}} test configs.
> The problem appears to be new in 8.3 as a result of changes made in SOLR-13843



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-13941) Tests configure Jetty differently than when running via start.jar

2019-11-18 Thread Uwe Schindler (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976712#comment-16976712
 ] 

Uwe Schindler edited comment on SOLR-13941 at 11/18/19 5:16 PM:


Hi,
I do not see any serious problem in the embedded jetty, but there is actually 
one difference - which makes sense:

in web.xml the SolrDispatchFilter is mount to path {{/\*}} while in the 
embedded jetty it's just {{\*}}. I have not much time to test it now, but maybe 
adding a "/" in the class here is enough to fix this:

https://github.com/apache/lucene-solr/blob/cf21340294a52ad764deac7b9cdd38d06cfbc3da/solr/core/src/java/org/apache/solr/client/solrj/embedded/JettySolrRunner.java#L386

The reson for this is explained here and has to do with the servlet spec: 
https://bluxte.net/musings/2006/03/29/servletpath-and-pathinfo-servlet-api-weirdness/

Same applies for the debug servlet.


was (Author: thetaphi):
Hi,
I do not see any serious problem in the embedded jetty, but there is actually 
one difference - which makes sense:

in web.xml the SolrDispatchFilter is mount to path {{/*}} while in the embedded 
jetty it's just {{*}}. I have not much time to test it now, but maybe adding a 
"/" in the class here is enough to fix this:

https://github.com/apache/lucene-solr/blob/cf21340294a52ad764deac7b9cdd38d06cfbc3da/solr/core/src/java/org/apache/solr/client/solrj/embedded/JettySolrRunner.java#L386

The reson for this is explained here and has to do with the servlet spec: 
https://bluxte.net/musings/2006/03/29/servletpath-and-pathinfo-servlet-api-weirdness/

Same applies for the debug servlet.

> Tests configure Jetty differently than when running via start.jar
> -
>
> Key: SOLR-13941
> URL: https://issues.apache.org/jira/browse/SOLR-13941
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
>Priority: Major
>
> Spinoff from SOLR-13905.
> There seems to be a slightly different configuration of our servlets when 
> Solr is run from command line and through Test-runner for our tests.
> This causes different behavior of {{httpRequest.getPathInfo}} and 
> {{httpRequest.getServletPath()}} in the two environments, making it hard to 
> depend on these in critical code paths, such as 
> [SolrDispatchFilter|https://github.com/apache/lucene-solr/blob/f07998fc234c81ff956a84ee508b85f8d573ef38/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L494-L499]
>  and AuditEvent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-13941) Tests configure Jetty differently than when running via start.jar

2019-11-18 Thread Uwe Schindler (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976712#comment-16976712
 ] 

Uwe Schindler edited comment on SOLR-13941 at 11/18/19 5:16 PM:


Hi,
I do not see any serious problem in the embedded jetty, but there is actually 
one difference - which makes sense:

in web.xml the SolrDispatchFilter is mount to path "{{/\*}}" while in the 
embedded jetty it's just "{{\*}}". I have not much time to test it now, but 
maybe adding a "{{/}}" in the class here is enough to fix this:

https://github.com/apache/lucene-solr/blob/cf21340294a52ad764deac7b9cdd38d06cfbc3da/solr/core/src/java/org/apache/solr/client/solrj/embedded/JettySolrRunner.java#L386

The reson for this is explained here and has to do with the servlet spec: 
https://bluxte.net/musings/2006/03/29/servletpath-and-pathinfo-servlet-api-weirdness/

Same applies for the debug servlet.


was (Author: thetaphi):
Hi,
I do not see any serious problem in the embedded jetty, but there is actually 
one difference - which makes sense:

in web.xml the SolrDispatchFilter is mount to path {{/\*}} while in the 
embedded jetty it's just {{\*}}. I have not much time to test it now, but maybe 
adding a "/" in the class here is enough to fix this:

https://github.com/apache/lucene-solr/blob/cf21340294a52ad764deac7b9cdd38d06cfbc3da/solr/core/src/java/org/apache/solr/client/solrj/embedded/JettySolrRunner.java#L386

The reson for this is explained here and has to do with the servlet spec: 
https://bluxte.net/musings/2006/03/29/servletpath-and-pathinfo-servlet-api-weirdness/

Same applies for the debug servlet.

> Tests configure Jetty differently than when running via start.jar
> -
>
> Key: SOLR-13941
> URL: https://issues.apache.org/jira/browse/SOLR-13941
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
>Priority: Major
>
> Spinoff from SOLR-13905.
> There seems to be a slightly different configuration of our servlets when 
> Solr is run from command line and through Test-runner for our tests.
> This causes different behavior of {{httpRequest.getPathInfo}} and 
> {{httpRequest.getServletPath()}} in the two environments, making it hard to 
> depend on these in critical code paths, such as 
> [SolrDispatchFilter|https://github.com/apache/lucene-solr/blob/f07998fc234c81ff956a84ee508b85f8d573ef38/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L494-L499]
>  and AuditEvent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13924) MoveReplica failures when using HDFS (NullPointerException)

2019-11-18 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976715#comment-16976715
 ] 

ASF subversion and git services commented on SOLR-13924:


Commit 3b7e33790a487026f590199efd148de011128a3b in lucene-solr's branch 
refs/heads/branch_8x from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3b7e337 ]

SOLR-13924: AwaitsFix: MoveReplicaHDFSTest

(cherry picked from commit f9076d85cf4804db3eedb23f9ef616f050d328db)


> MoveReplica failures when using HDFS (NullPointerException)
> ---
>
> Key: SOLR-13924
> URL: https://issues.apache.org/jira/browse/SOLR-13924
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.3
>Reporter: Chris M. Hostetter
>Assignee: Shalin Shekhar Mangar
>Priority: Major
>
> Based on recent jenkins test failures, it appears that attemping to use the 
> "MoveReplica" command on HDFS has a high chance of failure due to an 
> underlying NPE.
> I'm not sure if this bug *only* affects HDFS, or if it's just more likly to 
> occur when using HDFS due to some timing quirks.
> It's also possible that the bug impacts non-HDFS users just as much as HDFS 
> users, but only manifests in our tests due to some quick of our 
> {{cloud-hdfs}} test configs.
> The problem appears to be new in 8.3 as a result of changes made in SOLR-13843



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13941) Tests configure Jetty differently than when running via start.jar

2019-11-18 Thread Uwe Schindler (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976712#comment-16976712
 ] 

Uwe Schindler commented on SOLR-13941:
--

Hi,
I do not see any serious problem in the embedded jetty, but there is actually 
one difference - which makes sense:

in web.xml the SolrDispatchFilter is mount to path {{/*}} while in the embedded 
jetty it's just {{*}}. I have not much time to test it now, but maybe adding a 
"/" in the class here is enough to fix this:

https://github.com/apache/lucene-solr/blob/cf21340294a52ad764deac7b9cdd38d06cfbc3da/solr/core/src/java/org/apache/solr/client/solrj/embedded/JettySolrRunner.java#L386

The reson for this is explained here and has to do with the servlet spec: 
https://bluxte.net/musings/2006/03/29/servletpath-and-pathinfo-servlet-api-weirdness/

Same applies for the debug servlet.

> Tests configure Jetty differently than when running via start.jar
> -
>
> Key: SOLR-13941
> URL: https://issues.apache.org/jira/browse/SOLR-13941
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
>Priority: Major
>
> Spinoff from SOLR-13905.
> There seems to be a slightly different configuration of our servlets when 
> Solr is run from command line and through Test-runner for our tests.
> This causes different behavior of {{httpRequest.getPathInfo}} and 
> {{httpRequest.getServletPath()}} in the two environments, making it hard to 
> depend on these in critical code paths, such as 
> [SolrDispatchFilter|https://github.com/apache/lucene-solr/blob/f07998fc234c81ff956a84ee508b85f8d573ef38/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L494-L499]
>  and AuditEvent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13924) MoveReplica failures when using HDFS (NullPointerException)

2019-11-18 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976702#comment-16976702
 ] 

ASF subversion and git services commented on SOLR-13924:


Commit f9076d85cf4804db3eedb23f9ef616f050d328db in lucene-solr's branch 
refs/heads/master from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=f9076d8 ]

SOLR-13924: AwaitsFix: MoveReplicaHDFSTest


> MoveReplica failures when using HDFS (NullPointerException)
> ---
>
> Key: SOLR-13924
> URL: https://issues.apache.org/jira/browse/SOLR-13924
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.3
>Reporter: Chris M. Hostetter
>Assignee: Shalin Shekhar Mangar
>Priority: Major
>
> Based on recent jenkins test failures, it appears that attemping to use the 
> "MoveReplica" command on HDFS has a high chance of failure due to an 
> underlying NPE.
> I'm not sure if this bug *only* affects HDFS, or if it's just more likly to 
> occur when using HDFS due to some timing quirks.
> It's also possible that the bug impacts non-HDFS users just as much as HDFS 
> users, but only manifests in our tests due to some quick of our 
> {{cloud-hdfs}} test configs.
> The problem appears to be new in 8.3 as a result of changes made in SOLR-13843



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13941) Tests configure Jetty differently than when running via start.jar

2019-11-18 Thread Uwe Schindler (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976699#comment-16976699
 ] 

Uwe Schindler commented on SOLR-13941:
--

OK, I see. I will check the test setup code to see how it binds the 
servlet/servletfilter and how the context path is setup.

> Tests configure Jetty differently than when running via start.jar
> -
>
> Key: SOLR-13941
> URL: https://issues.apache.org/jira/browse/SOLR-13941
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
>Priority: Major
>
> Spinoff from SOLR-13905.
> There seems to be a slightly different configuration of our servlets when 
> Solr is run from command line and through Test-runner for our tests.
> This causes different behavior of {{httpRequest.getPathInfo}} and 
> {{httpRequest.getServletPath()}} in the two environments, making it hard to 
> depend on these in critical code paths, such as 
> [SolrDispatchFilter|https://github.com/apache/lucene-solr/blob/f07998fc234c81ff956a84ee508b85f8d573ef38/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L494-L499]
>  and AuditEvent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13905) Nullpointer exception in AuditEvent

2019-11-18 Thread Jira



[ 
https://issues.apache.org/jira/browse/SOLR-13905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976629#comment-16976629
 ] 

Jan Høydahl commented on SOLR-13905:


I'll merge this PR tomorrow and defer to SOLR-13941 to fix the root cause of 
this confusion.

> Nullpointer exception in AuditEvent
> ---
>
> Key: SOLR-13905
> URL: https://issues.apache.org/jira/browse/SOLR-13905
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Auditlogging
>Affects Versions: 8.3
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: 8.4, 8.3.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Nullpointer exception in AuditEvent for events with HttpServletRequest as 
> input. Happens when {{getPathInfo()}} returns null, which was not caught by 
> current tests. This causes the whole request to fail, rendering the audit 
> service unusable.
> The nullpointer is experienced in the {{findRequestType()}} method when 
> performing pattern match on the resource (path).
> This is a regression from 8.3, caused by SOLR-13835 where we switched from 
> fetching the URL path from {{httpRequest.getContextPath()}} to 
> {{httpRequest.getPathInfo()}}. However while this method behaves well in 
> tests (JettyTestRunner) it returns {{null}} in production.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (SOLR-13941) Tests configure Jetty differently than when running via start.jar

2019-11-18 Thread Jira



 [ 
https://issues.apache.org/jira/browse/SOLR-13941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl reassigned SOLR-13941:
--

Assignee: Uwe Schindler

> Tests configure Jetty differently than when running via start.jar
> -
>
> Key: SOLR-13941
> URL: https://issues.apache.org/jira/browse/SOLR-13941
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
>Priority: Major
>
> Spinoff from SOLR-13905.
> There seems to be a slightly different configuration of our servlets when 
> Solr is run from command line and through Test-runner for our tests.
> This causes different behavior of {{httpRequest.getPathInfo}} and 
> {{httpRequest.getServletPath()}} in the two environments, making it hard to 
> depend on these in critical code paths, such as 
> [SolrDispatchFilter|https://github.com/apache/lucene-solr/blob/f07998fc234c81ff956a84ee508b85f8d573ef38/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L494-L499]
>  and AuditEvent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13905) Nullpointer exception in AuditEvent

2019-11-18 Thread Jira



[ 
https://issues.apache.org/jira/browse/SOLR-13905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976624#comment-16976624
 ] 

Jan Høydahl commented on SOLR-13905:


Spinning off SOLR-13941 to perhaps fix the discrepancy between cmdline and 
TestRunner servlets.

> Nullpointer exception in AuditEvent
> ---
>
> Key: SOLR-13905
> URL: https://issues.apache.org/jira/browse/SOLR-13905
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Auditlogging
>Affects Versions: 8.3
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: 8.4, 8.3.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Nullpointer exception in AuditEvent for events with HttpServletRequest as 
> input. Happens when {{getPathInfo()}} returns null, which was not caught by 
> current tests. This causes the whole request to fail, rendering the audit 
> service unusable.
> The nullpointer is experienced in the {{findRequestType()}} method when 
> performing pattern match on the resource (path).
> This is a regression from 8.3, caused by SOLR-13835 where we switched from 
> fetching the URL path from {{httpRequest.getContextPath()}} to 
> {{httpRequest.getPathInfo()}}. However while this method behaves well in 
> tests (JettyTestRunner) it returns {{null}} in production.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-13941) Tests configure Jetty differently than when running via start.jar

2019-11-18 Thread Jira

Jan Høydahl created SOLR-13941:
--

 Summary: Tests configure Jetty differently than when running via 
start.jar
 Key: SOLR-13941
 URL: https://issues.apache.org/jira/browse/SOLR-13941
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Jan Høydahl


Spinoff from SOLR-13905.

There seems to be a slightly different configuration of our servlets when Solr 
is run from command line and through Test-runner for our tests.

This causes different behavior of {{httpRequest.getPathInfo}} and 
{{httpRequest.getServletPath()}} in the two environments, making it hard to 
depend on these in critical code paths, such as 
[SolrDispatchFilter|https://github.com/apache/lucene-solr/blob/f07998fc234c81ff956a84ee508b85f8d573ef38/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L494-L499]
 and AuditEvent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mikemccand commented on a change in pull request #973: LUCENE-9027: Use SIMD instructions to decode postings.

2019-11-18 Thread GitBox

mikemccand commented on a change in pull request #973: LUCENE-9027: Use SIMD 
instructions to decode postings.
URL: https://github.com/apache/lucene-solr/pull/973#discussion_r347415612
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/store/ByteBufferIndexInput.java
 ##
 @@ -107,6 +117,40 @@ public final void readBytes(byte[] b, int offset, int 
len) throws IOException {
 }
   }
 
+  @Override
+  public void readLELongs(long[] dst, int offset, int length) throws 
IOException {
+// ByteBuffer#getLong could work but it has some per-long overhead and 
there
+// is no ByteBuffer#getLongs to read multiple longs at once. So we use the
+// below trick in order to be able to leverage LongBuffer#get(long[]) to
+// read multiple longs at once with as little overhead as possible.
+if (curLongBufferViews == null) {
+  // readLELongs is only used for postings today, so we compute the long
+  // views lazily so that other data-structures don't have to pay for the
+  // associated initialization/memory overhead.
+  curLongBufferViews = new LongBuffer[Long.BYTES];
 
 Review comment:
   Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mikemccand commented on a change in pull request #973: LUCENE-9027: Use SIMD instructions to decode postings.

2019-11-18 Thread GitBox

mikemccand commented on a change in pull request #973: LUCENE-9027: Use SIMD 
instructions to decode postings.
URL: https://github.com/apache/lucene-solr/pull/973#discussion_r347420183
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/codecs/lucene84/PForUtil.java
 ##
 @@ -0,0 +1,130 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.codecs.lucene84;
+
+import java.io.IOException;
+import java.util.Arrays;
+
+import org.apache.lucene.store.DataInput;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.packed.PackedInts;
+
+/**
+ * Utility class to encode sequences of 128 small positive integers.
+ */
+final class PForUtil {
+
+  static boolean allEqual(long[] l) {
+for (int i = 1; i < ForUtil.BLOCK_SIZE; ++i) {
+  if (l[i] != l[0]) {
+return false;
+  }
+}
+return true;
+  }
+
+  private final ForUtil forUtil;
+
+  PForUtil(ForUtil forUtil) {
+this.forUtil = forUtil;
+  }
+
+  /**
+   * Encode 128 integers from {@code longs} into {@code out}.
+   */
+  void encode(long[] longs, DataOutput out) throws IOException {
+// At most 3 exceptions
+final long[] top4 = new long[4];
+Arrays.fill(top4, -1L);
+for (int i = 0; i < ForUtil.BLOCK_SIZE; ++i) {
+  if (longs[i] > top4[0]) {
+top4[0] = longs[i];
+Arrays.sort(top4); // For only 4 entries we just sort on every 
iteration instead of maintaining a PQ
 
 Review comment:
   Well I got too curious about this and played around with some silly 
micro-benchmarks.  I tested three approaches.
   
   First approach is to inline the PQ as a `long[4]`:
   
   ```
   private static long[] top4_a(long[] input) {
   long[] top4 = new long[4];
   Arrays.fill(top4, Long.MIN_VALUE);
   for (long elem : input) {
   if (elem > top4[3]) {
   if (elem > top4[1]) {
   if (elem > top4[0]) {
   top4[3] = top4[2];
   top4[2] = top4[1];
   top4[1] = top4[0];
   top4[0] = elem;
   } else {
   top4[3] = top4[2];
   top4[2] = top4[1];
   top4[1] = elem;
   }
   } else if (elem > top4[2]) {
   top4[3] = top4[2];
   top4[2] = elem;
   } else {
   top4[3] = elem;
   }
   }
   }
   
   return top4;
   }
   ```
   
   Second approach is the same thing, use local variables for the four slots 
instead of `long[]`:
   
   ```
   private static long[] top4_b(long[] input) {
   long first = Long.MIN_VALUE;
   long second = Long.MIN_VALUE;
   long third = Long.MIN_VALUE;
   long forth = Long.MIN_VALUE;
   for (long elem : input) {
   if (elem > forth) {
   if (elem > second) {
   if (elem > first) {
   forth = third;
   third = second;
   second = first;
   first = elem;
   } else {
   forth = third;
   third = second;
   second = elem;
   }
   } else if (elem > third) {
   forth = third;
   third = elem;
   } else {
   forth = elem;
   }
   }
   }
   
   return new long[] {first, second, third, forth};
   }
   ```
   
   Last approach just uses `Arrays.sort` (like here):
   
   ```
   private static long[] top4_c(long[] input) {
   long[] top4 = new long[4];
   Arrays.fill(top4, Long.MIN_VALUE);
   for (long elem : input) {
   if (elem > top4[0]) {
   top4[0] = elem;
   Arrays.sort(top4);
   }
   }
   
   for (int i = 0; i < 2; i++) {
   // swap

[jira] [Commented] (LUCENE-9051) Implement random access seeks in IndexedDISI (DocValues)

2019-11-18 Thread Michael Sokolov (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976571#comment-16976571
 ] 

Michael Sokolov commented on LUCENE-9051:
-

Sure, but I'd like to understand the rationale for forking. It seems like we'd 
end up with a lot of unneccessary code duplication. Why do we see implementing 
{{DocIdSetIterator}} as preventing a class from *also* implementing random 
access, as {{DocValuesIterator}} seems to offer with its {{advanceExact}}?

> Implement random access seeks in IndexedDISI (DocValues)
> 
>
> Key: LUCENE-9051
> URL: https://issues.apache.org/jira/browse/LUCENE-9051
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In LUCENE-9004 we have a use case for random-access seeking in DocValues, 
> which currently only support forward-only iteration (with efficient 
> skipping). One idea there was to write an entirely new format to cover these 
> cases. While looking into that, I noticed that our current DocValues 
> addressing implementation, {{IndexedDISI}}, already has a pretty good basis 
> for providing random accesses. I worked up a patch that does that; we already 
> have the ability to jump to a block, thanks to the jump-tables added last 
> year by [~toke]; the patch uses that, and/or rewinds the iteration within 
> current block as needed.
> I did a very simple performance test, comparing forward-only iteration with 
> random seeks, and in my test I saw no difference, but that can't be right, so 
> I wonder if we have a more thorough performance test of DocValues somwhere 
> that I could repurpose. Probably I'll go back and dig into the issue where we 
> added the jump tables - I seem to recall some testing was done then.
> Aside from performance testing the implementation, there is the question 
> should we alter our API guarantees in this way. This might be controversial, 
> I don't know the history or all the reasoning behind the way it is today. We 
> provide {{advanceExact}} and some implementations support docids going 
> backwards, others don't.  {{AssertingNumericDocValues.advanceExact}} does  
> enforce forward-iteration (in tests); what would the consequence be of 
> relaxing that? We'd then open ourselves up to requiring all DV impls to 
> support random access. Are there other impls to worry about though? I'm not 
> sure. I'd appreciate y'all's input on this one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9004) Approximate nearest vector search

2019-11-18 Thread Tomoko Uchida (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976542#comment-16976542
 ] 

Tomoko Uchida commented on LUCENE-9004:
---

Thanks for mentioning, I was working on this issue for couple of weeks and here 
is my WIP/PoC branch (actually it's not a PR, because "Query" part is still 
missing).
 [https://github.com/mocobeta/lucene-solr-mirror/commits/jira/LUCENE-9004-aknn]

I borrowed [~sokolov]'s idea but took different implementation approach:
 - Introduce new codec (Format, Writer, and Reader) for the graph part. The new 
{{GraphFormat}} can express multi level (document) graph.
 - Introduce new doc values field type for the vector part. The new 
{{VectorDocValues}} shares the same codec to BinaryDocValues but provides 
special functionalities for dense vector handling: encode/decode float array 
to/from binary value, keep num of dimensions and distance function, and allow 
random access to underlying binary doc values. (For now I just reset 
IndexedDISI when seeking backwards.)

It works but there are indexing performance concerns (due to costly graph 
construction). Anyway I hope I can create a PR with working examples before 
long...

> Approximate nearest vector search
> -
>
> Key: LUCENE-9004
> URL: https://issues.apache.org/jira/browse/LUCENE-9004
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Michael Sokolov
>Priority: Major
> Attachments: hnsw_layered_graph.png
>
>
> "Semantic" search based on machine-learned vector "embeddings" representing 
> terms, queries and documents is becoming a must-have feature for a modern 
> search engine. SOLR-12890 is exploring various approaches to this, including 
> providing vector-based scoring functions. This is a spinoff issue from that.
> The idea here is to explore approximate nearest-neighbor search. Researchers 
> have found an approach based on navigating a graph that partially encodes the 
> nearest neighbor relation at multiple scales can provide accuracy > 95% (as 
> compared to exact nearest neighbor calculations) at a reasonable cost. This 
> issue will explore implementing HNSW (hierarchical navigable small-world) 
> graphs for the purpose of approximate nearest vector search (often referred 
> to as KNN or k-nearest-neighbor search).
> At a high level the way this algorithm works is this. First assume you have a 
> graph that has a partial encoding of the nearest neighbor relation, with some 
> short and some long-distance links. If this graph is built in the right way 
> (has the hierarchical navigable small world property), then you can 
> efficiently traverse it to find nearest neighbors (approximately) in log N 
> time where N is the number of nodes in the graph. I believe this idea was 
> pioneered in  [1]. The great insight in that paper is that if you use the 
> graph search algorithm to find the K nearest neighbors of a new document 
> while indexing, and then link those neighbors (undirectedly, ie both ways) to 
> the new document, then the graph that emerges will have the desired 
> properties.
> The implementation I propose for Lucene is as follows. We need two new data 
> structures to encode the vectors and the graph. We can encode vectors using a 
> light wrapper around {{BinaryDocValues}} (we also want to encode the vector 
> dimension and have efficient conversion from bytes to floats). For the graph 
> we can use {{SortedNumericDocValues}} where the values we encode are the 
> docids of the related documents. Encoding the interdocument relations using 
> docids directly will make it relatively fast to traverse the graph since we 
> won't need to lookup through an id-field indirection. This choice limits us 
> to building a graph-per-segment since it would be impractical to maintain a 
> global graph for the whole index in the face of segment merges. However 
> graph-per-segment is a very natural at search time - we can traverse each 
> segments' graph independently and merge results as we do today for term-based 
> search.
> At index time, however, merging graphs is somewhat challenging. While 
> indexing we build a graph incrementally, performing searches to construct 
> links among neighbors. When merging segments we must construct a new graph 
> containing elements of all the merged segments. Ideally we would somehow 
> preserve the work done when building the initial graphs, but at least as a 
> start I'd propose we construct a new graph from scratch when merging. The 
> process is going to be  limited, at least initially, to graphs that can fit 
> in RAM since we require random access to the entire graph while constructing 
> it: In order to add links bidirectionally we must continually update existing 
> documents.
> I think we want to express this API to users as a single joint 
>

[jira] [Commented] (LUCENE-9036) ExitableDirectoryReader to interrupt DocValues as well

2019-11-18 Thread Lucene/Solr QA (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976541#comment-16976541
 ] 

Lucene/Solr QA commented on LUCENE-9036:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
18s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | 
{color:green}  1m 28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | 
{color:green}  1m 17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate source patterns {color} | 
{color:green}  1m 17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 32m 
44s{color} | {color:green} core in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 84m 
59s{color} | {color:green} core in the patch passed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}130m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | LUCENE-9036 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12986052/LUCENE-9036.patch |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  |
| uname | Linux lucene2-us-west.apache.org 4.4.0-112-generic #135-Ubuntu SMP 
Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / 0857bb6 |
| ant | version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018 |
| Default Java | LTS |
|  Test Results | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/235/testReport/ |
| modules | C: lucene/core solr/core U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/235/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> ExitableDirectoryReader to interrupt DocValues as well
> --
>
> Key: LUCENE-9036
> URL: https://issues.apache.org/jira/browse/LUCENE-9036
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mikhail Khludnev
>Priority: Major
> Attachments: LUCENE-9036.patch, LUCENE-9036.patch, LUCENE-9036.patch, 
> LUCENE-9036.patch, LUCENE-9036.patch, LUCENE-9036.patch
>
>
> This allow to make AnalyticsComponent and json.facet sensitive to time 
> allowed. 
> Does it make sense? Is it enough to check on DV creation ie per field/segment 
> or it's worth to check every Nth doc? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13892) Add postfilter support to {!join} queries

2019-11-18 Thread Jason Gerlowski (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski updated SOLR-13892:
---
Attachment: SOLR-13892.patch
Status: Open  (was: Open)

Latest patch refactors some code so that the Collector implementations used by 
the postfilter-join live as their own classes in the 
"org.apache.solr.search.join" package, and the remainder of the postfilter 
logic is moved into JoinQParserPlugin.  Also adds tests into the existing 
org.apache.solr.TestJoin.

Still todo:
* performance comparison between "score=none" method and join postfilter
* clarify ref-guide documentation on different join options and their 
limitations
* minor cleanup.

> Add postfilter support to {!join} queries
> -
>
> Key: SOLR-13892
> URL: https://issues.apache.org/jira/browse/SOLR-13892
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: master (9.0)
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-13892.patch, SOLR-13892.patch
>
>
> The JoinQParserPlugin would be a lot performant in many use-cases if it could 
> operate as a post-filter, especially when doc-values for the involved fields 
> are available.
> With this issue, I'd like to propose a post-filter implementation for the 
> {{join}} qparser.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (SOLR-13892) Add postfilter support to {!join} queries

2019-11-18 Thread Jason Gerlowski (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski reassigned SOLR-13892:
--

Assignee: Jason Gerlowski

> Add postfilter support to {!join} queries
> -
>
> Key: SOLR-13892
> URL: https://issues.apache.org/jira/browse/SOLR-13892
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: master (9.0)
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-13892.patch
>
>
> The JoinQParserPlugin would be a lot performant in many use-cases if it could 
> operate as a post-filter, especially when doc-values for the involved fields 
> are available.
> With this issue, I'd like to propose a post-filter implementation for the 
> {{join}} qparser.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9051) Implement random access seeks in IndexedDISI (DocValues)

2019-11-18 Thread Adrien Grand (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976529#comment-16976529
 ] 

Adrien Grand commented on LUCENE-9051:
--

I'm assuming we started by looking at doc-value formats in LUCENE-9004 because 
that was convenient, but eventually we'd have a separate XXXFormat abstraction 
instead? Then you could fork IndexedDisi to not implement DocIdSetIterator 
anymore (which is the reason why it enforces sequential access) and support 
backward access too and use it as an implementation detail of XXXFormat?

> Implement random access seeks in IndexedDISI (DocValues)
> 
>
> Key: LUCENE-9051
> URL: https://issues.apache.org/jira/browse/LUCENE-9051
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In LUCENE-9004 we have a use case for random-access seeking in DocValues, 
> which currently only support forward-only iteration (with efficient 
> skipping). One idea there was to write an entirely new format to cover these 
> cases. While looking into that, I noticed that our current DocValues 
> addressing implementation, {{IndexedDISI}}, already has a pretty good basis 
> for providing random accesses. I worked up a patch that does that; we already 
> have the ability to jump to a block, thanks to the jump-tables added last 
> year by [~toke]; the patch uses that, and/or rewinds the iteration within 
> current block as needed.
> I did a very simple performance test, comparing forward-only iteration with 
> random seeks, and in my test I saw no difference, but that can't be right, so 
> I wonder if we have a more thorough performance test of DocValues somwhere 
> that I could repurpose. Probably I'll go back and dig into the issue where we 
> added the jump tables - I seem to recall some testing was done then.
> Aside from performance testing the implementation, there is the question 
> should we alter our API guarantees in this way. This might be controversial, 
> I don't know the history or all the reasoning behind the way it is today. We 
> provide {{advanceExact}} and some implementations support docids going 
> backwards, others don't.  {{AssertingNumericDocValues.advanceExact}} does  
> enforce forward-iteration (in tests); what would the consequence be of 
> relaxing that? We'd then open ourselves up to requiring all DV impls to 
> support random access. Are there other impls to worry about though? I'm not 
> sure. I'd appreciate y'all's input on this one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #973: LUCENE-9027: Use SIMD instructions to decode postings.

2019-11-18 Thread GitBox

jpountz commented on a change in pull request #973: LUCENE-9027: Use SIMD 
instructions to decode postings.
URL: https://github.com/apache/lucene-solr/pull/973#discussion_r347372434
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/codecs/lucene84/PForUtil.java
 ##
 @@ -0,0 +1,130 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.codecs.lucene84;
+
+import java.io.IOException;
+import java.util.Arrays;
+
+import org.apache.lucene.store.DataInput;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.packed.PackedInts;
+
+/**
+ * Utility class to encode sequences of 128 small positive integers.
+ */
+final class PForUtil {
+
+  static boolean allEqual(long[] l) {
+for (int i = 1; i < ForUtil.BLOCK_SIZE; ++i) {
+  if (l[i] != l[0]) {
+return false;
+  }
+}
+return true;
+  }
+
+  private final ForUtil forUtil;
+
+  PForUtil(ForUtil forUtil) {
+this.forUtil = forUtil;
+  }
+
+  /**
+   * Encode 128 integers from {@code longs} into {@code out}.
+   */
+  void encode(long[] longs, DataOutput out) throws IOException {
+// At most 3 exceptions
+final long[] top4 = new long[4];
+Arrays.fill(top4, -1L);
+for (int i = 0; i < ForUtil.BLOCK_SIZE; ++i) {
+  if (longs[i] > top4[0]) {
+top4[0] = longs[i];
+Arrays.sort(top4); // For only 4 entries we just sort on every 
iteration instead of maintaining a PQ
 
 Review comment:
   Yes, I would be surprised if that made a different for 4 slots.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #973: LUCENE-9027: Use SIMD instructions to decode postings.

2019-11-18 Thread GitBox

jpountz commented on a change in pull request #973: LUCENE-9027: Use SIMD 
instructions to decode postings.
URL: https://github.com/apache/lucene-solr/pull/973#discussion_r347371932
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/store/ByteBufferIndexInput.java
 ##
 @@ -107,6 +118,28 @@ public final void readBytes(byte[] b, int offset, int 
len) throws IOException {
 }
   }
 
+  @Override
+  public void readLELongs(long[] dst, int offset, int length) throws 
IOException {
+if (curLongBufferViews == null) {
+  // Lazy init to not make pay for memory and initialization cost if you 
don't need to read arrays of longs
 
 Review comment:
   > under the hood the CPU must still do unaligned long decodes (which I think 
modern X86-64 are good at)? Maybe add a comment about why this trick is 
worthwhile?
   
   This is mostly a workaround for the lack of ByteBuffer#getLongs(long[]). 
Avoiding the LongBuffer view would require to call ByteBuffer#getLong in a 
loop, which seems to have some per-long overhead according to the 
microbenchmarks I ran. I'll leave a comment.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13647) CVE-2019-12409: Apache Solr RCE vulnerability due to bad config default

2019-11-18 Thread Jira



[ 
https://issues.apache.org/jira/browse/SOLR-13647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976524#comment-16976524
 ] 

Jan Høydahl commented on SOLR-13647:


Updated announcement with CVE number published to solr-user@ general@ and 
announce@apache lists.

> CVE-2019-12409: Apache Solr RCE vulnerability due to bad config default
> ---
>
> Key: SOLR-13647
> URL: https://issues.apache.org/jira/browse/SOLR-13647
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 8.1.1
>Reporter: John
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: 8.3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> This issue originally had the title "default solr.in.sh contains uncommented 
> lines" and told users to check their {{solr.in.sh}} file. Later we upgraded 
> the severity of this due to risk of remote code execution, acquired a CVE 
> number and issued the public annoucement quoted below.
> {noformat}
> CVE-2019-12409: Apache Solr RCE vulnerability due to bad config default
> Severity: High
> Vendor:
> The Apache Software Foundation
> Versions Affected:
> Solr 8.1.1 and 8.2.0 for Linux
> Description: The 8.1.1 and 8.2.0 releases of Apache Solr contain an
> insecure setting for the ENABLE_REMOTE_JMX_OPTS configuration option
> in the default solr.in.sh configuration file shipping with Solr.
> Windows users are not affected.
> If you use the default solr.in.sh file from the affected releases, then
> JMX monitoring will be enabled and exposed on RMI_PORT (default=18983),
> without any authentication. If this port is opened for inbound traffic
> in your firewall, then anyone with network access to your Solr nodes
> will be able to access JMX, which may in turn allow them to upload
> malicious code for execution on the Solr server.
> The vulnerability is already public [1] and mitigation steps were
> announced on project mailing lists and news page [3] on August 14th,
> without mentioning RCE at that time.
> Mitigation:
> Make sure your effective solr.in.sh file has ENABLE_REMOTE_JMX_OPTS set
> to 'false' on every Solr node and then restart Solr. Note that the
> effective solr.in.sh file may reside in /etc/defaults/ or another
> location depending on the install. You can then validate that the
> 'com.sun.management.jmxremote*' family of properties are not listed in
> the "Java Properties" section of the Solr Admin UI, or configured in a
> secure way.
> There is no need to upgrade or update any code.
> Remember to follow the Solr Documentation's advice to never expose Solr
> nodes directly in a hostile network environment.
> Credit:
> Matei "Mal" Badanoiu
> Solr JIRA user 'jnyryan' (John)
> References:
> [1] https://issues.apache.org/jira/browse/SOLR-13647
> [3] https://lucene.apache.org/solr/news.html
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13647) CVE-2019-12409: Apache Solr RCE vulnerability due to bad config default

2019-11-18 Thread Jira



 [ 
https://issues.apache.org/jira/browse/SOLR-13647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-13647:
---
Description: 
This issue originally had the title "default solr.in.sh contains uncommented 
lines" and told users to check their {{solr.in.sh}} file. Later we upgraded the 
severity of this due to risk of remote code execution, acquired a CVE number 
and issued the public annoucement quoted below.

{noformat}
CVE-2019-12409: Apache Solr RCE vulnerability due to bad config default

Severity: High

Vendor:
The Apache Software Foundation

Versions Affected:
Solr 8.1.1 and 8.2.0 for Linux

Description: The 8.1.1 and 8.2.0 releases of Apache Solr contain an
insecure setting for the ENABLE_REMOTE_JMX_OPTS configuration option
in the default solr.in.sh configuration file shipping with Solr.

Windows users are not affected.

If you use the default solr.in.sh file from the affected releases, then
JMX monitoring will be enabled and exposed on RMI_PORT (default=18983),
without any authentication. If this port is opened for inbound traffic
in your firewall, then anyone with network access to your Solr nodes
will be able to access JMX, which may in turn allow them to upload
malicious code for execution on the Solr server.

The vulnerability is already public [1] and mitigation steps were
announced on project mailing lists and news page [3] on August 14th,
without mentioning RCE at that time.

Mitigation:
Make sure your effective solr.in.sh file has ENABLE_REMOTE_JMX_OPTS set
to 'false' on every Solr node and then restart Solr. Note that the
effective solr.in.sh file may reside in /etc/defaults/ or another
location depending on the install. You can then validate that the
'com.sun.management.jmxremote*' family of properties are not listed in
the "Java Properties" section of the Solr Admin UI, or configured in a
secure way.

There is no need to upgrade or update any code.

Remember to follow the Solr Documentation's advice to never expose Solr
nodes directly in a hostile network environment.

Credit:
Matei "Mal" Badanoiu
Solr JIRA user 'jnyryan' (John)

References:
[1] https://issues.apache.org/jira/browse/SOLR-13647
[3] https://lucene.apache.org/solr/news.html
{noformat}


  was:
default version of this file should be completely commented

ENABLE_REMOTE_JMX_OPTS had defaults

   Priority: Major  (was: Trivial)
Summary: CVE-2019-12409: Apache Solr RCE vulnerability due to bad 
config default  (was: default solr.in.sh contains uncommented lines)

> CVE-2019-12409: Apache Solr RCE vulnerability due to bad config default
> ---
>
> Key: SOLR-13647
> URL: https://issues.apache.org/jira/browse/SOLR-13647
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 8.1.1
>Reporter: John
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: 8.3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> This issue originally had the title "default solr.in.sh contains uncommented 
> lines" and told users to check their {{solr.in.sh}} file. Later we upgraded 
> the severity of this due to risk of remote code execution, acquired a CVE 
> number and issued the public annoucement quoted below.
> {noformat}
> CVE-2019-12409: Apache Solr RCE vulnerability due to bad config default
> Severity: High
> Vendor:
> The Apache Software Foundation
> Versions Affected:
> Solr 8.1.1 and 8.2.0 for Linux
> Description: The 8.1.1 and 8.2.0 releases of Apache Solr contain an
> insecure setting for the ENABLE_REMOTE_JMX_OPTS configuration option
> in the default solr.in.sh configuration file shipping with Solr.
> Windows users are not affected.
> If you use the default solr.in.sh file from the affected releases, then
> JMX monitoring will be enabled and exposed on RMI_PORT (default=18983),
> without any authentication. If this port is opened for inbound traffic
> in your firewall, then anyone with network access to your Solr nodes
> will be able to access JMX, which may in turn allow them to upload
> malicious code for execution on the Solr server.
> The vulnerability is already public [1] and mitigation steps were
> announced on project mailing lists and news page [3] on August 14th,
> without mentioning RCE at that time.
> Mitigation:
> Make sure your effective solr.in.sh file has ENABLE_REMOTE_JMX_OPTS set
> to 'false' on every Solr node and then restart Solr. Note that the
> effective solr.in.sh file may reside in /etc/defaults/ or another
> location depending on the install. You can then validate that the
> 'com.sun.management.jmxremote*' family of properties are not listed in
> the "Java Properties" section of the Solr Admin UI, or configured in a
> secure way.
> There is no need to upgrade or update any code.
> Remember to follow the Solr Documentation's advice to never expose

[jira] [Resolved] (SOLR-13849) TestSnapshotCloudManager.testSimulatorFromSnapshot failure due to running trigger

2019-11-18 Thread Andrzej Bialecki (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki resolved SOLR-13849.
-
Resolution: Fixed

> TestSnapshotCloudManager.testSimulatorFromSnapshot failure due to running 
> trigger
> -
>
> Key: SOLR-13849
> URL: https://issues.apache.org/jira/browse/SOLR-13849
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Andrzej Bialecki
>Priority: Major
>
> recent jenkins failure from 
> TestSnapshotCloudManager.testSimulatorFromSnapshot suggests that the problem 
> is atemping to compare the ZK tree from the snapshot with the ZK tree of the 
> running system while a trigger has fired creating an ephemeral node...
> {noformat}
>[junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=TestSnapshotCloudManager -Dtests.method=testSimulatorFromSnapshot 
> -Dtests.seed=275551978EB
> 3DD11 -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=fr-MC 
> -Dtests.timezone=Asia/Jakarta -Dtests.asserts=true 
> -Dtests.file.encoding=ISO-8859-1
>[junit4] FAILURE 0.17s J2 | 
> TestSnapshotCloudManager.testSimulatorFromSnapshot <<<
>[junit4]> Throwable #1: java.lang.AssertionError: expected:<[ ... ]> 
> but was:<[ ... , /autoscaling/events/.scheduled_maintenance/qn-00, 
> ... ]>
>[junit4]>at 
> __randomizedtesting.SeedInfo.seed([275551978EB3DD11:491E5C4CE071E7FD]:0)
>[junit4]>at 
> org.apache.solr.cloud.autoscaling.sim.TestSnapshotCloudManager.assertDistribStateManager(TestSnapshotCloudManager.java:241)
>[junit4]>at 
> org.apache.solr.cloud.autoscaling.sim.TestSnapshotCloudManager.testSimulatorFromSnapshot(TestSnapshotCloudManager.java:157)
>[junit4]>at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>[junit4]>at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>[junit4]>at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>[junit4]>at 
> java.base/java.lang.reflect.Method.invoke(Method.java:564)
>[junit4]>at java.base/java.lang.Thread.run(Thread.java:830)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] msokolov opened a new pull request #1017: LUCENE-9051: random access for IndexedDISI

2019-11-18 Thread GitBox

msokolov opened a new pull request #1017: LUCENE-9051: random access for 
IndexedDISI
URL: https://github.com/apache/lucene-solr/pull/1017
 
 
   # Description
   Enables random access over DocValues via its existing advanceExact method. 
With this change, callers may provide docids in any order rather than having to 
enforce that they never decrease over subsequent calls to the iterator API.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9051) Implement random access seeks in IndexedDISI (DocValues)

2019-11-18 Thread Michael Sokolov (Jira)

Michael Sokolov created LUCENE-9051:
---

 Summary: Implement random access seeks in IndexedDISI (DocValues)
 Key: LUCENE-9051
 URL: https://issues.apache.org/jira/browse/LUCENE-9051
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael Sokolov


In LUCENE-9004 we have a use case for random-access seeking in DocValues, which 
currently only support forward-only iteration (with efficient skipping). One 
idea there was to write an entirely new format to cover these cases. While 
looking into that, I noticed that our current DocValues addressing 
implementation, {{IndexedDISI}}, already has a pretty good basis for providing 
random accesses. I worked up a patch that does that; we already have the 
ability to jump to a block, thanks to the jump-tables added last year by 
[~toke]; the patch uses that, and/or rewinds the iteration within current block 
as needed.

I did a very simple performance test, comparing forward-only iteration with 
random seeks, and in my test I saw no difference, but that can't be right, so I 
wonder if we have a more thorough performance test of DocValues somwhere that I 
could repurpose. Probably I'll go back and dig into the issue where we added 
the jump tables - I seem to recall some testing was done then.

Aside from performance testing the implementation, there is the question should 
we alter our API guarantees in this way. This might be controversial, I don't 
know the history or all the reasoning behind the way it is today. We provide 
{{advanceExact}} and some implementations support docids going backwards, 
others don't.  {{AssertingNumericDocValues.advanceExact}} does  enforce 
forward-iteration (in tests); what would the consequence be of relaxing that? 
We'd then open ourselves up to requiring all DV impls to support random access. 
Are there other impls to worry about though? I'm not sure. I'd appreciate 
y'all's input on this one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13934) Documentation on SimplePostTool for Windows users is pretty brief

2019-11-18 Thread Noble Paul (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976466#comment-16976466
 ] 

Noble Paul commented on SOLR-13934:
---

I agree that we should get rid of these things from Solr. We're managing too 
much code. We have to start removing stuff from our codebase. \{{curl}} is what 
we should show in our examples.

> Documentation on SimplePostTool for Windows users is pretty brief
> -
>
> Key: SOLR-13934
> URL: https://issues.apache.org/jira/browse/SOLR-13934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SimplePostTool
>Affects Versions: 8.3
>Reporter: David Eric Pugh
>Priority: Minor
> Fix For: master (9.0)
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> SimplePostTool on windows doesn't have enough documentation, you end up 
> googling to get it to work.  Need to provide better example.
> https://lucene.apache.org/solr/guide/8_3/post-tool.html#simpleposttool



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9004) Approximate nearest vector search

2019-11-18 Thread Michael Sokolov (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976461#comment-16976461
 ] 

Michael Sokolov commented on LUCENE-9004:
-

In the meantime, [~tomoko] posted [this 
PR|https://github.com/mocobeta/lucene-solr-mirror/commit/5fb93287fe98f3e427e588f871e1f114d8da0dfa]
 adding HNSW!

> Approximate nearest vector search
> -
>
> Key: LUCENE-9004
> URL: https://issues.apache.org/jira/browse/LUCENE-9004
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Michael Sokolov
>Priority: Major
> Attachments: hnsw_layered_graph.png
>
>
> "Semantic" search based on machine-learned vector "embeddings" representing 
> terms, queries and documents is becoming a must-have feature for a modern 
> search engine. SOLR-12890 is exploring various approaches to this, including 
> providing vector-based scoring functions. This is a spinoff issue from that.
> The idea here is to explore approximate nearest-neighbor search. Researchers 
> have found an approach based on navigating a graph that partially encodes the 
> nearest neighbor relation at multiple scales can provide accuracy > 95% (as 
> compared to exact nearest neighbor calculations) at a reasonable cost. This 
> issue will explore implementing HNSW (hierarchical navigable small-world) 
> graphs for the purpose of approximate nearest vector search (often referred 
> to as KNN or k-nearest-neighbor search).
> At a high level the way this algorithm works is this. First assume you have a 
> graph that has a partial encoding of the nearest neighbor relation, with some 
> short and some long-distance links. If this graph is built in the right way 
> (has the hierarchical navigable small world property), then you can 
> efficiently traverse it to find nearest neighbors (approximately) in log N 
> time where N is the number of nodes in the graph. I believe this idea was 
> pioneered in  [1]. The great insight in that paper is that if you use the 
> graph search algorithm to find the K nearest neighbors of a new document 
> while indexing, and then link those neighbors (undirectedly, ie both ways) to 
> the new document, then the graph that emerges will have the desired 
> properties.
> The implementation I propose for Lucene is as follows. We need two new data 
> structures to encode the vectors and the graph. We can encode vectors using a 
> light wrapper around {{BinaryDocValues}} (we also want to encode the vector 
> dimension and have efficient conversion from bytes to floats). For the graph 
> we can use {{SortedNumericDocValues}} where the values we encode are the 
> docids of the related documents. Encoding the interdocument relations using 
> docids directly will make it relatively fast to traverse the graph since we 
> won't need to lookup through an id-field indirection. This choice limits us 
> to building a graph-per-segment since it would be impractical to maintain a 
> global graph for the whole index in the face of segment merges. However 
> graph-per-segment is a very natural at search time - we can traverse each 
> segments' graph independently and merge results as we do today for term-based 
> search.
> At index time, however, merging graphs is somewhat challenging. While 
> indexing we build a graph incrementally, performing searches to construct 
> links among neighbors. When merging segments we must construct a new graph 
> containing elements of all the merged segments. Ideally we would somehow 
> preserve the work done when building the initial graphs, but at least as a 
> start I'd propose we construct a new graph from scratch when merging. The 
> process is going to be  limited, at least initially, to graphs that can fit 
> in RAM since we require random access to the entire graph while constructing 
> it: In order to add links bidirectionally we must continually update existing 
> documents.
> I think we want to express this API to users as a single joint 
> {{KnnGraphField}} abstraction that joins together the vectors and the graph 
> as a single joint field type. Mostly it just looks like a vector-valued 
> field, but has this graph attached to it.
> I'll push a branch with my POC and would love to hear comments. It has many 
> nocommits, basic design is not really set, there is no Query implementation 
> and no integration iwth IndexSearcher, but it does work by some measure using 
> a standalone test class. I've tested with uniform random vectors and on my 
> laptop indexed 10K documents in around 10 seconds and searched them at 95% 
> recall (compared with exact nearest-neighbor baseline) at around 250 QPS. I 
> haven't made any attempt to use multithreaded search for this, but it is 
> amenable to per-segment concurrency.
> [1] 
>

[jira] [Commented] (LUCENE-9004) Approximate nearest vector search

2019-11-18 Thread Michael Sokolov (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976458#comment-16976458
 ] 

Michael Sokolov commented on LUCENE-9004:
-

So after I wrote this:

bq. The DocValues on-disk encoding is not the most efficient for random access; 
it is heavily optimized for efficient forward-only iteration.

I went and looked more closely at the {{Lucene80DocValuesFormat}}, and I think 
I spoke too soon - bad habit! The  formats described there actually seem like a 
pretty reasonable basis for exposing random access. Sure it's not free, but if 
you were to go about implementing a concise data structure for random access 
API over non-fully-occupied (ie dense or sparse) per-document data, I don't see 
that it would end up looking a whole lot different to this under the hood. EG 
we have jump tables for seeking to the block enclosing a doc, and then jump 
tables within the block (for DENSE encoding). I'll open a separate issue for 
enabling random access in {{IndexedDISI}}.

> Approximate nearest vector search
> -
>
> Key: LUCENE-9004
> URL: https://issues.apache.org/jira/browse/LUCENE-9004
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Michael Sokolov
>Priority: Major
> Attachments: hnsw_layered_graph.png
>
>
> "Semantic" search based on machine-learned vector "embeddings" representing 
> terms, queries and documents is becoming a must-have feature for a modern 
> search engine. SOLR-12890 is exploring various approaches to this, including 
> providing vector-based scoring functions. This is a spinoff issue from that.
> The idea here is to explore approximate nearest-neighbor search. Researchers 
> have found an approach based on navigating a graph that partially encodes the 
> nearest neighbor relation at multiple scales can provide accuracy > 95% (as 
> compared to exact nearest neighbor calculations) at a reasonable cost. This 
> issue will explore implementing HNSW (hierarchical navigable small-world) 
> graphs for the purpose of approximate nearest vector search (often referred 
> to as KNN or k-nearest-neighbor search).
> At a high level the way this algorithm works is this. First assume you have a 
> graph that has a partial encoding of the nearest neighbor relation, with some 
> short and some long-distance links. If this graph is built in the right way 
> (has the hierarchical navigable small world property), then you can 
> efficiently traverse it to find nearest neighbors (approximately) in log N 
> time where N is the number of nodes in the graph. I believe this idea was 
> pioneered in  [1]. The great insight in that paper is that if you use the 
> graph search algorithm to find the K nearest neighbors of a new document 
> while indexing, and then link those neighbors (undirectedly, ie both ways) to 
> the new document, then the graph that emerges will have the desired 
> properties.
> The implementation I propose for Lucene is as follows. We need two new data 
> structures to encode the vectors and the graph. We can encode vectors using a 
> light wrapper around {{BinaryDocValues}} (we also want to encode the vector 
> dimension and have efficient conversion from bytes to floats). For the graph 
> we can use {{SortedNumericDocValues}} where the values we encode are the 
> docids of the related documents. Encoding the interdocument relations using 
> docids directly will make it relatively fast to traverse the graph since we 
> won't need to lookup through an id-field indirection. This choice limits us 
> to building a graph-per-segment since it would be impractical to maintain a 
> global graph for the whole index in the face of segment merges. However 
> graph-per-segment is a very natural at search time - we can traverse each 
> segments' graph independently and merge results as we do today for term-based 
> search.
> At index time, however, merging graphs is somewhat challenging. While 
> indexing we build a graph incrementally, performing searches to construct 
> links among neighbors. When merging segments we must construct a new graph 
> containing elements of all the merged segments. Ideally we would somehow 
> preserve the work done when building the initial graphs, but at least as a 
> start I'd propose we construct a new graph from scratch when merging. The 
> process is going to be  limited, at least initially, to graphs that can fit 
> in RAM since we require random access to the entire graph while constructing 
> it: In order to add links bidirectionally we must continually update existing 
> documents.
> I think we want to express this API to users as a single joint 
> {{KnnGraphField}} abstraction that joins together the vectors and the graph 
> as a single joint field type. Mostly it just looks like a vector-valued 
> field, but has this graph

[jira] [Commented] (SOLR-13934) Documentation on SimplePostTool for Windows users is pretty brief

2019-11-18 Thread Ishan Chattopadhyaya (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976451#comment-16976451
 ] 

Ishan Chattopadhyaya commented on SOLR-13934:
-

I think it is time to get rid of the post tool, it doesn't belong in the 
distribution. It could perhaps go into the proposed dev page (linked to 
somewhere in GitHub). It is important now to get rid of all the cruft that we 
have kept on accumulating over the years. I feel similarly for all examples 
etc, i.e. they don't belong in Solr.

> Documentation on SimplePostTool for Windows users is pretty brief
> -
>
> Key: SOLR-13934
> URL: https://issues.apache.org/jira/browse/SOLR-13934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SimplePostTool
>Affects Versions: 8.3
>Reporter: David Eric Pugh
>Priority: Minor
> Fix For: master (9.0)
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> SimplePostTool on windows doesn't have enough documentation, you end up 
> googling to get it to work.  Need to provide better example.
> https://lucene.apache.org/solr/guide/8_3/post-tool.html#simpleposttool



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] apoorvprecisely commented on a change in pull request #1015: SOLR-13936 : expose endpoint to support changing schema without collection

2019-11-18 Thread GitBox

apoorvprecisely commented on a change in pull request #1015: SOLR-13936 : 
expose endpoint to support changing schema without collection
URL: https://github.com/apache/lucene-solr/pull/1015#discussion_r347323892
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/handler/CoreLessSolrConfigHandler.java
 ##
 @@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.handler;
+
+import javax.xml.parsers.ParserConfigurationException;
+import java.io.IOException;
+
+import org.apache.solr.api.Command;
+import org.apache.solr.api.EndPoint;
+import org.apache.solr.client.solrj.SolrRequest;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.core.CoreContainer;
+import org.apache.solr.core.SolrConfig;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.security.PermissionNameProvider;
+import org.xml.sax.SAXException;
+
+public class CoreLessSolrConfigHandler {
+  private static final String PATH_PREFIX = "/cluster/configset/";
+  private static final String PATH_POSTFIX_CONFIG = "/config";
+  private static final String CONFIG_PREFIX = "/configs/";
+  private final CoreContainer coreContainer;
+  public final Write write = new Write();
+  public final Read read = new Read();
+
+  public CoreLessSolrConfigHandler(CoreContainer coreContainer) {
 
 Review comment:
   changed


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] apoorvprecisely commented on a change in pull request #1015: SOLR-13936 : expose endpoint to support changing schema without collection

2019-11-18 Thread GitBox

apoorvprecisely commented on a change in pull request #1015: SOLR-13936 : 
expose endpoint to support changing schema without collection
URL: https://github.com/apache/lucene-solr/pull/1015#discussion_r347323909
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/handler/CoreLessSchemaHandler.java
 ##
 @@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.handler;
+
+import javax.xml.parsers.ParserConfigurationException;
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.solr.api.ApiBag;
+import org.apache.solr.api.Command;
+import org.apache.solr.api.EndPoint;
+import org.apache.solr.client.solrj.SolrRequest;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.cloud.ZkConfigManager;
+import org.apache.solr.core.CoreContainer;
+import org.apache.solr.core.SolrConfig;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.schema.ManagedIndexSchema;
+import org.apache.solr.schema.SchemaManager;
+import org.apache.solr.security.PermissionNameProvider;
+import org.apache.solr.util.ConfigSetResourceUtil;
+import org.apache.zookeeper.data.Stat;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import org.xml.sax.InputSource;
+import org.xml.sax.SAXException;
+
+public class CoreLessSchemaHandler {
 
 Review comment:
   changed


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] romseygeek commented on a change in pull request #1011: LUCENE-9031: Just highlight term intervals and its' combinations.

2019-11-18 Thread GitBox

romseygeek commented on a change in pull request #1011: LUCENE-9031: Just 
highlight term intervals and its' combinations.
URL: https://github.com/apache/lucene-solr/pull/1011#discussion_r347312251
 
 

 ##
 File path: 
lucene/queries/src/test/org/apache/lucene/queries/intervals/TestIntervals.java
 ##
 @@ -393,11 +400,16 @@ public void testNesting2() throws IOException {
 assertNull(getMatches(source, 0, "field1"));
 MatchesIterator it = getMatches(source, 1, "field1");
 assertMatch(it, 6, 21, 41, 118);
+assertNull("Conjunction intervals don't yield query so far. dunno 
y",it.getQuery());
 
 Review comment:
   Actually I think it's fine, because `IntervalWeight.matches()` always wraps 
these to return the correct query - you can just remove this line.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13934) Documentation on SimplePostTool for Windows users is pretty brief

2019-11-18 Thread Jira



[ 
https://issues.apache.org/jira/browse/SOLR-13934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976388#comment-16976388
 ] 

Jan Høydahl commented on SOLR-13934:


I agree with all your thoughts.

Funny btw that SimplePostTool was initially created just as a simple post tool 
:) for those not having access to cURL etc. But it is gaining weight year by 
year, and I'm guilty of adding web crawl capability to it. The vision behind it 
was that it should be a dependency-less, slim easy to understand example, 
that's why we never use libraries like SolrJ but plain JDK classes, so anyone 
could copy/paste code from it if they wish.

Perhaps the time has come to write something from scratch, based on Spring Boot 
and SolrJ or what not? A Swiss army knife set of production ready code snippets 
on how to push data to Solr? Something that can be referenced from the new 
Dev-Guide as how to properly integrate with Solr? Something with unit tests... 
Just some thoughts :)

> Documentation on SimplePostTool for Windows users is pretty brief
> -
>
> Key: SOLR-13934
> URL: https://issues.apache.org/jira/browse/SOLR-13934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SimplePostTool
>Affects Versions: 8.3
>Reporter: David Eric Pugh
>Priority: Minor
> Fix For: master (9.0)
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> SimplePostTool on windows doesn't have enough documentation, you end up 
> googling to get it to work.  Need to provide better example.
> https://lucene.apache.org/solr/guide/8_3/post-tool.html#simpleposttool



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (SOLR-13940) Solr at http://localhost:8984/solr did not come online within 30 seconds!

2019-11-18 Thread Jira



 [ 
https://issues.apache.org/jira/browse/SOLR-13940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved SOLR-13940.

Resolution: Invalid

Hi and welcome to the community! In this project we use Jira only for reporting 
actual bugs, not as a support portal. Therefore I'm closing this Jira now.

You should sign up to the mailing list "solr-user" and ask your question there. 
See [https://lucene.apache.org/solr/community.html#mailing-lists-irc] You'll 
probably get some replies within short time. Note that if you do not subscribe 
to the list first, then you will not receive the replies to your question :)

> Solr at http://localhost:8984/solr did not come online within 30 seconds!
> -
>
> Key: SOLR-13940
> URL: https://issues.apache.org/jira/browse/SOLR-13940
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.2
>Reporter: Jatin Vyas
>Priority: Major
> Attachments: solr-8984-console.log, solr.log
>
>
> I am new to Solr. I had download the Solr-8.2.0 version and was playing with 
> example of cloud.
> Suddenly my Solr stooped working and not getting started.
> when i hit command on windows Solr start -e cloud
> it asks for number of nodes and when give this information and then asked for 
> port numbers and after this solr not getting start on this port and gives the 
> error.
> Solr at [http://localhost:8984/solr] did not come online within 30 seconds!
>  
> I had attached log file also if it will useful for you guys.
>  
> Detail exception is as below.
>  
> h2. HTTP ERROR 404
> Problem accessing /solr/. Reason:
> Not Found
>  
> h3. Caused by:
> javax.servlet.ServletException: javax.servlet.UnavailableException: Error 
> processing the request. CoreContainer is either not initialized or shutting 
> down. at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:168)
>  at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>  at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  at org.eclipse.jetty.server.Server.handle(Server.java:505) at 
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370) at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267) 
> at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
>  at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at 
> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
>  at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
>  at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
>  at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
>  at 
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
>  at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:781)
>  at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:917)
>  at java.base/java.lang.Thread.run(Thread.java:830) Caused by: 
> javax.servlet.UnavailableException: Error processing the request. 
> CoreContainer is either not initialized or shutting down. at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:369)
>  at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:350)
>  at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)
>  at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540) at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146) 
> at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) 
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
>  at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1711)
>  at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
>  at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1347)
>  at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
>  at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480) 
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1678)
>

68 matches

Mail list logo