[JENKINS] Lucene-Solr-master-Linux (64bit/jdk-9) - Build # 20721 - Still Unstable!

2017-10-23 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/20721/
Java: 64bit/jdk-9 -XX:+UseCompressedOops -XX:+UseSerialGC --illegal-access=deny

3 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.TestTolerantUpdateProcessorCloud

Error Message:
Error from server at https://127.0.0.1:41799/solr: create the collection time 
out:180s

Stack Trace:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at https://127.0.0.1:41799/solr: create the collection time out:180s
at __randomizedtesting.SeedInfo.seed([5E73F2787F24CA88]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:626)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:413)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1086)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:867)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:800)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:178)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:195)
at 
org.apache.solr.cloud.TestTolerantUpdateProcessorCloud.createMiniSolrCloudCluster(TestTolerantUpdateProcessorCloud.java:121)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:874)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.base/java.lang.Thread.run(Thread.java:844)


FAILED:  org.apache.solr.cloud.RecoveryAfterSoftCommitTest.test

Error Message:
Timeout occured while waiting response from server at: 
http://127.0.0.1:38671/f_k

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting 
response from server at: http://127.0.0.1:38671/f_k
at 
__randomizedtesting.SeedInfo.seed([5E73F2787F24CA88:D627CDA2D1D8A770]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:637)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483)
at 

[JENKINS] Lucene-Solr-master-MacOSX (64bit/jdk1.8.0) - Build # 4246 - Still Unstable!

2017-10-23 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-MacOSX/4246/
Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC

6 tests failed.
FAILED:  
org.apache.solr.cloud.TestDeleteCollectionOnDownNodes.deleteCollectionWithDownNodes

Error Message:
Timed out waiting for leader elections null Live Nodes: [127.0.0.1:56448_solr, 
127.0.0.1:56450_solr] Last available state: null

Stack Trace:
java.lang.AssertionError: Timed out waiting for leader elections
null
Live Nodes: [127.0.0.1:56448_solr, 127.0.0.1:56450_solr]
Last available state: null
at 
__randomizedtesting.SeedInfo.seed([BD3E31198AD0618A:230B55E1ACF32D02]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.cloud.SolrCloudTestCase.waitForState(SolrCloudTestCase.java:269)
at 
org.apache.solr.cloud.TestDeleteCollectionOnDownNodes.deleteCollectionWithDownNodes(TestDeleteCollectionOnDownNodes.java:47)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 

[jira] [Updated] (SOLR-11532) Solr hits NPE when fl only contains DV fields and any of them is a spatial field

2017-10-23 Thread Cao Manh Dat (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-11532:

Description: 
Right now, Solr does not know how to decode DV value of 
LatLonPointSpatialField. Therefore, Solr will hit NPE when trying to do that, 
for example: 
- when fl contains a spatial field and it is DV + not stored
- when fl only contains DV fields and any of them is a spatial field ( stored + 
DV ). Because SOLR-8344 will always use values from DV fields. This seems a 
common case.

Stacktrace
{code}
2017-10-23 10:28:52,528 [qtp1649011739-67] ERROR HttpSolrCall  - 
null:java.lang.NullPointerException
at 
org.apache.solr.search.SolrDocumentFetcher.decorateDocValueFields(SolrDocumentFetcher.java:525)
at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:108)
at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:57)
at 
org.apache.solr.response.BinaryResponseWriter$Resolver.writeResultsBody(BinaryResponseWriter.java:126)
at 
org.apache.solr.response.BinaryResponseWriter$Resolver.writeResults(BinaryResponseWriter.java:145)
at 
org.apache.solr.response.BinaryResponseWriter$Resolver.resolve(BinaryResponseWriter.java:89)
at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:239)
at 
org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:223)
at 
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:330)
at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:228)
at org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:155)
{code}



  was:
Right now, Solr does not know how to decode DV value of 
LatLonPointSpatialField. Therefore, Solr will hit NPE when trying to do that, 
for example: 
- when fl contains a spatial field and it is DV + not stored
- when fl only contains DV fields and any of them is a spatial field ( stored + 
DV ). Because SOLR-8344 will always use values from DV fields. This seems a 
common case.




> Solr hits NPE when fl only contains DV fields and any of them is a spatial 
> field
> 
>
> Key: SOLR-11532
> URL: https://issues.apache.org/jira/browse/SOLR-11532
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: spatial
>Affects Versions: 7.1
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>
> Right now, Solr does not know how to decode DV value of 
> LatLonPointSpatialField. Therefore, Solr will hit NPE when trying to do that, 
> for example: 
> - when fl contains a spatial field and it is DV + not stored
> - when fl only contains DV fields and any of them is a spatial field ( stored 
> + DV ). Because SOLR-8344 will always use values from DV fields. This seems a 
> common case.
> Stacktrace
> {code}
> 2017-10-23 10:28:52,528 [qtp1649011739-67] ERROR HttpSolrCall  - 
> null:java.lang.NullPointerException
> at 
> org.apache.solr.search.SolrDocumentFetcher.decorateDocValueFields(SolrDocumentFetcher.java:525)
> at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:108)
> at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:57)
> at 
> org.apache.solr.response.BinaryResponseWriter$Resolver.writeResultsBody(BinaryResponseWriter.java:126)
> at 
> org.apache.solr.response.BinaryResponseWriter$Resolver.writeResults(BinaryResponseWriter.java:145)
> at 
> org.apache.solr.response.BinaryResponseWriter$Resolver.resolve(BinaryResponseWriter.java:89)
> at 
> org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:239)
> at 
> org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:223)
> at 
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:330)
> at 
> org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:228)
> at org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:155)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11532) Solr hits NPE when fl only contains DV fields and any of them is a spatial field

2017-10-23 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-11532:

Affects Version/s: 7.1
  Component/s: spatial

> Solr hits NPE when fl only contains DV fields and any of them is a spatial 
> field
> 
>
> Key: SOLR-11532
> URL: https://issues.apache.org/jira/browse/SOLR-11532
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: spatial
>Affects Versions: 7.1
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>
> Right now, Solr does not know how to decode DV value of 
> LatLonPointSpatialField. Therefore, Solr will hit NPE when trying to do that, 
> for example: 
> - when fl contains a spatial field and it is DV + not stored
> - when fl only contains DV fields and any of them is a spatial field ( stored 
> + DV ). Because SOLR-8344 will always use values from DV fields. This seems a 
> common case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11532) Solr hits NPE when fl only contains DV fields and any of them is a spatial field

2017-10-23 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216295#comment-16216295
 ] 

David Smiley commented on SOLR-11532:
-

Ouch; this is a serious bug!  If we can have {{useDocValuesAsStored}}=false 
influence the optimization in SOLR-8344 then that would both fix this bug and 
be something that makes sense to do regardless.  LLPSF can then set 
useDocValuesAsStored=true as a default for itself since the docValues data is 
not preferable to return -- it's less precision (~1cm).

If we can agree on the above, I think it'd be a separate issue to add a feature 
to actually allow LLPSF to return the lat & lon from it's DocValues.  You 
didn't suggest that but I thought I'd mention it.

Can you post the NPE stacktrace?

> Solr hits NPE when fl only contains DV fields and any of them is a spatial 
> field
> 
>
> Key: SOLR-11532
> URL: https://issues.apache.org/jira/browse/SOLR-11532
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>
> Right now, Solr does not know how to decode DV value of 
> LatLonPointSpatialField. Therefore, Solr will hit NPE when trying to do that, 
> for example: 
> - when fl contains a spatial field and it is DV + not stored
> - when fl only contains DV fields and any of them is a spatial field ( stored 
> + DV ). Because SOLR-8344 will always use values from DV fields. This seems a 
> common case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11487) Collection Alias metadata for time partitioned collections

2017-10-23 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216287#comment-16216287
 ] 

David Smiley commented on SOLR-11487:
-

Fantastic response to my code review by the way :-)

RE {{aliasMap}}: my suggestion was embarassing; of course the value side should 
now be what you have -- just Map!

bq.  Based on your earlier response we won't have a collections API call to set 
metadata

What's this in reference to?  But you've got me thinking... what if this was a 
collection-or-alias metadata thing?  That sounds pretty useful/cool from a 
user/conceptual standpoint.  From a code details standpoint... maybe this would 
be no change -- alias metadata goes in one place (aliases.json) whereas 
collections would theoretically have it in their state.json?  Any way I don't 
want to create extra work for hypothetical features that are not in scope.

RE ZkStateReader field: Naturally we need to save the data in ZK but that 
doesn't require Aliases.java to have the field and it hurts immutability (more 
on that in a sec).  Couldn't ZkStateReader make this happen (Law of Demeter 
perhaps?)?

RE immutability: I believe ZkStateReader is keeping the Aliases instance up to 
date via a ZK watcher... so if code doesn't hold a durable reference to Aliases 
(outside of ZkStateReader) then we're good?  DocCollection is immutable; I 
think it's consistent for Aliases to follow the same approach too; no?  I don't 
think we want to break with the trend here.  If it were mutable, the caller 
might not be sure when exactly the ZK interaction happens (hidden behind some 
innocent method call?).  I get this is a trade-off and you've articulated the 
other side well I think.

RE Collections.EMPTY_MAP:  Okay.

RE Aliases CRUD in ZkStateReader: I like it.

RE TimeOut: nice catch on finding the hot loop!  I recommend not copying 
TimeOut; just add some utility method if wanted.  Classes add more conceptual 
complexity than a static method for a case like this IMO.

> Collection Alias metadata for time partitioned collections
> --
>
> Key: SOLR-11487
> URL: https://issues.apache.org/jira/browse/SOLR-11487
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
> Attachments: SOLR_11487.patch, SOLR_11487.patch
>
>
> SOLR-11299 outlines an approach to using a collection Alias to refer to a 
> series of collections of a time series. We'll need to store some metadata 
> about these time series collections, such as which field of the document 
> contains the timestamp to route on.
> The current {{/aliases.json}} is a Map with a key {{collection}} which is in 
> turn a Map of alias name strings to a comma delimited list of the collections.
> _If we change the comma delimited list to be another Map to hold the existing 
> list and more stuff, older CloudSolrClient (configured to talk to ZooKeeper) 
> will break_.  Although if it's configured with an HTTP Solr URL then it would 
> not break.  There's also some read/write hassle to worry about -- we may need 
> to continue to read an aliases.json in the older format.
> Alternatively, we could add a new map entry to aliases.json, say, 
> {{collection_metadata}} keyed by alias name?
> Perhaps another very different approach is to attach metadata to the 
> configset in use?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11532) Solr hits NPE when fl only contains DV fields and any of them is a spatial field

2017-10-23 Thread Cao Manh Dat (JIRA)
Cao Manh Dat created SOLR-11532:
---

 Summary: Solr hits NPE when fl only contains DV fields and any of 
them is a spatial field
 Key: SOLR-11532
 URL: https://issues.apache.org/jira/browse/SOLR-11532
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Cao Manh Dat
Assignee: Cao Manh Dat


Right now, Solr does not know how to decode DV value of 
LatLonPointSpatialField. Therefore, Solr will hit NPE when trying to do that, 
for example: 
- when fl contains a spatial field and it is DV + not stored
- when fl only contains DV fields and any of them is a spatial field ( stored + 
DV ). Because SOLR-8344 will always use values from DV fields. This seems a 
common case.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-7.x-Linux (32bit/jdk1.8.0_144) - Build # 655 - Still Unstable!

2017-10-23 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/655/
Java: 32bit/jdk1.8.0_144 -client -XX:+UseG1GC

5 tests failed.
FAILED:  org.apache.solr.cloud.ShardSplitTest.testSplitStaticIndexReplication

Error Message:
Timeout occured while waiting response from server at: http://127.0.0.1:43129

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting 
response from server at: http://127.0.0.1:43129
at 
__randomizedtesting.SeedInfo.seed([CEBEACB4DFAC86A9:84F438B44E04A9AC]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:637)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:413)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1086)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:867)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:800)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:178)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:195)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.createServers(AbstractFullDistribZkTestBase.java:315)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:991)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:968)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 

[jira] [Updated] (LUCENE-7997) More sanity testing of similarities

2017-10-23 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-7997:

Attachment: LUCENE-7997_wip.patch

Updated patch that also tests floating point tf values. We assume a 
computeSlopFactor has the range {{(0 .. 1]}} for testing. This found a leftover 
buggy float cast in DFR {{I(F)}} but also a new bug: Axiomatic model F1 will 
most likely return NaN values if you use SloppyPhraseQuery! frequency values < 
1 cause its first log to go negative, then the next log to go NaN: formula is 
{{1 + log(1 + log(freq))}}. Imagine freq=0.3, this is {{1 + log(1 + -1.2)}} = 
{{1 + log(-0.2)}} = NaN. If we alter the formula to use {{log(1 + freq)}} then 
tests pass but needs investigation/may not be an appropriate solution, so i 
marked AwaitsFix for now.

> More sanity testing of similarities
> ---
>
> Key: LUCENE-7997
> URL: https://issues.apache.org/jira/browse/LUCENE-7997
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch
>
>
> LUCENE-7993 is a potential optimization that we could only apply if the 
> similarity is an increasing functions of {{freq}} (all other things like DF 
> and length being equal). This sounds like a very reasonable requirement for a 
> similarity, so we should test it in the base similarity test case and maybe 
> move broken similarities to sandbox?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-7.x-Windows (32bit/jdk1.8.0_144) - Build # 263 - Still Unstable!

2017-10-23 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Windows/263/
Java: 32bit/jdk1.8.0_144 -server -XX:+UseG1GC

11 tests failed.
FAILED:  org.apache.solr.cloud.CollectionsAPISolrJTest.testBalanceShardUnique

Error Message:
Error from server at http://127.0.0.1:61769/solr: Timed out waiting to see all 
replicas: [balancedProperties_shard1_replica_n1, 
balancedProperties_shard1_replica_n2, balancedProperties_shard2_replica_n3, 
balancedProperties_shard2_replica_n4] in cluster state.

Stack Trace:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://127.0.0.1:61769/solr: Timed out waiting to see all 
replicas: [balancedProperties_shard1_replica_n1, 
balancedProperties_shard1_replica_n2, balancedProperties_shard2_replica_n3, 
balancedProperties_shard2_replica_n4] in cluster state.
at 
__randomizedtesting.SeedInfo.seed([CDDF174294045BD3:8567A10F2EA4F310]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:626)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:413)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1086)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:867)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:800)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:178)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:195)
at 
org.apache.solr.cloud.CollectionsAPISolrJTest.testBalanceShardUnique(CollectionsAPISolrJTest.java:389)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 

[jira] [Commented] (SOLR-11487) Collection Alias metadata for time partitioned collections

2017-10-23 Thread Gus Heck (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216223#comment-16216223
 ] 

Gus Heck commented on SOLR-11487:
-

Thx for the review Dave. 

I'll start in on some of the fix-ups, here's some of the whys for what I did, I 
can probably be talked out of anything here, but this is what I was thinking... 
Let me know what you think.

*Fix List*
* Will add test for removal of alias case and fix it. 
* docs: yes of course good point :) 
* names: sure sounds fine :)
* I had initially started supporting lists and then decided to axe that until 
discussion. I will move back to  from .. 



*Warnings*: this gets difficult because we don't really *have* type safety 
anymore... 

We have a Map with two keys "collection" and "collection_metadata" 
the values for these two keys don't match. The former is Map and 
the later is a Map> String and Map are not 
convertible types so one use case or the other wont compile... unless you back 
off from type safety. To achieve type safety we either need to keep two 
separate maps or we need to be serializing an actual object hierarchy rather 
than collection classes.



*The ZkStateReader field* is necessary because we need to get the metadata back 
to zookeeper. Based on your earlier response we won't have a collections API 
call to set metadata, so we need to have a ZkStateReader somehow or nothing 
gets written to zookeeper. In the case of cloneWithCollectionAlias, yeah that 
can be eliminated there, good catch. I can add an overload for the signature 
you suggest as well for the case where both are to be updated at the same time, 
but WRT removing ZkStateReader entirely, see comments about immutability 
below... 



*Immutability* is nice of course, and great for things that are immutable, or 
only held for a short duration, but once you have a long held reference and the 
underlying data is actually mutable, it gets difficult to be sure nobody is 
retaining a reference a stale copy every time a change is made... A little 
digging reveals that CoreContainer has a ZkContainer which has a ZkController, 
which has a ZkStateReader, which therefore holds onto an immutable copy of 
Aliases for a difficult to determine time frame

The existing cloning/immutability scheme therefore worries me. It seems like it 
would make more sense for Aliases to function as a wrapper around the 
(fundamentally mutable) json in zookeeper. If we never want to know if there 
was a change after we get our initial copy, and we never give away a reference 
to the copy we got, and we never retain that copy after we make an update we 
could have immutable copies... hard to make those stick however. It might be 
that we want a snapshot at the start of a request that doesn't change for the 
duration request (I can imagine that getting funky fast), but the long held 
versions need to be mutable I think... maybe a mutable super class and an 
immutable subclass that throws UnsupportedOperation on mutators?



The *Collections.EMPTY_MAP* came up because I can't put anything into an empty 
map, and need to test if that's what we currently have or if we have a regular 
map that I can add stuff to. Collections.emptyMap() is not required to return 
the same instance, or any particular implementation class, so in order to test 
for it and not be subject to breakage in odd VM's or future versions (or past 
versions?) I have to use the single unique instance in Collections.EMPTY_MAP. 
I've grown slightly unsure as to whether that if block is still necessary, a 
possible hold over from early versions of this code, so I might give a shot at 
eliminating it and go back to Collections.emptyMap().



I *moved CRUD stuff* to ZkStateReader so I didn't have to duplicate it in 
Aliases to get metadata written back to zookeeper. Also it feels reasonable to 
have a ZK class doing the Zk CRUD rather than having that code live in a 
command class that grabs ZkClient from ZkStateReader and writes the data 
directly itself... (law of Demeter etc). This way, the command does command 
type stuff like identify the data to be written and validation to be sure we 
really do want to write the data and then hands the data off to the thing that 
knows how to work with zookeeper data so it can do the actual writing... 
Controller & service/DAO stuff. The present code seems like the 
controller/action in a web app firing up a JDBC connection directly...



*TimeOut*... initially I was going to copy it over verbatim since it was in 
core and core is not available in solrj (when I moved the CRUD to 
ZkStateReader) and then I realized it could be improved so I improved it. I 
think perhaps this timeout and the one in Core could be reconciled and moved to 
a commonly available location to facilitate re-use, but that seems like a 

[jira] [Updated] (LUCENE-7997) More sanity testing of similarities

2017-10-23 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-7997:

Attachment: LUCENE-7997_wip.patch

updated to test all sims and parameters.

> More sanity testing of similarities
> ---
>
> Key: LUCENE-7997
> URL: https://issues.apache.org/jira/browse/LUCENE-7997
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch
>
>
> LUCENE-7993 is a potential optimization that we could only apply if the 
> similarity is an increasing functions of {{freq}} (all other things like DF 
> and length being equal). This sounds like a very reasonable requirement for a 
> similarity, so we should test it in the base similarity test case and maybe 
> move broken similarities to sandbox?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11409) A ref guide page on setting up solr on aws

2017-10-23 Thread Amrit Sarkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amrit Sarkar updated SOLR-11409:

Attachment: SOLR-11409.patch

Uploaded first patch without images and clickable links to get the work going. 
Will upload the entire patch very soon.

> A ref guide page on setting up solr on aws
> --
>
> Key: SOLR-11409
> URL: https://issues.apache.org/jira/browse/SOLR-11409
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
>Priority: Minor
> Attachments: SOLR-11409.patch
>
>
> It will be nice if we have a dedicated page on installing solr on aws . 
> At the end we could even link to 
> http://lucene.apache.org/solr/guide/6_6/taking-solr-to-production.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7997) More sanity testing of similarities

2017-10-23 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-7997:

Attachment: LUCENE-7997_wip.patch

Patch randomizing values of parameters, adding missing range checks/docs for 
these parameters. These are just the valid ranges documented by the formulas, 
for unbounded parameters (such as normalization c, smoothing parameter mu) we 
treat them the same as BM25's k1 and just ensure non-negative/finite in the 
range check, and test the range of 0..Integer.MAX_VALUE.

Still TODO is axiomatic parameters, need to look at paper and existing code (it 
has some range checks already so it may be easy).

> More sanity testing of similarities
> ---
>
> Key: LUCENE-7997
> URL: https://issues.apache.org/jira/browse/LUCENE-7997
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch
>
>
> LUCENE-7993 is a potential optimization that we could only apply if the 
> similarity is an increasing functions of {{freq}} (all other things like DF 
> and length being equal). This sounds like a very reasonable requirement for a 
> similarity, so we should test it in the base similarity test case and maybe 
> move broken similarities to sandbox?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-11452) TestTlogReplica.testOnlyLeaderIndexes() failure

2017-10-23 Thread Cao Manh Dat (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat resolved SOLR-11452.
-
Fix Version/s: master (8.0)
   7.2

> TestTlogReplica.testOnlyLeaderIndexes() failure
> ---
>
> Key: SOLR-11452
> URL: https://issues.apache.org/jira/browse/SOLR-11452
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Assignee: Cao Manh Dat
> Fix For: 7.2, master (8.0)
>
>
> Reproduces for me, from 
> [https://builds.apache.org/job/Lucene-Solr-NightlyTests-master/1398]:
> {noformat}
> Checking out Revision f0a4b2dafe13e2b372e33ce13d552f169187a44e 
> (refs/remotes/origin/master)
> [...]
>[junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestTlogReplica 
> -Dtests.method=testOnlyLeaderIndexes -Dtests.seed=CCAC87827208491B 
> -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true 
> -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt
>  -Dtests.locale=el -Dtests.timezone=Australia/LHI -Dtests.asserts=true 
> -Dtests.file.encoding=ISO-8859-1
>[junit4] FAILURE 29.5s J2 | TestTlogReplica.testOnlyLeaderIndexes <<<
>[junit4]> Throwable #1: java.lang.AssertionError: expected:<2> but 
> was:<5>
>[junit4]>  at 
> __randomizedtesting.SeedInfo.seed([CCAC87827208491B:D0ADFA0F07AD3788]:0)
>[junit4]>  at 
> org.apache.solr.cloud.TestTlogReplica.assertCopyOverOldUpdates(TestTlogReplica.java:909)
>[junit4]>  at 
> org.apache.solr.cloud.TestTlogReplica.testOnlyLeaderIndexes(TestTlogReplica.java:501)
>[junit4]>  at java.lang.Thread.run(Thread.java:748)
> [...]
>[junit4]   2> NOTE: test params are: codec=CheapBastard, 
> sim=RandomSimilarity(queryNorm=false): {}, locale=el, timezone=Australia/LHI
>[junit4]   2> NOTE: Linux 3.13.0-88-generic amd64/Oracle Corporation 
> 1.8.0_144 (64-bit)/cpus=4,threads=1,free=137513712,total=520093696
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-11469) LeaderElectionContextKeyTest has flawed logic: 50% of the time it checks the wrong shard's elections

2017-10-23 Thread Cao Manh Dat (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat resolved SOLR-11469.
-
   Resolution: Fixed
 Assignee: Cao Manh Dat
Fix Version/s: master (8.0)
   7.2

> LeaderElectionContextKeyTest has flawed logic: 50% of the time it checks the 
> wrong shard's elections
> 
>
> Key: SOLR-11469
> URL: https://issues.apache.org/jira/browse/SOLR-11469
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Cao Manh Dat
> Fix For: 7.2, master (8.0)
>
> Attachments: SOLR-11469.patch, SOLR-11469.patch, 
> SOLR-11469_incomplete_and_broken.patch
>
>
> LeaderElectionContextKeyTest is very flaky -- and on millers beastit reports 
> it shows a suspiciously close to "50%" failure rate.
> Digging into the test i realized that it creates a 2 shard index, then picks 
> "a leader" to kill (arbitrarily) and then asserts that the leader election 
> nodes for *shard1* are affected ... so ~50% of the time it kills the shard2 
> leader and then fails because it doesn't see an election in shard1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-master-Linux (32bit/jdk1.8.0_144) - Build # 20720 - Still Unstable!

2017-10-23 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/20720/
Java: 32bit/jdk1.8.0_144 -server -XX:+UseSerialGC

3 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.TestSolrCloudWithSecureImpersonation

Error Message:
1 thread leaked from SUITE scope at 
org.apache.solr.cloud.TestSolrCloudWithSecureImpersonation: 1) 
Thread[id=19514, name=jetty-launcher-5300-thread-1-EventThread, 
state=TIMED_WAITING, group=TGRP-TestSolrCloudWithSecureImpersonation] 
at sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) 
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
 at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)  
   at 
org.apache.curator.CuratorZookeeperClient.internalBlockUntilConnectedOrTimedOut(CuratorZookeeperClient.java:323)
 at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:105)  
   at 
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288)
 at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
 at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:41)
 at 
org.apache.curator.framework.recipes.shared.SharedValue.readValue(SharedValue.java:244)
 at 
org.apache.curator.framework.recipes.shared.SharedValue.access$100(SharedValue.java:44)
 at 
org.apache.curator.framework.recipes.shared.SharedValue$1.process(SharedValue.java:61)
 at 
org.apache.curator.framework.imps.NamespaceWatcher.process(NamespaceWatcher.java:67)
 at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)   
  at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:505)

Stack Trace:
com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE 
scope at org.apache.solr.cloud.TestSolrCloudWithSecureImpersonation: 
   1) Thread[id=19514, name=jetty-launcher-5300-thread-1-EventThread, 
state=TIMED_WAITING, group=TGRP-TestSolrCloudWithSecureImpersonation]
at sun.misc.Unsafe.park(Native Method)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at 
org.apache.curator.CuratorZookeeperClient.internalBlockUntilConnectedOrTimedOut(CuratorZookeeperClient.java:323)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:105)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:41)
at 
org.apache.curator.framework.recipes.shared.SharedValue.readValue(SharedValue.java:244)
at 
org.apache.curator.framework.recipes.shared.SharedValue.access$100(SharedValue.java:44)
at 
org.apache.curator.framework.recipes.shared.SharedValue$1.process(SharedValue.java:61)
at 
org.apache.curator.framework.imps.NamespaceWatcher.process(NamespaceWatcher.java:67)
at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:505)
at __randomizedtesting.SeedInfo.seed([B470C9D01CC7A385]:0)


FAILED:  org.apache.solr.cloud.FullSolrCloudDistribCmdsTest.test

Error Message:
Timeout occured while waiting response from server at: http://127.0.0.1:36297

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting 
response from server at: http://127.0.0.1:36297
at 
__randomizedtesting.SeedInfo.seed([B470C9D01CC7A385:3C24F60AB23BCE7D]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:637)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:413)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1086)
at 

[jira] [Assigned] (SOLR-11531) ref-guide build tool assumptions missmatch with how "section" level anchor ids are actaully generated in PDF

2017-10-23 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man reassigned SOLR-11531:
---

Assignee: Hoss Man

> ref-guide build tool assumptions missmatch with how "section" level anchor 
> ids are actaully generated in PDF
> 
>
> Key: SOLR-11531
> URL: https://issues.apache.org/jira/browse/SOLR-11531
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-11531.patch
>
>
> About a month ago, cassandra noticed [some problems with a few 
> links|https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9beafb612fa22b747b8728d7d954ea6e2bd37844]
>  in the ref-guide PDF which confused both of us.  Working through it to try 
> and understand what was going on -- and why the existigg custom link-checking 
> we do of the html-site version of the guide wasn't adequate for spotting 
> these kinds of problems -- I realized a few gaps in the rules our build tools 
> are enforcing...
> * the link checker, sidebar builder, & jekyll templates all have various 
> degrees of implicit/explicit assumptions that the {{page-shortname}} will 
> match the filename for each {{*.adoc}} file
> ** but nothing actaully enforces this as people edit pages & their titles
> * the jekyll templates use the {{page-shortname}} to create the {{ id="???" .../>}} attribute, and the link checker depends on that for 
> validation of links -- but on the PDF side of things, the normal [asciidoctor 
> rules for auto generated ids from section 
> headings|http://asciidoctor.org/docs/user-manual/#auto-generated-ids] is what 
> determines the anchor for each (page) header.
> ** so even though our (html based) link checker may be happy, mismatches 
> between page titles and page-shortnames cause broken links in the PDF
> Furthermore: the entire {{page-shortname}} and {{page-permalink}} variables 
> in all of our {{*.adoc}} files arn't really neccessary -- they are a 
> convention I introduced early on in the process of building our the sidebar & 
> next/pre link generation logic, but are error prone if/when people rename 
> files.
> 
> We Should (and I intend to)...
> # eliminate our dependency on {{page-shortname}} & {{page-permalink}} 
> attributes from all of our templates and nav-building code and use implicitly 
> generate values (from the filenames) instead
> # beef up our nav-building and link-checking code to verify that the "page 
> title" for each page matches to the filename -- so we can be confident the 
> per-page header anchors in our generated PDF are consistent with teh per-page 
> header anchors in our generated HTML 
> # remove all (no longer useful) {{page-shortname}} & {{page-permalink}} 
> attributes from all {{*.adoc}} files



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11531) ref-guide build tool assumptions missmatch with how "section" level anchor ids are actaully generated in PDF

2017-10-23 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-11531:

Attachment: SOLR-11531.patch

The attached patch is a starting point that takes care of #1 & #2 ... but the 
build currently fails because we have existing {{*.adoc}} files with titles 
that don't match...

{noformat}
 [java] Building up tree of all known pages
 [java] 
/home/hossman/lucene/dev/solr/build/solr-ref-guide/content/solrcloud-autoscaling-overview.adoc
 has a mismatched title: Overview of SolrCloud Autoscaling => 
overview-of-solrcloud-autoscaling
 [java] 
/home/hossman/lucene/dev/solr/build/solr-ref-guide/content/the-extended-dismax-query-parser.adoc
 has a mismatched title: The Extended DisMax (eDismax) Query Parser => 
the-extended-dismax-edismax-query-parser
 [java] 
/home/hossman/lucene/dev/solr/build/solr-ref-guide/content/solrcloud-autoscaling-auto-add-replicas.adoc
 has a mismatched title: SolrCloud AutoScaling Automatically Adding Replicas => 
solrcloud-autoscaling-automatically-adding-replicas
 [java] 
/home/hossman/lucene/dev/solr/build/solr-ref-guide/content/how-to-contribute.adoc
 has a mismatched title: How to Contribute to Solr Documentation => 
how-to-contribute-to-solr-documentation
 [java] 
/home/hossman/lucene/dev/solr/build/solr-ref-guide/content/solrcloud-autoscaling-api.adoc
 has a mismatched title: Autoscaling API => autoscaling-api
 [java] 
/home/hossman/lucene/dev/solr/build/solr-ref-guide/content/index.adoc has a 
mismatched title: Apache Solr Reference Guide => apache-solr-reference-guide
 [java] 
/home/hossman/lucene/dev/solr/build/solr-ref-guide/content/solrcloud-autoscaling-policy-preferences.adoc
 has a mismatched title: Autoscaling Policy and Preferences => 
autoscaling-policy-and-preferences
 [java] 
/home/hossman/lucene/dev/solr/build/solr-ref-guide/content/cross-data-center-replication-cdcr.adoc
 has a mismatched title: Cross Data Center Replication (CDCR) => 
cross-data-center-replication-cdcr-
{noformat}

...with the patch applied, these missmatches currently cause 
{{BuildNavAndPDFBody}} to fail, but even w/o the patch links to the "top" of 
these pages/sections currently fail in the PDF.

A few concrete Examples that are easy to "find" in the PDF:
* All links with the text "The Extended DisMax Query Parser" from the sections 
generated by query-screen.adoc, query-syntax-and-parsing.adoc, and 
searching.adoc
* link text "Overview of Autoscaling in SolrCloud" from 
solrcloud-autoscaling.adoc



...I think what would probably make sense is to go ahead and fix all of these 
page titles/filenames/shortnames w/o modifying any of the build code, then move 
forward with committing the patch (roughly) as is, then seperately remove all 
the  {{page-shortname}} & {{page-permalink}} attributes frm the source files.

One thing I'm not really sure about in this plan how to deal with 
{{index.adoc}} aka "Apache Solr Reference Guide" ?

We can do whatever special casing we want of this in our tools to "allow" it to 
have a missmatch between the filename and the page title -- but I don't think 
that's a good idea unless we do it in some what that still ensures that any 
links back to this page from the body of other pages are actaully validated 
properly such that they work in the PDF as well.

Perhaps we should rename it "apache-solr-reference-guide.adoc" and use 
.htaccess rules to redirect {{/}} to {{apache-solr-reference-guide.html}} (or 
declare it the {{DirectoryIndex}} ?)


> ref-guide build tool assumptions missmatch with how "section" level anchor 
> ids are actaully generated in PDF
> 
>
> Key: SOLR-11531
> URL: https://issues.apache.org/jira/browse/SOLR-11531
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Hoss Man
> Attachments: SOLR-11531.patch
>
>
> About a month ago, cassandra noticed [some problems with a few 
> links|https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9beafb612fa22b747b8728d7d954ea6e2bd37844]
>  in the ref-guide PDF which confused both of us.  Working through it to try 
> and understand what was going on -- and why the existigg custom link-checking 
> we do of the html-site version of the guide wasn't adequate for spotting 
> these kinds of problems -- I realized a few gaps in the rules our build tools 
> are enforcing...
> * the link checker, sidebar builder, & jekyll templates all have various 
> degrees of implicit/explicit assumptions that the {{page-shortname}} will 
> match the filename for each {{*.adoc}} file
> ** but nothing actaully enforces this as people edit pages & their titles
> * the jekyll templates use the 

[jira] [Updated] (SOLR-11531) ref-guide build tool assumptions missmatch with how "section" level anchor ids are actaully generated in PDF

2017-10-23 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-11531:

Component/s: documentation

> ref-guide build tool assumptions missmatch with how "section" level anchor 
> ids are actaully generated in PDF
> 
>
> Key: SOLR-11531
> URL: https://issues.apache.org/jira/browse/SOLR-11531
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-11531.patch
>
>
> About a month ago, cassandra noticed [some problems with a few 
> links|https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9beafb612fa22b747b8728d7d954ea6e2bd37844]
>  in the ref-guide PDF which confused both of us.  Working through it to try 
> and understand what was going on -- and why the existigg custom link-checking 
> we do of the html-site version of the guide wasn't adequate for spotting 
> these kinds of problems -- I realized a few gaps in the rules our build tools 
> are enforcing...
> * the link checker, sidebar builder, & jekyll templates all have various 
> degrees of implicit/explicit assumptions that the {{page-shortname}} will 
> match the filename for each {{*.adoc}} file
> ** but nothing actaully enforces this as people edit pages & their titles
> * the jekyll templates use the {{page-shortname}} to create the {{ id="???" .../>}} attribute, and the link checker depends on that for 
> validation of links -- but on the PDF side of things, the normal [asciidoctor 
> rules for auto generated ids from section 
> headings|http://asciidoctor.org/docs/user-manual/#auto-generated-ids] is what 
> determines the anchor for each (page) header.
> ** so even though our (html based) link checker may be happy, mismatches 
> between page titles and page-shortnames cause broken links in the PDF
> Furthermore: the entire {{page-shortname}} and {{page-permalink}} variables 
> in all of our {{*.adoc}} files arn't really neccessary -- they are a 
> convention I introduced early on in the process of building our the sidebar & 
> next/pre link generation logic, but are error prone if/when people rename 
> files.
> 
> We Should (and I intend to)...
> # eliminate our dependency on {{page-shortname}} & {{page-permalink}} 
> attributes from all of our templates and nav-building code and use implicitly 
> generate values (from the filenames) instead
> # beef up our nav-building and link-checking code to verify that the "page 
> title" for each page matches to the filename -- so we can be confident the 
> per-page header anchors in our generated PDF are consistent with teh per-page 
> header anchors in our generated HTML 
> # remove all (no longer useful) {{page-shortname}} & {{page-permalink}} 
> attributes from all {{*.adoc}} files



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11531) ref-guide build tool assumptions missmatch with how "section" level anchor ids are actaully generated in PDF

2017-10-23 Thread Hoss Man (JIRA)
Hoss Man created SOLR-11531:
---

 Summary: ref-guide build tool assumptions missmatch with how 
"section" level anchor ids are actaully generated in PDF
 Key: SOLR-11531
 URL: https://issues.apache.org/jira/browse/SOLR-11531
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Hoss Man



About a month ago, cassandra noticed [some problems with a few 
links|https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9beafb612fa22b747b8728d7d954ea6e2bd37844]
 in the ref-guide PDF which confused both of us.  Working through it to try and 
understand what was going on -- and why the existigg custom link-checking we do 
of the html-site version of the guide wasn't adequate for spotting these kinds 
of problems -- I realized a few gaps in the rules our build tools are 
enforcing...

* the link checker, sidebar builder, & jekyll templates all have various 
degrees of implicit/explicit assumptions that the {{page-shortname}} will match 
the filename for each {{*.adoc}} file
** but nothing actaully enforces this as people edit pages & their titles
* the jekyll templates use the {{page-shortname}} to create the {{}} attribute, and the link checker depends on that for validation 
of links -- but on the PDF side of things, the normal [asciidoctor rules for 
auto generated ids from section 
headings|http://asciidoctor.org/docs/user-manual/#auto-generated-ids] is what 
determines the anchor for each (page) header.
** so even though our (html based) link checker may be happy, mismatches 
between page titles and page-shortnames cause broken links in the PDF

Furthermore: the entire {{page-shortname}} and {{page-permalink}} variables in 
all of our {{*.adoc}} files arn't really neccessary -- they are a convention I 
introduced early on in the process of building our the sidebar & next/pre link 
generation logic, but are error prone if/when people rename files.



We Should (and I intend to)...

# eliminate our dependency on {{page-shortname}} & {{page-permalink}} 
attributes from all of our templates and nav-building code and use implicitly 
generate values (from the filenames) instead
# beef up our nav-building and link-checking code to verify that the "page 
title" for each page matches to the filename -- so we can be confident the 
per-page header anchors in our generated PDF are consistent with teh per-page 
header anchors in our generated HTML 
# remove all (no longer useful) {{page-shortname}} & {{page-permalink}} 
attributes from all {{*.adoc}} files




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7997) More sanity testing of similarities

2017-10-23 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-7997:

Attachment: LUCENE-7997_wip.patch

Updated patch with all remaining sims (axiomatic and language models) now 
tested.
The axiomatic F3EXP and F3LOG fail due to their gamma function driving scores 
negative, I added a warning to their javadocs about this. Also note that these 
two models don't have default parameter-free ctors. The other 4 models (F1EXP, 
F1LOG, F2EXP, F2LOG) are all fine, they don't have this gamma function.

At least now we have the lay of the land, it is as expected. 

Still need to deal with many parameters which aren't yet tested. In many cases 
these are also missing any range checks, we need to dig up/figure out the valid 
domain, randomize them, look for issues etc. But the default values are tested.


> More sanity testing of similarities
> ---
>
> Key: LUCENE-7997
> URL: https://issues.apache.org/jira/browse/LUCENE-7997
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch
>
>
> LUCENE-7993 is a potential optimization that we could only apply if the 
> similarity is an increasing functions of {{freq}} (all other things like DF 
> and length being equal). This sounds like a very reasonable requirement for a 
> similarity, so we should test it in the base similarity test case and maybe 
> move broken similarities to sandbox?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-7.x-Linux (32bit/jdk1.8.0_144) - Build # 654 - Still Unstable!

2017-10-23 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/654/
Java: 32bit/jdk1.8.0_144 -client -XX:+UseG1GC

1 tests failed.
FAILED:  org.apache.solr.cloud.ShardRoutingCustomTest.test

Error Message:
Timeout occured while waiting response from server at: https://127.0.0.1:33887

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting 
response from server at: https://127.0.0.1:33887
at 
__randomizedtesting.SeedInfo.seed([56024026214261CA:DE567FFC8FBE0C32]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:637)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:413)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1086)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:867)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:800)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:178)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:195)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.createServers(AbstractFullDistribZkTestBase.java:315)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:991)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:968)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 

[JENKINS] Lucene-Solr-Tests-5.5 - Build # 38 - Still Failing

2017-10-23 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.5/38/

11 tests failed.
FAILED:  org.apache.solr.cloud.BasicDistributedZkTest.test

Error Message:
Could not load collection from ZK: collection1

Stack Trace:
org.apache.solr.common.SolrException: Could not load collection from ZK: 
collection1
at 
__randomizedtesting.SeedInfo.seed([11CAF945A9F406F8:999EC69F07086B00]:0)
at 
org.apache.solr.common.cloud.ZkStateReader.getCollectionLive(ZkStateReader.java:986)
at 
org.apache.solr.common.cloud.ZkStateReader$LazyCollectionRef.get(ZkStateReader.java:523)
at 
org.apache.solr.common.cloud.ClusterState.getCollectionOrNull(ClusterState.java:189)
at 
org.apache.solr.common.cloud.ClusterState.hasCollection(ClusterState.java:119)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.getCollectionNames(CloudSolrClient.java:1132)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:854)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:827)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:942)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:957)
at 
org.apache.solr.cloud.BasicDistributedZkTest.queryServer(BasicDistributedZkTest.java:1183)
at 
org.apache.solr.cloud.BasicDistributedZkTest.testUpdateProcessorsRunOnlyOnce(BasicDistributedZkTest.java:715)
at 
org.apache.solr.cloud.BasicDistributedZkTest.test(BasicDistributedZkTest.java:366)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:996)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:971)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 

[jira] [Updated] (LUCENE-7997) More sanity testing of similarities

2017-10-23 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-7997:

Attachment: LUCENE-7997_wip.patch

updated with the information-based models. LL passes the test, and SPL fails as 
expected, it has warnings in its javadocs.

> More sanity testing of similarities
> ---
>
> Key: LUCENE-7997
> URL: https://issues.apache.org/jira/browse/LUCENE-7997
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch
>
>
> LUCENE-7993 is a potential optimization that we could only apply if the 
> similarity is an increasing functions of {{freq}} (all other things like DF 
> and length being equal). This sounds like a very reasonable requirement for a 
> similarity, so we should test it in the base similarity test case and maybe 
> move broken similarities to sandbox?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7997) More sanity testing of similarities

2017-10-23 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-7997:

Attachment: LUCENE-7997_wip.patch

patch with the 3 DFI models passing too.

> More sanity testing of similarities
> ---
>
> Key: LUCENE-7997
> URL: https://issues.apache.org/jira/browse/LUCENE-7997
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch
>
>
> LUCENE-7993 is a potential optimization that we could only apply if the 
> similarity is an increasing functions of {{freq}} (all other things like DF 
> and length being equal). This sounds like a very reasonable requirement for a 
> similarity, so we should test it in the base similarity test case and maybe 
> move broken similarities to sandbox?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7997) More sanity testing of similarities

2017-10-23 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-7997:

Attachment: LUCENE-7997_wip.patch

Updated patch with DFR passing/failing the new tests as expected:
* scoring models without warnings in the javadocs pass: models {{G}}, {{I(F)}}, 
{{I\(n)}}, {{I(ne)}} 
* ones with warnings in javadocs all fail: models {{BE}}, {{D}}, and {{P}}

I think this is a good sign it works to do what we need. To make DFR pass at 
all, I changed SimilarityBase to use {{double}} everywhere internally, then 
cast to 32-bit float at the end. This fixed all the numerical errors. I think 
this makes sense as this subclass is supposed to be simple and easy to use 
(separately, we should take another look at the whole thing now that a lot of 
ClassicSimilarity's complexity has been removed). It makes the formulas more 
elegant in many cases too because constants like {{5.0}} are naturally doubles 
and all java Math functions take doubles, so some casts etc get removed.

Will work thru the other models and look at potential improvements to explain 
etc here too for consistency.

> More sanity testing of similarities
> ---
>
> Key: LUCENE-7997
> URL: https://issues.apache.org/jira/browse/LUCENE-7997
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch
>
>
> LUCENE-7993 is a potential optimization that we could only apply if the 
> similarity is an increasing functions of {{freq}} (all other things like DF 
> and length being equal). This sounds like a very reasonable requirement for a 
> similarity, so we should test it in the base similarity test case and maybe 
> move broken similarities to sandbox?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: API Docs Output Snippet Formats

2017-10-23 Thread Jason Gerlowski
Thanks for the feedback (and context/history) guys.  I've created
SOLR-11530 (https://issues.apache.org/jira/browse/SOLR-11530) to keep
track of this.

You're right Cassandra, switching the snippets over to JSON will
probably take a bit of time.  I'm going to give it a shot this
evening, in case it's easier than it first appears.  But assuming it's
not, I plan to upload a patch adding the appropriate "wt" params, as a
short term fix.

Jason

On Mon, Oct 23, 2017 at 12:14 PM, Cassandra Targett
 wrote:
> I did a pass through the Ref Guide for SOLR-10494 and noted there [1]
> that I neglected to look for places where the output was XML but the
> sample request did not include "wt=xml". My intent was to look for
> those later, but then I forgot.
>
> It's likely easier to find where the request is missing "wt=xml" than
> to change the XML examples to JSON, although having them all in JSON
> is preferable. If you're willing to cook up a patch for either, it
> would be appreciated.
>
> If you think changing them to JSON will take you a while (and it
> might), I'd be happy to split the work and do the pass through for
> missing "wt=xml" params as a temporary measure.
>
> Cassandra
>
> [1] 
> https://issues.apache.org/jira/browse/SOLR-10494?focusedCommentId=16056403=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16056403
>
> On Sun, Oct 22, 2017 at 9:47 PM, Varun Thacker  wrote:
>> I'd prefer 1>
>>
>> On Sun, Oct 22, 2017 at 7:39 PM, Jason Gerlowski 
>> wrote:
>>>
>>> Hey all,
>>>
>>> Was doing some poking around the ref-guide this weekend.  I noticed
>>> that the output snippets given with the API documentation is split
>>> about 50/50 between xml and json.  Few of the examples contain an
>>> explicit "wt" parameter.  With the default "wt" format switching to
>>> json in 7.0, this means that any of the output snippets in XML format
>>> won't match what a user following along would see themselves.
>>>
>>> This won't trouble experienced users, but it could be a small
>>> speedbump for any new Solr adopters.  Making the snippets match the
>>> API calls would make the docs more correct, and more amateur-friendly.
>>>
>>> There's two approaches we could take to bring things into better
>>> alignment:
>>>
>>> 1. Change all API output snippets to JSON.
>>>
>>> 2, Don't change the format of any snippets.  Instead, add a "wt"
>>> parameter to the API call corresponding to any XML snippets, so that
>>> the input-call matches the output.
>>>
>>> Happy to create a JIRA and propose a patch for either approach if
>>> people think it's worth it, or have a particular preference on
>>> approach.  Anyone have any thoughts?
>>>
>>> Best,
>>>
>>> Jason
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11530) Update ref-guide output snippets to match new default wt

2017-10-23 Thread Jason Gerlowski (JIRA)
Jason Gerlowski created SOLR-11530:
--

 Summary: Update ref-guide output snippets to match new default wt
 Key: SOLR-11530
 URL: https://issues.apache.org/jira/browse/SOLR-11530
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: documentation
Affects Versions: master (8.0)
Reporter: Jason Gerlowski


Solr 7.0 changed the default wt format from XML to JSON.  Many of the 
API-output snippets from our documentation were never updated to accommodate 
this change.  So as-is, the output won't match what users following along at 
home would get from running the commands.  This could throw some newer Solr 
users.

This JIRA is created to resolve this inconsistency.  Either by adding the 
"wt=xml" param to the API command corresponding to any XML snippets, or by 
switching the snippets over to all use JSON.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-7.x-Solaris (64bit/jdk1.8.0) - Build # 258 - Still Unstable!

2017-10-23 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Solaris/258/
Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseParallelGC

4 tests failed.
FAILED:  
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testCollectionsAPI

Error Message:
Error from server at http://127.0.0.1:50131/solr/awhollynewcollection_0: 
{"awhollynewcollection_0":6}

Stack Trace:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://127.0.0.1:50131/solr/awhollynewcollection_0: 
{"awhollynewcollection_0":6}
at 
__randomizedtesting.SeedInfo.seed([74114FB8046A30C9:3C643B0C02591F5C]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:626)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:413)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1086)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:867)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:972)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:972)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:972)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:972)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:972)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:800)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:178)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:942)
at 
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testCollectionsAPI(CollectionsAPIDistributedZkTest.java:460)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 

[jira] [Commented] (LUCENE-7999) Invalid segment file name

2017-10-23 Thread Mykhailo Demianenko (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216025#comment-16216025
 ] 

Mykhailo Demianenko commented on LUCENE-7999:
-

Thank you! [~mikemccand] Unfortunately I do not have precise information on how 
old this particular index was (most probably a couple of years). But yes we 
refresh indexes quite a lot since we need to ensure that readers always see the 
latest available snapshot of the data.

> Invalid segment file name
> -
>
> Key: LUCENE-7999
> URL: https://issues.apache.org/jira/browse/LUCENE-7999
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 5.x, 6.x, 7.0
>Reporter: Mykhailo Demianenko
>Priority: Minor
> Attachments: LUCENE-7999.patch, segmentName.patch
>
>
> After really long and intensive index usage its possible to overflow counter 
> that used to generate new segment name that will not satisfy validation 
> criteria:
> Caused by: java.lang.IllegalArgumentException: invalid codec filename 
> '_-zik0zk_Lucene54_0.dvm', must match: _[a-z0-9]+(_.*)?\..*
> at 
> org.apache.lucene.index.SegmentInfo.checkFileNames(SegmentInfo.java:280)
> at org.apache.lucene.index.SegmentInfo.addFiles(SegmentInfo.java:262)
> at org.apache.lucene.index.SegmentInfo.setFiles(SegmentInfo.java:256)
> at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4080)
> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3655)
> at 
> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:588)
> at 
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:626)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11526) CollectionAdminResponse.isSuccess() incorrect for most admin collections APIs

2017-10-23 Thread Jason Gerlowski (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216016#comment-16216016
 ] 

Jason Gerlowski commented on SOLR-11526:


I'd thought the "status" field would be enough to solve this, but it looks like 
it's 0 even on many failures:

{code}
{
  "responseHeader": {
"status": 0,
"QTime": 13058
  },
  "failure": {
"127.0.0.1:45211_solr": 
"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error 
from server at http://127.0.0.1:45211/solr: Error CREATEing SolrCore 
'foo_shard1_replica_n1': Unable to create core [foo_shard1_replica_n1] Caused 
by: access denied (\"java.io.FilePermission\" \"/some_invalid_dir/foo/tlog\" 
\"write\")"
  }
}
{code}

I feel like I'm misinterpreting the intent of the status field.  The few times 
I see it set as non-zero, it holds the HTTP status code from the request to 
Solr.  This seems like it could be used for success-determination, though maybe 
Solr doesn't use HTTP status codes consistently enough for this to be reliable. 
 I'll take a look at the logic used to set the status field for these APIs and 
see if using "status" is as reasonable as it appears at first blush.

> CollectionAdminResponse.isSuccess() incorrect for most admin collections APIs
> -
>
> Key: SOLR-11526
> URL: https://issues.apache.org/jira/browse/SOLR-11526
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: master (8.0)
>Reporter: Jason Gerlowski
>Priority: Minor
>
> {{CollectionAdminResponse}} has a boolean {{isSuccess}} method which reports 
> whether the API was called successfully.  It returns true if it finds a 
> non-null NamedList element called "success".  It returns false otherwise.
> Unfortunately, only a handful of the Collection-Admin APIs have this element. 
>  APIs that don't contain this element in their response will always appear to 
> have failed (according to {{isSuccess()}}).
> The current implementation is correct for:
> - CREATECOLLECTION
> - RELOAD
> - SPLITSHARD
> - DELETESHARD
> - DELETECOLLECTION
> - ADDREPLICA
> - MIGRATE
> The current implementation is incorrect for:
> - CREATESHARD
> - CREATEALIAS
> - DELETEALIAS
> - LISTALIASES
> - CLUSTERPROP
> - ADDROLE
> - REMOVEROLE
> - OVERSEERSTATUS
> - CLUSTERSTATUS
> - REQUESTSTATUS
> - DELETESTATUS
> - LIST
> - ADDREPLICAPROP
> - DELETEREPLICAPROP
> - BALANCESHARDUNIQUE
> - REBALANCELEADERS
> (these lists are incomplete)
> A trivial fix for this would be to change the implementation to check the 
> "status" NamedList element (which is present in all Collection-Admin APIs).  
> My understanding is that the "status' field is set to 0 always on success.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-7999) Invalid segment file name

2017-10-23 Thread Mykhailo Demianenko (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215997#comment-16215997
 ] 

Mykhailo Demianenko edited comment on LUCENE-7999 at 10/23/17 10:57 PM:


Updated patch: add new _VERSION_72_ version field; write and read VLong(s); use 
_format_ to determine index format


was (Author: mikedemi):
Updated patch: add new LUCENE_72 version; write and read VLong(s); use 'format' 
to determine index format

> Invalid segment file name
> -
>
> Key: LUCENE-7999
> URL: https://issues.apache.org/jira/browse/LUCENE-7999
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 5.x, 6.x, 7.0
>Reporter: Mykhailo Demianenko
>Priority: Minor
> Attachments: LUCENE-7999.patch, segmentName.patch
>
>
> After really long and intensive index usage its possible to overflow counter 
> that used to generate new segment name that will not satisfy validation 
> criteria:
> Caused by: java.lang.IllegalArgumentException: invalid codec filename 
> '_-zik0zk_Lucene54_0.dvm', must match: _[a-z0-9]+(_.*)?\..*
> at 
> org.apache.lucene.index.SegmentInfo.checkFileNames(SegmentInfo.java:280)
> at org.apache.lucene.index.SegmentInfo.addFiles(SegmentInfo.java:262)
> at org.apache.lucene.index.SegmentInfo.setFiles(SegmentInfo.java:256)
> at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4080)
> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3655)
> at 
> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:588)
> at 
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:626)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7999) Invalid segment file name

2017-10-23 Thread Mykhailo Demianenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mykhailo Demianenko updated LUCENE-7999:

Attachment: LUCENE-7999.patch

Updated patch: add new LUCENE_72 version; write and read VLong(s); use 'format' 
to determine index format

> Invalid segment file name
> -
>
> Key: LUCENE-7999
> URL: https://issues.apache.org/jira/browse/LUCENE-7999
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 5.x, 6.x, 7.0
>Reporter: Mykhailo Demianenko
>Priority: Minor
> Attachments: LUCENE-7999.patch, segmentName.patch
>
>
> After really long and intensive index usage its possible to overflow counter 
> that used to generate new segment name that will not satisfy validation 
> criteria:
> Caused by: java.lang.IllegalArgumentException: invalid codec filename 
> '_-zik0zk_Lucene54_0.dvm', must match: _[a-z0-9]+(_.*)?\..*
> at 
> org.apache.lucene.index.SegmentInfo.checkFileNames(SegmentInfo.java:280)
> at org.apache.lucene.index.SegmentInfo.addFiles(SegmentInfo.java:262)
> at org.apache.lucene.index.SegmentInfo.setFiles(SegmentInfo.java:256)
> at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4080)
> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3655)
> at 
> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:588)
> at 
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:626)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-master-Linux (64bit/jdk1.8.0_144) - Build # 20719 - Still Unstable!

2017-10-23 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/20719/
Java: 64bit/jdk1.8.0_144 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC

2 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.security.hadoop.TestImpersonationWithHadoopAuth

Error Message:
2 threads leaked from SUITE scope at 
org.apache.solr.security.hadoop.TestImpersonationWithHadoopAuth: 1) 
Thread[id=13219, name=jetty-launcher-1701-thread-2-EventThread, 
state=TIMED_WAITING, group=TGRP-TestImpersonationWithHadoopAuth] at 
sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) 
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
 at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)  
   at 
org.apache.curator.CuratorZookeeperClient.internalBlockUntilConnectedOrTimedOut(CuratorZookeeperClient.java:323)
 at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:105)  
   at 
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288)
 at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
 at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:41)
 at 
org.apache.curator.framework.recipes.shared.SharedValue.readValue(SharedValue.java:244)
 at 
org.apache.curator.framework.recipes.shared.SharedValue.access$100(SharedValue.java:44)
 at 
org.apache.curator.framework.recipes.shared.SharedValue$1.process(SharedValue.java:61)
 at 
org.apache.curator.framework.imps.NamespaceWatcher.process(NamespaceWatcher.java:67)
 at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)   
  at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:505)   
 2) Thread[id=13217, name=jetty-launcher-1701-thread-1-EventThread, 
state=TIMED_WAITING, group=TGRP-TestImpersonationWithHadoopAuth] at 
sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) 
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
 at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)  
   at 
org.apache.curator.CuratorZookeeperClient.internalBlockUntilConnectedOrTimedOut(CuratorZookeeperClient.java:323)
 at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:105)  
   at 
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288)
 at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
 at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:41)
 at 
org.apache.curator.framework.recipes.shared.SharedValue.readValue(SharedValue.java:244)
 at 
org.apache.curator.framework.recipes.shared.SharedValue.access$100(SharedValue.java:44)
 at 
org.apache.curator.framework.recipes.shared.SharedValue$1.process(SharedValue.java:61)
 at 
org.apache.curator.framework.imps.NamespaceWatcher.process(NamespaceWatcher.java:67)
 at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)   
  at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:505)

Stack Trace:
com.carrotsearch.randomizedtesting.ThreadLeakError: 2 threads leaked from SUITE 
scope at org.apache.solr.security.hadoop.TestImpersonationWithHadoopAuth: 
   1) Thread[id=13219, name=jetty-launcher-1701-thread-2-EventThread, 
state=TIMED_WAITING, group=TGRP-TestImpersonationWithHadoopAuth]
at sun.misc.Unsafe.park(Native Method)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at 
org.apache.curator.CuratorZookeeperClient.internalBlockUntilConnectedOrTimedOut(CuratorZookeeperClient.java:323)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:105)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
at 

[jira] [Commented] (SOLR-11487) Collection Alias metadata for time partitioned collections

2017-10-23 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215893#comment-16215893
 ] 

David Smiley commented on SOLR-11487:
-

BTW FWIW RE TimeOut... IMO it'd be more nice to have a static utility method 
named something like callWithRetry(long intervalSleep, long timeout, TimeUnit 
unit, Callable callable) throws Exception

> Collection Alias metadata for time partitioned collections
> --
>
> Key: SOLR-11487
> URL: https://issues.apache.org/jira/browse/SOLR-11487
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
> Attachments: SOLR_11487.patch, SOLR_11487.patch
>
>
> SOLR-11299 outlines an approach to using a collection Alias to refer to a 
> series of collections of a time series. We'll need to store some metadata 
> about these time series collections, such as which field of the document 
> contains the timestamp to route on.
> The current {{/aliases.json}} is a Map with a key {{collection}} which is in 
> turn a Map of alias name strings to a comma delimited list of the collections.
> _If we change the comma delimited list to be another Map to hold the existing 
> list and more stuff, older CloudSolrClient (configured to talk to ZooKeeper) 
> will break_.  Although if it's configured with an HTTP Solr URL then it would 
> not break.  There's also some read/write hassle to worry about -- we may need 
> to continue to read an aliases.json in the older format.
> Alternatively, we could add a new map entry to aliases.json, say, 
> {{collection_metadata}} keyed by alias name?
> Perhaps another very different approach is to attach metadata to the 
> configset in use?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-11514) Solr 7.1 does not honor values specified in solr.in.sh

2017-10-23 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215856#comment-16215856
 ] 

Shawn Heisey edited comment on SOLR-11514 at 10/23/17 9:17 PM:
---

To check whether my thoughts about what's happening here are right, try this 
line in solr.in.sh instead of the one that you tried already:

SOLR_JAVA_MEM="-Xms1g -Xmx4g"

Note that we strongly recommend setting Xms and Xmx to the same value, so that 
Java doesn't need to allocate additional memory if it ends up needing more than 
the minimum.  General java operation tends to allocate the maximum anyway, so 
it's generally better to allocate it all up front and save Java the effort of 
increasing its memory allocation.


was (Author: elyograg):
To check whether my thoughts about what's happening here are right, try this 
line in solr.in.sh instead of the one that you tried already:

SOLR_JAVA_MEM="-Xms1g -Xmx4g"


> Solr 7.1 does not honor values specified in solr.in.sh
> --
>
> Key: SOLR-11514
> URL: https://issues.apache.org/jira/browse/SOLR-11514
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: release-scripts, SolrCloud
>Affects Versions: 7.1
> Environment: Linux RHEL 7, 8GB RAM, 60GB HDD, Solr 7.1 in cloud mode 
> (zookeeper 3.4.10)
>Reporter: Howard Black
> Attachments: solr, solr.in.sh
>
>
> Just installed Solr 7.1 and zookeeper 3.4.10 into a test environment and it 
> seems that arguments in the solr.in.sh file in /etc/default are not getting 
> picked up when starting the server.
> I have this specified in solr.in.sh SOLR_JAVA_MEM="-Xms512m -Xmx6144m" but 
> the JVM shows -Xms512m -Xmx512m.
> Same goes for SOLR_LOGS_DIR=/mnt/logs  logs are still being written to 
> /opt/solr/server/logs
> The command I used to install Solr is this:
> ./install_solr_service.sh /home/hblack/solr-7.1.0.tgz -i /opt -d 
> /mnt/solr-home -s solr -u solr



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11514) Solr 7.1 does not honor values specified in solr.in.sh

2017-10-23 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215856#comment-16215856
 ] 

Shawn Heisey commented on SOLR-11514:
-

To check whether my thoughts about what's happening here are right, try this 
line in solr.in.sh instead of the one that you tried already:

SOLR_JAVA_MEM="-Xms1g -Xmx4g"


> Solr 7.1 does not honor values specified in solr.in.sh
> --
>
> Key: SOLR-11514
> URL: https://issues.apache.org/jira/browse/SOLR-11514
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: release-scripts, SolrCloud
>Affects Versions: 7.1
> Environment: Linux RHEL 7, 8GB RAM, 60GB HDD, Solr 7.1 in cloud mode 
> (zookeeper 3.4.10)
>Reporter: Howard Black
> Attachments: solr, solr.in.sh
>
>
> Just installed Solr 7.1 and zookeeper 3.4.10 into a test environment and it 
> seems that arguments in the solr.in.sh file in /etc/default are not getting 
> picked up when starting the server.
> I have this specified in solr.in.sh SOLR_JAVA_MEM="-Xms512m -Xmx6144m" but 
> the JVM shows -Xms512m -Xmx512m.
> Same goes for SOLR_LOGS_DIR=/mnt/logs  logs are still being written to 
> /opt/solr/server/logs
> The command I used to install Solr is this:
> ./install_solr_service.sh /home/hblack/solr-7.1.0.tgz -i /opt -d 
> /mnt/solr-home -s solr -u solr



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8005) Avoid Class.getSimpleName in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215849#comment-16215849
 ] 

Robert Muir commented on LUCENE-8005:
-

Yeah my main argument is just, lets get any discussion of performance out of 
the way, this should be a pure API-driven decision.

In this case it strongly appears to me that you see this in a sampling profiler 
just because of safepoint bias since it calls a native method every time. 
calling a non-native method is of course a little faster, but it was never a 
perf issue: you have to be careful to avoid these profiler "ghosts".



> Avoid Class.getSimpleName in UsageTrackingQueryCachingPolicy
> 
>
> Key: LUCENE-8005
> URL: https://issues.apache.org/jira/browse/LUCENE-8005
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Scott Somerville
> Attachments: LUCENE-8005.patch, LUCENE-8005.patch
>
>
> By profiling an Elasticsearch cluster, I found the private method 
> UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
> clazz.getSimpleName() call.
> Here is an excerpt from hot_threads:
> {noformat}
> java.lang.Class.getEnclosingMethod0(Native Method)
>java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
>java.lang.Class.getEnclosingClass(Class.java:1272)
>java.lang.Class.getSimpleBinaryName(Class.java:1443)
>java.lang.Class.getSimpleName(Class.java:1309)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
>
> org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
>
> org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
>
> org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11514) Solr 7.1 does not honor values specified in solr.in.sh

2017-10-23 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215847#comment-16215847
 ] 

Shawn Heisey commented on SOLR-11514:
-

I have confirmed that the "status" output *does* only show the "total" memory.  
Right after startup with -Xms512m, if there is not a significant amount of data 
indexed to the Solr instance, that number will show 490.7 MB, even if the -Xmx 
setting is much higher.

This patch to the code should fix that output:

{noformat}
diff --git a/solr/core/src/java/org/apache/solr/util/SolrCLI.java 
b/solr/core/src/java/org/apache/solr/util/SolrCLI.java
index f4cd7c7..5cbcc13 100644
--- a/solr/core/src/java/org/apache/solr/util/SolrCLI.java
+++ b/solr/core/src/java/org/apache/solr/util/SolrCLI.java
@@ -942,7 +942,8 @@ public class SolrCLI {
   
   String usedMemory = asString("/jvm/memory/used", info);
   String totalMemory = asString("/jvm/memory/total", info);
-  status.put("memory", usedMemory+" of "+totalMemory);
+  String maxMemory = asString("/jvm/memory/max", info);
+  status.put("memory", usedMemory+" of "+totalMemory+", max="+maxMemory);
   
   // if this is a Solr in solrcloud mode, gather some basic cluster info
   if ("solrcloud".equals(info.get("mode"))) {
{noformat}


> Solr 7.1 does not honor values specified in solr.in.sh
> --
>
> Key: SOLR-11514
> URL: https://issues.apache.org/jira/browse/SOLR-11514
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: release-scripts, SolrCloud
>Affects Versions: 7.1
> Environment: Linux RHEL 7, 8GB RAM, 60GB HDD, Solr 7.1 in cloud mode 
> (zookeeper 3.4.10)
>Reporter: Howard Black
> Attachments: solr, solr.in.sh
>
>
> Just installed Solr 7.1 and zookeeper 3.4.10 into a test environment and it 
> seems that arguments in the solr.in.sh file in /etc/default are not getting 
> picked up when starting the server.
> I have this specified in solr.in.sh SOLR_JAVA_MEM="-Xms512m -Xmx6144m" but 
> the JVM shows -Xms512m -Xmx512m.
> Same goes for SOLR_LOGS_DIR=/mnt/logs  logs are still being written to 
> /opt/solr/server/logs
> The command I used to install Solr is this:
> ./install_solr_service.sh /home/hblack/solr-7.1.0.tgz -i /opt -d 
> /mnt/solr-home -s solr -u solr



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11487) Collection Alias metadata for time partitioned collections

2017-10-23 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215845#comment-16215845
 ] 

David Smiley commented on SOLR-11487:
-

Thanks for the patch Gus!
* I think just String values is fine; it's like a Properties object.  Therefore 
you could change the type of the aliasMap field to what it was 
(Map,String,Map>?
* Why change Collections.emptyMap() to Collections.EMPTY_MAP ?  The latter 
results in Java unchecked assignment warnings.  Speaking of which, can you 
please address such warnings?
* getCollectionMetadataMap needs some docs.  Can result be null?
* setAliasMetadata:
** would you mind changing the "key" param name to "keyMetadata" and "metadata" 
param name to "valueMetadata" (or shorter "Meta" instead of "Metadata" if you 
choose)?  That would read clearer to me.
** setAliasMetadata doesn't have "collection" in its name.  Likewise the 
aliasMetadataField should be qualified as well.  Or... we stop pretending at 
the class level that we support other alias types, yet continue to read/write 
with the "collection" prefix in case we actually do add new types later?
** oh... hey this method makes Aliases not immutable anymore.  Maybe change 
this to be something like cloneWithCollectionAliasMetadata?  Or we cold make 
immutable again but I admit Immutability is a nice property here.
* cloneWithCollectionAlias seems off; I don't think it should be using 
zkStateReader.  I think this method now needs a metadata map parameter 
(optional).  Furthermore, if we remove an alias, remove the corresponding 
metadata too.

I see you moved some alias CRUD stuff to ZkStateReader.  Just curious; what 
drove that decision?

TimeOut: do you envision other uses of this utility; what in particular?

> Collection Alias metadata for time partitioned collections
> --
>
> Key: SOLR-11487
> URL: https://issues.apache.org/jira/browse/SOLR-11487
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
> Attachments: SOLR_11487.patch, SOLR_11487.patch
>
>
> SOLR-11299 outlines an approach to using a collection Alias to refer to a 
> series of collections of a time series. We'll need to store some metadata 
> about these time series collections, such as which field of the document 
> contains the timestamp to route on.
> The current {{/aliases.json}} is a Map with a key {{collection}} which is in 
> turn a Map of alias name strings to a comma delimited list of the collections.
> _If we change the comma delimited list to be another Map to hold the existing 
> list and more stuff, older CloudSolrClient (configured to talk to ZooKeeper) 
> will break_.  Although if it's configured with an HTTP Solr URL then it would 
> not break.  There's also some read/write hassle to worry about -- we may need 
> to continue to read an aliases.json in the older format.
> Alternatively, we could add a new map entry to aliases.json, say, 
> {{collection_metadata}} keyed by alias name?
> Perhaps another very different approach is to attach metadata to the 
> configset in use?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8005) Avoid Class.getSimpleName in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Scott Somerville (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215833#comment-16215833
 ] 

Scott Somerville commented on LUCENE-8005:
--

Robert: No disagreements here, was just contrasting the different results - 
apologies for the poor wording. I agree that the performance boost here is 
sufficient for something that isn't on a critical code path.

Uwe: The problem is some of these classes are in different packages/modules. 
For example:
* org.apache.lucene.spatial3d.PointInGeo3DShapeQuery
* org.apache.lucene.search.join.PointInSetIncludingScoreQuery
* org.apache.lucene.search.PointRangeQuery

So you either have to move these queries to the same package or add something 
to the public API.

This seems to be motivation for taking this approach ( 
https://issues.apache.org/jira/browse/LUCENE-7050 ) rather than doing a series 
of instanceof checks

> Avoid Class.getSimpleName in UsageTrackingQueryCachingPolicy
> 
>
> Key: LUCENE-8005
> URL: https://issues.apache.org/jira/browse/LUCENE-8005
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Scott Somerville
> Attachments: LUCENE-8005.patch, LUCENE-8005.patch
>
>
> By profiling an Elasticsearch cluster, I found the private method 
> UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
> clazz.getSimpleName() call.
> Here is an excerpt from hot_threads:
> {noformat}
> java.lang.Class.getEnclosingMethod0(Native Method)
>java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
>java.lang.Class.getEnclosingClass(Class.java:1272)
>java.lang.Class.getSimpleBinaryName(Class.java:1443)
>java.lang.Class.getSimpleName(Class.java:1309)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
>
> org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
>
> org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
>
> org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-23 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215829#comment-16215829
 ] 

Michael McCandless commented on LUCENE-7976:


bq. forceMergeAndFreeze feels wrong to me. At that point the only option if 
they make a mistake is to re-index everything into another core/collection, 
right? 

Or IndexWriter's {{addIndexes(Directory[])}} which is quite efficient.

But yeah I agree this is a separate issue...

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
> Attachments: LUCENE-7976.patch
>
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8005) Avoid Class.getSimpleName in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215812#comment-16215812
 ] 

Uwe Schindler commented on LUCENE-8005:
---

Because of that I'd add a package private marker interface. Much cleaner and 
invisible from public API (a pkg private super all is invisible, too). The 
other idea would have been an annotation, but those are not inherited.

> Avoid Class.getSimpleName in UsageTrackingQueryCachingPolicy
> 
>
> Key: LUCENE-8005
> URL: https://issues.apache.org/jira/browse/LUCENE-8005
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Scott Somerville
> Attachments: LUCENE-8005.patch, LUCENE-8005.patch
>
>
> By profiling an Elasticsearch cluster, I found the private method 
> UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
> clazz.getSimpleName() call.
> Here is an excerpt from hot_threads:
> {noformat}
> java.lang.Class.getEnclosingMethod0(Native Method)
>java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
>java.lang.Class.getEnclosingClass(Class.java:1272)
>java.lang.Class.getSimpleBinaryName(Class.java:1443)
>java.lang.Class.getSimpleName(Class.java:1309)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
>
> org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
>
> org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
>
> org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-23 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-7976:
---
Attachment: LUCENE-7976.patch

Very rough, untested patch, showing how we could allow the "too big" segments 
into the eligible set of segments ... but we should test how this behaves 
around deletions once an index has too-big segments ... it could be the 
deletion reclaim weight is now too high!

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
> Attachments: LUCENE-7976.patch
>
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-23 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215799#comment-16215799
 ] 

Michael McCandless commented on LUCENE-7976:


I think we don't need to add another tunable to TMP; I think the existing 
{{reclaimDeletesWeight}} should suffice, as long as we:

Modify the logic around {{tooBigCount}}, so that even too big segments are 
added to the {{eligible}} set, but they are still not counted against the 
{{allowedSegCount}}.

This way TMP is able to choose to merge e.g. a too big segment with 20% 
deletions, with lots of smaller segments.  The thing is, this merge will be 
unappealing, since the sizes of the input segments are so different, but then 
the {{reclaimDeletesWeight}} can counteract that.

I'll attached a rough patch with what I mean ...

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8005) Avoid Class.getSimpleName in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215797#comment-16215797
 ] 

Robert Muir commented on LUCENE-8005:
-

{quote}
So a sizable improvement but not as fast as an instanceof check.
{quote}

But this is not the bottleneck for lucene's processing so we can't make 
decisions that way. Otherwise we optimize the wrong stuff. i wish it was the 
bottleneck, that 10 million queries could run in 20ms versus 900ms. But a ~ 100 
ns cost to check if a query is cached seems reasonable to me. And the current 
speed of ~ 1 microsecond per query seems also just fine. 

I am more concerned that we don't introduce bad apis and complex abstractions. 
In this case I'm sorry, i dont see a performance justification for that.

> Avoid Class.getSimpleName in UsageTrackingQueryCachingPolicy
> 
>
> Key: LUCENE-8005
> URL: https://issues.apache.org/jira/browse/LUCENE-8005
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Scott Somerville
> Attachments: LUCENE-8005.patch, LUCENE-8005.patch
>
>
> By profiling an Elasticsearch cluster, I found the private method 
> UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
> clazz.getSimpleName() call.
> Here is an excerpt from hot_threads:
> {noformat}
> java.lang.Class.getEnclosingMethod0(Native Method)
>java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
>java.lang.Class.getEnclosingClass(Class.java:1272)
>java.lang.Class.getSimpleBinaryName(Class.java:1443)
>java.lang.Class.getSimpleName(Class.java:1309)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
>
> org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
>
> org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
>
> org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10533) Improve checks for which fields can be returned

2017-10-23 Thread Amrit Sarkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amrit Sarkar updated SOLR-10533:

Attachment: (was: SOLR-10533.patch)

> Improve checks for which fields can be returned
> ---
>
> Key: SOLR-10533
> URL: https://issues.apache.org/jira/browse/SOLR-10533
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
> Attachments: SOLR-10533.patch, SOLR-10533.patch, SOLR-10533.patch
>
>
> I tried using {{DocBasedVersionConstraintsProcessorFactory}} on a field which 
> was defined as :
> {code}
> 
> {code}
> The long fieldType has docValues enabled and since useDocValuesAsStored is 
> true by default in the latest schema I can retrieve this field.
> But when I start Solr with this update processor I get the following error
> {code}
>  Caused by: field myVersionField must be defined in schema, be stored, and be 
> single valued.
> {code}
> Here's the following check in the update processor where the error originates 
> from:
> {code}
> if (userVersionField == null || !userVersionField.stored() || 
> userVersionField.multiValued()) {
>   throw new SolrException(SERVER_ERROR,
>   "field " + versionField + " must be defined in schema, be stored, 
> and be single valued.");
> }
> {code}
> We should improve the condition to also check if the field docValues is true 
> and useDocValuesAsStored is true then don't throw this error.
> Hoss pointed out in an offline discussion that this issue could be there in 
> other places in the codebase so keep this issue broad and not just tackle 
> DocBasedVersionConstraintsProcessorFactory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10533) Improve checks for which fields can be returned

2017-10-23 Thread Amrit Sarkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amrit Sarkar updated SOLR-10533:

Attachment: SOLR-10533.patch

> Improve checks for which fields can be returned
> ---
>
> Key: SOLR-10533
> URL: https://issues.apache.org/jira/browse/SOLR-10533
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
> Attachments: SOLR-10533.patch, SOLR-10533.patch, SOLR-10533.patch
>
>
> I tried using {{DocBasedVersionConstraintsProcessorFactory}} on a field which 
> was defined as :
> {code}
> 
> {code}
> The long fieldType has docValues enabled and since useDocValuesAsStored is 
> true by default in the latest schema I can retrieve this field.
> But when I start Solr with this update processor I get the following error
> {code}
>  Caused by: field myVersionField must be defined in schema, be stored, and be 
> single valued.
> {code}
> Here's the following check in the update processor where the error originates 
> from:
> {code}
> if (userVersionField == null || !userVersionField.stored() || 
> userVersionField.multiValued()) {
>   throw new SolrException(SERVER_ERROR,
>   "field " + versionField + " must be defined in schema, be stored, 
> and be single valued.");
> }
> {code}
> We should improve the condition to also check if the field docValues is true 
> and useDocValuesAsStored is true then don't throw this error.
> Hoss pointed out in an offline discussion that this issue could be there in 
> other places in the codebase so keep this issue broad and not just tackle 
> DocBasedVersionConstraintsProcessorFactory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-7.x-Linux (64bit/jdk-9) - Build # 653 - Still Unstable!

2017-10-23 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/653/
Java: 64bit/jdk-9 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC 
--illegal-access=deny

1 tests failed.
FAILED:  
org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigReplication

Error Message:
Index 0 out-of-bounds for length 0

Stack Trace:
java.lang.IndexOutOfBoundsException: Index 0 out-of-bounds for length 0
at 
__randomizedtesting.SeedInfo.seed([83C2A3591B5D5A40:978AF80C385AE75E]:0)
at 
java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
at 
java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
at 
java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:248)
at java.base/java.util.Objects.checkIndex(Objects.java:372)
at java.base/java.util.ArrayList.get(ArrayList.java:439)
at 
org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigReplication(TestReplicationHandler.java:561)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 

[jira] [Commented] (LUCENE-8005) Avoid Class.getSimpleName in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215790#comment-16215790
 ] 

Robert Muir commented on LUCENE-8005:
-

Hi Uwe, I agree with those concerns too. But we should separate the two issues:

1) Class.getSimpleName not being fast enough
2) doing it in a cleaner way.

It is possible to fix #1 without fixing #2. And its far too easy to add 
abstractions to lucene "because its faster". In this case the two concerns are 
unrelated, so IMO we should fix #1 here, and then make a separate issue about 
#2.

> Avoid Class.getSimpleName in UsageTrackingQueryCachingPolicy
> 
>
> Key: LUCENE-8005
> URL: https://issues.apache.org/jira/browse/LUCENE-8005
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Scott Somerville
> Attachments: LUCENE-8005.patch, LUCENE-8005.patch
>
>
> By profiling an Elasticsearch cluster, I found the private method 
> UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
> clazz.getSimpleName() call.
> Here is an excerpt from hot_threads:
> {noformat}
> java.lang.Class.getEnclosingMethod0(Native Method)
>java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
>java.lang.Class.getEnclosingClass(Class.java:1272)
>java.lang.Class.getSimpleBinaryName(Class.java:1443)
>java.lang.Class.getSimpleName(Class.java:1309)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
>
> org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
>
> org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
>
> org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8005) Avoid Class.getSimpleName in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Scott Somerville (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215786#comment-16215786
 ] 

Scott Somerville commented on LUCENE-8005:
--

Yes, an interface was the other way I would have gone with it depending on the 
feedback here.

I added a patch to use Class.getName and the timings from running my previous 
snippet became: 867ms, 861ms, 859ms

So a sizable improvement but not as fast as an instanceof check.

> Avoid Class.getSimpleName in UsageTrackingQueryCachingPolicy
> 
>
> Key: LUCENE-8005
> URL: https://issues.apache.org/jira/browse/LUCENE-8005
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Scott Somerville
> Attachments: LUCENE-8005.patch, LUCENE-8005.patch
>
>
> By profiling an Elasticsearch cluster, I found the private method 
> UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
> clazz.getSimpleName() call.
> Here is an excerpt from hot_threads:
> {noformat}
> java.lang.Class.getEnclosingMethod0(Native Method)
>java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
>java.lang.Class.getEnclosingClass(Class.java:1272)
>java.lang.Class.getSimpleBinaryName(Class.java:1443)
>java.lang.Class.getSimpleName(Class.java:1309)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
>
> org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
>
> org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
>
> org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8005) Avoid Class.getSimpleName in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215782#comment-16215782
 ] 

Uwe Schindler commented on LUCENE-8005:
---

IMHO, the current code looks horrible. On the other hand, I agree with Robert 
that adding a superclass makes no sense as there is no funtionality to share. 
Maybe we will refactor some subclasses to use another query type. How about 
adding a marker interface and doing instance of check on it?

I don't like the current code as it is not type safe. What happens if you 
create a new point query without "point" in the name? Or if you have your own 
class with a name like this? The full class name, as suggested by Robert, does 
not help with that, it just improves performance of the buggy check.

> Avoid Class.getSimpleName in UsageTrackingQueryCachingPolicy
> 
>
> Key: LUCENE-8005
> URL: https://issues.apache.org/jira/browse/LUCENE-8005
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Scott Somerville
> Attachments: LUCENE-8005.patch, LUCENE-8005.patch
>
>
> By profiling an Elasticsearch cluster, I found the private method 
> UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
> clazz.getSimpleName() call.
> Here is an excerpt from hot_threads:
> {noformat}
> java.lang.Class.getEnclosingMethod0(Native Method)
>java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
>java.lang.Class.getEnclosingClass(Class.java:1272)
>java.lang.Class.getSimpleBinaryName(Class.java:1443)
>java.lang.Class.getSimpleName(Class.java:1309)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
>
> org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
>
> org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
>
> org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8005) Avoid Reflection in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Scott Somerville (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Somerville updated LUCENE-8005:
-
Attachment: LUCENE-8005.patch

> Avoid Reflection in UsageTrackingQueryCachingPolicy
> ---
>
> Key: LUCENE-8005
> URL: https://issues.apache.org/jira/browse/LUCENE-8005
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Scott Somerville
> Attachments: LUCENE-8005.patch, LUCENE-8005.patch
>
>
> By profiling an Elasticsearch cluster, I found the private method 
> UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
> clazz.getSimpleName() call.
> Here is an excerpt from hot_threads:
> {noformat}
> java.lang.Class.getEnclosingMethod0(Native Method)
>java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
>java.lang.Class.getEnclosingClass(Class.java:1272)
>java.lang.Class.getSimpleBinaryName(Class.java:1443)
>java.lang.Class.getSimpleName(Class.java:1309)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
>
> org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
>
> org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
>
> org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8005) Avoid Class.getSimpleName in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Scott Somerville (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Somerville updated LUCENE-8005:
-
Summary: Avoid Class.getSimpleName in UsageTrackingQueryCachingPolicy  
(was: Avoid Class.getSimplename in UsageTrackingQueryCachingPolicy)

> Avoid Class.getSimpleName in UsageTrackingQueryCachingPolicy
> 
>
> Key: LUCENE-8005
> URL: https://issues.apache.org/jira/browse/LUCENE-8005
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Scott Somerville
> Attachments: LUCENE-8005.patch, LUCENE-8005.patch
>
>
> By profiling an Elasticsearch cluster, I found the private method 
> UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
> clazz.getSimpleName() call.
> Here is an excerpt from hot_threads:
> {noformat}
> java.lang.Class.getEnclosingMethod0(Native Method)
>java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
>java.lang.Class.getEnclosingClass(Class.java:1272)
>java.lang.Class.getSimpleBinaryName(Class.java:1443)
>java.lang.Class.getSimpleName(Class.java:1309)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
>
> org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
>
> org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
>
> org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8005) Avoid Class.getSimplename in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Scott Somerville (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Somerville updated LUCENE-8005:
-
Summary: Avoid Class.getSimplename in UsageTrackingQueryCachingPolicy  
(was: Avoid Reflection in UsageTrackingQueryCachingPolicy)

> Avoid Class.getSimplename in UsageTrackingQueryCachingPolicy
> 
>
> Key: LUCENE-8005
> URL: https://issues.apache.org/jira/browse/LUCENE-8005
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Scott Somerville
> Attachments: LUCENE-8005.patch, LUCENE-8005.patch
>
>
> By profiling an Elasticsearch cluster, I found the private method 
> UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
> clazz.getSimpleName() call.
> Here is an excerpt from hot_threads:
> {noformat}
> java.lang.Class.getEnclosingMethod0(Native Method)
>java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
>java.lang.Class.getEnclosingClass(Class.java:1272)
>java.lang.Class.getSimpleBinaryName(Class.java:1443)
>java.lang.Class.getSimpleName(Class.java:1309)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
>
> org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
>
> org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
>
> org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-23 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215764#comment-16215764
 ] 

Varun Thacker commented on LUCENE-7976:
---

bq. Wouldn't this mean that the segment sizes keep growing over time well 
beyond the max limit

Looking at the code this is not possible. I'll cook up a patch to make this 
check's weight configurable {{if (segBytes < maxMergedSegmentBytes/2.0)}}

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8005) Avoid Reflection in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215747#comment-16215747
 ] 

Robert Muir commented on LUCENE-8005:
-

OK, but I think the issue should be restated (not avoid reflection, just avoid 
Class.getSimpleName).

If we care about this being faster we should change to use Class.getName which 
becomes the speed of an ordinary getter: 
http://hg.openjdk.java.net/jdk9/jdk9/jdk/file/tip/src/java.base/share/classes/java/lang/Class.java#l772

In our case here we don't deal with arrays so it should be very easy. I greatly 
prefer this to adding new abstractions (Lucene has too many of those).

> Avoid Reflection in UsageTrackingQueryCachingPolicy
> ---
>
> Key: LUCENE-8005
> URL: https://issues.apache.org/jira/browse/LUCENE-8005
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Scott Somerville
> Attachments: LUCENE-8005.patch
>
>
> By profiling an Elasticsearch cluster, I found the private method 
> UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
> clazz.getSimpleName() call.
> Here is an excerpt from hot_threads:
> {noformat}
> java.lang.Class.getEnclosingMethod0(Native Method)
>java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
>java.lang.Class.getEnclosingClass(Class.java:1272)
>java.lang.Class.getSimpleBinaryName(Class.java:1443)
>java.lang.Class.getSimpleName(Class.java:1309)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
>
> org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
>
> org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
>
> org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-5.5-Linux (64bit/jdk1.7.0_80) - Build # 500 - Unstable!

2017-10-23 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-5.5-Linux/500/
Java: 64bit/jdk1.7.0_80 -XX:+UseCompressedOops -XX:+UseSerialGC

1 tests failed.
FAILED:  
org.apache.solr.handler.TestReplicationHandler.doTestIndexFetchOnMasterRestart

Error Message:
expected:<1> but was:<2>

Stack Trace:
java.lang.AssertionError: expected:<1> but was:<2>
at 
__randomizedtesting.SeedInfo.seed([35C43BCD529869E7:ED33FF29F943ABBB]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.solr.handler.TestReplicationHandler.doTestIndexFetchOnMasterRestart(TestReplicationHandler.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at java.lang.Thread.run(Thread.java:745)




Build Log:
[...truncated 11500 lines...]
   [junit4] Suite: 

[jira] [Commented] (LUCENE-8004) IndexUpgraderTool should rewrite segments rather than forceMerge

2017-10-23 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215682#comment-16215682
 ] 

Michael McCandless commented on LUCENE-8004:


Merges are usually compute bound, and a given merge is single threaded ... if 
you look in IndexWriter's info stream log you'll see which parts take the most 
time; it's usually postings in my experience.

Especially if you are merging away deleted docs, then we can't apply bulk copy 
optos for stored fields and term vectors.

We don't have a bulk copy opto for postings.

> IndexUpgraderTool should rewrite segments rather than forceMerge
> 
>
> Key: LUCENE-8004
> URL: https://issues.apache.org/jira/browse/LUCENE-8004
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> Spinoff from LUCENE-7976. We help users get themselves into a corner by using 
> forceMerge on an index to rewrite all segments in the current Lucene format. 
> We should rewrite each individual segment instead. This would also help with 
> upgrading X-2->X-1, then X-1->X.
> Of course the preferred method is to re-index from scratch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11514) Solr 7.1 does not honor values specified in solr.in.sh

2017-10-23 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215680#comment-16215680
 ] 

Shawn Heisey commented on SOLR-11514:
-

After a closer look at the output, if you are basing the heap size on the 
status output from the init script, then I think it may actually be working, 
but you're looking in the wrong place.

Here's an excerpt from the system info dump when I started with min 512MB and 
max 2GB:

{code:none}
"memory":{
  "free":"451.3 MB",
  "total":"490.7 MB",
  "max":"1.9 GB",
  "used":"39.4 MB (%2)",
  "raw":{
"free":473181152,
"total":514523136,
"max":2058027008,
"used":41341984,
"used%":2.0088163974182405}},
{code}

Note that "total" in the main section is 490.7 MB ... but "max" is 1.9 GB.  I 
bet the status output you're looking at is showing you that "total" number 
rather than "max", and that max is probably increased to the 4GB that you have 
indicated.  Check the admin UI in a graphical browser, and see if the third 
number on the "JVM memory" graph is up around 4000.


> Solr 7.1 does not honor values specified in solr.in.sh
> --
>
> Key: SOLR-11514
> URL: https://issues.apache.org/jira/browse/SOLR-11514
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: release-scripts, SolrCloud
>Affects Versions: 7.1
> Environment: Linux RHEL 7, 8GB RAM, 60GB HDD, Solr 7.1 in cloud mode 
> (zookeeper 3.4.10)
>Reporter: Howard Black
> Attachments: solr, solr.in.sh
>
>
> Just installed Solr 7.1 and zookeeper 3.4.10 into a test environment and it 
> seems that arguments in the solr.in.sh file in /etc/default are not getting 
> picked up when starting the server.
> I have this specified in solr.in.sh SOLR_JAVA_MEM="-Xms512m -Xmx6144m" but 
> the JVM shows -Xms512m -Xmx512m.
> Same goes for SOLR_LOGS_DIR=/mnt/logs  logs are still being written to 
> /opt/solr/server/logs
> The command I used to install Solr is this:
> ./install_solr_service.sh /home/hblack/solr-7.1.0.tgz -i /opt -d 
> /mnt/solr-home -s solr -u solr



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11429) Add loess Stream Evaluator to support Local Regression interpolation

2017-10-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215675#comment-16215675
 ] 

ASF subversion and git services commented on SOLR-11429:


Commit 2c54ee2d6da489b73cff4891f9a96765787925eb in lucene-solr's branch 
refs/heads/branch_7x from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=2c54ee2 ]

SOLR-11429: Add loess Stream Evaluator to support Local Regression interpolation


> Add loess Stream Evaluator to support Local Regression interpolation
> 
>
> Key: SOLR-11429
> URL: https://issues.apache.org/jira/browse/SOLR-11429
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 7.2, master (8.0)
>
> Attachments: SOLR-11429.patch, SOLR-11429.patch
>
>
> The loess function will fit a curved line through a set of points using the 
> Local Regression Algorithm.
> Syntax:
> {code}
> yvalues = loess(xvec, yvec)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11514) Solr 7.1 does not honor values specified in solr.in.sh

2017-10-23 Thread Jason Gerlowski (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215672#comment-16215672
 ] 

Jason Gerlowski commented on SOLR-11514:


FWIW, I was unable to reproduce on my system.  It was a fresh install for me, I 
had never used the install_solr_service.sh script before.

Anyway, hence my (misguided) suspicion earlier that Howard was editing the 
wrong solr.in.sh file, because I couldn't reproduce it locally in 7.1

> Solr 7.1 does not honor values specified in solr.in.sh
> --
>
> Key: SOLR-11514
> URL: https://issues.apache.org/jira/browse/SOLR-11514
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: release-scripts, SolrCloud
>Affects Versions: 7.1
> Environment: Linux RHEL 7, 8GB RAM, 60GB HDD, Solr 7.1 in cloud mode 
> (zookeeper 3.4.10)
>Reporter: Howard Black
> Attachments: solr, solr.in.sh
>
>
> Just installed Solr 7.1 and zookeeper 3.4.10 into a test environment and it 
> seems that arguments in the solr.in.sh file in /etc/default are not getting 
> picked up when starting the server.
> I have this specified in solr.in.sh SOLR_JAVA_MEM="-Xms512m -Xmx6144m" but 
> the JVM shows -Xms512m -Xmx512m.
> Same goes for SOLR_LOGS_DIR=/mnt/logs  logs are still being written to 
> /opt/solr/server/logs
> The command I used to install Solr is this:
> ./install_solr_service.sh /home/hblack/solr-7.1.0.tgz -i /opt -d 
> /mnt/solr-home -s solr -u solr



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11514) Solr 7.1 does not honor values specified in solr.in.sh

2017-10-23 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215671#comment-16215671
 ] 

Shawn Heisey commented on SOLR-11514:
-

Steps for reproduction (on Ubuntu 16)

{code:none}
root@smeagol:/usr/local/src# ./install_solr_service.sh solr-7.1.0.tgz
id: ‘solr’: no such user
Creating new user: solr
Adding system user `solr' (UID 141) ...
Adding new group `solr' (GID 147) ...
Adding new user `solr' (UID 141) with group `solr' ...
Creating home directory `/var/solr' ...

Extracting solr-7.1.0.tgz to /opt


Installing symlink /opt/solr -> /opt/solr-7.1.0 ...


Installing /etc/init.d/solr script ...


Installing /etc/default/solr.in.sh ...

Service solr installed.
Customize Solr startup configuration in /etc/default/solr.in.sh
● solr.service - LSB: Controls Apache Solr as a Service
   Loaded: loaded (/etc/init.d/solr; bad; vendor preset: enabled)
   Active: active (exited) since Mon 2017-10-23 12:50:12 MDT; 5s ago
 Docs: man:systemd-sysv-generator(8)
  Process: 25798 ExecStart=/etc/init.d/solr start (code=exited, 
status=0/SUCCESS)

Oct 23 12:50:05 smeagol systemd[1]: Starting LSB: Controls Apache Solr as a 
Service...
Oct 23 12:50:05 smeagol su[25805]: Successful su for solr by root
Oct 23 12:50:05 smeagol su[25805]: + ??? root:solr
Oct 23 12:50:05 smeagol su[25805]: pam_unix(su:session): session opened for 
user solr by (uid=0)
Oct 23 12:50:12 smeagol solr[25798]: [146B blob data]
Oct 23 12:50:12 smeagol solr[25798]: Started Solr server on port 8983 
(pid=25945). Happy searching!
Oct 23 12:50:12 smeagol solr[25798]: [14B blob data]
Oct 23 12:50:12 smeagol systemd[1]: Started LSB: Controls Apache Solr as a 
Service.
{code}

The I edited the include script and added a SOLR_HEAP="2g" line.

{code}
root@smeagol:/usr/local/src# vi /etc/default/solr.in.sh
root@smeagol:/usr/local/src# service solr stop
root@smeagol:/usr/local/src# service solr start
root@smeagol:/usr/local/src# service solr status
● solr.service - LSB: Controls Apache Solr as a Service
   Loaded: loaded (/etc/init.d/solr; bad; vendor preset: enabled)
   Active: active (exited) since Mon 2017-10-23 12:53:24 MDT; 2s ago
 Docs: man:systemd-sysv-generator(8)
  Process: 26182 ExecStop=/etc/init.d/solr stop (code=exited, status=0/SUCCESS)
  Process: 26336 ExecStart=/etc/init.d/solr start (code=exited, 
status=0/SUCCESS)

Oct 23 12:53:18 smeagol systemd[1]: Starting LSB: Controls Apache Solr as a 
Service...
Oct 23 12:53:18 smeagol su[26339]: Successful su for solr by root
Oct 23 12:53:18 smeagol su[26339]: + ??? root:solr
Oct 23 12:53:18 smeagol su[26339]: pam_unix(su:session): session opened for 
user solr by (uid=0)
Oct 23 12:53:24 smeagol solr[26336]: [146B blob data]
Oct 23 12:53:24 smeagol solr[26336]: Started Solr server on port 8983 
(pid=26475). Happy searching!
Oct 23 12:53:24 smeagol solr[26336]: [14B blob data]
Oct 23 12:53:24 smeagol systemd[1]: Started LSB: Controls Apache Solr as a 
Service.
{code}

It's not showing me the status, so I asked the running Solr for a system info 
dump:

{code}
root@smeagol:/usr/local/src# curl http://localhost:8983/solr/admin/info/system
{
  "responseHeader":{
"status":0,
"QTime":91},
  "mode":"std",
  "solr_home":"/var/solr/data",
  "lucene":{
"solr-spec-version":"7.1.0",
"solr-impl-version":"7.1.0 84c90ad2c0218156c840e19a64d72b8a38550659 - 
ubuntu - 2017-10-13 16:15:59",
"lucene-spec-version":"7.1.0",
"lucene-impl-version":"7.1.0 84c90ad2c0218156c840e19a64d72b8a38550659 - 
ubuntu - 2017-10-13 16:12:42"},
  "jvm":{
"version":"1.8.0_144 25.144-b01",
"name":"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM",
"spec":{
  "vendor":"Oracle Corporation",
  "name":"Java Platform API Specification",
  "version":"1.8"},
"jre":{
  "vendor":"Oracle Corporation",
  "version":"1.8.0_144"},
"vm":{
  "vendor":"Oracle Corporation",
  "name":"Java HotSpot(TM) 64-Bit Server VM",
  "version":"25.144-b01"},
"processors":4,
"memory":{
  "free":"1.8 GB",
  "total":"1.9 GB",
  "max":"1.9 GB",
  "used":"102.6 MB (%5.2)",
  "raw":{
"free":1950475352,
"total":2058027008,
"max":2058027008,
"used":107551656,
"used%":5.225959405873842}},
"jmx":{
  
"bootclasspath":"/usr/lib/jvm/java-8-oracle/jre/lib/resources.jar:/usr/lib/jvm/java-8-oracle/jre/lib/rt.jar:/usr/lib/jvm/java-8-oracle/jre/lib/sunrsasign.jar:/usr/lib/jvm/java-8-oracle/jre/lib/jsse.jar:/usr/lib/jvm/java-8-oracle/jre/lib/jce.jar:/usr/lib/jvm/java-8-oracle/jre/lib/charsets.jar:/usr/lib/jvm/java-8-oracle/jre/lib/jfr.jar:/usr/lib/jvm/java-8-oracle/jre/classes",
  

[jira] [Commented] (LUCENE-8005) Avoid Reflection in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Scott Somerville (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215659#comment-16215659
 ] 

Scott Somerville commented on LUCENE-8005:
--

In case the above was too nonsensical, here is another snippet that actually 
uses the change:

{code:java}
long start = System.currentTimeMillis();

Query q = IntPoint.newRangeQuery("intField", 1, 1000);

for (int i = 0; i < 10_000_000; i++) {
  assertTrue(UsageTrackingQueryCachingPolicy.isCostly(q));
}

System.out.println("Took " + (System.currentTimeMillis() - start) + "ms");
{code}

I ran it before and after, 3 times each:

{noformat}
BEFORE: Took 7019ms, Took 7074ms, Took 8108ms
AFTER:Took 17ms, Took 12ms, Took 17ms
{noformat}


> Avoid Reflection in UsageTrackingQueryCachingPolicy
> ---
>
> Key: LUCENE-8005
> URL: https://issues.apache.org/jira/browse/LUCENE-8005
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Scott Somerville
> Attachments: LUCENE-8005.patch
>
>
> By profiling an Elasticsearch cluster, I found the private method 
> UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
> clazz.getSimpleName() call.
> Here is an excerpt from hot_threads:
> {noformat}
> java.lang.Class.getEnclosingMethod0(Native Method)
>java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
>java.lang.Class.getEnclosingClass(Class.java:1272)
>java.lang.Class.getSimpleBinaryName(Class.java:1443)
>java.lang.Class.getSimpleName(Class.java:1309)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
>
> org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
>
> org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
>
> org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-SmokeRelease-7.x - Build # 65 - Still Failing

2017-10-23 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-7.x/65/

No tests ran.

Build Log:
[...truncated 28016 lines...]
prepare-release-no-sign:
[mkdir] Created dir: 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-7.x/lucene/build/smokeTestRelease/dist
 [copy] Copying 476 files to 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-7.x/lucene/build/smokeTestRelease/dist/lucene
 [copy] Copying 215 files to 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-7.x/lucene/build/smokeTestRelease/dist/solr
   [smoker] Java 1.8 JAVA_HOME=/home/jenkins/tools/java/latest1.8
   [smoker] NOTE: output encoding is UTF-8
   [smoker] 
   [smoker] Load release URL 
"file:/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-7.x/lucene/build/smokeTestRelease/dist/"...
   [smoker] 
   [smoker] Test Lucene...
   [smoker]   test basics...
   [smoker]   get KEYS
   [smoker] 0.2 MB in 0.06 sec (3.8 MB/sec)
   [smoker]   check changes HTML...
   [smoker]   download lucene-7.2.0-src.tgz...
   [smoker] 31.1 MB in 0.07 sec (469.3 MB/sec)
   [smoker] verify md5/sha1 digests
   [smoker]   download lucene-7.2.0.tgz...
   [smoker] 69.6 MB in 0.08 sec (869.1 MB/sec)
   [smoker] verify md5/sha1 digests
   [smoker]   download lucene-7.2.0.zip...
   [smoker] 80.0 MB in 0.07 sec (1160.7 MB/sec)
   [smoker] verify md5/sha1 digests
   [smoker]   unpack lucene-7.2.0.tgz...
   [smoker] verify JAR metadata/identity/no javax.* or java.* classes...
   [smoker] test demo with 1.8...
   [smoker]   got 6221 hits for query "lucene"
   [smoker] checkindex with 1.8...
   [smoker] check Lucene's javadoc JAR
   [smoker]   unpack lucene-7.2.0.zip...
   [smoker] verify JAR metadata/identity/no javax.* or java.* classes...
   [smoker] test demo with 1.8...
   [smoker]   got 6221 hits for query "lucene"
   [smoker] checkindex with 1.8...
   [smoker] check Lucene's javadoc JAR
   [smoker]   unpack lucene-7.2.0-src.tgz...
   [smoker] make sure no JARs/WARs in src dist...
   [smoker] run "ant validate"
   [smoker] run tests w/ Java 8 and testArgs='-Dtests.slow=false'...
   [smoker] test demo with 1.8...
   [smoker]   got 213 hits for query "lucene"
   [smoker] checkindex with 1.8...
   [smoker] generate javadocs w/ Java 8...
   [smoker] 
   [smoker] Crawl/parse...
   [smoker] 
   [smoker] Verify...
   [smoker]   confirm all releases have coverage in TestBackwardsCompatibility
   [smoker] find all past Lucene releases...
   [smoker] run TestBackwardsCompatibility..
   [smoker] Releases that don't seem to be tested:
   [smoker]   6.6.2
   [smoker] Traceback (most recent call last):
   [smoker]   File 
"/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-7.x/dev-tools/scripts/smokeTestRelease.py",
 line 1484, in 
   [smoker] main()
   [smoker]   File 
"/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-7.x/dev-tools/scripts/smokeTestRelease.py",
 line 1428, in main
   [smoker] smokeTest(c.java, c.url, c.revision, c.version, c.tmp_dir, 
c.is_signed, ' '.join(c.test_args))
   [smoker]   File 
"/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-7.x/dev-tools/scripts/smokeTestRelease.py",
 line 1466, in smokeTest
   [smoker] unpackAndVerify(java, 'lucene', tmpDir, 'lucene-%s-src.tgz' % 
version, gitRevision, version, testArgs, baseURL)
   [smoker]   File 
"/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-7.x/dev-tools/scripts/smokeTestRelease.py",
 line 622, in unpackAndVerify
   [smoker] verifyUnpacked(java, project, artifact, unpackPath, 
gitRevision, version, testArgs, tmpDir, baseURL)
   [smoker]   File 
"/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-7.x/dev-tools/scripts/smokeTestRelease.py",
 line 774, in verifyUnpacked
   [smoker] confirmAllReleasesAreTestedForBackCompat(version, unpackPath)
   [smoker]   File 
"/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-7.x/dev-tools/scripts/smokeTestRelease.py",
 line 1404, in confirmAllReleasesAreTestedForBackCompat
   [smoker] raise RuntimeError('some releases are not tested by 
TestBackwardsCompatibility?')
   [smoker] RuntimeError: some releases are not tested by 
TestBackwardsCompatibility?

BUILD FAILED
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-7.x/build.xml:622: 
exec returned: 1

Total time: 191 minutes 29 seconds
Build step 'Invoke Ant' marked build as failure
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7994) Use int/int hash map for int taxonomy facet counts

2017-10-23 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215652#comment-16215652
 ] 

Michael McCandless commented on LUCENE-7994:


Woops, I will upgrade to 0.7.3; thanks [~dweiss]!

> Use int/int hash map for int taxonomy facet counts
> --
>
> Key: LUCENE-7994
> URL: https://issues.apache.org/jira/browse/LUCENE-7994
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: master (8.0), 7.2
>
> Attachments: LUCENE-7994.patch, LUCENE-7994.patch
>
>
> Int taxonomy facets today always count into a dense {{int[]}}, which is 
> wasteful in cases where the number of unique facet labels is high and the 
> size of the current result set is small.
> I factored the native hash map from LUCENE-7927 and use a simple heuristic 
> (customizable by the user by subclassing) to decide up front whether to count 
> sparse or dense.  I also made loading of the large children and siblings 
> {{int[]}} lazy, so that they are only instantiated if you really need them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7994) Use int/int hash map for int taxonomy facet counts

2017-10-23 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215640#comment-16215640
 ] 

Dawid Weiss commented on LUCENE-7994:
-

I don't mind if you just copy the code, Mike, but it's always a pleasant 
feeling to have the library used. :) Upgrade the version before you commit 
though -- there's a minor performance improvement in {{addTo}}, which I see is 
used.

https://github.com/carrotsearch/hppc/releases/tag/0.7.3

> Use int/int hash map for int taxonomy facet counts
> --
>
> Key: LUCENE-7994
> URL: https://issues.apache.org/jira/browse/LUCENE-7994
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: master (8.0), 7.2
>
> Attachments: LUCENE-7994.patch, LUCENE-7994.patch
>
>
> Int taxonomy facets today always count into a dense {{int[]}}, which is 
> wasteful in cases where the number of unique facet labels is high and the 
> size of the current result set is small.
> I factored the native hash map from LUCENE-7927 and use a simple heuristic 
> (customizable by the user by subclassing) to decide up front whether to count 
> sparse or dense.  I also made loading of the large children and siblings 
> {{int[]}} lazy, so that they are only instantiated if you really need them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11429) Add loess Stream Evaluator to support Local Regression interpolation

2017-10-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215636#comment-16215636
 ] 

ASF subversion and git services commented on SOLR-11429:


Commit a6e12374515daa8c0e93c221eabf233a4d97dea4 in lucene-solr's branch 
refs/heads/master from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a6e1237 ]

SOLR-11429: Add loess Stream Evaluator to support Local Regression interpolation


> Add loess Stream Evaluator to support Local Regression interpolation
> 
>
> Key: SOLR-11429
> URL: https://issues.apache.org/jira/browse/SOLR-11429
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 7.2, master (8.0)
>
> Attachments: SOLR-11429.patch, SOLR-11429.patch
>
>
> The loess function will fit a curved line through a set of points using the 
> Local Regression Algorithm.
> Syntax:
> {code}
> yvalues = loess(xvec, yvec)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-master-MacOSX (64bit/jdk1.8.0) - Build # 4245 - Still Unstable!

2017-10-23 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-MacOSX/4245/
Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseParallelGC

9 tests failed.
FAILED:  org.apache.solr.cloud.MultiThreadedOCPTest.test

Error Message:
Task 3001 did not complete, final state: FAILED expected same: was 
not:

Stack Trace:
java.lang.AssertionError: Task 3001 did not complete, final state: FAILED 
expected same: was not:
at 
__randomizedtesting.SeedInfo.seed([84ADE6BF34F0DD17:CF9D9659A0CB0EF]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotSame(Assert.java:641)
at org.junit.Assert.assertSame(Assert.java:580)
at 
org.apache.solr.cloud.MultiThreadedOCPTest.testDeduplicationOfSubmittedTasks(MultiThreadedOCPTest.java:227)
at 
org.apache.solr.cloud.MultiThreadedOCPTest.test(MultiThreadedOCPTest.java:65)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:993)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:968)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 

[jira] [Commented] (SOLR-11514) Solr 7.1 does not honor values specified in solr.in.sh

2017-10-23 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215630#comment-16215630
 ] 

Shawn Heisey commented on SOLR-11514:
-

bq. It was a fresh install wiped everything and started over. 

The output you shared says it couldn't have been a fresh install.

{code}
/mnt/solr-home//data/solr.xml already exists. Skipping install ...

/mnt/solr-home//log4j.properties already exists. Skipping install ...
{code}

You might want to investigate the "-f" option on the service installer script, 
to force the install even when existing config is detected.

I'm going to try this out, see if I can reproduce the problem.


> Solr 7.1 does not honor values specified in solr.in.sh
> --
>
> Key: SOLR-11514
> URL: https://issues.apache.org/jira/browse/SOLR-11514
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: release-scripts, SolrCloud
>Affects Versions: 7.1
> Environment: Linux RHEL 7, 8GB RAM, 60GB HDD, Solr 7.1 in cloud mode 
> (zookeeper 3.4.10)
>Reporter: Howard Black
> Attachments: solr, solr.in.sh
>
>
> Just installed Solr 7.1 and zookeeper 3.4.10 into a test environment and it 
> seems that arguments in the solr.in.sh file in /etc/default are not getting 
> picked up when starting the server.
> I have this specified in solr.in.sh SOLR_JAVA_MEM="-Xms512m -Xmx6144m" but 
> the JVM shows -Xms512m -Xmx512m.
> Same goes for SOLR_LOGS_DIR=/mnt/logs  logs are still being written to 
> /opt/solr/server/logs
> The command I used to install Solr is this:
> ./install_solr_service.sh /home/hblack/solr-7.1.0.tgz -i /opt -d 
> /mnt/solr-home -s solr -u solr



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7994) Use int/int hash map for int taxonomy facet counts

2017-10-23 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215621#comment-16215621
 ] 

Robert Muir commented on LUCENE-7994:
-

Thanks Mike, the heuristic makes sense now when reading the code!

> Use int/int hash map for int taxonomy facet counts
> --
>
> Key: LUCENE-7994
> URL: https://issues.apache.org/jira/browse/LUCENE-7994
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: master (8.0), 7.2
>
> Attachments: LUCENE-7994.patch, LUCENE-7994.patch
>
>
> Int taxonomy facets today always count into a dense {{int[]}}, which is 
> wasteful in cases where the number of unique facet labels is high and the 
> size of the current result set is small.
> I factored the native hash map from LUCENE-7927 and use a simple heuristic 
> (customizable by the user by subclassing) to decide up front whether to count 
> sparse or dense.  I also made loading of the large children and siblings 
> {{int[]}} lazy, so that they are only instantiated if you really need them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11514) Solr 7.1 does not honor values specified in solr.in.sh

2017-10-23 Thread Howard Black (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215611#comment-16215611
 ] 

Howard Black commented on SOLR-11514:
-

It was a fresh install wiped everything and started over. Went and 
modified just the SOLR_JAVA_MEM="-Xms512m -Xmx4096m", saved the file and 
then restarted.  Nothing got picked up.

[root@solr-node01 bin]#
[root@solr-node01 bin]# service solr stop
Sending stop command to Solr running on port 8983 ... waiting up to 180 
seconds to allow Jetty process 10171 to stop gracefully.
[root@solr-node01 bin]# cd /etc/default
[root@solr-node01 default]# nano solr.in.sh
[root@solr-node01 default]# service solr start
Waiting up to 180 seconds to see Solr running on port 8983 [\]
Started Solr server on port 8983 (pid=10637). Happy searching!

[root@solr-node01 default]# service solr status

Found 1 Solr nodes:

Solr process 10637 running on port 8983
{
   "solr_home":"/mnt/solr-home/data",
   "version":"7.1.0 84c90ad2c0218156c840e19a64d72b8a38550659 - ubuntu - 
2017-10-13 16:15:59",
   "startTime":"2017-10-23T18:09:18.322Z",
   "uptime":"0 days, 0 hours, 0 minutes, 10 seconds",
   "memory":"23.6 MB (%0.6) of 490.7 MB"}







> Solr 7.1 does not honor values specified in solr.in.sh
> --
>
> Key: SOLR-11514
> URL: https://issues.apache.org/jira/browse/SOLR-11514
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: release-scripts, SolrCloud
>Affects Versions: 7.1
> Environment: Linux RHEL 7, 8GB RAM, 60GB HDD, Solr 7.1 in cloud mode 
> (zookeeper 3.4.10)
>Reporter: Howard Black
> Attachments: solr, solr.in.sh
>
>
> Just installed Solr 7.1 and zookeeper 3.4.10 into a test environment and it 
> seems that arguments in the solr.in.sh file in /etc/default are not getting 
> picked up when starting the server.
> I have this specified in solr.in.sh SOLR_JAVA_MEM="-Xms512m -Xmx6144m" but 
> the JVM shows -Xms512m -Xmx512m.
> Same goes for SOLR_LOGS_DIR=/mnt/logs  logs are still being written to 
> /opt/solr/server/logs
> The command I used to install Solr is this:
> ./install_solr_service.sh /home/hblack/solr-7.1.0.tgz -i /opt -d 
> /mnt/solr-home -s solr -u solr



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7994) Use int/int hash map for int taxonomy facet counts

2017-10-23 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215610#comment-16215610
 ] 

Michael McCandless commented on LUCENE-7994:


bq. Those hash collisions, on the other hand, were possible to hit with trivial 
map-iteration-copying blocks and they were nasty (and people rightfully didn't 
and couldn't anticipate them to happen). So I went for "better slower than 
sorry" direction...

I agree that is the right tradeoff!  Reduce the chances of adversarial cases ...

Thanks [~dweiss].

> Use int/int hash map for int taxonomy facet counts
> --
>
> Key: LUCENE-7994
> URL: https://issues.apache.org/jira/browse/LUCENE-7994
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: master (8.0), 7.2
>
> Attachments: LUCENE-7994.patch, LUCENE-7994.patch
>
>
> Int taxonomy facets today always count into a dense {{int[]}}, which is 
> wasteful in cases where the number of unique facet labels is high and the 
> size of the current result set is small.
> I factored the native hash map from LUCENE-7927 and use a simple heuristic 
> (customizable by the user by subclassing) to decide up front whether to count 
> sparse or dense.  I also made loading of the large children and siblings 
> {{int[]}} lazy, so that they are only instantiated if you really need them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7994) Use int/int hash map for int taxonomy facet counts

2017-10-23 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-7994:
---
Attachment: LUCENE-7994.patch

Another iteration; I think it's ready.

I decided to just add a dependency on HPPC rather than micro-fork; HPPC is fun 
to work with!  I used {{IntIntScatterMap}} and {{LongIntScatterMap}}, since I 
don't every copy from one hashed structure to another.

I also folded in [~rcmuir]'s feedback.

> Use int/int hash map for int taxonomy facet counts
> --
>
> Key: LUCENE-7994
> URL: https://issues.apache.org/jira/browse/LUCENE-7994
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: master (8.0), 7.2
>
> Attachments: LUCENE-7994.patch, LUCENE-7994.patch
>
>
> Int taxonomy facets today always count into a dense {{int[]}}, which is 
> wasteful in cases where the number of unique facet labels is high and the 
> size of the current result set is small.
> I factored the native hash map from LUCENE-7927 and use a simple heuristic 
> (customizable by the user by subclassing) to decide up front whether to count 
> sparse or dense.  I also made loading of the large children and siblings 
> {{int[]}} lazy, so that they are only instantiated if you really need them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11429) Add loess Stream Evaluator to support Local Regression interpolation

2017-10-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-11429:
--
Fix Version/s: master (8.0)
   7.2

> Add loess Stream Evaluator to support Local Regression interpolation
> 
>
> Key: SOLR-11429
> URL: https://issues.apache.org/jira/browse/SOLR-11429
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 7.2, master (8.0)
>
> Attachments: SOLR-11429.patch, SOLR-11429.patch
>
>
> The loess function will fit a curved line through a set of points using the 
> Local Regression Algorithm.
> Syntax:
> {code}
> yvalues = loess(xvec, yvec)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11429) Add loess Stream Evaluator to support Local Regression interpolation

2017-10-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-11429:
--
Attachment: SOLR-11429.patch

> Add loess Stream Evaluator to support Local Regression interpolation
> 
>
> Key: SOLR-11429
> URL: https://issues.apache.org/jira/browse/SOLR-11429
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-11429.patch, SOLR-11429.patch
>
>
> The loess function will fit a curved line through a set of points using the 
> Local Regression Algorithm.
> Syntax:
> {code}
> yvalues = loess(xvec, yvec)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11514) Solr 7.1 does not honor values specified in solr.in.sh

2017-10-23 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215579#comment-16215579
 ] 

Shawn Heisey commented on SOLR-11514:
-

bq. If I modify the /etc/default/solr.in.sh file, nothing gets picked up.

This is a potentially basic question, but I think it must be asked:  Did you 
restart (or stop and then start) the Solr service after modifying the include 
script?


> Solr 7.1 does not honor values specified in solr.in.sh
> --
>
> Key: SOLR-11514
> URL: https://issues.apache.org/jira/browse/SOLR-11514
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: release-scripts, SolrCloud
>Affects Versions: 7.1
> Environment: Linux RHEL 7, 8GB RAM, 60GB HDD, Solr 7.1 in cloud mode 
> (zookeeper 3.4.10)
>Reporter: Howard Black
> Attachments: solr, solr.in.sh
>
>
> Just installed Solr 7.1 and zookeeper 3.4.10 into a test environment and it 
> seems that arguments in the solr.in.sh file in /etc/default are not getting 
> picked up when starting the server.
> I have this specified in solr.in.sh SOLR_JAVA_MEM="-Xms512m -Xmx6144m" but 
> the JVM shows -Xms512m -Xmx512m.
> Same goes for SOLR_LOGS_DIR=/mnt/logs  logs are still being written to 
> /opt/solr/server/logs
> The command I used to install Solr is this:
> ./install_solr_service.sh /home/hblack/solr-7.1.0.tgz -i /opt -d 
> /mnt/solr-home -s solr -u solr



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11326) CDCR bootstrap should not download tlog's from source

2017-10-23 Thread Varun Thacker (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated SOLR-11326:
-
Attachment: SOLR-11326.patch

Updated patch. The new test was just checking the number of tlogs that were 
downloaded. So I folded that condition in an existing test- 
{{testBootstrapWithSourceCluster}} . 

> CDCR bootstrap should not download tlog's from source
> -
>
> Key: SOLR-11326
> URL: https://issues.apache.org/jira/browse/SOLR-11326
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
>Assignee: Varun Thacker
> Attachments: SOLR-11326.patch, SOLR-11326.patch, SOLR-11326.patch, 
> SOLR-11326.patch, SOLR-11326.patch, WITHOUT-FIX.patch
>
>
> While analyzing two separate fails on SOLR-11278 I see that during bootstrap 
> the tlog's from the source is getting download
> snippet1:
> {code}
>[junit4]   2> 42931 INFO  (qtp1525032019-69) [n:127.0.0.1:53178_solr 
> c:cdcr-source s:shard1 r:core_node1 x:cdcr-source_shard1_replica1] 
> o.a.s.h.CdcrReplicatorManager Submitting bootstrap task to executor
>[junit4]   2> 42934 INFO  
> (cdcr-bootstrap-status-32-thread-1-processing-n:127.0.0.1:53178_solr 
> x:cdcr-source_shard1_replica1 s:shard1 c:cdcr-source r:core_node1) 
> [n:127.0.0.1:53178_solr c:cdcr-source s:shard1 r:core_node1 
> x:cdcr-source_shard1_replica1] o.a.s.h.CdcrReplicatorManager Attempting to 
> bootstrap target collection: cdcr-target shard: shard1 leader: 
> http://127.0.0.1:53170/solr/cdcr-target_shard1_replica1/
>[junit4]   2> 43003 INFO  (qtp1525032019-69) [n:127.0.0.1:53178_solr 
> c:cdcr-source s:shard1 r:core_node1 x:cdcr-source_shard1_replica1] 
> o.a.s.c.S.Request [cdcr-source_shard1_replica1]  webapp=/solr 
> path=/replication 
> params={qt=/replication=javabin=2=indexversion} status=0 
> QTime=0
>[junit4]   2> 43004 INFO  
> (recoveryExecutor-6-thread-1-processing-n:127.0.0.1:53170_solr 
> x:cdcr-target_shard1_replica1 s:shard1 c:cdcr-target r:core_node1) 
> [n:127.0.0.1:53170_solr c:cdcr-target s:shard1 r:core_node1 
> x:cdcr-target_shard1_replica1] o.a.s.h.IndexFetcher Master's generation: 12
>[junit4]   2> 43004 INFO  
> (recoveryExecutor-6-thread-1-processing-n:127.0.0.1:53170_solr 
> x:cdcr-target_shard1_replica1 s:shard1 c:cdcr-target r:core_node1) 
> [n:127.0.0.1:53170_solr c:cdcr-target s:shard1 r:core_node1 
> x:cdcr-target_shard1_replica1] o.a.s.h.IndexFetcher Master's version: 
> 1503514968639
>[junit4]   2> 43004 INFO  
> (recoveryExecutor-6-thread-1-processing-n:127.0.0.1:53170_solr 
> x:cdcr-target_shard1_replica1 s:shard1 c:cdcr-target r:core_node1) 
> [n:127.0.0.1:53170_solr c:cdcr-target s:shard1 r:core_node1 
> x:cdcr-target_shard1_replica1] o.a.s.h.IndexFetcher Slave's generation: 1
>[junit4]   2> 43004 INFO  
> (recoveryExecutor-6-thread-1-processing-n:127.0.0.1:53170_solr 
> x:cdcr-target_shard1_replica1 s:shard1 c:cdcr-target r:core_node1) 
> [n:127.0.0.1:53170_solr c:cdcr-target s:shard1 r:core_node1 
> x:cdcr-target_shard1_replica1] o.a.s.h.IndexFetcher Slave's version: 0
>[junit4]   2> 43004 INFO  
> (recoveryExecutor-6-thread-1-processing-n:127.0.0.1:53170_solr 
> x:cdcr-target_shard1_replica1 s:shard1 c:cdcr-target r:core_node1) 
> [n:127.0.0.1:53170_solr c:cdcr-target s:shard1 r:core_node1 
> x:cdcr-target_shard1_replica1] o.a.s.h.IndexFetcher Starting replication 
> process
>[junit4]   2> 43041 INFO  (qtp1525032019-71) [n:127.0.0.1:53178_solr 
> c:cdcr-source s:shard1 r:core_node1 x:cdcr-source_shard1_replica1] 
> o.a.s.h.ReplicationHandler Adding tlog files to list: [{size=4649, 
> name=tlog.000.1576549701811961856}, {size=4770, 
> name=tlog.001.1576549702515556352}, {size=4770, 
> name=tlog.002.1576549702628802560}, {size=4770, 
> name=tlog.003.1576549702720028672}, {size=4770, 
> name=tlog.004.1576549702799720448}, {size=4770, 
> name=tlog.005.1576549702894092288}, {size=4770, 
> name=tlog.006.1576549703029358592}, {size=4770, 
> name=tlog.007.1576549703126876160}, {size=4770, 
> name=tlog.008.1576549703208665088}, {size=4770, 
> name=tlog.009.1576549703295696896}
> {code}
> snippet2:
> {code}
>  17070[junit4]   2> 677606 INFO  (qtp22544544-5725) [] 
> o.a.s.h.CdcrReplicatorManager Attempting to bootstrap target collection: 
> cdcr-target, shard: shard1^M
>  17071[junit4]   2> 677608 INFO  (qtp22544544-5725) [] 
> o.a.s.h.CdcrReplicatorManager Submitting bootstrap task to executor^M
> 17091[junit4]   2> 677627 INFO  (qtp22544544-5724) [] 
> o.a.s.c.S.Request [cdcr-source_shard1_replica_n1]  webapp=/solr 
> path=/replication 
> 

[jira] [Commented] (LUCENE-8005) Avoid Reflection in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Scott Somerville (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215556#comment-16215556
 ] 

Scott Somerville commented on LUCENE-8005:
--

Here's a quick code snippet:

{code:java}
Stopwatch sw = Stopwatch.createStarted();

for (int i = 0; i < 100_000_000; i++) {
Integer.class.getSimpleName();
}

System.out.println("getSimpleName took " + sw);

sw.reset();
sw.start();

for (int i = 0; i < 100_000_000; i++) {
Integer.class.getSuperclass();
}

System.out.println("getSuperclass took " + sw);

sw.reset();
sw.start();

Integer a = 42;

for (int i = 0; i < 100_000_000; i++) {
boolean b = a instanceof Number;
}

System.out.println("instanceof took " + sw);
{code}

Output:

{noformat}
getSimpleName took 9.810 s
getSuperclass took 5.454 ms
instanceof took 4.348 ms
{noformat}


> Avoid Reflection in UsageTrackingQueryCachingPolicy
> ---
>
> Key: LUCENE-8005
> URL: https://issues.apache.org/jira/browse/LUCENE-8005
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Scott Somerville
> Attachments: LUCENE-8005.patch
>
>
> By profiling an Elasticsearch cluster, I found the private method 
> UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
> clazz.getSimpleName() call.
> Here is an excerpt from hot_threads:
> {noformat}
> java.lang.Class.getEnclosingMethod0(Native Method)
>java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
>java.lang.Class.getEnclosingClass(Class.java:1272)
>java.lang.Class.getSimpleBinaryName(Class.java:1443)
>java.lang.Class.getSimpleName(Class.java:1309)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
>
> org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
>
> org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
>
> org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11514) Solr 7.1 does not honor values specified in solr.in.sh

2017-10-23 Thread Howard Black (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215553#comment-16215553
 ] 

Howard Black commented on SOLR-11514:
-

This is from the ./install_solr_service.sh script.

[root@solr-node01 bin]# ./install_solr_service.sh /home/hlblack/solr-7.1.0.tgz 
-i /opt -d /mnt/solr-home/ -s solr -u solr

Extracting /home/hlblack/solr-7.1.0.tgz to /opt


Installing symlink /opt/solr -> /opt/solr-7.1.0 ...


Installing /etc/init.d/solr script ...


Installing /etc/default/solr.in.sh ...


/mnt/solr-home//data/solr.xml already exists. Skipping install ...


/mnt/solr-home//log4j.properties already exists. Skipping install ...

Service solr installed.
Customize Solr startup configuration in /etc/default/solr.in.sh
Waiting up to 180 seconds to see Solr running on port 8983 [\]  
Started Solr server on port 8983 (pid=10171). Happy searching!


Found 1 Solr nodes: 

Solr process 10171 running on port 8983
{
  "solr_home":"/mnt/solr-home/data",
  "version":"7.1.0 84c90ad2c0218156c840e19a64d72b8a38550659 - ubuntu - 
2017-10-13 16:15:59",
  "startTime":"2017-10-23T18:06:14.533Z",
  "uptime":"0 days, 0 hours, 0 minutes, 8 seconds",
  "memory":"43 MB (%8.8) of 490.7 MB"}

[root@solr-node01 bin]# 


If I modify the /etc/default/solr.in.sh file, nothing gets picked up.

> Solr 7.1 does not honor values specified in solr.in.sh
> --
>
> Key: SOLR-11514
> URL: https://issues.apache.org/jira/browse/SOLR-11514
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: release-scripts, SolrCloud
>Affects Versions: 7.1
> Environment: Linux RHEL 7, 8GB RAM, 60GB HDD, Solr 7.1 in cloud mode 
> (zookeeper 3.4.10)
>Reporter: Howard Black
> Attachments: solr, solr.in.sh
>
>
> Just installed Solr 7.1 and zookeeper 3.4.10 into a test environment and it 
> seems that arguments in the solr.in.sh file in /etc/default are not getting 
> picked up when starting the server.
> I have this specified in solr.in.sh SOLR_JAVA_MEM="-Xms512m -Xmx6144m" but 
> the JVM shows -Xms512m -Xmx512m.
> Same goes for SOLR_LOGS_DIR=/mnt/logs  logs are still being written to 
> /opt/solr/server/logs
> The command I used to install Solr is this:
> ./install_solr_service.sh /home/hblack/solr-7.1.0.tgz -i /opt -d 
> /mnt/solr-home -s solr -u solr



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11429) Add loess Stream Evaluator to support Local Regression interpolation

2017-10-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-11429:
--
Attachment: SOLR-11429.patch

> Add loess Stream Evaluator to support Local Regression interpolation
> 
>
> Key: SOLR-11429
> URL: https://issues.apache.org/jira/browse/SOLR-11429
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-11429.patch
>
>
> The loess function will fit a curved line through a set of points using the 
> Local Regression Algorithm.
> Syntax:
> {code}
> yvalues = loess(xvec, yvec)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-11429) Add loess Stream Evaluator to support Local Regression interpolation

2017-10-23 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein reassigned SOLR-11429:
-

Assignee: Joel Bernstein

> Add loess Stream Evaluator to support Local Regression interpolation
> 
>
> Key: SOLR-11429
> URL: https://issues.apache.org/jira/browse/SOLR-11429
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>
> The loess function will fit a curved line through a set of points using the 
> Local Regression Algorithm.
> Syntax:
> {code}
> yvalues = loess(xvec, yvec)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-master-Linux (32bit/jdk1.8.0_144) - Build # 20718 - Still Unstable!

2017-10-23 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/20718/
Java: 32bit/jdk1.8.0_144 -client -XX:+UseConcMarkSweepGC

1 tests failed.
FAILED:  
org.apache.solr.cloud.TestCollectionsAPIViaSolrCloudCluster.testCollectionCreateSearchDelete

Error Message:
Error from server at 
https://127.0.0.1:39407/solr/testcollection_shard1_replica_n2: Expected mime 
type application/octet-stream but got text/html.Error 
404HTTP ERROR: 404 Problem accessing 
/solr/testcollection_shard1_replica_n2/update. Reason: Can not find: 
/solr/testcollection_shard1_replica_n2/update http://eclipse.org/jetty;>Powered by Jetty:// 9.3.20.v20170531 
  

Stack Trace:
org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error from 
server at https://127.0.0.1:39407/solr/testcollection_shard1_replica_n2: 
Expected mime type application/octet-stream but got text/html. 


Error 404 


HTTP ERROR: 404
Problem accessing /solr/testcollection_shard1_replica_n2/update. Reason:
Can not find: /solr/testcollection_shard1_replica_n2/update
http://eclipse.org/jetty;>Powered by Jetty:// 
9.3.20.v20170531



at 
__randomizedtesting.SeedInfo.seed([4EC96A281BFF5385:ED33C48D9C17B920]:0)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:541)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:998)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:867)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:800)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:178)
at 
org.apache.solr.client.solrj.request.UpdateRequest.commit(UpdateRequest.java:233)
at 
org.apache.solr.cloud.TestCollectionsAPIViaSolrCloudCluster.testCollectionCreateSearchDelete(TestCollectionsAPIViaSolrCloudCluster.java:167)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 

[jira] [Commented] (LUCENE-8005) Avoid Reflection in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215539#comment-16215539
 ] 

Robert Muir commented on LUCENE-8005:
-

Are you sure there is a real performance problem? I don't believe that 
accessing a class's getSimpleName is expensive.

Such "sampling" profiling techniques are fundamentally broken with java: 
http://psy-lob-saw.blogspot.com/2016/02/why-most-sampling-java-profilers-are.html


> Avoid Reflection in UsageTrackingQueryCachingPolicy
> ---
>
> Key: LUCENE-8005
> URL: https://issues.apache.org/jira/browse/LUCENE-8005
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Scott Somerville
> Attachments: LUCENE-8005.patch
>
>
> By profiling an Elasticsearch cluster, I found the private method 
> UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
> clazz.getSimpleName() call.
> Here is an excerpt from hot_threads:
> {noformat}
> java.lang.Class.getEnclosingMethod0(Native Method)
>java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
>java.lang.Class.getEnclosingClass(Class.java:1272)
>java.lang.Class.getSimpleBinaryName(Class.java:1443)
>java.lang.Class.getSimpleName(Class.java:1309)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
>
> org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
>
> org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
>
> org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8005) Avoid Reflection in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Scott Somerville (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Somerville updated LUCENE-8005:
-
Attachment: (was: LUCENE-8005.patch)

> Avoid Reflection in UsageTrackingQueryCachingPolicy
> ---
>
> Key: LUCENE-8005
> URL: https://issues.apache.org/jira/browse/LUCENE-8005
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Scott Somerville
> Attachments: LUCENE-8005.patch
>
>
> By profiling an Elasticsearch cluster, I found the private method 
> UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
> clazz.getSimpleName() call.
> Here is an excerpt from hot_threads:
> {noformat}
> java.lang.Class.getEnclosingMethod0(Native Method)
>java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
>java.lang.Class.getEnclosingClass(Class.java:1272)
>java.lang.Class.getSimpleBinaryName(Class.java:1443)
>java.lang.Class.getSimpleName(Class.java:1309)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
>
> org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
>
> org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
>
> org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8005) Avoid Reflection in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Scott Somerville (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Somerville updated LUCENE-8005:
-
Attachment: LUCENE-8005.patch

> Avoid Reflection in UsageTrackingQueryCachingPolicy
> ---
>
> Key: LUCENE-8005
> URL: https://issues.apache.org/jira/browse/LUCENE-8005
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Scott Somerville
> Attachments: LUCENE-8005.patch
>
>
> By profiling an Elasticsearch cluster, I found the private method 
> UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
> clazz.getSimpleName() call.
> Here is an excerpt from hot_threads:
> {noformat}
> java.lang.Class.getEnclosingMethod0(Native Method)
>java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
>java.lang.Class.getEnclosingClass(Class.java:1272)
>java.lang.Class.getSimpleBinaryName(Class.java:1443)
>java.lang.Class.getSimpleName(Class.java:1309)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
>
> org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
>
> org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
>
> org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8005) Avoid Reflection in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Scott Somerville (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215533#comment-16215533
 ] 

Scott Somerville commented on LUCENE-8005:
--

Attached a patch of one possible fix. It makes all existing Point*Query classes 
that I could find extend a new abstract base class, PointQuery, which extends 
Query then we can do an instanceof check instead of reflection.

Alternatively, PointQuery could be an interface that each one implements.

> Avoid Reflection in UsageTrackingQueryCachingPolicy
> ---
>
> Key: LUCENE-8005
> URL: https://issues.apache.org/jira/browse/LUCENE-8005
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Scott Somerville
> Attachments: LUCENE-8005.patch
>
>
> By profiling an Elasticsearch cluster, I found the private method 
> UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
> clazz.getSimpleName() call.
> Here is an excerpt from hot_threads:
> {noformat}
> java.lang.Class.getEnclosingMethod0(Native Method)
>java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
>java.lang.Class.getEnclosingClass(Class.java:1272)
>java.lang.Class.getSimpleBinaryName(Class.java:1443)
>java.lang.Class.getSimpleName(Class.java:1309)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
>
> org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
>
> org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
>
> org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8005) Avoid Reflection in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Scott Somerville (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Somerville updated LUCENE-8005:
-
Attachment: LUCENE-8005.patch

> Avoid Reflection in UsageTrackingQueryCachingPolicy
> ---
>
> Key: LUCENE-8005
> URL: https://issues.apache.org/jira/browse/LUCENE-8005
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Scott Somerville
> Attachments: LUCENE-8005.patch
>
>
> By profiling an Elasticsearch cluster, I found the private method 
> UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
> clazz.getSimpleName() call.
> Here is an excerpt from hot_threads:
> {noformat}
> java.lang.Class.getEnclosingMethod0(Native Method)
>java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
>java.lang.Class.getEnclosingClass(Class.java:1272)
>java.lang.Class.getSimpleBinaryName(Class.java:1443)
>java.lang.Class.getSimpleName(Class.java:1309)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
>
> org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
>
> org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
>
> org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
>
> org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
>org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-8005) Avoid Reflection in UsageTrackingQueryCachingPolicy

2017-10-23 Thread Scott Somerville (JIRA)
Scott Somerville created LUCENE-8005:


 Summary: Avoid Reflection in UsageTrackingQueryCachingPolicy
 Key: LUCENE-8005
 URL: https://issues.apache.org/jira/browse/LUCENE-8005
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Scott Somerville


By profiling an Elasticsearch cluster, I found the private method 
UsageTrackingQueryCachingPolicy.isPointQuery to be quite expensive due to the 
clazz.getSimpleName() call.

Here is an excerpt from hot_threads:

{noformat}
java.lang.Class.getEnclosingMethod0(Native Method)
   java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
   java.lang.Class.getEnclosingClass(Class.java:1272)
   java.lang.Class.getSimpleBinaryName(Class.java:1443)
   java.lang.Class.getSimpleName(Class.java:1309)
   
org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isPointQuery(UsageTrackingQueryCachingPolicy.java:39)
   
org.apache.lucene.search.UsageTrackingQueryCachingPolicy.isCostly(UsageTrackingQueryCachingPolicy.java:54)
   
org.apache.lucene.search.UsageTrackingQueryCachingPolicy.minFrequencyToCache(UsageTrackingQueryCachingPolicy.java:121)
   
org.apache.lucene.search.UsageTrackingQueryCachingPolicy.shouldCache(UsageTrackingQueryCachingPolicy.java:179)
   
org.elasticsearch.index.shard.ElasticsearchQueryCachingPolicy.shouldCache(ElasticsearchQueryCachingPolicy.java:53)
   
org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:806)
   
org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:168)
   org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
   org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
   org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:388)
   org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
{noformat}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-9735) Umbrella JIRA for Auto Scaling and Cluster Management in SolrCloud

2017-10-23 Thread Cassandra Targett (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett updated SOLR-9735:

Component/s: AutoScaling

> Umbrella JIRA for Auto Scaling and Cluster Management in SolrCloud
> --
>
> Key: SOLR-9735
> URL: https://issues.apache.org/jira/browse/SOLR-9735
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Reporter: Anshum Gupta
>Assignee: Shalin Shekhar Mangar
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> As SolrCloud is now used at fairly large scale, most users end up writing 
> their own cluster management tools. We should have a framework for cluster 
> management in Solr.
> In a discussion with [~noble.paul], we outlined the following steps w.r.t. 
> the approach to having this implemented:
> * *Basic API* calls for cluster management e.g. utilize added nodes, remove a 
> node etc. These calls would need explicit invocation by the users to begin 
> with. It would also specify the {{strategy}} to use. For instance I can have 
> a strategy called {{optimizeCoreCount}} which would target to have an even 
> no:of cores in each node . The strategy could optionally take parameters as 
> well
> * *Metrics* and stats tracking e.g. qps, etc. These would be required for any 
> advanced cluster management tasks e.g. *maintain a qps of 'x'* by 
> *auto-adding a replica* (using a recipe) etc. We would need 
> collection/shard/node level views of metrics for this.
> * *Recipes*: combination of multiple sequential/parallel API calls based on 
> rules. This would be complicated specially as most of these would be long 
> running series of tasks which would either have to be rolled back or resumed 
> in case of a failure.
> * *Event based triggers* that would not require explicit cluster management 
> calls for end users.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11524) Create a autoscaling/suggestions API end-point

2017-10-23 Thread Cassandra Targett (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett updated SOLR-11524:
-
Component/s: AutoScaling

> Create a autoscaling/suggestions API end-point
> --
>
> Key: SOLR-11524
> URL: https://issues.apache.org/jira/browse/SOLR-11524
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Reporter: Noble Paul
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11519) Suggestions for replica count violations

2017-10-23 Thread Cassandra Targett (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett updated SOLR-11519:
-
Component/s: AutoScaling

> Suggestions for replica count violations
> 
>
> Key: SOLR-11519
> URL: https://issues.apache.org/jira/browse/SOLR-11519
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Reporter: Noble Paul
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11520) Suggestions for cores violations

2017-10-23 Thread Cassandra Targett (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett updated SOLR-11520:
-
Component/s: AutoScaling

> Suggestions for cores violations
> 
>
> Key: SOLR-11520
> URL: https://issues.apache.org/jira/browse/SOLR-11520
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Reporter: Noble Paul
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11521) suggestions to add more nodes

2017-10-23 Thread Cassandra Targett (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett updated SOLR-11521:
-
Component/s: AutoScaling

> suggestions to add more nodes
> -
>
> Key: SOLR-11521
> URL: https://issues.apache.org/jira/browse/SOLR-11521
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Reporter: Noble Paul
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11523) suggestions to remove nodes

2017-10-23 Thread Cassandra Targett (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett updated SOLR-11523:
-
Component/s: AutoScaling

> suggestions to remove nodes
> ---
>
> Key: SOLR-11523
> URL: https://issues.apache.org/jira/browse/SOLR-11523
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Reporter: Noble Paul
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11522) Suggestions/recommendations to rebalance replicas

2017-10-23 Thread Cassandra Targett (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett updated SOLR-11522:
-
Component/s: AutoScaling

> Suggestions/recommendations to rebalance replicas
> -
>
> Key: SOLR-11522
> URL: https://issues.apache.org/jira/browse/SOLR-11522
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Reporter: Noble Paul
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11518) Create suggestions for freedisk violations

2017-10-23 Thread Cassandra Targett (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett updated SOLR-11518:
-
Component/s: AutoScaling

> Create suggestions for freedisk violations
> --
>
> Key: SOLR-11518
> URL: https://issues.apache.org/jira/browse/SOLR-11518
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Reporter: Noble Paul
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11359) An autoscaling/suggestions endpoint to recommend operations

2017-10-23 Thread Cassandra Targett (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett updated SOLR-11359:
-
Component/s: AutoScaling

> An autoscaling/suggestions endpoint to recommend operations
> ---
>
> Key: SOLR-11359
> URL: https://issues.apache.org/jira/browse/SOLR-11359
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Reporter: Noble Paul
>Assignee: Noble Paul
> Attachments: SOLR-11359.patch
>
>
> Autoscaling can make suggestions to users on what operations they can perform 
> to improve the health of the cluster
> The suggestions will have the following information
> * http end point
> * http method (POST,DELETE)
> * command payload



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-23 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215513#comment-16215513
 ] 

Varun Thacker commented on LUCENE-7976:
---

Hi Mike,

bq. 50% is too many deleted docs for some use cases; fixing TMP to let the 
large segments be eligible for merging, always, plus maybe tuning up the 
existing reclaimDeletesWeight, fixes that.

I'm interested in tackling this use-case. This is what you had stated in a 
previous reply as a potential solution:

 bq. Still, I think it's OK to relax TMP so it will allow max sized segments 
with less than 50% deletions to be eligible for merging, and users can tune the 
deletions weight to force TMP to aggressively merge such segments. This would 
be a tiny change in the loop that computes tooBigCount.

So you are proposing changing this statement {{if (segBytes < 
maxMergedSegmentBytes/2.0)}}  and make 2.0 ( 50%) configurable ? 
Wouldn't this mean that the segment sizes keep growing over time well beyond 
the max limit? Would have have downsides in the long run on the index in terms 
of performance?



 



> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >