Re: Congratulations to the new Lucene/Solr PMC Chair, Adrien Grand

2017-10-20 Thread Shalin Shekhar Mangar
Congratulations Adrien!

On Thu, Oct 19, 2017 at 12:49 PM, Tommaso Teofili
 wrote:
> Once a year the Lucene PMC rotates the PMC chair and Apache Vice President
> position.
> This year we have nominated and elected Adrien Grand as the chair and today
> the board just approved it, so now it's official.
>
> Congratulations Adrien!
> Regards,
> Tommaso



-- 
Regards,
Shalin Shekhar Mangar.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7997) More sanity testing of similarities

2017-10-20 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-7997:

Attachment: LUCENE-7997_wip.patch

Added ClassicSimilarity and BooleanSimilarity to testing, randomized bm25 
parameters and boosts.
ClassicSimilarity was fine just needed explain() cleaned up to exactly match 
score().

Note that query boosts and bm25's k1 parameter are only tested within a 
"reasonable" ranges (0..Integer.MAX_VALUE) so we can fail the test if the sim 
has internal unexpected overflows, this is just trying to kick out the sim bugs.

> More sanity testing of similarities
> ---
>
> Key: LUCENE-7997
> URL: https://issues.apache.org/jira/browse/LUCENE-7997
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch
>
>
> LUCENE-7993 is a potential optimization that we could only apply if the 
> similarity is an increasing functions of {{freq}} (all other things like DF 
> and length being equal). This sounds like a very reasonable requirement for a 
> similarity, so we should test it in the base similarity test case and maybe 
> move broken similarities to sandbox?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-5.5 - Build # 35 - Still Failing

2017-10-20 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.5/35/

1 tests failed.
FAILED:  org.apache.solr.schema.TestCloudManagedSchemaConcurrent.test

Error Message:
QUERY FAILED: 
xpath=/response/lst[@name='responseHeader']/int[@name='status'][.='0']  
request=/schema/dynamicfields/?wt=xml=15  response=  500   15377  org.apache.solr.common.SolrException org.apache.solr.common.SolrException  
7 out of 8 replicas failed to update their schema to version 
108 within 15 seconds! Failed cores: [https://127.0.0.1:35935/collection1/, 
https://127.0.0.1:37622/collection1/, https://127.0.0.1:44703/collection1/, 
https://127.0.0.1:44668/collection1/, https://127.0.0.1:45547/collection1/, 
https://127.0.0.1:41451/collection1/, 
https://127.0.0.1:45957/collection1/]   org.apache.solr.common.SolrException: 7 out of 8 replicas failed 
to update their schema to version 108 within 15 seconds! Failed cores: 
[https://127.0.0.1:35935/collection1/, https://127.0.0.1:37622/collection1/, 
https://127.0.0.1:44703/collection1/, https://127.0.0.1:44668/collection1/, 
https://127.0.0.1:45547/collection1/, https://127.0.0.1:41451/collection1/, 
https://127.0.0.1:45957/collection1/]  at 
org.apache.solr.schema.ManagedIndexSchema.waitForSchemaZkVersionAgreement(ManagedIndexSchema.java:264)
  at 
org.apache.solr.rest.schema.BaseFieldResource.waitForSchemaUpdateToPropagate(BaseFieldResource.java:119)
  at 
org.apache.solr.rest.schema.DynamicFieldCollectionResource.post(DynamicFieldCollectionResource.java:195)
  at org.restlet.resource.ServerResource.doHandle(ServerResource.java:454)  at 
org.restlet.resource.ServerResource.doConditionalHandle(ServerResource.java:359)
  at org.restlet.resource.ServerResource.handle(ServerResource.java:1044)  at 
org.restlet.resource.Finder.handle(Finder.java:236)  at 
org.restlet.routing.Filter.doHandle(Filter.java:150)  at 
org.restlet.routing.Filter.handle(Filter.java:197)  at 
org.restlet.routing.Router.doHandle(Router.java:422)  at 
org.restlet.routing.Router.handle(Router.java:639)  at 
org.restlet.routing.Filter.doHandle(Filter.java:150)  at 
org.restlet.routing.Filter.handle(Filter.java:197)  at 
org.restlet.routing.Filter.doHandle(Filter.java:150)  at 
org.restlet.routing.Filter.handle(Filter.java:197)  at 
org.restlet.routing.Filter.doHandle(Filter.java:150)  at 
org.restlet.engine.application.StatusFilter.doHandle(StatusFilter.java:140)  at 
org.restlet.routing.Filter.handle(Filter.java:197)  at 
org.restlet.routing.Filter.doHandle(Filter.java:150)  at 
org.restlet.routing.Filter.handle(Filter.java:197)  at 
org.restlet.engine.CompositeHelper.handle(CompositeHelper.java:202)  at 
org.restlet.engine.application.ApplicationHelper.handle(ApplicationHelper.java:75)
  at org.restlet.Application.handle(Application.java:385)  at 
org.restlet.routing.Filter.doHandle(Filter.java:150)  at 
org.restlet.routing.Filter.handle(Filter.java:197)  at 
org.restlet.routing.Router.doHandle(Router.java:422)  at 
org.restlet.routing.Router.handle(Router.java:639)  at 
org.restlet.routing.Filter.doHandle(Filter.java:150)  at 
org.restlet.routing.Filter.handle(Filter.java:197)  at 
org.restlet.routing.Router.doHandle(Router.java:422)  at 
org.restlet.routing.Router.handle(Router.java:639)  at 
org.restlet.routing.Filter.doHandle(Filter.java:150)  at 
org.restlet.routing.Filter.handle(Filter.java:197)  at 
org.restlet.engine.CompositeHelper.handle(CompositeHelper.java:202)  at 
org.restlet.Component.handle(Component.java:408)  at 
org.restlet.Server.handle(Server.java:507)  at 
org.restlet.engine.connector.ServerHelper.handle(ServerHelper.java:63)  at 
org.restlet.engine.adapter.HttpServerHelper.handle(HttpServerHelper.java:143)  
at org.restlet.ext.servlet.ServerServlet.service(ServerServlet.java:1117)  at 
javax.servlet.http.HttpServlet.service(HttpServlet.java:790)  at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:808)  at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:587)  at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221)
  at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
  at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)  
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
  at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
  at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)  
at org.eclipse.jetty.server.Dispatcher.forward(Dispatcher.java:191)  at 
org.eclipse.jetty.server.Dispatcher.forward(Dispatcher.java:72)  at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:266)
  at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)
  at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
  at 

[JENKINS] Lucene-Solr-7.x-MacOSX (64bit/jdk-9) - Build # 258 - Still Unstable!

2017-10-20 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-MacOSX/258/
Java: 64bit/jdk-9 -XX:-UseCompressedOops -XX:+UseParallelGC 
--illegal-access=deny

1 tests failed.
FAILED:  org.apache.solr.cloud.autoscaling.ExecutePlanActionTest.testIntegration

Error Message:
Timed out waiting for replicas of collection to be 2 again null Live Nodes: 
[127.0.0.1:49460_solr] Last available state: 
DocCollection(testIntegration//collections/testIntegration/state.json/8)={   
"pullReplicas":"0",   "replicationFactor":"2",   "shards":{"shard1":{   
"range":"8000-7fff",   "state":"active",   "replicas":{ 
"core_node3":{   "core":"testIntegration_shard1_replica_n1",   
"base_url":"http://127.0.0.1:49460/solr;,   
"node_name":"127.0.0.1:49460_solr",   "state":"active",   
"type":"NRT",   "leader":"true"}, "core_node4":{   
"core":"testIntegration_shard1_replica_n2",   
"base_url":"http://127.0.0.1:49459/solr;,   
"node_name":"127.0.0.1:49459_solr",   "state":"down",   
"type":"NRT",   "router":{"name":"compositeId"},   "maxShardsPerNode":"1",  
 "autoAddReplicas":"false",   "nrtReplicas":"2",   "tlogReplicas":"0"}

Stack Trace:
java.lang.AssertionError: Timed out waiting for replicas of collection to be 2 
again
null
Live Nodes: [127.0.0.1:49460_solr]
Last available state: 
DocCollection(testIntegration//collections/testIntegration/state.json/8)={
  "pullReplicas":"0",
  "replicationFactor":"2",
  "shards":{"shard1":{
  "range":"8000-7fff",
  "state":"active",
  "replicas":{
"core_node3":{
  "core":"testIntegration_shard1_replica_n1",
  "base_url":"http://127.0.0.1:49460/solr;,
  "node_name":"127.0.0.1:49460_solr",
  "state":"active",
  "type":"NRT",
  "leader":"true"},
"core_node4":{
  "core":"testIntegration_shard1_replica_n2",
  "base_url":"http://127.0.0.1:49459/solr;,
  "node_name":"127.0.0.1:49459_solr",
  "state":"down",
  "type":"NRT",
  "router":{"name":"compositeId"},
  "maxShardsPerNode":"1",
  "autoAddReplicas":"false",
  "nrtReplicas":"2",
  "tlogReplicas":"0"}
at 
__randomizedtesting.SeedInfo.seed([C0C350E0FC2E6941:70A25ECCD911C864]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.cloud.SolrCloudTestCase.waitForState(SolrCloudTestCase.java:269)
at 
org.apache.solr.cloud.autoscaling.ExecutePlanActionTest.testIntegration(ExecutePlanActionTest.java:209)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)

[jira] [Updated] (LUCENE-7997) More sanity testing of similarities

2017-10-20 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-7997:

Attachment: LUCENE-7997_wip.patch

Updated patch with more cleanups around explain. I tried to add descriptions 
for parts of the formula and also use standard nomenclature. I think its better 
now, here is typical output:

{noformat}
20.629753 = score(doc=0,freq=979.0), product of:
  2.2 = scaling factor, k1 + 1
  9.388654 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:
1.0 = n, number of documents containing term
17927.0 = N, total number of documents with field
  0.9987758 = tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) 
from:
979.0 = freq, occurrences of term within document
1.2 = k1, term saturation parameter
0.75 = b, length normalization parameter
1.0 = dl, length of field
1.0 = avgdl, average length of field
{noformat}

You can more easily see term frequency saturation including extreme cases such 
as 1.0 where no more occurrences can help. You can kinda visualize how it can 
work for maxScore now :)

{noformat}
...
  1.0 = tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:
5.9470048E8 = freq, occurrences of term within document
1.2 = k1, term saturation parameter
0.75 = b, length normalization parameter
40.0 = dl, length of field (approximate)
3.72180768E8 = avgdl, average length of field
...
{noformat}


> More sanity testing of similarities
> ---
>
> Key: LUCENE-7997
> URL: https://issues.apache.org/jira/browse/LUCENE-7997
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch
>
>
> LUCENE-7993 is a potential optimization that we could only apply if the 
> similarity is an increasing functions of {{freq}} (all other things like DF 
> and length being equal). This sounds like a very reasonable requirement for a 
> similarity, so we should test it in the base similarity test case and maybe 
> move broken similarities to sandbox?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-7.x - Build # 189 - Still Unstable

2017-10-20 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-7.x/189/

6 tests failed.
FAILED:  org.apache.solr.cloud.CdcrBootstrapTest.testBootstrapWithSourceCluster

Error Message:
Captured an uncaught exception in thread: Thread[id=848, name=Thread-154, 
state=RUNNABLE, group=TGRP-CdcrBootstrapTest]

Stack Trace:
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=848, name=Thread-154, state=RUNNABLE, 
group=TGRP-CdcrBootstrapTest]
at 
__randomizedtesting.SeedInfo.seed([3D43DFAB320A40C4:E4158E6F316E538E]:0)
Caused by: java.lang.AssertionError: 1
at __randomizedtesting.SeedInfo.seed([3D43DFAB320A40C4]:0)
at 
org.apache.solr.core.CachingDirectoryFactory.close(CachingDirectoryFactory.java:192)
at org.apache.solr.core.SolrCore.close(SolrCore.java:1614)
at 
org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:870)
at org.apache.solr.core.CoreContainer.reload(CoreContainer.java:1285)
at 
org.apache.solr.handler.IndexFetcher.lambda$reloadCore$0(IndexFetcher.java:910)
at java.lang.Thread.run(Thread.java:748)


FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.CdcrBootstrapTest

Error Message:
ObjectTracker found 3 object(s) that were not released!!! [SolrCore, 
MockDirectoryWrapper, RawDirectoryWrapper] 
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: 
org.apache.solr.core.SolrCore  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42)
  at org.apache.solr.core.SolrCore.(SolrCore.java:1020)  at 
org.apache.solr.core.SolrCore.reload(SolrCore.java:637)  at 
org.apache.solr.core.CoreContainer.reload(CoreContainer.java:1284)  at 
org.apache.solr.handler.IndexFetcher.lambda$reloadCore$0(IndexFetcher.java:910) 
 at java.lang.Thread.run(Thread.java:748)  
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: 
org.apache.lucene.store.MockDirectoryWrapper  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42)
  at 
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:348)
  at 
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:494)  
at org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:338) 
 at 
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:421) 
 at 
org.apache.solr.handler.CdcrRequestHandler$BootstrapCallable.call(CdcrRequestHandler.java:768)
  at 
org.apache.solr.handler.CdcrRequestHandler$BootstrapCallable.call(CdcrRequestHandler.java:723)
  at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)  at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
 at java.lang.Thread.run(Thread.java:748)  
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: 
org.apache.lucene.store.RawDirectoryWrapper  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42)
  at 
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:348)
  at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:92)  at 
org.apache.solr.core.SolrCore.initIndex(SolrCore.java:742)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:935)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:844)  at 
org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1036)
  at org.apache.solr.core.CoreContainer.create(CoreContainer.java:948)  at 
org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$0(CoreAdminOperation.java:91)
  at 
org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:384)
  at 
org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.call(CoreAdminHandler.java:389)
  at 
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:174)
  at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
  at org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:735)  
at 
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:716)  
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:497)  at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:382)
  at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
  at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
  at 
org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:139)
  at 

[JENKINS] Lucene-Solr-master-MacOSX (64bit/jdk1.8.0) - Build # 4241 - Still Unstable!

2017-10-20 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-MacOSX/4241/
Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseParallelGC

6 tests failed.
FAILED:  
org.apache.solr.cloud.CollectionsAPIAsyncDistributedZkTest.testSolrJAPICalls

Error Message:
org/apache/solr/client/solrj/request/CollectionAdminRequest$SplitShard

Stack Trace:
java.lang.NoClassDefFoundError: 
org/apache/solr/client/solrj/request/CollectionAdminRequest$SplitShard
at 
__randomizedtesting.SeedInfo.seed([EDD47ED5EB872B88:B5B0F2B4EDED835C]:0)
at 
org.apache.solr.client.solrj.request.CollectionAdminRequest.splitShard(CollectionAdminRequest.java:1040)
at 
org.apache.solr.cloud.CollectionsAPIAsyncDistributedZkTest.testSolrJAPICalls(CollectionsAPIAsyncDistributedZkTest.java:67)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)


FAILED:  org.apache.solr.cloud.CollectionsAPISolrJTest.testSplitShard

Error Message:

[jira] [Created] (SOLR-11524) Create a autoscaling/suggestions API end-point

2017-10-20 Thread Noble Paul (JIRA)
Noble Paul created SOLR-11524:
-

 Summary: Create a autoscaling/suggestions API end-point
 Key: SOLR-11524
 URL: https://issues.apache.org/jira/browse/SOLR-11524
 Project: Solr
  Issue Type: Sub-task
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11522) Suggestions/recommendations to rebalance replicas

2017-10-20 Thread Noble Paul (JIRA)
Noble Paul created SOLR-11522:
-

 Summary: Suggestions/recommendations to rebalance replicas
 Key: SOLR-11522
 URL: https://issues.apache.org/jira/browse/SOLR-11522
 Project: Solr
  Issue Type: Sub-task
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11523) suggestions to remove nodes

2017-10-20 Thread Noble Paul (JIRA)
Noble Paul created SOLR-11523:
-

 Summary: suggestions to remove nodes
 Key: SOLR-11523
 URL: https://issues.apache.org/jira/browse/SOLR-11523
 Project: Solr
  Issue Type: Sub-task
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11520) Suggestions for cores violations

2017-10-20 Thread Noble Paul (JIRA)
Noble Paul created SOLR-11520:
-

 Summary: Suggestions for cores violations
 Key: SOLR-11520
 URL: https://issues.apache.org/jira/browse/SOLR-11520
 Project: Solr
  Issue Type: Sub-task
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11521) suggestions to add more nodes

2017-10-20 Thread Noble Paul (JIRA)
Noble Paul created SOLR-11521:
-

 Summary: suggestions to add more nodes
 Key: SOLR-11521
 URL: https://issues.apache.org/jira/browse/SOLR-11521
 Project: Solr
  Issue Type: Sub-task
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11518) Create suggestions for freedisk violations

2017-10-20 Thread Noble Paul (JIRA)
Noble Paul created SOLR-11518:
-

 Summary: Create suggestions for freedisk violations
 Key: SOLR-11518
 URL: https://issues.apache.org/jira/browse/SOLR-11518
 Project: Solr
  Issue Type: Sub-task
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11519) Suggestions for replica count violations

2017-10-20 Thread Noble Paul (JIRA)
Noble Paul created SOLR-11519:
-

 Summary: Suggestions for replica count violations
 Key: SOLR-11519
 URL: https://issues.apache.org/jira/browse/SOLR-11519
 Project: Solr
  Issue Type: Sub-task
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11359) An autoscaling/suggestions endpoint to recommend operations

2017-10-20 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-11359:
--
Issue Type: New Feature  (was: Sub-task)
Parent: (was: SOLR-9735)

> An autoscaling/suggestions endpoint to recommend operations
> ---
>
> Key: SOLR-11359
> URL: https://issues.apache.org/jira/browse/SOLR-11359
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
> Attachments: SOLR-11359.patch
>
>
> Autoscaling can make suggestions to users on what operations they can perform 
> to improve the health of the cluster
> The suggestions will have the following information
> * http end point
> * http method (POST,DELETE)
> * command payload



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-master - Build # 2127 - Still Unstable

2017-10-20 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-master/2127/

5 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.CdcrBootstrapTest

Error Message:
ObjectTracker found 6 object(s) that were not released!!! 
[MockDirectoryWrapper, InternalHttpClient, MockDirectoryWrapper, 
MockDirectoryWrapper, MockDirectoryWrapper, SolrCore] 
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: 
org.apache.lucene.store.MockDirectoryWrapper  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42)
  at 
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:348)
  at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:92)  at 
org.apache.solr.core.SolrCore.initIndex(SolrCore.java:742)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:935)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:844)  at 
org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1040)
  at org.apache.solr.core.CoreContainer.create(CoreContainer.java:952)  at 
org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$0(CoreAdminOperation.java:91)
  at 
org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:384)
  at 
org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.call(CoreAdminHandler.java:389)
  at 
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:174)
  at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
  at org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:735)  
at 
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:716)  
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:497)  at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:382)
  at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
  at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
  at 
org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:139)
  at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
  at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) 
 at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:224)
  at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
  at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)  
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
  at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
  at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)  
at 
org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:426)  
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) 
 at org.eclipse.jetty.server.Server.handle(Server.java:534)  at 
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)  at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)  at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
  at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)  at 
org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:251)  at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
  at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)  at 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) 
 at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
  at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
  at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
  at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
  at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) 
 at java.lang.Thread.run(Thread.java:748)  
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: 
org.apache.http.impl.client.InternalHttpClient  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42)
  at 
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:289)
  at 
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:298)
  at 
org.apache.solr.handler.IndexFetcher.createHttpClient(IndexFetcher.java:222)  
at org.apache.solr.handler.IndexFetcher.(IndexFetcher.java:260)  at 
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:417) 
 at 

[JENKINS] Lucene-Solr-master-Linux (64bit/jdk1.8.0_144) - Build # 20703 - Still Unstable!

2017-10-20 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/20703/
Java: 64bit/jdk1.8.0_144 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC

2 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.DistribJoinFromCollectionTest

Error Message:
Error from server at https://127.0.0.1:41107/solr: create the collection time 
out:180s

Stack Trace:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at https://127.0.0.1:41107/solr: create the collection time out:180s
at __randomizedtesting.SeedInfo.seed([E48747E831BDFD1E]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:626)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:413)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1086)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:867)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:800)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:178)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:195)
at 
org.apache.solr.cloud.DistribJoinFromCollectionTest.setupCluster(DistribJoinFromCollectionTest.java:88)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:874)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)


FAILED:  org.apache.solr.cloud.autoscaling.ExecutePlanActionTest.testIntegration

Error Message:
Timed out waiting for replicas of collection to be 2 again null Live Nodes: 
[127.0.0.1:38147_solr] Last available state: 
DocCollection(testIntegration//collections/testIntegration/state.json/7)={   
"pullReplicas":"0",   "replicationFactor":"2",   "shards":{"shard1":{   
"range":"8000-7fff",   "state":"active",   "replicas":{ 
"core_node3":{   "core":"testIntegration_shard1_replica_n1",   
"base_url":"http://127.0.0.1:33341/solr;,   
"node_name":"127.0.0.1:33341_solr",   "state":"down",   
"type":"NRT"}, "core_node4":{   
"core":"testIntegration_shard1_replica_n2",   
"base_url":"http://127.0.0.1:38147/solr;,   
"node_name":"127.0.0.1:38147_solr",   

[jira] [Updated] (LUCENE-7997) More sanity testing of similarities

2017-10-20 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-7997:

Attachment: LUCENE-7997_wip.patch

Updated patch, also enforcing that explain == score (exactly, no floating point 
differences). 

I cleaned up the BM25 explain to be transparent and reflect how the calculation 
is done.
Most importantly, explanation is now broken out as {{scaling * df * tf}}, like 
how we compute it, and described in 
http://kak.tx0.org/Information-Retrieval/TFxIDF rather than displaying the 
"re-arranged formula" with tf including the {{k1 + 1}} scaling factor. Maybe 
its an improvement for debugging, too since it pulls out the independent 
scaling factor, making it easier to see the specifics of term frequency 
saturation and IDF across docs/terms?

> More sanity testing of similarities
> ---
>
> Key: LUCENE-7997
> URL: https://issues.apache.org/jira/browse/LUCENE-7997
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch
>
>
> LUCENE-7993 is a potential optimization that we could only apply if the 
> similarity is an increasing functions of {{freq}} (all other things like DF 
> and length being equal). This sounds like a very reasonable requirement for a 
> similarity, so we should test it in the base similarity test case and maybe 
> move broken similarities to sandbox?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-20 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213427#comment-16213427
 ] 

Yonik Seeley commented on LUCENE-7976:
--

bq. But how can that work?

It will work as defined.  For some, this will be worse and they should not have 
called forceMerge.  For others, they knew what they were doing and it's exactly 
what they wanted.
If you don't want 1 big segment, don't call forceMerge(1).

bq. Or, if we had two settings, we could insist that the 
maxForcedMergeSegmentSize is <= the maxSegmentSize but then what's the point 

See LogByteSizeMergePolicy which already works correctly and defaults to 
maxSegmentSize=2GB, maxForcedMergeSegmentSize=Long.MAX_VALUE



> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11032) Update solrj tutorial

2017-10-20 Thread Jason Gerlowski (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213306#comment-16213306
 ] 

Jason Gerlowski commented on SOLR-11032:


As far as failing the ref-guide build if a partial-include can't be found, 
there is a warning that gets spit out.  I took a quick look through 
asciidoctor's/jekyll's documentation for a way to turn this into a 
build-halting error, but haven't found anything yet.  I'll keep looking but I'm 
not sure it's possible (at least not without doing something happy like 
grepping output).

> Update solrj tutorial
> -
>
> Key: SOLR-11032
> URL: https://issues.apache.org/jira/browse/SOLR-11032
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation, SolrJ, website
>Reporter: Karl Richter
> Attachments: SOLR-11032.patch, SOLR-11032.patch, SOLR-11032.patch
>
>
> The [solrj tutorial](https://wiki.apache.org/solr/Solrj) has the following 
> issues:
>   * It refers to 1.4.0 whereas the current release is 6.x, some classes are 
> deprecated or no longer exist.
>   * Document-object-binding is a crucial feature [which should be working in 
> the meantime](https://issues.apache.org/jira/browse/SOLR-1945) and thus 
> should be covered in the tutorial.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11517) ToParentBlockJoinQuery fails when the parents/child fall in to different segments

2017-10-20 Thread ananthesh (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ananthesh updated SOLR-11517:
-
Description: 
We have a system where all the documents in the collections are nested child 
documents. We also have 'autoCommit' enabled for the collection. We also get 
huge number of document updates. We found a scenario, where 'child' documents 
were indexed in one segment, while 'parent' document got indexed in the other 
segment. Here are the docid looks like


0 = 95638
1 = 95639
2 = 95640
3 = 272190 \{parent}

Now if the solr request has been made using "parent" query parser like the 
following

{noformat}
{!parent which=parent:true score=max}(...)
{noformat}

ToParentBlockJoinQuery which handles the request wont be able to find the 
parent for the searched child documents. But if we trigger `optimize` for the 
same index which forces to merge all the segments to single index, the above 
request will be able to fetch the results. 

  was:
We have a system where all the documents in the collections are nested child 
documents. We also have 'autoCommit' enabled for the collection. We also get 
huge number of document updates. We found a scenario, where 'child' documents 
were indexed in one segment, while 'parent' document got indexed in the other 
segment. Here are the docid looks like


0 = 95638
1 = 95639
2 = 95640
3 = 272190 {parent}

Now if the solr request has been made using "parent" query parser like the 
following

{noformat}
{!parent which=parent:true score=max}(...)
{noformat}

ToParentBlockJoinQuery which handles the request wont be able to find the 
parent for the searched child documents. But if we trigger `optimize` for the 
same index which forces to merge all the segments to single index, the above 
request will be able to fetch the results. 


> ToParentBlockJoinQuery fails when the parents/child fall in to different 
> segments
> -
>
> Key: SOLR-11517
> URL: https://issues.apache.org/jira/browse/SOLR-11517
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.1
>Reporter: ananthesh
>
> We have a system where all the documents in the collections are nested child 
> documents. We also have 'autoCommit' enabled for the collection. We also get 
> huge number of document updates. We found a scenario, where 'child' documents 
> were indexed in one segment, while 'parent' document got indexed in the other 
> segment. Here are the docid looks like
> 0 = 95638
> 1 = 95639
> 2 = 95640
> 3 = 272190 \{parent}
> Now if the solr request has been made using "parent" query parser like the 
> following
> {noformat}
> {!parent which=parent:true score=max}(...)
> {noformat}
> ToParentBlockJoinQuery which handles the request wont be able to find the 
> parent for the searched child documents. But if we trigger `optimize` for the 
> same index which forces to merge all the segments to single index, the above 
> request will be able to fetch the results. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11517) ToParentBlockJoinQuery fails when the parents/child fall in to different segments

2017-10-20 Thread ananthesh (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ananthesh updated SOLR-11517:
-
Description: 
We have a system where all the documents in the collections are nested child 
documents. We also have 'autoCommit' enabled for the collection. We also get 
huge number of document updates. We found a scenario, where 'child' documents 
were indexed in one segment, while 'parent' document got indexed in the other 
segment. Here are the docid looks like


0 = 95638
1 = 95639
2 = 95640
3 = 272190 {parent}

Now if the solr request has been made using "parent" query parser like the 
following

{noformat}
{!parent which=parent:true score=max}(...)
{noformat}

ToParentBlockJoinQuery which handles the request wont be able to find the 
parent for the searched child documents. But if we trigger `optimize` for the 
same index which forces to merge all the segments to single index, the above 
request will be able to fetch the results. 

  was:
We have a system where all the documents in the collections are nested child 
documents. We also have 'autoCommit' enabled for the collection. We also get 
huge number of document updates. We found a scenario, where 'child' documents 
were indexed in one segment, while 'parent' document got indexed in the other 
segment. Here are the docid looks like


0 = 95638
1 = 95639
2 = 95640
3 = 272190 {parent}

Now if the solr request has been made using "parent" query parser like the 
following

{!parent which=parent:true score=max}(...)

ToParentBlockJoinQuery which handles the request wont be able to find the 
parent for the searched child documents. But if we trigger `optimize` for the 
same index which forces to merge all the segments to single index, the above 
request will be able to fetch the results. 




> ToParentBlockJoinQuery fails when the parents/child fall in to different 
> segments
> -
>
> Key: SOLR-11517
> URL: https://issues.apache.org/jira/browse/SOLR-11517
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.1
>Reporter: ananthesh
>
> We have a system where all the documents in the collections are nested child 
> documents. We also have 'autoCommit' enabled for the collection. We also get 
> huge number of document updates. We found a scenario, where 'child' documents 
> were indexed in one segment, while 'parent' document got indexed in the other 
> segment. Here are the docid looks like
> 0 = 95638
> 1 = 95639
> 2 = 95640
> 3 = 272190 {parent}
> Now if the solr request has been made using "parent" query parser like the 
> following
> {noformat}
> {!parent which=parent:true score=max}(...)
> {noformat}
> ToParentBlockJoinQuery which handles the request wont be able to find the 
> parent for the searched child documents. But if we trigger `optimize` for the 
> same index which forces to merge all the segments to single index, the above 
> request will be able to fetch the results. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11517) ToParentBlockJoinQuery fails when the parents/child fall in to different segments

2017-10-20 Thread ananthesh (JIRA)
ananthesh created SOLR-11517:


 Summary: ToParentBlockJoinQuery fails when the parents/child fall 
in to different segments
 Key: SOLR-11517
 URL: https://issues.apache.org/jira/browse/SOLR-11517
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 6.6.1
Reporter: ananthesh


We have a system where all the documents in the collections are nested child 
documents. We also have 'autoCommit' enabled for the collection. We also get 
huge number of document updates. We found a scenario, where 'child' documents 
were indexed in one segment, while 'parent' document got indexed in the other 
segment. Here are the docid looks like


0 = 95638
1 = 95639
2 = 95640
3 = 272190 {parent}

Now if the solr request has been made using "parent" query parser like the 
following

{!parent which=parent:true score=max}(...)

ToParentBlockJoinQuery which handles the request wont be able to find the 
parent for the searched child documents. But if we trigger `optimize` for the 
same index which forces to merge all the segments to single index, the above 
request will be able to fetch the results. 





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-5.5-Windows (32bit/jdk1.7.0_80) - Build # 148 - Still Unstable!

2017-10-20 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-5.5-Windows/148/
Java: 32bit/jdk1.7.0_80 -server -XX:+UseConcMarkSweepGC

1 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.handler.TestSolrConfigHandlerCloud

Error Message:
1 thread leaked from SUITE scope at 
org.apache.solr.handler.TestSolrConfigHandlerCloud: 1) Thread[id=23488, 
name=Thread-8464, state=TIMED_WAITING, group=TGRP-TestSolrConfigHandlerCloud]   
  at java.lang.Thread.sleep(Native Method) at 
org.apache.solr.cloud.ZkSolrResourceLoader.openResource(ZkSolrResourceLoader.java:101)
 at 
org.apache.solr.core.SolrResourceLoader.openSchema(SolrResourceLoader.java:355) 
at 
org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:48)
 at 
org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:70)
 at 
org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:108)
 at 
org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:79)   
  at org.apache.solr.core.CoreContainer.reload(CoreContainer.java:920) 
at org.apache.solr.core.SolrCore$11.run(SolrCore.java:2623) at 
org.apache.solr.cloud.ZkController$5.run(ZkController.java:2480)

Stack Trace:
com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE 
scope at org.apache.solr.handler.TestSolrConfigHandlerCloud: 
   1) Thread[id=23488, name=Thread-8464, state=TIMED_WAITING, 
group=TGRP-TestSolrConfigHandlerCloud]
at java.lang.Thread.sleep(Native Method)
at 
org.apache.solr.cloud.ZkSolrResourceLoader.openResource(ZkSolrResourceLoader.java:101)
at 
org.apache.solr.core.SolrResourceLoader.openSchema(SolrResourceLoader.java:355)
at 
org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:48)
at 
org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:70)
at 
org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:108)
at 
org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:79)
at org.apache.solr.core.CoreContainer.reload(CoreContainer.java:920)
at org.apache.solr.core.SolrCore$11.run(SolrCore.java:2623)
at org.apache.solr.cloud.ZkController$5.run(ZkController.java:2480)
at __randomizedtesting.SeedInfo.seed([6A210696BB2E619E]:0)




Build Log:
[...truncated 12225 lines...]
   [junit4] Suite: org.apache.solr.handler.TestSolrConfigHandlerCloud
   [junit4]   2> Creating dataDir: 
C:\Users\jenkins\workspace\Lucene-Solr-5.5-Windows\solr\build\solr-core\test\J0\temp\solr.handler.TestSolrConfigHandlerCloud_6A210696BB2E619E-001\init-core-data-001
   [junit4]   2> 2729505 INFO  
(SUITE-TestSolrConfigHandlerCloud-seed#[6A210696BB2E619E]-worker) [] 
o.a.s.SolrTestCaseJ4 Randomized ssl (false) and clientAuth (false)
   [junit4]   2> 2729505 INFO  
(SUITE-TestSolrConfigHandlerCloud-seed#[6A210696BB2E619E]-worker) [] 
o.a.s.BaseDistributedSearchTestCase Setting hostContext system property: /
   [junit4]   2> 2729510 INFO  
(TEST-TestSolrConfigHandlerCloud.test-seed#[6A210696BB2E619E]) [] 
o.a.s.c.ZkTestServer STARTING ZK TEST SERVER
   [junit4]   2> 2729510 INFO  (Thread-8298) [] o.a.s.c.ZkTestServer client 
port:0.0.0.0/0.0.0.0:0
   [junit4]   2> 2729510 INFO  (Thread-8298) [] o.a.s.c.ZkTestServer 
Starting server
   [junit4]   2> 2729611 INFO  
(TEST-TestSolrConfigHandlerCloud.test-seed#[6A210696BB2E619E]) [] 
o.a.s.c.ZkTestServer start zk server on port:60158
   [junit4]   2> 2729611 INFO  
(TEST-TestSolrConfigHandlerCloud.test-seed#[6A210696BB2E619E]) [] 
o.a.s.c.c.SolrZkClient Using default ZkCredentialsProvider
   [junit4]   2> 2729612 INFO  
(TEST-TestSolrConfigHandlerCloud.test-seed#[6A210696BB2E619E]) [] 
o.a.s.c.c.ConnectionManager Waiting for client to connect to ZooKeeper
   [junit4]   2> 2729619 INFO  (zkCallback-3018-thread-1) [] 
o.a.s.c.c.ConnectionManager Watcher 
org.apache.solr.common.cloud.ConnectionManager@53efdc name:ZooKeeperConnection 
Watcher:127.0.0.1:60158 got event WatchedEvent state:SyncConnected type:None 
path:null path:null type:None
   [junit4]   2> 2729619 INFO  
(TEST-TestSolrConfigHandlerCloud.test-seed#[6A210696BB2E619E]) [] 
o.a.s.c.c.ConnectionManager Client is connected to ZooKeeper
   [junit4]   2> 2729619 INFO  
(TEST-TestSolrConfigHandlerCloud.test-seed#[6A210696BB2E619E]) [] 
o.a.s.c.c.SolrZkClient Using default ZkACLProvider
   [junit4]   2> 2729620 INFO  
(TEST-TestSolrConfigHandlerCloud.test-seed#[6A210696BB2E619E]) [] 
o.a.s.c.c.SolrZkClient makePath: /solr
   [junit4]   2> 2729624 INFO  
(TEST-TestSolrConfigHandlerCloud.test-seed#[6A210696BB2E619E]) [] 
o.a.s.c.c.SolrZkClient Using default ZkCredentialsProvider
   [junit4]   2> 2729626 INFO  
(TEST-TestSolrConfigHandlerCloud.test-seed#[6A210696BB2E619E]) [] 
o.a.s.c.c.ConnectionManager Waiting for client to 

[jira] [Updated] (SOLR-11032) Update solrj tutorial

2017-10-20 Thread Jason Gerlowski (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski updated SOLR-11032:
---
Attachment: SOLR-11032.patch

The one/only advantage of having the examples live in 
{{solr/solr-ref-guide/src/java}} is that any attempt to build pages of the ref 
guide outside of ant will look for included files in descendants of 
{{solr/solr-ref-guide/src}}.  The main time this comes up is if you're using an 
editor that supports live-preview of your asciidoc pages.

That said, it's possible to avoid breaking these sort of live-preview cases by 
having an ant task copy the examples to expected place.  I've taken this 
approach in the attached patch.  The examples live in a package under 
{{solr/solrj/src/test}}.  They can be copied to 
{{solr/solr-ref-guide/src/example}} by the ref-guide target: {{ant 
copy-examples]}.  This gets us all of the advantages that Hoss pointed out, 
while at least providing a work around for people who like their live-preview.

> Update solrj tutorial
> -
>
> Key: SOLR-11032
> URL: https://issues.apache.org/jira/browse/SOLR-11032
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation, SolrJ, website
>Reporter: Karl Richter
> Attachments: SOLR-11032.patch, SOLR-11032.patch, SOLR-11032.patch
>
>
> The [solrj tutorial](https://wiki.apache.org/solr/Solrj) has the following 
> issues:
>   * It refers to 1.4.0 whereas the current release is 6.x, some classes are 
> deprecated or no longer exist.
>   * Document-object-binding is a crucial feature [which should be working in 
> the meantime](https://issues.apache.org/jira/browse/SOLR-1945) and thus 
> should be covered in the tutorial.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-7.x-Linux (32bit/jdk1.8.0_144) - Build # 638 - Still Unstable!

2017-10-20 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/638/
Java: 32bit/jdk1.8.0_144 -client -XX:+UseSerialGC

6 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.TestSolrCloudWithSecureImpersonation

Error Message:
2 threads leaked from SUITE scope at 
org.apache.solr.cloud.TestSolrCloudWithSecureImpersonation: 1) 
Thread[id=23696, name=jetty-launcher-2920-thread-1-EventThread, 
state=TIMED_WAITING, group=TGRP-TestSolrCloudWithSecureImpersonation] 
at sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) 
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
 at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)  
   at 
org.apache.curator.CuratorZookeeperClient.internalBlockUntilConnectedOrTimedOut(CuratorZookeeperClient.java:323)
 at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:105)  
   at 
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288)
 at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
 at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:41)
 at 
org.apache.curator.framework.recipes.shared.SharedValue.readValue(SharedValue.java:244)
 at 
org.apache.curator.framework.recipes.shared.SharedValue.access$100(SharedValue.java:44)
 at 
org.apache.curator.framework.recipes.shared.SharedValue$1.process(SharedValue.java:61)
 at 
org.apache.curator.framework.imps.NamespaceWatcher.process(NamespaceWatcher.java:67)
 at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)   
  at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:505)   
 2) Thread[id=23686, name=jetty-launcher-2920-thread-2-EventThread, 
state=TIMED_WAITING, group=TGRP-TestSolrCloudWithSecureImpersonation] 
at sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) 
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
 at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)  
   at 
org.apache.curator.CuratorZookeeperClient.internalBlockUntilConnectedOrTimedOut(CuratorZookeeperClient.java:323)
 at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:105)  
   at 
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288)
 at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
 at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:41)
 at 
org.apache.curator.framework.recipes.shared.SharedValue.readValue(SharedValue.java:244)
 at 
org.apache.curator.framework.recipes.shared.SharedValue.access$100(SharedValue.java:44)
 at 
org.apache.curator.framework.recipes.shared.SharedValue$1.process(SharedValue.java:61)
 at 
org.apache.curator.framework.imps.NamespaceWatcher.process(NamespaceWatcher.java:67)
 at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)   
  at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:505)

Stack Trace:
com.carrotsearch.randomizedtesting.ThreadLeakError: 2 threads leaked from SUITE 
scope at org.apache.solr.cloud.TestSolrCloudWithSecureImpersonation: 
   1) Thread[id=23696, name=jetty-launcher-2920-thread-1-EventThread, 
state=TIMED_WAITING, group=TGRP-TestSolrCloudWithSecureImpersonation]
at sun.misc.Unsafe.park(Native Method)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at 
org.apache.curator.CuratorZookeeperClient.internalBlockUntilConnectedOrTimedOut(CuratorZookeeperClient.java:323)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:105)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
at 

Re: Solr test framework - Locale randomization

2017-10-20 Thread Robert Muir
On Fri, Oct 20, 2017 at 5:29 PM, Hrishikesh Gadre  wrote:
> Yes that seems to be case. But in this case, we are using Derby for unit
> testing only. In production
> deployments we would be using a standard RDBMS (e.g. mysql, postgres etc.).
> So I am wondering
> if we should provide a knob to suppress the randomization in such cases
> (just to avoid such issues with third-party software).
> e.g. we already have an annotation to suppress SSL
> (@SolrTestCaseJ4.SuppressSSL).  May be we could do something similar
> for the Locale setting?
>
> To ensure that such setting is not abused inside lucene-solr repo, we can
> add it to the list of forbidden APIs.
>
> If this is acceptable, I will go ahead and create a jira.
>

I agree its a good idea if we can better allow the test class to not
run under specific locales. This would have some use-cases such as
temporarily disabling turkish/azeri locales when String.lowerCase() is
used incorrectly and other common problems.  I think a
SuppressLocaleRandomization or similar is a good idea. Maybe it should
instead just "wire" the class to a specified locale parameter (e.g.
"en-US") to make it totally clear what will happen (tests still are
repeatable for developers in different countries and so on). +1 to
open issue.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Solr test framework - Locale randomization

2017-10-20 Thread Hrishikesh Gadre
>>I think you're going to be stuck with doing something like
https://github.com/apache/lucene-solr/blob/master/solr/core/src/test/org/
apache/solr/cloud/KerberosTestServices.java#L55 where there is a hard-coded
list of locales that we know are broken.

@Mike - yes that's what I am doing currently.

>>See the documentation for the java Locale class for what I mean, it is
very thorough. https://docs.oracle.com/javase/8/docs/api/java/util/
Locale.html

Thanks Robert. I will review the javadoc for better understanding.


>>Yes but these are not bugs in what lucene is doing.
>>These are bugs in third party software that completely break under
certain Locales, because they don't know how to handle some newer
locales (broken parsing and so on).

Yes that seems to be case. But in this case, we are using Derby for unit
testing only. In production
deployments we would be using a standard RDBMS (e.g. mysql, postgres etc.).
So I am wondering
if we should provide a knob to suppress the randomization in such cases
(just to avoid such issues with third-party software).
e.g. we already have an annotation to suppress SSL
(@SolrTestCaseJ4.SuppressSSL).  May be we could do something similar
for the Locale setting?

To ensure that such setting is not abused inside lucene-solr repo, we can
add it to the list of forbidden APIs.

If this is acceptable, I will go ahead and create a jira.

Thanks
Hrishikesh


On Fri, Oct 20, 2017 at 1:50 PM, Robert Muir  wrote:

> Yes but these are not bugs in what lucene is doing. It is just setting
> a random locale.
>
> These are bugs in third party software that completely break under
> certain Locales, because they don't know how to handle some newer
> locales (broken parsing and so on). Such code does not handle the fact
> that Locales can be more than language + country + variant. Since Java
> 1.7 they can also have script (seen in your example) and extensions
> and support BCP 47.
>
> See the documentation for the java Locale class for what I mean, it is
> very thorough. https://docs.oracle.com/javase/8/docs/api/java/util/
> Locale.html
>
>
> On Fri, Oct 20, 2017 at 4:40 PM, Mike Drob  wrote:
> > I think you're going to be stuck with doing something like
> > https://github.com/apache/lucene-solr/blob/master/solr/
> core/src/test/org/apache/solr/cloud/KerberosTestServices.java#L55
> > where there is a hard-coded list of locales that we know are broken.
> >
> > On Fri, Oct 20, 2017 at 3:31 PM, Hrishikesh Gadre 
> > wrote:
> >>
> >> Hi Dawid,
> >>
> >> Thanks for the feedback.
> >>
> >> Here is one failure scenario,
> >>
> >> Locale configured (via -Dtests.locale) -> sr-Latn
> >>
> >> The error message,
> >>
> >> ERROR XBM0X: Supplied locale description 'sr__#Latn' is invalid,
> expecting
> >> ln[_CO[_variant]]
> >>
> >> ln=lower-case two-letter ISO-639 language code, CO=upper-case two-letter
> >> ISO-3166 country codes, see java.util.Locale.
> >>
> >> Note that if I use "sr-Latn-BA" instead, the test passes. My gut feeling
> >> is that "sr-Latn" is not a valid locale string as it is not listed here,
> >> http://www.oracle.com/technetwork/java/javase/java8locales-2095355.html
> >>
> >>
> >> Another failure is
> >>
> >> Locale configured (via -Dtests.locale) -> und
> >>
> >> The error message is
> >>
> >> Supplied locale description '' is invalid, expecting ln[_CO[_variant]]
> >>
> >> ln=lower-case two-letter ISO-639 language code, CO=upper-case two-letter
> >> ISO-3166 country codes, see java.util.Locale.
> >>
> >> For the time being, I am hard-coding these failure causing locales in
> the
> >> junit assume(...) so that I can skip the execution. But this is not
> >> full-proof since there may be more locale configurations which may not
> work
> >> with Derby. So I wonder if there is any way to suppress this locale
> >> randomization altogether?
> >>
> >> Thanks
> >> Hrishikesh
> >>
> >>
> >>
> >> On Fri, Oct 20, 2017 at 12:39 PM, Dawid Weiss 
> >> wrote:
> >>>
> >>> Only valid locales (for your Java) are selected, so this has to be an
> >>> error. What failures do you see? Perhaps they should be reported to
> >>> Derby?
> >>>
> >>> Dawid
> >>>
> >>> On Fri, Oct 20, 2017 at 8:14 PM, Hrishikesh Gadre <
> gadre.s...@gmail.com>
> >>> wrote:
> >>> > Hi,
> >>> >
> >>> > I am currently implementing solr authorization plugin backed by
> Apache
> >>> > Sentry. For the unit tests, I am using Solr test framework
> >>> > (specifically
> >>> > SolrCloudTestCase class). Occasionally I see unit test failures since
> >>> > the
> >>> > sentry tests use Derby in-memory database and it doesn't work
> properly
> >>> > for
> >>> > some of the Locale(s) configured by the Solr test framework.
> >>> >
> >>> > Couple of questions
> >>> >
> >>> > (a) Does the Solr test framework generates only valid Locale(s) or a
> >>> > mix of
> >>> > valid/invalid Locale(s) ? The reason I am asking is that I have a
> test
> >>> > failure with Locale 

[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-20 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213266#comment-16213266
 ] 

Michael McCandless commented on LUCENE-7976:


 bq. This can cause jitter in results where the ordering will depend on which 
shard answered a query because the frequencies are off significantly enough. 

Segment-based replication 
(http://blog.mikemccandless.com/2017/09/lucenes-near-real-time-segment-index.html)
 would improve this situation, in that the jitter no longer varies by shard 
since all replicas search identical point-in-time views of the index.  It's 
also quite a bit more efficient if you need many replicas.

bq. I suspect that the current behavior, where a segment that's 20 times larger 
than the configured max segment size is ineligible for automatic merging until 
97.5 percent deleted docs, was not actually what was desired.

Right!  The designer didn't think about this case because he didn't call 
{{forceMerge}} so frequently :)

bq. Max segment sizes are a target, not a hard guarantee... Lucene doesn't know 
exactly how big the segment will be before it actually completes the merge, and 
it can end up going over the limit.

Right, it's only an estimate, but in my experience it's conservative, i.e. the 
resulting merged segment is usually smaller than the max segment size, but you 
cannot count on that.

bq. The downside to a max segment size is that one can start getting many more 
segments than anticipated or desired (and can impact performance in 
unpredictable ways, depending on the exact usage).

Right, but the proposed solution (TMP always respects the max segment size) 
would work well such users: they just need to increase their max segment size 
if they need to get a 10 TB index down to 20 segments.

bq. So 50% deleted documents consumes a lot of resources, both disk and RAM 
when considered in aggregate at that scale.

Well, disks are cheap and getting cheaper.  And 50% is the worst case -- TMP 
merges those segments away once they hit 50%, so that the net across the index 
is less than 50% deletions.  Users already must have a lot of free disk space 
to accommodate running merges, pending refreshes, pending commits, etc.

Erick, are these timestamp'd documents?  It's better to index those into 
indices that rollover with time (see how Elasticsearch recommends it: 
https://www.elastic.co/blog/managing-time-based-indices-efficiently), where 
it's far more efficient to drop whole indices than delete documents in one 
index.

Still, I think it's OK to relax TMP so it will allow max sized segments with 
less than 50% deletions to be eligible for merging, and users can tune the 
deletions weight to force TMP to aggressively merge such segments.  This would 
be a tiny change in the loop that computes {{tooBigCount}}.

bq. The root cause of the problem here seems to be that we have only one 
variable (maxSegmentSize) and multiple use-cases we're forcing on it:

But how can that work?

If you have two different max sizes, then how can natural merging work with the 
too-large segments in the index due to a past {{forceMerge}}?  It cannot merge 
them and produce a small enough segment until enough (too many) deletes 
accumulate on them.

Or, if we had two settings, we could insist that the 
{{maxForcedMergeSegmentSize}} is <= the {{maxSegmentSize}} but then what's the 
point :)

The problem here is {{forceMerge}} today sets up an index structure that 
natural merging is unable to cope with; having {{forceMerge}} respect the max 
segment size would fix that nicely.  Users can simply increase that size if 
they want massive segments.

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are 

[jira] [Commented] (LUCENE-8000) Document Length Normalization in BM25Similarity correct?

2017-10-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213255#comment-16213255
 ] 

Robert Muir commented on LUCENE-8000:
-

{quote}
Robert Muir thanks for the further explanation. That helped clarify. It does 
seem the effect would be minor at best. It'd be an interesting experiment at 
some point, though. If I ever get to trying it, I'll post back.
{quote}

Thanks Timothy! Maybe if you get the chance to do the experiment, simply 
override the method {{protected float avgFieldLength(CollectionStatistics 
collectionStats)}} to return the alternative value. For experiments it can just 
be a hardcoded number you computed yourself in a different way.

> Document Length Normalization in BM25Similarity correct?
> 
>
> Key: LUCENE-8000
> URL: https://issues.apache.org/jira/browse/LUCENE-8000
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Christoph Goller
>Priority: Minor
>
> Length of individual documents only counts the number of positions of a 
> document since discountOverlaps defaults to true.
> {code}
>  @Override
>   public final long computeNorm(FieldInvertState state) {
> final int numTerms = discountOverlaps ? state.getLength() - 
> state.getNumOverlap() : state.getLength();
> int indexCreatedVersionMajor = state.getIndexCreatedVersionMajor();
> if (indexCreatedVersionMajor >= 7) {
>   return SmallFloat.intToByte4(numTerms);
> } else {
>   return SmallFloat.floatToByte315((float) (1 / Math.sqrt(numTerms)));
> }
>   }}
> {code}
> Measureing document length this way seems perfectly ok for me. What bothers 
> me is that
> average document length is based on sumTotalTermFreq for a field. As far as I 
> understand that sums up totalTermFreqs for all terms of a field, therefore 
> counting positions of terms including those that overlap.
> {code}
>  protected float avgFieldLength(CollectionStatistics collectionStats) {
> final long sumTotalTermFreq = collectionStats.sumTotalTermFreq();
> if (sumTotalTermFreq <= 0) {
>   return 1f;   // field does not exist, or stat is unsupported
> } else {
>   final long docCount = collectionStats.docCount() == -1 ? 
> collectionStats.maxDoc() : collectionStats.docCount();
>   return (float) (sumTotalTermFreq / (double) docCount);
> }
>   }
> }
> {code}
> Are we comparing apples and oranges in the final scoring?
> I haven't run any benchmarks and I am not sure whether this has a serious 
> effect. It just means that documents that have synonyms or in my use case 
> different normal forms of tokens on the same position are shorter and 
> therefore get higher scores  than they should and that we do not use the 
> whole spectrum of relative document lenght of BM25.
> I think for BM25  discountOverlaps  should default to false. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-20 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213253#comment-16213253
 ] 

Yonik Seeley commented on LUCENE-7976:
--

{quote}
The root cause of the problem here seems to be that we have only one variable 
(maxSegmentSize) and multiple use-cases we're forcing on it:
1) the max segment size that can be create automatically just by adding 
documents (this is maxSegmentSize currently)
2) the max segment size that can ever be created, even through explicit 
forceMerge (this is more for Tim's usecase... certain filesystems or transports 
may break if you go over certain limits)
{quote}

Actually, looking at the other merge policy, LogByteSizeMergePolicy, it 
*already* has different settings for these different concepts/use-cases:
 - setMaxMergeMB()
 - setMaxMergeMBForForcedMerge()

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11505) solr.cmd start of solr7.0.1 can't working in win7-64

2017-10-20 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213250#comment-16213250
 ] 

Shawn Heisey commented on SOLR-11505:
-

This is what I see on my system, where solr.cmd works:

{code:none}
C:\Users\sheisey>where find
C:\Windows\System32\find.exe
C:\cygwin64\bin\find.exe
{code}

I have cygwin installed, but when I modified the global PATH, I added the 
cygwin bin directory more towards the end of the string, not the beginning, 
because I knew there were a number of built-in windows commands where the same 
command exists in cygwin.  Although I would most likely would prefer to the 
cygwin version in my own incidental usage, I couldn't be sure that I wouldn't 
be breaking other software, and clearly that *can* happen.

Using "%windir%\System32\find.exe" instead of just "find" seems to work.  Would 
that be a reasonable idea to use in the script?  I didn't see any default 
system environment variables pointing directly at the System32 subdirectory.  
If there are any other system utilities that the script uses without explicit 
paths, it might be a good idea to utilize environment variables to locate them 
in their expected places.

This is my PATH variable:

{code:none}
C:\Users\sheisey>echo %path%
C:\ProgramData\Oracle\Java\javapath;C:\Program Files (x86)\AMD 
APP\bin\x86_64;C:\Program Files (x86)\AMD APP\bin\x86;C:\Program 
Files\Java\jdk1.6.0_27\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program
 Files (x86)\Common Files\Roxio 
Shared\9.0\DLLShared\;C:\Windows\System32\Windows System Resource 
Manager\bin;C:\Program Files\TortoiseSVN\bin;C:\Program Files (x86)\ATI 
Technologies\ATI.ACE\Core-Static;C:\cygwin64\bin;C:\apache-ant-1.9.6\bin;C:\Program
 Files (x86)\GNU\GnuPG\pub;C:\Program Files (x86)\Git\cmd;C:\Program Files 
(x86)\Microsoft SQL Server\Client SDK\ODBC\110\Tools\Binn\;C:\Program Files 
(x86)\Microsoft SQL Server\120\Tools\Binn\ManagementStudio\;C:\Program Files 
(x86)\Microsoft SQL Server\120\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL 
Server\120\DTS\Binn\;C:\Program Files (x86)\QuickTime\QTSystem\;C:\Program 
Files\PuTTY\
{code}


> solr.cmd start of solr7.0.1 can't working in win7-64
> 
>
> Key: SOLR-11505
> URL: https://issues.apache.org/jira/browse/SOLR-11505
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCLI
>Affects Versions: 7.0.1
> Environment: windows 7
>Reporter: cloverliu
>Priority: Trivial
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> http://archive.apache.org/dist/lucene/solr/7.0.1/solr-7.0.1.zip   
>  solr.cmd start of this file can't working in my win7-64bit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-20 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213242#comment-16213242
 ] 

Yonik Seeley commented on LUCENE-7976:
--

bq.  The actual merges to be executed are determined by the MergePolicy.

And so then we go and look a the merge policy in question (TieredMergePolicy) 
which says:
{code}
 *  NOTE: This policy always merges by byte size
 *  of the segments, always pro-rates by percent deletes,
 *  and does not apply any maximum segment size during
 *  forceMerge (unlike {@link LogByteSizeMergePolicy}).
{code}

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213228#comment-16213228
 ] 

Robert Muir commented on LUCENE-7976:
-

I don't agree its a feature. The documentation for {{IndexWriter.forceMerge}} 
states:

Forces merge policy to merge segments until there are <= maxNumSegments. *The 
actual merges to be executed are determined by the MergePolicy*.

I bolded sentence two just for emphasis.

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-20 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213213#comment-16213213
 ] 

Yonik Seeley commented on LUCENE-7976:
--

It's not a bug, it's a feature.  It's an explicit request that *may* or *may 
not* be a mistake on the part of the user, and it can certainly be a judgement 
call.  Given that it's explicit and we don't know if is advisable or not, we 
should do what is requested.

The root cause of the problem here seems to be that we have only one variable 
(maxSegmentSize) and multiple use-cases we're forcing on it:
1) the max segment size that can be create automatically just by adding 
documents (this is maxSegmentSize currently)
2) the max segment size that can *ever* be created, even through explicit 
forceMerge (this is more for Tim's usecase... certain filesystems or transports 
may break if you go over certain limits)

There is no variable/setting for #2 currently, but we should not re-use the 
current maxSegmentSize for this as it conflates the two use-cases.
Perhaps something like hardMaxSegmentSize or something?

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Solr test framework - Locale randomization

2017-10-20 Thread Robert Muir
Yes but these are not bugs in what lucene is doing. It is just setting
a random locale.

These are bugs in third party software that completely break under
certain Locales, because they don't know how to handle some newer
locales (broken parsing and so on). Such code does not handle the fact
that Locales can be more than language + country + variant. Since Java
1.7 they can also have script (seen in your example) and extensions
and support BCP 47.

See the documentation for the java Locale class for what I mean, it is
very thorough. https://docs.oracle.com/javase/8/docs/api/java/util/Locale.html


On Fri, Oct 20, 2017 at 4:40 PM, Mike Drob  wrote:
> I think you're going to be stuck with doing something like
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/test/org/apache/solr/cloud/KerberosTestServices.java#L55
> where there is a hard-coded list of locales that we know are broken.
>
> On Fri, Oct 20, 2017 at 3:31 PM, Hrishikesh Gadre 
> wrote:
>>
>> Hi Dawid,
>>
>> Thanks for the feedback.
>>
>> Here is one failure scenario,
>>
>> Locale configured (via -Dtests.locale) -> sr-Latn
>>
>> The error message,
>>
>> ERROR XBM0X: Supplied locale description 'sr__#Latn' is invalid, expecting
>> ln[_CO[_variant]]
>>
>> ln=lower-case two-letter ISO-639 language code, CO=upper-case two-letter
>> ISO-3166 country codes, see java.util.Locale.
>>
>> Note that if I use "sr-Latn-BA" instead, the test passes. My gut feeling
>> is that "sr-Latn" is not a valid locale string as it is not listed here,
>> http://www.oracle.com/technetwork/java/javase/java8locales-2095355.html
>>
>>
>> Another failure is
>>
>> Locale configured (via -Dtests.locale) -> und
>>
>> The error message is
>>
>> Supplied locale description '' is invalid, expecting ln[_CO[_variant]]
>>
>> ln=lower-case two-letter ISO-639 language code, CO=upper-case two-letter
>> ISO-3166 country codes, see java.util.Locale.
>>
>> For the time being, I am hard-coding these failure causing locales in the
>> junit assume(...) so that I can skip the execution. But this is not
>> full-proof since there may be more locale configurations which may not work
>> with Derby. So I wonder if there is any way to suppress this locale
>> randomization altogether?
>>
>> Thanks
>> Hrishikesh
>>
>>
>>
>> On Fri, Oct 20, 2017 at 12:39 PM, Dawid Weiss 
>> wrote:
>>>
>>> Only valid locales (for your Java) are selected, so this has to be an
>>> error. What failures do you see? Perhaps they should be reported to
>>> Derby?
>>>
>>> Dawid
>>>
>>> On Fri, Oct 20, 2017 at 8:14 PM, Hrishikesh Gadre 
>>> wrote:
>>> > Hi,
>>> >
>>> > I am currently implementing solr authorization plugin backed by Apache
>>> > Sentry. For the unit tests, I am using Solr test framework
>>> > (specifically
>>> > SolrCloudTestCase class). Occasionally I see unit test failures since
>>> > the
>>> > sentry tests use Derby in-memory database and it doesn't work properly
>>> > for
>>> > some of the Locale(s) configured by the Solr test framework.
>>> >
>>> > Couple of questions
>>> >
>>> > (a) Does the Solr test framework generates only valid Locale(s) or a
>>> > mix of
>>> > valid/invalid Locale(s) ? The reason I am asking is that I have a test
>>> > failure with Locale as "sr-Latn". But it is not included in the list of
>>> > valid Locales supported by Java 8
>>> >
>>> > (http://www.oracle.com/technetwork/java/javase/java8locales-2095355.html).
>>> >
>>> > (b) Is there a way to turn off this Locale randomization?
>>> >
>>> >
>>> > Thanks
>>> > Hrishikesh
>>> >
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Solr test framework - Locale randomization

2017-10-20 Thread Mike Drob
I think you're going to be stuck with doing something like
https://github.com/apache/lucene-solr/blob/master/solr/core/src/test/org/apache/solr/cloud/KerberosTestServices.java#L55
where there is a hard-coded list of locales that we know are broken.

On Fri, Oct 20, 2017 at 3:31 PM, Hrishikesh Gadre 
wrote:

> Hi Dawid,
>
> Thanks for the feedback.
>
> Here is one failure scenario,
>
> Locale configured (via -Dtests.locale) -> sr-Latn
>
> The error message,
>
> ERROR XBM0X: Supplied locale description 'sr__#Latn' is invalid, expecting
> ln[_CO[_variant]]
>
> ln=lower-case two-letter ISO-639 language code, CO=upper-case two-letter
> ISO-3166 country codes, see java.util.Locale.
> Note that if I use "sr-Latn-BA" instead, the test passes. My gut feeling
> is that "sr-Latn" is not a valid locale string as it is not listed here,
> http://www.oracle.com/technetwork/java/javase/java8locales-2095355.html
>
>
> Another failure is
>
> Locale configured (via -Dtests.locale) -> und
>
> The error message is
>
> Supplied locale description '' is invalid, expecting ln[_CO[_variant]]
>
> ln=lower-case two-letter ISO-639 language code, CO=upper-case two-letter
> ISO-3166 country codes, see java.util.Locale.
>
> For the time being, I am hard-coding these failure causing locales in the
> junit assume(...) so that I can skip the execution. But this is not
> full-proof since there may be more locale configurations which may not work
> with Derby. So I wonder if there is any way to suppress this locale
> randomization altogether?
> Thanks
> Hrishikesh
>
>
>
> On Fri, Oct 20, 2017 at 12:39 PM, Dawid Weiss 
> wrote:
>
>> Only valid locales (for your Java) are selected, so this has to be an
>> error. What failures do you see? Perhaps they should be reported to
>> Derby?
>>
>> Dawid
>>
>> On Fri, Oct 20, 2017 at 8:14 PM, Hrishikesh Gadre 
>> wrote:
>> > Hi,
>> >
>> > I am currently implementing solr authorization plugin backed by Apache
>> > Sentry. For the unit tests, I am using Solr test framework (specifically
>> > SolrCloudTestCase class). Occasionally I see unit test failures since
>> the
>> > sentry tests use Derby in-memory database and it doesn't work properly
>> for
>> > some of the Locale(s) configured by the Solr test framework.
>> >
>> > Couple of questions
>> >
>> > (a) Does the Solr test framework generates only valid Locale(s) or a
>> mix of
>> > valid/invalid Locale(s) ? The reason I am asking is that I have a test
>> > failure with Locale as "sr-Latn". But it is not included in the list of
>> > valid Locales supported by Java 8
>> > (http://www.oracle.com/technetwork/java/javase/java8locales-
>> 2095355.html).
>> >
>> > (b) Is there a way to turn off this Locale randomization?
>> >
>> >
>> > Thanks
>> > Hrishikesh
>> >
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>


[jira] [Commented] (SOLR-11516) Unified highlighter with word separator never gives context to the left

2017-10-20 Thread Tim Retout (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213196#comment-16213196
 ] 

Tim Retout commented on SOLR-11516:
---

Huh, okay, that's fair enough - it greatly surprises me that you don't want 
word separation!  This seems to be very common on, say, Google Search.  I'm not 
clear on what way the semantics are meant to be different here, but I've 
probably been lulled into this by the similarity with FVH.

However, I can either increase the fragsize and truncate sentences client-side, 
or use a different highlighter.  I'd agree with removing the option if it's not 
wanted.

Thanks!

> Unified highlighter with word separator never gives context to the left
> ---
>
> Key: SOLR-11516
> URL: https://issues.apache.org/jira/browse/SOLR-11516
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: highlighter
>Affects Versions: 6.4, 7.1
>Reporter: Tim Retout
>
> When using the unified highlighter with hl.bs.type=WORD, I am not able to get 
> context to the left of the matches returned; only words to the right of each 
> match are shown.  I see this behaviour on both Solr 6.4 and Solr 7.1.
> Without context to the left of a match, the highlighted snippets are much 
> less useful for understanding where the match appears in a document.
> As an example, using the techproducts data with Solr 7.1, given a search for 
> "apple", highlighting the "features" field:
> http://localhost:8983/solr/techproducts/select?hl.fl=features=on=apple=WORD=30=unified
> I see this snippet:
> "Apple Lossless, H.264 video"
> Note that "Apple" is anchored to the left.  Compare with the original 
> highlighter:
> http://localhost:8983/solr/techproducts/select?hl.fl=features=on=apple=30
> And the match has context either side:
> ", Audible, Apple Lossless, H.264 video"
> (To complicate this, in general I am not sure that the unified highlighter is 
> respecting the hl.fragsize parameter, although [SOLR-9935] suggests support 
> was added.  I included the hl.fragsize param in the unified URL too, but it's 
> making no difference unless set to 0.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Congratulations to the new Lucene/Solr PMC Chair, Adrien Grand

2017-10-20 Thread Otis Gospodnetic
Bravo!

Otis
--
http://sematext.com

> On Oct 20, 2017, at 13:04, Varun Thacker  wrote:
> 
> Congratulations Adrien!
> 
>> On Fri, Oct 20, 2017 at 9:45 AM, Tomas Fernandez Lobbe  
>> wrote:
>> Congratulations Adrien!
>> 
>>> On Oct 19, 2017, at 10:51 AM, Martin Gainty  wrote:
>>> 
>>> Félicitations Adrien!
>>> 
>>> Martin 
>>> __ 
>>> 
>>> From: ansh...@apple.com  on behalf of Anshum Gupta 
>>> 
>>> Sent: Thursday, October 19, 2017 11:52 AM
>>> To: dev@lucene.apache.org
>>> Subject: Re: Congratulations to the new Lucene/Solr PMC Chair, Adrien Grand
>>>  
>>> Congratulations Adrien!
>>> 
>>> -Anshum
>>> 
>>> 
>>> 
 On Oct 19, 2017, at 12:19 AM, Tommaso Teofili  
 wrote:
 
 Once a year the Lucene PMC rotates the PMC chair and Apache Vice President 
 position.
 This year we have nominated and elected Adrien Grand as the chair and 
 today the board just approved it, so now it's official.
 
 Congratulations Adrien!
 Regards,
 Tommaso
>> 
> 


[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213189#comment-16213189
 ] 

Robert Muir commented on LUCENE-7976:
-

There are more options Shawn. Its a bug that we created this 20x too big 
segment to begin with. The configured merge policy is not configured to create 
a segment that big. [~msoko...@gmail.com] suggestion about fixing that seems 
like the correct fix.

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Solr test framework - Locale randomization

2017-10-20 Thread Hrishikesh Gadre
Hi Dawid,

Thanks for the feedback.

Here is one failure scenario,

Locale configured (via -Dtests.locale) -> sr-Latn

The error message,

ERROR XBM0X: Supplied locale description 'sr__#Latn' is invalid, expecting
ln[_CO[_variant]]

ln=lower-case two-letter ISO-639 language code, CO=upper-case two-letter
ISO-3166 country codes, see java.util.Locale.
Note that if I use "sr-Latn-BA" instead, the test passes. My gut feeling is
that "sr-Latn" is not a valid locale string as it is not listed here,
http://www.oracle.com/technetwork/java/javase/java8locales-2095355.html


Another failure is

Locale configured (via -Dtests.locale) -> und

The error message is

Supplied locale description '' is invalid, expecting ln[_CO[_variant]]

ln=lower-case two-letter ISO-639 language code, CO=upper-case two-letter
ISO-3166 country codes, see java.util.Locale.

For the time being, I am hard-coding these failure causing locales in the
junit assume(...) so that I can skip the execution. But this is not
full-proof since there may be more locale configurations which may not work
with Derby. So I wonder if there is any way to suppress this locale
randomization altogether?
Thanks
Hrishikesh



On Fri, Oct 20, 2017 at 12:39 PM, Dawid Weiss  wrote:

> Only valid locales (for your Java) are selected, so this has to be an
> error. What failures do you see? Perhaps they should be reported to
> Derby?
>
> Dawid
>
> On Fri, Oct 20, 2017 at 8:14 PM, Hrishikesh Gadre 
> wrote:
> > Hi,
> >
> > I am currently implementing solr authorization plugin backed by Apache
> > Sentry. For the unit tests, I am using Solr test framework (specifically
> > SolrCloudTestCase class). Occasionally I see unit test failures since the
> > sentry tests use Derby in-memory database and it doesn't work properly
> for
> > some of the Locale(s) configured by the Solr test framework.
> >
> > Couple of questions
> >
> > (a) Does the Solr test framework generates only valid Locale(s) or a mix
> of
> > valid/invalid Locale(s) ? The reason I am asking is that I have a test
> > failure with Locale as "sr-Latn". But it is not included in the list of
> > valid Locales supported by Java 8
> > (http://www.oracle.com/technetwork/java/javase/java8locales-2095355.html
> ).
> >
> > (b) Is there a way to turn off this Locale randomization?
> >
> >
> > Thanks
> > Hrishikesh
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


[JENKINS] Lucene-Solr-master-Windows (32bit/jdk1.8.0_144) - Build # 6969 - Still Unstable!

2017-10-20 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Windows/6969/
Java: 32bit/jdk1.8.0_144 -client -XX:+UseSerialGC

2 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.core.TestSolrConfigHandler

Error Message:
Could not remove the following files (in the order of attempts):
C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler_365A47A31E663158-001\tempDir-012\cores\core:
 java.nio.file.DirectoryNotEmptyException: 
C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler_365A47A31E663158-001\tempDir-012\cores\core

C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler_365A47A31E663158-001\tempDir-012\cores:
 java.nio.file.DirectoryNotEmptyException: 
C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler_365A47A31E663158-001\tempDir-012\cores

C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler_365A47A31E663158-001\tempDir-012:
 java.nio.file.DirectoryNotEmptyException: 
C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler_365A47A31E663158-001\tempDir-012
 

Stack Trace:
java.io.IOException: Could not remove the following files (in the order of 
attempts):
   
C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler_365A47A31E663158-001\tempDir-012\cores\core:
 java.nio.file.DirectoryNotEmptyException: 
C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler_365A47A31E663158-001\tempDir-012\cores\core
   
C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler_365A47A31E663158-001\tempDir-012\cores:
 java.nio.file.DirectoryNotEmptyException: 
C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler_365A47A31E663158-001\tempDir-012\cores
   
C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler_365A47A31E663158-001\tempDir-012:
 java.nio.file.DirectoryNotEmptyException: 
C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler_365A47A31E663158-001\tempDir-012

at __randomizedtesting.SeedInfo.seed([365A47A31E663158]:0)
at org.apache.lucene.util.IOUtils.rm(IOUtils.java:329)
at 
org.apache.lucene.util.TestRuleTemporaryFilesCleanup.afterAlways(TestRuleTemporaryFilesCleanup.java:216)
at 
com.carrotsearch.randomizedtesting.rules.TestRuleAdapter$1.afterAlways(TestRuleAdapter.java:31)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:43)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)


FAILED:  
junit.framework.TestSuite.org.apache.solr.client.solrj.io.stream.StreamingTest

Error Message:
Error from server at http://127.0.0.1:49161/solr: create the collection time 
out:180s

Stack Trace:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://127.0.0.1:49161/solr: create the collection time out:180s
at __randomizedtesting.SeedInfo.seed([C05B10D70F51BC77]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:626)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:413)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1086)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:867)
   

[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-20 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213175#comment-16213175
 ] 

Shawn Heisey commented on LUCENE-7976:
--

Very interesting discussion and problem.

If we ignore for a moment what TMP actually does, and back up to the design 
intent when the policy was made ... what would the designer have wanted to 
happen in the case of a segment that's considerably larger than the configured 
max size?  Took me a while to find the right issue, which is LUCENE-854, work 
by [~mikemccand].

I suspect that the current behavior, where a segment that's 20 times larger 
than the configured max segment size is ineligible for automatic merging until 
97.5 percent deleted docs, was not actually what was desired.  Indexes with a 
segment like might not have even been considered when TMP was new.  I don't see 
anything in LUCENE-854 that mentions it.  I haven't checked all the later 
issues where changes to TMP were made.

So, how do we deal with this problem?  I see three options.  We can design an 
entirely new policy, and if its behavior becomes preferred, consider changing 
the default at a later date.  We can change TMP so it behaves better with very 
large segments with no change in user code or config.  We can add Erick's 
suggested option.  For any of these options, improved documentation is a must.

The second option (and the latter half of the first option) carries one risk 
factor I can think of -- users complaining about new behavior in a similar 
manner to what I've heard about when the default directory was changed to MMAP.

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-5.5-Linux (32bit/jdk1.7.0_80) - Build # 485 - Unstable!

2017-10-20 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-5.5-Linux/485/
Java: 32bit/jdk1.7.0_80 -client -XX:+UseG1GC

1 tests failed.
FAILED:  org.apache.solr.schema.TestManagedSchemaAPI.test

Error Message:
Error from server at http://127.0.0.1:43023/solr/testschemaapi_shard1_replica2: 
ERROR: [doc=2] unknown field 'myNewField1'

Stack Trace:
org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error from 
server at http://127.0.0.1:43023/solr/testschemaapi_shard1_replica2: ERROR: 
[doc=2] unknown field 'myNewField1'
at 
__randomizedtesting.SeedInfo.seed([25E8F87ECE885D5B:ADBCC7A4607430A3]:0)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:653)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1002)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:891)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:827)
at 
org.apache.solr.schema.TestManagedSchemaAPI.testAddFieldAndDocument(TestManagedSchemaAPI.java:101)
at 
org.apache.solr.schema.TestManagedSchemaAPI.test(TestManagedSchemaAPI.java:69)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 

[jira] [Updated] (SOLR-11446) Heavily edit the "near real time searching" page in the reference guide

2017-10-20 Thread Cassandra Targett (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett updated SOLR-11446:
-
Fix Version/s: (was: 7.2)
   7.1

> Heavily edit the "near real time searching" page in the reference guide
> ---
>
> Key: SOLR-11446
> URL: https://issues.apache.org/jira/browse/SOLR-11446
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Fix For: 7.1, master (8.0)
>
> Attachments: SOLR-11446.patch
>
>
> That page needs some focus, I'll attach a draft in a second (edited late last 
> night and not proofread yet).
> Feedback welcome



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11446) Heavily edit the "near real time searching" page in the reference guide

2017-10-20 Thread Cassandra Targett (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213141#comment-16213141
 ] 

Cassandra Targett commented on SOLR-11446:
--

I decided to backport this to 7.1 since there's no reason to wait to publish it.

> Heavily edit the "near real time searching" page in the reference guide
> ---
>
> Key: SOLR-11446
> URL: https://issues.apache.org/jira/browse/SOLR-11446
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Fix For: 7.1, master (8.0)
>
> Attachments: SOLR-11446.patch
>
>
> That page needs some focus, I'll attach a draft in a second (edited late last 
> night and not proofread yet).
> Feedback welcome



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11446) Heavily edit the "near real time searching" page in the reference guide

2017-10-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213138#comment-16213138
 ] 

ASF subversion and git services commented on SOLR-11446:


Commit 4d91e887a54a8c6b3bae0e71dbd6e741033b756e in lucene-solr's branch 
refs/heads/branch_7_1 from [~erickerickson]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4d91e88 ]

SOLR-11446: Heavily edit the 'near real time searching' page in the reference 
guide


> Heavily edit the "near real time searching" page in the reference guide
> ---
>
> Key: SOLR-11446
> URL: https://issues.apache.org/jira/browse/SOLR-11446
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Fix For: 7.2, master (8.0)
>
> Attachments: SOLR-11446.patch
>
>
> That page needs some focus, I'll attach a draft in a second (edited late last 
> night and not proofread yet).
> Feedback welcome



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11446) Heavily edit the "near real time searching" page in the reference guide

2017-10-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213139#comment-16213139
 ] 

ASF subversion and git services commented on SOLR-11446:


Commit 8bde1e33b80c264dfc0aa6f3771e742241494d18 in lucene-solr's branch 
refs/heads/branch_7_1 from [~erickerickson]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=8bde1e3 ]

SOLR-11446: Heavily edit the 'near real time searching' page in the reference 
guide, fix doc build error


> Heavily edit the "near real time searching" page in the reference guide
> ---
>
> Key: SOLR-11446
> URL: https://issues.apache.org/jira/browse/SOLR-11446
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Fix For: 7.2, master (8.0)
>
> Attachments: SOLR-11446.patch
>
>
> That page needs some focus, I'll attach a draft in a second (edited late last 
> night and not proofread yet).
> Feedback welcome



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Solr test framework - Locale randomization

2017-10-20 Thread Dawid Weiss
Only valid locales (for your Java) are selected, so this has to be an
error. What failures do you see? Perhaps they should be reported to
Derby?

Dawid

On Fri, Oct 20, 2017 at 8:14 PM, Hrishikesh Gadre  wrote:
> Hi,
>
> I am currently implementing solr authorization plugin backed by Apache
> Sentry. For the unit tests, I am using Solr test framework (specifically
> SolrCloudTestCase class). Occasionally I see unit test failures since the
> sentry tests use Derby in-memory database and it doesn't work properly for
> some of the Locale(s) configured by the Solr test framework.
>
> Couple of questions
>
> (a) Does the Solr test framework generates only valid Locale(s) or a mix of
> valid/invalid Locale(s) ? The reason I am asking is that I have a test
> failure with Locale as "sr-Latn". But it is not included in the list of
> valid Locales supported by Java 8
> (http://www.oracle.com/technetwork/java/javase/java8locales-2095355.html).
>
> (b) Is there a way to turn off this Locale randomization?
>
>
> Thanks
> Hrishikesh
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-7733) remove/rename "optimize" references in the UI.

2017-10-20 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213048#comment-16213048
 ] 

Yonik Seeley edited comment on SOLR-7733 at 10/20/17 7:30 PM:
--

Coupling with the Lucene JIRA would seem to only make things more confusing... 
and my previous points in this JIRA stand.
edit: Actually, my opinion is shifting.  While a warning on the GUI for the 
rewrite-entire-index cost seemed sufficient, it's harder to succinctly explain 
that it will make it harder for deletes to be automatically merged away in the 
future.

bq. Suggestion: change the discussion in the ref guide for optimize to 
something like "controlling deleted docs percentage"

Not sure I understand... that's not the only thing optimize is for.  Some 
operations happen much more quickly on an optimized index.



was (Author: ysee...@gmail.com):
Coupling with the Lucene JIRA would seem to only make things more confusing... 
and my previous points in this JIRA stand.

bq. Suggestion: change the discussion in the ref guide for optimize to 
something like "controlling deleted docs percentage"

Not sure I understand... that's not the only thing optimize is for.  Some 
operations happen much more quickly on an optimized index.


> remove/rename "optimize" references in the UI.
> --
>
> Key: SOLR-7733
> URL: https://issues.apache.org/jira/browse/SOLR-7733
> Project: Solr
>  Issue Type: Improvement
>  Components: Admin UI
>Affects Versions: 5.3, 6.0
>Reporter: Erick Erickson
>Assignee: Upayavira
>Priority: Minor
> Attachments: SOLR-7733.patch
>
>
> Since optimizing indexes is kind of a special circumstance thing, what do we 
> think about removing (or renaming) optimize-related stuff on the core admin 
> and core overview pages? The "optimize" button is already gone from the core 
> admin screen (was this intentional?).
> My personal feeling is that we should remove this entirely as it's too easy 
> to think "Of course I want my index optimized" and "look, this screen says my 
> index isn't optimized, that must mean I should optimize it".
> The core admin screen and the core overview page both have an "optimized" 
> checkmark, I propose just removing it from the "overview" page and on the 
> "core admin" page changing it to "Segment Count #". NOTE: the "overview" page 
> already has a "Segment Count" entry.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10299) Provide search for online Ref Guide

2017-10-20 Thread Alexandre Rafalovitch (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213087#comment-16213087
 ] 

Alexandre Rafalovitch commented on SOLR-10299:
--

Just to add this for reference, there is a Javascript search called Lunr.js ("A 
bit like Solr, but much smaller and not as bright." - their words not mine) : 
https://lunrjs.com/

They do support [pre-built 
indexes|https://lunrjs.com/guides/index_prebuilding.html] , but the pipeline is 
Node.js based and does have some dependencies. And require JSON documents 
generated to be indexed (like Solr I guess).

The other idea is to make Ref Guide one of the shipped (or downloadable) 
examples. So, it becomes bin/start -e refguide, possibly with browse interface 
or what not. Not sure the effort involved though and whether release timelines 
would line up.

> Provide search for online Ref Guide
> ---
>
> Key: SOLR-10299
> URL: https://issues.apache.org/jira/browse/SOLR-10299
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>
> The POC to move the Ref Guide off Confluence did not address providing 
> full-text search of the page content. Not because it's hard or impossible, 
> but because there were plenty of other issues to work on.
> The current HTML page design provides a title index, but to replicate the 
> current Confluence experience, the online version(s) need to provide a 
> full-text search experience.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-20 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213085#comment-16213085
 ] 

Yonik Seeley commented on LUCENE-7976:
--

The max segment size is great for a number of reasons:
  - By default, prevents an unpredictable huge cascading merge *when the user 
doesn't want it*
  - By default, prevents a huge segment if the user never wants huge segments

The downside to a max segment size is that one can start getting many more 
segments than anticipated or desired (and can impact performance in 
unpredictable ways, depending on the exact usage).
If a user *specifically* asks to forceMerge (i.e. they realized they have 200 
segments and they want to bring that down to 20), then that should be respected.

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11516) Unified highlighter with word separator never gives context to the left

2017-10-20 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213063#comment-16213063
 ] 

David Smiley commented on SOLR-11516:
-

Ok.  At least with respect to the issue title... this issue can probably be 
won't-fix or perhaps outright remove WORD & CHARACTER options since they are 
bad options.  (FWIW I didn't don't recall including them; they were probably 
inherited options from similar code for the FVH).

Can you try simply increasing the hl.fragsize a bunch more?  And then if the 
result is too long then trimming client-side?

FWIW there is an already coded option on the Lucene end of this to have 
hl.fragsize be a target/average such that the snippet will break on the side 
closest to the target (either ahead or before).  There is no Solr option to 
enable this; it's a TODO.  The current setting picks the earliest always, even 
if the next break is only a word beyond the target.

Snippeting is hard to satisfy everyone with.  There are many ways to skin this 
cat.

> Unified highlighter with word separator never gives context to the left
> ---
>
> Key: SOLR-11516
> URL: https://issues.apache.org/jira/browse/SOLR-11516
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: highlighter
>Affects Versions: 6.4, 7.1
>Reporter: Tim Retout
>
> When using the unified highlighter with hl.bs.type=WORD, I am not able to get 
> context to the left of the matches returned; only words to the right of each 
> match are shown.  I see this behaviour on both Solr 6.4 and Solr 7.1.
> Without context to the left of a match, the highlighted snippets are much 
> less useful for understanding where the match appears in a document.
> As an example, using the techproducts data with Solr 7.1, given a search for 
> "apple", highlighting the "features" field:
> http://localhost:8983/solr/techproducts/select?hl.fl=features=on=apple=WORD=30=unified
> I see this snippet:
> "Apple Lossless, H.264 video"
> Note that "Apple" is anchored to the left.  Compare with the original 
> highlighter:
> http://localhost:8983/solr/techproducts/select?hl.fl=features=on=apple=30
> And the match has context either side:
> ", Audible, Apple Lossless, H.264 video"
> (To complicate this, in general I am not sure that the unified highlighter is 
> respecting the hl.fragsize parameter, although [SOLR-9935] suggests support 
> was added.  I included the hl.fragsize param in the unified URL too, but it's 
> making no difference unless set to 0.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-20 Thread Timothy M. Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213052#comment-16213052
 ] 

Timothy M. Rodriguez commented on LUCENE-7976:
--

I didn't know that! Thanks for pointing out.

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-7.x-Solaris (64bit/jdk1.8.0) - Build # 254 - Still Unstable!

2017-10-20 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Solaris/254/
Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseSerialGC

2 tests failed.
FAILED:  org.apache.solr.cloud.autoscaling.ExecutePlanActionTest.testIntegration

Error Message:
Timed out waiting for replicas of collection to be 2 again null Live Nodes: 
[127.0.0.1:61145_solr] Last available state: 
DocCollection(testIntegration//collections/testIntegration/state.json/7)={   
"pullReplicas":"0",   "replicationFactor":"2",   "shards":{"shard1":{   
"range":"8000-7fff",   "state":"active",   "replicas":{ 
"core_node3":{   "core":"testIntegration_shard1_replica_n1",   
"base_url":"http://127.0.0.1:47014/solr;,   
"node_name":"127.0.0.1:47014_solr",   "state":"down",   
"type":"NRT"}, "core_node4":{   
"core":"testIntegration_shard1_replica_n2",   
"base_url":"http://127.0.0.1:61145/solr;,   
"node_name":"127.0.0.1:61145_solr",   "state":"active",   
"type":"NRT",   "leader":"true",   "router":{"name":"compositeId"}, 
  "maxShardsPerNode":"1",   "autoAddReplicas":"false",   "nrtReplicas":"2",   
"tlogReplicas":"0"}

Stack Trace:
java.lang.AssertionError: Timed out waiting for replicas of collection to be 2 
again
null
Live Nodes: [127.0.0.1:61145_solr]
Last available state: 
DocCollection(testIntegration//collections/testIntegration/state.json/7)={
  "pullReplicas":"0",
  "replicationFactor":"2",
  "shards":{"shard1":{
  "range":"8000-7fff",
  "state":"active",
  "replicas":{
"core_node3":{
  "core":"testIntegration_shard1_replica_n1",
  "base_url":"http://127.0.0.1:47014/solr;,
  "node_name":"127.0.0.1:47014_solr",
  "state":"down",
  "type":"NRT"},
"core_node4":{
  "core":"testIntegration_shard1_replica_n2",
  "base_url":"http://127.0.0.1:61145/solr;,
  "node_name":"127.0.0.1:61145_solr",
  "state":"active",
  "type":"NRT",
  "leader":"true",
  "router":{"name":"compositeId"},
  "maxShardsPerNode":"1",
  "autoAddReplicas":"false",
  "nrtReplicas":"2",
  "tlogReplicas":"0"}
at 
__randomizedtesting.SeedInfo.seed([7E47DCBFD04BDBC6:CE26D293F5747AE3]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.cloud.SolrCloudTestCase.waitForState(SolrCloudTestCase.java:269)
at 
org.apache.solr.cloud.autoscaling.ExecutePlanActionTest.testIntegration(ExecutePlanActionTest.java:209)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 

[JENKINS] Lucene-Solr-master-Linux (64bit/jdk-9) - Build # 20702 - Unstable!

2017-10-20 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/20702/
Java: 64bit/jdk-9 -XX:-UseCompressedOops -XX:+UseG1GC --illegal-access=deny

1 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.security.BasicAuthIntegrationTest

Error Message:
Error from server at https://127.0.0.1:35439/solr: create the collection time 
out:180s

Stack Trace:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at https://127.0.0.1:35439/solr: create the collection time out:180s
at __randomizedtesting.SeedInfo.seed([A44609501A79B9FC]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:626)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:413)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1086)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:867)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:800)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:178)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:195)
at 
org.apache.solr.security.BasicAuthIntegrationTest.setupCluster(BasicAuthIntegrationTest.java:81)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:874)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.base/java.lang.Thread.run(Thread.java:844)




Build Log:
[...truncated 12290 lines...]
   [junit4] Suite: org.apache.solr.security.BasicAuthIntegrationTest
   [junit4]   2> Creating dataDir: 
/home/jenkins/workspace/Lucene-Solr-master-Linux/solr/build/solr-core/test/J1/temp/solr.security.BasicAuthIntegrationTest_A44609501A79B9FC-001/init-core-data-001
   [junit4]   2> 548958 WARN  
(SUITE-BasicAuthIntegrationTest-seed#[A44609501A79B9FC]-worker) [] 
o.a.s.SolrTestCaseJ4 startTrackingSearchers: numOpens=4 numCloses=4
   [junit4]   2> 548958 INFO  
(SUITE-BasicAuthIntegrationTest-seed#[A44609501A79B9FC]-worker) [] 
o.a.s.SolrTestCaseJ4 Using PointFields (NUMERIC_POINTS_SYSPROP=true) 
w/NUMERIC_DOCVALUES_SYSPROP=true
   [junit4]   2> 548959 INFO  
(SUITE-BasicAuthIntegrationTest-seed#[A44609501A79B9FC]-worker) [] 
o.a.s.SolrTestCaseJ4 Randomized ssl (true) and clientAuth (false) via: 

[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-20 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213050#comment-16213050
 ] 

Yonik Seeley commented on LUCENE-7976:
--

bq: If we don't, other smaller bugs can come up for users, such as ulimits on 
file size, that they thought they were safely under.

Max segment sizes are a target, not a hard guarantee... Lucene doesn't know 
exactly how big the segment will be before it actually completes the merge, and 
it can end up going over the limit.

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7733) remove/rename "optimize" references in the UI.

2017-10-20 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213048#comment-16213048
 ] 

Yonik Seeley commented on SOLR-7733:


Coupling with the Lucene JIRA would seem to only make things more confusing... 
and my previous points in this JIRA stand.

bq. Suggestion: change the discussion in the ref guide for optimize to 
something like "controlling deleted docs percentage"

Not sure I understand... that's not the only thing optimize is for.  Some 
operations happen much more quickly on an optimized index.


> remove/rename "optimize" references in the UI.
> --
>
> Key: SOLR-7733
> URL: https://issues.apache.org/jira/browse/SOLR-7733
> Project: Solr
>  Issue Type: Improvement
>  Components: Admin UI
>Affects Versions: 5.3, 6.0
>Reporter: Erick Erickson
>Assignee: Upayavira
>Priority: Minor
> Attachments: SOLR-7733.patch
>
>
> Since optimizing indexes is kind of a special circumstance thing, what do we 
> think about removing (or renaming) optimize-related stuff on the core admin 
> and core overview pages? The "optimize" button is already gone from the core 
> admin screen (was this intentional?).
> My personal feeling is that we should remove this entirely as it's too easy 
> to think "Of course I want my index optimized" and "look, this screen says my 
> index isn't optimized, that must mean I should optimize it".
> The core admin screen and the core overview page both have an "optimized" 
> checkmark, I propose just removing it from the "overview" page and on the 
> "core admin" page changing it to "Segment Count #". NOTE: the "overview" page 
> already has a "Segment Count" entry.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8000) Document Length Normalization in BM25Similarity correct?

2017-10-20 Thread Timothy M. Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213042#comment-16213042
 ] 

Timothy M. Rodriguez commented on LUCENE-8000:
--

[~rcmuir] thanks for the further explanation.  That helped clarify. It does 
seem the effect would be minor at best.  It'd be an interesting experiment at 
some point, though.  If I ever get to trying it, I'll post back.

[~gol...@detego-software.de] As an additional point, advanced use cases often 
utilize token "stacking" for additional uses as well and these would have 
further distortions on length.  For example, some folks use analysis chains 
that stack variants of urls, currencies, etc.

> Document Length Normalization in BM25Similarity correct?
> 
>
> Key: LUCENE-8000
> URL: https://issues.apache.org/jira/browse/LUCENE-8000
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Christoph Goller
>Priority: Minor
>
> Length of individual documents only counts the number of positions of a 
> document since discountOverlaps defaults to true.
> {code}
>  @Override
>   public final long computeNorm(FieldInvertState state) {
> final int numTerms = discountOverlaps ? state.getLength() - 
> state.getNumOverlap() : state.getLength();
> int indexCreatedVersionMajor = state.getIndexCreatedVersionMajor();
> if (indexCreatedVersionMajor >= 7) {
>   return SmallFloat.intToByte4(numTerms);
> } else {
>   return SmallFloat.floatToByte315((float) (1 / Math.sqrt(numTerms)));
> }
>   }}
> {code}
> Measureing document length this way seems perfectly ok for me. What bothers 
> me is that
> average document length is based on sumTotalTermFreq for a field. As far as I 
> understand that sums up totalTermFreqs for all terms of a field, therefore 
> counting positions of terms including those that overlap.
> {code}
>  protected float avgFieldLength(CollectionStatistics collectionStats) {
> final long sumTotalTermFreq = collectionStats.sumTotalTermFreq();
> if (sumTotalTermFreq <= 0) {
>   return 1f;   // field does not exist, or stat is unsupported
> } else {
>   final long docCount = collectionStats.docCount() == -1 ? 
> collectionStats.maxDoc() : collectionStats.docCount();
>   return (float) (sumTotalTermFreq / (double) docCount);
> }
>   }
> }
> {code}
> Are we comparing apples and oranges in the final scoring?
> I haven't run any benchmarks and I am not sure whether this has a serious 
> effect. It just means that documents that have synonyms or in my use case 
> different normal forms of tokens on the same position are shorter and 
> therefore get higher scores  than they should and that we do not use the 
> whole spectrum of relative document lenght of BM25.
> I think for BM25  discountOverlaps  should default to false. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7733) remove/rename "optimize" references in the UI.

2017-10-20 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213025#comment-16213025
 ] 

Erick Erickson commented on SOLR-7733:
--

I think we should couple these together as the Lucene JIRA makes it much more 
palatable to remove this from the UI and de-emphasize it in the reference guide.

Suggestion: change the discussion in the ref guide for optimize to something 
like "controlling deleted docs percentage" (yuck, but you get the idea) 'cause 
that's what's really at issue here.

> remove/rename "optimize" references in the UI.
> --
>
> Key: SOLR-7733
> URL: https://issues.apache.org/jira/browse/SOLR-7733
> Project: Solr
>  Issue Type: Improvement
>  Components: Admin UI
>Affects Versions: 5.3, 6.0
>Reporter: Erick Erickson
>Assignee: Upayavira
>Priority: Minor
> Attachments: SOLR-7733.patch
>
>
> Since optimizing indexes is kind of a special circumstance thing, what do we 
> think about removing (or renaming) optimize-related stuff on the core admin 
> and core overview pages? The "optimize" button is already gone from the core 
> admin screen (was this intentional?).
> My personal feeling is that we should remove this entirely as it's too easy 
> to think "Of course I want my index optimized" and "look, this screen says my 
> index isn't optimized, that must mean I should optimize it".
> The core admin screen and the core overview page both have an "optimized" 
> checkmark, I propose just removing it from the "overview" page and on the 
> "core admin" page changing it to "Segment Count #". NOTE: the "overview" page 
> already has a "Segment Count" entry.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-20 Thread Timothy M. Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213024#comment-16213024
 ] 

Timothy M. Rodriguez commented on LUCENE-7976:
--

An additional place where deletions come up is in replica differences due to 
the way merging happened on a shard.  This can cause jitter in results where 
the ordering will depend on which shard answered a query because the 
frequencies are off significantly enough.  I know this problem will never go 
away completely as we can't flush away deletes immediately, but allowing some 
reclamation of deletes in large segments will help minimize the issue.

On max segment size, I also think the merge policy ought to dutifully respect 
maxSegmentSize.  If we don't, other smaller bugs can come up for users, such as 
ulimits on file size, that they thought they were safely under.

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-20 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213021#comment-16213021
 ] 

Erick Erickson commented on LUCENE-7976:


Mike:

bq: Lucene is quite good at skipping deleted docs during search

That's not the nub of the issue for me. I'm seeing very large indexes, 200-300G 
is quite common lately on a single core. We have customers approaching 100T 
indexes in aggregate in single Solr collections. And that problem is only going 
to get worse as hardware improves and super-especially if Java's GC algorithm 
evolves to work smoothly with larger heaps. BTW, this is not theoretical, I 
have a client using Azul's Zing with Java heaps approaching 80G. It's an edge 
case to be sure, but similar will become more common.

So 50% deleted documents consumes a _lot_ of resources, both disk and RAM when 
considered in aggregate at that scale. I realize that any of the options here 
will increase I/O, but that's preferable to having to provision a new data 
center because you're physically out of space and can't add more machines or 
even attach more storage to current machines.

bq: maybe we could simply relax TMP so that even max sized segments that have < 
50% deletions are eligible for merging

Just to be sure I understand this... Are you saying that we make it possible to 
merge, say, one segment with 3.5G and 5 other segments each 0.3G? That seems 
like it'd work.

That leaves finding a way out of what happens when someone actually does have a 
huge segment as a result of force merging. I know, I know, "don't do that" and 
"get rid of the big red optimize button in the Solr admin screen and stop 
talking about it!". I suppose your suggestion can tackle that too if we define 
an edge case in your "relax TMP so that" idea to include a "singleton 
merge" if the _result_ of the merge would be > max segment size.

Thanks for your input! Let's just say I have a lot more faith in your knowledge 
of this code than mine..

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Solr test framework - Locale randomization

2017-10-20 Thread Hrishikesh Gadre
Hi,

I am currently implementing solr authorization plugin backed by Apache
Sentry. For the unit tests, I am using Solr test framework (specifically
SolrCloudTestCase class). Occasionally I see unit test failures since the
sentry tests use Derby in-memory database and it doesn't work properly for
some of the Locale(s) configured by the Solr test framework.

Couple of questions

(a) Does the Solr test framework generates only valid Locale(s) or a mix of
valid/invalid Locale(s) ? The reason I am asking is that I have a test
failure with Locale as "sr-Latn". But it is not included in the list of
valid Locales supported by Java 8 (
http://www.oracle.com/technetwork/java/javase/java8locales-2095355.html).

(b) Is there a way to turn off this Locale randomization?


Thanks
Hrishikesh


[jira] [Commented] (SOLR-11032) Update solrj tutorial

2017-10-20 Thread Cassandra Targett (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212973#comment-16212973
 ] 

Cassandra Targett commented on SOLR-11032:
--

bq. of course: the ref-guide build ican/should still fail if the "include" tags 
aren't found in the references java file (or if the java test file gets 
removed/moved)
bq. (I'm assuming asciidoctor gives an error on this? we should double check 
that)

We do need to check that it will throw such an error, but a broader issue is 
that in many cases (as many as possible) asciidoctor will throw an error but 
continue. From Ant's perspective, the process finished, so it must be good. 
IOW, unless the error is severe enough that asciidoctor's process fails 
entirely, the build doesn't fail due to errors thrown by the process as it 
goes. And it's very conditional when to fail - there are some errors that are 
fine, as we know, but there are others where we might want to fail the build 
even if the process completes.

> Update solrj tutorial
> -
>
> Key: SOLR-11032
> URL: https://issues.apache.org/jira/browse/SOLR-11032
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation, SolrJ, website
>Reporter: Karl Richter
> Attachments: SOLR-11032.patch, SOLR-11032.patch
>
>
> The [solrj tutorial](https://wiki.apache.org/solr/Solrj) has the following 
> issues:
>   * It refers to 1.4.0 whereas the current release is 6.x, some classes are 
> deprecated or no longer exist.
>   * Document-object-binding is a crucial feature [which should be working in 
> the meantime](https://issues.apache.org/jira/browse/SOLR-1945) and thus 
> should be covered in the tutorial.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-20 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212949#comment-16212949
 ] 

Michael McCandless commented on LUCENE-7976:


I don't think we can allow different max segment sizes for forced merges and 
natural merges; that's effectively the state we are in today and it causes the 
bug (case 1) we have here, because natural merging can't touch the too-big 
segments.  I think we need to fix {{forceMerge}}, and 
{{findForcedDeletesMerges}}, to respect the maximum segment size, and if you 
really want a single segment and your index is bigger than 5 GB (default max 
segment size), you need to increase that maximum.  This would solve case 1 (the 
"I ran {{forceMerge}} and yet continued updating my index" situation).

For case 2, if we also must solve the "even 50% deletions is too much for me" 
case (and I'm not yet sure we should... Lucene is quite good at skipping 
deleted docs during search), maybe we could simply relax TMP so that even max 
sized segments that have < 50% deletions are eligible for merging.  Then, they 
would be considered for natural merging right off, and users could always 
(carefully!) tune up the {{reclaimDeletesWeight}} to more aggressively target 
segments with deletions.

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11475) Endless loop and OOM in PeerSync

2017-10-20 Thread Pushkar Raste (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212935#comment-16212935
 ] 

Pushkar Raste commented on SOLR-11475:
--

Version numbers are monotonically increasing sequence numbers and for
deletes sequence number is multiplied by -1

I dont think we would ever have version number X in replica's tlog and -X
in leader's (or any other replica's) tlog

Can you provide a valid test case for your issue. I am not in front of
computer right now, however, IIRC tests have token PeerSync in the name.


On Oct 20, 2017 5:54 AM, "Andrey Kudryavtsev (JIRA)" 
wrote:


 [ https://issues.apache.org/jira/browse/SOLR-11475?page=
com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrey Kudryavtsev mentioned you on SOLR-11475
--

I think throwing exception in case of {{ourUpdates.get(ourUpdatesIndex) =
-otherVersions.get(otherUpdatesIndex)}} than OOM

[~praste], [~shalinmangar] What do you think?


comment

Hint: You can mention someone in an issue description or comment by typing
"@" in front of their username.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


> Endless loop and OOM in PeerSync
> 
>
> Key: SOLR-11475
> URL: https://issues.apache.org/jira/browse/SOLR-11475
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andrey Kudryavtsev
>
> After problem described in SOLR-11459, I restarted cluster and got OOM on 
> start. 
> [PeerSync#handleVersionsWithRanges|https://github.com/apache/lucene-solr/blob/68bda0be421ce18811e03b229781fd6152fcc04a/solr/core/src/java/org/apache/solr/update/PeerSync.java#L539]
>  contains this logic: 
> {code}
> while (otherUpdatesIndex >= 0) {
>   // we have run out of ourUpdates, pick up all the remaining versions 
> from the other versions
>   if (ourUpdatesIndex < 0) {
> String range = otherVersions.get(otherUpdatesIndex) + "..." + 
> otherVersions.get(0);
> rangesToRequest.add(range);
> totalRequestedVersions += otherUpdatesIndex + 1;
> break;
>   }
>   // stop when the entries get old enough that reorders may lead us to 
> see updates we don't need
>   if (!completeList && Math.abs(otherVersions.get(otherUpdatesIndex)) < 
> ourLowThreshold) break;
>   if (ourUpdates.get(ourUpdatesIndex).longValue() == 
> otherVersions.get(otherUpdatesIndex).longValue()) {
> ourUpdatesIndex--;
> otherUpdatesIndex--;
>   } else if (Math.abs(ourUpdates.get(ourUpdatesIndex)) < 
> Math.abs(otherVersions.get(otherUpdatesIndex))) {
> ourUpdatesIndex--;
>   } else {
> long rangeStart = otherVersions.get(otherUpdatesIndex);
> while ((otherUpdatesIndex < otherVersions.size())
> && (Math.abs(otherVersions.get(otherUpdatesIndex)) < 
> Math.abs(ourUpdates.get(ourUpdatesIndex {
>   otherUpdatesIndex--;
>   totalRequestedVersions++;
> }
> // construct range here
> rangesToRequest.add(rangeStart + "..." + 
> otherVersions.get(otherUpdatesIndex + 1));
>   }
> }
> {code}
> If at some point there will be
> {code} ourUpdates.get(ourUpdatesIndex) = 
> -otherVersions.get(otherUpdatesIndex) {code}
> loop will never end. It will add same string again and again into 
> {{rangesToRequest}} until process runs out of memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11516) Unified highlighter with word separator never gives context to the left

2017-10-20 Thread Tim Retout (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212919#comment-16212919
 ] 

Tim Retout commented on SOLR-11516:
---

[~dsmiley] yes, that makes sense. Thanks for the quick reply.

The domain I'm working in (recruitment) is quite similar to a general purpose 
search engine - we have documents of maybe 1000 words, and need to show the 
gist of where the matches appear.  We are happy with cutting off in the middle 
of a sentence, because well-known search engines do it.

When using hl.bs.type=SENTENCE, I have run into examples where the surrounding 
sentences were not pulled in within the fragsize that we had set - 
unfortunately I can't show a quick example of this on the techproducts 
collection, but I can confirm this (and file as a separate issue?) if needed. 
It was something like:

"Foo bar baz. Very long sentence starts here that goes on for several 
hundred chars."

Then a search for "foo" would bring back as a snippet:

"Foo bar baz."

This led to very short summaries of the document, where only one or two short 
"sentences" are provided that match the query, and the total summary was less 
than one line long.

What I was hoping for was a way to use the unified highlighter to produce 
similar summaries to the other highlighter options (i.e. cutting off at word 
boundaries, I think I mean), to take advantage of the performance and 
flexibility advantages described in the documentation.

> Unified highlighter with word separator never gives context to the left
> ---
>
> Key: SOLR-11516
> URL: https://issues.apache.org/jira/browse/SOLR-11516
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: highlighter
>Affects Versions: 6.4, 7.1
>Reporter: Tim Retout
>
> When using the unified highlighter with hl.bs.type=WORD, I am not able to get 
> context to the left of the matches returned; only words to the right of each 
> match are shown.  I see this behaviour on both Solr 6.4 and Solr 7.1.
> Without context to the left of a match, the highlighted snippets are much 
> less useful for understanding where the match appears in a document.
> As an example, using the techproducts data with Solr 7.1, given a search for 
> "apple", highlighting the "features" field:
> http://localhost:8983/solr/techproducts/select?hl.fl=features=on=apple=WORD=30=unified
> I see this snippet:
> "Apple Lossless, H.264 video"
> Note that "Apple" is anchored to the left.  Compare with the original 
> highlighter:
> http://localhost:8983/solr/techproducts/select?hl.fl=features=on=apple=30
> And the match has context either side:
> ", Audible, Apple Lossless, H.264 video"
> (To complicate this, in general I am not sure that the unified highlighter is 
> respecting the hl.fragsize parameter, although [SOLR-9935] suggests support 
> was added.  I included the hl.fragsize param in the unified URL too, but it's 
> making no difference unless set to 0.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-7.x-Linux (64bit/jdk1.8.0_144) - Build # 637 - Still Unstable!

2017-10-20 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/637/
Java: 64bit/jdk1.8.0_144 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC

1 tests failed.
FAILED:  org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.test

Error Message:
Timeout occured while waiting response from server at: 
http://127.0.0.1:45257/_pg

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting 
response from server at: http://127.0.0.1:45257/_pg
at 
__randomizedtesting.SeedInfo.seed([3B1C40B3F78D:B3487F6959716DF8]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:637)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:413)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1086)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:867)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:800)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:178)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:195)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.createServers(AbstractFullDistribZkTestBase.java:315)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:991)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:968)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 

[jira] [Updated] (SOLR-10078) Highlighter error: Unknown query type:org.apache.lucene.search.MatchNoDocsQuery

2017-10-20 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-10078:

Component/s: highlighter
Summary: Highlighter error: Unknown query 
type:org.apache.lucene.search.MatchNoDocsQuery  (was: "Unknown query 
type:org.apache.lucene.search.MatchNoDocsQuery" error with Solr v6.3)

> Highlighter error: Unknown query 
> type:org.apache.lucene.search.MatchNoDocsQuery
> ---
>
> Key: SOLR-10078
> URL: https://issues.apache.org/jira/browse/SOLR-10078
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: highlighter
>Affects Versions: 6.3
>Reporter: Andy Tran
>Priority: Minor
>
> With Solr v6.3, when I issue this query:
> http://localhost:8983/solr/BestBuy/select?wt=json=10={!complexphrase%20inOrder=false}_text_:%22maytag~%20(refri~%20OR%20refri*)%20%22=id=true=false=60=nameX,shortDescription,longDescription,artistName,type,manufacturer,department
> I get this error in the JSON response:
> *
> {
>   "responseHeader": {
> "zkConnected": true,
> "status": 500,
> "QTime": 8,
> "params": {
>   "q": "{!complexphrase inOrder=false}_text_:\"maytag~ (refri~ OR refri*) 
> \"",
>   "hl": "true",
>   "hl.preserveMulti": "false",
>   "fl": "id",
>   "hl.fragsize": "60",
>   "hl.fl": 
> "nameX,shortDescription,longDescription,artistName,type,manufacturer,department",
>   "rows": "10",
>   "wt": "json"
> }
>   },
>   "response": {
> "numFound": 2,
> "start": 0,
> "docs": [
>   {
> "id": "5411379"
>   },
>   {
> "id": "5411404"
>   }
> ]
>   },
>   "error": {
> "msg": "Unknown query type:org.apache.lucene.search.MatchNoDocsQuery",
> "trace": "java.lang.IllegalArgumentException: Unknown query 
> type:org.apache.lucene.search.MatchNoDocsQuery\n\tat 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser$ComplexPhraseQuery.addComplexPhraseClause(ComplexPhraseQueryParser.java:388)\n\tat
>  
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser$ComplexPhraseQuery.rewrite(ComplexPhraseQueryParser.java:289)\n\tat
>  
> org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:230)\n\tat
>  
> org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(WeightedSpanTermExtractor.java:522)\n\tat
>  
> org.apache.lucene.search.highlight.QueryScorer.initExtractor(QueryScorer.java:218)\n\tat
>  
> org.apache.lucene.search.highlight.QueryScorer.init(QueryScorer.java:186)\n\tat
>  
> org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:195)\n\tat
>  
> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlightingByHighlighter(DefaultSolrHighlighter.java:602)\n\tat
>  
> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlightingOfField(DefaultSolrHighlighter.java:448)\n\tat
>  
> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:410)\n\tat
>  
> org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:141)\n\tat
>  
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:295)\n\tat
>  
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:153)\n\tat
>  org.apache.solr.core.SolrCore.execute(SolrCore.java:2213)\n\tat 
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654)\n\tat 
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:460)\n\tat 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:303)\n\tat
>  
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:254)\n\tat
>  
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)\n\tat
>  
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)\n\tat
>  
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
>  
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat
>  
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)\n\tat
>  
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)\n\tat
>  
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)\n\tat
>  
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat
>  
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)\n\tat
>  
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat
>  
> 

Re: Congratulations to the new Lucene/Solr PMC Chair, Adrien Grand

2017-10-20 Thread Varun Thacker
Congratulations Adrien!

On Fri, Oct 20, 2017 at 9:45 AM, Tomas Fernandez Lobbe 
wrote:

> Congratulations Adrien!
>
> On Oct 19, 2017, at 10:51 AM, Martin Gainty  wrote:
>
> Félicitations Adrien!
>
> Martin
> __
>
> --
> *From:* ansh...@apple.com  on behalf of Anshum Gupta <
> ansh...@apple.com>
> *Sent:* Thursday, October 19, 2017 11:52 AM
> *To:* dev@lucene.apache.org
> *Subject:* Re: Congratulations to the new Lucene/Solr PMC Chair, Adrien
> Grand
>
> Congratulations Adrien!
>
> -Anshum
>
>
>
> On Oct 19, 2017, at 12:19 AM, Tommaso Teofili 
> wrote:
>
> Once a year the Lucene PMC rotates the PMC chair and Apache Vice President
> position.
> This year we have nominated and elected Adrien Grand as the chair and
> today the board just approved it, so now it's official.
>
> Congratulations Adrien!
> Regards,
> Tommaso
>
>
>


[jira] [Commented] (SOLR-11032) Update solrj tutorial

2017-10-20 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212881#comment-16212881
 ] 

Hoss Man commented on SOLR-11032:
-

First off: big +1 from me on the idea of moving to using include-partial to 
pull in solrj samples.  i love that this sort of thing is now possible with the 
ref-guide.

bq. The attached patch moves the Java snippets out from the using-solrj page, 
to build-able examples that live in {{solr/solr-ref-guide/src/java}}. ...

i haven't looked at the patch in depth, but i would suggest that _instead_ of 
adding new, isloated, examples that live in {{solr/solr-ref-guide/src/java}}, 
thes new code snippets belong in tests that live in {{solr/solrj/src/test}}.

that would:
* keep the "PROS" listed (the solrj APIs can't be updated w/o the tests being 
updated)
* new PRO: pulling these snippets from actual junit tests means we can not only 
be confident that they compile, but that they actually do what the doc says 
they'll do -- by adding asserts after/outside the include tags verifying the 
expected outcome
* eliminate the only CONS: no need to change the build dependencies of the 
ref-guide
** building the ref guide won't _depend_ on building solrj, it will just trust 
that the snippets must be valid because the solrj tests (presumably) pass
*** down the road, once we can more easily integrate the ref guide build 
directly into the main build this assumption can go away -- but in the mean 
time it doesn't have to be a concern/issue/question affecting these doc 
impreovements
** of course: the ref-guide build ican/should still fail if the "include" tags 
aren't found in the references java file (or if the java test file gets 
removed/moved)
*** (I'm assuming asciidoctor gives an error on this? we should double check 
that)


> Update solrj tutorial
> -
>
> Key: SOLR-11032
> URL: https://issues.apache.org/jira/browse/SOLR-11032
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation, SolrJ, website
>Reporter: Karl Richter
> Attachments: SOLR-11032.patch, SOLR-11032.patch
>
>
> The [solrj tutorial](https://wiki.apache.org/solr/Solrj) has the following 
> issues:
>   * It refers to 1.4.0 whereas the current release is 6.x, some classes are 
> deprecated or no longer exist.
>   * Document-object-binding is a crucial feature [which should be working in 
> the meantime](https://issues.apache.org/jira/browse/SOLR-1945) and thus 
> should be covered in the tutorial.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-SmokeRelease-master - Build # 870 - Still Failing

2017-10-20 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-master/870/

No tests ran.

Build Log:
[...truncated 28006 lines...]
prepare-release-no-sign:
[mkdir] Created dir: 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/dist
 [copy] Copying 476 files to 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/dist/lucene
 [copy] Copying 215 files to 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/dist/solr
   [smoker] Java 1.8 JAVA_HOME=/home/jenkins/tools/java/latest1.8
   [smoker] NOTE: output encoding is UTF-8
   [smoker] 
   [smoker] Load release URL 
"file:/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/dist/"...
   [smoker] 
   [smoker] Test Lucene...
   [smoker]   test basics...
   [smoker]   get KEYS
   [smoker] 0.2 MB in 0.05 sec (4.6 MB/sec)
   [smoker]   check changes HTML...
   [smoker]   download lucene-8.0.0-src.tgz...
   [smoker] 29.7 MB in 0.18 sec (161.3 MB/sec)
   [smoker] verify md5/sha1 digests
   [smoker]   download lucene-8.0.0.tgz...
   [smoker] 69.5 MB in 0.14 sec (496.7 MB/sec)
   [smoker] verify md5/sha1 digests
   [smoker]   download lucene-8.0.0.zip...
   [smoker] 79.8 MB in 0.07 sec (1091.2 MB/sec)
   [smoker] verify md5/sha1 digests
   [smoker]   unpack lucene-8.0.0.tgz...
   [smoker] verify JAR metadata/identity/no javax.* or java.* classes...
   [smoker] test demo with 1.8...
   [smoker]   got 6184 hits for query "lucene"
   [smoker] checkindex with 1.8...
   [smoker] check Lucene's javadoc JAR
   [smoker]   unpack lucene-8.0.0.zip...
   [smoker] verify JAR metadata/identity/no javax.* or java.* classes...
   [smoker] test demo with 1.8...
   [smoker]   got 6184 hits for query "lucene"
   [smoker] checkindex with 1.8...
   [smoker] check Lucene's javadoc JAR
   [smoker]   unpack lucene-8.0.0-src.tgz...
   [smoker] make sure no JARs/WARs in src dist...
   [smoker] run "ant validate"
   [smoker] run tests w/ Java 8 and testArgs='-Dtests.slow=false'...
   [smoker] test demo with 1.8...
   [smoker]   got 213 hits for query "lucene"
   [smoker] checkindex with 1.8...
   [smoker] generate javadocs w/ Java 8...
   [smoker] 
   [smoker] Crawl/parse...
   [smoker] 
   [smoker] Verify...
   [smoker]   confirm all releases have coverage in TestBackwardsCompatibility
   [smoker] find all past Lucene releases...
   [smoker] run TestBackwardsCompatibility..
   [smoker] Releases that don't seem to be tested:
   [smoker]   6.6.2
   [smoker] Traceback (most recent call last):
   [smoker]   File 
"/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/dev-tools/scripts/smokeTestRelease.py",
 line 1484, in 
   [smoker] main()
   [smoker]   File 
"/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/dev-tools/scripts/smokeTestRelease.py",
 line 1428, in main
   [smoker] smokeTest(c.java, c.url, c.revision, c.version, c.tmp_dir, 
c.is_signed, ' '.join(c.test_args))
   [smoker]   File 
"/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/dev-tools/scripts/smokeTestRelease.py",
 line 1466, in smokeTest
   [smoker] unpackAndVerify(java, 'lucene', tmpDir, 'lucene-%s-src.tgz' % 
version, gitRevision, version, testArgs, baseURL)
   [smoker]   File 
"/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/dev-tools/scripts/smokeTestRelease.py",
 line 622, in unpackAndVerify
   [smoker] verifyUnpacked(java, project, artifact, unpackPath, 
gitRevision, version, testArgs, tmpDir, baseURL)
   [smoker]   File 
"/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/dev-tools/scripts/smokeTestRelease.py",
 line 774, in verifyUnpacked
   [smoker] confirmAllReleasesAreTestedForBackCompat(version, unpackPath)
   [smoker]   File 
"/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/dev-tools/scripts/smokeTestRelease.py",
 line 1404, in confirmAllReleasesAreTestedForBackCompat
   [smoker] raise RuntimeError('some releases are not tested by 
TestBackwardsCompatibility?')
   [smoker] RuntimeError: some releases are not tested by 
TestBackwardsCompatibility?

BUILD FAILED
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/build.xml:622:
 exec returned: 1

Total time: 183 minutes 59 seconds
Build step 'Invoke Ant' marked build as failure
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Congratulations to the new Lucene/Solr PMC Chair, Adrien Grand

2017-10-20 Thread Tomas Fernandez Lobbe
Congratulations Adrien!

> On Oct 19, 2017, at 10:51 AM, Martin Gainty  wrote:
> 
> Félicitations Adrien!
> 
> Martin 
> __ 
> 
> From: ansh...@apple.com  on behalf of Anshum Gupta 
> 
> Sent: Thursday, October 19, 2017 11:52 AM
> To: dev@lucene.apache.org
> Subject: Re: Congratulations to the new Lucene/Solr PMC Chair, Adrien Grand
>  
> Congratulations Adrien!
> 
> -Anshum
> 
> 
> 
>> On Oct 19, 2017, at 12:19 AM, Tommaso Teofili > > wrote:
>> 
>> Once a year the Lucene PMC rotates the PMC chair and Apache Vice President 
>> position.
>> This year we have nominated and elected Adrien Grand as the chair and today 
>> the board just approved it, so now it's official.
>> 
>> Congratulations Adrien!
>> Regards,
>> Tommaso



[jira] [Commented] (SOLR-11516) Unified highlighter with word separator never gives context to the left

2017-10-20 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212850#comment-16212850
 ] 

David Smiley commented on SOLR-11516:
-

Hi,
For the UnifiedHighlighter, the interpretation of hl.bs.type is essentially a 
pluggable way to establish a snippet boundary.  The value of WORD and CHARACTER 
are technically supported but probably make no sense. The default is SENTENCE.

Note that the FastVectorHighlighter uses the same parameter name and values but 
with a different semantic meaning -- and in its meaning, WORD is what you'd 
likely want it at, and it's the default for that highlighter.

When you use the UH with the default hl.bs.type, what snippeting challenges do 
you face?

hl.fragsize is supported but it's fidelity is to the hl.bs.type unit -- 
generally a sentence boundary.  With the original Highlighter, it was to the 
word edge, which meant it very likely chopped off a sentence, which isn't great.

> Unified highlighter with word separator never gives context to the left
> ---
>
> Key: SOLR-11516
> URL: https://issues.apache.org/jira/browse/SOLR-11516
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: highlighter
>Affects Versions: 6.4, 7.1
>Reporter: Tim Retout
>
> When using the unified highlighter with hl.bs.type=WORD, I am not able to get 
> context to the left of the matches returned; only words to the right of each 
> match are shown.  I see this behaviour on both Solr 6.4 and Solr 7.1.
> Without context to the left of a match, the highlighted snippets are much 
> less useful for understanding where the match appears in a document.
> As an example, using the techproducts data with Solr 7.1, given a search for 
> "apple", highlighting the "features" field:
> http://localhost:8983/solr/techproducts/select?hl.fl=features=on=apple=WORD=30=unified
> I see this snippet:
> "Apple Lossless, H.264 video"
> Note that "Apple" is anchored to the left.  Compare with the original 
> highlighter:
> http://localhost:8983/solr/techproducts/select?hl.fl=features=on=apple=30
> And the match has context either side:
> ", Audible, Apple Lossless, H.264 video"
> (To complicate this, in general I am not sure that the unified highlighter is 
> respecting the hl.fragsize parameter, although [SOLR-9935] suggests support 
> was added.  I included the hl.fragsize param in the unified URL too, but it's 
> making no difference unless set to 0.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8000) Document Length Normalization in BM25Similarity correct?

2017-10-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212813#comment-16212813
 ] 

Robert Muir commented on LUCENE-8000:
-

{quote}
What benchmarks have you used for measuring performance?
{quote}

I use trec-like IR collections in different languages. The Lucene benchmark 
module has some support for running the queries and creating output that you 
can send to trec_eval. I just use its query-running support (QueryDriver), i 
don't use its indexing/parsing support although it has that too. Instead I 
index the test collections myself. That's because the 
collections/queries/judgements are always annoyingly in a slightly different 
non-standard format. I only look at measures which are generally the most 
stable like MAP and bpref. 

{quote}
Is your opinion based on tests with Lucene Classic Similarity (it also uses 
discountOverlaps = true) or also on tests with BM25.
{quote}

I can't remember which scoring systems I tested at the time we flipped the 
default, but I think we should keep the same default for all scoring functions. 
It is fairly easy once you have everything setup to test with a ton of 
similarities at once (or different parameters) by modifying the code to loop 
across a big list. That's one reason why its valuable to try to keep any 
index-time logic consistent across all of them (such as formula for encoding 
the norm). Otherwise it makes testing unnecessarily difficult. Its already 
painful enough. This is important for real users too, they shouldn't have to 
reindex to do parameter tuning.



> Document Length Normalization in BM25Similarity correct?
> 
>
> Key: LUCENE-8000
> URL: https://issues.apache.org/jira/browse/LUCENE-8000
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Christoph Goller
>Priority: Minor
>
> Length of individual documents only counts the number of positions of a 
> document since discountOverlaps defaults to true.
> {code}
>  @Override
>   public final long computeNorm(FieldInvertState state) {
> final int numTerms = discountOverlaps ? state.getLength() - 
> state.getNumOverlap() : state.getLength();
> int indexCreatedVersionMajor = state.getIndexCreatedVersionMajor();
> if (indexCreatedVersionMajor >= 7) {
>   return SmallFloat.intToByte4(numTerms);
> } else {
>   return SmallFloat.floatToByte315((float) (1 / Math.sqrt(numTerms)));
> }
>   }}
> {code}
> Measureing document length this way seems perfectly ok for me. What bothers 
> me is that
> average document length is based on sumTotalTermFreq for a field. As far as I 
> understand that sums up totalTermFreqs for all terms of a field, therefore 
> counting positions of terms including those that overlap.
> {code}
>  protected float avgFieldLength(CollectionStatistics collectionStats) {
> final long sumTotalTermFreq = collectionStats.sumTotalTermFreq();
> if (sumTotalTermFreq <= 0) {
>   return 1f;   // field does not exist, or stat is unsupported
> } else {
>   final long docCount = collectionStats.docCount() == -1 ? 
> collectionStats.maxDoc() : collectionStats.docCount();
>   return (float) (sumTotalTermFreq / (double) docCount);
> }
>   }
> }
> {code}
> Are we comparing apples and oranges in the final scoring?
> I haven't run any benchmarks and I am not sure whether this has a serious 
> effect. It just means that documents that have synonyms or in my use case 
> different normal forms of tokens on the same position are shorter and 
> therefore get higher scores  than they should and that we do not use the 
> whole spectrum of relative document lenght of BM25.
> I think for BM25  discountOverlaps  should default to false. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[VOTE] Release Lucene/Solr 5.5.5 RC2

2017-10-20 Thread Steve Rowe
Please vote for release candidate 2 for Lucene/Solr 5.5.5 

The artifacts can be downloaded from: 
https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-5.5.5-RC2-revb3441673c21c83762035dc21d3827ad16aa17b68

You can run the smoke tester directly with this command: 

python3 -u dev-tools/scripts/smokeTestRelease.py \ 
https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-5.5.5-RC2-revb3441673c21c83762035dc21d3827ad16aa17b68

Here's my +1
SUCCESS! [0:53:51.570213]

--
Steve
www.lucidworks.com


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11032) Update solrj tutorial

2017-10-20 Thread Jason Gerlowski (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski updated SOLR-11032:
---
Attachment: SOLR-11032.patch

Attaching a ready-for-review patch which adds in the buildable-examples.

The attached patch moves the Java snippets out from the using-solrj page, to 
build-able examples that live in {{solr/solr-ref-guide/src/java}}.  These 
examples can be built explicitly using the {{ant build-examples}} target in the 
solr-ref-guide directory.  They're also built implicitly as a part of a normal 
{{ant compile}} from the project root.  (And of course they're built whenever 
you run {{ant build-site}} or {{ant build-pdf}} in the solr-ref-guide directory.

*PROS*
- compilation will now fail if a SolrJ API is changed without updating any 
examples that use it.
- examples are compiled in builds by default.  No additional ant target to 
memorize, etc.

*CONS*
- the examples need Solr/SolrJ classes on their classpath to compile.  And the 
ref-guide build will fail if the examples don't compile.  So the ref-guide 
can't be built in isolation from the rest of Solr anymore.  This is likely 
expected, but I wanted to mention it anyways in case this was a big deal to 
anyone.

This should be ready for review, if anyone has time to take a look or critique 
the approach.

> Update solrj tutorial
> -
>
> Key: SOLR-11032
> URL: https://issues.apache.org/jira/browse/SOLR-11032
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation, SolrJ, website
>Reporter: Karl Richter
> Attachments: SOLR-11032.patch, SOLR-11032.patch
>
>
> The [solrj tutorial](https://wiki.apache.org/solr/Solrj) has the following 
> issues:
>   * It refers to 1.4.0 whereas the current release is 6.x, some classes are 
> deprecated or no longer exist.
>   * Document-object-binding is a crucial feature [which should be working in 
> the meantime](https://issues.apache.org/jira/browse/SOLR-1945) and thus 
> should be covered in the tutorial.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8000) Document Length Normalization in BM25Similarity correct?

2017-10-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212772#comment-16212772
 ] 

Robert Muir commented on LUCENE-8000:
-

not sure how intuitive it is, i guess maybe it kinda is if you think on a 
case-by-case basis. Some examples:
* WDF splitting up "wi-fi", if those synonyms count towards doc's length, then 
we punish the doc because the author wrote a hyphen (vs writing "wi fi").
* if you have 1000 synonyms for hamburger and those count towards the length, 
then we punish a doc because the author wrote hamburger (versus writing 
"pizza").

note that punishing a doc unfairly here punishes it for all queries. if i 
search on "joker", why should one doc get a very low ranking for that term just 
because the doc also happens to mention "hamburger" instead of "pizza". In this 
case we have skewed length normalization in such a way that it doesn't properly 
reflect verbosity.

> Document Length Normalization in BM25Similarity correct?
> 
>
> Key: LUCENE-8000
> URL: https://issues.apache.org/jira/browse/LUCENE-8000
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Christoph Goller
>Priority: Minor
>
> Length of individual documents only counts the number of positions of a 
> document since discountOverlaps defaults to true.
> {code}
>  @Override
>   public final long computeNorm(FieldInvertState state) {
> final int numTerms = discountOverlaps ? state.getLength() - 
> state.getNumOverlap() : state.getLength();
> int indexCreatedVersionMajor = state.getIndexCreatedVersionMajor();
> if (indexCreatedVersionMajor >= 7) {
>   return SmallFloat.intToByte4(numTerms);
> } else {
>   return SmallFloat.floatToByte315((float) (1 / Math.sqrt(numTerms)));
> }
>   }}
> {code}
> Measureing document length this way seems perfectly ok for me. What bothers 
> me is that
> average document length is based on sumTotalTermFreq for a field. As far as I 
> understand that sums up totalTermFreqs for all terms of a field, therefore 
> counting positions of terms including those that overlap.
> {code}
>  protected float avgFieldLength(CollectionStatistics collectionStats) {
> final long sumTotalTermFreq = collectionStats.sumTotalTermFreq();
> if (sumTotalTermFreq <= 0) {
>   return 1f;   // field does not exist, or stat is unsupported
> } else {
>   final long docCount = collectionStats.docCount() == -1 ? 
> collectionStats.maxDoc() : collectionStats.docCount();
>   return (float) (sumTotalTermFreq / (double) docCount);
> }
>   }
> }
> {code}
> Are we comparing apples and oranges in the final scoring?
> I haven't run any benchmarks and I am not sure whether this has a serious 
> effect. It just means that documents that have synonyms or in my use case 
> different normal forms of tokens on the same position are shorter and 
> therefore get higher scores  than they should and that we do not use the 
> whole spectrum of relative document lenght of BM25.
> I think for BM25  discountOverlaps  should default to false. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-7.x-MacOSX (64bit/jdk-9) - Build # 257 - Still Unstable!

2017-10-20 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-MacOSX/257/
Java: 64bit/jdk-9 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC 
--illegal-access=deny

2 tests failed.
FAILED:  org.apache.solr.cloud.ShardSplitTest.testSplitAfterFailedSplit

Error Message:
expected:<1> but was:<2>

Stack Trace:
java.lang.AssertionError: expected:<1> but was:<2>
at 
__randomizedtesting.SeedInfo.seed([3C79BDD2D611A463:C5342E7DEA64E9E9]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.solr.cloud.ShardSplitTest.testSplitAfterFailedSplit(ShardSplitTest.java:279)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:993)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:968)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 

[JENKINS] Lucene-Solr-Tests-5.5 - Build # 34 - Still Failing

2017-10-20 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.5/34/

2 tests failed.
FAILED:  org.apache.solr.schema.TestManagedSchemaAPI.test

Error Message:
Error from server at http://127.0.0.1:36899/solr/testschemaapi_shard1_replica1: 
ERROR: [doc=2] unknown field 'myNewField1'

Stack Trace:
org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error from 
server at http://127.0.0.1:36899/solr/testschemaapi_shard1_replica1: ERROR: 
[doc=2] unknown field 'myNewField1'
at 
__randomizedtesting.SeedInfo.seed([D0551FCB2D1565DF:5801201183E90827]:0)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:653)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1002)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:891)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:827)
at 
org.apache.solr.schema.TestManagedSchemaAPI.testAddFieldAndDocument(TestManagedSchemaAPI.java:101)
at 
org.apache.solr.schema.TestManagedSchemaAPI.test(TestManagedSchemaAPI.java:69)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
  

[JENKINS] Lucene-Solr-master-MacOSX (64bit/jdk1.8.0) - Build # 4240 - Still Unstable!

2017-10-20 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-MacOSX/4240/
Java: 64bit/jdk1.8.0 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC

1 tests failed.
FAILED:  org.apache.solr.core.TestDynamicLoading.testDynamicLoading

Error Message:
Could not get expected value  
'org.apache.solr.core.BlobStoreTestRequestHandler' for path 
'overlay/requestHandler/\/test1/class' full output: {   "responseHeader":{ 
"status":0, "QTime":0},   "overlay":{ "znodeVersion":0, 
"runtimeLib":{"colltest":{ "name":"colltest", "version":1,  
from server:  null

Stack Trace:
java.lang.AssertionError: Could not get expected value  
'org.apache.solr.core.BlobStoreTestRequestHandler' for path 
'overlay/requestHandler/\/test1/class' full output: {
  "responseHeader":{
"status":0,
"QTime":0},
  "overlay":{
"znodeVersion":0,
"runtimeLib":{"colltest":{
"name":"colltest",
"version":1,  from server:  null
at 
__randomizedtesting.SeedInfo.seed([E1A73044852D3954:39EA1D1372F09CF4]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.core.TestSolrConfigHandler.testForResponseElement(TestSolrConfigHandler.java:557)
at 
org.apache.solr.core.TestDynamicLoading.testDynamicLoading(TestDynamicLoading.java:97)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:993)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:968)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 

[jira] [Commented] (SOLR-11505) solr.cmd start of solr7.0.1 can't working in win7-64

2017-10-20 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212642#comment-16212642
 ] 

Uwe Schindler commented on SOLR-11505:
--

I think the whole thing may come from the following fact: If you have cygwin or 
some other alternative shell installed and the PATH variable is using that, 
then you may use a wrong instance of find.exe (e.g., the cygwin one).

If you send us your PATH varaible contents we may be able to identify this. But 
with a plain Windows 7 it should work.

> solr.cmd start of solr7.0.1 can't working in win7-64
> 
>
> Key: SOLR-11505
> URL: https://issues.apache.org/jira/browse/SOLR-11505
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCLI
>Affects Versions: 7.0.1
> Environment: windows 7
>Reporter: cloverliu
>Priority: Trivial
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> http://archive.apache.org/dist/lucene/solr/7.0.1/solr-7.0.1.zip   
>  solr.cmd start of this file can't working in my win7-64bit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11505) solr.cmd start of solr7.0.1 can't working in win7-64

2017-10-20 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212622#comment-16212622
 ] 

Dawid Weiss commented on SOLR-11505:


Can you echo the path variable the way I did? 

> solr.cmd start of solr7.0.1 can't working in win7-64
> 
>
> Key: SOLR-11505
> URL: https://issues.apache.org/jira/browse/SOLR-11505
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCLI
>Affects Versions: 7.0.1
> Environment: windows 7
>Reporter: cloverliu
>Priority: Trivial
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> http://archive.apache.org/dist/lucene/solr/7.0.1/solr-7.0.1.zip   
>  solr.cmd start of this file can't working in my win7-64bit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11505) solr.cmd start of solr7.0.1 can't working in win7-64

2017-10-20 Thread cloverliu (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cloverliu updated SOLR-11505:
-
Attachment: screenshot-2.png

> solr.cmd start of solr7.0.1 can't working in win7-64
> 
>
> Key: SOLR-11505
> URL: https://issues.apache.org/jira/browse/SOLR-11505
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCLI
>Affects Versions: 7.0.1
> Environment: windows 7
>Reporter: cloverliu
>Priority: Trivial
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> http://archive.apache.org/dist/lucene/solr/7.0.1/solr-7.0.1.zip   
>  solr.cmd start of this file can't working in my win7-64bit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11505) solr.cmd start of solr7.0.1 can't working in win7-64

2017-10-20 Thread cloverliu (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212616#comment-16212616
 ] 

cloverliu commented on SOLR-11505:
--

[#comment-16210675]
i think my find.exe is different from yours.
!screenshot-2.png!

> solr.cmd start of solr7.0.1 can't working in win7-64
> 
>
> Key: SOLR-11505
> URL: https://issues.apache.org/jira/browse/SOLR-11505
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCLI
>Affects Versions: 7.0.1
> Environment: windows 7
>Reporter: cloverliu
>Priority: Trivial
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> http://archive.apache.org/dist/lucene/solr/7.0.1/solr-7.0.1.zip   
>  solr.cmd start of this file can't working in my win7-64bit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-7.x-Linux (32bit/jdk1.8.0_144) - Build # 636 - Still Unstable!

2017-10-20 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/636/
Java: 32bit/jdk1.8.0_144 -server -XX:+UseSerialGC

2 tests failed.
FAILED:  
org.apache.solr.cloud.CollectionsAPIAsyncDistributedZkTest.testAsyncRequests

Error Message:
DeleteReplica did not complete expected same: was not:

Stack Trace:
java.lang.AssertionError: DeleteReplica did not complete expected 
same: was not:
at 
__randomizedtesting.SeedInfo.seed([8DF22A755AF602EB:69B616C2FC5E4C34]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotSame(Assert.java:641)
at org.junit.Assert.assertSame(Assert.java:580)
at 
org.apache.solr.cloud.CollectionsAPIAsyncDistributedZkTest.testAsyncRequests(CollectionsAPIAsyncDistributedZkTest.java:170)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)


FAILED:  
org.apache.solr.cloud.CollectionsAPIAsyncDistributedZkTest.testSolrJAPICalls

Error Message:
CreateCollection task did not complete! expected 

Re: [VOTE] Release Lucene/Solr 5.5.5 RC1

2017-10-20 Thread Steve Rowe
This vote is cancelled.  I’ll be respinning shortly because of improvements Uwe 
made to the Tika Matlab file parsing problem - see comments on SOLR-8981.

--
Steve
www.lucidworks.com

> On Oct 20, 2017, at 5:09 AM, Dawid Weiss  wrote:
> 
> SUCCESS! [0:55:33.151716]
> 
> +1.
> 
> On Fri, Oct 20, 2017 at 3:50 AM, Steve Rowe  wrote:
>> Please vote for release candidate 1 for Lucene/Solr 5.5.5
>> 
>> The artifacts can be downloaded from:
>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-5.5.5-RC1-revccaa541db9c8a5af9d273ef77b79c4356e6e4c2e
>> 
>> You can run the smoke tester directly with this command:
>> 
>> python3 -u dev-tools/scripts/smokeTestRelease.py \
>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-5.5.5-RC1-revccaa541db9c8a5af9d273ef77b79c4356e6e4c2e
>> 
>> Here's my +1
>> SUCCESS! [0:51:56.623919]
>> 
>> --
>> Steve
>> www.lucidworks.com
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8981) Upgrade to Tika 1.13 when it is available

2017-10-20 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212548#comment-16212548
 ] 

Uwe Schindler commented on SOLR-8981:
-

[~steve_rowe]: I added a patch to other issue. The reason is that for safety I 
set the static "disable serialization" on the JMatIO parser in the init() of 
the plugin. I found out later by reviewing [~talli...@mitre.org]'s fork that 
the default is already using "false". But safe is safe.

> Upgrade to Tika 1.13 when it is available
> -
>
> Key: SOLR-8981
> URL: https://issues.apache.org/jira/browse/SOLR-8981
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - Solr Cell (Tika extraction)
>Reporter: Tim Allison
>Assignee: Uwe Schindler
> Fix For: 5.5.5, 6.2, 7.0
>
>
> Tika 1.13 should be out within a month.  This includes PDFBox 2.0.0 and a 
> number of other upgrades and improvements.  
> If there are any showstoppers in 1.13 from Solr's side or requests before we 
> roll 1.13, let us know.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8981) Upgrade to Tika 1.13 when it is available

2017-10-20 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212542#comment-16212542
 ] 

Steve Rowe commented on SOLR-8981:
--

bq. Hi Steve Rowe, did you try to remove the jmatio.jar file.

Yes, I did remove the jmatio.jar file, and started the 5.5.5 RC1 vote with it 
removed.

bq. How about trying to just update the JAR file by Tim Allison fork?  I added 
an alternative patch to SOLR-11486!

+1, I'll go respin.



> Upgrade to Tika 1.13 when it is available
> -
>
> Key: SOLR-8981
> URL: https://issues.apache.org/jira/browse/SOLR-8981
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - Solr Cell (Tika extraction)
>Reporter: Tim Allison
>Assignee: Uwe Schindler
> Fix For: 5.5.5, 6.2, 7.0
>
>
> Tika 1.13 should be out within a month.  This includes PDFBox 2.0.0 and a 
> number of other upgrades and improvements.  
> If there are any showstoppers in 1.13 from Solr's side or requests before we 
> roll 1.13, let us know.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-7.x-Windows (32bit/jdk1.8.0_144) - Build # 257 - Still Unstable!

2017-10-20 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Windows/257/
Java: 32bit/jdk1.8.0_144 -server -XX:+UseParallelGC

5 tests failed.
FAILED:  org.apache.solr.cloud.autoscaling.TriggerIntegrationTest.testEventQueue

Error Message:
action wasn't interrupted

Stack Trace:
java.lang.AssertionError: action wasn't interrupted
at 
__randomizedtesting.SeedInfo.seed([8D8C5AFB8278C078:443918558B1F068D]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.cloud.autoscaling.TriggerIntegrationTest.testEventQueue(TriggerIntegrationTest.java:684)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)


FAILED:  org.apache.solr.core.TestJmxIntegration.testJmxRegistration

Error Message:
org.apache.lucene.store.AlreadyClosedException: this IndexReader is closed

Stack Trace:
javax.management.RuntimeMBeanException: 
org.apache.lucene.store.AlreadyClosedException: this IndexReader is closed

[JENKINS] Lucene-Solr-Tests-7.x - Build # 188 - Still Unstable

2017-10-20 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-7.x/188/

7 tests failed.
FAILED:  org.apache.solr.cloud.HttpPartitionTest.test

Error Message:
Could not load collection from ZK: c8n_1x2

Stack Trace:
org.apache.solr.common.SolrException: Could not load collection from ZK: c8n_1x2
at 
__randomizedtesting.SeedInfo.seed([5D3C760E1FD07E28:D56849D4B12C13D0]:0)
at 
org.apache.solr.common.cloud.ZkStateReader.getCollectionLive(ZkStateReader.java:1172)
at 
org.apache.solr.common.cloud.ZkStateReader$LazyCollectionRef.get(ZkStateReader.java:692)
at 
org.apache.solr.common.cloud.ClusterState.getCollectionOrNull(ClusterState.java:130)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.ensureAllReplicasAreActive(AbstractFullDistribZkTestBase.java:1963)
at 
org.apache.solr.cloud.HttpPartitionTest.testRf2(HttpPartitionTest.java:370)
at 
org.apache.solr.cloud.HttpPartitionTest.test(HttpPartitionTest.java:132)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:993)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:968)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 

[jira] [Commented] (SOLR-8981) Upgrade to Tika 1.13 when it is available

2017-10-20 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212527#comment-16212527
 ] 

Tim Allison commented on SOLR-8981:
---

+1  Thank you, [~thetaphi]!

> Upgrade to Tika 1.13 when it is available
> -
>
> Key: SOLR-8981
> URL: https://issues.apache.org/jira/browse/SOLR-8981
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - Solr Cell (Tika extraction)
>Reporter: Tim Allison
>Assignee: Uwe Schindler
> Fix For: 5.5.5, 6.2, 7.0
>
>
> Tika 1.13 should be out within a month.  This includes PDFBox 2.0.0 and a 
> number of other upgrades and improvements.  
> If there are any showstoppers in 1.13 from Solr's side or requests before we 
> roll 1.13, let us know.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11516) Unified highlighter with word separator never gives context to the left

2017-10-20 Thread Tim Retout (JIRA)
Tim Retout created SOLR-11516:
-

 Summary: Unified highlighter with word separator never gives 
context to the left
 Key: SOLR-11516
 URL: https://issues.apache.org/jira/browse/SOLR-11516
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: highlighter
Affects Versions: 7.1, 6.4
Reporter: Tim Retout


When using the unified highlighter with hl.bs.type=WORD, I am not able to get 
context to the left of the matches returned; only words to the right of each 
match are shown.  I see this behaviour on both Solr 6.4 and Solr 7.1.

Without context to the left of a match, the highlighted snippets are much less 
useful for understanding where the match appears in a document.

As an example, using the techproducts data with Solr 7.1, given a search for 
"apple", highlighting the "features" field:

http://localhost:8983/solr/techproducts/select?hl.fl=features=on=apple=WORD=30=unified

I see this snippet:

"Apple Lossless, H.264 video"

Note that "Apple" is anchored to the left.  Compare with the original 
highlighter:

http://localhost:8983/solr/techproducts/select?hl.fl=features=on=apple=30

And the match has context either side:

", Audible, Apple Lossless, H.264 video"

(To complicate this, in general I am not sure that the unified highlighter is 
respecting the hl.fragsize parameter, although [SOLR-9935] suggests support was 
added.  I included the hl.fragsize param in the unified URL too, but it's 
making no difference unless set to 0.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-8001) UnescapedCharSequence Bugs

2017-10-20 Thread Shad Storhaug (JIRA)
Shad Storhaug created LUCENE-8001:
-

 Summary: UnescapedCharSequence Bugs
 Key: LUCENE-8001
 URL: https://issues.apache.org/jira/browse/LUCENE-8001
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/queryparser
Affects Versions: 7.1
Reporter: Shad Storhaug
Priority: Minor


There are a couple of issues with UnescapedCharSequence:

1. The [private 
constructor|https://github.com/apache/lucene-solr/blob/32ed8520c706a865b94e03644ae4e4435e0f7d35/lucene/queryparser/src/java/org/apache/lucene/queryparser/flexible/core/util/UnescapedCharSequence.java#L52-L63]
 is not used anywhere (and if it were, it would throw exceptions)
2. The ToEscapedString() overload has an [invalid 
condition|https://github.com/apache/lucene-solr/blob/32ed8520c706a865b94e03644ae4e4435e0f7d35/lucene/queryparser/src/java/org/apache/lucene/queryparser/flexible/core/util/UnescapedCharSequence.java#L96]
 that will only evaluate to true if the string has a length of 0.

There are no tests for UnescapedCharSequence so these issues have gone 
unnoticed for quite some time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11452) TestTlogReplica.testOnlyLeaderIndexes() failure

2017-10-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212494#comment-16212494
 ] 

ASF subversion and git services commented on SOLR-11452:


Commit 32ed8520c706a865b94e03644ae4e4435e0f7d35 in lucene-solr's branch 
refs/heads/master from [~caomanhdat]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=32ed852 ]

SOLR-11452: Delay between open new searcher and copy over updates can cause the 
test to fail.


> TestTlogReplica.testOnlyLeaderIndexes() failure
> ---
>
> Key: SOLR-11452
> URL: https://issues.apache.org/jira/browse/SOLR-11452
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Assignee: Cao Manh Dat
>
> Reproduces for me, from 
> [https://builds.apache.org/job/Lucene-Solr-NightlyTests-master/1398]:
> {noformat}
> Checking out Revision f0a4b2dafe13e2b372e33ce13d552f169187a44e 
> (refs/remotes/origin/master)
> [...]
>[junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestTlogReplica 
> -Dtests.method=testOnlyLeaderIndexes -Dtests.seed=CCAC87827208491B 
> -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true 
> -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt
>  -Dtests.locale=el -Dtests.timezone=Australia/LHI -Dtests.asserts=true 
> -Dtests.file.encoding=ISO-8859-1
>[junit4] FAILURE 29.5s J2 | TestTlogReplica.testOnlyLeaderIndexes <<<
>[junit4]> Throwable #1: java.lang.AssertionError: expected:<2> but 
> was:<5>
>[junit4]>  at 
> __randomizedtesting.SeedInfo.seed([CCAC87827208491B:D0ADFA0F07AD3788]:0)
>[junit4]>  at 
> org.apache.solr.cloud.TestTlogReplica.assertCopyOverOldUpdates(TestTlogReplica.java:909)
>[junit4]>  at 
> org.apache.solr.cloud.TestTlogReplica.testOnlyLeaderIndexes(TestTlogReplica.java:501)
>[junit4]>  at java.lang.Thread.run(Thread.java:748)
> [...]
>[junit4]   2> NOTE: test params are: codec=CheapBastard, 
> sim=RandomSimilarity(queryNorm=false): {}, locale=el, timezone=Australia/LHI
>[junit4]   2> NOTE: Linux 3.13.0-88-generic amd64/Oracle Corporation 
> 1.8.0_144 (64-bit)/cpus=4,threads=1,free=137513712,total=520093696
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-master-Linux (64bit/jdk1.8.0_144) - Build # 20700 - Failure!

2017-10-20 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/20700/
Java: 64bit/jdk1.8.0_144 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC

5 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest

Error Message:
64 threads leaked from SUITE scope at 
org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest: 1) 
Thread[id=8347, 
name=org.eclipse.jetty.server.session.HashSessionManager@4cb1412Timer, 
state=TIMED_WAITING, group=TGRP-LeaderInitiatedRecoveryOnCommitTest] at 
sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) 
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
 at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)   
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748)2) Thread[id=8454, 
name=zkCallback-967-thread-3, state=TIMED_WAITING, 
group=TGRP-LeaderInitiatedRecoveryOnCommitTest] at 
sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) 
at 
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
 at 
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
 at 
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941) 
at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)   
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748)3) Thread[id=8341, 
name=qtp1683637494-8341, state=TIMED_WAITING, 
group=TGRP-LeaderInitiatedRecoveryOnCommitTest] at 
sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) 
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
 at 
org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392) 
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:563)
 at 
org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:48)
 at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626) 
at java.lang.Thread.run(Thread.java:748)4) Thread[id=8356, 
name=TEST-LeaderInitiatedRecoveryOnCommitTest.test-seed#[9CBBC89541D9EF55]-SendThread(127.0.0.1:42075),
 state=TIMED_WAITING, group=TGRP-LeaderInitiatedRecoveryOnCommitTest] 
at java.lang.Thread.sleep(Native Method) at 
org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:101)
 at 
org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:997)
 at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1060)
5) Thread[id=8389, 
name=TEST-LeaderInitiatedRecoveryOnCommitTest.test-seed#[9CBBC89541D9EF55]-EventThread,
 state=WAITING, group=TGRP-LeaderInitiatedRecoveryOnCommitTest] at 
sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
 at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) 
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
6) Thread[id=8378, name=qtp1964563531-8378, state=TIMED_WAITING, 
group=TGRP-LeaderInitiatedRecoveryOnCommitTest] at 
sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) 
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
 at 
org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392) 
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:563)
 at 
org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:48)
 at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626) 
at java.lang.Thread.run(Thread.java:748)7) Thread[id=8412, 

[JENKINS] Lucene-Solr-5.5-Windows (64bit/jdk1.7.0_80) - Build # 147 - Still Unstable!

2017-10-20 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-5.5-Windows/147/
Java: 64bit/jdk1.7.0_80 -XX:+UseCompressedOops -XX:+UseSerialGC

2 tests failed.
FAILED:  org.apache.solr.cloud.PeerSyncReplicationTest.test

Error Message:
expected:<152> but was:<146>

Stack Trace:
java.lang.AssertionError: expected:<152> but was:<146>
at 
__randomizedtesting.SeedInfo.seed([FE4650164CE8996B:76126FCCE214F493]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.solr.cloud.PeerSyncReplicationTest.bringUpDeadNodeAndEnsureNoReplication(PeerSyncReplicationTest.java:278)
at 
org.apache.solr.cloud.PeerSyncReplicationTest.forceNodeFailureAndDoPeerSync(PeerSyncReplicationTest.java:242)
at 
org.apache.solr.cloud.PeerSyncReplicationTest.test(PeerSyncReplicationTest.java:125)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:996)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:971)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 

[jira] [Updated] (SOLR-11279) It is necessary to specify how to generate a password when used Basic Authentication!

2017-10-20 Thread chenmin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenmin updated SOLR-11279:
---
Fix Version/s: (was: master (8.0))

> It is necessary to specify how to generate a password when used  Basic 
> Authentication!
> --
>
> Key: SOLR-11279
> URL: https://issues.apache.org/jira/browse/SOLR-11279
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.3.1
>Reporter: chenmin
>
>  Follow the documentation ,"Usernames and passwords (as a 
> sha256(password+salt) hash) could be added when the file is created.
>  Actually, I do not know how to generate a password.
>  Finding code as:
>   public static String getSaltedHashedValue(String pwd) {
>  final Random r = new SecureRandom();
>  byte[] salt = new byte[32];
>  r.nextBytes(salt);
>  String saltBase64 = Base64.encodeBase64String(salt);
>  String val = sha256(pwd, saltBase64) + " " + saltBase64;
>  return val;
>   }
>  I  think we shoud give an example to generate password in ref guide! 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11279) It is necessary to specify how to generate a password when used Basic Authentication!

2017-10-20 Thread chenmin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenmin updated SOLR-11279:
---
Fix Version/s: master (8.0)

> It is necessary to specify how to generate a password when used  Basic 
> Authentication!
> --
>
> Key: SOLR-11279
> URL: https://issues.apache.org/jira/browse/SOLR-11279
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.3.1
>Reporter: chenmin
> Fix For: master (8.0)
>
>
>  Follow the documentation ,"Usernames and passwords (as a 
> sha256(password+salt) hash) could be added when the file is created.
>  Actually, I do not know how to generate a password.
>  Finding code as:
>   public static String getSaltedHashedValue(String pwd) {
>  final Random r = new SecureRandom();
>  byte[] salt = new byte[32];
>  r.nextBytes(salt);
>  String saltBase64 = Base64.encodeBase64String(salt);
>  String val = sha256(pwd, saltBase64) + " " + saltBase64;
>  return val;
>   }
>  I  think we shoud give an example to generate password in ref guide! 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11319) Update guide on client API using Python

2017-10-20 Thread Mario Corchero (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212447#comment-16212447
 ] 

Mario Corchero commented on SOLR-11319:
---

Thanks [~ctargett]

> Update guide on client API using Python
> ---
>
> Key: SOLR-11319
> URL: https://issues.apache.org/jira/browse/SOLR-11319
> Project: Solr
>  Issue Type: Task
>  Components: clients - python, documentation
>Reporter: Mario Corchero
>Assignee: Cassandra Targett
>Priority: Trivial
> Fix For: 7.1
>
>
> The guide on [the client API using 
> Python|https://lucene.apache.org/solr/guide/6_6/using-python.html] points 
> users to simplejson claiming Python has no support for json on the standard 
> library.
> That changed on 2008 and all supported version include a json package.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >