[jira] [Commented] (LUCENE-5422) Postings lists deduplication
[ https://issues.apache.org/jira/browse/LUCENE-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940214#comment-13940214 ] Vishmi Money commented on LUCENE-5422: -- [~dmitry_key] , yes, I agree with you. Better scoping is needed and expert ideas are also needed for that. As [~mikemccand] said, a clearer use case may solve the problem. If we can come up with a clear use case deciding when deduplication should really happen, it will help a lot to achieve this objective . So that the ideas are needed. As I didn't tell you about the progress of my work, I like to let you know that now I'm analyzing Lucene tokenizing and indexing as it is the main area I have to work with, than searching. But for the improvement or persistence of performance, I also learn about improving search queries for Lucene, according to our objective here. For that understanding and debugging purposes, I use Luke, a Index Browser tool for Lucene. Please let me know if there are other areas which I should look in. Also I again remind you about reviewing my proposal, as it will be a great support for me if I can get feedback from you. Postings lists deduplication Key: LUCENE-5422 URL: https://issues.apache.org/jira/browse/LUCENE-5422 Project: Lucene - Core Issue Type: Improvement Components: core/codecs, core/index Reporter: Dmitry Kan Labels: gsoc2014 The context: http://markmail.org/thread/tywtrjjcfdbzww6f Robert Muir and I have discussed what Robert eventually named postings lists deduplication at Berlin Buzzwords 2013 conference. The idea is to allow multiple terms to point to the same postings list to save space. This can be achieved by new index codec implementation, but this jira is open to other ideas as well. The application / impact of this is positive for synonyms, exact / inexact terms, leading wildcard support via storing reversed term etc. For example, at the moment, when supporting exact (unstemmed) and inexact (stemmed) searches, we store both unstemmed and stemmed variant of a word form and that leads to index bloating. That is why we had to remove the leading wildcard support via reversing a token on index and query time because of the same index size considerations. Comment from Mike McCandless: Neat idea! Would this idea allow a single term to point to (the union of) N other posting lists? It seems like that's necessary e.g. to handle the exact/inexact case. And then, to produce the Docs/AndPositionsEnum you'd need to do the merge sort across those N posting lists? Such a thing might also be do-able as runtime only wrapper around the postings API (FieldsProducer), if you could at runtime do the reverse expansion (e.g. stem - all of its surface forms). Comment from Robert Muir: I think the exact/inexact is trickier (detecting it would be the hard part), and you are right, another solution might work better. but for the reverse wildcard and synonyms situation, it seems we could even detect it on write if we created some hash of the previous terms postings. if the hash matches for the current term, we know it might be a duplicate and would have to actually do the costly check they are the same. maybe there are better ways to do it, but it might be a fun postingformat experiment to try. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5881) Upgrade Zookeeper to 3.4.6
[ https://issues.apache.org/jira/browse/SOLR-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey updated SOLR-5881: --- Attachment: SOLR-5881-testlog.txt SOLR-5881.patch Patch against trunk, implementing upgrade. The precommit passed, but I did have some test failures that might be related to the upgrade. A search of recent emails from Jenkins does not seem to match. I'm also attaching the full 'ant clean test' output from the solr directory. Upgrade Zookeeper to 3.4.6 -- Key: SOLR-5881 URL: https://issues.apache.org/jira/browse/SOLR-5881 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Shawn Heisey Assignee: Shawn Heisey Fix For: 4.8, 5.0 Attachments: SOLR-5881-testlog.txt, SOLR-5881.patch A mailing list user has run into ZOOKEEPER-1513. The release notes for 3.4.6 list a lot of fixes since 3.4.5. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5763) Upgrade to Tika 1.5
[ https://issues.apache.org/jira/browse/SOLR-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940242#comment-13940242 ] Shawn Heisey commented on SOLR-5763: I noticed while working on SOLR-5881 that the 5.0.0 and 4.8.0 section headers in CHANGES.txt still say Tika 1.4. Upgrade to Tika 1.5 --- Key: SOLR-5763 URL: https://issues.apache.org/jira/browse/SOLR-5763 Project: Solr Issue Type: Task Components: contrib - Solr Cell (Tika extraction) Reporter: Steve Rowe Assignee: Steve Rowe Priority: Minor Fix For: 4.8 Attachments: SOLR-5763.patch, SOLR-5763.patch, SOLR-5763.patch Just released: http://www.apache.org/dist/tika/CHANGES-1.5.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.7.0_60-ea-b10) - Build # 9736 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/9736/ Java: 64bit/jdk1.7.0_60-ea-b10 -XX:-UseCompressedOops -XX:+UseSerialGC 4 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.cloud.TestRequestStatusCollectionAPI Error Message: ERROR: SolrZkClient opens=4 closes=3 Stack Trace: java.lang.AssertionError: ERROR: SolrZkClient opens=4 closes=3 at __randomizedtesting.SeedInfo.seed([BCC754BFE9236E64]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.SolrTestCaseJ4.endTrackingZkClients(SolrTestCaseJ4.java:420) at org.apache.solr.SolrTestCaseJ4.afterClass(SolrTestCaseJ4.java:177) at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1617) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:789) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:359) at java.lang.Thread.run(Thread.java:744) FAILED: junit.framework.TestSuite.org.apache.solr.cloud.TestRequestStatusCollectionAPI Error Message: 2 threads leaked from SUITE scope at org.apache.solr.cloud.TestRequestStatusCollectionAPI: 1) Thread[id=7343, name=TEST-TestRequestStatusCollectionAPI.testDistribSearch-seed#[BCC754BFE9236E64]-SendThread(localhost.localdomain:34283), state=TIMED_WAITING, group=TGRP-TestRequestStatusCollectionAPI] at java.lang.Thread.sleep(Native Method) at org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:86) at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:937) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:993) 2) Thread[id=7344, name=TEST-TestRequestStatusCollectionAPI.testDistribSearch-seed#[BCC754BFE9236E64]-EventThread, state=WAITING, group=TGRP-TestRequestStatusCollectionAPI] at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:491) Stack Trace: com.carrotsearch.randomizedtesting.ThreadLeakError: 2 threads leaked from SUITE scope at org.apache.solr.cloud.TestRequestStatusCollectionAPI: 1) Thread[id=7343, name=TEST-TestRequestStatusCollectionAPI.testDistribSearch-seed#[BCC754BFE9236E64]-SendThread(localhost.localdomain:34283), state=TIMED_WAITING, group=TGRP-TestRequestStatusCollectionAPI] at java.lang.Thread.sleep(Native Method) at org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:86) at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:937) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:993) 2) Thread[id=7344, name=TEST-TestRequestStatusCollectionAPI.testDistribSearch-seed#[BCC754BFE9236E64]-EventThread, state=WAITING, group=TGRP-TestRequestStatusCollectionAPI] at sun.misc.Unsafe.park(Native Method) at
[JENKINS] Lucene-Solr-4.x-MacOSX (64bit/jdk1.7.0) - Build # 1388 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/1388/ Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseSerialGC -XX:-UseSuperWord All tests passed Build Log: [...truncated 25955 lines...] [junit4] JVM J0: stderr was not empty, see: /Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/build/contrib/solr-map-reduce/test/temp/junit4-J0-20140319_055915_763.syserr [junit4] JVM J0: stderr (verbatim) [junit4] 2014-03-19 05:59:26.064 java[658:5d0b] Unable to load realm info from SCDynamicStore [junit4] JVM J0: EOF [...truncated 14004 lines...] -check-forbidden-tests: [forbidden-apis] Reading API signatures: /Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/tools/forbiddenApis/tests.txt [forbidden-apis] Loading classes to check... [forbidden-apis] Scanning for API signatures and dependencies... [forbidden-apis] Forbidden method invocation: java.util.Random#init() [Use RandomizedRunner's random instead] [forbidden-apis] in org.apache.solr.cloud.TestMiniSolrCloudCluster (TestMiniSolrCloudCluster.java:67) [forbidden-apis] Scanned 632 (and 893 related) class file(s) for forbidden API invocations (in 0.62s), 1 error(s). BUILD FAILED /Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/build.xml:467: The following error occurred while executing this line: /Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/build.xml:70: The following error occurred while executing this line: /Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/build.xml:271: The following error occurred while executing this line: /Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/common-build.xml:2231: Check for forbidden API calls failed, see log. Total time: 103 minutes 41 seconds Build step 'Invoke Ant' marked build as failure Description set: Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseSerialGC -XX:-UseSuperWord Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0) - Build # 9843 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9843/ Java: 32bit/jdk1.8.0 -server -XX:+UseConcMarkSweepGC 4 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.cloud.TestModifyConfFiles Error Message: ERROR: SolrZkClient opens=4 closes=3 Stack Trace: java.lang.AssertionError: ERROR: SolrZkClient opens=4 closes=3 at __randomizedtesting.SeedInfo.seed([229540E9FC128DC3]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.SolrTestCaseJ4.endTrackingZkClients(SolrTestCaseJ4.java:435) at org.apache.solr.SolrTestCaseJ4.afterClass(SolrTestCaseJ4.java:180) at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1617) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:789) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:359) at java.lang.Thread.run(Thread.java:744) FAILED: junit.framework.TestSuite.org.apache.solr.cloud.TestModifyConfFiles Error Message: 2 threads leaked from SUITE scope at org.apache.solr.cloud.TestModifyConfFiles: 1) Thread[id=7319, name=TEST-TestModifyConfFiles.testDistribSearch-seed#[229540E9FC128DC3]-EventThread, state=WAITING, group=TGRP-TestModifyConfFiles] at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:491) 2) Thread[id=7318, name=TEST-TestModifyConfFiles.testDistribSearch-seed#[229540E9FC128DC3]-SendThread(localhost.localdomain:59210), state=TIMED_WAITING, group=TGRP-TestModifyConfFiles] at java.lang.Thread.sleep(Native Method) at org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:86) at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:937) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:993) Stack Trace: com.carrotsearch.randomizedtesting.ThreadLeakError: 2 threads leaked from SUITE scope at org.apache.solr.cloud.TestModifyConfFiles: 1) Thread[id=7319, name=TEST-TestModifyConfFiles.testDistribSearch-seed#[229540E9FC128DC3]-EventThread, state=WAITING, group=TGRP-TestModifyConfFiles] at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:491) 2) Thread[id=7318, name=TEST-TestModifyConfFiles.testDistribSearch-seed#[229540E9FC128DC3]-SendThread(localhost.localdomain:59210), state=TIMED_WAITING, group=TGRP-TestModifyConfFiles] at java.lang.Thread.sleep(Native Method) at
[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.7.0_51) - Build # 9737 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/9737/ Java: 32bit/jdk1.7.0_51 -client -XX:+UseSerialGC 1 tests failed. FAILED: org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testWithin {#3 seed=[7B07A9F0E9CF510E:3B9F34E4FAC7D6A]} Error Message: Shouldn't match I#0:Rect(minX=180.0,maxX=189.0,minY=-10.0,maxY=-4.0) Q:Pt(x=145.0,y=0.0) Stack Trace: java.lang.AssertionError: Shouldn't match I#0:Rect(minX=180.0,maxX=189.0,minY=-10.0,maxY=-4.0) Q:Pt(x=145.0,y=0.0) at __randomizedtesting.SeedInfo.seed([7B07A9F0E9CF510E:3B9F34E4FAC7D6A]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.fail(SpatialOpRecursivePrefixTreeTest.java:355) at org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.doTest(SpatialOpRecursivePrefixTreeTest.java:335) at org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testWithin(SpatialOpRecursivePrefixTreeTest.java:119) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1617) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:826) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:862) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:876) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:359) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:783) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:443) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:835) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:771) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:782) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:359) at java.lang.Thread.run(Thread.java:744) Build Log: [...truncated 9243 lines...] [junit4] Suite: org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest [junit4] 1 Strategy:
[JENKINS] Lucene-trunk-Linux-java7-64-analyzers - Build # 62 - Failure!
Build: builds.flonkings.com/job/Lucene-trunk-Linux-java7-64-analyzers/62/ 1 tests failed. REGRESSION: org.apache.lucene.analysis.core.TestRandomChains.testRandomChainsWithLargeStrings Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: Java heap space at __randomizedtesting.SeedInfo.seed([8B587EAF1B1924E2:E103C1BE42570411]:0) at java.util.Arrays.copyOfRange(Arrays.java:2694) at java.lang.String.init(String.java:203) at org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl.toString(CharTermAttributeImpl.java:267) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:696) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:605) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:506) at org.apache.lucene.analysis.core.TestRandomChains.testRandomChainsWithLargeStrings(TestRandomChains.java:923) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1617) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:826) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:862) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:876) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:359) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:783) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:443) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:835) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:771) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:782) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) Build Log: [...truncated 679 lines...] [junit4] Suite: org.apache.lucene.analysis.core.TestRandomChains [junit4] 2 TEST FAIL: useCharFilter=true text='udjvij jvmnqembqwru lezr ezgrcxud ;;\n1610431719 \ud83c\ude59\ud83c\udef1 \udb40\udc3b\udb40\udc58\udb40\udc04\udb40\udc5a\udb40\udc65 \ua964\ua962\ua97b\ua96b\ua962 [)tl{0,5} jmdgw hmwuwfiyr cgpkxvyosff dkjbbcw \u0620\udb17\udd39\u024a\ud0a4 dcs' [junit4] 2 Exception from random analyzer: [junit4] 2 charfilters= [junit4] 2 tokenizer= [junit4] 2 org.apache.lucene.analysis.ngram.NGramTokenizer(LUCENE_50, 20, 68) [junit4] 2 filters= [junit4] 2 org.apache.lucene.analysis.tr.TurkishLowerCaseFilter(ValidatingTokenFilter@5bf2a40a term=,bytes=[],positionIncrement=1,positionLength=1,startOffset=0,endOffset=0) [junit4] 2 org.apache.lucene.analysis.pt.PortugueseStemFilter(ValidatingTokenFilter@550ae9ec term=,bytes=[],positionIncrement=1,positionLength=1,startOffset=0,endOffset=0,keyword=false) [junit4] 2 org.apache.lucene.analysis.shingle.ShingleFilter(ValidatingTokenFilter@5ebd61d1 term=,bytes=[],positionIncrement=1,positionLength=1,startOffset=0,endOffset=0,keyword=false,type=word, 84, 100) [junit4] 2 org.apache.lucene.analysis.fa.PersianNormalizationFilter(ValidatingTokenFilter@799eac3a
Re: JDK 8 : Third Release Candidate - Build 132 is available on java.net
Hi Uwe, Thanks for downloading the GA Version of JDK 8 JDK 7u60 b10! JDK 8u20 b05 is NOW available on https://jdk8.java.net/download.html Rgds,Rory On 19/03/2014 00:17, Uwe Schindler wrote: Hi Rory, I installed the GA version of JDK 1.8.0 a minute ago. All working fine, it’s reporting b132 as build number. I think the tar.gz also contains additional stuff like Mission Control, not available in the preview builds (because it is like 40 MB larger)? I also installed 7u60-b10. If I have time the next days, we will look forward to test 8u20 builds in the future. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de http://www.thetaphi.de/ eMail: u...@thetaphi.de *From:*Rory O'Donnell Oracle, Dublin Ireland [mailto:rory.odonn...@oracle.com] *Sent:* Friday, March 07, 2014 9:32 AM *To:* Uwe Schindler; dev@lucene.apache.org; 'Uwe Schindler'; 'Dawid Weiss' *Cc:* 'Dalibor Topic'; 'Cecilia Borg'; 'Balchandra Vaidya' *Subject:* Re: JDK 8 : Third Release Candidate - Build 132 is available on java.net Thanks Uwe! On 06/03/2014 23:59, Uwe Schindler wrote: Hi Rory, hi Lucene committers, Thanks for the info! I updated our Jenkins build server to use JDK 8 b132 and JDK 7u60 b07. In addition, the MacOSX virtual machine now also runs JDK 8 b132 builds (after I sorted out how to **not** make JDK8 the default Java on OSX). Next to operating system upgrades I also updated to latest versions of IBM J9 v6.0 and 7.1 (releases of January 29^th ). Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de http://www.thetaphi.de/ eMail: u...@thetaphi.de mailto:u...@thetaphi.de *From:*Rory O'Donnell Oracle, Dublin Ireland [mailto:rory.odonn...@oracle.com] *Sent:* Thursday, March 06, 2014 6:48 PM *To:* Uwe Schindler; Dawid Weiss *Cc:* dev@lucene.apache.org mailto:dev@lucene.apache.org; Dalibor Topic; Cecilia Borg; Balchandra Vaidya *Subject:* JDK 8 : Third Release Candidate - Build 132 is available on java.net Hi Uwe,Dawid, JDK 8 Third Release Candidate , Build 132 is now available for download http://jdk8.java.net/download.html test. Please log all show stopper issues as soon as possible. Thanks for your support, Rory -- Rgds,Rory O'Donnell Quality Engineering Manager Oracle EMEA , Dublin, Ireland -- Rgds,Rory O'Donnell Quality Engineering Manager Oracle EMEA , Dublin, Ireland -- Rgds,Rory O'Donnell Quality Engineering Manager Oracle EMEA , Dublin, Ireland
[jira] [Updated] (LUCENE-5489) Add query rescoring API
[ https://issues.apache.org/jira/browse/LUCENE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-5489: --- Attachment: LUCENE-5489.patch OK, new patch, folding in Simon's Rob's feedback (thanks!), adding more tests. I made an entirely abstract class Rescorer, and then a QueryRescorer subclass that uses a Query to get the 2nd pass scores. In the future we can have other sources of scores, e.g. an ExpressionRescorer. Add query rescoring API --- Key: LUCENE-5489 URL: https://issues.apache.org/jira/browse/LUCENE-5489 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.8, 5.0 Attachments: LUCENE-5489.patch, LUCENE-5489.patch, LUCENE-5489.patch When costly scoring factors are used during searching, a common approach is to do a cheaper / basic query first, collect the top few hundred hits, and then rescore those hits using the more costly query. It's not clear/simple to do this with Lucene today; I think we should make it easier. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: JDK 8 : Third Release Candidate - Build 132 is available on java.net
On 19.03.2014 01:17, Uwe Schindler wrote: I think the tar.gz also contains additional stuff like Mission Control That's correct. cheers, dalibor topic -- http://www.oracle.com Dalibor Topic | Principal Product Manager Phone: +494089091214 tel:+494089091214 | Mobile: +491737185961 tel:+491737185961 ORACLE Deutschland B.V. Co. KG | Kühnehöfe 5 | 22761 Hamburg ORACLE Deutschland B.V. Co. KG Hauptverwaltung: Riesstr. 25, D-80992 München Registergericht: Amtsgericht München, HRA 95603 Geschäftsführer: Jürgen Kunz Komplementärin: ORACLE Deutschland Verwaltung B.V. Hertogswetering 163/167, 3543 AS Utrecht, Niederlande Handelsregister der Handelskammer Midden-Niederlande, Nr. 30143697 Geschäftsführer: Alexander van der Ven, Astrid Kepper, Val Maher http://www.oracle.com/commitment Oracle is committed to developing practices and products that help protect the environment - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Windows (32bit/jdk1.8.0) - Build # 3874 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/3874/ Java: 32bit/jdk1.8.0 -client -XX:+UseParallelGC 4 tests failed. REGRESSION: org.apache.solr.cloud.TestShortCircuitedRequests.testDistribSearch Error Message: Stack Trace: org.apache.solr.common.cloud.ZooKeeperException: at __randomizedtesting.SeedInfo.seed([7873BD3329B0256B:F995332B5EEF4557]:0) at org.apache.solr.client.solrj.impl.CloudSolrServer.connect(CloudSolrServer.java:249) at org.apache.solr.cloud.AbstractFullDistribZkTestBase.initCloud(AbstractFullDistribZkTestBase.java:256) at org.apache.solr.cloud.AbstractFullDistribZkTestBase.createServers(AbstractFullDistribZkTestBase.java:312) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:868) at sun.reflect.GeneratedMethodAccessor90.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1617) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:826) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:862) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:876) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:359) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:783) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:443) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:835) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:771) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:782) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:359) at java.lang.Thread.run(Thread.java:744) Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /live_nodes at
[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses
[ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940342#comment-13940342 ] Da Huang commented on LUCENE-4396: -- Hi, Mike. I have just finished revising my proposal. I'm not so sure about the decription on this page unless the MUST clauses have very low hit count compared to the other clauses, that BooleanScorer would perform better than BooleanScorer2.. In my opinion, even when MUST clauses have very low hit count compared to the other clauses, BooleanScorer is likely to perform better than BooleanScorer2, because the calling on .advance() when dealing with SHOULD clauses can skip documents as many as BooleanScorer2 does. Relevant ideas is described in the session Improve the Rule for Choosing Scorer. As it's not very consistent with the description on this page, I'm not sure whether my idea makes sense. BooleanScorer should sometimes be used for MUST clauses --- Key: LUCENE-4396 URL: https://issues.apache.org/jira/browse/LUCENE-4396 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT. If there is one or more MUST clauses we always use BooleanScorer2. But I suspect that unless the MUST clauses have very low hit count compared to the other clauses, that BooleanScorer would perform better than BooleanScorer2. BooleanScorer still has some vestiges from when it used to handle MUST so it shouldn't be hard to bring back this capability ... I think the challenging part might be the heuristics on when to use which (likely we would have to use firstDocID as proxy for total hit count). Likely we should also have BooleanScorer sometimes use .advance() on the subs in this case, eg if suddenly the MUST clause skips 100 docs then you want to .advance() all the SHOULD clauses. I won't have near term time to work on this so feel free to take it if you are inspired! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5394) facet.method=fcs seems to be using threads when it shouldn't
[ https://issues.apache.org/jira/browse/SOLR-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitaliy Zhovtyuk updated SOLR-5394: --- Attachment: SOLR-5394.patch Attached patch contains 3tests reproducing issues with thread number. There 2 unrelated usages of SimpleFacets.threads with different initialization: - facet.threads - pool size for getting term count per each faces field. Synchrous exectuion if 0. - pool size of org.apache.solr.request.PerSegmentSingleValuedFaceting from local parameters used in query like {!prefix f=bla threads=3 ex=text:bla}signatureField If negative or zero thread number pased, then used MAX_INT as thread number - int threads = nThreads = 0 ? Integer.MAX_VALUE : nThreads; Default value as -1 could be the issue. About proposed fix i dont see any good reason to keep negative threads number by default. Absolute limit for threads if negative should be -1. I propose to set threads=1 by default meaning single thread execution, if unspecified. If it's requried to get MAX_INT thread pool (with is unlimited threads number) is can be specified in query as -1. facet.method=fcs seems to be using threads when it shouldn't Key: SOLR-5394 URL: https://issues.apache.org/jira/browse/SOLR-5394 Project: Solr Issue Type: Bug Affects Versions: 4.6 Reporter: Michael McCandless Attachments: SOLR-5394.patch, SOLR-5394_keep_threads_original_value.patch I built a wikipedia index, with multiple fields for faceting. When I do facet.method=fcs with facet.field=dateFacet and facet.field=userNameFacet, and then kill -QUIT the java process, I see a bunch (46, I think) of facetExecutor-7-thread-N threads had spun up. But I thought threads for each field is turned off by default? Even if I add facet.threads=0, it still spins up all the threads. I think something is wrong in SimpleFacets.parseParams; somehow, that method returns early (because localParams) is null, leaving threads=-1, and then the later code that would have set threads to 0 never runs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses
[ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940357#comment-13940357 ] Michael McCandless commented on LUCENE-4396: Thanks Da, the new iteration on the proposal looks awesome; I left a comment there! You're right: if we do use .advance on the sub-scorers, then even in the case where a conjunction clause has very low cost, BS should still be competitive. BooleanScorer should sometimes be used for MUST clauses --- Key: LUCENE-4396 URL: https://issues.apache.org/jira/browse/LUCENE-4396 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT. If there is one or more MUST clauses we always use BooleanScorer2. But I suspect that unless the MUST clauses have very low hit count compared to the other clauses, that BooleanScorer would perform better than BooleanScorer2. BooleanScorer still has some vestiges from when it used to handle MUST so it shouldn't be hard to bring back this capability ... I think the challenging part might be the heuristics on when to use which (likely we would have to use firstDocID as proxy for total hit count). Likely we should also have BooleanScorer sometimes use .advance() on the subs in this case, eg if suddenly the MUST clause skips 100 docs then you want to .advance() all the SHOULD clauses. I won't have near term time to work on this so feel free to take it if you are inspired! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5422) Postings lists deduplication
[ https://issues.apache.org/jira/browse/LUCENE-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940363#comment-13940363 ] Michael McCandless commented on LUCENE-5422: Can any Lucene/Solr committers volunteer to mentor this project? Postings lists deduplication Key: LUCENE-5422 URL: https://issues.apache.org/jira/browse/LUCENE-5422 Project: Lucene - Core Issue Type: Improvement Components: core/codecs, core/index Reporter: Dmitry Kan Labels: gsoc2014 The context: http://markmail.org/thread/tywtrjjcfdbzww6f Robert Muir and I have discussed what Robert eventually named postings lists deduplication at Berlin Buzzwords 2013 conference. The idea is to allow multiple terms to point to the same postings list to save space. This can be achieved by new index codec implementation, but this jira is open to other ideas as well. The application / impact of this is positive for synonyms, exact / inexact terms, leading wildcard support via storing reversed term etc. For example, at the moment, when supporting exact (unstemmed) and inexact (stemmed) searches, we store both unstemmed and stemmed variant of a word form and that leads to index bloating. That is why we had to remove the leading wildcard support via reversing a token on index and query time because of the same index size considerations. Comment from Mike McCandless: Neat idea! Would this idea allow a single term to point to (the union of) N other posting lists? It seems like that's necessary e.g. to handle the exact/inexact case. And then, to produce the Docs/AndPositionsEnum you'd need to do the merge sort across those N posting lists? Such a thing might also be do-able as runtime only wrapper around the postings API (FieldsProducer), if you could at runtime do the reverse expansion (e.g. stem - all of its surface forms). Comment from Robert Muir: I think the exact/inexact is trickier (detecting it would be the hard part), and you are right, another solution might work better. but for the reverse wildcard and synonyms situation, it seems we could even detect it on write if we created some hash of the previous terms postings. if the hash matches for the current term, we know it might be a duplicate and would have to actually do the costly check they are the same. maybe there are better ways to do it, but it might be a fun postingformat experiment to try. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5539) Simplify IndexWriter.commitMergedDeletesAndUpdates
[ https://issues.apache.org/jira/browse/LUCENE-5539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940365#comment-13940365 ] Michael McCandless commented on LUCENE-5539: +1 Simplify IndexWriter.commitMergedDeletesAndUpdates -- Key: LUCENE-5539 URL: https://issues.apache.org/jira/browse/LUCENE-5539 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Minor Fix For: 4.8, 5.0 Attachments: LUCENE-5539.patch IW.commitMergedDeletes could use some simplification. For example, if we factor out a holder class for {{mergedDeletesAndUpdates}} and {{docMap}}, we can factor out a lot of the duplicated logic into a single method. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-466) Need QueryParser support for BooleanQuery.minNrShouldMatch
[ https://issues.apache.org/jira/browse/LUCENE-466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940369#comment-13940369 ] Michael McCandless commented on LUCENE-466: --- Can any Lucene/Solr committer volunteer to mentor this project and/or LUCENE-4892? Need QueryParser support for BooleanQuery.minNrShouldMatch -- Key: LUCENE-466 URL: https://issues.apache.org/jira/browse/LUCENE-466 Project: Lucene - Core Issue Type: Improvement Components: core/search Environment: Operating System: other Platform: Other Reporter: Mark Harwood Priority: Minor Labels: gsoc2014 Attached 2 new classes: 1) CoordConstrainedBooleanQuery A boolean query that only matches if a specified number of the contained clauses match. An example use might be a query that returns a list of books where ANY 2 people from a list of people were co-authors, eg: Lucene In Action would match (Erik Hatcher Otis Gospodneti#263; Mark Harwood Doug Cutting) with a minRequiredOverlap of 2 because Otis and Erik wrote that. The book Java Development with Ant would not match because only 1 element in the list (Erik) was selected. 2) CustomQueryParserExample A customised QueryParser that allows definition of CoordConstrainedBooleanQueries. The solution (mis)uses fieldnames to pass parameters to the custom query. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5528) Add context to AnalyzingInfixSuggester
[ https://issues.apache.org/jira/browse/LUCENE-5528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940381#comment-13940381 ] Michael McCandless commented on LUCENE-5528: Hmm, I think I prefer the simpler SetBytesRef? And e.g. one problem with BytesRefIterator is you can only iterate it once (we'd sort of need a BytesRefIterable I guess), which might be a hassle for some suggesters? Add context to AnalyzingInfixSuggester -- Key: LUCENE-5528 URL: https://issues.apache.org/jira/browse/LUCENE-5528 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.8, 5.0 Attachments: LUCENE-5528-1.patch, LUCENE-5528.patch, LUCENE-5528.patch, contextInputIteratImpl.patch Spinoff from LUCENE-5350. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-5715) CloudSolrServer should choose URLs that match _route_
[ https://issues.apache.org/jira/browse/SOLR-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul reassigned SOLR-5715: Assignee: Noble Paul CloudSolrServer should choose URLs that match _route_ - Key: SOLR-5715 URL: https://issues.apache.org/jira/browse/SOLR-5715 Project: Solr Issue Type: Improvement Components: clients - java Affects Versions: 4.6.1 Reporter: Chase Bradford Assignee: Noble Paul Priority: Minor Fix For: 4.8, 5.0 When using CloudSolrServer to issue a request with a _route_ param, the URLs passed to LBHttpSolrServer should be filtered to include only hosts serving a slice. If there's a single shard listed, then the query can be served directly. Otherwise, the cluster services 3 /select requests for the query. As the host to replica ratio increases the probability of needing an extra hop goes to one, putting unnecessary strain on the cluster's network. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5763) Upgrade to Tika 1.5
[ https://issues.apache.org/jira/browse/SOLR-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940411#comment-13940411 ] ASF subversion and git services commented on SOLR-5763: --- Commit 1579197 from [~steve_rowe] in branch 'dev/trunk' [ https://svn.apache.org/r1579197 ] SOLR-5763: Tika version 1.4-1.5 in Versions of Major Components section Upgrade to Tika 1.5 --- Key: SOLR-5763 URL: https://issues.apache.org/jira/browse/SOLR-5763 Project: Solr Issue Type: Task Components: contrib - Solr Cell (Tika extraction) Reporter: Steve Rowe Assignee: Steve Rowe Priority: Minor Fix For: 4.8 Attachments: SOLR-5763.patch, SOLR-5763.patch, SOLR-5763.patch Just released: http://www.apache.org/dist/tika/CHANGES-1.5.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5763) Upgrade to Tika 1.5
[ https://issues.apache.org/jira/browse/SOLR-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940413#comment-13940413 ] ASF subversion and git services commented on SOLR-5763: --- Commit 1579198 from [~steve_rowe] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1579198 ] SOLR-5763: Tika version 1.4-1.5 in Versions of Major Components section (merged trunk r1579197) Upgrade to Tika 1.5 --- Key: SOLR-5763 URL: https://issues.apache.org/jira/browse/SOLR-5763 Project: Solr Issue Type: Task Components: contrib - Solr Cell (Tika extraction) Reporter: Steve Rowe Assignee: Steve Rowe Priority: Minor Fix For: 4.8 Attachments: SOLR-5763.patch, SOLR-5763.patch, SOLR-5763.patch Just released: http://www.apache.org/dist/tika/CHANGES-1.5.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5763) Upgrade to Tika 1.5
[ https://issues.apache.org/jira/browse/SOLR-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940414#comment-13940414 ] Steve Rowe commented on SOLR-5763: -- Thanks [~elyograg], I've fixed the Tika version in the Versions of Major Components section on trunk and branch_4x. Upgrade to Tika 1.5 --- Key: SOLR-5763 URL: https://issues.apache.org/jira/browse/SOLR-5763 Project: Solr Issue Type: Task Components: contrib - Solr Cell (Tika extraction) Reporter: Steve Rowe Assignee: Steve Rowe Priority: Minor Fix For: 4.8 Attachments: SOLR-5763.patch, SOLR-5763.patch, SOLR-5763.patch Just released: http://www.apache.org/dist/tika/CHANGES-1.5.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5715) CloudSolrServer should choose URLs that match _route_
[ https://issues.apache.org/jira/browse/SOLR-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-5715: - Attachment: SOLR-5715.patch Quick fix. tests yet to be added CloudSolrServer should choose URLs that match _route_ - Key: SOLR-5715 URL: https://issues.apache.org/jira/browse/SOLR-5715 Project: Solr Issue Type: Improvement Components: clients - java Affects Versions: 4.6.1 Reporter: Chase Bradford Assignee: Noble Paul Priority: Minor Fix For: 4.8, 5.0 Attachments: SOLR-5715.patch When using CloudSolrServer to issue a request with a _route_ param, the URLs passed to LBHttpSolrServer should be filtered to include only hosts serving a slice. If there's a single shard listed, then the query can be served directly. Otherwise, the cluster services 3 /select requests for the query. As the host to replica ratio increases the probability of needing an extra hop goes to one, putting unnecessary strain on the cluster's network. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5539) Simplify IndexWriter.commitMergedDeletesAndUpdates
[ https://issues.apache.org/jira/browse/LUCENE-5539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940426#comment-13940426 ] ASF subversion and git services commented on LUCENE-5539: - Commit 1579204 from [~shaie] in branch 'dev/trunk' [ https://svn.apache.org/r1579204 ] LUCENE-5539: simplify IndexWriter.commitMergedDeletesAndUpdates Simplify IndexWriter.commitMergedDeletesAndUpdates -- Key: LUCENE-5539 URL: https://issues.apache.org/jira/browse/LUCENE-5539 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Minor Fix For: 4.8, 5.0 Attachments: LUCENE-5539.patch IW.commitMergedDeletes could use some simplification. For example, if we factor out a holder class for {{mergedDeletesAndUpdates}} and {{docMap}}, we can factor out a lot of the duplicated logic into a single method. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5539) Simplify IndexWriter.commitMergedDeletesAndUpdates
[ https://issues.apache.org/jira/browse/LUCENE-5539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940432#comment-13940432 ] ASF subversion and git services commented on LUCENE-5539: - Commit 1579206 from [~shaie] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1579206 ] LUCENE-5539: simplify IndexWriter.commitMergedDeletesAndUpdates Simplify IndexWriter.commitMergedDeletesAndUpdates -- Key: LUCENE-5539 URL: https://issues.apache.org/jira/browse/LUCENE-5539 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Minor Fix For: 4.8, 5.0 Attachments: LUCENE-5539.patch IW.commitMergedDeletes could use some simplification. For example, if we factor out a holder class for {{mergedDeletesAndUpdates}} and {{docMap}}, we can factor out a lot of the duplicated logic into a single method. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5539) Simplify IndexWriter.commitMergedDeletesAndUpdates
[ https://issues.apache.org/jira/browse/LUCENE-5539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-5539. Resolution: Fixed Committed to trunk and 4x. Simplify IndexWriter.commitMergedDeletesAndUpdates -- Key: LUCENE-5539 URL: https://issues.apache.org/jira/browse/LUCENE-5539 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Minor Fix For: 4.8, 5.0 Attachments: LUCENE-5539.patch IW.commitMergedDeletes could use some simplification. For example, if we factor out a holder class for {{mergedDeletesAndUpdates}} and {{docMap}}, we can factor out a lot of the duplicated logic into a single method. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
ensuring codec can index offsets in test framework
This is similar to David Smiley's question on Feb 16th, but SuppressCodecs would be too broad of a solution, I think. I'm using LuceneTestCase's newIndexWriterConfig, and I have a test that requires IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS. The test passes quite often (famous last words), but I occasionally get an UnsupportedOperationException: this codec cannot index offsets. Is there a way to have LuceneTestCase randomly select a codec (with particular subcomponents/configurations) that supports indexing offsets? The codec that fails: codec=Lucene46: {f1:MockVariableIntBlock(baseBlockSize=71)}, docValues:{}, ... Most often, however Lucene46 does not fail.
Re: [JENKINS] Lucene-trunk-Linux-java7-64-analyzers - Build # 62 - Failure!
This is a new test i added yesterday that passes huge strings to TestRandomChains. The OOM is because: [junit4] 2 tokenizer= [junit4] 2 org.apache.lucene.analysis.ngram.NGramTokenizer(LUCENE_50, 20, 68) I'll shorten the length of the strings we use in this test. On Wed, Mar 19, 2014 at 4:40 AM, buil...@flonkings.com wrote: Build: builds.flonkings.com/job/Lucene-trunk-Linux-java7-64-analyzers/62/ 1 tests failed. REGRESSION: org.apache.lucene.analysis.core.TestRandomChains.testRandomChainsWithLargeStrings Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: Java heap space at __randomizedtesting.SeedInfo.seed([8B587EAF1B1924E2:E103C1BE42570411]:0) at java.util.Arrays.copyOfRange(Arrays.java:2694) at java.lang.String.init(String.java:203) at org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl.toString(CharTermAttributeImpl.java:267) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:696) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:605) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:506) at org.apache.lucene.analysis.core.TestRandomChains.testRandomChainsWithLargeStrings(TestRandomChains.java:923) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1617) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:826) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:862) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:876) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:359) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:783) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:443) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:835) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:771) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:782) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) Build Log: [...truncated 679 lines...] [junit4] Suite: org.apache.lucene.analysis.core.TestRandomChains [junit4] 2 TEST FAIL: useCharFilter=true text='udjvij jvmnqembqwru lezr ezgrcxud ;;\n1610431719 \ud83c\ude59\ud83c\udef1 \udb40\udc3b\udb40\udc58\udb40\udc04\udb40\udc5a\udb40\udc65 \ua964\ua962\ua97b\ua96b\ua962 [)tl{0,5} jmdgw hmwuwfiyr cgpkxvyosff dkjbbcw \u0620\udb17\udd39\u024a\ud0a4 dcs' [junit4] 2 Exception from random analyzer: [junit4] 2 charfilters= [junit4] 2 tokenizer= [junit4] 2 org.apache.lucene.analysis.ngram.NGramTokenizer(LUCENE_50, 20, 68) [junit4] 2 filters= [junit4] 2 org.apache.lucene.analysis.tr.TurkishLowerCaseFilter(ValidatingTokenFilter@5bf2a40a term=,bytes=[],positionIncrement=1,positionLength=1,startOffset=0,endOffset=0) [junit4] 2 org.apache.lucene.analysis.pt.PortugueseStemFilter(ValidatingTokenFilter@550ae9ec
Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.7.0_60-ea-b10) - Build # 9841 - Failure!
Hmmm; I’ll fix this ASAP On 3/18/14, 11:53 PM, Policeman Jenkins Server jenk...@thetaphi.de wrote: Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9841/ Java: 32bit/jdk1.7.0_60-ea-b10 -client -XX:+UseG1GC 1 tests failed. FAILED: org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testWith in {#9 seed=[14A074E14CFDBEA3:F66F35851492BBDC]} Error Message: Shouldn't match I#3:Rect(minX=-166.0,maxX=-161.0,minY=87.0,maxY=90.0) Q:Pt(x=180.0,y=56.0) Stack Trace: java.lang.AssertionError: Shouldn't match I#3:Rect(minX=-166.0,maxX=-161.0,minY=87.0,maxY=90.0) Q:Pt(x=180.0,y=56.0) at __randomizedtesting.SeedInfo.seed([14A074E14CFDBEA3:F66F35851492BBDC]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.fail(Spa tialOpRecursivePrefixTreeTest.java:355) at org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.doTest(S patialOpRecursivePrefixTreeTest.java:335) at org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testWith in(SpatialOpRecursivePrefixTreeTest.java:119) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: 57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm pl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunne r.java:1617) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedR unner.java:826) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedR unner.java:862) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedR unner.java:876) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSet upTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCa cheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAf terRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.e valuate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThread AndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleI gnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure. java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(Stateme ntAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(T hreadLeakControl.java:359) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(Th readLeakControl.java:783) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeak Control.java:443) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(Randomiz edRunner.java:835) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedR unner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedR unner.java:771) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedR unner.java:782) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAf terRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClas sName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.e valuate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRu le$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRu le$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(Stateme ntAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAsser tionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure. java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleI gnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreT estSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(Stateme ntAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(T hreadLeakControl.java:359) at java.lang.Thread.run(Thread.java:744) Build Log: [...truncated 9265 lines...] [junit4] Suite:
Re: ensuring codec can index offsets in test framework
for now you can use an assume, there is a helper in LuceneTestCase: String pf = TestUtil.getPostingsFormat(dummy); boolean supportsOffsets = !doesntSupportOffsets.contains(pf); another option is to suppress the codecs that don't support it (anything using Sep layout). This is annoying though, maybe we should remove Sep layout? Realistically it was the precursor to the block layout that Lucene41 introduced, which was a big change, but i am unsure if its really helping us anymore, because it just falls behind on things and i dont think has any interesting qualities for real use or that would be useful in testing, either.. On Wed, Mar 19, 2014 at 8:53 AM, Allison, Timothy B. talli...@mitre.org wrote: This is similar to David Smiley's question on Feb 16th, but SuppressCodecs would be too broad of a solution, I think. I'm using LuceneTestCase's newIndexWriterConfig, and I have a test that requires IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS. The test passes quite often (famous last words), but I occasionally get an UnsupportedOperationException: this codec cannot index offsets. Is there a way to have LuceneTestCase randomly select a codec (with particular subcomponents/configurations) that supports indexing offsets? The codec that fails: codec=Lucene46: {f1:MockVariableIntBlock(baseBlockSize=71)}, docValues:{}, ... Most often, however Lucene46 does not fail. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5880) org.apache.solr.client.solrj.impl.CloudSolrServerTest is failing pretty much every time for a long time with an exception about not being able to connect to ZooKeeper wi
[ https://issues.apache.org/jira/browse/SOLR-5880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940517#comment-13940517 ] ASF subversion and git services commented on SOLR-5880: --- Commit 1579242 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1579242 ] SOLR-5880: Stop using zookeeper.forceSync=false for now. org.apache.solr.client.solrj.impl.CloudSolrServerTest is failing pretty much every time for a long time with an exception about not being able to connect to ZooKeeper within the timeout. -- Key: SOLR-5880 URL: https://issues.apache.org/jira/browse/SOLR-5880 Project: Solr Issue Type: Test Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.8, 5.0 This test is failing consistently, though currently only on Policeman Jenkins servers. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5880) org.apache.solr.client.solrj.impl.CloudSolrServerTest is failing pretty much every time for a long time with an exception about not being able to connect to ZooKeeper wi
[ https://issues.apache.org/jira/browse/SOLR-5880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940520#comment-13940520 ] ASF subversion and git services commented on SOLR-5880: --- Commit 1579243 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1579243 ] SOLR-5880: Stop using zookeeper.forceSync=false for now. org.apache.solr.client.solrj.impl.CloudSolrServerTest is failing pretty much every time for a long time with an exception about not being able to connect to ZooKeeper within the timeout. -- Key: SOLR-5880 URL: https://issues.apache.org/jira/browse/SOLR-5880 Project: Solr Issue Type: Test Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.8, 5.0 This test is failing consistently, though currently only on Policeman Jenkins servers. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5878) Solr returns duplicates when using distributed search with group.format=simple
[ https://issues.apache.org/jira/browse/SOLR-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940522#comment-13940522 ] J.B. Langston edited comment on SOLR-5878 at 3/19/14 2:46 PM: -- Sorry for not following protocol. Do you want me to move to the list now or continue here since it's already open? I may have misstated the problem here. The duplicates aren't the problem; rather that it ignores the rows parameter when using sharding and group.format=simple at the same time. You'll notice that there is a rows=5 param in the url, but in the output there are 16 documents returned. This prevents the use of rows and start params to page through the data. You're right about the cont_stub field not being the unique key. id is the unique key and indeed there are multiple documents with the same value for cont_stub and different values for the unique key. I was filing this on behalf of a customer and as I was reproducing it, I noticed the duplicates and got distracted by those. Sorry for the confusion; I can update the description to reflect the true problem if you like, or I ask on the mailing list before continuing here. was (Author: jblangs...@datastax.com): Sorry for not following protocol. Do you want me to move to the list now or continue here since it's already open? I may have misstated the problem here. The duplicates aren't the problem; rather that it ignores the rows parameter when using sharding and group.format=simple at the same time. You'll notice that there is a rows=5 param in the url, but the output and there are 16 documents returned. This prevents the use of rows and start params to page through the data. You're right about the cont_stub field not being the unique key. id is the unique key and indeed there are multiple documents with the same value for cont_stub and different values for the unique key. I was filing this on behalf of a customer and as I was reproducing it, I noticed the duplicates and got distracted by those. Sorry for the confusion; I can update the description to reflect the true problem if you like, or I ask on the mailing list before continuing here. Solr returns duplicates when using distributed search with group.format=simple -- Key: SOLR-5878 URL: https://issues.apache.org/jira/browse/SOLR-5878 Project: Solr Issue Type: Bug Affects Versions: 4.6 Reporter: J.B. Langston Solr returns duplicate documents when group.format=simple is supplied on a distributed search. This does not happen on the standard group format or when not using distributed search. For example: {code} http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=*%3A*fq=evt_stub%3A(452deed8-c3a2-49a8-878d-8356da315e6a)start=0rows=5fl=cont_stubwt=xmlindent=truegroup=truegroup.field=cont_stubgroup.format=simplegroup.limit=1000 {code} Returns: {code} ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime253/int /lst lst name=grouped lst name=cont_stub int name=matches56/int result name=doclist numFound=56 start=0 maxScore=1.0 doc str name=cont_stube60eb0f9-bce7-4da9-819c-b356dfc1c4f7/str/doc doc str name=cont_stube60eb0f9-bce7-4da9-819c-b356dfc1c4f7/str/doc doc str name=cont_stube60eb0f9-bce7-4da9-819c-b356dfc1c4f7/str/doc doc str name=cont_stubfaf0a7ea-4252-4eda-990a-4fcc6b5e63e3/str/doc doc str name=cont_stubfaf0a7ea-4252-4eda-990a-4fcc6b5e63e3/str/doc doc str name=cont_stubfaf0a7ea-4252-4eda-990a-4fcc6b5e63e3/str/doc doc str name=cont_stubdd94ec0b-f171-441d-8fb8-af6a22ebf168/str/doc doc str name=cont_stubdd94ec0b-f171-441d-8fb8-af6a22ebf168/str/doc doc str name=cont_stubdd94ec0b-f171-441d-8fb8-af6a22ebf168/str/doc doc str name=cont_stubfeede138-2fe4-4742-ac63-e7cecfd86c81/str/doc doc str name=cont_stubfeede138-2fe4-4742-ac63-e7cecfd86c81/str/doc doc str name=cont_stubfeede138-2fe4-4742-ac63-e7cecfd86c81/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc /result /lst /lst /response {code} It should only return 5 documents. Removing the distributed search and searching on either core will return the requested number of rows. Removing group.format=simple will also return the requested number of rows. --
[jira] [Commented] (SOLR-5878) Solr returns duplicates when using distributed search with group.format=simple
[ https://issues.apache.org/jira/browse/SOLR-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940522#comment-13940522 ] J.B. Langston commented on SOLR-5878: - Sorry for not following protocol. Do you want me to move to the list now or continue here since it's already open? I may have misstated the problem here. The duplicates aren't the problem; rather that it ignores the rows parameter when using sharding and group.format=simple at the same time. You'll notice that there is a rows=5 param in the url, but the output and there are 16 documents returned. This prevents the use of rows and start params to page through the data. You're right about the cont_stub field not being the unique key. id is the unique key and indeed there are multiple documents with the same value for cont_stub and different values for the unique key. I was filing this on behalf of a customer and as I was reproducing it, I noticed the duplicates and got distracted by those. Sorry for the confusion; I can update the description to reflect the true problem if you like, or I ask on the mailing list before continuing here. Solr returns duplicates when using distributed search with group.format=simple -- Key: SOLR-5878 URL: https://issues.apache.org/jira/browse/SOLR-5878 Project: Solr Issue Type: Bug Affects Versions: 4.6 Reporter: J.B. Langston Solr returns duplicate documents when group.format=simple is supplied on a distributed search. This does not happen on the standard group format or when not using distributed search. For example: {code} http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=*%3A*fq=evt_stub%3A(452deed8-c3a2-49a8-878d-8356da315e6a)start=0rows=5fl=cont_stubwt=xmlindent=truegroup=truegroup.field=cont_stubgroup.format=simplegroup.limit=1000 {code} Returns: {code} ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime253/int /lst lst name=grouped lst name=cont_stub int name=matches56/int result name=doclist numFound=56 start=0 maxScore=1.0 doc str name=cont_stube60eb0f9-bce7-4da9-819c-b356dfc1c4f7/str/doc doc str name=cont_stube60eb0f9-bce7-4da9-819c-b356dfc1c4f7/str/doc doc str name=cont_stube60eb0f9-bce7-4da9-819c-b356dfc1c4f7/str/doc doc str name=cont_stubfaf0a7ea-4252-4eda-990a-4fcc6b5e63e3/str/doc doc str name=cont_stubfaf0a7ea-4252-4eda-990a-4fcc6b5e63e3/str/doc doc str name=cont_stubfaf0a7ea-4252-4eda-990a-4fcc6b5e63e3/str/doc doc str name=cont_stubdd94ec0b-f171-441d-8fb8-af6a22ebf168/str/doc doc str name=cont_stubdd94ec0b-f171-441d-8fb8-af6a22ebf168/str/doc doc str name=cont_stubdd94ec0b-f171-441d-8fb8-af6a22ebf168/str/doc doc str name=cont_stubfeede138-2fe4-4742-ac63-e7cecfd86c81/str/doc doc str name=cont_stubfeede138-2fe4-4742-ac63-e7cecfd86c81/str/doc doc str name=cont_stubfeede138-2fe4-4742-ac63-e7cecfd86c81/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc /result /lst /lst /response {code} It should only return 5 documents. Removing the distributed search and searching on either core will return the requested number of rows. Removing group.format=simple will also return the requested number of rows. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5878) Solr returns duplicates when using distributed search with group.format=simple
[ https://issues.apache.org/jira/browse/SOLR-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940522#comment-13940522 ] J.B. Langston edited comment on SOLR-5878 at 3/19/14 2:47 PM: -- Sorry for not following protocol. Do you want me to move to the list now or continue here since it's already open? I may have misstated the problem here. The duplicates aren't the problem; rather that it ignores the rows parameter when using sharding and group.format=simple at the same time. You'll notice that there is a rows=5 param in the url, but in the output there are 16 documents returned. This prevents the use of rows and start params to page through the data. You're right about the cont_stub field not being the unique key. id is the unique key and indeed there are multiple documents with the same value for cont_stub and different values for the unique key. I was filing this on behalf of a customer and as I was reproducing it, I noticed the duplicates and got distracted by those. Sorry for the confusion; I can update the description to reflect the true problem if you like, or I can ask on the mailing list before continuing here. was (Author: jblangs...@datastax.com): Sorry for not following protocol. Do you want me to move to the list now or continue here since it's already open? I may have misstated the problem here. The duplicates aren't the problem; rather that it ignores the rows parameter when using sharding and group.format=simple at the same time. You'll notice that there is a rows=5 param in the url, but in the output there are 16 documents returned. This prevents the use of rows and start params to page through the data. You're right about the cont_stub field not being the unique key. id is the unique key and indeed there are multiple documents with the same value for cont_stub and different values for the unique key. I was filing this on behalf of a customer and as I was reproducing it, I noticed the duplicates and got distracted by those. Sorry for the confusion; I can update the description to reflect the true problem if you like, or I ask on the mailing list before continuing here. Solr returns duplicates when using distributed search with group.format=simple -- Key: SOLR-5878 URL: https://issues.apache.org/jira/browse/SOLR-5878 Project: Solr Issue Type: Bug Affects Versions: 4.6 Reporter: J.B. Langston Solr returns duplicate documents when group.format=simple is supplied on a distributed search. This does not happen on the standard group format or when not using distributed search. For example: {code} http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=*%3A*fq=evt_stub%3A(452deed8-c3a2-49a8-878d-8356da315e6a)start=0rows=5fl=cont_stubwt=xmlindent=truegroup=truegroup.field=cont_stubgroup.format=simplegroup.limit=1000 {code} Returns: {code} ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime253/int /lst lst name=grouped lst name=cont_stub int name=matches56/int result name=doclist numFound=56 start=0 maxScore=1.0 doc str name=cont_stube60eb0f9-bce7-4da9-819c-b356dfc1c4f7/str/doc doc str name=cont_stube60eb0f9-bce7-4da9-819c-b356dfc1c4f7/str/doc doc str name=cont_stube60eb0f9-bce7-4da9-819c-b356dfc1c4f7/str/doc doc str name=cont_stubfaf0a7ea-4252-4eda-990a-4fcc6b5e63e3/str/doc doc str name=cont_stubfaf0a7ea-4252-4eda-990a-4fcc6b5e63e3/str/doc doc str name=cont_stubfaf0a7ea-4252-4eda-990a-4fcc6b5e63e3/str/doc doc str name=cont_stubdd94ec0b-f171-441d-8fb8-af6a22ebf168/str/doc doc str name=cont_stubdd94ec0b-f171-441d-8fb8-af6a22ebf168/str/doc doc str name=cont_stubdd94ec0b-f171-441d-8fb8-af6a22ebf168/str/doc doc str name=cont_stubfeede138-2fe4-4742-ac63-e7cecfd86c81/str/doc doc str name=cont_stubfeede138-2fe4-4742-ac63-e7cecfd86c81/str/doc doc str name=cont_stubfeede138-2fe4-4742-ac63-e7cecfd86c81/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc /result /lst /lst /response {code} It should only return 5 documents. Removing the distributed search and searching on either core will return the requested number of rows. Removing group.format=simple will also return the requested number of rows.
[jira] [Commented] (SOLR-5865) Provide a MiniSolrCloudCluster to enable easier testing
[ https://issues.apache.org/jira/browse/SOLR-5865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940529#comment-13940529 ] ASF subversion and git services commented on SOLR-5865: --- Commit 1579247 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1579247 ] SOLR-5865: Ignore test for the moment. Provide a MiniSolrCloudCluster to enable easier testing --- Key: SOLR-5865 URL: https://issues.apache.org/jira/browse/SOLR-5865 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.7, 5.0 Reporter: Gregory Chanan Assignee: Mark Miller Attachments: SOLR-5865.patch, SOLR-5865.patch Today, the SolrCloud tests are based on the LuceneTestCase class hierarchy, which has a couple of issues around support for downstream projects: - It's difficult to test SolrCloud support in a downstream project that may have its own test framework. For example, some projects have support for different storage backends (e.g. Solr/ElasticSearch/HBase) and want tests against each of the different backends. This is difficult to do cleanly, because the Solr tests require derivation from LuceneTestCase, while the other don't - The LuceneTestCase class hierarchy is really designed for internal solr tests (e.g. it randomizes a lot of parameters to get test coverage, but a downstream project probably doesn't care about that). It's also quite complicated and dense, much more so than a downstream project would want. Given these reasons, it would be nice to provide a simple MiniSolrCloudCluster, similar to how HDFS provides a MiniHdfsCluster or HBase provides a MiniHBaseCluster. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5538) FastVectorHighlighter fails with booleans of phrases
[ https://issues.apache.org/jira/browse/LUCENE-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940532#comment-13940532 ] Adrien Grand commented on LUCENE-5538: -- +1! FastVectorHighlighter fails with booleans of phrases Key: LUCENE-5538 URL: https://issues.apache.org/jira/browse/LUCENE-5538 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Reporter: Robert Muir Attachments: LUCENE-5538.patch, LUCENE-5538.patch, LUCENE-5538_test.patch in some situations a query of (P1 OR PQ) returns no results, even though individually, both P1 or P2 by themselves will highlight correctly.. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5880) org.apache.solr.client.solrj.impl.CloudSolrServerTest is failing pretty much every time for a long time with an exception about not being able to connect to ZooKeeper wi
[ https://issues.apache.org/jira/browse/SOLR-5880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940548#comment-13940548 ] ASF subversion and git services commented on SOLR-5880: --- Commit 1579252 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1579252 ] SOLR-5880: Start using zookeeper.forceSync=false in tests. org.apache.solr.client.solrj.impl.CloudSolrServerTest is failing pretty much every time for a long time with an exception about not being able to connect to ZooKeeper within the timeout. -- Key: SOLR-5880 URL: https://issues.apache.org/jira/browse/SOLR-5880 Project: Solr Issue Type: Test Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.8, 5.0 This test is failing consistently, though currently only on Policeman Jenkins servers. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5880) org.apache.solr.client.solrj.impl.CloudSolrServerTest is failing pretty much every time for a long time with an exception about not being able to connect to ZooKeeper wi
[ https://issues.apache.org/jira/browse/SOLR-5880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940552#comment-13940552 ] ASF subversion and git services commented on SOLR-5880: --- Commit 1579253 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1579253 ] SOLR-5880: Start using zookeeper.forceSync=false in tests. org.apache.solr.client.solrj.impl.CloudSolrServerTest is failing pretty much every time for a long time with an exception about not being able to connect to ZooKeeper within the timeout. -- Key: SOLR-5880 URL: https://issues.apache.org/jira/browse/SOLR-5880 Project: Solr Issue Type: Test Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.8, 5.0 This test is failing consistently, though currently only on Policeman Jenkins servers. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5538) FastVectorHighlighter fails with booleans of phrases
[ https://issues.apache.org/jira/browse/LUCENE-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940559#comment-13940559 ] ASF subversion and git services commented on LUCENE-5538: - Commit 1579255 from [~rcmuir] in branch 'dev/trunk' [ https://svn.apache.org/r1579255 ] LUCENE-5538: FastVectorHighlighter fails with booleans of phrases FastVectorHighlighter fails with booleans of phrases Key: LUCENE-5538 URL: https://issues.apache.org/jira/browse/LUCENE-5538 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Reporter: Robert Muir Attachments: LUCENE-5538.patch, LUCENE-5538.patch, LUCENE-5538_test.patch in some situations a query of (P1 OR PQ) returns no results, even though individually, both P1 or P2 by themselves will highlight correctly.. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.7.0_51) - Build # 9741 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/9741/ Java: 64bit/jdk1.7.0_51 -XX:-UseCompressedOops -XX:+UseParallelGC -XX:-UseSuperWord 1 tests failed. REGRESSION: org.apache.lucene.analysis.core.TestRandomChains.testRandomChains Error Message: startOffset must be non-negative, and endOffset must be = startOffset, startOffset=2,endOffset=1 Stack Trace: java.lang.IllegalArgumentException: startOffset must be non-negative, and endOffset must be = startOffset, startOffset=2,endOffset=1 at __randomizedtesting.SeedInfo.seed([B5B16FF77330D152:885046963422CC92]:0) at org.apache.lucene.analysis.tokenattributes.OffsetAttributeImpl.setOffset(OffsetAttributeImpl.java:45) at org.apache.lucene.analysis.shingle.ShingleFilter.incrementToken(ShingleFilter.java:345) at org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:78) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:694) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:605) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:506) at org.apache.lucene.analysis.core.TestRandomChains.testRandomChains(TestRandomChains.java:925) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1617) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:826) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:862) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:876) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:359) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:783) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:443) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:835) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:771) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:782) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at
[jira] [Commented] (SOLR-4787) Join Contrib
[ https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940570#comment-13940570 ] Kranti Parisa commented on SOLR-4787: - can you post the query Join Contrib Key: SOLR-4787 URL: https://issues.apache.org/jira/browse/SOLR-4787 Project: Solr Issue Type: New Feature Components: search Affects Versions: 4.2.1 Reporter: Joel Bernstein Priority: Minor Fix For: 4.8 Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787-pjoin-long-keys.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4797-hjoin-multivaluekeys-nestedJoins.patch, SOLR-4797-hjoin-multivaluekeys-trunk.patch This contrib provides a place where different join implementations can be contributed to Solr. This contrib currently includes 3 join implementations. The initial patch was generated from the Solr 4.3 tag. Because of changes in the FieldCache API this patch will only build with Solr 4.2 or above. *HashSetJoinQParserPlugin aka hjoin* The hjoin provides a join implementation that filters results in one core based on the results of a search in another core. This is similar in functionality to the JoinQParserPlugin but the implementation differs in a couple of important ways. The first way is that the hjoin is designed to work with int and long join keys only. So, in order to use hjoin, int or long join keys must be included in both the to and from core. The second difference is that the hjoin builds memory structures that are used to quickly connect the join keys. So, the hjoin will need more memory then the JoinQParserPlugin to perform the join. The main advantage of the hjoin is that it can scale to join millions of keys between cores and provide sub-second response time. The hjoin should work well with up to two million results from the fromIndex and tens of millions of results from the main query. The hjoin supports the following features: 1) Both lucene query and PostFilter implementations. A *cost* 99 will turn on the PostFilter. The PostFilter will typically outperform the Lucene query when the main query results have been narrowed down. 2) With the lucene query implementation there is an option to build the filter with threads. This can greatly improve the performance of the query if the main query index is very large. The threads parameter turns on threading. For example *threads=6* will use 6 threads to build the filter. This will setup a fixed threadpool with six threads to handle all hjoin requests. Once the threadpool is created the hjoin will always use it to build the filter. Threading does not come into play with the PostFilter. 3) The *size* local parameter can be used to set the initial size of the hashset used to perform the join. If this is set above the number of results from the fromIndex then the you can avoid hashset resizing which improves performance. 4) Nested filter queries. The local parameter fq can be used to nest a filter query within the join. The nested fq will filter the results of the join query. This can point to another join to support nested joins. 5) Full caching support for the lucene query implementation. The filterCache and queryResultCache should work properly even with deep nesting of joins. Only the queryResultCache comes into play with the PostFilter implementation because PostFilters are not cacheable in the filterCache. The syntax of the hjoin is similar to the JoinQParserPlugin except that the plugin is referenced by the string hjoin rather then join. fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6 fq=$qq\}user:customer1qq=group:5 The example filter query above will search the fromIndex (collection2) for user:customer1 applying the local fq parameter to filter the results. The lucene filter query will be built using 6 threads. This query will generate a list of values from the from field that will be used to filter the main query. Only records from the main query, where the to field is present in the from list will be included in the results. The solrconfig.xml in the main query core must contain the reference to the hjoin. queryParser name=hjoin class=org.apache.solr.joins.HashSetJoinQParserPlugin/ And the join contrib lib jars must be registed in the solrconfig.xml. lib dir=../../../contrib/joins/lib regex=.*\.jar / After issuing the ant dist command from inside the solr directory the joins contrib jar will appear in the solr/dist directory. Place the the solr-joins-4.*-.jar in the WEB-INF/lib directory of the solr
Re: [JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.7.0_51) - Build # 9741 - Still Failing!
this fail is just because i increased this test to try harder (it takes multiplier into account etc) thai + shingles looks suspicious. I'll take a look in a bit. On Wed, Mar 19, 2014 at 11:26 AM, Policeman Jenkins Server jenk...@thetaphi.de wrote: Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/9741/ Java: 64bit/jdk1.7.0_51 -XX:-UseCompressedOops -XX:+UseParallelGC -XX:-UseSuperWord 1 tests failed. REGRESSION: org.apache.lucene.analysis.core.TestRandomChains.testRandomChains Error Message: startOffset must be non-negative, and endOffset must be = startOffset, startOffset=2,endOffset=1 Stack Trace: java.lang.IllegalArgumentException: startOffset must be non-negative, and endOffset must be = startOffset, startOffset=2,endOffset=1 at __randomizedtesting.SeedInfo.seed([B5B16FF77330D152:885046963422CC92]:0) at org.apache.lucene.analysis.tokenattributes.OffsetAttributeImpl.setOffset(OffsetAttributeImpl.java:45) at org.apache.lucene.analysis.shingle.ShingleFilter.incrementToken(ShingleFilter.java:345) at org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:78) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:694) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:605) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:506) at org.apache.lucene.analysis.core.TestRandomChains.testRandomChains(TestRandomChains.java:925) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1617) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:826) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:862) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:876) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:359) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:783) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:443) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:835) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:771) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:782) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at
[jira] [Commented] (LUCENE-5538) FastVectorHighlighter fails with booleans of phrases
[ https://issues.apache.org/jira/browse/LUCENE-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940579#comment-13940579 ] ASF subversion and git services commented on LUCENE-5538: - Commit 1579264 from [~rcmuir] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1579264 ] LUCENE-5538: FastVectorHighlighter fails with booleans of phrases FastVectorHighlighter fails with booleans of phrases Key: LUCENE-5538 URL: https://issues.apache.org/jira/browse/LUCENE-5538 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Reporter: Robert Muir Attachments: LUCENE-5538.patch, LUCENE-5538.patch, LUCENE-5538_test.patch in some situations a query of (P1 OR PQ) returns no results, even though individually, both P1 or P2 by themselves will highlight correctly.. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5878) Solr returns duplicates when using distributed search with group.format=simple
[ https://issues.apache.org/jira/browse/SOLR-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940590#comment-13940590 ] Erick Erickson commented on SOLR-5878: -- bq: Sorry for not following protocol NP, we all gotta start somewhere. You're right, on the surface of it it looks like this is just not working according to what I would expect from the Wiki page, let's keep this open. Solr returns duplicates when using distributed search with group.format=simple -- Key: SOLR-5878 URL: https://issues.apache.org/jira/browse/SOLR-5878 Project: Solr Issue Type: Bug Affects Versions: 4.6 Reporter: J.B. Langston Solr returns duplicate documents when group.format=simple is supplied on a distributed search. This does not happen on the standard group format or when not using distributed search. For example: {code} http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=*%3A*fq=evt_stub%3A(452deed8-c3a2-49a8-878d-8356da315e6a)start=0rows=5fl=cont_stubwt=xmlindent=truegroup=truegroup.field=cont_stubgroup.format=simplegroup.limit=1000 {code} Returns: {code} ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime253/int /lst lst name=grouped lst name=cont_stub int name=matches56/int result name=doclist numFound=56 start=0 maxScore=1.0 doc str name=cont_stube60eb0f9-bce7-4da9-819c-b356dfc1c4f7/str/doc doc str name=cont_stube60eb0f9-bce7-4da9-819c-b356dfc1c4f7/str/doc doc str name=cont_stube60eb0f9-bce7-4da9-819c-b356dfc1c4f7/str/doc doc str name=cont_stubfaf0a7ea-4252-4eda-990a-4fcc6b5e63e3/str/doc doc str name=cont_stubfaf0a7ea-4252-4eda-990a-4fcc6b5e63e3/str/doc doc str name=cont_stubfaf0a7ea-4252-4eda-990a-4fcc6b5e63e3/str/doc doc str name=cont_stubdd94ec0b-f171-441d-8fb8-af6a22ebf168/str/doc doc str name=cont_stubdd94ec0b-f171-441d-8fb8-af6a22ebf168/str/doc doc str name=cont_stubdd94ec0b-f171-441d-8fb8-af6a22ebf168/str/doc doc str name=cont_stubfeede138-2fe4-4742-ac63-e7cecfd86c81/str/doc doc str name=cont_stubfeede138-2fe4-4742-ac63-e7cecfd86c81/str/doc doc str name=cont_stubfeede138-2fe4-4742-ac63-e7cecfd86c81/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc /result /lst /lst /response {code} It should only return 5 documents. Removing the distributed search and searching on either core will return the requested number of rows. Removing group.format=simple will also return the requested number of rows. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: ensuring codec can index offsets in test framework
Perfect. As always, thank you. From: Robert Muir [rcm...@gmail.com] Sent: Wednesday, March 19, 2014 10:29 AM To: dev@lucene.apache.org Subject: Re: ensuring codec can index offsets in test framework for now you can use an assume, there is a helper in LuceneTestCase: String pf = TestUtil.getPostingsFormat(dummy); boolean supportsOffsets = !doesntSupportOffsets.contains(pf); another option is to suppress the codecs that don't support it (anything using Sep layout). This is annoying though, maybe we should remove Sep layout? Realistically it was the precursor to the block layout that Lucene41 introduced, which was a big change, but i am unsure if its really helping us anymore, because it just falls behind on things and i dont think has any interesting qualities for real use or that would be useful in testing, either.. On Wed, Mar 19, 2014 at 8:53 AM, Allison, Timothy B. talli...@mitre.org wrote: This is similar to David Smiley's question on Feb 16th, but SuppressCodecs would be too broad of a solution, I think. I'm using LuceneTestCase's newIndexWriterConfig, and I have a test that requires IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS. The test passes quite often (famous last words), but I occasionally get an UnsupportedOperationException: this codec cannot index offsets. Is there a way to have LuceneTestCase randomly select a codec (with particular subcomponents/configurations) that supports indexing offsets? The codec that fails: codec=Lucene46: {f1:MockVariableIntBlock(baseBlockSize=71)}, docValues:{}, ... Most often, however Lucene46 does not fail. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5881) Upgrade Zookeeper to 3.4.6
[ https://issues.apache.org/jira/browse/SOLR-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940582#comment-13940582 ] Mark Miller commented on SOLR-5881: --- That was likely unrelated fails caused by SOLR-5865. Things look good to me, commit away! Upgrade Zookeeper to 3.4.6 -- Key: SOLR-5881 URL: https://issues.apache.org/jira/browse/SOLR-5881 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Shawn Heisey Assignee: Shawn Heisey Fix For: 4.8, 5.0 Attachments: SOLR-5881-testlog.txt, SOLR-5881.patch A mailing list user has run into ZOOKEEPER-1513. The release notes for 3.4.6 list a lot of fixes since 3.4.5. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5878) Incorrect number of rows returned in distributed search with group.format=simple
[ https://issues.apache.org/jira/browse/SOLR-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-5878: - Description: The original description (left in below) is something of a red herring. The URL has rows=5 and group.format=simple, yet a bunch more rows are returned. This doesn't seem right given the Wiki description of format=simple, either the code is a problem or the Wiki needs updating. Original description: Solr returns duplicate documents when group.format=simple is supplied on a distributed search. This does not happen on the standard group format or when not using distributed search. For example: {code} http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=*%3A*fq=evt_stub%3A(452deed8-c3a2-49a8-878d-8356da315e6a)start=0rows=5fl=cont_stubwt=xmlindent=truegroup=truegroup.field=cont_stubgroup.format=simplegroup.limit=1000 {code} Returns: {code} ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime253/int /lst lst name=grouped lst name=cont_stub int name=matches56/int result name=doclist numFound=56 start=0 maxScore=1.0 doc str name=cont_stube60eb0f9-bce7-4da9-819c-b356dfc1c4f7/str/doc doc str name=cont_stube60eb0f9-bce7-4da9-819c-b356dfc1c4f7/str/doc doc str name=cont_stube60eb0f9-bce7-4da9-819c-b356dfc1c4f7/str/doc doc str name=cont_stubfaf0a7ea-4252-4eda-990a-4fcc6b5e63e3/str/doc doc str name=cont_stubfaf0a7ea-4252-4eda-990a-4fcc6b5e63e3/str/doc doc str name=cont_stubfaf0a7ea-4252-4eda-990a-4fcc6b5e63e3/str/doc doc str name=cont_stubdd94ec0b-f171-441d-8fb8-af6a22ebf168/str/doc doc str name=cont_stubdd94ec0b-f171-441d-8fb8-af6a22ebf168/str/doc doc str name=cont_stubdd94ec0b-f171-441d-8fb8-af6a22ebf168/str/doc doc str name=cont_stubfeede138-2fe4-4742-ac63-e7cecfd86c81/str/doc doc str name=cont_stubfeede138-2fe4-4742-ac63-e7cecfd86c81/str/doc doc str name=cont_stubfeede138-2fe4-4742-ac63-e7cecfd86c81/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc /result /lst /lst /response {code} It should only return 5 documents. Removing the distributed search and searching on either core will return the requested number of rows. Removing group.format=simple will also return the requested number of rows. was: Solr returns duplicate documents when group.format=simple is supplied on a distributed search. This does not happen on the standard group format or when not using distributed search. For example: {code} http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=*%3A*fq=evt_stub%3A(452deed8-c3a2-49a8-878d-8356da315e6a)start=0rows=5fl=cont_stubwt=xmlindent=truegroup=truegroup.field=cont_stubgroup.format=simplegroup.limit=1000 {code} Returns: {code} ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime253/int /lst lst name=grouped lst name=cont_stub int name=matches56/int result name=doclist numFound=56 start=0 maxScore=1.0 doc str name=cont_stube60eb0f9-bce7-4da9-819c-b356dfc1c4f7/str/doc doc str name=cont_stube60eb0f9-bce7-4da9-819c-b356dfc1c4f7/str/doc doc str name=cont_stube60eb0f9-bce7-4da9-819c-b356dfc1c4f7/str/doc doc str name=cont_stubfaf0a7ea-4252-4eda-990a-4fcc6b5e63e3/str/doc doc str name=cont_stubfaf0a7ea-4252-4eda-990a-4fcc6b5e63e3/str/doc doc str name=cont_stubfaf0a7ea-4252-4eda-990a-4fcc6b5e63e3/str/doc doc str name=cont_stubdd94ec0b-f171-441d-8fb8-af6a22ebf168/str/doc doc str name=cont_stubdd94ec0b-f171-441d-8fb8-af6a22ebf168/str/doc doc str name=cont_stubdd94ec0b-f171-441d-8fb8-af6a22ebf168/str/doc doc str name=cont_stubfeede138-2fe4-4742-ac63-e7cecfd86c81/str/doc doc str name=cont_stubfeede138-2fe4-4742-ac63-e7cecfd86c81/str/doc doc str name=cont_stubfeede138-2fe4-4742-ac63-e7cecfd86c81/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc doc str name=cont_stub86944a90-033d-4676-9ac3-b59744fc52a5/str/doc /result /lst /lst /response {code} It should only return 5 documents. Removing the distributed search and searching on either core will
[jira] [Updated] (LUCENE-5476) Facet sampling
[ https://issues.apache.org/jira/browse/LUCENE-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-5476: --- Attachment: LUCENE-5476.patch I reviewed the patch more closely before I commit: * Modified few javadocs * Removed needsSampling() since we don't offer an extension any access to e.g. totalHits. We can add it when there's demand. * Fixed a bug in how carryOver was implemented -- replaced by two members {{leftoverBin}} and {{leftoverIndex}}. So now if {{leftoverBin != -1}} we know to skip the first such documents in the next segment and depending on {{leftoverIndex}}, whether we need to sample any of them. Before that, we didn't really skip over the leftover docs in the bin, but started to count a new bin. * Added a CHANGES entry. I reviewed the test - would be good if we can write a unit test which specifically matches only few documents in one segment compared to the rest. I will look into it perhaps later. I think it's ready, but if anyone wants to give createSample() another look, to make sure this time leftover works well, I won't commit it by tomorrow anyway. Facet sampling -- Key: LUCENE-5476 URL: https://issues.apache.org/jira/browse/LUCENE-5476 Project: Lucene - Core Issue Type: Improvement Reporter: Rob Audenaerde Attachments: LUCENE-5476.patch, LUCENE-5476.patch, LUCENE-5476.patch, LUCENE-5476.patch, LUCENE-5476.patch, LUCENE-5476.patch, LUCENE-5476.patch, LUCENE-5476.patch, LUCENE-5476.patch, LUCENE-5476.patch, LUCENE-5476.patch, SamplingComparison_SamplingFacetsCollector.java, SamplingFacetsCollector.java With LUCENE-5339 facet sampling disappeared. When trying to display facet counts on large datasets (10M documents) counting facets is rather expensive, as all the hits are collected and processed. Sampling greatly reduced this and thus provided a nice speedup. Could it be brought back? -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5881) Upgrade Zookeeper to 3.4.6
[ https://issues.apache.org/jira/browse/SOLR-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940618#comment-13940618 ] ASF subversion and git services commented on SOLR-5881: --- Commit 1579275 from [~elyograg] in branch 'dev/trunk' [ https://svn.apache.org/r1579275 ] SOLR-5881: Upgrade zookeeper to 3.4.6. Upgrade Zookeeper to 3.4.6 -- Key: SOLR-5881 URL: https://issues.apache.org/jira/browse/SOLR-5881 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Shawn Heisey Assignee: Shawn Heisey Fix For: 4.8, 5.0 Attachments: SOLR-5881-testlog.txt, SOLR-5881.patch A mailing list user has run into ZOOKEEPER-1513. The release notes for 3.4.6 list a lot of fixes since 3.4.5. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5538) FastVectorHighlighter fails with booleans of phrases
[ https://issues.apache.org/jira/browse/LUCENE-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-5538. - Resolution: Fixed Fix Version/s: 4.7.1 5.0 4.8 FastVectorHighlighter fails with booleans of phrases Key: LUCENE-5538 URL: https://issues.apache.org/jira/browse/LUCENE-5538 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Reporter: Robert Muir Fix For: 4.8, 5.0, 4.7.1 Attachments: LUCENE-5538.patch, LUCENE-5538.patch, LUCENE-5538_test.patch in some situations a query of (P1 OR PQ) returns no results, even though individually, both P1 or P2 by themselves will highlight correctly.. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.7.0_51) - Build # 9741 - Still Failing!
I see the bug and can reproduce it. The problem is there is some thai text, and it runs through KeywordRepeatFitler first (adding a synonym of itself to every token). Like this ab - [ab, ab] Then thaiwordfilter comes along and splits both these tokens: [a0, b0, a1, b1]. This makes offsets go backwards. When shinglefilter jumps it, then b0 and a1 are shingled, the offsets are senseless because endOffset startOffset. I don't think we should hack around this: This thai filter is really a tokenizer and should not be a tokenfilter. There is an issue for that, I will take it. On Wed, Mar 19, 2014 at 11:31 AM, Robert Muir rm...@apache.org wrote: this fail is just because i increased this test to try harder (it takes multiplier into account etc) thai + shingles looks suspicious. I'll take a look in a bit. On Wed, Mar 19, 2014 at 11:26 AM, Policeman Jenkins Server jenk...@thetaphi.de wrote: Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/9741/ Java: 64bit/jdk1.7.0_51 -XX:-UseCompressedOops -XX:+UseParallelGC -XX:-UseSuperWord 1 tests failed. REGRESSION: org.apache.lucene.analysis.core.TestRandomChains.testRandomChains Error Message: startOffset must be non-negative, and endOffset must be = startOffset, startOffset=2,endOffset=1 Stack Trace: java.lang.IllegalArgumentException: startOffset must be non-negative, and endOffset must be = startOffset, startOffset=2,endOffset=1 at __randomizedtesting.SeedInfo.seed([B5B16FF77330D152:885046963422CC92]:0) at org.apache.lucene.analysis.tokenattributes.OffsetAttributeImpl.setOffset(OffsetAttributeImpl.java:45) at org.apache.lucene.analysis.shingle.ShingleFilter.incrementToken(ShingleFilter.java:345) at org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:78) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:694) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:605) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:506) at org.apache.lucene.analysis.core.TestRandomChains.testRandomChains(TestRandomChains.java:925) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1617) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:826) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:862) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:876) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:359) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:783) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:443) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:835) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:771) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:782) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at
[jira] [Commented] (SOLR-2412) Multipath hierarchical faceting
[ https://issues.apache.org/jira/browse/SOLR-2412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940626#comment-13940626 ] J.L. Hill commented on SOLR-2412: - ant run-example fails for me using solr-4.5.1-src.tgz patched with 29/Oct/13 SOLR-2412.patch It fails with: /usr/local/src/solr/solr-4.5.1/solr/build.xml:373: The following error occurred while executing this line: /usr/local/src/solr/solr-4.5.1/solr/common-build.xml:425: The following error occurred while executing this line: Target jar-exposed does not exist in the project solr-exposed. It is used from target module-jars-to-solr. The error is perhaps mine, but the test instructions seemed rather simple. The patch applied with no warnings. If I have made an error in posting here, my apologies; this is my first post. Multipath hierarchical faceting --- Key: SOLR-2412 URL: https://issues.apache.org/jira/browse/SOLR-2412 Project: Solr Issue Type: New Feature Components: SearchComponents - other Affects Versions: 4.0 Environment: Fast IO when huge hierarchies are used Reporter: Toke Eskildsen Labels: contrib, patch Attachments: SOLR-2412.patch, SOLR-2412.patch, SOLR-2412.patch, SOLR-2412.patch, SOLR-2412.patch, SOLR-2412.patch Hierarchical faceting with slow startup, low memory overhead and fast response. Distinguishing features as compared to SOLR-64 and SOLR-792 are * Multiple paths per document * Query-time analysis of the facet-field; no special requirements for indexing besides retaining separator characters in the terms used for faceting * Optional custom sorting of tag values * Recursive counting of references to tags at all levels of the output This is a shell around LUCENE-2369, making it work with the Solr API. The underlying principle is to reference terms by their ordinals and create an index wide documents to tags map, augmented with a compressed representation of hierarchical levels. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5749) Implement an Overseer status API
[ https://issues.apache.org/jira/browse/SOLR-5749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-5749: Attachment: SOLR-5749.patch Added a very basic test. Implement an Overseer status API Key: SOLR-5749 URL: https://issues.apache.org/jira/browse/SOLR-5749 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Shalin Shekhar Mangar Fix For: 5.0 Attachments: SOLR-5749.patch, SOLR-5749.patch, SOLR-5749.patch, SOLR-5749.patch Right now there is little to no information exposed about the overseer from SolrCloud. I propose that we have an API for overseer status which can return: # Past N commands executed (grouped by command type) # Status (queue-size, current overseer leader node) # Overseer log -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5880) org.apache.solr.client.solrj.impl.CloudSolrServerTest is failing pretty much every time for a long time with an exception about not being able to connect to ZooKeeper wi
[ https://issues.apache.org/jira/browse/SOLR-5880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940677#comment-13940677 ] Shalin Shekhar Mangar commented on SOLR-5880: - Hi Mark, did you find the root cause of the failures? org.apache.solr.client.solrj.impl.CloudSolrServerTest is failing pretty much every time for a long time with an exception about not being able to connect to ZooKeeper within the timeout. -- Key: SOLR-5880 URL: https://issues.apache.org/jira/browse/SOLR-5880 Project: Solr Issue Type: Test Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.8, 5.0 This test is failing consistently, though currently only on Policeman Jenkins servers. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5882) Support scoreMode parameter for BlockJoinParentQParser
Andrey Kudryavtsev created SOLR-5882: Summary: Support scoreMode parameter for BlockJoinParentQParser Key: SOLR-5882 URL: https://issues.apache.org/jira/browse/SOLR-5882 Project: Solr Issue Type: New Feature Affects Versions: 4.8 Reporter: Andrey Kudryavtsev Today BlockJoinParentQParser creates queries with hardcoded _scoring mode_ None: {code:borderStyle=solid} protected Query createQuery(Query parentList, Query query) { return new ToParentBlockJoinQuery(query, getFilter(parentList), ScoreMode.None); } {code} Analogically BlockJoinChildQParser creates queries with hardcoded _doScores_ false: {code:borderStyle=solid} protected Query createQuery(Query parentListQuery, Query query) { return new ToChildBlockJoinQuery(query, getFilter(parentListQuery), false); } {code} I propose to have ability to configure this scoring options via query syntax. Syntax for parent queries can be like: {code:borderStyle=solid} {!parent which=type:parent scoreMode=None|Avg|Max|Total} {code} For child query: {code:borderStyle=solid} {!child of=type:parent doScores=true|false} {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5882) Support scoreMode parameter for BlockJoinParentQParser
[ https://issues.apache.org/jira/browse/SOLR-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Kudryavtsev updated SOLR-5882: - Attachment: SOLR-5882.patch Initial patch Support scoreMode parameter for BlockJoinParentQParser -- Key: SOLR-5882 URL: https://issues.apache.org/jira/browse/SOLR-5882 Project: Solr Issue Type: New Feature Affects Versions: 4.8 Reporter: Andrey Kudryavtsev Attachments: SOLR-5882.patch Today BlockJoinParentQParser creates queries with hardcoded _scoring mode_ None: {code:borderStyle=solid} protected Query createQuery(Query parentList, Query query) { return new ToParentBlockJoinQuery(query, getFilter(parentList), ScoreMode.None); } {code} Analogically BlockJoinChildQParser creates queries with hardcoded _doScores_ false: {code:borderStyle=solid} protected Query createQuery(Query parentListQuery, Query query) { return new ToChildBlockJoinQuery(query, getFilter(parentListQuery), false); } {code} I propose to have ability to configure this scoring options via query syntax. Syntax for parent queries can be like: {code:borderStyle=solid} {!parent which=type:parent scoreMode=None|Avg|Max|Total} {code} For child query: {code:borderStyle=solid} {!child of=type:parent doScores=true|false} {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5782) The full MapReduceIndexer help text does not display when using --help.
[ https://issues.apache.org/jira/browse/SOLR-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940734#comment-13940734 ] ASF subversion and git services commented on SOLR-5782: --- Commit 1579307 from [~markrmil...@gmail.com] in branch 'dev/branches/lucene_solr_4_7' [ https://svn.apache.org/r1579307 ] SOLR-5782: The full MapReduceIndexer help text does not display when using --help. The full MapReduceIndexer help text does not display when using --help. --- Key: SOLR-5782 URL: https://issues.apache.org/jira/browse/SOLR-5782 Project: Solr Issue Type: Bug Components: contrib - MapReduce Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.8, 5.0, 4.7.1 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5782) The full MapReduceIndexer help text does not display when using --help.
[ https://issues.apache.org/jira/browse/SOLR-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940735#comment-13940735 ] ASF subversion and git services commented on SOLR-5782: --- Commit 1579308 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1579308 ] SOLR-5782: Move CHANGES entry to 4.7.1 The full MapReduceIndexer help text does not display when using --help. --- Key: SOLR-5782 URL: https://issues.apache.org/jira/browse/SOLR-5782 Project: Solr Issue Type: Bug Components: contrib - MapReduce Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.8, 5.0, 4.7.1 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5782) The full MapReduceIndexer help text does not display when using --help.
[ https://issues.apache.org/jira/browse/SOLR-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940738#comment-13940738 ] ASF subversion and git services commented on SOLR-5782: --- Commit 1579311 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1579311 ] SOLR-5782: Move CHANGES entry to 4.7.1 The full MapReduceIndexer help text does not display when using --help. --- Key: SOLR-5782 URL: https://issues.apache.org/jira/browse/SOLR-5782 Project: Solr Issue Type: Bug Components: contrib - MapReduce Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.8, 5.0, 4.7.1 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5782) The full MapReduceIndexer help text does not display when using --help.
[ https://issues.apache.org/jira/browse/SOLR-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-5782. --- Resolution: Fixed The full MapReduceIndexer help text does not display when using --help. --- Key: SOLR-5782 URL: https://issues.apache.org/jira/browse/SOLR-5782 Project: Solr Issue Type: Bug Components: contrib - MapReduce Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.8, 5.0, 4.7.1 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5824) Merge up Solr MapReduce contrib code to latest external changes.
[ https://issues.apache.org/jira/browse/SOLR-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940744#comment-13940744 ] Mark Miller commented on SOLR-5824: --- I looked through the tests and didn't find any pertinent changes a while back. I'm going to commit these (generally minor bug fixes) so that I can get them in 4.7.1. I'll make a new issues to do a check on the dependencies. Merge up Solr MapReduce contrib code to latest external changes. Key: SOLR-5824 URL: https://issues.apache.org/jira/browse/SOLR-5824 Project: Solr Issue Type: Task Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.8, 5.0 Attachments: SOLR-5824.patch There are a variety of changes in the mapreduce contrib code that have occurred while getting the initial stuff committed - they need to be merged in. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5824) Merge up Solr MapReduce contrib code to latest external changes.
[ https://issues.apache.org/jira/browse/SOLR-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5824: -- Fix Version/s: 4.7.1 Merge up Solr MapReduce contrib code to latest external changes. Key: SOLR-5824 URL: https://issues.apache.org/jira/browse/SOLR-5824 Project: Solr Issue Type: Task Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.8, 5.0, 4.7.1 Attachments: SOLR-5824.patch There are a variety of changes in the mapreduce contrib code that have occurred while getting the initial stuff committed - they need to be merged in. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5881) Upgrade Zookeeper to 3.4.6
[ https://issues.apache.org/jira/browse/SOLR-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940743#comment-13940743 ] ASF subversion and git services commented on SOLR-5881: --- Commit 1579316 from [~elyograg] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1579316 ] SOLR-5881: Upgrade zookeeper to 3.4.6 (merge trunk r1579275) Upgrade Zookeeper to 3.4.6 -- Key: SOLR-5881 URL: https://issues.apache.org/jira/browse/SOLR-5881 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Shawn Heisey Assignee: Shawn Heisey Fix For: 4.8, 5.0 Attachments: SOLR-5881-testlog.txt, SOLR-5881.patch A mailing list user has run into ZOOKEEPER-1513. The release notes for 3.4.6 list a lot of fixes since 3.4.5. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4787) Join Contrib
[ https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940740#comment-13940740 ] Alexander S. commented on SOLR-4787: Any query fails, seems I am doing something wrong (perhaps the patch was applied incorrectly). I see this error: {quote} SolrCore Initialization Failures crm-dev: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.search.joins.HashSetJoinQParserPlugin' {quote} when trying to access the web interface. Join Contrib Key: SOLR-4787 URL: https://issues.apache.org/jira/browse/SOLR-4787 Project: Solr Issue Type: New Feature Components: search Affects Versions: 4.2.1 Reporter: Joel Bernstein Priority: Minor Fix For: 4.8 Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787-pjoin-long-keys.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4797-hjoin-multivaluekeys-nestedJoins.patch, SOLR-4797-hjoin-multivaluekeys-trunk.patch This contrib provides a place where different join implementations can be contributed to Solr. This contrib currently includes 3 join implementations. The initial patch was generated from the Solr 4.3 tag. Because of changes in the FieldCache API this patch will only build with Solr 4.2 or above. *HashSetJoinQParserPlugin aka hjoin* The hjoin provides a join implementation that filters results in one core based on the results of a search in another core. This is similar in functionality to the JoinQParserPlugin but the implementation differs in a couple of important ways. The first way is that the hjoin is designed to work with int and long join keys only. So, in order to use hjoin, int or long join keys must be included in both the to and from core. The second difference is that the hjoin builds memory structures that are used to quickly connect the join keys. So, the hjoin will need more memory then the JoinQParserPlugin to perform the join. The main advantage of the hjoin is that it can scale to join millions of keys between cores and provide sub-second response time. The hjoin should work well with up to two million results from the fromIndex and tens of millions of results from the main query. The hjoin supports the following features: 1) Both lucene query and PostFilter implementations. A *cost* 99 will turn on the PostFilter. The PostFilter will typically outperform the Lucene query when the main query results have been narrowed down. 2) With the lucene query implementation there is an option to build the filter with threads. This can greatly improve the performance of the query if the main query index is very large. The threads parameter turns on threading. For example *threads=6* will use 6 threads to build the filter. This will setup a fixed threadpool with six threads to handle all hjoin requests. Once the threadpool is created the hjoin will always use it to build the filter. Threading does not come into play with the PostFilter. 3) The *size* local parameter can be used to set the initial size of the hashset used to perform the join. If this is set above the number of results from the fromIndex then the you can avoid hashset resizing which improves performance. 4) Nested filter queries. The local parameter fq can be used to nest a filter query within the join. The nested fq will filter the results of the join query. This can point to another join to support nested joins. 5) Full caching support for the lucene query implementation. The filterCache and queryResultCache should work properly even with deep nesting of joins. Only the queryResultCache comes into play with the PostFilter implementation because PostFilters are not cacheable in the filterCache. The syntax of the hjoin is similar to the JoinQParserPlugin except that the plugin is referenced by the string hjoin rather then join. fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6 fq=$qq\}user:customer1qq=group:5 The example filter query above will search the fromIndex (collection2) for user:customer1 applying the local fq parameter to filter the results. The lucene filter query will be built using 6 threads. This query will generate a list of values from the from field that will be used to filter the main query. Only records from the main query, where the to field is present in the from list will be included in the results. The solrconfig.xml in the main query core must contain the reference to the hjoin. queryParser name=hjoin
[jira] [Resolved] (SOLR-5881) Upgrade Zookeeper to 3.4.6
[ https://issues.apache.org/jira/browse/SOLR-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey resolved SOLR-5881. Resolution: Fixed Upgrade Zookeeper to 3.4.6 -- Key: SOLR-5881 URL: https://issues.apache.org/jira/browse/SOLR-5881 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Shawn Heisey Assignee: Shawn Heisey Fix For: 4.8, 5.0 Attachments: SOLR-5881-testlog.txt, SOLR-5881.patch A mailing list user has run into ZOOKEEPER-1513. The release notes for 3.4.6 list a lot of fixes since 3.4.5. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.7.0_51) - Build # 9828 - Failure!
I'll disable SSL for that test for now. SSL in general has been hard to get working smoothly with tests unfortunately. I've got a JIRA issue to look at improving it, but not likely I'll look into it for some time, so until then, tests having issues with SSL should likely simply disable SSL for now. - Mark On Tue, Mar 18, 2014 at 4:54 AM, Dawid Weiss dawid.we...@cs.put.poznan.plwrote: It's a lot of error messages like this one. I have the full syserr dump if needed. D. 2773140 T6223 oasc.ChaosMonkeyNothingIsSafeTest$FullThrottleStopableIndexingThread$1.handleError WARN suss error java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:618) at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:522) at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:401) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:178) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:304) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:610) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:445) at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:232) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) On Tue, Mar 18, 2014 at 9:46 AM, Uwe Schindler u...@thetaphi.de wrote: I dig tail -1 to extract the last 1 lines. The file is also in the archive at same place. It is indeed a loop. The code loops endless in a Connection Refused loop, without any delay between the events. After approx. 2:50 hours this hit the limits of the SSD file system. This test fails so often since it was fixed, we should revert to @BadApple. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: dawid.we...@gmail.com [mailto:dawid.we...@gmail.com] On Behalf Of Dawid Weiss Sent: Tuesday, March 18, 2014 9:16 AM To: dev@lucene.apache.org Subject: Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.7.0_51) - Build # 9828 - Failure! junit4-J0-20140317_230107_233.events8.17 GB [fingerprint] view This build created a 8.17 GB big events file and failed with out of space. How can this happen? Can you peek at it? It's probably something that logs in a loop or something. I'm fetching it right now, let's see if I can figure it out. D. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- - Mark http://about.me/markrmiller
[jira] [Created] (SOLR-5883) Shutdown SolrServer on tests
Tomás Fernández Löbbe created SOLR-5883: --- Summary: Shutdown SolrServer on tests Key: SOLR-5883 URL: https://issues.apache.org/jira/browse/SOLR-5883 Project: Solr Issue Type: Bug Affects Versions: 4.7, 5.0 Reporter: Tomás Fernández Löbbe Priority: Minor I noticed that many tests create multiple HttpSolrServer but never call the shutdown. I created a Jira for BasicDistributedZk2Test and BasicDistributedZkTest some time ago but didn’t check for all tests (SOLR-5684). I added the missing shutdowns that I found in other tests now. I’m also wondering if it makes sense to add some kind of check in the tests, in a similar way that open/close of IndexSearchers is checked (probably, not blocking or falling the test if open/close doesn’t much, but maybe to output a warning to the logs?) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5824) Merge up Solr MapReduce contrib code to latest external changes.
[ https://issues.apache.org/jira/browse/SOLR-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940752#comment-13940752 ] ASF subversion and git services commented on SOLR-5824: --- Commit 1579318 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1579318 ] SOLR-5824: Merge up Solr MapReduce contrib code to latest external changes. Includes a few minor bug fixes. Merge up Solr MapReduce contrib code to latest external changes. Key: SOLR-5824 URL: https://issues.apache.org/jira/browse/SOLR-5824 Project: Solr Issue Type: Task Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.8, 5.0, 4.7.1 Attachments: SOLR-5824.patch There are a variety of changes in the mapreduce contrib code that have occurred while getting the initial stuff committed - they need to be merged in. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5883) Shutdown SolrServer on tests
[ https://issues.apache.org/jira/browse/SOLR-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomás Fernández Löbbe updated SOLR-5883: Attachment: SOLR-5883.patch Shutdown SolrServer on tests Key: SOLR-5883 URL: https://issues.apache.org/jira/browse/SOLR-5883 Project: Solr Issue Type: Bug Affects Versions: 4.7, 5.0 Reporter: Tomás Fernández Löbbe Priority: Minor Attachments: SOLR-5883.patch I noticed that many tests create multiple HttpSolrServer but never call the shutdown. I created a Jira for BasicDistributedZk2Test and BasicDistributedZkTest some time ago but didn’t check for all tests (SOLR-5684). I added the missing shutdowns that I found in other tests now. I’m also wondering if it makes sense to add some kind of check in the tests, in a similar way that open/close of IndexSearchers is checked (probably, not blocking or falling the test if open/close doesn’t much, but maybe to output a warning to the logs?) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5824) Merge up Solr MapReduce contrib code to latest external changes.
[ https://issues.apache.org/jira/browse/SOLR-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940756#comment-13940756 ] ASF subversion and git services commented on SOLR-5824: --- Commit 1579322 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1579322 ] SOLR-5824: Merge up Solr MapReduce contrib code to latest external changes. Includes a few minor bug fixes. Merge up Solr MapReduce contrib code to latest external changes. Key: SOLR-5824 URL: https://issues.apache.org/jira/browse/SOLR-5824 Project: Solr Issue Type: Task Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.8, 5.0, 4.7.1 Attachments: SOLR-5824.patch There are a variety of changes in the mapreduce contrib code that have occurred while getting the initial stuff committed - they need to be merged in. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4984) Fix ThaiWordFilter
[ https://issues.apache.org/jira/browse/LUCENE-4984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4984: Attachment: LUCENE-4984.patch I cut this over to ThaiTokenizer with that base class restored from Kuromoji. The tokenizer itself is simpler now. I think we can use the same approach with SmartChinese. Fix ThaiWordFilter -- Key: LUCENE-4984 URL: https://issues.apache.org/jira/browse/LUCENE-4984 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand Assignee: Adrien Grand Attachments: LUCENE-4984.patch, LUCENE-4984.patch ThaiWordFilter is an offender in TestRandomChains because it creates positions and updates offsets. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5883) Shutdown SolrServer on tests
[ https://issues.apache.org/jira/browse/SOLR-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5883: -- Fix Version/s: 5.0 4.8 Shutdown SolrServer on tests Key: SOLR-5883 URL: https://issues.apache.org/jira/browse/SOLR-5883 Project: Solr Issue Type: Bug Affects Versions: 4.7, 5.0 Reporter: Tomás Fernández Löbbe Assignee: Mark Miller Priority: Minor Fix For: 4.8, 5.0 Attachments: SOLR-5883.patch I noticed that many tests create multiple HttpSolrServer but never call the shutdown. I created a Jira for BasicDistributedZk2Test and BasicDistributedZkTest some time ago but didn’t check for all tests (SOLR-5684). I added the missing shutdowns that I found in other tests now. I’m also wondering if it makes sense to add some kind of check in the tests, in a similar way that open/close of IndexSearchers is checked (probably, not blocking or falling the test if open/close doesn’t much, but maybe to output a warning to the logs?) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-5883) Shutdown SolrServer on tests
[ https://issues.apache.org/jira/browse/SOLR-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned SOLR-5883: - Assignee: Mark Miller Shutdown SolrServer on tests Key: SOLR-5883 URL: https://issues.apache.org/jira/browse/SOLR-5883 Project: Solr Issue Type: Bug Affects Versions: 4.7, 5.0 Reporter: Tomás Fernández Löbbe Assignee: Mark Miller Priority: Minor Fix For: 4.8, 5.0 Attachments: SOLR-5883.patch I noticed that many tests create multiple HttpSolrServer but never call the shutdown. I created a Jira for BasicDistributedZk2Test and BasicDistributedZkTest some time ago but didn’t check for all tests (SOLR-5684). I added the missing shutdowns that I found in other tests now. I’m also wondering if it makes sense to add some kind of check in the tests, in a similar way that open/close of IndexSearchers is checked (probably, not blocking or falling the test if open/close doesn’t much, but maybe to output a warning to the logs?) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5715) CloudSolrServer should choose URLs that match _route_
[ https://issues.apache.org/jira/browse/SOLR-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940767#comment-13940767 ] Chase Bradford commented on SOLR-5715: -- I don't see how this uses the route parameter. The list of slices that the LB server can query is still all active slices for the collection, and not just those that match the route. +DocCollection coll = clusterState.getCollection(collectionsList.iterator().next()); +CollectionSlice filteredSlices = coll.getRouter().getSearchSlices(route, reqParams , coll); +ClientUtils.addSlices(slices, coll.getName(), filteredSlices, true); CloudSolrServer should choose URLs that match _route_ - Key: SOLR-5715 URL: https://issues.apache.org/jira/browse/SOLR-5715 Project: Solr Issue Type: Improvement Components: clients - java Affects Versions: 4.6.1 Reporter: Chase Bradford Assignee: Noble Paul Priority: Minor Fix For: 4.8, 5.0 Attachments: SOLR-5715.patch When using CloudSolrServer to issue a request with a _route_ param, the URLs passed to LBHttpSolrServer should be filtered to include only hosts serving a slice. If there's a single shard listed, then the query can be served directly. Otherwise, the cluster services 3 /select requests for the query. As the host to replica ratio increases the probability of needing an extra hop goes to one, putting unnecessary strain on the cluster's network. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5538) FastVectorHighlighter fails with booleans of phrases
[ https://issues.apache.org/jira/browse/LUCENE-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940594#comment-13940594 ] ASF subversion and git services commented on LUCENE-5538: - Commit 1579269 from [~rcmuir] in branch 'dev/branches/lucene_solr_4_7' [ https://svn.apache.org/r1579269 ] LUCENE-5538: FastVectorHighlighter fails with booleans of phrases FastVectorHighlighter fails with booleans of phrases Key: LUCENE-5538 URL: https://issues.apache.org/jira/browse/LUCENE-5538 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Reporter: Robert Muir Fix For: 4.8, 5.0, 4.7.1 Attachments: LUCENE-5538.patch, LUCENE-5538.patch, LUCENE-5538_test.patch in some situations a query of (P1 OR PQ) returns no results, even though individually, both P1 or P2 by themselves will highlight correctly.. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940787#comment-13940787 ] Mark Miller commented on SOLR-5872: --- bq. Is that dead in the water now? No. It's got it's own issue, and it seems likely to happen to me. Even this issue is not dead in the water. Things are generally determined via discussion and consensus. I'm arguing that we should look at simple performance bottleneck and improvements to the current system - there seems to be a lot of low hanging fruit. {noformat} Can you throw some light on how was the ZK schema for your initial impl? If all nodes of a given slice is under one zk directory , one watch on the parent should be fine, right? {noformat} It's been a long time and we had a few variations, so I'd have to go back in the code to refresh my memory. For now, from my memory: Initially I had it to that we simply watched the parent - Loggly ran into performance issues with this - even when only one entry changed, they had so many entries that updating the state with so many nodes reading so many entries, the performance was a big problem for them. They hacked around it initially, and then we moved to watching each entry eventually - this made small updating state for small changes very efficient. But then another big early user was still hitting performance issues simply from having to read so many entries on startup and such. This is what prompted the move to a single clusterstate.json. It's hard to remember it all perfectly - the info is spread across and around a lot of old JIRAs. Non of the changes were taken lightly, and a variety of developers and contributors were generally involved in the discussion or motivating changes via their needs. There are tradeoffs with all of these approaches. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5883) Shutdown SolrServer on tests
[ https://issues.apache.org/jira/browse/SOLR-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940793#comment-13940793 ] ASF subversion and git services commented on SOLR-5883: --- Commit 1579336 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1579336 ] SOLR-5883: Many tests do not shutdown SolrServer. Shutdown SolrServer on tests Key: SOLR-5883 URL: https://issues.apache.org/jira/browse/SOLR-5883 Project: Solr Issue Type: Bug Affects Versions: 4.7, 5.0 Reporter: Tomás Fernández Löbbe Assignee: Mark Miller Priority: Minor Fix For: 4.8, 5.0 Attachments: SOLR-5883.patch I noticed that many tests create multiple HttpSolrServer but never call the shutdown. I created a Jira for BasicDistributedZk2Test and BasicDistributedZkTest some time ago but didn’t check for all tests (SOLR-5684). I added the missing shutdowns that I found in other tests now. I’m also wondering if it makes sense to add some kind of check in the tests, in a similar way that open/close of IndexSearchers is checked (probably, not blocking or falling the test if open/close doesn’t much, but maybe to output a warning to the logs?) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5883) Shutdown SolrServer on tests
[ https://issues.apache.org/jira/browse/SOLR-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-5883. --- Resolution: Fixed Thanks Tomás! Shutdown SolrServer on tests Key: SOLR-5883 URL: https://issues.apache.org/jira/browse/SOLR-5883 Project: Solr Issue Type: Bug Affects Versions: 4.7, 5.0 Reporter: Tomás Fernández Löbbe Assignee: Mark Miller Priority: Minor Fix For: 4.8, 5.0 Attachments: SOLR-5883.patch I noticed that many tests create multiple HttpSolrServer but never call the shutdown. I created a Jira for BasicDistributedZk2Test and BasicDistributedZkTest some time ago but didn’t check for all tests (SOLR-5684). I added the missing shutdowns that I found in other tests now. I’m also wondering if it makes sense to add some kind of check in the tests, in a similar way that open/close of IndexSearchers is checked (probably, not blocking or falling the test if open/close doesn’t much, but maybe to output a warning to the logs?) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5883) Shutdown SolrServer on tests
[ https://issues.apache.org/jira/browse/SOLR-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940795#comment-13940795 ] ASF subversion and git services commented on SOLR-5883: --- Commit 1579338 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1579338 ] SOLR-5883: Many tests do not shutdown SolrServer. Shutdown SolrServer on tests Key: SOLR-5883 URL: https://issues.apache.org/jira/browse/SOLR-5883 Project: Solr Issue Type: Bug Affects Versions: 4.7, 5.0 Reporter: Tomás Fernández Löbbe Assignee: Mark Miller Priority: Minor Fix For: 4.8, 5.0 Attachments: SOLR-5883.patch I noticed that many tests create multiple HttpSolrServer but never call the shutdown. I created a Jira for BasicDistributedZk2Test and BasicDistributedZkTest some time ago but didn’t check for all tests (SOLR-5684). I added the missing shutdowns that I found in other tests now. I’m also wondering if it makes sense to add some kind of check in the tests, in a similar way that open/close of IndexSearchers is checked (probably, not blocking or falling the test if open/close doesn’t much, but maybe to output a warning to the logs?) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5865) Provide a MiniSolrCloudCluster to enable easier testing
[ https://issues.apache.org/jira/browse/SOLR-5865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940886#comment-13940886 ] Gregory Chanan commented on SOLR-5865: -- Did the test fail? Need me to look at anything, Mark? Provide a MiniSolrCloudCluster to enable easier testing --- Key: SOLR-5865 URL: https://issues.apache.org/jira/browse/SOLR-5865 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.7, 5.0 Reporter: Gregory Chanan Assignee: Mark Miller Attachments: SOLR-5865.patch, SOLR-5865.patch Today, the SolrCloud tests are based on the LuceneTestCase class hierarchy, which has a couple of issues around support for downstream projects: - It's difficult to test SolrCloud support in a downstream project that may have its own test framework. For example, some projects have support for different storage backends (e.g. Solr/ElasticSearch/HBase) and want tests against each of the different backends. This is difficult to do cleanly, because the Solr tests require derivation from LuceneTestCase, while the other don't - The LuceneTestCase class hierarchy is really designed for internal solr tests (e.g. it randomizes a lot of parameters to get test coverage, but a downstream project probably doesn't care about that). It's also quite complicated and dense, much more so than a downstream project would want. Given these reasons, it would be nice to provide a simple MiniSolrCloudCluster, similar to how HDFS provides a MiniHdfsCluster or HBase provides a MiniHBaseCluster. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-trunk-Linux-java7-64-analyzers - Build # 105 - Failure!
Build: builds.flonkings.com/job/Lucene-trunk-Linux-java7-64-analyzers/105/ 1 tests failed. REGRESSION: org.apache.lucene.analysis.core.TestRandomChains.testRandomChains Error Message: startOffset must be non-negative, and endOffset must be = startOffset, startOffset=15,endOffset=14 Stack Trace: java.lang.IllegalArgumentException: startOffset must be non-negative, and endOffset must be = startOffset, startOffset=15,endOffset=14 at __randomizedtesting.SeedInfo.seed([9FF114F0742BB467:A2103D913339A9A7]:0) at org.apache.lucene.analysis.tokenattributes.OffsetAttributeImpl.setOffset(OffsetAttributeImpl.java:45) at org.apache.lucene.analysis.shingle.ShingleFilter.incrementToken(ShingleFilter.java:345) at org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:78) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:694) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:605) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:506) at org.apache.lucene.analysis.core.TestRandomChains.testRandomChains(TestRandomChains.java:904) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1617) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:826) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:862) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:876) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:359) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:783) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:443) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:835) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:771) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:782) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at
Re: [JENKINS] Lucene-trunk-Linux-java7-64-analyzers - Build # 105 - Failure!
This is a variant of the thai one. i will fix after thai (and probably smartcn) is fixed. I am just waiting for a review of that one to be able to reuse its logic. On Wed, Mar 19, 2014 at 3:19 PM, buil...@flonkings.com wrote: Build: builds.flonkings.com/job/Lucene-trunk-Linux-java7-64-analyzers/105/ 1 tests failed. REGRESSION: org.apache.lucene.analysis.core.TestRandomChains.testRandomChains Error Message: startOffset must be non-negative, and endOffset must be = startOffset, startOffset=15,endOffset=14 Stack Trace: java.lang.IllegalArgumentException: startOffset must be non-negative, and endOffset must be = startOffset, startOffset=15,endOffset=14 at __randomizedtesting.SeedInfo.seed([9FF114F0742BB467:A2103D913339A9A7]:0) at org.apache.lucene.analysis.tokenattributes.OffsetAttributeImpl.setOffset(OffsetAttributeImpl.java:45) at org.apache.lucene.analysis.shingle.ShingleFilter.incrementToken(ShingleFilter.java:345) at org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:78) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:694) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:605) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:506) at org.apache.lucene.analysis.core.TestRandomChains.testRandomChains(TestRandomChains.java:904) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1617) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:826) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:862) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:876) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:359) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:783) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:443) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:835) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:771) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:782) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at
[jira] [Commented] (LUCENE-5376) Add a demo search server
[ https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940689#comment-13940689 ] ASF subversion and git services commented on LUCENE-5376: - Commit 1579299 from [~mikemccand] in branch 'dev/branches/lucene5376_2' [ https://svn.apache.org/r1579299 ] LUCENE-5376: add missing cause for some exceptions Add a demo search server Key: LUCENE-5376 URL: https://issues.apache.org/jira/browse/LUCENE-5376 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Attachments: lucene-demo-server.tgz I think it'd be useful to have a demo search server for Lucene. Rather than being fully featured, like Solr, it would be minimal, just wrapping the existing Lucene modules to show how you can make use of these features in a server setting. The purpose is to demonstrate how one can build a minimal search server on top of APIs like SearchManager, SearcherLifetimeManager, etc. This is also useful for finding rough edges / issues in Lucene's APIs that make building a server unnecessarily hard. I don't think it should have back compatibility promises (except Lucene's index back compatibility), so it's free to improve as Lucene's APIs change. As a starting point, I'll post what I built for the eating your own dog food search app for Lucene's Solr's jira issues http://jirasearch.mikemccandless.com (blog: http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It uses Netty to expose basic indexing searching APIs via JSON, but it's very rough (lots nocommits). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [jira] [Commented] (SOLR-4787) Join Contrib
I'd guess you have an older jar in your classpath somewhere. Did you apply the patch to a fresh checkout of the code? Best, Erick On Wed, Mar 19, 2014 at 10:45 AM, Alexander S. (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940740#comment-13940740 ] Alexander S. commented on SOLR-4787: Any query fails, seems I am doing something wrong (perhaps the patch was applied incorrectly). I see this error: {quote} SolrCore Initialization Failures crm-dev: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.search.joins.HashSetJoinQParserPlugin' {quote} when trying to access the web interface. Join Contrib Key: SOLR-4787 URL: https://issues.apache.org/jira/browse/SOLR-4787 Project: Solr Issue Type: New Feature Components: search Affects Versions: 4.2.1 Reporter: Joel Bernstein Priority: Minor Fix For: 4.8 Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787-pjoin-long-keys.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4797-hjoin-multivaluekeys-nestedJoins.patch, SOLR-4797-hjoin-multivaluekeys-trunk.patch This contrib provides a place where different join implementations can be contributed to Solr. This contrib currently includes 3 join implementations. The initial patch was generated from the Solr 4.3 tag. Because of changes in the FieldCache API this patch will only build with Solr 4.2 or above. *HashSetJoinQParserPlugin aka hjoin* The hjoin provides a join implementation that filters results in one core based on the results of a search in another core. This is similar in functionality to the JoinQParserPlugin but the implementation differs in a couple of important ways. The first way is that the hjoin is designed to work with int and long join keys only. So, in order to use hjoin, int or long join keys must be included in both the to and from core. The second difference is that the hjoin builds memory structures that are used to quickly connect the join keys. So, the hjoin will need more memory then the JoinQParserPlugin to perform the join. The main advantage of the hjoin is that it can scale to join millions of keys between cores and provide sub-second response time. The hjoin should work well with up to two million results from the fromIndex and tens of millions of results from the main query. The hjoin supports the following features: 1) Both lucene query and PostFilter implementations. A *cost* 99 will turn on the PostFilter. The PostFilter will typically outperform the Lucene query when the main query results have been narrowed down. 2) With the lucene query implementation there is an option to build the filter with threads. This can greatly improve the performance of the query if the main query index is very large. The threads parameter turns on threading. For example *threads=6* will use 6 threads to build the filter. This will setup a fixed threadpool with six threads to handle all hjoin requests. Once the threadpool is created the hjoin will always use it to build the filter. Threading does not come into play with the PostFilter. 3) The *size* local parameter can be used to set the initial size of the hashset used to perform the join. If this is set above the number of results from the fromIndex then the you can avoid hashset resizing which improves performance. 4) Nested filter queries. The local parameter fq can be used to nest a filter query within the join. The nested fq will filter the results of the join query. This can point to another join to support nested joins. 5) Full caching support for the lucene query implementation. The filterCache and queryResultCache should work properly even with deep nesting of joins. Only the queryResultCache comes into play with the PostFilter implementation because PostFilters are not cacheable in the filterCache. The syntax of the hjoin is similar to the JoinQParserPlugin except that the plugin is referenced by the string hjoin rather then join. fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6 fq=$qq\}user:customer1qq=group:5 The example filter query above will search the fromIndex (collection2) for user:customer1 applying the local fq parameter to filter the results. The lucene filter query will be built using 6 threads. This query will generate a list of values from the from field that will be used to filter the main query. Only records from the main query, where the to
[jira] [Commented] (LUCENE-5130) fail the build on compilation warnings
[ https://issues.apache.org/jira/browse/LUCENE-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940959#comment-13940959 ] Shawn Heisey commented on LUCENE-5130: -- I'm trying to coordinate an effort to clean up warnings in the IDEs that we support. Before putting serious effort into that, I am tackling Uwe's recommendation to clean up warnings in the actual build first. I think this may require its own issue rather than doing it on this issue, but I thought I would ask here first. fail the build on compilation warnings -- Key: LUCENE-5130 URL: https://issues.apache.org/jira/browse/LUCENE-5130 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Fix For: 4.8 Attachments: LUCENE-5130.patch, LUCENE-5130.patch Many modules compile w/o warnings ... we should lock this in and fail the build if warnings are ever added, and try to fix the warnings in existing modules. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-5130) fail the build on compilation warnings
[ https://issues.apache.org/jira/browse/LUCENE-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940979#comment-13940979 ] Uwe Schindler edited comment on LUCENE-5130 at 3/19/14 9:01 PM: Hi, one important thing: We should *not* use something like +all warning -some warnings. In that case we would prevent later java versions with more possible warnings (covered by +all) from suceeding the build. Failing on warnings should be done with an explicit set of warnings given. This is just a warning about the warnings :-) was (Author: thetaphi): Hi, one important thing: We should i*not* use something like +all warning -some warnings. In that case we would prevent later java versions with more possible warnings (covered by +all) from suceeding the build. Failing on warnings should be done with an explicit set of warnings given. This is just a warning about the warnings :-) fail the build on compilation warnings -- Key: LUCENE-5130 URL: https://issues.apache.org/jira/browse/LUCENE-5130 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Fix For: 4.8 Attachments: LUCENE-5130.patch, LUCENE-5130.patch Many modules compile w/o warnings ... we should lock this in and fail the build if warnings are ever added, and try to fix the warnings in existing modules. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5130) fail the build on compilation warnings
[ https://issues.apache.org/jira/browse/LUCENE-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940979#comment-13940979 ] Uwe Schindler commented on LUCENE-5130: --- Hi, one important thing: We should i*not* use something like +all warning -some warnings. In that case we would prevent later java versions with more possible warnings (covered by +all) from suceeding the build. Failing on warnings should be done with an explicit set of warnings given. This is just a warning about the warnings :-) fail the build on compilation warnings -- Key: LUCENE-5130 URL: https://issues.apache.org/jira/browse/LUCENE-5130 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Fix For: 4.8 Attachments: LUCENE-5130.patch, LUCENE-5130.patch Many modules compile w/o warnings ... we should lock this in and fail the build if warnings are ever added, and try to fix the warnings in existing modules. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5540) Update forbidden-apis and other tools
Uwe Schindler created LUCENE-5540: - Summary: Update forbidden-apis and other tools Key: LUCENE-5540 URL: https://issues.apache.org/jira/browse/LUCENE-5540 Project: Lucene - Core Issue Type: Task Reporter: Uwe Schindler Fix For: 4.8, 5.0 forbidden-apis was released in version 1.4.1 (with support for final java 8). This also contains some additional unsafe signatures (added by [~rcmuir]), we need to fix. Also this updates some other tools versions: groovy, pegdown -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-5130) fail the build on compilation warnings
[ https://issues.apache.org/jira/browse/LUCENE-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-5130: - Assignee: Uwe Schindler fail the build on compilation warnings -- Key: LUCENE-5130 URL: https://issues.apache.org/jira/browse/LUCENE-5130 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Uwe Schindler Fix For: 4.8 Attachments: LUCENE-5130.patch, LUCENE-5130.patch Many modules compile w/o warnings ... we should lock this in and fail the build if warnings are ever added, and try to fix the warnings in existing modules. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5130) fail the build on compilation warnings
[ https://issues.apache.org/jira/browse/LUCENE-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5130: -- Assignee: (was: Uwe Schindler) fail the build on compilation warnings -- Key: LUCENE-5130 URL: https://issues.apache.org/jira/browse/LUCENE-5130 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Fix For: 4.8 Attachments: LUCENE-5130.patch, LUCENE-5130.patch Many modules compile w/o warnings ... we should lock this in and fail the build if warnings are ever added, and try to fix the warnings in existing modules. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-5540) Update forbidden-apis and other tools
[ https://issues.apache.org/jira/browse/LUCENE-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-5540: - Assignee: Uwe Schindler Update forbidden-apis and other tools - Key: LUCENE-5540 URL: https://issues.apache.org/jira/browse/LUCENE-5540 Project: Lucene - Core Issue Type: Task Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.8, 5.0 forbidden-apis was released in version 1.4.1 (with support for final java 8). This also contains some additional unsafe signatures (added by [~rcmuir]), we need to fix. Also this updates some other tools versions: groovy, pegdown -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5540) Update forbidden-apis and other tools
[ https://issues.apache.org/jira/browse/LUCENE-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5540: -- Attachment: LUCENE-5540.patch Patch. There is one crazy violation found: In solr-dataimporthandler there is some crazy way to lookup a locale from a string. This should be fixed separate (the locale is looked up by looping through all locales and then comparing the display name!). Update forbidden-apis and other tools - Key: LUCENE-5540 URL: https://issues.apache.org/jira/browse/LUCENE-5540 Project: Lucene - Core Issue Type: Task Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.8, 5.0 Attachments: LUCENE-5540.patch forbidden-apis was released in version 1.4.1 (with support for final java 8). This also contains some additional unsafe signatures (added by [~rcmuir]), we need to fix. Also this updates some other tools versions: groovy, pegdown -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5540) Update forbidden-apis and other tools
[ https://issues.apache.org/jira/browse/LUCENE-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940994#comment-13940994 ] ASF subversion and git services commented on LUCENE-5540: - Commit 1579399 from [~thetaphi] in branch 'dev/trunk' [ https://svn.apache.org/r1579399 ] LUCENE-5540: Update forbidden-apis, pegdown, groovy Update forbidden-apis and other tools - Key: LUCENE-5540 URL: https://issues.apache.org/jira/browse/LUCENE-5540 Project: Lucene - Core Issue Type: Task Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.8, 5.0 Attachments: LUCENE-5540.patch forbidden-apis was released in version 1.4.1 (with support for final java 8). This also contains some additional unsafe signatures (added by [~rcmuir]), we need to fix. Also this updates some other tools versions: groovy, pegdown -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5540) Update forbidden-apis and other tools
[ https://issues.apache.org/jira/browse/LUCENE-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941002#comment-13941002 ] ASF subversion and git services commented on LUCENE-5540: - Commit 1579402 from [~thetaphi] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1579402 ] Merged revision(s) 1579399 from lucene/dev/trunk: LUCENE-5540: Update forbidden-apis, pegdown, groovy Update forbidden-apis and other tools - Key: LUCENE-5540 URL: https://issues.apache.org/jira/browse/LUCENE-5540 Project: Lucene - Core Issue Type: Task Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.8, 5.0, 4.7.1 Attachments: LUCENE-5540.patch forbidden-apis was released in version 1.4.1 (with support for final java 8). This also contains some additional unsafe signatures (added by [~rcmuir]), we need to fix. Also this updates some other tools versions: groovy, pegdown -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5540) Update forbidden-apis and other tools
[ https://issues.apache.org/jira/browse/LUCENE-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5540: -- Fix Version/s: 4.7.1 Update forbidden-apis and other tools - Key: LUCENE-5540 URL: https://issues.apache.org/jira/browse/LUCENE-5540 Project: Lucene - Core Issue Type: Task Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.8, 5.0, 4.7.1 Attachments: LUCENE-5540.patch forbidden-apis was released in version 1.4.1 (with support for final java 8). This also contains some additional unsafe signatures (added by [~rcmuir]), we need to fix. Also this updates some other tools versions: groovy, pegdown -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5884) UnloadDistributedZkTest is slower than it should be.
Mark Miller created SOLR-5884: - Summary: UnloadDistributedZkTest is slower than it should be. Key: SOLR-5884 URL: https://issues.apache.org/jira/browse/SOLR-5884 Project: Solr Issue Type: Test Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.8, 5.0 This test ends up waiting a long time to cancel a recovery because the prep recovery command is stuck while the remote node waits to timeout seeing a state it will never see. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5540) Update forbidden-apis and other tools
[ https://issues.apache.org/jira/browse/LUCENE-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-5540. --- Resolution: Fixed Update forbidden-apis and other tools - Key: LUCENE-5540 URL: https://issues.apache.org/jira/browse/LUCENE-5540 Project: Lucene - Core Issue Type: Task Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 4.8, 5.0, 4.7.1 Attachments: LUCENE-5540.patch forbidden-apis was released in version 1.4.1 (with support for final java 8). This also contains some additional unsafe signatures (added by [~rcmuir]), we need to fix. Also this updates some other tools versions: groovy, pegdown -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5885) solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/SimplePropertiesWriter.java has crazy locale lookup
Uwe Schindler created SOLR-5885: --- Summary: solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/SimplePropertiesWriter.java has crazy locale lookup Key: SOLR-5885 URL: https://issues.apache.org/jira/browse/SOLR-5885 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 4.7 Reporter: Uwe Schindler Fix For: 4.8, 5.0 SimplePropertiesWriter uses the following code to convert a string to a java.util.Locale: {code:java} if(params.get(LOCALE) != null) { String localeStr = params.get(LOCALE); for (Locale l : Locale.getAvailableLocales()) { if(localeStr.equals(l.getDisplayName(Locale.ROOT))) { locale = l; break; } } if(locale==null) { throw new DataImportHandlerException(SEVERE, Unsupported locale for PropertWriter: + localeStr); } } else { locale = Locale.ROOT; } {code} This makes no sense to me. Before I fixed that in LUCENE-5540, it was using the localized display name of the locale for lookup. As we are on Java 7 in trunk and 4.x we can use the new API: {{Locale#forLanguageTag(String languageTag)}} This one is not dependent on current locale, does not use display name and follows standards (IETF BCP 47). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org