[jira] [Commented] (SOLR-6315) Remove SimpleOrderedMap
[ https://issues.apache.org/jira/browse/SOLR-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084374#comment-14084374 ] Shai Erera commented on SOLR-6315: -- OK, so I think if the jdocs said exactly (and only) what you wrote above, that would be fine. It will definitely remove the confusion around this class, as as you state it -- it's only used for JSON formatting. I would even call it a MapNamedList or JSONMapNamedList, to make it more clear. I still don't like it that whoever creates a NamedList has any saying about, or need to be aware of, how it will be output eventually. I don't understand why the different RequestHandlers decide to create a SimpleOrderedMap and not a NamedList -- why do they care how the Java class will be written as a response? It should be the job of the response writer, and then e.g. users who care about different output formats should worry about it, not the developer who codes the RequestHandler. So maybe if we decoupled NamedList and SimpleOrderedMap, into two separate implementations (same interface though), such that the RequestHandler's decision was more intelligent and intentional: SimpleOrderedMap would use a MapString,Object internally, forbid the use of a null key, or multi-valued keys and NamedList would allow anything. Then, whoever initializes it must decide in advance if it's going to need null or multi-valued keys and if it doesn't, it uses the simple implementation. Otherwise it uses the more generic NamedList. And JSONResponseWriter deciding to write a SimpleOrderedMap as a JSON map would make sense. And default to flat for NamedList would also make sense, since by definition it allows null keys, multi-valued keys etc. Is that something you think we could explore? Do you think that if I did that, existing handlers would break (because today SimpleOrderedMap does not limit you like that)? I don't mind doing this work, but would like to get your feedback first. Remove SimpleOrderedMap --- Key: SOLR-6315 URL: https://issues.apache.org/jira/browse/SOLR-6315 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-6315.patch As I described on SOLR-912, SimpleOrderedMap is redundant and generally useless class, with confusing jdocs. We should remove it. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5810) State of external collections not displayed in cloud graph panel
[ https://issues.apache.org/jira/browse/SOLR-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084384#comment-14084384 ] Noble Paul commented on SOLR-5810: -- There are multiple parts to this ticket. # Get all collections in the cloud panel # Display collections in pages rather than one large single page # And a way to 'search' for a particular collection #1 is a must for 5473 . I have tested the other two and they are working . Just not sure if the latest patch is attached with this jira State of external collections not displayed in cloud graph panel Key: SOLR-5810 URL: https://issues.apache.org/jira/browse/SOLR-5810 Project: Solr Issue Type: Improvement Components: SolrCloud, web gui Reporter: Timothy Potter Assignee: Timothy Potter Attachments: SOLR-5810-prelim.patch, SOLR-5810.prelim2.patch External collections (SOLR-5473) are not displayed in the Cloud - graph panel, which makes it very hard to see which external collections have problems, such as after a downed node comes back online. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5473) Split clusterstate.json per collection and watch states selectively
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084386#comment-14084386 ] Noble Paul commented on SOLR-5473: -- bq.I think if we are transitioning to this, on 5x at least, it should default to true. Yes, absolutely. bq.I don't really see the need to support a mixed install ... If I already have a cluster with branch_4x and I wish to move to a newer build , don't we need to support both formats? The next question is If I have many collections already in my cluster how/when should the split happen Split clusterstate.json per collection and watch states selectively Key: SOLR-5473 URL: https://issues.apache.org/jira/browse/SOLR-5473 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Labels: SolrCloud Fix For: 5.0, 4.10 Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74_POC.patch, SOLR-5473-configname-fix.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log As defined in the parent issue, store the states of each collection under /collections/collectionname/state.json node and watches state changes selectively. https://reviews.apache.org/r/24220/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-5698) Evaluate Lucene classification on publicly available datasets
[ https://issues.apache.org/jira/browse/LUCENE-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tommaso Teofili reassigned LUCENE-5698: --- Assignee: Tommaso Teofili Evaluate Lucene classification on publicly available datasets - Key: LUCENE-5698 URL: https://issues.apache.org/jira/browse/LUCENE-5698 Project: Lucene - Core Issue Type: Sub-task Components: modules/classification Reporter: Gergő Törcsvári Assignee: Tommaso Teofili Attachments: 0803-test.patch The Lucene classification module need some publicly available dataset for keep track on the development. Now it woud be nice to have some generated fast test-sets, and some bigger real world dataset too. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-trunk-Java7 - Build # 4789 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-trunk-Java7/4789/ 1 tests failed. FAILED: org.apache.solr.cloud.MultiThreadedOCPTest.testDistribSearch Error Message: Captured an uncaught exception in thread: Thread[id=17287, name=qtp1607451782-17287, state=RUNNABLE, group=TGRP-MultiThreadedOCPTest] Stack Trace: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=17287, name=qtp1607451782-17287, state=RUNNABLE, group=TGRP-MultiThreadedOCPTest] Caused by: java.lang.OutOfMemoryError: unable to create new native thread at __randomizedtesting.SeedInfo.seed([EBA9F5A770D74A40]:0) at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:714) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1047) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1312) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1339) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1323) at org.eclipse.jetty.server.ssl.SslSocketConnector$SslConnectorEndPoint.run(SslSocketConnector.java:665) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) Build Log: [...truncated 11273 lines...] [junit4] Suite: org.apache.solr.cloud.MultiThreadedOCPTest [junit4] 2 Creating dataDir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/solr/build/solr-core/test/J1/./temp/solr.cloud.MultiThreadedOCPTest-EBA9F5A770D74A40-001/init-core-data-001 [junit4] 2 1272769 T15606 oas.SolrTestCaseJ4.buildSSLConfig Randomized ssl (true) and clientAuth (true) [junit4] 2 1272770 T15606 oas.BaseDistributedSearchTestCase.initHostContext Setting hostContext system property: /_ssn/xx [junit4] 2 1272775 T15606 oas.SolrTestCaseJ4.setUp ###Starting testDistribSearch [junit4] 2 1272776 T15606 oasc.ZkTestServer.run STARTING ZK TEST SERVER [junit4] 1 client port:0.0.0.0/0.0.0.0:0 [junit4] 2 1272777 T15607 oasc.ZkTestServer$ZKServerMain.runFromConfig Starting server [junit4] 2 1272877 T15606 oasc.ZkTestServer.run start zk server on port:45003 [junit4] 2 1272879 T15606 oascc.ConnectionManager.waitForConnected Waiting for client to connect to ZooKeeper [junit4] 2 1272882 T15613 oascc.ConnectionManager.process Watcher org.apache.solr.common.cloud.ConnectionManager@3a593714 name:ZooKeeperConnection Watcher:127.0.0.1:45003 got event WatchedEvent state:SyncConnected type:None path:null path:null type:None [junit4] 2 1272883 T15606 oascc.ConnectionManager.waitForConnected Client is connected to ZooKeeper [junit4] 2 1272883 T15606 oascc.SolrZkClient.makePath makePath: /solr [junit4] 2 1272887 T15606 oascc.ConnectionManager.waitForConnected Waiting for client to connect to ZooKeeper [junit4] 2 1272888 T15615 oascc.ConnectionManager.process Watcher org.apache.solr.common.cloud.ConnectionManager@6ae8a460 name:ZooKeeperConnection Watcher:127.0.0.1:45003/solr got event WatchedEvent state:SyncConnected type:None path:null path:null type:None [junit4] 2 1272888 T15606 oascc.ConnectionManager.waitForConnected Client is connected to ZooKeeper [junit4] 2 1272889 T15606 oascc.SolrZkClient.makePath makePath: /collections/collection1 [junit4] 2 1272891 T15606 oascc.SolrZkClient.makePath makePath: /collections/collection1/shards [junit4] 2 1272893 T15606 oascc.SolrZkClient.makePath makePath: /collections/control_collection [junit4] 2 1272895 T15606 oascc.SolrZkClient.makePath makePath: /collections/control_collection/shards [junit4] 2 1272897 T15606 oasc.AbstractZkTestCase.putConfig put /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/solr/core/src/test-files/solr/collection1/conf/solrconfig-tlog.xml to /configs/conf1/solrconfig.xml [junit4] 2 1272897 T15606 oascc.SolrZkClient.makePath makePath: /configs/conf1/solrconfig.xml [junit4] 2 1272900 T15606 oasc.AbstractZkTestCase.putConfig put /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/solr/core/src/test-files/solr/collection1/conf/schema.xml to /configs/conf1/schema.xml [junit4] 2 1272901 T15606 oascc.SolrZkClient.makePath makePath: /configs/conf1/schema.xml [junit4] 2 1273003 T15606 oasc.AbstractZkTestCase.putConfig put /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/solr/core/src/test-files/solr/collection1/conf/solrconfig.snippet.randomindexconfig.xml to /configs/conf1/solrconfig.snippet.randomindexconfig.xml [junit4] 2 1273004 T15606 oascc.SolrZkClient.makePath makePath: /configs/conf1/solrconfig.snippet.randomindexconfig.xml [junit4] 2 1273006 T15606 oasc.AbstractZkTestCase.putConfig put
[JENKINS] Lucene-Solr-SmokeRelease-trunk - Build # 189 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-trunk/189/ No tests ran. Build Log: [...truncated 50755 lines...] prepare-release-no-sign: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-trunk/lucene/build/fakeRelease [copy] Copying 431 files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-trunk/lucene/build/fakeRelease/lucene [copy] Copying 230 files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-trunk/lucene/build/fakeRelease/solr [exec] JAVA7_HOME is /home/hudson/tools/java/latest1.7 [exec] NOTE: output encoding is US-ASCII [exec] [exec] Load release URL file:/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-trunk/lucene/build/fakeRelease/... [exec] [exec] Test Lucene... [exec] test basics... [exec] get KEYS [exec] 0.1 MB in 0.01 sec (9.2 MB/sec) [exec] check changes HTML... [exec] download lucene-5.0.0-src.tgz... [exec] 27.3 MB in 0.04 sec (653.0 MB/sec) [exec] verify md5/sha1 digests [exec] download lucene-5.0.0.tgz... [exec] 61.0 MB in 0.15 sec (398.2 MB/sec) [exec] verify md5/sha1 digests [exec] download lucene-5.0.0.zip... [exec] 70.4 MB in 0.11 sec (640.1 MB/sec) [exec] verify md5/sha1 digests [exec] unpack lucene-5.0.0.tgz... [exec] verify JAR metadata/identity/no javax.* or java.* classes... [exec] test demo with 1.7... [exec] got 5549 hits for query lucene [exec] checkindex with 1.7... [exec] check Lucene's javadoc JAR [exec] unpack lucene-5.0.0.zip... [exec] verify JAR metadata/identity/no javax.* or java.* classes... [exec] test demo with 1.7... [exec] got 5549 hits for query lucene [exec] checkindex with 1.7... [exec] check Lucene's javadoc JAR [exec] unpack lucene-5.0.0-src.tgz... [exec] make sure no JARs/WARs in src dist... [exec] run ant validate [exec] run tests w/ Java 7 and testArgs='-Dtests.jettyConnector=Socket -Dtests.disableHdfs=true'... [exec] test demo with 1.7... [exec] got 243 hits for query lucene [exec] checkindex with 1.7... [exec] generate javadocs w/ Java 7... [exec] [exec] Crawl/parse... [exec] [exec] Verify... [exec] [exec] Test Solr... [exec] test basics... [exec] get KEYS [exec] 0.1 MB in 0.02 sec (5.7 MB/sec) [exec] check changes HTML... [exec] download solr-5.0.0-src.tgz... [exec] 33.2 MB in 0.30 sec (109.5 MB/sec) [exec] verify md5/sha1 digests [exec] download solr-5.0.0.tgz... [exec] 118.3 MB in 0.52 sec (227.1 MB/sec) [exec] verify md5/sha1 digests [exec] download solr-5.0.0.zip... [exec] 124.2 MB in 0.78 sec (159.4 MB/sec) [exec] verify md5/sha1 digests [exec] unpack solr-5.0.0.tgz... [exec] verify JAR metadata/identity/no javax.* or java.* classes... [exec] unpack lucene-5.0.0.tgz... [exec] **WARNING**: skipping check of /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-trunk/lucene/build/fakeReleaseTmp/unpack/solr-5.0.0/contrib/dataimporthandler-extras/lib/activation-1.1.1.jar: it has javax.* classes [exec] **WARNING**: skipping check of /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-trunk/lucene/build/fakeReleaseTmp/unpack/solr-5.0.0/contrib/dataimporthandler-extras/lib/javax.mail-1.5.1.jar: it has javax.* classes [exec] verify WAR metadata/contained JAR identity/no javax.* or java.* classes... [exec] unpack lucene-5.0.0.tgz... [exec] copying unpacked distribution for Java 7 ... [exec] test solr example w/ Java 7... [exec] start Solr instance (log=/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-trunk/lucene/build/fakeReleaseTmp/unpack/solr-5.0.0-java7/solr-example.log)... [exec] startup done [exec] test utf8... [exec] index example docs... [exec] run query... [exec] stop server (SIGINT)... [exec] unpack solr-5.0.0.zip... [exec] verify JAR metadata/identity/no javax.* or java.* classes... [exec] unpack lucene-5.0.0.tgz... [exec] **WARNING**: skipping check of /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-trunk/lucene/build/fakeReleaseTmp/unpack/solr-5.0.0/contrib/dataimporthandler-extras/lib/activation-1.1.1.jar: it has javax.* classes [exec] **WARNING**: skipping check of /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-trunk/lucene/build/fakeReleaseTmp/unpack/solr-5.0.0/contrib/dataimporthandler-extras/lib/javax.mail-1.5.1.jar: it has javax.* classes [exec] verify WAR metadata/contained JAR
RE: Lucene versioning logic
I think this proposal is what makes the most sense since this discussion started. As for making an instance not modifiable, the setVersion could check if it was already called and error-out if it was. Then you could go from default to version 123, but at least you couldn't hop from version to version while the analyzer is live. This should mostly be handled by factories anyhow, so if factories explicitly calls setVersion all the time before returning an instance, the instances wouldn't be modifiable. My 2 cents. :) Steve From: Shai Erera [ser...@gmail.com] Sent: August 3, 2014 8:51 AM To: dev@lucene.apache.org Subject: Re: Lucene versioning logic OK I see, it's the Tokenizers and Filters that usually change, the Analyzer only needs Version to determine which TokenStream chain to return, and so we achieve that w/ Ryan's proposal of setVersion(). I'd still feel better if Version was a final setting on an Analyzer, i.e. that a single Analyzer instance always behaves consistently, and cannot alternate behavior down-stream if someone called setVersion(). But this is a really stupid thing to do. Maybe setVersion() can return an Analyzer, so you're sure that instance is not modifiable. But maybe this is just over engineering... I'm 0.5 to that. I prefer Version to be mandatory somehow (class name, ctor argument), but I can live with setVersion as well... Shai On Sun, Aug 3, 2014 at 3:30 PM, Robert Muir rcm...@gmail.com wrote: No, I didn't say any of this, please read it again :) Old back-compat *Tokenizer/Tokenfilter* are named this way. *Tokenizer/Tokenfilter* Only the old ones. Just to be clear: *Tokenizer/Tokenfilter* Their Factories still use Version to produce the right one (as we can't remove version from there, or we will have complaints from solr developers). So users who want the version-style back compat can just use the factories. really. On the other hand new users can do 'new LowerCaseFilter()' without the bullshit. For Analyzers, there is a setter. Users who want to use *OUR ANALYZERS* with back compat, call the setter. But its not mandatory-in-your-face-ctor. I am 1 to Ryan's proposal, so please look for more elaboration there. I am -1 to putting Versions in the name of Analyzers. On Sun, Aug 3, 2014 at 8:21 AM, Shai Erera ser...@gmail.com wrote: Oh, I misread this part I do think its ok to name ... -- replaced do with don't :). So you say that if we have a FooAnalyzer in 4.5 and change its behavior in 4.9, then we add a Foo45Analyzer as a back-compat support, and FooAnalyzer in 4.9 keeps its name, but with different behavior? That means that an app who didn't read CHANGES will be broken upon upgrade, but if it does read CHANGES, it at least has a way to retain desired behavior. So the thing now is whether FooAnalyzer is always _current_ and an app should choose a backwards version of it (if it wants to), vs if FooAnalyzer is _always the same_, and if you want to move forward you have to explicitly use a NewFooAnalyzer? Of course, when FooAnalyzer takes a Version, then an app only needs to change its Version CONSTANT, to get best behavior ... but as you point out, seems like we failed to implement that approach in our code already, which suggests this approach is not intuitive to our committers, so why do we expect our users to understand it ... I am 1 on either of the approaches (both get rid of Version.java). I don't feel bad with asking users to read CHANGES before they upgrade, and it does mean that FooAnalyzer always gives you the best behavior, which is important for new users or if you always re-index. Vs the second approach which always prefers backwards compatibility, and telling users to read the javadocs (and CHANGES) in order to find the best version of FooAnalyzer. There is another issue w/ a global Version CONSTANT, which today we encourage apps to use -- if you use two analyzers, but you want to work with a different Version of each (because of all sorts of reasons), having a global constant is bad. The explicit Foo45Analyzer (or Foo49Analyzer, whichever) lets you mix whichever versions that you want. Shai On Sun, Aug 3, 2014 at 3:02 PM, Robert Muir rcm...@gmail.com wrote: You don't read what i wrote. Read it again. On Sun, Aug 3, 2014 at 7:49 AM, Shai Erera ser...@gmail.com wrote: Yes, I agree that Foo49Analyzer is an odd name. Better if it was named FooAnalyzerWithNoApostrophe, and I'm fine if that Analyzer chose to name its different versions like that. But in the absence of better naming ideas, I proposed the Foo49Analyzer. If we already have such Analyzers, then we are in fact implementing that approach, only didn't make that decision globally. So whether it's odd or not, let's first agree if we are willing to have these analyzers in our code base (i.e. w/ the back-compat support). If we do, we can let each Analyzer decide on its naming. Analyzers aren't Codecs, I
[JENKINS] Lucene-Solr-NightlyTests-trunk - Build # 597 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-trunk/597/ 4 tests failed. REGRESSION: org.apache.solr.cloud.OverseerTest.testOverseerFailure Error Message: Could not register as the leader because creating the ephemeral registration node in ZooKeeper failed Stack Trace: org.apache.solr.common.SolrException: Could not register as the leader because creating the ephemeral registration node in ZooKeeper failed at __randomizedtesting.SeedInfo.seed([C5A20DAE37201620:C1AA825D2585F901]:0) at org.apache.solr.cloud.ShardLeaderElectionContextBase.runLeaderProcess(ElectionContext.java:144) at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:155) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:314) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221) at org.apache.solr.cloud.OverseerTest$MockZKController.publishState(OverseerTest.java:155) at org.apache.solr.cloud.OverseerTest.testOverseerFailure(OverseerTest.java:660) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at
Re: Lucene versioning logic
Right, we can also use a SetOnce wrapper to restrict that. Shai On Mon, Aug 4, 2014 at 4:05 PM, Steve Molloy smol...@opentext.com wrote: I think this proposal is what makes the most sense since this discussion started. As for making an instance not modifiable, the setVersion could check if it was already called and error-out if it was. Then you could go from default to version 123, but at least you couldn't hop from version to version while the analyzer is live. This should mostly be handled by factories anyhow, so if factories explicitly calls setVersion all the time before returning an instance, the instances wouldn't be modifiable. My 2 cents. :) Steve -- *From:* Shai Erera [ser...@gmail.com] *Sent:* August 3, 2014 8:51 AM *To:* dev@lucene.apache.org *Subject:* Re: Lucene versioning logic OK I see, it's the Tokenizers and Filters that usually change, the Analyzer only needs Version to determine which TokenStream chain to return, and so we achieve that w/ Ryan's proposal of setVersion(). I'd still feel better if Version was a final setting on an Analyzer, i.e. that a single Analyzer instance always behaves consistently, and cannot alternate behavior down-stream if someone called setVersion(). But this is a really stupid thing to do. Maybe setVersion() can return an Analyzer, so you're sure that instance is not modifiable. But maybe this is just over engineering... I'm +0.5 to that. I prefer Version to be mandatory somehow (class name, ctor argument), but I can live with setVersion as well... Shai On Sun, Aug 3, 2014 at 3:30 PM, Robert Muir rcm...@gmail.com wrote: No, I didn't say any of this, please read it again :) Old back-compat *Tokenizer/Tokenfilter* are named this way. *Tokenizer/Tokenfilter* Only the old ones. Just to be clear: *Tokenizer/Tokenfilter* Their Factories still use Version to produce the right one (as we can't remove version from there, or we will have complaints from solr developers). So users who want the version-style back compat can just use the factories. really. On the other hand new users can do 'new LowerCaseFilter()' without the bullshit. For Analyzers, there is a setter. Users who want to use *OUR ANALYZERS* with back compat, call the setter. But its not mandatory-in-your-face-ctor. I am +1 to Ryan's proposal, so please look for more elaboration there. I am -1 to putting Versions in the name of Analyzers. On Sun, Aug 3, 2014 at 8:21 AM, Shai Erera ser...@gmail.com wrote: Oh, I misread this part I do think its ok to name ... -- replaced do with don't :). So you say that if we have a FooAnalyzer in 4.5 and change its behavior in 4.9, then we add a Foo45Analyzer as a back-compat support, and FooAnalyzer in 4.9 keeps its name, but with different behavior? That means that an app who didn't read CHANGES will be broken upon upgrade, but if it does read CHANGES, it at least has a way to retain desired behavior. So the thing now is whether FooAnalyzer is always _current_ and an app should choose a backwards version of it (if it wants to), vs if FooAnalyzer is _always the same_, and if you want to move forward you have to explicitly use a NewFooAnalyzer? Of course, when FooAnalyzer takes a Version, then an app only needs to change its Version CONSTANT, to get best behavior ... but as you point out, seems like we failed to implement that approach in our code already, which suggests this approach is not intuitive to our committers, so why do we expect our users to understand it ... I am +1 on either of the approaches (both get rid of Version.java). I don't feel bad with asking users to read CHANGES before they upgrade, and it does mean that FooAnalyzer always gives you the best behavior, which is important for new users or if you always re-index. Vs the second approach which always prefers backwards compatibility, and telling users to read the javadocs (and CHANGES) in order to find the best version of FooAnalyzer. There is another issue w/ a global Version CONSTANT, which today we encourage apps to use -- if you use two analyzers, but you want to work with a different Version of each (because of all sorts of reasons), having a global constant is bad. The explicit Foo45Analyzer (or Foo49Analyzer, whichever) lets you mix whichever versions that you want. Shai On Sun, Aug 3, 2014 at 3:02 PM, Robert Muir rcm...@gmail.com wrote: You don't read what i wrote. Read it again. On Sun, Aug 3, 2014 at 7:49 AM, Shai Erera ser...@gmail.com wrote: Yes, I agree that Foo49Analyzer is an odd name. Better if it was named FooAnalyzerWithNoApostrophe, and I'm fine if that Analyzer chose to name its different versions like that. But in the absence of better naming ideas, I proposed the Foo49Analyzer. If we already have such Analyzers, then we are in fact implementing that
[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses
[ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084692#comment-14084692 ] Michael McCandless commented on LUCENE-4396: Thanks Da, this looks like great progress. Just to sum things up a bit here: * Both BooleanArrayScorer and BooleanLinkedScorer (which are Scorers not BulkScorers) can only be used when there's at least one MUST clause in the BooleanQuery. * BooleanArrayScorer grabs the next SIZE (256 now) hits from the MUST clauses, and then folds in the MUST_NOT and SHOULD. * BooleanLinkedScorer, like BooleanScorer, matches/cores in windows of 2048 docIDs at once, but it uses a bitSet (and also the linked list) to track filled bucket slots. * BooleanScorer now can also handle MUST clauses It's nice that you're careful to do the math and double/float casting in the same order as BS2 so the scores match. It's a bit spooky that collectMore recurses on itself; in theory there's an adversary that could consume quite a bit of stack right? Can we refactor that to the equivalent while loop (it's just tail recursion). Unfortunately the logic for picking which scorer to use looks really complex; hopefully we can simplify it. Also, do we really need 3 scorer classes (BS, BAS, BLS) for the non-DAAT case? Ie, does each really provide a compelling situation where it's better than the others? It's not great adding so much complexity for performance gains of unusual (so many clauses) boolean queries... BooleanScorer should sometimes be used for MUST clauses --- Key: LUCENE-4396 URL: https://issues.apache.org/jira/browse/LUCENE-4396 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Attachments: And.tasks, And.tasks, AndOr.tasks, AndOr.tasks, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, SIZE.perf, all.perf, luceneutil-score-equal.patch, luceneutil-score-equal.patch, stat.cpp, stat.cpp, tasks.cpp Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT. If there is one or more MUST clauses we always use BooleanScorer2. But I suspect that unless the MUST clauses have very low hit count compared to the other clauses, that BooleanScorer would perform better than BooleanScorer2. BooleanScorer still has some vestiges from when it used to handle MUST so it shouldn't be hard to bring back this capability ... I think the challenging part might be the heuristics on when to use which (likely we would have to use firstDocID as proxy for total hit count). Likely we should also have BooleanScorer sometimes use .advance() on the subs in this case, eg if suddenly the MUST clause skips 100 docs then you want to .advance() all the SHOULD clauses. I won't have near term time to work on this so feel free to take it if you are inspired! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-MacOSX (64bit/jdk1.7.0) - Build # 1715 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/1715/ Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseSerialGC 1 tests failed. REGRESSION: org.apache.solr.cloud.BasicDistributedZkTest.testDistribSearch Error Message: commitWithin did not work on node: http://127.0.0.1:55284/ra/y/collection1 expected:68 but was:67 Stack Trace: java.lang.AssertionError: commitWithin did not work on node: http://127.0.0.1:55284/ra/y/collection1 expected:68 but was:67 at __randomizedtesting.SeedInfo.seed([282191860CB32A6C:A9C71F9E7BEC4A50]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.apache.solr.cloud.BasicDistributedZkTest.doTest(BasicDistributedZkTest.java:356) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:867) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at
[jira] [Commented] (SOLR-5810) State of external collections not displayed in cloud graph panel
[ https://issues.apache.org/jira/browse/SOLR-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084746#comment-14084746 ] Timothy Potter commented on SOLR-5810: -- I need to create an updated patch out of our internal repo where we originally developed this solution. I can get to this later today. State of external collections not displayed in cloud graph panel Key: SOLR-5810 URL: https://issues.apache.org/jira/browse/SOLR-5810 Project: Solr Issue Type: Improvement Components: SolrCloud, web gui Reporter: Timothy Potter Assignee: Timothy Potter Attachments: SOLR-5810-prelim.patch, SOLR-5810.prelim2.patch External collections (SOLR-5473) are not displayed in the Cloud - graph panel, which makes it very hard to see which external collections have problems, such as after a downed node comes back online. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Performance issue with initialSize parameter for SolrCache
SolrCache implementations LRUCache, FastLRUCache and LFUCache have a parameter initialSize, that is passed directly to HashMap initialCapacity constructor argument. For LFUCache this is the code line: map = new LinkedHashMapK,V(initialSize, 0.75f, true) . In solrconfig.xml I tried to set initialSize to be size/0.75 + some_more, but it's impossible to set value greather than size because of this line: final int initialSize = Math.min(str==null ? 1024 : Integer.parseInt(str), limit); For FastLRUCache there is no such limitation, but still I think documentation and default values in solrconfig.xml should not use equal values for initialSize and size.
[jira] [Commented] (LUCENE-5699) Lucene classification score calculation normalize and return lists
[ https://issues.apache.org/jira/browse/LUCENE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084775#comment-14084775 ] Gergő Törcsvári commented on LUCENE-5699: - So why good the normalized and normalizedList functions? First of all, why normalized? When I first tried to use the Lucene Classification, one of the bigger problem was, that the scores, whats come back means nothing. Basically the classifier returns the class, and a random number. If you have 2 text, and you push them in the classifier, the scores didn't help you to figure out what result is more trustworthy. The normalized values have that option. If you want to tell the user how sure are you, the normalized values help you out. Second, why lists? If you can tell the user, how sure are you, it's not far that you want to tell them whats are the other options. What are the 3 more relevant or 5 more relevant class. Most of the classification algorithms have those numbers a prior. The problem with the normalization and the lists: Sadly not all classification algorithm have lists, they just drop classes. So it can't go instantly to the api, because some classification method never have list or score. I have 2 api suggestion: The first where the Classifier interface get those normalized and normalizedList functions, and some of the implementations drop exceptions if somebody want to use them. Or, the Classifier interface don't get them, but some classifier can provide these functions. Lucene classification score calculation normalize and return lists -- Key: LUCENE-5699 URL: https://issues.apache.org/jira/browse/LUCENE-5699 Project: Lucene - Core Issue Type: Sub-task Components: modules/classification Reporter: Gergő Törcsvári Assignee: Tommaso Teofili Attachments: 06-06-5699.patch, 0730.patch, 0803-base.patch Now the classifiers can return only the best matching classes. If somebody want it to use more complex tasks he need to modify these classes for get second and third results too. If it is possible to return a list and it is not a lot resource why we dont do that? (We iterate a list so also.) The Bayes classifier get too small return values, and there were a bug with the zero floats. It was fixed with logarithmic. It would be nice to scale the class scores sum vlue to one, and then we coud compare two documents return score and relevance. (If we dont do this the wordcount in the test documents affected the result score.) With bulletpoints: * In the Bayes classification normalized score values, and return with result lists. * In the KNN classifier possibility to return a result list. * Make the ClassificationResult Comparable for list sorting. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-2894: --- Attachment: SOLR-2894.patch Ater working through the fix the the refinement logic in PivotFacetField.queuePivotRefinementRequests the previously failing seed for TestCloudPivotFacet started to pass, but some sort=index tests still weren't working, which lead me to realize 2 things: * some of my tests were absurd -- i've gotten use to using overrequest=0 as a way to force refinement, but with facet.sort=index combined with limit (and offset) ad mincount it ment that it was impossible for the sort=index facet logic to ever find the results we're looking for. We *have* to allow some overrequest when mincount1 or the initial shard requests won't find the values (that will ultimately have a cumulative mincount high enough) in order to even try refining them. * offset wasn't being added to the limit in the per-shard requests, so w/o overrequest enabled you would never get teh values you needed even in ideal situations * the shard query logic in FacetComponent was ignoring overrequest when sort=index ... this seems broken to me, but from what i can tell, it comes straight form the existing facet.field logic as well. I'll open a bug to track the existing broken logic overrequest logic in facet.field -- even though i hope that once we're done with this issue, it may be fixed via refactoring and shared code with pivots (i'm not 100% certain: the FacetComponent diff is the bulk of what i still need to review more closely on this issue) There's still a failure in DistributedFacetPivotLargeTest (mismatch comapred to control) when i tried using mincount=0 that i'm not certain if/how we can solve... {code} // :nocommit: broken honda? rsp = query( params( q, *:*, rows, 0, facet,true, facet.sort,index, f.place_s.facet.limit, 20, f.place_s.facet.offset, 40, FacetParams.FACET_PIVOT_MINCOUNT,0, facet.pivot, place_s,company_t) ); {code} From what I can tell, the gist of the issue is that when dealing with sub-fields of the pivot, the coordination code doesn't know about some of the 0 values if no shard which has the value for the parent field even knows about the existence of the term. The simplest example of this discrepency (compared to single node pivots) is to consider an index with only 2 docs... {noformat} [{id:1,top_s:foo,sub_s:bar} {id:2,top_s:xxx,sub_s:yyy}] {noformat} If those two docs exist in a single node index, and you pivot on {{top_s,sub_s}} using mincount=0 you get a response like this... {noformat} $ curl -sS 'http://localhost:8881/solr/select?q=*:*rows=0facet=truefacet.pivot.mincount=0facet.pivot=top_s,sub_somitHeader=truewt=jsonindent=true' { response:{numFound:2,start:0,docs:[] }, facet_counts:{ facet_queries:{}, facet_fields:{}, facet_dates:{}, facet_ranges:{}, facet_intervals:{}, facet_pivot:{ top_s,sub_s:[{ field:top_s, value:foo, count:1, pivot:[{ field:sub_s, value:bar, count:1}, { field:sub_s, value:yyy, count:0}]}, { field:top_s, value:xxx, count:1, pivot:[{ field:sub_s, value:yyy, count:1}, { field:sub_s, value:bar, count:0}]}]}}} {noformat} If however you index each of those docs on a seperate shard, the response comes back like this... {noformat} $ curl -sS 'http://localhost:8881/solr/select?q=*:*rows=0facet=truefacet.pivot.mincount=0facet.pivot=top_s,sub_somitHeader=truewt=jsonindent=trueshards=localhost:8881/solr,localhost:8882/solr' { response:{numFound:2,start:0,maxScore:1.0,docs:[] }, facet_counts:{ facet_queries:{}, facet_fields:{}, facet_dates:{}, facet_ranges:{}, facet_intervals:{}, facet_pivot:{ top_s,sub_s:[{ field:top_s, value:foo, count:1, pivot:[{ field:sub_s, value:bar, count:1}]}, { field:top_s, value:xxx, count:1, pivot:[{ field:sub_s, value:yyy, count:1}]}]}}} {noformat} The only solution i can think of, would be an extra (special to mincount=0) stage of logic, after each PivotFacetField is refined, that would: * iterate over all the values of the current pivot * build up a Set of all all the known values for the child-pivots of of those values * iterate over all the values again, merging in a 0-count child value for every value in the set ...ie: At least one shard knows about value 'v_x' in
[jira] [Commented] (SOLR-5810) State of external collections not displayed in cloud graph panel
[ https://issues.apache.org/jira/browse/SOLR-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084844#comment-14084844 ] Erick Erickson commented on SOLR-5810: -- I'm coming late to the party, but wanted to point out this umbrella JIRA: https://issues.apache.org/jira/browse/SOLR-6082. The current admin UI carries a bunch of historical design from a very long time ago. It seems to me that SolrCloud admin could be made much more user-friendly if we moved all the SolrCloud stuff to it's own page or something. Having a select core dropdown as a menu choice on a page that displays the state of your SolrCloud is...wrong. Ideally, I'd like to see all the nodes in my cluster (whether they hosted collections or not). I'd like to ctrl-click on some number of them and be able to create a collection on the selected nodes. I'd like to be able to ctrl-click on a node and add a replica on that node to a collection. I'd like to you get the idea. All without having to drop into the shell prompt and use command-line scripts. Or type in a collections API call. We have the infrastructure in place, much of this would be a UI for the Collections API. Note, I am _not_ advocating we delay these issues waiting for some grand new design. Mostly I'm wondering if there's enough interest in this kind of thing to start designing a SolrCloud admin interface. We can use SOLR-6082 as a basis for the discussion if so. State of external collections not displayed in cloud graph panel Key: SOLR-5810 URL: https://issues.apache.org/jira/browse/SOLR-5810 Project: Solr Issue Type: Improvement Components: SolrCloud, web gui Reporter: Timothy Potter Assignee: Timothy Potter Attachments: SOLR-5810-prelim.patch, SOLR-5810.prelim2.patch External collections (SOLR-5473) are not displayed in the Cloud - graph panel, which makes it very hard to see which external collections have problems, such as after a downed node comes back online. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084867#comment-14084867 ] Erick Erickson commented on SOLR-2894: -- [~hossman_luc...@fucit.org] I confess I'm barely skimming this (it's big as you are more aware than me!). But there were two recent JIRAs, SOLR-6300 SOLR-6314 (facet mincount fails if distrib=true and multi-threaded facet count returns different results if shards 1) that sure seem like they could be related. Does that seem plausible? I realize this is pivot faceting, but... So I'm thinking if I can get repeatable test case failures for these two JIRAs that I should apply this patch and see if this patch fixes them. Thoughts? Implement distributed pivot faceting Key: SOLR-2894 URL: https://issues.apache.org/jira/browse/SOLR-2894 Project: Solr Issue Type: Improvement Reporter: Erik Hatcher Assignee: Hoss Man Fix For: 4.9, 5.0 Attachments: SOLR-2894-mincount-minification.patch, SOLR-2894-reworked.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894_cloud_test.patch, dateToObject.patch, pivot_mincount_problem.sh Following up on SOLR-792, pivot faceting currently only supports undistributed mode. Distributed pivot faceting needs to be implemented. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Jenkins hangs in check-lib-versions
check-lib-versions: [echo] Lib versions check under: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/.. [libversions] :: loading settings :: file = /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/ivy-settings.xml [libversions] [libversions] :: problems summary :: [libversions] WARNINGS [libversions] circular dependency found: dom4j#dom4j;1.6.1-jaxen#jaxen;1.1-beta-6-dom4j#dom4j;1.5.2 [libversions] [libversions] circular dependency found: jaxen#jaxen;1.1-beta-6-jdom#jdom;1.0-jaxen#jaxen;1.0-FCS [libversions] [libversions] [libversions] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS I will kill the build, stack trace is not available (IBM J9, kill -QUIT did not help). Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (64bit/ibm-j9-jdk7) - Build # 10953 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10953/ Java: 64bit/ibm-j9-jdk7 -Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;} All tests passed Build Log: [...truncated 40920 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5810) State of external collections not displayed in cloud graph panel
[ https://issues.apache.org/jira/browse/SOLR-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084879#comment-14084879 ] Timothy Potter commented on SOLR-5810: -- Thanks for the pointer to SOLR-6082. Definitely interested in starting to design a SolrCloud admin interface. The work being done in this ticket is mainly for supporting 100's to 1000's of collections from the existing Cloud panel as without some basic nav controls, that panel is unusable when you have many collections. In other words, this will serve as an interim solution for users that have many collections as we work on the design and develop an overhauled SolrCloud Admin UI. State of external collections not displayed in cloud graph panel Key: SOLR-5810 URL: https://issues.apache.org/jira/browse/SOLR-5810 Project: Solr Issue Type: Improvement Components: SolrCloud, web gui Reporter: Timothy Potter Assignee: Timothy Potter Attachments: SOLR-5810-prelim.patch, SOLR-5810.prelim2.patch External collections (SOLR-5473) are not displayed in the Cloud - graph panel, which makes it very hard to see which external collections have problems, such as after a downed node comes back online. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084887#comment-14084887 ] Hoss Man commented on SOLR-2894: Erick: * SOLR-6300: appears to be specific to date/range faceting - almost certainly not related to the problem i found since there's no overrequesting logic with range faceting. * SOLR-6314 seems unrelated given how it ties into the threading code, which is above the layer of changes i'm talking about ... but anything is possible. Implement distributed pivot faceting Key: SOLR-2894 URL: https://issues.apache.org/jira/browse/SOLR-2894 Project: Solr Issue Type: Improvement Reporter: Erik Hatcher Assignee: Hoss Man Fix For: 4.9, 5.0 Attachments: SOLR-2894-mincount-minification.patch, SOLR-2894-reworked.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894_cloud_test.patch, dateToObject.patch, pivot_mincount_problem.sh Following up on SOLR-792, pivot faceting currently only supports undistributed mode. Distributed pivot faceting needs to be implemented. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084890#comment-14084890 ] Erick Erickson commented on SOLR-2894: -- Rats! And here I was hoping you'd do the work for me ;) Good to know though, it'll keep me from putting this off. Thanks! Implement distributed pivot faceting Key: SOLR-2894 URL: https://issues.apache.org/jira/browse/SOLR-2894 Project: Solr Issue Type: Improvement Reporter: Erik Hatcher Assignee: Hoss Man Fix For: 4.9, 5.0 Attachments: SOLR-2894-mincount-minification.patch, SOLR-2894-reworked.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894_cloud_test.patch, dateToObject.patch, pivot_mincount_problem.sh Following up on SOLR-792, pivot faceting currently only supports undistributed mode. Distributed pivot faceting needs to be implemented. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Windows (64bit/jdk1.8.0_20-ea-b23) - Build # 4229 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/4229/ Java: 64bit/jdk1.8.0_20-ea-b23 -XX:+UseCompressedOops -XX:+UseG1GC 1 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.client.solrj.embedded.LargeVolumeJettyTest Error Message: Could not remove the following files (in the order of attempts): C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-solrj\test\J0\.\temp\solr.client.solrj.embedded.LargeVolumeJettyTest-34DFC31D7319D0E3-001\tempDir-001\tlog\tlog.002 C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-solrj\test\J0\.\temp\solr.client.solrj.embedded.LargeVolumeJettyTest-34DFC31D7319D0E3-001\tempDir-001\tlog\tlog.003 C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-solrj\test\J0\.\temp\solr.client.solrj.embedded.LargeVolumeJettyTest-34DFC31D7319D0E3-001\tempDir-001\tlog C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-solrj\test\J0\.\temp\solr.client.solrj.embedded.LargeVolumeJettyTest-34DFC31D7319D0E3-001\tempDir-001 C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-solrj\test\J0\.\temp\solr.client.solrj.embedded.LargeVolumeJettyTest-34DFC31D7319D0E3-001 Stack Trace: java.io.IOException: Could not remove the following files (in the order of attempts): C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-solrj\test\J0\.\temp\solr.client.solrj.embedded.LargeVolumeJettyTest-34DFC31D7319D0E3-001\tempDir-001\tlog\tlog.002 C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-solrj\test\J0\.\temp\solr.client.solrj.embedded.LargeVolumeJettyTest-34DFC31D7319D0E3-001\tempDir-001\tlog\tlog.003 C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-solrj\test\J0\.\temp\solr.client.solrj.embedded.LargeVolumeJettyTest-34DFC31D7319D0E3-001\tempDir-001\tlog C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-solrj\test\J0\.\temp\solr.client.solrj.embedded.LargeVolumeJettyTest-34DFC31D7319D0E3-001\tempDir-001 C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-solrj\test\J0\.\temp\solr.client.solrj.embedded.LargeVolumeJettyTest-34DFC31D7319D0E3-001 at __randomizedtesting.SeedInfo.seed([34DFC31D7319D0E3]:0) at org.apache.lucene.util.TestUtil.rm(TestUtil.java:117) at org.apache.lucene.util.TestRuleTemporaryFilesCleanup.afterAlways(TestRuleTemporaryFilesCleanup.java:125) at com.carrotsearch.randomizedtesting.rules.TestRuleAdapter$1.afterAlways(TestRuleAdapter.java:31) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:43) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at java.lang.Thread.run(Thread.java:745) Build Log: [...truncated 12618 lines...] [junit4] Suite: org.apache.solr.client.solrj.embedded.LargeVolumeJettyTest [junit4] 2 Creating dataDir: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-solrj\test\J0\.\temp\solr.client.solrj.embedded.LargeVolumeJettyTest-34DFC31D7319D0E3-001\init-core-data-001 [junit4] 2 267993 T564 oas.SolrTestCaseJ4.buildSSLConfig Randomized ssl (true) and clientAuth (false) [junit4] 2 267993 T564 oas.SolrTestCaseJ4.initCore initCore [junit4] 2 267993 T564 oas.SolrTestCaseJ4.initCore initCore end [junit4] 2 267996 T564 oejs.Server.doStart jetty-8.1.10.v20130312 [junit4] 2 268013 T564 oejus.SslContextFactory.doStart Enabled Protocols [SSLv2Hello, SSLv3, TLSv1, TLSv1.1, TLSv1.2] of [SSLv2Hello, SSLv3, TLSv1, TLSv1.1, TLSv1.2] [junit4] 2 268065 T564 oejs.AbstractConnector.doStart Started SslSelectChannelConnector@127.0.0.1:61666 [junit4] 2 268077 T564 oass.SolrDispatchFilter.init SolrDispatchFilter.init() [junit4] 2 268077 T564 oasc.SolrResourceLoader.locateSolrHome JNDI not configured for solr (NoInitialContextEx) [junit4] 2 268077 T564 oasc.SolrResourceLoader.locateSolrHome using system property solr.solr.home: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\solr [junit4] 2 268077 T564 oasc.SolrResourceLoader.init new
[jira] [Assigned] (SOLR-4316) Admin UI - SolrCloud - extend core options to collections
[ https://issues.apache.org/jira/browse/SOLR-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar reassigned SOLR-4316: --- Assignee: Shalin Shekhar Mangar Admin UI - SolrCloud - extend core options to collections - Key: SOLR-4316 URL: https://issues.apache.org/jira/browse/SOLR-4316 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.1 Reporter: Shawn Heisey Assignee: Shalin Shekhar Mangar Fix For: 4.9, 5.0 There are a number of sections available when you are looking at a core in the UI - Ping, Query, Schema, Config, Replication, Analysis, Schema Browser, Plugins / Stats, and Dataimport are the ones that I can see. A list of collections should be available, with as many of those options that can apply to a collection, If options specific to collections/SolrCloud can be implemented, those should be there too. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-6082) Umbrella JIRA for Admin UI and SolrCloud.
[ https://issues.apache.org/jira/browse/SOLR-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar reassigned SOLR-6082: --- Assignee: Shalin Shekhar Mangar (was: Stefan Matheis (steffkes)) Umbrella JIRA for Admin UI and SolrCloud. - Key: SOLR-6082 URL: https://issues.apache.org/jira/browse/SOLR-6082 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.9, 5.0 Reporter: Erick Erickson Assignee: Shalin Shekhar Mangar It would be very helpful if the admin UI were more cloud friendly. This is an umbrella JIRA so we can collect sub-tasks as necessary. I think there might be scattered JIRAs about this, let's link them in as we find them. [~steffkes] - I've taken the liberty of assigning it to you since you expressed some interest. Feel free to assign it back if you want... Let's imagine that a user has a cluster with _no_ collections assigned and start from there. Here's a simple way to set this up. Basically you follow the reference guide tutorial but _don't_ define a collection. 1 completely delete the collection1 directory from example 2 cp -r example example2 3 in example, execute java -DzkRun -jar start.jar 4 in example2, execute java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar Now the cloud link appears. If you expand the tree view, you see the two live nodes. But, there's nothing in the graph view, no cores are selectable, etc. First problem (need to solve before any sub-jiras, so including it here): You have to push a configuration directory to ZK. [~thetapi] The _last_ time Stefan and I started allowing files to be written to Solr from the UI it was...unfortunate. I'm assuming that there's something similar here. That is, we shouldn't allow pushing the Solr config _to_ ZooKeeper through the Admin UI, where they'd be distributed to all the solr nodes. Is that true? If this is a security issue, we can keep pushing the config dirs to ZK a manual step for now... Once we determine how to get configurations up, we can work on the various sub-jiras. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6082) Umbrella JIRA for Admin UI and SolrCloud.
[ https://issues.apache.org/jira/browse/SOLR-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084985#comment-14084985 ] Shalin Shekhar Mangar commented on SOLR-6082: - I am working on this. Umbrella JIRA for Admin UI and SolrCloud. - Key: SOLR-6082 URL: https://issues.apache.org/jira/browse/SOLR-6082 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.9, 5.0 Reporter: Erick Erickson Assignee: Shalin Shekhar Mangar It would be very helpful if the admin UI were more cloud friendly. This is an umbrella JIRA so we can collect sub-tasks as necessary. I think there might be scattered JIRAs about this, let's link them in as we find them. [~steffkes] - I've taken the liberty of assigning it to you since you expressed some interest. Feel free to assign it back if you want... Let's imagine that a user has a cluster with _no_ collections assigned and start from there. Here's a simple way to set this up. Basically you follow the reference guide tutorial but _don't_ define a collection. 1 completely delete the collection1 directory from example 2 cp -r example example2 3 in example, execute java -DzkRun -jar start.jar 4 in example2, execute java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar Now the cloud link appears. If you expand the tree view, you see the two live nodes. But, there's nothing in the graph view, no cores are selectable, etc. First problem (need to solve before any sub-jiras, so including it here): You have to push a configuration directory to ZK. [~thetapi] The _last_ time Stefan and I started allowing files to be written to Solr from the UI it was...unfortunate. I'm assuming that there's something similar here. That is, we shouldn't allow pushing the Solr config _to_ ZooKeeper through the Admin UI, where they'd be distributed to all the solr nodes. Is that true? If this is a security issue, we can keep pushing the config dirs to ZK a manual step for now... Once we determine how to get configurations up, we can work on the various sub-jiras. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4316) Admin UI - SolrCloud - extend core options to collections
[ https://issues.apache.org/jira/browse/SOLR-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084996#comment-14084996 ] Shalin Shekhar Mangar commented on SOLR-4316: - I am going to put up a patch shortly which will: # Refactor the UI into core specific and collection specific features, # Provide a drop down for available collections and local cores # Add a simple collection overview page Admin UI - SolrCloud - extend core options to collections - Key: SOLR-4316 URL: https://issues.apache.org/jira/browse/SOLR-4316 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.1 Reporter: Shawn Heisey Assignee: Shalin Shekhar Mangar Fix For: 4.9, 5.0 There are a number of sections available when you are looking at a core in the UI - Ping, Query, Schema, Config, Replication, Analysis, Schema Browser, Plugins / Stats, and Dataimport are the ones that I can see. A list of collections should be available, with as many of those options that can apply to a collection, If options specific to collections/SolrCloud can be implemented, those should be there too. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4316) Admin UI - SolrCloud - extend core options to collections
[ https://issues.apache.org/jira/browse/SOLR-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-4316: Attachment: solrcloud-admin-ui-menu.png Here's how the new menu looks like in SolrCloud. In non-solrcloud installations, the admin menu would look the same as it does today. Admin UI - SolrCloud - extend core options to collections - Key: SOLR-4316 URL: https://issues.apache.org/jira/browse/SOLR-4316 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.1 Reporter: Shawn Heisey Assignee: Shalin Shekhar Mangar Fix For: 4.9, 5.0 Attachments: solrcloud-admin-ui-menu.png There are a number of sections available when you are looking at a core in the UI - Ping, Query, Schema, Config, Replication, Analysis, Schema Browser, Plugins / Stats, and Dataimport are the ones that I can see. A list of collections should be available, with as many of those options that can apply to a collection, If options specific to collections/SolrCloud can be implemented, those should be there too. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4316) Admin UI - SolrCloud - extend core options to collections
[ https://issues.apache.org/jira/browse/SOLR-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085009#comment-14085009 ] Mark Miller commented on SOLR-4316: --- Sweet! Long overdue change. Admin UI - SolrCloud - extend core options to collections - Key: SOLR-4316 URL: https://issues.apache.org/jira/browse/SOLR-4316 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.1 Reporter: Shawn Heisey Assignee: Shalin Shekhar Mangar Fix For: 4.9, 5.0 Attachments: solrcloud-admin-ui-menu.png There are a number of sections available when you are looking at a core in the UI - Ping, Query, Schema, Config, Replication, Analysis, Schema Browser, Plugins / Stats, and Dataimport are the ones that I can see. A list of collections should be available, with as many of those options that can apply to a collection, If options specific to collections/SolrCloud can be implemented, those should be there too. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2304) MoreLikeThis: Apply field level boosts before query terms are selected
[ https://issues.apache.org/jira/browse/SOLR-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085012#comment-14085012 ] Brian commented on SOLR-2304: - I'm not sure this should be changed - I think the current behavior is expected. That is, qf with its dismax origins implies only query-time boosting - making it change which terms are selected I think would be more surprising and unexpected than it's current behavior. I think that instead another parameter should be added giving the option of applying the field boosts prior to building the query as well. I.e., I think the following use case could be common. We want to get interesting terms for building the MoreLikeThis query from across the whole document (across multiple fields) - we don't want terms showing up in specific fields to be weighted higher than others. We then use these interesting terms to build a query. However, at query time we do want to weight different fields more highly - which is what the qf parameter is used for in dismax - but using the same set of terms. (I realize in this case this also would require changing how MoreLikeThis builds a query since it does not currently support cross-field queries but I wouldn't want this change to prevent that possibility). I think it would be better to allow keeping the old behavior by either: -Adding a single boolean parameter specifying whether or not to apply qf field boost prior to selecting terms as well -Creating a new parameter specifically for interesting term field boost --This arguably is easier to understand, plus provides the most flexibility, because then we could have different boosts for generating the terms and then using those terms in the query. However it introduces greater complexity. MoreLikeThis: Apply field level boosts before query terms are selected -- Key: SOLR-2304 URL: https://issues.apache.org/jira/browse/SOLR-2304 Project: Solr Issue Type: Improvement Components: MoreLikeThis Affects Versions: 1.4.2 Reporter: Mike Mattozzi Priority: Minor Fix For: 4.9, 5.0 Attachments: SOLR-2304.patch MoreLikeThis provides the ability to set field level boosts to weight the importance of fields in selecting similar documents. Currently, in trunk, these field level boosts are applied after the query terms have been selected from the priority queue of interesting terms in MoreLIkeThis. This can give unexpected results when used in combination with mlt.maxqt to limit the number of query terms. For example, if you use fields fieldA and fieldB and boost them fieldA^0.5 fieldB^2.0 with a maxqt parameter of 20, if the terms in fieldA have relatively higher tf-idf scores than fieldB, only 20 fieldA terms will be selected as the basis for the MoreLikeThis query... even if after boosting, there are terms in fieldB with a higher overall score. I encountered this while using document descriptive text and document tags (comedy, action, etc) as the basis for MoreLIkeThis. I wanted to boost the tags higher, however the less common document text terms were always selected as the query terms while the more common tag terms were eliminated by the maxqt parameter before their scores were boosted. I believe the code was originally written as it was so that the bulk of the work could be done in the MoreLikeThisHandler without modifying the MoreLikeThis class in the lucene project. Now that the projects are merged, I think this modification makes sense. I will be attaching a simple patch to trunk. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5810) State of external collections not displayed in cloud graph panel
[ https://issues.apache.org/jira/browse/SOLR-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085035#comment-14085035 ] Mark Miller commented on SOLR-5810: --- So is this only an issue if you have 100+ collections? If so, we should probably update the title / description to be more specific. State of external collections not displayed in cloud graph panel Key: SOLR-5810 URL: https://issues.apache.org/jira/browse/SOLR-5810 Project: Solr Issue Type: Improvement Components: SolrCloud, web gui Reporter: Timothy Potter Assignee: Timothy Potter Attachments: SOLR-5810-prelim.patch, SOLR-5810.prelim2.patch External collections (SOLR-5473) are not displayed in the Cloud - graph panel, which makes it very hard to see which external collections have problems, such as after a downed node comes back online. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6317) MoreLikeThis should allow more flexible query building
Brian created SOLR-6317: --- Summary: MoreLikeThis should allow more flexible query building Key: SOLR-6317 URL: https://issues.apache.org/jira/browse/SOLR-6317 Project: Solr Issue Type: Improvement Components: MoreLikeThis Affects Versions: 4.8.1 Reporter: Brian Priority: Minor It would be better if MoreLikeThis had more flexible query-building options. There are two main abilities that would be helpful: 1. Allowing a DisjunctionMax query to be built instead of the plain BooleanQuery. 2. Applying the interesting terms to all specified query fields, instead of just the field it happened to be found in. #1 is important because we generally find disjunction max to work better than plain boolean queries (adding the field scores together). At the very least, the MoreLikeThis class should be made extendable, so that we can easily add this capability with a custom plugin, without having to rewrite all of the class. I.e., at its most basic, this could be resolved by making both createQuery methods protected instead of private so that we could then just override this method if we wanted to implement different ways of building the query, without changing anything else. At the other extreme, potentially the more complex resolution, it might be nice to be able to specify a search handler to use and then MoreLikeThis would create the query from that search handler. For example, fi the search handler used the eDisMax parser and specified a number of qf, pf, and boosts, then all those would be used to construct the mlt query, using the interesting terms generated. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5810) State of external collections not displayed in cloud graph panel
[ https://issues.apache.org/jira/browse/SOLR-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085045#comment-14085045 ] Timothy Potter commented on SOLR-5810: -- No, the status and nav controls work no matter how many collections you have. State of external collections not displayed in cloud graph panel Key: SOLR-5810 URL: https://issues.apache.org/jira/browse/SOLR-5810 Project: Solr Issue Type: Improvement Components: SolrCloud, web gui Reporter: Timothy Potter Assignee: Timothy Potter Attachments: SOLR-5810-prelim.patch, SOLR-5810.prelim2.patch External collections (SOLR-5473) are not displayed in the Cloud - graph panel, which makes it very hard to see which external collections have problems, such as after a downed node comes back online. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5810) State of external collections not displayed in cloud graph panel
[ https://issues.apache.org/jira/browse/SOLR-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085080#comment-14085080 ] Mark Miller commented on SOLR-5810: --- bq. #1 is a must for 5473 . Right, I'm also only concerned about #1 for SOLR-5473, though I've got nothing against taking it all in if it's ready. State of external collections not displayed in cloud graph panel Key: SOLR-5810 URL: https://issues.apache.org/jira/browse/SOLR-5810 Project: Solr Issue Type: Improvement Components: SolrCloud, web gui Reporter: Timothy Potter Assignee: Timothy Potter Attachments: SOLR-5810-prelim.patch, SOLR-5810.prelim2.patch External collections (SOLR-5473) are not displayed in the Cloud - graph panel, which makes it very hard to see which external collections have problems, such as after a downed node comes back online. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4316) Admin UI - SolrCloud - extend core options to collections
[ https://issues.apache.org/jira/browse/SOLR-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085114#comment-14085114 ] Shawn Heisey commented on SOLR-4316: The UI looks pretty good. I have one concern. Because I nearly always have a hi-res display, this won't really affect me, but I thought it worthwhile to mention: Users of low-res displays might appreciate having one dropdown with both collections and cores, similar to what we have on the schema browser for Fields, DynamicFields, and Types. This would be particularly important when we make the UI compatible with mobile browsers -- SOLR-4794. Admin UI - SolrCloud - extend core options to collections - Key: SOLR-4316 URL: https://issues.apache.org/jira/browse/SOLR-4316 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.1 Reporter: Shawn Heisey Assignee: Shalin Shekhar Mangar Fix For: 4.9, 5.0 Attachments: solrcloud-admin-ui-menu.png There are a number of sections available when you are looking at a core in the UI - Ping, Query, Schema, Config, Replication, Analysis, Schema Browser, Plugins / Stats, and Dataimport are the ones that I can see. A list of collections should be available, with as many of those options that can apply to a collection, If options specific to collections/SolrCloud can be implemented, those should be there too. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4316) Admin UI - SolrCloud - extend core options to collections
[ https://issues.apache.org/jira/browse/SOLR-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085129#comment-14085129 ] Shalin Shekhar Mangar commented on SOLR-4316: - bq. Users of low-res displays might appreciate having one dropdown with both collections and cores, similar to what we have on the schema browser for Fields, DynamicFields, and Types. I know Shawn but right now you can have a collection and a core named the same e.g. collection1. How would you choose one over the other? I think, eventually, we want to move away from the local core concept in the UI and navigate to individual cores from a collection view but I want to take small baby steps right now. That reminds me that the current menu has a fixed height I guess because on lower resolutions, one cannot scroll to the bottom of the menu. Instead, part of the menu is just cut off below the viewport and becomes unreachable. This is something that we should fix anyways. This is especially a problem during workshop/demos because the projectors have low resolution. Admin UI - SolrCloud - extend core options to collections - Key: SOLR-4316 URL: https://issues.apache.org/jira/browse/SOLR-4316 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.1 Reporter: Shawn Heisey Assignee: Shalin Shekhar Mangar Fix For: 4.9, 5.0 Attachments: solrcloud-admin-ui-menu.png There are a number of sections available when you are looking at a core in the UI - Ping, Query, Schema, Config, Replication, Analysis, Schema Browser, Plugins / Stats, and Dataimport are the ones that I can see. A list of collections should be available, with as many of those options that can apply to a collection, If options specific to collections/SolrCloud can be implemented, those should be there too. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Distributed search on non-default handler
I've recently entered SOLR-6311 to try to allow requests on handlers than the default /select to not have to specify shards.qt. This affects /suggest for instance. Anyhow, was wondering if anyone had feedback on approach of using path or suggestions for something better. Thanks, Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Performance issue with initialSize parameter for SolrCache
On 8/4/2014 9:08 AM, Simo Simov wrote: SolrCache implementations LRUCache, FastLRUCache and LFUCache have a parameter initialSize, that is passed directly to HashMap initialCapacity constructor argument. For LFUCache this is the code line: map = new LinkedHashMapK,V(initialSize, 0.75f, true) … In solrconfig.xml I tried to set initialSize to be size/0.75 + some_more, but it’s impossible to set value greather than size because of this line: final int initialSize = Math.min(str==null ? 1024 : Integer.parseInt(str), limit); For FastLRUCache there is no such limitation, but still I think documentation and default values in solrconfig.xml should not use equal values for initialSize and size. Simo, I wrote LFUCache, with FastLRUCache as a model. I'm having a hard time grasping exactly what you've written above, which is probably my failing. Based on what I think I understand, here's what I can say: This is the line I can find in LFUCache.java that's similar (but not the same) as the line you mentioned. The actual line you mentioned is in LRUCache, not LFUCache: final int initialSize = str == null ? limit : Integer.parseInt(str); The initialSize value should never be greater than size, but that line will not prevent you from setting it larger -- it only determines what happens if the initialSize parameter is missing from the config, which is to set it the same as size. I believe this is the correct action. The similar line in LRUCache (the one that you quoted) could produce unexpected behavior, because if initialSize is missing, it will be limited to 1024, even if size is many times larger than that. The workaround is to explicitly set initialSize, so I don't think it's actually a bug. Setting initialSize and size to the same value is a perfectly acceptable and logical thing to do. You're telling Solr that it should allocate all of the slots in the underlying Collection object up front, so it won't be forced to dynamically expand it later -- an operation that could take time. On a related front, I actually would like to completely replace the LFUCache implementation with a new one, because the current implementation is a very naive and unoptimized way of doing LFU.I just haven't found time to work on it. See SOLR-2906 and SOLR-3393. Thanks, Shawn - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3067) Missing Velocity Template for /browse request handler.
[ https://issues.apache.org/jira/browse/SOLR-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher resolved SOLR-3067. Resolution: Fixed Fix Version/s: (was: 4.10) (was: 5.0) 4.4 this was fixed in SOLR-4759 for Solr 4.4 Missing Velocity Template for /browse request handler. -- Key: SOLR-3067 URL: https://issues.apache.org/jira/browse/SOLR-3067 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 3.5 Reporter: Tom Hill Assignee: Erik Hatcher Priority: Trivial Fix For: 4.4 If you add group=ongroup.field=inStock to the URL in the /browse requesthandler, it throws a 500 error, due to a missing hitGrouped.vm file. This works correctly in trunk. Copying hitGrouped.vm from 4.0 to 3.5 prevents the error, although some of the other grouping support still isn't present. One could just remove the #parse(hitGrouped.vm) from browse.vm, and avoid the error, but it's probably about as easy to backport it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6311) SearchHandler should use path when no qt or shard.qt parameter is specified
[ https://issues.apache.org/jira/browse/SOLR-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085234#comment-14085234 ] David Smiley commented on SOLR-6311: Strong +1 from me; I have no idea why it wasn't this way from the beginning. SearchHandler should use path when no qt or shard.qt parameter is specified --- Key: SOLR-6311 URL: https://issues.apache.org/jira/browse/SOLR-6311 Project: Solr Issue Type: Bug Affects Versions: 4.9 Reporter: Steve Molloy Attachments: SOLR-6311.patch When performing distributed searches, you have to specify shards.qt unless you're on the default /select path for your handler. As this is configurable, even the default search handler could be on another path. The shard requests should thus default to the path if no shards.qt was specified. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses
[ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085242#comment-14085242 ] Paul Elschot commented on LUCENE-4396: -- bq. The commit hash code mentioned here just indicates which commit the patch should apply on. I missed that. Also I mistook the date order of the patches posted here, so I thought there were only old patches. And I did not think of looking for the commit in my own repo, I thought the commit was a commit for the patch. bq. btw, there is a repo where I'm maintaining the code, but the repo is on the server in my lab. My situation is very similar. The latest patch applies nicely to the indicated commit with the commands given above, thanks. BooleanScorer should sometimes be used for MUST clauses --- Key: LUCENE-4396 URL: https://issues.apache.org/jira/browse/LUCENE-4396 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Attachments: And.tasks, And.tasks, AndOr.tasks, AndOr.tasks, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, SIZE.perf, all.perf, luceneutil-score-equal.patch, luceneutil-score-equal.patch, stat.cpp, stat.cpp, tasks.cpp Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT. If there is one or more MUST clauses we always use BooleanScorer2. But I suspect that unless the MUST clauses have very low hit count compared to the other clauses, that BooleanScorer would perform better than BooleanScorer2. BooleanScorer still has some vestiges from when it used to handle MUST so it shouldn't be hard to bring back this capability ... I think the challenging part might be the heuristics on when to use which (likely we would have to use firstDocID as proxy for total hit count). Likely we should also have BooleanScorer sometimes use .advance() on the subs in this case, eg if suddenly the MUST clause skips 100 docs then you want to .advance() all the SHOULD clauses. I won't have near term time to work on this so feel free to take it if you are inspired! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6311) SearchHandler should use path when no qt or shard.qt parameter is specified
[ https://issues.apache.org/jira/browse/SOLR-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085258#comment-14085258 ] Yonik Seeley commented on SOLR-6311: bq. When performing distributed searches, you have to specify shards.qt unless you're on the default /select path for your handler. This was by design, and we should consider very carefully before changing it. Most of the time, other handlers were used to add default parameters. When this is the case, it's preferable to hit a bare /select handler for sub-requests as hitting the same handler again and adding defaults again will have a lot of side effects and sometimes produce incorrect distributed results. The worst is when a handler specifies shards or something, and this causes an endless loop. SearchHandler should use path when no qt or shard.qt parameter is specified --- Key: SOLR-6311 URL: https://issues.apache.org/jira/browse/SOLR-6311 Project: Solr Issue Type: Bug Affects Versions: 4.9 Reporter: Steve Molloy Attachments: SOLR-6311.patch When performing distributed searches, you have to specify shards.qt unless you're on the default /select path for your handler. As this is configurable, even the default search handler could be on another path. The shard requests should thus default to the path if no shards.qt was specified. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses
[ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085276#comment-14085276 ] Paul Elschot commented on LUCENE-4396: -- bq. By the perf. table of BQ, it looks that BQ perfs low on the first 4 cases. However, when I run these cases one by one, they're just worse than the trunk within 2%. The only reason I can think of that might cause this is that in the mixed case there is less effective use of caches (L1, L2, L3) during the test. When this is the case the max difference of 20% should go down when running for example twice as many of the same queries in one go during the mixed test. Btw. BooleanScorer2 is now almost ten years old, see LUCENE-294. Lots of improvements were made since then and I'm happy to see some further possible performance improvements here. BooleanScorer should sometimes be used for MUST clauses --- Key: LUCENE-4396 URL: https://issues.apache.org/jira/browse/LUCENE-4396 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Attachments: And.tasks, And.tasks, AndOr.tasks, AndOr.tasks, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, SIZE.perf, all.perf, luceneutil-score-equal.patch, luceneutil-score-equal.patch, stat.cpp, stat.cpp, tasks.cpp Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT. If there is one or more MUST clauses we always use BooleanScorer2. But I suspect that unless the MUST clauses have very low hit count compared to the other clauses, that BooleanScorer would perform better than BooleanScorer2. BooleanScorer still has some vestiges from when it used to handle MUST so it shouldn't be hard to bring back this capability ... I think the challenging part might be the heuristics on when to use which (likely we would have to use firstDocID as proxy for total hit count). Likely we should also have BooleanScorer sometimes use .advance() on the subs in this case, eg if suddenly the MUST clause skips 100 docs then you want to .advance() all the SHOULD clauses. I won't have near term time to work on this so feel free to take it if you are inspired! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6261) Run ZK watch event callbacks in parallel to the event thread
[ https://issues.apache.org/jira/browse/SOLR-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085282#comment-14085282 ] Ramkumar Aiyengar commented on SOLR-6261: - Mark, I realized I missed a case, the session watcher doesn't use this threadpool. In theory it should be fine as of now, as we already made it async with SOLR-5615, but might be worth keeping it in sync and avoiding any future code in {{ConnectionManager}} falling into the same trap. I have a patch and running through tests which I can post when done, should I create a new issue or put the patch here? Run ZK watch event callbacks in parallel to the event thread Key: SOLR-6261 URL: https://issues.apache.org/jira/browse/SOLR-6261 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.9 Reporter: Ramkumar Aiyengar Assignee: Mark Miller Priority: Minor Fix For: 5.0, 4.10 Currently checking for leadership (due to the leader's ephemeral node going away) happens in ZK's event thread. If there are many cores and all of them are due leadership, then they would have to serially go through the two-way sync and leadership takeover. For tens of cores, this could mean 30-40s without leadership before the last in the list even gets to start the leadership process. If the leadership process happens in a separate thread, then the cores could all take over in parallel. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6311) SearchHandler should use path when no qt or shard.qt parameter is specified
[ https://issues.apache.org/jira/browse/SOLR-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085293#comment-14085293 ] David Smiley commented on SOLR-6311: Good point Yonik; I forgot about that. It's not this simple then. In general, the need to specify shards.qt is to ensure one's customized components (e.g. spellcheck) are registered; it's not for request parameters. Perhaps the solution is to have a shard request ignore parameters specified in the request handler? SearchHandler should use path when no qt or shard.qt parameter is specified --- Key: SOLR-6311 URL: https://issues.apache.org/jira/browse/SOLR-6311 Project: Solr Issue Type: Bug Affects Versions: 4.9 Reporter: Steve Molloy Attachments: SOLR-6311.patch When performing distributed searches, you have to specify shards.qt unless you're on the default /select path for your handler. As this is configurable, even the default search handler could be on another path. The shard requests should thus default to the path if no shards.qt was specified. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5868) JoinUtil support for NUMERIC docValues fields
Mikhail Khludnev created LUCENE-5868: Summary: JoinUtil support for NUMERIC docValues fields Key: LUCENE-5868 URL: https://issues.apache.org/jira/browse/LUCENE-5868 Project: Lucene - Core Issue Type: New Feature Reporter: Mikhail Khludnev Priority: Minor while polishing SOLR-6234 I found that JoinUtil can't join int dv fields at least. I plan to provide test/patch. It might be important, because Solr's join can do that. Please vote if you care! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6318) QParser for TermsFilter
David Smiley created SOLR-6318: -- Summary: QParser for TermsFilter Key: SOLR-6318 URL: https://issues.apache.org/jira/browse/SOLR-6318 Project: Solr Issue Type: New Feature Components: query parsers Reporter: David Smiley Assignee: David Smiley Fix For: 4.10 Some applications require filtering documents by a large number of terms. It's often related to security filtering. Naively this is done this way: {noformat} fq={!df=myfield q.op=OR}code1 code2 code3 code4 code5... {noformat} And this ends up being a BooleanQuery. Users then wind up hitting BooleaQuery.maxClauseCount (sometimes in production, sadly) and they up it to a huge number to get the job done. Solr should offer a QParser based on TermsFilter. I propose it be named terms (plural of term), and have a separator option defaulting to a space. When it's a space, the values also get trimmed, which wouldn't otherwise happen. The analysis logic should be the same as that for term QParser which is to call FieldType.readableToIndexed. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4920) CLONE - TermsFilter should use AutomatonQuery
[ https://issues.apache.org/jira/browse/LUCENE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved LUCENE-4920. -- Resolution: Duplicate CLONE - TermsFilter should use AutomatonQuery - Key: LUCENE-4920 URL: https://issues.apache.org/jira/browse/LUCENE-4920 Project: Lucene - Core Issue Type: Improvement Reporter: sani kumar Labels: gsoc2014 I think we could see perf gains if TermsFilter sorted the terms, built a minimal automaton, and used TermsEnum.intersect to visit the terms... This idea came up on the dev list recently. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6318) QParser for TermsFilter
[ https://issues.apache.org/jira/browse/SOLR-6318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085349#comment-14085349 ] David Smiley commented on SOLR-6318: I've read somewhere (in the Lucene source, I forget) that BooleanQuery was shown to be faster than TermsFilter when the number of terms is less than some number, based on a bunch of assumptions of course. It would be nice to have a threshold option to switch between BooleanQuery TermsFilter. I've also seen a suggestion that TermsFilter should use or be replaced by AutomatonQuery LUCENE-3893. It would be easy to use any of these options. QParser for TermsFilter --- Key: SOLR-6318 URL: https://issues.apache.org/jira/browse/SOLR-6318 Project: Solr Issue Type: New Feature Components: query parsers Reporter: David Smiley Assignee: David Smiley Fix For: 4.10 Some applications require filtering documents by a large number of terms. It's often related to security filtering. Naively this is done this way: {noformat} fq={!df=myfield q.op=OR}code1 code2 code3 code4 code5... {noformat} And this ends up being a BooleanQuery. Users then wind up hitting BooleaQuery.maxClauseCount (sometimes in production, sadly) and they up it to a huge number to get the job done. Solr should offer a QParser based on TermsFilter. I propose it be named terms (plural of term), and have a separator option defaulting to a space. When it's a space, the values also get trimmed, which wouldn't otherwise happen. The analysis logic should be the same as that for term QParser which is to call FieldType.readableToIndexed. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6319) if mincount 1, facet.field needs to overrequest even if facet.sort=index
Hoss Man created SOLR-6319: -- Summary: if mincount 1, facet.field needs to overrequest even if facet.sort=index Key: SOLR-6319 URL: https://issues.apache.org/jira/browse/SOLR-6319 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Hoss Man Discovered this while working on SOLR-2894. the logic for distributed faceting ignores over requesting (beyond the user specified facet.limit) if the facet.sort is index order -- but the rationale for doing this falls apart if the user has specified a facet.mincount 1 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6319) if mincount 1, facet.field needs to overrequest even if facet.sort=index
[ https://issues.apache.org/jira/browse/SOLR-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085353#comment-14085353 ] Hoss Man commented on SOLR-6319: Consider the following sample data... {code:title=1.csv} foo_t a b c d e f g h a a a a a a a a a b b b b b b b b b g g g g {code} {code:title=2.csv} foo_t a b c d e f g h b f f f f f f f f f g g g g g g g g g g g h h h h h h h h h h h h {code} If you index this data in a single node solr setup, the following queries produce the results you expect... {noformat} $ curl http://localhost:8983/solr/update?rowidOffset=100rowid=idcommit=true; -H 'Content-type:application/csv; charset=utf-8' --data-binary @1.csv ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime522/int/lst /response $ curl http://localhost:8983/solr/update?rowidOffset=200rowid=idcommit=true; -H 'Content-type:application/csv; charset=utf-8' --data-binary @2.csv ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime435/int/lst /response $ curl -sS 'http://localhost:8983/solr/select?q=*:*rows=0facet=truefacet.field=foo_tfacet.sort=indexomitHeader=truewt=jsonindent=true' { response:{numFound:57,start:0,docs:[] }, facet_counts:{ facet_queries:{}, facet_fields:{ foo_t:[ a,11, b,12, c,2, d,2, e,2, f,11, g,17, h,14]}, facet_dates:{}, facet_ranges:{}, facet_intervals:{}}} $ curl -sS 'http://localhost:8983/solr/select?q=*:*rows=0facet=truefacet.field=foo_tfacet.limit=1facet.mincount=13facet.sort=indexomitHeader=truewt=jsonindent=true' { response:{numFound:57,start:0,docs:[] }, facet_counts:{ facet_queries:{}, facet_fields:{ foo_t:[ g,17]}, facet_dates:{}, facet_ranges:{}, facet_intervals:{}}} {noformat} But in a simple 2 node distributed setup... {noformat} $ curl http://localhost:8881/solr/update?rowidOffset=100rowid=idcommit=true; -H 'Content-type:application/csv; charset=utf-8' --data-binary @1.csv ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime483/int/lst /response $ curl http://localhost:8882/solr/update?rowidOffset=200rowid=idcommit=true; -H 'Content-type:application/csv; charset=utf-8' --data-binary @2.csv ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime456/int/lst /response $ curl -sS 'http://localhost:8881/solr/select?q=*:*rows=0facet=truefacet.field=foo_tfacet.sort=indexomitHeader=truewt=jsonindent=trueshards=localhost:8881/solr,localhost:8882/solr' { response:{numFound:57,start:0,maxScore:1.0,docs:[] }, facet_counts:{ facet_queries:{}, facet_fields:{ foo_t:[ a,11, b,12, c,2, d,2, e,2, f,11, g,17, h,14]}, facet_dates:{}, facet_ranges:{}, facet_intervals:{}}} $ curl -sS 'http://localhost:8881/solr/select?q=*:*rows=0facet=truefacet.field=foo_tfacet.limit=1facet.mincount=13facet.sort=indexomitHeader=truewt=jsonindent=trueshards=localhost:8881/solr,localhost:8882/solr' { response:{numFound:57,start:0,maxScore:1.0,docs:[] }, facet_counts:{ facet_queries:{}, facet_fields:{ foo_t:[]}, facet_dates:{}, facet_ranges:{}, facet_intervals:{}}} {noformat} Bottom Line: we should be overrequesting when facet.sort=index is combined with facet.mincount 0 if mincount 1, facet.field needs to overrequest even if facet.sort=index -- Key: SOLR-6319 URL: https://issues.apache.org/jira/browse/SOLR-6319 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Hoss Man Discovered this while working on SOLR-2894. the logic for distributed faceting ignores over requesting (beyond the user specified facet.limit) if the facet.sort is index order -- but the rationale for doing this falls apart if the user has specified a facet.mincount 1 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6319) if mincount 1, facet.field needs to overrequest even if facet.sort=index
[ https://issues.apache.org/jira/browse/SOLR-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085355#comment-14085355 ] Hoss Man commented on SOLR-6319: I suspect some refactoring being done in SOLR-2894 will fix this automatically, but i wanted to file it as a distinct issue so we know the bug has existed in past releases and is being fixed. if mincount 1, facet.field needs to overrequest even if facet.sort=index -- Key: SOLR-6319 URL: https://issues.apache.org/jira/browse/SOLR-6319 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Hoss Man Discovered this while working on SOLR-2894. the logic for distributed faceting ignores over requesting (beyond the user specified facet.limit) if the facet.sort is index order -- but the rationale for doing this falls apart if the user has specified a facet.mincount 1 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6300) facet.mincount fails to work if SolrCloud distrib=true is set
[ https://issues.apache.org/jira/browse/SOLR-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085367#comment-14085367 ] Hoss Man commented on SOLR-6300: Erick... * This is a possible dup of SOLR-6187 (hard to tell from that issue's formatting) * Possibly related: SOLR-6154 facet.mincount fails to work if SolrCloud distrib=true is set - Key: SOLR-6300 URL: https://issues.apache.org/jira/browse/SOLR-6300 Project: Solr Issue Type: Bug Components: SearchComponents - other, SolrCloud Affects Versions: 5.0 Reporter: Vamsee Yarlagadda Assignee: Erick Erickson I notice that using facet.mincount in SolrCloud mode with distrib=true fails to filter the facets based on the count. However, the same query with distrib=false works as expected. * Indexed some data as provided by the upstream test. https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/test/org/apache/solr/request/SimpleFacetsTest.java#L633 * Test being run: https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/test/org/apache/solr/request/SimpleFacetsTest.java#L657 * Running in SolrCloud mode with distrib=false (facet.mincount works as expected) {code} $ curl http://search-testing-c5-3.ent.cloudera.com:8983/solr/simple_faceting_coll/select?facet.date.start=1976-07-01T00%3A00%3A00.000Zfacet=truefacet.mincount=1q=*%3A*facet.date=bdayfacet.date.other=allfacet.date.gap=%2B1DAYfacet.date.end=1976-07-01T00%3A00%3A00.000Z%2B1MONTHrows=0indent=truewt=xmldistrib=false; ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime3/int lst name=params str name=facet.date.start1976-07-01T00:00:00.000Z/str str name=facettrue/str str name=indenttrue/str str name=facet.mincount1/str str name=q*:*/str str name=facet.datebday/str str name=distribfalse/str str name=facet.date.gap+1DAY/str str name=facet.date.otherall/str str name=wtxml/str str name=facet.date.end1976-07-01T00:00:00.000Z+1MONTH/str str name=rows0/str /lst /lst result name=response numFound=33 start=0 /result lst name=facet_counts lst name=facet_queries/ lst name=facet_fields/ lst name=facet_dates lst name=bday int name=1976-07-03T00:00:00Z1/int int name=1976-07-04T00:00:00Z1/int int name=1976-07-05T00:00:00Z1/int int name=1976-07-13T00:00:00Z1/int int name=1976-07-15T00:00:00Z1/int int name=1976-07-21T00:00:00Z1/int int name=1976-07-30T00:00:00Z1/int str name=gap+1DAY/str date name=start1976-07-01T00:00:00Z/date date name=end1976-08-01T00:00:00Z/date int name=before2/int int name=after0/int int name=between6/int /lst /lst lst name=facet_ranges/ /lst /response {code} * SolrCloud mode with distrib=true (facet.mincount fails to show effect) {code} $ curl http://search-testing-c5-3.ent.cloudera.com:8983/solr/simple_faceting_coll/select?facet.date.start=1976-07-01T00%3A00%3A00.000Zfacet=truefacet.mincount=1q=*%3A*facet.date=bdayfacet.date.other=allfacet.date.gap=%2B1DAYfacet.date.end=1976-07-01T00%3A00%3A00.000Z%2B1MONTHrows=0indent=truewt=xmldistrib=true; ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime12/int lst name=params str name=facet.date.start1976-07-01T00:00:00.000Z/str str name=facettrue/str str name=indenttrue/str str name=facet.mincount1/str str name=q*:*/str str name=facet.datebday/str str name=distribtrue/str str name=facet.date.gap+1DAY/str str name=facet.date.otherall/str str name=wtxml/str str name=facet.date.end1976-07-01T00:00:00.000Z+1MONTH/str str name=rows0/str /lst /lst result name=response numFound=63 start=0 maxScore=1.0 /result lst name=facet_counts lst name=facet_queries/ lst name=facet_fields/ lst name=facet_dates lst name=bday int name=1976-07-01T00:00:00Z0/int int name=1976-07-02T00:00:00Z0/int int name=1976-07-03T00:00:00Z2/int int name=1976-07-04T00:00:00Z2/int int name=1976-07-05T00:00:00Z2/int int name=1976-07-06T00:00:00Z0/int int name=1976-07-07T00:00:00Z0/int int name=1976-07-08T00:00:00Z0/int int name=1976-07-09T00:00:00Z0/int int name=1976-07-10T00:00:00Z0/int int name=1976-07-11T00:00:00Z0/int int name=1976-07-12T00:00:00Z1/int int name=1976-07-13T00:00:00Z1/int int name=1976-07-14T00:00:00Z0/int int name=1976-07-15T00:00:00Z2/int int name=1976-07-16T00:00:00Z0/int int name=1976-07-17T00:00:00Z0/int int name=1976-07-18T00:00:00Z0/int int name=1976-07-19T00:00:00Z0/int int
[jira] [Commented] (SOLR-6319) if mincount 1, facet.field needs to overrequest even if facet.sort=index
[ https://issues.apache.org/jira/browse/SOLR-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085406#comment-14085406 ] Yonik Seeley commented on SOLR-6319: It's not clear to me if you've hit something new, or something that has already been considered/documented. Here's the current comments for reference: {code} // we're sorting by index order. // if minCount==0, we should always be able to get accurate results w/o over-requesting or refining // if minCount==1, we should be able to get accurate results w/o over-requesting, but we'll need to refine // if minCount==n (1), we can set the initialMincount to minCount/nShards, rounded up. // For example, we know that if minCount=10 and we have 3 shards, then at least one shard must have a count of 4 for the term // For the minCount1 case, we can generate too short of a list (miss terms at the end of the list) unless limit==-1 // For example: each shard could produce a list of top 10, but some of those could fail to make it into the combined list (i.e. // we needed to go beyond the top 10 to generate the top 10 combined). Overrequesting can help a little here, but not as // much as when sorting by count. {code} It's been years since I wrote that, but IIRC the thinking was that over requesting when sorting by index was probably not worth it. It's a judgement call, and shouldn't be categorized as a bug (if I'm understanding this issue correctly). if mincount 1, facet.field needs to overrequest even if facet.sort=index -- Key: SOLR-6319 URL: https://issues.apache.org/jira/browse/SOLR-6319 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Hoss Man Discovered this while working on SOLR-2894. the logic for distributed faceting ignores over requesting (beyond the user specified facet.limit) if the facet.sort is index order -- but the rationale for doing this falls apart if the user has specified a facet.mincount 1 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6319) if mincount 1, facet.field needs to overrequest even if facet.sort=index
[ https://issues.apache.org/jira/browse/SOLR-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085441#comment-14085441 ] Hoss Man commented on SOLR-6319: bq. It's been years since I wrote that, but IIRC the thinking was that over requesting when sorting by index was probably not worth it. It's a judgement call, and shouldn't be categorized as a bug (if I'm understanding this issue correctly). I don't know how it could be considered not a bug ... did you look at the steps to reproduce that i posted? it's trivial to generate queries where facet.mincount1 combined with small facet.limit counts won't produce results even when results exist. the comment Overrequesting can help a little here, but not as much as when sorting by count. seems like an understatement. if mincount 1, facet.field needs to overrequest even if facet.sort=index -- Key: SOLR-6319 URL: https://issues.apache.org/jira/browse/SOLR-6319 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Hoss Man Discovered this while working on SOLR-2894. the logic for distributed faceting ignores over requesting (beyond the user specified facet.limit) if the facet.sort is index order -- but the rationale for doing this falls apart if the user has specified a facet.mincount 1 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6320) ExtendedDismaxQParser (edismax) parser fails on some queries
Esteban D created SOLR-6320: --- Summary: ExtendedDismaxQParser (edismax) parser fails on some queries Key: SOLR-6320 URL: https://issues.apache.org/jira/browse/SOLR-6320 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.5 Environment: Sun Microsystems Inc. OpenJDK 64-Bit Server VM (1.6.0_31 23.25-b01) Apache Tomcat/6.0.35 Java 1.6.0_31-b31 Reporter: Esteban D When querying for some specific attributes parsing fails and it falls back to use the default search field *aggregatefield*. h3. This works as expected {code:title=debug|borderStyle=solid} rawquerystring: *:* AND org_name_html:military AND affiliateorganizationid:\0087\ AND volunteersslots:[1 TO *], querystring: *:* AND org_name_html:military AND affiliateorganizationid:\0087\ AND volunteersslots:[1 TO *], parsedquery: (+(+MatchAllDocsQuery(*:*) +org_name_html:militari +affiliateorganizationid:0087 +volunteersslots:[1 TO *]))/no_coord, parsedquery_toString: +(+*:* +org_name_html:militari +affiliateorganizationid:0087 +volunteersslots:[1 TO *]), explain: { 71f3083d53e356b810cf31224f5de00e: 5.118334 = (MATCH) sum of: 0.08731608 = (MATCH) MatchAllDocsQuery, product of: 0.08731608 = queryNorm 2.1114364 = (MATCH) weight(org_name_html:militari in 34264) [DefaultSimilarity], result of: 2.1114364 = score(doc=34264,freq=1.0 = termFreq=1.0 ), product of: 0.85874873 = queryWeight, product of: 9.834944 = idf(docFreq=17, maxDocs=123663) 0.08731608 = queryNorm 2.458736 = fieldWeight in 34264, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 9.834944 = idf(docFreq=17, maxDocs=123663) 0.25 = fieldNorm(doc=34264) 2.8322654 = (MATCH) weight(affiliateorganizationid:0087 in 34264) [DefaultSimilarity], result of: 2.8322654 = score(doc=34264,freq=1.0 = termFreq=1.0 ), product of: 0.497295 = queryWeight, product of: 5.6953425 = idf(docFreq=1129, maxDocs=123663) 0.08731608 = queryNorm 5.6953425 = fieldWeight in 34264, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.6953425 = idf(docFreq=1129, maxDocs=123663) 1.0 = fieldNorm(doc=34264) 0.08731608 = (MATCH) ConstantScore(volunteersslots:[1 TO *]), product of: 1.0 = boost 0.08731608 = queryNorm , b36156ebd7204985f752f1c191ea4d18: 5.118334 = (MATCH) sum of: 0.08731608 = (MATCH) MatchAllDocsQuery, product of: 0.08731608 = queryNorm 2.1114364 = (MATCH) weight(org_name_html:militari in 34265) [DefaultSimilarity], result of: 2.1114364 = score(doc=34265,freq=1.0 = termFreq=1.0 ), product of: 0.85874873 = queryWeight, product of: 9.834944 = idf(docFreq=17, maxDocs=123663) 0.08731608 = queryNorm 2.458736 = fieldWeight in 34265, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 9.834944 = idf(docFreq=17, maxDocs=123663) 0.25 = fieldNorm(doc=34265) 2.8322654 = (MATCH) weight(affiliateorganizationid:0087 in 34265) [DefaultSimilarity], result of: 2.8322654 = score(doc=34265,freq=1.0 = termFreq=1.0 ), product of: 0.497295 = queryWeight, product of: 5.6953425 = idf(docFreq=1129, maxDocs=123663) 0.08731608 = queryNorm 5.6953425 = fieldWeight in 34265, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.6953425 = idf(docFreq=1129, maxDocs=123663) 1.0 = fieldNorm(doc=34265) 0.08731608 = (MATCH) ConstantScore(volunteersslots:[1 TO *]), product of: 1.0 = boost 0.08731608 = queryNorm }, QParser: ExtendedDismaxQParser, altquerystring: null, boost_queries: null, parsed_boost_queries: [], boostfuncs: null, {code} h3. When adding additional conditions to org_name_html it fails {code:title=debug|borderStyle=solid} rawquerystring: *:* AND org_name_html:military AND affiliateorganizationid:\0087\ AND org_name_html:( military AND veterans AND museum AND and AND education AND center ) AND volunteersslots:[1 TO *], querystring: *:* AND org_name_html:military AND affiliateorganizationid:\0087\ AND org_name_html:( military AND veterans AND museum AND and AND education AND center ) AND volunteersslots:[1 TO *], parsedquery: (+((org_name_html:militari affiliateorganizationid:0087 DisjunctionMaxQuery((aggregatefield:militari)) DisjunctionMaxQuery((aggregatefield:veteran)) DisjunctionMaxQuery((aggregatefield:museum)) DisjunctionMaxQuery((aggregatefield:educ)) DisjunctionMaxQuery((aggregatefield:center)) DisjunctionMaxQuery((aggregatefield:to)))~8))/no_coord, parsedquery_toString: +((org_name_html:militari affiliateorganizationid:0087 (aggregatefield:militari) (aggregatefield:veteran) (aggregatefield:museum) (aggregatefield:educ) (aggregatefield:center)
[jira] [Updated] (SOLR-6320) ExtendedDismaxQParser (edismax) parser fails on some queries
[ https://issues.apache.org/jira/browse/SOLR-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Esteban D updated SOLR-6320: Description: When querying for some specific attributes parsing fails and it falls back to use the default search field *aggregatefield*. h3. This works as expected {code:title=debug|borderStyle=solid} rawquerystring: *:* AND org_name_html:military AND affiliateorganizationid:\0087\ AND volunteersslots:[1 TO *], querystring: *:* AND org_name_html:military AND affiliateorganizationid:\0087\ AND volunteersslots:[1 TO *], parsedquery: (+(+MatchAllDocsQuery(*:*) +org_name_html:militari +affiliateorganizationid:0087 +volunteersslots:[1 TO *]))/no_coord, parsedquery_toString: +(+*:* +org_name_html:militari +affiliateorganizationid:0087 +volunteersslots:[1 TO *]), explain: { 71f3083d53e356b810cf31224f5de00e: 5.118334 = (MATCH) sum of: 0.08731608 = (MATCH) MatchAllDocsQuery, product of: 0.08731608 = queryNorm 2.1114364 = (MATCH) weight(org_name_html:militari in 34264) [DefaultSimilarity], result of: 2.1114364 = score(doc=34264,freq=1.0 = termFreq=1.0 ), product of: 0.85874873 = queryWeight, product of: 9.834944 = idf(docFreq=17, maxDocs=123663) 0.08731608 = queryNorm 2.458736 = fieldWeight in 34264, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 9.834944 = idf(docFreq=17, maxDocs=123663) 0.25 = fieldNorm(doc=34264) 2.8322654 = (MATCH) weight(affiliateorganizationid:0087 in 34264) [DefaultSimilarity], result of: 2.8322654 = score(doc=34264,freq=1.0 = termFreq=1.0 ), product of: 0.497295 = queryWeight, product of: 5.6953425 = idf(docFreq=1129, maxDocs=123663) 0.08731608 = queryNorm 5.6953425 = fieldWeight in 34264, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.6953425 = idf(docFreq=1129, maxDocs=123663) 1.0 = fieldNorm(doc=34264) 0.08731608 = (MATCH) ConstantScore(volunteersslots:[1 TO *]), product of: 1.0 = boost 0.08731608 = queryNorm , b36156ebd7204985f752f1c191ea4d18: 5.118334 = (MATCH) sum of: 0.08731608 = (MATCH) MatchAllDocsQuery, product of: 0.08731608 = queryNorm 2.1114364 = (MATCH) weight(org_name_html:militari in 34265) [DefaultSimilarity], result of: 2.1114364 = score(doc=34265,freq=1.0 = termFreq=1.0 ), product of: 0.85874873 = queryWeight, product of: 9.834944 = idf(docFreq=17, maxDocs=123663) 0.08731608 = queryNorm 2.458736 = fieldWeight in 34265, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 9.834944 = idf(docFreq=17, maxDocs=123663) 0.25 = fieldNorm(doc=34265) 2.8322654 = (MATCH) weight(affiliateorganizationid:0087 in 34265) [DefaultSimilarity], result of: 2.8322654 = score(doc=34265,freq=1.0 = termFreq=1.0 ), product of: 0.497295 = queryWeight, product of: 5.6953425 = idf(docFreq=1129, maxDocs=123663) 0.08731608 = queryNorm 5.6953425 = fieldWeight in 34265, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.6953425 = idf(docFreq=1129, maxDocs=123663) 1.0 = fieldNorm(doc=34265) 0.08731608 = (MATCH) ConstantScore(volunteersslots:[1 TO *]), product of: 1.0 = boost 0.08731608 = queryNorm }, QParser: ExtendedDismaxQParser, altquerystring: null, boost_queries: null, parsed_boost_queries: [], boostfuncs: null, {code} h3. When adding additional conditions to org_name_html it fails {code:title=debug|borderStyle=solid} rawquerystring: *:* AND affiliateorganizationid:\0087\ AND org_name_html:( military AND veterans AND museum AND and AND education AND center ) AND volunteersslots:[1 TO *], querystring: *:* AND affiliateorganizationid:\0087\ AND org_name_html:( military AND veterans AND museum AND and AND education AND center ) AND volunteersslots:[1 TO *], parsedquery: (+((affiliateorganizationid:0087 DisjunctionMaxQuery((aggregatefield:militari)) DisjunctionMaxQuery((aggregatefield:veteran)) DisjunctionMaxQuery((aggregatefield:museum)) DisjunctionMaxQuery((aggregatefield:educ)) DisjunctionMaxQuery((aggregatefield:center)) DisjunctionMaxQuery((aggregatefield:to)))~7))/no_coord, parsedquery_toString: +((affiliateorganizationid:0087 (aggregatefield:militari) (aggregatefield:veteran) (aggregatefield:museum) (aggregatefield:educ) (aggregatefield:center) (aggregatefield:to))~7), explain: {}, QParser: ExtendedDismaxQParser, altquerystring: null, boost_queries: null, parsed_boost_queries: [], boostfuncs: null, {code} was: When querying for some specific attributes parsing fails and it falls back to use the default search field *aggregatefield*. h3. This works as expected {code:title=debug|borderStyle=solid} rawquerystring: *:* AND org_name_html:military AND
[jira] [Commented] (SOLR-6320) ExtendedDismaxQParser (edismax) parser fails on some queries
[ https://issues.apache.org/jira/browse/SOLR-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085491#comment-14085491 ] Esteban D commented on SOLR-6320: - Maybe this is not a bug I noticed that the and is declared twice. The logic tokenizes a sentences into words concatenated with *AND*, h4. This breaks: {code}org_name_html:( military AND veterans AND museum AND and AND education AND center ){code} h4. This works: {code}org_name_html:( military AND veterans AND museum AND education AND center ){code} ExtendedDismaxQParser (edismax) parser fails on some queries Key: SOLR-6320 URL: https://issues.apache.org/jira/browse/SOLR-6320 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.5 Environment: Sun Microsystems Inc. OpenJDK 64-Bit Server VM (1.6.0_31 23.25-b01) Apache Tomcat/6.0.35 Java 1.6.0_31-b31 Reporter: Esteban D When querying for some specific attributes parsing fails and it falls back to use the default search field *aggregatefield*. h3. This works as expected {code:title=debug|borderStyle=solid} rawquerystring: *:* AND org_name_html:military AND affiliateorganizationid:\0087\ AND volunteersslots:[1 TO *], querystring: *:* AND org_name_html:military AND affiliateorganizationid:\0087\ AND volunteersslots:[1 TO *], parsedquery: (+(+MatchAllDocsQuery(*:*) +org_name_html:militari +affiliateorganizationid:0087 +volunteersslots:[1 TO *]))/no_coord, parsedquery_toString: +(+*:* +org_name_html:militari +affiliateorganizationid:0087 +volunteersslots:[1 TO *]), explain: { 71f3083d53e356b810cf31224f5de00e: 5.118334 = (MATCH) sum of: 0.08731608 = (MATCH) MatchAllDocsQuery, product of: 0.08731608 = queryNorm 2.1114364 = (MATCH) weight(org_name_html:militari in 34264) [DefaultSimilarity], result of: 2.1114364 = score(doc=34264,freq=1.0 = termFreq=1.0 ), product of: 0.85874873 = queryWeight, product of: 9.834944 = idf(docFreq=17, maxDocs=123663) 0.08731608 = queryNorm 2.458736 = fieldWeight in 34264, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 9.834944 = idf(docFreq=17, maxDocs=123663) 0.25 = fieldNorm(doc=34264) 2.8322654 = (MATCH) weight(affiliateorganizationid:0087 in 34264) [DefaultSimilarity], result of: 2.8322654 = score(doc=34264,freq=1.0 = termFreq=1.0 ), product of: 0.497295 = queryWeight, product of: 5.6953425 = idf(docFreq=1129, maxDocs=123663) 0.08731608 = queryNorm 5.6953425 = fieldWeight in 34264, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.6953425 = idf(docFreq=1129, maxDocs=123663) 1.0 = fieldNorm(doc=34264) 0.08731608 = (MATCH) ConstantScore(volunteersslots:[1 TO *]), product of: 1.0 = boost 0.08731608 = queryNorm , b36156ebd7204985f752f1c191ea4d18: 5.118334 = (MATCH) sum of: 0.08731608 = (MATCH) MatchAllDocsQuery, product of: 0.08731608 = queryNorm 2.1114364 = (MATCH) weight(org_name_html:militari in 34265) [DefaultSimilarity], result of: 2.1114364 = score(doc=34265,freq=1.0 = termFreq=1.0 ), product of: 0.85874873 = queryWeight, product of: 9.834944 = idf(docFreq=17, maxDocs=123663) 0.08731608 = queryNorm 2.458736 = fieldWeight in 34265, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 9.834944 = idf(docFreq=17, maxDocs=123663) 0.25 = fieldNorm(doc=34265) 2.8322654 = (MATCH) weight(affiliateorganizationid:0087 in 34265) [DefaultSimilarity], result of: 2.8322654 = score(doc=34265,freq=1.0 = termFreq=1.0 ), product of: 0.497295 = queryWeight, product of: 5.6953425 = idf(docFreq=1129, maxDocs=123663) 0.08731608 = queryNorm 5.6953425 = fieldWeight in 34265, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.6953425 = idf(docFreq=1129, maxDocs=123663) 1.0 = fieldNorm(doc=34265) 0.08731608 = (MATCH) ConstantScore(volunteersslots:[1 TO *]), product of: 1.0 = boost 0.08731608 = queryNorm }, QParser: ExtendedDismaxQParser, altquerystring: null, boost_queries: null, parsed_boost_queries: [], boostfuncs: null, {code} h3. When adding additional conditions to org_name_html it fails {code:title=debug|borderStyle=solid} rawquerystring: *:* AND affiliateorganizationid:\0087\ AND org_name_html:( military AND veterans AND museum AND and AND education AND center ) AND volunteersslots:[1 TO *], querystring: *:* AND affiliateorganizationid:\0087\ AND org_name_html:( military AND veterans AND museum AND and AND education
[jira] [Comment Edited] (SOLR-6320) ExtendedDismaxQParser (edismax) parser fails on some queries
[ https://issues.apache.org/jira/browse/SOLR-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085491#comment-14085491 ] Esteban D edited comment on SOLR-6320 at 8/4/14 11:30 PM: -- Maybe this is not a bug I noticed that the and is declared twice. The logic tokenizes a sentences into words concatenated with *AND* . In this case it causes: *and AND* h4. This breaks: {code}org_name_html:( military AND veterans AND museum AND and AND education AND center ){code} h4. This works: {code}org_name_html:( military AND veterans AND museum AND education AND center ){code} was (Author: esteband): Maybe this is not a bug I noticed that the and is declared twice. The logic tokenizes a sentences into words concatenated with *AND*, h4. This breaks: {code}org_name_html:( military AND veterans AND museum AND and AND education AND center ){code} h4. This works: {code}org_name_html:( military AND veterans AND museum AND education AND center ){code} ExtendedDismaxQParser (edismax) parser fails on some queries Key: SOLR-6320 URL: https://issues.apache.org/jira/browse/SOLR-6320 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.5 Environment: Sun Microsystems Inc. OpenJDK 64-Bit Server VM (1.6.0_31 23.25-b01) Apache Tomcat/6.0.35 Java 1.6.0_31-b31 Reporter: Esteban D When querying for some specific attributes parsing fails and it falls back to use the default search field *aggregatefield*. h3. This works as expected {code:title=debug|borderStyle=solid} rawquerystring: *:* AND org_name_html:military AND affiliateorganizationid:\0087\ AND volunteersslots:[1 TO *], querystring: *:* AND org_name_html:military AND affiliateorganizationid:\0087\ AND volunteersslots:[1 TO *], parsedquery: (+(+MatchAllDocsQuery(*:*) +org_name_html:militari +affiliateorganizationid:0087 +volunteersslots:[1 TO *]))/no_coord, parsedquery_toString: +(+*:* +org_name_html:militari +affiliateorganizationid:0087 +volunteersslots:[1 TO *]), explain: { 71f3083d53e356b810cf31224f5de00e: 5.118334 = (MATCH) sum of: 0.08731608 = (MATCH) MatchAllDocsQuery, product of: 0.08731608 = queryNorm 2.1114364 = (MATCH) weight(org_name_html:militari in 34264) [DefaultSimilarity], result of: 2.1114364 = score(doc=34264,freq=1.0 = termFreq=1.0 ), product of: 0.85874873 = queryWeight, product of: 9.834944 = idf(docFreq=17, maxDocs=123663) 0.08731608 = queryNorm 2.458736 = fieldWeight in 34264, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 9.834944 = idf(docFreq=17, maxDocs=123663) 0.25 = fieldNorm(doc=34264) 2.8322654 = (MATCH) weight(affiliateorganizationid:0087 in 34264) [DefaultSimilarity], result of: 2.8322654 = score(doc=34264,freq=1.0 = termFreq=1.0 ), product of: 0.497295 = queryWeight, product of: 5.6953425 = idf(docFreq=1129, maxDocs=123663) 0.08731608 = queryNorm 5.6953425 = fieldWeight in 34264, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.6953425 = idf(docFreq=1129, maxDocs=123663) 1.0 = fieldNorm(doc=34264) 0.08731608 = (MATCH) ConstantScore(volunteersslots:[1 TO *]), product of: 1.0 = boost 0.08731608 = queryNorm , b36156ebd7204985f752f1c191ea4d18: 5.118334 = (MATCH) sum of: 0.08731608 = (MATCH) MatchAllDocsQuery, product of: 0.08731608 = queryNorm 2.1114364 = (MATCH) weight(org_name_html:militari in 34265) [DefaultSimilarity], result of: 2.1114364 = score(doc=34265,freq=1.0 = termFreq=1.0 ), product of: 0.85874873 = queryWeight, product of: 9.834944 = idf(docFreq=17, maxDocs=123663) 0.08731608 = queryNorm 2.458736 = fieldWeight in 34265, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 9.834944 = idf(docFreq=17, maxDocs=123663) 0.25 = fieldNorm(doc=34265) 2.8322654 = (MATCH) weight(affiliateorganizationid:0087 in 34265) [DefaultSimilarity], result of: 2.8322654 = score(doc=34265,freq=1.0 = termFreq=1.0 ), product of: 0.497295 = queryWeight, product of: 5.6953425 = idf(docFreq=1129, maxDocs=123663) 0.08731608 = queryNorm 5.6953425 = fieldWeight in 34265, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.6953425 = idf(docFreq=1129, maxDocs=123663) 1.0 = fieldNorm(doc=34265) 0.08731608 = (MATCH) ConstantScore(volunteersslots:[1 TO *]), product of: 1.0 = boost 0.08731608 = queryNorm }, QParser: ExtendedDismaxQParser, altquerystring: null, boost_queries: null,
[jira] [Commented] (SOLR-6319) if mincount 1, facet.field needs to overrequest even if facet.sort=index
[ https://issues.apache.org/jira/browse/SOLR-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085508#comment-14085508 ] Yonik Seeley commented on SOLR-6319: It would also be trivial to produce examples where over-requesting by any given amount would also fail to produce correct results. Over-requesting does not fix the problem, it's simply a trade-off... increased cost for decreased chance of incorrect results. It was initial my judgement that over-requesting was worth it for sorting by count, but probably not worth it for sorting by index. Even in that case, the degree that we over-request by was a guess. If you have a different intuition about the best amount of over-requesting, I'm all ears. if mincount 1, facet.field needs to overrequest even if facet.sort=index -- Key: SOLR-6319 URL: https://issues.apache.org/jira/browse/SOLR-6319 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Hoss Man Discovered this while working on SOLR-2894. the logic for distributed faceting ignores over requesting (beyond the user specified facet.limit) if the facet.sort is index order -- but the rationale for doing this falls apart if the user has specified a facet.mincount 1 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-6187) facet.mincount ignored in range date faceting using distributed search
[ https://issues.apache.org/jira/browse/SOLR-6187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson reassigned SOLR-6187: Assignee: Erick Erickson facet.mincount ignored in range date faceting using distributed search -- Key: SOLR-6187 URL: https://issues.apache.org/jira/browse/SOLR-6187 Project: Solr Issue Type: Bug Components: search Affects Versions: 4.8, 4.8.1 Reporter: Zaccheo Bagnati Assignee: Erick Erickson While I was trying to do a range faceting with gap +1YEAR using shards, I noticed that facet.mincount parameter seems to be ignored. Issue can be reproduced in this way: Create 2 cores testshard1 and testshard2 with: solrconfig.xml ?xml version=1.0 encoding=UTF-8 ? config luceneMatchVersionLUCENE_41/luceneMatchVersion lib dir=/opt/solr/dist regex=solr-cell-.*\.jar/ directoryFactory name=DirectoryFactory class=${solr.directoryFactory:solr.NRTCachingDirectoryFactory}/ updateHandler class=solr.DirectUpdateHandler2 / requestHandler name=/select class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str int name=rows10/int str name=dfid/str /lst /requestHandler requestHandler name=/update class=solr.UpdateRequestHandler / requestHandler name=/admin/ class=org.apache.solr.handler.admin.AdminHandlers / requestHandler name=/admin/ping class=solr.PingRequestHandler lst name=invariants str name=qsolrpingquery/str /lst lst name=defaults str name=echoParamsall/str /lst /requestHandler /config schema.xml ?xml version=1.0 ? schema name=${solr.core.name} version=1.5 xmlns:xi=http://www.w3.org/2001/XInclude; fieldType name=int class=solr.TrieIntField precisionStep=0 positionIncrementGap=0/ fieldType name=long class=solr.TrieLongField precisionStep=0 positionIncrementGap=0/ fieldType name=date class=solr.TrieDateField precisionStep=0 positionIncrementGap=0/ field name=_version_ type=long indexed=true stored=true/ field name=id type=int indexed=true stored=true multiValued=false / field name=date type=date indexed=true stored=true multiValued=false / uniqueKeyid/uniqueKey defaultSearchFieldid/defaultSearchField /schema Insert in testshard1: add doc field name=id1/field field name=date2014-06-20T12:51:00Z/field /doc /add Insert into testshard2: add doc field name=id2/field field name=date2013-06-20T12:51:00Z/field /doc /add Now if I execute: curl http://localhost:8983/solr/testshard1/select?q=id:1facet=truefacet.mincount=1facet.range=datef.date.facet.range.start=1900-01-01T00:00:00Zf.date.facet.range.end=NOWf.date.facet.range.gap=%2B1YEARshards=localhost%3A8983%2Fsolr%2Ftestshard1%2Clocalhost%3A8983%2Fsolr%2Ftestshard2shards.info=truewt=json; I obtain:
[jira] [Commented] (SOLR-6300) facet.mincount fails to work if SolrCloud distrib=true is set
[ https://issues.apache.org/jira/browse/SOLR-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085542#comment-14085542 ] Erick Erickson commented on SOLR-6300: -- Hoss: Thanks, I've assigned them to myself so I'll take a look. facet.mincount fails to work if SolrCloud distrib=true is set - Key: SOLR-6300 URL: https://issues.apache.org/jira/browse/SOLR-6300 Project: Solr Issue Type: Bug Components: SearchComponents - other, SolrCloud Affects Versions: 5.0 Reporter: Vamsee Yarlagadda Assignee: Erick Erickson I notice that using facet.mincount in SolrCloud mode with distrib=true fails to filter the facets based on the count. However, the same query with distrib=false works as expected. * Indexed some data as provided by the upstream test. https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/test/org/apache/solr/request/SimpleFacetsTest.java#L633 * Test being run: https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/test/org/apache/solr/request/SimpleFacetsTest.java#L657 * Running in SolrCloud mode with distrib=false (facet.mincount works as expected) {code} $ curl http://search-testing-c5-3.ent.cloudera.com:8983/solr/simple_faceting_coll/select?facet.date.start=1976-07-01T00%3A00%3A00.000Zfacet=truefacet.mincount=1q=*%3A*facet.date=bdayfacet.date.other=allfacet.date.gap=%2B1DAYfacet.date.end=1976-07-01T00%3A00%3A00.000Z%2B1MONTHrows=0indent=truewt=xmldistrib=false; ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime3/int lst name=params str name=facet.date.start1976-07-01T00:00:00.000Z/str str name=facettrue/str str name=indenttrue/str str name=facet.mincount1/str str name=q*:*/str str name=facet.datebday/str str name=distribfalse/str str name=facet.date.gap+1DAY/str str name=facet.date.otherall/str str name=wtxml/str str name=facet.date.end1976-07-01T00:00:00.000Z+1MONTH/str str name=rows0/str /lst /lst result name=response numFound=33 start=0 /result lst name=facet_counts lst name=facet_queries/ lst name=facet_fields/ lst name=facet_dates lst name=bday int name=1976-07-03T00:00:00Z1/int int name=1976-07-04T00:00:00Z1/int int name=1976-07-05T00:00:00Z1/int int name=1976-07-13T00:00:00Z1/int int name=1976-07-15T00:00:00Z1/int int name=1976-07-21T00:00:00Z1/int int name=1976-07-30T00:00:00Z1/int str name=gap+1DAY/str date name=start1976-07-01T00:00:00Z/date date name=end1976-08-01T00:00:00Z/date int name=before2/int int name=after0/int int name=between6/int /lst /lst lst name=facet_ranges/ /lst /response {code} * SolrCloud mode with distrib=true (facet.mincount fails to show effect) {code} $ curl http://search-testing-c5-3.ent.cloudera.com:8983/solr/simple_faceting_coll/select?facet.date.start=1976-07-01T00%3A00%3A00.000Zfacet=truefacet.mincount=1q=*%3A*facet.date=bdayfacet.date.other=allfacet.date.gap=%2B1DAYfacet.date.end=1976-07-01T00%3A00%3A00.000Z%2B1MONTHrows=0indent=truewt=xmldistrib=true; ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime12/int lst name=params str name=facet.date.start1976-07-01T00:00:00.000Z/str str name=facettrue/str str name=indenttrue/str str name=facet.mincount1/str str name=q*:*/str str name=facet.datebday/str str name=distribtrue/str str name=facet.date.gap+1DAY/str str name=facet.date.otherall/str str name=wtxml/str str name=facet.date.end1976-07-01T00:00:00.000Z+1MONTH/str str name=rows0/str /lst /lst result name=response numFound=63 start=0 maxScore=1.0 /result lst name=facet_counts lst name=facet_queries/ lst name=facet_fields/ lst name=facet_dates lst name=bday int name=1976-07-01T00:00:00Z0/int int name=1976-07-02T00:00:00Z0/int int name=1976-07-03T00:00:00Z2/int int name=1976-07-04T00:00:00Z2/int int name=1976-07-05T00:00:00Z2/int int name=1976-07-06T00:00:00Z0/int int name=1976-07-07T00:00:00Z0/int int name=1976-07-08T00:00:00Z0/int int name=1976-07-09T00:00:00Z0/int int name=1976-07-10T00:00:00Z0/int int name=1976-07-11T00:00:00Z0/int int name=1976-07-12T00:00:00Z1/int int name=1976-07-13T00:00:00Z1/int int name=1976-07-14T00:00:00Z0/int int name=1976-07-15T00:00:00Z2/int int name=1976-07-16T00:00:00Z0/int int name=1976-07-17T00:00:00Z0/int int name=1976-07-18T00:00:00Z0/int int name=1976-07-19T00:00:00Z0/int int name=1976-07-20T00:00:00Z0/int int
[jira] [Assigned] (SOLR-6154) SolrCloud: facet range option f.field.facet.mincount=1 omits buckets on response
[ https://issues.apache.org/jira/browse/SOLR-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson reassigned SOLR-6154: Assignee: Erick Erickson SolrCloud: facet range option f.field.facet.mincount=1 omits buckets on response -- Key: SOLR-6154 URL: https://issues.apache.org/jira/browse/SOLR-6154 Project: Solr Issue Type: Bug Affects Versions: 4.5.1, 4.8.1 Environment: Solr 4.5.1 under Linux - explicit id routing Indexed 400,000+ Documents explicit routing custom schema.xml Solr 4.8.1 under Windows+Cygwin Indexed 6 Documents implicit id routing out of the box schema Reporter: Ronald Matamoros Assignee: Erick Erickson Attachments: HowToReplicate.pdf, data.xml Attached - PDF with instructions on how to replicate. - data.xml to replicate index The f.field.facet.mincount option on a distributed search gives inconsistent list of buckets on a range facet. Experiencing that some buckets are ignored when using the option f.field.facet.mincount=1. The Solr logs do not indicate any error or warning during execution. The debug=true option and increasing the log levels to the FacetComponent do not provide any hints to the behaviour. Replicated the issue on both Solr 4.5.1 4.8.1. Example, Removing the f.field.facet.mincount=1 option gives the expected list of buckets for the 6 documents matched. lst name=facet_ranges lst name=price lst name=counts int name=0.00/int int name=50.01/int int name=100.00/int int name=150.03/int int name=200.00/int int name=250.01/int int name=300.00/int int name=350.00/int int name=400.00/int int name=450.00/int int name=500.00/int int name=550.00/int int name=600.00/int int name=650.00/int int name=700.00/int int name=750.01/int int name=800.00/int int name=850.00/int int name=900.00/int int name=950.00/int /lst float name=gap50.0/float float name=start0.0/float float name=end1000.0/float int name=before0/int int name=after0/int int name=between2/int /lst /lst Using the f.field.facet.mincount=1 option removes the 0 count buckets but will also omit bucket int name=250.0 lst name=facet_ranges lst name=price lst name=counts int name=50.01/int int name=150.03/int int name=750.01/int /lst float name=gap50.0/float float name=start0.0/float float name=end1000.0/float int name=before0/int int name=after0/int int name=between4/int /lst /lst Resubmitting the query renders a different bucket list (May need to resubmit a couple times) lst name=facet_ranges lst name=price lst name=counts int name=150.03/int int name=250.01/int /lst float name=gap50.0/float float name=start0.0/float float name=end1000.0/float int name=before0/int int name=after0/int int name=between2/int /lst /lst -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5859) Remove Version from Analyzer constructors
[ https://issues.apache.org/jira/browse/LUCENE-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Ernst updated LUCENE-5859: --- Summary: Remove Version from Analyzer constructors (was: Remove Version.java completely) Remove Version from Analyzer constructors - Key: LUCENE-5859 URL: https://issues.apache.org/jira/browse/LUCENE-5859 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Fix For: 5.0 Attachments: LUCENE-5859.patch, LUCENE-5859_dead_code.patch This has always been a mess: analyzers are easy enough to make on your own, we don't need to take responsibility for the users analysis chain for 2 major releases. The code maintenance is horrible here. This creates a huge usability issue too, and as seen from numerous mailing list issues, users don't even understand how this versioning works anyway. I'm sure someone will whine if i try to remove these constants, but we can at least make no-arg ctors forwarding to VERSION_CURRENT so that people who don't care about back compat (e.g. just prototyping) don't have to deal with the horribly complex versioning system. If you want to make the argument that doing this is trappy (i heard this before), i think thats bogus, and ill counter by trying to remove them. Either way, I'm personally not going to add any of this kind of back compat logic myself ever again. Updated: description of the issue updated as expected. We should remove this API completely. No one else on the planet has APIs that require a mandatory version parameter. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5859) Remove Version.java completely
[ https://issues.apache.org/jira/browse/LUCENE-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Ernst updated LUCENE-5859: --- Attachment: LUCENE-5859.patch I modified [~rcmuir]'s patch, per my recent proposal for this issue on the dev@ email list. This adds setVersion/getVersion to Analyzer, and removes Version constructors from any concrete Analyzer constructors, as well as from TokeFilters, etc. Remove Version.java completely -- Key: LUCENE-5859 URL: https://issues.apache.org/jira/browse/LUCENE-5859 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Fix For: 5.0 Attachments: LUCENE-5859.patch, LUCENE-5859_dead_code.patch This has always been a mess: analyzers are easy enough to make on your own, we don't need to take responsibility for the users analysis chain for 2 major releases. The code maintenance is horrible here. This creates a huge usability issue too, and as seen from numerous mailing list issues, users don't even understand how this versioning works anyway. I'm sure someone will whine if i try to remove these constants, but we can at least make no-arg ctors forwarding to VERSION_CURRENT so that people who don't care about back compat (e.g. just prototyping) don't have to deal with the horribly complex versioning system. If you want to make the argument that doing this is trappy (i heard this before), i think thats bogus, and ill counter by trying to remove them. Either way, I'm personally not going to add any of this kind of back compat logic myself ever again. Updated: description of the issue updated as expected. We should remove this API completely. No one else on the planet has APIs that require a mandatory version parameter. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6319) if mincount 1, facet.field needs to overrequest even if facet.sort=index
[ https://issues.apache.org/jira/browse/SOLR-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085546#comment-14085546 ] Hoss Man commented on SOLR-6319: Let's put it this way... * the comment that you quoted explicitly says Overrequesting can help a little here, but not as much as when sorting by count but then does no overrequesting at all -- which looks like a bug to me, and is at best a disconnect between the comment and the code: ** if overrequesting can help, then why wasn't it done, or at have add an option to do it? ** so what if it doesn't help _as much_ as with sorting by count if it still helps? If there was _any_ overrequesting here (even if it had just been a smaller amount then when sort=count) then the comment would match the code, and i would have assumed it was intentional. * i can not find any tests of distributed field faceting that combines sort=index + mincount1 + limit0 which suggests to me that the code (if not the comment) wasn't thought through completely. (this was based on some creative greping - if i missed an assert i'm happy to be corrected here) * SOLR-2894 introduces new request params to allow users fine grained control over the amount of overrequest involved if they so choose (facet.overrequest.ratio facet.overrequest.count) with default values that match the current behavior of the code when facet.sort=count (limit * 1.5 + 10) * i plan to change the existing facet.field logic such that the default behavior for sort=index + mincount1 + limit0 _will_ overrequest by default, and respects those params (and their defaults) to decide how much over requesting if an expert user doesn't explicitly set them. * if you (still) don't like me calling this a bug then feel free to edit the jira and call it WTF you want. There aren't enough hours in the day for me to care about arguing that nit -- what i care about is making the code behavior better by default, and having a record in jira moving forward of when the behavior changed. if mincount 1, facet.field needs to overrequest even if facet.sort=index -- Key: SOLR-6319 URL: https://issues.apache.org/jira/browse/SOLR-6319 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Hoss Man Discovered this while working on SOLR-2894. the logic for distributed faceting ignores over requesting (beyond the user specified facet.limit) if the facet.sort is index order -- but the rationale for doing this falls apart if the user has specified a facet.mincount 1 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5859) Remove Version from Analyzer constructors
[ https://issues.apache.org/jira/browse/LUCENE-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1408#comment-1408 ] Ryan Ernst commented on LUCENE-5859: Note this is only the Analyzer side of that proposal. I will work on consolidating Version and LUCENE_MAIN_VERSION separately. Remove Version from Analyzer constructors - Key: LUCENE-5859 URL: https://issues.apache.org/jira/browse/LUCENE-5859 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Fix For: 5.0 Attachments: LUCENE-5859.patch, LUCENE-5859_dead_code.patch This has always been a mess: analyzers are easy enough to make on your own, we don't need to take responsibility for the users analysis chain for 2 major releases. The code maintenance is horrible here. This creates a huge usability issue too, and as seen from numerous mailing list issues, users don't even understand how this versioning works anyway. I'm sure someone will whine if i try to remove these constants, but we can at least make no-arg ctors forwarding to VERSION_CURRENT so that people who don't care about back compat (e.g. just prototyping) don't have to deal with the horribly complex versioning system. If you want to make the argument that doing this is trappy (i heard this before), i think thats bogus, and ill counter by trying to remove them. Either way, I'm personally not going to add any of this kind of back compat logic myself ever again. Updated: description of the issue updated as expected. We should remove this API completely. No one else on the planet has APIs that require a mandatory version parameter. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6319) consider increasing over-request amount when sorting by index with mincount 1
[ https://issues.apache.org/jira/browse/SOLR-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-6319: --- Summary: consider increasing over-request amount when sorting by index with mincount 1 (was: if mincount 1, facet.field needs to overrequest even if facet.sort=index) consider increasing over-request amount when sorting by index with mincount 1 --- Key: SOLR-6319 URL: https://issues.apache.org/jira/browse/SOLR-6319 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Priority: Minor Discovered this while working on SOLR-2894. the logic for distributed faceting ignores over requesting (beyond the user specified facet.limit) if the facet.sort is index order -- but the rationale for doing this falls apart if the user has specified a facet.mincount 1 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6319) if mincount 1, facet.field needs to overrequest even if facet.sort=index
[ https://issues.apache.org/jira/browse/SOLR-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-6319: --- Priority: Minor (was: Major) if mincount 1, facet.field needs to overrequest even if facet.sort=index -- Key: SOLR-6319 URL: https://issues.apache.org/jira/browse/SOLR-6319 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Priority: Minor Discovered this while working on SOLR-2894. the logic for distributed faceting ignores over requesting (beyond the user specified facet.limit) if the facet.sort is index order -- but the rationale for doing this falls apart if the user has specified a facet.mincount 1 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6319) if mincount 1, facet.field needs to overrequest even if facet.sort=index
[ https://issues.apache.org/jira/browse/SOLR-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-6319: --- Issue Type: Improvement (was: Bug) if mincount 1, facet.field needs to overrequest even if facet.sort=index -- Key: SOLR-6319 URL: https://issues.apache.org/jira/browse/SOLR-6319 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Discovered this while working on SOLR-2894. the logic for distributed faceting ignores over requesting (beyond the user specified facet.limit) if the facet.sort is index order -- but the rationale for doing this falls apart if the user has specified a facet.mincount 1 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6319) consider increasing over-request amount when sorting by index with mincount 1
[ https://issues.apache.org/jira/browse/SOLR-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085557#comment-14085557 ] Yonik Seeley commented on SOLR-6319: bq. if you (still) don't like me calling this a bug then feel free to edit the jira and call it WTF you want. Done. I do feel the distinction is important since adding more over-request still won't fix the bug if it is categorized as such. consider increasing over-request amount when sorting by index with mincount 1 --- Key: SOLR-6319 URL: https://issues.apache.org/jira/browse/SOLR-6319 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Priority: Minor Discovered this while working on SOLR-2894. the logic for distributed faceting ignores over requesting (beyond the user specified facet.limit) if the facet.sort is index order -- but the rationale for doing this falls apart if the user has specified a facet.mincount 1 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6319) consider increasing over-request amount when sorting by index with mincount 1
[ https://issues.apache.org/jira/browse/SOLR-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085580#comment-14085580 ] Yonik Seeley commented on SOLR-6319: bq. the comment that you quoted explicitly says Overrequesting can help a little here, but not as much as when sorting by count but then does no overrequesting at all Actually, a type of over-requesting is already built-in. Say you have 10 shards and request the top 20. If the stars align, one can get correct results by requesting 2 items per shard. There is obviously a high percentage chance of errors (but it depends on the data). As one requests more data per shard, the error chance decreases. I'm not sure there's anything magic about 20, except for the fact that if all results are on one shard then we are still OK. In the general case of data being randomized across shards though, there doesn't seem to be anything special about 20. So we request a total of 200 and select the top 20... there's your built-in over-request. And even if there were not a built-in over-request, just because it can help a little here says nothing about whether it's worth the cost or not. Looking at your example, I might be convinced of an over-request of the form of +10 or something to handle the very low limit cases, but I don't think we should apply a multiplier by default, as is done with sort-by-count. Anyway, if you are still asserting that lack of over-requesting *is* a bug... please post a patch that attempts to fix things via over-requesting only, and then I'll show you an example that still breaks :-) consider increasing over-request amount when sorting by index with mincount 1 --- Key: SOLR-6319 URL: https://issues.apache.org/jira/browse/SOLR-6319 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Priority: Minor Discovered this while working on SOLR-2894. the logic for distributed faceting ignores over requesting (beyond the user specified facet.limit) if the facet.sort is index order -- but the rationale for doing this falls apart if the user has specified a facet.mincount 1 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #670: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/670/ 1 tests failed. FAILED: org.apache.solr.cloud.MultiThreadedOCPTest.testDistribSearch Error Message: We have a failed SPLITSHARD task Stack Trace: java.lang.AssertionError: We have a failed SPLITSHARD task at __randomizedtesting.SeedInfo.seed([57E1EBBBE6C942C6:D60765A3919622FA]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.cloud.MultiThreadedOCPTest.testTaskExclusivity(MultiThreadedOCPTest.java:125) at org.apache.solr.cloud.MultiThreadedOCPTest.doTest(MultiThreadedOCPTest.java:71) Build Log: [...truncated 53069 lines...] BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:490: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:182: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/extra-targets.xml:77: Java returned: 1 Total time: 206 minutes 27 seconds Build step 'Invoke Ant' marked build as failure Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5869) FuzzyQuery should require positive maxExpansions
Ryan Ernst created LUCENE-5869: -- Summary: FuzzyQuery should require positive maxExpansions Key: LUCENE-5869 URL: https://issues.apache.org/jira/browse/LUCENE-5869 Project: Lucene - Core Issue Type: Bug Reporter: Ryan Ernst FuzzyQuery currently only disallows negative values of {{maxExpansions}}. However, passing {{0}} causes an NPE when the underlying {{TopTermsRewrite}} does a {{peek()}} on an empty queue, which returns {{null}}, and then goes on using it unknowingly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5869) FuzzyQuery should require positive maxExpansions
[ https://issues.apache.org/jira/browse/LUCENE-5869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Ernst updated LUCENE-5869: --- Attachment: LUCENE-5869.patch Simple patch with test. FuzzyQuery should require positive maxExpansions Key: LUCENE-5869 URL: https://issues.apache.org/jira/browse/LUCENE-5869 Project: Lucene - Core Issue Type: Bug Reporter: Ryan Ernst Attachments: LUCENE-5869.patch FuzzyQuery currently only disallows negative values of {{maxExpansions}}. However, passing {{0}} causes an NPE when the underlying {{TopTermsRewrite}} does a {{peek()}} on an empty queue, which returns {{null}}, and then goes on using it unknowingly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5810) State of external collections not displayed in cloud graph panel
[ https://issues.apache.org/jira/browse/SOLR-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter updated SOLR-5810: - Attachment: SOLR-5810.patch Here's an updated patch that works with the latest patch for SOLR-5473 applied to trunk. One thing that's easy to change is that I didn't activate the nav controls if you have less than 10 collections total. If we want the nav controls active all times, that's fine too, I just didn't want to introduce more complexity to the existing UI if it's not needed and I don't think you need paging or filtering if you only have 10 collections (one liner at ZookeeperInfoServlet line 257). State of external collections not displayed in cloud graph panel Key: SOLR-5810 URL: https://issues.apache.org/jira/browse/SOLR-5810 Project: Solr Issue Type: Improvement Components: SolrCloud, web gui Reporter: Timothy Potter Assignee: Timothy Potter Attachments: SOLR-5810-prelim.patch, SOLR-5810.patch, SOLR-5810.prelim2.patch External collections (SOLR-5473) are not displayed in the Cloud - graph panel, which makes it very hard to see which external collections have problems, such as after a downed node comes back online. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5810) State of external collections not displayed in cloud graph panel
[ https://issues.apache.org/jira/browse/SOLR-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085801#comment-14085801 ] Timothy Potter commented on SOLR-5810: -- Also, I had to clear cache to get the changes to the UI files to update (cloud.js / .css / .html) State of external collections not displayed in cloud graph panel Key: SOLR-5810 URL: https://issues.apache.org/jira/browse/SOLR-5810 Project: Solr Issue Type: Improvement Components: SolrCloud, web gui Reporter: Timothy Potter Assignee: Timothy Potter Attachments: SOLR-5810-prelim.patch, SOLR-5810.patch, SOLR-5810.prelim2.patch External collections (SOLR-5473) are not displayed in the Cloud - graph panel, which makes it very hard to see which external collections have problems, such as after a downed node comes back online. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-MacOSX (64bit/jdk1.7.0) - Build # 1716 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/1716/ Java: 64bit/jdk1.7.0 -XX:-UseCompressedOops -XX:+UseParallelGC 1 tests failed. REGRESSION: org.apache.solr.schema.TestCloudSchemaless.testDistribSearch Error Message: Timeout occured while waiting response from server at: https://127.0.0.1:55708/collection1 Stack Trace: org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting response from server at: https://127.0.0.1:55708/collection1 at __randomizedtesting.SeedInfo.seed([941B05E32BE07E45:15FD8BFB5CBF1E79]:0) at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:559) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54) at org.apache.solr.schema.TestCloudSchemaless.doTest(TestCloudSchemaless.java:140) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:867) at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)