AW: Problems installing Pylucene on Ubuntu 12.04
Thanks for the quick reply. The tests work fine with that patch. Uwe On Thu, 6 Mar 2014, Ritzschke, Uwe wrote: Hello, I'm facing problems with installing Pylucene on an Ubuntu 12.04 Server (32bit). Perhaps someone can give me some helpful advice? I've followed the official installation instructions [1]. It seems that building and installing JCC works fine. Also, running make to build Pylucene seems to succeed. But if I run make test, I get the errors attached below. It looks like there is a left-over 'import pdb; pdb.set_trace()' statement in the test_PythonDirectory.py test, at line 260. Please, remove it and re-run the tests. Thanks ! Andi..
Re: Suggestions about writing / extending QueryParsers
Thanks Tim and Upayavira for your replies. I still need to decide what the final syntax could be, however generally speaking the ideal would be that I am able to extend the current Lucene syntax with a new expression which will trigger the creation of a more like this query with something like +title:foo +text for similar docs%2 where the phrase between quotes will generate a MoreLikeThisQuery on that text if it's followed by the % character (and the number 2 may control the MLT configuration, e.g. min document freq == min term freq = 2), similarly to what it's done for proximity search (not sure about using %, it's just a syntax example). I guess then I'd need to extend the classic query parser, as per Tim's suggestions and I'd assume that if this goes into the classic qp it should be a no brainer on the Solr side. Does it sound correct / feasible? Regards, Tommaso 2014-03-06 15:08 GMT+01:00 Upayavira u...@odoko.co.uk: Tommaso, Do say more about what you're thinking of. I'm currently getting my dev environment up to look into enhancing the MoreLikeThisHandler to be able handle function query boosts. This should be eminently possible from my initial research. However, if you're thinking of something more powerful, perhaps we can work together. Upayavira On Thu, Mar 6, 2014, at 11:23 AM, Tommaso Teofili wrote: Hi all, I'm thinking about writing/extending a QueryParser for MLT queries; I've never really looked into that code too much, while I'm doing that now, I'm wondering if anyone has suggestions on how to start with such a topic. Should I write a new grammar for that ? Or can I just extend an existing grammar / class? Thanks in advance, Tommaso
[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #606: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/606/ 1 tests failed. REGRESSION: org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testDistribSearch Error Message: some core start times did not change on reload Stack Trace: java.lang.AssertionError: some core start times did not change on reload at __randomizedtesting.SeedInfo.seed([F401181A3936ADA2:75E796024E69CD9E]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testCollectionsAPI(CollectionsAPIDistributedZkTest.java:835) at org.apache.solr.cloud.CollectionsAPIDistributedZkTest.doTest(CollectionsAPIDistributedZkTest.java:202) Build Log: [...truncated 52622 lines...] BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:494: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:176: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/extra-targets.xml:77: Java returned: 1 Total time: 141 minutes 34 seconds Build step 'Invoke Ant' marked build as failure Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: JDK 8 : Third Release Candidate - Build 132 is available on java.net
Thanks Uwe! On 06/03/2014 23:59, Uwe Schindler wrote: Hi Rory, hi Lucene committers, Thanks for the info! I updated our Jenkins build server to use JDK 8 b132 and JDK 7u60 b07. In addition, the MacOSX virtual machine now also runs JDK 8 b132 builds (after I sorted out how to **not** make JDK8 the default Java on OSX). Next to operating system upgrades I also updated to latest versions of IBM J9 v6.0 and 7.1 (releases of January 29^th ). Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de http://www.thetaphi.de/ eMail: u...@thetaphi.de *From:*Rory O'Donnell Oracle, Dublin Ireland [mailto:rory.odonn...@oracle.com] *Sent:* Thursday, March 06, 2014 6:48 PM *To:* Uwe Schindler; Dawid Weiss *Cc:* dev@lucene.apache.org; Dalibor Topic; Cecilia Borg; Balchandra Vaidya *Subject:* JDK 8 : Third Release Candidate - Build 132 is available on java.net Hi Uwe,Dawid, JDK 8 Third Release Candidate , Build 132 is now available for download http://jdk8.java.net/download.html test. Please log all show stopper issues as soon as possible. Thanks for your support, Rory -- Rgds,Rory O'Donnell Quality Engineering Manager Oracle EMEA , Dublin, Ireland -- Rgds,Rory O'Donnell Quality Engineering Manager Oracle EMEA , Dublin, Ireland
[jira] [Commented] (LUCENE-5493) Rename Sorter, NumericDocValuesSorter, and fix javadocs
[ https://issues.apache.org/jira/browse/LUCENE-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923694#comment-13923694 ] Adrien Grand commented on LUCENE-5493: -- This is a very nice cleanup, and the ability to use any Sort object including expressions makes it very flexible. +1 to commit Rename Sorter, NumericDocValuesSorter, and fix javadocs --- Key: LUCENE-5493 URL: https://issues.apache.org/jira/browse/LUCENE-5493 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Robert Muir Attachments: LUCENE-5493-poc.patch, LUCENE-5493.patch Its not clear to users that these are for this super-expert thing of pre-sorting the index. From the names and documentation they think they should use them instead of Sort/SortField. These need to be renamed or, even better, the API fixed so they aren't public classes. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Fwd: Am I allowed to generate, enhance and republish a JavaDoc of an Apache project?
On 3/6/2014 8:42 PM, Alexandre Rafalovitch wrote: I asked this on Apache legal list but got no reply. So, I thought I'll try again for the group it will affect directly (project not mentioned below is Solr). Any opinion on legality, usefulness or possibly underlying causes of the original problem would be appreciated. Regards, Alex. -- Forwarded message -- Date: Thu, Feb 27, 2014 at 4:41 PM Subject: Am I allowed to generate, enhance and republish a JavaDoc of an Apache project? To: legal-disc...@apache.org Hello, For one (of many) of the Apache projects that I use, I am very frustrated that Google cannot find the officially-hosted Javadocs. I'm not going to try to comment about the legal issues, but I will tell you that I can very often find javadocs for a very specific class by searching for it along with a specific recent version number. So I will google for 'HttpSolrServer 4.7.0' and I have what I need. Finding related things is normally pretty easy, because there are clickable links for related classes buried in any given javadoc page. When searching for recent docs for SolrQuery, if I leave out the version number, I only get 4.2.1 and 3.6.0 near the top of the results. If I add a version number, the top results are actually kinda useless. A search for 'SolrQuery 4.6.1 API' did the trick. It's simply too common a phrase, especially when broken apart into Solr and Query. There are very likely things that we can do to improve our search engine results. I'm not well versed in SEO myself. Thanks, Shawn - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Fwd: Am I allowed to generate, enhance and republish a JavaDoc of an Apache project?
Thanks Shawn, these are neat tricks. I did find a couple of similar tricks with the versions. But I keep forgetting them and sometimes even that does not help. Additionally, the way Javadocs are built, the cross-links do not work too well. The classes are split between build modules and if you want to go up and down the inheritance hierarchy that spins multiple modules (or Solr/Lucene divide) you do not even get told those classes exist. So, it becomes a case of knowing that it exists to look for it. I am not saying it is terrible, just that perhaps it can be made better. And I want to experiment with making it better. So I need a freedom to experiment faster than official release policy. Regards, Alex. P.s. I am not a SEO expert either. Once I learn to be one with this and/or other projects, I would be more than happy to contribute my skills back to the official documentation. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Fri, Mar 7, 2014 at 4:32 PM, Shawn Heisey s...@elyograg.org wrote: On 3/6/2014 8:42 PM, Alexandre Rafalovitch wrote: I asked this on Apache legal list but got no reply. So, I thought I'll try again for the group it will affect directly (project not mentioned below is Solr). Any opinion on legality, usefulness or possibly underlying causes of the original problem would be appreciated. Regards, Alex. -- Forwarded message -- Date: Thu, Feb 27, 2014 at 4:41 PM Subject: Am I allowed to generate, enhance and republish a JavaDoc of an Apache project? To: legal-disc...@apache.org Hello, For one (of many) of the Apache projects that I use, I am very frustrated that Google cannot find the officially-hosted Javadocs. I'm not going to try to comment about the legal issues, but I will tell you that I can very often find javadocs for a very specific class by searching for it along with a specific recent version number. So I will google for 'HttpSolrServer 4.7.0' and I have what I need. Finding related things is normally pretty easy, because there are clickable links for related classes buried in any given javadoc page. When searching for recent docs for SolrQuery, if I leave out the version number, I only get 4.2.1 and 3.6.0 near the top of the results. If I add a version number, the top results are actually kinda useless. A search for 'SolrQuery 4.6.1 API' did the trick. It's simply too common a phrase, especially when broken apart into Solr and Query. There are very likely things that we can do to improve our search engine results. I'm not well versed in SEO myself. Thanks, Shawn - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60-ea-b07) - Build # 9703 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9703/ Java: 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops -XX:+UseParallelGC All tests passed Build Log: [...truncated 43612 lines...] -documentation-lint: [echo] checking for broken html... [jtidy] Checking for broken html (such as invalid tags)... [delete] Deleting directory /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/jtidy_tmp [echo] Checking for broken links... [exec] [exec] Crawl/parse... [exec] [exec] Verify... [echo] Checking for missing docs... [exec] [exec] build/docs/demo/org/apache/lucene/demo/xmlparser/FormBasedXmlQueryDemo.html [exec] missing Methods: doPost(javax.servlet.http.HttpServletRequest,%20javax.servlet.http.HttpServletResponse) [exec] [exec] Missing javadocs were found! BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:465: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:57: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:208: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:241: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/common-build.xml:2321: exec returned: 1 Total time: 54 minutes 47 seconds Build step 'Invoke Ant' marked build as failure Description set: Java: 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops -XX:+UseParallelGC Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5826) Request caching SolrServer
Tommaso Teofili created SOLR-5826: - Summary: Request caching SolrServer Key: SOLR-5826 URL: https://issues.apache.org/jira/browse/SOLR-5826 Project: Solr Issue Type: New Feature Components: clients - java Reporter: Tommaso Teofili Assignee: Tommaso Teofili Fix For: 5.0 As stated in http://markmail.org/thread/a477kyxsp5xrusdu there're scenarios where an application communicating with Solr needs to not loose requests (especially update/indexing requests) that may fail because Solr instance / cluster is not reachable for some time. For such scenarios it may helpful to have a wrapping SolrServer which can cache (in a FIFO queue, so that they get executed in order) requests when the Solr endpoint is not reachable and execute them as soon as it's reachable again. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Group-ignored tests and @Before/After class hooks.
Robert pointed out this: [junit4] Suite: org.apache.solr.cloud.BasicZkTest [junit4] IGNOR/A 0.00s J2 | BasicZkTest.testBasic [junit4] Assumption #1: 'slow' test group is disabled (@Slow) [junit4] Completed on J2 in 42.45s, 1 test, 1 skipped Bug? Like it must be running @BeforeClass etc even though no tests are enabled... Indeed, this is currently the case. The problem is that the way JUnit works (or rather: the various tooling environments expects it to work) one has a choice of: 1) ignoring/ filtering certain tests or classes; then they will not show up in IDEs at all, 2) ignoring/ filtering certain tests *at evaluation time*; this unfortunately means @BeforeClass and @AfterClass will run (and so will static class initializers). This has the benefit that all ignored methods are reported properly. I'll see what I can do about it but it's not a trivial bug/ change. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5827) Add boosting functionality to MoreLikeThisHandler
Upayavira created SOLR-5827: --- Summary: Add boosting functionality to MoreLikeThisHandler Key: SOLR-5827 URL: https://issues.apache.org/jira/browse/SOLR-5827 Project: Solr Issue Type: Improvement Components: MoreLikeThis Reporter: Upayavira Fix For: 4.8 The MoreLikeThisHandler facilitates the creation of a very simple yet powerful recommendation engine. It is possible to constrain the result set using filter queries. However, it isn't possible to influence the scoring using function queries. Adding function query boosting would allow for including such things as recency in the relevancy calculations. Unfortunately, the boost= parameter is already in use, meaning we cannot replicate the edismax boost/bf for additive/multiplicative boostings. My patch only touches the MoreLikeThisHandler, so the only really contentious thing is to decide the parameters to configure it. I have a prototype working, and will upload a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5826) Request caching SolrServer
[ https://issues.apache.org/jira/browse/SOLR-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tommaso Teofili updated SOLR-5826: -- Attachment: SOLR-5826.patch attached first draft patch which introduces a RequestCachingSolrServer. It still needs to be improved to switch from active to passive wait for consuming cached requests. The testcase needs to be adjusted as it only works for me from IDE (strangely failing from ant due to File permissions on the index ..). Request caching SolrServer -- Key: SOLR-5826 URL: https://issues.apache.org/jira/browse/SOLR-5826 Project: Solr Issue Type: New Feature Components: clients - java Reporter: Tommaso Teofili Assignee: Tommaso Teofili Fix For: 5.0 Attachments: SOLR-5826.patch As stated in http://markmail.org/thread/a477kyxsp5xrusdu there're scenarios where an application communicating with Solr needs to not loose requests (especially update/indexing requests) that may fail because Solr instance / cluster is not reachable for some time. For such scenarios it may helpful to have a wrapping SolrServer which can cache (in a FIFO queue, so that they get executed in order) requests when the Solr endpoint is not reachable and execute them as soon as it's reachable again. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3178) Native MMapDir
[ https://issues.apache.org/jira/browse/LUCENE-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923769#comment-13923769 ] Michael McCandless commented on LUCENE-3178: bq. ... and suddenly I got good results, this is idiopathic :S Lovely :) It is odd, because we do relatively few IO ops, since we read a big byte[] blob, and then do all decode from that (128 packed ints) in RAM. It do think it'd be interesting to pair up a NativeMMapDir with a custom postings format that instead uses IndexInput.readLong (via Unsafe.getLong) to pull longs from disk; this should save some cost we now have in packed ints to reconstitute longs from byte[] in Java. But, we'd need to fix the byte order in the index to match the CPU used at search time. Or, maybe we could use a DirectByteBuffer and set the byte order (but this may mean byte swapping for every access, which maybe is not so bad). Native MMapDir -- Key: LUCENE-3178 URL: https://issues.apache.org/jira/browse/LUCENE-3178 Project: Lucene - Core Issue Type: Improvement Components: core/store Reporter: Michael McCandless Labels: gsoc2014 Attachments: LUCENE-3178-Native-MMap-implementation.patch, LUCENE-3178-Native-MMap-implementation.patch, LUCENE-3178-Native-MMap-implementation.patch Spinoff from LUCENE-2793. Just like we will create native Dir impl (UnixDirectory) to pass the right OS level IO flags depending on the IOContext, we could in theory do something similar with MMapDir. The problem is MMap is apparently quite hairy... and to pass the flags the native code would need to invoke mmap (I think?), unlike UnixDir where the code only has to open the file handle. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5493) Rename Sorter, NumericDocValuesSorter, and fix javadocs
[ https://issues.apache.org/jira/browse/LUCENE-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923772#comment-13923772 ] Michael McCandless commented on LUCENE-5493: bq. What is meant by impact sorted postings? It's when you sort your documents according to biggest impact which is your own measure, and which you intend to sort by at search time. AnalyzingInfixSuggester uses this, to sort the suggestions by their weight. This way if you are looking for 5 suggestions, you can stop searching after collecting 5 hits, which is an enormous speedup when the query would have otherwise matched many documents. See e.g. http://nlp.stanford.edu/IR-book/html/htmledition/impact-ordering-1.html Rename Sorter, NumericDocValuesSorter, and fix javadocs --- Key: LUCENE-5493 URL: https://issues.apache.org/jira/browse/LUCENE-5493 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Robert Muir Attachments: LUCENE-5493-poc.patch, LUCENE-5493.patch Its not clear to users that these are for this super-expert thing of pre-sorting the index. From the names and documentation they think they should use them instead of Sort/SortField. These need to be renamed or, even better, the API fixed so they aren't public classes. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Stalled unit tests
Unfortunately, some tests take a very long time, and the test infra will print these HEARTBEAT messages notifying you that they are still running. They should eventually finish? Mike McCandless http://blog.mikemccandless.com On Thu, Mar 6, 2014 at 5:09 PM, Terry Smith sheb...@gmail.com wrote: I'm sure that I'm just missing something obvious but I'm having trouble getting the unit tests to run to completion on my laptop and was hoping that someone would be kind enough to point me in the right direction. I've cloned the repository from GitHub (http://git.apache.org/lucene-solr.git) and checked out the latest commit on branch_4x. commit 6e06247cec1410f32592bfd307c1020b814def06 Author: Robert Muir rm...@apache.org Date: Thu Mar 6 19:54:07 2014 + disable slow solr tests in smoketester git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x@1575025 13f79535-47bb-0310-9956-ffa450edef68 Executing ant clean test from the top level directory of the project shows the tests running but they seems to get stuck in loop with some stalled heartbeat messages. If I run the tests directly from lucene/ then they complete successfully after about 10 minutes. I'm using Java 6 under OS X (10.9.2). $ java -version java version 1.6.0_65 Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609) Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode) My terminal lists repeating stalled heartbeat messages like so: HEARTBEAT J2 PID(20104@onyx.local): 2014-03-06T16:53:35, stalled for 2111s at: HdfsLockFactoryTest.testBasic HEARTBEAT J0 PID(20106@onyx.local): 2014-03-06T16:53:47, stalled for 2108s at: TestSurroundQueryParser.testQueryParser HEARTBEAT J1 PID(20103@onyx.local): 2014-03-06T16:54:11, stalled for 2167s at: TestRecoveryHdfs.testBuffering HEARTBEAT J3 PID(20105@onyx.local): 2014-03-06T16:54:23, stalled for 2165s at: HdfsDirectoryTest.testEOF My machine does have 3 java processes chewing CPU, see attached jstack dumps for more information. Should I expect the tests to complete on my platform? Do I need to specify any special flags to give them more memory or to avoid any bad apples? Thanks in advance, --Terry - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Group-ignored tests and @Before/After class hooks.
Ok, I've fixed it. https://github.com/carrotsearch/randomizedtesting/issues/158 I'll include it in the next release. Dawid On Fri, Mar 7, 2014 at 11:17 AM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: Robert pointed out this: [junit4] Suite: org.apache.solr.cloud.BasicZkTest [junit4] IGNOR/A 0.00s J2 | BasicZkTest.testBasic [junit4] Assumption #1: 'slow' test group is disabled (@Slow) [junit4] Completed on J2 in 42.45s, 1 test, 1 skipped Bug? Like it must be running @BeforeClass etc even though no tests are enabled... Indeed, this is currently the case. The problem is that the way JUnit works (or rather: the various tooling environments expects it to work) one has a choice of: 1) ignoring/ filtering certain tests or classes; then they will not show up in IDEs at all, 2) ignoring/ filtering certain tests *at evaluation time*; this unfortunately means @BeforeClass and @AfterClass will run (and so will static class initializers). This has the benefit that all ignored methods are reported properly. I'll see what I can do about it but it's not a trivial bug/ change. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5487) Can we separate top scorer from sub scorer?
[ https://issues.apache.org/jira/browse/LUCENE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923781#comment-13923781 ] ASF subversion and git services commented on LUCENE-5487: - Commit 1575234 from [~mikemccand] in branch 'dev/branches/lucene5487' [ https://svn.apache.org/r1575234 ] LUCENE-5487: rename TopScorer - BulkScorer Can we separate top scorer from sub scorer? --- Key: LUCENE-5487 URL: https://issues.apache.org/jira/browse/LUCENE-5487 Project: Lucene - Core Issue Type: Improvement Components: core/search Reporter: Michael McCandless Assignee: Michael McCandless Attachments: LUCENE-5487.patch, LUCENE-5487.patch This is just an exploratory patch ... still many nocommits, but I think it may be promising. I find the two booleans we pass to Weight.scorer confusing, because they really only apply to whoever will call score(Collector) (just IndexSearcher and BooleanScorer). The params are pointless for the vast majority of scorers, because very, very few query scorers really need to change how top-scoring is done, and those scorers can *only* score top-level (throw throw UOE from nextDoc/advance). It seems like these two types of scorers should be separately typed. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5493) Rename Sorter, NumericDocValuesSorter, and fix javadocs
[ https://issues.apache.org/jira/browse/LUCENE-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923793#comment-13923793 ] Uwe Schindler commented on LUCENE-5493: --- Beautiful! I like the linest starting with - :-) Rename Sorter, NumericDocValuesSorter, and fix javadocs --- Key: LUCENE-5493 URL: https://issues.apache.org/jira/browse/LUCENE-5493 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Robert Muir Attachments: LUCENE-5493-poc.patch, LUCENE-5493.patch Its not clear to users that these are for this super-expert thing of pre-sorting the index. From the names and documentation they think they should use them instead of Sort/SortField. These need to be renamed or, even better, the API fixed so they aren't public classes. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5827) Add boosting functionality to MoreLikeThisHandler
[ https://issues.apache.org/jira/browse/SOLR-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Upayavira updated SOLR-5827: Attachment: SOLR-5827.patch First pass. Supports additive boosting with the mlt.bf parameter. No support for multiplicative boost, pending a choice of parameter name! Add boosting functionality to MoreLikeThisHandler - Key: SOLR-5827 URL: https://issues.apache.org/jira/browse/SOLR-5827 Project: Solr Issue Type: Improvement Components: MoreLikeThis Reporter: Upayavira Fix For: 4.8 Attachments: SOLR-5827.patch The MoreLikeThisHandler facilitates the creation of a very simple yet powerful recommendation engine. It is possible to constrain the result set using filter queries. However, it isn't possible to influence the scoring using function queries. Adding function query boosting would allow for including such things as recency in the relevancy calculations. Unfortunately, the boost= parameter is already in use, meaning we cannot replicate the edismax boost/bf for additive/multiplicative boostings. My patch only touches the MoreLikeThisHandler, so the only really contentious thing is to decide the parameters to configure it. I have a prototype working, and will upload a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60-ea-b07) - Build # 9703 - Still Failing!
Hi, I have no idea, why this error appears. This class is unchanged since months. Maybe this is something new, only appearing with latest JDK 7u60 build? But, there are no javadocs changes in the whole series of 7u60 updates. I see that the class mentioned here has no Javadocs at all, because it’s a demo class. What's wrong? Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Policeman Jenkins Server [mailto:jenk...@thetaphi.de] Sent: Friday, March 07, 2014 10:50 AM To: dev@lucene.apache.org Subject: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60-ea-b07) - Build # 9703 - Still Failing! Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9703/ Java: 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops - XX:+UseParallelGC All tests passed Build Log: [...truncated 43612 lines...] -documentation-lint: [echo] checking for broken html... [jtidy] Checking for broken html (such as invalid tags)... [delete] Deleting directory /mnt/ssd/jenkins/workspace/Lucene-Solr- trunk-Linux/lucene/build/jtidy_tmp [echo] Checking for broken links... [exec] [exec] Crawl/parse... [exec] [exec] Verify... [echo] Checking for missing docs... [exec] [exec] build/docs/demo/org/apache/lucene/demo/xmlparser/FormBasedXmlQuer yDemo.html [exec] missing Methods: doPost(javax.servlet.http.HttpServletRequest,%20javax.servlet.http.HttpSer vletResponse) [exec] [exec] Missing javadocs were found! BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:465: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:57: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:208: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:241: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/common- build.xml:2321: exec returned: 1 Total time: 54 minutes 47 seconds Build step 'Invoke Ant' marked build as failure Description set: Java: 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops -XX:+UseParallelGC Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5827) Add boosting functionality to MoreLikeThisHandler
[ https://issues.apache.org/jira/browse/SOLR-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Upayavira updated SOLR-5827: Attachment: SOLR-5827.patch Updated version with a minor tweak to get rid of compile error Add boosting functionality to MoreLikeThisHandler - Key: SOLR-5827 URL: https://issues.apache.org/jira/browse/SOLR-5827 Project: Solr Issue Type: Improvement Components: MoreLikeThis Reporter: Upayavira Fix For: 4.8 Attachments: SOLR-5827.patch, SOLR-5827.patch The MoreLikeThisHandler facilitates the creation of a very simple yet powerful recommendation engine. It is possible to constrain the result set using filter queries. However, it isn't possible to influence the scoring using function queries. Adding function query boosting would allow for including such things as recency in the relevancy calculations. Unfortunately, the boost= parameter is already in use, meaning we cannot replicate the edismax boost/bf for additive/multiplicative boostings. My patch only touches the MoreLikeThisHandler, so the only really contentious thing is to decide the parameters to configure it. I have a prototype working, and will upload a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60-ea-b07) - Build # 9703 - Still Failing!
I really don't like that %20 in there! Maybe a minor change in the recent JDK7 build caused it to escape space with %20 instead of + or maybe where it wasn't escaping before ... I'll try to repro/fix. Seems like we just need to make the linter unescape somewhere. Mike McCandless http://blog.mikemccandless.com On Fri, Mar 7, 2014 at 6:41 AM, Uwe Schindler u...@thetaphi.de wrote: Hi, I have no idea, why this error appears. This class is unchanged since months. Maybe this is something new, only appearing with latest JDK 7u60 build? But, there are no javadocs changes in the whole series of 7u60 updates. I see that the class mentioned here has no Javadocs at all, because it's a demo class. What's wrong? Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Policeman Jenkins Server [mailto:jenk...@thetaphi.de] Sent: Friday, March 07, 2014 10:50 AM To: dev@lucene.apache.org Subject: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60-ea-b07) - Build # 9703 - Still Failing! Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9703/ Java: 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops - XX:+UseParallelGC All tests passed Build Log: [...truncated 43612 lines...] -documentation-lint: [echo] checking for broken html... [jtidy] Checking for broken html (such as invalid tags)... [delete] Deleting directory /mnt/ssd/jenkins/workspace/Lucene-Solr- trunk-Linux/lucene/build/jtidy_tmp [echo] Checking for broken links... [exec] [exec] Crawl/parse... [exec] [exec] Verify... [echo] Checking for missing docs... [exec] [exec] build/docs/demo/org/apache/lucene/demo/xmlparser/FormBasedXmlQuer yDemo.html [exec] missing Methods: doPost(javax.servlet.http.HttpServletRequest,%20javax.servlet.http.HttpSer vletResponse) [exec] [exec] Missing javadocs were found! BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:465: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:57: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:208: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:241: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/common- build.xml:2321: exec returned: 1 Total time: 54 minutes 47 seconds Build step 'Invoke Ant' marked build as failure Description set: Java: 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops -XX:+UseParallelGC Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5827) Add boosting functionality to MoreLikeThisHandler
[ https://issues.apache.org/jira/browse/SOLR-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923808#comment-13923808 ] Upayavira commented on SOLR-5827: - It is perhaps worth noting that these are made against the 4x branch. Add boosting functionality to MoreLikeThisHandler - Key: SOLR-5827 URL: https://issues.apache.org/jira/browse/SOLR-5827 Project: Solr Issue Type: Improvement Components: MoreLikeThis Reporter: Upayavira Fix For: 4.8 Attachments: SOLR-5827.patch, SOLR-5827.patch The MoreLikeThisHandler facilitates the creation of a very simple yet powerful recommendation engine. It is possible to constrain the result set using filter queries. However, it isn't possible to influence the scoring using function queries. Adding function query boosting would allow for including such things as recency in the relevancy calculations. Unfortunately, the boost= parameter is already in use, meaning we cannot replicate the edismax boost/bf for additive/multiplicative boostings. My patch only touches the MoreLikeThisHandler, so the only really contentious thing is to decide the parameters to configure it. I have a prototype working, and will upload a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-5492) IndexFileDeleter AssertionError in presence of *_upgraded.si files
[ https://issues.apache.org/jira/browse/LUCENE-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-5492: -- Assignee: Michael McCandless IndexFileDeleter AssertionError in presence of *_upgraded.si files -- Key: LUCENE-5492 URL: https://issues.apache.org/jira/browse/LUCENE-5492 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.7 Reporter: Tim Smith Assignee: Michael McCandless When calling IndexWriter.deleteUnusedFiles against an index that contains 3.x segments, i am seeing the following exception: {code} java.lang.AssertionError: failAndDumpStackJunitStatment: RefCount is 0 pre-decrement for file _0_upgraded.si at org.apache.lucene.index.IndexFileDeleter$RefCount.DecRef(IndexFileDeleter.java:630) at org.apache.lucene.index.IndexFileDeleter.decRef(IndexFileDeleter.java:514) at org.apache.lucene.index.IndexFileDeleter.deleteCommits(IndexFileDeleter.java:286) at org.apache.lucene.index.IndexFileDeleter.revisitPolicy(IndexFileDeleter.java:393) at org.apache.lucene.index.IndexWriter.deleteUnusedFiles(IndexWriter.java:4617) {code} I believe this is caused by IndexFileDeleter not being aware of the Lucene3x Segment Infos Format (notably the _upgraded.si files created to upgrade an old index) This is new in 4.7 and did not occur in 4.6.1 Still trying to track down a workaround/fix -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5492) IndexFileDeleter AssertionError in presence of *_upgraded.si files
[ https://issues.apache.org/jira/browse/LUCENE-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923813#comment-13923813 ] Michael McCandless commented on LUCENE-5492: Hmm, not good. Can you describe what you are doing / boil it down to a test case? IndexFileDeleter AssertionError in presence of *_upgraded.si files -- Key: LUCENE-5492 URL: https://issues.apache.org/jira/browse/LUCENE-5492 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.7 Reporter: Tim Smith When calling IndexWriter.deleteUnusedFiles against an index that contains 3.x segments, i am seeing the following exception: {code} java.lang.AssertionError: failAndDumpStackJunitStatment: RefCount is 0 pre-decrement for file _0_upgraded.si at org.apache.lucene.index.IndexFileDeleter$RefCount.DecRef(IndexFileDeleter.java:630) at org.apache.lucene.index.IndexFileDeleter.decRef(IndexFileDeleter.java:514) at org.apache.lucene.index.IndexFileDeleter.deleteCommits(IndexFileDeleter.java:286) at org.apache.lucene.index.IndexFileDeleter.revisitPolicy(IndexFileDeleter.java:393) at org.apache.lucene.index.IndexWriter.deleteUnusedFiles(IndexWriter.java:4617) {code} I believe this is caused by IndexFileDeleter not being aware of the Lucene3x Segment Infos Format (notably the _upgraded.si files created to upgrade an old index) This is new in 4.7 and did not occur in 4.6.1 Still trying to track down a workaround/fix -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60-ea-b07) - Build # 9703 - Still Failing!
Thanks Mike. Maybe this is the difference! The only strange thing is the fact that it only happens on this single file. I can confirm: 7u60 b04 does not trigger this bug, but 7u60 b07 does. This is definitely no bug in the JDK, its just more correct bahviour (because whitespace must be escaped in URIs). Please note + is no valid replacement for whitespace in URI components. Only the form-url-encoding in the query string (may) use +, but not the default encoding used in path names or fragments. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Friday, March 07, 2014 12:45 PM To: Lucene/Solr dev Subject: Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60-ea-b07) - Build # 9703 - Still Failing! I really don't like that %20 in there! Maybe a minor change in the recent JDK7 build caused it to escape space with %20 instead of + or maybe where it wasn't escaping before ... I'll try to repro/fix. Seems like we just need to make the linter unescape somewhere. Mike McCandless http://blog.mikemccandless.com On Fri, Mar 7, 2014 at 6:41 AM, Uwe Schindler u...@thetaphi.de wrote: Hi, I have no idea, why this error appears. This class is unchanged since months. Maybe this is something new, only appearing with latest JDK 7u60 build? But, there are no javadocs changes in the whole series of 7u60 updates. I see that the class mentioned here has no Javadocs at all, because it's a demo class. What's wrong? Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Policeman Jenkins Server [mailto:jenk...@thetaphi.de] Sent: Friday, March 07, 2014 10:50 AM To: dev@lucene.apache.org Subject: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60-ea-b07) - Build # 9703 - Still Failing! Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9703/ Java: 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops - XX:+UseParallelGC All tests passed Build Log: [...truncated 43612 lines...] -documentation-lint: [echo] checking for broken html... [jtidy] Checking for broken html (such as invalid tags)... [delete] Deleting directory /mnt/ssd/jenkins/workspace/Lucene-Solr- trunk-Linux/lucene/build/jtidy_tmp [echo] Checking for broken links... [exec] [exec] Crawl/parse... [exec] [exec] Verify... [echo] Checking for missing docs... [exec] [exec] build/docs/demo/org/apache/lucene/demo/xmlparser/FormBasedXmlQuer yDemo.html [exec] missing Methods: doPost(javax.servlet.http.HttpServletRequest,%20javax.servlet.http.Ht tpSer vletResponse) [exec] [exec] Missing javadocs were found! BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:465: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:57: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk- Linux/lucene/build.xml:208: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk- Linux/lucene/build.xml:241: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/common- build.xml:2321: exec returned: 1 Total time: 54 minutes 47 seconds Build step 'Invoke Ant' marked build as failure Description set: Java: 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops -XX:+UseParallelGC Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5493) Rename Sorter, NumericDocValuesSorter, and fix javadocs
[ https://issues.apache.org/jira/browse/LUCENE-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923852#comment-13923852 ] ASF subversion and git services commented on LUCENE-5493: - Commit 1575248 from [~rcmuir] in branch 'dev/trunk' [ https://svn.apache.org/r1575248 ] LUCENE-5493: cut over index sorting to use Sort api for specifying the order Rename Sorter, NumericDocValuesSorter, and fix javadocs --- Key: LUCENE-5493 URL: https://issues.apache.org/jira/browse/LUCENE-5493 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Robert Muir Attachments: LUCENE-5493-poc.patch, LUCENE-5493.patch Its not clear to users that these are for this super-expert thing of pre-sorting the index. From the names and documentation they think they should use them instead of Sort/SortField. These need to be renamed or, even better, the API fixed so they aren't public classes. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5493) Rename Sorter, NumericDocValuesSorter, and fix javadocs
[ https://issues.apache.org/jira/browse/LUCENE-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-5493. - Resolution: Fixed Fix Version/s: 5.0 4.8 Rename Sorter, NumericDocValuesSorter, and fix javadocs --- Key: LUCENE-5493 URL: https://issues.apache.org/jira/browse/LUCENE-5493 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Robert Muir Fix For: 4.8, 5.0 Attachments: LUCENE-5493-poc.patch, LUCENE-5493.patch Its not clear to users that these are for this super-expert thing of pre-sorting the index. From the names and documentation they think they should use them instead of Sort/SortField. These need to be renamed or, even better, the API fixed so they aren't public classes. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5493) Rename Sorter, NumericDocValuesSorter, and fix javadocs
[ https://issues.apache.org/jira/browse/LUCENE-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923864#comment-13923864 ] ASF subversion and git services commented on LUCENE-5493: - Commit 1575253 from [~rcmuir] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1575253 ] LUCENE-5493: cut over index sorting to use Sort api for specifying the order Rename Sorter, NumericDocValuesSorter, and fix javadocs --- Key: LUCENE-5493 URL: https://issues.apache.org/jira/browse/LUCENE-5493 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Robert Muir Fix For: 4.8, 5.0 Attachments: LUCENE-5493-poc.patch, LUCENE-5493.patch Its not clear to users that these are for this super-expert thing of pre-sorting the index. From the names and documentation they think they should use them instead of Sort/SortField. These need to be renamed or, even better, the API fixed so they aren't public classes. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5498) SortingAtomicReader should be package private
Robert Muir created LUCENE-5498: --- Summary: SortingAtomicReader should be package private Key: LUCENE-5498 URL: https://issues.apache.org/jira/browse/LUCENE-5498 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir The intended purpose of this reader is to allow you to sort your entire index with IW.addIndexes(IR). Perhaps we should supply some kind of tool to do this and hide the reader. Its scary to think of someone using this for searching (based on its name and docs, its probably not clear that it would be ridiculously slow) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5828) Support for multiple wildcard highlight fields
Daniel Debray created SOLR-5828: --- Summary: Support for multiple wildcard highlight fields Key: SOLR-5828 URL: https://issues.apache.org/jira/browse/SOLR-5828 Project: Solr Issue Type: Improvement Components: highlighter Affects Versions: 5.0 Reporter: Daniel Debray Priority: Minor Hey guys, is there a reason why we don't support multiple wildcard querys for highlighting? Something like hl.fl=foo.*hl.fl=bar.* or hl.fl=foo.* bar.*. If nothing speaks against it i would like to provide a patch for this issue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5499) EarlyTerminatingSortingCollector shouldnt require exact Sort match
Robert Muir created LUCENE-5499: --- Summary: EarlyTerminatingSortingCollector shouldnt require exact Sort match Key: LUCENE-5499 URL: https://issues.apache.org/jira/browse/LUCENE-5499 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Today EarlyTerminatingSortingCollector requires that the Sort match exactly at query and at index time. However, now that you can use any Sort (e.g. with multiple sortfields), this should be improved. For example, early termination is fine in the following case: * index-time: popularity desc, time desc * query-time: popularity desc -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5499) EarlyTerminatingSortingCollector shouldnt require exact Sort match
[ https://issues.apache.org/jira/browse/LUCENE-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923872#comment-13923872 ] Robert Muir commented on LUCENE-5499: - The basics are: right now we just encode Sort.toString() in the index. But a Sort is just a collection of SortFields. So if we encode it differently (e.g. each SortField.toString() separated by INFORMATION_SEPARATOR_ONE, escaping the former in case someone is crazy...) we can easily have logic like this. EarlyTerminatingSortingCollector shouldnt require exact Sort match -- Key: LUCENE-5499 URL: https://issues.apache.org/jira/browse/LUCENE-5499 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Today EarlyTerminatingSortingCollector requires that the Sort match exactly at query and at index time. However, now that you can use any Sort (e.g. with multiple sortfields), this should be improved. For example, early termination is fine in the following case: * index-time: popularity desc, time desc * query-time: popularity desc -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5828) Support for multiple wildcard highlight fields
[ https://issues.apache.org/jira/browse/SOLR-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Debray updated SOLR-5828: Description: Hey guys, is there a reason why we don't support multiple wildcard querys for highlighting? Something like hl.fl=foo.* hl.fl=bar.* or hl.fl=foo.* bar.*. If nothing speaks against it i would like to provide a patch for this issue. was: Hey guys, is there a reason why we don't support multiple wildcard querys for highlighting? Something like hl.fl=foo.*hl.fl=bar.* or hl.fl=foo.* bar.*. If nothing speaks against it i would like to provide a patch for this issue. Support for multiple wildcard highlight fields -- Key: SOLR-5828 URL: https://issues.apache.org/jira/browse/SOLR-5828 Project: Solr Issue Type: Improvement Components: highlighter Affects Versions: 5.0 Reporter: Daniel Debray Priority: Minor Hey guys, is there a reason why we don't support multiple wildcard querys for highlighting? Something like hl.fl=foo.* hl.fl=bar.* or hl.fl=foo.* bar.*. If nothing speaks against it i would like to provide a patch for this issue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5498) SortingAtomicReader should be package private
[ https://issues.apache.org/jira/browse/LUCENE-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923878#comment-13923878 ] Robert Muir commented on LUCENE-5498: - FWIW the other tools in lucene/misc seem to take a similar approach: e.g. PKIndexSplitter hides its FilterReader SortingAtomicReader should be package private - Key: LUCENE-5498 URL: https://issues.apache.org/jira/browse/LUCENE-5498 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir The intended purpose of this reader is to allow you to sort your entire index with IW.addIndexes(IR). Perhaps we should supply some kind of tool to do this and hide the reader. Its scary to think of someone using this for searching (based on its name and docs, its probably not clear that it would be ridiculously slow) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (SOLR-5828) Support for multiple wildcard highlight fields
[ https://issues.apache.org/jira/browse/SOLR-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Debray closed SOLR-5828. --- Resolution: Duplicate Support for multiple wildcard highlight fields -- Key: SOLR-5828 URL: https://issues.apache.org/jira/browse/SOLR-5828 Project: Solr Issue Type: Improvement Components: highlighter Affects Versions: 5.0 Reporter: Daniel Debray Priority: Minor Hey guys, is there a reason why we don't support multiple wildcard querys for highlighting? Something like hl.fl=foo.* hl.fl=bar.* or hl.fl=foo.* bar.*. If nothing speaks against it i would like to provide a patch for this issue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5828) Support for multiple wildcard highlight fields
[ https://issues.apache.org/jira/browse/SOLR-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923881#comment-13923881 ] Daniel Debray commented on SOLR-5828: - Duplicate of SOLR-5127. Support for multiple wildcard highlight fields -- Key: SOLR-5828 URL: https://issues.apache.org/jira/browse/SOLR-5828 Project: Solr Issue Type: Improvement Components: highlighter Affects Versions: 5.0 Reporter: Daniel Debray Priority: Minor Hey guys, is there a reason why we don't support multiple wildcard querys for highlighting? Something like hl.fl=foo.* hl.fl=bar.* or hl.fl=foo.* bar.*. If nothing speaks against it i would like to provide a patch for this issue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #1121: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1121/ 1 tests failed. REGRESSION: org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.testDistribSearch Error Message: There were too many update fails - we expect it can happen, but shouldn't easily Stack Trace: java.lang.AssertionError: There were too many update fails - we expect it can happen, but shouldn't easily at __randomizedtesting.SeedInfo.seed([C6C5F3B33C48A73B:47237DAB4B17C707]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertFalse(Assert.java:68) at org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.doTest(ChaosMonkeyNothingIsSafeTest.java:212) Build Log: [...truncated 53142 lines...] BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/build.xml:488: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/build.xml:176: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/extra-targets.xml:77: Java returned: 1 Total time: 140 minutes 36 seconds Build step 'Invoke Ant' marked build as failure Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5773) CollapsingQParserPlugin should make elevated documents the group head
[ https://issues.apache.org/jira/browse/SOLR-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-5773: - Attachment: SOLR-5773.patch CollapsingQParserPlugin should make elevated documents the group head - Key: SOLR-5773 URL: https://issues.apache.org/jira/browse/SOLR-5773 Project: Solr Issue Type: Improvement Components: query parsers Affects Versions: 4.6.1 Reporter: David Assignee: Joel Bernstein Labels: collapse, solr Fix For: 4.8 Attachments: SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch Original Estimate: 8h Remaining Estimate: 8h Hi Joel, I sent you an email but I'm not sure if you received it or not. I ran into a bit of trouble using the CollapsingQParserPlugin with elevated documents. To explain it simply, I want to exclude grouped documents when one of the members of the group are contained in the elevated document set. I'm not sure this is possible currently because as you explain above elevated documents are added to the request context after the original query is constructed. To try to better illustrate the problem. If I have 2 documents docid=1 and docid=2 and both have a groupid of 'a'. If a grouped query scores docid 2 first in the results but I have elevated docid 1 then both documents are shown in the results when I really only want the elevated document to be shown in the results. Is this something that would be difficult to implement? Any help is appreciated. I think the solution would be to remove the documents from liveDocs that share the same groupid in the getBoostDocs() function. Let me know if this makes any sense. I'll continue working towards a solution in the meantime. {code} private IntOpenHashSet getBoostDocs(SolrIndexSearcher indexSearcher, SetString boosted) throws IOException { IntOpenHashSet boostDocs = null; if(boosted != null) { SchemaField idField = indexSearcher.getSchema().getUniqueKeyField(); String fieldName = idField.getName(); HashSetBytesRef localBoosts = new HashSet(boosted.size()*2); IteratorString boostedIt = boosted.iterator(); while(boostedIt.hasNext()) { localBoosts.add(new BytesRef(boostedIt.next())); } boostDocs = new IntOpenHashSet(boosted.size()*2); ListAtomicReaderContextleaves = indexSearcher.getTopReaderContext().leaves(); TermsEnum termsEnum = null; DocsEnum docsEnum = null; for(AtomicReaderContext leaf : leaves) { AtomicReader reader = leaf.reader(); int docBase = leaf.docBase; Bits liveDocs = reader.getLiveDocs(); Terms terms = reader.terms(fieldName); termsEnum = terms.iterator(termsEnum); IteratorBytesRef it = localBoosts.iterator(); while(it.hasNext()) { BytesRef ref = it.next(); if(termsEnum.seekExact(ref)) { docsEnum = termsEnum.docs(liveDocs, docsEnum); int doc = docsEnum.nextDoc(); if(doc != -1) { //Found the document. boostDocs.add(doc+docBase); *// HERE REMOVE ANY DOCUMENTS THAT SHARE THE GROUPID NOT ONLY THE DOCID //* it.remove(); } } } } } return boostDocs; } {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5773) CollapsingQParserPlugin should make elevated documents the group head
[ https://issues.apache.org/jira/browse/SOLR-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923885#comment-13923885 ] Joel Bernstein commented on SOLR-5773: -- Tested this at scale and it seems to be functioning properly. David, let me know when you've had a chance to test out the patch. Thanks, Joel CollapsingQParserPlugin should make elevated documents the group head - Key: SOLR-5773 URL: https://issues.apache.org/jira/browse/SOLR-5773 Project: Solr Issue Type: Improvement Components: query parsers Affects Versions: 4.6.1 Reporter: David Assignee: Joel Bernstein Labels: collapse, solr Fix For: 4.8 Attachments: SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch Original Estimate: 8h Remaining Estimate: 8h Hi Joel, I sent you an email but I'm not sure if you received it or not. I ran into a bit of trouble using the CollapsingQParserPlugin with elevated documents. To explain it simply, I want to exclude grouped documents when one of the members of the group are contained in the elevated document set. I'm not sure this is possible currently because as you explain above elevated documents are added to the request context after the original query is constructed. To try to better illustrate the problem. If I have 2 documents docid=1 and docid=2 and both have a groupid of 'a'. If a grouped query scores docid 2 first in the results but I have elevated docid 1 then both documents are shown in the results when I really only want the elevated document to be shown in the results. Is this something that would be difficult to implement? Any help is appreciated. I think the solution would be to remove the documents from liveDocs that share the same groupid in the getBoostDocs() function. Let me know if this makes any sense. I'll continue working towards a solution in the meantime. {code} private IntOpenHashSet getBoostDocs(SolrIndexSearcher indexSearcher, SetString boosted) throws IOException { IntOpenHashSet boostDocs = null; if(boosted != null) { SchemaField idField = indexSearcher.getSchema().getUniqueKeyField(); String fieldName = idField.getName(); HashSetBytesRef localBoosts = new HashSet(boosted.size()*2); IteratorString boostedIt = boosted.iterator(); while(boostedIt.hasNext()) { localBoosts.add(new BytesRef(boostedIt.next())); } boostDocs = new IntOpenHashSet(boosted.size()*2); ListAtomicReaderContextleaves = indexSearcher.getTopReaderContext().leaves(); TermsEnum termsEnum = null; DocsEnum docsEnum = null; for(AtomicReaderContext leaf : leaves) { AtomicReader reader = leaf.reader(); int docBase = leaf.docBase; Bits liveDocs = reader.getLiveDocs(); Terms terms = reader.terms(fieldName); termsEnum = terms.iterator(termsEnum); IteratorBytesRef it = localBoosts.iterator(); while(it.hasNext()) { BytesRef ref = it.next(); if(termsEnum.seekExact(ref)) { docsEnum = termsEnum.docs(liveDocs, docsEnum); int doc = docsEnum.nextDoc(); if(doc != -1) { //Found the document. boostDocs.add(doc+docBase); *// HERE REMOVE ANY DOCUMENTS THAT SHARE THE GROUPID NOT ONLY THE DOCID //* it.remove(); } } } } } return boostDocs; } {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Stalled unit tests
Mike, Fair enough. I'll let them run for more than 30 minutes and see what happens. How long does it take on your machine? I'm happy to signup for the wiki and add some extra information to http://wiki.apache.org/lucene-java/HowToContribute for folks wanting to tinker with Lucene. Do the Lucene developers typically run a subset of the test suite to make committing cheaper? Thanks, --Terry On Fri, Mar 7, 2014 at 5:52 AM, Michael McCandless luc...@mikemccandless.com wrote: Unfortunately, some tests take a very long time, and the test infra will print these HEARTBEAT messages notifying you that they are still running. They should eventually finish? Mike McCandless http://blog.mikemccandless.com On Thu, Mar 6, 2014 at 5:09 PM, Terry Smith sheb...@gmail.com wrote: I'm sure that I'm just missing something obvious but I'm having trouble getting the unit tests to run to completion on my laptop and was hoping that someone would be kind enough to point me in the right direction. I've cloned the repository from GitHub (http://git.apache.org/lucene-solr.git) and checked out the latest commit on branch_4x. commit 6e06247cec1410f32592bfd307c1020b814def06 Author: Robert Muir rm...@apache.org Date: Thu Mar 6 19:54:07 2014 + disable slow solr tests in smoketester git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x@1575025 13f79535-47bb-0310-9956-ffa450edef68 Executing ant clean test from the top level directory of the project shows the tests running but they seems to get stuck in loop with some stalled heartbeat messages. If I run the tests directly from lucene/ then they complete successfully after about 10 minutes. I'm using Java 6 under OS X (10.9.2). $ java -version java version 1.6.0_65 Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609) Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode) My terminal lists repeating stalled heartbeat messages like so: HEARTBEAT J2 PID(20104@onyx.local): 2014-03-06T16:53:35, stalled for 2111s at: HdfsLockFactoryTest.testBasic HEARTBEAT J0 PID(20106@onyx.local): 2014-03-06T16:53:47, stalled for 2108s at: TestSurroundQueryParser.testQueryParser HEARTBEAT J1 PID(20103@onyx.local): 2014-03-06T16:54:11, stalled for 2167s at: TestRecoveryHdfs.testBuffering HEARTBEAT J3 PID(20105@onyx.local): 2014-03-06T16:54:23, stalled for 2165s at: HdfsDirectoryTest.testEOF My machine does have 3 java processes chewing CPU, see attached jstack dumps for more information. Should I expect the tests to complete on my platform? Do I need to specify any special flags to give them more memory or to avoid any bad apples? Thanks in advance, --Terry - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Stalled unit tests
How long does it take on your machine? It really depends... check out the limit on some heavy nightly tests like this one: @TimeoutSuite(millis = 80 * TimeUnits.HOUR) @Ignore(takes ~ 45 minutes) (Somebody should really inspect this inconsistency :). Or this one: @Ignore(Requires tons of heap to run (420G works)) @TimeoutSuite(millis = 100 * TimeUnits.HOUR) Wait... how many Gs? :) And seriously the top parent class of all tests declares: @TimeoutSuite(millis = 2 * TimeUnits.HOUR) And this unfortunately means that a test class will timeout after 2 hours of inactivity. To me, it's absurdly high but in the past tests ran on very slow virtualized machines and were actually hitting these limits. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: Suggestions about writing / extending QueryParsers
Tommaso, Ah, now I see. If you want to add new operators, you'll have to modify the javacc files. For the SpanQueryParser, I added a handful of new operators and chose to go with regexes instead of javacc...not sure that was the right decision, but given my lack of knowledge of javacc, it was expedient. If you have time or already know javacc, it shouldn't be difficult. As for nobrainer on the Solr side, y, it shouldn't be a problem. However, as of now the basic queryparser is a copy and paste job between Lucene and Solr, so you'll just have to redo your code in Solrunless you do something smarter. If you'd be willing to wait for LUCENE-5205 to be brought into Lucene, I'd consider adding this functionality into the SpanQueryParser as a later step. Cheers, Tim From: Tommaso Teofili [mailto:tommaso.teof...@gmail.com] Sent: Friday, March 07, 2014 3:17 AM To: dev@lucene.apache.org Subject: Re: Suggestions about writing / extending QueryParsers Thanks Tim and Upayavira for your replies. I still need to decide what the final syntax could be, however generally speaking the ideal would be that I am able to extend the current Lucene syntax with a new expression which will trigger the creation of a more like this query with something like +title:foo +text for similar docs%2 where the phrase between quotes will generate a MoreLikeThisQuery on that text if it's followed by the % character (and the number 2 may control the MLT configuration, e.g. min document freq == min term freq = 2), similarly to what it's done for proximity search (not sure about using %, it's just a syntax example). I guess then I'd need to extend the classic query parser, as per Tim's suggestions and I'd assume that if this goes into the classic qp it should be a no brainer on the Solr side. Does it sound correct / feasible? Regards, Tommaso 2014-03-06 15:08 GMT+01:00 Upayavira u...@odoko.co.ukmailto:u...@odoko.co.uk: Tommaso, Do say more about what you're thinking of. I'm currently getting my dev environment up to look into enhancing the MoreLikeThisHandler to be able handle function query boosts. This should be eminently possible from my initial research. However, if you're thinking of something more powerful, perhaps we can work together. Upayavira On Thu, Mar 6, 2014, at 11:23 AM, Tommaso Teofili wrote: Hi all, I'm thinking about writing/extending a QueryParser for MLT queries; I've never really looked into that code too much, while I'm doing that now, I'm wondering if anyone has suggestions on how to start with such a topic. Should I write a new grammar for that ? Or can I just extend an existing grammar / class? Thanks in advance, Tommaso
[jira] [Commented] (SOLR-5720) Add ExpandComponent to expand results collapsed by the CollapsingQParserPlugin
[ https://issues.apache.org/jira/browse/SOLR-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923900#comment-13923900 ] ASF subversion and git services commented on SOLR-5720: --- Commit 1575266 from [~joel.bernstein] in branch 'dev/trunk' [ https://svn.apache.org/r1575266 ] SOLR-5720: Updated CHANGES.txt Add ExpandComponent to expand results collapsed by the CollapsingQParserPlugin -- Key: SOLR-5720 URL: https://issues.apache.org/jira/browse/SOLR-5720 Project: Solr Issue Type: New Feature Reporter: Joel Bernstein Assignee: Joel Bernstein Priority: Minor Fix For: 4.8, 5.0 Attachments: SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch This ticket introduces a new search component called the ExpandComponent. The expand component expands a single page of results collapsed by the CollapsingQParserPlugin. Sample syntax: {code} q=*:*fq={!collapse field=fieldA}expand=trueexpand.sort=fieldB+ascexpand.rows=10 {code} In the above query the results are collapsed on fieldA with the CollapsingQParserPlugin. The expand component expands the current page of collapsed results. The initial implementation of the ExpandComponent takes three parameters: *expand=true* (Turns on the ExpandComponent) *expand.sort=fieldB+asc,fieldC+desc* (Sorts the documents based on a sort spec. If none is specified the documents are sorted by relevance based on the main query.) *expand.rows=10* (Sets the numbers of rows that groups are expanded to). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5720) Add ExpandComponent to expand results collapsed by the CollapsingQParserPlugin
[ https://issues.apache.org/jira/browse/SOLR-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923904#comment-13923904 ] ASF subversion and git services commented on SOLR-5720: --- Commit 1575267 from [~joel.bernstein] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1575267 ] SOLR-5720: Updated CHANGES.txt Add ExpandComponent to expand results collapsed by the CollapsingQParserPlugin -- Key: SOLR-5720 URL: https://issues.apache.org/jira/browse/SOLR-5720 Project: Solr Issue Type: New Feature Reporter: Joel Bernstein Assignee: Joel Bernstein Priority: Minor Fix For: 4.8, 5.0 Attachments: SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch This ticket introduces a new search component called the ExpandComponent. The expand component expands a single page of results collapsed by the CollapsingQParserPlugin. Sample syntax: {code} q=*:*fq={!collapse field=fieldA}expand=trueexpand.sort=fieldB+ascexpand.rows=10 {code} In the above query the results are collapsed on fieldA with the CollapsingQParserPlugin. The expand component expands the current page of collapsed results. The initial implementation of the ExpandComponent takes three parameters: *expand=true* (Turns on the ExpandComponent) *expand.sort=fieldB+asc,fieldC+desc* (Sorts the documents based on a sort spec. If none is specified the documents are sorted by relevance based on the main query.) *expand.rows=10* (Sets the numbers of rows that groups are expanded to). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5492) IndexFileDeleter AssertionError in presence of *_upgraded.si files
[ https://issues.apache.org/jira/browse/LUCENE-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923910#comment-13923910 ] Tim Smith commented on LUCENE-5492: --- Here's what my test is doing: 1. unpacks lucene 3.x era index (has one segment in it) 2. opens IndexWriter on 3.x index 3. opens DirectoryReader using IndexWriter 4. Add 1 new document 5. commit IndexWriter 6. reopens DirectoryReader using IndexWriter 7. optimizes IndexWriter 8. commit optimized index 9. reopens DirectoryReader using IndexWriter One thing of note is that i have a custom IndexDeletionPolicy this policy will hold onto named commit points i hold onto the previous commit point at commit time, and then release it shortly after the commit is finished, once i have persisted my acceptance of the new commit point (calling deleteUnusedFiles() to purge it) IndexFileDeleter AssertionError in presence of *_upgraded.si files -- Key: LUCENE-5492 URL: https://issues.apache.org/jira/browse/LUCENE-5492 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.7 Reporter: Tim Smith Assignee: Michael McCandless When calling IndexWriter.deleteUnusedFiles against an index that contains 3.x segments, i am seeing the following exception: {code} java.lang.AssertionError: failAndDumpStackJunitStatment: RefCount is 0 pre-decrement for file _0_upgraded.si at org.apache.lucene.index.IndexFileDeleter$RefCount.DecRef(IndexFileDeleter.java:630) at org.apache.lucene.index.IndexFileDeleter.decRef(IndexFileDeleter.java:514) at org.apache.lucene.index.IndexFileDeleter.deleteCommits(IndexFileDeleter.java:286) at org.apache.lucene.index.IndexFileDeleter.revisitPolicy(IndexFileDeleter.java:393) at org.apache.lucene.index.IndexWriter.deleteUnusedFiles(IndexWriter.java:4617) {code} I believe this is caused by IndexFileDeleter not being aware of the Lucene3x Segment Infos Format (notably the _upgraded.si files created to upgrade an old index) This is new in 4.7 and did not occur in 4.6.1 Still trying to track down a workaround/fix -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5825) Separate http request creation and execution in SolrJ
[ https://issues.apache.org/jira/browse/SOLR-5825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923913#comment-13923913 ] Mark Miller commented on SOLR-5825: --- +1 Separate http request creation and execution in SolrJ - Key: SOLR-5825 URL: https://issues.apache.org/jira/browse/SOLR-5825 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Steven Bower Attachments: SOLR-5825.patch In order to implement some custom behaviors I split the request() method in HttpSolrServer into 2 distinct method createMethod() and executeMethod(). This allows for customization of either/both of these phases vs having it in a single function. In my use case I extended HttpSolrServer to support client side timeouts (so_timeout, connectTimeout and request timeout).. without duplicating the code in request() I couldn't accomplish.. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923915#comment-13923915 ] Tim Allison commented on LUCENE-5205: - The root of this problem is that SpanNearIQuery has no good way to handle stopwords in a way analagous to PhraseQuery. In SpanQueryParser, this limitation should be well described in the javadocs to SpanQueryParser and in the test cases. Let me know if it isn't. You have the option of throwing an exception when a stopword is found to notify the user about stopwords, but that's exceedingly unsatisfactory. Without digging into the internals of SpanNearQuery, we can still do better on this. One proposal is to do what the basic highlighter does and risk false positives...behind the scenes modify calculator for evaluating to calculator evaluating~1. This would then falsely match calculator zebra evaluating. PhraseQuery can have false positives, too, but it guarantees that the false hit has to be a stop word. This solution would not do that. So, is this better than no matches at all? [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser --- Key: LUCENE-5205 URL: https://issues.apache.org/jira/browse/LUCENE-5205 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Reporter: Tim Allison Labels: patch Fix For: 4.7 Attachments: LUCENE-5205-cleanup-tests.patch, LUCENE-5205-date-pkg-prvt.patch, LUCENE-5205.patch.gz, LUCENE-5205.patch.gz, LUCENE-5205_dateTestReInitPkgPrvt.patch, LUCENE-5205_smallTestMods.patch, LUCENE_5205.patch, SpanQueryParser_v1.patch.gz, patch.txt This parser extends QueryParserBase and includes functionality from: * Classic QueryParser: most of its syntax * SurroundQueryParser: recursive parsing for near and not clauses. * ComplexPhraseQueryParser: can handle near queries that include multiterms (wildcard, fuzzy, regex, prefix), * AnalyzingQueryParser: has an option to analyze multiterms. At a high level, there's a first pass BooleanQuery/field parser and then a span query parser handles all terminal nodes and phrases. Same as classic syntax: * term: test * fuzzy: roam~0.8, roam~2 * wildcard: te?t, test*, t*st * regex: /\[mb\]oat/ * phrase: jakarta apache * phrase with slop: jakarta apache~3 * default or clause: jakarta apache * grouping or clause: (jakarta apache) * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta * multiple fields: title:lucene author:hatcher Main additions in SpanQueryParser syntax vs. classic syntax: * Can require in order for phrases with slop with the \~ operator: jakarta apache\~3 * Can specify not near: fever bieber!\~3,10 :: find fever but not if bieber appears within 3 words before or 10 words after it. * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta apache\]~3 lucene\]\~4 :: find jakarta within 3 words of apache, and that hit has to be within four words before lucene * Can also use \[\] for single level phrasal queries instead of as in: \[jakarta apache\] * Can use or grouping clauses in phrasal queries: apache (lucene solr)\~3 :: find apache and then either lucene or solr within three words. * Can use multiterms in phrasal queries: jakarta\~1 ap*che\~2 * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like jakarta within two words of ap*che and that hit has to be within ten words of something like solr or that lucene regex. * Can require at least x number of hits at boolean level: apache AND (lucene solr tika)~2 * Can use negative only query: -jakarta :: Find all docs that don't contain jakarta * Can use an edit distance 2 for fuzzy query via SlowFuzzyQuery (beware of potential performance issues!). Trivial additions: * Can specify prefix length in fuzzy queries: jakarta~1,2 (edit distance =1, prefix =2) * Can specifiy Optimal String Alignment (OSA) vs Levenshtein for distance =2: (jakarta~1 (OSA) vs jakarta~1(Levenshtein) This parser can be very useful for concordance tasks (see also LUCENE-5317 and LUCENE-5318) and for analytical search. Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery. Most of the documentation is in the javadoc for SpanQueryParser. Any and all feedback is welcome. Thank you. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923915#comment-13923915 ] Tim Allison edited comment on LUCENE-5205 at 3/7/14 2:36 PM: - The root of this problem is that SpanNearQuery has no good way to handle stopwords in a way analagous to PhraseQuery. In SpanQueryParser, this limitation should be well described in the javadocs to SpanQueryParser and in the test cases. Let me know if it isn't. You have the option of throwing an exception when a stopword is found to notify the user about stopwords, but that's exceedingly unsatisfactory. Without digging into the internals of SpanNearQuery, we can still do better on this. One proposal is to do what the basic highlighter does and risk false positives...behind the scenes modify calculator for evaluating to calculator evaluating~1. This would then falsely match calculator zebra evaluating. PhraseQuery can have false positives, too, but it guarantees that the false hit has to be a stop word. This solution would not do that. So, is this better than no matches at all? was (Author: talli...@mitre.org): The root of this problem is that SpanNearIQuery has no good way to handle stopwords in a way analagous to PhraseQuery. In SpanQueryParser, this limitation should be well described in the javadocs to SpanQueryParser and in the test cases. Let me know if it isn't. You have the option of throwing an exception when a stopword is found to notify the user about stopwords, but that's exceedingly unsatisfactory. Without digging into the internals of SpanNearQuery, we can still do better on this. One proposal is to do what the basic highlighter does and risk false positives...behind the scenes modify calculator for evaluating to calculator evaluating~1. This would then falsely match calculator zebra evaluating. PhraseQuery can have false positives, too, but it guarantees that the false hit has to be a stop word. This solution would not do that. So, is this better than no matches at all? [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser --- Key: LUCENE-5205 URL: https://issues.apache.org/jira/browse/LUCENE-5205 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Reporter: Tim Allison Labels: patch Fix For: 4.7 Attachments: LUCENE-5205-cleanup-tests.patch, LUCENE-5205-date-pkg-prvt.patch, LUCENE-5205.patch.gz, LUCENE-5205.patch.gz, LUCENE-5205_dateTestReInitPkgPrvt.patch, LUCENE-5205_smallTestMods.patch, LUCENE_5205.patch, SpanQueryParser_v1.patch.gz, patch.txt This parser extends QueryParserBase and includes functionality from: * Classic QueryParser: most of its syntax * SurroundQueryParser: recursive parsing for near and not clauses. * ComplexPhraseQueryParser: can handle near queries that include multiterms (wildcard, fuzzy, regex, prefix), * AnalyzingQueryParser: has an option to analyze multiterms. At a high level, there's a first pass BooleanQuery/field parser and then a span query parser handles all terminal nodes and phrases. Same as classic syntax: * term: test * fuzzy: roam~0.8, roam~2 * wildcard: te?t, test*, t*st * regex: /\[mb\]oat/ * phrase: jakarta apache * phrase with slop: jakarta apache~3 * default or clause: jakarta apache * grouping or clause: (jakarta apache) * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta * multiple fields: title:lucene author:hatcher Main additions in SpanQueryParser syntax vs. classic syntax: * Can require in order for phrases with slop with the \~ operator: jakarta apache\~3 * Can specify not near: fever bieber!\~3,10 :: find fever but not if bieber appears within 3 words before or 10 words after it. * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta apache\]~3 lucene\]\~4 :: find jakarta within 3 words of apache, and that hit has to be within four words before lucene * Can also use \[\] for single level phrasal queries instead of as in: \[jakarta apache\] * Can use or grouping clauses in phrasal queries: apache (lucene solr)\~3 :: find apache and then either lucene or solr within three words. * Can use multiterms in phrasal queries: jakarta\~1 ap*che\~2 * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like jakarta within two words of ap*che and that hit has to be within ten words of something like solr or that lucene regex. * Can require at least x number of hits at boolean level: apache AND (lucene solr tika)~2 * Can use negative only query: -jakarta :: Find all docs
[jira] [Commented] (SOLR-5818) distrib search with custom comparator does not quite work correctly
[ https://issues.apache.org/jira/browse/SOLR-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923919#comment-13923919 ] Mark Miller commented on SOLR-5818: --- +1 - LGTM. distrib search with custom comparator does not quite work correctly --- Key: SOLR-5818 URL: https://issues.apache.org/jira/browse/SOLR-5818 Project: Solr Issue Type: Bug Reporter: Ryan Ernst Attachments: SOLR-5818.patch In QueryComponent.doFieldSortValues, a scorer is never set on a custom comparator. We just need to add a fake scorer that can pass through the score from the DocList. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5730) make Lucene's SortingMergePolicy and EarlyTerminatingSortingCollector configurable in Solr
[ https://issues.apache.org/jira/browse/SOLR-5730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923920#comment-13923920 ] Robert Muir commented on SOLR-5730: --- Hello, some things that might simplify some of the TODOs, is we changed the SortingMergePolicy API in LUCENE-5493 to just take Sort. This means you can have multiple fields, they dont have to be numeric docvalues, and so on. So I think this can simplify the configuration of this thing too, e.g. you could just take a standard sort spec string and parse it with QueryParsing.getSort or whatever (some refactoring might be needed here). It would be good though, to check that Sort.needsScores() == false, as that makes no sense at index-time... I'll open an issue to add this check to SortingMergePolicy itself in lucene. The other difference is, EarlyTerminatingSortingCollector now also takes a Sort, except really you should just pass the Sort being used for the Query (it does the proper checking against the segments to see if the segment was sorted in a compatible way, and if so, will optimize with early termination). Today this just checks that they are exactly equal, but in the future it can be smarter (LUCENE-5499). Hopefully this makes the integration easier. make Lucene's SortingMergePolicy and EarlyTerminatingSortingCollector configurable in Solr -- Key: SOLR-5730 URL: https://issues.apache.org/jira/browse/SOLR-5730 Project: Solr Issue Type: New Feature Reporter: Christine Poerschke Priority: Minor Example configuration: solrconfig.xml {noformat} mergeSorter class=org.apache.solr.update.DefaultMergeSorterFactory/ {noformat} schema.xml {noformat} mergeSorterKey class=org.apache.solr.schema.SingleFieldSorterFactory str name=fieldNametimestamp/str bool name=ascendingfalse/bool /mergeSorterKey {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5500) SortingMergePolicy should error if the Sort refers to the score
Robert Muir created LUCENE-5500: --- Summary: SortingMergePolicy should error if the Sort refers to the score Key: LUCENE-5500 URL: https://issues.apache.org/jira/browse/LUCENE-5500 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir It should throw an exception if Sort.needsScores() == true. This does not make sense at index-time. I think there is no reason for this method to be package-private either (as its just useful sugar, it loops over each SortField and checks needsScores). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5500) SortingMergePolicy should error if the Sort refers to the score
[ https://issues.apache.org/jira/browse/LUCENE-5500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923924#comment-13923924 ] Robert Muir commented on LUCENE-5500: - Note you will get an exception today: but not until the actual merge. The idea here is to fail when you configure the thing on indexwriter! SortingMergePolicy should error if the Sort refers to the score --- Key: LUCENE-5500 URL: https://issues.apache.org/jira/browse/LUCENE-5500 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir It should throw an exception if Sort.needsScores() == true. This does not make sense at index-time. I think there is no reason for this method to be package-private either (as its just useful sugar, it loops over each SortField and checks needsScores). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5500) SortingMergePolicy should error if the Sort refers to the score
[ https://issues.apache.org/jira/browse/LUCENE-5500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-5500: Attachment: LUCENE-5500.patch simple patch. I added tests for both the MP and the FilterReader (as its public today) SortingMergePolicy should error if the Sort refers to the score --- Key: LUCENE-5500 URL: https://issues.apache.org/jira/browse/LUCENE-5500 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5500.patch It should throw an exception if Sort.needsScores() == true. This does not make sense at index-time. I think there is no reason for this method to be package-private either (as its just useful sugar, it loops over each SortField and checks needsScores). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5823) Add utility function for internal code to know if it is currently the overseer
[ https://issues.apache.org/jira/browse/SOLR-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923934#comment-13923934 ] Mark Miller commented on SOLR-5823: --- Cool. Couple comments: bq. Looks like the ZooKeeper folks are planning to introduce a Path class to help with parsing in 3.5 Seems we should pull it out into it's own method or utility static in the meantime? I'd also add a warning about using it to the javadoc - most of this type of info is essentially free because it's in the cluster state, but this calls ZK and we try and do that sparingly. I'm still not sure why this can't just be a thread in the Overseer class though and avoid this call altogether? That already would fail over as you need right? Add utility function for internal code to know if it is currently the overseer -- Key: SOLR-5823 URL: https://issues.apache.org/jira/browse/SOLR-5823 Project: Solr Issue Type: Improvement Reporter: Hoss Man Attachments: SOLR-5823.patch It would be useful if there was some Overseer equivalent to CloudDescriptor.isLeader() that plugins running in solr could use to know At this moment, am i the leader? -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5477) Async execution of OverseerCollectionProcessor tasks
[ https://issues.apache.org/jira/browse/SOLR-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923942#comment-13923942 ] Mark Miller commented on SOLR-5477: --- bq. SolrJ calls you mean methods like CollectionAdminRequest.createCollection(). Right - it can def come in a second issue, but it seems like just at least adding the async param is pretty low hanging fruit. Async execution of OverseerCollectionProcessor tasks Key: SOLR-5477 URL: https://issues.apache.org/jira/browse/SOLR-5477 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Anshum Gupta Attachments: SOLR-5477-CoreAdminStatus.patch, SOLR-5477-updated.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch Typical collection admin commands are long running and it is very common to have the requests get timed out. It is more of a problem if the cluster is very large.Add an option to run these commands asynchronously add an extra param async=true for all collection commands the task is written to ZK and the caller is returned a task id. as separate collection admin command will be added to poll the status of the task command=statusid=7657668909 if id is not passed all running async tasks should be listed A separate queue is created to store in-process tasks . After the tasks are completed the queue entry is removed. OverSeerColectionProcessor will perform these tasks in multiple threads -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5477) Async execution of OverseerCollectionProcessor tasks
[ https://issues.apache.org/jira/browse/SOLR-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923960#comment-13923960 ] Anshum Gupta commented on SOLR-5477: [~markrmil...@gmail.com] Sure, I'll add that and put up another patch. It's just that I wanted to get it into trunk sooner than later considering that the patch touches a reasonable points in the code, which makes it tricky to forward port every time. Async execution of OverseerCollectionProcessor tasks Key: SOLR-5477 URL: https://issues.apache.org/jira/browse/SOLR-5477 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Anshum Gupta Attachments: SOLR-5477-CoreAdminStatus.patch, SOLR-5477-updated.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch Typical collection admin commands are long running and it is very common to have the requests get timed out. It is more of a problem if the cluster is very large.Add an option to run these commands asynchronously add an extra param async=true for all collection commands the task is written to ZK and the caller is returned a task id. as separate collection admin command will be added to poll the status of the task command=statusid=7657668909 if id is not passed all running async tasks should be listed A separate queue is created to store in-process tasks . After the tasks are completed the queue entry is removed. OverSeerColectionProcessor will perform these tasks in multiple threads -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5500) SortingMergePolicy should error if the Sort refers to the score
[ https://issues.apache.org/jira/browse/LUCENE-5500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923990#comment-13923990 ] Adrien Grand commented on LUCENE-5500: -- +1 SortingMergePolicy should error if the Sort refers to the score --- Key: LUCENE-5500 URL: https://issues.apache.org/jira/browse/LUCENE-5500 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5500.patch It should throw an exception if Sort.needsScores() == true. This does not make sense at index-time. I think there is no reason for this method to be package-private either (as its just useful sugar, it loops over each SortField and checks needsScores). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5491) Flexible StandardQueryParser fails on boost field
[ https://issues.apache.org/jira/browse/LUCENE-5491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] André updated LUCENE-5491: --- Fix Version/s: 4.8 Flexible StandardQueryParser fails on boost field - Key: LUCENE-5491 URL: https://issues.apache.org/jira/browse/LUCENE-5491 Project: Lucene - Core Issue Type: Bug Components: core/queryparser Affects Versions: 4.6, 4.7 Reporter: André Fix For: 4.8 The following exception {noformat} java.lang.IllegalArgumentException: field name should not be null! at org.apache.lucene.queryparser.flexible.core.config.FieldConfig.init(FieldConfig.java:36) at org.apache.lucene.queryparser.flexible.core.config.QueryConfigHandler.getFieldConfig(QueryConfigHandler.java:59) at org.apache.lucene.queryparser.flexible.standard.processors.BoostQueryNodeProcessor.postProcessNode(BoostQueryNodeProcessor.java:54) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processIteration(QueryNodeProcessorImpl.java:99) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processChildren(QueryNodeProcessorImpl.java:125) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processIteration(QueryNodeProcessorImpl.java:97) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.process(QueryNodeProcessorImpl.java:90) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorPipeline.process(QueryNodeProcessorPipeline.java:90) at org.apache.lucene.queryparser.flexible.core.QueryParserHelper.parse(QueryParserHelper.java:255) at org.apache.lucene.queryparser.flexible.standard.StandardQueryParser.parse(StandardQueryParser.java:168) {noformat} is caused by boosting a tokenizable phrase field within a group. {code:java} public void testFail() throws Exception { test((mimeType:\text-html\)); } public void testOkay() throws Exception { test(mimeType:\text-html\); } static void test(String qs) throws Exception { Analyzer sa = new StandardAnalyzer(Version.LUCENE_46); StandardQueryParser qp = new StandardQueryParser(sa); qp.getFieldsBoost().put(mimeType, 1f); qp.parse(qs, content); } {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5491) NPE in Flexible StandardQueryParser on boosting
[ https://issues.apache.org/jira/browse/LUCENE-5491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] André updated LUCENE-5491: --- Summary: NPE in Flexible StandardQueryParser on boosting (was: NPE in Flexible StandardQueryParser when boosting) NPE in Flexible StandardQueryParser on boosting --- Key: LUCENE-5491 URL: https://issues.apache.org/jira/browse/LUCENE-5491 Project: Lucene - Core Issue Type: Bug Components: core/queryparser Affects Versions: 4.6, 4.7 Reporter: André Fix For: 4.8 The following exception {noformat} java.lang.IllegalArgumentException: field name should not be null! at org.apache.lucene.queryparser.flexible.core.config.FieldConfig.init(FieldConfig.java:36) at org.apache.lucene.queryparser.flexible.core.config.QueryConfigHandler.getFieldConfig(QueryConfigHandler.java:59) at org.apache.lucene.queryparser.flexible.standard.processors.BoostQueryNodeProcessor.postProcessNode(BoostQueryNodeProcessor.java:54) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processIteration(QueryNodeProcessorImpl.java:99) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processChildren(QueryNodeProcessorImpl.java:125) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processIteration(QueryNodeProcessorImpl.java:97) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.process(QueryNodeProcessorImpl.java:90) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorPipeline.process(QueryNodeProcessorPipeline.java:90) at org.apache.lucene.queryparser.flexible.core.QueryParserHelper.parse(QueryParserHelper.java:255) at org.apache.lucene.queryparser.flexible.standard.StandardQueryParser.parse(StandardQueryParser.java:168) {noformat} is caused by boosting a tokenizable phrase field within a group. {code:java} public void testFail() throws Exception { test((mimeType:\text-html\)); } public void testOkay() throws Exception { test(mimeType:\text-html\); } static void test(String qs) throws Exception { Analyzer sa = new StandardAnalyzer(Version.LUCENE_46); StandardQueryParser qp = new StandardQueryParser(sa); qp.getFieldsBoost().put(mimeType, 1f); qp.parse(qs, content); } {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5491) NPE in Flexible StandardQueryParser when boosting
[ https://issues.apache.org/jira/browse/LUCENE-5491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] André updated LUCENE-5491: --- Summary: NPE in Flexible StandardQueryParser when boosting (was: Flexible StandardQueryParser fails on boost field) NPE in Flexible StandardQueryParser when boosting - Key: LUCENE-5491 URL: https://issues.apache.org/jira/browse/LUCENE-5491 Project: Lucene - Core Issue Type: Bug Components: core/queryparser Affects Versions: 4.6, 4.7 Reporter: André Fix For: 4.8 The following exception {noformat} java.lang.IllegalArgumentException: field name should not be null! at org.apache.lucene.queryparser.flexible.core.config.FieldConfig.init(FieldConfig.java:36) at org.apache.lucene.queryparser.flexible.core.config.QueryConfigHandler.getFieldConfig(QueryConfigHandler.java:59) at org.apache.lucene.queryparser.flexible.standard.processors.BoostQueryNodeProcessor.postProcessNode(BoostQueryNodeProcessor.java:54) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processIteration(QueryNodeProcessorImpl.java:99) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processChildren(QueryNodeProcessorImpl.java:125) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processIteration(QueryNodeProcessorImpl.java:97) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.process(QueryNodeProcessorImpl.java:90) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorPipeline.process(QueryNodeProcessorPipeline.java:90) at org.apache.lucene.queryparser.flexible.core.QueryParserHelper.parse(QueryParserHelper.java:255) at org.apache.lucene.queryparser.flexible.standard.StandardQueryParser.parse(StandardQueryParser.java:168) {noformat} is caused by boosting a tokenizable phrase field within a group. {code:java} public void testFail() throws Exception { test((mimeType:\text-html\)); } public void testOkay() throws Exception { test(mimeType:\text-html\); } static void test(String qs) throws Exception { Analyzer sa = new StandardAnalyzer(Version.LUCENE_46); StandardQueryParser qp = new StandardQueryParser(sa); qp.getFieldsBoost().put(mimeType, 1f); qp.parse(qs, content); } {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5491) NPE in Flexible StandardQueryParser on boosting
[ https://issues.apache.org/jira/browse/LUCENE-5491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923999#comment-13923999 ] André commented on LUCENE-5491: @[~adriano_crestani] I added a null check and it works fine. NPE in Flexible StandardQueryParser on boosting --- Key: LUCENE-5491 URL: https://issues.apache.org/jira/browse/LUCENE-5491 Project: Lucene - Core Issue Type: Bug Components: core/queryparser Affects Versions: 4.6, 4.7 Reporter: André Fix For: 4.8 The following exception {noformat} java.lang.IllegalArgumentException: field name should not be null! at org.apache.lucene.queryparser.flexible.core.config.FieldConfig.init(FieldConfig.java:36) at org.apache.lucene.queryparser.flexible.core.config.QueryConfigHandler.getFieldConfig(QueryConfigHandler.java:59) at org.apache.lucene.queryparser.flexible.standard.processors.BoostQueryNodeProcessor.postProcessNode(BoostQueryNodeProcessor.java:54) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processIteration(QueryNodeProcessorImpl.java:99) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processChildren(QueryNodeProcessorImpl.java:125) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processIteration(QueryNodeProcessorImpl.java:97) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.process(QueryNodeProcessorImpl.java:90) at org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorPipeline.process(QueryNodeProcessorPipeline.java:90) at org.apache.lucene.queryparser.flexible.core.QueryParserHelper.parse(QueryParserHelper.java:255) at org.apache.lucene.queryparser.flexible.standard.StandardQueryParser.parse(StandardQueryParser.java:168) {noformat} is caused by boosting a tokenizable phrase field within a group. {code:java} public void testFail() throws Exception { test((mimeType:\text-html\)); } public void testOkay() throws Exception { test(mimeType:\text-html\); } static void test(String qs) throws Exception { Analyzer sa = new StandardAnalyzer(Version.LUCENE_46); StandardQueryParser qp = new StandardQueryParser(sa); qp.getFieldsBoost().put(mimeType, 1f); qp.parse(qs, content); } {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5500) SortingMergePolicy should error if the Sort refers to the score
[ https://issues.apache.org/jira/browse/LUCENE-5500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923998#comment-13923998 ] ASF subversion and git services commented on LUCENE-5500: - Commit 1575306 from [~rcmuir] in branch 'dev/trunk' [ https://svn.apache.org/r1575306 ] LUCENE-5500: SortingMergePolicy should error if the Sort refers to the score SortingMergePolicy should error if the Sort refers to the score --- Key: LUCENE-5500 URL: https://issues.apache.org/jira/browse/LUCENE-5500 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5500.patch It should throw an exception if Sort.needsScores() == true. This does not make sense at index-time. I think there is no reason for this method to be package-private either (as its just useful sugar, it loops over each SortField and checks needsScores). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.7.0_60-ea-b07) - Build # 9600 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/9600/ Java: 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops -XX:+UseG1GC All tests passed Build Log: [...truncated 42217 lines...] [javadoc] Generating Javadoc [javadoc] Javadoc execution [javadoc] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javadoc] Loading source files for package org.apache.lucene... [javadoc] Loading source files for package org.apache.lucene.analysis... [javadoc] Loading source files for package org.apache.lucene.analysis.tokenattributes... [javadoc] Loading source files for package org.apache.lucene.codecs... [javadoc] Loading source files for package org.apache.lucene.codecs.compressing... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene3x... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene40... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene41... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene42... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene45... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene46... [javadoc] Loading source files for package org.apache.lucene.codecs.perfield... [javadoc] Loading source files for package org.apache.lucene.document... [javadoc] Loading source files for package org.apache.lucene.index... [javadoc] Loading source files for package org.apache.lucene.search... [javadoc] Loading source files for package org.apache.lucene.search.payloads... [javadoc] Loading source files for package org.apache.lucene.search.similarities... [javadoc] Loading source files for package org.apache.lucene.search.spans... [javadoc] Loading source files for package org.apache.lucene.store... [javadoc] Loading source files for package org.apache.lucene.util... [javadoc] Loading source files for package org.apache.lucene.util.automaton... [javadoc] Loading source files for package org.apache.lucene.util.fst... [javadoc] Loading source files for package org.apache.lucene.util.mutable... [javadoc] Loading source files for package org.apache.lucene.util.packed... [javadoc] Constructing Javadoc information... [javadoc] Standard Doclet version 1.7.0_60-ea [javadoc] Building tree for all the packages and classes... [javadoc] Generating /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/package-summary.html... [javadoc] Copying file /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-2.png to directory /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/doc-files... [javadoc] Copying file /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-1.png to directory /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/doc-files... [javadoc] Building index for all the packages and classes... [javadoc] Building index for all classes... [javadoc] Generating /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/help-doc.html... [javadoc] 1 warning [...truncated 27 lines...] [javadoc] Generating Javadoc [javadoc] Javadoc execution [javadoc] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javadoc] Loading source files for package org.apache.lucene.analysis.ar... [javadoc] Loading source files for package org.apache.lucene.analysis.bg... [javadoc] Loading source files for package org.apache.lucene.analysis.br... [javadoc] Loading source files for package org.apache.lucene.analysis.ca... [javadoc] Loading source files for package org.apache.lucene.analysis.charfilter... [javadoc] Loading source files for package org.apache.lucene.analysis.cjk... [javadoc] Loading source files for package org.apache.lucene.analysis.ckb... [javadoc] Loading source files for package org.apache.lucene.analysis.cn... [javadoc] Loading source files for package org.apache.lucene.analysis.commongrams... [javadoc] Loading source files for package org.apache.lucene.analysis.compound... [javadoc] Loading source files for package org.apache.lucene.analysis.compound.hyphenation... [javadoc] Loading source files for package org.apache.lucene.analysis.core... [javadoc] Loading source files for package org.apache.lucene.analysis.cz... [javadoc] Loading source files for package org.apache.lucene.analysis.da... [javadoc] Loading source files for package org.apache.lucene.analysis.de... [javadoc] Loading source files for package org.apache.lucene.analysis.el... [javadoc] Loading source files for package org.apache.lucene.analysis.en... [javadoc] Loading source files for package org.apache.lucene.analysis.es... [javadoc] Loading
[jira] [Resolved] (LUCENE-5500) SortingMergePolicy should error if the Sort refers to the score
[ https://issues.apache.org/jira/browse/LUCENE-5500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-5500. - Resolution: Fixed Fix Version/s: 5.0 4.8 SortingMergePolicy should error if the Sort refers to the score --- Key: LUCENE-5500 URL: https://issues.apache.org/jira/browse/LUCENE-5500 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Fix For: 4.8, 5.0 Attachments: LUCENE-5500.patch It should throw an exception if Sort.needsScores() == true. This does not make sense at index-time. I think there is no reason for this method to be package-private either (as its just useful sugar, it loops over each SortField and checks needsScores). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5500) SortingMergePolicy should error if the Sort refers to the score
[ https://issues.apache.org/jira/browse/LUCENE-5500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924000#comment-13924000 ] ASF subversion and git services commented on LUCENE-5500: - Commit 1575307 from [~rcmuir] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1575307 ] LUCENE-5500: SortingMergePolicy should error if the Sort refers to the score SortingMergePolicy should error if the Sort refers to the score --- Key: LUCENE-5500 URL: https://issues.apache.org/jira/browse/LUCENE-5500 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Fix For: 4.8, 5.0 Attachments: LUCENE-5500.patch It should throw an exception if Sort.needsScores() == true. This does not make sense at index-time. I think there is no reason for this method to be package-private either (as its just useful sugar, it loops over each SortField and checks needsScores). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5501) Out-of-order collection testing
Adrien Grand created LUCENE-5501: Summary: Out-of-order collection testing Key: LUCENE-5501 URL: https://issues.apache.org/jira/browse/LUCENE-5501 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Collectors have the ability to declare whether or not they support out-of-order collection, but since most scorers score in order this is not well tested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5829) Add tag/exclude functionality to the ExpandComponent
Joel Bernstein created SOLR-5829: Summary: Add tag/exclude functionality to the ExpandComponent Key: SOLR-5829 URL: https://issues.apache.org/jira/browse/SOLR-5829 Project: Solr Issue Type: New Feature Components: SearchComponents - other Reporter: Joel Bernstein Fix For: 4.8 Adding tag/exclude functionality to the ExpandComponent would allow it to operate independently of the CollapsingQParserPlugin. For example: q=*:*fq={!tag=parent}type=parentexpand=trueexpand.field=group_idexpand.exclude=parent The query above searches all documents limiting the results to type=parent. So the main result would contain only parent documents. The expand component then excludes the type=parent filter and expands the groups based on the group_id field. Using this approach the main search result will contain only documents with type=parent and the expanded results will display the child documents for the group. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5829) Add tag/exclude functionality to the ExpandComponent
[ https://issues.apache.org/jira/browse/SOLR-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-5829: - Description: Adding tag/exclude functionality to the ExpandComponent would allow it to operate independently of the CollapsingQParserPlugin. For example: {code} q=*:*fq={!tag=parent}type=parentexpand=trueexpand.field=group_idexpand.exclude=parent {code} The query above searches all documents limiting the results to type=parent. So the main result would contain only parent documents. The expand component then excludes the type=parent filter and expands the groups based on the group_id field. Using this approach the main search result will contain only documents with type=parent and the expanded results will display the child documents for the group. was: Adding tag/exclude functionality to the ExpandComponent would allow it to operate independently of the CollapsingQParserPlugin. For example: q=*:*fq={!tag=parent}type=parentexpand=trueexpand.field=group_idexpand.exclude=parent The query above searches all documents limiting the results to type=parent. So the main result would contain only parent documents. The expand component then excludes the type=parent filter and expands the groups based on the group_id field. Using this approach the main search result will contain only documents with type=parent and the expanded results will display the child documents for the group. Add tag/exclude functionality to the ExpandComponent Key: SOLR-5829 URL: https://issues.apache.org/jira/browse/SOLR-5829 Project: Solr Issue Type: New Feature Components: SearchComponents - other Reporter: Joel Bernstein Fix For: 4.8 Adding tag/exclude functionality to the ExpandComponent would allow it to operate independently of the CollapsingQParserPlugin. For example: {code} q=*:*fq={!tag=parent}type=parentexpand=trueexpand.field=group_idexpand.exclude=parent {code} The query above searches all documents limiting the results to type=parent. So the main result would contain only parent documents. The expand component then excludes the type=parent filter and expands the groups based on the group_id field. Using this approach the main search result will contain only documents with type=parent and the expanded results will display the child documents for the group. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5829) Add tag/exclude functionality to the ExpandComponent
[ https://issues.apache.org/jira/browse/SOLR-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-5829: - Attachment: SOLR-5829.patch Initial patch leverages the existing tag/exclude framework for tag/exclude faceting. Runs but needs tests. Add tag/exclude functionality to the ExpandComponent Key: SOLR-5829 URL: https://issues.apache.org/jira/browse/SOLR-5829 Project: Solr Issue Type: New Feature Components: SearchComponents - other Reporter: Joel Bernstein Fix For: 4.8 Attachments: SOLR-5829.patch Adding tag/exclude functionality to the ExpandComponent would allow it to operate independently of the CollapsingQParserPlugin. For example: {code} q=*:*fq={!tag=parent}type=parentexpand=trueexpand.field=group_idexpand.exclude=parent {code} The query above searches all documents limiting the results to type=parent. So the main result would contain only parent documents. The expand component then excludes the type=parent filter and expands the groups based on the group_id field. Using this approach the main search result will contain only documents with type=parent and the expanded results will display the child documents for the group. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5501) Out-of-order collection testing
[ https://issues.apache.org/jira/browse/LUCENE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-5501: - Attachment: LUCENE-5501.patch Here is a simple patch I've been playing with: - AssertingWeight.scoresDocsOutOfOrder randomly returns true in order to trigger the use of our top docs collectors that tie-break on doc id, - AssertingScorer randomly scores in random order when the collector says it supports it It found a bug in the grouping collector whose acceptDocsOutOfOrder method returns true although the collect method has a comment that explicitely says that the comparison works because doc IDs come in order. Out-of-order collection testing --- Key: LUCENE-5501 URL: https://issues.apache.org/jira/browse/LUCENE-5501 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Attachments: LUCENE-5501.patch Collectors have the ability to declare whether or not they support out-of-order collection, but since most scorers score in order this is not well tested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5501) Out-of-order collection testing
[ https://issues.apache.org/jira/browse/LUCENE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924011#comment-13924011 ] Robert Muir commented on LUCENE-5501: - Are you sure? I think its ok overall, but of course could be better: From a line coverage perspective, most of these are at or very close to 100% Looking at test contributions against each, all the collectors in TopScore* look like they get beat up pretty well. TopField* is not so great, but with several tests trying to explicitly iterate over all of them (TestSearchAfter, TestExpressionSorts, etc). More tests for these sorting ones might be good, but I don't think the situation is so bad? https://builds.apache.org/job/Lucene-Solr-Clover-trunk/clover/org/apache/lucene/search/TopScoreDocCollector.html#TopScoreDocCollector https://builds.apache.org/job/Lucene-Solr-Clover-trunk/clover/org/apache/lucene/search/TopFieldCollector.html#TopFieldCollector Out-of-order collection testing --- Key: LUCENE-5501 URL: https://issues.apache.org/jira/browse/LUCENE-5501 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Attachments: LUCENE-5501.patch Collectors have the ability to declare whether or not they support out-of-order collection, but since most scorers score in order this is not well tested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5501) Out-of-order collection testing
[ https://issues.apache.org/jira/browse/LUCENE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924032#comment-13924032 ] Adrien Grand commented on LUCENE-5501: -- I was not thinking about these collectors at all, I think they are very-well tested indeed! I was more thinking about more exotic collectors, like those that are used for grouping or joins that won't get out-of-order testing unless they use a boolean query. Out-of-order collection testing --- Key: LUCENE-5501 URL: https://issues.apache.org/jira/browse/LUCENE-5501 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Attachments: LUCENE-5501.patch Collectors have the ability to declare whether or not they support out-of-order collection, but since most scorers score in order this is not well tested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5422) Postings lists deduplication
[ https://issues.apache.org/jira/browse/LUCENE-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924036#comment-13924036 ] Vishmi Money commented on LUCENE-5422: -- [~otis], thank you. [~mikemccand], yes I agree with you. As you said if cost is added for merging of posts lists, despite the server space saved, it will affect for performance. Then we have to think about how we can achieve this desired performance while trying to save server space. I will keep this in my mind when I look further in to this. Postings lists deduplication Key: LUCENE-5422 URL: https://issues.apache.org/jira/browse/LUCENE-5422 Project: Lucene - Core Issue Type: Improvement Components: core/codecs, core/index Reporter: Dmitry Kan Labels: gsoc2014 The context: http://markmail.org/thread/tywtrjjcfdbzww6f Robert Muir and I have discussed what Robert eventually named postings lists deduplication at Berlin Buzzwords 2013 conference. The idea is to allow multiple terms to point to the same postings list to save space. This can be achieved by new index codec implementation, but this jira is open to other ideas as well. The application / impact of this is positive for synonyms, exact / inexact terms, leading wildcard support via storing reversed term etc. For example, at the moment, when supporting exact (unstemmed) and inexact (stemmed) searches, we store both unstemmed and stemmed variant of a word form and that leads to index bloating. That is why we had to remove the leading wildcard support via reversing a token on index and query time because of the same index size considerations. Comment from Mike McCandless: Neat idea! Would this idea allow a single term to point to (the union of) N other posting lists? It seems like that's necessary e.g. to handle the exact/inexact case. And then, to produce the Docs/AndPositionsEnum you'd need to do the merge sort across those N posting lists? Such a thing might also be do-able as runtime only wrapper around the postings API (FieldsProducer), if you could at runtime do the reverse expansion (e.g. stem - all of its surface forms). Comment from Robert Muir: I think the exact/inexact is trickier (detecting it would be the hard part), and you are right, another solution might work better. but for the reverse wildcard and synonyms situation, it seems we could even detect it on write if we created some hash of the previous terms postings. if the hash matches for the current term, we know it might be a duplicate and would have to actually do the costly check they are the same. maybe there are better ways to do it, but it might be a fun postingformat experiment to try. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5476) Facet sampling
[ https://issues.apache.org/jira/browse/LUCENE-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924037#comment-13924037 ] Rob Audenaerde commented on LUCENE-5476: {quote} ...Given our test framework, randomness is not a big deal at all, since once we get a test failure, we can deterministically reproduce the failure (when there is no multi-threading)... {quote} Ok, this makes sense to me. {quote} It looks like it hasn't changed? I mean besides the rename. So if I set sampleSize=100K, it's 100K whether there are 101K docs or 100M docs, right? Is that your intention? {quote} Correct, it is my intention. I actually prefer not to increase the {{sampleSize}} with more hits, as bigger samples are slower and 100K is a nice sample size anyway and more hits means more time. I adjust the sampleRatio so that the resulting set of documents is (close to) the {{sampleSize}}. {quote} I find this assert just redundant – if we always expect 5, we shouldn't assert that we received 5. If we say that very infrequently we might get 5 and we're OK with it .. what's the point of asserting that at all? {quote} Agreed with the 5. Asserting seems redundant, but is that not the point in unit-tests? The trick is that the assertion should still hold if you change the implementation.. I will add more next week. Btw. Is there an easy way to retrieve the total facet counts for a ordinal? When correcting facet counts it would a quick win to limit the number of estimated documents to the actual number of documents in the index that match that facet. (And maybe use the distribution as well, to make better estimates) Facet sampling -- Key: LUCENE-5476 URL: https://issues.apache.org/jira/browse/LUCENE-5476 Project: Lucene - Core Issue Type: Improvement Reporter: Rob Audenaerde Attachments: LUCENE-5476.patch, LUCENE-5476.patch, LUCENE-5476.patch, LUCENE-5476.patch, LUCENE-5476.patch, LUCENE-5476.patch, LUCENE-5476.patch, SamplingComparison_SamplingFacetsCollector.java, SamplingFacetsCollector.java With LUCENE-5339 facet sampling disappeared. When trying to display facet counts on large datasets (10M documents) counting facets is rather expensive, as all the hits are collected and processed. Sampling greatly reduced this and thus provided a nice speedup. Could it be brought back? -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-4.x-Java6 - Build # 2330 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java6/2330/ All tests passed Build Log: [...truncated 3742 lines...] [javac] Compiling 20 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/build/misc/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/misc/src/java/org/apache/lucene/index/sorter/Sorter.java:238: cannot find symbol [javac] symbol : method compare(int,int) [javac] location: class java.lang.Integer [javac] return Integer.compare(docID1, docID2); // docid order tiebreak [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] 1 error BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/build.xml:471: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/build.xml:451: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/build.xml:39: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/extra-targets.xml:37: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/build.xml:534: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/common-build.xml:1998: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/module-build.xml:57: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/common-build.xml:471: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/common-build.xml:1736: Compile failed; see the compiler error output for details. Total time: 12 minutes 23 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-Tests-4.x-Java6 - Build # 2330 - Still Failing
Sorry, ill fix On Fri, Mar 7, 2014 at 11:41 AM, Apache Jenkins Server jenk...@builds.apache.org wrote: Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java6/2330/ All tests passed Build Log: [...truncated 3742 lines...] [javac] Compiling 20 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/build/misc/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/misc/src/java/org/apache/lucene/index/sorter/Sorter.java:238: cannot find symbol [javac] symbol : method compare(int,int) [javac] location: class java.lang.Integer [javac] return Integer.compare(docID1, docID2); // docid order tiebreak [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] 1 error BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/build.xml:471: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/build.xml:451: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/build.xml:39: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/extra-targets.xml:37: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/build.xml:534: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/common-build.xml:1998: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/module-build.xml:57: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/common-build.xml:471: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/common-build.xml:1736: Compile failed; see the compiler error output for details. Total time: 12 minutes 23 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.7.0_60-ea-b07) - Build # 9594 - Still Failing!
I committed a fix. Mike McCandless http://blog.mikemccandless.com On Fri, Mar 7, 2014 at 12:01 AM, Policeman Jenkins Server jenk...@thetaphi.de wrote: Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/9594/ Java: 64bit/jdk1.7.0_60-ea-b07 -XX:+UseCompressedOops -XX:+UseParallelGC All tests passed Build Log: [...truncated 42183 lines...] [javadoc] Generating Javadoc [javadoc] Javadoc execution [javadoc] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javadoc] Loading source files for package org.apache.lucene... [javadoc] Loading source files for package org.apache.lucene.analysis... [javadoc] Loading source files for package org.apache.lucene.analysis.tokenattributes... [javadoc] Loading source files for package org.apache.lucene.codecs... [javadoc] Loading source files for package org.apache.lucene.codecs.compressing... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene3x... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene40... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene41... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene42... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene45... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene46... [javadoc] Loading source files for package org.apache.lucene.codecs.perfield... [javadoc] Loading source files for package org.apache.lucene.document... [javadoc] Loading source files for package org.apache.lucene.index... [javadoc] Loading source files for package org.apache.lucene.search... [javadoc] Loading source files for package org.apache.lucene.search.payloads... [javadoc] Loading source files for package org.apache.lucene.search.similarities... [javadoc] Loading source files for package org.apache.lucene.search.spans... [javadoc] Loading source files for package org.apache.lucene.store... [javadoc] Loading source files for package org.apache.lucene.util... [javadoc] Loading source files for package org.apache.lucene.util.automaton... [javadoc] Loading source files for package org.apache.lucene.util.fst... [javadoc] Loading source files for package org.apache.lucene.util.mutable... [javadoc] Loading source files for package org.apache.lucene.util.packed... [javadoc] Constructing Javadoc information... [javadoc] Standard Doclet version 1.7.0_60-ea [javadoc] Building tree for all the packages and classes... [javadoc] Generating /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/package-summary.html... [javadoc] Copying file /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-2.png to directory /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/doc-files... [javadoc] Copying file /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-1.png to directory /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/doc-files... [javadoc] Building index for all the packages and classes... [javadoc] Building index for all classes... [javadoc] Generating /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/help-doc.html... [javadoc] 1 warning [...truncated 27 lines...] [javadoc] Generating Javadoc [javadoc] Javadoc execution [javadoc] Loading source files for package org.apache.lucene.analysis.ar... [javadoc] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javadoc] Loading source files for package org.apache.lucene.analysis.bg... [javadoc] Loading source files for package org.apache.lucene.analysis.br... [javadoc] Loading source files for package org.apache.lucene.analysis.ca... [javadoc] Loading source files for package org.apache.lucene.analysis.charfilter... [javadoc] Loading source files for package org.apache.lucene.analysis.cjk... [javadoc] Loading source files for package org.apache.lucene.analysis.ckb... [javadoc] Loading source files for package org.apache.lucene.analysis.cn... [javadoc] Loading source files for package org.apache.lucene.analysis.commongrams... [javadoc] Loading source files for package org.apache.lucene.analysis.compound... [javadoc] Loading source files for package org.apache.lucene.analysis.compound.hyphenation... [javadoc] Loading source files for package org.apache.lucene.analysis.core... [javadoc] Loading source files for package org.apache.lucene.analysis.cz... [javadoc] Loading source files for package org.apache.lucene.analysis.da... [javadoc] Loading source files for package org.apache.lucene.analysis.de...
[jira] [Created] (SOLR-5830) Elevate file hardcoded to load from either conf or data directory
David Stuart created SOLR-5830: -- Summary: Elevate file hardcoded to load from either conf or data directory Key: SOLR-5830 URL: https://issues.apache.org/jira/browse/SOLR-5830 Project: Solr Issue Type: Bug Affects Versions: 3.6.3, 4.7, 4.8, 5.0 Reporter: David Stuart When loading the elevate.xml from the solrconfig the QueryElevationComponent class is hard code to look in either conf directory or the data directory. If a absolute path is defined it errors out as file not found as it is prepending conf and data directories in it check -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4654) Integrate Lucene's sorting and early query termination capabilities into Solr
[ https://issues.apache.org/jira/browse/SOLR-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924081#comment-13924081 ] Furkan KAMACI commented on SOLR-4654: - I am volunteer to work on this issue as a part of GSOC 14. Integrate Lucene's sorting and early query termination capabilities into Solr - Key: SOLR-4654 URL: https://issues.apache.org/jira/browse/SOLR-4654 Project: Solr Issue Type: Improvement Reporter: Adrien Grand Priority: Trivial Labels: gsoc2014 I think there would be some interesting work to do to integrate Lucene's sorting and early query termination capabilities into Solr, in particular (just ideas, maybe they're not all interesting/useful): - configuring a SortingMergePolicy, - figuring out when the sort order of queries matches the sort order of the index segments, - giving the ability to get approximated results when the query is not sorted but only boosted by the sort order of the index, - integration with TimeLimitingCollector: maybe it's better to collect only half of all segments than to fully collect half of the segments, - approximation of the number of matches based on the ratio of collected documents, - ... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5501) Out-of-order collection testing
[ https://issues.apache.org/jira/browse/LUCENE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924088#comment-13924088 ] Michael McCandless commented on LUCENE-5501: +1, I love this patch. You shuffle the docIDs from the scorer before delivering to the collector, if the collector claims it can accept out-of-order hits. LUCENE-4950 is related: I tried to fix AssertingIndexSearcher to use AssertingCollector but hit strange exceptions with ConstantScoreQuery that I never explained. AssertingCollector would verify that if the collector said it could not accept docs out of order, then the scorer does not in fact deliver docs out of order. Also, LUCENE-5487 will increase how often out-of-order scoring is allowed, because BooleanScorer will now allow the sub-scorers to score out of order. Out-of-order collection testing --- Key: LUCENE-5501 URL: https://issues.apache.org/jira/browse/LUCENE-5501 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Attachments: LUCENE-5501.patch Collectors have the ability to declare whether or not they support out-of-order collection, but since most scorers score in order this is not well tested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.7.0_60-ea-b07) - Build # 9600 - Still Failing!
Same issue, I committed a fix earlier. Mike McCandless http://blog.mikemccandless.com On Fri, Mar 7, 2014 at 11:11 AM, Policeman Jenkins Server jenk...@thetaphi.de wrote: Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/9600/ Java: 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops -XX:+UseG1GC All tests passed Build Log: [...truncated 42217 lines...] [javadoc] Generating Javadoc [javadoc] Javadoc execution [javadoc] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javadoc] Loading source files for package org.apache.lucene... [javadoc] Loading source files for package org.apache.lucene.analysis... [javadoc] Loading source files for package org.apache.lucene.analysis.tokenattributes... [javadoc] Loading source files for package org.apache.lucene.codecs... [javadoc] Loading source files for package org.apache.lucene.codecs.compressing... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene3x... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene40... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene41... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene42... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene45... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene46... [javadoc] Loading source files for package org.apache.lucene.codecs.perfield... [javadoc] Loading source files for package org.apache.lucene.document... [javadoc] Loading source files for package org.apache.lucene.index... [javadoc] Loading source files for package org.apache.lucene.search... [javadoc] Loading source files for package org.apache.lucene.search.payloads... [javadoc] Loading source files for package org.apache.lucene.search.similarities... [javadoc] Loading source files for package org.apache.lucene.search.spans... [javadoc] Loading source files for package org.apache.lucene.store... [javadoc] Loading source files for package org.apache.lucene.util... [javadoc] Loading source files for package org.apache.lucene.util.automaton... [javadoc] Loading source files for package org.apache.lucene.util.fst... [javadoc] Loading source files for package org.apache.lucene.util.mutable... [javadoc] Loading source files for package org.apache.lucene.util.packed... [javadoc] Constructing Javadoc information... [javadoc] Standard Doclet version 1.7.0_60-ea [javadoc] Building tree for all the packages and classes... [javadoc] Generating /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/package-summary.html... [javadoc] Copying file /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-2.png to directory /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/doc-files... [javadoc] Copying file /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-1.png to directory /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/doc-files... [javadoc] Building index for all the packages and classes... [javadoc] Building index for all classes... [javadoc] Generating /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/help-doc.html... [javadoc] 1 warning [...truncated 27 lines...] [javadoc] Generating Javadoc [javadoc] Javadoc execution [javadoc] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javadoc] Loading source files for package org.apache.lucene.analysis.ar... [javadoc] Loading source files for package org.apache.lucene.analysis.bg... [javadoc] Loading source files for package org.apache.lucene.analysis.br... [javadoc] Loading source files for package org.apache.lucene.analysis.ca... [javadoc] Loading source files for package org.apache.lucene.analysis.charfilter... [javadoc] Loading source files for package org.apache.lucene.analysis.cjk... [javadoc] Loading source files for package org.apache.lucene.analysis.ckb... [javadoc] Loading source files for package org.apache.lucene.analysis.cn... [javadoc] Loading source files for package org.apache.lucene.analysis.commongrams... [javadoc] Loading source files for package org.apache.lucene.analysis.compound... [javadoc] Loading source files for package org.apache.lucene.analysis.compound.hyphenation... [javadoc] Loading source files for package org.apache.lucene.analysis.core... [javadoc] Loading source files for package org.apache.lucene.analysis.cz... [javadoc] Loading source files for package org.apache.lucene.analysis.da... [javadoc] Loading source files for package
[jira] [Updated] (SOLR-5830) Elevate file hardcoded to load from either conf or data directory
[ https://issues.apache.org/jira/browse/SOLR-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Stuart updated SOLR-5830: --- Attachment: SOLR-5830.patch First pass on a patch. Elevate file hardcoded to load from either conf or data directory - Key: SOLR-5830 URL: https://issues.apache.org/jira/browse/SOLR-5830 Project: Solr Issue Type: Bug Affects Versions: 3.6.3, 4.7, 4.8, 5.0 Reporter: David Stuart Attachments: SOLR-5830.patch Original Estimate: 2h Remaining Estimate: 2h When loading the elevate.xml from the solrconfig the QueryElevationComponent class is hard code to look in either conf directory or the data directory. If a absolute path is defined it errors out as file not found as it is prepending conf and data directories in it check -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5821) Search inconsistency on SolrCloud replicas
[ https://issues.apache.org/jira/browse/SOLR-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-5821. -- Resolution: Invalid First, please raise issues like this on the user's before raising a JIRA to be sure you are really seeing a bug rather than simply misunderstanding. If your hypothesis is true, try specifying a secondary known ordering. If scores are tied, then Solr/Lucene will return the document in internal Lucene ID order, and you're quite correct that the internal order may be different in different shards. Testing this should be as simple as specifying something similar to sort=score desc, id asc Search inconsistency on SolrCloud replicas -- Key: SOLR-5821 URL: https://issues.apache.org/jira/browse/SOLR-5821 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Environment: SolrCloud: 1 shard, 2 replicas Both instances/replicas have identical hardware/software: CPU(s): 4 RAM: 8Gb HDD: 100Gb OS: CentOS 6.5 ZooKeeper 3.4.5 Tomcat 8.0.3 Solr 4.6.1 Servers are utilized to run Solr only. Reporter: Maxim Novikov Priority: Critical Labels: cloud, inconsistency, replica, search We use the following infrastructure: SolrCloud with 1 shard and 2 replicas. The index is built using DataImportHandler (importing data from the database). The number of items in the index can vary from 100 to 100,000,000. After indexing part of the data (not necessarily all the data, it is enough to have a small number of items in the search index), we can observe that Solr instances (replicas) return different results for the same search queries. I believe it happens because some of the results have the same scores, and Solr instances return those in a random order. PS This is a critical issue for us as we use a load balancer to scale Solr through replicas, and as a result of this issue, we retrieve various results for the same queries all the time. They are not necessarily completely different, but even a couple of items that differ is a deal breaker. The expected behaviour would be to always get identical results for the same search queries from all replicas. Otherwise, this cloud thing works just unreliably. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5499) EarlyTerminatingSortingCollector shouldnt require exact Sort match
[ https://issues.apache.org/jira/browse/LUCENE-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924098#comment-13924098 ] Michael McCandless commented on LUCENE-5499: +1 The search-time sort just has to be congruent with the index-time one. EarlyTerminatingSortingCollector shouldnt require exact Sort match -- Key: LUCENE-5499 URL: https://issues.apache.org/jira/browse/LUCENE-5499 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Today EarlyTerminatingSortingCollector requires that the Sort match exactly at query and at index time. However, now that you can use any Sort (e.g. with multiple sortfields), this should be improved. For example, early termination is fine in the following case: * index-time: popularity desc, time desc * query-time: popularity desc -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5498) SortingAtomicReader should be package private
[ https://issues.apache.org/jira/browse/LUCENE-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924093#comment-13924093 ] Michael McCandless commented on LUCENE-5498: +1 SortingAtomicReader should be package private - Key: LUCENE-5498 URL: https://issues.apache.org/jira/browse/LUCENE-5498 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir The intended purpose of this reader is to allow you to sort your entire index with IW.addIndexes(IR). Perhaps we should supply some kind of tool to do this and hide the reader. Its scary to think of someone using this for searching (based on its name and docs, its probably not clear that it would be ridiculously slow) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Stalled unit tests
I just ran ant test under Solr; it took 4 minutes 25 seconds. But, in my ~/build.properties I have: tests.disableHdfs=true tests.slow=false Which makes things substantially faster, and also [seems to] sidestep the Solr tests that false fail. Mike McCandless http://blog.mikemccandless.com On Fri, Mar 7, 2014 at 9:04 AM, Terry Smith sheb...@gmail.com wrote: Mike, Fair enough. I'll let them run for more than 30 minutes and see what happens. How long does it take on your machine? I'm happy to signup for the wiki and add some extra information to http://wiki.apache.org/lucene-java/HowToContribute for folks wanting to tinker with Lucene. Do the Lucene developers typically run a subset of the test suite to make committing cheaper? Thanks, --Terry On Fri, Mar 7, 2014 at 5:52 AM, Michael McCandless luc...@mikemccandless.com wrote: Unfortunately, some tests take a very long time, and the test infra will print these HEARTBEAT messages notifying you that they are still running. They should eventually finish? Mike McCandless http://blog.mikemccandless.com On Thu, Mar 6, 2014 at 5:09 PM, Terry Smith sheb...@gmail.com wrote: I'm sure that I'm just missing something obvious but I'm having trouble getting the unit tests to run to completion on my laptop and was hoping that someone would be kind enough to point me in the right direction. I've cloned the repository from GitHub (http://git.apache.org/lucene-solr.git) and checked out the latest commit on branch_4x. commit 6e06247cec1410f32592bfd307c1020b814def06 Author: Robert Muir rm...@apache.org Date: Thu Mar 6 19:54:07 2014 + disable slow solr tests in smoketester git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x@1575025 13f79535-47bb-0310-9956-ffa450edef68 Executing ant clean test from the top level directory of the project shows the tests running but they seems to get stuck in loop with some stalled heartbeat messages. If I run the tests directly from lucene/ then they complete successfully after about 10 minutes. I'm using Java 6 under OS X (10.9.2). $ java -version java version 1.6.0_65 Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609) Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode) My terminal lists repeating stalled heartbeat messages like so: HEARTBEAT J2 PID(20104@onyx.local): 2014-03-06T16:53:35, stalled for 2111s at: HdfsLockFactoryTest.testBasic HEARTBEAT J0 PID(20106@onyx.local): 2014-03-06T16:53:47, stalled for 2108s at: TestSurroundQueryParser.testQueryParser HEARTBEAT J1 PID(20103@onyx.local): 2014-03-06T16:54:11, stalled for 2167s at: TestRecoveryHdfs.testBuffering HEARTBEAT J3 PID(20105@onyx.local): 2014-03-06T16:54:23, stalled for 2165s at: HdfsDirectoryTest.testEOF My machine does have 3 java processes chewing CPU, see attached jstack dumps for more information. Should I expect the tests to complete on my platform? Do I need to specify any special flags to give them more memory or to avoid any bad apples? Thanks in advance, --Terry - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5773) CollapsingQParserPlugin should make elevated documents the group head
[ https://issues.apache.org/jira/browse/SOLR-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924108#comment-13924108 ] David commented on SOLR-5773: - What did you change in your latest patch? CollapsingQParserPlugin should make elevated documents the group head - Key: SOLR-5773 URL: https://issues.apache.org/jira/browse/SOLR-5773 Project: Solr Issue Type: Improvement Components: query parsers Affects Versions: 4.6.1 Reporter: David Assignee: Joel Bernstein Labels: collapse, solr Fix For: 4.8 Attachments: SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch Original Estimate: 8h Remaining Estimate: 8h Hi Joel, I sent you an email but I'm not sure if you received it or not. I ran into a bit of trouble using the CollapsingQParserPlugin with elevated documents. To explain it simply, I want to exclude grouped documents when one of the members of the group are contained in the elevated document set. I'm not sure this is possible currently because as you explain above elevated documents are added to the request context after the original query is constructed. To try to better illustrate the problem. If I have 2 documents docid=1 and docid=2 and both have a groupid of 'a'. If a grouped query scores docid 2 first in the results but I have elevated docid 1 then both documents are shown in the results when I really only want the elevated document to be shown in the results. Is this something that would be difficult to implement? Any help is appreciated. I think the solution would be to remove the documents from liveDocs that share the same groupid in the getBoostDocs() function. Let me know if this makes any sense. I'll continue working towards a solution in the meantime. {code} private IntOpenHashSet getBoostDocs(SolrIndexSearcher indexSearcher, SetString boosted) throws IOException { IntOpenHashSet boostDocs = null; if(boosted != null) { SchemaField idField = indexSearcher.getSchema().getUniqueKeyField(); String fieldName = idField.getName(); HashSetBytesRef localBoosts = new HashSet(boosted.size()*2); IteratorString boostedIt = boosted.iterator(); while(boostedIt.hasNext()) { localBoosts.add(new BytesRef(boostedIt.next())); } boostDocs = new IntOpenHashSet(boosted.size()*2); ListAtomicReaderContextleaves = indexSearcher.getTopReaderContext().leaves(); TermsEnum termsEnum = null; DocsEnum docsEnum = null; for(AtomicReaderContext leaf : leaves) { AtomicReader reader = leaf.reader(); int docBase = leaf.docBase; Bits liveDocs = reader.getLiveDocs(); Terms terms = reader.terms(fieldName); termsEnum = terms.iterator(termsEnum); IteratorBytesRef it = localBoosts.iterator(); while(it.hasNext()) { BytesRef ref = it.next(); if(termsEnum.seekExact(ref)) { docsEnum = termsEnum.docs(liveDocs, docsEnum); int doc = docsEnum.nextDoc(); if(doc != -1) { //Found the document. boostDocs.add(doc+docBase); *// HERE REMOVE ANY DOCUMENTS THAT SHARE THE GROUPID NOT ONLY THE DOCID //* it.remove(); } } } } } return boostDocs; } {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5821) Search inconsistency on SolrCloud replicas
[ https://issues.apache.org/jira/browse/SOLR-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924107#comment-13924107 ] Maxim Novikov commented on SOLR-5821: - Will this additional ordering not impact the performance of search? Considering 100,000,000 records indexed from the database, and having about 400 search requests per second per 1 Solr instance. Search inconsistency on SolrCloud replicas -- Key: SOLR-5821 URL: https://issues.apache.org/jira/browse/SOLR-5821 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Environment: SolrCloud: 1 shard, 2 replicas Both instances/replicas have identical hardware/software: CPU(s): 4 RAM: 8Gb HDD: 100Gb OS: CentOS 6.5 ZooKeeper 3.4.5 Tomcat 8.0.3 Solr 4.6.1 Servers are utilized to run Solr only. Reporter: Maxim Novikov Priority: Critical Labels: cloud, inconsistency, replica, search We use the following infrastructure: SolrCloud with 1 shard and 2 replicas. The index is built using DataImportHandler (importing data from the database). The number of items in the index can vary from 100 to 100,000,000. After indexing part of the data (not necessarily all the data, it is enough to have a small number of items in the search index), we can observe that Solr instances (replicas) return different results for the same search queries. I believe it happens because some of the results have the same scores, and Solr instances return those in a random order. PS This is a critical issue for us as we use a load balancer to scale Solr through replicas, and as a result of this issue, we retrieve various results for the same queries all the time. They are not necessarily completely different, but even a couple of items that differ is a deal breaker. The expected behaviour would be to always get identical results for the same search queries from all replicas. Otherwise, this cloud thing works just unreliably. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5773) CollapsingQParserPlugin should make elevated documents the group head
[ https://issues.apache.org/jira/browse/SOLR-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924106#comment-13924106 ] David commented on SOLR-5773: - I've got it running in a sandbox environment. Seems to be functioning without error under load of up to 3000 requests per minute, though most of these queries don't have elevated documents in their result set. But I haven't seen any errors so far. CollapsingQParserPlugin should make elevated documents the group head - Key: SOLR-5773 URL: https://issues.apache.org/jira/browse/SOLR-5773 Project: Solr Issue Type: Improvement Components: query parsers Affects Versions: 4.6.1 Reporter: David Assignee: Joel Bernstein Labels: collapse, solr Fix For: 4.8 Attachments: SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch Original Estimate: 8h Remaining Estimate: 8h Hi Joel, I sent you an email but I'm not sure if you received it or not. I ran into a bit of trouble using the CollapsingQParserPlugin with elevated documents. To explain it simply, I want to exclude grouped documents when one of the members of the group are contained in the elevated document set. I'm not sure this is possible currently because as you explain above elevated documents are added to the request context after the original query is constructed. To try to better illustrate the problem. If I have 2 documents docid=1 and docid=2 and both have a groupid of 'a'. If a grouped query scores docid 2 first in the results but I have elevated docid 1 then both documents are shown in the results when I really only want the elevated document to be shown in the results. Is this something that would be difficult to implement? Any help is appreciated. I think the solution would be to remove the documents from liveDocs that share the same groupid in the getBoostDocs() function. Let me know if this makes any sense. I'll continue working towards a solution in the meantime. {code} private IntOpenHashSet getBoostDocs(SolrIndexSearcher indexSearcher, SetString boosted) throws IOException { IntOpenHashSet boostDocs = null; if(boosted != null) { SchemaField idField = indexSearcher.getSchema().getUniqueKeyField(); String fieldName = idField.getName(); HashSetBytesRef localBoosts = new HashSet(boosted.size()*2); IteratorString boostedIt = boosted.iterator(); while(boostedIt.hasNext()) { localBoosts.add(new BytesRef(boostedIt.next())); } boostDocs = new IntOpenHashSet(boosted.size()*2); ListAtomicReaderContextleaves = indexSearcher.getTopReaderContext().leaves(); TermsEnum termsEnum = null; DocsEnum docsEnum = null; for(AtomicReaderContext leaf : leaves) { AtomicReader reader = leaf.reader(); int docBase = leaf.docBase; Bits liveDocs = reader.getLiveDocs(); Terms terms = reader.terms(fieldName); termsEnum = terms.iterator(termsEnum); IteratorBytesRef it = localBoosts.iterator(); while(it.hasNext()) { BytesRef ref = it.next(); if(termsEnum.seekExact(ref)) { docsEnum = termsEnum.docs(liveDocs, docsEnum); int doc = docsEnum.nextDoc(); if(doc != -1) { //Found the document. boostDocs.add(doc+docBase); *// HERE REMOVE ANY DOCUMENTS THAT SHARE THE GROUPID NOT ONLY THE DOCID //* it.remove(); } } } } } return boostDocs; } {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Under what circumtances, termsEnum's next(), or seekExact(), o seekCeli() is more efficient?
On Wed, Mar 5, 2014 at 4:34 PM, hao yan hyan2...@gmail.com wrote: Hi, Michael 1.We find actually both are costly. I am not sure what is the difference btw first next only once + seekExact from then on and always seekExact. I mean, the first call of next and the first call of seekExact, they are different? If what next() does is to load a block of data and position to the beginning of th block and seekExact() is to load a block and position to the target, then next() should be more efficient, right? The first next() call is not that different from seekExact: it must load the block containing the first term and read bytes from it. After that, next() should be cheaper than seekExact. 2. Is multiFields/multiTerms/multiTermsEnum efficient ? We have a fixed number ( three) segments always. We want to search on the three segments for each query. Therefore we borrowed most of the code of multixxx. Is there anyway to optimize this? They are relatively efficient? I mean, they must merge-sort the terms, and manage N segments that might have a term under the hood, but it's the best we can do (unless you can forceMerge). But it's better to operate per-segment if you care about performance. Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5773) CollapsingQParserPlugin should make elevated documents the group head
[ https://issues.apache.org/jira/browse/SOLR-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924123#comment-13924123 ] David commented on SOLR-5773: - oh I see it looks like you just added another test CollapsingQParserPlugin should make elevated documents the group head - Key: SOLR-5773 URL: https://issues.apache.org/jira/browse/SOLR-5773 Project: Solr Issue Type: Improvement Components: query parsers Affects Versions: 4.6.1 Reporter: David Assignee: Joel Bernstein Labels: collapse, solr Fix For: 4.8 Attachments: SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch Original Estimate: 8h Remaining Estimate: 8h Hi Joel, I sent you an email but I'm not sure if you received it or not. I ran into a bit of trouble using the CollapsingQParserPlugin with elevated documents. To explain it simply, I want to exclude grouped documents when one of the members of the group are contained in the elevated document set. I'm not sure this is possible currently because as you explain above elevated documents are added to the request context after the original query is constructed. To try to better illustrate the problem. If I have 2 documents docid=1 and docid=2 and both have a groupid of 'a'. If a grouped query scores docid 2 first in the results but I have elevated docid 1 then both documents are shown in the results when I really only want the elevated document to be shown in the results. Is this something that would be difficult to implement? Any help is appreciated. I think the solution would be to remove the documents from liveDocs that share the same groupid in the getBoostDocs() function. Let me know if this makes any sense. I'll continue working towards a solution in the meantime. {code} private IntOpenHashSet getBoostDocs(SolrIndexSearcher indexSearcher, SetString boosted) throws IOException { IntOpenHashSet boostDocs = null; if(boosted != null) { SchemaField idField = indexSearcher.getSchema().getUniqueKeyField(); String fieldName = idField.getName(); HashSetBytesRef localBoosts = new HashSet(boosted.size()*2); IteratorString boostedIt = boosted.iterator(); while(boostedIt.hasNext()) { localBoosts.add(new BytesRef(boostedIt.next())); } boostDocs = new IntOpenHashSet(boosted.size()*2); ListAtomicReaderContextleaves = indexSearcher.getTopReaderContext().leaves(); TermsEnum termsEnum = null; DocsEnum docsEnum = null; for(AtomicReaderContext leaf : leaves) { AtomicReader reader = leaf.reader(); int docBase = leaf.docBase; Bits liveDocs = reader.getLiveDocs(); Terms terms = reader.terms(fieldName); termsEnum = terms.iterator(termsEnum); IteratorBytesRef it = localBoosts.iterator(); while(it.hasNext()) { BytesRef ref = it.next(); if(termsEnum.seekExact(ref)) { docsEnum = termsEnum.docs(liveDocs, docsEnum); int doc = docsEnum.nextDoc(); if(doc != -1) { //Found the document. boostDocs.add(doc+docBase); *// HERE REMOVE ANY DOCUMENTS THAT SHARE THE GROUPID NOT ONLY THE DOCID //* it.remove(); } } } } } return boostDocs; } {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5818) distrib search with custom comparator does not quite work correctly
[ https://issues.apache.org/jira/browse/SOLR-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924124#comment-13924124 ] ASF subversion and git services commented on SOLR-5818: --- Commit 1575344 from [~rjernst] in branch 'dev/trunk' [ https://svn.apache.org/r1575344 ] SOLR-5818: distrib search with custom comparator does not quite work correctly distrib search with custom comparator does not quite work correctly --- Key: SOLR-5818 URL: https://issues.apache.org/jira/browse/SOLR-5818 Project: Solr Issue Type: Bug Reporter: Ryan Ernst Fix For: 4.8, 5.0 Attachments: SOLR-5818.patch In QueryComponent.doFieldSortValues, a scorer is never set on a custom comparator. We just need to add a fake scorer that can pass through the score from the DocList. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5818) distrib search with custom comparator does not quite work correctly
[ https://issues.apache.org/jira/browse/SOLR-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Ernst updated SOLR-5818: - Fix Version/s: 5.0 4.8 distrib search with custom comparator does not quite work correctly --- Key: SOLR-5818 URL: https://issues.apache.org/jira/browse/SOLR-5818 Project: Solr Issue Type: Bug Reporter: Ryan Ernst Assignee: Ryan Ernst Fix For: 4.8, 5.0 Attachments: SOLR-5818.patch In QueryComponent.doFieldSortValues, a scorer is never set on a custom comparator. We just need to add a fake scorer that can pass through the score from the DocList. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-5818) distrib search with custom comparator does not quite work correctly
[ https://issues.apache.org/jira/browse/SOLR-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Ernst reassigned SOLR-5818: Assignee: Ryan Ernst distrib search with custom comparator does not quite work correctly --- Key: SOLR-5818 URL: https://issues.apache.org/jira/browse/SOLR-5818 Project: Solr Issue Type: Bug Reporter: Ryan Ernst Assignee: Ryan Ernst Fix For: 4.8, 5.0 Attachments: SOLR-5818.patch In QueryComponent.doFieldSortValues, a scorer is never set on a custom comparator. We just need to add a fake scorer that can pass through the score from the DocList. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5821) Search inconsistency on SolrCloud replicas
[ https://issues.apache.org/jira/browse/SOLR-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924127#comment-13924127 ] Maxim Novikov commented on SOLR-5821: - PS Regarding misunderstanding and stuff like that... This behavior is unexpected for me. As I wrote, I have a load balancer that redirects queries to Solr's replicas having the only shard, and running the same query (even not specifying any additional parameters), I expect to retrieve the same results. You can tell anything about how Solr is implemented internally, but from the perspective of Solr's user (search's user) I should not care about that at all. That was the point. If you disagree and think that this is sort of a feature, not a bug/issue, that is still good to keep this stuff in JIRA. The other people who face the same issue will be able to find it, read Solr developers' responses, and judge for themselves whether this feature fits the search solution they want to get or not. Search inconsistency on SolrCloud replicas -- Key: SOLR-5821 URL: https://issues.apache.org/jira/browse/SOLR-5821 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Environment: SolrCloud: 1 shard, 2 replicas Both instances/replicas have identical hardware/software: CPU(s): 4 RAM: 8Gb HDD: 100Gb OS: CentOS 6.5 ZooKeeper 3.4.5 Tomcat 8.0.3 Solr 4.6.1 Servers are utilized to run Solr only. Reporter: Maxim Novikov Priority: Critical Labels: cloud, inconsistency, replica, search We use the following infrastructure: SolrCloud with 1 shard and 2 replicas. The index is built using DataImportHandler (importing data from the database). The number of items in the index can vary from 100 to 100,000,000. After indexing part of the data (not necessarily all the data, it is enough to have a small number of items in the search index), we can observe that Solr instances (replicas) return different results for the same search queries. I believe it happens because some of the results have the same scores, and Solr instances return those in a random order. PS This is a critical issue for us as we use a load balancer to scale Solr through replicas, and as a result of this issue, we retrieve various results for the same queries all the time. They are not necessarily completely different, but even a couple of items that differ is a deal breaker. The expected behaviour would be to always get identical results for the same search queries from all replicas. Otherwise, this cloud thing works just unreliably. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-5488) FilteredQuery.explain does not honor FilterStrategy
[ https://issues.apache.org/jira/browse/LUCENE-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch reassigned LUCENE-5488: - Assignee: Michael Busch FilteredQuery.explain does not honor FilterStrategy --- Key: LUCENE-5488 URL: https://issues.apache.org/jira/browse/LUCENE-5488 Project: Lucene - Core Issue Type: Bug Components: core/search Affects Versions: 4.6.1 Reporter: John Wang Assignee: Michael Busch Attachments: LUCENE-5488.patch, LUCENE-5488.patch Some Filter implementations produce DocIdSets without the iterator() implementation, such as o.a.l.facet.range.Range.getFilter(). It is done with the intention to be used in conjunction with FilteredQuery with FilterStrategy set to be QUERY_FIRST_FILTER_STRATEGY for performance reasons. However, this behavior is not honored by FilteredQuery.explain where docidset.iterator is called regardless and causing such valid usages of above filter types to fail. The fix is to check bits() first and and fall back to iterator if bits is null. In which case, the input Filter is indeed bad. See attached unit test, which fails without this patch. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5823) Add utility function for internal code to know if it is currently the overseer
[ https://issues.apache.org/jira/browse/SOLR-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924138#comment-13924138 ] Hoss Man commented on SOLR-5823: miller i talked a bit about this on IRC this morning, a few summary points... * the reasons i'm looking for a general am i the leader type method, that can be run as part of a scheduled executor -- instead of adding a new processing thread to the Overseer class is two fold: ** i want the logic to be usable even if we aren't in cloud mode ** i'm trying to think about how other people who write plugins/components would be able to do the same thing w/o needing to modify the overseer. * Tim's patch goes the route of ensuring every node can ask what is the name of the overseer node and then implements am i the overseer node by comparing our name with the overser name ** in the case of who is the shard leader and am i the shard leader that info is cached in the cluster state info, so calling those methods doesn't hit ZK everytime ** we don't want to cache the overseer info in a similar way, because it's risky and 99% of the time, nodes don't care who is the overseer Which brought me to the key question where miller i realized we had gotten side tracked... * i don't really care about the what is the name of the overseer node case -- and most people shouldn't -- i'm really just looking for the am i currently the overseer? part of the equation ** this as a simple boolean should be a much easier question to answer efficiently, because of how the overseer election works -- if a node is the overseer, it's running hte overseer processing threads ** part of my confusion was the terminology: the idea of Leader is used a lot in the overseer code, but that's not refering to shard leader in the solr context, it's refering to the ZK jargon of leader election, in many cases (in the overseer classes) it refers to who is the (zk leader in charge of being the) overseer At this point, miller got disconnected from IRC ... but digging in a bit and thinking about what he was telling me, it seems like we should be able to add an efficient ZkController.isOverseer() method (that doesn't have to hit Zk directly), by checking if the Overseer object is active or closed -- either with a new state boolean, or maybe just by checking the threads it manages for null Add utility function for internal code to know if it is currently the overseer -- Key: SOLR-5823 URL: https://issues.apache.org/jira/browse/SOLR-5823 Project: Solr Issue Type: Improvement Reporter: Hoss Man Attachments: SOLR-5823.patch It would be useful if there was some Overseer equivalent to CloudDescriptor.isLeader() that plugins running in solr could use to know At this moment, am i the leader? -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5496) Nuke fuzzyMinSim and replace with maxEdits for FuzzyQuery and its friends
[ https://issues.apache.org/jira/browse/LUCENE-5496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated LUCENE-5496: Attachment: LUCENE-5496-lucene_core_sandbox_v1.patch This is a first pass at nuking minsims in Lucene core and sandbox in trunk. More work remains in queryparser and in Solr. I've Ignored the test in TestSlowFuzzyQuery2 for now... Will continue work if anyone has an interest. If not, this will go on hold. Nuke fuzzyMinSim and replace with maxEdits for FuzzyQuery and its friends - Key: LUCENE-5496 URL: https://issues.apache.org/jira/browse/LUCENE-5496 Project: Lucene - Core Issue Type: Task Components: core/queryparser, core/search Affects Versions: 4.8, 5.0 Reporter: Tim Allison Priority: Minor Attachments: LUCENE-5496-lucene_core_sandbox_v1.patch, LUCENE-5496_4x_deprecations.patch As we get closer to 5.0, I propose adding some deprecations in the queryparsers realm of 4.x. Are we ready to get rid of all fuzzyMinSims in trunk? -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5818) distrib search with custom comparator does not quite work correctly
[ https://issues.apache.org/jira/browse/SOLR-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Ernst resolved SOLR-5818. -- Resolution: Fixed distrib search with custom comparator does not quite work correctly --- Key: SOLR-5818 URL: https://issues.apache.org/jira/browse/SOLR-5818 Project: Solr Issue Type: Bug Reporter: Ryan Ernst Assignee: Ryan Ernst Fix For: 4.8, 5.0 Attachments: SOLR-5818.patch In QueryComponent.doFieldSortValues, a scorer is never set on a custom comparator. We just need to add a fake scorer that can pass through the score from the DocList. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org