[jira] [Comment Edited] (SOLR-6022) Rename getAnalyzer to getIndexAnalyzer
[ https://issues.apache.org/jira/browse/SOLR-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984076#comment-13984076 ] Uwe Schindler edited comment on SOLR-6022 at 4/29/14 6:52 AM: -- [~tomasflobbe]: The patch for 4.x has to be applied after merge from trunk, so first apply patch to trunk, merge that to 4.x and then apply the 4.x patch on-top (which restores the deprecated method): bq. The second patch shows exactly how I will do the deprecation in branch_4x (I believe I should be able to just apply the patch after doing a merge back from trunk). To test that everything works without merging/committing, apply both patches to the same 4.x checkout, the general first, then the 4.x one. was (Author: thetaphi): [~tomasflobbe]: The patch for 4.x has to be applied after merge from trunk, so first apply patch to trunk, merge that to 4.x and then apply the 4.x patch on-top (which restores the deprecated method): bq. The second patch shows exactly how I will do the deprecation in branch_4x (I believe I should be able to just apply the patch after doing a merge back from trunk). > Rename getAnalyzer to getIndexAnalyzer > -- > > Key: SOLR-6022 > URL: https://issues.apache.org/jira/browse/SOLR-6022 > Project: Solr > Issue Type: Improvement >Reporter: Ryan Ernst > Attachments: SOLR-6022.branch_4x-deprecation.patch, SOLR-6022.patch, > SOLR-6022.patch > > > We have separate index/query analyzer chains, but the access methods for the > analyzers do not match up with the names. This can lead to unknowingly using > the wrong analyzer chain (as it did in SOLR-6017). We should do this > renaming in trunk, and deprecate the old getAnalyzer function in 4x. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6022) Rename getAnalyzer to getIndexAnalyzer
[ https://issues.apache.org/jira/browse/SOLR-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984076#comment-13984076 ] Uwe Schindler commented on SOLR-6022: - [~tomasflobbe]: The patch for 4.x has to be applied after merge from trunk, so first apply patch to trunk, merge that to 4.x and then apply the 4.x patch on-top (which restores the deprecated method): bq. The second patch shows exactly how I will do the deprecation in branch_4x (I believe I should be able to just apply the patch after doing a merge back from trunk). > Rename getAnalyzer to getIndexAnalyzer > -- > > Key: SOLR-6022 > URL: https://issues.apache.org/jira/browse/SOLR-6022 > Project: Solr > Issue Type: Improvement >Reporter: Ryan Ernst > Attachments: SOLR-6022.branch_4x-deprecation.patch, SOLR-6022.patch, > SOLR-6022.patch > > > We have separate index/query analyzer chains, but the access methods for the > analyzers do not match up with the names. This can lead to unknowingly using > the wrong analyzer chain (as it did in SOLR-6017). We should do this > renaming in trunk, and deprecate the old getAnalyzer function in 4x. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6031) Getting Cannot find symbol while Compiling the java file.
Vikash Kumar Singh created SOLR-6031: Summary: Getting Cannot find symbol while Compiling the java file. Key: SOLR-6031 URL: https://issues.apache.org/jira/browse/SOLR-6031 Project: Solr Issue Type: Task Components: clients - java Environment: Centos6.5, Solr-4.7.1, java version "1.7.0_51" Reporter: Vikash Kumar Singh Priority: Minor Here is the code which i am using just for testing purpose first on console as import org.apache.solr.client.solrj.SolrServerException; import org.apache.solr.client.solrj.impl.HttpSolrServer; import org.apache.solr.client.solrj.SolrQuery; import org.apache.solr.client.solrj.response.QueryResponse; import org.apache.solr.common.SolrDocumentList; import java.net.MalformedURLException; public class SolrJSearcher { public static void main(String[] args) throws MalformedURLException,SolrServerException { HttpSolrServer solr = new HttpSolrServer("http://localhost:8983/solr";); SolrQuery query = new SolrQuery(); query.setQuery("sony digital camera"); query.addFilterQuery("cat:electronics","store:amazon.com"); query.setFields("id","price","merchant","cat","store"); query.setStart(0); query.set("defType", "edismax"); QueryResponse response = solr.query(query); SolrDocumentList results = response.getResults(); for (int i = 0; i < results.size(); ++i) { System.out.println(results.get(i)); } } } Also i hv set the classpath as export CLASSPATH=/home/vikash/solr-4.7.1/dist/*.jar:/home/vikash/solr-4.7.1/dist/solrj-lib/*.jar but still while compiling i get these errors, i don't know what to do now, please help [root@localhost vikash]# javac SolrJSearcher.java SolrJSearcher.java:1: package org.apache.solr.client.solrj does not exist import org.apache.solr.client.solrj.SolrServerException; ^ SolrJSearcher.java:2: package org.apache.solr.client.solrj.impl does not exist import org.apache.solr.client.solrj.impl.HttpSolrServer; ^ SolrJSearcher.java:3: package org.apache.solr.client.solrj does not exist import org.apache.solr.client.solrj.SolrQuery; ^ SolrJSearcher.java:4: package org.apache.solr.client.solrj.response does not exist import org.apache.solr.client.solrj.response.QueryResponse; ^ SolrJSearcher.java:5: package org.apache.solr.common does not exist import org.apache.solr.common.SolrDocumentList; ^ SolrJSearcher.java:10: cannot find symbol symbol : class SolrServerException location: class SolrJSearcher public static void main(String[] args) throws MalformedURLException,SolrServerException ^ SolrJSearcher.java:12: cannot find symbol symbol : class HttpSolrServer location: class SolrJSearcher HttpSolrServer solr = new HttpSolrServer("http://localhost:8983/solr";); ^ SolrJSearcher.java:12: cannot find symbol symbol : class HttpSolrServer location: class SolrJSearcher HttpSolrServer solr = new HttpSolrServer("http://localhost:8983/solr";); ^ SolrJSearcher.java:13: cannot find symbol symbol : class SolrQuery location: class SolrJSearcher SolrQuery query = new SolrQuery(); ^ SolrJSearcher.java:13: cannot find symbol symbol : class SolrQuery location: class SolrJSearcher SolrQuery query = new SolrQuery(); ^ SolrJSearcher.java:19: cannot find symbol symbol : class QueryResponse location: class SolrJSearcher QueryResponse response = solr.query(query); ^ SolrJSearcher.java:20: cannot find symbol symbol : class SolrDocumentList location: class SolrJSearcher SolrDocumentList results = response.getResults(); ^ 12 errors [root@localhost vikash]# -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6030) Use System.nanoTime() instead of currentTimeInMills() in LRUCache.warm
[ https://issues.apache.org/jira/browse/SOLR-6030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomás Fernández Löbbe updated SOLR-6030: Attachment: SOLR-6030.patch > Use System.nanoTime() instead of currentTimeInMills() in LRUCache.warm > -- > > Key: SOLR-6030 > URL: https://issues.apache.org/jira/browse/SOLR-6030 > Project: Solr > Issue Type: Improvement >Reporter: Tomás Fernández Löbbe >Priority: Trivial > Attachments: SOLR-6030.patch > > > Most of this cases were addressed in SOLR-5734, but looks like LRUCache > wasn't. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6030) Use System.nanoTime() instead of currentTimeInMills() in LRUCache.warm
Tomás Fernández Löbbe created SOLR-6030: --- Summary: Use System.nanoTime() instead of currentTimeInMills() in LRUCache.warm Key: SOLR-6030 URL: https://issues.apache.org/jira/browse/SOLR-6030 Project: Solr Issue Type: Improvement Reporter: Tomás Fernández Löbbe Priority: Trivial Most of this cases were addressed in SOLR-5734, but looks like LRUCache wasn't. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment
[ https://issues.apache.org/jira/browse/SOLR-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein resolved SOLR-6029. -- Resolution: Fixed Fix Version/s: 4.9 4.8.1 > CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc > has been deleted from a segment > - > > Key: SOLR-6029 > URL: https://issues.apache.org/jira/browse/SOLR-6029 > Project: Solr > Issue Type: Bug > Components: query parsers >Affects Versions: 4.7.1 >Reporter: Greg Harris >Assignee: Joel Bernstein >Priority: Minor > Fix For: 4.8.1, 4.9 > > Attachments: SOLR-6029.patch > > > CollapsingQParserPlugin misidentifies if a document is not found in a segment > if the docid previously existed in a segment ie was deleted. > Relevant code bit from CollapsingQParserPlugin needs to be changed from: > -if(doc != -1) { > +if((doc != -1) && (doc != DocsEnum.NO_MORE_DOCS)) { > What happens is if the doc is not found the returned value is > DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the > doc location causing an ArrayIndexOutOfBoundsException as the array is only > as big as maxDocs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment
[ https://issues.apache.org/jira/browse/SOLR-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984006#comment-13984006 ] ASF subversion and git services commented on SOLR-6029: --- Commit 1590868 from [~joel.bernstein] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1590868 ] SOLR-6029: Updated CHANGES.txt > CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc > has been deleted from a segment > - > > Key: SOLR-6029 > URL: https://issues.apache.org/jira/browse/SOLR-6029 > Project: Solr > Issue Type: Bug > Components: query parsers >Affects Versions: 4.7.1 >Reporter: Greg Harris >Assignee: Joel Bernstein >Priority: Minor > Attachments: SOLR-6029.patch > > > CollapsingQParserPlugin misidentifies if a document is not found in a segment > if the docid previously existed in a segment ie was deleted. > Relevant code bit from CollapsingQParserPlugin needs to be changed from: > -if(doc != -1) { > +if((doc != -1) && (doc != DocsEnum.NO_MORE_DOCS)) { > What happens is if the doc is not found the returned value is > DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the > doc location causing an ArrayIndexOutOfBoundsException as the array is only > as big as maxDocs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment
[ https://issues.apache.org/jira/browse/SOLR-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984004#comment-13984004 ] ASF subversion and git services commented on SOLR-6029: --- Commit 1590867 from [~joel.bernstein] in branch 'dev/trunk' [ https://svn.apache.org/r1590867 ] SOLR-6029: Updated CHANGES.txt > CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc > has been deleted from a segment > - > > Key: SOLR-6029 > URL: https://issues.apache.org/jira/browse/SOLR-6029 > Project: Solr > Issue Type: Bug > Components: query parsers >Affects Versions: 4.7.1 >Reporter: Greg Harris >Assignee: Joel Bernstein >Priority: Minor > Attachments: SOLR-6029.patch > > > CollapsingQParserPlugin misidentifies if a document is not found in a segment > if the docid previously existed in a segment ie was deleted. > Relevant code bit from CollapsingQParserPlugin needs to be changed from: > -if(doc != -1) { > +if((doc != -1) && (doc != DocsEnum.NO_MORE_DOCS)) { > What happens is if the doc is not found the returned value is > DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the > doc location causing an ArrayIndexOutOfBoundsException as the array is only > as big as maxDocs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment
[ https://issues.apache.org/jira/browse/SOLR-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983999#comment-13983999 ] ASF subversion and git services commented on SOLR-6029: --- Commit 1590866 from [~joel.bernstein] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1590866 ] SOLR-6029: CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment > CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc > has been deleted from a segment > - > > Key: SOLR-6029 > URL: https://issues.apache.org/jira/browse/SOLR-6029 > Project: Solr > Issue Type: Bug > Components: query parsers >Affects Versions: 4.7.1 >Reporter: Greg Harris >Assignee: Joel Bernstein >Priority: Minor > Attachments: SOLR-6029.patch > > > CollapsingQParserPlugin misidentifies if a document is not found in a segment > if the docid previously existed in a segment ie was deleted. > Relevant code bit from CollapsingQParserPlugin needs to be changed from: > -if(doc != -1) { > +if((doc != -1) && (doc != DocsEnum.NO_MORE_DOCS)) { > What happens is if the doc is not found the returned value is > DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the > doc location causing an ArrayIndexOutOfBoundsException as the array is only > as big as maxDocs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment
[ https://issues.apache.org/jira/browse/SOLR-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983992#comment-13983992 ] ASF subversion and git services commented on SOLR-6029: --- Commit 1590865 from [~joel.bernstein] in branch 'dev/trunk' [ https://svn.apache.org/r1590865 ] SOLR-6029: CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment > CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc > has been deleted from a segment > - > > Key: SOLR-6029 > URL: https://issues.apache.org/jira/browse/SOLR-6029 > Project: Solr > Issue Type: Bug > Components: query parsers >Affects Versions: 4.7.1 >Reporter: Greg Harris >Assignee: Joel Bernstein >Priority: Minor > Attachments: SOLR-6029.patch > > > CollapsingQParserPlugin misidentifies if a document is not found in a segment > if the docid previously existed in a segment ie was deleted. > Relevant code bit from CollapsingQParserPlugin needs to be changed from: > -if(doc != -1) { > +if((doc != -1) && (doc != DocsEnum.NO_MORE_DOCS)) { > What happens is if the doc is not found the returned value is > DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the > doc location causing an ArrayIndexOutOfBoundsException as the array is only > as big as maxDocs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5963) Finalize interface and backport analytics component to 4x
[ https://issues.apache.org/jira/browse/SOLR-5963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983987#comment-13983987 ] Erick Erickson commented on SOLR-5963: -- Hmmm, there have been several discussions around this, and the question now is whether this should be back-ported or not. Given that the current stats component doesn't supported distributed Solr, one suggestion is to move this to a contrib for the time being and then put distributed statistics in to the main-line code as we can. This may mean there are fewer capabilities. If that's acceptable, I'll start working toward that goal. So that would mean: 1> pull this out of trunk 2> put this into a contrib on trunk 3> backport the contrib to 4x. > Finalize interface and backport analytics component to 4x > - > > Key: SOLR-5963 > URL: https://issues.apache.org/jira/browse/SOLR-5963 > Project: Solr > Issue Type: Improvement >Affects Versions: 4.9, 5.0 >Reporter: Erick Erickson >Assignee: Erick Erickson > Attachments: SOLR-5963.patch, SOLR-5963.patch > > > Now that we seem to have fixed up the test failures for trunk for the > analytics component, we need to solidify the API and back-port it to 4x. For > history, see SOLR-5302 and SOLR-5488. > As far as I know, these are the merges that need to occur to do this (plus > any that this JIRA brings up) > svn merge -c 1543651 https://svn.apache.org/repos/asf/lucene/dev/trunk > svn merge -c 1545009 https://svn.apache.org/repos/asf/lucene/dev/trunk > svn merge -c 1545053 https://svn.apache.org/repos/asf/lucene/dev/trunk > svn merge -c 1545054 https://svn.apache.org/repos/asf/lucene/dev/trunk > svn merge -c 1545080 https://svn.apache.org/repos/asf/lucene/dev/trunk > svn merge -c 1545143 https://svn.apache.org/repos/asf/lucene/dev/trunk > svn merge -c 1545417 https://svn.apache.org/repos/asf/lucene/dev/trunk > svn merge -c 1545514 https://svn.apache.org/repos/asf/lucene/dev/trunk > svn merge -c 1545650 https://svn.apache.org/repos/asf/lucene/dev/trunk > svn merge -c 1546074 https://svn.apache.org/repos/asf/lucene/dev/trunk > svn merge -c 1546263 https://svn.apache.org/repos/asf/lucene/dev/trunk > svn merge -c 1559770 https://svn.apache.org/repos/asf/lucene/dev/trunk > svn merge -c 1583636 https://svn.apache.org/repos/asf/lucene/dev/trunk > The only remaining thing I think needs to be done is to solidify the > interface, see comments from [~yo...@apache.org] on the two JIRAs mentioned, > although SOLR-5488 is the most relevant one. > [~sbower], [~houstonputman] and [~yo...@apache.org] might be particularly > interested here. > I really want to put this to bed, so if we can get agreement on this soon I > can make it march. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment
[ https://issues.apache.org/jira/browse/SOLR-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983972#comment-13983972 ] Joel Bernstein commented on SOLR-6029: -- Thanks Greg, this is a nasty bug. > CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc > has been deleted from a segment > - > > Key: SOLR-6029 > URL: https://issues.apache.org/jira/browse/SOLR-6029 > Project: Solr > Issue Type: Bug > Components: query parsers >Affects Versions: 4.7.1 >Reporter: Greg Harris >Priority: Minor > Attachments: SOLR-6029.patch > > > CollapsingQParserPlugin misidentifies if a document is not found in a segment > if the docid previously existed in a segment ie was deleted. > Relevant code bit from CollapsingQParserPlugin needs to be changed from: > -if(doc != -1) { > +if((doc != -1) && (doc != DocsEnum.NO_MORE_DOCS)) { > What happens is if the doc is not found the returned value is > DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the > doc location causing an ArrayIndexOutOfBoundsException as the array is only > as big as maxDocs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment
[ https://issues.apache.org/jira/browse/SOLR-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein reassigned SOLR-6029: Assignee: Joel Bernstein > CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc > has been deleted from a segment > - > > Key: SOLR-6029 > URL: https://issues.apache.org/jira/browse/SOLR-6029 > Project: Solr > Issue Type: Bug > Components: query parsers >Affects Versions: 4.7.1 >Reporter: Greg Harris >Assignee: Joel Bernstein >Priority: Minor > Attachments: SOLR-6029.patch > > > CollapsingQParserPlugin misidentifies if a document is not found in a segment > if the docid previously existed in a segment ie was deleted. > Relevant code bit from CollapsingQParserPlugin needs to be changed from: > -if(doc != -1) { > +if((doc != -1) && (doc != DocsEnum.NO_MORE_DOCS)) { > What happens is if the doc is not found the returned value is > DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the > doc location causing an ArrayIndexOutOfBoundsException as the array is only > as big as maxDocs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Simple unit test doco?
Many times cloud-related/distrib tests fail due to timeouts, this could be related to the overall load of your computer (probably generated by the tests itself). I don’t know if this is the correct way, but I found it that it’s much less probable for them to fail if I use less JVMs to run the tests (by default my mac would use 4, but I set it to 2 if I see failures. You can use the JVM parameter "tests.jvms" when running ant test) If you are working on some specific component you can filter which tests to run in many ways, see “ant test-help”. It may be useful to use tests.slow=false to skip the slow tests in most of your runs. "do I need to turn on a ZK server for integration testing?” No, you don’t. Solr will start an embedded Zookeeper for the tests. "I've tried running those tests in isolation via IntelliJ and they all report as passing” Most probably is not related to this, but just in case: make sure when you try to reproduce a failure on a test that you saw to use the same seed (-Dtests.seed). The seed used should be in the output of the test where you saw the failure. [junit4] Tests with failures: [junit4] - org.apache.solr.hadoop.MorphlineMapperTest (suite) [junit4] Sorry, no idea about this one. On Mon, Apr 28, 2014 at 7:47 PM, Greg Pendlebury wrote: > Heyo, > > I'm wondering if there is any additional doco and/or tricks to unit > testing solr than this wiki page? http://wiki.apache.org/solr/TestingSolr > > Some details about my troubles are below if anyone cares to read, but I'm > not so much looking for specific responses to why individual tests are > failing. I'm more trying to work out whether I'm on the right track or > missing some key information... like do I need to turn on a ZK server for > integration testing? > > Or do I need to accept failed unit tests as a baseline before applying our > patch? I don't typically like that, but this is an enormous test suite and > I'd be happy just to get a pass up to the same level that 4.7.2 had prior > to release. > > Ta, > Greg > > > Details > == > I downloaded the tagged 4.7.2 release Yesterday to apply a patch our team > wants to test, but even before touching the codebase at all I cannot get > the unit tests to pass. I'm struggling to even get consistent results. > > The most useful two end points I reach are: >[junit4] Tests with failures: >[junit4] - > org.apache.solr.cloud.CustomCollectionTest.testDistribSearch >[junit4] - > org.apache.solr.cloud.DistribCursorPagingTest.testDistribSearch >[junit4] - org.apache.solr.cloud.DistribCursorPagingTest (suite) >[junit4] > ... >[junit4] Execution time total: 2 hours 6 minutes 50 seconds >[junit4] Tests summary: 365 suites, 1570 tests, 1 suite-level error, 2 > errors, 187 ignored (12 assumptions) > > And another one (don't have the terminal output on hand unfortunately) in > the cloudera morphline suite. It is the same error as this though and fails > after around an hour: > http://mail-archives.apache.org/mod_mbox/flume-dev/201310.mbox/%3ccac6yyrj2cv89hntdeel7t0qlq8zjbwjynbtcveucxlzdmyv...@mail.gmail.com%3E > > I've tried running those tests in isolation via IntelliJ and they all > report as passing... the logs show exceptions about ZK session expiry for > some (not all) but I assume those are trapped expected exceptions since > JUnit is passing them? > > Given the response in the message I linked just above re: windows support > I tried shifting the build up to a RHEL6 server this morning but I've tried > two runs now and both failed with this odd error: >[junit4] Tests with failures: >[junit4] - org.apache.solr.hadoop.MorphlineMapperTest (suite) >[junit4] > ... >[junit4] Execution time total: 42 seconds >[junit4] Tests summary: 7 suites, 35 tests, 2 suite-level errors, 5 > ignored > > I only say odd because they run for half an hour and then report 42 > seconds. > > Thanks again if you've read all this. >
Re: VOTE: RC1 Release apache-solr-ref-guide-4.8.pdf
+1 On Apr 25, 2014, at 5:38 PM, Chris Hostetter wrote: > > (Note: cross posted to general, please confine replies to dev@lucene) > > Please VOTE to release the following RC1 as apache-solr-ref-guide-4.8.pdf ... > > https://dist.apache.org/repos/dist/dev/lucene/solr/ref-guide/apache-solr-ref-guide-4.8-RC1 > > > The notes I previously mentioned regarding RC0 apply to this RC as well... > > 1) Due to a known bug in confluence, the PDFs it generates are much bigger > then they should be. This bug has been fixed in the latest version of > confluence, but cwiki.apache.rog has not yet been updated. For that reason, > I have manually run a small tool against the PDF to "fix" the size (see > SOLR-5819). The first time i tried this approach, it inadvertantly removed > the "Index" (aka: Table of Contents, or Bookmarks depending on what PDF > reader client you use). I've already fixed this, but if you notice anything > else unusual about this PDF compared to previous versions please speak up so > we can see if it's a result of this post-processing and try to fix it. > > 2) This is the first ref guide release where we've started using a special > confluence macro for any lucene/solr javadoc links. The up side is that all > javadoc links in this 4.8 ref guide will now correctly point to the 4.8 > javadocs on lucene.apache.org -- the down side is that this means none of > those links currently work, since the 4.8 code release is still ongoing and > the website has not yet been updated. > > Because of #2, I intend to leave this ref guide vote open until the 4.8 code > release is final - that way we won't officially be releasing this doc until > the 4.8 javadocs are uploaded and all the links work properly. > > > > -Hoss > http://www.lucidworks.com/ Grant Ingersoll | @gsingers http://www.lucidworks.com
[jira] [Commented] (SOLR-5473) Make one state.json per collection
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983928#comment-13983928 ] Noble Paul commented on SOLR-5473: -- I'm almost there . probably today or in the worst case, tomorrow > Make one state.json per collection > -- > > Key: SOLR-5473 > URL: https://issues.apache.org/jira/browse/SOLR-5473 > Project: Solr > Issue Type: Sub-task > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > Fix For: 5.0 > > Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, > ec2-50-16-38-73_solr.log > > > As defined in the parent issue, store the states of each collection under > /collections/collectionname/state.json node -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Simple unit test doco?
Heyo, I'm wondering if there is any additional doco and/or tricks to unit testing solr than this wiki page? http://wiki.apache.org/solr/TestingSolr Some details about my troubles are below if anyone cares to read, but I'm not so much looking for specific responses to why individual tests are failing. I'm more trying to work out whether I'm on the right track or missing some key information... like do I need to turn on a ZK server for integration testing? Or do I need to accept failed unit tests as a baseline before applying our patch? I don't typically like that, but this is an enormous test suite and I'd be happy just to get a pass up to the same level that 4.7.2 had prior to release. Ta, Greg Details == I downloaded the tagged 4.7.2 release Yesterday to apply a patch our team wants to test, but even before touching the codebase at all I cannot get the unit tests to pass. I'm struggling to even get consistent results. The most useful two end points I reach are: [junit4] Tests with failures: [junit4] - org.apache.solr.cloud.CustomCollectionTest.testDistribSearch [junit4] - org.apache.solr.cloud.DistribCursorPagingTest.testDistribSearch [junit4] - org.apache.solr.cloud.DistribCursorPagingTest (suite) [junit4] ... [junit4] Execution time total: 2 hours 6 minutes 50 seconds [junit4] Tests summary: 365 suites, 1570 tests, 1 suite-level error, 2 errors, 187 ignored (12 assumptions) And another one (don't have the terminal output on hand unfortunately) in the cloudera morphline suite. It is the same error as this though and fails after around an hour: http://mail-archives.apache.org/mod_mbox/flume-dev/201310.mbox/%3ccac6yyrj2cv89hntdeel7t0qlq8zjbwjynbtcveucxlzdmyv...@mail.gmail.com%3E I've tried running those tests in isolation via IntelliJ and they all report as passing... the logs show exceptions about ZK session expiry for some (not all) but I assume those are trapped expected exceptions since JUnit is passing them? Given the response in the message I linked just above re: windows support I tried shifting the build up to a RHEL6 server this morning but I've tried two runs now and both failed with this odd error: [junit4] Tests with failures: [junit4] - org.apache.solr.hadoop.MorphlineMapperTest (suite) [junit4] ... [junit4] Execution time total: 42 seconds [junit4] Tests summary: 7 suites, 35 tests, 2 suite-level errors, 5 ignored I only say odd because they run for half an hour and then report 42 seconds. Thanks again if you've read all this.
Re: VOTE: RC1 Release apache-solr-ref-guide-4.8.pdf
+1 On Fri, Apr 25, 2014 at 2:38 PM, Chris Hostetter wrote: > > (Note: cross posted to general, please confine replies to dev@lucene) > > Please VOTE to release the following RC1 as apache-solr-ref-guide-4.8.pdf > ... > > https://dist.apache.org/repos/dist/dev/lucene/solr/ref- > guide/apache-solr-ref-guide-4.8-RC1 > > > The notes I previously mentioned regarding RC0 apply to this RC as well... > > 1) Due to a known bug in confluence, the PDFs it generates are much bigger > then they should be. This bug has been fixed in the latest version of > confluence, but cwiki.apache.rog has not yet been updated. For that > reason, I have manually run a small tool against the PDF to "fix" the size > (see SOLR-5819). The first time i tried this approach, it inadvertantly > removed the "Index" (aka: Table of Contents, or Bookmarks depending on what > PDF reader client you use). I've already fixed this, but if you notice > anything else unusual about this PDF compared to previous versions please > speak up so we can see if it's a result of this post-processing and try to > fix it. > > 2) This is the first ref guide release where we've started using a special > confluence macro for any lucene/solr javadoc links. The up side is that > all javadoc links in this 4.8 ref guide will now correctly point to the 4.8 > javadocs on lucene.apache.org -- the down side is that this means none of > those links currently work, since the 4.8 code release is still ongoing and > the website has not yet been updated. > > Because of #2, I intend to leave this ref guide vote open until the 4.8 > code release is final - that way we won't officially be releasing this doc > until the 4.8 javadocs are uploaded and all the links work properly. > > > > -Hoss > http://www.lucidworks.com/ > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > -- Anshum Gupta http://www.anshumgupta.net
[jira] [Commented] (LUCENE-5611) Simplify the default indexing chain
[ https://issues.apache.org/jira/browse/LUCENE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983867#comment-13983867 ] ASF subversion and git services commented on LUCENE-5611: - Commit 1590858 from [~rcmuir] in branch 'dev/branches/lucene5611' [ https://svn.apache.org/r1590858 ] LUCENE-5611: indexing optimizations, dont compute CRC for internal-use of RAMOutputStream, dont do heavy per-term stuff in skipper until we actually must buffer skipdata > Simplify the default indexing chain > --- > > Key: LUCENE-5611 > URL: https://issues.apache.org/jira/browse/LUCENE-5611 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 4.9, 5.0 > > Attachments: LUCENE-5611.patch, LUCENE-5611.patch > > > I think Lucene's current indexing chain has too many classes / > hierarchy / abstractions, making it look much more complex than it > really should be, and discouraging users from experimenting/innovating > with their own indexing chains. > Also, if it were easier to understand/approach, then new developers > would more likely try to improve it ... it really should be simpler. > So I'm exploring a pared back indexing chain, and have a starting patch > that I think is looking ok: it seems more approachable than the > current indexing chain, or at least has fewer strange classes. > I also thought this could give some speedup for tiny documents (a more > common use of Lucene lately), and it looks like, with the evil > optimizations, this is a ~25% speedup for Geonames docs. Even without > those evil optos it's a bit faster. > This is very much a work in progress / nocommits, and there are some > behavior changes e.g. the new chain requires all fields to have the > same TV options (rather than auto-upgrading all fields by the same > name that the current chain does)... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6022) Rename getAnalyzer to getIndexAnalyzer
[ https://issues.apache.org/jira/browse/SOLR-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983829#comment-13983829 ] Tomás Fernández Löbbe commented on SOLR-6022: - I don't see why in 4x calls to getAnalyzer() can't be changed to getIndexAnalyzer(), It wouldn't break compatibility and it would avoid creating many warnings. > Rename getAnalyzer to getIndexAnalyzer > -- > > Key: SOLR-6022 > URL: https://issues.apache.org/jira/browse/SOLR-6022 > Project: Solr > Issue Type: Improvement >Reporter: Ryan Ernst > Attachments: SOLR-6022.branch_4x-deprecation.patch, SOLR-6022.patch, > SOLR-6022.patch > > > We have separate index/query analyzer chains, but the access methods for the > analyzers do not match up with the names. This can lead to unknowingly using > the wrong analyzer chain (as it did in SOLR-6017). We should do this > renaming in trunk, and deprecate the old getAnalyzer function in 4x. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment
[ https://issues.apache.org/jira/browse/SOLR-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Harris updated SOLR-6029: -- Attachment: SOLR-6029.patch Patch with test for 4.7 > CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc > has been deleted from a segment > - > > Key: SOLR-6029 > URL: https://issues.apache.org/jira/browse/SOLR-6029 > Project: Solr > Issue Type: Bug > Components: query parsers >Affects Versions: 4.7.1 >Reporter: Greg Harris >Priority: Minor > Attachments: SOLR-6029.patch > > > CollapsingQParserPlugin misidentifies if a document is not found in a segment > if the docid previously existed in a segment ie was deleted. > Relevant code bit from CollapsingQParserPlugin needs to be changed from: > -if(doc != -1) { > +if((doc != -1) && (doc != DocsEnum.NO_MORE_DOCS)) { > What happens is if the doc is not found the returned value is > DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the > doc location causing an ArrayIndexOutOfBoundsException as the array is only > as big as maxDocs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment
Greg Harris created SOLR-6029: - Summary: CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment Key: SOLR-6029 URL: https://issues.apache.org/jira/browse/SOLR-6029 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.7.1 Reporter: Greg Harris Priority: Minor Attachments: SOLR-6029.patch CollapsingQParserPlugin misidentifies if a document is not found in a segment if the docid previously existed in a segment ie was deleted. Relevant code bit from CollapsingQParserPlugin needs to be changed from: -if(doc != -1) { +if((doc != -1) && (doc != DocsEnum.NO_MORE_DOCS)) { What happens is if the doc is not found the returned value is DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the doc location causing an ArrayIndexOutOfBoundsException as the array is only as big as maxDocs. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5473) Make one state.json per collection
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983733#comment-13983733 ] Mark Miller commented on SOLR-5473: --- Any help needed on pulling this out? I think if we take too long, it's likely to get quite tricky fast. > Make one state.json per collection > -- > > Key: SOLR-5473 > URL: https://issues.apache.org/jira/browse/SOLR-5473 > Project: Solr > Issue Type: Sub-task > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > Fix For: 5.0 > > Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, > ec2-50-16-38-73_solr.log > > > As defined in the parent issue, store the states of each collection under > /collections/collectionname/state.json node -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N
[ https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983726#comment-13983726 ] Hoss Man commented on LUCENE-5632: -- bq. So I would suggest to fix this for 4.x in the way like the attached patch and remove in trunk all deprecated constants completely (so simply do rename in trunk). I think it would probably be best to keep the (new) parseLeniently on trunk as well though (not just on 4x) so that _strings_ like "LUCENE_47" continue to work on trunk. > transition Version constants from LUCENE_MN to LUCENE_M_N > - > > Key: LUCENE-5632 > URL: https://issues.apache.org/jira/browse/LUCENE-5632 > Project: Lucene - Core > Issue Type: Improvement > Components: core/other >Reporter: Robert Muir >Assignee: Uwe Schindler > Fix For: 4.9 > > Attachments: LUCENE-5632.patch, LUCENE-5632.patch > > > We should fix this, otherwise the constants will be hard to read (e.g. > Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever). > I do not want this to be an excuse for an arbitrary 5.0 release that does not > have the features expected of a major release :) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N
[ https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5632: -- Component/s: core/other Fix Version/s: 4.9 Assignee: Uwe Schindler Issue Type: Improvement (was: Bug) > transition Version constants from LUCENE_MN to LUCENE_M_N > - > > Key: LUCENE-5632 > URL: https://issues.apache.org/jira/browse/LUCENE-5632 > Project: Lucene - Core > Issue Type: Improvement > Components: core/other >Reporter: Robert Muir >Assignee: Uwe Schindler > Fix For: 4.9 > > Attachments: LUCENE-5632.patch, LUCENE-5632.patch > > > We should fix this, otherwise the constants will be hard to read (e.g. > Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever). > I do not want this to be an excuse for an arbitrary 5.0 release that does not > have the features expected of a major release :) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N
[ https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5632: -- Attachment: LUCENE-5632.patch New patch, which passes all tests (unmodified). Solr tests also pass with unmodified config files. In fact, in branch_4x there are more occurences, but the whole patch is more or less a Eclipse refactor rename. So I would suggest to fix this for 4.x in the way like the attached patch and remove in trunk all deprecated constants completely (so simply do rename in trunk). > transition Version constants from LUCENE_MN to LUCENE_M_N > - > > Key: LUCENE-5632 > URL: https://issues.apache.org/jira/browse/LUCENE-5632 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir > Attachments: LUCENE-5632.patch, LUCENE-5632.patch > > > We should fix this, otherwise the constants will be hard to read (e.g. > Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever). > I do not want this to be an excuse for an arbitrary 5.0 release that does not > have the features expected of a major release :) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N
[ https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5632: -- Attachment: LUCENE-5632.patch Patch, interestingly not so many things changed. I did not yet run tests, but I also fixed the parser. > transition Version constants from LUCENE_MN to LUCENE_M_N > - > > Key: LUCENE-5632 > URL: https://issues.apache.org/jira/browse/LUCENE-5632 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir > Attachments: LUCENE-5632.patch > > > We should fix this, otherwise the constants will be hard to read (e.g. > Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever). > I do not want this to be an excuse for an arbitrary 5.0 release that does not > have the features expected of a major release :) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N
[ https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983644#comment-13983644 ] Uwe Schindler commented on LUCENE-5632: --- In fact it is possible to add "deprecated" old constants somewhere at the end of the enum. Those are no real enum constants (they dont work in switch statements), but for the general use case of matchVersion parameters, this is fine: {code:java} @Deprecated public static final Version LUCENE_41 = LUCENE_4_1; {code} > transition Version constants from LUCENE_MN to LUCENE_M_N > - > > Key: LUCENE-5632 > URL: https://issues.apache.org/jira/browse/LUCENE-5632 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir > > We should fix this, otherwise the constants will be hard to read (e.g. > Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever). > I do not want this to be an excuse for an arbitrary 5.0 release that does not > have the features expected of a major release :) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N
[ https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983631#comment-13983631 ] Uwe Schindler edited comment on LUCENE-5632 at 4/28/14 9:55 PM: One idea. We redfine the enum with the new syntax. In Lucene 4.x we may additionally define the old constants as additional alias statics (+ deprecated) inside the enum? Java does not allow additional constants in an enum that are identical to others, but we can maybe move the deprecated ones between the real ones (like: {{LUCENE_4_0, @Deprecated LUCENE_40, LUCENE_4_1, @Deprecated LUCENE_41}}, although I am not sure if they should come before or after, but we can add magic to the version comparison functions: {{onOrAfter()}} can accept both). We must also fix the parseVersionLenient for use by Solr + ElasticSearch. was (Author: thetaphi): One idea. We redfine the enum with the new syntax. In Lucene 4.x we may additionally define the old constants as additional alias statics (+ deprecated) inside the enum? Java does not allow additional constants in an enum that are identical to others, but we can maybe move the deprecated ones between the real ones (like: {{LUCENE_4_0, @Deprecated LUCENE_40, LUCENE_4_1, @Deprecated LUCENE_41}}, although I am not sure if they should come before or after). We must also fix the parseVersionLenient for use by Solr + ElasticSearch. > transition Version constants from LUCENE_MN to LUCENE_M_N > - > > Key: LUCENE-5632 > URL: https://issues.apache.org/jira/browse/LUCENE-5632 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir > > We should fix this, otherwise the constants will be hard to read (e.g. > Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever). > I do not want this to be an excuse for an arbitrary 5.0 release that does not > have the features expected of a major release :) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N
[ https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983631#comment-13983631 ] Uwe Schindler commented on LUCENE-5632: --- One idea. We redfine the enum with the new syntax. In Lucene 4.x we may additionally define the old constants as additional alias statics (+ deprecated) inside the enum? Java does not allow additional constants in an enum that are identical to others, but we can maybe move the deprecated ones between the real ones (like: {{LUCENE_4_0, @Deprecated LUCENE_40, LUCENE_4_1, @Deprecated LUCENE_41}}, although I am not sure if they should come before or after). We must also fix the parseVersionLenient for use by Solr + ElasticSearch. > transition Version constants from LUCENE_MN to LUCENE_M_N > - > > Key: LUCENE-5632 > URL: https://issues.apache.org/jira/browse/LUCENE-5632 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir > > We should fix this, otherwise the constants will be hard to read (e.g. > Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever). > I do not want this to be an excuse for an arbitrary 5.0 release that does not > have the features expected of a major release :) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6009) edismax mis-parsing RegexpQuery
[ https://issues.apache.org/jira/browse/SOLR-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983586#comment-13983586 ] Hoss Man commented on SOLR-6009: I did a quick skim of ExtendedDismaxQParser and from what i can tell nothing was _ever_ done to support regex syntax. It also scares the hee-bee-jee-bees out of me that the (erroneous) behavior of edismax is different depending on whether the field exists in the schema, or is matched because of a "\*" dynamicField ... particularly since I can't reproduce the same IMPOSSIBLE_FIELD_NAME leakage when using an existing text_general dynamicField like qf=foo_txt. smells like 2 interconnected bugs: one is just that needs regex support added to the parser, the other is that while regex support is missing, something is getting tickled that causes *really* bad behavior when the fields in use exist because of "\*" dynamicField > edismax mis-parsing RegexpQuery > --- > > Key: SOLR-6009 > URL: https://issues.apache.org/jira/browse/SOLR-6009 > Project: Solr > Issue Type: Bug > Components: query parsers >Affects Versions: 4.7.2 >Reporter: Evan Sayer > > edismax appears to be leaking its IMPOSSIBLE_FIELD_NAME into queries > involving a RegexpQuery. Steps to reproduce on 4.7.2: > 1) remove the explicit definition for 'text' > 2) add a catch-all '*' dynamic field of type text_general > {code} > stored="true" /> > {code} > 3) index the exampledocs/ data > 4) run a query like the following: > {code} > http://localhost:8983/solr/collection1/select?q={!edismax%20qf=%27text%27}%20/.*elec.*/&debugQuery=true > {code} > The debugQuery output will look like this: > {code} > > {!edismax qf='text'} /.*elec.*/ > {!edismax qf='text'} /.*elec.*/ > (+RegexpQuery(:/.*elec.*/))/no_coord > +:/.*elec.*/ > {code} > If you copy/paste the parsed-query into a text editor or something, you can > see that the field-name isn't actually blank. The IMPOSSIBLE_FIELD_NAME ends > up in there. > I haven't been able to reproduce this behavior on 4.7.2 without getting rid > of the explicit field definition for 'text' and using a dynamicField, which > is how things are setup on the machine where this issue was discovered. The > query isn't quite right with the explicit field definition in place either, > though: > {code} > > {!edismax qf='text'} /.*elec.*/ > {!edismax qf='text'} /.*elec.*/ > (+DisjunctionMaxQuery((text:elec)))/no_coord > +(text:elec) > {code} > numFound=0 for both of these. This site is useful for looking at the > characters in the first variant: > http://rishida.net/tools/conversion/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N
Robert Muir created LUCENE-5632: --- Summary: transition Version constants from LUCENE_MN to LUCENE_M_N Key: LUCENE-5632 URL: https://issues.apache.org/jira/browse/LUCENE-5632 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir We should fix this, otherwise the constants will be hard to read (e.g. Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever). I do not want this to be an excuse for an arbitrary 5.0 release that does not have the features expected of a major release :) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6022) Rename getAnalyzer to getIndexAnalyzer
[ https://issues.apache.org/jira/browse/SOLR-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983456#comment-13983456 ] Uwe Schindler commented on SOLR-6022: - I think that looks good. > Rename getAnalyzer to getIndexAnalyzer > -- > > Key: SOLR-6022 > URL: https://issues.apache.org/jira/browse/SOLR-6022 > Project: Solr > Issue Type: Improvement >Reporter: Ryan Ernst > Attachments: SOLR-6022.branch_4x-deprecation.patch, SOLR-6022.patch, > SOLR-6022.patch > > > We have separate index/query analyzer chains, but the access methods for the > analyzers do not match up with the names. This can lead to unknowingly using > the wrong analyzer chain (as it did in SOLR-6017). We should do this > renaming in trunk, and deprecate the old getAnalyzer function in 4x. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5331) nested SpanNearQuery with repeating groups does not find match
[ https://issues.apache.org/jira/browse/LUCENE-5331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983432#comment-13983432 ] Michael Sander commented on LUCENE-5331: I will look into this a bit more deeply. > nested SpanNearQuery with repeating groups does not find match > -- > > Key: LUCENE-5331 > URL: https://issues.apache.org/jira/browse/LUCENE-5331 > Project: Lucene - Core > Issue Type: Bug >Reporter: Jerry Zhou > Attachments: NestedSpanNearTest.java > > > Nested spanNear queries do not work in some cases when repeating groups are > in the query. > Test case is attached ... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5331) nested SpanNearQuery with repeating groups does not find match
[ https://issues.apache.org/jira/browse/LUCENE-5331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983421#comment-13983421 ] Tim Allison commented on LUCENE-5331: - [~speedplane], I was trying to figure out if what you're seeing is the same as the original issue or the one that I raised. The example that you posted on the google groups seems to work in pure Lucene both 4.7 and trunk: {noformat} private final static String FIELD = "f"; @Test public void testSimpleBizBuz() throws Exception { Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_47); IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_47, analyzer); RAMDirectory d = new RAMDirectory(); IndexWriter writer = new IndexWriter(d, config); Document doc = new Document(); doc.add(new TextField(FIELD, "foo biz buz", Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new TextField(FIELD, "foo biz and biz buz", Store.YES)); writer.addDocument(doc); writer.close(); SpanQuery foo = new SpanTermQuery(new Term(FIELD, "foo")); SpanQuery biz = new SpanTermQuery(new Term(FIELD, "biz")); SpanQuery buz = new SpanTermQuery(new Term(FIELD, "buz")); SpanQuery bizbuz = new SpanNearQuery(new SpanQuery[]{biz, buz}, 0, false); SpanQuery foobizbuz = new SpanNearQuery(new SpanQuery[]{foo, bizbuz}, 0, false); IndexReader reader = DirectoryReader.open(d); IndexSearcher searcher = new IndexSearcher(reader); TopScoreDocCollector coll = TopScoreDocCollector.create(100, true); searcher.search(foobizbuz, coll); ScoreDoc[] scoreDocs = coll.topDocs().scoreDocs; assertEquals(1, scoreDocs.length); } {noformat} Are you sure the issue that you reported is the same as one of the ones in this issue? Is the above test case right? > nested SpanNearQuery with repeating groups does not find match > -- > > Key: LUCENE-5331 > URL: https://issues.apache.org/jira/browse/LUCENE-5331 > Project: Lucene - Core > Issue Type: Bug >Reporter: Jerry Zhou > Attachments: NestedSpanNearTest.java > > > Nested spanNear queries do not work in some cases when repeating groups are > in the query. > Test case is attached ... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5631) Improve access to archived versions of Lucene and Solr
[ https://issues.apache.org/jira/browse/LUCENE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983405#comment-13983405 ] Shawn Heisey commented on LUCENE-5631: -- {quote} https://lucene.apache.org/core/downloads.html https://lucene.apache.org/solr/downloads.html Both of these links are prominent in the top nav of the main pages for Lucene-Core & Solr, just to the right of the "news" tabs... {quote} This is not the first time I've been completely blind. I did not notice those links. Apparently I'm not the only one, though. I've seen the question come up in the IRC channel regularly. Now that they've been pointed out, I can guide people in the right direction much easier than providing a link or telling them that they just have to click fast. :) bq. The project has taken several steps in the opposite direction, intentionally making it harder to access releases (and docs) for older versions, to encourage people to choose the most recent version. This is understandable, but people who are explicitly looking for an older version are asking about it. Hoss has pointed out where to go. I thought those links weren't there, and it turns out that it was me, not the website. > Improve access to archived versions of Lucene and Solr > -- > > Key: LUCENE-5631 > URL: https://issues.apache.org/jira/browse/LUCENE-5631 > Project: Lucene - Core > Issue Type: Improvement > Components: general/website >Reporter: Shawn Heisey > > When visiting the website to download Lucene or Solr, it is very difficult > for people to locate where to download previous versions. The archive link > does show up when you click the download link, but the page where it lives is > replaced in less than a second by the CGI for picking a download mirror for > the current release. There's nothing there for previous versions. > At a minimum, we need a link to the download archive that's right below the > main Download link. Something else I think we should do (which might > actually be an INFRA issue, as this problem exists for other projects too) > would be to have the "closer.cgi" page include a link to the archives. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6022) Rename getAnalyzer to getIndexAnalyzer
[ https://issues.apache.org/jira/browse/SOLR-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Ernst updated SOLR-6022: - Attachment: SOLR-6022.branch_4x-deprecation.patch SOLR-6022.patch Here are two more patches: # The first is still for trunk, and changes the indexAnalyzer/queryAnalyzer members in FieldType to private scope. This will be a "hard fail" for anyone that is subclassing FieldType and using these, but they should be using get/setAnalyzer anyways. It also adds CHANGES.txt entries for review. # The second patch shows exactly how I will do the deprecation in branch_4x (I believe I should be able to just apply the patch after doing a merge back from trunk). > Rename getAnalyzer to getIndexAnalyzer > -- > > Key: SOLR-6022 > URL: https://issues.apache.org/jira/browse/SOLR-6022 > Project: Solr > Issue Type: Improvement >Reporter: Ryan Ernst > Attachments: SOLR-6022.branch_4x-deprecation.patch, SOLR-6022.patch, > SOLR-6022.patch > > > We have separate index/query analyzer chains, but the access methods for the > analyzers do not match up with the names. This can lead to unknowingly using > the wrong analyzer chain (as it did in SOLR-6017). We should do this > renaming in trunk, and deprecate the old getAnalyzer function in 4x. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5611) Simplify the default indexing chain
[ https://issues.apache.org/jira/browse/LUCENE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983390#comment-13983390 ] ASF subversion and git services commented on LUCENE-5611: - Commit 1590747 from [~rcmuir] in branch 'dev/branches/lucene5611' [ https://svn.apache.org/r1590747 ] LUCENE-5611: move attribute juggling to a fieldinvertstate setter > Simplify the default indexing chain > --- > > Key: LUCENE-5611 > URL: https://issues.apache.org/jira/browse/LUCENE-5611 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 4.9, 5.0 > > Attachments: LUCENE-5611.patch, LUCENE-5611.patch > > > I think Lucene's current indexing chain has too many classes / > hierarchy / abstractions, making it look much more complex than it > really should be, and discouraging users from experimenting/innovating > with their own indexing chains. > Also, if it were easier to understand/approach, then new developers > would more likely try to improve it ... it really should be simpler. > So I'm exploring a pared back indexing chain, and have a starting patch > that I think is looking ok: it seems more approachable than the > current indexing chain, or at least has fewer strange classes. > I also thought this could give some speedup for tiny documents (a more > common use of Lucene lately), and it looks like, with the evil > optimizations, this is a ~25% speedup for Geonames docs. Even without > those evil optos it's a bit faster. > This is very much a work in progress / nocommits, and there are some > behavior changes e.g. the new chain requires all fields to have the > same TV options (rather than auto-upgrading all fields by the same > name that the current chain does)... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5611) Simplify the default indexing chain
[ https://issues.apache.org/jira/browse/LUCENE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983341#comment-13983341 ] ASF subversion and git services commented on LUCENE-5611: - Commit 1590731 from [~rcmuir] in branch 'dev/branches/lucene5611' [ https://svn.apache.org/r1590731 ] LUCENE-5611: fix the crazy getAttribute API to prevent double lookups and extra code everywhere > Simplify the default indexing chain > --- > > Key: LUCENE-5611 > URL: https://issues.apache.org/jira/browse/LUCENE-5611 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 4.9, 5.0 > > Attachments: LUCENE-5611.patch, LUCENE-5611.patch > > > I think Lucene's current indexing chain has too many classes / > hierarchy / abstractions, making it look much more complex than it > really should be, and discouraging users from experimenting/innovating > with their own indexing chains. > Also, if it were easier to understand/approach, then new developers > would more likely try to improve it ... it really should be simpler. > So I'm exploring a pared back indexing chain, and have a starting patch > that I think is looking ok: it seems more approachable than the > current indexing chain, or at least has fewer strange classes. > I also thought this could give some speedup for tiny documents (a more > common use of Lucene lately), and it looks like, with the evil > optimizations, this is a ~25% speedup for Geonames docs. Even without > those evil optos it's a bit faster. > This is very much a work in progress / nocommits, and there are some > behavior changes e.g. the new chain requires all fields to have the > same TV options (rather than auto-upgrading all fields by the same > name that the current chain does)... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5631) Improve access to archived versions of Lucene and Solr
[ https://issues.apache.org/jira/browse/LUCENE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983338#comment-13983338 ] Hoss Man commented on LUCENE-5631: -- I'm not really understanidng this issue. In the right nav of the site, are links that are very explicitly about downloading the *latest* release. These load pages that do an auto-redirect, but as a convenience also include some static text for people that might not use javascript. This doesn't change the fact that we *also* already have a main "Download" page that also links to those download redirectors, and has details about archived releases... https://lucene.apache.org/core/downloads.html https://lucene.apache.org/solr/downloads.html Both of these links are prominent in the top nav of the main pages for Lucene-Core & Solr, just to the right of the "news" tabs... https://lucene.apache.org/core/ https://lucene.apache.org/solr/ > Improve access to archived versions of Lucene and Solr > -- > > Key: LUCENE-5631 > URL: https://issues.apache.org/jira/browse/LUCENE-5631 > Project: Lucene - Core > Issue Type: Improvement > Components: general/website >Reporter: Shawn Heisey > > When visiting the website to download Lucene or Solr, it is very difficult > for people to locate where to download previous versions. The archive link > does show up when you click the download link, but the page where it lives is > replaced in less than a second by the CGI for picking a download mirror for > the current release. There's nothing there for previous versions. > At a minimum, we need a link to the download archive that's right below the > main Download link. Something else I think we should do (which might > actually be an INFRA issue, as this problem exists for other projects too) > would be to have the "closer.cgi" page include a link to the archives. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6027) Replica assignments should try to take the host name into account so all replicas don't end up on the same host
[ https://issues.apache.org/jira/browse/SOLR-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983300#comment-13983300 ] Jessica Cheng commented on SOLR-6027: - {quote}I would be nice to make this decision configurable/pluggable.{quote} +1. For example, something like "rack awareness" would be nice to be taken into account as well. > Replica assignments should try to take the host name into account so all > replicas don't end up on the same host > --- > > Key: SOLR-6027 > URL: https://issues.apache.org/jira/browse/SOLR-6027 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Timothy Potter >Priority: Minor > > I have 18 SolrCloud nodes distributed across 3 Ec2 instances, so 6 per > instance. One of my collections was created with all replicas landing on > different SolrCloud nodes on the same instance. Ideally, SolrCloud would be a > little smarter and ensure that at least one of the replicas was on one of the > other hosts. > shard4: { > > http://ec2-??-??-??-239.compute-1.amazonaws.com:8988/solr/med_collection_shard4_replica1/ > LEADER > > http://ec2-??-??-??-239.compute-1.amazonaws.com:8984/solr/med_collection_shard4_replica3/ > > > http://ec2-??-??-??-239.compute-1.amazonaws.com:8985/solr/med_collection_shard4_replica2/ > > } > I marked this as minor for now as it could be argued that I shouldn't be > running that many Solr nodes per instance, but I'm seeing plenty of installs > that are using higher-end instance types / server hardware and then running > multiple Solr nodes per host. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5631) Improve access to archived versions of Lucene and Solr
[ https://issues.apache.org/jira/browse/LUCENE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983291#comment-13983291 ] Shawn Heisey commented on LUCENE-5631: -- I asked on #asfinfra whether a general solution with the mirror CGI would be a good idea. {noformat} 11:39 <@gmcdonald> I do not think download.cgi is the place to put an archives link no {noformat} {noformat} 11:41 <@gmcdonald> elyograg: elsewhere on your website you should link to downloads.cgi or the archives. The download.cgi picks a mirror, archives do not live on mirrors, change your way of thinking is my reply. If you insist on following up, talk to the site-dev@ list instead of an INFRA ticket. {noformat} I will follow up with the site-dev list to see if they have any interest in a general solution. Regardless of what happens there, I do think we need to improve our own project pages. When I have a moment, I will pull the site down from svn and see if I can cook up a patch. > Improve access to archived versions of Lucene and Solr > -- > > Key: LUCENE-5631 > URL: https://issues.apache.org/jira/browse/LUCENE-5631 > Project: Lucene - Core > Issue Type: Improvement > Components: general/website >Reporter: Shawn Heisey > > When visiting the website to download Lucene or Solr, it is very difficult > for people to locate where to download previous versions. The archive link > does show up when you click the download link, but the page where it lives is > replaced in less than a second by the CGI for picking a download mirror for > the current release. There's nothing there for previous versions. > At a minimum, we need a link to the download archive that's right below the > main Download link. Something else I think we should do (which might > actually be an INFRA issue, as this problem exists for other projects too) > would be to have the "closer.cgi" page include a link to the archives. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5969) Enable distributed tracing of requests
[ https://issues.apache.org/jira/browse/SOLR-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregg Donovan updated SOLR-5969: Attachment: SOLR-5969.diff Patch updated for Lucene/Solr 4.8. > Enable distributed tracing of requests > -- > > Key: SOLR-5969 > URL: https://issues.apache.org/jira/browse/SOLR-5969 > Project: Solr > Issue Type: Improvement >Reporter: Gregg Donovan > Attachments: SOLR-5969.diff, SOLR-5969.diff > > > Enable users to add diagnostic information to requests and trace them in the > logs across servers. > We have some metadata -- e.g. a request UUID -- that we log to every log line > using [Log4J's > MDC|http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/MDC.html]. > The UUID logging allows us to connect any log lines we have for a given > request across servers. Sort of like Twitter's > [Zipkin|http://twitter.github.io/zipkin/]. > Currently we're using EmbeddedSolrServer without sharding, so adding the UUID > is fairly simple, since everything is in one process and one thread. But, > we're testing a sharded HTTP implementation and running into some > difficulties getting this data passed around in a way that lets us trace all > log lines generated by a request to its UUID. > The first thing I tried was to add the UUID by adding it to the SolrParams. > This achieves the goal of getting those values logged on the shards if a > request is successful, but we miss having those values in the MDC if there > are other log lines before the final log line. E.g. an Exception in a custom > component. > My current thought is that sending HTTP headers with diagnostic information > would be very useful. Those could be placed in the MDC even before handing > off to work to SolrDispatchFilter, so that any Solr problem will have the > proper logging. > I.e. every additional header added to a Solr request gets a "Solr-" prefix. > On the server, we look for those headers and add them to the [SLF4J > MDC|http://www.slf4j.org/manual.html#mdc]. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5631) Improve access to archived versions of Lucene and Solr
[ https://issues.apache.org/jira/browse/LUCENE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983280#comment-13983280 ] Steve Rowe commented on LUCENE-5631: The project has taken several steps in the opposite direction, intentionally making it *harder* to access releases (and docs) for older versions, to encourage people to choose the most recent version. > Improve access to archived versions of Lucene and Solr > -- > > Key: LUCENE-5631 > URL: https://issues.apache.org/jira/browse/LUCENE-5631 > Project: Lucene - Core > Issue Type: Improvement > Components: general/website >Reporter: Shawn Heisey > > When visiting the website to download Lucene or Solr, it is very difficult > for people to locate where to download previous versions. The archive link > does show up when you click the download link, but the page where it lives is > replaced in less than a second by the CGI for picking a download mirror for > the current release. There's nothing there for previous versions. > At a minimum, we need a link to the download archive that's right below the > main Download link. Something else I think we should do (which might > actually be an INFRA issue, as this problem exists for other projects too) > would be to have the "closer.cgi" page include a link to the archives. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6028) SOLR returns 500 error code for query /<,/
Kingston Duffie created SOLR-6028: - Summary: SOLR returns 500 error code for query /<,/ Key: SOLR-6028 URL: https://issues.apache.org/jira/browse/SOLR-6028 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.7.1 Reporter: Kingston Duffie Priority: Minor If you enter the following query string into the SOLR admin console to execute a query, you will get a 500 error: /<,/ This is an invalid query -- in the sense that the field between the slashes is not a valid regex. Nevertheless, I would have expected to get a 400 error rather than 500. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6027) Replica assignments should try to take the host name into account so all replicas don't end up on the same host
[ https://issues.apache.org/jira/browse/SOLR-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983264#comment-13983264 ] Tomás Fernández Löbbe commented on SOLR-6027: - I would be nice to make this decision configurable/pluggable. One could select the nodes for a collection depending on case-specific context. > Replica assignments should try to take the host name into account so all > replicas don't end up on the same host > --- > > Key: SOLR-6027 > URL: https://issues.apache.org/jira/browse/SOLR-6027 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Timothy Potter >Priority: Minor > > I have 18 SolrCloud nodes distributed across 3 Ec2 instances, so 6 per > instance. One of my collections was created with all replicas landing on > different SolrCloud nodes on the same instance. Ideally, SolrCloud would be a > little smarter and ensure that at least one of the replicas was on one of the > other hosts. > shard4: { > > http://ec2-??-??-??-239.compute-1.amazonaws.com:8988/solr/med_collection_shard4_replica1/ > LEADER > > http://ec2-??-??-??-239.compute-1.amazonaws.com:8984/solr/med_collection_shard4_replica3/ > > > http://ec2-??-??-??-239.compute-1.amazonaws.com:8985/solr/med_collection_shard4_replica2/ > > } > I marked this as minor for now as it could be argued that I shouldn't be > running that many Solr nodes per instance, but I'm seeing plenty of installs > that are using higher-end instance types / server hardware and then running > multiple Solr nodes per host. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5631) Improve access to archived versions of Lucene and Solr
[ https://issues.apache.org/jira/browse/LUCENE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey updated LUCENE-5631: - Description: When visiting the website to download Lucene or Solr, it is very difficult for people to locate where to download previous versions. The archive link does show up when you click the download link, but the page where it lives is replaced in less than a second by the CGI for picking a download mirror for the current release. There's nothing there for previous versions. At a minimum, we need a link to the download archive that's right below the main Download link. Something else I think we should do (which might actually be an INFRA issue, as this problem exists for other projects too) would be to have the "closer.cgi" page include a link to the archives. was: When visiting the website to download Lucene or Solr, it is very difficult for people to locate where to download previous versions. The archive link does show up when you click the download link, but the page where it lives is replaced in less than a second by the CGI for picking a download mirror for the current release. There's nothing there for previous versions. At a minimum, we need a link to the download archive that's right below the main Download link. Something else I think we should do (which might actually be an INFRA issue) would be to have the "closer.cgi" page include a link to the archives. > Improve access to archived versions of Lucene and Solr > -- > > Key: LUCENE-5631 > URL: https://issues.apache.org/jira/browse/LUCENE-5631 > Project: Lucene - Core > Issue Type: Improvement > Components: general/website >Reporter: Shawn Heisey > > When visiting the website to download Lucene or Solr, it is very difficult > for people to locate where to download previous versions. The archive link > does show up when you click the download link, but the page where it lives is > replaced in less than a second by the CGI for picking a download mirror for > the current release. There's nothing there for previous versions. > At a minimum, we need a link to the download archive that's right below the > main Download link. Something else I think we should do (which might > actually be an INFRA issue, as this problem exists for other projects too) > would be to have the "closer.cgi" page include a link to the archives. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5631) Improve access to archived versions of Lucene and Solr
Shawn Heisey created LUCENE-5631: Summary: Improve access to archived versions of Lucene and Solr Key: LUCENE-5631 URL: https://issues.apache.org/jira/browse/LUCENE-5631 Project: Lucene - Core Issue Type: Improvement Components: general/website Reporter: Shawn Heisey When visiting the website to download Lucene or Solr, it is very difficult for people to locate where to download previous versions. The archive link does show up when you click the download link, but the page where it lives is replaced in less than a second by the CGI for picking a download mirror for the current release. There's nothing there for previous versions. At a minimum, we need a link to the download archive that's right below the main Download link. Something else I think we should do (which might actually be an INFRA issue) would be to have the "closer.cgi" page include a link to the archives. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5611) Simplify the default indexing chain
[ https://issues.apache.org/jira/browse/LUCENE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983256#comment-13983256 ] ASF subversion and git services commented on LUCENE-5611: - Commit 1590721 from [~mikemccand] in branch 'dev/branches/lucene5611' [ https://svn.apache.org/r1590721 ] LUCENE-5611: put current patch on branch > Simplify the default indexing chain > --- > > Key: LUCENE-5611 > URL: https://issues.apache.org/jira/browse/LUCENE-5611 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 4.9, 5.0 > > Attachments: LUCENE-5611.patch, LUCENE-5611.patch > > > I think Lucene's current indexing chain has too many classes / > hierarchy / abstractions, making it look much more complex than it > really should be, and discouraging users from experimenting/innovating > with their own indexing chains. > Also, if it were easier to understand/approach, then new developers > would more likely try to improve it ... it really should be simpler. > So I'm exploring a pared back indexing chain, and have a starting patch > that I think is looking ok: it seems more approachable than the > current indexing chain, or at least has fewer strange classes. > I also thought this could give some speedup for tiny documents (a more > common use of Lucene lately), and it looks like, with the evil > optimizations, this is a ~25% speedup for Geonames docs. Even without > those evil optos it's a bit faster. > This is very much a work in progress / nocommits, and there are some > behavior changes e.g. the new chain requires all fields to have the > same TV options (rather than auto-upgrading all fields by the same > name that the current chain does)... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5611) Simplify the default indexing chain
[ https://issues.apache.org/jira/browse/LUCENE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983254#comment-13983254 ] ASF subversion and git services commented on LUCENE-5611: - Commit 1590720 from [~mikemccand] in branch 'dev/branches/lucene5611' [ https://svn.apache.org/r1590720 ] LUCENE-5611: make branch > Simplify the default indexing chain > --- > > Key: LUCENE-5611 > URL: https://issues.apache.org/jira/browse/LUCENE-5611 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 4.9, 5.0 > > Attachments: LUCENE-5611.patch, LUCENE-5611.patch > > > I think Lucene's current indexing chain has too many classes / > hierarchy / abstractions, making it look much more complex than it > really should be, and discouraging users from experimenting/innovating > with their own indexing chains. > Also, if it were easier to understand/approach, then new developers > would more likely try to improve it ... it really should be simpler. > So I'm exploring a pared back indexing chain, and have a starting patch > that I think is looking ok: it seems more approachable than the > current indexing chain, or at least has fewer strange classes. > I also thought this could give some speedup for tiny documents (a more > common use of Lucene lately), and it looks like, with the evil > optimizations, this is a ~25% speedup for Geonames docs. Even without > those evil optos it's a bit faster. > This is very much a work in progress / nocommits, and there are some > behavior changes e.g. the new chain requires all fields to have the > same TV options (rather than auto-upgrading all fields by the same > name that the current chain does)... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6027) Replica assignments should try to take the host name into account so all replicas don't end up on the same host
[ https://issues.apache.org/jira/browse/SOLR-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983230#comment-13983230 ] Mark Miller commented on SOLR-6027: --- bq. it could be argued that I shouldn't be running that many Solr nodes per instance I think we want to make things as smart as we can! Everything is very basic right now and the intention has always been to make it much better - we want at least the option to take into account as much info as we can when choosing hosts (eventually, even hardware, avg load, whatever!). > Replica assignments should try to take the host name into account so all > replicas don't end up on the same host > --- > > Key: SOLR-6027 > URL: https://issues.apache.org/jira/browse/SOLR-6027 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Timothy Potter >Priority: Minor > > I have 18 SolrCloud nodes distributed across 3 Ec2 instances, so 6 per > instance. One of my collections was created with all replicas landing on > different SolrCloud nodes on the same instance. Ideally, SolrCloud would be a > little smarter and ensure that at least one of the replicas was on one of the > other hosts. > shard4: { > > http://ec2-??-??-??-239.compute-1.amazonaws.com:8988/solr/med_collection_shard4_replica1/ > LEADER > > http://ec2-??-??-??-239.compute-1.amazonaws.com:8984/solr/med_collection_shard4_replica3/ > > > http://ec2-??-??-??-239.compute-1.amazonaws.com:8985/solr/med_collection_shard4_replica2/ > > } > I marked this as minor for now as it could be argued that I shouldn't be > running that many Solr nodes per instance, but I'm seeing plenty of installs > that are using higher-end instance types / server hardware and then running > multiple Solr nodes per host. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6027) Replica assignments should try to take the host name into account so all replicas don't end up on the same host
Timothy Potter created SOLR-6027: Summary: Replica assignments should try to take the host name into account so all replicas don't end up on the same host Key: SOLR-6027 URL: https://issues.apache.org/jira/browse/SOLR-6027 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Timothy Potter Priority: Minor I have 18 SolrCloud nodes distributed across 3 Ec2 instances, so 6 per instance. One of my collections was created with all replicas landing on different SolrCloud nodes on the same instance. Ideally, SolrCloud would be a little smarter and ensure that at least one of the replicas was on one of the other hosts. shard4: { http://ec2-??-??-??-239.compute-1.amazonaws.com:8988/solr/med_collection_shard4_replica1/ LEADER http://ec2-??-??-??-239.compute-1.amazonaws.com:8984/solr/med_collection_shard4_replica3/ http://ec2-??-??-??-239.compute-1.amazonaws.com:8985/solr/med_collection_shard4_replica2/ } I marked this as minor for now as it could be argued that I shouldn't be running that many Solr nodes per instance, but I'm seeing plenty of installs that are using higher-end instance types / server hardware and then running multiple Solr nodes per host. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[ANNOUNCE] Apache Solr 4.8.0 released
28 April 2014, Apache Solr™ 4.8.0 available The Lucene PMC is pleased to announce the release of Apache Solr 4.8.0 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world's largest internet sites. Solr 4.8.0 is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html See the CHANGES.txt file included with the release for a full list of details. Solr 4.8.0 Release Highlights: * Apache Solr now requires Java 7 or greater (recommended is Oracle Java 7 or OpenJDK 7, minimum update 55; earlier versions have known JVM bugs affecting Solr). * Apache Solr is fully compatible with Java 8. * and tags have been deprecated from schema.xml. There is no longer any reason to keep them in the schema file, they may be safely removed. This allows intermixing of , and definitions if desired. * The new {!complexphrase} query parser supports wildcards, ORs etc. inside Phrase Queries. * New Collections API CLUSTERSTATUS action reports the status of collections, shards, and replicas, and also lists collection aliases and cluster properties. * Added managed synonym and stopword filter factories, which enable synonym and stopword lists to be dynamically managed via REST API. * JSON updates now support nested child documents, enabling {!child} and {!parent} block join queries. * Added ExpandComponent to expand results collapsed by the CollapsingQParserPlugin, as well as the parent/child relationship of nested child documents. * Long-running Collections API tasks can now be executed asynchronously; the new REQUESTSTATUS action provides status. * Added a hl.qparser parameter to allow you to define a query parser for hl.q highlight queries. * In Solr single-node mode, cores can now be created using named configsets. * New DocExpirationUpdateProcessorFactory supports computing an expiration date for documents from the "TTL" expression, as well as automatically deleting expired documents on a periodic basis. Solr 4.8.0 also includes many other new features as well as numerous optimizations and bugfixes of the corresponding Apache Lucene release. Please report any feedback to the mailing lists (http://lucene.apache.org/solr/discussion.html) Note: The Apache Software Foundation uses an extensive mirroring network for distributing releases. It is possible that the mirror you are using may not have replicated the release yet. If that is the case, please try another mirror. This also goes for Maven access. - Uwe Schindler uschind...@apache.org Apache Lucene PMC Chair / Committer Bremen, Germany http://lucene.apache.org/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[ANNOUNCE] Apache Lucene 4.8.0 released
28 April 2014, Apache Lucene™ 4.8.0 available The Lucene PMC is pleased to announce the release of Apache Lucene 4.8.0 Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform. This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html See the CHANGES.txt file included with the release for a full list of details. Lucene 4.8.0 Release Highlights: * Apache Lucene now requires Java 7 or greater (recommended is Oracle Java 7 or OpenJDK 7, minimum update 55; earlier versions have known JVM bugs affecting Lucene). * Apache Lucene is fully compatible with Java 8. * All index files now store end-to-end checksums, which are now validated during merging and reading. This ensures that corruptions caused by any bit-flipping hardware problems or bugs in the JVM can be detected earlier. For full detection be sure to enable all checksums during merging (it's disabled by default). * Lucene has a new Rescorer/QueryRescorer API to perform second-pass rescoring or reranking of search results using more expensive scoring functions after first-pass hit collection. * AnalyzingInfixSuggester now supports near-real-time autosuggest. * Simplified impact-sorted postings (using SortingMergePolicy and EarlyTerminatingCollector) to use Lucene's Sort class to express the sort order. * Bulk scoring and normal iterator-based scoring were separated, so some queries can do bulk scoring more effectively. * Switched to MurmurHash3 to hash terms during indexing. * IndexWriter now supports updating of binary doc value fields. * HunspellStemFilter now uses 10 to 100x less RAM. It also loads all known OpenOffice dictionaries without error. * Lucene now also fsyncs the directory metadata on commits, if the operating system and file system allow it (Linux, MacOSX are known to work). * Lucene now uses Java 7 file system functions under the hood, so index files can be deleted on Windows, even when readers are still open. * A serious bug in NativeFSLockFactory was fixed, which could allow multiple IndexWriters to acquire the same lock. The lock file is no longer deleted from the index directory even when the lock is not held. * Various bugfixes and optimizations since the 4.7.2 release. Please read CHANGES.txt for a full list of new features. Please report any feedback to the mailing lists (http://lucene.apache.org/core/discussion.html) Note: The Apache Software Foundation uses an extensive mirroring network for distributing releases. It is possible that the mirror you are using may not have replicated the release yet. If that is the case, please try another mirror. This also goes for Maven access. - Uwe Schindler uschind...@apache.org Apache Lucene PMC Chair / Committer Bremen, Germany http://lucene.apache.org/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5637) Per-request cache statistics
[ https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregg Donovan updated SOLR-5637: Attachment: SOLR-5637.patch > Per-request cache statistics > > > Key: SOLR-5637 > URL: https://issues.apache.org/jira/browse/SOLR-5637 > Project: Solr > Issue Type: New Feature >Reporter: Shikhar Bhushan >Priority: Minor > Attachments: SOLR-5367.patch, SOLR-5367.patch, SOLR-5637.patch, > SOLR-5637.patch > > > We have found it very useful to have information on the number of cache hits > and misses for key Solr caches (filterCache, documentCache, etc.) at the > request level. > This is currently implemented in our codebase using custom {{SolrCache}} > implementations. > I am working on moving to maintaining stats in the {{SolrRequestInfo}} > thread-local, and adding hooks in get() methods of SolrCache implementations. > This will be glued up using the {{DebugComponent}} and can be requested using > a "debug.cache" parameter. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5637) Per-request cache statistics
[ https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregg Donovan updated SOLR-5637: Attachment: (was: SOLR-5637.diff) > Per-request cache statistics > > > Key: SOLR-5637 > URL: https://issues.apache.org/jira/browse/SOLR-5637 > Project: Solr > Issue Type: New Feature >Reporter: Shikhar Bhushan >Priority: Minor > Attachments: SOLR-5367.patch, SOLR-5367.patch, SOLR-5637.patch, > SOLR-5637.patch > > > We have found it very useful to have information on the number of cache hits > and misses for key Solr caches (filterCache, documentCache, etc.) at the > request level. > This is currently implemented in our codebase using custom {{SolrCache}} > implementations. > I am working on moving to maintaining stats in the {{SolrRequestInfo}} > thread-local, and adding hooks in get() methods of SolrCache implementations. > This will be glued up using the {{DebugComponent}} and can be requested using > a "debug.cache" parameter. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5637) Per-request cache statistics
[ https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregg Donovan updated SOLR-5637: Attachment: SOLR-5637.diff Patch updated for Lucene/Solr 4.8 and JDK 7. > Per-request cache statistics > > > Key: SOLR-5637 > URL: https://issues.apache.org/jira/browse/SOLR-5637 > Project: Solr > Issue Type: New Feature >Reporter: Shikhar Bhushan >Priority: Minor > Attachments: SOLR-5367.patch, SOLR-5367.patch, SOLR-5637.patch, > SOLR-5637.patch > > > We have found it very useful to have information on the number of cache hits > and misses for key Solr caches (filterCache, documentCache, etc.) at the > request level. > This is currently implemented in our codebase using custom {{SolrCache}} > implementations. > I am working on moving to maintaining stats in the {{SolrRequestInfo}} > thread-local, and adding hooks in get() methods of SolrCache implementations. > This will be glued up using the {{DebugComponent}} and can be requested using > a "debug.cache" parameter. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5468) Option to enforce a majority quorum approach to accepting updates in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983153#comment-13983153 ] Mark Miller commented on SOLR-5468: --- bq. Do think this should be an update request parameter or collection level setting? Yeah, I think it's common to allow passing this per request so the client can vary it depending on the data. I'm sure configurable defaults are probably worth looking at too though. > Option to enforce a majority quorum approach to accepting updates in SolrCloud > -- > > Key: SOLR-5468 > URL: https://issues.apache.org/jira/browse/SOLR-5468 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Affects Versions: 4.5 > Environment: All >Reporter: Timothy Potter >Assignee: Timothy Potter >Priority: Minor > > I've been thinking about how SolrCloud deals with write-availability using > in-sync replica sets, in which writes will continue to be accepted so long as > there is at least one healthy node per shard. > For a little background (and to verify my understanding of the process is > correct), SolrCloud only considers active/healthy replicas when acknowledging > a write. Specifically, when a shard leader accepts an update request, it > forwards the request to all active/healthy replicas and only considers the > write successful if all active/healthy replicas ack the write. Any down / > gone replicas are not considered and will sync up with the leader when they > come back online using peer sync or snapshot replication. For instance, if a > shard has 3 nodes, A, B, C with A being the current leader, then writes to > the shard will continue to succeed even if B & C are down. > The issue is that if a shard leader continues to accept updates even if it > loses all of its replicas, then we have acknowledged updates on only 1 node. > If that node, call it A, then fails and one of the previous replicas, call it > B, comes back online before A does, then any writes that A accepted while the > other replicas were offline are at risk to being lost. > SolrCloud does provide a safe-guard mechanism for this problem with the > leaderVoteWait setting, which puts any replicas that come back online before > node A into a temporary wait state. If A comes back online within the wait > period, then all is well as it will become the leader again and no writes > will be lost. As a side note, sys admins definitely need to be made more > aware of this situation as when I first encountered it in my cluster, I had > no idea what it meant. > My question is whether we want to consider an approach where SolrCloud will > not accept writes unless there is a majority of replicas available to accept > the write? For my example, under this approach, we wouldn't accept writes if > both B&C failed, but would if only C did, leaving A & B online. Admittedly, > this lowers the write-availability of the system, so may be something that > should be tunable? > From Mark M: Yeah, this is kind of like one of many little features that we > have just not gotten to yet. I’ve always planned for a param that let’s you > say how many replicas an update must be verified on before responding > success. Seems to make sense to fail that type of request early if you notice > there are not enough replicas up to satisfy the param to begin with. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: VOTE: RC1 Release apache-solr-ref-guide-4.8.pdf
Whoops, i forgot my own vote... : https://dist.apache.org/repos/dist/dev/lucene/solr/ref-guide/apache-solr-ref-guide-4.8-RC1 +1 to RC1 with this SHA... 9904feefcdbad85eea1a81fe531f37df22ca134f apache-solr-ref-guide-4.8.pdf (Note: we still need at least 1 more binding +1) -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5468) Option to enforce a majority quorum approach to accepting updates in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983138#comment-13983138 ] Timothy Potter commented on SOLR-5468: -- Thanks for the quick feedback. Do think this should be an update request parameter or collection level setting? Just re-read your original comment about this and sounds like you were thinking a parameter with each request. I like that since it gives the option to by-pass this checking when doing large bulk loads of the collection and only apply it when it makes sense. In terms of fine-grained error response handling, looks like this is captured in: https://issues.apache.org/jira/browse/SOLR-3382 > Option to enforce a majority quorum approach to accepting updates in SolrCloud > -- > > Key: SOLR-5468 > URL: https://issues.apache.org/jira/browse/SOLR-5468 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Affects Versions: 4.5 > Environment: All >Reporter: Timothy Potter >Assignee: Timothy Potter >Priority: Minor > > I've been thinking about how SolrCloud deals with write-availability using > in-sync replica sets, in which writes will continue to be accepted so long as > there is at least one healthy node per shard. > For a little background (and to verify my understanding of the process is > correct), SolrCloud only considers active/healthy replicas when acknowledging > a write. Specifically, when a shard leader accepts an update request, it > forwards the request to all active/healthy replicas and only considers the > write successful if all active/healthy replicas ack the write. Any down / > gone replicas are not considered and will sync up with the leader when they > come back online using peer sync or snapshot replication. For instance, if a > shard has 3 nodes, A, B, C with A being the current leader, then writes to > the shard will continue to succeed even if B & C are down. > The issue is that if a shard leader continues to accept updates even if it > loses all of its replicas, then we have acknowledged updates on only 1 node. > If that node, call it A, then fails and one of the previous replicas, call it > B, comes back online before A does, then any writes that A accepted while the > other replicas were offline are at risk to being lost. > SolrCloud does provide a safe-guard mechanism for this problem with the > leaderVoteWait setting, which puts any replicas that come back online before > node A into a temporary wait state. If A comes back online within the wait > period, then all is well as it will become the leader again and no writes > will be lost. As a side note, sys admins definitely need to be made more > aware of this situation as when I first encountered it in my cluster, I had > no idea what it meant. > My question is whether we want to consider an approach where SolrCloud will > not accept writes unless there is a majority of replicas available to accept > the write? For my example, under this approach, we wouldn't accept writes if > both B&C failed, but would if only C did, leaving A & B online. Admittedly, > this lowers the write-availability of the system, so may be something that > should be tunable? > From Mark M: Yeah, this is kind of like one of many little features that we > have just not gotten to yet. I’ve always planned for a param that let’s you > say how many replicas an update must be verified on before responding > success. Seems to make sense to fail that type of request early if you notice > there are not enough replicas up to satisfy the param to begin with. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5611) Simplify the default indexing chain
[ https://issues.apache.org/jira/browse/LUCENE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983071#comment-13983071 ] Robert Muir commented on LUCENE-5611: - In StoredFieldsWriter: {noformat} - * For every document, {@link #startDocument(int)} is called, + * For every document, {@link #startDocument()} is called, * informing the Codec how many fields will be written. {noformat} This javadoc "compiles" but now does not make sense because we don't pass numFields as a parameter anymore. The attribute handling in the indexing chain got more confusing and complicated. Can we factor this into FieldInvertState? Its bogus we call hasAttribute + getAttribute, besides making the code more complicated, its two hashmap lookups for 2 atts. We should add a method to attribute source that acts like map.get (returns an attribute, or null if it doesnt exist). Or simple change the semantics of getAttribute to do that. This can be a followup issue. I will keep reviewing, i only got thru the first 3 or 4 files in the patch. > Simplify the default indexing chain > --- > > Key: LUCENE-5611 > URL: https://issues.apache.org/jira/browse/LUCENE-5611 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 4.9, 5.0 > > Attachments: LUCENE-5611.patch, LUCENE-5611.patch > > > I think Lucene's current indexing chain has too many classes / > hierarchy / abstractions, making it look much more complex than it > really should be, and discouraging users from experimenting/innovating > with their own indexing chains. > Also, if it were easier to understand/approach, then new developers > would more likely try to improve it ... it really should be simpler. > So I'm exploring a pared back indexing chain, and have a starting patch > that I think is looking ok: it seems more approachable than the > current indexing chain, or at least has fewer strange classes. > I also thought this could give some speedup for tiny documents (a more > common use of Lucene lately), and it looks like, with the evil > optimizations, this is a ~25% speedup for Geonames docs. Even without > those evil optos it's a bit faster. > This is very much a work in progress / nocommits, and there are some > behavior changes e.g. the new chain requires all fields to have the > same TV options (rather than auto-upgrading all fields by the same > name that the current chain does)... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5468) Option to enforce a majority quorum approach to accepting updates in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983035#comment-13983035 ] Mark Miller edited comment on SOLR-5468 at 4/28/14 2:29 PM: bq. for now it seems sufficient to let users decide how many replicas a write must succeed on to be considered successful. I agree that that is the low hanging fruit. We just have to let the user know exactly what we are trying to promise. bq. there would need to be some "backing out" work to remove an update that succeeded on the leader but failed on the replicas. Yup - that will be the hardest part of doing this how we would really like and a large reason it was punted on in all the initial work. Even if the leader didn't process the doc first (which is likely a doable optimization at some point), I still think it's really hard. bq. Lastly, batches! What happens if half of a batch (sent by a client) succeeds and the other half fails (due to losing a replica in the middle of processing the batch)? Batches and streaming really don't make sense yet in SolrCloud other than for batch loading. We need to implement better, fine grained +error+ responses first. When that happens, it should all operate the same as single update per request. was (Author: markrmil...@gmail.com): bq. for now it seems sufficient to let users decide how many replicas a write must succeed on to be considered successful. I agree that that is the low hanging fruit. We just have to let the user know exactly what we are trying to promise. bq. there would need to be some "backing out" work to remove an update that succeeded on the leader but failed on the replicas. Yup - that will be the hardest part of doing this how we would really like and a large reason it was punted on in all the initial work. Even if the leader didn't process the doc first (which is likely a doable optimization at some point), I still think it's really hard. bq. Lastly, batches! What happens if half of a batch (sent by a client) succeeds and the other half fails (due to losing a replica in the middle of processing the batch)? Batches and streaming really don't make sense yet in SolrCloud other than for batch loading. We need to implement better, fine grained responses first. When that happens, it should all operate the same as single update per request. > Option to enforce a majority quorum approach to accepting updates in SolrCloud > -- > > Key: SOLR-5468 > URL: https://issues.apache.org/jira/browse/SOLR-5468 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Affects Versions: 4.5 > Environment: All >Reporter: Timothy Potter >Assignee: Timothy Potter >Priority: Minor > > I've been thinking about how SolrCloud deals with write-availability using > in-sync replica sets, in which writes will continue to be accepted so long as > there is at least one healthy node per shard. > For a little background (and to verify my understanding of the process is > correct), SolrCloud only considers active/healthy replicas when acknowledging > a write. Specifically, when a shard leader accepts an update request, it > forwards the request to all active/healthy replicas and only considers the > write successful if all active/healthy replicas ack the write. Any down / > gone replicas are not considered and will sync up with the leader when they > come back online using peer sync or snapshot replication. For instance, if a > shard has 3 nodes, A, B, C with A being the current leader, then writes to > the shard will continue to succeed even if B & C are down. > The issue is that if a shard leader continues to accept updates even if it > loses all of its replicas, then we have acknowledged updates on only 1 node. > If that node, call it A, then fails and one of the previous replicas, call it > B, comes back online before A does, then any writes that A accepted while the > other replicas were offline are at risk to being lost. > SolrCloud does provide a safe-guard mechanism for this problem with the > leaderVoteWait setting, which puts any replicas that come back online before > node A into a temporary wait state. If A comes back online within the wait > period, then all is well as it will become the leader again and no writes > will be lost. As a side note, sys admins definitely need to be made more > aware of this situation as when I first encountered it in my cluster, I had > no idea what it meant. > My question is whether we want to consider an approach where SolrCloud will > not accept writes unless there is a majority of replicas available to accept > the write? For my example, under this approach, we wouldn't accept writes if > both B&C
RE: [VOTE] Lucene/Solr 4.8.0 RC2
Thanks for the release notes editing! I will now start to publish the web page. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Uwe Schindler [mailto:u...@thetaphi.de] > Sent: Monday, April 28, 2014 12:01 AM > To: dev@lucene.apache.org > Subject: RE: [VOTE] Lucene/Solr 4.8.0 RC2 > > Hi, > > the vote succeeded. I will now start to push the artifacts and sill send the > release announcement tomorrow. It would be good to review the release > notes before: > => https://wiki.apache.org/lucene-java/ReleaseNote48 > => https://wiki.apache.org/solr/ReleaseNote48 > > Thanks to all for voting! > Uwe > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -Original Message- > > From: Uwe Schindler [mailto:u...@thetaphi.de] > > Sent: Thursday, April 24, 2014 11:54 PM > > To: dev@lucene.apache.org > > Subject: [VOTE] Lucene/Solr 4.8.0 RC2 > > > > Hi, > > > > I prepared a second release candidate of Lucene and Solr 4.8.0. The > > artifacts can be found here: > > => > > http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0- > > RC2-rev1589874/ > > > > This RC contains the additional fixes for SOLR-6011, LUCENE-5626, and > > LUCENE-5630. > > > > Please check the artifacts and give your vote in the next 72 hrs. > > > > Uwe > > > > P.S.: Here's my smoker command line: > > $ JAVA_HOME=$HOME/jdk1.7.0_55 JAVA7_HOME=$HOME/jdk1.7.0_55 > > python3.2 -u smokeTestRelease.py ' > > http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0-RC > > 2- > > rev1589874/' 1589874 4.8.0 tmp > > > > - > > Uwe Schindler > > H.-H.-Meier-Allee 63, D-28213 Bremen > > http://www.thetaphi.de > > eMail: u...@thetaphi.de > > > > > > > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For > > additional commands, e-mail: dev-h...@lucene.apache.org > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional > commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5468) Option to enforce a majority quorum approach to accepting updates in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983035#comment-13983035 ] Mark Miller commented on SOLR-5468: --- bq. for now it seems sufficient to let users decide how many replicas a write must succeed on to be considered successful. I agree that that is the low hanging fruit. We just have to let the user know exactly what we are trying to promise. bq. there would need to be some "backing out" work to remove an update that succeeded on the leader but failed on the replicas. Yup - that will be the hardest part of doing this how we would really like and a large reason it was punted on in all the initial work. Even if the leader didn't process the doc first (which is likely a doable optimization at some point), I still think it's really hard. bq. Lastly, batches! What happens if half of a batch (sent by a client) succeeds and the other half fails (due to losing a replica in the middle of processing the batch)? Batches and streaming really don't make sense yet in SolrCloud other than for batch loading. We need to implement better, fine grained responses first. When that happens, it should all operate the same as single update per request. > Option to enforce a majority quorum approach to accepting updates in SolrCloud > -- > > Key: SOLR-5468 > URL: https://issues.apache.org/jira/browse/SOLR-5468 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Affects Versions: 4.5 > Environment: All >Reporter: Timothy Potter >Assignee: Timothy Potter >Priority: Minor > > I've been thinking about how SolrCloud deals with write-availability using > in-sync replica sets, in which writes will continue to be accepted so long as > there is at least one healthy node per shard. > For a little background (and to verify my understanding of the process is > correct), SolrCloud only considers active/healthy replicas when acknowledging > a write. Specifically, when a shard leader accepts an update request, it > forwards the request to all active/healthy replicas and only considers the > write successful if all active/healthy replicas ack the write. Any down / > gone replicas are not considered and will sync up with the leader when they > come back online using peer sync or snapshot replication. For instance, if a > shard has 3 nodes, A, B, C with A being the current leader, then writes to > the shard will continue to succeed even if B & C are down. > The issue is that if a shard leader continues to accept updates even if it > loses all of its replicas, then we have acknowledged updates on only 1 node. > If that node, call it A, then fails and one of the previous replicas, call it > B, comes back online before A does, then any writes that A accepted while the > other replicas were offline are at risk to being lost. > SolrCloud does provide a safe-guard mechanism for this problem with the > leaderVoteWait setting, which puts any replicas that come back online before > node A into a temporary wait state. If A comes back online within the wait > period, then all is well as it will become the leader again and no writes > will be lost. As a side note, sys admins definitely need to be made more > aware of this situation as when I first encountered it in my cluster, I had > no idea what it meant. > My question is whether we want to consider an approach where SolrCloud will > not accept writes unless there is a majority of replicas available to accept > the write? For my example, under this approach, we wouldn't accept writes if > both B&C failed, but would if only C did, leaving A & B online. Admittedly, > this lowers the write-availability of the system, so may be something that > should be tunable? > From Mark M: Yeah, this is kind of like one of many little features that we > have just not gotten to yet. I’ve always planned for a param that let’s you > say how many replicas an update must be verified on before responding > success. Seems to make sense to fail that type of request early if you notice > there are not enough replicas up to satisfy the param to begin with. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5981) Please change method visibility of getSolrWriter in DataImportHandler to public (or at least protected)
[ https://issues.apache.org/jira/browse/SOLR-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983033#comment-13983033 ] James Dyer commented on SOLR-5981: -- Shawn, I think its ok to commit, but to fully implement the DIHWriter and let the writers be truly plugabble is probably the best situation. This patch is easier to do and what's the harm? Should a future maintainer want to do it differently, it might not be backwards-compatible. DIH is perpetually "expeirmental, subject to change" and I think the bar is low in this case. And to give it a new use-case indexing a no-sql db, might make it more attractive to someone to maintain this in the future. > Please change method visibility of getSolrWriter in DataImportHandler to > public (or at least protected) > --- > > Key: SOLR-5981 > URL: https://issues.apache.org/jira/browse/SOLR-5981 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 4.0 > Environment: Linux 3.13.9-200.fc20.x86_64 > Solr 4.6.0 >Reporter: Aaron LaBella >Assignee: Shawn Heisey >Priority: Minor > Fix For: 4.9, 5.0 > > Attachments: SOLR-5981.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > I've been using the org.apache.solr.handler.dataimport.DataImportHandler for > a bit and it's an excellent model and architecture. I'd like to extend the > usage of it to plugin my own DIHWriter, but, the code doesn't allow for it. > Please change ~line 227 in the DataImportHander class to be: > public SolrWriter getSolrWriter > instead of: > private SolrWriter getSolrWriter > or, at a minimum, protected, so that I can extend DataImportHandler and > override this method. > Thank you *sincerely* in advance for the quick turn-around on this. If the > change can be made in 4.6.0 and upstream, that'd be ideal. > Thanks! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-5468) Option to enforce a majority quorum approach to accepting updates in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter reassigned SOLR-5468: Assignee: Timothy Potter > Option to enforce a majority quorum approach to accepting updates in SolrCloud > -- > > Key: SOLR-5468 > URL: https://issues.apache.org/jira/browse/SOLR-5468 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Affects Versions: 4.5 > Environment: All >Reporter: Timothy Potter >Assignee: Timothy Potter >Priority: Minor > > I've been thinking about how SolrCloud deals with write-availability using > in-sync replica sets, in which writes will continue to be accepted so long as > there is at least one healthy node per shard. > For a little background (and to verify my understanding of the process is > correct), SolrCloud only considers active/healthy replicas when acknowledging > a write. Specifically, when a shard leader accepts an update request, it > forwards the request to all active/healthy replicas and only considers the > write successful if all active/healthy replicas ack the write. Any down / > gone replicas are not considered and will sync up with the leader when they > come back online using peer sync or snapshot replication. For instance, if a > shard has 3 nodes, A, B, C with A being the current leader, then writes to > the shard will continue to succeed even if B & C are down. > The issue is that if a shard leader continues to accept updates even if it > loses all of its replicas, then we have acknowledged updates on only 1 node. > If that node, call it A, then fails and one of the previous replicas, call it > B, comes back online before A does, then any writes that A accepted while the > other replicas were offline are at risk to being lost. > SolrCloud does provide a safe-guard mechanism for this problem with the > leaderVoteWait setting, which puts any replicas that come back online before > node A into a temporary wait state. If A comes back online within the wait > period, then all is well as it will become the leader again and no writes > will be lost. As a side note, sys admins definitely need to be made more > aware of this situation as when I first encountered it in my cluster, I had > no idea what it meant. > My question is whether we want to consider an approach where SolrCloud will > not accept writes unless there is a majority of replicas available to accept > the write? For my example, under this approach, we wouldn't accept writes if > both B&C failed, but would if only C did, leaving A & B online. Admittedly, > this lowers the write-availability of the system, so may be something that > should be tunable? > From Mark M: Yeah, this is kind of like one of many little features that we > have just not gotten to yet. I’ve always planned for a param that let’s you > say how many replicas an update must be verified on before responding > success. Seems to make sense to fail that type of request early if you notice > there are not enough replicas up to satisfy the param to begin with. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5468) Option to enforce a majority quorum approach to accepting updates in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983016#comment-13983016 ] Timothy Potter commented on SOLR-5468: -- Starting to work on this ... First, I think "majority quorum" is too strong for what we really need at the moment; for now it seems sufficient to let users decide how many replicas a write must succeed on to be considered successful. In other words, we can introduce a new, optional integer property when creating a new collection - minActiveReplicas (need a better name), which defaults to 1 (current behavior). If >1, then an update won't succeed unless it is ack'd by at least that many replicas. Activating this feature doesn't make much sense unless a collection has RF > 2. The biggest hurdle to adding this behavior is the asynchronous / streaming based approach leaders use to forward updates on to replicas. The current implementation uses a callback error handler to deal with failed update requests (from leader to replica) and simply considers an update successful if it works on the leader. Part of the complexity is that the leader processes the update before even attempting to forward on to the replica so there would need to be some "backing out" work to remove an update that succeeded on the leader but failed on the replicas. This is starting to get messy ;-) Another key point here is this feature simply moves the problem from the Solr server to the client application, i.e. it's a fail-faster approach where a client indexing app gets notified that writes are not succeeding on enough replicas to meet the desired threshold. The client application still has to decide what to do when writes fail. Lastly, batches! What happens if half of a batch (sent by a client) succeeds and the other half fails (due to losing a replica in the middle of processing the batch)? Another idea I had is maybe this isn't a collection-level property, maybe it is set on a per-request basis? > Option to enforce a majority quorum approach to accepting updates in SolrCloud > -- > > Key: SOLR-5468 > URL: https://issues.apache.org/jira/browse/SOLR-5468 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Affects Versions: 4.5 > Environment: All >Reporter: Timothy Potter >Priority: Minor > > I've been thinking about how SolrCloud deals with write-availability using > in-sync replica sets, in which writes will continue to be accepted so long as > there is at least one healthy node per shard. > For a little background (and to verify my understanding of the process is > correct), SolrCloud only considers active/healthy replicas when acknowledging > a write. Specifically, when a shard leader accepts an update request, it > forwards the request to all active/healthy replicas and only considers the > write successful if all active/healthy replicas ack the write. Any down / > gone replicas are not considered and will sync up with the leader when they > come back online using peer sync or snapshot replication. For instance, if a > shard has 3 nodes, A, B, C with A being the current leader, then writes to > the shard will continue to succeed even if B & C are down. > The issue is that if a shard leader continues to accept updates even if it > loses all of its replicas, then we have acknowledged updates on only 1 node. > If that node, call it A, then fails and one of the previous replicas, call it > B, comes back online before A does, then any writes that A accepted while the > other replicas were offline are at risk to being lost. > SolrCloud does provide a safe-guard mechanism for this problem with the > leaderVoteWait setting, which puts any replicas that come back online before > node A into a temporary wait state. If A comes back online within the wait > period, then all is well as it will become the leader again and no writes > will be lost. As a side note, sys admins definitely need to be made more > aware of this situation as when I first encountered it in my cluster, I had > no idea what it meant. > My question is whether we want to consider an approach where SolrCloud will > not accept writes unless there is a majority of replicas available to accept > the write? For my example, under this approach, we wouldn't accept writes if > both B&C failed, but would if only C did, leaving A & B online. Admittedly, > this lowers the write-availability of the system, so may be something that > should be tunable? > From Mark M: Yeah, this is kind of like one of many little features that we > have just not gotten to yet. I’ve always planned for a param that let’s you > say how many replicas an update must be verified on before responding > success. Seems to make se
[jira] [Commented] (SOLR-6013) Fix method visibility of Evaluator, refactor DateFormatEvaluator for extensibility
[ https://issues.apache.org/jira/browse/SOLR-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982999#comment-13982999 ] Aaron LaBella commented on SOLR-6013: - NOTE: the patches can be applied from oldest to newest > Fix method visibility of Evaluator, refactor DateFormatEvaluator for > extensibility > -- > > Key: SOLR-6013 > URL: https://issues.apache.org/jira/browse/SOLR-6013 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 4.7 >Reporter: Aaron LaBella > Fix For: 4.9 > > Attachments: 0001-add-getters-for-datemathparser.patch, > 0001-change-method-access-to-protected.patch, > 0001-change-method-variable-visibility-and-refactor-for-extensibility.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > This is similar to issue 5981, the Evaluator class is declared as abstract, > yet the parseParams method is package private? Surely this is an oversight, > as I wouldn't expect everyone writing their own evaluators to have to deal > with parsing the parameters. > Similarly, I needed to refactor DateFormatEvaluator because I need to do some > custom date math/parsing and it wasn't written in a way that I can extend it. > Please review/apply my attached patch to the next version of Solr, ie: 4.8 or > 4.9 if I must wait. > Thanks! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6013) Fix method visibility of Evaluator, refactor DateFormatEvaluator for extensibility
[ https://issues.apache.org/jira/browse/SOLR-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron LaBella updated SOLR-6013: Attachment: 0001-change-method-access-to-protected.patch Thanks Shalin, I'm attaching another patch to change the method accessors to protected (instead of public) and marked the methods as lucene.experimental. Let me know if there's anything else. Otherwise, can you, or someone else commit/push these patches into the 4.x branch so it makes the next release? Thanks > Fix method visibility of Evaluator, refactor DateFormatEvaluator for > extensibility > -- > > Key: SOLR-6013 > URL: https://issues.apache.org/jira/browse/SOLR-6013 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 4.7 >Reporter: Aaron LaBella > Fix For: 4.9 > > Attachments: 0001-add-getters-for-datemathparser.patch, > 0001-change-method-access-to-protected.patch, > 0001-change-method-variable-visibility-and-refactor-for-extensibility.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > This is similar to issue 5981, the Evaluator class is declared as abstract, > yet the parseParams method is package private? Surely this is an oversight, > as I wouldn't expect everyone writing their own evaluators to have to deal > with parsing the parameters. > Similarly, I needed to refactor DateFormatEvaluator because I need to do some > custom date math/parsing and it wasn't written in a way that I can extend it. > Please review/apply my attached patch to the next version of Solr, ie: 4.8 or > 4.9 if I must wait. > Thanks! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [VOTE] Lucene/Solr 4.8.0 RC2
Hi, May anybody help with the Solr release notes? To me there are only minimal new features listed, because I don’t have the full insight what changed. It might be good to add one or 2 more features. https://wiki.apache.org/solr/ReleaseNote48 Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Uwe Schindler [mailto:u...@thetaphi.de] > Sent: Monday, April 28, 2014 12:01 AM > To: dev@lucene.apache.org > Subject: RE: [VOTE] Lucene/Solr 4.8.0 RC2 > > Hi, > > the vote succeeded. I will now start to push the artifacts and sill send the > release announcement tomorrow. It would be good to review the release > notes before: > => https://wiki.apache.org/lucene-java/ReleaseNote48 > => https://wiki.apache.org/solr/ReleaseNote48 > > Thanks to all for voting! > Uwe > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -Original Message- > > From: Uwe Schindler [mailto:u...@thetaphi.de] > > Sent: Thursday, April 24, 2014 11:54 PM > > To: dev@lucene.apache.org > > Subject: [VOTE] Lucene/Solr 4.8.0 RC2 > > > > Hi, > > > > I prepared a second release candidate of Lucene and Solr 4.8.0. The > > artifacts can be found here: > > => > > http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0- > > RC2-rev1589874/ > > > > This RC contains the additional fixes for SOLR-6011, LUCENE-5626, and > > LUCENE-5630. > > > > Please check the artifacts and give your vote in the next 72 hrs. > > > > Uwe > > > > P.S.: Here's my smoker command line: > > $ JAVA_HOME=$HOME/jdk1.7.0_55 JAVA7_HOME=$HOME/jdk1.7.0_55 > > python3.2 -u smokeTestRelease.py ' > > http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0-RC > > 2- > > rev1589874/' 1589874 4.8.0 tmp > > > > - > > Uwe Schindler > > H.-H.-Meier-Allee 63, D-28213 Bremen > > http://www.thetaphi.de > > eMail: u...@thetaphi.de > > > > > > > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For > > additional commands, e-mail: dev-h...@lucene.apache.org > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional > commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5618) DocValues updates send wrong fieldinfos to codec producers
[ https://issues.apache.org/jira/browse/LUCENE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982931#comment-13982931 ] Shai Erera commented on LUCENE-5618: If we separate each DV update into its own file, I think we will need to track another gen in SegmentCommitInfo: deletes, fieldInfos and dvUpdates. Though each FI writes its dvGen in the FIS file, we need to know from where to increment the gen for the next update. This isn't a big deal, just adds complexity to SCI (4 methods and index format change). But why do you think that it's wrong to write 2 fields and then at read time ask to provide only 1 field? I.e. what if the Codecs API was more "lazy", or a Codec wants to implement lazy loading of even just the metadata? Passing all the fields a Codec wrote, e.g. in the {{gen=-1}} case, even though none of them is not going to be used because they were all updated in later gens, seems awkward to me as well. What sort of index corruption does this check detect? As I see it, the Codec gets a subset of the fields that it already wrote. It's worse if it gets a superset of those fields, because you don't know e.g. if there are perhaps missing fields that disappeared from the file system. > DocValues updates send wrong fieldinfos to codec producers > -- > > Key: LUCENE-5618 > URL: https://issues.apache.org/jira/browse/LUCENE-5618 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir > > Spinoff from LUCENE-5616. > See the example there, docvalues readers get a fieldinfos, but it doesn't > contain the correct ones, so they have invalid field numbers at read time. > This should really be fixed. Maybe a simple solution is to not write > "batches" of fields in updates but just have only one field per gen? > This removes many-many relationships and would make things easy to understand. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Lucene/Solr 4.8.0 RC2
OK I made some small edit's to Lucene's release notes. Mike McCandless http://blog.mikemccandless.com On Mon, Apr 28, 2014 at 6:20 AM, Michael McCandless wrote: > I'd like to make some minor edits to the Lucene release notes ... but > I can't login (http://status.apache.org shows some problem). I'll try > a bit later ... > > Mike McCandless > > http://blog.mikemccandless.com > > > On Sun, Apr 27, 2014 at 6:00 PM, Uwe Schindler wrote: >> Hi, >> >> the vote succeeded. I will now start to push the artifacts and sill send the >> release announcement tomorrow. It would be good to review the release notes >> before: >> => https://wiki.apache.org/lucene-java/ReleaseNote48 >> => https://wiki.apache.org/solr/ReleaseNote48 >> >> Thanks to all for voting! >> Uwe >> >> - >> Uwe Schindler >> H.-H.-Meier-Allee 63, D-28213 Bremen >> http://www.thetaphi.de >> eMail: u...@thetaphi.de >> >> >>> -Original Message- >>> From: Uwe Schindler [mailto:u...@thetaphi.de] >>> Sent: Thursday, April 24, 2014 11:54 PM >>> To: dev@lucene.apache.org >>> Subject: [VOTE] Lucene/Solr 4.8.0 RC2 >>> >>> Hi, >>> >>> I prepared a second release candidate of Lucene and Solr 4.8.0. The >>> artifacts >>> can be found here: >>> => http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0- >>> RC2-rev1589874/ >>> >>> This RC contains the additional fixes for SOLR-6011, LUCENE-5626, and >>> LUCENE-5630. >>> >>> Please check the artifacts and give your vote in the next 72 hrs. >>> >>> Uwe >>> >>> P.S.: Here's my smoker command line: >>> $ JAVA_HOME=$HOME/jdk1.7.0_55 JAVA7_HOME=$HOME/jdk1.7.0_55 >>> python3.2 -u smokeTestRelease.py ' >>> http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0-RC2- >>> rev1589874/' 1589874 4.8.0 tmp >>> >>> - >>> Uwe Schindler >>> H.-H.-Meier-Allee 63, D-28213 Bremen >>> http://www.thetaphi.de >>> eMail: u...@thetaphi.de >>> >>> >>> >>> >>> - >>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional >>> commands, e-mail: dev-h...@lucene.apache.org >> >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Lucene/Solr 4.8.0 RC2
I'd like to make some minor edits to the Lucene release notes ... but I can't login (http://status.apache.org shows some problem). I'll try a bit later ... Mike McCandless http://blog.mikemccandless.com On Sun, Apr 27, 2014 at 6:00 PM, Uwe Schindler wrote: > Hi, > > the vote succeeded. I will now start to push the artifacts and sill send the > release announcement tomorrow. It would be good to review the release notes > before: > => https://wiki.apache.org/lucene-java/ReleaseNote48 > => https://wiki.apache.org/solr/ReleaseNote48 > > Thanks to all for voting! > Uwe > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > >> -Original Message- >> From: Uwe Schindler [mailto:u...@thetaphi.de] >> Sent: Thursday, April 24, 2014 11:54 PM >> To: dev@lucene.apache.org >> Subject: [VOTE] Lucene/Solr 4.8.0 RC2 >> >> Hi, >> >> I prepared a second release candidate of Lucene and Solr 4.8.0. The artifacts >> can be found here: >> => http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0- >> RC2-rev1589874/ >> >> This RC contains the additional fixes for SOLR-6011, LUCENE-5626, and >> LUCENE-5630. >> >> Please check the artifacts and give your vote in the next 72 hrs. >> >> Uwe >> >> P.S.: Here's my smoker command line: >> $ JAVA_HOME=$HOME/jdk1.7.0_55 JAVA7_HOME=$HOME/jdk1.7.0_55 >> python3.2 -u smokeTestRelease.py ' >> http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0-RC2- >> rev1589874/' 1589874 4.8.0 tmp >> >> - >> Uwe Schindler >> H.-H.-Meier-Allee 63, D-28213 Bremen >> http://www.thetaphi.de >> eMail: u...@thetaphi.de >> >> >> >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional >> commands, e-mail: dev-h...@lucene.apache.org > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5618) DocValues updates send wrong fieldinfos to codec producers
[ https://issues.apache.org/jira/browse/LUCENE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982888#comment-13982888 ] Robert Muir commented on LUCENE-5618: - {quote} Write each updated field in its own gen – if you update many fields, many times, this will create many files in the index directory. Technically it's not "wrong", it just looks weird {quote} Why? This is how separate norms worked. Its the obvious solution. The current behavior is broken: lets fix the bug. This optimization is what is to blame. The optimization is invalid. {quote} Anyway, I think the issue's title is wrong – DocValues updates do pass the correct fieldInfos to the producers. They pass only the infos that the producer should care about, and we see that passing too many is wrong (PerFieldDVF). {quote} Absolutely not! You get a different fieldinfos at _read_ time than you get at _write_. This is broken! > DocValues updates send wrong fieldinfos to codec producers > -- > > Key: LUCENE-5618 > URL: https://issues.apache.org/jira/browse/LUCENE-5618 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir > > Spinoff from LUCENE-5616. > See the example there, docvalues readers get a fieldinfos, but it doesn't > contain the correct ones, so they have invalid field numbers at read time. > This should really be fixed. Maybe a simple solution is to not write > "batches" of fields in updates but just have only one field per gen? > This removes many-many relationships and would make things easy to understand. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Posting list
Postings are more like SortedMap, ie terms are binary, and are in sorted order. Are you referring to the indexing chain classes, e.g. FreqProxTermsWriterPerField.FreqProxPostingsArray? That class holds the postings in IndexWriter's RAM buffer until it's time to write them to disk as a new segment. Those data structures are somewhat confusing, but once they are written to disk and opened for reading they are exposed via the FieldsProducer API. Mike McCandless http://blog.mikemccandless.com On Mon, Apr 28, 2014 at 12:33 AM, fabric fabricio wrote: > Can you explain how dictionary are linked with this implementation of > posting lists. In traditional case we have dictionary like > hashmap[String,List(int,int)] //word -> docid, termfreq. In this case > dictionary points to "parallel arrays" slots and in the "poitner array" > points to most recent docid in the posting list what means "to search the > posting list" in other words how this maps to List(int,int) part - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5618) DocValues updates send wrong fieldinfos to codec producers
[ https://issues.apache.org/jira/browse/LUCENE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982853#comment-13982853 ] Shai Erera commented on LUCENE-5618: I modified the code to pass all the FIs to the codec, no matter the gen, and tests fail with FileNotFoundException. The reason is that PerFieldDVF tries to open DVPs e.g. of {{gen=1}} of all fields, whether they were written in that gen or not, which leads to the FNFE. I am not sure that we can pass all FIs to the Codec that way ... so our options are: * Pass all the fields that were written in a gen (whether we need them or not) -- this does not make sense to me, as we'll need to track it somewhere, and it seems a waste * Add leniency in the form of "here are the fields you should care about" -- this makes the codec partially updates aware, but I don't think it's a bad idea * Write each updated field in its own gen -- if you update many fields, many times, this will create many files in the index directory. Technically it's not "wrong", it just looks weird * Remain w/ the current code's corruption detection if the read fieldNumber < 0 Anyway, I think the issue's title is wrong -- DocValues updates *do* pass the correct fieldInfos to the producers. They pass only the infos that the producer should care about, and we see that passing too many is wrong (PerFieldDVF). I will think about it more. If you see other alternatives, feel free to propose them. > DocValues updates send wrong fieldinfos to codec producers > -- > > Key: LUCENE-5618 > URL: https://issues.apache.org/jira/browse/LUCENE-5618 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir > > Spinoff from LUCENE-5616. > See the example there, docvalues readers get a fieldinfos, but it doesn't > contain the correct ones, so they have invalid field numbers at read time. > This should really be fixed. Maybe a simple solution is to not write > "batches" of fields in updates but just have only one field per gen? > This removes many-many relationships and would make things easy to understand. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anshum Gupta updated SOLR-5681: --- Attachment: SOLR-5681.patch Added a test for running parallel tasks (multiple collection creation and split). Seems like there's some issue fetching new tasks from the queue. Working on resolving the issue. > Make the OverseerCollectionProcessor multi-threaded > --- > > Key: SOLR-5681 > URL: https://issues.apache.org/jira/browse/SOLR-5681 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Anshum Gupta >Assignee: Anshum Gupta > Attachments: SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch > > > Right now, the OverseerCollectionProcessor is single threaded i.e submitting > anything long running would have it block processing of other mutually > exclusive tasks. > When OCP tasks become optionally async (SOLR-5477), it'd be good to have > truly non-blocking behavior by multi-threading the OCP itself. > For example, a ShardSplit call on Collection1 would block the thread and > thereby, not processing a create collection task (which would stay queued in > zk) though both the tasks are mutually exclusive. > Here are a few of the challenges: > * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An > easy way to handle that is to only let 1 task per collection run at a time. > * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. > The task from the workQueue is only removed on completion so that in case of > a failure, the new Overseer can re-consume the same task and retry. A queue > is not the right data structure in the first place to look ahead i.e. get the > 2nd task from the queue when the 1st one is in process. Also, deleting tasks > which are not at the head of a queue is not really an 'intuitive' thing. > Proposed solutions for task management: > * Task funnel and peekAfter(): The parent thread is responsible for getting > and passing the request to a new thread (or one from the pool). The parent > method uses a peekAfter(last element) instead of a peek(). The peekAfter > returns the task after the 'last element'. Maintain this request information > and use it for deleting/cleaning up the workQueue. > * Another (almost duplicate) queue: While offering tasks to workQueue, also > offer them to a new queue (call it volatileWorkQueue?). The difference is, as > soon as a task from this is picked up for processing by the thread, it's > removed from the queue. At the end, the cleanup is done from the workQueue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6026) Also check work-queue while processing a REQUESTSTATUS Collection API Call
Anshum Gupta created SOLR-6026: -- Summary: Also check work-queue while processing a REQUESTSTATUS Collection API Call Key: SOLR-6026 URL: https://issues.apache.org/jira/browse/SOLR-6026 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.8 Reporter: Anshum Gupta Fix For: 4.8.1 REQUESTSTATUS API call should check for the following: * work-queue (submitted task) * running-map (running task/in progress) * completed-map * failure-map Right now it checks everything but the work-queue. Add that. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org