Re: [Lucene.Net] fw: resolving github mirror issues
https://github.com/apache/lucene.net https://github.com/apache/lucene.netgithub mirror is now up to date. On Mon, May 2, 2011 at 8:05 PM, Michael Herndon mhern...@wickedsoftware.net wrote: Apache's git mirror is now aimed at the incubation repo. Now just waiting to see the github mirror script will pick up the changes on its own. On Mon, May 2, 2011 at 1:24 PM, Prescott Nasser geobmx...@hotmail.comwrote: I don't think so Date: Mon, 2 May 2011 11:18:12 -0400 From: mhern...@wickedsoftware.net To: lucene-net-dev@lucene.apache.org Subject: [Lucene.Net] fw: resolving github mirror issues Is there any reason not to replace the old mirror with the newly created one? - Michael -- Hi, On Tue, Apr 26, 2011 at 7:51 PM, Michael Herndon mhern...@wickedsoftware.net wrote: Would it be possible to get the git mirror to reflect that or at least create a new mirror for the lucene.net repo that is under incubator? Unfortunately our mirroring scripts can't handle an svn move that wasn't done as a single commit (svn move .../lucene/lucene.net .../incubator/lucene.net), so I'll need to recreate the mirror. If and when you move back to Lucene or to a TLP, I suggest you move the full svn tree in a single commit. Do you still need the old mirror repository, or is it OK if I simply replace it with the newly created one? BR, Jukka Zitting
[JENKINS] Lucene-Solr-tests-only-3.x - Build # 7674 - Still Failing
Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/7674/ No tests ran. Build Log (for compile errors): [...truncated 3742 lines...] [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:233: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListjava.util.Listorg.apache.solr.common.util.NamedList [javac] indexPart = nameTextType.get(index); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:250: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListjava.util.Listorg.apache.solr.common.util.NamedList [javac] queryPart = nameTextType.get(query); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:256: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListorg.apache.solr.common.util.NamedList [javac] NamedListNamedList fieldNames = result.get(field_names); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:259: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListorg.apache.solr.common.util.NamedList [javac] NamedListNamedList whitetok = fieldNames.get(whitetok); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:262: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListjava.util.Listorg.apache.solr.common.util.NamedList [javac] indexPart = whitetok.get(index); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:279: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListjava.util.Listorg.apache.solr.common.util.NamedList [javac] queryPart = whitetok.get(query); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:288: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListorg.apache.solr.common.util.NamedList [javac] NamedListNamedList keywordtok = fieldNames.get(keywordtok); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:291: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListjava.util.Listorg.apache.solr.common.util.NamedList [javac] indexPart = keywordtok.get(index); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:299: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListjava.util.Listorg.apache.solr.common.util.NamedList [javac] queryPart = keywordtok.get(query); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:320: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListorg.apache.solr.common.util.NamedList [javac] NamedListNamedList fieldTypes = result.get(field_types);
[jira] [Created] (SOLR-2486) org.apache.solr.common.SolrException: Service Unavailable
org.apache.solr.common.SolrException: Service Unavailable - Key: SOLR-2486 URL: https://issues.apache.org/jira/browse/SOLR-2486 Project: Solr Issue Type: Bug Components: update Affects Versions: 3.1 Reporter: Jayesh K Rajpurohit While doing an update on Solr master core, I am getting this exception. It is more frequent after I added the 'Do You Mean' functionality. I have 'buildOnCommit' in my solrConfig.xml for building the spellcheck index. Below is the exception trace. org.apache.solr.common.SolrException: Service Unavailable Service Unavailable request: http://qa-agile-iproxy-mgmt.idefense.vrsn.com/solrmaster/CoreNAME/update at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:435) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-3.x - Build # 7675 - Still Failing
Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/7675/ No tests ran. Build Log (for compile errors): [...truncated 3741 lines...] [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:233: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListjava.util.Listorg.apache.solr.common.util.NamedList [javac] indexPart = nameTextType.get(index); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:250: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListjava.util.Listorg.apache.solr.common.util.NamedList [javac] queryPart = nameTextType.get(query); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:256: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListorg.apache.solr.common.util.NamedList [javac] NamedListNamedList fieldNames = result.get(field_names); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:259: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListorg.apache.solr.common.util.NamedList [javac] NamedListNamedList whitetok = fieldNames.get(whitetok); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:262: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListjava.util.Listorg.apache.solr.common.util.NamedList [javac] indexPart = whitetok.get(index); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:279: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListjava.util.Listorg.apache.solr.common.util.NamedList [javac] queryPart = whitetok.get(query); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:288: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListorg.apache.solr.common.util.NamedList [javac] NamedListNamedList keywordtok = fieldNames.get(keywordtok); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:291: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListjava.util.Listorg.apache.solr.common.util.NamedList [javac] indexPart = keywordtok.get(index); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:299: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListjava.util.Listorg.apache.solr.common.util.NamedList [javac] queryPart = keywordtok.get(query); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHandlerTest.java:320: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListorg.apache.solr.common.util.NamedList [javac] NamedListNamedList fieldTypes = result.get(field_types);
[jira] [Commented] (SOLR-2486) org.apache.solr.common.SolrException: Service Unavailable
[ https://issues.apache.org/jira/browse/SOLR-2486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028078#comment-13028078 ] Jayesh K Rajpurohit commented on SOLR-2486: --- Can you suggest some configuration tuning to evade this issue . Thanks ! org.apache.solr.common.SolrException: Service Unavailable - Key: SOLR-2486 URL: https://issues.apache.org/jira/browse/SOLR-2486 Project: Solr Issue Type: Bug Components: update Affects Versions: 3.1 Reporter: Jayesh K Rajpurohit While doing an update on Solr master core, I am getting this exception. It is more frequent after I added the 'Do You Mean' functionality. I have 'buildOnCommit' in my solrConfig.xml for building the spellcheck index. Below is the exception trace. org.apache.solr.common.SolrException: Service Unavailable Service Unavailable request: http://qa-agile-iproxy-mgmt.idefense.vrsn.com/solrmaster/CoreNAME/update at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:435) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [JENKINS] Lucene-Solr-tests-only-3.x - Build # 7675 - Still Failing
Fixed interface @Override - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Apache Jenkins Server [mailto:hud...@hudson.apache.org] Sent: Tuesday, May 03, 2011 8:44 AM To: dev@lucene.apache.org Subject: [JENKINS] Lucene-Solr-tests-only-3.x - Build # 7675 - Still Failing Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only- 3.x/7675/ No tests ran. Build Log (for compile errors): [...truncated 3741 lines...] [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests- only- 3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHa ndlerTest.java:233: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListjava.util.Listorg.apache.solr.comm on.util.NamedList [javac] indexPart = nameTextType.get(index); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests- only- 3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHa ndlerTest.java:250: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListjava.util.Listorg.apache.solr.comm on.util.NamedList [javac] queryPart = nameTextType.get(query); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests- only- 3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHa ndlerTest.java:256: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListorg.apache.solr.common.util.Name dList [javac] NamedListNamedList fieldNames = result.get(field_names); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests- only- 3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHa ndlerTest.java:259: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListorg.apache.solr.common.util.Name dList [javac] NamedListNamedList whitetok = fieldNames.get(whitetok); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests- only- 3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHa ndlerTest.java:262: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListjava.util.Listorg.apache.solr.comm on.util.NamedList [javac] indexPart = whitetok.get(index); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests- only- 3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHa ndlerTest.java:279: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListjava.util.Listorg.apache.solr.comm on.util.NamedList [javac] queryPart = whitetok.get(query); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests- only- 3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHa ndlerTest.java:288: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListorg.apache.solr.common.util.Name dList [javac] NamedListNamedList keywordtok = fieldNames.get(keywordtok); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests- only- 3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHa ndlerTest.java:291: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListjava.util.Listorg.apache.solr.comm on.util.NamedList [javac] indexPart = keywordtok.get(index); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests- only- 3.x/checkout/solr/src/test/org/apache/solr/handler/FieldAnalysisRequestHa ndlerTest.java:299: warning: [unchecked] unchecked conversion [javac] found : org.apache.solr.common.util.NamedList [javac] required: org.apache.solr.common.util.NamedListjava.util.Listorg.apache.solr.comm on.util.NamedList [javac] queryPart = keywordtok.get(query); [javac]
[jira] [Commented] (LUCENE-3058) FST should allow more than one output for the same input
[ https://issues.apache.org/jira/browse/LUCENE-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028104#comment-13028104 ] Dawid Weiss commented on LUCENE-3058: - Looks good to me. One note: possible NPE here (null passes all instanceofs): {code} +@Override +public boolean equals(Object _other) { + if (_other instanceof TwoLongs) { +final TwoLongs other = (TwoLongs) _other; +return first == other.first second == other.second; + } else { +return false; + } +} {code} FST should allow more than one output for the same input Key: LUCENE-3058 URL: https://issues.apache.org/jira/browse/LUCENE-3058 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-3058.patch, LUCENE-3058.patch For the block tree terms dict, it turns out I need this case. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3058) FST should allow more than one output for the same input
[ https://issues.apache.org/jira/browse/LUCENE-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028105#comment-13028105 ] Uwe Schindler commented on LUCENE-3058: --- bq. null passes all instanceofs Definitely NOT! [http://stackoverflow.com/questions/2950319/is-null-check-needed-before-calling-instanceof] FST should allow more than one output for the same input Key: LUCENE-3058 URL: https://issues.apache.org/jira/browse/LUCENE-3058 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-3058.patch, LUCENE-3058.patch For the block tree terms dict, it turns out I need this case. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3058) FST should allow more than one output for the same input
[ https://issues.apache.org/jira/browse/LUCENE-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028108#comment-13028108 ] Dawid Weiss commented on LUCENE-3058: - Handslap! And this is why you should always refresh your memory before posting something that lasts for millenia... Crawling back to my cave right now. FST should allow more than one output for the same input Key: LUCENE-3058 URL: https://issues.apache.org/jira/browse/LUCENE-3058 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-3058.patch, LUCENE-3058.patch For the block tree terms dict, it turns out I need this case. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (LUCENE-3058) FST should allow more than one output for the same input
[ https://issues.apache.org/jira/browse/LUCENE-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028104#comment-13028104 ] Dawid Weiss edited comment on LUCENE-3058 at 5/3/11 8:35 AM: - Looks good to me. One note: possible NPE here (-null passes all instanceofs-): {code} +@Override +public boolean equals(Object _other) { + if (_other instanceof TwoLongs) { +final TwoLongs other = (TwoLongs) _other; +return first == other.first second == other.second; + } else { +return false; + } +} {code} was (Author: dweiss): Looks good to me. One note: possible NPE here (null passes all instanceofs): {code} +@Override +public boolean equals(Object _other) { + if (_other instanceof TwoLongs) { +final TwoLongs other = (TwoLongs) _other; +return first == other.first second == other.second; + } else { +return false; + } +} {code} FST should allow more than one output for the same input Key: LUCENE-3058 URL: https://issues.apache.org/jira/browse/LUCENE-3058 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-3058.patch, LUCENE-3058.patch For the block tree terms dict, it turns out I need this case. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3058) FST should allow more than one output for the same input
[ https://issues.apache.org/jira/browse/LUCENE-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028109#comment-13028109 ] Uwe Schindler commented on LUCENE-3058: --- :-) It always confuses me, too. But if you think more about it, it makes sense to return false. But it's the same always for me: Whenever I write equals() methods, this question pops up. But now I mostly copy code like the one above from other classes. But you have to note: The above equals() code is only 100% suitable for final classes, else it could happen that a subclass that extends some fields is equal. But thats more a theoretical discussion. E.g. Lucene's Queries always check this.getClass()==other.getClass(). FST should allow more than one output for the same input Key: LUCENE-3058 URL: https://issues.apache.org/jira/browse/LUCENE-3058 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-3058.patch, LUCENE-3058.patch For the block tree terms dict, it turns out I need this case. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: MergePolicy Thresholds
Hi I looked into porting it to 3x, and prepared the attached patch. It only contains the new TieredMP and Test, as well as the necessary changes to LuceneTestCase and IndexWriter. I guess you can start with it (even just the MP and IW changes) to test it on your indexes. Mike, I saw that there were many more changes, as part of LUCENE-1076, done to the code. In particular, this MP is now the default (on trunk), so I guess many changes (to tests) were needed because of that. Do you remember, if apart from the changes I've included in the patch, other important changes w.r.t. this code? As we won't change the default MP on 3x, I'm guessing I don't need to port all the changes to 3x. Shai On Mon, May 2, 2011 at 9:41 PM, Burton-West, Tom tburt...@umich.edu wrote: Hi Shai and Mike, Testing the TieredMP on our large indexes has been on my todo list since I read Mikes blog post http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html . If you port it to the 3.x branch Shai, I'll be more than happy to test it with our very large (300GB+) indexes. Besides being able to set the max merged segment size, I'm especially interested in using the maxSegmentsPerTier parameter. From Mike's blog post: ...maxSegmentsPerTier that lets you set the allowed width (number of segments) of each stair in the staircase. This is nice because it decouples how many segments to merge at a time from how wide the staircase can be. Tom Burton-West http://www.hathitrust.org/blogs/large-scale-search -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Monday, May 02, 2011 2:19 PM To: dev@lucene.apache.org Subject: Re: MergePolicy Thresholds I think it should be an easy port... Mike http://blog.mikemccandless.com On Mon, May 2, 2011 at 2:16 PM, Shai Erera ser...@gmail.com wrote: Thanks Mike. I'll take a look at TieredMP. Does it depend on trunk in any way, or do you think it can easily be ported to 3x? Shai - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org tieredmp.patch Description: Binary data - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3054) SorterTemplate.quickSort stack overflows on broken comparators that produce only few disticnt values in large arrays
[ https://issues.apache.org/jira/browse/LUCENE-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028125#comment-13028125 ] Michael McCandless commented on LUCENE-3054: Patch looks good! I like the 2*log_2(N) dynamic cutover; this means we can tolerate somewhat lopsided QS recursion and remain using QS. SorterTemplate.quickSort stack overflows on broken comparators that produce only few disticnt values in large arrays Key: LUCENE-3054 URL: https://issues.apache.org/jira/browse/LUCENE-3054 Project: Lucene - Java Issue Type: Task Affects Versions: 3.1 Reporter: Robert Muir Assignee: Uwe Schindler Priority: Critical Fix For: 3.1.1, 3.2, 4.0 Attachments: LUCENE-3054-dynamic.patch, LUCENE-3054-stackoverflow.patch, LUCENE-3054.patch, LUCENE-3054.patch, LUCENE-3054.patch, LUCENE-3054.patch, LUCENE-3054.patch, LUCENE-3054.patch Looking at Otis's sort problem on the mailing list, he said: {noformat} * looked for other places where this call is made - found it in MultiPhraseQuery$MultiPhraseWeight and changed that call from ArrayUtil.quickSort to ArrayUtil.mergeSort * now we no longer see SorterTemplate.quickSort in deep recursion when we do a thread dump {noformat} I thought this was interesting because PostingsAndFreq's comparator looks like it needs a tiebreaker. I think in our sorts we should add some asserts to try to catch some of these broken comparators. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: MergePolicy Thresholds
Looks good Shai! Comments below too: On Tue, May 3, 2011 at 5:29 AM, Shai Erera ser...@gmail.com wrote: Hi I looked into porting it to 3x, and prepared the attached patch. It only contains the new TieredMP and Test, as well as the necessary changes to LuceneTestCase and IndexWriter. I guess you can start with it (even just the MP and IW changes) to test it on your indexes. Mike, I saw that there were many more changes, as part of LUCENE-1076, done to the code. In particular, this MP is now the default (on trunk), so I guess many changes (to tests) were needed because of that. Do you remember, if apart from the changes I've included in the patch, other important changes w.r.t. this code? The only other changes I can think of were some verbosity improvements to IndexWriter, to support the python script that can make a merge movie from an infoStream output; but that can wait for when I back-port to 3.x... As we won't change the default MP on 3x, I'm guessing I don't need to port all the changes to 3x. Right, I think. Mike - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: MergePolicy Thresholds
Mike, if you want, I can back-port it, as I've already started this when preparing the patch. I noticed that you added a throws IOE to IW.setInfoStream -- is it ok on 3x too? It'll be a backwards change. Maybe we should iterate on the issue? I can reopen. Shai On Tue, May 3, 2011 at 12:36 PM, Michael McCandless luc...@mikemccandless.com wrote: Looks good Shai! Comments below too: On Tue, May 3, 2011 at 5:29 AM, Shai Erera ser...@gmail.com wrote: Hi I looked into porting it to 3x, and prepared the attached patch. It only contains the new TieredMP and Test, as well as the necessary changes to LuceneTestCase and IndexWriter. I guess you can start with it (even just the MP and IW changes) to test it on your indexes. Mike, I saw that there were many more changes, as part of LUCENE-1076, done to the code. In particular, this MP is now the default (on trunk), so I guess many changes (to tests) were needed because of that. Do you remember, if apart from the changes I've included in the patch, other important changes w.r.t. this code? The only other changes I can think of were some verbosity improvements to IndexWriter, to support the python script that can make a merge movie from an infoStream output; but that can wait for when I back-port to 3.x... As we won't change the default MP on 3x, I'm guessing I don't need to port all the changes to 3x. Right, I think. Mike - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: MergePolicy Thresholds
That'd be great, thanks :) Yes, let's iterate on the issue! But: it should still be open, I hope (I didn't mean to close it yet, since it's not back ported)... Mike http://blog.mikemccandless.com On Tue, May 3, 2011 at 5:51 AM, Shai Erera ser...@gmail.com wrote: Mike, if you want, I can back-port it, as I've already started this when preparing the patch. I noticed that you added a throws IOE to IW.setInfoStream -- is it ok on 3x too? It'll be a backwards change. Maybe we should iterate on the issue? I can reopen. Shai On Tue, May 3, 2011 at 12:36 PM, Michael McCandless luc...@mikemccandless.com wrote: Looks good Shai! Comments below too: On Tue, May 3, 2011 at 5:29 AM, Shai Erera ser...@gmail.com wrote: Hi I looked into porting it to 3x, and prepared the attached patch. It only contains the new TieredMP and Test, as well as the necessary changes to LuceneTestCase and IndexWriter. I guess you can start with it (even just the MP and IW changes) to test it on your indexes. Mike, I saw that there were many more changes, as part of LUCENE-1076, done to the code. In particular, this MP is now the default (on trunk), so I guess many changes (to tests) were needed because of that. Do you remember, if apart from the changes I've included in the patch, other important changes w.r.t. this code? The only other changes I can think of were some verbosity improvements to IndexWriter, to support the python script that can make a merge movie from an infoStream output; but that can wait for when I back-port to 3.x... As we won't change the default MP on 3x, I'm guessing I don't need to port all the changes to 3x. Right, I think. Mike - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2487) Do not include slf4j-jdk14 jar in WAR
Do not include slf4j-jdk14 jar in WAR - Key: SOLR-2487 URL: https://issues.apache.org/jira/browse/SOLR-2487 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 3.2, 4.0 Reporter: Jan Høydahl I know we've intentionally bundled slf4j-jdk14-1.5.5.jar in the war to help newbies get up and running. But I find myself re-packaging the war for every customer when adapting to their choice of logger framework, which is counter-productive. It would be sufficient to have the jdk-logging binding in example/lib to let the example and tutorial still work OOTB but as soon as you deploy solr.war to production you're forced to explicitly decide what logging to use. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2488) README.TXT mixes Unix and Windows path styles
README.TXT mixes Unix and Windows path styles - Key: SOLR-2488 URL: https://issues.apache.org/jira/browse/SOLR-2488 Project: Solr Issue Type: Improvement Components: documentation Affects Versions: 3.2, 4.0 Reporter: Jan Høydahl Priority: Minor README.TXT mixes Unix- and Windows-style syntaxes without further comments. Propose to change e.g. %JAVA_HOME%\bin - $JAVA_HOME/bin to be consistent and add a comment about Windows elsewhere -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2489) Remove old lucene.apache.org/solr/who page
Remove old lucene.apache.org/solr/who page -- Key: SOLR-2489 URL: https://issues.apache.org/jira/browse/SOLR-2489 Project: Solr Issue Type: Bug Affects Versions: 3.1, 3.2 Reporter: Jan Høydahl Priority: Minor In the distribution, docs/who.html is old - refers to the old Solr committers list at http://lucene.apache.org/solr/who Fix would be to simply delete the old page -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-3.x #111: POMs out of sync
Build: https://builds.apache.org/hudson/job/Lucene-Solr-Maven-3.x/111/ No tests ran. Build Log (for compile errors): [...truncated 13339 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-1076) Allow MergePolicy to select non-contiguous merges
[ https://issues.apache.org/jira/browse/LUCENE-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-1076: --- Attachment: LUCENE-1076-3x.patch Patch against 3x. This is not ready to commit yet, as many tests fail on exceptions like this: {noformat} [junit] java.lang.IndexOutOfBoundsException [junit] at java.util.AbstractList.subList(AbstractList.java:763) [junit] at java.util.Vector.subList(Vector.java:975) [junit] at org.apache.lucene.index.IndexWriter.commitMerge(IndexWriter.java:3550) [junit] at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4057) [junit] at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3631) {noformat} Mike says there was an earlier commit (handled how deletes are flushed) that is a dependency of that, and that I can continue only he back-ports that. In the meantime, I've fixed tests that assumed LogMP (for setting compound and mergeFactor) by adding LTC.setUseCompoundFile and LTC.setMergeFactor as utility methods. Will continue after Mike back-ports the dependencies. Allow MergePolicy to select non-contiguous merges - Key: LUCENE-1076 URL: https://issues.apache.org/jira/browse/LUCENE-1076 Project: Lucene - Java Issue Type: Improvement Components: Index Affects Versions: 2.3 Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-1076-3x.patch, LUCENE-1076.patch, LUCENE-1076.patch, LUCENE-1076.patch I started work on this but with LUCENE-1044 I won't make much progress on it for a while, so I want to checkpoint my current state/patch. For backwards compatibility we must leave the default MergePolicy as selecting contiguous merges. This is necessary because some applications rely on temporal monotonicity of doc IDs, which means even though merges can re-number documents, the renumbering will always reflect the order in which the documents were added to the index. Still, for those apps that do not rely on this, we should offer a MergePolicy that is free to select the best merges regardless of whether they are continuguous. This requires fixing IndexWriter to accept such a merge, and, fixing LogMergePolicy to optionally allow it the freedom to do so. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2488) README.TXT mixes Unix and Windows path styles
[ https://issues.apache.org/jira/browse/SOLR-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated SOLR-2488: -- Attachment: SOLR-2488.patch Proposed changes README.TXT mixes Unix and Windows path styles - Key: SOLR-2488 URL: https://issues.apache.org/jira/browse/SOLR-2488 Project: Solr Issue Type: Improvement Components: documentation Affects Versions: 3.2, 4.0 Reporter: Jan Høydahl Priority: Minor Attachments: SOLR-2488.patch README.TXT mixes Unix- and Windows-style syntaxes without further comments. Propose to change e.g. %JAVA_HOME%\bin - $JAVA_HOME/bin to be consistent and add a comment about Windows elsewhere -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: (LUCENE-3058) FST should allow more than one output for the same input
I usually do an explicit check for nulls and that's why I allowed myself to bring the issue up. It's similar to operator priorities -- I just like to have explicit brackets instead of relying on my degenerating memory... As for sorting, I don't like to rely on the default hashCode/equals exactly for the reasons you mentioned and prefer explicit comparators. It's really a pity there is no full hashcode/equals delegation model in java util collections, it would be a nice addition. On Tue, May 3, 2011 at 10:41 AM, Uwe Schindler (JIRA) j...@apache.orgwrote: [ https://issues.apache.org/jira/browse/LUCENE-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028109#comment-13028109] Uwe Schindler commented on LUCENE-3058: --- :-) It always confuses me, too. But if you think more about it, it makes sense to return false. But it's the same always for me: Whenever I write equals() methods, this question pops up. But now I mostly copy code like the one above from other classes. But you have to note: The above equals() code is only 100% suitable for final classes, else it could happen that a subclass that extends some fields is equal. But thats more a theoretical discussion. E.g. Lucene's Queries always check this.getClass()==other.getClass(). FST should allow more than one output for the same input Key: LUCENE-3058 URL: https://issues.apache.org/jira/browse/LUCENE-3058 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-3058.patch, LUCENE-3058.patch For the block tree terms dict, it turns out I need this case. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3055) LUCENE-2372, LUCENE-2389 made it impossible to subclass core analyzers
[ https://issues.apache.org/jira/browse/LUCENE-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028189#comment-13028189 ] Robert Muir commented on LUCENE-3055: - {quote} Also, if reusableTokenStream is the only method left standing, isn't it wise to hide actual reuse somewhere in Lucene internals and turn Analyzer into plain and dumb factory interface? {quote} Hi Earwin: I completely agree that somehow Analyzer should be a plain and dumb interface, but are you suggesting we should move the responsibility of reuse onto the consumer? I think this could be challenging, alternatively there might be a way to present a plain and dumb API with the reuse guts buried inside Analyzer itself (like ReusableAnalyzerBase), and reuse enforced (e.g. the tokenStream() is final and you cannot disable reuse). The trick would be handling the special cases such as AnalyzerWrappers but I feel like we could still do this. Either way, I really think we should try to do this for 4.0. Though I think to get there it would be safest if we addressed a few issues first: * LUCENE-2788: make charfilters reusable, otherwise we will make the same mistake again! * LUCENE-3064: ensure consumers are properly using the API e.g. calling reset() * LUCENE-3040: cut all consumers over to reusable API, so its really the one left standing LUCENE-2372, LUCENE-2389 made it impossible to subclass core analyzers -- Key: LUCENE-3055 URL: https://issues.apache.org/jira/browse/LUCENE-3055 Project: Lucene - Java Issue Type: Bug Components: Analysis Affects Versions: 3.1 Reporter: Ian Soboroff LUCENE-2372 and LUCENE-2389 marked all analyzers as final. This makes ReusableAnalyzerBase useless, and makes it impossible to subclass e.g. StandardAnalyzer to make a small modification e.g. to tokenStream(). These issues don't indicate a new method of doing this. The issues don't give a reason except for design considerations, which seems a poor reason to make a backward-incompatible change -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2488) README.TXT mixes Unix and Windows path styles
[ https://issues.apache.org/jira/browse/SOLR-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley resolved SOLR-2488. Resolution: Fixed Committed. Thanks Jan! README.TXT mixes Unix and Windows path styles - Key: SOLR-2488 URL: https://issues.apache.org/jira/browse/SOLR-2488 Project: Solr Issue Type: Improvement Components: documentation Affects Versions: 3.2, 4.0 Reporter: Jan Høydahl Priority: Minor Attachments: SOLR-2488.patch README.TXT mixes Unix- and Windows-style syntaxes without further comments. Propose to change e.g. %JAVA_HOME%\bin - $JAVA_HOME/bin to be consistent and add a comment about Windows elsewhere -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3064) add checks to MockTokenizer to enforce proper consumption
[ https://issues.apache.org/jira/browse/LUCENE-3064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3064: Attachment: LUCENE-3064.patch updated patch with fixes for contrib, though highlighter still remains, and some TODOs are not resolved. add checks to MockTokenizer to enforce proper consumption - Key: LUCENE-3064 URL: https://issues.apache.org/jira/browse/LUCENE-3064 Project: Lucene - Java Issue Type: Test Reporter: Robert Muir Fix For: 4.0 Attachments: LUCENE-3064.patch, LUCENE-3064.patch we can enforce things like consumer properly iterates through tokenstream lifeycle via MockTokenizer. this could catch bugs in consumers that don't call reset(), etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3054) SorterTemplate.quickSort stack overflows on broken comparators that produce only few disticnt values in large arrays
[ https://issues.apache.org/jira/browse/LUCENE-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-3054. --- Resolution: Fixed Committed trunk revision: 1099041 Merged 3.x revision: 1099045 Merged 3.1 revision: 1099046 SorterTemplate.quickSort stack overflows on broken comparators that produce only few disticnt values in large arrays Key: LUCENE-3054 URL: https://issues.apache.org/jira/browse/LUCENE-3054 Project: Lucene - Java Issue Type: Task Affects Versions: 3.1 Reporter: Robert Muir Assignee: Uwe Schindler Priority: Critical Fix For: 3.1.1, 3.2, 4.0 Attachments: LUCENE-3054-dynamic.patch, LUCENE-3054-stackoverflow.patch, LUCENE-3054.patch, LUCENE-3054.patch, LUCENE-3054.patch, LUCENE-3054.patch, LUCENE-3054.patch, LUCENE-3054.patch Looking at Otis's sort problem on the mailing list, he said: {noformat} * looked for other places where this call is made - found it in MultiPhraseQuery$MultiPhraseWeight and changed that call from ArrayUtil.quickSort to ArrayUtil.mergeSort * now we no longer see SorterTemplate.quickSort in deep recursion when we do a thread dump {noformat} I thought this was interesting because PostingsAndFreq's comparator looks like it needs a tiebreaker. I think in our sorts we should add some asserts to try to catch some of these broken comparators. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2191) Change SolrException cstrs that take Throwable to default to alreadyLogged=false
[ https://issues.apache.org/jira/browse/SOLR-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028240#comment-13028240 ] David Smiley commented on SOLR-2191: Ehh... I kind of like the notion but wether it is kept or not, I think a general error/exception strategy needs to be devised. In code I write, I tend to almost never log exceptions; I let them get to the highest possible point to ensure they are logged there once, which is usually one place. Beforehand I might catch an exception to do a log.error() to provide some context and then rethrow the exception. I also wrap with RuntimeExceptions. An alternative is to log exceptions early (with contextual error message), and then rethrow but don't log it higher up (e.g. earlier up) the stack. But how can that early point know the exception has been handled? It can't generically know making your suggestion of fix those code paths to be less chatty problematic. Perhaps our code will always assume that we logged an exception before wrapping it in SolrException right before we throw them. I think that's a reasonable policy and wouldn't require an alreadyLogged flag. Change SolrException cstrs that take Throwable to default to alreadyLogged=false Key: SOLR-2191 URL: https://issues.apache.org/jira/browse/SOLR-2191 Project: Solr Issue Type: Bug Reporter: Mark Miller Fix For: Next Attachments: SOLR-2191.patch Because of misuse, many exceptions are now not logged at all - can be painful when doing dev. I think we should flip this setting and work at removing any double logging - losing logging is worse (and it almost looks like we lose more logging than we would get in double logging) - and bad solrexception/logging patterns are proliferating. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2487) Do not include slf4j-jdk14 jar in WAR
[ https://issues.apache.org/jira/browse/SOLR-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028243#comment-13028243 ] David Smiley commented on SOLR-2487: I like it Jan! JDK14 logging sucks, any way. Do not include slf4j-jdk14 jar in WAR - Key: SOLR-2487 URL: https://issues.apache.org/jira/browse/SOLR-2487 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 3.2, 4.0 Reporter: Jan Høydahl Labels: logging, slf4j I know we've intentionally bundled slf4j-jdk14-1.5.5.jar in the war to help newbies get up and running. But I find myself re-packaging the war for every customer when adapting to their choice of logger framework, which is counter-productive. It would be sufficient to have the jdk-logging binding in example/lib to let the example and tutorial still work OOTB but as soon as you deploy solr.war to production you're forced to explicitly decide what logging to use. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2191) Change SolrException cstrs that take Throwable to default to alreadyLogged=false
[ https://issues.apache.org/jira/browse/SOLR-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028246#comment-13028246 ] Yonik Seeley commented on SOLR-2191: bq. In code I write, I tend to almost never log exceptions; I let them get to the highest possible point to ensure they are logged there once, which is usually one place. Beforehand I might catch an exception to do a log.error() to provide some context and then rethrow the exception. Right. And logging immediately can be problematic since one may not know if it's really an error that should be logged since Exceptions can sometimes be handled (dismax is one example). Anyway, certainly a +1 from me for changing the default of alreadyLogged and improving the strategy in general. Change SolrException cstrs that take Throwable to default to alreadyLogged=false Key: SOLR-2191 URL: https://issues.apache.org/jira/browse/SOLR-2191 Project: Solr Issue Type: Bug Reporter: Mark Miller Fix For: Next Attachments: SOLR-2191.patch Because of misuse, many exceptions are now not logged at all - can be painful when doing dev. I think we should flip this setting and work at removing any double logging - losing logging is worse (and it almost looks like we lose more logging than we would get in double logging) - and bad solrexception/logging patterns are proliferating. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028252#comment-13028252 ] Stephen Weiss commented on SOLR-236: Yes, I've had this too: https://issues.apache.org/jira/browse/SOLR-236?focusedCommentId=12655750page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12655750 I'm pretty sure I know the reason for it, but I don't know how to fix it... to the best of my knowledge no one on the ticket really said if the problem could be fixed or not yet either. At the moment we just use facet.before and explain to our users that the facets are for unfiltered results... almost no one complains once we explain it to them. However, a fix would be *wonderful*... people ask about it often enough that clearly it's not very intuitive. Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: Next Attachments: DocSetScoreCollector.java, NonAdjacentDocumentCollapser.java, NonAdjacentDocumentCollapserTest.java, SOLR-236-1_4_1-NPEfix.patch, SOLR-236-1_4_1-paging-totals-working.patch, SOLR-236-1_4_1.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-branch_3x.patch, SOLR-236-distinctFacet.patch, SOLR-236-trunk.patch, SOLR-236-trunk.patch, SOLR-236-trunk.patch, SOLR-236-trunk.patch, SOLR-236-trunk.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch, collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, solr-236.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028256#comment-13028256 ] Yuriy Akopov commented on SOLR-236: --- Thanks, Stephen. So it isn't just me doing something else wrong. I'm thinking of displaying not the actual figures against the facet items but something like 100+, 200+, 300+ etc. Should be okay as the difference is not dramatic but seems to remain within the relatively narrow interval. Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: Next Attachments: DocSetScoreCollector.java, NonAdjacentDocumentCollapser.java, NonAdjacentDocumentCollapserTest.java, SOLR-236-1_4_1-NPEfix.patch, SOLR-236-1_4_1-paging-totals-working.patch, SOLR-236-1_4_1.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-branch_3x.patch, SOLR-236-distinctFacet.patch, SOLR-236-trunk.patch, SOLR-236-trunk.patch, SOLR-236-trunk.patch, SOLR-236-trunk.patch, SOLR-236-trunk.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch, collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, solr-236.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: MergePolicy Thresholds
Thanks Shai! I'm way behind on my 3.x backports -- I'll try to do this soon. Mike http://blog.mikemccandless.com On Tue, May 3, 2011 at 8:10 AM, Shai Erera ser...@gmail.com wrote: I uploaded a patch to LUCENE-1076. Tom, apparently the patch I've attached before cannot be used, because there are dependencies (in earlier commits on LUCENE-1076) that need to be back-ported as well. So stay tuned on LUCENE-1076 for when it is safe to use this new MP. Shai On Tue, May 3, 2011 at 1:00 PM, Michael McCandless luc...@mikemccandless.com wrote: That'd be great, thanks :) Yes, let's iterate on the issue! But: it should still be open, I hope (I didn't mean to close it yet, since it's not back ported)... Mike http://blog.mikemccandless.com On Tue, May 3, 2011 at 5:51 AM, Shai Erera ser...@gmail.com wrote: Mike, if you want, I can back-port it, as I've already started this when preparing the patch. I noticed that you added a throws IOE to IW.setInfoStream -- is it ok on 3x too? It'll be a backwards change. Maybe we should iterate on the issue? I can reopen. Shai On Tue, May 3, 2011 at 12:36 PM, Michael McCandless luc...@mikemccandless.com wrote: Looks good Shai! Comments below too: On Tue, May 3, 2011 at 5:29 AM, Shai Erera ser...@gmail.com wrote: Hi I looked into porting it to 3x, and prepared the attached patch. It only contains the new TieredMP and Test, as well as the necessary changes to LuceneTestCase and IndexWriter. I guess you can start with it (even just the MP and IW changes) to test it on your indexes. Mike, I saw that there were many more changes, as part of LUCENE-1076, done to the code. In particular, this MP is now the default (on trunk), so I guess many changes (to tests) were needed because of that. Do you remember, if apart from the changes I've included in the patch, other important changes w.r.t. this code? The only other changes I can think of were some verbosity improvements to IndexWriter, to support the python script that can make a merge movie from an infoStream output; but that can wait for when I back-port to 3.x... As we won't change the default MP on 3x, I'm guessing I don't need to port all the changes to 3x. Right, I think. Mike - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3028) IW.getReader() returns inconsistent reader on RT Branch
[ https://issues.apache.org/jira/browse/LUCENE-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer resolved LUCENE-3028. - Resolution: Fixed fixed in RT IW.getReader() returns inconsistent reader on RT Branch --- Key: LUCENE-3028 URL: https://issues.apache.org/jira/browse/LUCENE-3028 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: Realtime Branch Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: Realtime Branch Attachments: LUCENE-3028.patch, LUCENE-3028.patch, realtime-1.txt I extended the testcase TestRollingUpdates#testUpdateSameDoc to pull a NRT reader after each update and asserted that is always sees only one document. Yet, this fails with current branch since there is a problem in how we flush in the getReader() case. What happens here is that we flush all threads and then release the lock (letting other flushes which came in after we entered the flushAllThread context, continue) so that we could concurrently get a new segment that transports global deletes without the corresponding add. They sneak in while we continue to open the NRT reader which in turn sees inconsistent results. I will upload a patch soon -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2487) Do not include slf4j-jdk14 jar in WAR
[ https://issues.apache.org/jira/browse/SOLR-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028283#comment-13028283 ] Hoss Man commented on SOLR-2487: bq. It would be sufficient to have the jdk-logging binding in example/lib to let the example and tutorial still work OOTB but as soon as you deploy solr.war to production you're forced to explicitly decide what logging to use. Personally, that sounds like a terrible idea to me. Novice users would try the demo, see that it works, then try deploying to some other servlet container and suddenly get errors unless the servlet container had already explicitly loaded some slf4j binding jar? we already have plenty of users who get confused about how (and even *why*) they configure the solr home dir when deploying solr to a servlet container -- this would make it ever harder for beginners. simple things should be simple -- novice users should be able to copy a jar, and copy configs, and be good to go. for a user who cares about jdk14 logging vs log4j vs whatever, the task of customizing the war is simple and straightforward to understand -- but for a solr user who doesn't know anything about java, picking an slf4j binding and configuring their servlet container to load could easily appear like a daunting burden that will make them turn away from even using solr past the tutorial stage. this really seems like a no brainer to me Do not include slf4j-jdk14 jar in WAR - Key: SOLR-2487 URL: https://issues.apache.org/jira/browse/SOLR-2487 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 3.2, 4.0 Reporter: Jan Høydahl Labels: logging, slf4j I know we've intentionally bundled slf4j-jdk14-1.5.5.jar in the war to help newbies get up and running. But I find myself re-packaging the war for every customer when adapting to their choice of logger framework, which is counter-productive. It would be sufficient to have the jdk-logging binding in example/lib to let the example and tutorial still work OOTB but as soon as you deploy solr.war to production you're forced to explicitly decide what logging to use. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: modularization discussion
Isn't our end goal here a bunch of well factored search modules? Ie, fast forward a year or two and I think we should have modules like these: * Faceting * Highlighting * Suggest (good patch is on LUCENE-2995) * Schema * Query impls * Query parsers * Analyzers (good progress here already, thanks Robert!), incl. factories/XML configuration (still need this) * Database import (DIH) * Web app * Distribution/replication * Doc set representations * Collapse/grouping * Caches * Similarity/scoring impls (BM25, etc.) * Codecs * Joins * Lucene core In this future, much of this code came from what is now Solr and Lucene, but we should freely and aggressively poach from other projects when appropriate (and license/provenance is OK). I keep seeing all these cool compressed int set projects popping up... surely these are useful for us. Solr poached a doc set impl from Nutch; probably there's other stuff to poach from Nutch, Mahout, etc. Katta's doing something sweet with distribution/replication; let's poach merge w/ Solr's approach. There are various facet impls out there (Bobo browse/Zoie; Toke's; Elastic Search); let's poach merge with Solr's. Elastic Search has lots of cool stuff, too, under ASL2. All these external open-source projects are fair game for poaching and refactoring into shared modules, along with what is now Solr and Lucene sources. In this ideal future, Solr becomes the bundling and default/example configuration of the Web App and other modules, much like how the various Linux distros bundle different stuff together around the Linux kernel. And if you are an advanced app and don't need the webapp part, you can cherry pick the huper duper modules you do need and directly embedded into your app. Isn't this the future we are working towards? Mike http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: modularization discussion
On the namespace, since Yonik seems concerned about it, and others aren't (I think?), why don't we leave everything factored out of Solr under the under org.apache.solr namespace? Anyone object to that approach? My only concern is that this sends the message that the module depends on Solr but, this turns into a non-issue once Solr is well factored into modules, because by the time we arrive at that future, depending on Solr just means depending on Solr modules, which resolves my concern! Mike http://blog.mikemccandless.com On Mon, May 2, 2011 at 6:11 PM, Grant Ingersoll gsing...@apache.org wrote: On Apr 27, 2011, at 11:45 PM, Greg Stein wrote: On Wed, Apr 27, 2011 at 09:25:14AM -0400, Yonik Seeley wrote: ... But as I said... it seems only fair to meet half way and use the solr namespace for some modules and the lucene namespace for others. Please explain this part to me... I really don't understand. At the risk of speaking for someone else, I think it has to do w/ wanting to maintain brand awareness for Solr. We, as the PMC, currently produce two products: Apache Lucene and Apache Solr. I believe Yonik's concern is that if everything is just labeled Lucene, then Solr is just seen as a very thin shell around Lucene (which, IMO, would still not be the case, since wiring together a server app like Solr is non-trivial, but that is my opinion and I'm not sure if Yonik share's it). Solr has never been a thin shell around Lucene and never will be. However, In some ways, this gets at why I believe Yonik was interested in a Solr TLP: so that Solr could stand on it's own as a brand and as a first class Apache product steered by a PMC that is aligned solely w/ producing the Solr (i.e. as a TLP) product as opposed to the two products we produce now. (Note, my vote on such a TLP was -1, so please don't confuse me as arguing for the point, I'm just trying to, hopefully, explain it) That being said, 99% of consumers of Solr never even know what is in the underlying namespace b/c they only ever interact w/ Solr via HTTP (which has solr in the namespace by default) at the server API level, so at least in my mind, I don't care what the namespace used underneath is. Call it lusolr for all I care. What does fairness have to do with the codebase? I can't speak to this, but perhaps it's just the wrong choice of words and would have been better said: please don't take this as a reason to gut Solr and call everything Lucene. Isn't the whole point of the Lucene project to create the best code possible, for the benefit of our worldwide users? It is. We do that primarily through the release of two products: Lucene and Solr. Lucene is a Java class library. A good deal of programming is required to create anything meaningful in terms of a production ready search server. Solr is a server that takes and makes most things that are programming tasks in Lucene configuration tasks as well as adds a fair bit of functionality (distributed search, replication, faceting, auto-suggest, etc.) and is thus that much easier to put in production (I've seen people be in production on Solr in a matter of days/weeks, I've never seen that in Lucene) The crux of this debate is whether these additional pieces are better served as modules (I think they are) or tightly coupled inside of Solr (which does have a few benefits from a dev. point of view, even though I firmly believe they are outweighed by the positives of modularization.) And, while I think most of us agree that modularization makes sense, that doesn't mean there aren't reasons against it. I also believe we need to take it on a case by case basis. I also don't think every patch has to be in it's final place on first commit. As Otis so often says, it's just software. If it doesn't work, change it. Thus, if people contribute and it lands in Solr, the committer who commits it need not immediately move it (although, hopefully they will) or ask the contributor to do so, as that will likely dampen contributions. Likewise for Lucene. Along with that, if and when others wish to refactor, then they should by all means be allowed to do so assuming of course, all tests across both products still pass. In short, I believe people should still contribute where they see they can add the most value and according to their time schedules. Additionally, others who have more time or the ability to refactor for reusability should be free to do so as well. I don't know what the outcome of this thread should be, so I guess we need to just move forward and keep coding away and working to make things better. Do others see anything broader here? A vote? That would be symbolic, I guess, but doesn't force anyone to do anything since there isn't a specific issue at hand other than a broad concept that is seen as good. -Grant
[jira] [Commented] (SOLR-2487) Do not include slf4j-jdk14 jar in WAR
[ https://issues.apache.org/jira/browse/SOLR-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028316#comment-13028316 ] Uwe Schindler commented on SOLR-2487: - +1 +1 +1 +1 ... Do not include slf4j-jdk14 jar in WAR - Key: SOLR-2487 URL: https://issues.apache.org/jira/browse/SOLR-2487 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 3.2, 4.0 Reporter: Jan Høydahl Labels: logging, slf4j I know we've intentionally bundled slf4j-jdk14-1.5.5.jar in the war to help newbies get up and running. But I find myself re-packaging the war for every customer when adapting to their choice of logger framework, which is counter-productive. It would be sufficient to have the jdk-logging binding in example/lib to let the example and tutorial still work OOTB but as soon as you deploy solr.war to production you're forced to explicitly decide what logging to use. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: modularization discussion
On May 3, 2011, at 12:49 PM, Michael McCandless wrote: Isn't this the future we are working towards? No, not really. Others perhaps, but not me. I'm on board with some modules. I do think there are tradeoffs when considering them and considering Lucene and Solr. I'm happy to take everything one issue at a time. When I voted to merge, no, I certainly was not thinking, I hope in a year or two we have taken everything from Solr and made it a module. I did it for a few specific things to start - analyzers for sure, perhaps some other things as people did something that made sense. I did it so we could share some code more easily - not all code. Others did it for their own reasons I assume. But no - I'm not sure I have ever fully subscribed to what you are saying. - Mark Miller lucidimagination.com Lucene/Solr User Conference May 25-26, San Francisco www.lucenerevolution.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3023) Land DWPT on trunk
[ https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028327#comment-13028327 ] Michael McCandless commented on LUCENE-3023: This cutover to concurrent flushing (DWPT) produces astounding increases in indexing throughput: http://people.apache.org/~mikemccand/lucenebench/indexing.html 186 GB plain text per hour (from 101 GB/hour the day before)!!! It's not every day you see an 84% jump in indexing throughput! Wow. This is on a machine that has substantial CPU+IO concurrency, ie, it was bottlenecked by our non-concurrent flush. Also, I can now tune up the IW settings I use in those nightly benchmarks; it's now 6 threads and only 512 MB RAM buffer. I'll wait a few days and then do that. Looks like a few queries got a bit slower... I suspect this is because the index segment count has changed. Before concurrent flushing it was this: {noformat} 36(4.0):C4977400 _69(4.0):C4977400 _9c(4.0):C4977400 _cf(4.0):C4977400 _fi(4.0):C4977400 _fq(4.0):C497740 _g1(4.0):C497740 _gc(4.0):C497740 _gn(4.0):C497740 _gy(4.0):C497740 _gx(4.0):C49774 _gz(4.0):C49774 _h0(4.0):C49774 _h1(4.0):C49774 _h2(4.0):C49774 _h3(4.0):C468 {noformat} After concurrent flushing: {noformat} _3d(4.0):C4977400 _6h(4.0):C4977400 _9j(4.0):C4977400 _cn(4.0):C4977400 _fq(4.0):C4977400 _fu(4.0):C497740 _g6(4.0):C497740 _gh(4.0):C497740 _gs(4.0):C497740 _h2(4.0):C497740 _gy(4.0):C49774 _gz(4.0):C49774 _h0(4.0):C49774 _h5(4.0):C4105 _1(4.0):C2627 _h4(4.0):C16331 _h3(4.0):C28728 _h1(4.0):C48225 {noformat} So we have 2 extra segments... it's interesting how this affects some queries but not others. Land DWPT on trunk -- Key: LUCENE-3023 URL: https://issues.apache.org/jira/browse/LUCENE-3023 Project: Lucene - Java Issue Type: Task Affects Versions: CSF branch, 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.0 Attachments: LUCENE-3023-quicksort-reincarnation.patch, LUCENE-3023-svn-diff.patch, LUCENE-3023-ws-changes.patch, LUCENE-3023.patch, LUCENE-3023.patch, LUCENE-3023.patch, LUCENE-3023.patch, LUCENE-3023_CHANGES.patch, LUCENE-3023_CHANGES.patch, LUCENE-3023_iw_iwc_jdoc.patch, LUCENE-3023_simonw_review.patch, LUCENE-3023_svndiff.patch, LUCENE-3023_svndiff.patch, diffMccand.py, diffSources.patch, diffSources.patch, realtime-TestAddIndexes-3.txt, realtime-TestAddIndexes-5.txt, realtime-TestIndexWriterExceptions-assert-6.txt, realtime-TestIndexWriterExceptions-npe-1.txt, realtime-TestIndexWriterExceptions-npe-2.txt, realtime-TestIndexWriterExceptions-npe-4.txt, realtime-TestOmitTf-corrupt-0.txt With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so we can proceed landing the DWPT development on trunk soon. I think one of the bigger issues here is to make sure that all JavaDocs for IW etc. are still correct though. I will start going through that first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: modularization discussion
On Tue, May 3, 2011 at 1:11 PM, Mark Miller markrmil...@gmail.com wrote: On May 3, 2011, at 12:49 PM, Michael McCandless wrote: Isn't this the future we are working towards? No, not really. Others perhaps, but not me. I'm on board with some modules. I do think there are tradeoffs when considering them and considering Lucene and Solr. I'm happy to take everything one issue at a time. I hope the outcome of this discussion is a shared sense of the relationship between lucene, solr, and modules -- we need some general guidelines so that every time this comes up we don't have to have the same discussion over and over. Mike I agree with the general vision -- the details on how it would actually work suggest that we may have to fast forward more then a year or two for most of these things -- but who knows. ryan - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)
NumericField should be stored in binary format in index (matching Solr's format) Key: LUCENE-3065 URL: https://issues.apache.org/jira/browse/LUCENE-3065 Project: Lucene - Java Issue Type: Bug Components: Index Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 (Spinoff of LUCENE-3001) Today when writing stored fields we don't record that the field was a NumericField, and so at IndexReader time you get back an ordinary Field and your number has turned into a string. See https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972 We have spare bits already in stored fields, so, we should use one to record that the field is numeric, and then encode the numeric field in Solr's more-compact binary format. A nice side-effect is we fix the long standing issue that you don't get a NumericField back when loading your document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: modularization discussion
On May 3, 2011, at 1:29 PM, Shai Erera wrote: I don't like that approach. Two years from now, if indeed your vision becomes the reality (obviously, not everyone think like you), what would o.a.solr mean? Who will remember that 'suggest' (just picking an example) came from Solr? Who'd care? Why, when I will integrate several modules together, will I need to see o.a.lucene on some, and o.a.solr on others, when both come from the same distro (even same tar.gz file, e.g. modules)? What makes sense, at least to me, is that either we call everything o.a.lucene and solr becomes o.a.lucene.solr (I know I've probably pissed off some people with that, sorry), or we come up w/ a new namespace (proposed by Grant I think) o.a.lusolr. If we go with the second, then we'll have 3 namespaces: * o.a.lucene for core Lucene stuff (e.g. Lucene core, benchmark?) * o.a.solr for pure/core Solr stuff * o.a.lusolr for shared modules. Honestly, I could go for any of those. I can't bring myself to get caught up caring long term what the package names are. You can't even make rules about that - they won't and shouldn't stand over time. Picking a good package name is important. And deciding to call everything that came from Solr o.a.solr, just to not offend someone, is not the right way to do things, at least IMO. Yeah, its just not a sustainable idea for an open source project anyway. Mike, I do share with you the vision you outline, and I believe many of us do. It will become a reality if we factor out modules from Solr and Lucene under /modules. It can also become a reality if someone simply contributes under /modules alternative packages for e.g. faceting, suggest, spellcheck etc. If those are good packages, I doubt Solr would be reluctant to adopt them. Either way, it's the community that will dictate the future of itself, and not individuals. Perhaps we should stop discussing what can possibly happen, and start doing things. Actions get more results than endless threads. This have been stated on this thread numerous times -- if a contribution is good, well coded, designed, thought of, it will go in. Whether it's a refactoring of something, or a completely new code. I doubt there are people on this community that can stand in the way of it. This is really the crux of it. IMO, people should be much less concerned with how they perceive others, and more concerned with just doing things. The Apache rules are set up to deal with this type of thing. Those rules can get tricky, and nobody likes to fall back on them - but when you have strong disagreement, that is what they are there for. Not everyone on a project has to agree - nor do they have to have pure open source motives. That's just normal and expected. We are a very varied group. The more differences the better IMHO. Just as a reminder - a couple things I see repeatedly at Apache: community over code merit does not expire Other than that, the doers do, occasionally we vote, and in general things move along. Shai - Mark Miller lucidimagination.com Lucene/Solr User Conference May 25-26, San Francisco www.lucenerevolution.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)
[ https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028354#comment-13028354 ] Uwe Schindler commented on LUCENE-3065: --- Ideally this could be done with the schema-like approach of one of the GSoC projects? We already discussed about that: We can use the FieldsReader/FieldsWriter type flag (which currently says, binary/text and compressed (unused now)) in the index file format to mark a field as NumericField. In that case, Document.getField() would return the NumericField instance. For Lucene backwards we should still support creating text-only fields. The new binary format would also be compatible with solr, as on getField, Solr would get a NumericField and can decide using instanceof what to do. Old Solr indexes without the NumericField marker flag would return as byte[], in which case, solr would do the decoding. For storing on index side, Solr could move to NumericField completely (I dont like the current approach using NumericTokenStream and to/fromInternal wrappers around conventional Field). NumericField should be stored in binary format in index (matching Solr's format) Key: LUCENE-3065 URL: https://issues.apache.org/jira/browse/LUCENE-3065 Project: Lucene - Java Issue Type: Bug Components: Index Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 (Spinoff of LUCENE-3001) Today when writing stored fields we don't record that the field was a NumericField, and so at IndexReader time you get back an ordinary Field and your number has turned into a string. See https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972 We have spare bits already in stored fields, so, we should use one to record that the field is numeric, and then encode the numeric field in Solr's more-compact binary format. A nice side-effect is we fix the long standing issue that you don't get a NumericField back when loading your document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: MergePolicy Thresholds
Thanks Shai and Mike! I'll keep an eye on LUCENE-1076. Tom -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Tuesday, May 03, 2011 11:15 AM To: dev@lucene.apache.org Subject: Re: MergePolicy Thresholds Thanks Shai! I'm way behind on my 3.x backports -- I'll try to do this soon. Mike http://blog.mikemccandless.com On Tue, May 3, 2011 at 8:10 AM, Shai Erera ser...@gmail.com wrote: I uploaded a patch to LUCENE-1076. Tom, apparently the patch I've attached before cannot be used, because there are dependencies (in earlier commits on LUCENE-1076) that need to be back-ported as well. So stay tuned on LUCENE-1076 for when it is safe to use this new MP. Shai On Tue, May 3, 2011 at 1:00 PM, Michael McCandless luc...@mikemccandless.com wrote: That'd be great, thanks :) Yes, let's iterate on the issue! But: it should still be open, I hope (I didn't mean to close it yet, since it's not back ported)... Mike http://blog.mikemccandless.com On Tue, May 3, 2011 at 5:51 AM, Shai Erera ser...@gmail.com wrote: Mike, if you want, I can back-port it, as I've already started this when preparing the patch. I noticed that you added a throws IOE to IW.setInfoStream -- is it ok on 3x too? It'll be a backwards change. Maybe we should iterate on the issue? I can reopen. Shai On Tue, May 3, 2011 at 12:36 PM, Michael McCandless luc...@mikemccandless.com wrote: Looks good Shai! Comments below too: On Tue, May 3, 2011 at 5:29 AM, Shai Erera ser...@gmail.com wrote: Hi I looked into porting it to 3x, and prepared the attached patch. It only contains the new TieredMP and Test, as well as the necessary changes to LuceneTestCase and IndexWriter. I guess you can start with it (even just the MP and IW changes) to test it on your indexes. Mike, I saw that there were many more changes, as part of LUCENE-1076, done to the code. In particular, this MP is now the default (on trunk), so I guess many changes (to tests) were needed because of that. Do you remember, if apart from the changes I've included in the patch, other important changes w.r.t. this code? The only other changes I can think of were some verbosity improvements to IndexWriter, to support the python script that can make a merge movie from an infoStream output; but that can wait for when I back-port to 3.x... As we won't change the default MP on 3x, I'm guessing I don't need to port all the changes to 3x. Right, I think. Mike - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[Lucene.Net] [jira] [Created] (LUCENENET-413) Medium trust security issue
Medium trust security issue - Key: LUCENENET-413 URL: https://issues.apache.org/jira/browse/LUCENENET-413 Project: Lucene.Net Issue Type: Improvement Affects Versions: Lucene.Net 2.9.4 Environment: Lucene.Net 2.9.4, Lucene.Net 2.9.4g , .Net 4.0 Reporter: Digy Priority: Minor Fix For: Lucene.Net 2.9.4 On behalf of Richard Wilde: Exceptions in Medium Trust(.NET 4.0) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Updated] (LUCENENET-413) Medium trust security issue
[ https://issues.apache.org/jira/browse/LUCENENET-413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy updated LUCENENET-413: --- Attachment: MediumTrust.2.9.4g.patch MediumTrust.2.9.4.patch Medium trust security issue - Key: LUCENENET-413 URL: https://issues.apache.org/jira/browse/LUCENENET-413 Project: Lucene.Net Issue Type: Improvement Affects Versions: Lucene.Net 2.9.4 Environment: Lucene.Net 2.9.4, Lucene.Net 2.9.4g , .Net 4.0 Reporter: Digy Priority: Minor Fix For: Lucene.Net 2.9.4 Attachments: MediumTrust.2.9.4.patch, MediumTrust.2.9.4g.patch On behalf of Richard Wilde: Exceptions in Medium Trust(.NET 4.0) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)
[ https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3065: --- Attachment: LUCENE-3065.patch Patch against 3.x. I moved the to/from byte[] methods from Solr's TrieField into Lucene's NumericUtils, and fixed FieldsWriter/Reader to use free bits in the field's flags to know if the field is Numeric, and which type. I added a random test case to verify we now get the right NumericField back, when we stored NumericField during indexing. Old indices are handled fine (you'll get a String-ified Field back like you did before). Spookily, nothing failed in Solr... I assume there's somewhere in Solr that must now be fixed to handle the fact that a field can come back as NumericField? Anyone know where...? NumericField should be stored in binary format in index (matching Solr's format) Key: LUCENE-3065 URL: https://issues.apache.org/jira/browse/LUCENE-3065 Project: Lucene - Java Issue Type: Bug Components: Index Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3065.patch (Spinoff of LUCENE-3001) Today when writing stored fields we don't record that the field was a NumericField, and so at IndexReader time you get back an ordinary Field and your number has turned into a string. See https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972 We have spare bits already in stored fields, so, we should use one to record that the field is numeric, and then encode the numeric field in Solr's more-compact binary format. A nice side-effect is we fix the long standing issue that you don't get a NumericField back when loading your document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)
[ https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028395#comment-13028395 ] Uwe Schindler commented on LUCENE-3065: --- {quote} Spookily, nothing failed in Solr... I assume there's somewhere in Solr that must now be fixed to handle the fact that a field can come back as NumericField? Anyone know where...? {quote} Thats easy to understand: Solr does not use NumericField at all. It produces a NumericTokenStream and indexes it like any other analyzer. The byte[] field is indexed as a separate Field with only store=true and binary. This is what I wanted to say with my last comment. NumericField should be stored in binary format in index (matching Solr's format) Key: LUCENE-3065 URL: https://issues.apache.org/jira/browse/LUCENE-3065 Project: Lucene - Java Issue Type: Bug Components: Index Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3065.patch (Spinoff of LUCENE-3001) Today when writing stored fields we don't record that the field was a NumericField, and so at IndexReader time you get back an ordinary Field and your number has turned into a string. See https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972 We have spare bits already in stored fields, so, we should use one to record that the field is numeric, and then encode the numeric field in Solr's more-compact binary format. A nice side-effect is we fix the long standing issue that you don't get a NumericField back when loading your document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2490) PropertiesRequestHandler; encode line.separator
PropertiesRequestHandler; encode line.separator --- Key: SOLR-2490 URL: https://issues.apache.org/jira/browse/SOLR-2490 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Priority: Trivial Currently, the XML looks like this: {code}!-- .. -- str name=java.io.tmpdir/tmp/str str name=line.separator /str str name=java.vm.specification.vendorSun Microsystems Inc./str !-- .. --{code} would be good to have this instead: {code}!-- .. -- str name=java.io.tmpdir/tmp/str str name=line.separator\n/str str name=java.vm.specification.vendorSun Microsystems Inc./str !-- .. --{code} afterwords we will be able to display to used line seperator -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)
[ https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028398#comment-13028398 ] Michael McCandless commented on LUCENE-3065: {quote} Thats easy to understand: Solr does not use NumericField at all. It produces a NumericTokenStream and indexes it like any other analyzer. The byte[] field is indexed as a separate Field with only store=true and binary. This is what I wanted to say with my last comment. {quote} A, OK. So, not spooky. We should eventually fix that; shouldn't Solr just use NumericField instead of doing this encode/decode itself? Is there some reason...? NumericField should be stored in binary format in index (matching Solr's format) Key: LUCENE-3065 URL: https://issues.apache.org/jira/browse/LUCENE-3065 Project: Lucene - Java Issue Type: Bug Components: Index Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3065.patch (Spinoff of LUCENE-3001) Today when writing stored fields we don't record that the field was a NumericField, and so at IndexReader time you get back an ordinary Field and your number has turned into a string. See https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972 We have spare bits already in stored fields, so, we should use one to record that the field is numeric, and then encode the numeric field in Solr's more-compact binary format. A nice side-effect is we fix the long standing issue that you don't get a NumericField back when loading your document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)
[ https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028399#comment-13028399 ] Ryan McKinley commented on LUCENE-3065: --- bq. Is there some reason...? Solr did its own encoding/decoding so that it could store a binary field -- with this patch, that is not necessary anymore. NumericField should be stored in binary format in index (matching Solr's format) Key: LUCENE-3065 URL: https://issues.apache.org/jira/browse/LUCENE-3065 Project: Lucene - Java Issue Type: Bug Components: Index Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3065.patch (Spinoff of LUCENE-3001) Today when writing stored fields we don't record that the field was a NumericField, and so at IndexReader time you get back an ordinary Field and your number has turned into a string. See https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972 We have spare bits already in stored fields, so, we should use one to record that the field is numeric, and then encode the numeric field in Solr's more-compact binary format. A nice side-effect is we fix the long standing issue that you don't get a NumericField back when loading your document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)
[ https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028404#comment-13028404 ] Michael McCandless commented on LUCENE-3065: Uwe: I agree, I'll use BytesRef in trunk. Ryan: OK. Should we try to fix that w/ this issue? If so, can you take a crack at it? Thanks. Or, we can postpone... not necessary for this initial cutover. NumericField should be stored in binary format in index (matching Solr's format) Key: LUCENE-3065 URL: https://issues.apache.org/jira/browse/LUCENE-3065 Project: Lucene - Java Issue Type: Bug Components: Index Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3065.patch (Spinoff of LUCENE-3001) Today when writing stored fields we don't record that the field was a NumericField, and so at IndexReader time you get back an ordinary Field and your number has turned into a string. See https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972 We have spare bits already in stored fields, so, we should use one to record that the field is numeric, and then encode the numeric field in Solr's more-compact binary format. A nice side-effect is we fix the long standing issue that you don't get a NumericField back when loading your document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked
[ https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028402#comment-13028402 ] Stefan Matheis (steffkes) commented on SOLR-2399: - bq. Thanks for doing all this, Stefan! I'm happy to contribute :) bq. I looked at the Analysis screenshot and found it a bit hard to eyeball quickly because the whole things feels very pale, which makes it hard for an eye to quickly jump from tokenizer, to token filter, to next token filter, etc. It's also not immediately obvious what left side vs. right side are, so maybe a more visible Index-time Analysis and Query-time Analysis may help. Thanks for the Feedback, really appreciated. Tried to Focus on the Text .. maybe there is too much gray around, yes :/ Maybe a vertical divider (from top to bottom) would help to realize the index vs. query thingy? more whitespace between both columns perhaps? What was the Text you've used for analysis? (Just to get a feeling, how your page looks like :) Solr Admin Interface, reworked -- Key: SOLR-2399 URL: https://issues.apache.org/jira/browse/SOLR-2399 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Priority: Minor Fix For: 4.0 *The idea was to create a new, fresh (and hopefully clean) Solr Admin Interface.* [Based on this [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]] I've quickly created a Github-Repository (Just for me, to keep track of the changes) » https://github.com/steffkes/solr-admin Quick Tour: [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png], [Query-Form|http://files.mathe.is/solr-admin/02_query.png], [Plugins|http://files.mathe.is/solr-admin/05_plugins.png], [Logging|http://files.mathe.is/solr-admin/07_logging.png], [Analysis|http://files.mathe.is/solr-admin/04_analysis.png], [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png], [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png], [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png] Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked
[ https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028406#comment-13028406 ] Stefan Matheis (steffkes) commented on SOLR-2399: - {quote}Rather then use: java-properties.jsp can the JS hit: http://localhost:8983/solr/admin/properties{quote} Ha, nice -- [already integrated|https://github.com/steffkes/solr-admin/commit/04af2c51b9f5f364cbbc79d09e42530213c8fb02], dropped out the .jsp. But noticed the the line.seperator is not 'encoded', already started an ticket for this: SOLR-2490 {quote}I like the landing dashboard you have, but it would be nice to have an big (optional) link to: http://localhost:8983/solr/browse so that people starting with solr can see solr in action easily{quote} Hmm, would be useful too have another admin-extra.html File also for the global Dashboard, not only on Core-Level? We could point to the Velocity-Thingy but default, and everybody is able to extend this for his own needs. Solr Admin Interface, reworked -- Key: SOLR-2399 URL: https://issues.apache.org/jira/browse/SOLR-2399 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Priority: Minor Fix For: 4.0 *The idea was to create a new, fresh (and hopefully clean) Solr Admin Interface.* [Based on this [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]] I've quickly created a Github-Repository (Just for me, to keep track of the changes) » https://github.com/steffkes/solr-admin Quick Tour: [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png], [Query-Form|http://files.mathe.is/solr-admin/02_query.png], [Plugins|http://files.mathe.is/solr-admin/05_plugins.png], [Logging|http://files.mathe.is/solr-admin/07_logging.png], [Analysis|http://files.mathe.is/solr-admin/04_analysis.png], [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png], [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png], [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png] Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2491) spellcheck.maxCollationTries breaks when using FieldCollapsing
spellcheck.maxCollationTries breaks when using FieldCollapsing -- Key: SOLR-2491 URL: https://issues.apache.org/jira/browse/SOLR-2491 Project: Solr Issue Type: Bug Components: spellchecker Affects Versions: 4.0 Reporter: James Dyer Priority: Minor Fix For: 4.0 If specifying spellcheck.maxCollationTries and group=true on the same query, you never get any Spell Check Collations back. The problem is that SpellCheckCollator relies on ResponseBuilder.getToLog().get(hits) to see how many results each test query returns. When group=true, the toLog isn't populated so SpellCheckCollator is unable to find a collation that can return results. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)
[ https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028408#comment-13028408 ] Ryan McKinley commented on LUCENE-3065: --- bq. If so, can you take a crack at it? Thanks. Or, we can postpone... not necessary for this initial cutover. I'll take a crack at it... but I don't think its necessary in the first pass NumericField should be stored in binary format in index (matching Solr's format) Key: LUCENE-3065 URL: https://issues.apache.org/jira/browse/LUCENE-3065 Project: Lucene - Java Issue Type: Bug Components: Index Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3065.patch (Spinoff of LUCENE-3001) Today when writing stored fields we don't record that the field was a NumericField, and so at IndexReader time you get back an ordinary Field and your number has turned into a string. See https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972 We have spare bits already in stored fields, so, we should use one to record that the field is numeric, and then encode the numeric field in Solr's more-compact binary format. A nice side-effect is we fix the long standing issue that you don't get a NumericField back when loading your document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2491) spellcheck.maxCollationTries breaks when using FieldCollapsing
[ https://issues.apache.org/jira/browse/SOLR-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer updated SOLR-2491: - Attachment: SOLR-2491.patch This patch fixes the problem includes a unit test. This patch simply removes the group parameter from any test queries prior to running them. Note that the # of hits for each collation returned will always be the # of _ungrouped_ hits. This is consistent with the fact that FieldCollapsing is unable to tell us the number of grouped hits. It is a bit disturbing to me how brittle getting the # of hits back via toLog has proven to be. If someone can point to a less breakable way to do this it would be appreciated. spellcheck.maxCollationTries breaks when using FieldCollapsing -- Key: SOLR-2491 URL: https://issues.apache.org/jira/browse/SOLR-2491 Project: Solr Issue Type: Bug Components: spellchecker Affects Versions: 4.0 Reporter: James Dyer Priority: Minor Fix For: 4.0 Attachments: SOLR-2491.patch If specifying spellcheck.maxCollationTries and group=true on the same query, you never get any Spell Check Collations back. The problem is that SpellCheckCollator relies on ResponseBuilder.getToLog().get(hits) to see how many results each test query returns. When group=true, the toLog isn't populated so SpellCheckCollator is unable to find a collation that can return results. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)
[ https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028412#comment-13028412 ] Yonik Seeley commented on LUCENE-3065: -- bq. I'll take a crack at it... but I don't think its necessary in the first pass Should we try to accept both (binary or numeric field coming back) so this isn't a needless index format break, or is there another lucene index format break in the cards soon anyway? NumericField should be stored in binary format in index (matching Solr's format) Key: LUCENE-3065 URL: https://issues.apache.org/jira/browse/LUCENE-3065 Project: Lucene - Java Issue Type: Bug Components: Index Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3065.patch (Spinoff of LUCENE-3001) Today when writing stored fields we don't record that the field was a NumericField, and so at IndexReader time you get back an ordinary Field and your number has turned into a string. See https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972 We have spare bits already in stored fields, so, we should use one to record that the field is numeric, and then encode the numeric field in Solr's more-compact binary format. A nice side-effect is we fix the long standing issue that you don't get a NumericField back when loading your document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)
[ https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028416#comment-13028416 ] Uwe Schindler commented on LUCENE-3065: --- Mike: One thing about the bitmask and the 4 values. There is also an issue open to extend NumericField by byte and short. Maybe we should reserve 3 bits instead of 2 for the numeric field type - so 0x70 instead of 0x30 as mask? I just want to reseve this one extra bit, so we dont need to do any dumb masks and values later, if we extend. About the index format change: As described above, for Solr it's not a problem. New fields are always indexed using NumericField. On the query side, when Document.getField is called, it could simply check the return value with instanceof. If the getter returns not a NumericField, Solr knows that it's binary and can decode manually. This would safe backwards. Else its no break at all if we support both stored field formats during indexing somehow (in Lucene its string, returning a String Field or new binary NumericField). The index format itsself does not change generally (no need to bump version numbers, as we only use unused bits?) NumericField should be stored in binary format in index (matching Solr's format) Key: LUCENE-3065 URL: https://issues.apache.org/jira/browse/LUCENE-3065 Project: Lucene - Java Issue Type: Bug Components: Index Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3065.patch (Spinoff of LUCENE-3001) Today when writing stored fields we don't record that the field was a NumericField, and so at IndexReader time you get back an ordinary Field and your number has turned into a string. See https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972 We have spare bits already in stored fields, so, we should use one to record that the field is numeric, and then encode the numeric field in Solr's more-compact binary format. A nice side-effect is we fix the long standing issue that you don't get a NumericField back when loading your document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked
[ https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028417#comment-13028417 ] Otis Gospodnetic commented on SOLR-2399: Stefan - I only looked at the screenshot you provided. and now I can't find the link, I thought it was in this issue. Solr Admin Interface, reworked -- Key: SOLR-2399 URL: https://issues.apache.org/jira/browse/SOLR-2399 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Priority: Minor Fix For: 4.0 *The idea was to create a new, fresh (and hopefully clean) Solr Admin Interface.* [Based on this [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]] I've quickly created a Github-Repository (Just for me, to keep track of the changes) » https://github.com/steffkes/solr-admin Quick Tour: [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png], [Query-Form|http://files.mathe.is/solr-admin/02_query.png], [Plugins|http://files.mathe.is/solr-admin/05_plugins.png], [Logging|http://files.mathe.is/solr-admin/07_logging.png], [Analysis|http://files.mathe.is/solr-admin/04_analysis.png], [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png], [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png], [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png] Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3066) NullPointerException when calling sizeInBytes and setHasVectors concurrently.
[ https://issues.apache.org/jira/browse/LUCENE-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Nistor updated LUCENE-3066: -- Attachment: Test.java NullPointerException when calling sizeInBytes and setHasVectors concurrently. - Key: LUCENE-3066 URL: https://issues.apache.org/jira/browse/LUCENE-3066 Project: Lucene - Java Issue Type: Bug Affects Versions: 3.1 Environment: java 1.6.0_24 Ubuntu 10.10 Reporter: Adrian Nistor Attachments: Test.java Hi, I am encountering a NullPointerException when using org.apache.lucene.index.SegmentInfo. It appears in version 3.1.0 and also in revision 1099085 (May 3rd 2011). The NullPointerException is thrown by SegmentInfo.sizeInBytes(false) when calling SegmentInfo.sizeInBytes(false) and SegmentInfo.setHasVectors(true) in parallel. When these methods are called sequentially, they do not throw any exception. I have attached a test that exposes this problem. If you set ExposeBug = true, the methods are called concurrently and you get the NullPointerException. If you set ExposeBug = false, the methods are called sequentially, and there is no exception. Note that, in the sequential version, the methods are called many times (just like in the parallel version), and in different orders (just like in the parallel version). The concurrent test (ExposeBug = true) always throws NullPointerException under heavy load (ManyIterations = 1). Under small load (e.g., if you set ManyIterations = 10), the NullPointerException will not manifest. I suppose you need a certain thread interleaving for the NullPointerException to happen, and thus you need the heavy load. Is this a bug? Is there a patch for it? Thanks! Adrian -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3066) NullPointerException when calling sizeInBytes and setHasVectors concurrently.
NullPointerException when calling sizeInBytes and setHasVectors concurrently. - Key: LUCENE-3066 URL: https://issues.apache.org/jira/browse/LUCENE-3066 Project: Lucene - Java Issue Type: Bug Affects Versions: 3.1 Environment: java 1.6.0_24 Ubuntu 10.10 Reporter: Adrian Nistor Attachments: Test.java Hi, I am encountering a NullPointerException when using org.apache.lucene.index.SegmentInfo. It appears in version 3.1.0 and also in revision 1099085 (May 3rd 2011). The NullPointerException is thrown by SegmentInfo.sizeInBytes(false) when calling SegmentInfo.sizeInBytes(false) and SegmentInfo.setHasVectors(true) in parallel. When these methods are called sequentially, they do not throw any exception. I have attached a test that exposes this problem. If you set ExposeBug = true, the methods are called concurrently and you get the NullPointerException. If you set ExposeBug = false, the methods are called sequentially, and there is no exception. Note that, in the sequential version, the methods are called many times (just like in the parallel version), and in different orders (just like in the parallel version). The concurrent test (ExposeBug = true) always throws NullPointerException under heavy load (ManyIterations = 1). Under small load (e.g., if you set ManyIterations = 10), the NullPointerException will not manifest. I suppose you need a certain thread interleaving for the NullPointerException to happen, and thus you need the heavy load. Is this a bug? Is there a patch for it? Thanks! Adrian -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked
[ https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028419#comment-13028419 ] Stefan Matheis (steffkes) commented on SOLR-2399: - bq. Stefan - I only looked at the screenshot you provided. and now I can't find the link, I thought it was in this issue. Ah okay, thought you've already played around. Here is the screen again: http://files.mathe.is/solr-admin/04_analysis.png Solr Admin Interface, reworked -- Key: SOLR-2399 URL: https://issues.apache.org/jira/browse/SOLR-2399 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Priority: Minor Fix For: 4.0 *The idea was to create a new, fresh (and hopefully clean) Solr Admin Interface.* [Based on this [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]] I've quickly created a Github-Repository (Just for me, to keep track of the changes) » https://github.com/steffkes/solr-admin Quick Tour: [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png], [Query-Form|http://files.mathe.is/solr-admin/02_query.png], [Plugins|http://files.mathe.is/solr-admin/05_plugins.png], [Logging|http://files.mathe.is/solr-admin/07_logging.png], [Analysis|http://files.mathe.is/solr-admin/04_analysis.png], [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png], [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png], [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png] Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2492) DIH does not commit if only Deletes are processed
DIH does not commit if only Deletes are processed - Key: SOLR-2492 URL: https://issues.apache.org/jira/browse/SOLR-2492 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 3.1, 1.4.1, 4.0 Reporter: James Dyer Priority: Minor Fix For: 3.2, 4.0 If a DIH run processes deletes using the $deleteDocById and/or $deleteDocByQuery special commands, and if no adds or updates get processed in the same run, then commit is never called. Also, the # of deleted documents does not get incremented. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2492) DIH does not commit if only Deletes are processed
[ https://issues.apache.org/jira/browse/SOLR-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer updated SOLR-2492: - Attachment: SOLR-2492.patch This patch increments the # deleted documents once for each call to $deleteDocById and/or $deleteDocByQuery. Note that it would be even better (especially with ..byQuery) to get the actual # of deleted documents and increment by that many. By incrementing the # deleted documents, commit is called at the end of the run as expected. This fixes the issue of commit not being called and also causes the # of deleted documents to be reported back to the user. While this is better than current behavior, the actual # of reported deletions may not be accurate because a call to $deleteDocById may not actually delete a document. Likewise a call to $deleteDocByQuery could delete more than 1 document (or none). A unit test is provided. DIH does not commit if only Deletes are processed - Key: SOLR-2492 URL: https://issues.apache.org/jira/browse/SOLR-2492 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 1.4.1, 3.1, 4.0 Reporter: James Dyer Priority: Minor Fix For: 3.2, 4.0 Attachments: SOLR-2492.patch If a DIH run processes deletes using the $deleteDocById and/or $deleteDocByQuery special commands, and if no adds or updates get processed in the same run, then commit is never called. Also, the # of deleted documents does not get incremented. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked
[ https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028429#comment-13028429 ] Otis Gospodnetic commented on SOLR-2399: Right, so if you look at the names of tokenizers and filters there, they are super light, almost like the background. I think making them stand out more would be better - darker font, bolder... Solr Admin Interface, reworked -- Key: SOLR-2399 URL: https://issues.apache.org/jira/browse/SOLR-2399 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Priority: Minor Fix For: 4.0 *The idea was to create a new, fresh (and hopefully clean) Solr Admin Interface.* [Based on this [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]] I've quickly created a Github-Repository (Just for me, to keep track of the changes) » https://github.com/steffkes/solr-admin Quick Tour: [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png], [Query-Form|http://files.mathe.is/solr-admin/02_query.png], [Plugins|http://files.mathe.is/solr-admin/05_plugins.png], [Logging|http://files.mathe.is/solr-admin/07_logging.png], [Analysis|http://files.mathe.is/solr-admin/04_analysis.png], [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png], [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png], [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png] Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked
[ https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028439#comment-13028439 ] Stefan Matheis (steffkes) commented on SOLR-2399: - bq. Right, so if you look at the names of tokenizers and filters there, they are super light, almost like the background. I think making them stand out more would be better - darker font, bolder... Correct - just used on another monitor .. depends heavily on the settings, brightness/contrast -- will put that screen back on the todo-list and try a few things. will let you know if it's done : Additional Thoughts Otis? Does not matter if they are related to existing Screens or other Features! Solr Admin Interface, reworked -- Key: SOLR-2399 URL: https://issues.apache.org/jira/browse/SOLR-2399 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Priority: Minor Fix For: 4.0 *The idea was to create a new, fresh (and hopefully clean) Solr Admin Interface.* [Based on this [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]] I've quickly created a Github-Repository (Just for me, to keep track of the changes) » https://github.com/steffkes/solr-admin Quick Tour: [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png], [Query-Form|http://files.mathe.is/solr-admin/02_query.png], [Plugins|http://files.mathe.is/solr-admin/05_plugins.png], [Logging|http://files.mathe.is/solr-admin/07_logging.png], [Analysis|http://files.mathe.is/solr-admin/04_analysis.png], [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png], [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png], [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png] Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)
[ https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028454#comment-13028454 ] Uwe Schindler commented on LUCENE-3065: --- There is still a problem - first the good news: - If user calls Document.get(field), the returned string is as before, so there is no break at all. The reason is the implementation of NumericField.stringValue(), it returns what the user is used to from 3.0 - If a user calls getFieldable(field) all is fine, too. The only change is that it not could return NumericField. If the user simply calls stringValue() all is identical to 3.0 Problems start with: - If user calls Document.getField(name) it returns Field (internally it casts the getFieldable()) result to Field. But NumericField does not subclass Field - ClassCastException. How to handle this? - Maybe change those methods to return AbstractField, but thats a binary break and users will complain, because not everything works as expected - Make NumericField subclass Field (and Field is unfinalized) - thats a bad idea, because Field has too many methods / members that are out of scope - Deprecate Document.getField() and make it internally do an instanceof check, if it gets NumericField transform to a backwards-compatible Field? - This method is already broken. If you request Lazy field loading it also throws ClassCastEx (e.g. LUCENE-609). Not sure how to proceed. Else the patch looks fine. I think simply ignoring LazyField loading is fine, as numeric fields are a maximum of 8 bytes Else we would need LazyNumericField :( NumericField should be stored in binary format in index (matching Solr's format) Key: LUCENE-3065 URL: https://issues.apache.org/jira/browse/LUCENE-3065 Project: Lucene - Java Issue Type: Bug Components: Index Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3065.patch (Spinoff of LUCENE-3001) Today when writing stored fields we don't record that the field was a NumericField, and so at IndexReader time you get back an ordinary Field and your number has turned into a string. See https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972 We have spare bits already in stored fields, so, we should use one to record that the field is numeric, and then encode the numeric field in Solr's more-compact binary format. A nice side-effect is we fix the long standing issue that you don't get a NumericField back when loading your document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2399) Solr Admin Interface, reworked
[ https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Matheis (steffkes) updated SOLR-2399: Attachment: SOLR-2399.patch Solr Admin Interface, reworked -- Key: SOLR-2399 URL: https://issues.apache.org/jira/browse/SOLR-2399 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Priority: Minor Fix For: 4.0 Attachments: SOLR-2399.patch *The idea was to create a new, fresh (and hopefully clean) Solr Admin Interface.* [Based on this [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]] I've quickly created a Github-Repository (Just for me, to keep track of the changes) » https://github.com/steffkes/solr-admin Quick Tour: [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png], [Query-Form|http://files.mathe.is/solr-admin/02_query.png], [Plugins|http://files.mathe.is/solr-admin/05_plugins.png], [Logging|http://files.mathe.is/solr-admin/07_logging.png], [Analysis|http://files.mathe.is/solr-admin/04_analysis.png], [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png], [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png], [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png] Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)
[ https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028454#comment-13028454 ] Uwe Schindler edited comment on LUCENE-3065 at 5/3/11 10:01 PM: There is still a problem - first the good news: - If user calls Document.get(field), the returned string is as before, so there is no break at all. The reason is the implementation of NumericField.stringValue(), it returns what the user is used to from 3.0 - If a user calls getFieldable(field) all is fine, too. The only change is that it could return NumericField now. If the user simply calls stringValue() all is identical to 3.0 Problems start with: - If user calls Document.getField(name) it returns Field (internally it casts the getFieldable()) result to Field. But NumericField does not subclass Field - ClassCastException. How to handle this? - Maybe change those methods to return AbstractField, but thats a binary break and users will complain, because not everything works as expected - Make NumericField subclass Field (and Field is unfinalized) - thats a bad idea, because Field has too many methods / members that are out of scope - Deprecate Document.getField() and make it internally do an instanceof check, if it gets NumericField transform to a backwards-compatible Field? - This method is already broken. If you request Lazy field loading it also throws ClassCastEx (e.g. LUCENE-609). Not sure how to proceed. Else the patch looks fine. I think simply ignoring LazyField loading is fine, as numeric fields are a maximum of 8 bytes Else we would need LazyNumericField :( was (Author: thetaphi): There is still a problem - first the good news: - If user calls Document.get(field), the returned string is as before, so there is no break at all. The reason is the implementation of NumericField.stringValue(), it returns what the user is used to from 3.0 - If a user calls getFieldable(field) all is fine, too. The only change is that it not could return NumericField. If the user simply calls stringValue() all is identical to 3.0 Problems start with: - If user calls Document.getField(name) it returns Field (internally it casts the getFieldable()) result to Field. But NumericField does not subclass Field - ClassCastException. How to handle this? - Maybe change those methods to return AbstractField, but thats a binary break and users will complain, because not everything works as expected - Make NumericField subclass Field (and Field is unfinalized) - thats a bad idea, because Field has too many methods / members that are out of scope - Deprecate Document.getField() and make it internally do an instanceof check, if it gets NumericField transform to a backwards-compatible Field? - This method is already broken. If you request Lazy field loading it also throws ClassCastEx (e.g. LUCENE-609). Not sure how to proceed. Else the patch looks fine. I think simply ignoring LazyField loading is fine, as numeric fields are a maximum of 8 bytes Else we would need LazyNumericField :( NumericField should be stored in binary format in index (matching Solr's format) Key: LUCENE-3065 URL: https://issues.apache.org/jira/browse/LUCENE-3065 Project: Lucene - Java Issue Type: Bug Components: Index Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3065.patch (Spinoff of LUCENE-3001) Today when writing stored fields we don't record that the field was a NumericField, and so at IndexReader time you get back an ordinary Field and your number has turned into a string. See https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972 We have spare bits already in stored fields, so, we should use one to record that the field is numeric, and then encode the numeric field in Solr's more-compact binary format. A nice side-effect is we fix the long standing issue that you don't get a NumericField back when loading your document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3066) NullPointerException when calling sizeInBytes and setHasVectors concurrently.
[ https://issues.apache.org/jira/browse/LUCENE-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028480#comment-13028480 ] Michael McCandless commented on LUCENE-3066: SegmentInfo is not actually thread safe; access to it inside Lucene is supposed to be guarded by IndexWriter's monitor lock. That said, this issue looks alot like LUCENE-3051 -- is that where/how you hit a problem here? Or something else...? NullPointerException when calling sizeInBytes and setHasVectors concurrently. - Key: LUCENE-3066 URL: https://issues.apache.org/jira/browse/LUCENE-3066 Project: Lucene - Java Issue Type: Bug Affects Versions: 3.1 Environment: java 1.6.0_24 Ubuntu 10.10 Reporter: Adrian Nistor Attachments: Test.java Hi, I am encountering a NullPointerException when using org.apache.lucene.index.SegmentInfo. It appears in version 3.1.0 and also in revision 1099085 (May 3rd 2011). The NullPointerException is thrown by SegmentInfo.sizeInBytes(false) when calling SegmentInfo.sizeInBytes(false) and SegmentInfo.setHasVectors(true) in parallel. When these methods are called sequentially, they do not throw any exception. I have attached a test that exposes this problem. If you set ExposeBug = true, the methods are called concurrently and you get the NullPointerException. If you set ExposeBug = false, the methods are called sequentially, and there is no exception. Note that, in the sequential version, the methods are called many times (just like in the parallel version), and in different orders (just like in the parallel version). The concurrent test (ExposeBug = true) always throws NullPointerException under heavy load (ManyIterations = 1). Under small load (e.g., if you set ManyIterations = 10), the NullPointerException will not manifest. I suppose you need a certain thread interleaving for the NullPointerException to happen, and thus you need the heavy load. Is this a bug? Is there a patch for it? Thanks! Adrian -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)
[ https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028483#comment-13028483 ] Michael McCandless commented on LUCENE-3065: Ugh! Field/Fieldable/AbstractField strikes again hmm not sure what to do. NumericField should be stored in binary format in index (matching Solr's format) Key: LUCENE-3065 URL: https://issues.apache.org/jira/browse/LUCENE-3065 Project: Lucene - Java Issue Type: Bug Components: Index Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3065.patch (Spinoff of LUCENE-3001) Today when writing stored fields we don't record that the field was a NumericField, and so at IndexReader time you get back an ordinary Field and your number has turned into a string. See https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972 We have spare bits already in stored fields, so, we should use one to record that the field is numeric, and then encode the numeric field in Solr's more-compact binary format. A nice side-effect is we fix the long standing issue that you don't get a NumericField back when loading your document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (LUCENE-3066) NullPointerException when calling sizeInBytes and setHasVectors concurrently.
[ https://issues.apache.org/jira/browse/LUCENE-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Nistor closed LUCENE-3066. - Resolution: Not A Problem NullPointerException when calling sizeInBytes and setHasVectors concurrently. - Key: LUCENE-3066 URL: https://issues.apache.org/jira/browse/LUCENE-3066 Project: Lucene - Java Issue Type: Bug Affects Versions: 3.1 Environment: java 1.6.0_24 Ubuntu 10.10 Reporter: Adrian Nistor Attachments: Test.java Hi, I am encountering a NullPointerException when using org.apache.lucene.index.SegmentInfo. It appears in version 3.1.0 and also in revision 1099085 (May 3rd 2011). The NullPointerException is thrown by SegmentInfo.sizeInBytes(false) when calling SegmentInfo.sizeInBytes(false) and SegmentInfo.setHasVectors(true) in parallel. When these methods are called sequentially, they do not throw any exception. I have attached a test that exposes this problem. If you set ExposeBug = true, the methods are called concurrently and you get the NullPointerException. If you set ExposeBug = false, the methods are called sequentially, and there is no exception. Note that, in the sequential version, the methods are called many times (just like in the parallel version), and in different orders (just like in the parallel version). The concurrent test (ExposeBug = true) always throws NullPointerException under heavy load (ManyIterations = 1). Under small load (e.g., if you set ManyIterations = 10), the NullPointerException will not manifest. I suppose you need a certain thread interleaving for the NullPointerException to happen, and thus you need the heavy load. Is this a bug? Is there a patch for it? Thanks! Adrian -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3066) NullPointerException when calling sizeInBytes and setHasVectors concurrently.
[ https://issues.apache.org/jira/browse/LUCENE-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028496#comment-13028496 ] Adrian Nistor commented on LUCENE-3066: --- Hi Michael, Thanks for the super fast reply! SegmentInfo is not actually thread safe; access to it inside Lucene is supposed to be guarded by IndexWriter's monitor lock. Ah, very sorry, I did not realize this. That said, this issue looks a lot like LUCENE-3051 - is that where/how you hit a problem here? Or something else...? No, totally unrelated. I am testing a tool for testing concurrent code. I assumed that SegmentInfo is supposed to be thread safe and thus a good candidate for testing. Thanks again for your reply and very sorry the trouble! Thanks! Adrian NullPointerException when calling sizeInBytes and setHasVectors concurrently. - Key: LUCENE-3066 URL: https://issues.apache.org/jira/browse/LUCENE-3066 Project: Lucene - Java Issue Type: Bug Affects Versions: 3.1 Environment: java 1.6.0_24 Ubuntu 10.10 Reporter: Adrian Nistor Attachments: Test.java Hi, I am encountering a NullPointerException when using org.apache.lucene.index.SegmentInfo. It appears in version 3.1.0 and also in revision 1099085 (May 3rd 2011). The NullPointerException is thrown by SegmentInfo.sizeInBytes(false) when calling SegmentInfo.sizeInBytes(false) and SegmentInfo.setHasVectors(true) in parallel. When these methods are called sequentially, they do not throw any exception. I have attached a test that exposes this problem. If you set ExposeBug = true, the methods are called concurrently and you get the NullPointerException. If you set ExposeBug = false, the methods are called sequentially, and there is no exception. Note that, in the sequential version, the methods are called many times (just like in the parallel version), and in different orders (just like in the parallel version). The concurrent test (ExposeBug = true) always throws NullPointerException under heavy load (ManyIterations = 1). Under small load (e.g., if you set ManyIterations = 10), the NullPointerException will not manifest. I suppose you need a certain thread interleaving for the NullPointerException to happen, and thus you need the heavy load. Is this a bug? Is there a patch for it? Thanks! Adrian -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)
[ https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028526#comment-13028526 ] Chris Male commented on LUCENE-3065: The Field/Fieldable/AbstractField problem is what I've been addressing in LUCENE-2310. There I took the step of making NumericField extend Field, with a series of unsupported fields. This seemed easiest to do particularly with FieldType in mind. I then deprecated all the Fieldable methods in Document. NumericField should be stored in binary format in index (matching Solr's format) Key: LUCENE-3065 URL: https://issues.apache.org/jira/browse/LUCENE-3065 Project: Lucene - Java Issue Type: Bug Components: Index Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3065.patch (Spinoff of LUCENE-3001) Today when writing stored fields we don't record that the field was a NumericField, and so at IndexReader time you get back an ordinary Field and your number has turned into a string. See https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972 We have spare bits already in stored fields, so, we should use one to record that the field is numeric, and then encode the numeric field in Solr's more-compact binary format. A nice side-effect is we fix the long standing issue that you don't get a NumericField back when loading your document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)
[ https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028540#comment-13028540 ] Yonik Seeley commented on LUCENE-3065: -- bq. I then deprecated all the Fieldable methods in Document. Hmmm, I thought Fieldable was a step forward. The Field class is the worst of the bunch! NumericField should be stored in binary format in index (matching Solr's format) Key: LUCENE-3065 URL: https://issues.apache.org/jira/browse/LUCENE-3065 Project: Lucene - Java Issue Type: Bug Components: Index Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3065.patch (Spinoff of LUCENE-3001) Today when writing stored fields we don't record that the field was a NumericField, and so at IndexReader time you get back an ordinary Field and your number has turned into a string. See https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972 We have spare bits already in stored fields, so, we should use one to record that the field is numeric, and then encode the numeric field in Solr's more-compact binary format. A nice side-effect is we fix the long standing issue that you don't get a NumericField back when loading your document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3067) Lucene test cases do not properly close input and output instances
Lucene test cases do not properly close input and output instances -- Key: LUCENE-3067 URL: https://issues.apache.org/jira/browse/LUCENE-3067 Project: Lucene - Java Issue Type: Bug Components: Build, Tests Affects Versions: 3.1, 4.0 Reporter: Robert Ragno Priority: Minor The Lucene tests do not take care to close all file handles. Unless I am missing something, every single instance of Directory, IndexReader, IndexWriter, IndexSearcher, TermPositions, etc. should be wrapped with a try-finally pattern, such that the instance is always closed. Not doing so risks leaving files open, depending on the GC behavior. I believe this causes tests to fail with a could not delete exception, inconsistently. I at least observe this on a fast machine with Windows, where deletion is a little more sensitive to open handles. It seems dangerous and undesirable, anyway (again, unless I am missing something). I don't know of another pattern in Java that would actually be safe. Some of these objects may just happen to be safe to let dangle in the wind, until the GC reaps, but by the contracts that really can't be allowed. The close methods need to be called to release resources. Fixing this *appears* to alleviate the test failures, but it is hard to tell due to the nondeterministic behavior. I am reluctant to make up the whole patch if this is inaccurate - it is somewhat tedious. The classes involved can be instrumented to expose this problem. (In particular, I would imagine that the finalizer should never be reached without the close() methods being previously invoked.) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3067) Lucene test cases do not properly close input and output instances
[ https://issues.apache.org/jira/browse/LUCENE-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028545#comment-13028545 ] Robert Muir commented on LUCENE-3067: - Hi Robert, can you provide more information on exactly which tests you are having a problem with? All tests wrap Directory instances via MockDirectoryWrapper. These Directory instances are themselves registered with LuceneTestCase, and the test will fail if you do not close the Directory. Furthermore, you cannot close these Directory instances themselves until open files are closed, as they track open files for this purpose. If you don't close all IndexReaders etc the test will fail. Lucene test cases do not properly close input and output instances -- Key: LUCENE-3067 URL: https://issues.apache.org/jira/browse/LUCENE-3067 Project: Lucene - Java Issue Type: Bug Components: Build, Tests Affects Versions: 3.1, 4.0 Reporter: Robert Ragno Priority: Minor Original Estimate: 4h Remaining Estimate: 4h The Lucene tests do not take care to close all file handles. Unless I am missing something, every single instance of Directory, IndexReader, IndexWriter, IndexSearcher, TermPositions, etc. should be wrapped with a try-finally pattern, such that the instance is always closed. Not doing so risks leaving files open, depending on the GC behavior. I believe this causes tests to fail with a could not delete exception, inconsistently. I at least observe this on a fast machine with Windows, where deletion is a little more sensitive to open handles. It seems dangerous and undesirable, anyway (again, unless I am missing something). I don't know of another pattern in Java that would actually be safe. Some of these objects may just happen to be safe to let dangle in the wind, until the GC reaps, but by the contracts that really can't be allowed. The close methods need to be called to release resources. Fixing this *appears* to alleviate the test failures, but it is hard to tell due to the nondeterministic behavior. I am reluctant to make up the whole patch if this is inaccurate - it is somewhat tedious. The classes involved can be instrumented to expose this problem. (In particular, I would imagine that the finalizer should never be reached without the close() methods being previously invoked.) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)
[ https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028570#comment-13028570 ] Chris Male commented on LUCENE-3065: Yeah there is an element of truth to that except I'm not convinced we need to have such a complicated hierarchy (although I've since been thinking about field definitions coming from different sources, so maybe an interface is best). But yes, Field is a mess and I've been trying to clean that out too. NumericField should be stored in binary format in index (matching Solr's format) Key: LUCENE-3065 URL: https://issues.apache.org/jira/browse/LUCENE-3065 Project: Lucene - Java Issue Type: Bug Components: Index Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3065.patch (Spinoff of LUCENE-3001) Today when writing stored fields we don't record that the field was a NumericField, and so at IndexReader time you get back an ordinary Field and your number has turned into a string. See https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972 We have spare bits already in stored fields, so, we should use one to record that the field is numeric, and then encode the numeric field in Solr's more-compact binary format. A nice side-effect is we fix the long standing issue that you don't get a NumericField back when loading your document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2493) SolrQueryParser constantly parse luceneMatchVersion in solrconfig. Large performance hit.
SolrQueryParser constantly parse luceneMatchVersion in solrconfig. Large performance hit. - Key: SOLR-2493 URL: https://issues.apache.org/jira/browse/SOLR-2493 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.1 Reporter: Stephane Bailliez Priority: Blocker I' m putting this as blocker as I think this is a serious issue that should be adressed asap with a release. With the current code this is no way near suitable for production use. For each instance created SolrQueryParser calls getSchema().getSolrConfig().getLuceneVersion(luceneMatchVersion, Version.LUCENE_24) instead of using getSchema().getSolrConfig().luceneMatchVersion This creates a massive performance hit. For each request, there is generally 3 query parsers created and each of them will parse the xml node in config which involve creating an instance of XPath and behind the scene the usual factory finder pattern quicks in within the xml parser and does a loadClass. The stack is typically: at org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:363) at com.sun.org.apache.xml.internal.dtm.ObjectFactory.findProviderClass(ObjectFactory.java:506) at com.sun.org.apache.xml.internal.dtm.ObjectFactory.lookUpFactoryClass(ObjectFactory.java:217) at com.sun.org.apache.xml.internal.dtm.ObjectFactory.createObject(ObjectFactory.java:131) at com.sun.org.apache.xml.internal.dtm.ObjectFactory.createObject(ObjectFactory.java:101) at com.sun.org.apache.xml.internal.dtm.DTMManager.newInstance(DTMManager.java:135) at com.sun.org.apache.xpath.internal.XPathContext.init(XPathContext.java:100) at com.sun.org.apache.xpath.internal.jaxp.XPathImpl.eval(XPathImpl.java:201) at com.sun.org.apache.xpath.internal.jaxp.XPathImpl.evaluate(XPathImpl.java:275) at org.apache.solr.core.Config.getNode(Config.java:230) at org.apache.solr.core.Config.getVal(Config.java:256) at org.apache.solr.core.Config.getLuceneVersion(Config.java:325) at org.apache.solr.search.SolrQueryParser.init(SolrQueryParser.java:76) at org.apache.solr.schema.IndexSchema.getSolrQueryParser(IndexSchema.java:277) With the current 3.1 code, I do barely 250 qps with 16 concurrent users with a near empty index. Switching SolrQueryParser to use getSchema().getSolrConfig().luceneMatchVersion and doing a quick bench test, performance become reasonable beyond 2000 qps. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: I was accepted in GSoC!!!
Hi Uwe, Sorry, I only saw your email today, busy lately with college homeworks. I was planning to submit patches to Lucene (through JIRA/email?). Do you have something else in mind? Regards, Vinicius Barros --- Em dom, 1/5/11, Uwe Schindler u...@thetaphi.de escreveu: De: Uwe Schindler u...@thetaphi.de Assunto: RE: I was accepted in GSoC!!! Para: dev@lucene.apache.org Data: Domingo, 1 de Maio de 2011, 7:36 Welcome Vinicius, I am glad to hear that you (my mentee) are one of the 5 students that are working for Apache Lucene/Solr this year. Until the coding officially starts, we should also sort out the infrastructure things like where to put the code and make a plan how to start. We should keep in close contact. Uwe -Uwe SchindlerH.-H.-Meier-Allee 63, D-28213 Bremenhttp://www.thetaphi.deeMail: u...@thetaphi.de From: Vinicius Barros [mailto:viniciusbarros.g...@yahoo.com.br] Sent: Sunday, May 01, 2011 3:18 AM To: dev@lucene.apache.org; uschind...@apache.org Subject: I was accepted in GSoC!!! Hi, That's great, I am waiting next instructions from google, it seems there is some paperwork to do. Regards, Vinicius Barros --- Em seg, 25/4/11, no-re...@socghop.appspotmail.com no-re...@socghop.appspotmail.com escreveu: De: no-re...@socghop.appspotmail.com no-re...@socghop.appspotmail.com Assunto: Congratulations! Para: viniciusbarros.g...@yahoo.com.br Data: Segunda-feira, 25 de Abril de 2011, 15:48Dear Vinicius, Congratulations! Your proposal LUCENE-1768: NumericRange support for new query parser as submitted to Apache Software Foundation has been accepted for Google Summer of Code 2011. Over the next few days, we will add you to the private Google Summer of Code Student Discussion List. Over the next few weeks, we will send instructions to this list regarding turn in proof of enrollment, tax forms, etc. Now that you've been accepted, please take the opportunity to speak with your mentors about plans for the Community Bonding Period: what documentation should you be reading, what version control system will you need to set up, etc., before start of coding begins on May 23rd. Welcome to Google Summer of Code 2011! We look forward to having you with us. With best regards, The Google Summer of Code Program Administration Team