[jira] [Comment Edited] (SOLR-6022) Rename getAnalyzer to getIndexAnalyzer

2014-04-28 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984076#comment-13984076
 ] 

Uwe Schindler edited comment on SOLR-6022 at 4/29/14 6:52 AM:
--

[~tomasflobbe]: The patch for 4.x has to be applied after merge from trunk, so 
first apply patch to trunk, merge that to 4.x and then apply the 4.x patch 
on-top (which restores the deprecated method):

bq. The second patch shows exactly how I will do the deprecation in branch_4x 
(I believe I should be able to just apply the patch after doing a merge back 
from trunk).

To test that everything works without merging/committing, apply both patches to 
the same 4.x checkout, the general first, then the 4.x one.


was (Author: thetaphi):
[~tomasflobbe]: The patch for 4.x has to be applied after merge from trunk, so 
first apply patch to trunk, merge that to 4.x and then apply the 4.x patch 
on-top (which restores the deprecated method):

bq. The second patch shows exactly how I will do the deprecation in branch_4x 
(I believe I should be able to just apply the patch after doing a merge back 
from trunk).

> Rename getAnalyzer to getIndexAnalyzer
> --
>
> Key: SOLR-6022
> URL: https://issues.apache.org/jira/browse/SOLR-6022
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan Ernst
> Attachments: SOLR-6022.branch_4x-deprecation.patch, SOLR-6022.patch, 
> SOLR-6022.patch
>
>
> We have separate index/query analyzer chains, but the access methods for the 
> analyzers do not match up with the names.  This can lead to unknowingly using 
> the wrong analyzer chain (as it did in SOLR-6017).  We should do this 
> renaming in trunk, and deprecate the old getAnalyzer function in 4x.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6022) Rename getAnalyzer to getIndexAnalyzer

2014-04-28 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984076#comment-13984076
 ] 

Uwe Schindler commented on SOLR-6022:
-

[~tomasflobbe]: The patch for 4.x has to be applied after merge from trunk, so 
first apply patch to trunk, merge that to 4.x and then apply the 4.x patch 
on-top (which restores the deprecated method):

bq. The second patch shows exactly how I will do the deprecation in branch_4x 
(I believe I should be able to just apply the patch after doing a merge back 
from trunk).

> Rename getAnalyzer to getIndexAnalyzer
> --
>
> Key: SOLR-6022
> URL: https://issues.apache.org/jira/browse/SOLR-6022
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan Ernst
> Attachments: SOLR-6022.branch_4x-deprecation.patch, SOLR-6022.patch, 
> SOLR-6022.patch
>
>
> We have separate index/query analyzer chains, but the access methods for the 
> analyzers do not match up with the names.  This can lead to unknowingly using 
> the wrong analyzer chain (as it did in SOLR-6017).  We should do this 
> renaming in trunk, and deprecate the old getAnalyzer function in 4x.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6031) Getting Cannot find symbol while Compiling the java file.

2014-04-28 Thread Vikash Kumar Singh (JIRA)
Vikash Kumar Singh created SOLR-6031:


 Summary: Getting Cannot find symbol while Compiling the java file.
 Key: SOLR-6031
 URL: https://issues.apache.org/jira/browse/SOLR-6031
 Project: Solr
  Issue Type: Task
  Components: clients - java
 Environment: Centos6.5, Solr-4.7.1, java version "1.7.0_51"
Reporter: Vikash Kumar Singh
Priority: Minor


Here is the code which i am using just for testing purpose first on console as 

import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.impl.HttpSolrServer;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.common.SolrDocumentList;
import java.net.MalformedURLException;

public class SolrJSearcher
{
 public static void main(String[] args) throws 
MalformedURLException,SolrServerException
 {
HttpSolrServer solr = new HttpSolrServer("http://localhost:8983/solr";);
SolrQuery query = new SolrQuery();
query.setQuery("sony digital camera");
query.addFilterQuery("cat:electronics","store:amazon.com");
query.setFields("id","price","merchant","cat","store");
query.setStart(0);
query.set("defType", "edismax");
QueryResponse response = solr.query(query);
SolrDocumentList results = response.getResults();
for (int i = 0; i < results.size(); ++i)
 {
  System.out.println(results.get(i));
 }
 }
}


Also i hv set the classpath as 

export 
CLASSPATH=/home/vikash/solr-4.7.1/dist/*.jar:/home/vikash/solr-4.7.1/dist/solrj-lib/*.jar


but still while compiling i get these errors, i don't know what to do now, 
please help


[root@localhost vikash]# javac SolrJSearcher.java 
SolrJSearcher.java:1: package org.apache.solr.client.solrj does not exist
import org.apache.solr.client.solrj.SolrServerException;
   ^
SolrJSearcher.java:2: package org.apache.solr.client.solrj.impl does not exist
import org.apache.solr.client.solrj.impl.HttpSolrServer;
^
SolrJSearcher.java:3: package org.apache.solr.client.solrj does not exist
import org.apache.solr.client.solrj.SolrQuery;
   ^
SolrJSearcher.java:4: package org.apache.solr.client.solrj.response does not 
exist
import org.apache.solr.client.solrj.response.QueryResponse;
^
SolrJSearcher.java:5: package org.apache.solr.common does not exist
import org.apache.solr.common.SolrDocumentList;
 ^
SolrJSearcher.java:10: cannot find symbol
symbol  : class SolrServerException
location: class SolrJSearcher
 public static void main(String[] args) throws 
MalformedURLException,SolrServerException
 ^
SolrJSearcher.java:12: cannot find symbol
symbol  : class HttpSolrServer
location: class SolrJSearcher
HttpSolrServer solr = new HttpSolrServer("http://localhost:8983/solr";);
^
SolrJSearcher.java:12: cannot find symbol
symbol  : class HttpSolrServer
location: class SolrJSearcher
HttpSolrServer solr = new HttpSolrServer("http://localhost:8983/solr";);
  ^
SolrJSearcher.java:13: cannot find symbol
symbol  : class SolrQuery
location: class SolrJSearcher
SolrQuery query = new SolrQuery();
^
SolrJSearcher.java:13: cannot find symbol
symbol  : class SolrQuery
location: class SolrJSearcher
SolrQuery query = new SolrQuery();
  ^
SolrJSearcher.java:19: cannot find symbol
symbol  : class QueryResponse
location: class SolrJSearcher
QueryResponse response = solr.query(query);
^
SolrJSearcher.java:20: cannot find symbol
symbol  : class SolrDocumentList
location: class SolrJSearcher
SolrDocumentList results = response.getResults();
^
12 errors
[root@localhost vikash]# 




--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6030) Use System.nanoTime() instead of currentTimeInMills() in LRUCache.warm

2014-04-28 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-6030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomás Fernández Löbbe updated SOLR-6030:


Attachment: SOLR-6030.patch

> Use System.nanoTime() instead of currentTimeInMills() in LRUCache.warm
> --
>
> Key: SOLR-6030
> URL: https://issues.apache.org/jira/browse/SOLR-6030
> Project: Solr
>  Issue Type: Improvement
>Reporter: Tomás Fernández Löbbe
>Priority: Trivial
> Attachments: SOLR-6030.patch
>
>
> Most of this cases were addressed in SOLR-5734, but looks like LRUCache 
> wasn't. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6030) Use System.nanoTime() instead of currentTimeInMills() in LRUCache.warm

2014-04-28 Thread JIRA
Tomás Fernández Löbbe created SOLR-6030:
---

 Summary: Use System.nanoTime() instead of currentTimeInMills() in 
LRUCache.warm
 Key: SOLR-6030
 URL: https://issues.apache.org/jira/browse/SOLR-6030
 Project: Solr
  Issue Type: Improvement
Reporter: Tomás Fernández Löbbe
Priority: Trivial


Most of this cases were addressed in SOLR-5734, but looks like LRUCache wasn't. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment

2014-04-28 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein resolved SOLR-6029.
--

   Resolution: Fixed
Fix Version/s: 4.9
   4.8.1

> CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc 
> has been deleted from a segment
> -
>
> Key: SOLR-6029
> URL: https://issues.apache.org/jira/browse/SOLR-6029
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Affects Versions: 4.7.1
>Reporter: Greg Harris
>Assignee: Joel Bernstein
>Priority: Minor
> Fix For: 4.8.1, 4.9
>
> Attachments: SOLR-6029.patch
>
>
> CollapsingQParserPlugin misidentifies if a document is not found in a segment 
> if the docid previously existed in a segment ie was deleted. 
> Relevant code bit from CollapsingQParserPlugin needs to be changed from:
> -if(doc != -1) {
> +if((doc != -1) && (doc != DocsEnum.NO_MORE_DOCS)) {
> What happens is if the doc is not found the returned value is 
> DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the 
> doc location causing an ArrayIndexOutOfBoundsException as the array is only 
> as big as maxDocs. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment

2014-04-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984006#comment-13984006
 ] 

ASF subversion and git services commented on SOLR-6029:
---

Commit 1590868 from [~joel.bernstein] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1590868 ]

SOLR-6029: Updated CHANGES.txt

> CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc 
> has been deleted from a segment
> -
>
> Key: SOLR-6029
> URL: https://issues.apache.org/jira/browse/SOLR-6029
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Affects Versions: 4.7.1
>Reporter: Greg Harris
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: SOLR-6029.patch
>
>
> CollapsingQParserPlugin misidentifies if a document is not found in a segment 
> if the docid previously existed in a segment ie was deleted. 
> Relevant code bit from CollapsingQParserPlugin needs to be changed from:
> -if(doc != -1) {
> +if((doc != -1) && (doc != DocsEnum.NO_MORE_DOCS)) {
> What happens is if the doc is not found the returned value is 
> DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the 
> doc location causing an ArrayIndexOutOfBoundsException as the array is only 
> as big as maxDocs. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment

2014-04-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984004#comment-13984004
 ] 

ASF subversion and git services commented on SOLR-6029:
---

Commit 1590867 from [~joel.bernstein] in branch 'dev/trunk'
[ https://svn.apache.org/r1590867 ]

SOLR-6029: Updated CHANGES.txt

> CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc 
> has been deleted from a segment
> -
>
> Key: SOLR-6029
> URL: https://issues.apache.org/jira/browse/SOLR-6029
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Affects Versions: 4.7.1
>Reporter: Greg Harris
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: SOLR-6029.patch
>
>
> CollapsingQParserPlugin misidentifies if a document is not found in a segment 
> if the docid previously existed in a segment ie was deleted. 
> Relevant code bit from CollapsingQParserPlugin needs to be changed from:
> -if(doc != -1) {
> +if((doc != -1) && (doc != DocsEnum.NO_MORE_DOCS)) {
> What happens is if the doc is not found the returned value is 
> DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the 
> doc location causing an ArrayIndexOutOfBoundsException as the array is only 
> as big as maxDocs. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment

2014-04-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983999#comment-13983999
 ] 

ASF subversion and git services commented on SOLR-6029:
---

Commit 1590866 from [~joel.bernstein] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1590866 ]

SOLR-6029: CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if 
elevated doc has been deleted from a segment

> CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc 
> has been deleted from a segment
> -
>
> Key: SOLR-6029
> URL: https://issues.apache.org/jira/browse/SOLR-6029
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Affects Versions: 4.7.1
>Reporter: Greg Harris
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: SOLR-6029.patch
>
>
> CollapsingQParserPlugin misidentifies if a document is not found in a segment 
> if the docid previously existed in a segment ie was deleted. 
> Relevant code bit from CollapsingQParserPlugin needs to be changed from:
> -if(doc != -1) {
> +if((doc != -1) && (doc != DocsEnum.NO_MORE_DOCS)) {
> What happens is if the doc is not found the returned value is 
> DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the 
> doc location causing an ArrayIndexOutOfBoundsException as the array is only 
> as big as maxDocs. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment

2014-04-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983992#comment-13983992
 ] 

ASF subversion and git services commented on SOLR-6029:
---

Commit 1590865 from [~joel.bernstein] in branch 'dev/trunk'
[ https://svn.apache.org/r1590865 ]

SOLR-6029: CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if 
elevated doc has been deleted from a segment

> CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc 
> has been deleted from a segment
> -
>
> Key: SOLR-6029
> URL: https://issues.apache.org/jira/browse/SOLR-6029
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Affects Versions: 4.7.1
>Reporter: Greg Harris
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: SOLR-6029.patch
>
>
> CollapsingQParserPlugin misidentifies if a document is not found in a segment 
> if the docid previously existed in a segment ie was deleted. 
> Relevant code bit from CollapsingQParserPlugin needs to be changed from:
> -if(doc != -1) {
> +if((doc != -1) && (doc != DocsEnum.NO_MORE_DOCS)) {
> What happens is if the doc is not found the returned value is 
> DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the 
> doc location causing an ArrayIndexOutOfBoundsException as the array is only 
> as big as maxDocs. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5963) Finalize interface and backport analytics component to 4x

2014-04-28 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983987#comment-13983987
 ] 

Erick Erickson commented on SOLR-5963:
--

Hmmm, there have been several discussions around this, and the question now is 
whether this should be  back-ported or not.

Given that the current stats component doesn't supported distributed Solr, one 
suggestion is to move this to a contrib for the time being and then put 
distributed statistics in to the main-line code as we can. This may mean there 
are fewer capabilities. If that's acceptable, I'll start working toward that 
goal.

So that would mean:
1> pull this out of trunk
2> put this into a contrib on trunk
3> backport the contrib to 4x.

> Finalize interface and backport analytics component to 4x
> -
>
> Key: SOLR-5963
> URL: https://issues.apache.org/jira/browse/SOLR-5963
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.9, 5.0
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-5963.patch, SOLR-5963.patch
>
>
> Now that we seem to have fixed up the test failures for trunk for the 
> analytics component, we need to solidify the API and back-port it to 4x. For 
> history, see SOLR-5302 and SOLR-5488.
> As far as I know, these are the merges that need to occur to do this (plus 
> any that this JIRA brings up)
> svn merge -c 1543651 https://svn.apache.org/repos/asf/lucene/dev/trunk
> svn merge -c 1545009 https://svn.apache.org/repos/asf/lucene/dev/trunk
> svn merge -c 1545053 https://svn.apache.org/repos/asf/lucene/dev/trunk
> svn merge -c 1545054 https://svn.apache.org/repos/asf/lucene/dev/trunk
> svn merge -c 1545080 https://svn.apache.org/repos/asf/lucene/dev/trunk
> svn merge -c 1545143 https://svn.apache.org/repos/asf/lucene/dev/trunk
> svn merge -c 1545417 https://svn.apache.org/repos/asf/lucene/dev/trunk
> svn merge -c 1545514 https://svn.apache.org/repos/asf/lucene/dev/trunk
> svn merge -c 1545650 https://svn.apache.org/repos/asf/lucene/dev/trunk
> svn merge -c 1546074 https://svn.apache.org/repos/asf/lucene/dev/trunk
> svn merge -c 1546263 https://svn.apache.org/repos/asf/lucene/dev/trunk
> svn merge -c 1559770 https://svn.apache.org/repos/asf/lucene/dev/trunk
> svn merge -c 1583636 https://svn.apache.org/repos/asf/lucene/dev/trunk
> The only remaining thing I think needs to be done is to solidify the 
> interface, see comments from [~yo...@apache.org] on the two JIRAs mentioned, 
> although SOLR-5488 is the most relevant one.
> [~sbower], [~houstonputman] and [~yo...@apache.org] might be particularly 
> interested here.
> I really want to put this to bed, so if we can get agreement on this soon I 
> can make it march.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment

2014-04-28 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983972#comment-13983972
 ] 

Joel Bernstein commented on SOLR-6029:
--

Thanks Greg, this is a nasty bug.

> CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc 
> has been deleted from a segment
> -
>
> Key: SOLR-6029
> URL: https://issues.apache.org/jira/browse/SOLR-6029
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Affects Versions: 4.7.1
>Reporter: Greg Harris
>Priority: Minor
> Attachments: SOLR-6029.patch
>
>
> CollapsingQParserPlugin misidentifies if a document is not found in a segment 
> if the docid previously existed in a segment ie was deleted. 
> Relevant code bit from CollapsingQParserPlugin needs to be changed from:
> -if(doc != -1) {
> +if((doc != -1) && (doc != DocsEnum.NO_MORE_DOCS)) {
> What happens is if the doc is not found the returned value is 
> DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the 
> doc location causing an ArrayIndexOutOfBoundsException as the array is only 
> as big as maxDocs. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment

2014-04-28 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein reassigned SOLR-6029:


Assignee: Joel Bernstein

> CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc 
> has been deleted from a segment
> -
>
> Key: SOLR-6029
> URL: https://issues.apache.org/jira/browse/SOLR-6029
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Affects Versions: 4.7.1
>Reporter: Greg Harris
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: SOLR-6029.patch
>
>
> CollapsingQParserPlugin misidentifies if a document is not found in a segment 
> if the docid previously existed in a segment ie was deleted. 
> Relevant code bit from CollapsingQParserPlugin needs to be changed from:
> -if(doc != -1) {
> +if((doc != -1) && (doc != DocsEnum.NO_MORE_DOCS)) {
> What happens is if the doc is not found the returned value is 
> DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the 
> doc location causing an ArrayIndexOutOfBoundsException as the array is only 
> as big as maxDocs. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Simple unit test doco?

2014-04-28 Thread Tomás Fernández Löbbe
Many times cloud-related/distrib tests fail due to timeouts, this could be
related to the overall load of your computer (probably generated by the
tests itself). I don’t know if this is the correct way, but I found it that
it’s much less probable for them to fail if I use less JVMs to run the
tests (by default my mac would use 4, but I set it to 2 if I see failures.
You can use the JVM parameter "tests.jvms" when running ant test)

If you are working on some specific component you can filter which tests to
run in many ways, see “ant test-help”. It may be useful to use
tests.slow=false to skip the slow tests in most of your runs.

"do I need to turn on a ZK server for integration testing?”
No, you don’t. Solr will start an embedded Zookeeper for the tests.

"I've tried running those tests in isolation via IntelliJ and they all
report as passing”
Most probably is not related to this, but just in case: make sure when you
try to reproduce a failure on a test that you saw to use the same seed
(-Dtests.seed). The seed used should be in the output of the test where you
saw the failure.


   [junit4] Tests with failures:
   [junit4]   - org.apache.solr.hadoop.MorphlineMapperTest (suite)
   [junit4]

Sorry, no idea about this one.


On Mon, Apr 28, 2014 at 7:47 PM, Greg Pendlebury
wrote:

> Heyo,
>
> I'm wondering if there is any additional doco and/or tricks to unit
> testing solr than this wiki page? http://wiki.apache.org/solr/TestingSolr
>
> Some details about my troubles are below if anyone cares to read, but I'm
> not so much looking for specific responses to why individual tests are
> failing. I'm more trying to work out whether I'm on the right track or
> missing some key information... like do I need to turn on a ZK server for
> integration testing?
>
> Or do I need to accept failed unit tests as a baseline before applying our
> patch? I don't typically like that, but this is an enormous test suite and
> I'd be happy just to get a pass up to the same level that 4.7.2 had prior
> to release.
>
> Ta,
> Greg
>
>
> Details
> ==
> I downloaded the tagged 4.7.2 release Yesterday to apply a patch our team
> wants to test, but even before touching the codebase at all I cannot get
> the unit tests to pass. I'm struggling to even get consistent results.
>
> The most useful two end points I reach are:
>[junit4] Tests with failures:
>[junit4]   -
> org.apache.solr.cloud.CustomCollectionTest.testDistribSearch
>[junit4]   -
> org.apache.solr.cloud.DistribCursorPagingTest.testDistribSearch
>[junit4]   - org.apache.solr.cloud.DistribCursorPagingTest (suite)
>[junit4]
> ...
>[junit4] Execution time total: 2 hours 6 minutes 50 seconds
>[junit4] Tests summary: 365 suites, 1570 tests, 1 suite-level error, 2
> errors, 187 ignored (12 assumptions)
>
> And another one (don't have the terminal output on hand unfortunately) in
> the cloudera morphline suite. It is the same error as this though and fails
> after around an hour:
> http://mail-archives.apache.org/mod_mbox/flume-dev/201310.mbox/%3ccac6yyrj2cv89hntdeel7t0qlq8zjbwjynbtcveucxlzdmyv...@mail.gmail.com%3E
>
> I've tried running those tests in isolation via IntelliJ and they all
> report as passing... the logs show exceptions about ZK session expiry for
> some (not all) but I assume those are trapped expected exceptions since
> JUnit is passing them?
>
> Given the response in the message I linked just above re: windows support
> I tried shifting the build up to a RHEL6 server this morning but I've tried
> two runs now and both failed with this odd error:
>[junit4] Tests with failures:
>[junit4]   - org.apache.solr.hadoop.MorphlineMapperTest (suite)
>[junit4]
> ...
>[junit4] Execution time total: 42 seconds
>[junit4] Tests summary: 7 suites, 35 tests, 2 suite-level errors, 5
> ignored
>
> I only say odd because they run for half an hour and then report 42
> seconds.
>
> Thanks again if you've read all this.
>


Re: VOTE: RC1 Release apache-solr-ref-guide-4.8.pdf

2014-04-28 Thread Grant Ingersoll
+1

On Apr 25, 2014, at 5:38 PM, Chris Hostetter  wrote:

> 
> (Note: cross posted to general, please confine replies to dev@lucene)
> 
> Please VOTE to release the following RC1 as apache-solr-ref-guide-4.8.pdf ...
> 
> https://dist.apache.org/repos/dist/dev/lucene/solr/ref-guide/apache-solr-ref-guide-4.8-RC1
> 
> 
> The notes I previously mentioned regarding RC0 apply to this RC as well...
> 
> 1) Due to a known bug in confluence, the PDFs it generates are much bigger 
> then they should be.  This bug has been fixed in the latest version of 
> confluence, but cwiki.apache.rog has not yet been updated.  For that reason, 
> I have manually run a small tool against the PDF to "fix" the size (see 
> SOLR-5819).  The first time i tried this approach, it inadvertantly removed 
> the "Index" (aka: Table of Contents, or Bookmarks depending on what PDF 
> reader client you use).  I've already fixed this, but if you notice anything 
> else unusual about this PDF compared to previous versions please speak up so 
> we can see if it's a result of this post-processing and try to fix it.
> 
> 2) This is the first ref guide release where we've started using a special 
> confluence macro for any lucene/solr javadoc links.  The up side is that all 
> javadoc links in this 4.8 ref guide will now correctly point to the 4.8 
> javadocs on lucene.apache.org -- the down side is that this means none of 
> those links currently work, since the 4.8 code release is still ongoing and 
> the website has not yet been updated.
> 
> Because of #2, I intend to leave this ref guide vote open until the 4.8 code 
> release is final - that way we won't officially be releasing this doc until 
> the 4.8 javadocs are uploaded and all the links work properly.
> 
> 
> 
> -Hoss
> http://www.lucidworks.com/


Grant Ingersoll | @gsingers
http://www.lucidworks.com







[jira] [Commented] (SOLR-5473) Make one state.json per collection

2014-04-28 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983928#comment-13983928
 ] 

Noble Paul commented on SOLR-5473:
--

I'm almost there . probably today or in the worst case, tomorrow

> Make one state.json per collection
> --
>
> Key: SOLR-5473
> URL: https://issues.apache.org/jira/browse/SOLR-5473
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 5.0
>
> Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, 
> ec2-50-16-38-73_solr.log
>
>
> As defined in the parent issue, store the states of each collection under 
> /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Simple unit test doco?

2014-04-28 Thread Greg Pendlebury
Heyo,

I'm wondering if there is any additional doco and/or tricks to unit testing
solr than this wiki page? http://wiki.apache.org/solr/TestingSolr

Some details about my troubles are below if anyone cares to read, but I'm
not so much looking for specific responses to why individual tests are
failing. I'm more trying to work out whether I'm on the right track or
missing some key information... like do I need to turn on a ZK server for
integration testing?

Or do I need to accept failed unit tests as a baseline before applying our
patch? I don't typically like that, but this is an enormous test suite and
I'd be happy just to get a pass up to the same level that 4.7.2 had prior
to release.

Ta,
Greg


Details
==
I downloaded the tagged 4.7.2 release Yesterday to apply a patch our team
wants to test, but even before touching the codebase at all I cannot get
the unit tests to pass. I'm struggling to even get consistent results.

The most useful two end points I reach are:
   [junit4] Tests with failures:
   [junit4]   - org.apache.solr.cloud.CustomCollectionTest.testDistribSearch
   [junit4]   -
org.apache.solr.cloud.DistribCursorPagingTest.testDistribSearch
   [junit4]   - org.apache.solr.cloud.DistribCursorPagingTest (suite)
   [junit4]
...
   [junit4] Execution time total: 2 hours 6 minutes 50 seconds
   [junit4] Tests summary: 365 suites, 1570 tests, 1 suite-level error, 2
errors, 187 ignored (12 assumptions)

And another one (don't have the terminal output on hand unfortunately) in
the cloudera morphline suite. It is the same error as this though and fails
after around an hour:
http://mail-archives.apache.org/mod_mbox/flume-dev/201310.mbox/%3ccac6yyrj2cv89hntdeel7t0qlq8zjbwjynbtcveucxlzdmyv...@mail.gmail.com%3E

I've tried running those tests in isolation via IntelliJ and they all
report as passing... the logs show exceptions about ZK session expiry for
some (not all) but I assume those are trapped expected exceptions since
JUnit is passing them?

Given the response in the message I linked just above re: windows support I
tried shifting the build up to a RHEL6 server this morning but I've tried
two runs now and both failed with this odd error:
   [junit4] Tests with failures:
   [junit4]   - org.apache.solr.hadoop.MorphlineMapperTest (suite)
   [junit4]
...
   [junit4] Execution time total: 42 seconds
   [junit4] Tests summary: 7 suites, 35 tests, 2 suite-level errors, 5
ignored

I only say odd because they run for half an hour and then report 42 seconds.

Thanks again if you've read all this.


Re: VOTE: RC1 Release apache-solr-ref-guide-4.8.pdf

2014-04-28 Thread Anshum Gupta
+1


On Fri, Apr 25, 2014 at 2:38 PM, Chris Hostetter
wrote:

>
> (Note: cross posted to general, please confine replies to dev@lucene)
>
> Please VOTE to release the following RC1 as apache-solr-ref-guide-4.8.pdf
> ...
>
> https://dist.apache.org/repos/dist/dev/lucene/solr/ref-
> guide/apache-solr-ref-guide-4.8-RC1
>
>
> The notes I previously mentioned regarding RC0 apply to this RC as well...
>
> 1) Due to a known bug in confluence, the PDFs it generates are much bigger
> then they should be.  This bug has been fixed in the latest version of
> confluence, but cwiki.apache.rog has not yet been updated.  For that
> reason, I have manually run a small tool against the PDF to "fix" the size
> (see SOLR-5819).  The first time i tried this approach, it inadvertantly
> removed the "Index" (aka: Table of Contents, or Bookmarks depending on what
> PDF reader client you use).  I've already fixed this, but if you notice
> anything else unusual about this PDF compared to previous versions please
> speak up so we can see if it's a result of this post-processing and try to
> fix it.
>
> 2) This is the first ref guide release where we've started using a special
> confluence macro for any lucene/solr javadoc links.  The up side is that
> all javadoc links in this 4.8 ref guide will now correctly point to the 4.8
> javadocs on lucene.apache.org -- the down side is that this means none of
> those links currently work, since the 4.8 code release is still ongoing and
> the website has not yet been updated.
>
> Because of #2, I intend to leave this ref guide vote open until the 4.8
> code release is final - that way we won't officially be releasing this doc
> until the 4.8 javadocs are uploaded and all the links work properly.
>
>
>
> -Hoss
> http://www.lucidworks.com/
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


-- 

Anshum Gupta
http://www.anshumgupta.net


[jira] [Commented] (LUCENE-5611) Simplify the default indexing chain

2014-04-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983867#comment-13983867
 ] 

ASF subversion and git services commented on LUCENE-5611:
-

Commit 1590858 from [~rcmuir] in branch 'dev/branches/lucene5611'
[ https://svn.apache.org/r1590858 ]

LUCENE-5611: indexing optimizations, dont compute CRC for internal-use of 
RAMOutputStream, dont do heavy per-term stuff in skipper until we actually must 
buffer skipdata

> Simplify the default indexing chain
> ---
>
> Key: LUCENE-5611
> URL: https://issues.apache.org/jira/browse/LUCENE-5611
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5611.patch, LUCENE-5611.patch
>
>
> I think Lucene's current indexing chain has too many classes /
> hierarchy / abstractions, making it look much more complex than it
> really should be, and discouraging users from experimenting/innovating
> with their own indexing chains.
> Also, if it were easier to understand/approach, then new developers
> would more likely try to improve it ... it really should be simpler.
> So I'm exploring a pared back indexing chain, and have a starting patch
> that I think is looking ok: it seems more approachable than the
> current indexing chain, or at least has fewer strange classes.
> I also thought this could give some speedup for tiny documents (a more
> common use of Lucene lately), and it looks like, with the evil
> optimizations, this is a ~25% speedup for Geonames docs.  Even without
> those evil optos it's a bit faster.
> This is very much a work in progress / nocommits, and there are some
> behavior changes e.g. the new chain requires all fields to have the
> same TV options (rather than auto-upgrading all fields by the same
> name that the current chain does)...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6022) Rename getAnalyzer to getIndexAnalyzer

2014-04-28 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983829#comment-13983829
 ] 

Tomás Fernández Löbbe commented on SOLR-6022:
-

I don't see why in 4x calls to getAnalyzer() can't be changed to 
getIndexAnalyzer(), It wouldn't break compatibility and it would avoid creating 
many warnings. 

> Rename getAnalyzer to getIndexAnalyzer
> --
>
> Key: SOLR-6022
> URL: https://issues.apache.org/jira/browse/SOLR-6022
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan Ernst
> Attachments: SOLR-6022.branch_4x-deprecation.patch, SOLR-6022.patch, 
> SOLR-6022.patch
>
>
> We have separate index/query analyzer chains, but the access methods for the 
> analyzers do not match up with the names.  This can lead to unknowingly using 
> the wrong analyzer chain (as it did in SOLR-6017).  We should do this 
> renaming in trunk, and deprecate the old getAnalyzer function in 4x.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment

2014-04-28 Thread Greg Harris (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Harris updated SOLR-6029:
--

Attachment: SOLR-6029.patch

Patch with test for 4.7

> CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc 
> has been deleted from a segment
> -
>
> Key: SOLR-6029
> URL: https://issues.apache.org/jira/browse/SOLR-6029
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Affects Versions: 4.7.1
>Reporter: Greg Harris
>Priority: Minor
> Attachments: SOLR-6029.patch
>
>
> CollapsingQParserPlugin misidentifies if a document is not found in a segment 
> if the docid previously existed in a segment ie was deleted. 
> Relevant code bit from CollapsingQParserPlugin needs to be changed from:
> -if(doc != -1) {
> +if((doc != -1) && (doc != DocsEnum.NO_MORE_DOCS)) {
> What happens is if the doc is not found the returned value is 
> DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the 
> doc location causing an ArrayIndexOutOfBoundsException as the array is only 
> as big as maxDocs. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment

2014-04-28 Thread Greg Harris (JIRA)
Greg Harris created SOLR-6029:
-

 Summary: CollapsingQParserPlugin throws 
ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment
 Key: SOLR-6029
 URL: https://issues.apache.org/jira/browse/SOLR-6029
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.7.1
Reporter: Greg Harris
Priority: Minor
 Attachments: SOLR-6029.patch

CollapsingQParserPlugin misidentifies if a document is not found in a segment 
if the docid previously existed in a segment ie was deleted. 

Relevant code bit from CollapsingQParserPlugin needs to be changed from:
-if(doc != -1) {
+if((doc != -1) && (doc != DocsEnum.NO_MORE_DOCS)) {

What happens is if the doc is not found the returned value is 
DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the 
doc location causing an ArrayIndexOutOfBoundsException as the array is only as 
big as maxDocs. 




--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5473) Make one state.json per collection

2014-04-28 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983733#comment-13983733
 ] 

Mark Miller commented on SOLR-5473:
---

Any help needed on pulling this out? I think if we take too long, it's likely 
to get quite tricky fast.

> Make one state.json per collection
> --
>
> Key: SOLR-5473
> URL: https://issues.apache.org/jira/browse/SOLR-5473
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 5.0
>
> Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, 
> ec2-50-16-38-73_solr.log
>
>
> As defined in the parent issue, store the states of each collection under 
> /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N

2014-04-28 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983726#comment-13983726
 ] 

Hoss Man commented on LUCENE-5632:
--


bq. So I would suggest to fix this for 4.x in the way like the attached patch 
and remove in trunk all deprecated constants completely (so simply do rename in 
trunk).

I think it would probably be best to keep the (new) parseLeniently on trunk as 
well though (not just on 4x) so that _strings_ like "LUCENE_47" continue to 
work on trunk.

> transition Version constants from LUCENE_MN to LUCENE_M_N
> -
>
> Key: LUCENE-5632
> URL: https://issues.apache.org/jira/browse/LUCENE-5632
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Robert Muir
>Assignee: Uwe Schindler
> Fix For: 4.9
>
> Attachments: LUCENE-5632.patch, LUCENE-5632.patch
>
>
> We should fix this, otherwise the constants will be hard to read (e.g. 
> Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever).
> I do not want this to be an excuse for an arbitrary 5.0 release that does not 
> have the features expected of a major release :)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N

2014-04-28 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-5632:
--

  Component/s: core/other
Fix Version/s: 4.9
 Assignee: Uwe Schindler
   Issue Type: Improvement  (was: Bug)

> transition Version constants from LUCENE_MN to LUCENE_M_N
> -
>
> Key: LUCENE-5632
> URL: https://issues.apache.org/jira/browse/LUCENE-5632
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Robert Muir
>Assignee: Uwe Schindler
> Fix For: 4.9
>
> Attachments: LUCENE-5632.patch, LUCENE-5632.patch
>
>
> We should fix this, otherwise the constants will be hard to read (e.g. 
> Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever).
> I do not want this to be an excuse for an arbitrary 5.0 release that does not 
> have the features expected of a major release :)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N

2014-04-28 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-5632:
--

Attachment: LUCENE-5632.patch

New patch, which passes all tests (unmodified). Solr tests also pass with 
unmodified config files.

In fact, in branch_4x there are more occurences, but the whole patch is more or 
less a Eclipse refactor rename. So I would suggest to fix this for 4.x in the 
way like the attached patch and remove in trunk all deprecated constants 
completely (so simply do rename in trunk).

> transition Version constants from LUCENE_MN to LUCENE_M_N
> -
>
> Key: LUCENE-5632
> URL: https://issues.apache.org/jira/browse/LUCENE-5632
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-5632.patch, LUCENE-5632.patch
>
>
> We should fix this, otherwise the constants will be hard to read (e.g. 
> Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever).
> I do not want this to be an excuse for an arbitrary 5.0 release that does not 
> have the features expected of a major release :)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N

2014-04-28 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-5632:
--

Attachment: LUCENE-5632.patch

Patch, interestingly not so many things changed.

I did not yet run tests, but I also fixed the parser.

> transition Version constants from LUCENE_MN to LUCENE_M_N
> -
>
> Key: LUCENE-5632
> URL: https://issues.apache.org/jira/browse/LUCENE-5632
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-5632.patch
>
>
> We should fix this, otherwise the constants will be hard to read (e.g. 
> Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever).
> I do not want this to be an excuse for an arbitrary 5.0 release that does not 
> have the features expected of a major release :)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N

2014-04-28 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983644#comment-13983644
 ] 

Uwe Schindler commented on LUCENE-5632:
---

In fact it is possible to add "deprecated" old constants somewhere at the end 
of the enum. Those are no real enum constants (they dont work in switch 
statements), but for the general use case of matchVersion parameters, this is 
fine:

{code:java}
@Deprecated
public static final Version LUCENE_41 = LUCENE_4_1;
{code}

> transition Version constants from LUCENE_MN to LUCENE_M_N
> -
>
> Key: LUCENE-5632
> URL: https://issues.apache.org/jira/browse/LUCENE-5632
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> We should fix this, otherwise the constants will be hard to read (e.g. 
> Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever).
> I do not want this to be an excuse for an arbitrary 5.0 release that does not 
> have the features expected of a major release :)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N

2014-04-28 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983631#comment-13983631
 ] 

Uwe Schindler edited comment on LUCENE-5632 at 4/28/14 9:55 PM:


One idea. We redfine the enum with the new syntax. In Lucene 4.x we may 
additionally define the old constants as additional alias statics (+ 
deprecated) inside the enum? Java does not allow additional constants in an 
enum that are identical to others, but we can maybe move the deprecated ones 
between the real ones (like: {{LUCENE_4_0, @Deprecated LUCENE_40, LUCENE_4_1, 
@Deprecated LUCENE_41}}, although I am not sure if they should come before or 
after, but we can add magic to the version comparison functions: 
{{onOrAfter()}} can accept both).
We must also fix the parseVersionLenient for use by Solr + ElasticSearch.


was (Author: thetaphi):
One idea. We redfine the enum with the new syntax. In Lucene 4.x we may 
additionally define the old constants as additional alias statics (+ 
deprecated) inside the enum? Java does not allow additional constants in an 
enum that are identical to others, but we can maybe move the deprecated ones 
between the real ones (like: {{LUCENE_4_0, @Deprecated LUCENE_40, LUCENE_4_1, 
@Deprecated LUCENE_41}}, although I am not sure if they should come before or 
after).
We must also fix the parseVersionLenient for use by Solr + ElasticSearch.

> transition Version constants from LUCENE_MN to LUCENE_M_N
> -
>
> Key: LUCENE-5632
> URL: https://issues.apache.org/jira/browse/LUCENE-5632
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> We should fix this, otherwise the constants will be hard to read (e.g. 
> Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever).
> I do not want this to be an excuse for an arbitrary 5.0 release that does not 
> have the features expected of a major release :)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N

2014-04-28 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983631#comment-13983631
 ] 

Uwe Schindler commented on LUCENE-5632:
---

One idea. We redfine the enum with the new syntax. In Lucene 4.x we may 
additionally define the old constants as additional alias statics (+ 
deprecated) inside the enum? Java does not allow additional constants in an 
enum that are identical to others, but we can maybe move the deprecated ones 
between the real ones (like: {{LUCENE_4_0, @Deprecated LUCENE_40, LUCENE_4_1, 
@Deprecated LUCENE_41}}, although I am not sure if they should come before or 
after).
We must also fix the parseVersionLenient for use by Solr + ElasticSearch.

> transition Version constants from LUCENE_MN to LUCENE_M_N
> -
>
> Key: LUCENE-5632
> URL: https://issues.apache.org/jira/browse/LUCENE-5632
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> We should fix this, otherwise the constants will be hard to read (e.g. 
> Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever).
> I do not want this to be an excuse for an arbitrary 5.0 release that does not 
> have the features expected of a major release :)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6009) edismax mis-parsing RegexpQuery

2014-04-28 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983586#comment-13983586
 ] 

Hoss Man commented on SOLR-6009:


I did a quick skim of ExtendedDismaxQParser and from what i can tell nothing 
was _ever_ done to support regex syntax.

It also scares the hee-bee-jee-bees out of me that the (erroneous) behavior of 
edismax is different depending on whether the field exists in the schema, or is 
matched because of a "\*" dynamicField ... particularly since I can't reproduce 
the same IMPOSSIBLE_FIELD_NAME leakage when using an existing text_general 
dynamicField like qf=foo_txt.

smells like 2 interconnected bugs: one is just that needs regex support added 
to the parser, the other is that while regex support is missing, something is 
getting tickled that causes *really* bad behavior when the fields in use exist 
because of "\*" dynamicField

> edismax mis-parsing RegexpQuery
> ---
>
> Key: SOLR-6009
> URL: https://issues.apache.org/jira/browse/SOLR-6009
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Affects Versions: 4.7.2
>Reporter: Evan Sayer
>
> edismax appears to be leaking its IMPOSSIBLE_FIELD_NAME into queries 
> involving a RegexpQuery.  Steps to reproduce on 4.7.2:
> 1) remove the explicit  definition for 'text'
> 2) add a catch-all '*' dynamic field of type text_general
> {code}
>  stored="true" />
> {code}
> 3) index the exampledocs/ data
> 4) run a query like the following:
> {code}
> http://localhost:8983/solr/collection1/select?q={!edismax%20qf=%27text%27}%20/.*elec.*/&debugQuery=true
> {code}
> The debugQuery output will look like this:
> {code}
> 
> {!edismax qf='text'} /.*elec.*/
> {!edismax qf='text'} /.*elec.*/
> (+RegexpQuery(:/.*elec.*/))/no_coord
> +:/.*elec.*/
> {code}
> If you copy/paste the parsed-query into a text editor or something, you can 
> see that the field-name isn't actually blank.  The IMPOSSIBLE_FIELD_NAME ends 
> up in there.
> I haven't been able to reproduce this behavior on 4.7.2 without getting rid 
> of the explicit field definition for 'text' and using a dynamicField, which 
> is how things are setup on the machine where this issue was discovered.  The 
> query isn't quite right with the explicit field definition in place either, 
> though:
> {code}
> 
> {!edismax qf='text'} /.*elec.*/
> {!edismax qf='text'} /.*elec.*/
> (+DisjunctionMaxQuery((text:elec)))/no_coord
> +(text:elec)
> {code}
> numFound=0 for both of these.  This site is useful for looking at the 
> characters in the first variant:
> http://rishida.net/tools/conversion/



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N

2014-04-28 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-5632:
---

 Summary: transition Version constants from LUCENE_MN to LUCENE_M_N
 Key: LUCENE-5632
 URL: https://issues.apache.org/jira/browse/LUCENE-5632
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir


We should fix this, otherwise the constants will be hard to read (e.g. 
Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever).

I do not want this to be an excuse for an arbitrary 5.0 release that does not 
have the features expected of a major release :)




--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6022) Rename getAnalyzer to getIndexAnalyzer

2014-04-28 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983456#comment-13983456
 ] 

Uwe Schindler commented on SOLR-6022:
-

I think that looks good.

> Rename getAnalyzer to getIndexAnalyzer
> --
>
> Key: SOLR-6022
> URL: https://issues.apache.org/jira/browse/SOLR-6022
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan Ernst
> Attachments: SOLR-6022.branch_4x-deprecation.patch, SOLR-6022.patch, 
> SOLR-6022.patch
>
>
> We have separate index/query analyzer chains, but the access methods for the 
> analyzers do not match up with the names.  This can lead to unknowingly using 
> the wrong analyzer chain (as it did in SOLR-6017).  We should do this 
> renaming in trunk, and deprecate the old getAnalyzer function in 4x.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5331) nested SpanNearQuery with repeating groups does not find match

2014-04-28 Thread Michael Sander (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983432#comment-13983432
 ] 

Michael Sander commented on LUCENE-5331:


I will look into this a bit more deeply.

> nested SpanNearQuery with repeating groups does not find match
> --
>
> Key: LUCENE-5331
> URL: https://issues.apache.org/jira/browse/LUCENE-5331
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Jerry Zhou
> Attachments: NestedSpanNearTest.java
>
>
> Nested spanNear queries do not work in some cases when repeating groups are 
> in the query.
> Test case is attached ...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5331) nested SpanNearQuery with repeating groups does not find match

2014-04-28 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983421#comment-13983421
 ] 

Tim Allison commented on LUCENE-5331:
-

[~speedplane], I was trying to figure out if what you're seeing is the same as 
the original issue or the one that I raised.

The example that you posted on the google groups seems to work in pure Lucene 
both 4.7 and trunk:

{noformat}
   private final static String FIELD = "f";

   @Test
   public void testSimpleBizBuz() throws Exception {
  Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_47);
  IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_47, 
analyzer);
  RAMDirectory d = new RAMDirectory();
  IndexWriter writer = new IndexWriter(d, config);
  Document doc = new Document();
  doc.add(new TextField(FIELD, "foo biz buz", Store.YES));
  writer.addDocument(doc);
  
  doc = new Document();
  doc.add(new TextField(FIELD, "foo biz and biz buz", Store.YES));
  writer.addDocument(doc);
  writer.close();
  
  SpanQuery foo = new SpanTermQuery(new Term(FIELD, "foo"));
  SpanQuery biz = new SpanTermQuery(new Term(FIELD, "biz"));
  SpanQuery buz = new SpanTermQuery(new Term(FIELD, "buz"));
  
  SpanQuery bizbuz = new SpanNearQuery(new SpanQuery[]{biz, buz}, 0, false);
  SpanQuery foobizbuz = new SpanNearQuery(new SpanQuery[]{foo, bizbuz}, 0, 
false);
  
  IndexReader reader = DirectoryReader.open(d);
  IndexSearcher searcher = new IndexSearcher(reader);
  TopScoreDocCollector coll = TopScoreDocCollector.create(100,  true);
  searcher.search(foobizbuz, coll);
  ScoreDoc[] scoreDocs = coll.topDocs().scoreDocs;
  assertEquals(1, scoreDocs.length);  
   }
{noformat}

Are you sure the issue that you reported is the same as one of the ones in this 
issue?  Is the above test case right?

> nested SpanNearQuery with repeating groups does not find match
> --
>
> Key: LUCENE-5331
> URL: https://issues.apache.org/jira/browse/LUCENE-5331
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Jerry Zhou
> Attachments: NestedSpanNearTest.java
>
>
> Nested spanNear queries do not work in some cases when repeating groups are 
> in the query.
> Test case is attached ...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5631) Improve access to archived versions of Lucene and Solr

2014-04-28 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983405#comment-13983405
 ] 

Shawn Heisey commented on LUCENE-5631:
--

{quote}
https://lucene.apache.org/core/downloads.html
https://lucene.apache.org/solr/downloads.html

Both of these links are prominent in the top nav of the main pages for 
Lucene-Core & Solr, just to the right of the "news" tabs...
{quote}

This is not the first time I've been completely blind.  I did not notice those 
links.  Apparently I'm not the only one, though.  I've seen the question come 
up in the IRC channel regularly.  Now that they've been pointed out, I can 
guide people in the right direction much easier than providing a link or 
telling them that they just have to click fast. :)

bq. The project has taken several steps in the opposite direction, 
intentionally making it harder to access releases (and docs) for older 
versions, to encourage people to choose the most recent version. 

This is understandable, but people who are explicitly looking for an older 
version are asking about it.  Hoss has pointed out where to go.  I thought 
those links weren't there, and it turns out that it was me, not the website.


> Improve access to archived versions of Lucene and Solr
> --
>
> Key: LUCENE-5631
> URL: https://issues.apache.org/jira/browse/LUCENE-5631
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/website
>Reporter: Shawn Heisey
>
> When visiting the website to download Lucene or Solr, it is very difficult 
> for people to locate where to download previous versions.  The archive link 
> does show up when you click the download link, but the page where it lives is 
> replaced in less than a second by the CGI for picking a download mirror for 
> the current release.  There's nothing there for previous versions.
> At a minimum, we need a link to the download archive that's right below the 
> main Download link.  Something else I think we should do (which might 
> actually be an INFRA issue, as this problem exists for other projects too) 
> would be to have the "closer.cgi" page include a link to the archives.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6022) Rename getAnalyzer to getIndexAnalyzer

2014-04-28 Thread Ryan Ernst (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Ernst updated SOLR-6022:
-

Attachment: SOLR-6022.branch_4x-deprecation.patch
SOLR-6022.patch

Here are two more patches:
# The first is still for trunk, and changes the indexAnalyzer/queryAnalyzer 
members in FieldType to private scope.  This will be a "hard fail" for anyone 
that is subclassing FieldType and using these, but they should be using 
get/setAnalyzer anyways.  It also adds CHANGES.txt entries for review.
# The second patch shows exactly how I will do the deprecation in branch_4x (I 
believe I should be able to just apply the patch after doing a merge back from 
trunk).

> Rename getAnalyzer to getIndexAnalyzer
> --
>
> Key: SOLR-6022
> URL: https://issues.apache.org/jira/browse/SOLR-6022
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan Ernst
> Attachments: SOLR-6022.branch_4x-deprecation.patch, SOLR-6022.patch, 
> SOLR-6022.patch
>
>
> We have separate index/query analyzer chains, but the access methods for the 
> analyzers do not match up with the names.  This can lead to unknowingly using 
> the wrong analyzer chain (as it did in SOLR-6017).  We should do this 
> renaming in trunk, and deprecate the old getAnalyzer function in 4x.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5611) Simplify the default indexing chain

2014-04-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983390#comment-13983390
 ] 

ASF subversion and git services commented on LUCENE-5611:
-

Commit 1590747 from [~rcmuir] in branch 'dev/branches/lucene5611'
[ https://svn.apache.org/r1590747 ]

LUCENE-5611: move attribute juggling to a fieldinvertstate setter

> Simplify the default indexing chain
> ---
>
> Key: LUCENE-5611
> URL: https://issues.apache.org/jira/browse/LUCENE-5611
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5611.patch, LUCENE-5611.patch
>
>
> I think Lucene's current indexing chain has too many classes /
> hierarchy / abstractions, making it look much more complex than it
> really should be, and discouraging users from experimenting/innovating
> with their own indexing chains.
> Also, if it were easier to understand/approach, then new developers
> would more likely try to improve it ... it really should be simpler.
> So I'm exploring a pared back indexing chain, and have a starting patch
> that I think is looking ok: it seems more approachable than the
> current indexing chain, or at least has fewer strange classes.
> I also thought this could give some speedup for tiny documents (a more
> common use of Lucene lately), and it looks like, with the evil
> optimizations, this is a ~25% speedup for Geonames docs.  Even without
> those evil optos it's a bit faster.
> This is very much a work in progress / nocommits, and there are some
> behavior changes e.g. the new chain requires all fields to have the
> same TV options (rather than auto-upgrading all fields by the same
> name that the current chain does)...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5611) Simplify the default indexing chain

2014-04-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983341#comment-13983341
 ] 

ASF subversion and git services commented on LUCENE-5611:
-

Commit 1590731 from [~rcmuir] in branch 'dev/branches/lucene5611'
[ https://svn.apache.org/r1590731 ]

LUCENE-5611: fix the crazy getAttribute API to prevent double lookups and extra 
code everywhere

> Simplify the default indexing chain
> ---
>
> Key: LUCENE-5611
> URL: https://issues.apache.org/jira/browse/LUCENE-5611
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5611.patch, LUCENE-5611.patch
>
>
> I think Lucene's current indexing chain has too many classes /
> hierarchy / abstractions, making it look much more complex than it
> really should be, and discouraging users from experimenting/innovating
> with their own indexing chains.
> Also, if it were easier to understand/approach, then new developers
> would more likely try to improve it ... it really should be simpler.
> So I'm exploring a pared back indexing chain, and have a starting patch
> that I think is looking ok: it seems more approachable than the
> current indexing chain, or at least has fewer strange classes.
> I also thought this could give some speedup for tiny documents (a more
> common use of Lucene lately), and it looks like, with the evil
> optimizations, this is a ~25% speedup for Geonames docs.  Even without
> those evil optos it's a bit faster.
> This is very much a work in progress / nocommits, and there are some
> behavior changes e.g. the new chain requires all fields to have the
> same TV options (rather than auto-upgrading all fields by the same
> name that the current chain does)...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5631) Improve access to archived versions of Lucene and Solr

2014-04-28 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983338#comment-13983338
 ] 

Hoss Man commented on LUCENE-5631:
--

I'm not really understanidng this issue.

In the right nav of the site, are links that are very explicitly about 
downloading the *latest* release.  These load pages that do an auto-redirect, 
but as a convenience also include some static text for people that might not 
use javascript.

This doesn't change the fact that we *also* already have a main "Download" page 
that also links to those download redirectors, and has details about archived 
releases...

https://lucene.apache.org/core/downloads.html
https://lucene.apache.org/solr/downloads.html

Both of these links are prominent in the top nav of the main pages for 
Lucene-Core & Solr, just to the right of the "news" tabs...

https://lucene.apache.org/core/
https://lucene.apache.org/solr/


> Improve access to archived versions of Lucene and Solr
> --
>
> Key: LUCENE-5631
> URL: https://issues.apache.org/jira/browse/LUCENE-5631
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/website
>Reporter: Shawn Heisey
>
> When visiting the website to download Lucene or Solr, it is very difficult 
> for people to locate where to download previous versions.  The archive link 
> does show up when you click the download link, but the page where it lives is 
> replaced in less than a second by the CGI for picking a download mirror for 
> the current release.  There's nothing there for previous versions.
> At a minimum, we need a link to the download archive that's right below the 
> main Download link.  Something else I think we should do (which might 
> actually be an INFRA issue, as this problem exists for other projects too) 
> would be to have the "closer.cgi" page include a link to the archives.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6027) Replica assignments should try to take the host name into account so all replicas don't end up on the same host

2014-04-28 Thread Jessica Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983300#comment-13983300
 ] 

Jessica Cheng commented on SOLR-6027:
-

{quote}I would be nice to make this decision configurable/pluggable.{quote}
+1. For example, something like "rack awareness" would be nice to be taken into 
account as well.

> Replica assignments should try to take the host name into account so all 
> replicas don't end up on the same host
> ---
>
> Key: SOLR-6027
> URL: https://issues.apache.org/jira/browse/SOLR-6027
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Timothy Potter
>Priority: Minor
>
> I have 18 SolrCloud nodes distributed across 3 Ec2 instances, so 6 per 
> instance. One of my collections was created with all replicas landing on 
> different SolrCloud nodes on the same instance. Ideally, SolrCloud would be a 
> little smarter and ensure that at least one of the replicas was on one of the 
> other hosts.
> shard4: {
>   
> http://ec2-??-??-??-239.compute-1.amazonaws.com:8988/solr/med_collection_shard4_replica1/
>  LEADER
>   
> http://ec2-??-??-??-239.compute-1.amazonaws.com:8984/solr/med_collection_shard4_replica3/
>  
>   
> http://ec2-??-??-??-239.compute-1.amazonaws.com:8985/solr/med_collection_shard4_replica2/
>  
> }
> I marked this as minor for now as it could be argued that I shouldn't be 
> running that many Solr nodes per instance, but I'm seeing plenty of installs 
> that are using higher-end instance types / server hardware and then running 
> multiple Solr nodes per host.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5631) Improve access to archived versions of Lucene and Solr

2014-04-28 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983291#comment-13983291
 ] 

Shawn Heisey commented on LUCENE-5631:
--

I asked on #asfinfra whether a general solution with the mirror CGI would be a 
good idea.

{noformat}
11:39 <@gmcdonald> I do not think download.cgi is the place to put an archives
   link no
{noformat}

{noformat}
11:41 <@gmcdonald> elyograg: elsewhere on your website you should link to
   downloads.cgi or the archives. The download.cgi picks a
   mirror, archives do not live on mirrors, change your way of
   thinking is my reply. If you insist on following up, talk to
   the site-dev@ list instead of an INFRA ticket.
{noformat}

I will follow up with the site-dev list to see if they have any interest in a 
general solution.  Regardless of what happens there, I do think we need to 
improve our own project pages.  When I have a moment, I will pull the site down 
from svn and see if I can cook up a patch.


> Improve access to archived versions of Lucene and Solr
> --
>
> Key: LUCENE-5631
> URL: https://issues.apache.org/jira/browse/LUCENE-5631
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/website
>Reporter: Shawn Heisey
>
> When visiting the website to download Lucene or Solr, it is very difficult 
> for people to locate where to download previous versions.  The archive link 
> does show up when you click the download link, but the page where it lives is 
> replaced in less than a second by the CGI for picking a download mirror for 
> the current release.  There's nothing there for previous versions.
> At a minimum, we need a link to the download archive that's right below the 
> main Download link.  Something else I think we should do (which might 
> actually be an INFRA issue, as this problem exists for other projects too) 
> would be to have the "closer.cgi" page include a link to the archives.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5969) Enable distributed tracing of requests

2014-04-28 Thread Gregg Donovan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregg Donovan updated SOLR-5969:


Attachment: SOLR-5969.diff

Patch updated for Lucene/Solr 4.8.

> Enable distributed tracing of requests
> --
>
> Key: SOLR-5969
> URL: https://issues.apache.org/jira/browse/SOLR-5969
> Project: Solr
>  Issue Type: Improvement
>Reporter: Gregg Donovan
> Attachments: SOLR-5969.diff, SOLR-5969.diff
>
>
> Enable users to add diagnostic information to requests and trace them in the 
> logs across servers.
> We have some metadata -- e.g. a request UUID -- that we log to every log line 
> using [Log4J's 
> MDC|http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/MDC.html]. 
> The UUID logging allows us to connect any log lines we have for a given 
> request across servers. Sort of like Twitter's 
> [Zipkin|http://twitter.github.io/zipkin/].
> Currently we're using EmbeddedSolrServer without sharding, so adding the UUID 
> is fairly simple, since everything is in one process and one thread. But, 
> we're testing a sharded HTTP implementation and running into some 
> difficulties getting this data passed around in a way that lets us trace all 
> log lines generated by a request to its UUID.
> The first thing I tried was to add the UUID by adding it to the SolrParams. 
> This achieves the goal of getting those values logged on the shards if a 
> request is successful, but we miss having those values in the MDC if there 
> are other log lines before the final log line. E.g. an Exception in a custom 
> component.
> My current thought is that sending HTTP headers with diagnostic information 
> would be very useful. Those could be placed in the MDC even before handing 
> off to work to SolrDispatchFilter, so that any Solr problem will have the 
> proper logging.
> I.e. every additional header added to a Solr request gets a "Solr-" prefix. 
> On the server, we look for those headers and add them to the [SLF4J 
> MDC|http://www.slf4j.org/manual.html#mdc].



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5631) Improve access to archived versions of Lucene and Solr

2014-04-28 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983280#comment-13983280
 ] 

Steve Rowe commented on LUCENE-5631:


The project has taken several steps in the opposite direction, intentionally 
making it *harder* to access releases (and docs) for older versions, to 
encourage people to choose the most recent version.  


> Improve access to archived versions of Lucene and Solr
> --
>
> Key: LUCENE-5631
> URL: https://issues.apache.org/jira/browse/LUCENE-5631
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/website
>Reporter: Shawn Heisey
>
> When visiting the website to download Lucene or Solr, it is very difficult 
> for people to locate where to download previous versions.  The archive link 
> does show up when you click the download link, but the page where it lives is 
> replaced in less than a second by the CGI for picking a download mirror for 
> the current release.  There's nothing there for previous versions.
> At a minimum, we need a link to the download archive that's right below the 
> main Download link.  Something else I think we should do (which might 
> actually be an INFRA issue, as this problem exists for other projects too) 
> would be to have the "closer.cgi" page include a link to the archives.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6028) SOLR returns 500 error code for query /<,/

2014-04-28 Thread Kingston Duffie (JIRA)
Kingston Duffie created SOLR-6028:
-

 Summary: SOLR returns 500 error code for query /<,/
 Key: SOLR-6028
 URL: https://issues.apache.org/jira/browse/SOLR-6028
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.7.1
Reporter: Kingston Duffie
Priority: Minor


If you enter the following query string into the SOLR admin console to execute 
a query, you will get a 500 error:

/<,/

This is an invalid query -- in the sense that the field between the slashes is 
not a valid regex.  Nevertheless, I would have expected to get a 400 error 
rather than 500.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6027) Replica assignments should try to take the host name into account so all replicas don't end up on the same host

2014-04-28 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983264#comment-13983264
 ] 

Tomás Fernández Löbbe commented on SOLR-6027:
-

I would be nice to make this decision configurable/pluggable. One could select 
the nodes for a collection depending on case-specific context. 

> Replica assignments should try to take the host name into account so all 
> replicas don't end up on the same host
> ---
>
> Key: SOLR-6027
> URL: https://issues.apache.org/jira/browse/SOLR-6027
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Timothy Potter
>Priority: Minor
>
> I have 18 SolrCloud nodes distributed across 3 Ec2 instances, so 6 per 
> instance. One of my collections was created with all replicas landing on 
> different SolrCloud nodes on the same instance. Ideally, SolrCloud would be a 
> little smarter and ensure that at least one of the replicas was on one of the 
> other hosts.
> shard4: {
>   
> http://ec2-??-??-??-239.compute-1.amazonaws.com:8988/solr/med_collection_shard4_replica1/
>  LEADER
>   
> http://ec2-??-??-??-239.compute-1.amazonaws.com:8984/solr/med_collection_shard4_replica3/
>  
>   
> http://ec2-??-??-??-239.compute-1.amazonaws.com:8985/solr/med_collection_shard4_replica2/
>  
> }
> I marked this as minor for now as it could be argued that I shouldn't be 
> running that many Solr nodes per instance, but I'm seeing plenty of installs 
> that are using higher-end instance types / server hardware and then running 
> multiple Solr nodes per host.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5631) Improve access to archived versions of Lucene and Solr

2014-04-28 Thread Shawn Heisey (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey updated LUCENE-5631:
-

Description: 
When visiting the website to download Lucene or Solr, it is very difficult for 
people to locate where to download previous versions.  The archive link does 
show up when you click the download link, but the page where it lives is 
replaced in less than a second by the CGI for picking a download mirror for the 
current release.  There's nothing there for previous versions.

At a minimum, we need a link to the download archive that's right below the 
main Download link.  Something else I think we should do (which might actually 
be an INFRA issue, as this problem exists for other projects too) would be to 
have the "closer.cgi" page include a link to the archives.


  was:
When visiting the website to download Lucene or Solr, it is very difficult for 
people to locate where to download previous versions.  The archive link does 
show up when you click the download link, but the page where it lives is 
replaced in less than a second by the CGI for picking a download mirror for the 
current release.  There's nothing there for previous versions.

At a minimum, we need a link to the download archive that's right below the 
main Download link.  Something else I think we should do (which might actually 
be an INFRA issue) would be to have the "closer.cgi" page include a link to the 
archives.



> Improve access to archived versions of Lucene and Solr
> --
>
> Key: LUCENE-5631
> URL: https://issues.apache.org/jira/browse/LUCENE-5631
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/website
>Reporter: Shawn Heisey
>
> When visiting the website to download Lucene or Solr, it is very difficult 
> for people to locate where to download previous versions.  The archive link 
> does show up when you click the download link, but the page where it lives is 
> replaced in less than a second by the CGI for picking a download mirror for 
> the current release.  There's nothing there for previous versions.
> At a minimum, we need a link to the download archive that's right below the 
> main Download link.  Something else I think we should do (which might 
> actually be an INFRA issue, as this problem exists for other projects too) 
> would be to have the "closer.cgi" page include a link to the archives.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5631) Improve access to archived versions of Lucene and Solr

2014-04-28 Thread Shawn Heisey (JIRA)
Shawn Heisey created LUCENE-5631:


 Summary: Improve access to archived versions of Lucene and Solr
 Key: LUCENE-5631
 URL: https://issues.apache.org/jira/browse/LUCENE-5631
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/website
Reporter: Shawn Heisey


When visiting the website to download Lucene or Solr, it is very difficult for 
people to locate where to download previous versions.  The archive link does 
show up when you click the download link, but the page where it lives is 
replaced in less than a second by the CGI for picking a download mirror for the 
current release.  There's nothing there for previous versions.

At a minimum, we need a link to the download archive that's right below the 
main Download link.  Something else I think we should do (which might actually 
be an INFRA issue) would be to have the "closer.cgi" page include a link to the 
archives.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5611) Simplify the default indexing chain

2014-04-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983256#comment-13983256
 ] 

ASF subversion and git services commented on LUCENE-5611:
-

Commit 1590721 from [~mikemccand] in branch 'dev/branches/lucene5611'
[ https://svn.apache.org/r1590721 ]

LUCENE-5611: put current patch on branch

> Simplify the default indexing chain
> ---
>
> Key: LUCENE-5611
> URL: https://issues.apache.org/jira/browse/LUCENE-5611
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5611.patch, LUCENE-5611.patch
>
>
> I think Lucene's current indexing chain has too many classes /
> hierarchy / abstractions, making it look much more complex than it
> really should be, and discouraging users from experimenting/innovating
> with their own indexing chains.
> Also, if it were easier to understand/approach, then new developers
> would more likely try to improve it ... it really should be simpler.
> So I'm exploring a pared back indexing chain, and have a starting patch
> that I think is looking ok: it seems more approachable than the
> current indexing chain, or at least has fewer strange classes.
> I also thought this could give some speedup for tiny documents (a more
> common use of Lucene lately), and it looks like, with the evil
> optimizations, this is a ~25% speedup for Geonames docs.  Even without
> those evil optos it's a bit faster.
> This is very much a work in progress / nocommits, and there are some
> behavior changes e.g. the new chain requires all fields to have the
> same TV options (rather than auto-upgrading all fields by the same
> name that the current chain does)...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5611) Simplify the default indexing chain

2014-04-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983254#comment-13983254
 ] 

ASF subversion and git services commented on LUCENE-5611:
-

Commit 1590720 from [~mikemccand] in branch 'dev/branches/lucene5611'
[ https://svn.apache.org/r1590720 ]

LUCENE-5611: make branch

> Simplify the default indexing chain
> ---
>
> Key: LUCENE-5611
> URL: https://issues.apache.org/jira/browse/LUCENE-5611
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5611.patch, LUCENE-5611.patch
>
>
> I think Lucene's current indexing chain has too many classes /
> hierarchy / abstractions, making it look much more complex than it
> really should be, and discouraging users from experimenting/innovating
> with their own indexing chains.
> Also, if it were easier to understand/approach, then new developers
> would more likely try to improve it ... it really should be simpler.
> So I'm exploring a pared back indexing chain, and have a starting patch
> that I think is looking ok: it seems more approachable than the
> current indexing chain, or at least has fewer strange classes.
> I also thought this could give some speedup for tiny documents (a more
> common use of Lucene lately), and it looks like, with the evil
> optimizations, this is a ~25% speedup for Geonames docs.  Even without
> those evil optos it's a bit faster.
> This is very much a work in progress / nocommits, and there are some
> behavior changes e.g. the new chain requires all fields to have the
> same TV options (rather than auto-upgrading all fields by the same
> name that the current chain does)...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6027) Replica assignments should try to take the host name into account so all replicas don't end up on the same host

2014-04-28 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983230#comment-13983230
 ] 

Mark Miller commented on SOLR-6027:
---

bq. it could be argued that I shouldn't be running that many Solr nodes per 
instance

I think we want to make things as smart as we can! Everything is very basic 
right now and the intention has always been to make it much better - we want at 
least the option to take into account as much info as we can when choosing 
hosts (eventually, even hardware, avg load, whatever!).

> Replica assignments should try to take the host name into account so all 
> replicas don't end up on the same host
> ---
>
> Key: SOLR-6027
> URL: https://issues.apache.org/jira/browse/SOLR-6027
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Timothy Potter
>Priority: Minor
>
> I have 18 SolrCloud nodes distributed across 3 Ec2 instances, so 6 per 
> instance. One of my collections was created with all replicas landing on 
> different SolrCloud nodes on the same instance. Ideally, SolrCloud would be a 
> little smarter and ensure that at least one of the replicas was on one of the 
> other hosts.
> shard4: {
>   
> http://ec2-??-??-??-239.compute-1.amazonaws.com:8988/solr/med_collection_shard4_replica1/
>  LEADER
>   
> http://ec2-??-??-??-239.compute-1.amazonaws.com:8984/solr/med_collection_shard4_replica3/
>  
>   
> http://ec2-??-??-??-239.compute-1.amazonaws.com:8985/solr/med_collection_shard4_replica2/
>  
> }
> I marked this as minor for now as it could be argued that I shouldn't be 
> running that many Solr nodes per instance, but I'm seeing plenty of installs 
> that are using higher-end instance types / server hardware and then running 
> multiple Solr nodes per host.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6027) Replica assignments should try to take the host name into account so all replicas don't end up on the same host

2014-04-28 Thread Timothy Potter (JIRA)
Timothy Potter created SOLR-6027:


 Summary: Replica assignments should try to take the host name into 
account so all replicas don't end up on the same host
 Key: SOLR-6027
 URL: https://issues.apache.org/jira/browse/SOLR-6027
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Timothy Potter
Priority: Minor


I have 18 SolrCloud nodes distributed across 3 Ec2 instances, so 6 per 
instance. One of my collections was created with all replicas landing on 
different SolrCloud nodes on the same instance. Ideally, SolrCloud would be a 
little smarter and ensure that at least one of the replicas was on one of the 
other hosts.

shard4: {

http://ec2-??-??-??-239.compute-1.amazonaws.com:8988/solr/med_collection_shard4_replica1/
 LEADER

http://ec2-??-??-??-239.compute-1.amazonaws.com:8984/solr/med_collection_shard4_replica3/
 

http://ec2-??-??-??-239.compute-1.amazonaws.com:8985/solr/med_collection_shard4_replica2/
 
}

I marked this as minor for now as it could be argued that I shouldn't be 
running that many Solr nodes per instance, but I'm seeing plenty of installs 
that are using higher-end instance types / server hardware and then running 
multiple Solr nodes per host.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[ANNOUNCE] Apache Solr 4.8.0 released

2014-04-28 Thread Uwe Schindler
28 April 2014, Apache Solr™ 4.8.0 available

The Lucene PMC is pleased to announce the release of Apache Solr 4.8.0

Solr is the popular, blazing fast, open source NoSQL search platform
from the Apache Lucene project. Its major features include powerful
full-text search, hit highlighting, faceted search, dynamic
clustering, database integration, rich document (e.g., Word, PDF)
handling, and geospatial search.  Solr is highly scalable, providing
fault tolerant distributed search and indexing, and powers the search
and navigation features of many of the world's largest internet sites.

Solr 4.8.0 is available for immediate download at:
  http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

See the CHANGES.txt file included with the release for a full list of
details.

Solr 4.8.0 Release Highlights:

* Apache Solr now requires Java 7 or greater (recommended is
  Oracle Java 7 or OpenJDK 7, minimum update 55; earlier versions
  have known JVM bugs affecting Solr).

* Apache Solr is fully compatible with Java 8.

*  and  tags have been deprecated from schema.xml.
  There is no longer any reason to keep them in the schema file,
  they may be safely removed. This allows intermixing of ,
   and  definitions if desired.

* The new {!complexphrase} query parser supports wildcards, ORs etc.
  inside Phrase Queries. 

* New Collections API CLUSTERSTATUS action reports the status of
  collections, shards, and replicas, and also lists collection
  aliases and cluster properties.
 
* Added managed synonym and stopword filter factories, which enable
  synonym and stopword lists to be dynamically managed via REST API.

* JSON updates now support nested child documents, enabling {!child}
  and {!parent} block join queries. 

* Added ExpandComponent to expand results collapsed by the
  CollapsingQParserPlugin, as well as the parent/child relationship
  of nested child documents.

* Long-running Collections API tasks can now be executed
  asynchronously; the new REQUESTSTATUS action provides status.

* Added a hl.qparser parameter to allow you to define a query parser
  for hl.q highlight queries.

* In Solr single-node mode, cores can now be created using named
  configsets.

* New DocExpirationUpdateProcessorFactory supports computing an
  expiration date for documents from the "TTL" expression, as well as
  automatically deleting expired documents on a periodic basis. 

Solr 4.8.0 also includes many other new features as well as numerous
optimizations and bugfixes of the corresponding Apache Lucene release.

Please report any feedback to the mailing lists
(http://lucene.apache.org/solr/discussion.html)

Note: The Apache Software Foundation uses an extensive mirroring network
for distributing releases.  It is possible that the mirror you are using
may not have replicated the release yet.  If that is the case, please
try another mirror.  This also goes for Maven access.

-
Uwe Schindler
uschind...@apache.org 
Apache Lucene PMC Chair / Committer
Bremen, Germany
http://lucene.apache.org/



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[ANNOUNCE] Apache Lucene 4.8.0 released

2014-04-28 Thread Uwe Schindler
28 April 2014, Apache Lucene™ 4.8.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.8.0

Apache Lucene is a high-performance, full-featured text search engine
library written entirely in Java. It is a technology suitable for nearly
any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and
improvements, some of which are highlighted below. The release
is available for immediate download at:
  http://lucene.apache.org/core/mirrors-core-latest-redir.html

See the CHANGES.txt file included with the release for a full list of
details.

Lucene 4.8.0 Release Highlights:

* Apache Lucene now requires Java 7 or greater (recommended is
  Oracle Java 7 or OpenJDK 7, minimum update 55; earlier versions
  have known JVM bugs affecting Lucene).

* Apache Lucene is fully compatible with Java 8.

* All index files now store end-to-end checksums, which are
  now validated during merging and reading. This ensures that
  corruptions caused by any bit-flipping hardware problems or bugs
  in the JVM can be detected earlier.  For full detection be sure
  to enable all checksums during merging (it's disabled by default).

* Lucene has a new Rescorer/QueryRescorer API to perform second-pass
  rescoring or reranking of search results using more expensive scoring
  functions after first-pass hit collection.

* AnalyzingInfixSuggester now supports near-real-time autosuggest.

* Simplified impact-sorted postings (using SortingMergePolicy and
  EarlyTerminatingCollector) to use Lucene's Sort class
  to express the sort order.

* Bulk scoring and normal iterator-based scoring were separated,
  so some queries can do bulk scoring more effectively.

* Switched to MurmurHash3 to hash terms during indexing.

* IndexWriter now supports updating of binary doc value fields.

* HunspellStemFilter now uses 10 to 100x less RAM. It also loads
  all known OpenOffice dictionaries without error.

* Lucene now also fsyncs the directory metadata on commits, if the
  operating system and file system allow it (Linux, MacOSX are
  known to work).

* Lucene now uses Java 7 file system functions under the hood,
  so index files can be deleted on Windows, even when readers are
  still open.

* A serious bug in NativeFSLockFactory was fixed, which could
  allow multiple IndexWriters to acquire the same lock.  The
  lock file is no longer deleted from the index directory
  even when the lock is not held.

* Various bugfixes and optimizations since the 4.7.2 release.

Please read CHANGES.txt for a full list of new features.

Please report any feedback to the mailing lists
(http://lucene.apache.org/core/discussion.html)

Note: The Apache Software Foundation uses an extensive mirroring network
for distributing releases.  It is possible that the mirror you are using
may not have replicated the release yet.  If that is the case, please
try another mirror.  This also goes for Maven access.

-
Uwe Schindler
uschind...@apache.org 
Apache Lucene PMC Chair / Committer
Bremen, Germany
http://lucene.apache.org/




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5637) Per-request cache statistics

2014-04-28 Thread Gregg Donovan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregg Donovan updated SOLR-5637:


Attachment: SOLR-5637.patch

> Per-request cache statistics
> 
>
> Key: SOLR-5637
> URL: https://issues.apache.org/jira/browse/SOLR-5637
> Project: Solr
>  Issue Type: New Feature
>Reporter: Shikhar Bhushan
>Priority: Minor
> Attachments: SOLR-5367.patch, SOLR-5367.patch, SOLR-5637.patch, 
> SOLR-5637.patch
>
>
> We have found it very useful to have information on the number of cache hits 
> and misses for key Solr caches (filterCache, documentCache, etc.) at the 
> request level.
> This is currently implemented in our codebase using custom {{SolrCache}} 
> implementations.
> I am working on moving to maintaining stats in the {{SolrRequestInfo}} 
> thread-local, and adding hooks in get() methods of SolrCache implementations. 
> This will be glued up using the {{DebugComponent}} and can be requested using 
> a "debug.cache" parameter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5637) Per-request cache statistics

2014-04-28 Thread Gregg Donovan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregg Donovan updated SOLR-5637:


Attachment: (was: SOLR-5637.diff)

> Per-request cache statistics
> 
>
> Key: SOLR-5637
> URL: https://issues.apache.org/jira/browse/SOLR-5637
> Project: Solr
>  Issue Type: New Feature
>Reporter: Shikhar Bhushan
>Priority: Minor
> Attachments: SOLR-5367.patch, SOLR-5367.patch, SOLR-5637.patch, 
> SOLR-5637.patch
>
>
> We have found it very useful to have information on the number of cache hits 
> and misses for key Solr caches (filterCache, documentCache, etc.) at the 
> request level.
> This is currently implemented in our codebase using custom {{SolrCache}} 
> implementations.
> I am working on moving to maintaining stats in the {{SolrRequestInfo}} 
> thread-local, and adding hooks in get() methods of SolrCache implementations. 
> This will be glued up using the {{DebugComponent}} and can be requested using 
> a "debug.cache" parameter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5637) Per-request cache statistics

2014-04-28 Thread Gregg Donovan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregg Donovan updated SOLR-5637:


Attachment: SOLR-5637.diff

Patch updated for Lucene/Solr 4.8 and JDK 7.

> Per-request cache statistics
> 
>
> Key: SOLR-5637
> URL: https://issues.apache.org/jira/browse/SOLR-5637
> Project: Solr
>  Issue Type: New Feature
>Reporter: Shikhar Bhushan
>Priority: Minor
> Attachments: SOLR-5367.patch, SOLR-5367.patch, SOLR-5637.patch, 
> SOLR-5637.patch
>
>
> We have found it very useful to have information on the number of cache hits 
> and misses for key Solr caches (filterCache, documentCache, etc.) at the 
> request level.
> This is currently implemented in our codebase using custom {{SolrCache}} 
> implementations.
> I am working on moving to maintaining stats in the {{SolrRequestInfo}} 
> thread-local, and adding hooks in get() methods of SolrCache implementations. 
> This will be glued up using the {{DebugComponent}} and can be requested using 
> a "debug.cache" parameter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5468) Option to enforce a majority quorum approach to accepting updates in SolrCloud

2014-04-28 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983153#comment-13983153
 ] 

Mark Miller commented on SOLR-5468:
---

bq. Do think this should be an update request parameter or collection level 
setting?

Yeah, I think it's common to allow passing this per request so the client can 
vary it depending on the data. I'm sure configurable defaults are probably 
worth looking at too though.

> Option to enforce a majority quorum approach to accepting updates in SolrCloud
> --
>
> Key: SOLR-5468
> URL: https://issues.apache.org/jira/browse/SOLR-5468
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 4.5
> Environment: All
>Reporter: Timothy Potter
>Assignee: Timothy Potter
>Priority: Minor
>
> I've been thinking about how SolrCloud deals with write-availability using 
> in-sync replica sets, in which writes will continue to be accepted so long as 
> there is at least one healthy node per shard.
> For a little background (and to verify my understanding of the process is 
> correct), SolrCloud only considers active/healthy replicas when acknowledging 
> a write. Specifically, when a shard leader accepts an update request, it 
> forwards the request to all active/healthy replicas and only considers the 
> write successful if all active/healthy replicas ack the write. Any down / 
> gone replicas are not considered and will sync up with the leader when they 
> come back online using peer sync or snapshot replication. For instance, if a 
> shard has 3 nodes, A, B, C with A being the current leader, then writes to 
> the shard will continue to succeed even if B & C are down.
> The issue is that if a shard leader continues to accept updates even if it 
> loses all of its replicas, then we have acknowledged updates on only 1 node. 
> If that node, call it A, then fails and one of the previous replicas, call it 
> B, comes back online before A does, then any writes that A accepted while the 
> other replicas were offline are at risk to being lost. 
> SolrCloud does provide a safe-guard mechanism for this problem with the 
> leaderVoteWait setting, which puts any replicas that come back online before 
> node A into a temporary wait state. If A comes back online within the wait 
> period, then all is well as it will become the leader again and no writes 
> will be lost. As a side note, sys admins definitely need to be made more 
> aware of this situation as when I first encountered it in my cluster, I had 
> no idea what it meant.
> My question is whether we want to consider an approach where SolrCloud will 
> not accept writes unless there is a majority of replicas available to accept 
> the write? For my example, under this approach, we wouldn't accept writes if 
> both B&C failed, but would if only C did, leaving A & B online. Admittedly, 
> this lowers the write-availability of the system, so may be something that 
> should be tunable?
> From Mark M: Yeah, this is kind of like one of many little features that we 
> have just not gotten to yet. I’ve always planned for a param that let’s you 
> say how many replicas an update must be verified on before responding 
> success. Seems to make sense to fail that type of request early if you notice 
> there are not enough replicas up to satisfy the param to begin with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: VOTE: RC1 Release apache-solr-ref-guide-4.8.pdf

2014-04-28 Thread Chris Hostetter

Whoops, i forgot my own vote...

: 
https://dist.apache.org/repos/dist/dev/lucene/solr/ref-guide/apache-solr-ref-guide-4.8-RC1

+1 to RC1 with this SHA...

9904feefcdbad85eea1a81fe531f37df22ca134f  apache-solr-ref-guide-4.8.pdf


(Note: we still need at least 1 more binding +1)


-Hoss
http://www.lucidworks.com/

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5468) Option to enforce a majority quorum approach to accepting updates in SolrCloud

2014-04-28 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983138#comment-13983138
 ] 

Timothy Potter commented on SOLR-5468:
--

Thanks for the quick feedback. Do think this should be an update request 
parameter or collection level setting? Just re-read your original comment about 
this and sounds like you were thinking a parameter with each request. I like 
that since it gives the option to by-pass this checking when doing large bulk 
loads of the collection and only apply it when it makes sense.

In terms of fine-grained error response handling, looks like this is captured 
in: https://issues.apache.org/jira/browse/SOLR-3382

 

> Option to enforce a majority quorum approach to accepting updates in SolrCloud
> --
>
> Key: SOLR-5468
> URL: https://issues.apache.org/jira/browse/SOLR-5468
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 4.5
> Environment: All
>Reporter: Timothy Potter
>Assignee: Timothy Potter
>Priority: Minor
>
> I've been thinking about how SolrCloud deals with write-availability using 
> in-sync replica sets, in which writes will continue to be accepted so long as 
> there is at least one healthy node per shard.
> For a little background (and to verify my understanding of the process is 
> correct), SolrCloud only considers active/healthy replicas when acknowledging 
> a write. Specifically, when a shard leader accepts an update request, it 
> forwards the request to all active/healthy replicas and only considers the 
> write successful if all active/healthy replicas ack the write. Any down / 
> gone replicas are not considered and will sync up with the leader when they 
> come back online using peer sync or snapshot replication. For instance, if a 
> shard has 3 nodes, A, B, C with A being the current leader, then writes to 
> the shard will continue to succeed even if B & C are down.
> The issue is that if a shard leader continues to accept updates even if it 
> loses all of its replicas, then we have acknowledged updates on only 1 node. 
> If that node, call it A, then fails and one of the previous replicas, call it 
> B, comes back online before A does, then any writes that A accepted while the 
> other replicas were offline are at risk to being lost. 
> SolrCloud does provide a safe-guard mechanism for this problem with the 
> leaderVoteWait setting, which puts any replicas that come back online before 
> node A into a temporary wait state. If A comes back online within the wait 
> period, then all is well as it will become the leader again and no writes 
> will be lost. As a side note, sys admins definitely need to be made more 
> aware of this situation as when I first encountered it in my cluster, I had 
> no idea what it meant.
> My question is whether we want to consider an approach where SolrCloud will 
> not accept writes unless there is a majority of replicas available to accept 
> the write? For my example, under this approach, we wouldn't accept writes if 
> both B&C failed, but would if only C did, leaving A & B online. Admittedly, 
> this lowers the write-availability of the system, so may be something that 
> should be tunable?
> From Mark M: Yeah, this is kind of like one of many little features that we 
> have just not gotten to yet. I’ve always planned for a param that let’s you 
> say how many replicas an update must be verified on before responding 
> success. Seems to make sense to fail that type of request early if you notice 
> there are not enough replicas up to satisfy the param to begin with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5611) Simplify the default indexing chain

2014-04-28 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983071#comment-13983071
 ] 

Robert Muir commented on LUCENE-5611:
-

In StoredFieldsWriter:

{noformat}
- *   For every document, {@link #startDocument(int)} is called,
+ *   For every document, {@link #startDocument()} is called,
  *   informing the Codec how many fields will be written.
{noformat}

This javadoc "compiles" but now does not make sense because we don't pass 
numFields as a parameter anymore.

The attribute handling in the indexing chain got more confusing and 
complicated. Can we factor this into FieldInvertState?

Its bogus we call hasAttribute + getAttribute, besides making the code more 
complicated, its two hashmap lookups for 2 atts. We should add a method to 
attribute source that acts like map.get (returns an attribute, or null if it 
doesnt exist). Or simple change the semantics of getAttribute to do that. This 
can be a followup issue.

I will keep reviewing, i only got thru the first 3 or 4 files in the patch.

> Simplify the default indexing chain
> ---
>
> Key: LUCENE-5611
> URL: https://issues.apache.org/jira/browse/LUCENE-5611
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5611.patch, LUCENE-5611.patch
>
>
> I think Lucene's current indexing chain has too many classes /
> hierarchy / abstractions, making it look much more complex than it
> really should be, and discouraging users from experimenting/innovating
> with their own indexing chains.
> Also, if it were easier to understand/approach, then new developers
> would more likely try to improve it ... it really should be simpler.
> So I'm exploring a pared back indexing chain, and have a starting patch
> that I think is looking ok: it seems more approachable than the
> current indexing chain, or at least has fewer strange classes.
> I also thought this could give some speedup for tiny documents (a more
> common use of Lucene lately), and it looks like, with the evil
> optimizations, this is a ~25% speedup for Geonames docs.  Even without
> those evil optos it's a bit faster.
> This is very much a work in progress / nocommits, and there are some
> behavior changes e.g. the new chain requires all fields to have the
> same TV options (rather than auto-upgrading all fields by the same
> name that the current chain does)...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-5468) Option to enforce a majority quorum approach to accepting updates in SolrCloud

2014-04-28 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983035#comment-13983035
 ] 

Mark Miller edited comment on SOLR-5468 at 4/28/14 2:29 PM:


bq. for now it seems sufficient to let users decide how many replicas a write 
must succeed on to be considered successful.

I agree that that is the low hanging fruit. We just have to let the user know 
exactly what we are trying to promise.

bq. there would need to be some "backing out" work to remove an update that 
succeeded on the leader but failed on the replicas. 

Yup - that will be the hardest part of doing this how we would really like and 
a large reason it was punted on in all the initial work. Even if the leader 
didn't process the doc first (which is likely a doable optimization at some 
point), I still think it's really hard.

bq. Lastly, batches! What happens if half of a batch (sent by a client) 
succeeds and the other half fails (due to losing a replica in the middle of 
processing the batch)? 

Batches and streaming really don't make sense yet in SolrCloud other than for 
batch loading. We need to implement better, fine grained +error+ responses 
first. When that happens, it should all operate the same as single update per 
request.


was (Author: markrmil...@gmail.com):
bq. for now it seems sufficient to let users decide how many replicas a write 
must succeed on to be considered successful.

I agree that that is the low hanging fruit. We just have to let the user know 
exactly what we are trying to promise.

bq. there would need to be some "backing out" work to remove an update that 
succeeded on the leader but failed on the replicas. 

Yup - that will be the hardest part of doing this how we would really like and 
a large reason it was punted on in all the initial work. Even if the leader 
didn't process the doc first (which is likely a doable optimization at some 
point), I still think it's really hard.

bq. Lastly, batches! What happens if half of a batch (sent by a client) 
succeeds and the other half fails (due to losing a replica in the middle of 
processing the batch)? 

Batches and streaming really don't make sense yet in SolrCloud other than for 
batch loading. We need to implement better, fine grained responses first. When 
that happens, it should all operate the same as single update per request.

> Option to enforce a majority quorum approach to accepting updates in SolrCloud
> --
>
> Key: SOLR-5468
> URL: https://issues.apache.org/jira/browse/SOLR-5468
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 4.5
> Environment: All
>Reporter: Timothy Potter
>Assignee: Timothy Potter
>Priority: Minor
>
> I've been thinking about how SolrCloud deals with write-availability using 
> in-sync replica sets, in which writes will continue to be accepted so long as 
> there is at least one healthy node per shard.
> For a little background (and to verify my understanding of the process is 
> correct), SolrCloud only considers active/healthy replicas when acknowledging 
> a write. Specifically, when a shard leader accepts an update request, it 
> forwards the request to all active/healthy replicas and only considers the 
> write successful if all active/healthy replicas ack the write. Any down / 
> gone replicas are not considered and will sync up with the leader when they 
> come back online using peer sync or snapshot replication. For instance, if a 
> shard has 3 nodes, A, B, C with A being the current leader, then writes to 
> the shard will continue to succeed even if B & C are down.
> The issue is that if a shard leader continues to accept updates even if it 
> loses all of its replicas, then we have acknowledged updates on only 1 node. 
> If that node, call it A, then fails and one of the previous replicas, call it 
> B, comes back online before A does, then any writes that A accepted while the 
> other replicas were offline are at risk to being lost. 
> SolrCloud does provide a safe-guard mechanism for this problem with the 
> leaderVoteWait setting, which puts any replicas that come back online before 
> node A into a temporary wait state. If A comes back online within the wait 
> period, then all is well as it will become the leader again and no writes 
> will be lost. As a side note, sys admins definitely need to be made more 
> aware of this situation as when I first encountered it in my cluster, I had 
> no idea what it meant.
> My question is whether we want to consider an approach where SolrCloud will 
> not accept writes unless there is a majority of replicas available to accept 
> the write? For my example, under this approach, we wouldn't accept writes if 
> both B&C

RE: [VOTE] Lucene/Solr 4.8.0 RC2

2014-04-28 Thread Uwe Schindler
Thanks for the release notes editing! I will now start to publish the web page.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Uwe Schindler [mailto:u...@thetaphi.de]
> Sent: Monday, April 28, 2014 12:01 AM
> To: dev@lucene.apache.org
> Subject: RE: [VOTE] Lucene/Solr 4.8.0 RC2
> 
> Hi,
> 
> the vote succeeded. I will now start to push the artifacts and sill send the
> release announcement tomorrow. It would be good to review the release
> notes before:
> => https://wiki.apache.org/lucene-java/ReleaseNote48
> => https://wiki.apache.org/solr/ReleaseNote48
> 
> Thanks to all for voting!
> Uwe
> 
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
> 
> 
> > -Original Message-
> > From: Uwe Schindler [mailto:u...@thetaphi.de]
> > Sent: Thursday, April 24, 2014 11:54 PM
> > To: dev@lucene.apache.org
> > Subject: [VOTE] Lucene/Solr 4.8.0 RC2
> >
> > Hi,
> >
> > I prepared a second release candidate of Lucene and Solr 4.8.0. The
> > artifacts can be found here:
> > =>
> > http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0-
> > RC2-rev1589874/
> >
> > This RC contains the additional fixes for SOLR-6011, LUCENE-5626, and
> > LUCENE-5630.
> >
> > Please check the artifacts and give your vote in the next 72 hrs.
> >
> > Uwe
> >
> > P.S.: Here's my smoker command line:
> > $  JAVA_HOME=$HOME/jdk1.7.0_55 JAVA7_HOME=$HOME/jdk1.7.0_55
> > python3.2 -u smokeTestRelease.py '
> > http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0-RC
> > 2-
> > rev1589874/' 1589874 4.8.0 tmp
> >
> > -
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
> > additional commands, e-mail: dev-h...@lucene.apache.org
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
> commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5468) Option to enforce a majority quorum approach to accepting updates in SolrCloud

2014-04-28 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983035#comment-13983035
 ] 

Mark Miller commented on SOLR-5468:
---

bq. for now it seems sufficient to let users decide how many replicas a write 
must succeed on to be considered successful.

I agree that that is the low hanging fruit. We just have to let the user know 
exactly what we are trying to promise.

bq. there would need to be some "backing out" work to remove an update that 
succeeded on the leader but failed on the replicas. 

Yup - that will be the hardest part of doing this how we would really like and 
a large reason it was punted on in all the initial work. Even if the leader 
didn't process the doc first (which is likely a doable optimization at some 
point), I still think it's really hard.

bq. Lastly, batches! What happens if half of a batch (sent by a client) 
succeeds and the other half fails (due to losing a replica in the middle of 
processing the batch)? 

Batches and streaming really don't make sense yet in SolrCloud other than for 
batch loading. We need to implement better, fine grained responses first. When 
that happens, it should all operate the same as single update per request.

> Option to enforce a majority quorum approach to accepting updates in SolrCloud
> --
>
> Key: SOLR-5468
> URL: https://issues.apache.org/jira/browse/SOLR-5468
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 4.5
> Environment: All
>Reporter: Timothy Potter
>Assignee: Timothy Potter
>Priority: Minor
>
> I've been thinking about how SolrCloud deals with write-availability using 
> in-sync replica sets, in which writes will continue to be accepted so long as 
> there is at least one healthy node per shard.
> For a little background (and to verify my understanding of the process is 
> correct), SolrCloud only considers active/healthy replicas when acknowledging 
> a write. Specifically, when a shard leader accepts an update request, it 
> forwards the request to all active/healthy replicas and only considers the 
> write successful if all active/healthy replicas ack the write. Any down / 
> gone replicas are not considered and will sync up with the leader when they 
> come back online using peer sync or snapshot replication. For instance, if a 
> shard has 3 nodes, A, B, C with A being the current leader, then writes to 
> the shard will continue to succeed even if B & C are down.
> The issue is that if a shard leader continues to accept updates even if it 
> loses all of its replicas, then we have acknowledged updates on only 1 node. 
> If that node, call it A, then fails and one of the previous replicas, call it 
> B, comes back online before A does, then any writes that A accepted while the 
> other replicas were offline are at risk to being lost. 
> SolrCloud does provide a safe-guard mechanism for this problem with the 
> leaderVoteWait setting, which puts any replicas that come back online before 
> node A into a temporary wait state. If A comes back online within the wait 
> period, then all is well as it will become the leader again and no writes 
> will be lost. As a side note, sys admins definitely need to be made more 
> aware of this situation as when I first encountered it in my cluster, I had 
> no idea what it meant.
> My question is whether we want to consider an approach where SolrCloud will 
> not accept writes unless there is a majority of replicas available to accept 
> the write? For my example, under this approach, we wouldn't accept writes if 
> both B&C failed, but would if only C did, leaving A & B online. Admittedly, 
> this lowers the write-availability of the system, so may be something that 
> should be tunable?
> From Mark M: Yeah, this is kind of like one of many little features that we 
> have just not gotten to yet. I’ve always planned for a param that let’s you 
> say how many replicas an update must be verified on before responding 
> success. Seems to make sense to fail that type of request early if you notice 
> there are not enough replicas up to satisfy the param to begin with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5981) Please change method visibility of getSolrWriter in DataImportHandler to public (or at least protected)

2014-04-28 Thread James Dyer (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983033#comment-13983033
 ] 

James Dyer commented on SOLR-5981:
--

Shawn,

I think its ok to commit, but to fully implement the DIHWriter and let the 
writers be truly plugabble is probably the best situation.  This patch is 
easier to do and what's the harm?  Should a future maintainer want to do it 
differently, it might not be backwards-compatible.  DIH is perpetually 
"expeirmental, subject to change" and I think the bar is low in this case.  And 
to give it a new use-case indexing a no-sql db, might make it more attractive 
to someone to maintain this in the future.

> Please change method visibility of getSolrWriter in DataImportHandler to 
> public (or at least protected)
> ---
>
> Key: SOLR-5981
> URL: https://issues.apache.org/jira/browse/SOLR-5981
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 4.0
> Environment: Linux 3.13.9-200.fc20.x86_64
> Solr 4.6.0
>Reporter: Aaron LaBella
>Assignee: Shawn Heisey
>Priority: Minor
> Fix For: 4.9, 5.0
>
> Attachments: SOLR-5981.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> I've been using the org.apache.solr.handler.dataimport.DataImportHandler for 
> a bit and it's an excellent model and architecture.  I'd like to extend the 
> usage of it to plugin my own DIHWriter, but, the code doesn't allow for it.  
> Please change ~line 227 in the DataImportHander class to be:
> public SolrWriter getSolrWriter
> instead of:
> private SolrWriter getSolrWriter
> or, at a minimum, protected, so that I can extend DataImportHandler and 
> override this method.
> Thank you *sincerely* in advance for the quick turn-around on this.  If the 
> change can be made in 4.6.0 and upstream, that'd be ideal.
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-5468) Option to enforce a majority quorum approach to accepting updates in SolrCloud

2014-04-28 Thread Timothy Potter (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter reassigned SOLR-5468:


Assignee: Timothy Potter

> Option to enforce a majority quorum approach to accepting updates in SolrCloud
> --
>
> Key: SOLR-5468
> URL: https://issues.apache.org/jira/browse/SOLR-5468
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 4.5
> Environment: All
>Reporter: Timothy Potter
>Assignee: Timothy Potter
>Priority: Minor
>
> I've been thinking about how SolrCloud deals with write-availability using 
> in-sync replica sets, in which writes will continue to be accepted so long as 
> there is at least one healthy node per shard.
> For a little background (and to verify my understanding of the process is 
> correct), SolrCloud only considers active/healthy replicas when acknowledging 
> a write. Specifically, when a shard leader accepts an update request, it 
> forwards the request to all active/healthy replicas and only considers the 
> write successful if all active/healthy replicas ack the write. Any down / 
> gone replicas are not considered and will sync up with the leader when they 
> come back online using peer sync or snapshot replication. For instance, if a 
> shard has 3 nodes, A, B, C with A being the current leader, then writes to 
> the shard will continue to succeed even if B & C are down.
> The issue is that if a shard leader continues to accept updates even if it 
> loses all of its replicas, then we have acknowledged updates on only 1 node. 
> If that node, call it A, then fails and one of the previous replicas, call it 
> B, comes back online before A does, then any writes that A accepted while the 
> other replicas were offline are at risk to being lost. 
> SolrCloud does provide a safe-guard mechanism for this problem with the 
> leaderVoteWait setting, which puts any replicas that come back online before 
> node A into a temporary wait state. If A comes back online within the wait 
> period, then all is well as it will become the leader again and no writes 
> will be lost. As a side note, sys admins definitely need to be made more 
> aware of this situation as when I first encountered it in my cluster, I had 
> no idea what it meant.
> My question is whether we want to consider an approach where SolrCloud will 
> not accept writes unless there is a majority of replicas available to accept 
> the write? For my example, under this approach, we wouldn't accept writes if 
> both B&C failed, but would if only C did, leaving A & B online. Admittedly, 
> this lowers the write-availability of the system, so may be something that 
> should be tunable?
> From Mark M: Yeah, this is kind of like one of many little features that we 
> have just not gotten to yet. I’ve always planned for a param that let’s you 
> say how many replicas an update must be verified on before responding 
> success. Seems to make sense to fail that type of request early if you notice 
> there are not enough replicas up to satisfy the param to begin with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5468) Option to enforce a majority quorum approach to accepting updates in SolrCloud

2014-04-28 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983016#comment-13983016
 ] 

Timothy Potter commented on SOLR-5468:
--

Starting to work on this ...

First, I think "majority quorum" is too strong for what we really need at the 
moment; for now it seems sufficient to let users decide how many replicas a 
write must succeed on to be considered successful. In other words, we can 
introduce a new, optional integer property when creating a new collection - 
minActiveReplicas (need a better name), which defaults to 1 (current behavior). 
If >1, then an update won't succeed unless it is ack'd by at least that many 
replicas. Activating this feature doesn't make much sense unless a collection 
has RF > 2.

The biggest hurdle to adding this behavior is the asynchronous / streaming 
based approach leaders use to forward updates on to replicas. The current 
implementation uses a callback error handler to deal with failed update 
requests (from leader to replica) and simply considers an update successful if 
it works on the leader. Part of the complexity is that the leader processes the 
update before even attempting to forward on to the replica so there would need 
to be some "backing out" work to remove an update that succeeded on the leader 
but failed on the replicas. This is starting to get messy ;-)

Another key point here is this feature simply moves the problem from the Solr 
server to the client application, i.e. it's a fail-faster approach where a 
client indexing app gets notified that writes are not succeeding on enough 
replicas to meet the desired threshold. The client application still has to 
decide what to do when writes fail. 

Lastly, batches! What happens if half of a batch (sent by a client) succeeds 
and the other half fails (due to losing a replica in the middle of processing 
the batch)? Another idea I had is maybe this isn't a collection-level property, 
maybe it is set on a per-request basis?

> Option to enforce a majority quorum approach to accepting updates in SolrCloud
> --
>
> Key: SOLR-5468
> URL: https://issues.apache.org/jira/browse/SOLR-5468
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 4.5
> Environment: All
>Reporter: Timothy Potter
>Priority: Minor
>
> I've been thinking about how SolrCloud deals with write-availability using 
> in-sync replica sets, in which writes will continue to be accepted so long as 
> there is at least one healthy node per shard.
> For a little background (and to verify my understanding of the process is 
> correct), SolrCloud only considers active/healthy replicas when acknowledging 
> a write. Specifically, when a shard leader accepts an update request, it 
> forwards the request to all active/healthy replicas and only considers the 
> write successful if all active/healthy replicas ack the write. Any down / 
> gone replicas are not considered and will sync up with the leader when they 
> come back online using peer sync or snapshot replication. For instance, if a 
> shard has 3 nodes, A, B, C with A being the current leader, then writes to 
> the shard will continue to succeed even if B & C are down.
> The issue is that if a shard leader continues to accept updates even if it 
> loses all of its replicas, then we have acknowledged updates on only 1 node. 
> If that node, call it A, then fails and one of the previous replicas, call it 
> B, comes back online before A does, then any writes that A accepted while the 
> other replicas were offline are at risk to being lost. 
> SolrCloud does provide a safe-guard mechanism for this problem with the 
> leaderVoteWait setting, which puts any replicas that come back online before 
> node A into a temporary wait state. If A comes back online within the wait 
> period, then all is well as it will become the leader again and no writes 
> will be lost. As a side note, sys admins definitely need to be made more 
> aware of this situation as when I first encountered it in my cluster, I had 
> no idea what it meant.
> My question is whether we want to consider an approach where SolrCloud will 
> not accept writes unless there is a majority of replicas available to accept 
> the write? For my example, under this approach, we wouldn't accept writes if 
> both B&C failed, but would if only C did, leaving A & B online. Admittedly, 
> this lowers the write-availability of the system, so may be something that 
> should be tunable?
> From Mark M: Yeah, this is kind of like one of many little features that we 
> have just not gotten to yet. I’ve always planned for a param that let’s you 
> say how many replicas an update must be verified on before responding 
> success. Seems to make se

[jira] [Commented] (SOLR-6013) Fix method visibility of Evaluator, refactor DateFormatEvaluator for extensibility

2014-04-28 Thread Aaron LaBella (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982999#comment-13982999
 ] 

Aaron LaBella commented on SOLR-6013:
-

NOTE: the patches can be applied from oldest to newest

> Fix method visibility of Evaluator, refactor DateFormatEvaluator for 
> extensibility
> --
>
> Key: SOLR-6013
> URL: https://issues.apache.org/jira/browse/SOLR-6013
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 4.7
>Reporter: Aaron LaBella
> Fix For: 4.9
>
> Attachments: 0001-add-getters-for-datemathparser.patch, 
> 0001-change-method-access-to-protected.patch, 
> 0001-change-method-variable-visibility-and-refactor-for-extensibility.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> This is similar to issue 5981, the Evaluator class is declared as abstract, 
> yet the parseParams method is package private?  Surely this is an oversight, 
> as I wouldn't expect everyone writing their own evaluators to have to deal 
> with parsing the parameters.
> Similarly, I needed to refactor DateFormatEvaluator because I need to do some 
> custom date math/parsing and it wasn't written in a way that I can extend it.
> Please review/apply my attached patch to the next version of Solr, ie: 4.8 or 
> 4.9 if I must wait.
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6013) Fix method visibility of Evaluator, refactor DateFormatEvaluator for extensibility

2014-04-28 Thread Aaron LaBella (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron LaBella updated SOLR-6013:


Attachment: 0001-change-method-access-to-protected.patch

Thanks Shalin,

I'm attaching another patch to change the method accessors to protected 
(instead of public) and marked the methods as lucene.experimental.  Let me know 
if there's anything else.  Otherwise, can you, or someone else commit/push 
these patches into the 4.x branch so it makes the next release?

Thanks

> Fix method visibility of Evaluator, refactor DateFormatEvaluator for 
> extensibility
> --
>
> Key: SOLR-6013
> URL: https://issues.apache.org/jira/browse/SOLR-6013
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 4.7
>Reporter: Aaron LaBella
> Fix For: 4.9
>
> Attachments: 0001-add-getters-for-datemathparser.patch, 
> 0001-change-method-access-to-protected.patch, 
> 0001-change-method-variable-visibility-and-refactor-for-extensibility.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> This is similar to issue 5981, the Evaluator class is declared as abstract, 
> yet the parseParams method is package private?  Surely this is an oversight, 
> as I wouldn't expect everyone writing their own evaluators to have to deal 
> with parsing the parameters.
> Similarly, I needed to refactor DateFormatEvaluator because I need to do some 
> custom date math/parsing and it wasn't written in a way that I can extend it.
> Please review/apply my attached patch to the next version of Solr, ie: 4.8 or 
> 4.9 if I must wait.
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: [VOTE] Lucene/Solr 4.8.0 RC2

2014-04-28 Thread Uwe Schindler
Hi,

May anybody help with the Solr release notes? To me there are only minimal new 
features listed, because I don’t have the full insight what changed. It might 
be good to add one or 2 more features.
https://wiki.apache.org/solr/ReleaseNote48

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Uwe Schindler [mailto:u...@thetaphi.de]
> Sent: Monday, April 28, 2014 12:01 AM
> To: dev@lucene.apache.org
> Subject: RE: [VOTE] Lucene/Solr 4.8.0 RC2
> 
> Hi,
> 
> the vote succeeded. I will now start to push the artifacts and sill send the
> release announcement tomorrow. It would be good to review the release
> notes before:
> => https://wiki.apache.org/lucene-java/ReleaseNote48
> => https://wiki.apache.org/solr/ReleaseNote48
> 
> Thanks to all for voting!
> Uwe
> 
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
> 
> 
> > -Original Message-
> > From: Uwe Schindler [mailto:u...@thetaphi.de]
> > Sent: Thursday, April 24, 2014 11:54 PM
> > To: dev@lucene.apache.org
> > Subject: [VOTE] Lucene/Solr 4.8.0 RC2
> >
> > Hi,
> >
> > I prepared a second release candidate of Lucene and Solr 4.8.0. The
> > artifacts can be found here:
> > =>
> > http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0-
> > RC2-rev1589874/
> >
> > This RC contains the additional fixes for SOLR-6011, LUCENE-5626, and
> > LUCENE-5630.
> >
> > Please check the artifacts and give your vote in the next 72 hrs.
> >
> > Uwe
> >
> > P.S.: Here's my smoker command line:
> > $  JAVA_HOME=$HOME/jdk1.7.0_55 JAVA7_HOME=$HOME/jdk1.7.0_55
> > python3.2 -u smokeTestRelease.py '
> > http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0-RC
> > 2-
> > rev1589874/' 1589874 4.8.0 tmp
> >
> > -
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
> > additional commands, e-mail: dev-h...@lucene.apache.org
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
> commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5618) DocValues updates send wrong fieldinfos to codec producers

2014-04-28 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982931#comment-13982931
 ] 

Shai Erera commented on LUCENE-5618:


If we separate each DV update into its own file, I think we will need to track 
another gen in SegmentCommitInfo: deletes, fieldInfos and dvUpdates. Though 
each FI writes its dvGen in the FIS file, we need to know from where to 
increment the gen for the next update. This isn't a big deal, just adds 
complexity to SCI (4 methods and index format change).

But why do you think that it's wrong to write 2 fields and then at read time 
ask to provide only 1 field? I.e. what if the Codecs API was more "lazy", or a 
Codec wants to implement lazy loading of even just the metadata?

Passing all the fields a Codec wrote, e.g. in the {{gen=-1}} case, even though 
none of them is not going to be used because they were all updated in later 
gens, seems awkward to me as well.

What sort of index corruption does this check detect? As I see it, the Codec 
gets a subset of the fields that it already wrote. It's worse if it gets a 
superset of those fields, because you don't know e.g. if there are perhaps 
missing fields that disappeared from the file system.

> DocValues updates send wrong fieldinfos to codec producers
> --
>
> Key: LUCENE-5618
> URL: https://issues.apache.org/jira/browse/LUCENE-5618
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> Spinoff from LUCENE-5616.
> See the example there, docvalues readers get a fieldinfos, but it doesn't 
> contain the correct ones, so they have invalid field numbers at read time.
> This should really be fixed. Maybe a simple solution is to not write 
> "batches" of fields in updates but just have only one field per gen? 
> This removes many-many relationships and would make things easy to understand.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Lucene/Solr 4.8.0 RC2

2014-04-28 Thread Michael McCandless
OK I made some small edit's to Lucene's release notes.

Mike McCandless

http://blog.mikemccandless.com


On Mon, Apr 28, 2014 at 6:20 AM, Michael McCandless
 wrote:
> I'd like to make some minor edits to the Lucene release notes ... but
> I can't login (http://status.apache.org shows some problem).  I'll try
> a bit later ...
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Sun, Apr 27, 2014 at 6:00 PM, Uwe Schindler  wrote:
>> Hi,
>>
>> the vote succeeded. I will now start to push the artifacts and sill send the 
>> release announcement tomorrow. It would be good to review the release notes 
>> before:
>> => https://wiki.apache.org/lucene-java/ReleaseNote48
>> => https://wiki.apache.org/solr/ReleaseNote48
>>
>> Thanks to all for voting!
>> Uwe
>>
>> -
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: u...@thetaphi.de
>>
>>
>>> -Original Message-
>>> From: Uwe Schindler [mailto:u...@thetaphi.de]
>>> Sent: Thursday, April 24, 2014 11:54 PM
>>> To: dev@lucene.apache.org
>>> Subject: [VOTE] Lucene/Solr 4.8.0 RC2
>>>
>>> Hi,
>>>
>>> I prepared a second release candidate of Lucene and Solr 4.8.0. The 
>>> artifacts
>>> can be found here:
>>> => http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0-
>>> RC2-rev1589874/
>>>
>>> This RC contains the additional fixes for SOLR-6011, LUCENE-5626, and
>>> LUCENE-5630.
>>>
>>> Please check the artifacts and give your vote in the next 72 hrs.
>>>
>>> Uwe
>>>
>>> P.S.: Here's my smoker command line:
>>> $  JAVA_HOME=$HOME/jdk1.7.0_55 JAVA7_HOME=$HOME/jdk1.7.0_55
>>> python3.2 -u smokeTestRelease.py '
>>> http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0-RC2-
>>> rev1589874/' 1589874 4.8.0 tmp
>>>
>>> -
>>> Uwe Schindler
>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>> http://www.thetaphi.de
>>> eMail: u...@thetaphi.de
>>>
>>>
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
>>> commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Lucene/Solr 4.8.0 RC2

2014-04-28 Thread Michael McCandless
I'd like to make some minor edits to the Lucene release notes ... but
I can't login (http://status.apache.org shows some problem).  I'll try
a bit later ...

Mike McCandless

http://blog.mikemccandless.com


On Sun, Apr 27, 2014 at 6:00 PM, Uwe Schindler  wrote:
> Hi,
>
> the vote succeeded. I will now start to push the artifacts and sill send the 
> release announcement tomorrow. It would be good to review the release notes 
> before:
> => https://wiki.apache.org/lucene-java/ReleaseNote48
> => https://wiki.apache.org/solr/ReleaseNote48
>
> Thanks to all for voting!
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
>> -Original Message-
>> From: Uwe Schindler [mailto:u...@thetaphi.de]
>> Sent: Thursday, April 24, 2014 11:54 PM
>> To: dev@lucene.apache.org
>> Subject: [VOTE] Lucene/Solr 4.8.0 RC2
>>
>> Hi,
>>
>> I prepared a second release candidate of Lucene and Solr 4.8.0. The artifacts
>> can be found here:
>> => http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0-
>> RC2-rev1589874/
>>
>> This RC contains the additional fixes for SOLR-6011, LUCENE-5626, and
>> LUCENE-5630.
>>
>> Please check the artifacts and give your vote in the next 72 hrs.
>>
>> Uwe
>>
>> P.S.: Here's my smoker command line:
>> $  JAVA_HOME=$HOME/jdk1.7.0_55 JAVA7_HOME=$HOME/jdk1.7.0_55
>> python3.2 -u smokeTestRelease.py '
>> http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0-RC2-
>> rev1589874/' 1589874 4.8.0 tmp
>>
>> -
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: u...@thetaphi.de
>>
>>
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
>> commands, e-mail: dev-h...@lucene.apache.org
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5618) DocValues updates send wrong fieldinfos to codec producers

2014-04-28 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982888#comment-13982888
 ] 

Robert Muir commented on LUCENE-5618:
-

{quote}
Write each updated field in its own gen – if you update many fields, many 
times, this will create many files in the index directory. Technically it's not 
"wrong", it just looks weird
{quote}

Why? This is how separate norms worked. Its the obvious solution. The current 
behavior is broken: lets fix the bug. This optimization is what is to blame. 
The optimization is invalid.

{quote}
Anyway, I think the issue's title is wrong – DocValues updates do pass the 
correct fieldInfos to the producers. They pass only the infos that the producer 
should care about, and we see that passing too many is wrong (PerFieldDVF).
{quote}

Absolutely not! You get a different fieldinfos at _read_ time than you get at 
_write_. This is broken!

> DocValues updates send wrong fieldinfos to codec producers
> --
>
> Key: LUCENE-5618
> URL: https://issues.apache.org/jira/browse/LUCENE-5618
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> Spinoff from LUCENE-5616.
> See the example there, docvalues readers get a fieldinfos, but it doesn't 
> contain the correct ones, so they have invalid field numbers at read time.
> This should really be fixed. Maybe a simple solution is to not write 
> "batches" of fields in updates but just have only one field per gen? 
> This removes many-many relationships and would make things easy to understand.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Posting list

2014-04-28 Thread Michael McCandless
Postings are more like SortedMap, ie terms are
binary, and are in sorted order.

Are you referring to the indexing chain classes, e.g.
FreqProxTermsWriterPerField.FreqProxPostingsArray?  That class holds
the postings in IndexWriter's RAM buffer until it's time to write them
to disk as a new segment.  Those data structures are somewhat
confusing, but once they are written to disk and opened for reading
they are exposed via the FieldsProducer API.

Mike McCandless

http://blog.mikemccandless.com


On Mon, Apr 28, 2014 at 12:33 AM, fabric fabricio
 wrote:
> Can you explain how dictionary are linked with this implementation of
> posting lists. In traditional case we have dictionary like
> hashmap[String,List(int,int)] //word -> docid, termfreq. In this case
> dictionary points to "parallel arrays" slots and in the "poitner array"
> points to most recent docid in the posting list what means "to search the
> posting list" in other words how this maps to List(int,int) part

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5618) DocValues updates send wrong fieldinfos to codec producers

2014-04-28 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982853#comment-13982853
 ] 

Shai Erera commented on LUCENE-5618:


I modified the code to pass all the FIs to the codec, no matter the gen, and 
tests fail with FileNotFoundException. The reason is that PerFieldDVF tries to 
open DVPs e.g. of {{gen=1}} of all fields, whether they were written in that 
gen or not, which leads to the FNFE. I am not sure that we can pass all FIs to 
the Codec that way ... so our options are:

* Pass all the fields that were written in a gen (whether we need them or not) 
-- this does not make sense to me, as we'll need to track it somewhere, and it 
seems a waste
* Add leniency in the form of "here are the fields you should care about" -- 
this makes the codec partially updates aware, but I don't think it's a bad idea
* Write each updated field in its own gen -- if you update many fields, many 
times, this will create many files in the index directory. Technically it's not 
"wrong", it just looks weird
* Remain w/ the current code's corruption detection if the read fieldNumber < 0

Anyway, I think the issue's title is wrong -- DocValues updates *do* pass the 
correct fieldInfos to the producers. They pass only the infos that the producer 
should care about, and we see that passing too many is wrong (PerFieldDVF).

I will think about it more. If you see other alternatives, feel free to propose 
them.

> DocValues updates send wrong fieldinfos to codec producers
> --
>
> Key: LUCENE-5618
> URL: https://issues.apache.org/jira/browse/LUCENE-5618
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> Spinoff from LUCENE-5616.
> See the example there, docvalues readers get a fieldinfos, but it doesn't 
> contain the correct ones, so they have invalid field numbers at read time.
> This should really be fixed. Maybe a simple solution is to not write 
> "batches" of fields in updates but just have only one field per gen? 
> This removes many-many relationships and would make things easy to understand.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded

2014-04-28 Thread Anshum Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-5681:
---

Attachment: SOLR-5681.patch

Added a test for running parallel tasks (multiple collection creation and 
split). Seems like there's some issue fetching new tasks from the queue.
Working on resolving the issue.

> Make the OverseerCollectionProcessor multi-threaded
> ---
>
> Key: SOLR-5681
> URL: https://issues.apache.org/jira/browse/SOLR-5681
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Anshum Gupta
>Assignee: Anshum Gupta
> Attachments: SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch
>
>
> Right now, the OverseerCollectionProcessor is single threaded i.e submitting 
> anything long running would have it block processing of other mutually 
> exclusive tasks.
> When OCP tasks become optionally async (SOLR-5477), it'd be good to have 
> truly non-blocking behavior by multi-threading the OCP itself.
> For example, a ShardSplit call on Collection1 would block the thread and 
> thereby, not processing a create collection task (which would stay queued in 
> zk) though both the tasks are mutually exclusive.
> Here are a few of the challenges:
> * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An 
> easy way to handle that is to only let 1 task per collection run at a time.
> * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. 
> The task from the workQueue is only removed on completion so that in case of 
> a failure, the new Overseer can re-consume the same task and retry. A queue 
> is not the right data structure in the first place to look ahead i.e. get the 
> 2nd task from the queue when the 1st one is in process. Also, deleting tasks 
> which are not at the head of a queue is not really an 'intuitive' thing.
> Proposed solutions for task management:
> * Task funnel and peekAfter(): The parent thread is responsible for getting 
> and passing the request to a new thread (or one from the pool). The parent 
> method uses a peekAfter(last element) instead of a peek(). The peekAfter 
> returns the task after the 'last element'. Maintain this request information 
> and use it for deleting/cleaning up the workQueue.
> * Another (almost duplicate) queue: While offering tasks to workQueue, also 
> offer them to a new queue (call it volatileWorkQueue?). The difference is, as 
> soon as a task from this is picked up for processing by the thread, it's 
> removed from the queue. At the end, the cleanup is done from the workQueue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6026) Also check work-queue while processing a REQUESTSTATUS Collection API Call

2014-04-28 Thread Anshum Gupta (JIRA)
Anshum Gupta created SOLR-6026:
--

 Summary: Also check work-queue while processing a REQUESTSTATUS 
Collection API Call
 Key: SOLR-6026
 URL: https://issues.apache.org/jira/browse/SOLR-6026
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.8
Reporter: Anshum Gupta
 Fix For: 4.8.1


REQUESTSTATUS API call should check for the following:
* work-queue (submitted task)
* running-map (running task/in progress)
* completed-map
* failure-map

Right now it checks everything but the work-queue. Add that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org