[jira] [Commented] (LUCENE-1853) SubPhraseQuery for matching and scoring sub phrase matches.

2013-04-14 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631277#comment-13631277
 ] 

Shalin Shekhar Mangar commented on LUCENE-1853:
---

Erick, SubPhraseQuery was written by Preetam for AOL Real Estate search. AFAIK, 
no one is working actively on it.

 SubPhraseQuery for matching and scoring sub phrase matches.
 ---

 Key: LUCENE-1853
 URL: https://issues.apache.org/jira/browse/LUCENE-1853
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
 Environment: Lucene/Java
Reporter: Preetam Rao
Priority: Minor
 Attachments: LUCENE-1853.patch, LUCENE-1853.patch


 The goal is to give more control via configuration when searching using user 
 entered queries against multiple fields where sub phrases have special 
 significance.
 For a query like homes in new york with swimming pool, if a document's 
 field matches only new york it should get scored and it should get scored 
 higher than two separate matches new and york.  Also, a 3 word sub phrase 
 match must gets scored considerably higher than a 2 word sub phrase match. 
 (boost factor should be configurable)
 Using shingles for this use case, means each field of each document needs to 
 be indexed as shingles of all (1..N)-grams as well as the query. (Please 
 correct me if I am wrong.)
 The query could also support 
 - ignoring of idf and/or field norms, (so that factors outside the document 
 don't influence scoring)
 - consider only the longest match (for example match on new york is scored 
 and considered rather than new furniture and york city)
 - ignore duplicates (new york appearing twice or thrice does not make any 
 difference)
 This kind of query  could be combined with DisMax query. For example, 
 something like solr's dismax request handler can be made to use this query 
 where we run a user query as it is against all fields and configure each 
 field with above configurations.
 I have also attached a patch with comments and test cases in case, my 
 description is not clear enough. Would appreciate alternatives or feedback. 
 Example Usage:
 code
// sub phrase config
 SubPhraseQuery.SubPhraseConfig conf = new 
 SubPhraseQuery.SubPhraseConfig();
 conf.ignoreIdf = true;
 conf.ignoreFieldNorms = true;
 conf.matchOnlyLongest = true;
 conf.ignoreDuplicates = true;
 conf.phraseBoost = 2;
 // phrase query as usual
SubPhraseQuery pq = new SubPhraseQuery();
pq.add(new Term(f, term));
pq.add(new Term(f, term));
 pq.setSubPhraseConf(conf);
 Hits hits = searcher.search(pq);
 /code

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Linux (64bit/jrockit-jdk1.6.0_37-R28.2.5-4.1.0) - Build # 5107 - Failure!

2013-04-14 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/5107/
Java: 64bit/jrockit-jdk1.6.0_37-R28.2.5-4.1.0 

1 tests failed.
REGRESSION:  org.apache.lucene.index.TestPostingsOffsets.testBackwardsOffsets

Error Message:


Stack Trace:
java.lang.AssertionError
at 
__randomizedtesting.SeedInfo.seed([67F0CFA017E4A011:12E027E7A202D16C]:0)
at 
org.apache.lucene.index.FreqProxTermsWriterPerField.writeOffsets(FreqProxTermsWriterPerField.java:158)
at 
org.apache.lucene.index.FreqProxTermsWriterPerField.addTerm(FreqProxTermsWriterPerField.java:242)
at 
org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:235)
at 
org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:165)
at 
org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:252)
at 
org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:256)
at 
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:376)
at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1486)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1161)
at 
org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:152)
at 
org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:115)
at 
org.apache.lucene.index.TestPostingsOffsets.checkTokens(TestPostingsOffsets.java:492)
at 
org.apache.lucene.index.TestPostingsOffsets.testBackwardsOffsets(TestPostingsOffsets.java:437)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:738)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:774)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:683)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:44)
at 

RE: [JENKINS] Lucene-Solr-4.x-Linux (64bit/jrockit-jdk1.6.0_37-R28.2.5-4.1.0) - Build # 5107 - Failure!

2013-04-14 Thread Uwe Schindler
It looks like the latest JRockit is still buggy. I re-added the-XnoOpt JVM 
setting.

This finally tells us that one should never use JRockit together with Lucene, 
otherwise corrupt indexes will be your daily business.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Policeman Jenkins Server [mailto:jenk...@thetaphi.de]
 Sent: Sunday, April 14, 2013 9:21 AM
 To: dev@lucene.apache.org; sha...@apache.org
 Subject: [JENKINS] Lucene-Solr-4.x-Linux (64bit/jrockit-jdk1.6.0_37-R28.2.5-
 4.1.0) - Build # 5107 - Failure!
 
 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/5107/
 Java: 64bit/jrockit-jdk1.6.0_37-R28.2.5-4.1.0
 
 1 tests failed.
 REGRESSION:
 org.apache.lucene.index.TestPostingsOffsets.testBackwardsOffsets
 
 Error Message:
 
 
 Stack Trace:
 java.lang.AssertionError
   at
 __randomizedtesting.SeedInfo.seed([67F0CFA017E4A011:12E027E7A202D16
 C]:0)
   at
 org.apache.lucene.index.FreqProxTermsWriterPerField.writeOffsets(FreqPr
 oxTermsWriterPerField.java:158)
   at
 org.apache.lucene.index.FreqProxTermsWriterPerField.addTerm(FreqProxT
 ermsWriterPerField.java:242)
   at
 org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:23
 5)
   at
 org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerF
 ield.java:165)
   at
 org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProc
 essor.java:252)
   at
 org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(Doc
 umentsWriterPerThread.java:256)
   at
 org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWr
 iter.java:376)
   at
 org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:14
 86)
   at
 org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1161)
   at
 org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndex
 Writer.java:152)
   at
 org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndex
 Writer.java:115)
   at
 org.apache.lucene.index.TestPostingsOffsets.checkTokens(TestPostingsOffs
 ets.java:492)
   at
 org.apache.lucene.index.TestPostingsOffsets.testBackwardsOffsets(TestPos
 tingsOffsets.java:437)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
 ava:39)
   at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
 sorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(Randomize
 dRunner.java:1559)
   at
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(Rando
 mizedRunner.java:79)
   at
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(Rando
 mizedRunner.java:738)
   at
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(Rando
 mizedRunner.java:774)
   at
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(Rando
 mizedRunner.java:787)
   at
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRule
 SetupTeardownChained.java:50)
   at
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCa
 cheSanity.java:51)
   at
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeA
 fterRule.java:46)
   at
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1
 .evaluate(SystemPropertiesInvariantRule.java:55)
   at
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleTh
 readAndTestName.java:49)
   at
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRule
 IgnoreAfterMaxFailures.java:70)
   at
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure
 .java:48)
   at
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(Stat
 ementAdapter.java:36)
   at
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.
 run(ThreadLeakControl.java:358)
   at
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask
 (ThreadLeakControl.java:782)
   at
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadL
 eakControl.java:442)
   at
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(Ran
 domizedRunner.java:746)
   at
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(Rando
 mizedRunner.java:648)
   at
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(Rando
 mizedRunner.java:683)
   at
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(Rando
 mizedRunner.java:693)
   at
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeA
 fterRule.java:46)
   at
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreCl
 assName.java:42)
   at
 

[jira] [Resolved] (LUCENE-1853) SubPhraseQuery for matching and scoring sub phrase matches.

2013-04-14 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved LUCENE-1853.


Resolution: Won't Fix

SPRING_CLEANING_2013 JIRA.

OK, we'll close this given Shalin's comment.

 SubPhraseQuery for matching and scoring sub phrase matches.
 ---

 Key: LUCENE-1853
 URL: https://issues.apache.org/jira/browse/LUCENE-1853
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
 Environment: Lucene/Java
Reporter: Preetam Rao
Priority: Minor
 Attachments: LUCENE-1853.patch, LUCENE-1853.patch


 The goal is to give more control via configuration when searching using user 
 entered queries against multiple fields where sub phrases have special 
 significance.
 For a query like homes in new york with swimming pool, if a document's 
 field matches only new york it should get scored and it should get scored 
 higher than two separate matches new and york.  Also, a 3 word sub phrase 
 match must gets scored considerably higher than a 2 word sub phrase match. 
 (boost factor should be configurable)
 Using shingles for this use case, means each field of each document needs to 
 be indexed as shingles of all (1..N)-grams as well as the query. (Please 
 correct me if I am wrong.)
 The query could also support 
 - ignoring of idf and/or field norms, (so that factors outside the document 
 don't influence scoring)
 - consider only the longest match (for example match on new york is scored 
 and considered rather than new furniture and york city)
 - ignore duplicates (new york appearing twice or thrice does not make any 
 difference)
 This kind of query  could be combined with DisMax query. For example, 
 something like solr's dismax request handler can be made to use this query 
 where we run a user query as it is against all fields and configure each 
 field with above configurations.
 I have also attached a patch with comments and test cases in case, my 
 description is not clear enough. Would appreciate alternatives or feedback. 
 Example Usage:
 code
// sub phrase config
 SubPhraseQuery.SubPhraseConfig conf = new 
 SubPhraseQuery.SubPhraseConfig();
 conf.ignoreIdf = true;
 conf.ignoreFieldNorms = true;
 conf.matchOnlyLongest = true;
 conf.ignoreDuplicates = true;
 conf.phraseBoost = 2;
 // phrase query as usual
SubPhraseQuery pq = new SubPhraseQuery();
pq.add(new Term(f, term));
pq.add(new Term(f, term));
 pq.setSubPhraseConf(conf);
 Hits hits = searcher.search(pq);
 /code

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4933) SweetSpotSimilarity doesnt override tf(float)

2013-04-14 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631308#comment-13631308
 ] 

Robert Muir commented on LUCENE-4933:
-

{quote}
historically the value of having both tf(int) and tf(float) was that people 
could choose to implement alternative functions for dealing with phrase 
frequency (using tf(float)) vs single term query's (using tf(int))
{quote}

There is no value in having different functions here: only the possibility of 
bugs.

 SweetSpotSimilarity doesnt override tf(float)
 -

 Key: LUCENE-4933
 URL: https://issues.apache.org/jira/browse/LUCENE-4933
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/query/scoring
Affects Versions: 2.0.0
Reporter: Robert Muir
 Attachments: LUCENE-4933.patch


 This means its scoring is not really right: it only applies to term queries 
 and exact phrase queries, but not e.g. sloppy phrase queries and spans.
 As far as I can tell, its had this bug all along.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-1372) Proposal: introduce more sensible sorting when a doc has multiple values for a term

2013-04-14 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved LUCENE-1372.


Resolution: Won't Fix

SPRING_CLEANING_2013 JIRAS Defining the sort order on MV fields has always 
seemed like one of those features that is more trouble than it's worth. One can 
define a predictable order, but the use to the user is questionable.

 Proposal: introduce more sensible sorting when a doc has multiple values for 
 a term
 ---

 Key: LUCENE-1372
 URL: https://issues.apache.org/jira/browse/LUCENE-1372
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 2.3.2
Reporter: Paul Cowan
Priority: Minor
 Attachments: LUCENE-1372-MultiValueSorters.patch, 
 lucene-multisort.patch


 At the moment, FieldCacheImpl has somewhat disconcerting values when sorting 
 on a field for which multiple values exist for one document. For example, 
 imagine a field fruit which is added to a document multiple times, with the 
 values as follows:
 doc 1: {apple}
 doc 2: {banana}
 doc 3: {apple, banana}
 doc 4: {apple, zebra}
 if one sorts on the field fruit, the loop in 
 FieldCacheImpl.stringsIndexCache.createValue() (and similarly for the other 
 methods in the various FieldCacheImpl caches) does the following:
   while (termDocs.next()) {
 retArray[termDocs.doc()] = t;
   }
 which means that we look over the terms in their natural order and, on each 
 one, overwrite retArray[doc] with the value for each document with that term. 
 Effectively, this overwriting means that a string sort in this circumstance 
 will sort by the LAST term lexicographically, so the docs above will 
 effecitvely be sorted as if they had the single values (apple, banana, 
 banana, zebra) which is nonintuitive. To change this to sort on the first 
 time in the TermEnum seems relatively trivial and low-overhead; while it's 
 not perfect (it's not local-aware, for example) the behaviour seems much more 
 sensible to me. Interested to see what people think.
 Patch to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4451) Upgrade to httpclient 4.2.x and take advantage of SystemDefaultHttpClient

2013-04-14 Thread Ken Krugler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631321#comment-13631321
 ] 

Ken Krugler commented on SOLR-4451:
---

One of my developers also ran into what seems like the same issue, when trying 
to use the embedded Solr server. Worked fine in unit tests, but gets a 
NoSuchMethodError when running on the Hadoop cluster. Whereas 4.1 works fine. 
I'm assuming a classpath issue with some Hadoop jars, so more research is 
needed, but wanted to add to the discussion above.

 Upgrade to httpclient 4.2.x and take advantage of SystemDefaultHttpClient
 -

 Key: SOLR-4451
 URL: https://issues.apache.org/jira/browse/SOLR-4451
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.2, 5.0

 Attachments: SOLR-4451.patch


 HttpComponent is up to version 4.2, and included in 4.2 is a new subclass of 
 DefaultHttpClient named SystemDefaultHttpClient, which automatically 
 configures itself using the standard java system properties...
 http://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/impl/client/SystemDefaultHttpClient.html
 ...i think we should upgrade and start using this new class in place of 
 DefaultHttpClient, so that SolrJ clients (and implicitly SolrCloud) can 
 automaticly leverage system properties users may expect to work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4662) Finalize what we're going to do with solr.xml, auto-discovery, config sets.

2013-04-14 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631323#comment-13631323
 ] 

Erick Erickson commented on SOLR-4662:
--

[~markrmil...@gmail.com] Never mind. I'm not quite sure why I'm seeing what I'm 
seeing, but it's pretty clear that I screwed up the code. Told you it would be 
clear after sleeping Looking

 Finalize what we're going to do with solr.xml, auto-discovery, config sets.
 ---

 Key: SOLR-4662
 URL: https://issues.apache.org/jira/browse/SOLR-4662
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.3, 5.0
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Blocker
 Attachments: SOLR-4662.patch


 Spinoff from SOLR-4615, breaking it out here so we can address the changes in 
 pieces.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4451) Upgrade to httpclient 4.2.x and take advantage of SystemDefaultHttpClient

2013-04-14 Thread Ken Krugler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631326#comment-13631326
 ] 

Ken Krugler commented on SOLR-4451:
---

One related question - if I'm using embedded Solr, is there some way to load a 
core without initShardHandler() being called? From looking at the code, that's 
not possible, but seems like something that should be configurable (similar to 
how ZooKeeper support is conditional on some system properties). 

 Upgrade to httpclient 4.2.x and take advantage of SystemDefaultHttpClient
 -

 Key: SOLR-4451
 URL: https://issues.apache.org/jira/browse/SOLR-4451
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.2, 5.0

 Attachments: SOLR-4451.patch


 HttpComponent is up to version 4.2, and included in 4.2 is a new subclass of 
 DefaultHttpClient named SystemDefaultHttpClient, which automatically 
 configures itself using the standard java system properties...
 http://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/impl/client/SystemDefaultHttpClient.html
 ...i think we should upgrade and start using this new class in place of 
 DefaultHttpClient, so that SolrJ clients (and implicitly SolrCloud) can 
 automaticly leverage system properties users may expect to work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-1416) reduce contention in CoreContainer#getCore()

2013-04-14 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-1416.
--

Resolution: Won't Fix

SPRING_CLEANING_2013 JIRA.
We haven't really seen any evidence that this is a problem. As it happens once 
per search request I'm going to defer.

 reduce contention in CoreContainer#getCore()
 

 Key: SOLR-1416
 URL: https://issues.apache.org/jira/browse/SOLR-1416
 Project: Solr
  Issue Type: Improvement
  Components: multicore
Reporter: Noble Paul
Priority: Minor
 Attachments: SOLR-1416.patch


 every call to CoreContainer#getCore() is synchronized . We should reduce the 
 contention . The writes are very infrequent and reads are frequent . How 
 about using a ReadWriterLock?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



New-style solr.xml

2013-04-14 Thread Erick Erickson
I've started a new page here:

It's completely rudimentary, but I wanted to get it started so as many
eyes as possible can get on it. See:
http://wiki.apache.org/solr/Solr.xml%204.3%20and%20beyond

Question:
With the new style, does allowing cores.properties to specify an
alternate instanceDir make any sense? I don't think so.

What about the sub-properties file ('properties' property).

Erick

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1416) reduce contention in CoreContainer#getCore()

2013-04-14 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631334#comment-13631334
 ] 

Shalin Shekhar Mangar commented on SOLR-1416:
-

Noble and I worked on LotsOfCores for AOL which had more than 20K cores per 
Solr instance. The top three factors slowing down solr for such use-cases were:
# Opening IndexSearcher (our use-case had a 10:1 write/read ratio) - solved by 
opening searcher lazily
# Loading/parsing IndexSchema and SolrConfig objects - solved by caching these 
objects
# Contention in CoreContainer#getCore - solved by the attached patch

At this later date, I don't have the data to support this change but it is 
worth benchmarking if someone is up for it.

 reduce contention in CoreContainer#getCore()
 

 Key: SOLR-1416
 URL: https://issues.apache.org/jira/browse/SOLR-1416
 Project: Solr
  Issue Type: Improvement
  Components: multicore
Reporter: Noble Paul
Priority: Minor
 Attachments: SOLR-1416.patch


 every call to CoreContainer#getCore() is synchronized . We should reduce the 
 contention . The writes are very infrequent and reads are frequent . How 
 about using a ReadWriterLock?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4711) Fix Java 8 bugs in Solr Cloud tests

2013-04-14 Thread Uwe Schindler (JIRA)
Uwe Schindler created SOLR-4711:
---

 Summary: Fix Java 8 bugs in Solr Cloud tests
 Key: SOLR-4711
 URL: https://issues.apache.org/jira/browse/SOLR-4711
 Project: Solr
  Issue Type: Bug
Reporter: Uwe Schindler
Priority: Critical


Some tests in Solr's cloud package always fail on Java 8. We should fix them 
before Java 8 comes out because its not a good idea to release a Solr 
distribution with a test that always fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4553) BasicDistributedZk2Test common failuire on jenkins cluster: status:404, message:Can not find: /onenodecollectioncore/update

2013-04-14 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-4553:


Issue Type: Sub-task  (was: Bug)
Parent: SOLR-4711

 BasicDistributedZk2Test common failuire on jenkins cluster: status:404, 
 message:Can not find: /onenodecollectioncore/update
 -

 Key: SOLR-4553
 URL: https://issues.apache.org/jira/browse/SOLR-4553
 Project: Solr
  Issue Type: Sub-task
  Components: Tests
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4553) BasicDistributedZk2Test common failuire on jenkins cluster: status:404, message:Can not find: /onenodecollectioncore/update

2013-04-14 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631353#comment-13631353
 ] 

Mark Miller commented on SOLR-4553:
---

I have actually seen this at least once not on java 8 - so I think some java 8 
changes are just making some issue *much* more noticeable.

I did some investigation, but I guess I responded on the dev list and forgot to 
update this ticket.

I'll have to dig up what I wrote, but basically it seems like sometimes the 
SolrDispatchFilter is either not getting started properly, or is taking forever 
to get set properly (adding additional sleeps as a hack didn't seem to help).

 BasicDistributedZk2Test common failuire on jenkins cluster: status:404, 
 message:Can not find: /onenodecollectioncore/update
 -

 Key: SOLR-4553
 URL: https://issues.apache.org/jira/browse/SOLR-4553
 Project: Solr
  Issue Type: Sub-task
  Components: Tests
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4712) org.apache.solr.cloud.UnloadDistributedZkTest.testDistribSearch fails in Java 8 almost always

2013-04-14 Thread Uwe Schindler (JIRA)
Uwe Schindler created SOLR-4712:
---

 Summary: 
org.apache.solr.cloud.UnloadDistributedZkTest.testDistribSearch fails in Java 8 
almost always
 Key: SOLR-4712
 URL: https://issues.apache.org/jira/browse/SOLR-4712
 Project: Solr
  Issue Type: Sub-task
Reporter: Uwe Schindler


see parent issue

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4713) org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testDistribSearch fails in Java 8 almost always

2013-04-14 Thread Uwe Schindler (JIRA)
Uwe Schindler created SOLR-4713:
---

 Summary: 
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testDistribSearch fails 
in Java 8 almost always
 Key: SOLR-4713
 URL: https://issues.apache.org/jira/browse/SOLR-4713
 Project: Solr
  Issue Type: Sub-task
Reporter: Uwe Schindler


see parent issue

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: New-style solr.xml

2013-04-14 Thread Mark Miller

On Apr 14, 2013, at 11:42 AM, Erick Erickson erickerick...@gmail.com wrote:

 I've started a new page here:
 
 It's completely rudimentary, but I wanted to get it started so as many
 eyes as possible can get on it. See:
 http://wiki.apache.org/solr/Solr.xml%204.3%20and%20beyond

What happened to the SolrCloud section? I think that's a very helpful division.

 
 Question:
 With the new style, does allowing cores.properties to specify an
 alternate instanceDir make any sense? I don't think so.

Right, I don't either. These will be autodiscovered from a root location, not 
specified or overridable individually.

 
 What about the sub-properties file ('properties' property).

I'd have to look to comment. Not familiar with that one.

- Mark

 
 Erick
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (SOLR-1416) reduce contention in CoreContainer#getCore()

2013-04-14 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reopened SOLR-1416:
---


 reduce contention in CoreContainer#getCore()
 

 Key: SOLR-1416
 URL: https://issues.apache.org/jira/browse/SOLR-1416
 Project: Solr
  Issue Type: Improvement
  Components: multicore
Reporter: Noble Paul
Priority: Minor
 Attachments: SOLR-1416.patch


 every call to CoreContainer#getCore() is synchronized . We should reduce the 
 contention . The writes are very infrequent and reads are frequent . How 
 about using a ReadWriterLock?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1416) reduce contention in CoreContainer#getCore()

2013-04-14 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631359#comment-13631359
 ] 

Mark Miller commented on SOLR-1416:
---

I reopened. This remains a good issue.

 reduce contention in CoreContainer#getCore()
 

 Key: SOLR-1416
 URL: https://issues.apache.org/jira/browse/SOLR-1416
 Project: Solr
  Issue Type: Improvement
  Components: multicore
Reporter: Noble Paul
Priority: Minor
 Attachments: SOLR-1416.patch


 every call to CoreContainer#getCore() is synchronized . We should reduce the 
 contention . The writes are very infrequent and reads are frequent . How 
 about using a ReadWriterLock?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4708) Enable ClusteringComponent by default

2013-04-14 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631382#comment-13631382
 ] 

Dawid Weiss commented on SOLR-4708:
---

The patch looks good to me. Thanks Erik.

 Enable ClusteringComponent by default
 -

 Key: SOLR-4708
 URL: https://issues.apache.org/jira/browse/SOLR-4708
 Project: Solr
  Issue Type: Bug
Reporter: Erik Hatcher
Priority: Minor
 Attachments: SOLR-4708.patch, SOLR-4708.patch


 In the past, the ClusteringComponent used to rely on 3rd party JARs not 
 available from a Solr distro.  This is no longer the case, but the /browse UI 
 and other references still had the clustering component disabled in the 
 example with an awkward system property way to enable it.  Let's remove all 
 of that unnecessary stuff and just enable it as it works out of the box now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3755) shard splitting

2013-04-14 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631384#comment-13631384
 ] 

Shalin Shekhar Mangar commented on SOLR-3755:
-

bq. Anshum suggested over chat that we should think about combining 
ShardSplitTest and ChaosMonkeyShardSplit tests into one to avoid code 
duplication. I'll try to see if we can do that.
I've changed ChaosMonkeyShardSplitTest to extend ShardSplitTest so that we can 
share most of the code. The ChaosMonkey test is not completely correct and I 
intend to improve it.

bq. The original change around this made preRegister start taking a core rather 
than a core descriptor. I'd like to work that out so it doesn't need to be the 
case.

I'll revert the change to the preRegister method signature and find another way.

I've found two kinds of test failures of (ChaosMonkey)ShardSplitTest.

The first is because of the following sequence of events:

# A doc addition fails (because of the kill leader jetty command), client 
throws an exception and therefore the docCount variable is not incremented 
inside the index thread.
# However, the doc addition is recorded in the update logs (of the proxy node?) 
and replayed on the new leader so in reality, the doc does get added to the 
shard
# Split happens and we assert on docCounts being equal in the server which 
fails because the server has the document that we have not counted.

This happens mostly with Lucene-Solr-Tests-4.x-Java6 builds. The bug is in the 
tests and not in the split code. Following is the stack trace:

{code}
[junit4:junit4]   1 ERROR - 2013-04-14 14:24:27.697; 
org.apache.solr.cloud.ChaosMonkeyShardSplitTest$1; Exception while adding doc
[junit4:junit4]   1 org.apache.solr.client.solrj.SolrServerException: No live 
SolrServers available to handle this 
request:[http://127.0.0.1:34203/h/y/collection1, 
http://127.0.0.1:34304/h/y/collection1, http://127.0.0.1:34311/h/y/collection1, 
http://127.0.0.1:34270/h/y/collection1]
[junit4:junit4]   1at 
org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:333)
[junit4:junit4]   1at 
org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:306)
[junit4:junit4]   1at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
[junit4:junit4]   1at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.indexDoc(AbstractFullDistribZkTestBase.java:561)
[junit4:junit4]   1at 
org.apache.solr.cloud.ChaosMonkeyShardSplitTest.indexr(ChaosMonkeyShardSplitTest.java:434)
[junit4:junit4]   1at 
org.apache.solr.cloud.ChaosMonkeyShardSplitTest$1.run(ChaosMonkeyShardSplitTest.java:158)
[junit4:junit4]   1 Caused by: org.apache.solr.common.SolrException: Server at 
http://127.0.0.1:34311/h/y/collection1 returned non ok status:503, 
message:Service Unavailable
[junit4:junit4]   1at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:373)
[junit4:junit4]   1at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
[junit4:junit4]   1at 
org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:264)
[junit4:junit4]   1... 5 more
{code}

Perhaps we should check the exception message and continue to count such a 
document?

The second kind of test failures are where a document add fails due to version 
conflict. This exception is always seen just after the updateshardstate is 
called to switch the shard states. Following is the relevant log:

{code}
[junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.861; 
org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state invoked 
for collection: collection1
[junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.861; 
org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state shard1 
to inactive
[junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.861; 
org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state shard1_0 
to active
[junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.861; 
org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state shard1_1 
to active
[junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.873; 
org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp= 
path=/update params={wt=javabinversion=2} {add=[169 (1432319507166134272)]} 0 2
[junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.877; 
org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: 
WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, 
has occurred - updating... (live nodes size: 5)
[junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.877; 
org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: 
WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, 
has occurred - updating... (live nodes size: 5)
[junit4:junit4]   1 INFO  - 

[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #825: POMs out of sync

2013-04-14 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/825/

1 tests failed.
FAILED:  org.apache.solr.cloud.ChaosMonkeyShardSplitTest.testDistribSearch

Error Message:
Wrong doc count on shard1_1 expected:109 but was:110

Stack Trace:
java.lang.AssertionError: Wrong doc count on shard1_1 expected:109 but 
was:110
at 
__randomizedtesting.SeedInfo.seed([72D2B25F6BB18882:F3343C471CEEE8BE]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at 
org.apache.solr.cloud.ShardSplitTest.checkDocCountsAndShardStates(ShardSplitTest.java:155)
at 
org.apache.solr.cloud.ChaosMonkeyShardSplitTest.doTest(ChaosMonkeyShardSplitTest.java:154)




Build Log:
[...truncated 23734 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4662) Finalize what we're going to do with solr.xml, auto-discovery, config sets.

2013-04-14 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-4662:
-

Attachment: SOLR-4662.patch

This may be ready. I'll let it sit for a day or two, go over it again, and 
commit it unless there are objections..

All tests pass, I'll run nightly tests on it tonight.

This patch and JIRA will NOT implement config sets, see SOLR-4478 for that. 
While I'd like to get 4478 into 4.3, it can wait until 4.4 if I don't have the 
time.

 Finalize what we're going to do with solr.xml, auto-discovery, config sets.
 ---

 Key: SOLR-4662
 URL: https://issues.apache.org/jira/browse/SOLR-4662
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.3, 5.0
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Blocker
 Attachments: SOLR-4662.patch, SOLR-4662.patch


 Spinoff from SOLR-4615, breaking it out here so we can address the changes in 
 pieces.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4546) Separate global/zookeeper info in solr.properties / solr.xml into its own config file

2013-04-14 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631393#comment-13631393
 ] 

Erick Erickson commented on SOLR-4546:
--

[~elyograg]] [~markrmil...@gmail.com] Since all of the parameters in solr.xml 
should be overridable via system properties, it seems like there's already a 
way to do this without the added complication of another config file. I'm 
inclined to close this won't fix.

I'm not dead set against this, but I don't see the need either and am trying to 
wrap up this migration. If someone wants to pick this up they should feel free. 
Mostly I wanted Shawn to assign it to me so it didn't get lost, but upon 
further reflection I don't think I have any investment in carrying this forward.

 Separate global/zookeeper info in solr.properties / solr.xml into its own 
 config file
 -

 Key: SOLR-4546
 URL: https://issues.apache.org/jira/browse/SOLR-4546
 Project: Solr
  Issue Type: New Feature
Affects Versions: 4.1
Reporter: Shawn Heisey
Assignee: Erick Erickson
 Fix For: 4.3, 5.0


 I know that solr.xml is due to be replaced by solr.properties soon, so I will 
 say solr.\* and you can use whatever extension makes sense.
 There is a small but very important amount of information in solr.* that 
 doesn't specifically have to do with the cores that are local to that server. 
  With the advent of SolrCloud, the amount of such global information has 
 grown, though it is still relatively small.
 If you want to change these global options (or you have config files in 
 git/svn), you can't just copy solr.\* from one system to another, because 
 that's where cores specific to that server are defined.
 I would like to continue to have these options work if they are in solr.\*, 
 but have an additional file for global options, with a filename prefix like 
 global, solrglobal, globalsolr, solrcommon, ... whatever bikeshedding comes 
 up with.  That way you could put zkHost, lib, and other things that will be 
 common to all servers in the new file, and put machine-specific things like 
 host and port in solr.\*.  Any setting in solr.\* would replace the global 
 setting, so you could put port in either file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4546) Separate global/zookeeper info in solr.properties / solr.xml into its own config file

2013-04-14 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631393#comment-13631393
 ] 

Erick Erickson edited comment on SOLR-4546 at 4/14/13 9:09 PM:
---

[~elyograg] [~markrmil...@gmail.com] Since all of the parameters in solr.xml 
should be overridable via system properties, it seems like there's already a 
way to do this without the added complication of another config file. I'm 
inclined to close this won't fix.

I'm not dead set against this, but I don't see the need either and am trying to 
wrap up this migration. If someone wants to pick this up they should feel free. 
Mostly I wanted Shawn to assign it to me so it didn't get lost, but upon 
further reflection I don't think I have any investment in carrying this forward.

  was (Author: erickerickson):
[~elyograg]] [~markrmil...@gmail.com] Since all of the parameters in 
solr.xml should be overridable via system properties, it seems like there's 
already a way to do this without the added complication of another config file. 
I'm inclined to close this won't fix.

I'm not dead set against this, but I don't see the need either and am trying to 
wrap up this migration. If someone wants to pick this up they should feel free. 
Mostly I wanted Shawn to assign it to me so it didn't get lost, but upon 
further reflection I don't think I have any investment in carrying this forward.
  
 Separate global/zookeeper info in solr.properties / solr.xml into its own 
 config file
 -

 Key: SOLR-4546
 URL: https://issues.apache.org/jira/browse/SOLR-4546
 Project: Solr
  Issue Type: New Feature
Affects Versions: 4.1
Reporter: Shawn Heisey
Assignee: Erick Erickson
 Fix For: 4.3, 5.0


 I know that solr.xml is due to be replaced by solr.properties soon, so I will 
 say solr.\* and you can use whatever extension makes sense.
 There is a small but very important amount of information in solr.* that 
 doesn't specifically have to do with the cores that are local to that server. 
  With the advent of SolrCloud, the amount of such global information has 
 grown, though it is still relatively small.
 If you want to change these global options (or you have config files in 
 git/svn), you can't just copy solr.\* from one system to another, because 
 that's where cores specific to that server are defined.
 I would like to continue to have these options work if they are in solr.\*, 
 but have an additional file for global options, with a filename prefix like 
 global, solrglobal, globalsolr, solrcommon, ... whatever bikeshedding comes 
 up with.  That way you could put zkHost, lib, and other things that will be 
 common to all servers in the new file, and put machine-specific things like 
 host and port in solr.\*.  Any setting in solr.\* would replace the global 
 setting, so you could put port in either file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4546) Separate global/zookeeper info in solr.properties / solr.xml into its own config file

2013-04-14 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631403#comment-13631403
 ] 

Mark Miller commented on SOLR-4546:
---

If it's just system properties, I think there is room for improvement in 
accepting a known properties file (much like core.properties?) or does that 
work already, like solrconfig.xml does?

Not only does it let you keep config out of your scripts, it lets you specify 
this stuff in a non global way.

 Separate global/zookeeper info in solr.properties / solr.xml into its own 
 config file
 -

 Key: SOLR-4546
 URL: https://issues.apache.org/jira/browse/SOLR-4546
 Project: Solr
  Issue Type: New Feature
Affects Versions: 4.1
Reporter: Shawn Heisey
Assignee: Erick Erickson
 Fix For: 4.3, 5.0


 I know that solr.xml is due to be replaced by solr.properties soon, so I will 
 say solr.\* and you can use whatever extension makes sense.
 There is a small but very important amount of information in solr.* that 
 doesn't specifically have to do with the cores that are local to that server. 
  With the advent of SolrCloud, the amount of such global information has 
 grown, though it is still relatively small.
 If you want to change these global options (or you have config files in 
 git/svn), you can't just copy solr.\* from one system to another, because 
 that's where cores specific to that server are defined.
 I would like to continue to have these options work if they are in solr.\*, 
 but have an additional file for global options, with a filename prefix like 
 global, solrglobal, globalsolr, solrcommon, ... whatever bikeshedding comes 
 up with.  That way you could put zkHost, lib, and other things that will be 
 common to all servers in the new file, and put machine-specific things like 
 host and port in solr.\*.  Any setting in solr.\* would replace the global 
 setting, so you could put port in either file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3755) shard splitting

2013-04-14 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631405#comment-13631405
 ] 

Mark Miller commented on SOLR-3755:
---

bq. I'll revert the change to the preRegister method signature and find another 
way.

I'm happy to help on this - it might be easier to just create a new issue 
rather than reverting, and work on getting it nicer from there, up to you 
though.

 shard splitting
 ---

 Key: SOLR-3755
 URL: https://issues.apache.org/jira/browse/SOLR-3755
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: Yonik Seeley
Assignee: Shalin Shekhar Mangar
 Fix For: 4.3, 5.0

 Attachments: SOLR-3755-combined.patch, 
 SOLR-3755-combinedWithReplication.patch, SOLR-3755-CoreAdmin.patch, 
 SOLR-3755.patch, SOLR-3755.patch, SOLR-3755.patch, SOLR-3755.patch, 
 SOLR-3755.patch, SOLR-3755.patch, SOLR-3755.patch, SOLR-3755.patch, 
 SOLR-3755.patch, SOLR-3755.patch, SOLR-3755-testSplitter.patch, 
 SOLR-3755-testSplitter.patch


 We can currently easily add replicas to handle increases in query volume, but 
 we should also add a way to add additional shards dynamically by splitting 
 existing shards.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: New-style solr.xml

2013-04-14 Thread Erick Erickson
bq: What happened to the SolrCloud section?

It wasn't in the bits I cut-n-pasted, so I didn't include it at all.
Sheeesh! I'll fix, thanks for looking. It's amazing what I can fail to
see.

bq: sub-properties file

Sounds like this is also really SOLR-4546, so maybe that's coming back?

Erick

On Sun, Apr 14, 2013 at 1:28 PM, Mark Miller markrmil...@gmail.com wrote:

 On Apr 14, 2013, at 11:42 AM, Erick Erickson erickerick...@gmail.com wrote:

 I've started a new page here:

 It's completely rudimentary, but I wanted to get it started so as many
 eyes as possible can get on it. See:
 http://wiki.apache.org/solr/Solr.xml%204.3%20and%20beyond

 What happened to the SolrCloud section? I think that's a very helpful 
 division.


 Question:
 With the new style, does allowing cores.properties to specify an
 alternate instanceDir make any sense? I don't think so.

 Right, I don't either. These will be autodiscovered from a root location, not 
 specified or overridable individually.


 What about the sub-properties file ('properties' property).

 I'd have to look to comment. Not familiar with that one.

 - Mark


 Erick

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: New-style solr.xml

2013-04-14 Thread Erick Erickson
Mark:

OK, updated the example, I'd appreciate a quick glance to see if I put
the right properties in the solrcloud tag. Code fixes coming.

On Sun, Apr 14, 2013 at 8:57 PM, Erick Erickson erickerick...@gmail.com wrote:
 bq: What happened to the SolrCloud section?

 It wasn't in the bits I cut-n-pasted, so I didn't include it at all.
 Sheeesh! I'll fix, thanks for looking. It's amazing what I can fail to
 see.

 bq: sub-properties file

 Sounds like this is also really SOLR-4546, so maybe that's coming back?

 Erick

 On Sun, Apr 14, 2013 at 1:28 PM, Mark Miller markrmil...@gmail.com wrote:

 On Apr 14, 2013, at 11:42 AM, Erick Erickson erickerick...@gmail.com wrote:

 I've started a new page here:

 It's completely rudimentary, but I wanted to get it started so as many
 eyes as possible can get on it. See:
 http://wiki.apache.org/solr/Solr.xml%204.3%20and%20beyond

 What happened to the SolrCloud section? I think that's a very helpful 
 division.


 Question:
 With the new style, does allowing cores.properties to specify an
 alternate instanceDir make any sense? I don't think so.

 Right, I don't either. These will be autodiscovered from a root location, 
 not specified or overridable individually.


 What about the sub-properties file ('properties' property).

 I'd have to look to comment. Not familiar with that one.

 - Mark


 Erick

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4662) Finalize what we're going to do with solr.xml, auto-discovery, config sets.

2013-04-14 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-4662:
-

Attachment: SOLR-4662.patch

Put solrcloud section in solr.xml

 Finalize what we're going to do with solr.xml, auto-discovery, config sets.
 ---

 Key: SOLR-4662
 URL: https://issues.apache.org/jira/browse/SOLR-4662
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.3, 5.0
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Blocker
 Attachments: SOLR-4662.patch, SOLR-4662.patch, SOLR-4662.patch


 Spinoff from SOLR-4615, breaking it out here so we can address the changes in 
 pieces.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3755) shard splitting

2013-04-14 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631490#comment-13631490
 ] 

Anshum Gupta commented on SOLR-3755:


bq. This happens mostly with Lucene-Solr-Tests-4.x-Java6 builds.

Is this true for all the exceptions or just the one that follows this line? I 
wasn't able to reproduce this on my system running Java7.
Also, are these consistent failures?

 shard splitting
 ---

 Key: SOLR-3755
 URL: https://issues.apache.org/jira/browse/SOLR-3755
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: Yonik Seeley
Assignee: Shalin Shekhar Mangar
 Fix For: 4.3, 5.0

 Attachments: SOLR-3755-combined.patch, 
 SOLR-3755-combinedWithReplication.patch, SOLR-3755-CoreAdmin.patch, 
 SOLR-3755.patch, SOLR-3755.patch, SOLR-3755.patch, SOLR-3755.patch, 
 SOLR-3755.patch, SOLR-3755.patch, SOLR-3755.patch, SOLR-3755.patch, 
 SOLR-3755.patch, SOLR-3755.patch, SOLR-3755-testSplitter.patch, 
 SOLR-3755-testSplitter.patch


 We can currently easily add replicas to handle increases in query volume, but 
 we should also add a way to add additional shards dynamically by splitting 
 existing shards.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3755) shard splitting

2013-04-14 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631511#comment-13631511
 ] 

Shalin Shekhar Mangar commented on SOLR-3755:
-

bq. Is this true for all the exceptions or just the one that follows this line? 
I wasn't able to reproduce this on my system running Java7.

The error with the failing add doc happens with Java6 -- haven't seen it with 
any other version. I've seen the version conflict exception on java7 and java8.

bq. Also, are these consistent failures?

Yes but only on jenkins! I've had ec2 boxes running these tests all night and I 
haven't seen a failure in over 500 runs. These failures are very environment 
and timing dependent.

 shard splitting
 ---

 Key: SOLR-3755
 URL: https://issues.apache.org/jira/browse/SOLR-3755
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: Yonik Seeley
Assignee: Shalin Shekhar Mangar
 Fix For: 4.3, 5.0

 Attachments: SOLR-3755-combined.patch, 
 SOLR-3755-combinedWithReplication.patch, SOLR-3755-CoreAdmin.patch, 
 SOLR-3755.patch, SOLR-3755.patch, SOLR-3755.patch, SOLR-3755.patch, 
 SOLR-3755.patch, SOLR-3755.patch, SOLR-3755.patch, SOLR-3755.patch, 
 SOLR-3755.patch, SOLR-3755.patch, SOLR-3755-testSplitter.patch, 
 SOLR-3755-testSplitter.patch


 We can currently easily add replicas to handle increases in query volume, but 
 we should also add a way to add additional shards dynamically by splitting 
 existing shards.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4242) A better spatial query parser

2013-04-14 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631513#comment-13631513
 ] 

Bill Bell commented on SOLR-4242:
-

spatialdist() ? 

We really need this.

 A better spatial query parser
 -

 Key: SOLR-4242
 URL: https://issues.apache.org/jira/browse/SOLR-4242
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley
 Fix For: 4.3


 I've been thinking about how spatial support is exposed to Solr users. 
 Presently there's the older Solr 3 stuff, most prominently seen via 
 \{!geofilt} and \{!bbox} done by [~gsingers] (I think). and then there's the 
 Solr 4 fields using a special syntax parsed by Lucene 4 spatial that looks 
 like mygeofield:Intersects(Circle(1 2 d=3)) What's inside the outer 
 parenthesis is parsed by Spatial4j as a shape, and it has a special 
 (non-standard) syntax for points, rects, and circles, and then there's WKT.  
 I believe this scheme was devised by [~ryantxu].
 I'd like to devise something that is both comprehensive and is aligned with 
 standards to the extent that it's prudent.  The old Solr 3 stuff is not 
 comprehensive and not standardized.  The newer stuff is comprehensive but 
 only a little based on standards. And I think it'd be nicer to implement it 
 as a Solr query parser.  I'll say more in the comments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org