[jira] [Updated] (SOLR-3925) Expose SpanFirst in eDismax

2012-11-07 Thread Markus Jelsma (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-3925:


Attachment: SOLR-3925-trunk-2.patch

Updated patch for today's trunk.

 Expose SpanFirst in eDismax
 ---

 Key: SOLR-3925
 URL: https://issues.apache.org/jira/browse/SOLR-3925
 Project: Solr
  Issue Type: Improvement
  Components: query parsers
Affects Versions: 4.0-BETA
 Environment: solr-spec 5.0.0.2012.10.09.19.29.59
 solr-impl 5.0-SNAPSHOT 1366361:1396116M - markus - 2012-10-09 19:29:59 
Reporter: Markus Jelsma
 Fix For: 4.1, 5.0

 Attachments: SOLR-3925-trunk-1.patch, SOLR-3925-trunk-2.patch


 Expose Lucene's SpanFirst capability in Solr's extended Dismax query parser. 
 This issue adds the SF-parameter (SpanFirst) and takes a FIELD~DISTANCE^BOOST 
 formatted value.
 For example, sf=title~5^2 will give a boost of 2 if one of the normal 
 clauses, originally generated for automatic phrase queries, is located within 
 five positions from the field's start.
 Unit test is included and all tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3970) Admin dashboard shows incomplete java version

2012-11-07 Thread Stefan Matheis (steffkes) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492244#comment-13492244
 ] 

Stefan Matheis (steffkes) commented on SOLR-3970:
-

Shawn, nothing easier than this: 
{{solr/core/src/java/org/apache/solr/handler/admin/SystemInfoHandler.java}} on 
Line 214:

{code}jvm.add( version, System.getProperty(java.vm.version) );{code}

but i don't know what makes more sense to show as information at the dashboard?

 Admin dashboard shows incomplete java version
 -

 Key: SOLR-3970
 URL: https://issues.apache.org/jira/browse/SOLR-3970
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 4.0
 Environment: Linux bigindy5 2.6.32-279.9.1.el6.centos.plus.x86_64 #1 
 SMP Wed Sep 26 03:52:55 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
 java version 1.7.0_07
 Java(TM) SE Runtime Environment (build 1.7.0_07-b10)
 Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)
Reporter: Shawn Heisey
Priority: Minor
 Fix For: 4.1


 The admin dashboard shows the following for Runtime under JVM but it is 
 incomplete.  Unless you are intimately familiar with the correlation between 
 HotSpot version numbers and Java version numbers, you can't look at this and 
 know what version of Oracle Java is being used.
 Java HotSpot(TM) 64-Bit Server VM (23.3-b01)
 The complete version output (from java -version) on this system is this:
 java version 1.7.0_07
 Java(TM) SE Runtime Environment (build 1.7.0_07-b10)
 Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4031) Rare mixup of request content

2012-11-07 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-4031:
---

Fix Version/s: 4.1

  Rare mixup of request content
 --

 Key: SOLR-4031
 URL: https://issues.apache.org/jira/browse/SOLR-4031
 Project: Solr
  Issue Type: Bug
  Components: multicore, search, SolrCloud
Affects Versions: 4.0
Reporter: Per Steffensen
  Labels: bug, data-integrity, mixup, request, security
 Fix For: 4.1


 We are using Solr 4.0 and run intensive performance/data-integrity/endurance 
 tests on it. In very rare occasions the content of two concurrent requests to 
 Solr get mixed up. We have spent a lot of time narrowing down this issue and 
 found that it is a bug in Jetty 8.1.2. Therefore of course we have filed it 
 as a bug with Jetty.
 Official bugzilla: https://bugs.eclipse.org/bugs/show_bug.cgi?id=392936
 Mailing list thread: 
 http://dev.eclipse.org/mhonarc/lists/jetty-dev/threads.html#01530
 The reports to Jetty is very detailed so you can go and read about it there. 
 We have found that the problem seems to be solved in Jetty 8.1.7. Therefore 
 we are now running Solr 4.0 (plus our additional changes) on top of Jetty 
 8.1.7 instead of 8.1.2. You probably want to do the same upgrade on the 
 Apache side sometime soon.
 Alt least now you know what to tell people if the start complaining about 
 mixed up requests in Solr 4.0 - upgrade the Jetty underneath to 8.1.7 (or run 
 tomcat or something)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4031) Rare mixup of request content

2012-11-07 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492247#comment-13492247
 ] 

Yonik Seeley commented on SOLR-4031:


Thanks for tracking that down Per!
Sounds like we should definitely upgrade to the latest Jetty 8.  I've marked 
this for 4.1

  Rare mixup of request content
 --

 Key: SOLR-4031
 URL: https://issues.apache.org/jira/browse/SOLR-4031
 Project: Solr
  Issue Type: Bug
  Components: multicore, search, SolrCloud
Affects Versions: 4.0
Reporter: Per Steffensen
  Labels: bug, data-integrity, mixup, request, security
 Fix For: 4.1


 We are using Solr 4.0 and run intensive performance/data-integrity/endurance 
 tests on it. In very rare occasions the content of two concurrent requests to 
 Solr get mixed up. We have spent a lot of time narrowing down this issue and 
 found that it is a bug in Jetty 8.1.2. Therefore of course we have filed it 
 as a bug with Jetty.
 Official bugzilla: https://bugs.eclipse.org/bugs/show_bug.cgi?id=392936
 Mailing list thread: 
 http://dev.eclipse.org/mhonarc/lists/jetty-dev/threads.html#01530
 The reports to Jetty is very detailed so you can go and read about it there. 
 We have found that the problem seems to be solved in Jetty 8.1.7. Therefore 
 we are now running Solr 4.0 (plus our additional changes) on top of Jetty 
 8.1.7 instead of 8.1.2. You probably want to do the same upgrade on the 
 Apache side sometime soon.
 Alt least now you know what to tell people if the start complaining about 
 mixed up requests in Solr 4.0 - upgrade the Jetty underneath to 8.1.7 (or run 
 tomcat or something)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4532) TestDirectoryTaxonomyReader.testRefreshReadRecreatedTaxonomy failure

2012-11-07 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-4532.


   Resolution: Fixed
Fix Version/s: 5.0
   4.1
Lucene Fields: New,Patch Available  (was: New)

Committed to trunk and 4.x. I didn't commit to 4.0.x because it seems we're not 
going to have a 4.0.1, but rather focus on 4.1.

 TestDirectoryTaxonomyReader.testRefreshReadRecreatedTaxonomy failure
 

 Key: LUCENE-4532
 URL: https://issues.apache.org/jira/browse/LUCENE-4532
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4532.patch, LUCENE-4532.patch


 The following failure on Jenkins:
 {noformat}
  Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/1404/
  Java: 32bit/jdk1.6.0_37 -client -XX:+UseConcMarkSweepGC
 
  1 tests failed.
  REGRESSION:  
  org.apache.lucene.facet.taxonomy.directory.TestDirectoryTaxonomyReader.testRefreshReadRecreatedTaxonomy
 
  Error Message:
 
 
  Stack Trace:
  java.lang.ArrayIndexOutOfBoundsException
  at 
  __randomizedtesting.SeedInfo.seed([6AB10D3E4E956CFA:BFB2863DB7E077E0]:0)
  at java.lang.System.arraycopy(Native Method)
  at 
  org.apache.lucene.facet.taxonomy.directory.ParentArray.refresh(ParentArray.java:99)
  at 
  org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader.refresh(DirectoryTaxonomyReader.java:407)
  at 
  org.apache.lucene.facet.taxonomy.directory.TestDirectoryTaxonomyReader.doTestReadRecreatedTaxono(TestDirectoryTaxonomyReader.java:167)
  at 
  org.apache.lucene.facet.taxonomy.directory.TestDirectoryTaxonomyReader.testRefreshReadRecreatedTaxonomy(TestDirectoryTaxonomyReader.java:130)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
  sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
  sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
  com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
  at 
  com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
  at 
  com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
  at 
  com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
  at 
  com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
  at 
  org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
  at 
  org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
  at 
  org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
  at 
  com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
  at 
  org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
  at 
  org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
  at 
  org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
  at 
  com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at 
  com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
  at 
  com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
  at 
  com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
  at 
  com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
  at 
  com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
  at 
  com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
  at 
  com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
  at 
  org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
  at 
  org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
  at 
  com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
  at 
  

[jira] [Created] (SOLR-4044) CloudSolrServer early connect problems

2012-11-07 Thread Grant Ingersoll (JIRA)
Grant Ingersoll created SOLR-4044:
-

 Summary: CloudSolrServer early connect problems
 Key: SOLR-4044
 URL: https://issues.apache.org/jira/browse/SOLR-4044
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0
Reporter: Grant Ingersoll


If you call CloudSolrServer.connect() after Zookeeper is up, but before 
clusterstate, etc. is populated, you will get No live SolrServer exceptions 
(line 322 in LBHttpSolrServer):
{code}
throw new SolrServerException(No live SolrServers available to handle this 
request);{code}

for all requests made even though all the Solr nodes are coming up just fine.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-4031) Rare mixup of request content

2012-11-07 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley reassigned SOLR-4031:
--

Assignee: Yonik Seeley

  Rare mixup of request content
 --

 Key: SOLR-4031
 URL: https://issues.apache.org/jira/browse/SOLR-4031
 Project: Solr
  Issue Type: Bug
  Components: multicore, search, SolrCloud
Affects Versions: 4.0
Reporter: Per Steffensen
Assignee: Yonik Seeley
  Labels: bug, data-integrity, mixup, request, security
 Fix For: 4.1


 We are using Solr 4.0 and run intensive performance/data-integrity/endurance 
 tests on it. In very rare occasions the content of two concurrent requests to 
 Solr get mixed up. We have spent a lot of time narrowing down this issue and 
 found that it is a bug in Jetty 8.1.2. Therefore of course we have filed it 
 as a bug with Jetty.
 Official bugzilla: https://bugs.eclipse.org/bugs/show_bug.cgi?id=392936
 Mailing list thread: 
 http://dev.eclipse.org/mhonarc/lists/jetty-dev/threads.html#01530
 The reports to Jetty is very detailed so you can go and read about it there. 
 We have found that the problem seems to be solved in Jetty 8.1.7. Therefore 
 we are now running Solr 4.0 (plus our additional changes) on top of Jetty 
 8.1.7 instead of 8.1.2. You probably want to do the same upgrade on the 
 Apache side sometime soon.
 Alt least now you know what to tell people if the start complaining about 
 mixed up requests in Solr 4.0 - upgrade the Jetty underneath to 8.1.7 (or run 
 tomcat or something)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3931) Turn off coord() factor for scoring

2012-11-07 Thread Joel Nothman (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492300#comment-13492300
 ] 

Joel Nothman commented on SOLR-3931:


Version 4.0.0 allows the specification of a custom similarity factory for each 
field in schema.xml (see SOLR-2338; it seems documentation is a bit lacking). 
So these options are not per-query, but per-core.

It would be possible to copy or patch Lucene's {{DefaultSimilarity}} and Solr's 
{{DefaultSimilarityFactory}} to take `useCoord` and `useQueryNorm` parameters.

 Turn off coord() factor for scoring
 ---

 Key: SOLR-3931
 URL: https://issues.apache.org/jira/browse/SOLR-3931
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Bill Bell

 We would like to remove coordination factor from scoring.
 FOr small fields (like name of doctor), we want to not score higher if the 
 same term is in the field more than once. Makes sense for books, not so much 
 for formal names.
 /solr/select?q=*:*coordFactor=false
 Default is true.
 (Note: we might want to make each of these optional - tf, idf, coord, 
 queryNorm
 coord(q,d) is a score factor based on how many of the query terms are found 
 in the specified document. Typically, a document that contains more of the 
 query's terms will receive a higher score than another document with fewer 
 query terms. This is a search time factor computed in coord(q,d) by the 
 Similarity in effect at search time. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-3931) Turn off coord() factor for scoring

2012-11-07 Thread Joel Nothman (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492300#comment-13492300
 ] 

Joel Nothman edited comment on SOLR-3931 at 11/7/12 12:41 PM:
--

Version 4.0.0 allows the specification of a custom similarity factory for each 
field in schema.xml (see SOLR-2338; it seems documentation is a bit lacking). 
So these options are not per-query, but per-core.

It would be possible to copy or patch Lucene's {{DefaultSimilarity}} and Solr's 
{{DefaultSimilarityFactory}} to take {{useCoord}} and {{useQueryNorm}} 
parameters.

  was (Author: jnothman):
Version 4.0.0 allows the specification of a custom similarity factory for 
each field in schema.xml (see SOLR-2338; it seems documentation is a bit 
lacking). So these options are not per-query, but per-core.

It would be possible to copy or patch Lucene's {{DefaultSimilarity}} and Solr's 
{{DefaultSimilarityFactory}} to take `useCoord` and `useQueryNorm` parameters.
  
 Turn off coord() factor for scoring
 ---

 Key: SOLR-3931
 URL: https://issues.apache.org/jira/browse/SOLR-3931
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Bill Bell

 We would like to remove coordination factor from scoring.
 FOr small fields (like name of doctor), we want to not score higher if the 
 same term is in the field more than once. Makes sense for books, not so much 
 for formal names.
 /solr/select?q=*:*coordFactor=false
 Default is true.
 (Note: we might want to make each of these optional - tf, idf, coord, 
 queryNorm
 coord(q,d) is a score factor based on how many of the query terms are found 
 in the specified document. Typically, a document that contains more of the 
 query's terms will receive a higher score than another document with fewer 
 query terms. This is a search time factor computed in coord(q,d) by the 
 Similarity in effect at search time. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4542) Make RECURSION_CAP in HunspellStemmer configurable

2012-11-07 Thread Piotr (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492307#comment-13492307
 ] 

Piotr commented on LUCENE-4542:
---

I'd prefer not to create a patch, I don't feel so comfortable with lucene code. 

 Make RECURSION_CAP in HunspellStemmer configurable
 --

 Key: LUCENE-4542
 URL: https://issues.apache.org/jira/browse/LUCENE-4542
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Piotr
Assignee: Chris Male

 Currently there is 
 private static final int RECURSION_CAP = 2;
 in the code of the class HunspellStemmer. It makes using hunspell with 
 several dictionaries almost unusable, due to bad performance (f.ex. it costs 
 36ms to stem long sentence in latvian for recursion_cap=2 and 5 ms for 
 recursion_cap=1). It would be nice to be able to tune this number as needed.
 AFAIK this number (2) was chosen arbitrary.
 (it's a first issue in my life, so please forgive me any mistakes done).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-4542) Make RECURSION_CAP in HunspellStemmer configurable

2012-11-07 Thread Piotr (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492307#comment-13492307
 ] 

Piotr edited comment on LUCENE-4542 at 11/7/12 12:57 PM:
-

I'd prefer not to create a patch myself, I don't feel so comfortable with 
lucene code. 

  was (Author: zasnuty):
I'd prefer not to create a patch, I don't feel so comfortable with lucene 
code. 
  
 Make RECURSION_CAP in HunspellStemmer configurable
 --

 Key: LUCENE-4542
 URL: https://issues.apache.org/jira/browse/LUCENE-4542
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Piotr
Assignee: Chris Male

 Currently there is 
 private static final int RECURSION_CAP = 2;
 in the code of the class HunspellStemmer. It makes using hunspell with 
 several dictionaries almost unusable, due to bad performance (f.ex. it costs 
 36ms to stem long sentence in latvian for recursion_cap=2 and 5 ms for 
 recursion_cap=1). It would be nice to be able to tune this number as needed.
 AFAIK this number (2) was chosen arbitrary.
 (it's a first issue in my life, so please forgive me any mistakes done).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4031) Rare mixup of request content

2012-11-07 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved SOLR-4031.


Resolution: Fixed

Upgraded to Jetty 8.1.7, committed to trunk, 4x.

  Rare mixup of request content
 --

 Key: SOLR-4031
 URL: https://issues.apache.org/jira/browse/SOLR-4031
 Project: Solr
  Issue Type: Bug
  Components: multicore, search, SolrCloud
Affects Versions: 4.0
Reporter: Per Steffensen
Assignee: Yonik Seeley
  Labels: bug, data-integrity, mixup, request, security
 Fix For: 4.1


 We are using Solr 4.0 and run intensive performance/data-integrity/endurance 
 tests on it. In very rare occasions the content of two concurrent requests to 
 Solr get mixed up. We have spent a lot of time narrowing down this issue and 
 found that it is a bug in Jetty 8.1.2. Therefore of course we have filed it 
 as a bug with Jetty.
 Official bugzilla: https://bugs.eclipse.org/bugs/show_bug.cgi?id=392936
 Mailing list thread: 
 http://dev.eclipse.org/mhonarc/lists/jetty-dev/threads.html#01530
 The reports to Jetty is very detailed so you can go and read about it there. 
 We have found that the problem seems to be solved in Jetty 8.1.7. Therefore 
 we are now running Solr 4.0 (plus our additional changes) on top of Jetty 
 8.1.7 instead of 8.1.2. You probably want to do the same upgrade on the 
 Apache side sometime soon.
 Alt least now you know what to tell people if the start complaining about 
 mixed up requests in Solr 4.0 - upgrade the Jetty underneath to 8.1.7 (or run 
 tomcat or something)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4542) Make RECURSION_CAP in HunspellStemmer configurable

2012-11-07 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/LUCENE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rafał Kuć updated LUCENE-4542:
--

Attachment: LUCENE-4542.patch

As Piotr doesn't want to provide the patch, I'll do it for him :) Simple patch 
adding a new constructor that allows to pass additional parameter - the 
recursion cap. The old constructor is there and the default value for recursion 
cap is 2. 

 Make RECURSION_CAP in HunspellStemmer configurable
 --

 Key: LUCENE-4542
 URL: https://issues.apache.org/jira/browse/LUCENE-4542
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Piotr
Assignee: Chris Male
 Attachments: LUCENE-4542.patch


 Currently there is 
 private static final int RECURSION_CAP = 2;
 in the code of the class HunspellStemmer. It makes using hunspell with 
 several dictionaries almost unusable, due to bad performance (f.ex. it costs 
 36ms to stem long sentence in latvian for recursion_cap=2 and 5 ms for 
 recursion_cap=1). It would be nice to be able to tune this number as needed.
 AFAIK this number (2) was chosen arbitrary.
 (it's a first issue in my life, so please forgive me any mistakes done).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4545) Better error reporting StemmerOverrideFilterFactory

2012-11-07 Thread Markus Jelsma (JIRA)
Markus Jelsma created LUCENE-4545:
-

 Summary: Better error reporting StemmerOverrideFilterFactory
 Key: LUCENE-4545
 URL: https://issues.apache.org/jira/browse/LUCENE-4545
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Markus Jelsma
 Fix For: 4.1, 5.0


If the dictionary contains an error such as a space instead of a tab somewhere 
in the dictionary it is hard to find the error in a long dictionary. This patch 
includes the file and line number in the exception, helping to debug it quickly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4545) Better error reporting StemmerOverrideFilterFactory

2012-11-07 Thread Markus Jelsma (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated LUCENE-4545:
--

Attachment: LUCENE-4545-trunk-1.patch

Patch for trunk.

 Better error reporting StemmerOverrideFilterFactory
 ---

 Key: LUCENE-4545
 URL: https://issues.apache.org/jira/browse/LUCENE-4545
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Markus Jelsma
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4545-trunk-1.patch


 If the dictionary contains an error such as a space instead of a tab 
 somewhere in the dictionary it is hard to find the error in a long 
 dictionary. This patch includes the file and line number in the exception, 
 helping to debug it quickly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4545) Better error reporting StemmerOverrideFilterFactory

2012-11-07 Thread Markus Jelsma (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated LUCENE-4545:
--

Priority: Trivial  (was: Major)

 Better error reporting StemmerOverrideFilterFactory
 ---

 Key: LUCENE-4545
 URL: https://issues.apache.org/jira/browse/LUCENE-4545
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Markus Jelsma
Priority: Trivial
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4545-trunk-1.patch


 If the dictionary contains an error such as a space instead of a tab 
 somewhere in the dictionary it is hard to find the error in a long 
 dictionary. This patch includes the file and line number in the exception, 
 helping to debug it quickly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4542) Make RECURSION_CAP in HunspellStemmer configurable

2012-11-07 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/LUCENE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rafał Kuć updated LUCENE-4542:
--

Attachment: LUCENE-4542-with-solr.patch

Chris I've attached a second patch which includes changes to Solr 
HunspellFilter and its factory. Please review it and say if you want any 
changes to be made to it. I'll be glad to do it.

 Make RECURSION_CAP in HunspellStemmer configurable
 --

 Key: LUCENE-4542
 URL: https://issues.apache.org/jira/browse/LUCENE-4542
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Piotr
Assignee: Chris Male
 Attachments: LUCENE-4542.patch, LUCENE-4542-with-solr.patch


 Currently there is 
 private static final int RECURSION_CAP = 2;
 in the code of the class HunspellStemmer. It makes using hunspell with 
 several dictionaries almost unusable, due to bad performance (f.ex. it costs 
 36ms to stem long sentence in latvian for recursion_cap=2 and 5 ms for 
 recursion_cap=1). It would be nice to be able to tune this number as needed.
 AFAIK this number (2) was chosen arbitrary.
 (it's a first issue in my life, so please forgive me any mistakes done).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b58) - Build # 2257 - Still Failing!

2012-11-07 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux/2257/
Java: 32bit/jdk1.8.0-ea-b58 -server -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 19613 lines...]
check-licenses:
 [echo] License check under: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-continuation-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-deploy-8.1.7.v20120910.jar

 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-http-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-io-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-jmx-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-security-8.1.7.v20120910.jar

 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-server-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-servlet-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-util-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-webapp-8.1.7.v20120910.jar

[...truncated 2 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:67: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build.xml:223: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/tools/custom-tasks.xml:44:
 License check failed. Check the logs.

Total time: 28 minutes 3 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.8.0-ea-b58 -server -XX:+UseConcMarkSweepGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Windows (32bit/jdk1.6.0_37) - Build # 1476 - Failure!

2012-11-07 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows/1476/
Java: 32bit/jdk1.6.0_37 -client -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 18989 lines...]
check-licenses:
 [echo] License check under: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-continuation-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-deploy-8.1.7.v20120910.jar

 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-http-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-io-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-jmx-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-security-8.1.7.v20120910.jar

 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-server-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-servlet-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-util-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-webapp-8.1.7.v20120910.jar

[...truncated 2 lines...]
BUILD FAILED
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\build.xml:67: The 
following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build.xml:223: 
The following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\lucene\tools\custom-tasks.xml:44:
 License check failed. Check the logs.

Total time: 47 minutes 56 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.6.0_37 -client -XX:+UseParallelGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4545) Better error reporting StemmerOverrideFilterFactory

2012-11-07 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492366#comment-13492366
 ] 

Robert Muir commented on LUCENE-4545:
-

I'm for the idea, but not for the logic contained to this specific factory.

Instead of tracking our own line numbers, we should use LineNumberReader and so 
on.

WordListLoader.getStemDict should be changed to take a generic map (Not a 
chararraymap), so that it can be used by this method.
In fact, since nothing at all is using this method, we can do whatever we want 
with it.

Also the logic should not use split(s, 2): I think instead it should just use 
split(s)? This way we detect the situation
where there are multiple tabs in a line unexpectedly, too.

 Better error reporting StemmerOverrideFilterFactory
 ---

 Key: LUCENE-4545
 URL: https://issues.apache.org/jira/browse/LUCENE-4545
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Markus Jelsma
Priority: Trivial
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4545-trunk-1.patch


 If the dictionary contains an error such as a space instead of a tab 
 somewhere in the dictionary it is hard to find the error in a long 
 dictionary. This patch includes the file and line number in the exception, 
 helping to debug it quickly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4546) SorterTemplate.quicksort incorrect

2012-11-07 Thread Stefan Pohl (JIRA)
Stefan Pohl created LUCENE-4546:
---

 Summary: SorterTemplate.quicksort incorrect
 Key: LUCENE-4546
 URL: https://issues.apache.org/jira/browse/LUCENE-4546
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Affects Versions: 4.0, 3.6.1, 4.1
Reporter: Stefan Pohl
 Fix For: 4.1, 4.0, 3.6.1


On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon 
inconsistent sorting behaviour, of course, only a randomized test caught this;)

Because SorterTemplate.quicksort is used in several places in the code 
(directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and 
transitively index and search), I'm a bit puzzled that this either hasn't been 
caught by another higher-level test or that neither my test nor my 
understanding of an insufficiency in the code is valid;)
If the former holds and given that the same code is released in 3.6 and 4.0, 
this might even be a more critical issue requiring a higher priority than 
'major'.
So, can a second pair of eyes please have a timely look at the attached test 
and patch?

Basically the current quicksort implementation seems to assume that luckily 
always the median is chosen as pivot element by grabbing the mid element, not 
handling the case where the initially chosen pivot ends up not in the middle. 
Hope this and the test helps to understand the issue.

Reproducible, currently failing test and a patch attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4546) SorterTemplate.quicksort incorrect

2012-11-07 Thread Stefan Pohl (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Pohl updated LUCENE-4546:


Attachment: SorterTemplate.java.patch
TestSorterTemplate.java

Test and patch file now attached.

 SorterTemplate.quicksort incorrect
 --

 Key: LUCENE-4546
 URL: https://issues.apache.org/jira/browse/LUCENE-4546
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Affects Versions: 3.6.1, 4.0, 4.1
Reporter: Stefan Pohl
  Labels: patch
 Fix For: 3.6.1, 4.0, 4.1

 Attachments: SorterTemplate.java.patch, TestSorterTemplate.java


 On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon 
 inconsistent sorting behaviour, of course, only a randomized test caught 
 this;)
 Because SorterTemplate.quicksort is used in several places in the code 
 (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and 
 transitively index and search), I'm a bit puzzled that this either hasn't 
 been caught by another higher-level test or that neither my test nor my 
 understanding of an insufficiency in the code is valid;)
 If the former holds and given that the same code is released in 3.6 and 4.0, 
 this might even be a more critical issue requiring a higher priority than 
 'major'.
 So, can a second pair of eyes please have a timely look at the attached test 
 and patch?
 Basically the current quicksort implementation seems to assume that luckily 
 always the median is chosen as pivot element by grabbing the mid element, not 
 handling the case where the initially chosen pivot ends up not in the middle. 
 Hope this and the test helps to understand the issue.
 Reproducible, currently failing test and a patch attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.7.0_09) - Build # 2248 - Failure!

2012-11-07 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux/2248/
Java: 32bit/jdk1.7.0_09 -client -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 19574 lines...]
check-licenses:
 [echo] License check under: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-continuation-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-deploy-8.1.7.v20120910.jar

 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-http-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-io-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-jmx-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-security-8.1.7.v20120910.jar

 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-server-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-servlet-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-util-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-webapp-8.1.7.v20120910.jar

[...truncated 2 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:67: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/build.xml:223: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/tools/custom-tasks.xml:44:
 License check failed. Check the logs.

Total time: 34 minutes 18 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.7.0_09 -client -XX:+UseConcMarkSweepGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4536) Make PackedInts byte-aligned?

2012-11-07 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-4536.
--

Resolution: Fixed

Committed:
 - trunk: r1406651
 - branch 4.x: r1406660

 Make PackedInts byte-aligned?
 -

 Key: LUCENE-4536
 URL: https://issues.apache.org/jira/browse/LUCENE-4536
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.1

 Attachments: LUCENE-4536.patch, LUCENE-4536.patch


 PackedInts are more and more used to save/restore small arrays, but given 
 that they are long-aligned, up to 63 bits are wasted per array. We should try 
 to make PackedInts storage byte-aligned so that only 7 bits are wasted in the 
 worst case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4547) PackedIntsDocValue field broken on large indexes

2012-11-07 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-4547:
---

 Summary: PackedIntsDocValue field broken on large indexes
 Key: LUCENE-4547
 URL: https://issues.apache.org/jira/browse/LUCENE-4547
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Priority: Blocker
 Fix For: 4.1


I tried to write a test to sanity check LUCENE-4536 (first running against svn 
revision 1406416, before the change).

But i found docvalues is already broken here for large indexes that have a 
PackedLongDocValues field:

{code}
final int numDocs = 5;
for (int i = 0; i  numDocs; ++i) {
  if (i == 0) {
field.setLongValue(0L); // force  32bit deltas
  } else {
field.setLongValue(133L); 
  }
  w.addDocument(doc);
}
w.forceMerge(1);
w.close();
dir.close(); // checkindex
{code}

{noformat}
[junit4:junit4]   2 WARNING: Uncaught exception in thread: Thread[Lucene Merge 
Thread #0,6,TGRP-Test2GBDocValues]
[junit4:junit4]   2 org.apache.lucene.index.MergePolicy$MergeException: 
java.lang.ArrayIndexOutOfBoundsException: -65536
[junit4:junit4]   2at 
__randomizedtesting.SeedInfo.seed([5DC54DB14FA5979]:0)
[junit4:junit4]   2at 
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:535)
[junit4:junit4]   2at 
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:508)
[junit4:junit4]   2 Caused by: java.lang.ArrayIndexOutOfBoundsException: -65536
[junit4:junit4]   2at 
org.apache.lucene.util.ByteBlockPool.deref(ByteBlockPool.java:305)
[junit4:junit4]   2at 
org.apache.lucene.codecs.lucene40.values.FixedStraightBytesImpl$FixedBytesWriterBase.set(FixedStraightBytesImpl.java:115)
[junit4:junit4]   2at 
org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.writePackedInts(PackedIntValues.java:109)
[junit4:junit4]   2at 
org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.finish(PackedIntValues.java:80)
[junit4:junit4]   2at 
org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:130)
[junit4:junit4]   2at 
org.apache.lucene.codecs.PerDocConsumer.merge(PerDocConsumer.java:65)
{noformat}



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Windows (64bit/jdk1.6.0_37) - Build # 1471 - Failure!

2012-11-07 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/1471/
Java: 64bit/jdk1.6.0_37 -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 18872 lines...]
check-licenses:
 [echo] License check under: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-continuation-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-deploy-8.1.7.v20120910.jar

 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-http-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-io-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-jmx-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-security-8.1.7.v20120910.jar

 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-server-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-servlet-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-util-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-webapp-8.1.7.v20120910.jar

[...truncated 2 lines...]
BUILD FAILED
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:67: The 
following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\build.xml:223: The 
following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\lucene\tools\custom-tasks.xml:44:
 License check failed. Check the logs.

Total time: 49 minutes 19 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 64bit/jdk1.6.0_37 -XX:+UseSerialGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4547) PackedIntsDocValue field broken on large indexes

2012-11-07 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4547:


Attachment: test.patch

Here was my initial test, just screwing around.

I ran with 'ant test -Dtestcase=Test2GBDocValues -Dtests.nightly=true 
-Dtests.heapsize=5G'

 PackedIntsDocValue field broken on large indexes
 

 Key: LUCENE-4547
 URL: https://issues.apache.org/jira/browse/LUCENE-4547
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Priority: Blocker
 Fix For: 4.1

 Attachments: test.patch


 I tried to write a test to sanity check LUCENE-4536 (first running against 
 svn revision 1406416, before the change).
 But i found docvalues is already broken here for large indexes that have a 
 PackedLongDocValues field:
 {code}
 final int numDocs = 5;
 for (int i = 0; i  numDocs; ++i) {
   if (i == 0) {
 field.setLongValue(0L); // force  32bit deltas
   } else {
 field.setLongValue(133L); 
   }
   w.addDocument(doc);
 }
 w.forceMerge(1);
 w.close();
 dir.close(); // checkindex
 {code}
 {noformat}
 [junit4:junit4]   2 WARNING: Uncaught exception in thread: Thread[Lucene 
 Merge Thread #0,6,TGRP-Test2GBDocValues]
 [junit4:junit4]   2 org.apache.lucene.index.MergePolicy$MergeException: 
 java.lang.ArrayIndexOutOfBoundsException: -65536
 [junit4:junit4]   2  at 
 __randomizedtesting.SeedInfo.seed([5DC54DB14FA5979]:0)
 [junit4:junit4]   2  at 
 org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:535)
 [junit4:junit4]   2  at 
 org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:508)
 [junit4:junit4]   2 Caused by: java.lang.ArrayIndexOutOfBoundsException: 
 -65536
 [junit4:junit4]   2  at 
 org.apache.lucene.util.ByteBlockPool.deref(ByteBlockPool.java:305)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.lucene40.values.FixedStraightBytesImpl$FixedBytesWriterBase.set(FixedStraightBytesImpl.java:115)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.writePackedInts(PackedIntValues.java:109)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.finish(PackedIntValues.java:80)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:130)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.PerDocConsumer.merge(PerDocConsumer.java:65)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4547) PackedIntsDocValue field broken on large indexes

2012-11-07 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492405#comment-13492405
 ] 

Robert Muir commented on LUCENE-4547:
-

There is even a out-of-coffee bug in the test, its only using like 2 bits per 
value :)
So this is really even worse. 

I'm not sure we should be using ByteBlockPool etc here. I think it shouldnt be 
used outside of the indexer.

 PackedIntsDocValue field broken on large indexes
 

 Key: LUCENE-4547
 URL: https://issues.apache.org/jira/browse/LUCENE-4547
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Priority: Blocker
 Fix For: 4.1

 Attachments: test.patch


 I tried to write a test to sanity check LUCENE-4536 (first running against 
 svn revision 1406416, before the change).
 But i found docvalues is already broken here for large indexes that have a 
 PackedLongDocValues field:
 {code}
 final int numDocs = 5;
 for (int i = 0; i  numDocs; ++i) {
   if (i == 0) {
 field.setLongValue(0L); // force  32bit deltas
   } else {
 field.setLongValue(133L); 
   }
   w.addDocument(doc);
 }
 w.forceMerge(1);
 w.close();
 dir.close(); // checkindex
 {code}
 {noformat}
 [junit4:junit4]   2 WARNING: Uncaught exception in thread: Thread[Lucene 
 Merge Thread #0,6,TGRP-Test2GBDocValues]
 [junit4:junit4]   2 org.apache.lucene.index.MergePolicy$MergeException: 
 java.lang.ArrayIndexOutOfBoundsException: -65536
 [junit4:junit4]   2  at 
 __randomizedtesting.SeedInfo.seed([5DC54DB14FA5979]:0)
 [junit4:junit4]   2  at 
 org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:535)
 [junit4:junit4]   2  at 
 org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:508)
 [junit4:junit4]   2 Caused by: java.lang.ArrayIndexOutOfBoundsException: 
 -65536
 [junit4:junit4]   2  at 
 org.apache.lucene.util.ByteBlockPool.deref(ByteBlockPool.java:305)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.lucene40.values.FixedStraightBytesImpl$FixedBytesWriterBase.set(FixedStraightBytesImpl.java:115)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.writePackedInts(PackedIntValues.java:109)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.finish(PackedIntValues.java:80)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:130)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.PerDocConsumer.merge(PerDocConsumer.java:65)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0-ea-b58) - Build # 2258 - Still Failing!

2012-11-07 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux/2258/
Java: 64bit/jdk1.8.0-ea-b58 -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 19639 lines...]
check-licenses:
 [echo] License check under: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-continuation-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-deploy-8.1.7.v20120910.jar

 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-http-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-io-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-jmx-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-security-8.1.7.v20120910.jar

 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-server-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-servlet-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-util-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-webapp-8.1.7.v20120910.jar

[...truncated 2 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:67: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build.xml:223: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/tools/custom-tasks.xml:44:
 License check failed. Check the logs.

Total time: 27 minutes 8 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 64bit/jdk1.8.0-ea-b58 -XX:+UseSerialGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4547) DocValues field broken on large indexes

2012-11-07 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4547:


Summary: DocValues field broken on large indexes  (was: PackedIntsDocValue 
field broken on large indexes)

editing description: I think it affects more than PackedIntValues actually?

I think the bug is in how FixedStraightBytesImpl uses byteblockpool.

So this means the problem should be way more widespread: e.g. if you have lots 
of documents in general I think you are fucked (as norms should trip it too).

 DocValues field broken on large indexes
 ---

 Key: LUCENE-4547
 URL: https://issues.apache.org/jira/browse/LUCENE-4547
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Priority: Blocker
 Fix For: 4.1

 Attachments: test.patch


 I tried to write a test to sanity check LUCENE-4536 (first running against 
 svn revision 1406416, before the change).
 But i found docvalues is already broken here for large indexes that have a 
 PackedLongDocValues field:
 {code}
 final int numDocs = 5;
 for (int i = 0; i  numDocs; ++i) {
   if (i == 0) {
 field.setLongValue(0L); // force  32bit deltas
   } else {
 field.setLongValue(133L); 
   }
   w.addDocument(doc);
 }
 w.forceMerge(1);
 w.close();
 dir.close(); // checkindex
 {code}
 {noformat}
 [junit4:junit4]   2 WARNING: Uncaught exception in thread: Thread[Lucene 
 Merge Thread #0,6,TGRP-Test2GBDocValues]
 [junit4:junit4]   2 org.apache.lucene.index.MergePolicy$MergeException: 
 java.lang.ArrayIndexOutOfBoundsException: -65536
 [junit4:junit4]   2  at 
 __randomizedtesting.SeedInfo.seed([5DC54DB14FA5979]:0)
 [junit4:junit4]   2  at 
 org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:535)
 [junit4:junit4]   2  at 
 org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:508)
 [junit4:junit4]   2 Caused by: java.lang.ArrayIndexOutOfBoundsException: 
 -65536
 [junit4:junit4]   2  at 
 org.apache.lucene.util.ByteBlockPool.deref(ByteBlockPool.java:305)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.lucene40.values.FixedStraightBytesImpl$FixedBytesWriterBase.set(FixedStraightBytesImpl.java:115)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.writePackedInts(PackedIntValues.java:109)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.finish(PackedIntValues.java:80)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:130)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.PerDocConsumer.merge(PerDocConsumer.java:65)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4045) SOLR admin page returns HTTP 404 on core names containing a '.' (dot)

2012-11-07 Thread Alessandro Tommasi (JIRA)
Alessandro Tommasi created SOLR-4045:


 Summary: SOLR admin page returns HTTP 404 on core names containing 
a '.' (dot)
 Key: SOLR-4045
 URL: https://issues.apache.org/jira/browse/SOLR-4045
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0
 Environment: Linux, Ubuntu 12.04
Reporter: Alessandro Tommasi
Priority: Minor


When SOLR is configured in multicore mode, cores with '.' (dot) in their names 
are inaccessible via the admin web guy. (localhost:8983/solr). The page shows 
an alert with the message (test.test was my core name):

404 Not Found get #/test.test

To replicate: start solr in multicore mode, go to localhost:8983/solr, via 
core admin create a new core test.test, then refresh the page. test.test 
will show under the menu at the bottom left. Clicking on it causes the message, 
while no core menu appears.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4547) DocValues field broken on large indexes

2012-11-07 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492421#comment-13492421
 ] 

Robert Muir commented on LUCENE-4547:
-

Another bug is that I had to pass tests.heapsize at all.

I think its bad that docvalues gobbles up so much ram when merging.
Cant we merge this stuff from disk?

 DocValues field broken on large indexes
 ---

 Key: LUCENE-4547
 URL: https://issues.apache.org/jira/browse/LUCENE-4547
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Priority: Blocker
 Fix For: 4.1

 Attachments: test.patch


 I tried to write a test to sanity check LUCENE-4536 (first running against 
 svn revision 1406416, before the change).
 But i found docvalues is already broken here for large indexes that have a 
 PackedLongDocValues field:
 {code}
 final int numDocs = 5;
 for (int i = 0; i  numDocs; ++i) {
   if (i == 0) {
 field.setLongValue(0L); // force  32bit deltas
   } else {
 field.setLongValue(133L); 
   }
   w.addDocument(doc);
 }
 w.forceMerge(1);
 w.close();
 dir.close(); // checkindex
 {code}
 {noformat}
 [junit4:junit4]   2 WARNING: Uncaught exception in thread: Thread[Lucene 
 Merge Thread #0,6,TGRP-Test2GBDocValues]
 [junit4:junit4]   2 org.apache.lucene.index.MergePolicy$MergeException: 
 java.lang.ArrayIndexOutOfBoundsException: -65536
 [junit4:junit4]   2  at 
 __randomizedtesting.SeedInfo.seed([5DC54DB14FA5979]:0)
 [junit4:junit4]   2  at 
 org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:535)
 [junit4:junit4]   2  at 
 org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:508)
 [junit4:junit4]   2 Caused by: java.lang.ArrayIndexOutOfBoundsException: 
 -65536
 [junit4:junit4]   2  at 
 org.apache.lucene.util.ByteBlockPool.deref(ByteBlockPool.java:305)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.lucene40.values.FixedStraightBytesImpl$FixedBytesWriterBase.set(FixedStraightBytesImpl.java:115)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.writePackedInts(PackedIntValues.java:109)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.finish(PackedIntValues.java:80)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:130)
 [junit4:junit4]   2  at 
 org.apache.lucene.codecs.PerDocConsumer.merge(PerDocConsumer.java:65)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4046) An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a.

2012-11-07 Thread Per Steffensen (JIRA)
Per Steffensen created SOLR-4046:


 Summary: An instance of CloudSolrServer is not able to handle 
consecutive request on different collections o.a.
 Key: SOLR-4046
 URL: https://issues.apache.org/jira/browse/SOLR-4046
 Project: Solr
  Issue Type: Bug
  Components: clients - java, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0. Actually revision 1394844 on branch 
lucene_solr_4_0 but I believe that is the same
Reporter: Per Steffensen
Priority: Critical


CloudSolrServer saves urlList, leaderUrlList and replicasList on instance 
level, and only recalculates those lists in case of clusterState changes. The 
values calculated for the lists will be different for different 
target-collections. Therefore they also ought to recalculated for a request R, 
if the target-collection for R is different from the target-collection for the 
request handled just before R by the same CloudSolrServer instance.

Another problem with the implementation in CloudSolrServer is with the 
lastClusterStateHashCode. lastClusterStateHashCode is updated when the first 
request after a clusterState-change is handled. Before the 
lastClusterStateHashCode is updated one of the following two sets of lists are 
updated:
* In case sendToLeader==true for the request: leaderUrlList and replicasList  
are updated, but not urlList
* In case sendToLeader==false for the request: urlList is updated, but not 
leaderUrlList and replicasList
But the lastClusterStateHashCode is always updated. So even though there was 
just one collection in the world there is a problem: If the first request after 
a clusterState-change is a sendToLeader==true-request urlList will 
(potentially) be wrong (and will not be recalculated) for the next 
sendToLeader==false-request to the same CloudSolrServer instance. If the first 
request after a clusterState-change is a sendToLeader==false-request 
leaderUrlList and replicasList will (potentially) be wrong (and will not be 
recalculated) for the next sendToLeader==true-request to the same 
CloudSolrServer instance.

Besides that it is a very bad idea to have instance- and local-method-variables 
with the same name. CloudSolrServer has an instance variable called urlList and 
method CloudSolrServer.request has a local-method-variable called urlList and 
the method also operates on instance variable urlList. This makes the code hard 
to read.

Havnt made a test in Apache Solr regi to reproduce the main error (the one 
mentioned at the top above) but I guess you can easily do it yourself:
Make a setup with two collections collection1 and collection2 - no default 
collection. Add some documents to collection2 (without any autocommit). Then 
do cloudSolrServer.commit(collection1) and afterwards 
cloudSolrServer.commit(collection2) (use same instance of CloudSolrServer). 
Then try to search collection2 for the documents you inserted into it. They 
ought to be found, but are not, because the 
cloudSolrServer.commit(collection2) will not do a commit of collection2 - it 
will actually do a commit of collection1.
Well, actually you cant do cloudSolrServer.commit(collection-name) (the 
method doesnt exist), but that ought to be corrected too. But you can do the 
following instead:
{code}
UpdateRequest req = new UpdateRequest();
req.setAction(UpdateRequest.ACTION.COMMIT, true, true);
req.setParam(CoreAdminParams.COLLECTION, collection-name);
req.process(cloudSolrServer);
{code}

In general I think you should add misc tests to your test-suite - tests that 
run Solr-clusters with more than one collection and makes clever tests on that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.7.0_09) - Build # 2249 - Still Failing!

2012-11-07 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux/2249/
Java: 32bit/jdk1.7.0_09 -client -XX:+UseG1GC

All tests passed

Build Log:
[...truncated 19521 lines...]
check-licenses:
 [echo] License check under: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-continuation-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-deploy-8.1.7.v20120910.jar

 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-http-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-io-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-jmx-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-security-8.1.7.v20120910.jar

 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-server-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-servlet-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-util-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-webapp-8.1.7.v20120910.jar

[...truncated 2 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:67: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/build.xml:223: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/tools/custom-tasks.xml:44:
 License check failed. Check the logs.

Total time: 28 minutes 24 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.7.0_09 -client -XX:+UseG1GC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4046) An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a.

2012-11-07 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492430#comment-13492430
 ] 

Mark Miller commented on SOLR-4046:
---

I think this is a dupe of SOLR-3920?

 An instance of CloudSolrServer is not able to handle consecutive request on 
 different collections o.a.
 --

 Key: SOLR-4046
 URL: https://issues.apache.org/jira/browse/SOLR-4046
 Project: Solr
  Issue Type: Bug
  Components: clients - java, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0. Actually revision 1394844 on branch 
 lucene_solr_4_0 but I believe that is the same
Reporter: Per Steffensen
Priority: Critical

 CloudSolrServer saves urlList, leaderUrlList and replicasList on instance 
 level, and only recalculates those lists in case of clusterState changes. The 
 values calculated for the lists will be different for different 
 target-collections. Therefore they also ought to recalculated for a request 
 R, if the target-collection for R is different from the target-collection for 
 the request handled just before R by the same CloudSolrServer instance.
 Another problem with the implementation in CloudSolrServer is with the 
 lastClusterStateHashCode. lastClusterStateHashCode is updated when the first 
 request after a clusterState-change is handled. Before the 
 lastClusterStateHashCode is updated one of the following two sets of lists 
 are updated:
 * In case sendToLeader==true for the request: leaderUrlList and replicasList  
 are updated, but not urlList
 * In case sendToLeader==false for the request: urlList is updated, but not 
 leaderUrlList and replicasList
 But the lastClusterStateHashCode is always updated. So even though there was 
 just one collection in the world there is a problem: If the first request 
 after a clusterState-change is a sendToLeader==true-request urlList will 
 (potentially) be wrong (and will not be recalculated) for the next 
 sendToLeader==false-request to the same CloudSolrServer instance. If the 
 first request after a clusterState-change is a sendToLeader==false-request 
 leaderUrlList and replicasList will (potentially) be wrong (and will not be 
 recalculated) for the next sendToLeader==true-request to the same 
 CloudSolrServer instance.
 Besides that it is a very bad idea to have instance- and 
 local-method-variables with the same name. CloudSolrServer has an instance 
 variable called urlList and method CloudSolrServer.request has a 
 local-method-variable called urlList and the method also operates on instance 
 variable urlList. This makes the code hard to read.
 Havnt made a test in Apache Solr regi to reproduce the main error (the one 
 mentioned at the top above) but I guess you can easily do it yourself:
 Make a setup with two collections collection1 and collection2 - no 
 default collection. Add some documents to collection2 (without any 
 autocommit). Then do cloudSolrServer.commit(collection1) and afterwards 
 cloudSolrServer.commit(collection2) (use same instance of CloudSolrServer). 
 Then try to search collection2 for the documents you inserted into it. They 
 ought to be found, but are not, because the 
 cloudSolrServer.commit(collection2) will not do a commit of collection2 - 
 it will actually do a commit of collection1.
 Well, actually you cant do cloudSolrServer.commit(collection-name) (the 
 method doesnt exist), but that ought to be corrected too. But you can do the 
 following instead:
 {code}
 UpdateRequest req = new UpdateRequest();
 req.setAction(UpdateRequest.ACTION.COMMIT, true, true);
 req.setParam(CoreAdminParams.COLLECTION, collection-name);
 req.process(cloudSolrServer);
 {code}
 In general I think you should add misc tests to your test-suite - tests that 
 run Solr-clusters with more than one collection and makes clever tests on 
 that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4047) dataimporter.functions.encodeUrl throughs Unable to encode expression: field.name with value: null

2012-11-07 Thread Igor Dobritskiy (JIRA)
Igor Dobritskiy created SOLR-4047:
-

 Summary: dataimporter.functions.encodeUrl throughs Unable to 
encode expression: field.name with value: null
 Key: SOLR-4047
 URL: https://issues.apache.org/jira/browse/SOLR-4047
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.0
 Environment: Windows 7
Reporter: Igor Dobritskiy


For some reason dataimporter.functions.encoude URL stopped work after update to 
solr 4.0 from 3.5.
Here is the error
{code}
Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: 
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to encode 
expression: attach.name with value: null Processing Document # 1
{code}

Here is the data import config snippet:
{code}
...
entity name=account
query=select name from accounts where account_id = 
'${attach.account_id}'

entity name=img_index processor=TikaEntityProcessor 
dataSource=bin
format=text 

url=http://example.com/data/${account.name}/attaches/${attach.item_id}/${dataimporter.functions.encodeUrl(attach.name)}

field column=text name=body /
/entity 
/entity
...
{code}
When I'm changing it to *not* use dataimporter.functions.encodeUrl it works but 
I need to url encode file names as they have special chars in theirs names.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Windows (32bit/jdk1.6.0_37) - Build # 1477 - Still Failing!

2012-11-07 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows/1477/
Java: 32bit/jdk1.6.0_37 -client -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 18960 lines...]
check-licenses:
 [echo] License check under: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-continuation-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-deploy-8.1.7.v20120910.jar

 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-http-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-io-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-jmx-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-security-8.1.7.v20120910.jar

 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-server-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-servlet-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-util-8.1.7.v20120910.jar
 [licenses] MISSING sha1 checksum file for: 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-webapp-8.1.7.v20120910.jar

[...truncated 2 lines...]
BUILD FAILED
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\build.xml:67: The 
following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build.xml:223: 
The following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\lucene\tools\custom-tasks.xml:44:
 License check failed. Check the logs.

Total time: 45 minutes 53 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.6.0_37 -client -XX:+UseConcMarkSweepGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4046) An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a.

2012-11-07 Thread Per Steffensen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Per Steffensen updated SOLR-4046:
-

Attachment: SOLR-4046.patch

I have made the following patch in our local version of Solr.

The patch could be done in various ways, but I decided to get rid of 
unneccesary code-complexity at the expense of negligible performance 
optimizations. So the idea about calculating and caching the different lists 
and only recalculate them on clusterState-change is gone. The lists are 
calculated from in-memory clusterState and it cannot take many ms to calculate 
the lists for every request - and the additional GC that comes out of it should 
also be negligible. The good think is that code becomes easier to read and 
understand.

Well, of course you can choose a different approach.

 An instance of CloudSolrServer is not able to handle consecutive request on 
 different collections o.a.
 --

 Key: SOLR-4046
 URL: https://issues.apache.org/jira/browse/SOLR-4046
 Project: Solr
  Issue Type: Bug
  Components: clients - java, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0. Actually revision 1394844 on branch 
 lucene_solr_4_0 but I believe that is the same
Reporter: Per Steffensen
Priority: Critical
 Attachments: SOLR-4046.patch


 CloudSolrServer saves urlList, leaderUrlList and replicasList on instance 
 level, and only recalculates those lists in case of clusterState changes. The 
 values calculated for the lists will be different for different 
 target-collections. Therefore they also ought to recalculated for a request 
 R, if the target-collection for R is different from the target-collection for 
 the request handled just before R by the same CloudSolrServer instance.
 Another problem with the implementation in CloudSolrServer is with the 
 lastClusterStateHashCode. lastClusterStateHashCode is updated when the first 
 request after a clusterState-change is handled. Before the 
 lastClusterStateHashCode is updated one of the following two sets of lists 
 are updated:
 * In case sendToLeader==true for the request: leaderUrlList and replicasList  
 are updated, but not urlList
 * In case sendToLeader==false for the request: urlList is updated, but not 
 leaderUrlList and replicasList
 But the lastClusterStateHashCode is always updated. So even though there was 
 just one collection in the world there is a problem: If the first request 
 after a clusterState-change is a sendToLeader==true-request urlList will 
 (potentially) be wrong (and will not be recalculated) for the next 
 sendToLeader==false-request to the same CloudSolrServer instance. If the 
 first request after a clusterState-change is a sendToLeader==false-request 
 leaderUrlList and replicasList will (potentially) be wrong (and will not be 
 recalculated) for the next sendToLeader==true-request to the same 
 CloudSolrServer instance.
 Besides that it is a very bad idea to have instance- and 
 local-method-variables with the same name. CloudSolrServer has an instance 
 variable called urlList and method CloudSolrServer.request has a 
 local-method-variable called urlList and the method also operates on instance 
 variable urlList. This makes the code hard to read.
 Havnt made a test in Apache Solr regi to reproduce the main error (the one 
 mentioned at the top above) but I guess you can easily do it yourself:
 Make a setup with two collections collection1 and collection2 - no 
 default collection. Add some documents to collection2 (without any 
 autocommit). Then do cloudSolrServer.commit(collection1) and afterwards 
 cloudSolrServer.commit(collection2) (use same instance of CloudSolrServer). 
 Then try to search collection2 for the documents you inserted into it. They 
 ought to be found, but are not, because the 
 cloudSolrServer.commit(collection2) will not do a commit of collection2 - 
 it will actually do a commit of collection1.
 Well, actually you cant do cloudSolrServer.commit(collection-name) (the 
 method doesnt exist), but that ought to be corrected too. But you can do the 
 following instead:
 {code}
 UpdateRequest req = new UpdateRequest();
 req.setAction(UpdateRequest.ACTION.COMMIT, true, true);
 req.setParam(CoreAdminParams.COLLECTION, collection-name);
 req.process(cloudSolrServer);
 {code}
 In general I think you should add misc tests to your test-suite - tests that 
 run Solr-clusters with more than one collection and makes clever tests on 
 that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: concurrentmergescheduller

2012-11-07 Thread Michael McCandless
On Tue, Nov 6, 2012 at 10:43 PM, Robert Muir rcm...@gmail.com wrote:
 On Tue, Nov 6, 2012 at 6:32 AM, Michael McCandless
 luc...@mikemccandless.com wrote:

 While confusing, I think the code is actually nearly correct...

 My question is, who is going to create the MikeSays account?

LOL :)

Mike McCandless

http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4046) An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a.

2012-11-07 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492437#comment-13492437
 ] 

Per Steffensen commented on SOLR-4046:
--

Yes, Mark, that seems like a dupe of SOLR-3920. But the patch is very 
different. First I thought about making the same patch, where the idea is to 
keep maps of the lists, but I just think, that if all this is only for 
performance reasons (not having to recalculate the lists every time) it is not 
worth the complexity in the code. Such an operation on in-memory stuff is 
negligible compared to what really uses time and resources in Solr, like 
storing to disk, sending stuff over network etc.

But anyways, you can use the patch if you want. I will consider if we will use 
your solution on our side or stay with our own.

Thanks a lot for responding.

Regards, Per Steffensen

 An instance of CloudSolrServer is not able to handle consecutive request on 
 different collections o.a.
 --

 Key: SOLR-4046
 URL: https://issues.apache.org/jira/browse/SOLR-4046
 Project: Solr
  Issue Type: Bug
  Components: clients - java, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0. Actually revision 1394844 on branch 
 lucene_solr_4_0 but I believe that is the same
Reporter: Per Steffensen
Priority: Critical
 Attachments: SOLR-4046.patch


 CloudSolrServer saves urlList, leaderUrlList and replicasList on instance 
 level, and only recalculates those lists in case of clusterState changes. The 
 values calculated for the lists will be different for different 
 target-collections. Therefore they also ought to recalculated for a request 
 R, if the target-collection for R is different from the target-collection for 
 the request handled just before R by the same CloudSolrServer instance.
 Another problem with the implementation in CloudSolrServer is with the 
 lastClusterStateHashCode. lastClusterStateHashCode is updated when the first 
 request after a clusterState-change is handled. Before the 
 lastClusterStateHashCode is updated one of the following two sets of lists 
 are updated:
 * In case sendToLeader==true for the request: leaderUrlList and replicasList  
 are updated, but not urlList
 * In case sendToLeader==false for the request: urlList is updated, but not 
 leaderUrlList and replicasList
 But the lastClusterStateHashCode is always updated. So even though there was 
 just one collection in the world there is a problem: If the first request 
 after a clusterState-change is a sendToLeader==true-request urlList will 
 (potentially) be wrong (and will not be recalculated) for the next 
 sendToLeader==false-request to the same CloudSolrServer instance. If the 
 first request after a clusterState-change is a sendToLeader==false-request 
 leaderUrlList and replicasList will (potentially) be wrong (and will not be 
 recalculated) for the next sendToLeader==true-request to the same 
 CloudSolrServer instance.
 Besides that it is a very bad idea to have instance- and 
 local-method-variables with the same name. CloudSolrServer has an instance 
 variable called urlList and method CloudSolrServer.request has a 
 local-method-variable called urlList and the method also operates on instance 
 variable urlList. This makes the code hard to read.
 Havnt made a test in Apache Solr regi to reproduce the main error (the one 
 mentioned at the top above) but I guess you can easily do it yourself:
 Make a setup with two collections collection1 and collection2 - no 
 default collection. Add some documents to collection2 (without any 
 autocommit). Then do cloudSolrServer.commit(collection1) and afterwards 
 cloudSolrServer.commit(collection2) (use same instance of CloudSolrServer). 
 Then try to search collection2 for the documents you inserted into it. They 
 ought to be found, but are not, because the 
 cloudSolrServer.commit(collection2) will not do a commit of collection2 - 
 it will actually do a commit of collection1.
 Well, actually you cant do cloudSolrServer.commit(collection-name) (the 
 method doesnt exist), but that ought to be corrected too. But you can do the 
 following instead:
 {code}
 UpdateRequest req = new UpdateRequest();
 req.setAction(UpdateRequest.ACTION.COMMIT, true, true);
 req.setParam(CoreAdminParams.COLLECTION, collection-name);
 req.process(cloudSolrServer);
 {code}
 In general I think you should add misc tests to your test-suite - tests that 
 run Solr-clusters with more than one collection and makes clever tests on 
 that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (SOLR-4048) Add a getRecursive method to NamedList

2012-11-07 Thread Shawn Heisey (JIRA)
Shawn Heisey created SOLR-4048:
--

 Summary: Add a getRecursive method to NamedList
 Key: SOLR-4048
 URL: https://issues.apache.org/jira/browse/SOLR-4048
 Project: Solr
  Issue Type: New Feature
Affects Versions: 4.0
Reporter: Shawn Heisey
Priority: Minor
 Fix For: 4.1


Most of the time when accessing data from a NamedList, what you'll be doing is 
using get() to retrieve another NamedList, and doing so over and over until you 
reach the final level, where you'll actually retrieve the value you want.

I propose adding a method to NamedList which would do all that heavy lifting 
for you.  I created the following method for my own code.  It could be adapted 
fairly easily for inclusion into NamedList itself.  The only reason I did not 
include it as a patch is because I figure you'll want to ensure it meets all 
your particular coding guidelines, and that the JavaDoc is much better than I 
have done here:

{code}
/**
 * Recursively parse a NamedList and return the value at the last level,
 * assuming that the object found at each level is also a NamedList. For
 * example, if response is the NamedList response from the Solr4 mbean
 * handler, the following code makes sense:
 * 
 * String coreName = (String) getRecursiveFromResponse(response, new
 * String[] { solr-mbeans, CORE, core, stats, coreName });
 * 
 * 
 * @param namedList the NamedList to parse
 * @param args A list of values to recursively request
 * @return the object at the last level.
 * @throws SolrServerException
 */
@SuppressWarnings(unchecked)
private final Object getRecursiveFromResponse(
NamedListObject namedList, String[] args)
throws CommonSolrException
{

NamedListObject list = null;
Object value = null;
try
{
for (String key : args)
{
if (list == null)
{
list = namedList;
}
else
{
list = (NamedListObject) value;
}
value = list.get(key);
}
return value;
}
catch (Exception e)
{
throw new SolrServerException(
Failed to recursively parse 
NamedList, e);
}
}
{code}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4048) Add a getRecursive method to NamedList

2012-11-07 Thread Shawn Heisey (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey updated SOLR-4048:
---

Description: 
Most of the time when accessing data from a NamedList, what you'll be doing is 
using get() to retrieve another NamedList, and doing so over and over until you 
reach the final level, where you'll actually retrieve the value you want.

I propose adding a method to NamedList which would do all that heavy lifting 
for you.  I created the following method for my own code.  It could be adapted 
fairly easily for inclusion into NamedList itself.  The only reason I did not 
include it as a patch is because I figure you'll want to ensure it meets all 
your particular coding guidelines, and that the JavaDoc is much better than I 
have done here:

{code}
/**
 * Recursively parse a NamedList and return the value at the last level,
 * assuming that the object found at each level is also a NamedList. For
 * example, if response is the NamedList response from the Solr4 mbean
 * handler, the following code makes sense:
 * 
 * String coreName = (String) getRecursiveFromResponse(response, new
 * String[] { solr-mbeans, CORE, core, stats, coreName });
 * 
 * 
 * @param namedList the NamedList to parse
 * @param args A list of values to recursively request
 * @return the object at the last level.
 * @throws SolrServerException
 */
@SuppressWarnings(unchecked)
private final Object getRecursiveFromResponse(
NamedListObject namedList, String[] args)
throws SolrServerException
{

NamedListObject list = null;
Object value = null;
try
{
for (String key : args)
{
if (list == null)
{
list = namedList;
}
else
{
list = (NamedListObject) value;
}
value = list.get(key);
}
return value;
}
catch (Exception e)
{
throw new SolrServerException(
Failed to recursively parse 
NamedList, e);
}
}
{code}


  was:
Most of the time when accessing data from a NamedList, what you'll be doing is 
using get() to retrieve another NamedList, and doing so over and over until you 
reach the final level, where you'll actually retrieve the value you want.

I propose adding a method to NamedList which would do all that heavy lifting 
for you.  I created the following method for my own code.  It could be adapted 
fairly easily for inclusion into NamedList itself.  The only reason I did not 
include it as a patch is because I figure you'll want to ensure it meets all 
your particular coding guidelines, and that the JavaDoc is much better than I 
have done here:

{code}
/**
 * Recursively parse a NamedList and return the value at the last level,
 * assuming that the object found at each level is also a NamedList. For
 * example, if response is the NamedList response from the Solr4 mbean
 * handler, the following code makes sense:
 * 
 * String coreName = (String) getRecursiveFromResponse(response, new
 * String[] { solr-mbeans, CORE, core, stats, coreName });
 * 
 * 
 * @param namedList the NamedList to parse
 * @param args A list of values to recursively request
 * @return the object at the last level.
 * @throws SolrServerException
 */
@SuppressWarnings(unchecked)
private final Object getRecursiveFromResponse(
NamedListObject namedList, String[] args)
throws CommonSolrException
{

NamedListObject list = null;
Object value = null;
try
{
for (String key : args)
{
if (list == null)
{
list = namedList;
}
else
{
list = (NamedListObject) value;
}
value = list.get(key);
}
return value;
}
catch (Exception e)
  

[jira] [Commented] (SOLR-4048) Add a getRecursive method to NamedList

2012-11-07 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492441#comment-13492441
 ] 

Shawn Heisey commented on SOLR-4048:


Had to edit that.  I have my own Exception type, I forgot to change one of the 
lines to SolrServerException.

 Add a getRecursive method to NamedList
 

 Key: SOLR-4048
 URL: https://issues.apache.org/jira/browse/SOLR-4048
 Project: Solr
  Issue Type: New Feature
Affects Versions: 4.0
Reporter: Shawn Heisey
Priority: Minor
 Fix For: 4.1


 Most of the time when accessing data from a NamedList, what you'll be doing 
 is using get() to retrieve another NamedList, and doing so over and over 
 until you reach the final level, where you'll actually retrieve the value you 
 want.
 I propose adding a method to NamedList which would do all that heavy lifting 
 for you.  I created the following method for my own code.  It could be 
 adapted fairly easily for inclusion into NamedList itself.  The only reason I 
 did not include it as a patch is because I figure you'll want to ensure it 
 meets all your particular coding guidelines, and that the JavaDoc is much 
 better than I have done here:
 {code}
   /**
* Recursively parse a NamedList and return the value at the last level,
* assuming that the object found at each level is also a NamedList. For
* example, if response is the NamedList response from the Solr4 mbean
* handler, the following code makes sense:
* 
* String coreName = (String) getRecursiveFromResponse(response, new
* String[] { solr-mbeans, CORE, core, stats, coreName });
* 
* 
* @param namedList the NamedList to parse
* @param args A list of values to recursively request
* @return the object at the last level.
* @throws SolrServerException
*/
   @SuppressWarnings(unchecked)
   private final Object getRecursiveFromResponse(
   NamedListObject namedList, String[] args)
   throws SolrServerException
   {
   NamedListObject list = null;
   Object value = null;
   try
   {
   for (String key : args)
   {
   if (list == null)
   {
   list = namedList;
   }
   else
   {
   list = (NamedListObject) value;
   }
   value = list.get(key);
   }
   return value;
   }
   catch (Exception e)
   {
   throw new SolrServerException(
   Failed to recursively parse 
 NamedList, e);
   }
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4045) SOLR admin page returns HTTP 404 on core names containing a '.' (dot)

2012-11-07 Thread Stefan Matheis (steffkes) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Matheis (steffkes) updated SOLR-4045:


Fix Version/s: 5.0
 Assignee: Stefan Matheis (steffkes)

 SOLR admin page returns HTTP 404 on core names containing a '.' (dot)
 -

 Key: SOLR-4045
 URL: https://issues.apache.org/jira/browse/SOLR-4045
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0
 Environment: Linux, Ubuntu 12.04
Reporter: Alessandro Tommasi
Assignee: Stefan Matheis (steffkes)
Priority: Minor
  Labels: admin, solr, webgui
 Fix For: 5.0


 When SOLR is configured in multicore mode, cores with '.' (dot) in their 
 names are inaccessible via the admin web guy. (localhost:8983/solr). The page 
 shows an alert with the message (test.test was my core name):
 404 Not Found get #/test.test
 To replicate: start solr in multicore mode, go to localhost:8983/solr, via 
 core admin create a new core test.test, then refresh the page. test.test 
 will show under the menu at the bottom left. Clicking on it causes the 
 message, while no core menu appears.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4045) SOLR admin page returns HTTP 404 on core names containing a '.' (dot)

2012-11-07 Thread Stefan Matheis (steffkes) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Matheis (steffkes) updated SOLR-4045:


Attachment: SOLR-4045.patch

[~alley] would you mind verifying this Patch? Just to be sure that i didn't 
miss one Controller.

While changing all those Files, i already that it would be good to have one 
central place holding kind of a core-pattern .. will try to change that as 
well, if that patch is okay

 SOLR admin page returns HTTP 404 on core names containing a '.' (dot)
 -

 Key: SOLR-4045
 URL: https://issues.apache.org/jira/browse/SOLR-4045
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0
 Environment: Linux, Ubuntu 12.04
Reporter: Alessandro Tommasi
Assignee: Stefan Matheis (steffkes)
Priority: Minor
  Labels: admin, solr, webgui
 Fix For: 5.0

 Attachments: SOLR-4045.patch


 When SOLR is configured in multicore mode, cores with '.' (dot) in their 
 names are inaccessible via the admin web guy. (localhost:8983/solr). The page 
 shows an alert with the message (test.test was my core name):
 404 Not Found get #/test.test
 To replicate: start solr in multicore mode, go to localhost:8983/solr, via 
 core admin create a new core test.test, then refresh the page. test.test 
 will show under the menu at the bottom left. Clicking on it causes the 
 message, while no core menu appears.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4544) possible bug in ConcurrentMergeScheduler.merge(IndexWriter)

2012-11-07 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492446#comment-13492446
 ] 

Michael McCandless commented on LUCENE-4544:


I think it needs more than cutting over to thread pool to clean it up :)

We've actually looked at using a thread pool (see LUCENE-2063) but it 
apparently wasn't straightforward ... if you can see a way that'd be nice :)

But I think we should do that under a separate issue ... leave this one focused 
on the off-by-one on maxMergeCount.


 possible bug in ConcurrentMergeScheduler.merge(IndexWriter) 
 

 Key: LUCENE-4544
 URL: https://issues.apache.org/jira/browse/LUCENE-4544
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Affects Versions: 5.0
Reporter: Radim Kolar
Assignee: Michael McCandless
 Attachments: LUCENE-4544.patch


 from dev list:
 ¨i suspect that this code is broken. Lines 331 - 343 in 
 org.apache.lucene.index.ConcurrentMergeScheduler.merge(IndexWriter)
 mergeThreadCount() are currently active merges, they can be at most 
 maxThreadCount, maxMergeCount is number of queued merges defaulted with 
 maxThreadCount+2 and it can never be lower then maxThreadCount, which means 
 that condition in while can never become true.
   synchronized(this) {
 long startStallTime = 0;
 while (mergeThreadCount() = 1+maxMergeCount) {
   startStallTime = System.currentTimeMillis();
   if (verbose()) {
 message(too many merges; stalling...);
   }
   try {
 wait();
   } catch (InterruptedException ie) {
 throw new ThreadInterruptedException(ie);
   }
 } 
 While confusing, I think the code is actually nearly correct... but I
 would love to find some simplifications of CMS's logic (it's really
 hairy).
 It turns out mergeThreadCount() is allowed to go higher than
 maxThreadCount; when this happens, Lucene pauses
 mergeThreadCount()-maxThreadCount of those merge threads, and resumes
 them once threads finish (see updateMergeThreads).  Ie, CMS will
 accept up to maxMergeCount merges (and launch threads for them), but
 will only allow maxThreadCount of those threads to be running at once.
 So what that while loop is doing is preventing more than
 maxMergeCount+1 threads from starting, and then pausing the incoming
 thread to slow down the rate of segment creation (since merging cannot
 keep up).
 But ... I think the 1+ is wrong ... it seems like it should just be
 mergeThreadCount() = maxMergeCount().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4546) SorterTemplate.quicksort incorrect

2012-11-07 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492445#comment-13492445
 ] 

Uwe Schindler commented on LUCENE-4546:
---

Hi,
I think the problem is your test case:
The SorterTemplate in your test does not handle the pivot value correctly - The 
setPivot() and comparePivot() methods get the index to compare with, but 
setPivot must store the actual value of the pivot, not the index of the pivot. 
Your code just stores the pivot index. You can fix this by correctly 
implementing setPivot: this.pivot = x[i] and implement comparePivot accordingly.

See ArrayUtil for an example.

 SorterTemplate.quicksort incorrect
 --

 Key: LUCENE-4546
 URL: https://issues.apache.org/jira/browse/LUCENE-4546
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Affects Versions: 3.6.1, 4.0, 4.1
Reporter: Stefan Pohl
  Labels: patch
 Fix For: 3.6.1, 4.0, 4.1

 Attachments: SorterTemplate.java.patch, TestSorterTemplate.java


 On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon 
 inconsistent sorting behaviour, of course, only a randomized test caught 
 this;)
 Because SorterTemplate.quicksort is used in several places in the code 
 (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and 
 transitively index and search), I'm a bit puzzled that this either hasn't 
 been caught by another higher-level test or that neither my test nor my 
 understanding of an insufficiency in the code is valid;)
 If the former holds and given that the same code is released in 3.6 and 4.0, 
 this might even be a more critical issue requiring a higher priority than 
 'major'.
 So, can a second pair of eyes please have a timely look at the attached test 
 and patch?
 Basically the current quicksort implementation seems to assume that luckily 
 always the median is chosen as pivot element by grabbing the mid element, not 
 handling the case where the initially chosen pivot ends up not in the middle. 
 Hope this and the test helps to understand the issue.
 Reproducible, currently failing test and a patch attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4046) An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a.

2012-11-07 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492447#comment-13492447
 ] 

Mark Miller commented on SOLR-4046:
---

Yeah, I honestly had the same thought when I was fixing - I almost just dropped 
the caching completely - it didn't seem like the perf would be much different 
and the code is complicated. It's mostly a random dice roll that I ended up 
keeping the caching. Mostly, I was too lazy to test if it mattered (even though 
intuitively, I doubt it would).

I'll keep this open until I'm home from Germany and can re look at it.

 An instance of CloudSolrServer is not able to handle consecutive request on 
 different collections o.a.
 --

 Key: SOLR-4046
 URL: https://issues.apache.org/jira/browse/SOLR-4046
 Project: Solr
  Issue Type: Bug
  Components: clients - java, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0. Actually revision 1394844 on branch 
 lucene_solr_4_0 but I believe that is the same
Reporter: Per Steffensen
Priority: Critical
 Attachments: SOLR-4046.patch


 CloudSolrServer saves urlList, leaderUrlList and replicasList on instance 
 level, and only recalculates those lists in case of clusterState changes. The 
 values calculated for the lists will be different for different 
 target-collections. Therefore they also ought to recalculated for a request 
 R, if the target-collection for R is different from the target-collection for 
 the request handled just before R by the same CloudSolrServer instance.
 Another problem with the implementation in CloudSolrServer is with the 
 lastClusterStateHashCode. lastClusterStateHashCode is updated when the first 
 request after a clusterState-change is handled. Before the 
 lastClusterStateHashCode is updated one of the following two sets of lists 
 are updated:
 * In case sendToLeader==true for the request: leaderUrlList and replicasList  
 are updated, but not urlList
 * In case sendToLeader==false for the request: urlList is updated, but not 
 leaderUrlList and replicasList
 But the lastClusterStateHashCode is always updated. So even though there was 
 just one collection in the world there is a problem: If the first request 
 after a clusterState-change is a sendToLeader==true-request urlList will 
 (potentially) be wrong (and will not be recalculated) for the next 
 sendToLeader==false-request to the same CloudSolrServer instance. If the 
 first request after a clusterState-change is a sendToLeader==false-request 
 leaderUrlList and replicasList will (potentially) be wrong (and will not be 
 recalculated) for the next sendToLeader==true-request to the same 
 CloudSolrServer instance.
 Besides that it is a very bad idea to have instance- and 
 local-method-variables with the same name. CloudSolrServer has an instance 
 variable called urlList and method CloudSolrServer.request has a 
 local-method-variable called urlList and the method also operates on instance 
 variable urlList. This makes the code hard to read.
 Havnt made a test in Apache Solr regi to reproduce the main error (the one 
 mentioned at the top above) but I guess you can easily do it yourself:
 Make a setup with two collections collection1 and collection2 - no 
 default collection. Add some documents to collection2 (without any 
 autocommit). Then do cloudSolrServer.commit(collection1) and afterwards 
 cloudSolrServer.commit(collection2) (use same instance of CloudSolrServer). 
 Then try to search collection2 for the documents you inserted into it. They 
 ought to be found, but are not, because the 
 cloudSolrServer.commit(collection2) will not do a commit of collection2 - 
 it will actually do a commit of collection1.
 Well, actually you cant do cloudSolrServer.commit(collection-name) (the 
 method doesnt exist), but that ought to be corrected too. But you can do the 
 following instead:
 {code}
 UpdateRequest req = new UpdateRequest();
 req.setAction(UpdateRequest.ACTION.COMMIT, true, true);
 req.setParam(CoreAdminParams.COLLECTION, collection-name);
 req.process(cloudSolrServer);
 {code}
 In general I think you should add misc tests to your test-suite - tests that 
 run Solr-clusters with more than one collection and makes clever tests on 
 that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4546) SorterTemplate.quicksort incorrect

2012-11-07 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492450#comment-13492450
 ] 

Uwe Schindler commented on LUCENE-4546:
---

By the way we have a random test (TestArrayUtil) that does exactly the same, 
but the tested ArrayUtil handles the pivot value correctly, so it works 
correct. If you use your failing example array and sort it with ArrayUtil it 
passes.

 SorterTemplate.quicksort incorrect
 --

 Key: LUCENE-4546
 URL: https://issues.apache.org/jira/browse/LUCENE-4546
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Affects Versions: 3.6.1, 4.0, 4.1
Reporter: Stefan Pohl
  Labels: patch
 Fix For: 3.6.1, 4.0, 4.1

 Attachments: SorterTemplate.java.patch, TestSorterTemplate.java


 On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon 
 inconsistent sorting behaviour, of course, only a randomized test caught 
 this;)
 Because SorterTemplate.quicksort is used in several places in the code 
 (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and 
 transitively index and search), I'm a bit puzzled that this either hasn't 
 been caught by another higher-level test or that neither my test nor my 
 understanding of an insufficiency in the code is valid;)
 If the former holds and given that the same code is released in 3.6 and 4.0, 
 this might even be a more critical issue requiring a higher priority than 
 'major'.
 So, can a second pair of eyes please have a timely look at the attached test 
 and patch?
 Basically the current quicksort implementation seems to assume that luckily 
 always the median is chosen as pivot element by grabbing the mid element, not 
 handling the case where the initially chosen pivot ends up not in the middle. 
 Hope this and the test helps to understand the issue.
 Reproducible, currently failing test and a patch attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.6.0_37) - Build # 2259 - Still Failing!

2012-11-07 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux/2259/
Java: 32bit/jdk1.6.0_37 -client -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 28385 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:294: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:117: The 
following files are missing svn:eol-style (or binary svn:mime-type):
* solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1

Total time: 34 minutes 10 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.6.0_37 -client -XX:+UseConcMarkSweepGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-4546) SorterTemplate.quicksort incorrect

2012-11-07 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler reassigned LUCENE-4546:
-

Assignee: Uwe Schindler

 SorterTemplate.quicksort incorrect
 --

 Key: LUCENE-4546
 URL: https://issues.apache.org/jira/browse/LUCENE-4546
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Affects Versions: 3.6.1, 4.0, 4.1
Reporter: Stefan Pohl
Assignee: Uwe Schindler
  Labels: patch
 Fix For: 3.6.1, 4.0, 4.1

 Attachments: SorterTemplate.java.patch, TestSorterTemplate.java


 On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon 
 inconsistent sorting behaviour, of course, only a randomized test caught 
 this;)
 Because SorterTemplate.quicksort is used in several places in the code 
 (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and 
 transitively index and search), I'm a bit puzzled that this either hasn't 
 been caught by another higher-level test or that neither my test nor my 
 understanding of an insufficiency in the code is valid;)
 If the former holds and given that the same code is released in 3.6 and 4.0, 
 this might even be a more critical issue requiring a higher priority than 
 'major'.
 So, can a second pair of eyes please have a timely look at the attached test 
 and patch?
 Basically the current quicksort implementation seems to assume that luckily 
 always the median is chosen as pivot element by grabbing the mid element, not 
 handling the case where the initially chosen pivot ends up not in the middle. 
 Hope this and the test helps to understand the issue.
 Reproducible, currently failing test and a patch attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.

2012-11-07 Thread Nagendra Nagarajayya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492457#comment-13492457
 ] 

Nagendra Nagarajayya commented on SOLR-3816:


@Otis:

Regarding the performance improvement, apart from the performance improvement, 
realtime-search makes available a realtime (nrt) view of the index as to 
current Solr implementation of point-in-time snapshots of the index. So each 
search may return new results ... 

 Need a more granular nrt system that is close to a realtime system.
 ---

 Key: SOLR-3816
 URL: https://issues.apache.org/jira/browse/SOLR-3816
 Project: Solr
  Issue Type: Improvement
  Components: clients - java, replication (java), search, 
 SearchComponents - other, SolrCloud, update
Affects Versions: 4.0
Reporter: Nagendra Nagarajayya
  Labels: nrt, realtime, replication, search, solrcloud, update
 Attachments: alltests_passed_with_realtime_turnedoff.log, 
 SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, 
 solr-3816-realtime_nrt.patch


 Need a more granular NRT system that is close to a realtime system. A 
 realtime system should be able to reflect changes to the index as and when 
 docs are added/updated to the index. soft-commit offers NRT and is more 
 realtime friendly than hard commit but is limited by the dependency on the 
 SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. 
 Closing and reopening of the SolrIndexSearcher may impact performance also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4046) An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a.

2012-11-07 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492459#comment-13492459
 ] 

Per Steffensen commented on SOLR-4046:
--

Well, the entire Apache Solr test-suite is still green with my fix - not that 
it makes any guarantee that the simplification does not matter :-)

 An instance of CloudSolrServer is not able to handle consecutive request on 
 different collections o.a.
 --

 Key: SOLR-4046
 URL: https://issues.apache.org/jira/browse/SOLR-4046
 Project: Solr
  Issue Type: Bug
  Components: clients - java, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0. Actually revision 1394844 on branch 
 lucene_solr_4_0 but I believe that is the same
Reporter: Per Steffensen
Priority: Critical
 Attachments: SOLR-4046.patch


 CloudSolrServer saves urlList, leaderUrlList and replicasList on instance 
 level, and only recalculates those lists in case of clusterState changes. The 
 values calculated for the lists will be different for different 
 target-collections. Therefore they also ought to recalculated for a request 
 R, if the target-collection for R is different from the target-collection for 
 the request handled just before R by the same CloudSolrServer instance.
 Another problem with the implementation in CloudSolrServer is with the 
 lastClusterStateHashCode. lastClusterStateHashCode is updated when the first 
 request after a clusterState-change is handled. Before the 
 lastClusterStateHashCode is updated one of the following two sets of lists 
 are updated:
 * In case sendToLeader==true for the request: leaderUrlList and replicasList  
 are updated, but not urlList
 * In case sendToLeader==false for the request: urlList is updated, but not 
 leaderUrlList and replicasList
 But the lastClusterStateHashCode is always updated. So even though there was 
 just one collection in the world there is a problem: If the first request 
 after a clusterState-change is a sendToLeader==true-request urlList will 
 (potentially) be wrong (and will not be recalculated) for the next 
 sendToLeader==false-request to the same CloudSolrServer instance. If the 
 first request after a clusterState-change is a sendToLeader==false-request 
 leaderUrlList and replicasList will (potentially) be wrong (and will not be 
 recalculated) for the next sendToLeader==true-request to the same 
 CloudSolrServer instance.
 Besides that it is a very bad idea to have instance- and 
 local-method-variables with the same name. CloudSolrServer has an instance 
 variable called urlList and method CloudSolrServer.request has a 
 local-method-variable called urlList and the method also operates on instance 
 variable urlList. This makes the code hard to read.
 Havnt made a test in Apache Solr regi to reproduce the main error (the one 
 mentioned at the top above) but I guess you can easily do it yourself:
 Make a setup with two collections collection1 and collection2 - no 
 default collection. Add some documents to collection2 (without any 
 autocommit). Then do cloudSolrServer.commit(collection1) and afterwards 
 cloudSolrServer.commit(collection2) (use same instance of CloudSolrServer). 
 Then try to search collection2 for the documents you inserted into it. They 
 ought to be found, but are not, because the 
 cloudSolrServer.commit(collection2) will not do a commit of collection2 - 
 it will actually do a commit of collection1.
 Well, actually you cant do cloudSolrServer.commit(collection-name) (the 
 method doesnt exist), but that ought to be corrected too. But you can do the 
 following instead:
 {code}
 UpdateRequest req = new UpdateRequest();
 req.setAction(UpdateRequest.ACTION.COMMIT, true, true);
 req.setParam(CoreAdminParams.COLLECTION, collection-name);
 req.process(cloudSolrServer);
 {code}
 In general I think you should add misc tests to your test-suite - tests that 
 run Solr-clusters with more than one collection and makes clever tests on 
 that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4546) SorterTemplate.quicksort incorrect

2012-11-07 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-4546:
--

Attachment: TestSorterTemplate.java

Attached the corrected testcase, which passes.

BTW: Your SorterTemplate implementation fails with with mergeSort completely :-)

 SorterTemplate.quicksort incorrect
 --

 Key: LUCENE-4546
 URL: https://issues.apache.org/jira/browse/LUCENE-4546
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Affects Versions: 3.6.1, 4.0, 4.1
Reporter: Stefan Pohl
Assignee: Uwe Schindler
  Labels: patch
 Fix For: 3.6.1, 4.0, 4.1

 Attachments: SorterTemplate.java.patch, TestSorterTemplate.java, 
 TestSorterTemplate.java


 On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon 
 inconsistent sorting behaviour, of course, only a randomized test caught 
 this;)
 Because SorterTemplate.quicksort is used in several places in the code 
 (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and 
 transitively index and search), I'm a bit puzzled that this either hasn't 
 been caught by another higher-level test or that neither my test nor my 
 understanding of an insufficiency in the code is valid;)
 If the former holds and given that the same code is released in 3.6 and 4.0, 
 this might even be a more critical issue requiring a higher priority than 
 'major'.
 So, can a second pair of eyes please have a timely look at the attached test 
 and patch?
 Basically the current quicksort implementation seems to assume that luckily 
 always the median is chosen as pivot element by grabbing the mid element, not 
 handling the case where the initially chosen pivot ends up not in the middle. 
 Hope this and the test helps to understand the issue.
 Reproducible, currently failing test and a patch attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4045) SOLR admin page returns HTTP 404 on core names containing a '.' (dot)

2012-11-07 Thread Alessandro Tommasi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492465#comment-13492465
 ] 

Alessandro Tommasi commented on SOLR-4045:
--

Thank you for your prompt action on this. I have tried the patch, but patching 
my existing 4.0 installation (as downloaded from the website) was a little 
troublesome, as those files that the patch indicated as being in:

solr/webapp/web/js/scripts

are actually under:

solr-webapp/webapp/js/scripts

in my installation. Replacing the paths in the patch and applying it, however, 
worked, and the web gui seems to work w/o issues. (I had however to open the 
web gui in another browser, as mine seemed to have cached all those js and 
refused to reload them unless I refreshed them one by one).

 SOLR admin page returns HTTP 404 on core names containing a '.' (dot)
 -

 Key: SOLR-4045
 URL: https://issues.apache.org/jira/browse/SOLR-4045
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0
 Environment: Linux, Ubuntu 12.04
Reporter: Alessandro Tommasi
Assignee: Stefan Matheis (steffkes)
Priority: Minor
  Labels: admin, solr, webgui
 Fix For: 5.0

 Attachments: SOLR-4045.patch


 When SOLR is configured in multicore mode, cores with '.' (dot) in their 
 names are inaccessible via the admin web guy. (localhost:8983/solr). The page 
 shows an alert with the message (test.test was my core name):
 404 Not Found get #/test.test
 To replicate: start solr in multicore mode, go to localhost:8983/solr, via 
 core admin create a new core test.test, then refresh the page. test.test 
 will show under the menu at the bottom left. Clicking on it causes the 
 message, while no core menu appears.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4546) SorterTemplate.quicksort incorrect

2012-11-07 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved LUCENE-4546.
---

Resolution: Not A Problem

 SorterTemplate.quicksort incorrect
 --

 Key: LUCENE-4546
 URL: https://issues.apache.org/jira/browse/LUCENE-4546
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Affects Versions: 3.6.1, 4.0, 4.1
Reporter: Stefan Pohl
Assignee: Uwe Schindler
  Labels: patch
 Fix For: 4.1, 4.0, 3.6.1

 Attachments: SorterTemplate.java.patch, TestSorterTemplate.java, 
 TestSorterTemplate.java


 On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon 
 inconsistent sorting behaviour, of course, only a randomized test caught 
 this;)
 Because SorterTemplate.quicksort is used in several places in the code 
 (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and 
 transitively index and search), I'm a bit puzzled that this either hasn't 
 been caught by another higher-level test or that neither my test nor my 
 understanding of an insufficiency in the code is valid;)
 If the former holds and given that the same code is released in 3.6 and 4.0, 
 this might even be a more critical issue requiring a higher priority than 
 'major'.
 So, can a second pair of eyes please have a timely look at the attached test 
 and patch?
 Basically the current quicksort implementation seems to assume that luckily 
 always the median is chosen as pivot element by grabbing the mid element, not 
 handling the case where the initially chosen pivot ends up not in the middle. 
 Hope this and the test helps to understand the issue.
 Reproducible, currently failing test and a patch attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4045) SOLR admin page returns HTTP 404 on core names containing a '.' (dot)

2012-11-07 Thread Stefan Matheis (steffkes) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492475#comment-13492475
 ] 

Stefan Matheis (steffkes) commented on SOLR-4045:
-

{quote}as those files that the patch indicated as being in:
solr/webapp/web/js/scripts
are actually under:
solr-webapp/webapp/js/scripts{quote}

in {{solr/webapp/web}} the source-files are located, where as in 
{{example/solr-webapp/webapp}} your running instance is holding their copies of 
the source-files

But anyway, fine that it works -- will work on the second version with a global 
corename-pattern to make these changes in the feature a bit easier

Thanks Allesandro!

 SOLR admin page returns HTTP 404 on core names containing a '.' (dot)
 -

 Key: SOLR-4045
 URL: https://issues.apache.org/jira/browse/SOLR-4045
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0
 Environment: Linux, Ubuntu 12.04
Reporter: Alessandro Tommasi
Assignee: Stefan Matheis (steffkes)
Priority: Minor
  Labels: admin, solr, webgui
 Fix For: 5.0

 Attachments: SOLR-4045.patch


 When SOLR is configured in multicore mode, cores with '.' (dot) in their 
 names are inaccessible via the admin web guy. (localhost:8983/solr). The page 
 shows an alert with the message (test.test was my core name):
 404 Not Found get #/test.test
 To replicate: start solr in multicore mode, go to localhost:8983/solr, via 
 core admin create a new core test.test, then refresh the page. test.test 
 will show under the menu at the bottom left. Clicking on it causes the 
 message, while no core menu appears.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4546) SorterTemplate.quicksort incorrect

2012-11-07 Thread Stefan Pohl (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492480#comment-13492480
 ] 

Stefan Pohl commented on LUCENE-4546:
-

Thanks for the clarification, Uwe!

Out of curiosity and for reference, are there any reasons for the abstraction 
having to overwrite setPivot/comparePivot? Using the implementation in my patch 
would actually allow to get rid of having to overwrite these methods at all, 
possibly being faster due to removal of some calls depending on JVM 
optimization and possibly being slower due to a few more swaps and branches in 
the code. Pure speculation.

 SorterTemplate.quicksort incorrect
 --

 Key: LUCENE-4546
 URL: https://issues.apache.org/jira/browse/LUCENE-4546
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Affects Versions: 3.6.1, 4.0, 4.1
Reporter: Stefan Pohl
Assignee: Uwe Schindler
  Labels: patch
 Fix For: 3.6.1, 4.0, 4.1

 Attachments: SorterTemplate.java.patch, TestSorterTemplate.java, 
 TestSorterTemplate.java


 On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon 
 inconsistent sorting behaviour, of course, only a randomized test caught 
 this;)
 Because SorterTemplate.quicksort is used in several places in the code 
 (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and 
 transitively index and search), I'm a bit puzzled that this either hasn't 
 been caught by another higher-level test or that neither my test nor my 
 understanding of an insufficiency in the code is valid;)
 If the former holds and given that the same code is released in 3.6 and 4.0, 
 this might even be a more critical issue requiring a higher priority than 
 'major'.
 So, can a second pair of eyes please have a timely look at the attached test 
 and patch?
 Basically the current quicksort implementation seems to assume that luckily 
 always the median is chosen as pivot element by grabbing the mid element, not 
 handling the case where the initially chosen pivot ends up not in the middle. 
 Hope this and the test helps to understand the issue.
 Reproducible, currently failing test and a patch attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Windows (32bit/jdk1.7.0_09) - Build # 1472 - Still Failing!

2012-11-07 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/1472/
Java: 32bit/jdk1.7.0_09 -server -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 28989 lines...]
BUILD FAILED
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:294: The 
following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\extra-targets.xml:117: 
The following files are missing svn:eol-style (or binary svn:mime-type):
* solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1

Total time: 54 minutes 23 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.7.0_09 -server -XX:+UseParallelGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3589) Edismax parser does not honor mm parameter if analyzer splits a token

2012-11-07 Thread Tom Burton-West (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492487#comment-13492487
 ] 

Tom Burton-West commented on SOLR-3589:
---

Forgot to work from your latest patch with the synonyms test.   I'll post a new 
backport of the patch with the synonyms test and against the latest 3.6x in svn 
shortly

 Edismax parser does not honor mm parameter if analyzer splits a token
 -

 Key: SOLR-3589
 URL: https://issues.apache.org/jira/browse/SOLR-3589
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 3.6, 4.0-BETA
Reporter: Tom Burton-West
Assignee: Robert Muir
 Attachments: SOLR-3589.patch, SOLR-3589.patch, SOLR-3589.patch, 
 SOLR-3589.patch, SOLR-3589.patch, SOLR-3589_test.patch, testSolr3589.xml.gz, 
 testSolr3589.xml.gz


 With edismax mm set to 100%  if one of the tokens is split into two tokens by 
 the analyzer chain (i.e. fire-fly  = fire fly), the mm parameter is 
 ignored and the equivalent of  OR query for fire OR fly is produced.
 This is particularly a problem for languages that do not use white space to 
 separate words such as Chinese or Japenese.
 See these messages for more discussion:
 http://lucene.472066.n3.nabble.com/edismax-parser-ignores-mm-parameter-when-tokenizer-splits-tokens-hypenated-words-WDF-splitting-etc-tc3991911.html
 http://lucene.472066.n3.nabble.com/edismax-parser-ignores-mm-parameter-when-tokenizer-splits-tokens-i-e-CJK-tc3991438.html
 http://lucene.472066.n3.nabble.com/Why-won-t-dismax-create-multiple-DisjunctionMaxQueries-when-autoGeneratePhraseQueries-is-false-tc3992109.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.7.0_09) - Build # 2250 - Still Failing!

2012-11-07 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux/2250/
Java: 64bit/jdk1.7.0_09 -XX:+UseG1GC

All tests passed

Build Log:
[...truncated 28980 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:294: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:117: The 
following files are missing svn:eol-style (or binary svn:mime-type):
* solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1

Total time: 37 minutes 8 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 64bit/jdk1.7.0_09 -XX:+UseG1GC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.7.0_09) - Build # 2260 - Still Failing!

2012-11-07 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux/2260/
Java: 32bit/jdk1.7.0_09 -client -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 29062 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:294: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:117: The 
following files are missing svn:eol-style (or binary svn:mime-type):
* solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1

Total time: 32 minutes 47 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.7.0_09 -client -XX:+UseSerialGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Windows (32bit/jdk1.7.0_09) - Build # 1478 - Still Failing!

2012-11-07 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows/1478/
Java: 32bit/jdk1.7.0_09 -server -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 29089 lines...]
BUILD FAILED
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\build.xml:294: The 
following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\extra-targets.xml:117:
 The following files are missing svn:eol-style (or binary svn:mime-type):
* solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1

Total time: 53 minutes 25 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.7.0_09 -server -XX:+UseConcMarkSweepGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3589) Edismax parser does not honor mm parameter if analyzer splits a token

2012-11-07 Thread Tom Burton-West (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom Burton-West updated SOLR-3589:
--

Attachment: SOLR-3589-3.6.PATCH

Backport to 3.6 r1406713. Includes synonyms test.

Will test in against production later today 

 Edismax parser does not honor mm parameter if analyzer splits a token
 -

 Key: SOLR-3589
 URL: https://issues.apache.org/jira/browse/SOLR-3589
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 3.6, 4.0-BETA
Reporter: Tom Burton-West
Assignee: Robert Muir
 Attachments: SOLR-3589-3.6.PATCH, SOLR-3589.patch, SOLR-3589.patch, 
 SOLR-3589.patch, SOLR-3589.patch, SOLR-3589.patch, SOLR-3589_test.patch, 
 testSolr3589.xml.gz, testSolr3589.xml.gz


 With edismax mm set to 100%  if one of the tokens is split into two tokens by 
 the analyzer chain (i.e. fire-fly  = fire fly), the mm parameter is 
 ignored and the equivalent of  OR query for fire OR fly is produced.
 This is particularly a problem for languages that do not use white space to 
 separate words such as Chinese or Japenese.
 See these messages for more discussion:
 http://lucene.472066.n3.nabble.com/edismax-parser-ignores-mm-parameter-when-tokenizer-splits-tokens-hypenated-words-WDF-splitting-etc-tc3991911.html
 http://lucene.472066.n3.nabble.com/edismax-parser-ignores-mm-parameter-when-tokenizer-splits-tokens-i-e-CJK-tc3991438.html
 http://lucene.472066.n3.nabble.com/Why-won-t-dismax-create-multiple-DisjunctionMaxQueries-when-autoGeneratePhraseQueries-is-false-tc3992109.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.

2012-11-07 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492551#comment-13492551
 ] 

Otis Gospodnetic commented on SOLR-3816:


[~nnagarajayya] Hmmm maybe I'm missing something but if you set the 
soft commit in Solr to something very, very low, then yes, while it is still 
technically point in time view, that point in time is shifted so frequently 
that it looks like RT search to a human - new results can show up with every 
new search.  So the effect can be as (N)RT as you choose with the soft commit 
frequency.  I think the only Q is whether that approach vs. the approach in 
your patch yields better performance, and it looks like [~hsn] will test that 
soon and we're all anxiously waiting to see the results! :)


 Need a more granular nrt system that is close to a realtime system.
 ---

 Key: SOLR-3816
 URL: https://issues.apache.org/jira/browse/SOLR-3816
 Project: Solr
  Issue Type: Improvement
  Components: clients - java, replication (java), search, 
 SearchComponents - other, SolrCloud, update
Affects Versions: 4.0
Reporter: Nagendra Nagarajayya
  Labels: nrt, realtime, replication, search, solrcloud, update
 Attachments: alltests_passed_with_realtime_turnedoff.log, 
 SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, 
 solr-3816-realtime_nrt.patch


 Need a more granular NRT system that is close to a realtime system. A 
 realtime system should be able to reflect changes to the index as and when 
 docs are added/updated to the index. soft-commit offers NRT and is more 
 realtime friendly than hard commit but is limited by the dependency on the 
 SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. 
 Closing and reopening of the SolrIndexSearcher may impact performance also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Linux (64bit/jrockit-jdk1.6.0_33-R28.2.4-4.1.0) - Build # 2251 - Still Failing!

2012-11-07 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux/2251/
Java: 64bit/jrockit-jdk1.6.0_33-R28.2.4-4.1.0 -XnoOpt

All tests passed

Build Log:
[...truncated 28172 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:294: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:117: The 
following files are missing svn:eol-style (or binary svn:mime-type):
* solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1

Total time: 43 minutes 44 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 64bit/jrockit-jdk1.6.0_33-R28.2.4-4.1.0 -XnoOpt
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4544) possible bug in ConcurrentMergeScheduler.merge(IndexWriter)

2012-11-07 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-4544:
---

Attachment: LUCENE-4544.patch

Added test case ... I think it's ready.

 possible bug in ConcurrentMergeScheduler.merge(IndexWriter) 
 

 Key: LUCENE-4544
 URL: https://issues.apache.org/jira/browse/LUCENE-4544
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Affects Versions: 5.0
Reporter: Radim Kolar
Assignee: Michael McCandless
 Attachments: LUCENE-4544.patch, LUCENE-4544.patch


 from dev list:
 ¨i suspect that this code is broken. Lines 331 - 343 in 
 org.apache.lucene.index.ConcurrentMergeScheduler.merge(IndexWriter)
 mergeThreadCount() are currently active merges, they can be at most 
 maxThreadCount, maxMergeCount is number of queued merges defaulted with 
 maxThreadCount+2 and it can never be lower then maxThreadCount, which means 
 that condition in while can never become true.
   synchronized(this) {
 long startStallTime = 0;
 while (mergeThreadCount() = 1+maxMergeCount) {
   startStallTime = System.currentTimeMillis();
   if (verbose()) {
 message(too many merges; stalling...);
   }
   try {
 wait();
   } catch (InterruptedException ie) {
 throw new ThreadInterruptedException(ie);
   }
 } 
 While confusing, I think the code is actually nearly correct... but I
 would love to find some simplifications of CMS's logic (it's really
 hairy).
 It turns out mergeThreadCount() is allowed to go higher than
 maxThreadCount; when this happens, Lucene pauses
 mergeThreadCount()-maxThreadCount of those merge threads, and resumes
 them once threads finish (see updateMergeThreads).  Ie, CMS will
 accept up to maxMergeCount merges (and launch threads for them), but
 will only allow maxThreadCount of those threads to be running at once.
 So what that while loop is doing is preventing more than
 maxMergeCount+1 threads from starting, and then pausing the incoming
 thread to slow down the rate of segment creation (since merging cannot
 keep up).
 But ... I think the 1+ is wrong ... it seems like it should just be
 mergeThreadCount() = maxMergeCount().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Windows (32bit/jdk1.6.0_37) - Build # 1473 - Still Failing!

2012-11-07 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/1473/
Java: 32bit/jdk1.6.0_37 -server -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 28176 lines...]
BUILD FAILED
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:294: The 
following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\extra-targets.xml:117: 
The following files are missing svn:eol-style (or binary svn:mime-type):
* solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1

Total time: 55 minutes 56 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.6.0_37 -server -XX:+UseParallelGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3589) Edismax parser does not honor mm parameter if analyzer splits a token

2012-11-07 Thread Tom Burton-West (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492591#comment-13492591
 ] 

Tom Burton-West commented on SOLR-3589:
---

Hi Robert,

I just put the backport to 3.6 up on our test server and pointed it to one of 
our production shards.  The improvement for Chinese queries  are dramatic.  
(Especially for longer queries like the TREC 5 queries, see examples below)

When you have time, please look over the backport of the patch.  I think it is 
fine but I would appreciate you looking it over.  My understanding of your 
patch is that it just affects a small portion of the edismax logic, but I don't 
understand the edismax parser well enough to be sure there isn't some 
difference between 3.6 and 4.0 that I didn't account for in the patch.

Thanks for working on this.   Naomi and I are both very excited about this bug 
finally being fixed and want to put the fix into production soon.
---
Example TREC 5 Chinese queries:

num Number: CH4
E-title The newly discovered oil fields in China.
C-title 中国大陆新发现的油田   
40,135 items found for 中国大陆新发现的油田 with current implementation (due to dismax 
bug)
78 items found for 中国大陆新发现的油田 with patch

num Number: CH10
E-title Border Trade in Xinjiang
C-title 新疆的边境贸易  
20,249 items found for 新疆的边境贸易  current implementation (with bug)
243 items found for 新疆的边境贸易  with patch.


 Edismax parser does not honor mm parameter if analyzer splits a token
 -

 Key: SOLR-3589
 URL: https://issues.apache.org/jira/browse/SOLR-3589
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 3.6, 4.0-BETA
Reporter: Tom Burton-West
Assignee: Robert Muir
 Attachments: SOLR-3589-3.6.PATCH, SOLR-3589.patch, SOLR-3589.patch, 
 SOLR-3589.patch, SOLR-3589.patch, SOLR-3589.patch, SOLR-3589_test.patch, 
 testSolr3589.xml.gz, testSolr3589.xml.gz


 With edismax mm set to 100%  if one of the tokens is split into two tokens by 
 the analyzer chain (i.e. fire-fly  = fire fly), the mm parameter is 
 ignored and the equivalent of  OR query for fire OR fly is produced.
 This is particularly a problem for languages that do not use white space to 
 separate words such as Chinese or Japenese.
 See these messages for more discussion:
 http://lucene.472066.n3.nabble.com/edismax-parser-ignores-mm-parameter-when-tokenizer-splits-tokens-hypenated-words-WDF-splitting-etc-tc3991911.html
 http://lucene.472066.n3.nabble.com/edismax-parser-ignores-mm-parameter-when-tokenizer-splits-tokens-i-e-CJK-tc3991438.html
 http://lucene.472066.n3.nabble.com/Why-won-t-dismax-create-multiple-DisjunctionMaxQueries-when-autoGeneratePhraseQueries-is-false-tc3992109.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-4.x-Windows (32bit/jdk1.6.0_37) - Build # 1473 - Still Failing!

2012-11-07 Thread Michael McCandless
Looks like Yonik's Jetty upgrade upgrade caused these build failures
... Yonik can you fix?  Thanks.

And we all should try to remember to run ant precommit before committing ...

Mike McCandless

http://blog.mikemccandless.com

On Wed, Nov 7, 2012 at 1:28 PM, Policeman Jenkins Server
jenk...@sd-datasolutions.de wrote:
 Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/1473/
 Java: 32bit/jdk1.6.0_37 -server -XX:+UseParallelGC

 All tests passed

 Build Log:
 [...truncated 28176 lines...]
 BUILD FAILED
 C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:294: The 
 following error occurred while executing this line:
 C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\extra-targets.xml:117:
  The following files are missing svn:eol-style (or binary svn:mime-type):
 * solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1

 Total time: 55 minutes 56 seconds
 Build step 'Invoke Ant' marked build as failure
 Archiving artifacts
 Recording test results
 Description set: Java: 32bit/jdk1.6.0_37 -server -XX:+UseParallelGC
 Email was triggered for: Failure
 Sending email for trigger: Failure




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



BooleanFilter MUST clauses and getDocIdSet(acceptDocs)

2012-11-07 Thread david.w.smi...@gmail.com
I am about to write a Filter that only operates on a set of documents that
have already passed other filter(s).  It's rather expensive, since it has
to use DocValues to examine a value and then determine if its a match.  So
it scales O(n) where n is the number of documents it must see.  The 2nd arg
of getDocIdSet is Bits acceptDocs.  Unfortunately Bits doesn't have an int
iterator but I can deal with that seeing if it extends DocIdSet.

I'm looking at BooleanFilter which I want to use and I notice that it
passes null to filter.getDocIdSet for acceptDocs, and it justifies this
with the following comment:
// we dont pass acceptDocs, we will filter at the end using an additional
filter
Uwe wrote this comment in relation to LUCENE-1536 (r1188624).
For the MUST clause loop, couldn't it give it the accumulated bits of the
MUST clauses?

~ David


[JENKINS] Lucene-Solr-trunk-Windows (32bit/jdk1.7.0_09) - Build # 1479 - Still Failing!

2012-11-07 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows/1479/
Java: 32bit/jdk1.7.0_09 -server -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 29069 lines...]
BUILD FAILED
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\build.xml:294: The 
following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\extra-targets.xml:117:
 The following files are missing svn:eol-style (or binary svn:mime-type):
* solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1

Total time: 52 minutes 25 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.7.0_09 -server -XX:+UseParallelGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-4.x-Windows (32bit/jdk1.6.0_37) - Build # 1473 - Still Failing!

2012-11-07 Thread Robert Muir
I committed a fix

On Wed, Nov 7, 2012 at 2:06 PM, Michael McCandless
luc...@mikemccandless.com wrote:
 Looks like Yonik's Jetty upgrade upgrade caused these build failures
 ... Yonik can you fix?  Thanks.

 And we all should try to remember to run ant precommit before committing ...

 Mike McCandless

 http://blog.mikemccandless.com

 On Wed, Nov 7, 2012 at 1:28 PM, Policeman Jenkins Server
 jenk...@sd-datasolutions.de wrote:
 Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/1473/
 Java: 32bit/jdk1.6.0_37 -server -XX:+UseParallelGC

 All tests passed

 Build Log:
 [...truncated 28176 lines...]
 BUILD FAILED
 C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:294: The 
 following error occurred while executing this line:
 C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\extra-targets.xml:117:
  The following files are missing svn:eol-style (or binary svn:mime-type):
 * solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1

 Total time: 55 minutes 56 seconds
 Build step 'Invoke Ant' marked build as failure
 Archiving artifacts
 Recording test results
 Description set: Java: 32bit/jdk1.6.0_37 -server -XX:+UseParallelGC
 Email was triggered for: Failure
 Sending email for trigger: Failure




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting

2012-11-07 Thread Chris Russell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492619#comment-13492619
 ] 

Chris Russell commented on SOLR-2894:
-

In regards to my above comment, I have determined that it is because if you 
specify a limit for a field that you are not requesting facet counts for, solr 
will not automatically over-request on that field.  
i.e.
facet.pivot=somefield
f.somefield.facet.limit=10

This will make your pivots weird because the limit of 10 will not be over 
requested unless you add this line:
facet.field=somefield

Since solr does not do distributed pivoting yet, this has not been an issue yet.
I am working on an update to the patch that will correct this issue.

 Implement distributed pivot faceting
 

 Key: SOLR-2894
 URL: https://issues.apache.org/jira/browse/SOLR-2894
 Project: Solr
  Issue Type: Improvement
Reporter: Erik Hatcher
 Fix For: 4.1

 Attachments: distributed_pivot.patch, distributed_pivot.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894-reworked.patch


 Following up on SOLR-792, pivot faceting currently only supports 
 undistributed mode.  Distributed pivot faceting needs to be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-4.x-Windows (32bit/jdk1.6.0_37) - Build # 1473 - Still Failing!

2012-11-07 Thread Michael McCandless
Thanks Robert!

Mike McCandless

http://blog.mikemccandless.com

On Wed, Nov 7, 2012 at 2:27 PM, Robert Muir rcm...@gmail.com wrote:
 I committed a fix

 On Wed, Nov 7, 2012 at 2:06 PM, Michael McCandless
 luc...@mikemccandless.com wrote:
 Looks like Yonik's Jetty upgrade upgrade caused these build failures
 ... Yonik can you fix?  Thanks.

 And we all should try to remember to run ant precommit before committing 
 ...

 Mike McCandless

 http://blog.mikemccandless.com

 On Wed, Nov 7, 2012 at 1:28 PM, Policeman Jenkins Server
 jenk...@sd-datasolutions.de wrote:
 Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/1473/
 Java: 32bit/jdk1.6.0_37 -server -XX:+UseParallelGC

 All tests passed

 Build Log:
 [...truncated 28176 lines...]
 BUILD FAILED
 C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:294: The 
 following error occurred while executing this line:
 C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\extra-targets.xml:117:
  The following files are missing svn:eol-style (or binary svn:mime-type):
 * solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1
 * solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1

 Total time: 55 minutes 56 seconds
 Build step 'Invoke Ant' marked build as failure
 Archiving artifacts
 Recording test results
 Description set: Java: 32bit/jdk1.6.0_37 -server -XX:+UseParallelGC
 Email was triggered for: Failure
 Sending email for trigger: Failure




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.6.0_37) - Build # 2262 - Still Failing!

2012-11-07 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux/2262/
Java: 64bit/jdk1.6.0_37 -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 28365 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:294: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:117: The 
following files are missing svn:eol-style (or binary svn:mime-type):
* solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1

Total time: 31 minutes 26 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 64bit/jdk1.6.0_37 -XX:+UseConcMarkSweepGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-Windows (32bit/jdk1.6.0_37) - Build # 1474 - Still Failing!

2012-11-07 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/1474/
Java: 32bit/jdk1.6.0_37 -server -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 28179 lines...]
BUILD FAILED
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:294: The 
following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\extra-targets.xml:117: 
The following files are missing svn:eol-style (or binary svn:mime-type):
* solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1
* solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1

Total time: 59 minutes 7 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.6.0_37 -server -XX:+UseConcMarkSweepGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery

2012-11-07 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-4482.


Resolution: Fixed

The new Zing 5.5 release looks to have fixed this issue!  I can now pass all 
Lucene/Solr tests with Zing ... at lest two times :)

 Likely Zing JVM bug causes failures in TestPayloadNearQuery
 ---

 Key: LUCENE-4482
 URL: https://issues.apache.org/jira/browse/LUCENE-4482
 Project: Lucene - Core
  Issue Type: Bug
 Environment: Lucene trunk, rev 1397735
 Zing:
 {noformat}
   java version 1.6.0_31
   Java(TM) SE Runtime Environment (build 1.6.0_31-6)
   Java HotSpot(TM) 64-Bit Tiered VM (build 
 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode)
 {noformat}
 Ubuntu 12.04 LTS 3.2.0-23-generic kernel
Reporter: Michael McCandless
 Attachments: LUCENE-4482.patch


 I dug into one of the Lucene test failures when running with Zing JVM
 (available free for open source devs...).  At least one other test
 sometimes fails but I haven't dug into that yet.
 I managed to get the failure easily reproduced: with the attached
 patch, on rev 1397735 checkout, if you cd to lucene/core and run:
 {noformat}
   ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 
 -Dtests.showSuccess=true
 {noformat}
 Then you'll hit several failures in TestPayloadNearQuery, eg:
 {noformat}
 Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery
   1 FAILED
   2 NOTE: reproduce with: ant test  -Dtestcase=TestPayloadNearQuery 
 -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true 
 -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII
 ERROR   0.01s | TestPayloadNearQuery.test 
 Throwable #1: java.lang.RuntimeException: overridden idfExplain method 
 in TestPayloadNearQuery.BoostingSimilarity was not called
  at 
 __randomizedtesting.SeedInfo.seed([C3802435F5FB39D0:4BD41BEF5B075428]:0)
  at 
 org.apache.lucene.search.similarities.TFIDFSimilarity.computeWeight(TFIDFSimilarity.java:740)
  at org.apache.lucene.search.spans.SpanWeight.init(SpanWeight.java:62)
  at 
 org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanWeight.init(PayloadNearQuery.java:147)
  at 
 org.apache.lucene.search.payloads.PayloadNearQuery.createWeight(PayloadNearQuery.java:75)
  at 
 org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:648)
  at 
 org.apache.lucene.search.AssertingIndexSearcher.createNormalizedWeight(AssertingIndexSearcher.java:60)
  at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:265)
  at 
 org.apache.lucene.search.payloads.TestPayloadNearQuery.test(TestPayloadNearQuery.java:146)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
  at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
  at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
  at 
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
  at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
  at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
  at 
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
  at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
  at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
  at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
  at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
  at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
  at 
 

[jira] [Comment Edited] (LUCENE-4546) SorterTemplate.quicksort incorrect

2012-11-07 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492464#comment-13492464
 ] 

Uwe Schindler edited comment on LUCENE-4546 at 11/7/12 9:25 PM:


Attached the corrected testcase, which passes.

-BTW: Your SorterTemplate implementation fails with with mergeSort completely- 
:-) _(this was incorrect, mergeSort does not use the pivot methods)_

  was (Author: thetaphi):
Attached the corrected testcase, which passes.

BTW: Your SorterTemplate implementation fails with with mergeSort completely :-)
  
 SorterTemplate.quicksort incorrect
 --

 Key: LUCENE-4546
 URL: https://issues.apache.org/jira/browse/LUCENE-4546
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Affects Versions: 3.6.1, 4.0, 4.1
Reporter: Stefan Pohl
Assignee: Uwe Schindler
  Labels: patch
 Fix For: 3.6.1, 4.0, 4.1

 Attachments: SorterTemplate.java.patch, TestSorterTemplate.java, 
 TestSorterTemplate.java


 On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon 
 inconsistent sorting behaviour, of course, only a randomized test caught 
 this;)
 Because SorterTemplate.quicksort is used in several places in the code 
 (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and 
 transitively index and search), I'm a bit puzzled that this either hasn't 
 been caught by another higher-level test or that neither my test nor my 
 understanding of an insufficiency in the code is valid;)
 If the former holds and given that the same code is released in 3.6 and 4.0, 
 this might even be a more critical issue requiring a higher priority than 
 'major'.
 So, can a second pair of eyes please have a timely look at the attached test 
 and patch?
 Basically the current quicksort implementation seems to assume that luckily 
 always the median is chosen as pivot element by grabbing the mid element, not 
 handling the case where the initially chosen pivot ends up not in the middle. 
 Hope this and the test helps to understand the issue.
 Reproducible, currently failing test and a patch attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4546) SorterTemplate.quicksort incorrect

2012-11-07 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492720#comment-13492720
 ] 

Uwe Schindler commented on LUCENE-4546:
---

Hi Stefan,
it is some time ago when I worked on SorterTemplate, so I don't have all the 
facts in mind. There are different implementations of QuickSort available, also 
those working without any pivot (like yours). But as far as I remember, the 
performance tests showed, that the additional swaps and compares added some 
slowdown (depending on the order of input data), so the explicit pivot methods 
helped. The SorterTemplate quicksort implementation is also the one that was 
used in Lucene from the beginning, so I did not want to change the algorithm in 
a minor release. We could add some new performance tests with your 
implementation and compare the speed, but I think, e.g. CollectionUtil, which 
uses Collections.swap() would get much slower by this.

I agree, the class is very nice for sorting of non-array data, but it is 
currently marked as @lucene.internal, so the usability for non-lucene code was 
never thought of, performance was the only driving force :-) But I checked the 
javadocs, it is clearly documented that setPivot(i) has to store the value of 
slot i for later comparison with comparePivot(j).

 SorterTemplate.quicksort incorrect
 --

 Key: LUCENE-4546
 URL: https://issues.apache.org/jira/browse/LUCENE-4546
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Affects Versions: 3.6.1, 4.0, 4.1
Reporter: Stefan Pohl
Assignee: Uwe Schindler
  Labels: patch
 Fix For: 3.6.1, 4.0, 4.1

 Attachments: SorterTemplate.java.patch, TestSorterTemplate.java, 
 TestSorterTemplate.java


 On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon 
 inconsistent sorting behaviour, of course, only a randomized test caught 
 this;)
 Because SorterTemplate.quicksort is used in several places in the code 
 (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and 
 transitively index and search), I'm a bit puzzled that this either hasn't 
 been caught by another higher-level test or that neither my test nor my 
 understanding of an insufficiency in the code is valid;)
 If the former holds and given that the same code is released in 3.6 and 4.0, 
 this might even be a more critical issue requiring a higher priority than 
 'major'.
 So, can a second pair of eyes please have a timely look at the attached test 
 and patch?
 Basically the current quicksort implementation seems to assume that luckily 
 always the median is chosen as pivot element by grabbing the mid element, not 
 handling the case where the initially chosen pivot ends up not in the middle. 
 Hope this and the test helps to understand the issue.
 Reproducible, currently failing test and a patch attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1916) investigate DIH use of default locale

2012-11-07 Thread James Dyer (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492769#comment-13492769
 ] 

James Dyer commented on SOLR-1916:
--

Robert,

I'm having a hard time finding a seed, locale, or timezone for which 
TestEvaluatorBag#testGetDateFormatEvaluator will fail.  Can you provide more 
info?  (Maybe my jvm doesn't support enough locales for me to get a 
failure-prone one?)

 investigate DIH use of default locale
 -

 Key: SOLR-1916
 URL: https://issues.apache.org/jira/browse/SOLR-1916
 Project: Solr
  Issue Type: Task
  Components: contrib - DataImportHandler
Affects Versions: 3.1, 4.0-ALPHA
Reporter: Robert Muir
Assignee: Robert Muir
 Fix For: 4.1

 Attachments: SOLR-1916.patch


 This is a spinoff from LUCENE-2466.
 In this issue I changed my locale to various locales and found some problems 
 in Lucene/Solr triggered by use of the default Locale.
 I noticed some use of the default-locale for Date operations in DIH 
 (TimeZone.getDefault/Locale.getDefault) and, while no tests fail, I think it 
 might be better to support a locale parameter for this.
 The wiki documents that numeric parsing can support localized numerics 
 formats: http://wiki.apache.org/solr/DataImportHandler#NumberFormatTransformer
 In both cases, I don't think we should ever use the default Locale. If no 
 Locale is provided, I find that new Locale() -- Unicode Root Locale, is a 
 better default for a server situation in a lot of cases, as it won't change 
 depending on the computer, or perhaps we just make Locale params mandatory 
 for this.
 Finally, in both cases, if localized numbers/dates are explicitly supported, 
 I think we should come up with a test strategy to ensure everything is 
 working. One idea is to do something similar to or make use of Lucene's 
 LocalizedTestCase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1916) investigate DIH use of default locale

2012-11-07 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492775#comment-13492775
 ] 

Robert Muir commented on SOLR-1916:
---

James, thanks for looking at this!!!

It may not be a locale issue, instead a time zone issue (or both). But this 
test definitely failed intermittently in the past.

For example, it failed during a Daylight savings time window (but only for 
developers in Europe!) and Chris Male addressed
some of the issues in SOLR-1821.

Fortunately, Uwe Schindler has made it dead easy to identify most of these 
issues: we not longer have to solely rely upon unit tests alone.

http://blog.thetaphi.de/2012/07/default-locales-default-charsets-and.html

DIH currently has 40 violations!

Try this:
{noformat}
Index: build.xml
===
--- build.xml   (revision 1406757)
+++ build.xml   (working copy)
@@ -250,8 +250,6 @@
   /apiFileSet
   fileset dir=${basedir}/build
 include name=**/*.class /
-!-- exclude DIH for now as it is broken with Locales and Encodings: 
SOLR-1916 --
-exclude name=contrib/solr-dataimporthandler*/** /
   /fileset
 /forbidden-apis
   /target
{noformat}

Then run
{noformat}
rmuir@beast:~/workspace/lucene-trunk/solr$ ant check-forbidden-apis
...
-check-forbidden-java-apis:
[forbidden-apis] Reading API signatures: 
/home/rmuir/workspace/lucene-trunk/lucene/tools/forbiddenApis/commons-io.txt
[forbidden-apis] Reading API signatures: 
/home/rmuir/workspace/lucene-trunk/lucene/tools/forbiddenApis/executors.txt
[forbidden-apis] Reading API signatures: 
/home/rmuir/workspace/lucene-trunk/lucene/tools/forbiddenApis/jdk-deprecated.txt
[forbidden-apis] Reading API signatures: 
/home/rmuir/workspace/lucene-trunk/lucene/tools/forbiddenApis/jdk.txt
[forbidden-apis] Loading classes to check...
[forbidden-apis] Scanning for API signatures and dependencies...
[forbidden-apis] Forbidden method invocation: 
java.text.DecimalFormatSymbols#init()
[forbidden-apis]   in 
org.apache.solr.handler.dataimport.TestNumberFormatTransformer 
(TestNumberFormatTransformer.java:36)
[forbidden-apis] Forbidden method invocation: 
java.text.DecimalFormatSymbols#init()
[forbidden-apis]   in 
org.apache.solr.handler.dataimport.TestNumberFormatTransformer 
(TestNumberFormatTransformer.java:37)
[forbidden-apis] Forbidden method invocation: 
java.text.MessageFormat#init(java.lang.String)
[forbidden-apis]   in org.apache.solr.handler.dataimport.DebugLogger 
(DebugLogger.java:52)
[forbidden-apis] Forbidden method invocation: 
java.text.SimpleDateFormat#init(java.lang.String)
[forbidden-apis]   in 
org.apache.solr.handler.dataimport.TestDateFormatTransformer 
(TestDateFormatTransformer.java:43)
[forbidden-apis] Forbidden method invocation: 
java.text.SimpleDateFormat#init(java.lang.String)
[forbidden-apis]   in 
org.apache.solr.handler.dataimport.TestDateFormatTransformer 
(TestDateFormatTransformer.java:66)
[forbidden-apis] Forbidden method invocation: 
java.text.SimpleDateFormat#init(java.lang.String)
[forbidden-apis]   in org.apache.solr.handler.dataimport.MailEntityProcessor 
(MailEntityProcessor.java:88)
[forbidden-apis] Forbidden method invocation: java.lang.String#getBytes()
[forbidden-apis]   in org.apache.solr.handler.dataimport.TestDocBuilder2 
(TestDocBuilder2.java:250)
[forbidden-apis] Forbidden method invocation: java.lang.String#getBytes()
[forbidden-apis]   in org.apache.solr.handler.dataimport.TestDocBuilder2 
(TestDocBuilder2.java:251)
[forbidden-apis] Forbidden method invocation: java.lang.String#getBytes()
[forbidden-apis]   in org.apache.solr.handler.dataimport.TestDocBuilder2 
(TestDocBuilder2.java:252)
[forbidden-apis] Forbidden method invocation: java.lang.String#getBytes()
[forbidden-apis]   in org.apache.solr.handler.dataimport.TestDocBuilder2 
(TestDocBuilder2.java:257)
[forbidden-apis] Forbidden method invocation: 
java.text.SimpleDateFormat#init(java.lang.String)
[forbidden-apis]   in org.apache.solr.handler.dataimport.DataImporter$3 
(DataImporter.java:490)
[forbidden-apis] Forbidden method invocation: 
java.lang.String#format(java.lang.String,java.lang.Object[])
[forbidden-apis]   in org.apache.solr.handler.dataimport.DocBuilder 
(DocBuilder.java:711)
[forbidden-apis] Forbidden method invocation: 
java.lang.String#format(java.lang.String,java.lang.Object[])
[forbidden-apis]   in org.apache.solr.handler.dataimport.DocBuilder 
(DocBuilder.java:717)
[forbidden-apis] Forbidden method invocation: 
java.lang.String#format(java.lang.String,java.lang.Object[])
[forbidden-apis]   in org.apache.solr.handler.dataimport.DocBuilder 
(DocBuilder.java:725)
[forbidden-apis] Forbidden method invocation: 
java.lang.String#format(java.lang.String,java.lang.Object[])
[forbidden-apis]   in org.apache.solr.handler.dataimport.DocBuilder 
(DocBuilder.java:727)
[forbidden-apis] Forbidden method invocation: 

[jira] [Resolved] (LUCENE-4527) CompressingStoredFieldsFormat: encode numStoredFields more efficiently

2012-11-07 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-4527.
--

Resolution: Fixed

Committed
 - trunk: r1406704
 - branch 4.x: r1406712

 CompressingStoredFieldsFormat: encode numStoredFields more efficiently
 --

 Key: LUCENE-4527
 URL: https://issues.apache.org/jira/browse/LUCENE-4527
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.1

 Attachments: LUCENE-4527.patch, LUCENE-4527.patch


 Another interesting idea from Robert: many applications have a schema and all 
 documents are likely to have the same number of stored fields. We could save 
 space by using packed ints and the same kind of optimization as {{ForUtil}} 
 (requiring only one VInt if all values are equal).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3855) DocValues support

2012-11-07 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated SOLR-3855:
---

Attachment: SOLR-3855.patch

New patch:
 - ability to have direct doc values,
 - doc values are not fetched by default, you need to explicitely add their 
name to the fl parameter to load them,
 - all tests pass except BasicDistributedZkTest.testDistribSearch, but it 
doesn't pass either without the patch applied on my (very slow...) laptop.

This patch is not perfect... for example I am not happy that I had to add a new 
createDocValuesFields method in FieldType. The reason is that only poly fields 
are allowed to return several fields in createFields but I think this would 
require a more globabl refactoring and should not block this issue?

If you want to play with doc values and Solr, I modified the example schema.xml 
so that popularity and inStock have doc values enabled. You can try to display 
their values, sort on them and/or use function queries on them.

When a field is indexed and has doc values, the patch always tries to use doc 
values instead of the field cache.

 DocValues support
 -

 Key: SOLR-3855
 URL: https://issues.apache.org/jira/browse/SOLR-3855
 Project: Solr
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: SOLR-3855.patch, SOLR-3855.patch


 It would be nice if Solr supported DocValues:
  - for ID fields (fewer disk seeks when running distributed search),
  - for sorting/faceting/function queries (faster warmup time than fieldcache),
  - better on-disk and in-memory efficiency (you can use packed impls).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4542) Make RECURSION_CAP in HunspellStemmer configurable

2012-11-07 Thread Chris Male (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492892#comment-13492892
 ] 

Chris Male commented on LUCENE-4542:


Rafał,

Thanks for creating the patches, they are looking great.  Couple of very small 
improvements:

- Can we mark recursionCap as final?
- Can we improve the javadoc for the recursionCap parameter so it's clear what 
purpose it serves?
- Maybe also drop in a comment at the field about how the recursion cap of 2 is 
the default value based on documentation about Hunspell (as opposed to 
something we arbitrarily chose).

 Make RECURSION_CAP in HunspellStemmer configurable
 --

 Key: LUCENE-4542
 URL: https://issues.apache.org/jira/browse/LUCENE-4542
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Piotr
Assignee: Chris Male
 Attachments: LUCENE-4542.patch, LUCENE-4542-with-solr.patch


 Currently there is 
 private static final int RECURSION_CAP = 2;
 in the code of the class HunspellStemmer. It makes using hunspell with 
 several dictionaries almost unusable, due to bad performance (f.ex. it costs 
 36ms to stem long sentence in latvian for recursion_cap=2 and 5 ms for 
 recursion_cap=1). It would be nice to be able to tune this number as needed.
 AFAIK this number (2) was chosen arbitrary.
 (it's a first issue in my life, so please forgive me any mistakes done).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: BooleanFilter MUST clauses and getDocIdSet(acceptDocs)

2012-11-07 Thread Uwe Schindler
Hi David,

 

the idea of passing the already build bits for the MUST is a good idea and can 
be implemented easily.

 

The reason why the acceptDocs were not passed down is the new way of filter 
works in Lucene 4.0 and to optimize caching. Because accept docs are the only 
thing that changes when deletions are applied and filters are required to 
handle them separately:  whenever something is able to cache (e.g. 
CachingWrapperFilter), the acceptDocs are not cached, so the underlying filters 
get a null acceptDocs to produce the full bitset and the filtering is done when 
CachingWrapperFilter gets the “uptodate” acceptDocs. But for this case this 
does not matter if the first filter clause does not get acceptdocs, but later 
MUST clauses of course can get them (they are not deletion-specific)!

 

Can you open issue to optimize the MUST case (possibly MUST_NOT, too)?

 

Another thing that could help here: You can stop using BooleanFilter if you can 
apply the filters sequentially (only MUST clauses) by wrapping with multiple 
FilteredQuery: new FilteredQuery(new FilteredQuery(originalQuery, clause1), 
clause2). If the DocIdSets enable bits() and the FilteredQuery autodetection 
decides to use random access filters, the acceptdocs are also passed down from 
the outside to the inner, removing the documents filtered out.

 

Uwe

 

-

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

http://www.thetaphi.de http://www.thetaphi.de/ 

eMail: u...@thetaphi.de

 

From: david.w.smi...@gmail.com [mailto:david.w.smi...@gmail.com] 
Sent: Wednesday, November 07, 2012 8:23 PM
To: dev@lucene.apache.org
Subject: BooleanFilter MUST clauses and getDocIdSet(acceptDocs)

 

I am about to write a Filter that only operates on a set of documents that have 
already passed other filter(s).  It's rather expensive, since it has to use 
DocValues to examine a value and then determine if its a match.  So it scales 
O(n) where n is the number of documents it must see.  The 2nd arg of 
getDocIdSet is Bits acceptDocs.  Unfortunately Bits doesn't have an int 
iterator but I can deal with that seeing if it extends DocIdSet.

 

I'm looking at BooleanFilter which I want to use and I notice that it passes 
null to filter.getDocIdSet for acceptDocs, and it justifies this with the 
following comment:

// we dont pass acceptDocs, we will filter at the end using an additional filter

Uwe wrote this comment in relation to LUCENE-1536 (r1188624).

For the MUST clause loop, couldn't it give it the accumulated bits of the MUST 
clauses?  

 

~ David



Fwd: [concurrency-interest] _interrupted field visibility bug in OpenJDK 7+

2012-11-07 Thread Dawid Weiss
Thought you'd be interested. I don't think it affects us but it's good
to know about it. Reproduces for me all the time on newer hotspots.

New (invisible) bug entry is at:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8003135

Dawid


-- Forwarded message --
From: Dr Heinz M. Kabutz he...@javaspecialists.eu
Date: Wed, Nov 7, 2012 at 11:00 PM
Subject: [concurrency-interest] _interrupted field visibility bug in OpenJDK 7+
To: concurrency-inter...@cs.oswego.edu


During a hands-on session today of my new Concurrency Specialist
Course, one of my students discovered what we think might be an
interesting and potentially serious bug in the JVM.  It seems that the
Server HotSpot in OpenJDK 7 may sometimes hoist the value of the
_interrupted field.  This is interesting, since the value is not
stored in Java, but rather in the OSThread.hpp file in the jint
_interrupted field.  It is also pretty serious, because it means we
cannot rely on the interrupted status in order to shut down threads.
This will affect Future.cancel(), ExecutorService.shutdownNow() and a
whole bunch of other mechanisms that use interruptions to
cooperatively cancel tasks.  (Obviously the exercise was more involved
than the code presented in this email, after all the course is aimed
at intermediate to advanced Java developers.  So please don't expect
that this won't happen in your code - I've just taken away unnecessary
code until we can see the bug without any of the paraphernalia that
might distract.)

First off, some code that works as expected.  As soon as you interrupt
the thread, it breaks out of the while() loop and exits:

   public void think() {
   while (true) {
   if (Thread.currentThread().isInterrupted()) break;
   }
   System.out.println(We're done thinking);
   }

However, if you extract the Thread.currentThread().isInterrupted()
into a separate method, then that might be optimized by HotSpot to
always return false and the code then never ends:

   public void think() {
   while (true) {
   if (checkInterruptedStatus()) break;
   }
   System.out.println(We're done thinking);
   }

   private boolean checkInterruptedStatus() {
   return Thread.currentThread().isInterrupted();
   }

My assumption is that the checkInterruptedStatus() method is
aggressively optimized and then the actual status is not read again.
This does not happen with the client hotspot and also not with Java
1.6.0_37.  It does happen with the 1.8 EA that I've got on my MacBook
Pro.  The student was using a Windows machine, so this not just a Mac
problem.
Here is the complete code:

public class InterruptedVisibilityTest {
   public void think() {
   while (true) {
   if (checkInterruptedStatus()) break;
   }
   System.out.println(We're done thinking);
   }

   private boolean checkInterruptedStatus() {
   return Thread.currentThread().isInterrupted();
   }

   public static void main(String[] args) throws InterruptedException {
   final InterruptedVisibilityTest test =
   new InterruptedVisibilityTest();
   Thread thinkerThread = new Thread(Thinker) {
   public void run() {
   test.think();
   }
   };
   thinkerThread.start();
   Thread.sleep(500);
   thinkerThread.interrupt();
   long timeOfInterruption = System.currentTimeMillis();
   thinkerThread.join(500);
   if (thinkerThread.isAlive()) {
   System.err.println(Thinker did not shut down within 500ms);
   System.err.println(Error in Java Virtual Machine!);
   System.err.println(Interrupted:  + thinkerThread.isInterrupted());
   System.err.println();
   System.err.println((Let's see if the thread ever dies and
how long it takes));
   while (thinkerThread.isAlive()) {
   thinkerThread.join(1000);
   if (thinkerThread.isAlive()) {
   System.err.println(  ... still waiting);
   }
   }
   }
   System.err.println(Finally, the thread has died - that took  +
   (System.currentTimeMillis() - timeOfInterruption) + ms);
   }
}

As I said, the original code was more involved, but this demonstrates
the essentials.  I hope some of you might be able to take a look at
what's going on.

Regards

Heinz
--
Dr Heinz M. Kabutz (PhD CompSci)
Author of The Java(tm) Specialists' Newsletter
Sun Java Champion
IEEE Certified Software Development Professional
http://www.javaspecialists.eu
Tel: +30 69 75 595 262
Skype: kabutz


___
Concurrency-interest mailing list
concurrency-inter...@cs.oswego.edu
http://cs.oswego.edu/mailman/listinfo/concurrency-interest

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Compressed stored fields and multiGet(sorted luceneId[])?

2012-11-07 Thread eksdev
Just a theoretical question, would it make sense to add some sort of 
StoredDocument[] bulkGet(int[] docId) to fetch multiple stored documents in one 
go? 

The reasoning behind is that now with compressed blocks random-access gets more 
expensive, and in some cases  a user  needs to fetch more documents in one go. 
If it happens that more documents come from one block it is a win. I would also 
assume, even without compression , bulk access on sorted docIds cold be a win 
(sequential access)?

Does that make sense, is it doable? Or even worse, does it already exist :)

By the way, I am impressed how well compression does, even on really short 
stored documents, approx. 150b  we observe 35% reduction. Fetching 1000 short 
documents on fully cached index  is observably slower (2-3 times), but as soon 
as you memory gets low, compression wins quickly. Did not test it thoroughly, 
but looks good so far. Great job!


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org