[jira] [Commented] (LUCENENET-484) Some possibly major tests intermittently fail
[ https://issues.apache.org/jira/browse/LUCENENET-484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286433#comment-13286433 ] Luc Vanlerberghe commented on LUCENENET-484: The failures in the TestSanity test cases are due to a bug in Cleanup which is called whenever a GC is detected in CleanIfNeeded (which is itself called from several places) Cleanup actually drops all cache entries that have live keys instead of the other way around! I also corrected a race condition in WeakKeyT.Equals (that will probably only happen under heavy load when you least I'll post patches with the corrections and updated test cases in a minute... Some possibly major tests intermittently fail -- Key: LUCENENET-484 URL: https://issues.apache.org/jira/browse/LUCENENET-484 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Core, Lucene.Net Test Affects Versions: Lucene.Net 3.0.3 Reporter: Christopher Currens Fix For: Lucene.Net 3.0.3 These tests will fail intermittently in Debug or Release mode, in the core test suite: # -Lucene.Net.Index:- #- -TestConcurrentMergeScheduler.TestFlushExceptions- # Lucene.Net.Store: #- TestLockFactory.TestStressLocks # Lucene.Net.Search: #- TestSort.TestParallelMultiSort # Lucene.Net.Util: #- TestFieldCacheSanityChecker.TestInsanity1 #- TestFieldCacheSanityChecker.TestInsanity2 #- (It's possible all of the insanity tests fail at one point or another) # Lucene.Net.Support #- TestWeakHashTableMultiThreadAccess.Test TestWeakHashTableMultiThreadAccess should be fine to remove along with the WeakHashTable in the Support namespace, since it's been replaced with WeakDictionary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (LUCENENET-484) Some possibly major tests intermittently fail
[ https://issues.apache.org/jira/browse/LUCENENET-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luc Vanlerberghe updated LUCENENET-484: --- Attachment: Lucenenet-484-WeakDictionaryTests.patch This patch removes WeakHashtable and uses its tests for WeakDictionary instead (I actually renamed the test files and updated the tests so subversion would keep the history, but the .patch format apparently doesn't keep that info...) Some possibly major tests intermittently fail -- Key: LUCENENET-484 URL: https://issues.apache.org/jira/browse/LUCENENET-484 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Core, Lucene.Net Test Affects Versions: Lucene.Net 3.0.3 Reporter: Christopher Currens Fix For: Lucene.Net 3.0.3 Attachments: Lucenenet-484-WeakDictionary.patch, Lucenenet-484-WeakDictionaryTests.patch These tests will fail intermittently in Debug or Release mode, in the core test suite: # -Lucene.Net.Index:- #- -TestConcurrentMergeScheduler.TestFlushExceptions- # Lucene.Net.Store: #- TestLockFactory.TestStressLocks # Lucene.Net.Search: #- TestSort.TestParallelMultiSort # Lucene.Net.Util: #- TestFieldCacheSanityChecker.TestInsanity1 #- TestFieldCacheSanityChecker.TestInsanity2 #- (It's possible all of the insanity tests fail at one point or another) # Lucene.Net.Support #- TestWeakHashTableMultiThreadAccess.Test TestWeakHashTableMultiThreadAccess should be fine to remove along with the WeakHashTable in the Support namespace, since it's been replaced with WeakDictionary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (LUCENENET-493) Make lucene.net culture insensitive (like the java version)
Luc Vanlerberghe created LUCENENET-493: -- Summary: Make lucene.net culture insensitive (like the java version) Key: LUCENENET-493 URL: https://issues.apache.org/jira/browse/LUCENENET-493 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Core, Lucene.Net Test Affects Versions: Lucene.Net 3.0.3 Reporter: Luc Vanlerberghe Fix For: Lucene.Net 3.0.3 In Java, conversion of the basic types to and from strings is locale (culture) independent. For localized input/output one needs to use the classes in the java.text package. In .Net, conversion of the basic types to and from strings depends on the default Culture. Otherwise you have to specify CultureInfo.InvariantCulture explicitly. Some of the testcases in lucene.net fail if they are not run on a machine with culture set to US. In the current version of lucene.net there are patches here and there that try to correct for some specific cases by using string replacement (like System.Double.Parse(s.Replace(., CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator)), but that seems really ugly. I submit a patch here that removes the old workarounds and replaces them by calls to classes in the Lucene.Net.Support namespace that try to handle the conversions in a compatible way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (LUCENENET-493) Make lucene.net culture insensitive (like the java version)
[ https://issues.apache.org/jira/browse/LUCENENET-493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luc Vanlerberghe updated LUCENENET-493: --- Attachment: Lucenenet-493.patch Makes lucene.net locale/culture independent (like the java version). Solves a few testcases that fail when run on a machine with a non-US culture. Make lucene.net culture insensitive (like the java version) --- Key: LUCENENET-493 URL: https://issues.apache.org/jira/browse/LUCENENET-493 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Core, Lucene.Net Test Affects Versions: Lucene.Net 3.0.3 Reporter: Luc Vanlerberghe Labels: patch Fix For: Lucene.Net 3.0.3 Attachments: Lucenenet-493.patch In Java, conversion of the basic types to and from strings is locale (culture) independent. For localized input/output one needs to use the classes in the java.text package. In .Net, conversion of the basic types to and from strings depends on the default Culture. Otherwise you have to specify CultureInfo.InvariantCulture explicitly. Some of the testcases in lucene.net fail if they are not run on a machine with culture set to US. In the current version of lucene.net there are patches here and there that try to correct for some specific cases by using string replacement (like System.Double.Parse(s.Replace(., CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator)), but that seems really ugly. I submit a patch here that removes the old workarounds and replaces them by calls to classes in the Lucene.Net.Support namespace that try to handle the conversions in a compatible way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (LUCENENET-484) Some possibly major tests intermittently fail
[ https://issues.apache.org/jira/browse/LUCENENET-484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286699#comment-13286699 ] Christopher Currens commented on LUCENENET-484: --- Thanks Luc. This is great stuff. I'll run the patch on my local box and double check everything. Your help with this is appreciated by all of us! Some possibly major tests intermittently fail -- Key: LUCENENET-484 URL: https://issues.apache.org/jira/browse/LUCENENET-484 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Core, Lucene.Net Test Affects Versions: Lucene.Net 3.0.3 Reporter: Christopher Currens Fix For: Lucene.Net 3.0.3 Attachments: Lucenenet-484-WeakDictionary.patch, Lucenenet-484-WeakDictionaryTests.patch These tests will fail intermittently in Debug or Release mode, in the core test suite: # -Lucene.Net.Index:- #- -TestConcurrentMergeScheduler.TestFlushExceptions- # Lucene.Net.Store: #- TestLockFactory.TestStressLocks # Lucene.Net.Search: #- TestSort.TestParallelMultiSort # Lucene.Net.Util: #- TestFieldCacheSanityChecker.TestInsanity1 #- TestFieldCacheSanityChecker.TestInsanity2 #- (It's possible all of the insanity tests fail at one point or another) # Lucene.Net.Support #- TestWeakHashTableMultiThreadAccess.Test TestWeakHashTableMultiThreadAccess should be fine to remove along with the WeakHashTable in the Support namespace, since it's been replaced with WeakDictionary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (LUCENENET-484) Some possibly major tests intermittently fail
[ https://issues.apache.org/jira/browse/LUCENENET-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Currens updated LUCENENET-484: -- Description: These tests will fail intermittently in Debug or Release mode, in the core test suite: # -Lucene.Net.Index:- #- -TestConcurrentMergeScheduler.TestFlushExceptions- # Lucene.Net.Store: #- TestLockFactory.TestStressLocks # Lucene.Net.Search: #- TestSort.TestParallelMultiSort # -Lucene.Net.Util:- #- -TestFieldCacheSanityChecker.TestInsanity1- #- -TestFieldCacheSanityChecker.TestInsanity2- #- -(It's possible all of the insanity tests fail at one point or another)- # -Lucene.Net.Support- #- -TestWeakHashTableMultiThreadAccess.Test- TestWeakHashTableMultiThreadAccess should be fine to remove along with the WeakHashTable in the Support namespace, since it's been replaced with WeakDictionary. was: These tests will fail intermittently in Debug or Release mode, in the core test suite: # -Lucene.Net.Index:- #- -TestConcurrentMergeScheduler.TestFlushExceptions- # Lucene.Net.Store: #- TestLockFactory.TestStressLocks # Lucene.Net.Search: #- TestSort.TestParallelMultiSort # Lucene.Net.Util: #- TestFieldCacheSanityChecker.TestInsanity1 #- TestFieldCacheSanityChecker.TestInsanity2 #- (It's possible all of the insanity tests fail at one point or another) # Lucene.Net.Support #- TestWeakHashTableMultiThreadAccess.Test TestWeakHashTableMultiThreadAccess should be fine to remove along with the WeakHashTable in the Support namespace, since it's been replaced with WeakDictionary. Environment: All Applied the patches. Getting closer to resolving this issue. Some possibly major tests intermittently fail -- Key: LUCENENET-484 URL: https://issues.apache.org/jira/browse/LUCENENET-484 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Core, Lucene.Net Test Affects Versions: Lucene.Net 3.0.3 Environment: All Reporter: Christopher Currens Fix For: Lucene.Net 3.0.3 Attachments: Lucenenet-484-WeakDictionary.patch, Lucenenet-484-WeakDictionaryTests.patch These tests will fail intermittently in Debug or Release mode, in the core test suite: # -Lucene.Net.Index:- #- -TestConcurrentMergeScheduler.TestFlushExceptions- # Lucene.Net.Store: #- TestLockFactory.TestStressLocks # Lucene.Net.Search: #- TestSort.TestParallelMultiSort # -Lucene.Net.Util:- #- -TestFieldCacheSanityChecker.TestInsanity1- #- -TestFieldCacheSanityChecker.TestInsanity2- #- -(It's possible all of the insanity tests fail at one point or another)- # -Lucene.Net.Support- #- -TestWeakHashTableMultiThreadAccess.Test- TestWeakHashTableMultiThreadAccess should be fine to remove along with the WeakHashTable in the Support namespace, since it's been replaced with WeakDictionary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java7-64 #191
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/191/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4077) ToParentBlockJoinCollector provides no way to access computed scores and the maxScore
[ https://issues.apache.org/jira/browse/LUCENE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286389#comment-13286389 ] Christoph Kaser commented on LUCENE-4077: - Thank you, now it works perfectly! ToParentBlockJoinCollector provides no way to access computed scores and the maxScore - Key: LUCENE-4077 URL: https://issues.apache.org/jira/browse/LUCENE-4077 Project: Lucene - Java Issue Type: Bug Components: modules/join Affects Versions: 3.4, 3.5, 3.6 Reporter: Christoph Kaser Assignee: Michael McCandless Attachments: LUCENE-4077.patch, LUCENE-4077.patch, LUCENE-4077.patch, LUCENE-4077.patch The constructor of ToParentBlockJoinCollector allows to turn on the tracking of parent scores and the maximum parent score, however there is no way to access those scores because: * maxScore is a private field, and there is no getter * TopGroups / GroupDocs does not provide access to the scores for the parent documents, only the children -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #337
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/337/ -- [...truncated 10767 lines...] [junit4] 2 1081 T3671 oashc.HttpShardHandlerFactory.getParameter Setting maximumPoolSize to: 2147483647 [junit4] 2 1081 T3671 oashc.HttpShardHandlerFactory.getParameter Setting maxThreadIdleTime to: 5 [junit4] 2 1081 T3671 oashc.HttpShardHandlerFactory.getParameter Setting sizeOfQueue to: -1 [junit4] 2 1082 T3671 oashc.HttpShardHandlerFactory.getParameter Setting fairnessPolicy to: false [junit4] 2 1082 T3671 oascsi.HttpClientUtil.createClient Creating new http client, config:maxConnectionsPerHost=20maxConnections=1socketTimeout=0connTimeout=0retry=false [junit4] 2 1086 T3673 oasc.SolrCore.registerSearcher [collection1] Registered new searcher Searcher@77668827 main{StandardDirectoryReader(segments_2:3 _0(5.0):C3)} [junit4] 2 1086 T3671 oasc.CoreContainer.register registering core: collection1 [junit4] 2 1087 T3671 oas.SolrTestCaseJ4.initCore initCore end [junit4] 2 ASYNC NEW_CORE C220 name=collection1 org.apache.solr.core.SolrCore@107ede1 [junit4] 2 1087 T3671 C220 REQ [collection1] webapp=null path=null params={q=acspellcheck.count=2qt=/suggest_tstspellcheck.onlyMorePopular=true} status=0 QTime=0 [junit4] 2 1091 T3671 oas.SolrTestCaseJ4.assertQ SEVERE REQUEST FAILED: xpath=//lst[@name='spellcheck']/lst[@name='suggestions']/lst[@name='ac']/int[@name='numFound'][.='2'] [junit4] 2xml response was: ?xml version=1.0 encoding=UTF-8? [junit4] 2response [junit4] 2lst name=responseHeaderint name=status0/intint name=QTime0/int/lstlst name=spellchecklst name=suggestions//lst [junit4] 2/response [junit4] 2 [junit4] 2request was:q=acspellcheck.count=2qt=/suggest_tstspellcheck.onlyMorePopular=true [junit4] 2 1091 T3671 oasc.SolrException.log SEVERE REQUEST FAILED: q=acspellcheck.count=2qt=/suggest_tstspellcheck.onlyMorePopular=true:java.lang.RuntimeException: REQUEST FAILED: xpath=//lst[@name='spellcheck']/lst[@name='suggestions']/lst[@name='ac']/int[@name='numFound'][.='2'] [junit4] 2xml response was: ?xml version=1.0 encoding=UTF-8? [junit4] 2response [junit4] 2lst name=responseHeaderint name=status0/intint name=QTime0/int/lstlst name=spellchecklst name=suggestions//lst [junit4] 2/response [junit4] 2 [junit4] 2request was:q=acspellcheck.count=2qt=/suggest_tstspellcheck.onlyMorePopular=true [junit4] 2at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:452) [junit4] 2at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:426) [junit4] 2at org.apache.solr.spelling.suggest.SuggesterTest.testReload(SuggesterTest.java:91) [junit4] 2at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] 2at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [junit4] 2at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [junit4] 2at java.lang.reflect.Method.invoke(Method.java:597) [junit4] 2at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) [junit4] 2at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) [junit4] 2at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) [junit4] 2at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) [junit4] 2at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) [junit4] 2at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) [junit4] 2at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) [junit4] 2at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) [junit4] 2at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) [junit4] 2at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) [junit4] 2at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) [junit4] 2at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) [junit4] 2
Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #338
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/338/ -- [...truncated 10266 lines...] [junit4] Completed in 0.93s, 2 tests [junit4] [junit4] Suite: org.apache.solr.handler.component.TermVectorComponentTest [junit4] Completed in 1.19s, 4 tests [junit4] [junit4] Suite: org.apache.solr.core.RAMDirectoryFactoryTest [junit4] Completed in 0.01s, 1 test [junit4] [junit4] Suite: org.apache.solr.analysis.TestItalianLightStemFilterFactory [junit4] Completed in 0.01s, 1 test [junit4] [junit4] Suite: org.apache.solr.core.RequestHandlersTest [junit4] Completed in 1.23s, 3 tests [junit4] [junit4] Suite: org.apache.solr.search.TestSolrQueryParser [junit4] Completed in 0.92s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.TestSort [junit4] Completed in 3.97s, 2 tests [junit4] [junit4] Suite: org.apache.solr.search.function.TestFunctionQuery [junit4] Completed in 3.11s, 14 tests [junit4] [junit4] Suite: org.apache.solr.search.TestRealTimeGet [junit4] IGNOR/A 0.00s | TestRealTimeGet.testStressRecovery [junit4] Assumption #1: FIXME: This test is horribly slow sometimes on Windows! [junit4] 2 28508 T2206 oas.SolrTestCaseJ4.setUp ###Starting testStressRecovery [junit4] 2 28508 T2206 oas.SolrTestCaseJ4.tearDown ###Ending testStressRecovery [junit4] 2 [junit4] Completed in 28.65s, 8 tests, 1 skipped [junit4] [junit4] Suite: org.apache.solr.cloud.OverseerTest [junit4] Completed in 48.54s, 7 tests [junit4] [junit4] Suite: org.apache.solr.cloud.LeaderElectionTest [junit4] Completed in 20.93s, 4 tests [junit4] [junit4] Suite: org.apache.solr.cloud.RecoveryZkTest [junit4] Completed in 35.65s, 1 test [junit4] [junit4] Suite: org.apache.solr.cloud.LeaderElectionIntegrationTest [junit4] Completed in 29.88s, 2 tests [junit4] [junit4] Suite: org.apache.solr.request.TestFaceting [junit4] Completed in 12.36s, 3 tests [junit4] [junit4] Suite: org.apache.solr.update.DirectUpdateHandlerTest [junit4] Completed in 2.77s, 6 tests [junit4] [junit4] Suite: org.apache.solr.update.PeerSyncTest [junit4] Completed in 4.51s, 1 test [junit4] [junit4] Suite: org.apache.solr.ConvertedLegacyTest [junit4] Completed in 3.20s, 1 test [junit4] [junit4] Suite: org.apache.solr.handler.StandardRequestHandlerTest [junit4] Completed in 0.94s, 1 test [junit4] [junit4] Suite: org.apache.solr.update.SolrCmdDistributorTest [junit4] Completed in 1.87s, 1 test [junit4] [junit4] Suite: org.apache.solr.spelling.IndexBasedSpellCheckerTest [junit4] Completed in 1.33s, 5 tests [junit4] [junit4] Suite: org.apache.solr.request.TestWriterPerf [junit4] Completed in 1.12s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.similarities.TestLMDirichletSimilarityFactory [junit4] Completed in 0.17s, 2 tests [junit4] [junit4] Suite: org.apache.solr.handler.component.TermsComponentTest [junit4] Completed in 1.19s, 13 tests [junit4] [junit4] Suite: org.apache.solr.search.function.SortByFunctionTest [junit4] Completed in 2.18s, 2 tests [junit4] [junit4] Suite: org.apache.solr.spelling.SpellCheckCollatorTest [junit4] Completed in 2.23s, 6 tests [junit4] [junit4] Suite: org.apache.solr.search.SpatialFilterTest [junit4] Completed in 1.80s, 3 tests [junit4] [junit4] Suite: org.apache.solr.schema.PolyFieldTest [junit4] Completed in 1.39s, 4 tests [junit4] [junit4] Suite: org.apache.solr.schema.CopyFieldTest [junit4] Completed in 0.67s, 6 tests [junit4] [junit4] Suite: org.apache.solr.update.processor.FieldMutatingUpdateProcessorTest [junit4] Completed in 0.90s, 20 tests [junit4] [junit4] Suite: org.apache.solr.search.TestDocSet [junit4] Completed in 0.70s, 2 tests [junit4] [junit4] Suite: org.apache.solr.handler.XmlUpdateRequestHandlerTest [junit4] Completed in 0.93s, 3 tests [junit4] [junit4] Suite: org.apache.solr.handler.TestCSVLoader [junit4] Completed in 1.24s, 5 tests [junit4] [junit4] Suite: org.apache.solr.handler.component.DebugComponentTest [junit4] Completed in 1.05s, 2 tests [junit4] [junit4] Suite: org.apache.solr.handler.JsonLoaderTest [junit4] Completed in 0.94s, 5 tests [junit4] [junit4] Suite: org.apache.solr.response.TestCSVResponseWriter [junit4] Completed in 0.86s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.QueryParsingTest [junit4] Completed in 0.91s, 3 tests [junit4] [junit4] Suite: org.apache.solr.handler.component.SearchHandlerTest [junit4] Completed in 0.91s, 1 test [junit4] [junit4] Suite: org.apache.solr.update.UpdateParamsTest [junit4] Completed in 0.92s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.ReturnFieldsTest
[jira] [Commented] (LUCENE-4090) PerFieldPostingsFormat cannot use name as suffix
[ https://issues.apache.org/jira/browse/LUCENE-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286411#comment-13286411 ] Mark Harwood commented on LUCENE-4090: -- Thanks for the quick fix, Rob :) Working fine for me here now. PerFieldPostingsFormat cannot use name as suffix Key: LUCENE-4090 URL: https://issues.apache.org/jira/browse/LUCENE-4090 Project: Lucene - Java Issue Type: Bug Components: core/index Affects Versions: 4.0 Reporter: Robert Muir Assignee: Robert Muir Fix For: 4.0, 5.0 Attachments: LUCENE-4090.patch, LUCENE-4090.patch Currently PFPF just records the name in the metadata, which matches up to the segment suffix. But this isnt enough, e.g. someone can use Pulsing(1) on one field and Pulsing(2) on another field. See Mark Harwood's examples struggling with this on LUCENE-4069. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [jira] [Commented] (LUCENE-4055) Refactor SegmentInfo / FieldInfo to make them extensible
Thanks Robert for the answers, I'll investigate this approach. -- Renaud Delbru On 28/05/12 21:59, Robert Muir (JIRA) wrote: [ https://issues.apache.org/jira/browse/LUCENE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284553#comment-13284553 ] Robert Muir commented on LUCENE-4055: - Well you can do postingsFormat instanceof PerFieldPostingsFormat + postingsFormat.getPostingsFormatForField if you really want. But keep in mind PerFieldPostingsFormat is not really special and just one we provide for convenience, obviously one could write their own PostingsFormat that implements the same thing in a different way. Refactor SegmentInfo / FieldInfo to make them extensible Key: LUCENE-4055 URL: https://issues.apache.org/jira/browse/LUCENE-4055 Project: Lucene - Java Issue Type: Improvement Components: core/codecs Reporter: Andrzej Bialecki Assignee: Robert Muir Fix For: 4.0 Attachments: LUCENE-4055.patch After LUCENE-4050 is done the resulting SegmentInfo / FieldInfo classes should be made abstract so that they can be extended by Codec-s. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java7-64 #192
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/192/ -- [...truncated 14362 lines...] [junit4] 2 26209 T3142 oasc.RequestHandlers.initHandlersFromConfig adding lazy requestHandler: solr.ReplicationHandler [junit4] 2 26209 T3142 oasc.RequestHandlers.initHandlersFromConfig created /replication: solr.ReplicationHandler [junit4] 2 26209 T3142 oasc.RequestHandlers.initHandlersFromConfig created standard: solr.StandardRequestHandler [junit4] 2 26209 T3142 oasc.RequestHandlers.initHandlersFromConfig created /get: solr.RealTimeGetHandler [junit4] 2 26210 T3142 oasc.RequestHandlers.initHandlersFromConfig created dismax: solr.SearchHandler [junit4] 2 26210 T3142 oasc.RequestHandlers.initHandlersFromConfig created dismaxNoDefaults: solr.SearchHandler [junit4] 2 26210 T3142 oasc.RequestHandlers.initHandlersFromConfig created mock: org.apache.solr.core.MockQuerySenderListenerReqHandler [junit4] 2 26211 T3142 oasc.RequestHandlers.initHandlersFromConfig created /admin/: org.apache.solr.handler.admin.AdminHandlers [junit4] 2 26211 T3142 oasc.RequestHandlers.initHandlersFromConfig created defaults: solr.StandardRequestHandler [junit4] 2 26211 T3142 oasc.RequestHandlers.initHandlersFromConfig adding lazy requestHandler: solr.StandardRequestHandler [junit4] 2 26211 T3142 oasc.RequestHandlers.initHandlersFromConfig created lazy: solr.StandardRequestHandler [junit4] 2 26211 T3142 oasc.RequestHandlers.initHandlersFromConfig created /update: solr.UpdateRequestHandler [junit4] 2 26211 T3142 oasc.RequestHandlers.initHandlersFromConfig created /terms: org.apache.solr.handler.component.SearchHandler [junit4] 2 26211 T3142 oasc.RequestHandlers.initHandlersFromConfig created spellCheckCompRH: org.apache.solr.handler.component.SearchHandler [junit4] 2 26211 T3142 oasc.RequestHandlers.initHandlersFromConfig created spellCheckCompRH_Direct: org.apache.solr.handler.component.SearchHandler [junit4] 2 26211 T3142 oasc.RequestHandlers.initHandlersFromConfig created spellCheckCompRH1: org.apache.solr.handler.component.SearchHandler [junit4] 2 26212 T3142 oasc.RequestHandlers.initHandlersFromConfig created tvrh: org.apache.solr.handler.component.SearchHandler [junit4] 2 26213 T3142 oasc.RequestHandlers.initHandlersFromConfig created /mlt: solr.MoreLikeThisHandler [junit4] 2 26213 T3142 oasc.RequestHandlers.initHandlersFromConfig created /debug/dump: solr.DumpRequestHandler [junit4] 2 26214 T3142 oashl.XMLLoader.init xsltCacheLifetimeSeconds=60 [junit4] 2 26216 T3142 oasc.SolrCore.initDeprecatedSupport WARNING solrconfig.xml uses deprecated admin/gettableFiles, Please update your config to use the ShowFileRequestHandler. [junit4] 2 26217 T3142 oasc.SolrCore.initDeprecatedSupport WARNING adding ShowFileRequestHandler with hidden files: [SOLRCONFIG-HIGHLIGHT.XML, SCHEMA-REQUIRED-FIELDS.XML, SCHEMA-REPLICATION2.XML, SCHEMA-MINIMAL.XML, BAD-SCHEMA-DUP-DYNAMICFIELD.XML, SOLRCONFIG-CACHING.XML, SOLRCONFIG-REPEATER.XML, CURRENCY.XML, BAD-SCHEMA-NONTEXT-ANALYZER.XML, SOLRCONFIG-MERGEPOLICY.XML, SOLRCONFIG-TLOG.XML, SOLRCONFIG-MASTER.XML, SCHEMA11.XML, SOLRCONFIG-BASIC.XML, DA_COMPOUNDDICTIONARY.TXT, SCHEMA-COPYFIELD-TEST.XML, SOLRCONFIG-SLAVE.XML, ELEVATE.XML, SOLRCONFIG-PROPINJECT-INDEXDEFAULT.XML, SCHEMA-IB.XML, SOLRCONFIG-QUERYSENDER.XML, SCHEMA-REPLICATION1.XML, DA_UTF8.XML, HYPHENATION.DTD, SOLRCONFIG-ENABLEPLUGIN.XML, SCHEMA-PHRASESUGGEST.XML, STEMDICT.TXT, HUNSPELL-TEST.AFF, STOPTYPES-1.TXT, STOPWORDSWRONGENCODING.TXT, SCHEMA-NUMERIC.XML, SOLRCONFIG-TRANSFORMERS.XML, SOLRCONFIG-PROPINJECT.XML, BAD-SCHEMA-NOT-INDEXED-BUT-TF.XML, SOLRCONFIG-SIMPLELOCK.XML, WDFTYPES.TXT, STOPTYPES-2.TXT, SCHEMA-REVERSED.XML, SOLRCONFIG-SPELLCHECKCOMPONENT.XML, SCHEMA-DFR.XML, SOLRCONFIG-PHRASESUGGEST.XML, BAD-SCHEMA-NOT-INDEXED-BUT-POS.XML, KEEP-1.TXT, OPEN-EXCHANGE-RATES.JSON, STOPWITHBOM.TXT, SCHEMA-BINARYFIELD.XML, SOLRCONFIG-SPELLCHECKER.XML, SOLRCONFIG-UPDATE-PROCESSOR-CHAINS.XML, BAD-SCHEMA-OMIT-TF-BUT-NOT-POS.XML, BAD-SCHEMA-DUP-FIELDTYPE.XML, SOLRCONFIG-MASTER1.XML, SYNONYMS.TXT, SCHEMA.XML, SCHEMA_CODEC.XML, SOLRCONFIG-SOLR-749.XML, SOLRCONFIG-MASTER1-KEEPONEBACKUP.XML, STOP-2.TXT, SOLRCONFIG-FUNCTIONQUERY.XML, SCHEMA-LMDIRICHLET.XML, SOLRCONFIG-TERMINDEX.XML, SOLRCONFIG-ELEVATE.XML, STOPWORDS.TXT, SCHEMA-FOLDING.XML, SCHEMA-STOP-KEEP.XML, BAD-SCHEMA-NOT-INDEXED-BUT-NORMS.XML, SOLRCONFIG-SOLCOREPROPERTIES.XML, STOP-1.TXT, SOLRCONFIG-MASTER2.XML, SCHEMA-SPELLCHECKER.XML, SOLRCONFIG-LAZYWRITER.XML, SCHEMA-LUCENEMATCHVERSION.XML, BAD-MP-SOLRCONFIG.XML, FRENCHARTICLES.TXT, SCHEMA15.XML, SOLRCONFIG-REQHANDLER.INCL, SCHEMASURROUND.XML, SOLRCONFIG-MASTER3.XML, HUNSPELL-TEST.DIC, SOLRCONFIG-XINCLUDE.XML, SOLRCONFIG-DELPOLICY1.XML, SOLRCONFIG-SLAVE1.XML, SCHEMA-SIM.XML, SCHEMA-COLLATE.XML, STOP-SNOWBALL.TXT, PROTWORDS.TXT,
Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java6-64 #339
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/339/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3498) ContentStreamUpdateRequest doesn't seem to respect setCommitWithin()
Christian Moen created SOLR-3498: Summary: ContentStreamUpdateRequest doesn't seem to respect setCommitWithin() Key: SOLR-3498 URL: https://issues.apache.org/jira/browse/SOLR-3498 Project: Solr Issue Type: Bug Components: update Affects Versions: 3.6 Reporter: Christian Moen I'm using the below code to post some office format files to Solr using SolrJ. It seems like {{setCommitWithin()}} is ignored in my {{ContentStreamUpdateRequest}} request, and that I need to use {{setParam(UpdateParams.COMMIT_WITHIN, ...)}} instead to get the desired effect. {code} SolrServer solrServer = new HttpSolrServer(http://localhost:8983/solr;); ContentStreamUpdateRequest updateRequest = new ContentStreamUpdateRequest(/update/extract); updateRequest.addFile(file); updateRequest.setParam(literal.id, file.getName()); updateRequest.setCommitWithin(1); // Does not work //updateRequest.setParam(UpdateParams.COMMIT_WITHIN, 1); // Works updateRequest.process(solrServer); {code} If I use the below {code} ... //updateRequest.setCommitWithin(1); // Does not work updateRequest.setParam(UpdateParams.COMMIT_WITHIN, 1); // Works ... {code} I get the desired result and a commit is being done. I'm doing this on 3.x, but I believe this issue could apply to 4.x as well (by quickly glancing over the code with tired eyes), but I haven't verified this, yet. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENENET-484) Some possibly major tests intermittently fail
[ https://issues.apache.org/jira/browse/LUCENENET-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luc Vanlerberghe updated LUCENENET-484: --- Attachment: Lucenenet-484-WeakDictionary.patch Corrects both Clean() and WeakKeyT.Equals Some possibly major tests intermittently fail -- Key: LUCENENET-484 URL: https://issues.apache.org/jira/browse/LUCENENET-484 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Core, Lucene.Net Test Affects Versions: Lucene.Net 3.0.3 Reporter: Christopher Currens Fix For: Lucene.Net 3.0.3 Attachments: Lucenenet-484-WeakDictionary.patch These tests will fail intermittently in Debug or Release mode, in the core test suite: # -Lucene.Net.Index:- #- -TestConcurrentMergeScheduler.TestFlushExceptions- # Lucene.Net.Store: #- TestLockFactory.TestStressLocks # Lucene.Net.Search: #- TestSort.TestParallelMultiSort # Lucene.Net.Util: #- TestFieldCacheSanityChecker.TestInsanity1 #- TestFieldCacheSanityChecker.TestInsanity2 #- (It's possible all of the insanity tests fail at one point or another) # Lucene.Net.Support #- TestWeakHashTableMultiThreadAccess.Test TestWeakHashTableMultiThreadAccess should be fine to remove along with the WeakHashTable in the Support namespace, since it's been replaced with WeakDictionary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (LUCENE-4019) Parsing Hunspell affix rules without regexp condition
[ https://issues.apache.org/jira/browse/LUCENE-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luca Cavanna updated LUCENE-4019: - Attachment: LUCENE-4019.patch Hi Chris, thanks for your feedback. Here is a new patch containing a new option in order to enable/disable the affix strict parsing, by default it is enabled. I updated the HunspellStemFilterFactory too in order to expose the new option to Solr. Parsing Hunspell affix rules without regexp condition - Key: LUCENE-4019 URL: https://issues.apache.org/jira/browse/LUCENE-4019 Project: Lucene - Java Issue Type: Improvement Components: modules/analysis Affects Versions: 3.6 Reporter: Luca Cavanna Assignee: Chris Male Attachments: LUCENE-4019.patch, LUCENE-4019.patch We found out that some recent Dutch hunspell dictionaries contain suffix or prefix rules like the following: {code} SFX Na N 1 SFX Na 0 ste {code} The rule on the second line doesn't contain the 5th parameter, which should be the condition (a regexp usually). You can usually see a '.' as condition, meaning always (for every character). As explained in LUCENE-3976 the readAffix method throws error. I wonder if we should treat the missing value as a kind of default value, like '.'. On the other hand I haven't found any information about this within the spec. Any thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4019) Parsing Hunspell affix rules without regexp condition
[ https://issues.apache.org/jira/browse/LUCENE-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286474#comment-13286474 ] Chris Male commented on LUCENE-4019: Hi Luca, Thanks for taking a shot at this. I wonder whether we can do improve the ParseException message? At the very least it should include the line that is causing the problem so people can find it. What would be even better is if we also included the line number. The latter is probably not so urgent, but it would be handy to have for other parsing errors too. Also I think the changes to the Factory are wrong: {code} + if(strictAffixParsing.equalsIgnoreCase(TRUE)) ignoreCase = true; + else if(strictAffixParsing.equalsIgnoreCase(FALSE)) ignoreCase = false; {code} Parsing Hunspell affix rules without regexp condition - Key: LUCENE-4019 URL: https://issues.apache.org/jira/browse/LUCENE-4019 Project: Lucene - Java Issue Type: Improvement Components: modules/analysis Affects Versions: 3.6 Reporter: Luca Cavanna Assignee: Chris Male Attachments: LUCENE-4019.patch, LUCENE-4019.patch We found out that some recent Dutch hunspell dictionaries contain suffix or prefix rules like the following: {code} SFX Na N 1 SFX Na 0 ste {code} The rule on the second line doesn't contain the 5th parameter, which should be the condition (a regexp usually). You can usually see a '.' as condition, meaning always (for every character). As explained in LUCENE-3976 the readAffix method throws error. I wonder if we should treat the missing value as a kind of default value, like '.'. On the other hand I haven't found any information about this within the spec. Any thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java7-64 #193
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/193/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4097) index was locked because of InterruptedException
wang created LUCENE-4097: Summary: index was locked because of InterruptedException Key: LUCENE-4097 URL: https://issues.apache.org/jira/browse/LUCENE-4097 Project: Lucene - Java Issue Type: Bug Reporter: wang the index was locked, because of InterruptedException,and i could do nothing but restart tomcat, how could i avoid this happen again? thanks this is stacktrace: org.apache.lucene.util.ThreadInterruptedException: java.lang.InterruptedException at org.apache.lucene.index.IndexWriter.doWait(IndexWriter.java:4118) at org.apache.lucene.index.IndexWriter.waitForMerges(IndexWriter.java:2836) at org.apache.lucene.index.IndexWriter.finishMerges(IndexWriter.java:2821) at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1847) at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1800) at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1764) at org.opencms.search.CmsSearchManager.updateIndexIncremental(CmsSearchManager.java:2262) at org.opencms.search.CmsSearchManager.updateIndexOffline(CmsSearchManager.java:2306) at org.opencms.search.CmsSearchManager$CmsSearchOfflineIndexThread.run(CmsSearchManager.java:327) Caused by: java.lang.InterruptedException at java.lang.Object.wait(Native Method) at org.apache.lucene.index.IndexWriter.doWait(IndexWriter.java:4116) ... 8 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 2719 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2719/ 1 tests failed. REGRESSION: org.apache.lucene.util.packed.TestPackedInts.testIntOverflow Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: Java heap space at __randomizedtesting.SeedInfo.seed([26B73FF6A7A21CED:83602BDCBFF8E4B8]:0) at org.apache.lucene.util.packed.Packed64SingleBlock.init(Packed64SingleBlock.java:115) at org.apache.lucene.util.packed.Packed64SingleBlock$Packed64SingleBlock5.init(Packed64SingleBlock.java:279) at org.apache.lucene.util.packed.Packed64SingleBlock.create(Packed64SingleBlock.java:68) at org.apache.lucene.util.packed.TestPackedInts.testIntOverflow(TestPackedInts.java:303) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) Build Log (for compile errors): [...truncated 1559 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4019) Parsing Hunspell affix rules without regexp condition
[ https://issues.apache.org/jira/browse/LUCENE-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luca Cavanna updated LUCENE-4019: - Attachment: LUCENE-4019.patch Yeah, sorry for my mistakes, I corrected them. And I added the line number to the ParseException. Let me know if there's something more I can do! Parsing Hunspell affix rules without regexp condition - Key: LUCENE-4019 URL: https://issues.apache.org/jira/browse/LUCENE-4019 Project: Lucene - Java Issue Type: Improvement Components: modules/analysis Affects Versions: 3.6 Reporter: Luca Cavanna Assignee: Chris Male Attachments: LUCENE-4019.patch, LUCENE-4019.patch, LUCENE-4019.patch We found out that some recent Dutch hunspell dictionaries contain suffix or prefix rules like the following: {code} SFX Na N 1 SFX Na 0 ste {code} The rule on the second line doesn't contain the 5th parameter, which should be the condition (a regexp usually). You can usually see a '.' as condition, meaning always (for every character). As explained in LUCENE-3976 the readAffix method throws error. I wonder if we should treat the missing value as a kind of default value, like '.'. On the other hand I haven't found any information about this within the spec. Any thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3350) TextField's parseFieldQuery method not using analyzer's enablePosIncr parameter
[ https://issues.apache.org/jira/browse/SOLR-3350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286531#comment-13286531 ] Tommaso Teofili commented on SOLR-3350: --- for now I think we can at least remove the useless switches inside the code, as the broader discussion about overall enablePositionIncrements isn't trivial. TextField's parseFieldQuery method not using analyzer's enablePosIncr parameter --- Key: SOLR-3350 URL: https://issues.apache.org/jira/browse/SOLR-3350 Project: Solr Issue Type: Bug Components: Schema and Analysis Affects Versions: 3.5, 4.0 Reporter: Tommaso Teofili Priority: Minor parseFieldQuery method of TextField class just set {code} ... boolean enablePositionIncrements = true; ... {code} while that should be taken from Analyzer's configuration. The above condition is evaluated afterwards in two points: {code} ... if (enablePositionIncrements) { mpq.add((Term[]) multiTerms.toArray(new Term[0]), position); } else { mpq.add((Term[]) multiTerms.toArray(new Term[0])); } return mpq; ... ... if (enablePositionIncrements) { position += positionIncrement; pq.add(new Term(field, term), position); } else { pq.add(new Term(field, term)); } ... {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Harwood updated LUCENE-4069: - Attachment: BloomFilterPostings40.patch This is looking more promising. Running ant test-core -Dtests.postingsformat=TestBloomFilteredLucene40Postings now passes all tests but causes OOM exception on 3 tests: * TestConsistentFieldNumbers.testManyFields * TestIndexableField.testArbitraryFields * TestIndexWriter.testManyFields Any pointers on how to annotate or otherwise avoid the BloomFilter class for many-field tests would be welcome. These are not realistic tests for this class (we don't expect indexes with 100s of primary-key like fields). In this patch I've * added an SPI lookup mechanism for pluggable hash algos. * documented the file format * fixed issues with TermVector tests * changed the API To use: BloomFilteringPostingFormat now takes a delegate PostingsFormat and a set of field names that are to have bloom-filters created. Fields that are not listed in the filter set can be safely indexed as per normal and doing so is beneficial because it allows filtered and non filtered field data to co-exist in the same physical files created by the delegate PostingsFormat. Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterCodec40.patch, BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Harwood updated LUCENE-4069: - Attachment: (was: BloomFilterCodec40.patch) Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Harwood updated LUCENE-4069: - Attachment: (was: BloomFilterPostings40.patch) Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Harwood updated LUCENE-4069: - Attachment: BloomFilterPostings40.patch Added missing class Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286555#comment-13286555 ] Robert Muir commented on LUCENE-4069: - I don't think the abstract class should be registered in the SPI. Instead i think the concrete Bloom+Lucene40 that you have in tests should be moved into src/java and registered there, just call it Bloom40 or something. The abstract api is still available for someone that wants to do something more specialized. This is just like how pulsing (another wrapper) is implemented. As far as disabling this for certain tests, import o.a.l.util.LuceneTestCase.SuppressCodecs and put something like this at class level: {code} @SuppressCodecs(Bloom40) public class TestFoo... @SuppressCodecs({Bloom40, Memory}) public class TestBar... {code} The strings in here can be codecs or postings formats Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286562#comment-13286562 ] Robert Muir commented on LUCENE-4069: - Seeing the tests in question though, i dont think you want to disable this for these entire test classes. We dont have a way to disable this on a per-method basis: and I think its generally not possible because many classes create indexes in @BeforeClass etc. An alternative would be to just pick this less often in RandomCodec: see the SimpleText hack :) Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286598#comment-13286598 ] Mark Harwood commented on LUCENE-4069: -- bq. Instead i think the concrete Bloom+Lucene40 that you have in tests should be moved into src/java and registered there What problem would that be trying to solve? Registration (or creation) of any BloomFilteringPostingsFormat subclasses is not necessary to decode index contents. Offering a Bloom40 would only buy users a pairing of Lucene40Postings and Bloom filtering but they would still have to declare which fields they want Bloom filtering on at write time. This isn't too hard using the code in the existing patch: {code:title=ThisWorks.java} final SetStringbloomFilteredFields=new HashSetString(); bloomFilteredFields.add(PRIMARY_KEY_FIELD_NAME); iwc.setCodec(new Lucene40Codec(){ BloomFilteringPostingsFormat postingOptions=new BloomFilteringPostingsFormat(new Lucene40PostingsFormat(), bloomFilteredFields); @Override public PostingsFormat getPostingsFormatForField(String field) { return postingOptions; } }); {code} No extra subclasses/registration required here to read the index built with the above setup. Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286599#comment-13286599 ] Robert Muir commented on LUCENE-4069: - I dont understand why this handles fields. Someone should just pick that with perfieldpostingsformat. So you have the abstract wrapper(takes the wrapped postings format, and a String name), not registered. And you have a concrete impl registered that is just abstractWrapper(lucene40, Bloom40): done. Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286600#comment-13286600 ] Mark Harwood commented on LUCENE-4069: -- bq. An alternative would be to just pick this less often in RandomCodec: see the SimpleText hack Another option might be to make the TestBloomFilteredLucene40Postings pick a ludicrously small Bitset sizing option for each field so that we can accommodate tests that create silly numbers of fields. The bitsets being so small will just quickly reach saturation and force all reads to hit the underlying FieldsProducer. Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #344
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/344/ -- [...truncated 10471 lines...] [junit4] 2 18390 T2929 oasc.RequestHandlers.initHandlersFromConfig created mock: org.apache.solr.core.MockQuerySenderListenerReqHandler [junit4] 2 18391 T2929 oasc.RequestHandlers.initHandlersFromConfig created /admin/: org.apache.solr.handler.admin.AdminHandlers [junit4] 2 18391 T2929 oasc.RequestHandlers.initHandlersFromConfig created defaults: solr.StandardRequestHandler [junit4] 2 18391 T2929 oasc.RequestHandlers.initHandlersFromConfig adding lazy requestHandler: solr.StandardRequestHandler [junit4] 2 18391 T2929 oasc.RequestHandlers.initHandlersFromConfig created lazy: solr.StandardRequestHandler [junit4] 2 18392 T2929 oasc.RequestHandlers.initHandlersFromConfig created /update: solr.UpdateRequestHandler [junit4] 2 18392 T2929 oasc.RequestHandlers.initHandlersFromConfig created /terms: org.apache.solr.handler.component.SearchHandler [junit4] 2 18392 T2929 oasc.RequestHandlers.initHandlersFromConfig created spellCheckCompRH: org.apache.solr.handler.component.SearchHandler [junit4] 2 18393 T2929 oasc.RequestHandlers.initHandlersFromConfig created spellCheckCompRH_Direct: org.apache.solr.handler.component.SearchHandler [junit4] 2 18393 T2929 oasc.RequestHandlers.initHandlersFromConfig created spellCheckCompRH1: org.apache.solr.handler.component.SearchHandler [junit4] 2 18393 T2929 oasc.RequestHandlers.initHandlersFromConfig created tvrh: org.apache.solr.handler.component.SearchHandler [junit4] 2 18393 T2929 oasc.RequestHandlers.initHandlersFromConfig created /mlt: solr.MoreLikeThisHandler [junit4] 2 18394 T2929 oasc.RequestHandlers.initHandlersFromConfig created /debug/dump: solr.DumpRequestHandler [junit4] 2 18395 T2929 oashl.XMLLoader.init xsltCacheLifetimeSeconds=60 [junit4] 2 18397 T2929 oasc.SolrCore.initDeprecatedSupport WARNING solrconfig.xml uses deprecated admin/gettableFiles, Please update your config to use the ShowFileRequestHandler. [junit4] 2 18398 T2929 oasc.SolrCore.initDeprecatedSupport WARNING adding ShowFileRequestHandler with hidden files: [SOLRCONFIG-HIGHLIGHT.XML, SCHEMA-REQUIRED-FIELDS.XML, SCHEMA-REPLICATION2.XML, SCHEMA-MINIMAL.XML, BAD-SCHEMA-DUP-DYNAMICFIELD.XML, SOLRCONFIG-CACHING.XML, SOLRCONFIG-REPEATER.XML, CURRENCY.XML, BAD-SCHEMA-NONTEXT-ANALYZER.XML, SOLRCONFIG-MERGEPOLICY.XML, SOLRCONFIG-TLOG.XML, SOLRCONFIG-MASTER.XML, SCHEMA11.XML, SOLRCONFIG-BASIC.XML, DA_COMPOUNDDICTIONARY.TXT, SCHEMA-COPYFIELD-TEST.XML, SOLRCONFIG-SLAVE.XML, ELEVATE.XML, SOLRCONFIG-PROPINJECT-INDEXDEFAULT.XML, SCHEMA-IB.XML, SOLRCONFIG-QUERYSENDER.XML, SCHEMA-REPLICATION1.XML, DA_UTF8.XML, HYPHENATION.DTD, SOLRCONFIG-ENABLEPLUGIN.XML, SCHEMA-PHRASESUGGEST.XML, STEMDICT.TXT, HUNSPELL-TEST.AFF, STOPTYPES-1.TXT, STOPWORDSWRONGENCODING.TXT, SCHEMA-NUMERIC.XML, SOLRCONFIG-TRANSFORMERS.XML, SOLRCONFIG-PROPINJECT.XML, BAD-SCHEMA-NOT-INDEXED-BUT-TF.XML, SOLRCONFIG-SIMPLELOCK.XML, WDFTYPES.TXT, STOPTYPES-2.TXT, SCHEMA-REVERSED.XML, SOLRCONFIG-SPELLCHECKCOMPONENT.XML, SCHEMA-DFR.XML, SOLRCONFIG-PHRASESUGGEST.XML, BAD-SCHEMA-NOT-INDEXED-BUT-POS.XML, KEEP-1.TXT, OPEN-EXCHANGE-RATES.JSON, STOPWITHBOM.TXT, SCHEMA-BINARYFIELD.XML, SOLRCONFIG-SPELLCHECKER.XML, SOLRCONFIG-UPDATE-PROCESSOR-CHAINS.XML, BAD-SCHEMA-OMIT-TF-BUT-NOT-POS.XML, BAD-SCHEMA-DUP-FIELDTYPE.XML, SOLRCONFIG-MASTER1.XML, SYNONYMS.TXT, SCHEMA.XML, SCHEMA_CODEC.XML, SOLRCONFIG-SOLR-749.XML, SOLRCONFIG-MASTER1-KEEPONEBACKUP.XML, STOP-2.TXT, SOLRCONFIG-FUNCTIONQUERY.XML, SCHEMA-LMDIRICHLET.XML, SOLRCONFIG-TERMINDEX.XML, SOLRCONFIG-ELEVATE.XML, STOPWORDS.TXT, SCHEMA-FOLDING.XML, SCHEMA-STOP-KEEP.XML, BAD-SCHEMA-NOT-INDEXED-BUT-NORMS.XML, SOLRCONFIG-SOLCOREPROPERTIES.XML, STOP-1.TXT, SOLRCONFIG-MASTER2.XML, SCHEMA-SPELLCHECKER.XML, SOLRCONFIG-LAZYWRITER.XML, SCHEMA-LUCENEMATCHVERSION.XML, BAD-MP-SOLRCONFIG.XML, FRENCHARTICLES.TXT, SCHEMA15.XML, SOLRCONFIG-REQHANDLER.INCL, SCHEMASURROUND.XML, SOLRCONFIG-MASTER3.XML, HUNSPELL-TEST.DIC, SOLRCONFIG-XINCLUDE.XML, SOLRCONFIG-DELPOLICY1.XML, SOLRCONFIG-SLAVE1.XML, SCHEMA-SIM.XML, SCHEMA-COLLATE.XML, STOP-SNOWBALL.TXT, PROTWORDS.TXT, SCHEMA-TRIE.XML, SOLRCONFIG_CODEC.XML, SCHEMA-TFIDF.XML, SCHEMA-LMJELINEKMERCER.XML, PHRASESUGGEST.TXT, OLD_SYNONYMS.TXT, SOLRCONFIG-DELPOLICY2.XML, XSLT, SOLRCONFIG-NATIVELOCK.XML, BAD-SCHEMA-DUP-FIELD.XML, SOLRCONFIG-NOCACHE.XML, SCHEMA-BM25.XML, SOLRCONFIG-ALTDIRECTORY.XML, SOLRCONFIG-QUERYSENDER-NOQUERY.XML, COMPOUNDDICTIONARY.TXT, SOLRCONFIG_PERF.XML, SCHEMA-NOT-REQUIRED-UNIQUE-KEY.XML, KEEP-2.TXT, SCHEMA12.XML, MAPPING-ISOLATIN1ACCENT.TXT, BAD_SOLRCONFIG.XML, BAD-SCHEMA-EXTERNAL-FILEFIELD.XML] [junit4] 2 18401 T2929 oass.SolrIndexSearcher.init Opening Searcher@48370187 main [junit4] 2 18401 T2929 oass.SolrIndexSearcher.init WARNING WARNING: Directory impl does not support
Re: [JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 2719 - Failure
Is this related to the recent packed ints changes? This test historically required quite a lot of ram, maybe that sent it over the edge? On Thu, May 31, 2012 at 7:17 AM, Apache Jenkins Server jenk...@builds.apache.org wrote: Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2719/ 1 tests failed. REGRESSION: org.apache.lucene.util.packed.TestPackedInts.testIntOverflow Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: Java heap space at __randomizedtesting.SeedInfo.seed([26B73FF6A7A21CED:83602BDCBFF8E4B8]:0) at org.apache.lucene.util.packed.Packed64SingleBlock.init(Packed64SingleBlock.java:115) at org.apache.lucene.util.packed.Packed64SingleBlock$Packed64SingleBlock5.init(Packed64SingleBlock.java:279) at org.apache.lucene.util.packed.Packed64SingleBlock.create(Packed64SingleBlock.java:68) at org.apache.lucene.util.packed.TestPackedInts.testIntOverflow(TestPackedInts.java:303) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) Build Log (for compile errors): [...truncated 1559 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286616#comment-13286616 ] Mark Harwood commented on LUCENE-4069: -- bq. I dont understand why this handles fields. Someone should just pick that with perfieldpostingsformat. That would be inefficient because your PFPF will see BloomFilteringPostingsFormat(field1 + Lucene40) and BloomFilteringPostingsFormat(field2 + Lucene40) as fundamentally different PostingsFormat instances and consequently create multiple files named differently because it assumes these instances may be capable of using radically different file structures. In reality, the choice of BloomFilter with field 1 or BloomFilter with field 2 or indeed no BloomFilter does not fundamentally alter the underlying delegate PostingFormat's file format - it only adds a supplementary blm file on the side with the field summaries. For this reason it is a mistake to configure seperate BloomFilterPostingsFormat instances on a per-field basis if they can share a common delegate. Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286619#comment-13286619 ] Robert Muir commented on LUCENE-4069: - {quote} That would be inefficient because your PFPF will see BloomFilteringPostingsFormat(field1 + Lucene40) and BloomFilteringPostingsFormat(field2 + Lucene40) as fundamentally different PostingsFormat instances and consequently create multiple files named differently because it assumes these instances may be capable of using radically different file structures. {quote} But adding per-field handling here is not the way to solve this: its messy. Per-Field handling should all be handled at a level above in PerFieldPostingsFormat. To solve what you speak of we just need to resolve LUCENE-4093. Then multiple postings format instances that are 'the same' will be deduplicated correctly. Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #345
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/345/ -- [...truncated 13489 lines...] [junit4] Completed in 0.01s, 1 test [junit4] [junit4] Suite: org.apache.solr.core.ResourceLoaderTest [junit4] Completed in 0.02s, 4 tests [junit4] [junit4] Suite: org.apache.solr.internal.csv.ExtendedBufferedReaderTest [junit4] Completed in 0.02s, 8 tests [junit4] [junit4] Suite: org.apache.solr.search.ReturnFieldsTest [junit4] Completed in 0.93s, 10 tests [junit4] [junit4] Suite: org.apache.solr.update.DirectUpdateHandlerOptimizeTest [junit4] Completed in 0.86s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.TestLFUCache [junit4] Completed in 0.84s, 5 tests [junit4] [junit4] Suite: org.apache.solr.handler.admin.MBeansHandlerTest [junit4] Completed in 0.91s, 1 test [junit4] [junit4] Suite: org.apache.solr.update.DirectUpdateHandlerTest [junit4] Completed in 3.21s, 6 tests [junit4] [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterTest [junit4] Completed in 1.37s, 4 tests [junit4] [junit4] Suite: org.apache.solr.cloud.FullSolrCloudDistribCmdsTest [junit4] Completed in 39.24s, 1 test [junit4] [junit4] Suite: org.apache.solr.cloud.OverseerTest [junit4] Completed in 46.76s, 7 tests [junit4] [junit4] Suite: org.apache.solr.analysis.TestPhoneticFilterFactory [junit4] Completed in 9.59s, 5 tests [junit4] [junit4] Suite: org.apache.solr.TestDistributedGrouping [junit4] Completed in 20.37s, 1 test [junit4] [junit4] Suite: org.apache.solr.cloud.TestHashPartitioner [junit4] Completed in 4.11s, 1 test [junit4] [junit4] Suite: org.apache.solr.cloud.TestMultiCoreConfBootstrap [junit4] Completed in 3.85s, 1 test [junit4] [junit4] Suite: org.apache.solr.update.PeerSyncTest [junit4] Completed in 3.87s, 1 test [junit4] [junit4] Suite: org.apache.solr.ConvertedLegacyTest [junit4] Completed in 3.03s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.TestFiltering [junit4] Completed in 3.25s, 2 tests [junit4] [junit4] Suite: org.apache.solr.core.SolrCoreTest [junit4] Completed in 5.75s, 5 tests [junit4] [junit4] Suite: org.apache.solr.handler.component.StatsComponentTest [junit4] Completed in 5.73s, 6 tests [junit4] [junit4] Suite: org.apache.solr.SolrInfoMBeanTest [junit4] Completed in 1.01s, 1 test [junit4] [junit4] Suite: org.apache.solr.update.SolrCmdDistributorTest [junit4] Completed in 2.18s, 1 test [junit4] [junit4] Suite: org.apache.solr.request.TestWriterPerf [junit4] Completed in 1.23s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.TestPseudoReturnFields [junit4] Completed in 1.59s, 13 tests [junit4] [junit4] Suite: org.apache.solr.handler.admin.ShowFileRequestHandlerTest [junit4] Completed in 1.35s, 2 tests [junit4] [junit4] Suite: org.apache.solr.search.TestSurroundQueryParser [junit4] Completed in 1.03s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.function.SortByFunctionTest [junit4] Completed in 2.31s, 2 tests [junit4] [junit4] Suite: org.apache.solr.handler.admin.CoreAdminHandlerTest [junit4] Completed in 2.35s, 1 test [junit4] [junit4] Suite: org.apache.solr.update.DocumentBuilderTest [junit4] Completed in 1.28s, 11 tests [junit4] [junit4] Suite: org.apache.solr.search.function.distance.DistanceFunctionTest [junit4] Completed in 1.36s, 3 tests [junit4] [junit4] Suite: org.apache.solr.search.SpatialFilterTest [junit4] Completed in 2.09s, 3 tests [junit4] [junit4] Suite: org.apache.solr.handler.DocumentAnalysisRequestHandlerTest [junit4] Completed in 1.17s, 4 tests [junit4] [junit4] Suite: org.apache.solr.search.TestFoldingMultitermQuery [junit4] Completed in 1.53s, 18 tests [junit4] [junit4] Suite: org.apache.solr.schema.CurrencyFieldTest [junit4] IGNORED 0.00s | CurrencyFieldTest.testPerformance [junit4] Cause: Annotated @Ignore() [junit4] Completed in 1.49s, 8 tests, 1 skipped [junit4] [junit4] Suite: org.apache.solr.core.RequestHandlersTest [junit4] Completed in 1.14s, 3 tests [junit4] [junit4] Suite: org.apache.solr.handler.component.DebugComponentTest [junit4] Completed in 1.22s, 2 tests [junit4] [junit4] Suite: org.apache.solr.spelling.FileBasedSpellCheckerTest [junit4] Completed in 1.29s, 3 tests [junit4] [junit4] Suite: org.apache.solr.schema.PrimitiveFieldTypeTest [junit4] Completed in 1.54s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.TestValueSourceCache [junit4] Completed in 1.09s, 2 tests [junit4] [junit4] Suite: org.apache.solr.DisMaxRequestHandlerTest [junit4] Completed in 1.24s, 3 tests [junit4] [junit4] Suite:
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286707#comment-13286707 ] Mark Harwood commented on LUCENE-4069: -- bq. To solve what you speak of we just need to resolve LUCENE-4093. Presumably the main objective here is that in order to cut down on the number of files we store, content consumers of various types should aim to consolidate multiple fields' contents into a single file (if they share common config choices). bq. Then multiple postings format instances that are 'the same' will be deduplicated correctly. The complication in this case is that we essentially have 2 consumers (Bloom and Lucene40), one wrapped in the other with different but overlapping choices of fields e.g we want a single Lucene40 to process all fields but we want Bloom to handle only a subset of these fields. This will be a tough one for PFPF to untangle while we are stuck with a delegating model for composing consumers. This may be made easier if instead of delegating a single stream we have a *stream-splitting* capability via a multicast subscription e.g. Bloom filtering consumer registers interest in content streams for fields A and B while Lucene40 is consolidating content from fields A, B, C and D. A broadcast mechanism feeds each consumer a copy of the relevant stream and each consumer is responsible for inventing their own file-naming convention that avoids muddling files. While that may help for writing streams it doesn't solve the re-assembly of producer streams at read-time where BloomFilter absolutely has to position itself in front of the standard Lucene40 producer in order to offer fast-fail lookups. In the absence of a fancy optimised routing mechanism (this all may be overkill) my current solution was to put BloomFilter in the delegate chain armed with a subset of fieldnames to observe as a larger array of fields flow past to a common delegate. I added some Javadocs to describe the need to do it this way for an efficient configuration. You are right that this is messy (ie open to bad configuration) but operating this deep down in Lucene that's always a possibility regardless of what we put in place. Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286712#comment-13286712 ] Robert Muir commented on LUCENE-4069: - {quote} but overlapping choices of fields e.g we want a single Lucene40 to process all fields but we want Bloom to handle only a subset of these fields. {quote} Thats not true: I disagree. Its an implementation detail that Bloom as a postingsformat wraps another one (thats just the abstract implementation), and the file formats should not expose this in general for any format. This is true for a number of reasons: e.g. in the pulsing case the wrapped writer only gets a subset of the postings: therefore the wrapped writer's files are incomplete and an implementation detail. its enough here that if you have 5 fields: 2 bloom and 3 not, that we detect there are only two postings formats in use, regardless of whether you have 2 or 5 actual object instances. Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286717#comment-13286717 ] Robert Muir commented on LUCENE-4069: - And separately, you can always contain the number of files even today by: * using only unique instances yourself when writing (rather than waiting on LUCENE-4093) * using the compound file format. The purpose of LUCENE-4093 is just to make this simpler, but I opened it as a separate issue because its really solely an optimization, and only for a pretty rare case where people are customizing the index format for different fields. Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java6-64 #346
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/346/changes - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4096) impossible to CheckIndex if you use norms other than byte[]
[ https://issues.apache.org/jira/browse/LUCENE-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286739#comment-13286739 ] Michael McCandless commented on LUCENE-4096: +1 Not sure why I originally used TermQuery in CheckIndex... I think switching to DocsEnum is fine... impossible to CheckIndex if you use norms other than byte[] --- Key: LUCENE-4096 URL: https://issues.apache.org/jira/browse/LUCENE-4096 Project: Lucene - Java Issue Type: Task Components: core/index Affects Versions: 4.0 Reporter: Robert Muir Fix For: 4.0 Attachments: LUCENE-4096.patch I noticed TestCustomNorms had the checkIndexOnClose disabled, but I think this is a real problem. If someone wants to use e.g. float[] norms, they should be able to run CheckIndex. CheckIndex is fine with validating any norm type, the problem is that it sometimes creates an IndexSearcher and fires off TermQueries for some calculations. This causes it to (wrongly) fail, because DefaultSimilarity expects single byte norms. I don't think CheckIndex needs to use TermQuery here, we can do this differently so it doesnt use IndexSearcher or TermQuery but just the postings apis. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4092) Check what's Jenkins pattern for e-mailing log fragments (so that it includes failures).
[ https://issues.apache.org/jira/browse/LUCENE-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286744#comment-13286744 ] Steven Rowe commented on LUCENE-4092: - I plan on adding the following (as suggested by Robert) as alternations to the BUILD_LOG_REGEX for all non-Maven Jenkins jobs (some of these things don't run under the Maven jobs, and Maven's output is different enough that it'll require separate treatment): bq. the javadocs warnings task {noformat} (?:[^\r\n]*\[javadoc\].*\r?\n)*[^\r\n]*\[javadoc\]\s*[1-9]\d*\s+warnings.*\r?\n {noformat} bq. two javadocs checkers in javadocs-lint Output from javadocs-lint seems to show up only when there's a problem, so any output from it will always be extracted by the following regex: {noformat} [^\r\n]*javadocs-lint:.*\r?\n(?:[^\r\n]*\[echo\].*\r?\n)* {noformat} bq. and the rat-checker {noformat} [^\r\n]*rat-sources:\s+\[echo\].*(?:\r?\n[^\r\n]*\[echo\].*)*\s*[1-9]\d*\s+Unknown\s+Licenses.*\r?\n(?:[^\r\n]*\[echo\].*\r?\n)* {noformat} Along with two others: # Compilation failures: {noformat} (?:[^\r\n]*\[javac\].*\r?\n)*[^\r\n]*\[javac\]\s*[1-9]\d*\s*error.*\r?\n {noformat} # Jenkins FATAL errors: {noformat} [^\r\n]*FATAL:(?s:.*) {noformat} Check what's Jenkins pattern for e-mailing log fragments (so that it includes failures). Key: LUCENE-4092 URL: https://issues.apache.org/jira/browse/LUCENE-4092 Project: Lucene - Java Issue Type: Sub-task Components: general/test Reporter: Dawid Weiss Priority: Trivial -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286754#comment-13286754 ] Mark Harwood commented on LUCENE-4069: -- Its true to say that Bloom is a different case to Pulsing - Bloom does not interfere in any with the normal recording of content in the wrapped delegate whereas Pulsing does. It may prove useful for us to mark a formal distinction between these mutating/non mutating types so we can treat them differently and provide optimisations? bq. And separately, you can always contain the number of files even today by using only unique instances yourself when writing Contained but not optimal - roughly double the number of required files if I want the common case of a primary key indexed with Bloom. I can't see a way of indexing with Bloom-plus-Lucene40 on field A and indexing with just Lucene40 on fields B,C and D and winding up with only one Lucene40 set of files with a common segment suffix. The way I did find of achieving this was to add a bloomFilteredFields set into my single Bloom+Lucene40 instance used for all fields. Is there any other option here currently? Looking to the future, 4093 may have more capabilities at optimising if it understands the distinction between mutating wrappers and non-mutating ones and how they are composed? Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286756#comment-13286756 ] Robert Muir commented on LUCENE-4069: - {quote} Contained but not optimal - roughly double the number of required files if I want the common case of a primary key indexed with Bloom. {quote} Then use CFS, its optimal always (1). I really dont think we should make this complex to save 2 or 3 files total (even in a complex config with many fields). Its not worth the complexity. Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Documenting document limits for Lucene and Solr
Deleted documents use IDs, so you may run out of doc IDs with fewer than 2^31 searchable documents. I recommend designing with a lot of slack, maybe using only 75% of IDs. Solr might alert when 90% of the space is used. If you want to delete everything, then re-add everything without a commit, you will use 2X the doc IDs. That isn't even worst case. If you reduce or black-out merging, you can end up with serious doc ID consumption. With no merges, if you find lots of near-dupes and routinely replace documents with a better version, you can have many deleted documents for each searchable one. This can happen with web spidering. If you find five mirrors of a million-document site, and find the best one last, you can use five million doc IDs for those million docs. wunder On May 30, 2012, at 8:52 AM, Jack Krupansky wrote: AFAICT, there is no clear documentation of the maximum number of documents that can be stored in a Lucene or Solr Index (single core/shard). It appears to be 2^31 since a Lucene document number and the value returned from IW.maxDoc is a Java “int”. Lucene users have that “hint” to guide them, but that hint is never surfaced for Solr users, AFAICT. A few years ago nobody in their right mind would imagine indexing 2 billion documents in a single machine/core, but now people are at least tempted to try. So, it is now more important for people to know about it, up front, not hidden down in the fine print of Lucene file formats. I wanted to file a Jira on this, but I wanted to check first if anybody knows of an existing Jira for it that maybe was worded in a way that it escaped my semi-diligent searches. I was also thinking of filing it as two Jiras, one for Lucene and one for Solr since the doc would be in different places. Or, should there be one combined “Lucene/Solr Capacity Limits/Planning” wiki? Unless somebody objects, I’ll file as two separate (but linked) issues. And, I was also thinking of filing two Jiras for Lucene and Solr to each have a robust check for exceeding the underlying Lucene limit and reporting this exception in a well-defined manner rather than “numFound” or “maxDoc” going negative. But this is separate from the documentation issue, I think. Unless somebody objects, I’ll file these as two separate issues. Any objection to me filing these four issues? -- Jack Krupansky
[jira] [Created] (LUCENE-4098) Efficient bulk operations for packed integer arrays
Adrien Grand created LUCENE-4098: Summary: Efficient bulk operations for packed integer arrays Key: LUCENE-4098 URL: https://issues.apache.org/jira/browse/LUCENE-4098 Project: Lucene - Java Issue Type: Improvement Components: core/other Reporter: Adrien Grand Priority: Minor Fix For: 4.1 There are some places in Lucene code that {iterate over,set} ranges of values of a packed integer array. Because bit-packing implementations (Packed*) tend be slower than direct implementations, this can take a lot of time. For example, under some scenarii, GrowableWriter can take most of its (averaged) {{set}} time in resizing operations. However, some bit-packing schemes, such as the one that is used by {{Packed64SingleBlock*}}, allow to implement efficient bulk operations such as get/set/fill. Implementing these bulk operations in {{PackedInts.{Reader,Mutable}}} and using them across other components instead of their single-value counterpart could help improve performance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4096) impossible to CheckIndex if you use norms other than byte[]
[ https://issues.apache.org/jira/browse/LUCENE-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-4096. - Resolution: Fixed Fix Version/s: 5.0 impossible to CheckIndex if you use norms other than byte[] --- Key: LUCENE-4096 URL: https://issues.apache.org/jira/browse/LUCENE-4096 Project: Lucene - Java Issue Type: Task Components: core/index Affects Versions: 4.0 Reporter: Robert Muir Fix For: 4.0, 5.0 Attachments: LUCENE-4096.patch I noticed TestCustomNorms had the checkIndexOnClose disabled, but I think this is a real problem. If someone wants to use e.g. float[] norms, they should be able to run CheckIndex. CheckIndex is fine with validating any norm type, the problem is that it sometimes creates an IndexSearcher and fires off TermQueries for some calculations. This causes it to (wrongly) fail, because DefaultSimilarity expects single byte norms. I don't think CheckIndex needs to use TermQuery here, we can do this differently so it doesnt use IndexSearcher or TermQuery but just the postings apis. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4092) Check what's Jenkins pattern for e-mailing log fragments (so that it includes failures).
[ https://issues.apache.org/jira/browse/LUCENE-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286771#comment-13286771 ] Steven Rowe commented on LUCENE-4092: - I'm going to add one more to the regex: {noformat} # Third-party dependency license/notice problems |[^\\r\\n]*validate:.*\\r?\\n[^\\r\\n]*\\[echo\\].*\\r?\\n(?:[^\\r\\n]*\\[licenses\\].*\\r?\\n)*[^\\r\\n]*\\[licenses\\].*[1-9]\\d*\\s+error.*\\r?\\n {noformat} Check what's Jenkins pattern for e-mailing log fragments (so that it includes failures). Key: LUCENE-4092 URL: https://issues.apache.org/jira/browse/LUCENE-4092 Project: Lucene - Java Issue Type: Sub-task Components: general/test Reporter: Dawid Weiss Priority: Trivial -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4077) ToParentBlockJoinCollector provides no way to access computed scores and the maxScore
[ https://issues.apache.org/jira/browse/LUCENE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286770#comment-13286770 ] Michael McCandless commented on LUCENE-4077: Super, thanks Christoph, I'll commit shortly... ToParentBlockJoinCollector provides no way to access computed scores and the maxScore - Key: LUCENE-4077 URL: https://issues.apache.org/jira/browse/LUCENE-4077 Project: Lucene - Java Issue Type: Bug Components: modules/join Affects Versions: 3.4, 3.5, 3.6 Reporter: Christoph Kaser Assignee: Michael McCandless Attachments: LUCENE-4077.patch, LUCENE-4077.patch, LUCENE-4077.patch, LUCENE-4077.patch The constructor of ToParentBlockJoinCollector allows to turn on the tracking of parent scores and the maximum parent score, however there is no way to access those scores because: * maxScore is a private field, and there is no getter * TopGroups / GroupDocs does not provide access to the scores for the parent documents, only the children -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4092) Check what's Jenkins pattern for e-mailing log fragments (so that it includes failures).
[ https://issues.apache.org/jira/browse/LUCENE-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286774#comment-13286774 ] Steven Rowe commented on LUCENE-4092: - bq. I'm going to add one more to the regex Done - added to the configuration on all non-Maven Jenkins jobs Check what's Jenkins pattern for e-mailing log fragments (so that it includes failures). Key: LUCENE-4092 URL: https://issues.apache.org/jira/browse/LUCENE-4092 Project: Lucene - Java Issue Type: Sub-task Components: general/test Reporter: Dawid Weiss Priority: Trivial -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikola Tankovic updated LUCENE-3312: Attachment: lucene-3312-patch-04.patch Patch 04: Status core compiles. This is an attempt to separate IndexableFields and StorebleFields in indexing. I introduced oal.index.Document which holds both type of fields. I also introduced StorableFieldType interface, StoredFieldType class. Let me know what you think! Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4098) Efficient bulk operations for packed integer arrays
[ https://issues.apache.org/jira/browse/LUCENE-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-4098: - Attachment: LUCENE-4098.patch Here is the patch for the proposed modifications. All {{Mutable}} implementations have a new efficient {{fill}} method and Packed64SingleBlock* classes also have efficient bulk get and set. For example, the execution time of the following (unrealistic) microbenchmark is more than twice better with the patch applied on my computer thanks to the use of {{PackedInts.copy}} instead of naive copy (see {{GrowableWriter#ensureCapacity}}). {code} for (int k = 0; k 50; ++k) { long start = System.nanoTime(); GrowableWriter wrt = new GrowableWriter(1, 1 22, PackedInts.DEFAULT); for (int i = 0; i 1 22; ++i) { wrt.set(i,i); } long end = System.nanoTime(); System.out.println((end - start) / 100); long sum = 0; for (int i = 0; i wrt.size(); ++i) { sum += wrt.get(i); } System.out.println(sum); } {code} Efficient bulk operations for packed integer arrays --- Key: LUCENE-4098 URL: https://issues.apache.org/jira/browse/LUCENE-4098 Project: Lucene - Java Issue Type: Improvement Components: core/other Reporter: Adrien Grand Priority: Minor Fix For: 4.1 Attachments: LUCENE-4098.patch There are some places in Lucene code that {iterate over,set} ranges of values of a packed integer array. Because bit-packing implementations (Packed*) tend be slower than direct implementations, this can take a lot of time. For example, under some scenarii, GrowableWriter can take most of its (averaged) {{set}} time in resizing operations. However, some bit-packing schemes, such as the one that is used by {{Packed64SingleBlock*}}, allow to implement efficient bulk operations such as get/set/fill. Implementing these bulk operations in {{PackedInts.{Reader,Mutable}}} and using them across other components instead of their single-value counterpart could help improve performance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Documenting document limits for Lucene and Solr
Thanks. That’s all good info to be documented for users to be aware of when they start pushing the limits. -- Jack Krupansky From: Walter Underwood Sent: Thursday, May 31, 2012 1:30 PM To: dev@lucene.apache.org Subject: Re: Documenting document limits for Lucene and Solr Deleted documents use IDs, so you may run out of doc IDs with fewer than 2^31 searchable documents. I recommend designing with a lot of slack, maybe using only 75% of IDs. Solr might alert when 90% of the space is used. If you want to delete everything, then re-add everything without a commit, you will use 2X the doc IDs. That isn't even worst case. If you reduce or black-out merging, you can end up with serious doc ID consumption. With no merges, if you find lots of near-dupes and routinely replace documents with a better version, you can have many deleted documents for each searchable one. This can happen with web spidering. If you find five mirrors of a million-document site, and find the best one last, you can use five million doc IDs for those million docs. wunder On May 30, 2012, at 8:52 AM, Jack Krupansky wrote: AFAICT, there is no clear documentation of the maximum number of documents that can be stored in a Lucene or Solr Index (single core/shard). It appears to be 2^31 since a Lucene document number and the value returned from IW.maxDoc is a Java “int”. Lucene users have that “hint” to guide them, but that hint is never surfaced for Solr users, AFAICT. A few years ago nobody in their right mind would imagine indexing 2 billion documents in a single machine/core, but now people are at least tempted to try. So, it is now more important for people to know about it, up front, not hidden down in the fine print of Lucene file formats. I wanted to file a Jira on this, but I wanted to check first if anybody knows of an existing Jira for it that maybe was worded in a way that it escaped my semi-diligent searches. I was also thinking of filing it as two Jiras, one for Lucene and one for Solr since the doc would be in different places. Or, should there be one combined “Lucene/Solr Capacity Limits/Planning” wiki? Unless somebody objects, I’ll file as two separate (but linked) issues. And, I was also thinking of filing two Jiras for Lucene and Solr to each have a robust check for exceeding the underlying Lucene limit and reporting this exception in a well-defined manner rather than “numFound” or “maxDoc” going negative. But this is separate from the documentation issue, I think. Unless somebody objects, I’ll file these as two separate issues. Any objection to me filing these four issues? -- Jack Krupansky
[jira] [Commented] (LUCENE-4092) Check what's Jenkins pattern for e-mailing log fragments (so that it includes failures).
[ https://issues.apache.org/jira/browse/LUCENE-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286794#comment-13286794 ] Robert Muir commented on LUCENE-4092: - awesome! thank you! Check what's Jenkins pattern for e-mailing log fragments (so that it includes failures). Key: LUCENE-4092 URL: https://issues.apache.org/jira/browse/LUCENE-4092 Project: Lucene - Java Issue Type: Sub-task Components: general/test Reporter: Dawid Weiss Priority: Trivial -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #348
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/348/changes Changes: [rmuir] LUCENE-4096: impossible to checkindex if you use norms other than byte[] -- [...truncated 10535 lines...] [junit4] 2 79515 T3115 oasc.RequestHandlers.initHandlersFromConfig created /update: solr.UpdateRequestHandler [junit4] 2 79515 T3115 oasc.RequestHandlers.initHandlersFromConfig created /terms: org.apache.solr.handler.component.SearchHandler [junit4] 2 79515 T3115 oasc.RequestHandlers.initHandlersFromConfig created spellCheckCompRH: org.apache.solr.handler.component.SearchHandler [junit4] 2 79515 T3115 oasc.RequestHandlers.initHandlersFromConfig created spellCheckCompRH_Direct: org.apache.solr.handler.component.SearchHandler [junit4] 2 79516 T3115 oasc.RequestHandlers.initHandlersFromConfig created spellCheckCompRH1: org.apache.solr.handler.component.SearchHandler [junit4] 2 79516 T3115 oasc.RequestHandlers.initHandlersFromConfig created tvrh: org.apache.solr.handler.component.SearchHandler [junit4] 2 79516 T3115 oasc.RequestHandlers.initHandlersFromConfig created /mlt: solr.MoreLikeThisHandler [junit4] 2 79517 T3115 oasc.RequestHandlers.initHandlersFromConfig created /debug/dump: solr.DumpRequestHandler [junit4] 2 79517 T3115 oashl.XMLLoader.init xsltCacheLifetimeSeconds=60 [junit4] 2 79520 T3115 oasc.SolrCore.initDeprecatedSupport WARNING solrconfig.xml uses deprecated admin/gettableFiles, Please update your config to use the ShowFileRequestHandler. [junit4] 2 79522 T3115 oasc.SolrCore.initDeprecatedSupport WARNING adding ShowFileRequestHandler with hidden files: [SOLRCONFIG-HIGHLIGHT.XML, SCHEMA-REQUIRED-FIELDS.XML, SCHEMA-REPLICATION2.XML, SCHEMA-MINIMAL.XML, BAD-SCHEMA-DUP-DYNAMICFIELD.XML, SOLRCONFIG-CACHING.XML, SOLRCONFIG-REPEATER.XML, CURRENCY.XML, BAD-SCHEMA-NONTEXT-ANALYZER.XML, SOLRCONFIG-MERGEPOLICY.XML, SOLRCONFIG-TLOG.XML, SOLRCONFIG-MASTER.XML, SCHEMA11.XML, SOLRCONFIG-BASIC.XML, DA_COMPOUNDDICTIONARY.TXT, SCHEMA-COPYFIELD-TEST.XML, SOLRCONFIG-SLAVE.XML, ELEVATE.XML, SOLRCONFIG-PROPINJECT-INDEXDEFAULT.XML, SCHEMA-IB.XML, SOLRCONFIG-QUERYSENDER.XML, SCHEMA-REPLICATION1.XML, DA_UTF8.XML, HYPHENATION.DTD, SOLRCONFIG-ENABLEPLUGIN.XML, SCHEMA-PHRASESUGGEST.XML, STEMDICT.TXT, HUNSPELL-TEST.AFF, STOPTYPES-1.TXT, STOPWORDSWRONGENCODING.TXT, SCHEMA-NUMERIC.XML, SOLRCONFIG-TRANSFORMERS.XML, SOLRCONFIG-PROPINJECT.XML, BAD-SCHEMA-NOT-INDEXED-BUT-TF.XML, SOLRCONFIG-SIMPLELOCK.XML, WDFTYPES.TXT, STOPTYPES-2.TXT, SCHEMA-REVERSED.XML, SOLRCONFIG-SPELLCHECKCOMPONENT.XML, SCHEMA-DFR.XML, SOLRCONFIG-PHRASESUGGEST.XML, BAD-SCHEMA-NOT-INDEXED-BUT-POS.XML, KEEP-1.TXT, OPEN-EXCHANGE-RATES.JSON, STOPWITHBOM.TXT, SCHEMA-BINARYFIELD.XML, SOLRCONFIG-SPELLCHECKER.XML, SOLRCONFIG-UPDATE-PROCESSOR-CHAINS.XML, BAD-SCHEMA-OMIT-TF-BUT-NOT-POS.XML, BAD-SCHEMA-DUP-FIELDTYPE.XML, SOLRCONFIG-MASTER1.XML, SYNONYMS.TXT, SCHEMA.XML, SCHEMA_CODEC.XML, SOLRCONFIG-SOLR-749.XML, SOLRCONFIG-MASTER1-KEEPONEBACKUP.XML, STOP-2.TXT, SOLRCONFIG-FUNCTIONQUERY.XML, SCHEMA-LMDIRICHLET.XML, SOLRCONFIG-TERMINDEX.XML, SOLRCONFIG-ELEVATE.XML, STOPWORDS.TXT, SCHEMA-FOLDING.XML, SCHEMA-STOP-KEEP.XML, BAD-SCHEMA-NOT-INDEXED-BUT-NORMS.XML, SOLRCONFIG-SOLCOREPROPERTIES.XML, STOP-1.TXT, SOLRCONFIG-MASTER2.XML, SCHEMA-SPELLCHECKER.XML, SOLRCONFIG-LAZYWRITER.XML, SCHEMA-LUCENEMATCHVERSION.XML, BAD-MP-SOLRCONFIG.XML, FRENCHARTICLES.TXT, SCHEMA15.XML, SOLRCONFIG-REQHANDLER.INCL, SCHEMASURROUND.XML, SOLRCONFIG-MASTER3.XML, HUNSPELL-TEST.DIC, SOLRCONFIG-XINCLUDE.XML, SOLRCONFIG-DELPOLICY1.XML, SOLRCONFIG-SLAVE1.XML, SCHEMA-SIM.XML, SCHEMA-COLLATE.XML, STOP-SNOWBALL.TXT, PROTWORDS.TXT, SCHEMA-TRIE.XML, SOLRCONFIG_CODEC.XML, SCHEMA-TFIDF.XML, SCHEMA-LMJELINEKMERCER.XML, PHRASESUGGEST.TXT, OLD_SYNONYMS.TXT, SOLRCONFIG-DELPOLICY2.XML, XSLT, SOLRCONFIG-NATIVELOCK.XML, BAD-SCHEMA-DUP-FIELD.XML, SOLRCONFIG-NOCACHE.XML, SCHEMA-BM25.XML, SOLRCONFIG-ALTDIRECTORY.XML, SOLRCONFIG-QUERYSENDER-NOQUERY.XML, COMPOUNDDICTIONARY.TXT, SOLRCONFIG_PERF.XML, SCHEMA-NOT-REQUIRED-UNIQUE-KEY.XML, KEEP-2.TXT, SCHEMA12.XML, MAPPING-ISOLATIN1ACCENT.TXT, BAD_SOLRCONFIG.XML, BAD-SCHEMA-EXTERNAL-FILEFIELD.XML] [junit4] 2 79525 T3115 oass.SolrIndexSearcher.init Opening Searcher@3de7e517 main [junit4] 2 79525 T3115 oass.SolrIndexSearcher.init WARNING WARNING: Directory impl does not support setting indexDir: org.apache.lucene.store.MockDirectoryWrapper [junit4] 2 79525 T3115 oasu.CommitTracker.init Hard AutoCommit: disabled [junit4] 2 79526 T3115 oasu.CommitTracker.init Soft AutoCommit: disabled [junit4] 2 79526 T3115 oashc.SpellCheckComponent.inform Initializing spell checkers [junit4] 2 79534 T3115 oass.DirectSolrSpellChecker.init init: {name=direct,classname=DirectSolrSpellChecker,field=lowerfilt,minQueryLength=3} [junit4] 2 79574 T3115 oashc.HttpShardHandlerFactory.getParameter Setting
[jira] [Resolved] (LUCENE-4077) ToParentBlockJoinCollector provides no way to access computed scores and the maxScore
[ https://issues.apache.org/jira/browse/LUCENE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-4077. Resolution: Fixed Fix Version/s: 5.0 4.0 ToParentBlockJoinCollector provides no way to access computed scores and the maxScore - Key: LUCENE-4077 URL: https://issues.apache.org/jira/browse/LUCENE-4077 Project: Lucene - Java Issue Type: Bug Components: modules/join Affects Versions: 3.4, 3.5, 3.6 Reporter: Christoph Kaser Assignee: Michael McCandless Fix For: 4.0, 5.0 Attachments: LUCENE-4077.patch, LUCENE-4077.patch, LUCENE-4077.patch, LUCENE-4077.patch The constructor of ToParentBlockJoinCollector allows to turn on the tracking of parent scores and the maximum parent score, however there is no way to access those scores because: * maxScore is a private field, and there is no getter * TopGroups / GroupDocs does not provide access to the scores for the parent documents, only the children -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286815#comment-13286815 ] Mark Harwood commented on LUCENE-4069: -- bq. Its not worth the complexity There's no real added complexity in BloomFilterPostingsFormat - it has to be capable of storing blooms for 1 field anyway and using the fieldname set is roughly 2 extra lines of code to see if a TermsConsumer needs wrapping or not. From a client side you don't have to use this feature - the fieldname set can be null in which case it will wrap all fields sent its way. If you do chose to supply a set the wrapped PostingsFormat will have the advantage of being shared for bloomed and non-bloomed fields. We could add a constructor that removes the set and mark the others expert. For me this falls into one of the many faster-if-you-know-about-it optimisations like FieldSelectors or recycling certain objects. Basically a useful hint to Lucene to save some extra effort but one which you dont *need* to use. Lucene-4093 may in future resolve the multi-file issue but I'm not sure it will do so without significant complication. Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 2723 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2723/ 1 tests failed. REGRESSION: org.apache.lucene.util.packed.TestPackedInts.testIntOverflow Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: Java heap space at __randomizedtesting.SeedInfo.seed([44E7903FBFDCF43:A1996D29E3A73716]:0) at org.apache.lucene.util.packed.Packed64SingleBlock.init(Packed64SingleBlock.java:115) at org.apache.lucene.util.packed.Packed64SingleBlock$Packed64SingleBlock3.init(Packed64SingleBlock.java:315) at org.apache.lucene.util.packed.Packed64SingleBlock.create(Packed64SingleBlock.java:64) at org.apache.lucene.util.packed.TestPackedInts.testIntOverflow(TestPackedInts.java:303) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) Build Log: ${BUILD_LOG_REGEX,regex=(?x: # Compilation failures (?:[^\\r\\n]*\\[javac\\].*\\r?\\n)*[^\\r\\n]*\\[javac\\]\\s*[1-9]\\d*\\s*error.*\\r?\\n # Test failures |[^\\r\\n]*\\[junit4\\]\\s*Suite:.*[\\r\\n]+[^\\r\\n]*\\[junit4\\]\\s*(?!Completed)(?!IGNOR)\\S(?s:.*?)\\s*FAILURES! # License problems |[^\\r\\n]*rat-sources:\\s+\\[echo\\].*(?:\\r?\\n[^\\r\\n]*\\[echo\\].*)*\\s*[1-9]\\d*\\s+Unknown\\s+Licenses.*\\r?\\n(?:[^\\r\\n]*\\[echo\\].*\\r?\\n)* # Javadocs warnings |(?:[^\\r\\n]*\\[javadoc\\].*\\r?\\n)*[^\\r\\n]*\\[javadoc\\]\\s*[1-9]\\d*\\s+warnings.*\\r?\\n # Other javadocs problems (broken links and missing javadocs) |[^\\r\\n]*javadocs-lint:.*\\r?\\n(?:[^\\r\\n]*\\[echo\\].*\\r?\\n)* # Third-party dependency license/notice problems |[^\\r\\n]*validate:.*\\r?\\n[^\\r\\n]*\\[echo\\].*\\r?\\n(?:[^\\r\\n]*\\[licenses\\].*\\r?\\n)*[^\\r\\n]*\\[licenses\\].*[1-9]\\d*\\s+error.*\\r?\\n # Jenkins problems |[^\\r\\n]*FATAL:(?s:.*) )} - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 2723 - Failure
Hmm, looks like spreading the BUILD_LOG_REGEX across multiple lines caused it not to be recognized. Jenkins's email templating functionality is provided by the Jenkins Email Extension Plugin (email-ext) https://wiki.jenkins-ci.org/display/JENKINS/Email-ext+plugin. The token parsing is done by hudson.plugins.emailext.plugins.ContentBuilder.Tokenizer: https://github.com/jenkinsci/email-ext-plugin/blob/master/src/main/java/hudson/plugins/emailext/plugins/ContentBuilder.java#L134 Here's the relevant argument-value regex (used to parse the value of the regex argument to the BUILD_LOG_REGEX token): private static final String stringRegex = \([^\\\r\\n]|(.))*\; So I *think* if I put a backslash (escaped with another backslash) at the end of each line, I can keep the multiple lines (and comments). I'll give it a try. Steve -Original Message- From: Apache Jenkins Server [mailto:jenk...@builds.apache.org] Sent: Thursday, May 31, 2012 2:55 PM To: dev@lucene.apache.org Subject: [JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 2723 - Failure Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2723/ 1 tests failed. REGRESSION: org.apache.lucene.util.packed.TestPackedInts.testIntOverflow Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: Java heap space at __randomizedtesting.SeedInfo.seed([44E7903FBFDCF43:A1996D29E3A73716]:0) at org.apache.lucene.util.packed.Packed64SingleBlock.init(Packed64SingleBlock.java:115) at org.apache.lucene.util.packed.Packed64SingleBlock$Packed64SingleBlock3.init(Packed64SingleBlock.java:315) at org.apache.lucene.util.packed.Packed64SingleBlock.create(Packed64SingleBlock.java:64) at org.apache.lucene.util.packed.TestPackedInts.testIntOverflow(TestPackedInts.java:303) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) Build Log: ${BUILD_LOG_REGEX,regex=(?x: # Compilation failures (?:[^\\r\\n]*\\[javac\\].*\\r?\\n)*[^\\r\\n]*\\[javac\\]\\s*[1-9]\\d*\\s*error.*\\r?\\n # Test failures
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286868#comment-13286868 ] Simon Willnauer commented on LUCENE-4069: - bq. I really dont think we should make this complex to save 2 or 3 files total (even in a complex config with many fields). Its not worth the complexity. I agree. I think those postings formats should only deal with encoding and not with handling certain fields different. A user / app should handle this in the codec. Ideally you don't have any conditions in the relevant methods like termsConsumer etc. bq. For me this falls into one of the many faster-if-you-know-about-it optimisations like FieldSelectors or recycling certain objects. Basically a useful hint to Lucene to save some extra effort but one which you dont need to use. why is this a speed improvement? reading from one file vs. multiple is not really faster though. Anyway, I think we should make this patch as simple as possible and don't handle fields in the PF. We can still open another issue or wait until LUCENE-4093 is in to discuss this issue? Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286882#comment-13286882 ] Robert Muir commented on LUCENE-4069: - {quote} For me this falls into one of the many faster-if-you-know-about-it optimisations like FieldSelectors or recycling certain objects. Basically a useful hint to Lucene to save some extra effort but one which you dont need to use. {quote} I agree with Simon, its not going to be faster. Worse, it creates a situation from the per-field perspective where multiple postings formats are sharing the same files for a segment. This would make it harder to do things like refactorings of codec apis in the future. So where is the benefit? Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java6-64 #349
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/349/changes - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4092) Check what's Jenkins pattern for e-mailing log fragments (so that it includes failures).
[ https://issues.apache.org/jira/browse/LUCENE-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286887#comment-13286887 ] Dawid Weiss commented on LUCENE-4092: - Thanks for working on this, Steve. It'll really be useful. Check what's Jenkins pattern for e-mailing log fragments (so that it includes failures). Key: LUCENE-4092 URL: https://issues.apache.org/jira/browse/LUCENE-4092 Project: Lucene - Java Issue Type: Sub-task Components: general/test Reporter: Dawid Weiss Priority: Trivial -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286909#comment-13286909 ] Andrzej Bialecki commented on LUCENE-3312: --- Comments to patch 04: * index.Document is an interface, I think for better extensibility in the future it could be an abstract class - who knows what we will want to put there in addition to the iterators... * as noted on IRC, this strong decoupling of stored and indexed content poses some interesting questions: ** since you can add multiple fields with the same name, you can now add an arbitrary sequence of Stored and Indexed fields (all with the same name). This means that you can now store parts of a field that are not indexed, and parts of a field that are indexed but not stored. ** previously, if a field was flagged as indexed but didn't have a tokenStream, its String or Reader value would be used to create a token stream. Now if you want a value to be stored and indexed you have to add two fields with the same name - one StoredField and the other an IndexedField for which you create a token stream from the value. My assumption is that StoredField-s will never be used anymore as potential sources of token streams? * maybe this is a good moment to change all getters that return arrays of fields or values to return List-s, since all the code is doing underneath is collecting them into lists and then converting to arrays? * previously we allowed one to remove fields from document by name, are we going to allow this now separately for indexed and stored fields? * minor nit: there's a grammar mistake in Field.setTokenStream(..): TokenStream fields tokenized. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286916#comment-13286916 ] Mark Harwood commented on LUCENE-4069: -- bq. why is this a speed improvement? Sorry - misleading. Replace the word faster in my comment with better and that makes more sense - I mean better in terms of resource usage and reduced open file handles. This seemed relevant given the earlier comments about Solr's use of non-compound files: bq. [Solr] create massive amounts of files if we did so (add to the fact it disables compound files by default and its a disaster...) I can see there is a useful simplification being sought for here if PerFieldPF can consider each of the unique top-level PFs presented to it as looking after an exclusive set of files. As the centralised allocator of file names it can then simply call each unique PF with a choice of segment suffix to name its various files without conflicting with other PFs. Lucene 4093 is all about better determining which PF is unique using .equals(). Unfortunately I don't think this approach is sufficiently complex. In order to avoid allocating all unnecessary file names PerFieldPF would have to further understand the nuances of which PFs were being wrapped by other PFs and which wrapped PFs would be reusable outside of their wrapped PF (as is the case with BloomPF's wrapped PF). That seems a more complex task than implementing equals(). So it seems we have 3 options: 1) Ignore the problems of creating too many files in the case of BloomPF and any other examples of wrapping PFs 2) Create a PerFieldPF implementation that reuses wrapped PFs using some generic means of discovering recyclable wrapped PFs (i.e go further than what 4093 currently proposes in adding .equals support) 3) Retain my BloomPF-specific solution to the problem for those prepared to use lower-level APIs. Am I missing any other options and which one do you want to go for? Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286926#comment-13286926 ] Simon Willnauer commented on LUCENE-4069: - bq. Create a PerFieldPF implementation that reuses wrapped PFs using some generic means of discovering recyclable wrapped PFs (i.e go further than what 4093 currently proposes in adding .equals support) I think we should investigate this further. Lets keep this issue simple and remove the field handling and fix this on a higher level! Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286926#comment-13286926 ] Simon Willnauer edited comment on LUCENE-4069 at 5/31/12 8:57 PM: -- bq. This seemed relevant given the earlier comments about Solr's use of non-compound files: We can't make wrong decisions just because higher level apps make wrong decisions. The dependency goes Solr - Lucene not the other way around. We provide fine grained control when to use CFS ie for smallish segments etc. If you have hundreds of fields all using different PF etc. you have to deal with tons of files but that is to be honest not very likely to be the common case. bq. Create a PerFieldPF implementation that reuses wrapped PFs using some generic means of discovering recyclable wrapped PFs (i.e go further than what 4093 currently proposes in adding .equals support) I think we should investigate this further. Lets keep this issue simple and remove the field handling and fix this on a higher level! was (Author: simonw): bq. Create a PerFieldPF implementation that reuses wrapped PFs using some generic means of discovering recyclable wrapped PFs (i.e go further than what 4093 currently proposes in adding .equals support) I think we should investigate this further. Lets keep this issue simple and remove the field handling and fix this on a higher level! Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostings40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 2723 - Failure
This test intentionally allocates ~256 MB packed ints ... the seed doesn't fail in isolation, but I think the test fails if it's run with other tests that leave too much uncollectible stuff allocated in the heap ... Can we somehow mark that a test should be run in isolation (it's own new JVM)...? Another option ... would be to ignore the OOME ... but the risk there is we suppress a real OOME from a sudden bug in the packed ints. Though it's unlikely such a breakage would escape our usages of packed ints... so maybe it's fine. Mike McCandless http://blog.mikemccandless.com On Thu, May 31, 2012 at 2:54 PM, Apache Jenkins Server jenk...@builds.apache.org wrote: Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2723/ 1 tests failed. REGRESSION: org.apache.lucene.util.packed.TestPackedInts.testIntOverflow Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: Java heap space at __randomizedtesting.SeedInfo.seed([44E7903FBFDCF43:A1996D29E3A73716]:0) at org.apache.lucene.util.packed.Packed64SingleBlock.init(Packed64SingleBlock.java:115) at org.apache.lucene.util.packed.Packed64SingleBlock$Packed64SingleBlock3.init(Packed64SingleBlock.java:315) at org.apache.lucene.util.packed.Packed64SingleBlock.create(Packed64SingleBlock.java:64) at org.apache.lucene.util.packed.TestPackedInts.testIntOverflow(TestPackedInts.java:303) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) Build Log: ${BUILD_LOG_REGEX,regex=(?x: # Compilation failures (?:[^\\r\\n]*\\[javac\\].*\\r?\\n)*[^\\r\\n]*\\[javac\\]\\s*[1-9]\\d*\\s*error.*\\r?\\n # Test failures |[^\\r\\n]*\\[junit4\\]\\s*Suite:.*[\\r\\n]+[^\\r\\n]*\\[junit4\\]\\s*(?!Completed)(?!IGNOR)\\S(?s:.*?)\\s*FAILURES! # License problems |[^\\r\\n]*rat-sources:\\s+\\[echo\\].*(?:\\r?\\n[^\\r\\n]*\\[echo\\].*)*\\s*[1-9]\\d*\\s+Unknown\\s+Licenses.*\\r?\\n(?:[^\\r\\n]*\\[echo\\].*\\r?\\n)* # Javadocs warnings |(?:[^\\r\\n]*\\[javadoc\\].*\\r?\\n)*[^\\r\\n]*\\[javadoc\\]\\s*[1-9]\\d*\\s+warnings.*\\r?\\n # Other javadocs problems (broken links and missing javadocs)
Re: [JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 2723 - Failure
This test intentionally allocates ~256 MB packed ints ... the seed doesn't fail in isolation, but I think the test fails if it's run with other tests that leave too much uncollectible stuff allocated in the heap ... It doesn't need to be hard refs. With parallel garbage collectors (with various staged memory pools) and fast allocation rate a thread may fail with an OOM even if there is theoretically enough space for a new allocated block. Running with SerialGC typically fixes the problem but then -- this isn't realistic :) Can we somehow mark that a test should be run in isolation (it's own new JVM)...? Technically this is possible I think (can't tell how large refactoring it woudl be). But something in me objects to this idea. On the one hand this is ideal test isolation; on the other hand I bet with time all tests would just require a forked VM because it's simpler this way. Good tests should clean up after themselves. I'm idealistic but I believe tests should be fixed if they don't follow this rule. Another option ... would be to ignore the OOME ... but the risk there is we suppress a real OOME from a sudden bug in the packed ints. Though it's unlikely such a breakage would escape our usages of packed ints... so maybe it's fine. How close are we to the memory limit if run in isolation (as a stand-alone test case)? We can probably measure this by allocating a byte[] before the test and doing binary search on its size depending on if it OOMs or not? Maybe it's just really close to the memory limit? Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 2723 - Failure
On Thu, May 31, 2012 at 5:16 PM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: This test intentionally allocates ~256 MB packed ints ... the seed doesn't fail in isolation, but I think the test fails if it's run with other tests that leave too much uncollectible stuff allocated in the heap ... It doesn't need to be hard refs. With parallel garbage collectors (with various staged memory pools) and fast allocation rate a thread may fail with an OOM even if there is theoretically enough space for a new allocated block. Running with SerialGC typically fixes the problem but then -- this isn't realistic :) Got it. Can we somehow mark that a test should be run in isolation (it's own new JVM)...? Technically this is possible I think (can't tell how large refactoring it woudl be). But something in me objects to this idea. On the one hand this is ideal test isolation; on the other hand I bet with time all tests would just require a forked VM because it's simpler this way. Good tests should clean up after themselves. I'm idealistic but I believe tests should be fixed if they don't follow this rule. Yeah I hear you... hmm do we forcefully clear the FieldCache after tests...? Though, in theory once the AtomicReader is collectible the FC's entries should be too... Another option ... would be to ignore the OOME ... but the risk there is we suppress a real OOME from a sudden bug in the packed ints. Though it's unlikely such a breakage would escape our usages of packed ints... so maybe it's fine. How close are we to the memory limit if run in isolation (as a stand-alone test case)? We can probably measure this by allocating a byte[] before the test and doing binary search on its size depending on if it OOMs or not? Maybe it's just really close to the memory limit? OK I did that: if I alloc 68 MB byte[] up front we OOME, but 67 MB byte[] and the test passes (run in isolation). That's closer than I expected: the max long[] we alloc in the test is 273 MB. So 512 - 273 - 68 = 171 MB unexplained hmm I think this is because large arrays are alloc'd directly from the old generation: http://stackoverflow.com/questions/9738911/javas-serial-garbage-collector-performing-far-better-than-other-garbage-collect When I run with -XX:NewRatio=10 then I can pre-alloc 191 MB byte[] and the test still passes ... I think the best option is to ignore the OOME from this test case...? Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #351
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/351/ -- [...truncated 10516 lines...] [junit4] 2at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) [junit4] 2at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146) [junit4] 2 [junit4] 2 168401 T3160 oas.SolrTestCaseJ4.endTrackingSearchers SEVERE ERROR: SolrIndexSearcher opens=430 closes=429 [junit4] 2 NOTE: test params are: codec=Lucene40: {}, sim=RandomSimilarityProvider(queryNorm=false,coord=true): {}, locale=is, timezone=Europe/Vienna [junit4] 2 NOTE: Windows 7 6.1 amd64/Sun Microsystems Inc. 1.6.0_32 (64-bit)/cpus=2,threads=2,free=161686656,total=259457024 [junit4] 2 NOTE: All tests run in this JVM: [SpellCheckCollatorTest, TestIndexingPerformance, CloudStateUpdateTest, TestRussianLightStemFilterFactory, TestValueSourceCache, IndexBasedSpellCheckerTest, OutputWriterTest, TestQueryUtils, TestRecovery, TestHindiFilters, TestPropInjectDefaults, TestPseudoReturnFields, TermVectorComponentTest, ConvertedLegacyTest, PrimUtilsTest, TestIndonesianStemFilterFactory, TestDelimitedPayloadTokenFilterFactory, NoCacheHeaderTest, TestCJKWidthFilterFactory, TestNorwegianLightStemFilterFactory, TestHTMLStripCharFilterFactory, TestThaiWordFilterFactory, TestGroupingSearch, TestHungarianLightStemFilterFactory, BasicZkTest, TestPatternTokenizerFactory, BadComponentTest, TestFunctionQuery, TestJmxIntegration, QueryParsingTest, TestRemoveDuplicatesTokenFilterFactory, TestPhraseSuggestions, TestDocSet, DirectSolrConnectionTest, TestSwedishLightStemFilterFactory, TestConfig, TestNGramFilters, TestJoin, TestSolrCoreProperties, LukeRequestHandlerTest, TestSolrDeletionPolicy1, PolyFieldTest, NotRequiredUniqueKeyTest, TestTypeTokenFilterFactory, TestQueryTypes, XsltUpdateRequestHandlerTest, TestMappingCharFilterFactory, TestElisionFilterFactory, TestFoldingMultitermQuery, CommonGramsFilterFactoryTest, QueryElevationComponentTest, LengthFilterTest, TestMergePolicyConfig, SolrCoreTest, ShowFileRequestHandlerTest, TestKStemFilterFactory, SuggesterWFSTTest, TestItalianLightStemFilterFactory, TestLuceneMatchVersion, LeaderElectionTest, LegacyHTMLStripCharFilterTest, TestSuggestSpellingConverter, TestIndexSearcher, TestSolrXMLSerializer, DateFieldTest, TestMultiCoreConfBootstrap, TestBinaryResponseWriter, TestSolrQueryParser, SystemInfoHandlerTest, TestFastLRUCache, TestGreekStemFilterFactory, TestPersianNormalizationFilterFactory, TestCapitalizationFilterFactory, TestCollationField, TestDistributedGrouping, TestShingleFilterFactory, TestDefaultSimilarityFactory, TestFiltering, TestLRUCache, TestSolrDeletionPolicy2, DoubleMetaphoneFilterFactoryTest, MultiTermTest, TestStopFilterFactory, ZkControllerTest, TestLMJelinekMercerSimilarityFactory, TestCodecSupport, FieldMutatingUpdateProcessorTest, StatsComponentTest, FullSolrCloudTest, TestGalicianStemFilterFactory, SimpleFacetsTest, TestJapaneseBaseFormFilterFactory, TestPatternReplaceCharFilterFactory, TestPorterStemFilterFactory, TestPropInject, TestBadConfig, PrimitiveFieldTypeTest, TestPortugueseStemFilterFactory, EchoParamsTest, UpdateRequestProcessorFactoryTest, SignatureUpdateProcessorFactoryTest, DocumentAnalysisRequestHandlerTest, TestPortugueseMinimalStemFilterFactory, TestBeiderMorseFilterFactory, TestWriterPerf, UpdateParamsTest, TestQuerySenderNoQuery, SolrCoreCheckLockOnStartupTest, TestXIncludeConfig, PluginInfoTest, TestBM25SimilarityFactory, DocumentBuilderTest, ZkSolrClientTest, ZkNodePropsTest, SnowballPorterFilterFactoryTest, TestJapanesePartOfSpeechStopFilterFactory, SuggesterFSTTest, TestFrenchLightStemFilterFactory, FileBasedSpellCheckerTest, CloudStateTest, TestSearchPerf, TestPortugueseLightStemFilterFactory, TestHunspellStemFilterFactory, TestHashPartitioner, LeaderElectionIntegrationTest, MBeansHandlerTest, TestSurroundQueryParser, TermsComponentTest, TestEnglishMinimalStemFilterFactory, TestDFRSimilarityFactory, AlternateDirectoryTest, FastVectorHighlighterTest, OverseerTest, TestJmxMonitoredMap, JSONWriterTest, PingRequestHandlerTest, TestSynonymFilterFactory, TestHyphenationCompoundWordTokenFilterFactory, TestOmitPositions, CoreAdminHandlerTest, TestPhoneticFilterFactory, DirectUpdateHandlerTest, TestWordDelimiterFilterFactory, CacheHeaderTest, SoftAutoCommitTest, DistributedSpellCheckComponentTest, TestGermanStemFilterFactory, TestReplicationHandler, TestPerFieldSimilarity, MoreLikeThisHandlerTest, TestQuerySenderListener, TestCoreContainer, TestKeywordMarkerFilterFactory, SpellPossibilityIteratorTest, TestGreekLowerCaseFilterFactory, NumericFieldsTest, TestLFUCache, JsonLoaderTest, TestStandardFactories, BadIndexSchemaTest, TestReverseStringFilterFactory, SolrRequestParserTest, UniqFieldsUpdateProcessorFactoryTest, SolrPluginUtilsTest, TestWikipediaTokenizerFactory, SOLR749Test,
[jira] [Commented] (LUCENE-4092) Check what's Jenkins pattern for e-mailing log fragments (so that it includes failures).
[ https://issues.apache.org/jira/browse/LUCENE-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286994#comment-13286994 ] Steven Rowe commented on LUCENE-4092: - Two problems: # Spreading the BUILD_LOG_REGEX regex value over multiple lines is not supported by Jenkins's email templating functionality, which is provided by the Jenkins Email Extension Plugin (email-ext) [https://wiki.jenkins-ci.org/display/JENKINS/Email-ext+plugin]. See [the configuration token parsing regexes in ContentBuilder.Tokenizer|https://github.com/jenkinsci/email-ext-plugin/blob/master/src/main/java/hudson/plugins/emailext/plugins/ContentBuilder.java#L134], in particular the comment over the {{stringRegex}} field:{code:java}// Sequence of (1) not \ CR LF and (2) \ followed by non line terminator private static final String stringRegex = \([^\\\r\\n]|(.))*\;{code} This could be fixed by allowing line terminators to be escaped:{code:java}// Sequence of (1) not \ CR LF and (2) \ followed by any non-CR/LF character or (CR)LF private static final String stringRegex = \([^\\\r\\n]|((?:.|\r?\n)))*\;{code} I submitted a Jenkins JIRA issue for this: [https://issues.jenkins-ci.org/browse/JENKINS-13976]. # [BuildLogRegexContent, the content parser for BUILD_LOG_REGEX, matches line-by-line|https://github.com/jenkinsci/email-ext-plugin/blob/master/src/main/java/hudson/plugins/emailext/plugins/content/BuildLogRegexContent.java#L213], so regexes targeting multiple lines will fail. I can see two possible routes to address this: ## The BUILD_LOG_EXCERPT token allows specification of begin/end line regexes, and includes everything inbetween matches. I'm doubtful this will enable capture of the stuff we want, though. ## I'll try to add an argument to BUILD_LOG_REGEX to enable multi-line content matching, and make a pull request/JIRA issue to get it included in the next release of the plugin. In the mean time, I'll switch the configuration in our Jenkins jobs to the following: {noformat} Build: ${BUILD_URL} ${FAILED_TESTS} Build Log: ${BUILD_LOG_REGEX,regex=[ \\t]*(?:\\[javac\\]\\s+[1-9]\\d*\\s+error|\\[junit4\\].*\\s+FAILURES!|\\[javadoc\\]\\s+[1-9]\\d*\\s+warning).*,linesBefore=100} ${BUILD_LOG_REGEX,regex=[ \\t]*\\[echo\\].*)*\\s*[1-9]\\d*\\s+Unknown\\s+Licenses.*,linesBefore=17,linesAfter=20} ${BUILD_LOG_REGEX,regex=[ \\t]*javadocs-lint:.*,linesBefore=0,linesAfter=75} ${BUILD_LOG_REGEX,regex=.*FATAL:.*,linesBefore=0,linesAfter=100} {noformat} Check what's Jenkins pattern for e-mailing log fragments (so that it includes failures). Key: LUCENE-4092 URL: https://issues.apache.org/jira/browse/LUCENE-4092 Project: Lucene - Java Issue Type: Sub-task Components: general/test Reporter: Dawid Weiss Priority: Trivial -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java6-64 #352
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/352/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4092) Check what's Jenkins pattern for e-mailing log fragments (so that it includes failures).
[ https://issues.apache.org/jira/browse/LUCENE-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287026#comment-13287026 ] Michael McCandless commented on LUCENE-4092: Steve you are a regexp God. Check what's Jenkins pattern for e-mailing log fragments (so that it includes failures). Key: LUCENE-4092 URL: https://issues.apache.org/jira/browse/LUCENE-4092 Project: Lucene - Java Issue Type: Sub-task Components: general/test Reporter: Dawid Weiss Priority: Trivial -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 2723 - Failure
On Thu, May 31, 2012 at 5:51 PM, Michael McCandless luc...@mikemccandless.com wrote: I think the best option is to ignore the OOME from this test case...? Mike McCandless I think thats fine for now, but I'm not convinced there is no problem at all. However, its not obvious the problem is us, either. Its easy to see this OOM is related to G1 garbage collector. This test has failed 3 times in the past couple days (before it never failed: i suspect packed ints changes sent it over the edge). https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2707/ https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2719/ https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2723/ All 3 cases are java 7, and all 3 cases uses -XX:+UseG1GC. (Uwe turned on GC randomization at lucene revolution) -- lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3495) UUID and Timestamp Update Processors
[ https://issues.apache.org/jira/browse/SOLR-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-3495. Resolution: Fixed Committed revision 1344946. - trunk Committed revision 1344947. - 4x UUID and Timestamp Update Processors Key: SOLR-3495 URL: https://issues.apache.org/jira/browse/SOLR-3495 Project: Solr Issue Type: New Feature Reporter: Hoss Man Assignee: Hoss Man Fix For: 4.0 Attachments: SOLR-3495.patch new Update Processor's to automatically add fields with new UUIDs and Timestamps to SolrInputDocuments leveraging SOLR-2802. Both processors should default to selecting the uniqueKey field if it is the appropriate type. This is necessary for 4.0 because of SOLR-2796 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Apache Icon in JIRA ?
Hello everyone, When I tried to work LUCENE-4091.patch, I realized that apache icon doesn't appear at the next to the patch file. If I remember correctly, apache icon is displayed as long as the contributors checks Grant License to ASF when attaching patch files. But I couldn't see any apache icons in the past issues in jira. Does anyone know how I see the attached files are granted or not? koji -- Query Log Visualizer for Apache Solr http://soleami.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3499) Attachment Test Issue - IGNORE
Hoss Man created SOLR-3499: -- Summary: Attachment Test Issue - IGNORE Key: SOLR-3499 URL: https://issues.apache.org/jira/browse/SOLR-3499 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Hoss Man sanity checking attachment licensing indicator in Jira -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Apache Icon in JIRA ?
Click the down arrow (options, to the far right side of the attachments section), then choose manage attachments and you can see the apache icon beside all attachments on the issue. On Thu, May 31, 2012 at 8:33 PM, Koji Sekiguchi k...@r.email.ne.jp wrote: Hello everyone, When I tried to work LUCENE-4091.patch, I realized that apache icon doesn't appear at the next to the patch file. If I remember correctly, apache icon is displayed as long as the contributors checks Grant License to ASF when attaching patch files. But I couldn't see any apache icons in the past issues in jira. Does anyone know how I see the attached files are granted or not? koji -- Query Log Visualizer for Apache Solr http://soleami.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3499) Attachment Test Issue - IGNORE
[ https://issues.apache.org/jira/browse/SOLR-3499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-3499: --- Attachment: empty_file_not_intended_for_inclusion.txt attaching empty file and selecting Attachment not intended for inclusion Attachment Test Issue - IGNORE -- Key: SOLR-3499 URL: https://issues.apache.org/jira/browse/SOLR-3499 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Hoss Man Attachments: empty_file_grant_license.txt, empty_file_not_intended_for_inclusion.txt sanity checking attachment licensing indicator in Jira -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3499) Attachment Test Issue - IGNORE
[ https://issues.apache.org/jira/browse/SOLR-3499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-3499: --- Attachment: empty_file_grant_license.txt Attaching empty file and selecting Grant license to ASF for inclusion in ASF works (as per the Apache License §5) Attachment Test Issue - IGNORE -- Key: SOLR-3499 URL: https://issues.apache.org/jira/browse/SOLR-3499 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Hoss Man Attachments: empty_file_grant_license.txt, empty_file_not_intended_for_inclusion.txt sanity checking attachment licensing indicator in Jira -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Apache Icon in JIRA ?
: : Click the down arrow (options, to the far right side of the : attachments section), then choose manage attachments and you can see : the apache icon beside all attachments on the issue. For quick comparison... https://issues.apache.org/jira/browse/SOLR-3499 https://issues.apache.org/jira/secure/ManageAttachments.jspa?id=12558890 ...i'll file an INFRA Jira to see if we can get this back on the main issue screen. -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Apache Icon in JIRA ?
: ...i'll file an INFRA Jira to see if we can get this back on the main : issue screen. Scratch that ... It was already reported and Infra evidently considers the matter resolved... https://issues.apache.org/jira/browse/INFRA-4842 -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Apache Icon in JIRA ?
Robert, Hoss - Thanks! :) (12/06/01 9:42), Chris Hostetter wrote: : : Click the down arrow (options, to the far right side of the : attachments section), then choose manage attachments and you can see : the apache icon beside all attachments on the issue. For quick comparison... https://issues.apache.org/jira/browse/SOLR-3499 https://issues.apache.org/jira/secure/ManageAttachments.jspa?id=12558890 ...i'll file an INFRA Jira to see if we can get this back on the main issue screen. -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Query Log Visualizer for Apache Solr http://soleami.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4091) FastVectorHighlighter: Getter for FieldFragList.WeightedFragInfo and FieldPhraseList.WeightedPhraseInfo
[ https://issues.apache.org/jira/browse/LUCENE-4091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi resolved LUCENE-4091. Resolution: Fixed Fix Version/s: 5.0 4.0 committed in trunk and 4x. FastVectorHighlighter: Getter for FieldFragList.WeightedFragInfo and FieldPhraseList.WeightedPhraseInfo --- Key: LUCENE-4091 URL: https://issues.apache.org/jira/browse/LUCENE-4091 Project: Lucene - Java Issue Type: Improvement Components: modules/highlighter Affects Versions: 4.0 Reporter: sebastian L. Assignee: Koji Sekiguchi Priority: Minor Labels: patch Fix For: 4.0, 5.0 Attachments: LUCENE-4091.patch This patch introduces getter-methods for * FieldFragList.WeightedFragInfo and * FieldPhraseList.WeightedPhraseInfo in order to make FieldFragList plugable (see LUCENE-3440). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3440) FastVectorHighlighter: IDF-weighted terms for ordered fragments
[ https://issues.apache.org/jira/browse/LUCENE-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287088#comment-13287088 ] Koji Sekiguchi commented on LUCENE-3440: Hi sebastian, I committed LUCENE-4091 in trunk and branch_4x. For the credit, I will give it in CHANGES.txt when committing the main body (LUCENE-3440) patch. FastVectorHighlighter: IDF-weighted terms for ordered fragments Key: LUCENE-3440 URL: https://issues.apache.org/jira/browse/LUCENE-3440 Project: Lucene - Java Issue Type: Improvement Components: modules/highlighter Reporter: sebastian L. Priority: Minor Labels: FastVectorHighlighter Fix For: 4.0 Attachments: LUCENE-3440.patch, LUCENE-3440.patch, LUCENE-3440.patch, LUCENE-3440_3.6.1-SNAPSHOT.patch, LUCENE-4.0-SNAPSHOT-3440-9.patch, weight-vs-boost_table01.html, weight-vs-boost_table02.html The FastVectorHighlighter uses for every term found in a fragment an equal weight, which causes a higher ranking for fragments with a high number of words or, in the worst case, a high number of very common words than fragments that contains *all* of the terms used in the original query. This patch provides ordered fragments with IDF-weighted terms: total weight = total weight + IDF for unique term per fragment * boost of query; The ranking-formula should be the same, or at least similar, to that one used in org.apache.lucene.search.highlight.QueryTermScorer. The patch is simple, but it works for us. Some ideas: - A better approach would be moving the whole fragments-scoring into a separate class. - Switch scoring via parameter - Exact phrases should be given a even better score, regardless if a phrase-query was executed or not - edismax/dismax-parameters pf, ps and pf^boost should be observed and corresponding fragments should be ranked higher -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3498) ContentStreamUpdateRequest doesn't seem to respect setCommitWithin()
[ https://issues.apache.org/jira/browse/SOLR-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Moen updated SOLR-3498: - Affects Version/s: 4.0 ContentStreamUpdateRequest doesn't seem to respect setCommitWithin() Key: SOLR-3498 URL: https://issues.apache.org/jira/browse/SOLR-3498 Project: Solr Issue Type: Bug Components: update Affects Versions: 3.6, 4.0 Reporter: Christian Moen I'm using the below code to post some office format files to Solr using SolrJ. It seems like {{setCommitWithin()}} is ignored in my {{ContentStreamUpdateRequest}} request, and that I need to use {{setParam(UpdateParams.COMMIT_WITHIN, ...)}} instead to get the desired effect. {code} SolrServer solrServer = new HttpSolrServer(http://localhost:8983/solr;); ContentStreamUpdateRequest updateRequest = new ContentStreamUpdateRequest(/update/extract); updateRequest.addFile(file); updateRequest.setParam(literal.id, file.getName()); updateRequest.setCommitWithin(1); // Does not work //updateRequest.setParam(UpdateParams.COMMIT_WITHIN, 1); // Works updateRequest.process(solrServer); {code} If I use the below {code} ... //updateRequest.setCommitWithin(1); // Does not work updateRequest.setParam(UpdateParams.COMMIT_WITHIN, 1); // Works ... {code} I get the desired result and a commit is being done. I'm doing this on 3.x, but I believe this issue could apply to 4.x as well (by quickly glancing over the code with tired eyes), but I haven't verified this, yet. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #355
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/355/changes Changes: [koji] LUCENE-4091: add getter methods to FVH, part of LUCENE-3440 [hossman] SOLR-3495: new UpdateProcessors to add default values (constant, UUID, or Date) to documents w/o field values -- [...truncated 10447 lines...] [junit4] Completed in 0.17s, 1 test [junit4] [junit4] Suite: org.apache.solr.spelling.SpellPossibilityIteratorTest [junit4] Completed in 0.06s, 2 tests [junit4] [junit4] Suite: org.apache.solr.handler.XsltUpdateRequestHandlerTest [junit4] Completed in 1.16s, 1 test [junit4] [junit4] Suite: org.apache.solr.analysis.TestWikipediaTokenizerFactory [junit4] Completed in 0.01s, 1 test [junit4] [junit4] Suite: org.apache.solr.analysis.TestElisionFilterFactory [junit4] Completed in 0.03s, 3 tests [junit4] [junit4] Suite: org.apache.solr.cloud.TestMultiCoreConfBootstrap [junit4] Completed in 4.51s, 1 test [junit4] [junit4] Suite: org.apache.solr.cloud.RecoveryZkTest [junit4] Completed in 35.07s, 1 test [junit4] [junit4] Suite: org.apache.solr.handler.TestReplicationHandler [junit4] Completed in 28.80s, 1 test [junit4] [junit4] Suite: org.apache.solr.cloud.ZkSolrClientTest [junit4] Completed in 15.81s, 4 tests [junit4] [junit4] Suite: org.apache.solr.handler.component.DistributedSpellCheckComponentTest [junit4] Completed in 18.97s, 1 test [junit4] [junit4] Suite: org.apache.solr.handler.component.QueryElevationComponentTest [junit4] Completed in 6.06s, 7 tests [junit4] [junit4] Suite: org.apache.solr.ConvertedLegacyTest [junit4] Completed in 3.26s, 1 test [junit4] [junit4] Suite: org.apache.solr.TestTrie [junit4] Completed in 1.55s, 8 tests [junit4] [junit4] Suite: org.apache.solr.schema.BadIndexSchemaTest [junit4] Completed in 1.21s, 6 tests [junit4] [junit4] Suite: org.apache.solr.core.TestJmxIntegration [junit4] IGNORED 0.00s | TestJmxIntegration.testJmxOnCoreReload [junit4] Cause: Annotated @Ignore(timing problem? https://issues.apache.org/jira/browse/SOLR-2715) [junit4] Completed in 1.60s, 3 tests, 1 skipped [junit4] [junit4] Suite: org.apache.solr.servlet.SolrRequestParserTest [junit4] Completed in 1.32s, 4 tests [junit4] [junit4] Suite: org.apache.solr.spelling.IndexBasedSpellCheckerTest [junit4] Completed in 1.05s, 5 tests [junit4] [junit4] Suite: org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest [junit4] Completed in 1.19s, 6 tests [junit4] [junit4] Suite: org.apache.solr.core.TestSolrDeletionPolicy2 [junit4] Completed in 0.74s, 1 test [junit4] [junit4] Suite: org.apache.solr.handler.component.TermsComponentTest [junit4] Completed in 0.95s, 13 tests [junit4] [junit4] Suite: org.apache.solr.handler.admin.ShowFileRequestHandlerTest [junit4] Completed in 1.04s, 2 tests [junit4] [junit4] Suite: org.apache.solr.search.TestSurroundQueryParser [junit4] Completed in 0.92s, 1 test [junit4] [junit4] Suite: org.apache.solr.highlight.HighlighterTest [junit4] Completed in 2.06s, 27 tests [junit4] [junit4] Suite: org.apache.solr.update.DocumentBuilderTest [junit4] Completed in 0.96s, 11 tests [junit4] [junit4] Suite: org.apache.solr.search.function.distance.DistanceFunctionTest [junit4] Completed in 1.06s, 3 tests [junit4] [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterTSTTest [junit4] Completed in 1.28s, 4 tests [junit4] [junit4] Suite: org.apache.solr.schema.PolyFieldTest [junit4] Completed in 1.29s, 4 tests [junit4] [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterWFSTTest [junit4] Completed in 1.28s, 4 tests [junit4] [junit4] Suite: org.apache.solr.schema.TestOmitPositions [junit4] Completed in 0.98s, 2 tests [junit4] [junit4] Suite: org.apache.solr.core.TestSolrDeletionPolicy1 [junit4] IGNOR/A 0.01s | TestSolrDeletionPolicy1.testCommitAge [junit4] Assumption #1: This test is not working on Windows (or maybe machines with only 2 CPUs) [junit4] 2 1167 T3512 oas.SolrTestCaseJ4.setUp ###Starting testCommitAge [junit4] 2 1172 T3512 C217 oasu.DirectUpdateHandler2.deleteAll [collection1] REMOVING ALL DOCUMENTS FROM INDEX [junit4] 2 1172 T3512 C217 UPDATE [collection1] webapp=null path=null params={} {deleteByQuery=*:*} 0 0 [junit4] 2 1174 T3512 oas.SolrTestCaseJ4.tearDown ###Ending testCommitAge [junit4] 2 [junit4] Completed in 1.19s, 3 tests, 1 skipped [junit4] [junit4] Suite: org.apache.solr.analysis.TestReversedWildcardFilterFactory [junit4] Completed in 0.75s, 4 tests [junit4] [junit4] Suite: org.apache.solr.schema.RequiredFieldsTest [junit4] Completed in 0.83s, 3 tests [junit4] [junit4] Suite:
[jira] [Commented] (LUCENE-4092) Check what's Jenkins pattern for e-mailing log fragments (so that it includes failures).
[ https://issues.apache.org/jira/browse/LUCENE-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287097#comment-13287097 ] Steven Rowe commented on LUCENE-4092: - bq. I'll switch the configuration in our Jenkins jobs to the following ... Done. Check what's Jenkins pattern for e-mailing log fragments (so that it includes failures). Key: LUCENE-4092 URL: https://issues.apache.org/jira/browse/LUCENE-4092 Project: Lucene - Java Issue Type: Sub-task Components: general/test Reporter: Dawid Weiss Priority: Trivial -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4097) index was locked because of InterruptedException
[ https://issues.apache.org/jira/browse/LUCENE-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wang updated LUCENE-4097: - Component/s: core/index Affects Version/s: 3.1 index was locked because of InterruptedException Key: LUCENE-4097 URL: https://issues.apache.org/jira/browse/LUCENE-4097 Project: Lucene - Java Issue Type: Bug Components: core/index Affects Versions: 3.1 Reporter: wang the index was locked, because of InterruptedException,and i could do nothing but restart tomcat, how could i avoid this happen again? thanks this is stacktrace: org.apache.lucene.util.ThreadInterruptedException: java.lang.InterruptedException at org.apache.lucene.index.IndexWriter.doWait(IndexWriter.java:4118) at org.apache.lucene.index.IndexWriter.waitForMerges(IndexWriter.java:2836) at org.apache.lucene.index.IndexWriter.finishMerges(IndexWriter.java:2821) at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1847) at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1800) at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1764) at org.opencms.search.CmsSearchManager.updateIndexIncremental(CmsSearchManager.java:2262) at org.opencms.search.CmsSearchManager.updateIndexOffline(CmsSearchManager.java:2306) at org.opencms.search.CmsSearchManager$CmsSearchOfflineIndexThread.run(CmsSearchManager.java:327) Caused by: java.lang.InterruptedException at java.lang.Object.wait(Native Method) at org.apache.lucene.index.IndexWriter.doWait(IndexWriter.java:4116) ... 8 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Typo in UIMAUpdateRequestProcessor: Analazying text
(12/06/01 5:25), Jack Krupansky wrote: A typo at line 146 in UIMAUpdateRequestProcessor.java: log.info(new StringBuffer(Analazying text).toString()); “Analazying” s.b. “Analyzing” Thanks Jack! Committed the fix. koji -- Query Log Visualizer for Apache Solr http://soleami.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #356
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/356/ -- [...truncated 16518 lines...] [junit4] 2 58777 T3191 oasc.RequestHandlers.initHandlersFromConfig created spellCheckCompRH_Direct: org.apache.solr.handler.component.SearchHandler [junit4] 2 58777 T3191 oasc.RequestHandlers.initHandlersFromConfig created spellCheckCompRH1: org.apache.solr.handler.component.SearchHandler [junit4] 2 58777 T3191 oasc.RequestHandlers.initHandlersFromConfig created tvrh: org.apache.solr.handler.component.SearchHandler [junit4] 2 58778 T3191 oasc.RequestHandlers.initHandlersFromConfig created /mlt: solr.MoreLikeThisHandler [junit4] 2 58778 T3191 oasc.RequestHandlers.initHandlersFromConfig created /debug/dump: solr.DumpRequestHandler [junit4] 2 58779 T3191 oashl.XMLLoader.init xsltCacheLifetimeSeconds=60 [junit4] 2 58780 T3191 oasc.SolrCore.initDeprecatedSupport WARNING solrconfig.xml uses deprecated admin/gettableFiles, Please update your config to use the ShowFileRequestHandler. [junit4] 2 58781 T3191 oasc.SolrCore.initDeprecatedSupport WARNING adding ShowFileRequestHandler with hidden files: [SOLRCONFIG-HIGHLIGHT.XML, SCHEMA-REQUIRED-FIELDS.XML, SCHEMA-REPLICATION2.XML, SCHEMA-MINIMAL.XML, BAD-SCHEMA-DUP-DYNAMICFIELD.XML, SOLRCONFIG-CACHING.XML, SOLRCONFIG-REPEATER.XML, CURRENCY.XML, BAD-SCHEMA-NONTEXT-ANALYZER.XML, SOLRCONFIG-MERGEPOLICY.XML, SOLRCONFIG-TLOG.XML, SOLRCONFIG-MASTER.XML, SCHEMA11.XML, SOLRCONFIG-BASIC.XML, DA_COMPOUNDDICTIONARY.TXT, SCHEMA-COPYFIELD-TEST.XML, SOLRCONFIG-SLAVE.XML, ELEVATE.XML, SOLRCONFIG-PROPINJECT-INDEXDEFAULT.XML, SCHEMA-IB.XML, SOLRCONFIG-QUERYSENDER.XML, SCHEMA-REPLICATION1.XML, DA_UTF8.XML, HYPHENATION.DTD, SOLRCONFIG-ENABLEPLUGIN.XML, SCHEMA-PHRASESUGGEST.XML, STEMDICT.TXT, HUNSPELL-TEST.AFF, STOPTYPES-1.TXT, STOPWORDSWRONGENCODING.TXT, SCHEMA-NUMERIC.XML, SOLRCONFIG-TRANSFORMERS.XML, SOLRCONFIG-PROPINJECT.XML, BAD-SCHEMA-NOT-INDEXED-BUT-TF.XML, SOLRCONFIG-SIMPLELOCK.XML, WDFTYPES.TXT, STOPTYPES-2.TXT, SCHEMA-REVERSED.XML, SOLRCONFIG-SPELLCHECKCOMPONENT.XML, SCHEMA-DFR.XML, SOLRCONFIG-PHRASESUGGEST.XML, BAD-SCHEMA-NOT-INDEXED-BUT-POS.XML, KEEP-1.TXT, OPEN-EXCHANGE-RATES.JSON, STOPWITHBOM.TXT, SCHEMA-BINARYFIELD.XML, SOLRCONFIG-SPELLCHECKER.XML, SOLRCONFIG-UPDATE-PROCESSOR-CHAINS.XML, BAD-SCHEMA-OMIT-TF-BUT-NOT-POS.XML, BAD-SCHEMA-DUP-FIELDTYPE.XML, SOLRCONFIG-MASTER1.XML, SYNONYMS.TXT, SCHEMA.XML, SCHEMA_CODEC.XML, SOLRCONFIG-SOLR-749.XML, SOLRCONFIG-MASTER1-KEEPONEBACKUP.XML, STOP-2.TXT, SOLRCONFIG-FUNCTIONQUERY.XML, SCHEMA-LMDIRICHLET.XML, SOLRCONFIG-TERMINDEX.XML, SOLRCONFIG-ELEVATE.XML, STOPWORDS.TXT, SCHEMA-FOLDING.XML, SCHEMA-STOP-KEEP.XML, BAD-SCHEMA-NOT-INDEXED-BUT-NORMS.XML, SOLRCONFIG-SOLCOREPROPERTIES.XML, STOP-1.TXT, SOLRCONFIG-MASTER2.XML, SCHEMA-SPELLCHECKER.XML, SOLRCONFIG-LAZYWRITER.XML, SCHEMA-LUCENEMATCHVERSION.XML, BAD-MP-SOLRCONFIG.XML, FRENCHARTICLES.TXT, SCHEMA15.XML, SOLRCONFIG-REQHANDLER.INCL, SCHEMASURROUND.XML, SOLRCONFIG-MASTER3.XML, HUNSPELL-TEST.DIC, SOLRCONFIG-XINCLUDE.XML, SOLRCONFIG-DELPOLICY1.XML, SOLRCONFIG-SLAVE1.XML, SCHEMA-SIM.XML, SCHEMA-COLLATE.XML, STOP-SNOWBALL.TXT, PROTWORDS.TXT, SCHEMA-TRIE.XML, SOLRCONFIG_CODEC.XML, SCHEMA-TFIDF.XML, SCHEMA-LMJELINEKMERCER.XML, PHRASESUGGEST.TXT, OLD_SYNONYMS.TXT, SOLRCONFIG-DELPOLICY2.XML, XSLT, SOLRCONFIG-NATIVELOCK.XML, BAD-SCHEMA-DUP-FIELD.XML, SOLRCONFIG-NOCACHE.XML, SCHEMA-BM25.XML, SOLRCONFIG-ALTDIRECTORY.XML, SOLRCONFIG-QUERYSENDER-NOQUERY.XML, COMPOUNDDICTIONARY.TXT, SOLRCONFIG_PERF.XML, SCHEMA-NOT-REQUIRED-UNIQUE-KEY.XML, KEEP-2.TXT, SCHEMA12.XML, MAPPING-ISOLATIN1ACCENT.TXT, BAD_SOLRCONFIG.XML, BAD-SCHEMA-EXTERNAL-FILEFIELD.XML] [junit4] 2 58784 T3191 oass.SolrIndexSearcher.init Opening Searcher@132bfc00 main [junit4] 2 58785 T3191 oass.SolrIndexSearcher.init WARNING WARNING: Directory impl does not support setting indexDir: org.apache.lucene.store.MockDirectoryWrapper [junit4] 2 58785 T3191 oasu.CommitTracker.init Hard AutoCommit: disabled [junit4] 2 58785 T3191 oasu.CommitTracker.init Soft AutoCommit: disabled [junit4] 2 58786 T3191 oashc.SpellCheckComponent.inform Initializing spell checkers [junit4] 2 58793 T3191 oass.DirectSolrSpellChecker.init init: {name=direct,classname=DirectSolrSpellChecker,field=lowerfilt,minQueryLength=3} [junit4] 2 58823 T208 oaz.ClientCnxn$SendThread.startConnect Opening socket connection to server 127.0.0.1/127.0.0.1:60602 [junit4] 2 58825 T3191 oashc.HttpShardHandlerFactory.getParameter Setting socketTimeout to: 0 [junit4] 2 58825 T3191 oashc.HttpShardHandlerFactory.getParameter Setting urlScheme to: http:// [junit4] 2 58826 T3191 oashc.HttpShardHandlerFactory.getParameter Setting connTimeout to: 0 [junit4] 2 58826 T3191 oashc.HttpShardHandlerFactory.getParameter Setting maxConnectionsPerHost to: 20 [junit4] 2 58826 T3191
Re: CHANGES.txt for highlighter?
trunk/branch4x only have a single consolidated lucene/CHANGES.txt. So a highlighter change would just go there! On Thu, May 31, 2012 at 10:15 PM, Koji Sekiguchi k...@r.email.ne.jp wrote: Hi sorry again, I cannot find CHANGES.txt files anymore for (ancient?) contrib packages, e.g. highlighter under lucene directory: $ find . -name CHANGES.txt ./lucene/CHANGES.txt ./solr/CHANGES.txt ./solr/contrib/analysis-extras/CHANGES.txt ./solr/contrib/clustering/CHANGES.txt ./solr/contrib/dataimporthandler/CHANGES.txt ./solr/contrib/extraction/CHANGES.txt ./solr/contrib/langid/CHANGES.txt ./solr/contrib/uima/CHANGES.txt where should I give a credit for a contributor for FVH? koji -- Query Log Visualizer for Apache Solr http://soleami.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: CHANGES.txt for highlighter?
(12/06/01 11:28), Robert Muir wrote: trunk/branch4x only have a single consolidated lucene/CHANGES.txt. So a highlighter change would just go there! Got it. Thank you again! koji -- Query Log Visualizer for Apache Solr http://soleami.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java7-64 #201
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/201/changes Changes: [koji] fix typo in uima contrib [koji] LUCENE-4091: add getter methods to FVH, part of LUCENE-3440 -- [...truncated 11372 lines...] [junit4] Suite: org.apache.solr.util.TimeZoneUtilsTest [junit4] Completed in 0.16s, 5 tests [junit4] [junit4] Suite: org.apache.solr.schema.NumericFieldsTest [junit4] Completed in 0.88s, 1 test [junit4] [junit4] Suite: org.apache.solr.core.TestQuerySenderNoQuery [junit4] Completed in 0.84s, 3 tests [junit4] [junit4] Suite: org.apache.solr.analysis.TestGermanMinimalStemFilterFactory [junit4] Completed in 0.01s, 1 test [junit4] [junit4] Suite: org.apache.solr.cloud.OverseerTest [junit4] Completed in 46.50s, 7 tests [junit4] [junit4] Suite: org.apache.solr.TestDistributedSearch [junit4] Completed in 27.63s, 1 test [junit4] [junit4] Suite: org.apache.solr.cloud.CloudStateUpdateTest [junit4] Completed in 12.74s, 1 test [junit4] [junit4] Suite: org.apache.solr.handler.component.DistributedSpellCheckComponentTest [junit4] Completed in 15.45s, 1 test [junit4] [junit4] Suite: org.apache.solr.handler.component.DistributedTermsComponentTest [junit4] Completed in 13.24s, 1 test [junit4] [junit4] Suite: org.apache.solr.cloud.BasicZkTest [junit4] Completed in 9.06s, 1 test [junit4] [junit4] Suite: org.apache.solr.TestJoin [junit4] Completed in 10.19s, 2 tests [junit4] [junit4] Suite: org.apache.solr.handler.component.SpellCheckComponentTest [junit4] Completed in 7.31s, 9 tests [junit4] [junit4] Suite: org.apache.solr.handler.component.QueryElevationComponentTest [junit4] Completed in 5.33s, 7 tests [junit4] [junit4] Suite: org.apache.solr.cloud.TestMultiCoreConfBootstrap [junit4] Completed in 4.47s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.TestRangeQuery [junit4] Completed in 8.50s, 2 tests [junit4] [junit4] Suite: org.apache.solr.update.PeerSyncTest [junit4] Completed in 4.24s, 1 test [junit4] [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterFSTTest [junit4] Completed in 1.25s, 4 tests [junit4] [junit4] Suite: org.apache.solr.handler.MoreLikeThisHandlerTest [junit4] Completed in 0.92s, 1 test [junit4] [junit4] Suite: org.apache.solr.core.SolrCoreTest [junit4] Completed in 5.18s, 5 tests [junit4] [junit4] Suite: org.apache.solr.core.TestJmxIntegration [junit4] IGNORED 0.00s | TestJmxIntegration.testJmxOnCoreReload [junit4] Cause: Annotated @Ignore(timing problem? https://issues.apache.org/jira/browse/SOLR-2715) [junit4] Completed in 1.74s, 3 tests, 1 skipped [junit4] [junit4] Suite: org.apache.solr.search.TestPseudoReturnFields [junit4] Completed in 1.40s, 13 tests [junit4] [junit4] Suite: org.apache.solr.search.similarities.TestLMDirichletSimilarityFactory [junit4] Completed in 0.15s, 2 tests [junit4] [junit4] Suite: org.apache.solr.update.processor.UniqFieldsUpdateProcessorFactoryTest [junit4] Completed in 0.81s, 1 test [junit4] [junit4] Suite: org.apache.solr.handler.admin.CoreAdminHandlerTest [junit4] Completed in 1.78s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.SpatialFilterTest [junit4] Completed in 1.53s, 3 tests [junit4] [junit4] Suite: org.apache.solr.core.SolrCoreCheckLockOnStartupTest [junit4] Completed in 1.53s, 2 tests [junit4] [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterWFSTTest [junit4] Completed in 1.26s, 4 tests [junit4] [junit4] Suite: org.apache.solr.schema.CurrencyFieldTest [junit4] IGNORED 0.00s | CurrencyFieldTest.testPerformance [junit4] Cause: Annotated @Ignore() [junit4] Completed in 1.10s, 8 tests, 1 skipped [junit4] [junit4] Suite: org.apache.solr.schema.TestOmitPositions [junit4] Completed in 0.88s, 2 tests [junit4] [junit4] Suite: org.apache.solr.core.TestSolrDeletionPolicy1 [junit4] IGNOR/A 0.02s | TestSolrDeletionPolicy1.testCommitAge [junit4] Assumption #1: This test is not working on Windows (or maybe machines with only 2 CPUs) [junit4] 2 749 T3389 oas.SolrTestCaseJ4.setUp ###Starting testCommitAge [junit4] 2 ASYNC NEW_CORE C224 name=collection1 org.apache.solr.core.SolrCore@26d904a1 [junit4] 2 753 T3389 C224 oasu.DirectUpdateHandler2.deleteAll [collection1] REMOVING ALL DOCUMENTS FROM INDEX [junit4] 2 754 T3390 oasc.SolrCore.registerSearcher [collection1] Registered new searcher Searcher@ab34164 main{StandardDirectoryReader(segments_1:1)} [junit4] 2 754 T3389 C224 oasc.SolrDeletionPolicy.onInit SolrDeletionPolicy.onInit: commits:num=1 [junit4] 2 commit{dir=MockDirWrapper(org.apache.lucene.store.RAMDirectory@7a766fb4
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287136#comment-13287136 ] Chris Male commented on LUCENE-3312: bq. index.Document is an interface, I think for better extensibility in the future it could be an abstract class - who knows what we will want to put there in addition to the iterators... I'm not sure that is such a big deal. But I do think should think about the name here. We already have Document and it's going to become confusing with two different Document classes kind of doing the same thing and with document.Document implementing index.Document as well. bq. previously we allowed one to remove fields from document by name, are we going to allow this now separately for indexed and stored fields? I think we need to simplify the document.Document API. I don't think it should hold Indexable/StorableField instances but instead should just hold Field instances. It is a userland kind of class and so is Field. We should make it easy for people to add the Fields that they want. If they want to have a Field which is both indexed and stored, then they can create it once and add it to Document. If they want to do it separately, then they can do that too. Since Field implements both IndexableField and StorableField, it can serve the dual purpose. That way the API in document.Document is pretty simple and you can add and remove things as done in the past. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4099) Remove generics from SpatialStrategy and remove SpatialFieldInfo
Chris Male created LUCENE-4099: -- Summary: Remove generics from SpatialStrategy and remove SpatialFieldInfo Key: LUCENE-4099 URL: https://issues.apache.org/jira/browse/LUCENE-4099 Project: Lucene - Java Issue Type: Improvement Components: modules/spatial Reporter: Chris Male Priority: Minor Same time ago I added SpatialFieldInfo as a way for SpatialStrategys to declare what information they needed per request. This meant that a Strategy could be used across multiple requests. However it doesn't really need to be that way any more, Strategies are light to instantiate and the generics are just clumsy and annoying. Instead Strategies should just define what they need in their constructor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 2723 - Failure
Aaahhh... I thought G1 will start causing issues at some point. Good catch, Robert. Dawid On Fri, Jun 1, 2012 at 2:05 AM, Robert Muir rcm...@gmail.com wrote: On Thu, May 31, 2012 at 5:51 PM, Michael McCandless luc...@mikemccandless.com wrote: I think the best option is to ignore the OOME from this test case...? Mike McCandless I think thats fine for now, but I'm not convinced there is no problem at all. However, its not obvious the problem is us, either. Its easy to see this OOM is related to G1 garbage collector. This test has failed 3 times in the past couple days (before it never failed: i suspect packed ints changes sent it over the edge). https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2707/ https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2719/ https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2723/ All 3 cases are java 7, and all 3 cases uses -XX:+UseG1GC. (Uwe turned on GC randomization at lucene revolution) -- lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java7-64 #202
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/202/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org