[jira] [Updated] (SOLR-3161) Use of 'qt' should be restricted to searching and should not start with a '/'
[ https://issues.apache.org/jira/browse/SOLR-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-3161: --- Attachment: SOLR-3161_handleSelect=false_and_register_slash-select_handler.patch Attached is a patch to solrconfig.xml with comments based off of what Hoss suggested. I'll commit these to 4x trunk in ~24 hours if no further comment is given. Use of 'qt' should be restricted to searching and should not start with a '/' - Key: SOLR-3161 URL: https://issues.apache.org/jira/browse/SOLR-3161 Project: Solr Issue Type: Improvement Components: search, web gui Reporter: David Smiley Assignee: David Smiley Fix For: 3.6, 4.0 Attachments: SOLR-3161-disable-qt-by-default.patch, SOLR-3161-dispatching-request-handler.patch, SOLR-3161-dispatching-request-handler.patch, SOLR-3161_handleSelect=false_and_register_slash-select_handler.patch, SOLR-3161_limit_qt=_to_refer_to_SearchHandlers,_and_shards_qt_likewise.patch, SOLR-3161_make_the_slash-select_request_handler_the_default.patch I haven't yet looked at the code involved for suggestions here; I'm speaking based on how I think things should work and not work, based on intuitiveness and security. In general I feel it is best practice to use '/' leading request handler names and not use qt, but I don't hate it enough when used in limited (search-only) circumstances to propose its demise. But if someone proposes its deprecation that then I am +1 for that. Here is my proposal: Solr should error if the parameter qt is supplied with a leading '/'. (trunk only) Solr should only honor qt if the target request handler extends solr.SearchHandler. The new admin UI should only use 'qt' when it has to. For the query screen, it could present a little pop-up menu of handlers to choose from, including /select?qt=mycustom for handlers that aren't named with a leading '/'. This choice should be positioned at the top. And before I forget, me or someone should investigate if there are any similar security problems with the shards.qt parameter. Perhaps shards.qt can abide by the same rules outlined above. Does anyone foresee any problems with this proposal? On a related subject, I think the notion of a default request handler is bad - the default=true thing. Honestly I'm not sure what it does, since I noticed Solr trunk redirects '/solr/' to the new admin UI at '/solr/#/'. Assuming it doesn't do anything useful anymore, I think it would be clearer to use requestHandler name=/select class=solr.SearchHandler instead of what's there now. The delta is to put the leading '/' on this request handler name, and remove the default attribute. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Adrien Grand as a new Lucene/Solr committer
Welcome On Fri, Jun 8, 2012 at 5:47 PM, Sami Siren ssi...@gmail.com wrote: Welcome Adrien! -- Sami Siren On Thu, Jun 7, 2012 at 9:11 PM, Michael McCandless luc...@mikemccandless.com wrote: I'm pleased to announce that Adrien Grand has joined our ranks as a committer. He has been contributing various patches to Lucene/Solr, recently to Lucene's packed ints implementation, giving a nice performance gain in some cases. For example check out http://people.apache.org/~mikemccand/lucenebench/TermTitleSort.html (look for annotation U). Adrien, its tradition that you introduce yourself with a brief bio. As soon as your SVN access is setup, you should then be able to add yourself to the committers list on the website as well. Congratulations! Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Chris Male | Software Developer | DutchWorks | www.dutchworks.nl
[jira] [Updated] (SOLR-3177) Excluding tagged filter in StatsComponent
[ https://issues.apache.org/jira/browse/SOLR-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mathias H. updated SOLR-3177: - Affects Version/s: 4.0 Excluding tagged filter in StatsComponent - Key: SOLR-3177 URL: https://issues.apache.org/jira/browse/SOLR-3177 Project: Solr Issue Type: Improvement Components: SearchComponents - other Affects Versions: 3.5, 3.6, 4.0 Reporter: Mathias H. Priority: Minor Labels: localparams, stats, statscomponent It would be useful to exclude the effects of some fq params from the set of documents used to compute stats -- similar to how you can exclude tagged filters when generating facet counts... https://wiki.apache.org/solr/SimpleFacetParameters#Tagging_and_excluding_Filters So that it's possible to do something like this... http://localhost:8983/solr/select?fq={!tag=priceFilter}price:[1 TO 20]q=*:*stats=truestats.field={!ex=priceFilter}price If you want to create a price slider this is very useful because then you can filter the price ([1 TO 20) and nevertheless get the lower and upper bound of the unfiltered price (min=0, max=100): {noformat} |-[---]--| $0 $1 $20$100 {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4087) Provide consistent IW behavior for illegal meta data changes
[ https://issues.apache.org/jira/browse/LUCENE-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291600#comment-13291600 ] Simon Willnauer commented on LUCENE-4087: - After talking about this to other committers during the conference I think this is really a bit more controversial than it seemed. Except of the DocValues behavior this all is pre-existing behavior. The discussion is similar to changing norms through IR and removing that capability did bring up some hard discussions. Yet, I think we should only solve the DocValues issue in the least intrusive way and discuss the omitNorms IndexOptions behavior in a different issue. If we make all this throwing exceptions we almost introduce a schema here which makes lucene 4.0 very different in terms of RT behavior compared to previous versions. Provide consistent IW behavior for illegal meta data changes Key: LUCENE-4087 URL: https://issues.apache.org/jira/browse/LUCENE-4087 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Fix For: 4.0, 4.1, 5.0 Currently IW fails late and inconsistent if field metadata like an already defined DocValues type or un-omitting norms. we can approach this similar to how we handle consistent field number and: * throw exception if indexOptions conflict (e.g. omitTF=true versus false) instead of silently dropping positions on merge * same with omitNorms * same with norms types and docvalues types * still keeping field numbers consistent this way we could eliminate all these traps and just give an exception instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3524) Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory
[ https://issues.apache.org/jira/browse/SOLR-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291635#comment-13291635 ] Christian Moen commented on SOLR-3524: -- Hiraga-san, there are different views on how punctuation characters best are handled by tokenizers. Punctuation characters generally don't convey much meaning useful for text search, so they are generally removed in Lucene. (A different point of view is that tokenizers shouldn't remove punctuations and that filters should do this.) The ability to keep punctuation was left as an expert-feature in JapanseTokenizer and I think we can expose this as an expert feature in Solr as well. Could you share some details on your use-case just so that I get a better idea of the background and importance of this? Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory --- Key: SOLR-3524 URL: https://issues.apache.org/jira/browse/SOLR-3524 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 3.6 Reporter: Kazuaki Hiraga Priority: Minor Attachments: kuromoji_discard_punctuation.patch.txt JapaneseTokenizer, Kuromoji doesn't provide configuration option to preserve punctuation in Japanese text, although It has a parameter to change this behavior. JapaneseTokenizerFactory always set third parameter, which controls this behavior, to true to remove punctuation. I would like to have an option I can configure this behavior by fieldtype definition in schema.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3520) Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory
[ https://issues.apache.org/jira/browse/SOLR-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291637#comment-13291637 ] Robert Muir commented on SOLR-3520: --- +1 its dangerous when these don't have tests, there could be very simple bugs or patches in the future that break things and we won't notice. we should also keep an eye on https://builds.apache.org/job/Solr-trunk/clover/org/apache/solr/analysis/pkg-summary.html which makes it very easy to see which ones are missing tests. Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory Key: SOLR-3520 URL: https://issues.apache.org/jira/browse/SOLR-3520 Project: Solr Issue Type: Test Affects Versions: 4.0, 5.0 Reporter: Christian Moen Priority: Minor Attachments: SOLR-3520.patch {{JapaneseReadingFormFilterFactory}} and {{JapaneseKatakanaStemFilterFactory}} doesn't have any tests and it would be good to have some. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3520) Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory
[ https://issues.apache.org/jira/browse/SOLR-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291639#comment-13291639 ] Christian Moen commented on SOLR-3520: -- Thanks, Robert. We have them in Lucene, but not adding some for Solr was an oversight on my part. Very good idea to keep an eye the Clover reports. I'll commit this shortly. Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory Key: SOLR-3520 URL: https://issues.apache.org/jira/browse/SOLR-3520 Project: Solr Issue Type: Test Affects Versions: 4.0, 5.0 Reporter: Christian Moen Priority: Minor Attachments: SOLR-3520.patch {{JapaneseReadingFormFilterFactory}} and {{JapaneseKatakanaStemFilterFactory}} doesn't have any tests and it would be good to have some. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4120) FST should use packed integer arrays
Adrien Grand created LUCENE-4120: Summary: FST should use packed integer arrays Key: LUCENE-4120 URL: https://issues.apache.org/jira/browse/LUCENE-4120 Project: Lucene - Java Issue Type: Improvement Components: core/FSTs Affects Versions: 5.0 Reporter: Adrien Grand Priority: Minor Fix For: 5.0 There are some places where an int[] could be advantageously replaced with a packed integer array. I am thinking (at least) of: * FST.nodeAddress (GrowableWriter) * FST.inCounts (GrowableWriter) * FST.nodeRefToAddress (read-only Reader) The serialization/deserialization methods should be modified too in order to take advantage of PackedInts.get{Reader,Writer}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3524) Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory
[ https://issues.apache.org/jira/browse/SOLR-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291643#comment-13291643 ] Christian Moen commented on SOLR-3524: -- Ohtani-san, thanks for the patch! I've tried it on {{trunk}} and applying it fails because of an {{InitializationException}} is thrown instead of a {{SolrException}}. I'll correct this shortly. We also need some tests here... Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory --- Key: SOLR-3524 URL: https://issues.apache.org/jira/browse/SOLR-3524 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 3.6 Reporter: Kazuaki Hiraga Priority: Minor Attachments: kuromoji_discard_punctuation.patch.txt JapaneseTokenizer, Kuromoji doesn't provide configuration option to preserve punctuation in Japanese text, although It has a parameter to change this behavior. JapaneseTokenizerFactory always set third parameter, which controls this behavior, to true to remove punctuation. I would like to have an option I can configure this behavior by fieldtype definition in schema.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3525) Per-field similarity should display used impl. in debug output broken
Markus Jelsma created SOLR-3525: --- Summary: Per-field similarity should display used impl. in debug output broken Key: SOLR-3525 URL: https://issues.apache.org/jira/browse/SOLR-3525 Project: Solr Issue Type: Bug Components: search Affects Versions: 4.0 Reporter: Markus Jelsma Priority: Minor Fix For: 4.0 When using per-field similarity debugQuery should display the used similarity implementation for each match. Right now it's broken and displays empty brackets: 112.33515 = (MATCH) weight(content:blah in 273) [], result of: -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3526) Remove classfile dependency on ZooKeeper from CoreContainer
Michael Froh created SOLR-3526: -- Summary: Remove classfile dependency on ZooKeeper from CoreContainer Key: SOLR-3526 URL: https://issues.apache.org/jira/browse/SOLR-3526 Project: Solr Issue Type: Wish Components: SolrCloud Affects Versions: 4.0 Reporter: Michael Froh We are using Solr as a library embedded within an existing application, and are currently developing toward using 4.0 when it is released. We are currently instantiating SolrCores with null CoreDescriptors (and hence no CoreContainer), since we don't need SolrCloud functionality (and do not want to depend on ZooKeeper). A couple of months ago, SearchHandler was modified to try to retrieve a ShardHandlerFactory from the CoreContainer. I was able to work around this by specifying a dummy ShardHandlerFactory in the config. Now UpdateRequestProcessorChain is inserting a DistributedUpdateProcessor into my chains, again triggering a NPE when trying to dereference the CoreDescriptor. I would happily place the SolrCores in CoreContainers, except that CoreContainer imports and references org.apache.zookeeper.KeeperException, which we do not have (and do not want) in our classpath. Therefore, I get a ClassNotFoundException when loading the CoreContainer class. Ideally (IMHO), ZkController should isolate the ZooKeeper dependency, and simply rethrow KeeperExceptions as org.apache.solr.common.cloud.ZooKeeperException (or some Solr-hosted checked exception). Then CoreContainer could remove the offending import/references. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3525) Per-field similarity should display used impl. in debug output broken
[ https://issues.apache.org/jira/browse/SOLR-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291651#comment-13291651 ] Robert Muir commented on SOLR-3525: --- note: its an impl detail of PerFieldSimilarityWrapper that it does different things for different fields. The reason you probably get blank brackets is because the weight uses [ + similarity.getClass().getSimpleName() + ] In the solr case this is an anonymous class. If we want to keep this (I just added it for debugging, we could also just remove it), probably better instead print the class of whats scoring the documents: so you would see ExactBM25DocScorer or SloppyTFIDFDocScorer. Per-field similarity should display used impl. in debug output broken - Key: SOLR-3525 URL: https://issues.apache.org/jira/browse/SOLR-3525 Project: Solr Issue Type: Bug Components: search Affects Versions: 4.0 Reporter: Markus Jelsma Priority: Minor Fix For: 4.0 When using per-field similarity debugQuery should display the used similarity implementation for each match. Right now it's broken and displays empty brackets: 112.33515 = (MATCH) weight(content:blah in 273) [], result of: -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-4120) FST should use packed integer arrays
[ https://issues.apache.org/jira/browse/LUCENE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand reassigned LUCENE-4120: Assignee: Adrien Grand FST should use packed integer arrays Key: LUCENE-4120 URL: https://issues.apache.org/jira/browse/LUCENE-4120 Project: Lucene - Java Issue Type: Improvement Components: core/FSTs Affects Versions: 5.0 Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 5.0 There are some places where an int[] could be advantageously replaced with a packed integer array. I am thinking (at least) of: * FST.nodeAddress (GrowableWriter) * FST.inCounts (GrowableWriter) * FST.nodeRefToAddress (read-only Reader) The serialization/deserialization methods should be modified too in order to take advantage of PackedInts.get{Reader,Writer}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Solr-trunk - Build # 1879 - Failure
Build: https://builds.apache.org/job/Solr-trunk/1879/ 3 tests failed. REGRESSION: org.apache.solr.cloud.FullSolrCloudTest.testDistribSearch Error Message: Timeout occured while waiting response from server at: http://localhost:13109/solr/collection1 Stack Trace: org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting response from server at: http://localhost:13109/solr/collection1 at __randomizedtesting.SeedInfo.seed([CCF7D390CC98B64:8D29F3217B96EB58]:0) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:405) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:182) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117) at org.apache.solr.cloud.FullSolrCloudTest.index_specific(FullSolrCloudTest.java:498) at org.apache.solr.cloud.FullSolrCloudTest.brindDownShardIndexSomeDocsAndRecover(FullSolrCloudTest.java:713) at org.apache.solr.cloud.FullSolrCloudTest.doTest(FullSolrCloudTest.java:550) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:680) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at
Packaging dependencies for Ivy and license validation
Hi- I would like to get the OpenNLP project's packages in shape so that the Lucene build will accept them. What has to be done in a third-party package for the license validation to pass? Is there a Maven cheatsheet somewhere for the right pom.xml snippet? -- Lance Norskog goks...@gmail.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3178) Versioning - optimistic locking
[ https://issues.apache.org/jira/browse/SOLR-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291668#comment-13291668 ] Per Steffensen commented on SOLR-3178: -- {quote} Regarding error handling, I tracked down the original issue: SOLR-445 {quote} Yes, SOLR-445 is solved by my patch - the nice way. On certain kinds of errors (PartialError subclasses) during the handling of a particular document in a nultidocument/batch-update the processing of subsequent documents will continue. The client will receive a response describing all errors (wrapped in PartialErrors) that happend during the processing of the entire update-request (multidocument/batch). Please have a look at http://wiki.apache.org/solr/Per%20Steffensen/Update%20semantics#Multi_document_updates {quote} It's just a guess, but I think it unlikely any committers would feel comfortable tackling this big patch, or even have time to understand all of the different aspects. They may agree with some parts but disagree with other parts {quote} Of course that is up to you, but I believe Solr has a problem being a real Open Source project receiving contributions from many semi-related organistions around the world, if you do not trust your test suite. Basically when taking in a patch a committer do not need to understand everything down to every detail. It should be enough (if you trust your test suite) to * Verify that all existing tests are still green - and havnt been hacked * Verify that all new tests seem to be meaningfull and covering the features described in the corresponding Jira (and in my case the associated Wiki page), indicating that the new features are usefull and well tested * Scan through the new code to see if it is breaking any design principals etc., and in general if it seems to be doing the right thing the right way As long as a patch does not break any existing functionality, and seems to bring nice new functionality (you should be able to see that from the added tests) a patch cannot be that harmfull - you can always refactor if you realize that you disagree with some parts. It all depends on trusting your test suite. Dont you agree, in principle at least? Regards, Per Steffensen Versioning - optimistic locking --- Key: SOLR-3178 URL: https://issues.apache.org/jira/browse/SOLR-3178 Project: Solr Issue Type: New Feature Components: update Affects Versions: 3.5 Environment: All Reporter: Per Steffensen Assignee: Per Steffensen Labels: RDBMS, insert, locking, nosql, optimistic, uniqueKey, update, versioning Fix For: 4.0 Attachments: SOLR-3173_3178_3382_3428_plus.patch, SOLR-3178.patch, SOLR_3173_3178_3382_plus.patch Original Estimate: 168h Remaining Estimate: 168h In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support versioning to be used for optimistic locking. When my intent (see SOLR-3173) is to update an existing document, I will need to provide a version-number equal to the version number I got when I fetched the existing document for update plus one. If this provided version-number does not correspond to the newest version-number of that document at the time of update plus one, I will get a VersionConflict error. If it does correspond the document will be updated with the new one, so that the newest version-number of that document is NOW one higher than before the update. Correct but efficient concurrency handling. When my intent (see SOLR-3173) is to insert a new document, the version number provided will not be used - instead a version-number 0 will be used. According to SOLR-3173 insert will only succeed if a document with the same value on uniqueKey-field does not already exist. In general when talking about different versions of the same document, of course we need to be able to identify when a document is the same - that, per definition, is when the values of the uniqueKey-fields are equal. The functionality provided by this issue is only really meaningfull when you run with updateLog activated. This issue might be solved more or less at the same time as SOLR-3173, and only one single SVN patch might be given to cover both issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Confusing method names to get the size of objects
Hi, Lucene and Solr have a few classes that expose the size of their instances, but with different method names. There are at least ramBytesUsed (packed ints), sizeInBytes (FST, RamDirectory) and memSize (Solr DocSets) that provide an estimation of the memory used in bytes. The confusing thing is that sizeInBytes is sometimes also used for on-disk sizes (SegmentInfo for example). I think it would improve readability to stick to only two method names, one for the in-memory size and one for the on-disk size. Or maybe these methods have different meanings that I am missing? What do you think? -- Adrien - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4115) JAR resolution/ cleanup should be done automatically for ant clean/ eclipse/ resolve.
[ https://issues.apache.org/jira/browse/LUCENE-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291670#comment-13291670 ] Robert Muir commented on LUCENE-4115: - Another possibility (didnt investigate if it has options that would work for us) is the sync=true option for retrieve: http://ant.apache.org/ivy/history/trunk/use/retrieve.html Just at a glance there could be some problems: sha1/license/notice files, and solr/lib which is 'shared' across solrj and core dependencies. But maybe we could still utilize this... JAR resolution/ cleanup should be done automatically for ant clean/ eclipse/ resolve. - Key: LUCENE-4115 URL: https://issues.apache.org/jira/browse/LUCENE-4115 Project: Lucene - Java Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 4.0, 5.0 Attachments: LUCENE-4111.patch I think we should add the following target deps: ant clean [depends on] clean-jars ant resolve [depends on] clean-jars ant eclipse [depends on] resolve, clean-jars ant idea [depends on] resolve, clean-jars This eliminates the need to remember about cleaning up stale jars which users complain about (and I think they're right about it). The overhead will be minimal since resolve is only going to copy jars from cache. Eclipse won't have a problem with updated JARs if they end up at the same location. If there are no objections I will fix this in a few hours. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4119) SegmentInfoFormat.getSegmentInfos{Reader,Writer} should be singular
[ https://issues.apache.org/jira/browse/LUCENE-4119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291671#comment-13291671 ] Robert Muir commented on LUCENE-4119: - Thanks for cleaning this up! SegmentInfoFormat.getSegmentInfos{Reader,Writer} should be singular --- Key: LUCENE-4119 URL: https://issues.apache.org/jira/browse/LUCENE-4119 Project: Lucene - Java Issue Type: Bug Components: core/codecs Affects Versions: 4.0, 5.0 Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Fix For: 4.0, 5.0 Attachments: LUCENE-4119.patch Left-over from SegmentInfos refactoring. The name should be singular, we don't have SegmentInfosWriter/Reader anymore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-861) SOLRJ Client does not release connections 'nicely' by default
[ https://issues.apache.org/jira/browse/SOLR-861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291678#comment-13291678 ] Sami Siren commented on SOLR-861: - bq. it's not clear to me if this has already been addressed by the new client in SOLR-2020 - can you please triage for 4.0? I have not done anything specific to address this issue. Since opening this issue a shutdown() method was added in HttpSolrServer that should take care of releasing the resources, if that's not working then there's a bug. SOLRJ Client does not release connections 'nicely' by default - Key: SOLR-861 URL: https://issues.apache.org/jira/browse/SOLR-861 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 1.3 Environment: linux Reporter: Ian Holsman Assignee: Sami Siren Fix For: 4.0 Attachments: SimpleClient.patch as-is the SolrJ Commons HttpServer uses the multi-threaded http connection manager. This manager seems to keep the connection alive for the client and does not close it when the object is dereferenced. When you keep on opening new CommonsHttpSolrServer instances it results in a socket that is stuck in the CLOSE_WAIT state. Eventually this will use up all your available file handles, causing your client to die a painful death. The solution I propose is that it uses a 'Simple' HttpConnectionManager which is set to not reuse connections if you don't specify a HttpClient. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4115) JAR resolution/ cleanup should be done automatically for ant clean/ eclipse/ resolve.
[ https://issues.apache.org/jira/browse/LUCENE-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291681#comment-13291681 ] Dawid Weiss commented on LUCENE-4115: - I've checked that -- sync on retrieve deletes everything from a folder (there is no exclusion pattern to be applied). Besides it won't solve the locking problem on windows (assuming something keeps a lock on a jar to be deleted it'd fail anyway). A true nice solution would be to revisit the issue where classpaths are constructed to ivy cache directly (they're always correct then) and just use copying for packaging. JAR resolution/ cleanup should be done automatically for ant clean/ eclipse/ resolve. - Key: LUCENE-4115 URL: https://issues.apache.org/jira/browse/LUCENE-4115 Project: Lucene - Java Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 4.0, 5.0 Attachments: LUCENE-4111.patch I think we should add the following target deps: ant clean [depends on] clean-jars ant resolve [depends on] clean-jars ant eclipse [depends on] resolve, clean-jars ant idea [depends on] resolve, clean-jars This eliminates the need to remember about cleaning up stale jars which users complain about (and I think they're right about it). The overhead will be minimal since resolve is only going to copy jars from cache. Eclipse won't have a problem with updated JARs if they end up at the same location. If there are no objections I will fix this in a few hours. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2526) Grouping on multiple fields
[ https://issues.apache.org/jira/browse/SOLR-2526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291698#comment-13291698 ] Olav Frengstad commented on SOLR-2526: -- What's the status of this? As [LUCENE-3099] and [LUCENE-2883] is fixed what would it take to fix this? I would gladly try implementing this, any pointers on where to start would be appreciated. Grouping on multiple fields --- Key: SOLR-2526 URL: https://issues.apache.org/jira/browse/SOLR-2526 Project: Solr Issue Type: New Feature Components: search Affects Versions: 4.0 Reporter: Arian Karbasi Priority: Minor Grouping on multiple fields and/or ranges should be an option (X,Y) groupings. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-3178) Versioning - optimistic locking
[ https://issues.apache.org/jira/browse/SOLR-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291668#comment-13291668 ] Per Steffensen edited comment on SOLR-3178 at 6/8/12 11:13 AM: --- {quote} Regarding error handling, I tracked down the original issue: SOLR-445 {quote} Yes, SOLR-445 is solved by my patch - the nice way. On certain kinds of errors (PartialError subclasses) during the handling of a particular document in a nultidocument/batch-update the processing of subsequent documents will continue. The client will receive a response describing all errors (wrapped in PartialErrors) that happend during the processing of the entire update-request (multidocument/batch). Please have a look at http://wiki.apache.org/solr/Per%20Steffensen/Update%20semantics#Multi_document_updates {quote} It's just a guess, but I think it unlikely any committers would feel comfortable tackling this big patch, or even have time to understand all of the different aspects. They may agree with some parts but disagree with other parts {quote} Of course that is up to you, but I believe Solr has a problem being a real Open Source project receiving contributions from many semi-related organistions around the world, if you do not trust your test suite. Basically when taking in a patch a committer do not need to understand everything down to every detail. It should be enough (if you trust your test suite) to * Verify that all existing tests are still green - and havnt been hacked * Verify that all new tests seem to be meaningfull and covering the features described in the corresponding Jira (and in my case the associated Wiki page), indicating that the new features are usefull and well tested (in order to be able to trust the test suite will reveal if future commits will ruin this new feature) * Scan through the new code to see if it is breaking any design principals etc., and in general if it seems to be doing the right thing the right way As long as a patch does not break any existing functionality, and seems to bring nice new functionality (you should be able to see that from the added tests) a patch cannot be that harmfull - you can always refactor if you realize that you disagree with some parts. It all depends on trusting your test suite. Dont you agree, in principle at least? Regards, Per Steffensen was (Author: steff1193): {quote} Regarding error handling, I tracked down the original issue: SOLR-445 {quote} Yes, SOLR-445 is solved by my patch - the nice way. On certain kinds of errors (PartialError subclasses) during the handling of a particular document in a nultidocument/batch-update the processing of subsequent documents will continue. The client will receive a response describing all errors (wrapped in PartialErrors) that happend during the processing of the entire update-request (multidocument/batch). Please have a look at http://wiki.apache.org/solr/Per%20Steffensen/Update%20semantics#Multi_document_updates {quote} It's just a guess, but I think it unlikely any committers would feel comfortable tackling this big patch, or even have time to understand all of the different aspects. They may agree with some parts but disagree with other parts {quote} Of course that is up to you, but I believe Solr has a problem being a real Open Source project receiving contributions from many semi-related organistions around the world, if you do not trust your test suite. Basically when taking in a patch a committer do not need to understand everything down to every detail. It should be enough (if you trust your test suite) to * Verify that all existing tests are still green - and havnt been hacked * Verify that all new tests seem to be meaningfull and covering the features described in the corresponding Jira (and in my case the associated Wiki page), indicating that the new features are usefull and well tested * Scan through the new code to see if it is breaking any design principals etc., and in general if it seems to be doing the right thing the right way As long as a patch does not break any existing functionality, and seems to bring nice new functionality (you should be able to see that from the added tests) a patch cannot be that harmfull - you can always refactor if you realize that you disagree with some parts. It all depends on trusting your test suite. Dont you agree, in principle at least? Regards, Per Steffensen Versioning - optimistic locking --- Key: SOLR-3178 URL: https://issues.apache.org/jira/browse/SOLR-3178 Project: Solr Issue Type: New Feature Components: update Affects Versions: 3.5 Environment: All Reporter: Per Steffensen Assignee: Per Steffensen Labels:
[jira] [Commented] (SOLR-2352) TermVectorComponent fails with Undefined Field errors for score, *, or any Solr 4x psuedo-fields used in the fl param.
[ https://issues.apache.org/jira/browse/SOLR-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291726#comment-13291726 ] Robert Muir commented on SOLR-2352: --- {quote} ...the last item seemingly a relic from when the code use to use the TermVectorMapper interface to walk the vectors the various fields, and used diff code paths depending on wether all fields were requested, or just specific ones. {quote} I didnt look at the patch, or the issue, but maybe in the case only specific fields are returned you could just wrap the Fields returned by getTermVectors with a FilteredFields so you only have one codepath: http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/test-framework/src/java/org/apache/lucene/index/FieldFilterAtomicReader.java TermVectorComponent fails with Undefined Field errors for score, *, or any Solr 4x psuedo-fields used in the fl param. -- Key: SOLR-2352 URL: https://issues.apache.org/jira/browse/SOLR-2352 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 3.1 Environment: Ubuntu 10.04/Arch solr 3.x branch r1058326 Reporter: Jed Glazner Assignee: Hoss Man Fix For: 4.0 Attachments: SOLR-2352.patch When searching using the term vector components and setting fl=*,score the result is a http 400 error 'undefined field: *'. If you disable the tvc the search works properly. Example bad request... {code}http://localhost:8983/solr/select/?qt=tvrhq=includes:[*+TO+*]fl=*{code} 3.1 stack trace: {noformat} SEVERE: org.apache.solr.common.SolrException: undefined field: * at org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:142) ... {noformat} The work around is to explicitly use the tv.fl param when using psuedo-fields in the fl... {code}http://localhost:8983/solr/select/?qt=tvrhq=includes:[*+TO+*]fl=*tv.fl=includes{code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Confusing method names to get the size of objects
+1 to standardize on two names. It is confusing now! Mike McCandless http://blog.mikemccandless.com On Fri, Jun 8, 2012 at 6:20 AM, Adrien Grand jpou...@gmail.com wrote: Hi, Lucene and Solr have a few classes that expose the size of their instances, but with different method names. There are at least ramBytesUsed (packed ints), sizeInBytes (FST, RamDirectory) and memSize (Solr DocSets) that provide an estimation of the memory used in bytes. The confusing thing is that sizeInBytes is sometimes also used for on-disk sizes (SegmentInfo for example). I think it would improve readability to stick to only two method names, one for the in-memory size and one for the on-disk size. Or maybe these methods have different meanings that I am missing? What do you think? -- Adrien - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4087) Provide consistent IW behavior for illegal meta data changes
[ https://issues.apache.org/jira/browse/LUCENE-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291742#comment-13291742 ] Michael McCandless commented on LUCENE-4087: I think this is a good baby step for 4.0. But I think it's important the javadocs make it clear that if you change up the DV type for a given field, the behavior is undefined and we are free to improve it in the future. Ideally I think apps should get clear exceptions on attempting to index a doc with an incompatible change to anything that is our effective schema (omitNorms, indexOptions, DV types, etc.). For example, if a given field already omitNorms and you try to add a doc with that field not omitting norms, you should get a clear exception (it can only be an app bug, because on merge the norms will silently go away). But let's open a separate issue for that... Provide consistent IW behavior for illegal meta data changes Key: LUCENE-4087 URL: https://issues.apache.org/jira/browse/LUCENE-4087 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Fix For: 4.0, 4.1, 5.0 Attachments: LUCENE-4087.patch Currently IW fails late and inconsistent if field metadata like an already defined DocValues type or un-omitting norms. we can approach this similar to how we handle consistent field number and: * throw exception if indexOptions conflict (e.g. omitTF=true versus false) instead of silently dropping positions on merge * same with omitNorms * same with norms types and docvalues types * still keeping field numbers consistent this way we could eliminate all these traps and just give an exception instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Windows-Java7-64 - Build # 18 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/18/ 1 tests failed. REGRESSION: org.apache.solr.spelling.suggest.SuggesterTSTTest.testReload Error Message: Exception during query Stack Trace: java.lang.RuntimeException: Exception during query at __randomizedtesting.SeedInfo.seed([476D2EB48E0F2244:809D56B7444CDA56]:0) at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:459) at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:426) at org.apache.solr.spelling.suggest.SuggesterTest.testReload(SuggesterTest.java:91) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Caused by: java.lang.RuntimeException: REQUEST FAILED: xpath=//lst[@name='spellcheck']/lst[@name='suggestions']/lst[@name='ac']/int[@name='numFound'][.='2'] xml response was: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime0/int/lstlst name=spellchecklst name=suggestions//lst /response request was:q=acspellcheck.count=2qt=/suggest_tstspellcheck.onlyMorePopular=true at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:452) ...
[jira] [Commented] (LUCENE-4120) FST should use packed integer arrays
[ https://issues.apache.org/jira/browse/LUCENE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291747#comment-13291747 ] Michael McCandless commented on LUCENE-4120: +1 FST should use packed integer arrays Key: LUCENE-4120 URL: https://issues.apache.org/jira/browse/LUCENE-4120 Project: Lucene - Java Issue Type: Improvement Components: core/FSTs Affects Versions: 5.0 Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 5.0 There are some places where an int[] could be advantageously replaced with a packed integer array. I am thinking (at least) of: * FST.nodeAddress (GrowableWriter) * FST.inCounts (GrowableWriter) * FST.nodeRefToAddress (read-only Reader) The serialization/deserialization methods should be modified too in order to take advantage of PackedInts.get{Reader,Writer}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4101) Remove XXXField.TYPE_STORED
[ https://issues.apache.org/jira/browse/LUCENE-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-4101: --- Attachment: LUCENE-4101.patch New patch, adding StoredStringField and StoredTextField (instead of StringField.TYPE_STORED / TextField.TYPE_STORED). I think it's ready. Remove XXXField.TYPE_STORED --- Key: LUCENE-4101 URL: https://issues.apache.org/jira/browse/LUCENE-4101 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Priority: Blocker Fix For: 4.0, 5.0 Attachments: LUCENE-4101.patch, LUCENE-4101.patch Spinoff from LUCENE-3312. For 4.0 I think we should simplify the sugar field APIs by requiring that you add a StoredField if you want to store the field. Expert users can still make a custom FieldType that both stores and indexes... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3524) Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory
[ https://issues.apache.org/jira/browse/SOLR-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291787#comment-13291787 ] Jun Ohtani commented on SOLR-3524: -- Hi Christian, Sorry, I create the patch based ver. 3.6.0. Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory --- Key: SOLR-3524 URL: https://issues.apache.org/jira/browse/SOLR-3524 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 3.6 Reporter: Kazuaki Hiraga Priority: Minor Attachments: kuromoji_discard_punctuation.patch.txt JapaneseTokenizer, Kuromoji doesn't provide configuration option to preserve punctuation in Japanese text, although It has a parameter to change this behavior. JapaneseTokenizerFactory always set third parameter, which controls this behavior, to true to remove punctuation. I would like to have an option I can configure this behavior by fieldtype definition in schema.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3524) Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory
[ https://issues.apache.org/jira/browse/SOLR-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291792#comment-13291792 ] Christian Moen commented on SOLR-3524: -- No trouble. I'll provide a new patch shortly for {{trunk}} and {{branch_4x}} with a test as well. Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory --- Key: SOLR-3524 URL: https://issues.apache.org/jira/browse/SOLR-3524 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 3.6 Reporter: Kazuaki Hiraga Priority: Minor Attachments: kuromoji_discard_punctuation.patch.txt JapaneseTokenizer, Kuromoji doesn't provide configuration option to preserve punctuation in Japanese text, although It has a parameter to change this behavior. JapaneseTokenizerFactory always set third parameter, which controls this behavior, to true to remove punctuation. I would like to have an option I can configure this behavior by fieldtype definition in schema.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4101) Remove XXXField.TYPE_STORED
[ https://issues.apache.org/jira/browse/LUCENE-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291797#comment-13291797 ] Robert Muir commented on LUCENE-4101: - Thinking about this issue a bit, I think its bad if you have to use Field/FieldType api just to store a field. So I agree this should be fixed. Separately we should also make it easy to have a stored-only (not indexed) field. I felt like both of these things were easy with the old document API. {quote} A third option is to add boolean isStored to each of XXXFields? So, it's not stored by default, but then you can do: {quote} I don't like that we are making our apis hard to use just because java doesn't have named parameter passing or something. I think the old API was great here: it had an enum for Stored so it was totally obvious from your code if it was stored or not, or indexed or not. I think if we dont like booleans for this silly reason, then we should just use an enum like the old API! Extra Stored* classes for each field are just overwhelming. {quote} I can't see a situation where having to add the same field twice with different flags is good from a usability standpoint. {quote} We can never force that. People who are experts or committers are free to add the field twice if they want to (nothing stops them), but I don't want to see this forced in our APIs, its too difficult. Remove XXXField.TYPE_STORED --- Key: LUCENE-4101 URL: https://issues.apache.org/jira/browse/LUCENE-4101 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Priority: Blocker Fix For: 4.0, 5.0 Attachments: LUCENE-4101.patch, LUCENE-4101.patch Spinoff from LUCENE-3312. For 4.0 I think we should simplify the sugar field APIs by requiring that you add a StoredField if you want to store the field. Expert users can still make a custom FieldType that both stores and indexes... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291803#comment-13291803 ] Jason Rutherglen commented on SOLR-2242: Terrance, can you post a patch to the Jira? It makes sense to start this Jira off non-distributed, and add a distributed version in another Jira issue... Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3524) Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory
[ https://issues.apache.org/jira/browse/SOLR-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Moen updated SOLR-3524: - Attachment: SOLR-3524.patch Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory --- Key: SOLR-3524 URL: https://issues.apache.org/jira/browse/SOLR-3524 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 3.6 Reporter: Kazuaki Hiraga Priority: Minor Attachments: SOLR-3524.patch, kuromoji_discard_punctuation.patch.txt JapaneseTokenizer, Kuromoji doesn't provide configuration option to preserve punctuation in Japanese text, although It has a parameter to change this behavior. JapaneseTokenizerFactory always set third parameter, which controls this behavior, to true to remove punctuation. I would like to have an option I can configure this behavior by fieldtype definition in schema.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3524) Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory
[ https://issues.apache.org/jira/browse/SOLR-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291807#comment-13291807 ] Christian Moen commented on SOLR-3524: -- New patch with tests and documentation changes attached. Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory --- Key: SOLR-3524 URL: https://issues.apache.org/jira/browse/SOLR-3524 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 3.6 Reporter: Kazuaki Hiraga Priority: Minor Attachments: SOLR-3524.patch, kuromoji_discard_punctuation.patch.txt JapaneseTokenizer, Kuromoji doesn't provide configuration option to preserve punctuation in Japanese text, although It has a parameter to change this behavior. JapaneseTokenizerFactory always set third parameter, which controls this behavior, to true to remove punctuation. I would like to have an option I can configure this behavior by fieldtype definition in schema.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4121) Standardize ramBytesUsed/sizeInBytes/memSize
Adrien Grand created LUCENE-4121: Summary: Standardize ramBytesUsed/sizeInBytes/memSize Key: LUCENE-4121 URL: https://issues.apache.org/jira/browse/LUCENE-4121 Project: Lucene - Java Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 5.0 We should standardize the names of the methods we use to estimate the sizes of objects in memory and on disk. (cf. discussion on dev@lucene http://search-lucene.com/m/VbXSx1BP60G). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4121) Standardize ramBytesUsed/sizeInBytes/memSize
[ https://issues.apache.org/jira/browse/LUCENE-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291819#comment-13291819 ] Adrien Grand commented on LUCENE-4121: -- I am currently thinking of {{memSize}} for the in-memory size and {{diskSize}} for the on-disk size. Standardize ramBytesUsed/sizeInBytes/memSize Key: LUCENE-4121 URL: https://issues.apache.org/jira/browse/LUCENE-4121 Project: Lucene - Java Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 5.0 We should standardize the names of the methods we use to estimate the sizes of objects in memory and on disk. (cf. discussion on dev@lucene http://search-lucene.com/m/VbXSx1BP60G). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Confusing method names to get the size of objects
On Fri, Jun 8, 2012 at 2:12 PM, Michael McCandless luc...@mikemccandless.com wrote: +1 to standardize on two names. It is confusing now! Thanks Mike for your feedback, I created LUCENE-4121 [1] to address this issue. [1] https://issues.apache.org/jira/browse/LUCENE-4121 -- Adrien - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4087) Provide consistent IW behavior for illegal meta data changes
[ https://issues.apache.org/jira/browse/LUCENE-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-4087: Attachment: LUCENE-4087.patch here is a new patch adding documentation to DocValues.java a reference to all DV Fields. I added a bunch of tests including verification test that changing norms types actually fails. I extended the type promoter a little to actually promote INT_16 INT_8 to Float32 if needed as well as INT_32 to FLOAT_64. I think its ready Provide consistent IW behavior for illegal meta data changes Key: LUCENE-4087 URL: https://issues.apache.org/jira/browse/LUCENE-4087 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Fix For: 4.0, 4.1, 5.0 Attachments: LUCENE-4087.patch, LUCENE-4087.patch Currently IW fails late and inconsistent if field metadata like an already defined DocValues type or un-omitting norms. we can approach this similar to how we handle consistent field number and: * throw exception if indexOptions conflict (e.g. omitTF=true versus false) instead of silently dropping positions on merge * same with omitNorms * same with norms types and docvalues types * still keeping field numbers consistent this way we could eliminate all these traps and just give an exception instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4121) Standardize ramBytesUsed/sizeInBytes/memSize
[ https://issues.apache.org/jira/browse/LUCENE-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291831#comment-13291831 ] Simon Willnauer commented on LUCENE-4121: - Adrien I think we can do that for 4.0 too though Standardize ramBytesUsed/sizeInBytes/memSize Key: LUCENE-4121 URL: https://issues.apache.org/jira/browse/LUCENE-4121 Project: Lucene - Java Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 5.0 We should standardize the names of the methods we use to estimate the sizes of objects in memory and on disk. (cf. discussion on dev@lucene http://search-lucene.com/m/VbXSx1BP60G). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4121) Standardize ramBytesUsed/sizeInBytes/memSize
[ https://issues.apache.org/jira/browse/LUCENE-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-4121: - Fix Version/s: 4.0 Standardize ramBytesUsed/sizeInBytes/memSize Key: LUCENE-4121 URL: https://issues.apache.org/jira/browse/LUCENE-4121 Project: Lucene - Java Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.0, 5.0 We should standardize the names of the methods we use to estimate the sizes of objects in memory and on disk. (cf. discussion on dev@lucene http://search-lucene.com/m/VbXSx1BP60G). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4121) Standardize ramBytesUsed/sizeInBytes/memSize
[ https://issues.apache.org/jira/browse/LUCENE-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291834#comment-13291834 ] Adrien Grand commented on LUCENE-4121: -- Updated fix version. Standardize ramBytesUsed/sizeInBytes/memSize Key: LUCENE-4121 URL: https://issues.apache.org/jira/browse/LUCENE-4121 Project: Lucene - Java Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.0, 5.0 We should standardize the names of the methods we use to estimate the sizes of objects in memory and on disk. (cf. discussion on dev@lucene http://search-lucene.com/m/VbXSx1BP60G). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3524) Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory
[ https://issues.apache.org/jira/browse/SOLR-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291836#comment-13291836 ] Kazuaki Hiraga commented on SOLR-3524: -- Thank you guys! Christian, Since some documents have keywords that consists of alphabet and punctuation such as c++, c# and so on, We want to match those keywords with the keyword that unchanged form. Of course, we will discard punctuation in many cases but some cases, especially short text, we want to preserve punctuation. Therefore, I want to have an option that I can control this behaviour. Ohtani-san, thank you for your early reply and patch! Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory --- Key: SOLR-3524 URL: https://issues.apache.org/jira/browse/SOLR-3524 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 3.6 Reporter: Kazuaki Hiraga Priority: Minor Attachments: SOLR-3524.patch, kuromoji_discard_punctuation.patch.txt JapaneseTokenizer, Kuromoji doesn't provide configuration option to preserve punctuation in Japanese text, although It has a parameter to change this behavior. JapaneseTokenizerFactory always set third parameter, which controls this behavior, to true to remove punctuation. I would like to have an option I can configure this behavior by fieldtype definition in schema.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3520) Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory
[ https://issues.apache.org/jira/browse/SOLR-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Moen updated SOLR-3520: - Attachment: SOLR-3520.patch Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory Key: SOLR-3520 URL: https://issues.apache.org/jira/browse/SOLR-3520 Project: Solr Issue Type: Test Affects Versions: 4.0, 5.0 Reporter: Christian Moen Priority: Minor Attachments: SOLR-3520.patch, SOLR-3520.patch {{JapaneseReadingFormFilterFactory}} and {{JapaneseKatakanaStemFilterFactory}} doesn't have any tests and it would be good to have some. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3520) Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory
[ https://issues.apache.org/jira/browse/SOLR-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291837#comment-13291837 ] Christian Moen commented on SOLR-3520: -- Updated patch with a case that also deals with short katakana terms that shouldn't be stemmed by default. Will commit shortly. Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory Key: SOLR-3520 URL: https://issues.apache.org/jira/browse/SOLR-3520 Project: Solr Issue Type: Test Affects Versions: 4.0, 5.0 Reporter: Christian Moen Priority: Minor Attachments: SOLR-3520.patch, SOLR-3520.patch {{JapaneseReadingFormFilterFactory}} and {{JapaneseKatakanaStemFilterFactory}} doesn't have any tests and it would be good to have some. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3520) Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory
[ https://issues.apache.org/jira/browse/SOLR-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291839#comment-13291839 ] Christian Moen commented on SOLR-3520: -- Committed r1348134 on {{trunk}}. Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory Key: SOLR-3520 URL: https://issues.apache.org/jira/browse/SOLR-3520 Project: Solr Issue Type: Test Affects Versions: 4.0, 5.0 Reporter: Christian Moen Priority: Minor Attachments: SOLR-3520.patch, SOLR-3520.patch {{JapaneseReadingFormFilterFactory}} and {{JapaneseKatakanaStemFilterFactory}} doesn't have any tests and it would be good to have some. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2357) Reduce transient RAM usage while merging by using packed ints array for docID re-mapping
[ https://issues.apache.org/jira/browse/LUCENE-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291840#comment-13291840 ] Adrien Grand commented on LUCENE-2357: -- I am going to commit this change next week unless someone objects. Reduce transient RAM usage while merging by using packed ints array for docID re-mapping Key: LUCENE-2357 URL: https://issues.apache.org/jira/browse/LUCENE-2357 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Fix For: 4.0, 5.0 Attachments: LUCENE-2357.patch, LUCENE-2357.patch, LUCENE-2357.patch, LUCENE-2357.patch, LUCENE-2357.patch We allocate this int[] to remap docIDs due to compaction of deleted ones. This uses alot of RAM for large segment merges, and can fail to allocate due to fragmentation on 32 bit JREs. Now that we have packed ints, a simple fix would be to use a packed int array... and maybe instead of storing abs docID in the mapping, we could store the number of del docs seen so far (so the remap would do a lookup then a subtract). This may add some CPU cost to merging but should bring down transient RAM usage quite a bit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4122) Replace Payload with BytesRef
Andrzej Bialecki created LUCENE-4122: - Summary: Replace Payload with BytesRef Key: LUCENE-4122 URL: https://issues.apache.org/jira/browse/LUCENE-4122 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0, 5.0 Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Fix For: 4.0, 5.0 The Payload class offers a very similar functionality to BytesRef. The code internally uses BytesRef-s to represent payloads, and on indexing and on retrieval this data is repackaged from/to Payload. This seems wasteful. I propose to remove the Payload class and use BytesRef instead, thus avoid this re-wrapping and reducing the API footprint. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4122) Replace Payload with BytesRef
[ https://issues.apache.org/jira/browse/LUCENE-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated LUCENE-4122: -- Attachment: LUCENE-4122.patch Patch for trunk. All tests pass. Replace Payload with BytesRef - Key: LUCENE-4122 URL: https://issues.apache.org/jira/browse/LUCENE-4122 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0, 5.0 Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Fix For: 4.0, 5.0 Attachments: LUCENE-4122.patch The Payload class offers a very similar functionality to BytesRef. The code internally uses BytesRef-s to represent payloads, and on indexing and on retrieval this data is repackaged from/to Payload. This seems wasteful. I propose to remove the Payload class and use BytesRef instead, thus avoid this re-wrapping and reducing the API footprint. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3520) Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory
[ https://issues.apache.org/jira/browse/SOLR-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291853#comment-13291853 ] Christian Moen commented on SOLR-3520: -- Committed r1348148 on {{branch_4x}} Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory Key: SOLR-3520 URL: https://issues.apache.org/jira/browse/SOLR-3520 Project: Solr Issue Type: Test Affects Versions: 4.0, 5.0 Reporter: Christian Moen Priority: Minor Attachments: SOLR-3520.patch, SOLR-3520.patch {{JapaneseReadingFormFilterFactory}} and {{JapaneseKatakanaStemFilterFactory}} doesn't have any tests and it would be good to have some. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-3520) Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory
[ https://issues.apache.org/jira/browse/SOLR-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291853#comment-13291853 ] Christian Moen edited comment on SOLR-3520 at 6/8/12 4:42 PM: -- Committed r1348148 on {{branch_4x}}. was (Author: cm): Committed r1348148 on {{branch_4x}} Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory Key: SOLR-3520 URL: https://issues.apache.org/jira/browse/SOLR-3520 Project: Solr Issue Type: Test Affects Versions: 4.0, 5.0 Reporter: Christian Moen Priority: Minor Attachments: SOLR-3520.patch, SOLR-3520.patch {{JapaneseReadingFormFilterFactory}} and {{JapaneseKatakanaStemFilterFactory}} doesn't have any tests and it would be good to have some. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-3520) Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory
[ https://issues.apache.org/jira/browse/SOLR-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Moen reassigned SOLR-3520: Assignee: Christian Moen Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory Key: SOLR-3520 URL: https://issues.apache.org/jira/browse/SOLR-3520 Project: Solr Issue Type: Test Affects Versions: 4.0, 5.0 Reporter: Christian Moen Assignee: Christian Moen Priority: Minor Attachments: SOLR-3520.patch, SOLR-3520.patch {{JapaneseReadingFormFilterFactory}} and {{JapaneseKatakanaStemFilterFactory}} doesn't have any tests and it would be good to have some. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3520) Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory
[ https://issues.apache.org/jira/browse/SOLR-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Moen resolved SOLR-3520. -- Resolution: Fixed Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory Key: SOLR-3520 URL: https://issues.apache.org/jira/browse/SOLR-3520 Project: Solr Issue Type: Test Affects Versions: 4.0, 5.0 Reporter: Christian Moen Assignee: Christian Moen Priority: Minor Attachments: SOLR-3520.patch, SOLR-3520.patch {{JapaneseReadingFormFilterFactory}} and {{JapaneseKatakanaStemFilterFactory}} doesn't have any tests and it would be good to have some. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4123) Add CachingRAMDirectory
Michael McCandless created LUCENE-4123: -- Summary: Add CachingRAMDirectory Key: LUCENE-4123 URL: https://issues.apache.org/jira/browse/LUCENE-4123 Project: Lucene - Java Issue Type: Bug Components: core/store Reporter: Michael McCandless Assignee: Michael McCandless The directory is very simple and useful if you have an index that you know fully fits into available RAM. You could also use FileSwitchDir if you want to leave some files (eg stored fields or term vectors) on disk. It wraps any other Directory and delegates all writing (IndexOutput) to it, but for reading (IndexInput), it allocates a single byte[] and fully reads the file in and then serves requests off that single byte[]. It's more GC friendly than RAMDir since it only allocates a single array per file. It has a few nocommits still, but all tests pass if I wrap the delegate inside MockDirectoryWrapper using this. I tested with 1M Wikipedia english index (would like to test w/ 10M docs but I don't have enough RAM...); it seems to give a nice speedup: {noformat} TaskQPS base StdDev base QPS cachedStdDev cached Pct diff Respell 197.007.27 203.198.17 -4% - 11% PKLookup 121.122.80 125.463.20 -1% - 8% Fuzzy2 66.622.62 69.912.85 -3% - 13% Fuzzy1 206.206.47 222.216.521% - 14% TermGroup100K 160.146.62 175.713.793% - 16% Phrase 34.850.40 38.750.618% - 14% TermBGroup100K 363.75 15.74 406.98 13.233% - 20% SpanNear 53.081.11 59.532.944% - 20% TermBGroup100K1P 222.539.78 252.865.966% - 21% SloppyPhrase 70.362.05 79.954.484% - 23% Wildcard 238.104.29 272.784.97 10% - 18% OrHighMed 123.494.85 149.324.66 12% - 29% Prefix3 288.468.10 350.405.38 16% - 26% OrHighHigh 76.463.27 93.132.96 13% - 31% IntNRQ 92.252.12 113.475.74 14% - 32% Term 757.12 39.03 958.62 22.68 17% - 36% AndHighHigh 103.034.48 133.893.76 21% - 39% AndHighMed 376.36 16.58 493.99 10.00 23% - 40% {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4123) Add CachingRAMDirectory
[ https://issues.apache.org/jira/browse/LUCENE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-4123: --- Attachment: LUCENE-4123.patch Add CachingRAMDirectory --- Key: LUCENE-4123 URL: https://issues.apache.org/jira/browse/LUCENE-4123 Project: Lucene - Java Issue Type: Bug Components: core/store Reporter: Michael McCandless Assignee: Michael McCandless Attachments: LUCENE-4123.patch The directory is very simple and useful if you have an index that you know fully fits into available RAM. You could also use FileSwitchDir if you want to leave some files (eg stored fields or term vectors) on disk. It wraps any other Directory and delegates all writing (IndexOutput) to it, but for reading (IndexInput), it allocates a single byte[] and fully reads the file in and then serves requests off that single byte[]. It's more GC friendly than RAMDir since it only allocates a single array per file. It has a few nocommits still, but all tests pass if I wrap the delegate inside MockDirectoryWrapper using this. I tested with 1M Wikipedia english index (would like to test w/ 10M docs but I don't have enough RAM...); it seems to give a nice speedup: {noformat} TaskQPS base StdDev base QPS cachedStdDev cached Pct diff Respell 197.007.27 203.198.17 -4% - 11% PKLookup 121.122.80 125.463.20 -1% - 8% Fuzzy2 66.622.62 69.912.85 -3% - 13% Fuzzy1 206.206.47 222.216.521% - 14% TermGroup100K 160.146.62 175.713.793% - 16% Phrase 34.850.40 38.750.618% - 14% TermBGroup100K 363.75 15.74 406.98 13.233% - 20% SpanNear 53.081.11 59.532.944% - 20% TermBGroup100K1P 222.539.78 252.865.966% - 21% SloppyPhrase 70.362.05 79.954.484% - 23% Wildcard 238.104.29 272.784.97 10% - 18% OrHighMed 123.494.85 149.324.66 12% - 29% Prefix3 288.468.10 350.405.38 16% - 26% OrHighHigh 76.463.27 93.132.96 13% - 31% IntNRQ 92.252.12 113.475.74 14% - 32% Term 757.12 39.03 958.62 22.68 17% - 36% AndHighHigh 103.034.48 133.893.76 21% - 39% AndHighMed 376.36 16.58 493.99 10.00 23% - 40% {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4123) Add CachingRAMDirectory
[ https://issues.apache.org/jira/browse/LUCENE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291857#comment-13291857 ] Simon Willnauer commented on LUCENE-4123: - bq.I tested with 1M Wikipedia english index (would like to test w/ 10M docs but I don't have enough RAM...); it seems to give a nice speedup: #fail! :) Add CachingRAMDirectory --- Key: LUCENE-4123 URL: https://issues.apache.org/jira/browse/LUCENE-4123 Project: Lucene - Java Issue Type: Bug Components: core/store Reporter: Michael McCandless Assignee: Michael McCandless Attachments: LUCENE-4123.patch The directory is very simple and useful if you have an index that you know fully fits into available RAM. You could also use FileSwitchDir if you want to leave some files (eg stored fields or term vectors) on disk. It wraps any other Directory and delegates all writing (IndexOutput) to it, but for reading (IndexInput), it allocates a single byte[] and fully reads the file in and then serves requests off that single byte[]. It's more GC friendly than RAMDir since it only allocates a single array per file. It has a few nocommits still, but all tests pass if I wrap the delegate inside MockDirectoryWrapper using this. I tested with 1M Wikipedia english index (would like to test w/ 10M docs but I don't have enough RAM...); it seems to give a nice speedup: {noformat} TaskQPS base StdDev base QPS cachedStdDev cached Pct diff Respell 197.007.27 203.198.17 -4% - 11% PKLookup 121.122.80 125.463.20 -1% - 8% Fuzzy2 66.622.62 69.912.85 -3% - 13% Fuzzy1 206.206.47 222.216.521% - 14% TermGroup100K 160.146.62 175.713.793% - 16% Phrase 34.850.40 38.750.618% - 14% TermBGroup100K 363.75 15.74 406.98 13.233% - 20% SpanNear 53.081.11 59.532.944% - 20% TermBGroup100K1P 222.539.78 252.865.966% - 21% SloppyPhrase 70.362.05 79.954.484% - 23% Wildcard 238.104.29 272.784.97 10% - 18% OrHighMed 123.494.85 149.324.66 12% - 29% Prefix3 288.468.10 350.405.38 16% - 26% OrHighHigh 76.463.27 93.132.96 13% - 31% IntNRQ 92.252.12 113.475.74 14% - 32% Term 757.12 39.03 958.62 22.68 17% - 36% AndHighHigh 103.034.48 133.893.76 21% - 39% AndHighMed 376.36 16.58 493.99 10.00 23% - 40% {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
in case you run into Author: [name] not defined in .git/authors.txt file
In case you are using git-svn and see the following error after running git svn rebase on a brach: Author: [name] not defined in .git/authors.txt file just cd into the .git directory and add the user to the /authors.txt file where [name] is the apache user name. [name] = Full Name [name]@apache.org save file. run git svn rebase again.
[jira] [Created] (SOLR-3527) Optimize ignores maxSegments in distributed environment
Andy Laird created SOLR-3527: Summary: Optimize ignores maxSegments in distributed environment Key: SOLR-3527 URL: https://issues.apache.org/jira/browse/SOLR-3527 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 4.0 Reporter: Andy Laird Send the following command to a Solr server with many segments in a multi-shard, multi-server environment: curl http://localhost:8080/solr/update?optimize=truewaitFlush=truemaxSegments=6distrib=false; The local server will end up with the number of segments at 6, as requested, but all other shards in the index will be optimized with maxSegments=1, which takes far longer to complete. All shards should be optimized to the requested value of 6. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4123) Add CachingRAMDirectory
[ https://issues.apache.org/jira/browse/LUCENE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291860#comment-13291860 ] Robert Muir commented on LUCENE-4123: - I dont think it buys anything to code dup the readVint/vlong here. it should be compiled to the same code. e.g. mmapdir doesnt do this. Add CachingRAMDirectory --- Key: LUCENE-4123 URL: https://issues.apache.org/jira/browse/LUCENE-4123 Project: Lucene - Java Issue Type: Bug Components: core/store Reporter: Michael McCandless Assignee: Michael McCandless Attachments: LUCENE-4123.patch The directory is very simple and useful if you have an index that you know fully fits into available RAM. You could also use FileSwitchDir if you want to leave some files (eg stored fields or term vectors) on disk. It wraps any other Directory and delegates all writing (IndexOutput) to it, but for reading (IndexInput), it allocates a single byte[] and fully reads the file in and then serves requests off that single byte[]. It's more GC friendly than RAMDir since it only allocates a single array per file. It has a few nocommits still, but all tests pass if I wrap the delegate inside MockDirectoryWrapper using this. I tested with 1M Wikipedia english index (would like to test w/ 10M docs but I don't have enough RAM...); it seems to give a nice speedup: {noformat} TaskQPS base StdDev base QPS cachedStdDev cached Pct diff Respell 197.007.27 203.198.17 -4% - 11% PKLookup 121.122.80 125.463.20 -1% - 8% Fuzzy2 66.622.62 69.912.85 -3% - 13% Fuzzy1 206.206.47 222.216.521% - 14% TermGroup100K 160.146.62 175.713.793% - 16% Phrase 34.850.40 38.750.618% - 14% TermBGroup100K 363.75 15.74 406.98 13.233% - 20% SpanNear 53.081.11 59.532.944% - 20% TermBGroup100K1P 222.539.78 252.865.966% - 21% SloppyPhrase 70.362.05 79.954.484% - 23% Wildcard 238.104.29 272.784.97 10% - 18% OrHighMed 123.494.85 149.324.66 12% - 29% Prefix3 288.468.10 350.405.38 16% - 26% OrHighHigh 76.463.27 93.132.96 13% - 31% IntNRQ 92.252.12 113.475.74 14% - 32% Term 757.12 39.03 958.62 22.68 17% - 36% AndHighHigh 103.034.48 133.893.76 21% - 39% AndHighMed 376.36 16.58 493.99 10.00 23% - 40% {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4122) Replace Payload with BytesRef
[ https://issues.apache.org/jira/browse/LUCENE-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291862#comment-13291862 ] Robert Muir commented on LUCENE-4122: - +1 Replace Payload with BytesRef - Key: LUCENE-4122 URL: https://issues.apache.org/jira/browse/LUCENE-4122 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0, 5.0 Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Fix For: 4.0, 5.0 Attachments: LUCENE-4122.patch The Payload class offers a very similar functionality to BytesRef. The code internally uses BytesRef-s to represent payloads, and on indexing and on retrieval this data is repackaged from/to Payload. This seems wasteful. I propose to remove the Payload class and use BytesRef instead, thus avoid this re-wrapping and reducing the API footprint. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3527) Optimize ignores maxSegments in distributed environment
[ https://issues.apache.org/jira/browse/SOLR-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291864#comment-13291864 ] Andy Laird commented on SOLR-3527: -- One additional data point: the distrib=false does not matter with current behavior. It seems if distrib=false only the local server should be optimized (to the requested value) and if distrib=true (default) all shards in the index should be optimized with N max segments. Optimize ignores maxSegments in distributed environment --- Key: SOLR-3527 URL: https://issues.apache.org/jira/browse/SOLR-3527 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 4.0 Reporter: Andy Laird Send the following command to a Solr server with many segments in a multi-shard, multi-server environment: curl http://localhost:8080/solr/update?optimize=truewaitFlush=truemaxSegments=6distrib=false; The local server will end up with the number of segments at 6, as requested, but all other shards in the index will be optimized with maxSegments=1, which takes far longer to complete. All shards should be optimized to the requested value of 6. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4122) Replace Payload with BytesRef
[ https://issues.apache.org/jira/browse/LUCENE-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291865#comment-13291865 ] Michael McCandless commented on LUCENE-4122: +1 Replace Payload with BytesRef - Key: LUCENE-4122 URL: https://issues.apache.org/jira/browse/LUCENE-4122 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0, 5.0 Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Fix For: 4.0, 5.0 Attachments: LUCENE-4122.patch The Payload class offers a very similar functionality to BytesRef. The code internally uses BytesRef-s to represent payloads, and on indexing and on retrieval this data is repackaged from/to Payload. This seems wasteful. I propose to remove the Payload class and use BytesRef instead, thus avoid this re-wrapping and reducing the API footprint. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Harwood updated LUCENE-4069: - Attachment: (was: BloomFilterPostings40.patch) Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Harwood updated LUCENE-4069: - Attachment: BloomFilterPostingsBranch4x.patch Updated as follows: * Extracted Bloom filter functionality as new oal.util.FuzzySet class - the name is changed because Bloom filtering is one application for a FuzzySet, fuzzy count distincts being another. * BloomFilterPostingsFormat now take a factory that can tailor choice of BloomFilter per field (bitset size/saturation settings and choice of hash algo). Provided a default factory implementation. * All Unit tests pass now that I have a test PostingsFormat class that uses v small bitsets where before the many-field unit tests would cause OOM. Will follow up with benchmarks when I have more time to run and document them. Initial results from my large-scale tests on growing indexes show a nice flat line in the face of a growing index whereas a non-Bloomed index saw-tooths upwards as segments grow/merge. Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostingsBranch4x.patch, MHBloomFilterOn3.6Branch.patch An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Harwood updated LUCENE-4069: - Attachment: (was: PrimaryKey40PerformanceTestSrc.zip) Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostingsBranch4x.patch, MHBloomFilterOn3.6Branch.patch An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Spurious JFlex warning from build
I happened to notice the following message in an ant test today: compile-test: [echo] Building analyzers-common... jflex-uptodate-check: jflex-notice: [echo] One or more of the JFlex .jflex files is newer than its corresponding [echo] .java file. Run the jflex target to regenerate the artifacts. It is a spurious warning/directive because HTMLCharacterEntities.jflex doesn’t have a matching .java file since it is a “macro” referenced by HTMLStripCharFilter.jflex. I am wondering if it makes sense to rename HTMLCharacterEntities.jflex to HTMLCharacterEntities.jflex-macro (like HTMLStripCharFilter.SUPPLEMENTARY.jflex-macro) to avoid the misleading build warning/directive. -- Jack Krupansky
[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Harwood updated LUCENE-4069: - Attachment: PrimaryKeyPerfTest40.java Benchmark tool adapted from Mike's original Pulsing codec benchmark. Now includes Bloom postings example. Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6, 4.0 Reporter: Mark Harwood Priority: Minor Fix For: 4.0, 3.6.1 Attachments: BloomFilterPostingsBranch4x.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKeyPerfTest40.java An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no 3.6 API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4124) factor ByteBufferIndexInput out of MMapDirectory
Robert Muir created LUCENE-4124: --- Summary: factor ByteBufferIndexInput out of MMapDirectory Key: LUCENE-4124 URL: https://issues.apache.org/jira/browse/LUCENE-4124 Project: Lucene - Java Issue Type: Improvement Components: core/store Reporter: Robert Muir Assignee: Uwe Schindler I think we should factor a ByteBufferIndexInput out of MMapDir, leaving only the mmap/unmapping in mmapdir. Its a cleaner separation and would allow it to be used for other purposes (e.g. direct or array-backed buffers) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4124) factor ByteBufferIndexInput out of MMapDirectory
[ https://issues.apache.org/jira/browse/LUCENE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4124: Attachment: LUCENE-4124.patch factor ByteBufferIndexInput out of MMapDirectory Key: LUCENE-4124 URL: https://issues.apache.org/jira/browse/LUCENE-4124 Project: Lucene - Java Issue Type: Improvement Components: core/store Reporter: Robert Muir Assignee: Uwe Schindler Attachments: LUCENE-4124.patch I think we should factor a ByteBufferIndexInput out of MMapDir, leaving only the mmap/unmapping in mmapdir. Its a cleaner separation and would allow it to be used for other purposes (e.g. direct or array-backed buffers) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4123) Add CachingRAMDirectory
[ https://issues.apache.org/jira/browse/LUCENE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291884#comment-13291884 ] Michael McCandless commented on LUCENE-4123: Results for 5M doc index: {noformat} TaskQPS base StdDev base QPS cachedStdDev cached Pct diff Respell 104.067.63 108.597.55 -9% - 20% TermGroup1M 57.941.59 60.700.301% - 8% TermBGroup1M 103.282.54 108.512.540% - 10% Fuzzy2 43.072.96 45.323.06 -8% - 20% Fuzzy1 72.644.73 76.924.38 -6% - 19% TermBGroup1M1P 90.143.03 95.953.81 -1% - 14% IntNRQ 16.010.95 17.170.330% - 16% PKLookup 86.212.51 92.552.591% - 13% Wildcard 65.513.13 71.001.451% - 16% OrHighMed 21.641.83 23.561.24 -4% - 25% Prefix3 105.334.94 114.752.461% - 16% OrHighHigh 17.391.45 18.970.95 -4% - 24% AndHighHigh 30.051.14 33.420.884% - 18% Term 243.139.03 273.928.265% - 20% SloppyPhrase 15.800.28 17.840.786% - 19% SpanNear 10.520.14 11.970.259% - 17% AndHighMed 117.603.54 135.912.49 10% - 21% Phrase 20.150.78 24.220.26 14% - 26% {noformat} Add CachingRAMDirectory --- Key: LUCENE-4123 URL: https://issues.apache.org/jira/browse/LUCENE-4123 Project: Lucene - Java Issue Type: Bug Components: core/store Reporter: Michael McCandless Assignee: Michael McCandless Attachments: LUCENE-4123.patch The directory is very simple and useful if you have an index that you know fully fits into available RAM. You could also use FileSwitchDir if you want to leave some files (eg stored fields or term vectors) on disk. It wraps any other Directory and delegates all writing (IndexOutput) to it, but for reading (IndexInput), it allocates a single byte[] and fully reads the file in and then serves requests off that single byte[]. It's more GC friendly than RAMDir since it only allocates a single array per file. It has a few nocommits still, but all tests pass if I wrap the delegate inside MockDirectoryWrapper using this. I tested with 1M Wikipedia english index (would like to test w/ 10M docs but I don't have enough RAM...); it seems to give a nice speedup: {noformat} TaskQPS base StdDev base QPS cachedStdDev cached Pct diff Respell 197.007.27 203.198.17 -4% - 11% PKLookup 121.122.80 125.463.20 -1% - 8% Fuzzy2 66.622.62 69.912.85 -3% - 13% Fuzzy1 206.206.47 222.216.521% - 14% TermGroup100K 160.146.62 175.713.793% - 16% Phrase 34.850.40 38.750.618% - 14% TermBGroup100K 363.75 15.74 406.98 13.233% - 20% SpanNear 53.081.11 59.532.944% - 20% TermBGroup100K1P 222.539.78 252.865.966% - 21% SloppyPhrase 70.362.05 79.954.484% - 23% Wildcard 238.104.29 272.784.97 10% - 18% OrHighMed 123.494.85 149.324.66 12% - 29% Prefix3 288.468.10 350.405.38 16% - 26% OrHighHigh 76.463.27 93.132.96 13% - 31% IntNRQ 92.252.12 113.475.74 14% - 32% Term 757.12 39.03 958.62 22.68 17% - 36% AndHighHigh 103.034.48 133.893.76 21% - 39% AndHighMed 376.36 16.58 493.99 10.00 23% - 40% {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see:
[jira] [Commented] (LUCENE-4123) Add CachingRAMDirectory
[ https://issues.apache.org/jira/browse/LUCENE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291886#comment-13291886 ] Michael McCandless commented on LUCENE-4123: bq. I dont think it buys anything to code dup the readVint/vlong here. it should be compiled to the same code. e.g. mmapdir doesnt do this. I think you're right! Here are the results w/ the code dup removed (same static seed as previous 5M doc results): {noformat} TaskQPS base StdDev base QPS cachedStdDev cached Pct diff IntNRQ 16.360.86 16.920.75 -6% - 14% TermBGroup1M1P 91.713.03 95.073.94 -3% - 11% TermGroup1M 58.141.00 60.381.530% - 8% TermBGroup1M 103.111.76 108.142.630% - 9% Prefix3 108.830.97 115.052.892% - 9% Wildcard 67.270.72 71.221.712% - 9% Respell 102.297.78 109.087.22 -7% - 23% Fuzzy2 42.462.95 45.513.31 -7% - 23% Fuzzy1 72.463.55 77.964.51 -3% - 19% Term 247.45 17.73 268.17 12.28 -3% - 22% OrHighMed 22.381.19 24.471.64 -3% - 23% OrHighHigh 18.010.92 19.711.20 -2% - 22% AndHighHigh 30.790.35 33.800.377% - 12% PKLookup 84.712.40 93.952.325% - 16% SpanNear 10.540.13 12.020.13 11% - 16% AndHighMed 119.181.05 136.641.80 12% - 17% SloppyPhrase 15.500.15 18.260.30 14% - 20% Phrase 20.640.12 24.940.48 17% - 23% {noformat} So I'll remove the code dup. Add CachingRAMDirectory --- Key: LUCENE-4123 URL: https://issues.apache.org/jira/browse/LUCENE-4123 Project: Lucene - Java Issue Type: Bug Components: core/store Reporter: Michael McCandless Assignee: Michael McCandless Attachments: LUCENE-4123.patch The directory is very simple and useful if you have an index that you know fully fits into available RAM. You could also use FileSwitchDir if you want to leave some files (eg stored fields or term vectors) on disk. It wraps any other Directory and delegates all writing (IndexOutput) to it, but for reading (IndexInput), it allocates a single byte[] and fully reads the file in and then serves requests off that single byte[]. It's more GC friendly than RAMDir since it only allocates a single array per file. It has a few nocommits still, but all tests pass if I wrap the delegate inside MockDirectoryWrapper using this. I tested with 1M Wikipedia english index (would like to test w/ 10M docs but I don't have enough RAM...); it seems to give a nice speedup: {noformat} TaskQPS base StdDev base QPS cachedStdDev cached Pct diff Respell 197.007.27 203.198.17 -4% - 11% PKLookup 121.122.80 125.463.20 -1% - 8% Fuzzy2 66.622.62 69.912.85 -3% - 13% Fuzzy1 206.206.47 222.216.521% - 14% TermGroup100K 160.146.62 175.713.793% - 16% Phrase 34.850.40 38.750.618% - 14% TermBGroup100K 363.75 15.74 406.98 13.233% - 20% SpanNear 53.081.11 59.532.944% - 20% TermBGroup100K1P 222.539.78 252.865.966% - 21% SloppyPhrase 70.362.05 79.954.484% - 23% Wildcard 238.104.29 272.784.97 10% - 18% OrHighMed 123.494.85 149.324.66 12% - 29% Prefix3 288.468.10 350.405.38 16% - 26% OrHighHigh 76.463.27 93.132.96 13% - 31% IntNRQ 92.252.12 113.475.74 14% - 32% Term 757.12 39.03 958.62 22.68 17% - 36% AndHighHigh 103.034.48 133.893.76 21% - 39% AndHighMed 376.36 16.58 493.99 10.00 23% - 40%
RE: Spurious JFlex warning from build
+1 From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Friday, June 08, 2012 1:48 PM To: Lucene/Solr Dev Subject: Spurious JFlex warning from build I happened to notice the following message in an ant test today: compile-test: [echo] Building analyzers-common... jflex-uptodate-check: jflex-notice: [echo] One or more of the JFlex .jflex files is newer than its corresponding [echo] .java file. Run the jflex target to regenerate the artifacts. It is a spurious warning/directive because HTMLCharacterEntities.jflex doesn’t have a matching .java file since it is a “macro” referenced by HTMLStripCharFilter.jflex. I am wondering if it makes sense to rename HTMLCharacterEntities.jflex to HTMLCharacterEntities.jflex-macro (like HTMLStripCharFilter.SUPPLEMENTARY.jflex-macro) to avoid the misleading build warning/directive. -- Jack Krupansky
[jira] [Commented] (SOLR-3527) Optimize ignores maxSegments in distributed environment
[ https://issues.apache.org/jira/browse/SOLR-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291890#comment-13291890 ] Mark Miller commented on SOLR-3527: --- Sounds right Andy - thanks for the report. Optimize ignores maxSegments in distributed environment --- Key: SOLR-3527 URL: https://issues.apache.org/jira/browse/SOLR-3527 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 4.0 Reporter: Andy Laird Send the following command to a Solr server with many segments in a multi-shard, multi-server environment: curl http://localhost:8080/solr/update?optimize=truewaitFlush=truemaxSegments=6distrib=false; The local server will end up with the number of segments at 6, as requested, but all other shards in the index will be optimized with maxSegments=1, which takes far longer to complete. All shards should be optimized to the requested value of 6. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4124) factor ByteBufferIndexInput out of MMapDirectory
[ https://issues.apache.org/jira/browse/LUCENE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291899#comment-13291899 ] Uwe Schindler commented on LUCENE-4124: --- Thanks for assigning me. Patch looks good as first step. The hashcode and equals in the (now abstract) base class must be final. This was not done before, as class on itself was final. factor ByteBufferIndexInput out of MMapDirectory Key: LUCENE-4124 URL: https://issues.apache.org/jira/browse/LUCENE-4124 Project: Lucene - Java Issue Type: Improvement Components: core/store Reporter: Robert Muir Assignee: Uwe Schindler Attachments: LUCENE-4124.patch I think we should factor a ByteBufferIndexInput out of MMapDir, leaving only the mmap/unmapping in mmapdir. Its a cleaner separation and would allow it to be used for other purposes (e.g. direct or array-backed buffers) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
lucene-highlighter 3.6 No highlight for 3 letter words
Hi, How can i highlight 3 letter words? everything is working except for this, what setting do i need to change? Im using lucene-highlighter-3.6.0.jar lucene-core-3.6.0.jar. Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_30); QueryParser parser = new QueryParser(Version.LUCENE_30, , analyzer); parser.setAllowLeadingWildcard(true); SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter(,); Highlighter highlighter = new Highlighter(htmlFormatter,new QueryScorer(parser.parse(pQuery))); highlighter.setTextFragmenter(new NullFragmenter()); highlighter.setMaxDocCharsToAnalyze(Integer.MAX_VALUE); String text = highlighter.getBestFragment(analyzer, , pText); Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/lucene-highlighter-3-6-No-highlight-for-3-letter-words-tp3988464.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 14636 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/14636/ All tests passed Build Log: [...truncated 21120 lines...] rat-sources-typedef: rat-sources: [echo] [echo] * [echo] Summary [echo] --- [echo] Generated at: 2012-06-08T19:01:23+00:00 [echo] Notes: 0 [echo] Binaries: 0 [echo] Archives: 0 [echo] Standards: 6 [echo] [echo] Apache Licensed: 6 [echo] Generated Documents: 0 [echo] [echo] JavaDocs are generated and so license header is optional [echo] Generated files do not required license headers [echo] [echo] 0 Unknown Licenses [echo] [echo] *** [echo] [echo] Unapproved licenses: [echo] [echo] [echo] *** [echo] [echo] Archives: [echo] [echo] * [echo] Files with Apache License headers will be marked AL [echo] Binary files (which do not require AL headers) will be marked B [echo] Compressed archives will be marked A [echo] Notices, licenses etc will be marked N [echo] AL /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/velocity/src/java/org/apache/solr/response/PageTool.java [echo] AL /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/velocity/src/java/org/apache/solr/response/SolrParamResourceLoader.java [echo] AL /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/velocity/src/java/org/apache/solr/response/SolrVelocityResourceLoader.java [echo] AL /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/velocity/src/java/org/apache/solr/response/VelocityResponseWriter.java [echo] AL /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/velocity/src/java/overview.html [echo] AL /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/velocity/src/test/org/apache/solr/velocity/VelocityResponseWriterTest.java [echo] [echo] * [echo] Printing headers for files without AL header... [echo] [echo] BUILD SUCCESSFUL Total time: 8 seconds + [ -z '' ] + cd /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene + /home/hudson/tools/ant/supported18/bin/ant -Djavadoc.link=/home/hudson/tools/java/api6 javadocs-lint Buildfile: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build.xml check-lucene-core-javadocs-uptodate: javadocs-lucene-core: javadocs: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/docs/core download-java6-javadoc-packagelist: [copy] Copying 1 file to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/docs/core [javadoc] Generating Javadoc [javadoc] Javadoc execution [javadoc] Loading source files for package org.apache.lucene... [javadoc] Loading source files for package org.apache.lucene.analysis... [javadoc] Loading source files for package org.apache.lucene.analysis.tokenattributes... [javadoc] Loading source files for package org.apache.lucene.codecs... [javadoc] Loading source files for package org.apache.lucene.codecs.appending... [javadoc] Loading source files for package org.apache.lucene.codecs.intblock... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene40... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene40.values... [javadoc] Loading source files for package org.apache.lucene.codecs.memory... [javadoc] Loading source files for package org.apache.lucene.codecs.perfield... [javadoc] Loading source files for package org.apache.lucene.codecs.pulsing... [javadoc] Loading source files for package org.apache.lucene.codecs.sep... [javadoc] Loading source files for package org.apache.lucene.codecs.simpletext... [javadoc] Loading source files for package org.apache.lucene.document... [javadoc] Loading source files for package org.apache.lucene.index... [javadoc] Loading source files for package org.apache.lucene.search... [javadoc] Loading source files for package org.apache.lucene.search.payloads... [javadoc] Loading source files for package org.apache.lucene.search.similarities... [javadoc] Loading source files for package org.apache.lucene.search.spans... [javadoc] Loading source files for package org.apache.lucene.store... [javadoc] Loading source files for package org.apache.lucene.util... [javadoc] Loading source files for package org.apache.lucene.util.automaton... [javadoc]
[jira] [Created] (SOLR-3528) Analysis UI should stack tokens at the same position
Ryan McKinley created SOLR-3528: --- Summary: Analysis UI should stack tokens at the same position Key: SOLR-3528 URL: https://issues.apache.org/jira/browse/SOLR-3528 Project: Solr Issue Type: Improvement Components: web gui Reporter: Ryan McKinley Attachments: position-stach.png The old UI would display tokens that had the same position in the same column. The new one adds a new column for each position, making it less clear what is happening with position offsets (especially in the non-verbose output) I think it should be reworked as: {code} tr tdTokenizer/td td divstuff at pos 0/div divstuff at pos 0/div divstuff at pos 0/div /td td divstuff at pos 1/div /td /tr {code} Using a table would also force the layout wide rather then wrapping -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3528) Analysis UI should stack tokens at the same position
[ https://issues.apache.org/jira/browse/SOLR-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley updated SOLR-3528: Attachment: position-stach.png Current view: !position-stach.png! Analysis UI should stack tokens at the same position Key: SOLR-3528 URL: https://issues.apache.org/jira/browse/SOLR-3528 Project: Solr Issue Type: Improvement Components: web gui Reporter: Ryan McKinley Attachments: position-stach.png The old UI would display tokens that had the same position in the same column. The new one adds a new column for each position, making it less clear what is happening with position offsets (especially in the non-verbose output) I think it should be reworked as: {code} tr tdTokenizer/td td divstuff at pos 0/div divstuff at pos 0/div divstuff at pos 0/div /td td divstuff at pos 1/div /td /tr {code} Using a table would also force the layout wide rather then wrapping -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3528) Analysis UI should stack tokens at the same position
[ https://issues.apache.org/jira/browse/SOLR-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291917#comment-13291917 ] Ryan McKinley commented on SOLR-3528: - synonyms and path heiarch are good examples of tokenizers/filters that stack positions Analysis UI should stack tokens at the same position Key: SOLR-3528 URL: https://issues.apache.org/jira/browse/SOLR-3528 Project: Solr Issue Type: Improvement Components: web gui Reporter: Ryan McKinley Attachments: position-stach.png The old UI would display tokens that had the same position in the same column. The new one adds a new column for each position, making it less clear what is happening with position offsets (especially in the non-verbose output) I think it should be reworked as: {code} tr tdTokenizer/td td divstuff at pos 0/div divstuff at pos 0/div divstuff at pos 0/div /td td divstuff at pos 1/div /td /tr {code} Using a table would also force the layout wide rather then wrapping -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux-Java6-64 - Build # 818 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux-Java6-64/818/ All tests passed Build Log: [...truncated 21105 lines...] rat-sources-typedef: rat-sources: [echo] [echo] * [echo] Summary [echo] --- [echo] Generated at: 2012-06-08T19:18:24+00:00 [echo] Notes: 0 [echo] Binaries: 0 [echo] Archives: 0 [echo] Standards: 6 [echo] [echo] Apache Licensed: 6 [echo] Generated Documents: 0 [echo] [echo] JavaDocs are generated and so license header is optional [echo] Generated files do not required license headers [echo] [echo] 0 Unknown Licenses [echo] [echo] *** [echo] [echo] Unapproved licenses: [echo] [echo] [echo] *** [echo] [echo] Archives: [echo] [echo] * [echo] Files with Apache License headers will be marked AL [echo] Binary files (which do not require AL headers) will be marked B [echo] Compressed archives will be marked A [echo] Notices, licenses etc will be marked N [echo] AL /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/solr/contrib/velocity/src/java/org/apache/solr/response/PageTool.java [echo] AL /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/solr/contrib/velocity/src/java/org/apache/solr/response/SolrParamResourceLoader.java [echo] AL /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/solr/contrib/velocity/src/java/org/apache/solr/response/SolrVelocityResourceLoader.java [echo] AL /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/solr/contrib/velocity/src/java/org/apache/solr/response/VelocityResponseWriter.java [echo] AL /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/solr/contrib/velocity/src/java/overview.html [echo] AL /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/solr/contrib/velocity/src/test/org/apache/solr/velocity/VelocityResponseWriterTest.java [echo] [echo] * [echo] Printing headers for files without AL header... [echo] [echo] BUILD SUCCESSFUL Total time: 7 seconds + [ -z ] + cd /var/lib/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/lucene + /var/lib/jenkins/tools/ant/supported18/bin/ant -Djavadoc.link=/var/lib/jenkins/tools/java/api6 javadocs-lint Buildfile: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/lucene/build.xml check-lucene-core-javadocs-uptodate: javadocs-lucene-core: javadocs: [mkdir] Created dir: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/lucene/build/docs/core download-java6-javadoc-packagelist: [copy] Copying 1 file to /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/lucene/build/docs/core [javadoc] Generating Javadoc [javadoc] Javadoc execution [javadoc] Loading source files for package org.apache.lucene... [javadoc] Loading source files for package org.apache.lucene.analysis... [javadoc] Loading source files for package org.apache.lucene.analysis.tokenattributes... [javadoc] Loading source files for package org.apache.lucene.codecs... [javadoc] Loading source files for package org.apache.lucene.codecs.appending... [javadoc] Loading source files for package org.apache.lucene.codecs.intblock... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene40... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene40.values... [javadoc] Loading source files for package org.apache.lucene.codecs.memory... [javadoc] Loading source files for package org.apache.lucene.codecs.perfield... [javadoc] Loading source files for package org.apache.lucene.codecs.pulsing... [javadoc] Loading source files for package org.apache.lucene.codecs.sep... [javadoc] Loading source files for package org.apache.lucene.codecs.simpletext... [javadoc] Loading source files for package org.apache.lucene.document... [javadoc] Loading source files for package org.apache.lucene.index... [javadoc] Loading source files for package org.apache.lucene.search... [javadoc] Loading source files for package org.apache.lucene.search.payloads... [javadoc] Loading source files for package org.apache.lucene.search.similarities... [javadoc] Loading source files for package org.apache.lucene.search.spans... [javadoc] Loading source files for package org.apache.lucene.store... [javadoc] Loading source files for package org.apache.lucene.util... [javadoc] Loading source files for package org.apache.lucene.util.automaton... [javadoc] Loading source files for package org.apache.lucene.util.fst...
[jira] [Commented] (SOLR-3528) Analysis UI should stack tokens at the same position
[ https://issues.apache.org/jira/browse/SOLR-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291921#comment-13291921 ] Christian Moen commented on SOLR-3528: -- I agree. It's great if the new analysis UI can stack tokens. Analysis UI should stack tokens at the same position Key: SOLR-3528 URL: https://issues.apache.org/jira/browse/SOLR-3528 Project: Solr Issue Type: Improvement Components: web gui Reporter: Ryan McKinley Attachments: position-stach.png The old UI would display tokens that had the same position in the same column. The new one adds a new column for each position, making it less clear what is happening with position offsets (especially in the non-verbose output) I think it should be reworked as: {code} tr tdTokenizer/td td divstuff at pos 0/div divstuff at pos 0/div divstuff at pos 0/div /td td divstuff at pos 1/div /td /tr {code} Using a table would also force the layout wide rather then wrapping -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2569) Enable facile moving of cores
[ https://issues.apache.org/jira/browse/SOLR-2569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen resolved SOLR-2569. Resolution: Won't Fix Enable facile moving of cores - Key: SOLR-2569 URL: https://issues.apache.org/jira/browse/SOLR-2569 Project: Solr Issue Type: Improvement Components: multicore, replication (java) Affects Versions: 4.0 Reporter: Jason Rutherglen Spin-off from this thread: http://search-lucene.com/m/5CO7Z1oOrh6/elastic+searchsubj=Solr+vs+ElasticSearch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: lucene-highlighter 3.6 No highlight for 3 letter words
you might ask this question on the user list to get better response. can you provide also the text query you want to highlight? simon On Fri, Jun 8, 2012 at 1:17 PM, gerryjun gerry...@yahoo.com wrote: Hi, How can i highlight 3 letter words? everything is working except for this, what setting do i need to change? Im using lucene-highlighter-3.6.0.jar lucene-core-3.6.0.jar. Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_30); QueryParser parser = new QueryParser(Version.LUCENE_30, , analyzer); parser.setAllowLeadingWildcard(true); SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter(,); Highlighter highlighter = new Highlighter(htmlFormatter,new QueryScorer(parser.parse(pQuery))); highlighter.setTextFragmenter(new NullFragmenter()); highlighter.setMaxDocCharsToAnalyze(Integer.MAX_VALUE); String text = highlighter.getBestFragment(analyzer, , pText); Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/lucene-highlighter-3-6-No-highlight-for-3-letter-words-tp3988464.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4087) Provide consistent IW behavior for illegal meta data changes
[ https://issues.apache.org/jira/browse/LUCENE-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291937#comment-13291937 ] Michael McCandless commented on LUCENE-4087: Patch looks good. +1 I think this patch now means that, depending on when flushes kick in, you can sometimes apparently succeed in changing DV type for a field (though on merge something strange can happen, eg suddenly upgrading to a BYTES_XXX type) and other times hit an exception? Like the error checking is now intermittent as seen from the app? You might think everything is OK, push to production, and later (in production) hit a new exception... I think that's actually OK for now (this is all best effort)... but I think we should clean this up (can come after 4.0) so that the checking is consistent. Can we shorten the javadoc to simply state Changing the DocValue type for a given field is not supported.? Sure we make best effort to recover today but I don't think we should detail particulars of the specific best effort we're doing in 4.0? Provide consistent IW behavior for illegal meta data changes Key: LUCENE-4087 URL: https://issues.apache.org/jira/browse/LUCENE-4087 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Fix For: 4.0, 4.1, 5.0 Attachments: LUCENE-4087.patch, LUCENE-4087.patch Currently IW fails late and inconsistent if field metadata like an already defined DocValues type or un-omitting norms. we can approach this similar to how we handle consistent field number and: * throw exception if indexOptions conflict (e.g. omitTF=true versus false) instead of silently dropping positions on merge * same with omitNorms * same with norms types and docvalues types * still keeping field numbers consistent this way we could eliminate all these traps and just give an exception instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4087) Provide consistent IW behavior for illegal meta data changes
[ https://issues.apache.org/jira/browse/LUCENE-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291940#comment-13291940 ] Robert Muir commented on LUCENE-4087: - nit: loosing - losing in DocValues.java javadocs Provide consistent IW behavior for illegal meta data changes Key: LUCENE-4087 URL: https://issues.apache.org/jira/browse/LUCENE-4087 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Fix For: 4.0, 4.1, 5.0 Attachments: LUCENE-4087.patch, LUCENE-4087.patch Currently IW fails late and inconsistent if field metadata like an already defined DocValues type or un-omitting norms. we can approach this similar to how we handle consistent field number and: * throw exception if indexOptions conflict (e.g. omitTF=true versus false) instead of silently dropping positions on merge * same with omitNorms * same with norms types and docvalues types * still keeping field numbers consistent this way we could eliminate all these traps and just give an exception instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4087) Provide consistent IW behavior for illegal meta data changes
[ https://issues.apache.org/jira/browse/LUCENE-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291946#comment-13291946 ] Simon Willnauer commented on LUCENE-4087: - {quote} Can we shorten the javadoc to simply state Changing the DocValue type for a given field is not supported.? Sure we make best effort to recover today but I don't think we should detail particulars of the specific best effort we're doing in 4.0? {quote} I am not sure if we should say that since its not true. you can safely change a float into a double and if you reindex all documents you will eventually converge to double. Same is true for Sorted and go from fixed to variable or extend the precision of an integer. I think its just fair to document that. if we can change it in future releases is a different thing. Provide consistent IW behavior for illegal meta data changes Key: LUCENE-4087 URL: https://issues.apache.org/jira/browse/LUCENE-4087 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Fix For: 4.0, 4.1, 5.0 Attachments: LUCENE-4087.patch, LUCENE-4087.patch Currently IW fails late and inconsistent if field metadata like an already defined DocValues type or un-omitting norms. we can approach this similar to how we handle consistent field number and: * throw exception if indexOptions conflict (e.g. omitTF=true versus false) instead of silently dropping positions on merge * same with omitNorms * same with norms types and docvalues types * still keeping field numbers consistent this way we could eliminate all these traps and just give an exception instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4122) Replace Payload with BytesRef
[ https://issues.apache.org/jira/browse/LUCENE-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki resolved LUCENE-4122. --- Resolution: Fixed Committed in rev. 1348171 to trunk and in rev. 1348227 to branch_4x. Replace Payload with BytesRef - Key: LUCENE-4122 URL: https://issues.apache.org/jira/browse/LUCENE-4122 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0, 5.0 Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Fix For: 4.0, 5.0 Attachments: LUCENE-4122.patch The Payload class offers a very similar functionality to BytesRef. The code internally uses BytesRef-s to represent payloads, and on indexing and on retrieval this data is repackaged from/to Payload. This seems wasteful. I propose to remove the Payload class and use BytesRef instead, thus avoid this re-wrapping and reducing the API footprint. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2357) Reduce transient RAM usage while merging by using packed ints array for docID re-mapping
[ https://issues.apache.org/jira/browse/LUCENE-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291967#comment-13291967 ] Simon Willnauer commented on LUCENE-2357: - s/(Adrien Grand via Mike McCandless)/(Adrien Grand) otherwise +1 Reduce transient RAM usage while merging by using packed ints array for docID re-mapping Key: LUCENE-2357 URL: https://issues.apache.org/jira/browse/LUCENE-2357 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Fix For: 4.0, 5.0 Attachments: LUCENE-2357.patch, LUCENE-2357.patch, LUCENE-2357.patch, LUCENE-2357.patch, LUCENE-2357.patch We allocate this int[] to remap docIDs due to compaction of deleted ones. This uses alot of RAM for large segment merges, and can fail to allocate due to fragmentation on 32 bit JREs. Now that we have packed ints, a simple fix would be to use a packed int array... and maybe instead of storing abs docID in the mapping, we could store the number of del docs seen so far (so the remap would do a lookup then a subtract). This may add some CPU cost to merging but should bring down transient RAM usage quite a bit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4087) Provide consistent IW behavior for illegal meta data changes
[ https://issues.apache.org/jira/browse/LUCENE-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291972#comment-13291972 ] Michael McCandless commented on LUCENE-4087: OK I guess that makes sense. Basically we sign up, now, to allow certain DV type changes in the schema, just like how we allow omitNorms to change from false to true, but not vice/versa. Provide consistent IW behavior for illegal meta data changes Key: LUCENE-4087 URL: https://issues.apache.org/jira/browse/LUCENE-4087 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Fix For: 4.0, 4.1, 5.0 Attachments: LUCENE-4087.patch, LUCENE-4087.patch Currently IW fails late and inconsistent if field metadata like an already defined DocValues type or un-omitting norms. we can approach this similar to how we handle consistent field number and: * throw exception if indexOptions conflict (e.g. omitTF=true versus false) instead of silently dropping positions on merge * same with omitNorms * same with norms types and docvalues types * still keeping field numbers consistent this way we could eliminate all these traps and just give an exception instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 14637 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/14637/ All tests passed Build Log: [...truncated 24254 lines...] [...truncated 24254 lines...] [...truncated 24254 lines...] [...truncated 24201 lines...] javadocs-lint: [exec] [exec] Crawl/parse... [exec] [exec] build/docs/core/org/apache/lucene/store/package-use.html [exec] WARNING: anchor ../../../../org/apache/lucene/store/subclasses appears more than once [exec] [exec] Verify... [exec] [exec] build/docs/analyzers-common/org/apache/lucene/analysis/payloads/IdentityEncoder.html [exec] BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html [exec] BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html [exec] [exec] build/docs/analyzers-common/org/apache/lucene/analysis/payloads/package-summary.html [exec] BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html [exec] BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html [exec] [exec] build/docs/analyzers-common/org/apache/lucene/analysis/payloads/class-use/AbstractEncoder.html [exec] BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html [exec] BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html [exec] [exec] build/docs/analyzers-common/org/apache/lucene/analysis/payloads/IntegerEncoder.html [exec] BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html [exec] BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html [exec] BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html [exec] [exec] build/docs/analyzers-common/org/apache/lucene/analysis/payloads/FloatEncoder.html [exec] BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html [exec] BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html [exec] BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html [exec] [exec] build/docs/analyzers-common/org/apache/lucene/analysis/payloads/PayloadEncoder.html [exec] BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html [exec] BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html [exec] [exec] build/docs/analyzers-common/org/apache/lucene/analysis/payloads/class-use/PayloadEncoder.html [exec] BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html [exec] BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html [exec] [exec] Broken javadocs links were found! BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build.xml:194: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/common-build.xml:1613: exec returned: 1 Total time: 3 minutes 16 seconds Build step 'Execute shell' marked build as failure Archiving artifacts Recording test results Publishing Clover coverage report... No Clover report will be published due to a Build Failure Email was triggered for: Failure Sending email for trigger: Failure [...truncated 24254 lines...] [...truncated 24254 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4115) JAR resolution/ cleanup should be done automatically for ant clean/ eclipse/ resolve.
[ https://issues.apache.org/jira/browse/LUCENE-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291980#comment-13291980 ] Hoss Man commented on LUCENE-4115: -- bq. A true nice solution would be to revisit the issue where classpaths are constructed to ivy cache directly (they're always correct then) and just use copying for packaging. seems like that might introduce some risk of the classpath(s) used by developers/jenkins for running tests deviating from the ones people would get if they use the binary distributions (particularly solr users who don't know/understand java classpaths and just copy the example lib dirs as a starting point). JAR resolution/ cleanup should be done automatically for ant clean/ eclipse/ resolve. - Key: LUCENE-4115 URL: https://issues.apache.org/jira/browse/LUCENE-4115 Project: Lucene - Java Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 4.0, 5.0 Attachments: LUCENE-4111.patch I think we should add the following target deps: ant clean [depends on] clean-jars ant resolve [depends on] clean-jars ant eclipse [depends on] resolve, clean-jars ant idea [depends on] resolve, clean-jars This eliminates the need to remember about cleaning up stale jars which users complain about (and I think they're right about it). The overhead will be minimal since resolve is only going to copy jars from cache. Eclipse won't have a problem with updated JARs if they end up at the same location. If there are no objections I will fix this in a few hours. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-4.x - Build # 31 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-4.x/31/ All tests passed Build Log: [...truncated 9163 lines...] jar-misc: check-spatial-uptodate: jar-spatial: check-grouping-uptodate: jar-grouping: check-queries-uptodate: jar-queries: check-queryparser-uptodate: jar-queryparser: prep-lucene-jars: resolve-example: ivy-availability-check: ivy-fail: ivy-configure: resolve: [ivy:retrieve] :: loading settings :: url = jar:file:/home/hudson/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml common.init: compile-lucene-core: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/build/solr-core/classes/java [javac] Compiling 572 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/build/solr-core/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:29: cannot find symbol [javac] symbol : class Payload [javac] location: package org.apache.lucene.index [javac] import org.apache.lucene.index.Payload; [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:22: cannot find symbol [javac] symbol : class Payload [javac] location: package org.apache.lucene.index [javac] import org.apache.lucene.index.Payload; [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:36: cannot find symbol [javac] symbol : class Payload [javac] location: package org.apache.lucene.index [javac] import org.apache.lucene.index.Payload; [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:276: cannot find symbol [javac] symbol: class Payload [javac] if (value instanceof Payload) { [javac]^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:277: cannot find symbol [javac] symbol: class Payload [javac] final Payload p = (Payload) value; [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:277: cannot find symbol [javac] symbol: class Payload [javac] final Payload p = (Payload) value; [javac]^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:174: cannot find symbol [javac] symbol : class Payload [javac] location: class org.apache.solr.schema.JsonPreAnalyzedParser [javac] p.setPayload(new Payload(data)); [javac]^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:251: cannot find symbol [javac] symbol : class Payload [javac] location: class org.apache.solr.schema.JsonPreAnalyzedParser [javac] Payload p = ((PayloadAttribute)att).getPayload(); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:440: cannot find symbol [javac] symbol : class Payload [javac] location: class org.apache.solr.schema.SimplePreAnalyzedParser [javac] p.setPayload(new Payload(data)); [javac]^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:501: cannot find symbol [javac] symbol : class Payload [javac] location: class org.apache.solr.schema.SimplePreAnalyzedParser [javac] Payload p = ((PayloadAttribute)att).getPayload(); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 10 errors [...truncated 18
[jira] [Commented] (LUCENE-4115) JAR resolution/ cleanup should be done automatically for ant clean/ eclipse/ resolve.
[ https://issues.apache.org/jira/browse/LUCENE-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291989#comment-13291989 ] Dawid Weiss commented on LUCENE-4115: - Why would this be so? I mean -- the risk of users messing up their classpath with lib/*.jar is pretty much the same compared to an ivy classpath from cache + ivy classpath from cache copied to lib/ at distribution time? JAR resolution/ cleanup should be done automatically for ant clean/ eclipse/ resolve. - Key: LUCENE-4115 URL: https://issues.apache.org/jira/browse/LUCENE-4115 Project: Lucene - Java Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 4.0, 5.0 Attachments: LUCENE-4111.patch I think we should add the following target deps: ant clean [depends on] clean-jars ant resolve [depends on] clean-jars ant eclipse [depends on] resolve, clean-jars ant idea [depends on] resolve, clean-jars This eliminates the need to remember about cleaning up stale jars which users complain about (and I think they're right about it). The overhead will be minimal since resolve is only going to copy jars from cache. Eclipse won't have a problem with updated JARs if they end up at the same location. If there are no objections I will fix this in a few hours. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4087) Provide consistent IW behavior for illegal meta data changes
[ https://issues.apache.org/jira/browse/LUCENE-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer resolved LUCENE-4087. - Resolution: Fixed Assignee: Simon Willnauer Lucene Fields: New,Patch Available (was: New) committed to trunk and 4x Provide consistent IW behavior for illegal meta data changes Key: LUCENE-4087 URL: https://issues.apache.org/jira/browse/LUCENE-4087 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.0, 4.1, 5.0 Attachments: LUCENE-4087.patch, LUCENE-4087.patch Currently IW fails late and inconsistent if field metadata like an already defined DocValues type or un-omitting norms. we can approach this similar to how we handle consistent field number and: * throw exception if indexOptions conflict (e.g. omitTF=true versus false) instead of silently dropping positions on merge * same with omitNorms * same with norms types and docvalues types * still keeping field numbers consistent this way we could eliminate all these traps and just give an exception instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4078) PatternReplaceCharFilter assertion error
[ https://issues.apache.org/jira/browse/LUCENE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291996#comment-13291996 ] Dawid Weiss commented on LUCENE-4078: - Follow-up discussion on core-libs-dev. The bottom line: this is the expected behavior... http://mail.openjdk.java.net/pipermail/core-libs-dev/2012-June/010455.html PatternReplaceCharFilter assertion error Key: LUCENE-4078 URL: https://issues.apache.org/jira/browse/LUCENE-4078 Project: Lucene - Java Issue Type: Bug Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor Fix For: 4.0 Attachments: LUCENE-4078.patch Build: https://builds.apache.org/job/Lucene-trunk/1942/ 1 tests failed. REGRESSION: org.apache.lucene.analysis.pattern.TestPatternReplaceCharFilter.testRandomStrings Error Message: Stack Trace: java.lang.AssertionError at __randomizedtesting.SeedInfo.seed([8E91A6AC395FEED9:618A6129A5BB9EC]:0) at org.apache.lucene.analysis.MockTokenizer.readCodePoint(MockTokenizer.java:153) at org.apache.lucene.analysis.MockTokenizer.incrementToken(MockTokenizer.java:123) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:558) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:488) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:430) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:424) at org.apache.lucene.analysis.pattern.TestPatternReplaceCharFilter.testRandomStrings(TestPatternReplaceCharFilter.java:323) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(Randomized -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Linux-Java7-64 - Build # 29 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux-Java7-64/29/ All tests passed Build Log: [...truncated 10051 lines...] check-spatial-uptodate: jar-spatial: check-grouping-uptodate: jar-grouping: check-queries-uptodate: jar-queries: check-queryparser-uptodate: jar-queryparser: prep-lucene-jars: resolve-example: ivy-availability-check: ivy-fail: ivy-configure: resolve: [ivy:retrieve] :: loading settings :: url = jar:file:/var/lib/jenkins/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml common.init: compile-lucene-core: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/build/solr-core/classes/java [javac] Compiling 572 source files to /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/build/solr-core/classes/java [javac] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:29: error: cannot find symbol [javac] import org.apache.lucene.index.Payload; [javac] ^ [javac] symbol: class Payload [javac] location: package org.apache.lucene.index [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:22: error: cannot find symbol [javac] import org.apache.lucene.index.Payload; [javac] ^ [javac] symbol: class Payload [javac] location: package org.apache.lucene.index [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:36: error: cannot find symbol [javac] import org.apache.lucene.index.Payload; [javac] ^ [javac] symbol: class Payload [javac] location: package org.apache.lucene.index [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:276: error: cannot find symbol [javac] if (value instanceof Payload) { [javac]^ [javac] symbol: class Payload [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:277: error: cannot find symbol [javac] final Payload p = (Payload) value; [javac] ^ [javac] symbol: class Payload [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:277: error: cannot find symbol [javac] final Payload p = (Payload) value; [javac]^ [javac] symbol: class Payload [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:174: error: cannot find symbol [javac] p.setPayload(new Payload(data)); [javac]^ [javac] symbol: class Payload [javac] location: class JsonPreAnalyzedParser [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:251: error: cannot find symbol [javac] Payload p = ((PayloadAttribute)att).getPayload(); [javac] ^ [javac] symbol: class Payload [javac] location: class JsonPreAnalyzedParser [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:440: error: cannot find symbol [javac] p.setPayload(new Payload(data)); [javac]^ [javac] symbol: class Payload [javac] location: class SimplePreAnalyzedParser [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:501: error: cannot find symbol [javac] Payload p = ((PayloadAttribute)att).getPayload(); [javac] ^ [javac] symbol: class Payload [javac] location: class SimplePreAnalyzedParser [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 10 errors [...truncated 17
Re: [JENKINS] Lucene-Solr-tests-only-4.x - Build # 31 - Failure
I think the merge done here in 1348227 must have been done from the lucene/ directory. I merged the rest (r1348248) On Fri, Jun 8, 2012 at 4:58 PM, Apache Jenkins Server jenk...@builds.apache.org wrote: Build: https://builds.apache.org/job/Lucene-Solr-tests-only-4.x/31/ All tests passed Build Log: [...truncated 9163 lines...] jar-misc: check-spatial-uptodate: jar-spatial: check-grouping-uptodate: jar-grouping: check-queries-uptodate: jar-queries: check-queryparser-uptodate: jar-queryparser: prep-lucene-jars: resolve-example: ivy-availability-check: ivy-fail: ivy-configure: resolve: [ivy:retrieve] :: loading settings :: url = jar:file:/home/hudson/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml common.init: compile-lucene-core: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/build/solr-core/classes/java [javac] Compiling 572 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/build/solr-core/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:29: cannot find symbol [javac] symbol : class Payload [javac] location: package org.apache.lucene.index [javac] import org.apache.lucene.index.Payload; [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:22: cannot find symbol [javac] symbol : class Payload [javac] location: package org.apache.lucene.index [javac] import org.apache.lucene.index.Payload; [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:36: cannot find symbol [javac] symbol : class Payload [javac] location: package org.apache.lucene.index [javac] import org.apache.lucene.index.Payload; [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:276: cannot find symbol [javac] symbol: class Payload [javac] if (value instanceof Payload) { [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:277: cannot find symbol [javac] symbol: class Payload [javac] final Payload p = (Payload) value; [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:277: cannot find symbol [javac] symbol: class Payload [javac] final Payload p = (Payload) value; [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:174: cannot find symbol [javac] symbol : class Payload [javac] location: class org.apache.solr.schema.JsonPreAnalyzedParser [javac] p.setPayload(new Payload(data)); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:251: cannot find symbol [javac] symbol : class Payload [javac] location: class org.apache.solr.schema.JsonPreAnalyzedParser [javac] Payload p = ((PayloadAttribute)att).getPayload(); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:440: cannot find symbol [javac] symbol : class Payload [javac] location: class org.apache.solr.schema.SimplePreAnalyzedParser [javac] p.setPayload(new Payload(data)); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:501: cannot find symbol [javac] symbol : class Payload [javac] location: class org.apache.solr.schema.SimplePreAnalyzedParser [javac] Payload p = ((PayloadAttribute)att).getPayload(); [javac] ^ [javac] Note: Some input files use or override a
[JENKINS] Lucene-Solr-4.x-Windows-Java6-64 - Build # 25 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java6-64/25/ All tests passed Build Log: [...truncated 9091 lines...] jar-misc: check-spatial-uptodate: jar-spatial: check-grouping-uptodate: jar-grouping: check-queries-uptodate: jar-queries: check-queryparser-uptodate: jar-queryparser: prep-lucene-jars: resolve-example: ivy-availability-check: ivy-fail: ivy-configure: resolve: [ivy:retrieve] :: loading settings :: url = jar:file:/C:/Users/JenkinsSlave/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml common.init: compile-lucene-core: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\build\solr-core\classes\java [javac] Compiling 572 source files to C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\build\solr-core\classes\java [javac] C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\handler\AnalysisRequestHandlerBase.java:29: cannot find symbol [javac] symbol : class Payload [javac] location: package org.apache.lucene.index [javac] import org.apache.lucene.index.Payload; [javac] ^ [javac] C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\schema\JsonPreAnalyzedParser.java:22: cannot find symbol [javac] symbol : class Payload [javac] location: package org.apache.lucene.index [javac] import org.apache.lucene.index.Payload; [javac] ^ [javac] C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\schema\SimplePreAnalyzedParser.java:36: cannot find symbol [javac] symbol : class Payload [javac] location: package org.apache.lucene.index [javac] import org.apache.lucene.index.Payload; [javac] ^ [javac] C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\handler\AnalysisRequestHandlerBase.java:276: cannot find symbol [javac] symbol: class Payload [javac] if (value instanceof Payload) { [javac]^ [javac] C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\handler\AnalysisRequestHandlerBase.java:277: cannot find symbol [javac] symbol: class Payload [javac] final Payload p = (Payload) value; [javac] ^ [javac] C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\handler\AnalysisRequestHandlerBase.java:277: cannot find symbol [javac] symbol: class Payload [javac] final Payload p = (Payload) value; [javac]^ [javac] C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\schema\JsonPreAnalyzedParser.java:174: cannot find symbol [javac] symbol : class Payload [javac] location: class org.apache.solr.schema.JsonPreAnalyzedParser [javac] p.setPayload(new Payload(data)); [javac]^ [javac] C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\schema\JsonPreAnalyzedParser.java:251: cannot find symbol [javac] symbol : class Payload [javac] location: class org.apache.solr.schema.JsonPreAnalyzedParser [javac] Payload p = ((PayloadAttribute)att).getPayload(); [javac] ^ [javac] C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\schema\SimplePreAnalyzedParser.java:440: cannot find symbol [javac] symbol : class Payload [javac] location: class org.apache.solr.schema.SimplePreAnalyzedParser [javac] p.setPayload(new Payload(data)); [javac]^ [javac] C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\schema\SimplePreAnalyzedParser.java:501: cannot find symbol [javac] symbol : class Payload [javac] location: class org.apache.solr.schema.SimplePreAnalyzedParser [javac] Payload p = ((PayloadAttribute)att).getPayload(); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 10 errors [...truncated 15 lines...] [...truncated 9207 lines...] [...truncated 9207 lines...] [...truncated 9207 lines...] [...truncated 9207 lines...] [...truncated 9207 lines...] - To
[JENKINS] Lucene-Solr-4.x-Linux-Java6-64 - Build # 36 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux-Java6-64/36/ All tests passed Build Log: [...truncated 9223 lines...] jar-misc: check-spatial-uptodate: jar-spatial: check-grouping-uptodate: jar-grouping: check-queries-uptodate: jar-queries: check-queryparser-uptodate: jar-queryparser: prep-lucene-jars: resolve-example: ivy-availability-check: ivy-fail: ivy-configure: resolve: [ivy:retrieve] :: loading settings :: url = jar:file:/var/lib/jenkins/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml common.init: compile-lucene-core: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java6-64/checkout/solr/build/solr-core/classes/java [javac] Compiling 572 source files to /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java6-64/checkout/solr/build/solr-core/classes/java [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java6-64/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:29: cannot find symbol [javac] symbol : class Payload [javac] location: package org.apache.lucene.index [javac] import org.apache.lucene.index.Payload; [javac] ^ [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java6-64/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:22: cannot find symbol [javac] symbol : class Payload [javac] location: package org.apache.lucene.index [javac] import org.apache.lucene.index.Payload; [javac] ^ [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java6-64/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:36: cannot find symbol [javac] symbol : class Payload [javac] location: package org.apache.lucene.index [javac] import org.apache.lucene.index.Payload; [javac] ^ [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java6-64/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:276: cannot find symbol [javac] symbol: class Payload [javac] if (value instanceof Payload) { [javac]^ [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java6-64/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:277: cannot find symbol [javac] symbol: class Payload [javac] final Payload p = (Payload) value; [javac] ^ [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java6-64/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:277: cannot find symbol [javac] symbol: class Payload [javac] final Payload p = (Payload) value; [javac]^ [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java6-64/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:174: cannot find symbol [javac] symbol : class Payload [javac] location: class org.apache.solr.schema.JsonPreAnalyzedParser [javac] p.setPayload(new Payload(data)); [javac]^ [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java6-64/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:251: cannot find symbol [javac] symbol : class Payload [javac] location: class org.apache.solr.schema.JsonPreAnalyzedParser [javac] Payload p = ((PayloadAttribute)att).getPayload(); [javac] ^ [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java6-64/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:440: cannot find symbol [javac] symbol : class Payload [javac] location: class org.apache.solr.schema.SimplePreAnalyzedParser [javac] p.setPayload(new Payload(data)); [javac]^ [javac] /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java6-64/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:501: cannot find symbol [javac] symbol : class Payload [javac] location: class org.apache.solr.schema.SimplePreAnalyzedParser [javac] Payload p = ((PayloadAttribute)att).getPayload(); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 10 errors [...truncated 16 lines...] [...truncated 9340 lines...] [...truncated 9340 lines...] [...truncated 9340
[jira] [Commented] (LUCENE-4115) JAR resolution/ cleanup should be done automatically for ant clean/ eclipse/ resolve.
[ https://issues.apache.org/jira/browse/LUCENE-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292004#comment-13292004 ] Hoss Man commented on LUCENE-4115: -- I'm not sure i'm following you. right now, we know that when you run ant test (or java -jar start.jar in solr) you getting the exact same classpath and set of jars as someone who downloads a binary dist you might build from your checkout -- because the classpath you are using comes from the lib dirs, built by ivy. there is only one places jars are copied from the ivy cache to the lib dir(s). if instead we have ant classpath/ declarations that use ivy features to build up classpaths pointed directly at jars in the ivy cache, and independently we have copy directives building up the lib/ dirs that make it into the binary dists, then isn't there a risk that (overtime) those will diverge in a way we might not notice because all the testing will only be done with the ivy generated classpaths? (maybe i'm wrong ... maybe the ivy classpath stuff works diff then i understand ... i'm just raising it as a potential concern) JAR resolution/ cleanup should be done automatically for ant clean/ eclipse/ resolve. - Key: LUCENE-4115 URL: https://issues.apache.org/jira/browse/LUCENE-4115 Project: Lucene - Java Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 4.0, 5.0 Attachments: LUCENE-4111.patch I think we should add the following target deps: ant clean [depends on] clean-jars ant resolve [depends on] clean-jars ant eclipse [depends on] resolve, clean-jars ant idea [depends on] resolve, clean-jars This eliminates the need to remember about cleaning up stale jars which users complain about (and I think they're right about it). The overhead will be minimal since resolve is only going to copy jars from cache. Eclipse won't have a problem with updated JARs if they end up at the same location. If there are no objections I will fix this in a few hours. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org