Sponsoring porting work
Hi devs, We are looking to sponsor porting work, to help with keeping up the pace of development and help Lucene.Net be closer to Java Lucene. Unfortunately the amount of work I can put on this is very limited, and being up to speed with Lucene is important to us, hence the idea to offer sponsorship. I'm not entirely sure how these things work under the Apache umbrella, but I'd imagine there isn't a real issue doing that. All work will be handed back to the project under the ASL of course. I'd appreciate any guidance if needed. In the meantime, interested parties are welcome to contact me privately. Itamar.
[jira] [Commented] (LUCENENET-493) Make lucene.net culture insensitive (like the java version)
[ https://issues.apache.org/jira/browse/LUCENENET-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293189#comment-13293189 ] Christopher Currens commented on LUCENENET-493: --- This is rather annoying, actually. Java has tests for different cultures wired into the test suite. Interestingly enough, so do we, but because of the differences between JUnit and NUnit (namely attribute based test discovery), we can't override the test running implementation in the same way java does. So, the code we've ported for testing cultures does not work...period. NUnit supports changing the cultures via attributes, but only a single culture. MbUnit allows multiple cultures and will run the test each time in that culture. We should find a workaround. Make lucene.net culture insensitive (like the java version) --- Key: LUCENENET-493 URL: https://issues.apache.org/jira/browse/LUCENENET-493 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Core, Lucene.Net Test Affects Versions: Lucene.Net 3.0.3 Reporter: Luc Vanlerberghe Labels: patch Fix For: Lucene.Net 3.0.3 Attachments: Lucenenet-493.patch In Java, conversion of the basic types to and from strings is locale (culture) independent. For localized input/output one needs to use the classes in the java.text package. In .Net, conversion of the basic types to and from strings depends on the default Culture. Otherwise you have to specify CultureInfo.InvariantCulture explicitly. Some of the testcases in lucene.net fail if they are not run on a machine with culture set to US. In the current version of lucene.net there are patches here and there that try to correct for some specific cases by using string replacement (like System.Double.Parse(s.Replace(., CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator)), but that seems really ugly. I submit a patch here that removes the old workarounds and replaces them by calls to classes in the Lucene.Net.Support namespace that try to handle the conversions in a compatible way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (LUCENE-4118) FastVectorHighlighter fail to highlight taking in input some proximity query.
[ https://issues.apache.org/jira/browse/LUCENE-4118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi reassigned LUCENE-4118: -- Assignee: Koji Sekiguchi FastVectorHighlighter fail to highlight taking in input some proximity query. - Key: LUCENE-4118 URL: https://issues.apache.org/jira/browse/LUCENE-4118 Project: Lucene - Java Issue Type: Bug Components: modules/highlighter Affects Versions: 3.4, 5.0 Reporter: Emanuele Lombardi Assignee: Koji Sekiguchi Labels: FastVectorHighlighter Attachments: FVHPatch.txt There are 2 related bug with proximity query 1) In a phrase there are n repeated terms the FVH module fails to highlight that. see testRepeatedTermsWithSlop 2) If you search the terms reversed the FVH module fails to highlight that. see testReversedTermsWithSlop -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1348623 - in /lucene/dev/branches/branch_4x: ./ dev-tools/ lucene/ lucene/analysis/ lucene/analysis/common/ lucene/analysis/common/src/java/org/apache/lucene/analysis/standard/std31/
fixed thanks dawid! On Sun, Jun 10, 2012 at 7:12 PM, Dawid Weiss dawid.we...@gmail.com wrote: Synchonizer - Synchronizer? D. On Sun, Jun 10, 2012 at 6:42 PM, sim...@apache.org wrote: Author: simonw Date: Sun Jun 10 16:42:55 2012 New Revision: 1348623 URL: http://svn.apache.org/viewvc?rev=1348623view=rev Log: LUCENE-4116: fix concurrency test for DWPTStallControl Modified: lucene/dev/branches/branch_4x/ (props changed) lucene/dev/branches/branch_4x/dev-tools/ (props changed) lucene/dev/branches/branch_4x/lucene/ (props changed) lucene/dev/branches/branch_4x/lucene/BUILD.txt (props changed) lucene/dev/branches/branch_4x/lucene/CHANGES.txt (props changed) lucene/dev/branches/branch_4x/lucene/JRE_VERSION_MIGRATION.txt (props changed) lucene/dev/branches/branch_4x/lucene/LICENSE.txt (props changed) lucene/dev/branches/branch_4x/lucene/MIGRATE.txt (props changed) lucene/dev/branches/branch_4x/lucene/NOTICE.txt (props changed) lucene/dev/branches/branch_4x/lucene/README.txt (props changed) lucene/dev/branches/branch_4x/lucene/analysis/ (props changed) lucene/dev/branches/branch_4x/lucene/analysis/common/ (props changed) lucene/dev/branches/branch_4x/lucene/analysis/common/src/java/org/apache/lucene/analysis/standard/std31/package.html (props changed) lucene/dev/branches/branch_4x/lucene/analysis/common/src/java/org/apache/lucene/analysis/standard/std34/package.html (props changed) lucene/dev/branches/branch_4x/lucene/backwards/ (props changed) lucene/dev/branches/branch_4x/lucene/benchmark/ (props changed) lucene/dev/branches/branch_4x/lucene/build.xml (props changed) lucene/dev/branches/branch_4x/lucene/common-build.xml (props changed) lucene/dev/branches/branch_4x/lucene/core/ (props changed) lucene/dev/branches/branch_4x/lucene/core/src/java/org/apache/lucene/index/DocumentsWriterStallControl.java lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/index/TestDocumentsWriterStallControl.java lucene/dev/branches/branch_4x/lucene/demo/ (props changed) lucene/dev/branches/branch_4x/lucene/facet/ (props changed) lucene/dev/branches/branch_4x/lucene/grouping/ (props changed) lucene/dev/branches/branch_4x/lucene/highlighter/ (props changed) lucene/dev/branches/branch_4x/lucene/ivy-settings.xml (props changed) lucene/dev/branches/branch_4x/lucene/join/ (props changed) lucene/dev/branches/branch_4x/lucene/memory/ (props changed) lucene/dev/branches/branch_4x/lucene/misc/ (props changed) lucene/dev/branches/branch_4x/lucene/module-build.xml (props changed) lucene/dev/branches/branch_4x/lucene/queries/ (props changed) lucene/dev/branches/branch_4x/lucene/queryparser/ (props changed) lucene/dev/branches/branch_4x/lucene/sandbox/ (props changed) lucene/dev/branches/branch_4x/lucene/site/ (props changed) lucene/dev/branches/branch_4x/lucene/spatial/ (props changed) lucene/dev/branches/branch_4x/lucene/suggest/ (props changed) lucene/dev/branches/branch_4x/lucene/test-framework/ (props changed) lucene/dev/branches/branch_4x/lucene/tools/ (props changed) lucene/dev/branches/branch_4x/solr/ (props changed) lucene/dev/branches/branch_4x/solr/CHANGES.txt (props changed) lucene/dev/branches/branch_4x/solr/LICENSE.txt (props changed) lucene/dev/branches/branch_4x/solr/NOTICE.txt (props changed) lucene/dev/branches/branch_4x/solr/README.txt (props changed) lucene/dev/branches/branch_4x/solr/build.xml (props changed) lucene/dev/branches/branch_4x/solr/cloud-dev/ (props changed) lucene/dev/branches/branch_4x/solr/common-build.xml (props changed) lucene/dev/branches/branch_4x/solr/contrib/ (props changed) lucene/dev/branches/branch_4x/solr/core/ (props changed) lucene/dev/branches/branch_4x/solr/dev-tools/ (props changed) lucene/dev/branches/branch_4x/solr/example/ (props changed) lucene/dev/branches/branch_4x/solr/lib/ (props changed) lucene/dev/branches/branch_4x/solr/lib/httpclient-LICENSE-ASL.txt (props changed) lucene/dev/branches/branch_4x/solr/lib/httpclient-NOTICE.txt (props changed) lucene/dev/branches/branch_4x/solr/lib/httpcore-LICENSE-ASL.txt (props changed) lucene/dev/branches/branch_4x/solr/lib/httpcore-NOTICE.txt (props changed) lucene/dev/branches/branch_4x/solr/lib/httpmime-LICENSE-ASL.txt (props changed) lucene/dev/branches/branch_4x/solr/lib/httpmime-NOTICE.txt (props changed) lucene/dev/branches/branch_4x/solr/scripts/ (props changed) lucene/dev/branches/branch_4x/solr/solrj/ (props changed) lucene/dev/branches/branch_4x/solr/test-framework/ (props changed) lucene/dev/branches/branch_4x/solr/testlogging.properties (props changed) lucene/dev/branches/branch_4x/solr/webapp/ (props changed)
[JENKINS] Lucene-Solr-trunk-Windows-Java7-64 - Build # 291 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/291/ 1 tests failed. REGRESSION: org.apache.solr.handler.TestReplicationHandler.test Error Message: expected:494 but was:0 Stack Trace: java.lang.AssertionError: expected:494 but was:0 at __randomizedtesting.SeedInfo.seed([B032809CBA0B0DAB:3866BF4614F76053]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigAliasReplication(TestReplicationHandler.java:716) at org.apache.solr.handler.TestReplicationHandler.test(TestReplicationHandler.java:254) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Build Log: [...truncated 17404 lines...] [junit4] 2 56759 T3040 C189 REQ [collection1] webapp=/solr path=/replication params={command=filecontentchecksum=truegeneration=16wt=filestreamfile=_9.fnm} status=0 QTime=0 [junit4] 2 56764 T3040 C189 REQ
[jira] [Commented] (SOLR-2724) Deprecate defaultSearchField and defaultOperator defined in schema.xml
[ https://issues.apache.org/jira/browse/SOLR-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292688#comment-13292688 ] Bernd Fehling commented on SOLR-2724: - So this issue is stale since March and Solr hangs now between heaven and hell which means that defaultSearchField is disabled in schema.xml and marked as deprecated, but the method getDefaultSearchFieldName() still exists and gives now no fallback to a default. This is bad and breaks pieces of Solr, like edismax and several more. And the solution with df in the defaults of RequestHandler is also not working. Now what, revert and release a fixed 3.6.2 or fix getDefaultSearchFieldName() and release a 3.6.2? What about defaultOperator, is this one having/producing the same kind of problems as defaultSearchField? Deprecate defaultSearchField and defaultOperator defined in schema.xml -- Key: SOLR-2724 URL: https://issues.apache.org/jira/browse/SOLR-2724 Project: Solr Issue Type: Improvement Components: Schema and Analysis, search Reporter: David Smiley Assignee: David Smiley Priority: Minor Fix For: 3.6, 4.0 Attachments: SOLR-2724_deprecateDefaultSearchField_and_defaultOperator.patch Original Estimate: 2h Remaining Estimate: 2h I've always been surprised to see the defaultSearchField element and solrQueryParser defaultOperator=OR/ defined in the schema.xml file since the first time I saw them. They just seem out of place to me since they are more query parser related than schema related. But not only are they misplaced, I feel they shouldn't exist. For query parsers, we already have a df parameter that works just fine, and explicit field references. And the default lucene query operator should stay at OR -- if a particular query wants different behavior then use q.op or simply use OR. similarity Seems like something better placed in solrconfig.xml than in the schema. In my opinion, defaultSearchField and defaultOperator configuration elements should be deprecated in Solr 3.x and removed in Solr 4. And similarity should move to solrconfig.xml. I am willing to do it, provided there is consensus on it of course. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3526) Remove classfile dependency on ZooKeeper from CoreContainer
[ https://issues.apache.org/jira/browse/SOLR-3526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292689#comment-13292689 ] Michael Froh commented on SOLR-3526: Oh, thanks a lot for pointing that out, Hoss! I had completely missed that part. My wish for the removal of the KeeperException reference from CoreContainer still stands, but using NoOpDistributingUpdateProcessorFactory lets me remove my current hacky solution (adding a dummy org.apache.zookeeper.KeeperException in one of my libraries). Remove classfile dependency on ZooKeeper from CoreContainer --- Key: SOLR-3526 URL: https://issues.apache.org/jira/browse/SOLR-3526 Project: Solr Issue Type: Wish Components: SolrCloud Affects Versions: 4.0 Reporter: Michael Froh We are using Solr as a library embedded within an existing application, and are currently developing toward using 4.0 when it is released. We are currently instantiating SolrCores with null CoreDescriptors (and hence no CoreContainer), since we don't need SolrCloud functionality (and do not want to depend on ZooKeeper). A couple of months ago, SearchHandler was modified to try to retrieve a ShardHandlerFactory from the CoreContainer. I was able to work around this by specifying a dummy ShardHandlerFactory in the config. Now UpdateRequestProcessorChain is inserting a DistributedUpdateProcessor into my chains, again triggering a NPE when trying to dereference the CoreDescriptor. I would happily place the SolrCores in CoreContainers, except that CoreContainer imports and references org.apache.zookeeper.KeeperException, which we do not have (and do not want) in our classpath. Therefore, I get a ClassNotFoundException when loading the CoreContainer class. Ideally (IMHO), ZkController should isolate the ZooKeeper dependency, and simply rethrow KeeperExceptions as org.apache.solr.common.cloud.ZooKeeperException (or some Solr-hosted checked exception). Then CoreContainer could remove the offending import/references. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3488) Create a Collections API for SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292697#comment-13292697 ] Tommaso Teofili commented on SOLR-3488: --- bq. I think though, that we should really change how things work - so that you just pass the number of shards and the number of replicas, and the overseer just ensures the collection is on the right number of nodes. Then we don't have to have this 'template' collection to figure out what nodes to create on - or explicitly pass the nodes. sure, +1. bq. Sami has a distributed work queue for the overseer setup now, and I'm working on integrating this with that. that looks great. By the way, I think it would be good if that could be also (optionally) used for indexing in SolrCloud. Create a Collections API for SolrCloud -- Key: SOLR-3488 URL: https://issues.apache.org/jira/browse/SOLR-3488 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Attachments: SOLR-3488.patch, SOLR-3488_2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3532) Promote shutdown method to SolrServer
[ https://issues.apache.org/jira/browse/SOLR-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated SOLR-3532: - Attachment: SOLR-3532.patch trivial patch Promote shutdown method to SolrServer - Key: SOLR-3532 URL: https://issues.apache.org/jira/browse/SOLR-3532 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Sami Siren Assignee: Sami Siren Priority: Minor Fix For: 4.0 Attachments: SOLR-3532.patch Currently every java client implements shutdown, (LBHttpSolrServer has close). I think it makes sense to promote #shutdown method to SolrServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3532) Promote shutdown method to SolrServer
Sami Siren created SOLR-3532: Summary: Promote shutdown method to SolrServer Key: SOLR-3532 URL: https://issues.apache.org/jira/browse/SOLR-3532 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Sami Siren Assignee: Sami Siren Priority: Minor Fix For: 4.0 Attachments: SOLR-3532.patch Currently every java client implements shutdown, (LBHttpSolrServer has close). I think it makes sense to promote #shutdown method to SolrServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3533) Show CharFilters in Schema Browser
Erik Hatcher created SOLR-3533: -- Summary: Show CharFilters in Schema Browser Key: SOLR-3533 URL: https://issues.apache.org/jira/browse/SOLR-3533 Project: Solr Issue Type: Improvement Affects Versions: 4.0 Reporter: Erik Hatcher Priority: Minor Fix For: 4.0 Schema Browser (on trunk) currently does not show CharFilters. The example/ schema has this definition that can be used to demonstrate, though it needs to be uncommented: {code} fieldType name=text_char_norm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.WhitespaceTokenizerFactory/ /analyzer /fieldType {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3533) Show CharFilters in Schema Browser
[ https://issues.apache.org/jira/browse/SOLR-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher updated SOLR-3533: --- Attachment: SOLR-3533.patch This patch adds char filters to the index and query analysis sections. Show CharFilters in Schema Browser -- Key: SOLR-3533 URL: https://issues.apache.org/jira/browse/SOLR-3533 Project: Solr Issue Type: Improvement Affects Versions: 4.0 Reporter: Erik Hatcher Priority: Minor Fix For: 4.0 Attachments: SOLR-3533.patch Schema Browser (on trunk) currently does not show CharFilters. The example/ schema has this definition that can be used to demonstrate, though it needs to be uncommented: {code} fieldType name=text_char_norm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.WhitespaceTokenizerFactory/ /analyzer /fieldType {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-3533) Show CharFilters in Schema Browser
[ https://issues.apache.org/jira/browse/SOLR-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher reassigned SOLR-3533: -- Assignee: Stefan Matheis (steffkes) I'm going to assign to Stefan to vet/commit, to ensure my first foray into the new UI is on track. Show CharFilters in Schema Browser -- Key: SOLR-3533 URL: https://issues.apache.org/jira/browse/SOLR-3533 Project: Solr Issue Type: Improvement Affects Versions: 4.0 Reporter: Erik Hatcher Assignee: Stefan Matheis (steffkes) Priority: Minor Fix For: 4.0 Attachments: SOLR-3533.patch Schema Browser (on trunk) currently does not show CharFilters. The example/ schema has this definition that can be used to demonstrate, though it needs to be uncommented: {code} fieldType name=text_char_norm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.WhitespaceTokenizerFactory/ /analyzer /fieldType {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3533) Show CharFilters in Schema Browser
[ https://issues.apache.org/jira/browse/SOLR-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292738#comment-13292738 ] Erik Hatcher commented on SOLR-3533: Another nicety would be to make the mapping parameter be special like synonyms and words are now in order to (eventually, I know it's not enabled on trunk at the moment) link them. Show CharFilters in Schema Browser -- Key: SOLR-3533 URL: https://issues.apache.org/jira/browse/SOLR-3533 Project: Solr Issue Type: Improvement Affects Versions: 4.0 Reporter: Erik Hatcher Assignee: Stefan Matheis (steffkes) Priority: Minor Fix For: 4.0 Attachments: SOLR-3533.patch Schema Browser (on trunk) currently does not show CharFilters. The example/ schema has this definition that can be used to demonstrate, though it needs to be uncommented: {code} fieldType name=text_char_norm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.WhitespaceTokenizerFactory/ /analyzer /fieldType {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Windows-Java6-64 - Build # 46 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java6-64/46/ 1 tests failed. REGRESSION: org.apache.solr.cloud.RecoveryZkTest.testDistribSearch Error Message: Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #0,6,] Stack Trace: java.lang.RuntimeException: Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #0,6,] at com.carrotsearch.randomizedtesting.RunnerThreadGroup.processUncaught(RunnerThreadGroup.java:96) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:857) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Caused by: org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.store.AlreadyClosedException: this Directory is closed at __randomizedtesting.SeedInfo.seed([918444B633412A98]:0) at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:507) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:480) Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is closed at org.apache.lucene.store.Directory.ensureOpen(Directory.java:244) at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:241) at org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:321) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3149) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:382) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:451) Build Log: [...truncated 22375 lines...] [junit4] 2 50960 T349 oascc.ZkStateReader.updateCloudState Manual update of cluster state initiated [junit4] 2 50960 T349 oascc.ZkStateReader.updateCloudState Updating cloud state from ZooKeeper... [junit4] 2 50961 T349 oasc.Overseer$CloudStateUpdater.run Announcing new cluster state [junit4] 2 50963 T209 oascc.ZkStateReader$2.process A cluster state change has occurred [junit4] 2 50963 T205 oascc.ZkStateReader$2.process A cluster state change has occurred [junit4] 2 50965 T250 oascc.ZkStateReader$2.process A cluster state change has occurred [junit4] 2 50986 T155 oasc.CoreContainer.shutdown Shutting down CoreContainer instance=1794423774 [junit4] 2 50986 T155 oasc.RecoveryStrategy.close WARNING Stopping recovery for core collection1 zkNodeName=127.0.0.1:61728_solr_collection1 [junit4] 2 50986 T155 oasc.SolrCore.close [collection1] CLOSING SolrCore org.apache.solr.core.SolrCore@4b919723 [junit4] 2 50991 T155 oasc.SolrCore.closeSearcher [collection1] Closing main searcher on request. [junit4] 2 50991 T155 oasu.DirectUpdateHandler2.close closing DirectUpdateHandler2{commits=6,autocommits=0,soft
[jira] [Commented] (SOLR-3533) Show CharFilters in Schema Browser
[ https://issues.apache.org/jira/browse/SOLR-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292740#comment-13292740 ] Erik Hatcher commented on SOLR-3533: bq. in order to (eventually, I know it's not enabled on trunk at the moment) link them FYI - the link to the mapping file in this example is this: http://localhost:8983/solr/admin/file?file=mapping-ISOLatin1Accent.txt (optionally with core name in the URL too of course), so maybe we can spin off another issue to add these links in to those files now? Show CharFilters in Schema Browser -- Key: SOLR-3533 URL: https://issues.apache.org/jira/browse/SOLR-3533 Project: Solr Issue Type: Improvement Affects Versions: 4.0 Reporter: Erik Hatcher Assignee: Stefan Matheis (steffkes) Priority: Minor Fix For: 4.0 Attachments: SOLR-3533.patch Schema Browser (on trunk) currently does not show CharFilters. The example/ schema has this definition that can be used to demonstrate, though it needs to be uncommented: {code} fieldType name=text_char_norm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.WhitespaceTokenizerFactory/ /analyzer /fieldType {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3533) Show CharFilters in Schema Browser
[ https://issues.apache.org/jira/browse/SOLR-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292741#comment-13292741 ] Erik Hatcher commented on SOLR-3533: oh, and my patch also changes Filters to Token Filters to make it labeled a little differently than Char Filters. Show CharFilters in Schema Browser -- Key: SOLR-3533 URL: https://issues.apache.org/jira/browse/SOLR-3533 Project: Solr Issue Type: Improvement Affects Versions: 4.0 Reporter: Erik Hatcher Assignee: Stefan Matheis (steffkes) Priority: Minor Fix For: 4.0 Attachments: SOLR-3533.patch Schema Browser (on trunk) currently does not show CharFilters. The example/ schema has this definition that can be used to demonstrate, though it needs to be uncommented: {code} fieldType name=text_char_norm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.WhitespaceTokenizerFactory/ /analyzer /fieldType {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4129) add CodecHeader to .frq and .prx
Robert Muir created LUCENE-4129: --- Summary: add CodecHeader to .frq and .prx Key: LUCENE-4129 URL: https://issues.apache.org/jira/browse/LUCENE-4129 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Robert Muir We did this for all other files, but not .frq/.prx. Currently the postings writer only records itself in the blocktree terms dictionary, which is fine, but thats really documenting the .tim itself, that it is Blocktree with Lucene40Postings metadata. I think we should put headers in .frq/.prx as well: e.g. it could detect file jumbling. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4129) add CodecHeader to .frq and .prx
[ https://issues.apache.org/jira/browse/LUCENE-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4129: Attachment: LUCENE-4129.patch patch: this found a bug in NestedPulsing in the disk full tests. I also changed pulsing to be more clear that the inner postings reader/writer is being closed here: theoretically its possible the pulsingreader/writer ctor could throw an exception and we would have a leak. add CodecHeader to .frq and .prx Key: LUCENE-4129 URL: https://issues.apache.org/jira/browse/LUCENE-4129 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Robert Muir Attachments: LUCENE-4129.patch We did this for all other files, but not .frq/.prx. Currently the postings writer only records itself in the blocktree terms dictionary, which is fine, but thats really documenting the .tim itself, that it is Blocktree with Lucene40Postings metadata. I think we should put headers in .frq/.prx as well: e.g. it could detect file jumbling. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4129) add CodecHeader to .frq and .prx
[ https://issues.apache.org/jira/browse/LUCENE-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4129: Attachment: LUCENE-4129.patch updated patch to actually check the header :) add CodecHeader to .frq and .prx Key: LUCENE-4129 URL: https://issues.apache.org/jira/browse/LUCENE-4129 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Robert Muir Attachments: LUCENE-4129.patch, LUCENE-4129.patch We did this for all other files, but not .frq/.prx. Currently the postings writer only records itself in the blocktree terms dictionary, which is fine, but thats really documenting the .tim itself, that it is Blocktree with Lucene40Postings metadata. I think we should put headers in .frq/.prx as well: e.g. it could detect file jumbling. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
LBHttpSolrServer doc needs a little improvement
The wiki for LBHttpSolrServer is a little out of date. It says the feature “experimental” and “currently being developed” even though SOLR-844 is closed. There are a couple of “LB!HttpSolrServer” links that point to nonexistent pages. The class javadoc has half but not all of the doc from the wiki. The simplest solution may be to move the rest of the wiki doc into the javadoc. I’m not sure what should be done with the wiki though. How can a wiki link to javadoc when the link depends on Solr release? Or, maybe just make the wiki and javadoc be the same. And the SolrJ wiki makes no mention of LBHttpSolrServer. -- Jack Krupansky
[jira] [Updated] (LUCENE-4128) add safety to preflex segmentinfo upgrade
[ https://issues.apache.org/jira/browse/LUCENE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-4128: --- Fix Version/s: 4.0 add safety to preflex segmentinfo upgrade --- Key: LUCENE-4128 URL: https://issues.apache.org/jira/browse/LUCENE-4128 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-4128.patch Currently the one-time-upgrade depends on whether the upgraded .si file exists. And the writing is done in a try/finally so its removed if ioexception happens. but I think there could be a power-loss or something else in the middle of this, the upgraded .si file could be bogus, then the user would have to manually remove it (they probably wouldnt know). i think instead we should just have a marker file on completion, that we create after we successfully fsync the upgraded .si file. this way if something happens we just rewrite the thing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4128) add safety to preflex segmentinfo upgrade
[ https://issues.apache.org/jira/browse/LUCENE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-4128: --- Attachment: LUCENE-4128.patch Patch. add safety to preflex segmentinfo upgrade --- Key: LUCENE-4128 URL: https://issues.apache.org/jira/browse/LUCENE-4128 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-4128.patch Currently the one-time-upgrade depends on whether the upgraded .si file exists. And the writing is done in a try/finally so its removed if ioexception happens. but I think there could be a power-loss or something else in the middle of this, the upgraded .si file could be bogus, then the user would have to manually remove it (they probably wouldnt know). i think instead we should just have a marker file on completion, that we create after we successfully fsync the upgraded .si file. this way if something happens we just rewrite the thing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4128) add safety to preflex segmentinfo upgrade
[ https://issues.apache.org/jira/browse/LUCENE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292765#comment-13292765 ] Robert Muir commented on LUCENE-4128: - We write a codec header for the upgraded marker file, so instead of relying upon File.exists we could add a deprecated method to SegmentInfos hasMarkerFile that just opens it and does CheckHeader, returning false if there is any exception? add safety to preflex segmentinfo upgrade --- Key: LUCENE-4128 URL: https://issues.apache.org/jira/browse/LUCENE-4128 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-4128.patch Currently the one-time-upgrade depends on whether the upgraded .si file exists. And the writing is done in a try/finally so its removed if ioexception happens. but I think there could be a power-loss or something else in the middle of this, the upgraded .si file could be bogus, then the user would have to manually remove it (they probably wouldnt know). i think instead we should just have a marker file on completion, that we create after we successfully fsync the upgraded .si file. this way if something happens we just rewrite the thing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)
[ https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-4061: --- Attachment: LUCENE-4061.patch Fixing the concurrency issue was hairy, and required lots of changes to DirTaxoWriter: * Needed a ReaderManager, so added such in core under o.a.l.index. Separately, I think that we should move RefManager to o.a.l.util instead of o.a.l.search. * DirTaxoWriter was not very well built for concurrency :), so many changes had to be done to it. * TaxoWriterCache.hasRoom(int) has been replaced by isFull(). * TestDirTaxoWriter has been enhanced to sometimes, during nightly builds, use a NoOpCache, as it uncovered some bugs too ! (yet it makes the test horribly slow, hence why the nightly criteria, and very low chances still). I ran DirTaxoWriter.testConcurrency over 1000 times and no failures, so I'm inclined to believe the concurrency issues are now resolved. Still, a second (and third and even a forth) look by someone else would be appreciated. I'll commit it tomorrow if no one will object, and port to 4x. Improvements to DirectoryTaxonomyWriter (synchronization and others) Key: LUCENE-4061 URL: https://issues.apache.org/jira/browse/LUCENE-4061 Project: Lucene - Java Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Fix For: 4.0 Attachments: LUCENE-4061.patch, LUCENE-4061.patch DirTaxoWriter synchronizes in too many places. For instance addCategory() is fully synchronized, while only a small part of it needs to be. Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is interested can query the specific cache impl it has. Currently, only Cl2oTaxoWriterCache supports it. If the changes will be simple, I'll port them to 3.6.1 as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4128) add safety to preflex segmentinfo upgrade
[ https://issues.apache.org/jira/browse/LUCENE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292770#comment-13292770 ] Michael McCandless commented on LUCENE-4128: Good idea, I'll do that! add safety to preflex segmentinfo upgrade --- Key: LUCENE-4128 URL: https://issues.apache.org/jira/browse/LUCENE-4128 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-4128.patch Currently the one-time-upgrade depends on whether the upgraded .si file exists. And the writing is done in a try/finally so its removed if ioexception happens. but I think there could be a power-loss or something else in the middle of this, the upgraded .si file could be bogus, then the user would have to manually remove it (they probably wouldnt know). i think instead we should just have a marker file on completion, that we create after we successfully fsync the upgraded .si file. this way if something happens we just rewrite the thing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4130) CompoundFileDirectory.listAll is broken
Robert Muir created LUCENE-4130: --- Summary: CompoundFileDirectory.listAll is broken Key: LUCENE-4130 URL: https://issues.apache.org/jira/browse/LUCENE-4130 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir The files returned by listAll are not actually the files in the CFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4120) FST should use packed integer arrays
[ https://issues.apache.org/jira/browse/LUCENE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-4120: - Affects Version/s: (was: 5.0) Fix Version/s: (was: 5.0) 4.0 FST should use packed integer arrays Key: LUCENE-4120 URL: https://issues.apache.org/jira/browse/LUCENE-4120 Project: Lucene - Java Issue Type: Improvement Components: core/FSTs Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.0 There are some places where an int[] could be advantageously replaced with a packed integer array. I am thinking (at least) of: * FST.nodeAddress (GrowableWriter) * FST.inCounts (GrowableWriter) * FST.nodeRefToAddress (read-only Reader) The serialization/deserialization methods should be modified too in order to take advantage of PackedInts.get{Reader,Writer}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4130) CompoundFileDirectory.listAll is broken
[ https://issues.apache.org/jira/browse/LUCENE-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4130: Attachment: LUCENE-4130_test.patch test case CompoundFileDirectory.listAll is broken --- Key: LUCENE-4130 URL: https://issues.apache.org/jira/browse/LUCENE-4130 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Attachments: LUCENE-4130_test.patch The files returned by listAll are not actually the files in the CFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4131) .cfs/.cfe should have a codecheader
Robert Muir created LUCENE-4131: --- Summary: .cfs/.cfe should have a codecheader Key: LUCENE-4131 URL: https://issues.apache.org/jira/browse/LUCENE-4131 Project: Lucene - Java Issue Type: Improvement Affects Versions: 4.0 Reporter: Robert Muir The new .cfs is more tricky, but I still think we can do it. we should definitely fix this for .cfe -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3518) No `hits` in SolrResp. NamedList if distrib=true
[ https://issues.apache.org/jira/browse/SOLR-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated SOLR-3518: Attachment: SOLR-3518-4.0-1.patch Patch for trunk adding the `hits` field to the SolrQueryResponse's NamedList. It's only returned in the final response, not in intermediate shardrequests in a distributed search. Most likely not a good solution but it seems to work fine for now. Please improve. No `hits` in SolrResp. NamedList if distrib=true Key: SOLR-3518 URL: https://issues.apache.org/jira/browse/SOLR-3518 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0 Environment: 5.0-SNAPSHOT 1346798 - markus - 2012-06-06 11:38:15 Reporter: Markus Jelsma Priority: Minor Fix For: 4.0 Attachments: SOLR-3518-4.0-1.patch The hits field in the NamedList obtained from SolrQueryResponse.toLog() is not available for distrib=true requests. The hits fields is also not written to the log. See also:: http://lucene.472066.n3.nabble.com/SolrDispatchFilter-no-hits-in-response-NamedList-if-distrib-true-td3987751.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4129) add CodecHeader to .frq and .prx
[ https://issues.apache.org/jira/browse/LUCENE-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4129: Attachment: LUCENE-4129.patch also added a test TestAllFilesHaveCodecHeader. It currently has to ignore .cfs/cfe and also not recurse into them until we fix LUCENE-4130 and LUCENE-4131 add CodecHeader to .frq and .prx Key: LUCENE-4129 URL: https://issues.apache.org/jira/browse/LUCENE-4129 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Robert Muir Attachments: LUCENE-4129.patch, LUCENE-4129.patch, LUCENE-4129.patch We did this for all other files, but not .frq/.prx. Currently the postings writer only records itself in the blocktree terms dictionary, which is fine, but thats really documenting the .tim itself, that it is Blocktree with Lucene40Postings metadata. I think we should put headers in .frq/.prx as well: e.g. it could detect file jumbling. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4128) add safety to preflex segmentinfo upgrade
[ https://issues.apache.org/jira/browse/LUCENE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-4128: --- Attachment: LUCENE-4128.patch New patch w/ separate segmentWasUpgraded method checking the codec header. add safety to preflex segmentinfo upgrade --- Key: LUCENE-4128 URL: https://issues.apache.org/jira/browse/LUCENE-4128 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-4128.patch, LUCENE-4128.patch Currently the one-time-upgrade depends on whether the upgraded .si file exists. And the writing is done in a try/finally so its removed if ioexception happens. but I think there could be a power-loss or something else in the middle of this, the upgraded .si file could be bogus, then the user would have to manually remove it (they probably wouldnt know). i think instead we should just have a marker file on completion, that we create after we successfully fsync the upgraded .si file. this way if something happens we just rewrite the thing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Issues with whitespace tokenization in QueryParser
Welcome John! Basically the tricky part about this issue is how Analyzer integrates into the parsing workflow: It is as hossman says on the issue. You can edit the .jflex file so that _TERM_CHAR is defined differently and regenerate, and you will see what i mean by the tests that fail. The crux of the problem is that currently if you have +foo bar -baz, we split on whitespace, applying operators, then run the analyzer on each portion. so you get +foo, bar, -baz, then we analyze foo, bar, and baz respectively. But if you just remove the whitespace tokenization, you will get +foo bar, -baz, which is different. so to make this kind of thing work as expected, I think the analyzer would be integrated at an earlier stage here before the operators are applied, e.g. its part of the lexing process. NOTE: I definitely don't want to discourage you from tackling this issue, but I think its fair to mention there is a workaround, and thats if you can preprocess your queries yourself (maybe you dont allow all the lucene syntax to your users or something like that), you can escape the whitespace yourself such as rain\ coat, and I think your synonyms will work as expected. On Sun, Jun 10, 2012 at 11:03 PM, John Berryman jfberry...@gmail.com wrote: According to https://issues.apache.org/jira/browse/LUCENE-2605, the Lucene QueryParser tokenizes on white space before giving any text to the Analyzer. This makes it impossible to use multi-term synonyms because the SynonymFilter only receives one word at a time. Resolution to this would really help with my current project. My project client sells clothing and accessories online. They have plenty of examples of compound words e.g.rain coat. But some of these compound words are really tripping them up. A prime example is that a search for dress shoes returns a list of dresses and random shoes (not necessarily dress shoes). I wish that I was able to synonym compound words to single tokens (e.g. dress shoes = dress_shoes), but with this whitespace tokenization issue, it's impossible. Has anything happened with this bug recently? For a short time I've got a client that would be willing to pay for this issues to be fixed if it's not too much of a rabbit hole. Anyone care to catch me up with what this might entail? -- LinkedIn Twitter -- lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4128) add safety to preflex segmentinfo upgrade
[ https://issues.apache.org/jira/browse/LUCENE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292784#comment-13292784 ] Robert Muir commented on LUCENE-4128: - +1, thanks! add safety to preflex segmentinfo upgrade --- Key: LUCENE-4128 URL: https://issues.apache.org/jira/browse/LUCENE-4128 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-4128.patch, LUCENE-4128.patch Currently the one-time-upgrade depends on whether the upgraded .si file exists. And the writing is done in a try/finally so its removed if ioexception happens. but I think there could be a power-loss or something else in the middle of this, the upgraded .si file could be bogus, then the user would have to manually remove it (they probably wouldnt know). i think instead we should just have a marker file on completion, that we create after we successfully fsync the upgraded .si file. this way if something happens we just rewrite the thing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4132) IndexWriterConfig live settings
Shai Erera created LUCENE-4132: -- Summary: IndexWriterConfig live settings Key: LUCENE-4132 URL: https://issues.apache.org/jira/browse/LUCENE-4132 Project: Lucene - Java Issue Type: Improvement Reporter: Shai Erera Priority: Minor Fix For: 4.0, 5.0 A while ago there was a discussion about making some IW settings live and I remember that RAM buffer size was one of them. Judging from IW code, I see that RAM buffer can be changed live as IW never caches it. However, I don't remember which other settings were decided to be live and I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: {code} * bNOTE:/b some settings may be changed on the * returned {@link IndexWriterConfig}, and will take * effect in the current IndexWriter instance. See the * javadocs for the specific setters in {@link * IndexWriterConfig} for details. {code} But there's no text on e.g. IWC.setRAMBuffer mentioning that. I think that it'd be good if we make it easier for users to tell which of the settings are live ones. There are few possible ways to do it: * Introduce a custom @live.setting tag on the relevant IWC.set methods, and add special text for them in build.xml ** Or, drop the tag and just document it clearly. * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name proposals are welcome !), have IWC impl both, and introduce another IW.getLiveConfig which will return that interface, thereby clearly letting the user know which of the settings are live. It'd be good if IWC itself could only expose setXYZ methods for the live settings though. So perhaps, off the top of my head, we can do something like this: * Introduce a Config object, which is essentially what IWC is today, and pass it to IW. * IW will create a different object, IWC from that Config and IW.getConfig will return IWC. * IWC itself will only have setXYZ methods for the live settings. It adds another object, but user code doesn't change - it still creates a Config object when initializing IW, and need to handle a different type if it ever calls IW.getConfig. Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Windows-Java7-64 - Build # 41 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/41/ 1 tests failed. REGRESSION: org.apache.solr.handler.TestReplicationHandler.test Error Message: expected:494 but was:0 Stack Trace: java.lang.AssertionError: expected:494 but was:0 at __randomizedtesting.SeedInfo.seed([5CB24A4021D11B91:D4E6759A8F2D7669]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigAliasReplication(TestReplicationHandler.java:716) at org.apache.solr.handler.TestReplicationHandler.test(TestReplicationHandler.java:254) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Build Log: [...truncated 17470 lines...] [junit4] 2 63741 T2697 C164 REQ [collection1] webapp=/solr path=/replication params={command=filecontentchecksum=truegeneration=16wt=filestreamfile=_9_Lucene40_0.tim} status=0 QTime=0 [junit4] 2 63746 T2697 C164
[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292786#comment-13292786 ] Robert Muir commented on LUCENE-4132: - I don't think we should add another Config object, making things complicated for such a very very expert use case. Even ordinary users need to use IWC, and 99% of them don't care about changing things live. I'm also nervous about documenting which things can/cannot be changed live unless there are unit tests for each one. If we want to refactor indexwriter in some way that really cleans it up, but makes something un-live, then I think thats totally fair game and we should be able to do it, but the docs shouldnt be wrong. IndexWriterConfig live settings --- Key: LUCENE-4132 URL: https://issues.apache.org/jira/browse/LUCENE-4132 Project: Lucene - Java Issue Type: Improvement Reporter: Shai Erera Priority: Minor Fix For: 4.0, 5.0 A while ago there was a discussion about making some IW settings live and I remember that RAM buffer size was one of them. Judging from IW code, I see that RAM buffer can be changed live as IW never caches it. However, I don't remember which other settings were decided to be live and I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: {code} * bNOTE:/b some settings may be changed on the * returned {@link IndexWriterConfig}, and will take * effect in the current IndexWriter instance. See the * javadocs for the specific setters in {@link * IndexWriterConfig} for details. {code} But there's no text on e.g. IWC.setRAMBuffer mentioning that. I think that it'd be good if we make it easier for users to tell which of the settings are live ones. There are few possible ways to do it: * Introduce a custom @live.setting tag on the relevant IWC.set methods, and add special text for them in build.xml ** Or, drop the tag and just document it clearly. * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name proposals are welcome !), have IWC impl both, and introduce another IW.getLiveConfig which will return that interface, thereby clearly letting the user know which of the settings are live. It'd be good if IWC itself could only expose setXYZ methods for the live settings though. So perhaps, off the top of my head, we can do something like this: * Introduce a Config object, which is essentially what IWC is today, and pass it to IW. * IW will create a different object, IWC from that Config and IW.getConfig will return IWC. * IWC itself will only have setXYZ methods for the live settings. It adds another object, but user code doesn't change - it still creates a Config object when initializing IW, and need to handle a different type if it ever calls IW.getConfig. Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292788#comment-13292788 ] Shai Erera commented on LUCENE-4132: {quote} I don't think we should add another Config object, making things complicated for such a very very expert use case. Even ordinary users need to use IWC, and 99% of them don't care about changing things live. {quote} I'm not proposing to complicate matters for 99.9% of the users. On the contrary -- users will still do: {code} IndexWriterConfig config = new IndexWriterConfig(...); // configure it IndexWriter writer = new IndexWriter(dir, config); {code} Only the expert users who will want to change some settings live, will do: {code} Config conf = writer.getConfig(); // NOTE: it's a different type conf.setSomething(); {code} Config can be an IW internal type and most users won't even be aware of it. Today we document that the given IWC to IW ctor is cloned and it will remain as such. Only instead of being cloned to an IWC type, it will be cloned to a Config (or LiveConfig) type. IWC documentation isn't changed, IW.getConfig changes by removing that NOTE, and if you care about lively configure IW, you can do so through LiveConfig. And we can test that type too ! IndexWriterConfig live settings --- Key: LUCENE-4132 URL: https://issues.apache.org/jira/browse/LUCENE-4132 Project: Lucene - Java Issue Type: Improvement Reporter: Shai Erera Priority: Minor Fix For: 4.0, 5.0 A while ago there was a discussion about making some IW settings live and I remember that RAM buffer size was one of them. Judging from IW code, I see that RAM buffer can be changed live as IW never caches it. However, I don't remember which other settings were decided to be live and I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: {code} * bNOTE:/b some settings may be changed on the * returned {@link IndexWriterConfig}, and will take * effect in the current IndexWriter instance. See the * javadocs for the specific setters in {@link * IndexWriterConfig} for details. {code} But there's no text on e.g. IWC.setRAMBuffer mentioning that. I think that it'd be good if we make it easier for users to tell which of the settings are live ones. There are few possible ways to do it: * Introduce a custom @live.setting tag on the relevant IWC.set methods, and add special text for them in build.xml ** Or, drop the tag and just document it clearly. * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name proposals are welcome !), have IWC impl both, and introduce another IW.getLiveConfig which will return that interface, thereby clearly letting the user know which of the settings are live. It'd be good if IWC itself could only expose setXYZ methods for the live settings though. So perhaps, off the top of my head, we can do something like this: * Introduce a Config object, which is essentially what IWC is today, and pass it to IW. * IW will create a different object, IWC from that Config and IW.getConfig will return IWC. * IWC itself will only have setXYZ methods for the live settings. It adds another object, but user code doesn't change - it still creates a Config object when initializing IW, and need to handle a different type if it ever calls IW.getConfig. Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292789#comment-13292789 ] Robert Muir commented on LUCENE-4132: - Right, but i suppose changing live settings isnt necessarily the only use case for writer.getConfig() ? Today someone can take the config off of there and set it on another writer (it will be privately cloned). so i think if we want to do it this way, we could just keep getConfig as is, and add getLiveConfig which actually returns the same object, just cast through that interface. IndexWriterConfig live settings --- Key: LUCENE-4132 URL: https://issues.apache.org/jira/browse/LUCENE-4132 Project: Lucene - Java Issue Type: Improvement Reporter: Shai Erera Priority: Minor Fix For: 4.0, 5.0 A while ago there was a discussion about making some IW settings live and I remember that RAM buffer size was one of them. Judging from IW code, I see that RAM buffer can be changed live as IW never caches it. However, I don't remember which other settings were decided to be live and I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: {code} * bNOTE:/b some settings may be changed on the * returned {@link IndexWriterConfig}, and will take * effect in the current IndexWriter instance. See the * javadocs for the specific setters in {@link * IndexWriterConfig} for details. {code} But there's no text on e.g. IWC.setRAMBuffer mentioning that. I think that it'd be good if we make it easier for users to tell which of the settings are live ones. There are few possible ways to do it: * Introduce a custom @live.setting tag on the relevant IWC.set methods, and add special text for them in build.xml ** Or, drop the tag and just document it clearly. * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name proposals are welcome !), have IWC impl both, and introduce another IW.getLiveConfig which will return that interface, thereby clearly letting the user know which of the settings are live. It'd be good if IWC itself could only expose setXYZ methods for the live settings though. So perhaps, off the top of my head, we can do something like this: * Introduce a Config object, which is essentially what IWC is today, and pass it to IW. * IW will create a different object, IWC from that Config and IW.getConfig will return IWC. * IWC itself will only have setXYZ methods for the live settings. It adds another object, but user code doesn't change - it still creates a Config object when initializing IW, and need to handle a different type if it ever calls IW.getConfig. Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292792#comment-13292792 ] Robert Muir commented on LUCENE-4132: - ok actually i was partially wrong, one can no longer actually use the IWC from a writer since its marked as owned. But they can still grab it and look at stuff like getIndexDeletionPolicy, even though thats not live. I guess to be less confusing we should add getLiveConfig and just remove getConfig completely? IndexWriterConfig live settings --- Key: LUCENE-4132 URL: https://issues.apache.org/jira/browse/LUCENE-4132 Project: Lucene - Java Issue Type: Improvement Reporter: Shai Erera Priority: Minor Fix For: 4.0, 5.0 A while ago there was a discussion about making some IW settings live and I remember that RAM buffer size was one of them. Judging from IW code, I see that RAM buffer can be changed live as IW never caches it. However, I don't remember which other settings were decided to be live and I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: {code} * bNOTE:/b some settings may be changed on the * returned {@link IndexWriterConfig}, and will take * effect in the current IndexWriter instance. See the * javadocs for the specific setters in {@link * IndexWriterConfig} for details. {code} But there's no text on e.g. IWC.setRAMBuffer mentioning that. I think that it'd be good if we make it easier for users to tell which of the settings are live ones. There are few possible ways to do it: * Introduce a custom @live.setting tag on the relevant IWC.set methods, and add special text for them in build.xml ** Or, drop the tag and just document it clearly. * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name proposals are welcome !), have IWC impl both, and introduce another IW.getLiveConfig which will return that interface, thereby clearly letting the user know which of the settings are live. It'd be good if IWC itself could only expose setXYZ methods for the live settings though. So perhaps, off the top of my head, we can do something like this: * Introduce a Config object, which is essentially what IWC is today, and pass it to IW. * IW will create a different object, IWC from that Config and IW.getConfig will return IWC. * IWC itself will only have setXYZ methods for the live settings. It adds another object, but user code doesn't change - it still creates a Config object when initializing IW, and need to handle a different type if it ever calls IW.getConfig. Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292793#comment-13292793 ] Shai Erera commented on LUCENE-4132: bq. Today someone can take the config off of there and set it on another writer (it will be privately cloned) True, but I'm not aware of such use, and still someone can cache the IWC himself and pass it to multiple writers? If getConfig() returns an IWC which has setters(), that'll confuse the user for sure, because those settings won't take effect. I prefer that getConfig return the new LiveConfig type, with few setters and all getters (i.e. all getXYZ from IWC), and let whoever want to pass the same IWC instance to other writers handle it himself. Alternatively, we can add another ctor which takes a LiveConfig object, that is returned from getConfig(), but I prefer to avoid that until someone actually tells us that he shares the same IWC with other writers, and he cannot cache it himself? IndexWriterConfig live settings --- Key: LUCENE-4132 URL: https://issues.apache.org/jira/browse/LUCENE-4132 Project: Lucene - Java Issue Type: Improvement Reporter: Shai Erera Priority: Minor Fix For: 4.0, 5.0 A while ago there was a discussion about making some IW settings live and I remember that RAM buffer size was one of them. Judging from IW code, I see that RAM buffer can be changed live as IW never caches it. However, I don't remember which other settings were decided to be live and I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: {code} * bNOTE:/b some settings may be changed on the * returned {@link IndexWriterConfig}, and will take * effect in the current IndexWriter instance. See the * javadocs for the specific setters in {@link * IndexWriterConfig} for details. {code} But there's no text on e.g. IWC.setRAMBuffer mentioning that. I think that it'd be good if we make it easier for users to tell which of the settings are live ones. There are few possible ways to do it: * Introduce a custom @live.setting tag on the relevant IWC.set methods, and add special text for them in build.xml ** Or, drop the tag and just document it clearly. * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name proposals are welcome !), have IWC impl both, and introduce another IW.getLiveConfig which will return that interface, thereby clearly letting the user know which of the settings are live. It'd be good if IWC itself could only expose setXYZ methods for the live settings though. So perhaps, off the top of my head, we can do something like this: * Introduce a Config object, which is essentially what IWC is today, and pass it to IW. * IW will create a different object, IWC from that Config and IW.getConfig will return IWC. * IWC itself will only have setXYZ methods for the live settings. It adds another object, but user code doesn't change - it still creates a Config object when initializing IW, and need to handle a different type if it ever calls IW.getConfig. Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292794#comment-13292794 ] Robert Muir commented on LUCENE-4132: - sorry, instead of nuking getConfig make it pkg-private. Things like RandomIndexWriter want to peek into some un-live settings (like codec), I think we should still be able to look at these things for tests :) IndexWriterConfig live settings --- Key: LUCENE-4132 URL: https://issues.apache.org/jira/browse/LUCENE-4132 Project: Lucene - Java Issue Type: Improvement Reporter: Shai Erera Priority: Minor Fix For: 4.0, 5.0 A while ago there was a discussion about making some IW settings live and I remember that RAM buffer size was one of them. Judging from IW code, I see that RAM buffer can be changed live as IW never caches it. However, I don't remember which other settings were decided to be live and I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: {code} * bNOTE:/b some settings may be changed on the * returned {@link IndexWriterConfig}, and will take * effect in the current IndexWriter instance. See the * javadocs for the specific setters in {@link * IndexWriterConfig} for details. {code} But there's no text on e.g. IWC.setRAMBuffer mentioning that. I think that it'd be good if we make it easier for users to tell which of the settings are live ones. There are few possible ways to do it: * Introduce a custom @live.setting tag on the relevant IWC.set methods, and add special text for them in build.xml ** Or, drop the tag and just document it clearly. * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name proposals are welcome !), have IWC impl both, and introduce another IW.getLiveConfig which will return that interface, thereby clearly letting the user know which of the settings are live. It'd be good if IWC itself could only expose setXYZ methods for the live settings though. So perhaps, off the top of my head, we can do something like this: * Introduce a Config object, which is essentially what IWC is today, and pass it to IW. * IW will create a different object, IWC from that Config and IW.getConfig will return IWC. * IWC itself will only have setXYZ methods for the live settings. It adds another object, but user code doesn't change - it still creates a Config object when initializing IW, and need to handle a different type if it ever calls IW.getConfig. Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292795#comment-13292795 ] Shai Erera commented on LUCENE-4132: bq. I guess to be less confusing we should add getLiveConfig and just remove getConfig completely? Yes that's the proposal - either getConfig or getLiveConfig, but return a LiveConfig object with all the getters of IWC, and only the setters that we want to support. IndexWriterConfig live settings --- Key: LUCENE-4132 URL: https://issues.apache.org/jira/browse/LUCENE-4132 Project: Lucene - Java Issue Type: Improvement Reporter: Shai Erera Priority: Minor Fix For: 4.0, 5.0 A while ago there was a discussion about making some IW settings live and I remember that RAM buffer size was one of them. Judging from IW code, I see that RAM buffer can be changed live as IW never caches it. However, I don't remember which other settings were decided to be live and I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: {code} * bNOTE:/b some settings may be changed on the * returned {@link IndexWriterConfig}, and will take * effect in the current IndexWriter instance. See the * javadocs for the specific setters in {@link * IndexWriterConfig} for details. {code} But there's no text on e.g. IWC.setRAMBuffer mentioning that. I think that it'd be good if we make it easier for users to tell which of the settings are live ones. There are few possible ways to do it: * Introduce a custom @live.setting tag on the relevant IWC.set methods, and add special text for them in build.xml ** Or, drop the tag and just document it clearly. * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name proposals are welcome !), have IWC impl both, and introduce another IW.getLiveConfig which will return that interface, thereby clearly letting the user know which of the settings are live. It'd be good if IWC itself could only expose setXYZ methods for the live settings though. So perhaps, off the top of my head, we can do something like this: * Introduce a Config object, which is essentially what IWC is today, and pass it to IW. * IW will create a different object, IWC from that Config and IW.getConfig will return IWC. * IWC itself will only have setXYZ methods for the live settings. It adds another object, but user code doesn't change - it still creates a Config object when initializing IW, and need to handle a different type if it ever calls IW.getConfig. Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292797#comment-13292797 ] Robert Muir commented on LUCENE-4132: - {quote} True, but I'm not aware of such use, and still someone can cache the IWC himself and pass it to multiple writers? {quote} I'm just talking about the general issue that IW.getConfig is not only used to change settings live. Today our tests use this to peek at the settings on the IW (see my RandomIndexWriter example)... IndexWriterConfig live settings --- Key: LUCENE-4132 URL: https://issues.apache.org/jira/browse/LUCENE-4132 Project: Lucene - Java Issue Type: Improvement Reporter: Shai Erera Priority: Minor Fix For: 4.0, 5.0 A while ago there was a discussion about making some IW settings live and I remember that RAM buffer size was one of them. Judging from IW code, I see that RAM buffer can be changed live as IW never caches it. However, I don't remember which other settings were decided to be live and I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: {code} * bNOTE:/b some settings may be changed on the * returned {@link IndexWriterConfig}, and will take * effect in the current IndexWriter instance. See the * javadocs for the specific setters in {@link * IndexWriterConfig} for details. {code} But there's no text on e.g. IWC.setRAMBuffer mentioning that. I think that it'd be good if we make it easier for users to tell which of the settings are live ones. There are few possible ways to do it: * Introduce a custom @live.setting tag on the relevant IWC.set methods, and add special text for them in build.xml ** Or, drop the tag and just document it clearly. * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name proposals are welcome !), have IWC impl both, and introduce another IW.getLiveConfig which will return that interface, thereby clearly letting the user know which of the settings are live. It'd be good if IWC itself could only expose setXYZ methods for the live settings though. So perhaps, off the top of my head, we can do something like this: * Introduce a Config object, which is essentially what IWC is today, and pass it to IW. * IW will create a different object, IWC from that Config and IW.getConfig will return IWC. * IWC itself will only have setXYZ methods for the live settings. It adds another object, but user code doesn't change - it still creates a Config object when initializing IW, and need to handle a different type if it ever calls IW.getConfig. Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4120) FST should use packed integer arrays
[ https://issues.apache.org/jira/browse/LUCENE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-4120: - Attachment: LUCENE-4120.patch Patch. I don't fully understand how FST packing works so I would appreciate if someone familiar with it could review this patch. FST should use packed integer arrays Key: LUCENE-4120 URL: https://issues.apache.org/jira/browse/LUCENE-4120 Project: Lucene - Java Issue Type: Improvement Components: core/FSTs Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.0 Attachments: LUCENE-4120.patch There are some places where an int[] could be advantageously replaced with a packed integer array. I am thinking (at least) of: * FST.nodeAddress (GrowableWriter) * FST.inCounts (GrowableWriter) * FST.nodeRefToAddress (read-only Reader) The serialization/deserialization methods should be modified too in order to take advantage of PackedInts.get{Reader,Writer}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: VOTE: Lucene/Solr 4.0-ALPHA
On Jun 10, 2012, at 11:22 AM, Jack Krupansky wrote: I reviewed the Solr 4.0 wiki, which sounds as if the intent is for a single alpha and a single beta If someone is willing to assemble a release and the release can get the release votes, there is no reason we can't have multiple alphas or betas. Most things put on the wiki are guidelines or hopes more than anything - nothing is really set in stone. It's all subject to change given who expends what effort and what circumstances accumulate. Bottom line, anyone can be an RM, anyone can build an alpha, beta, release candidate, etc. You just need to get three PMC members to vote for your release. Given that, it does not make a lot of sense to put too much into intent or plans IMO. If circumstances warrant it, and someone is willing to make the releases, I'm sure we will do whatever makes the most sense given the feedback we get from the first alpha. Maybe we have one alpha and multiple betas. Maybe we have one Alpha and decide to release. I think it makes sense to plan (hope?) minimally - that is, one alpha, one beta sounds reasonable in terms of a bunch of cats stating intent - and let further work arise from the release response. - Mark Miller lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: LBHttpSolrServer doc needs a little improvement
On Jun 11, 2012, at 7:28 AM, Jack Krupansky wrote: The wiki for LBHttpSolrServer is a little out of date. It says the feature “experimental” and “currently being developed” even though SOLR-844 is closed. There are a couple of “LB!HttpSolrServer” links that point to nonexistent pages. The class javadoc has half but not all of the doc from the wiki. The simplest solution may be to move the rest of the wiki doc into the javadoc. I’m not sure what should be done with the wiki though. How can a wiki link to javadoc when the link depends on Solr release? Or, maybe just make the wiki and javadoc be the same. And the SolrJ wiki makes no mention of LBHttpSolrServer. -- Jack Krupansky Feel free to jump in and make improvements - anyone can edit the wiki, and there are many instances of out of date information, or holes in information. - Mark - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Windows-Java6-64 - Build # 505 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/505/ 1 tests failed. REGRESSION: org.apache.solr.handler.TestReplicationHandler.test Error Message: expected:498 but was:0 Stack Trace: java.lang.AssertionError: expected:498 but was:0 at __randomizedtesting.SeedInfo.seed([EC0818932746A69C:645C274989BACB64]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigReplication(TestReplicationHandler.java:391) at org.apache.solr.handler.TestReplicationHandler.test(TestReplicationHandler.java:250) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Build Log: [...truncated 12840 lines...] [junit4] 2 45148 T1174 C20 REQ [collection1] webapp=/solr path=/replication params={command=filecontentchecksum=truegeneration=7wt=filestreamfile=_3.fdx} status=0 QTime=0 [junit4] 2 45148 T1182 C21 REQ [collection1]
[jira] [Created] (LUCENE-4133) FastVectorHighlighter: A weighted approach for ordered fragments
Sebastian Lutze created LUCENE-4133: --- Summary: FastVectorHighlighter: A weighted approach for ordered fragments Key: LUCENE-4133 URL: https://issues.apache.org/jira/browse/LUCENE-4133 Project: Lucene - Java Issue Type: Improvement Components: modules/highlighter Affects Versions: 4.0, 5.0 Reporter: Sebastian Lutze Priority: Minor Fix For: 4.0 Attachments: LUCENE-4133.patch The FastVectorHighlighter currently disregards IDF-weights for matching terms within generated fragments. In the worst case, a fragment, which contains high number of very common words, is scored higher, than a fragment that contains *all* of the terms which have been used in the original query. This patch provides ordered fragments with IDF-weighted terms: *For each distinct matching term per fragment:* _weight = weight + IDF * boost_ *For each fragment:* _weight = weight * numTerms * 1 / sqrt( numTerms )_ |weight| total weight of fragment |IDF| inverse document frequency for each distinct matching term |boost| query boost as provided, for example _term^2_ |numTerms| total number of matching terms per fragment *Method:* {code:java} public void add( int startOffset, int endOffset, ListWeightedPhraseInfo phraseInfoList ) { float totalBoost = 0; ListSubInfo subInfos = new ArrayListSubInfo(); HashSetString distinctTerms = new HashSetString(); int length = 0; for( WeightedPhraseInfo phraseInfo : phraseInfoList ){ subInfos.add( new SubInfo( phraseInfo.getText(), phraseInfo.getTermsOffsets(), phraseInfo.getSeqnum() ) ); for ( TermInfo ti : phraseInfo.getTermsInfos()) { if ( distinctTerms.add( ti.getText() ) ) totalBoost += ti.getWeight() * phraseInfo.getBoost(); length++; } } totalBoost *= length * ( 1 / Math.sqrt( length ) ); getFragInfos().add( new WeightedFragInfo( startOffset, endOffset, subInfos, totalBoost ) ); } {code} The ranking-formula should be the same, or at least similar, to that one used in QueryTermScorer. *This patch contains:* * a changed class-member in FieldPhraseList (termInfos to termsInfos) * a changed local variable in SimpleFieldFragList (score to totalBoost) * adds a missing @override in SimpleFragListBuilder * class WeightedFieldFragList, a implementation of FieldFragList * class WeightedFragListBuilder, a implementation of BaseFragListBuilder * class WeightedFragListBuilderTest, a simple test-case * updated docs for FVH Last part (see also LUCENE-4091, LUCENE-4107, LUCENE-4113) of LUCENE-3440. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4133) FastVectorHighlighter: A weighted approach for ordered fragments
[ https://issues.apache.org/jira/browse/LUCENE-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Lutze updated LUCENE-4133: Attachment: LUCENE-4133.patch FastVectorHighlighter: A weighted approach for ordered fragments Key: LUCENE-4133 URL: https://issues.apache.org/jira/browse/LUCENE-4133 Project: Lucene - Java Issue Type: Improvement Components: modules/highlighter Affects Versions: 4.0, 5.0 Reporter: Sebastian Lutze Priority: Minor Labels: FastVectorHighlighter Fix For: 4.0 Attachments: LUCENE-4133.patch The FastVectorHighlighter currently disregards IDF-weights for matching terms within generated fragments. In the worst case, a fragment, which contains high number of very common words, is scored higher, than a fragment that contains *all* of the terms which have been used in the original query. This patch provides ordered fragments with IDF-weighted terms: *For each distinct matching term per fragment:* _weight = weight + IDF * boost_ *For each fragment:* _weight = weight * numTerms * 1 / sqrt( numTerms )_ |weight| total weight of fragment |IDF| inverse document frequency for each distinct matching term |boost| query boost as provided, for example _term^2_ |numTerms| total number of matching terms per fragment *Method:* {code:java} public void add( int startOffset, int endOffset, ListWeightedPhraseInfo phraseInfoList ) { float totalBoost = 0; ListSubInfo subInfos = new ArrayListSubInfo(); HashSetString distinctTerms = new HashSetString(); int length = 0; for( WeightedPhraseInfo phraseInfo : phraseInfoList ){ subInfos.add( new SubInfo( phraseInfo.getText(), phraseInfo.getTermsOffsets(), phraseInfo.getSeqnum() ) ); for ( TermInfo ti : phraseInfo.getTermsInfos()) { if ( distinctTerms.add( ti.getText() ) ) totalBoost += ti.getWeight() * phraseInfo.getBoost(); length++; } } totalBoost *= length * ( 1 / Math.sqrt( length ) ); getFragInfos().add( new WeightedFragInfo( startOffset, endOffset, subInfos, totalBoost ) ); } {code} The ranking-formula should be the same, or at least similar, to that one used in QueryTermScorer. *This patch contains:* * a changed class-member in FieldPhraseList (termInfos to termsInfos) * a changed local variable in SimpleFieldFragList (score to totalBoost) * adds a missing @override in SimpleFragListBuilder * class WeightedFieldFragList, a implementation of FieldFragList * class WeightedFragListBuilder, a implementation of BaseFragListBuilder * class WeightedFragListBuilderTest, a simple test-case * updated docs for FVH Last part (see also LUCENE-4091, LUCENE-4107, LUCENE-4113) of LUCENE-3440. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3440) FastVectorHighlighter: IDF-weighted terms for ordered fragments
[ https://issues.apache.org/jira/browse/LUCENE-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292822#comment-13292822 ] Sebastian Lutze commented on LUCENE-3440: - Hi Koji, bq. Is the next the last one? almost. :) Next thing would be Solr-Integration. So, I just realized: trunk is not trunk anymore! This one is for branch_4x: https://issues.apache.org/jira/browse/LUCENE-4133 Tests are fine. FastVectorHighlighter: IDF-weighted terms for ordered fragments Key: LUCENE-3440 URL: https://issues.apache.org/jira/browse/LUCENE-3440 Project: Lucene - Java Issue Type: Improvement Components: modules/highlighter Reporter: Sebastian Lutze Priority: Minor Labels: FastVectorHighlighter Fix For: 4.0 Attachments: LUCENE-3440.patch, LUCENE-3440.patch, LUCENE-3440.patch, LUCENE-3440_3.6.1-SNAPSHOT.patch, LUCENE-4.0-SNAPSHOT-3440-9.patch, weight-vs-boost_table01.html, weight-vs-boost_table02.html The FastVectorHighlighter uses for every term found in a fragment an equal weight, which causes a higher ranking for fragments with a high number of words or, in the worst case, a high number of very common words than fragments that contains *all* of the terms used in the original query. This patch provides ordered fragments with IDF-weighted terms: total weight = total weight + IDF for unique term per fragment * boost of query; The ranking-formula should be the same, or at least similar, to that one used in org.apache.lucene.search.highlight.QueryTermScorer. The patch is simple, but it works for us. Some ideas: - A better approach would be moving the whole fragments-scoring into a separate class. - Switch scoring via parameter - Exact phrases should be given a even better score, regardless if a phrase-query was executed or not - edismax/dismax-parameters pf, ps and pf^boost should be observed and corresponding fragments should be ranked higher -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-2748) Convert all Lucene web properties to use the ASF CMS
[ https://issues.apache.org/jira/browse/LUCENE-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved LUCENE-2748. - Resolution: Fixed Convert all Lucene web properties to use the ASF CMS Key: LUCENE-2748 URL: https://issues.apache.org/jira/browse/LUCENE-2748 Project: Lucene - Java Issue Type: Bug Reporter: Grant Ingersoll Assignee: Grant Ingersoll Attachments: modify_ui.diff The new CMS has a lot of nice features (and some kinks to still work out) and Forrest just doesn't cut it anymore, so we should move to the ASF CMS: http://apache.org/dev/cms.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Windows-Java7-64 - Build # 293 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/293/ 1 tests failed. REGRESSION: org.apache.solr.handler.TestReplicationHandler.test Error Message: expected:498 but was:0 Stack Trace: java.lang.AssertionError: expected:498 but was:0 at __randomizedtesting.SeedInfo.seed([B511A30BDD1F4E06:3D459CD173E323FE]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigReplication(TestReplicationHandler.java:391) at org.apache.solr.handler.TestReplicationHandler.test(TestReplicationHandler.java:250) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Build Log: [...truncated 13675 lines...] [junit4] 2 35971 T582 C20 REQ [collection1] webapp=/solr path=/replication params={command=filelistwt=javabingeneration=7} status=0 QTime=1 [junit4] 2 35972 T599 oash.SnapPuller.fetchLatestIndex Number of files in
[jira] [Commented] (LUCENE-4129) add CodecHeader to .frq and .prx
[ https://issues.apache.org/jira/browse/LUCENE-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292842#comment-13292842 ] Uwe Schindler commented on LUCENE-4129: --- I am fine with the patch. I would like to fix the CFS issues, too. But we already have issue. add CodecHeader to .frq and .prx Key: LUCENE-4129 URL: https://issues.apache.org/jira/browse/LUCENE-4129 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Robert Muir Attachments: LUCENE-4129.patch, LUCENE-4129.patch, LUCENE-4129.patch We did this for all other files, but not .frq/.prx. Currently the postings writer only records itself in the blocktree terms dictionary, which is fine, but thats really documenting the .tim itself, that it is Blocktree with Lucene40Postings metadata. I think we should put headers in .frq/.prx as well: e.g. it could detect file jumbling. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4129) add CodecHeader to .frq and .prx
[ https://issues.apache.org/jira/browse/LUCENE-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292846#comment-13292846 ] Robert Muir commented on LUCENE-4129: - I will look into the CFS stuff too after this one! add CodecHeader to .frq and .prx Key: LUCENE-4129 URL: https://issues.apache.org/jira/browse/LUCENE-4129 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Robert Muir Attachments: LUCENE-4129.patch, LUCENE-4129.patch, LUCENE-4129.patch We did this for all other files, but not .frq/.prx. Currently the postings writer only records itself in the blocktree terms dictionary, which is fine, but thats really documenting the .tim itself, that it is Blocktree with Lucene40Postings metadata. I think we should put headers in .frq/.prx as well: e.g. it could detect file jumbling. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: LBHttpSolrServer doc needs a little improvement
Thanks. I wasn't sure what the policy was for wiki updates by non-commiters. I updated the wiki as I specified, including the addition of a reference on the SolrJ wiki. And a bunch of typos and a couple of errors in the example code as well. -- Jack Krupansky -Original Message- From: Mark Miller Sent: Monday, June 11, 2012 9:48 AM To: dev@lucene.apache.org Subject: Re: LBHttpSolrServer doc needs a little improvement On Jun 11, 2012, at 7:28 AM, Jack Krupansky wrote: The wiki for LBHttpSolrServer is a little out of date. It says the feature “experimental” and “currently being developed” even though SOLR-844 is closed. There are a couple of “LB!HttpSolrServer” links that point to nonexistent pages. The class javadoc has half but not all of the doc from the wiki. The simplest solution may be to move the rest of the wiki doc into the javadoc. I’m not sure what should be done with the wiki though. How can a wiki link to javadoc when the link depends on Solr release? Or, maybe just make the wiki and javadoc be the same. And the SolrJ wiki makes no mention of LBHttpSolrServer. -- Jack Krupansky Feel free to jump in and make improvements - anyone can edit the wiki, and there are many instances of out of date information, or holes in information. - Mark - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4133) FastVectorHighlighter: A weighted approach for ordered fragments
[ https://issues.apache.org/jira/browse/LUCENE-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Lutze updated LUCENE-4133: Description: The FastVectorHighlighter currently disregards IDF-weights for matching terms within generated fragments. In the worst case, a fragment, which contains high number of very common words, is scored higher, than a fragment that contains *all* of the terms which have been used in the original query. This patch provides ordered fragments with IDF-weighted terms: *For each distinct matching term per fragment:* _weight = weight + IDF * boost_ *For each fragment:* _weight = weight * length * 1 / sqrt( length )_ |weight| total weight of fragment |IDF| inverse document frequency for each distinct matching term |boost| query boost as provided, for example _term^2_ |length| total number of non-distinct matching terms per fragment *Method:* {code:java} public void add( int startOffset, int endOffset, ListWeightedPhraseInfo phraseInfoList ) { float totalBoost = 0; ListSubInfo subInfos = new ArrayListSubInfo(); HashSetString distinctTerms = new HashSetString(); int length = 0; for( WeightedPhraseInfo phraseInfo : phraseInfoList ){ subInfos.add( new SubInfo( phraseInfo.getText(), phraseInfo.getTermsOffsets(), phraseInfo.getSeqnum() ) ); for ( TermInfo ti : phraseInfo.getTermsInfos()) { if ( distinctTerms.add( ti.getText() ) ) totalBoost += ti.getWeight() * phraseInfo.getBoost(); length++; } } totalBoost *= length * ( 1 / Math.sqrt( length ) ); getFragInfos().add( new WeightedFragInfo( startOffset, endOffset, subInfos, totalBoost ) ); } {code} The ranking-formula should be the same, or at least similar, to that one used in QueryTermScorer. *This patch contains:* * a changed class-member in FieldPhraseList (termInfos to termsInfos) * a changed local variable in SimpleFieldFragList (score to totalBoost) * adds a missing @override in SimpleFragListBuilder * class WeightedFieldFragList, a implementation of FieldFragList * class WeightedFragListBuilder, a implementation of BaseFragListBuilder * class WeightedFragListBuilderTest, a simple test-case * updated docs for FVH Last part (see also LUCENE-4091, LUCENE-4107, LUCENE-4113) of LUCENE-3440. was: The FastVectorHighlighter currently disregards IDF-weights for matching terms within generated fragments. In the worst case, a fragment, which contains high number of very common words, is scored higher, than a fragment that contains *all* of the terms which have been used in the original query. This patch provides ordered fragments with IDF-weighted terms: *For each distinct matching term per fragment:* _weight = weight + IDF * boost_ *For each fragment:* _weight = weight * numTerms * 1 / sqrt( numTerms )_ |weight| total weight of fragment |IDF| inverse document frequency for each distinct matching term |boost| query boost as provided, for example _term^2_ |numTerms| total number of matching terms per fragment *Method:* {code:java} public void add( int startOffset, int endOffset, ListWeightedPhraseInfo phraseInfoList ) { float totalBoost = 0; ListSubInfo subInfos = new ArrayListSubInfo(); HashSetString distinctTerms = new HashSetString(); int length = 0; for( WeightedPhraseInfo phraseInfo : phraseInfoList ){ subInfos.add( new SubInfo( phraseInfo.getText(), phraseInfo.getTermsOffsets(), phraseInfo.getSeqnum() ) ); for ( TermInfo ti : phraseInfo.getTermsInfos()) { if ( distinctTerms.add( ti.getText() ) ) totalBoost += ti.getWeight() * phraseInfo.getBoost(); length++; } } totalBoost *= length * ( 1 / Math.sqrt( length ) ); getFragInfos().add( new WeightedFragInfo( startOffset, endOffset, subInfos, totalBoost ) ); } {code} The ranking-formula should be the same, or at least similar, to that one used in QueryTermScorer. *This patch contains:* * a changed class-member in FieldPhraseList (termInfos to termsInfos) * a changed local variable in SimpleFieldFragList (score to totalBoost) * adds a missing @override in SimpleFragListBuilder * class WeightedFieldFragList, a implementation of FieldFragList * class WeightedFragListBuilder, a implementation of BaseFragListBuilder * class WeightedFragListBuilderTest, a simple test-case * updated docs for FVH Last part (see also LUCENE-4091, LUCENE-4107, LUCENE-4113) of LUCENE-3440. FastVectorHighlighter: A weighted approach for ordered fragments Key: LUCENE-4133 URL: https://issues.apache.org/jira/browse/LUCENE-4133 Project: Lucene - Java Issue Type: Improvement Components: modules/highlighter Affects Versions: 4.0, 5.0
[jira] [Resolved] (LUCENE-4129) add CodecHeader to .frq and .prx
[ https://issues.apache.org/jira/browse/LUCENE-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-4129. - Resolution: Fixed Fix Version/s: 5.0 4.0 add CodecHeader to .frq and .prx Key: LUCENE-4129 URL: https://issues.apache.org/jira/browse/LUCENE-4129 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Robert Muir Fix For: 4.0, 5.0 Attachments: LUCENE-4129.patch, LUCENE-4129.patch, LUCENE-4129.patch We did this for all other files, but not .frq/.prx. Currently the postings writer only records itself in the blocktree terms dictionary, which is fine, but thats really documenting the .tim itself, that it is Blocktree with Lucene40Postings metadata. I think we should put headers in .frq/.prx as well: e.g. it could detect file jumbling. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4131) .cfs/.cfe should have a codecheader
[ https://issues.apache.org/jira/browse/LUCENE-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4131: Attachment: LUCENE-4131_cfe.patch trivial patch for .cfe Looking at the .cfs now .cfs/.cfe should have a codecheader --- Key: LUCENE-4131 URL: https://issues.apache.org/jira/browse/LUCENE-4131 Project: Lucene - Java Issue Type: Improvement Affects Versions: 4.0 Reporter: Robert Muir Attachments: LUCENE-4131_cfe.patch The new .cfs is more tricky, but I still think we can do it. we should definitely fix this for .cfe -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4120) FST should use packed integer arrays
[ https://issues.apache.org/jira/browse/LUCENE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292885#comment-13292885 ] Dawid Weiss commented on LUCENE-4120: - I looked at the patch and it looks good to me but I didn't really analyze it in-depth. As for fst packing, the idea is fairly simple -- you reduce the overall size of the fst by moving states which have lots incoming arcs to offsets which compress well (in vcoding). At least I think that's what Mike implemented (Mike is an unpredictable genius :) ). This presentation has some details: http://ciaa-fsmnlp-2011.univ-tours.fr/ciaa/upload/files/Weiss-Daciuk.pdf FST should use packed integer arrays Key: LUCENE-4120 URL: https://issues.apache.org/jira/browse/LUCENE-4120 Project: Lucene - Java Issue Type: Improvement Components: core/FSTs Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.0 Attachments: LUCENE-4120.patch There are some places where an int[] could be advantageously replaced with a packed integer array. I am thinking (at least) of: * FST.nodeAddress (GrowableWriter) * FST.inCounts (GrowableWriter) * FST.nodeRefToAddress (read-only Reader) The serialization/deserialization methods should be modified too in order to take advantage of PackedInts.get{Reader,Writer}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4130) CompoundFileDirectory.listAll is broken
[ https://issues.apache.org/jira/browse/LUCENE-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4130: Attachment: LUCENE-4130.patch The problem is the ad-hoc substring'ing done in listAll: it doesnt work with norms/dv because they use CFS filenames with segment suffixes. Instead of this substring, i added a IndexFileNames.parseSegmentName that is just like stripSegmentName, except returns the other part. CompoundFileDirectory.listAll is broken --- Key: LUCENE-4130 URL: https://issues.apache.org/jira/browse/LUCENE-4130 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Attachments: LUCENE-4130.patch, LUCENE-4130_test.patch The files returned by listAll are not actually the files in the CFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3211) Allow parameter override in conjunction with spellcheck.maxCollationTries
[ https://issues.apache.org/jira/browse/SOLR-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer resolved SOLR-3211. -- Resolution: Fixed Committed...Trunk: 1348936, Branch_4x: r1348937 Allow parameter override in conjunction with spellcheck.maxCollationTries --- Key: SOLR-3211 URL: https://issues.apache.org/jira/browse/SOLR-3211 Project: Solr Issue Type: Improvement Components: spellchecker Affects Versions: 3.6, 4.0 Reporter: James Dyer Assignee: James Dyer Priority: Minor Fix For: 4.0, 5.0 Attachments: SOLR-3211.patch A couple users on the mailing list recently asked about being able to override the mm parameter when SpellCheckComponent issues queries to check for # hits for a collation candidate. The issue is if the query had mm=0, pretty much everything will generate hits. But for collation checking purposes, a low mm is almost never desirable. It might be worthwhile to generalize this to let other parameters be overridden as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4130) CompoundFileDirectory.listAll is broken
[ https://issues.apache.org/jira/browse/LUCENE-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292918#comment-13292918 ] Michael McCandless commented on LUCENE-4130: +1, sneaky. CompoundFileDirectory.listAll is broken --- Key: LUCENE-4130 URL: https://issues.apache.org/jira/browse/LUCENE-4130 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Attachments: LUCENE-4130.patch, LUCENE-4130_test.patch The files returned by listAll are not actually the files in the CFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4128) add safety to preflex segmentinfo upgrade
[ https://issues.apache.org/jira/browse/LUCENE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-4128. Resolution: Fixed add safety to preflex segmentinfo upgrade --- Key: LUCENE-4128 URL: https://issues.apache.org/jira/browse/LUCENE-4128 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-4128.patch, LUCENE-4128.patch Currently the one-time-upgrade depends on whether the upgraded .si file exists. And the writing is done in a try/finally so its removed if ioexception happens. but I think there could be a power-loss or something else in the middle of this, the upgraded .si file could be bogus, then the user would have to manually remove it (they probably wouldnt know). i think instead we should just have a marker file on completion, that we create after we successfully fsync the upgraded .si file. this way if something happens we just rewrite the thing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4130) CompoundFileDirectory.listAll is broken
[ https://issues.apache.org/jira/browse/LUCENE-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4130: Attachment: LUCENE-4130.patch cleaned-up patch, removing the duplicate code of 'find segment boundary/indexOf' between stripSegmentName and parseSegmentName (so its not easy to break the relationship between the two), and returning empty string from parse (which is more correct, also means CFS is transparent for files without a segment prefix). also removed the TODO from TestAllFilesHaveCodecHeader to recurse into CFS. I think this is ready to commit CompoundFileDirectory.listAll is broken --- Key: LUCENE-4130 URL: https://issues.apache.org/jira/browse/LUCENE-4130 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Attachments: LUCENE-4130.patch, LUCENE-4130.patch, LUCENE-4130_test.patch The files returned by listAll are not actually the files in the CFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4130) CompoundFileDirectory.listAll is broken
[ https://issues.apache.org/jira/browse/LUCENE-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-4130. - Resolution: Fixed Fix Version/s: 5.0 4.0 CompoundFileDirectory.listAll is broken --- Key: LUCENE-4130 URL: https://issues.apache.org/jira/browse/LUCENE-4130 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Fix For: 4.0, 5.0 Attachments: LUCENE-4130.patch, LUCENE-4130.patch, LUCENE-4130_test.patch The files returned by listAll are not actually the files in the CFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Windows-Java7-64 - Build # 294 - Still Failing!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/294/ 1 tests failed. FAILED: org.apache.solr.handler.TestReplicationHandler.test Error Message: expected:494 but was:0 Stack Trace: java.lang.AssertionError: expected:494 but was:0 at __randomizedtesting.SeedInfo.seed([7C7B1BC6CDCC693D:F42F241C633004C5]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigAliasReplication(TestReplicationHandler.java:716) at org.apache.solr.handler.TestReplicationHandler.test(TestReplicationHandler.java:254) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Build Log: [...truncated 17504 lines...] [junit4] 2 60601 T2776 C193 REQ [collection1] webapp=/solr path=/replication params={command=filecontentchecksum=truegeneration=16wt=filestreamfile=_9.fnm} status=0 QTime=0 [junit4] 2 60604 T2776 C193 REQ
Grouping - Boosting large groups
Hi forum, I've implemented grouping using the TermFirstPassGroupingCollector and TermSecondPassGroupingCollector, pretty much exactly as the example at the API. This works really well. I'm getting a the groups sorted by the computed relevance, within each groups the docs are sorted by a numeric field. So far, so good. Now I want to make things more complicated by boosting larger groups in addition to the existing relevance sort. For example, if the first result has a relevancy score of 1 and the group has 2 docs and the second group has a score of 0.9 and 4 docs, I want to boost the second group so it will appear before the first. Basically I'm trying to boost the groups according to the number of elements in the groups. I couldn't figure out how to do that or find an example anywhere. I hope I'm making sense Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/Grouping-Boosting-large-groups-tp3988959.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4120) FST should use packed integer arrays
[ https://issues.apache.org/jira/browse/LUCENE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292958#comment-13292958 ] Michael McCandless commented on LUCENE-4120: Patch looks great! Kuromoji's TokenInfoDictionaryBuilder doesn't compile w/ the patch ... it just needs the added arg to FST.pack. It seems sort of odd to have the new .save method on ReaderImpl... can it be on Mutable/Impl instead, or, maybe FST does its own saving or something? In all the places we now pass random.nextFloat() for acceptableOverheadRatio (to FST.pack or MemoryPostingsFormat), shouldn't it be COMPACT .. FASTEST instead of 0.0 .. 1.0? Can you fix the comment for FST.pack? It's no longer necessarily 8 bytes per node ... maybe just say up to 8 bytes per node, depending on acceptableOverheadRatio? Maybe rename the new PackedInts.getWriter method to eg getWriterByFormat? I was confused on just staring at it... FST should use packed integer arrays Key: LUCENE-4120 URL: https://issues.apache.org/jira/browse/LUCENE-4120 Project: Lucene - Java Issue Type: Improvement Components: core/FSTs Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.0 Attachments: LUCENE-4120.patch There are some places where an int[] could be advantageously replaced with a packed integer array. I am thinking (at least) of: * FST.nodeAddress (GrowableWriter) * FST.inCounts (GrowableWriter) * FST.nodeRefToAddress (read-only Reader) The serialization/deserialization methods should be modified too in order to take advantage of PackedInts.get{Reader,Writer}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4120) FST should use packed integer arrays
[ https://issues.apache.org/jira/browse/LUCENE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292966#comment-13292966 ] Michael McCandless commented on LUCENE-4120: bq. As for fst packing, the idea is fairly simple – you reduce the overall size of the fst by moving states which have lots incoming arcs to offsets which compress well (in vcoding). That's all I did, inspired by your talk/paper... I think we could do more :) FST should use packed integer arrays Key: LUCENE-4120 URL: https://issues.apache.org/jira/browse/LUCENE-4120 Project: Lucene - Java Issue Type: Improvement Components: core/FSTs Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.0 Attachments: LUCENE-4120.patch There are some places where an int[] could be advantageously replaced with a packed integer array. I am thinking (at least) of: * FST.nodeAddress (GrowableWriter) * FST.inCounts (GrowableWriter) * FST.nodeRefToAddress (read-only Reader) The serialization/deserialization methods should be modified too in order to take advantage of PackedInts.get{Reader,Writer}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4131) .cfs/.cfe should have a codecheader
[ https://issues.apache.org/jira/browse/LUCENE-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4131: Attachment: LUCENE-4131.patch patch also including CFS (for branch_4x). Trunk won't need the ugly stuff because it won't need to support 3.0 indexes .cfs/.cfe should have a codecheader --- Key: LUCENE-4131 URL: https://issues.apache.org/jira/browse/LUCENE-4131 Project: Lucene - Java Issue Type: Improvement Affects Versions: 4.0 Reporter: Robert Muir Attachments: LUCENE-4131.patch, LUCENE-4131_cfe.patch The new .cfs is more tricky, but I still think we can do it. we should definitely fix this for .cfe -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)
[ https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292977#comment-13292977 ] Simon Willnauer commented on LUCENE-4061: - patch looks good. I wonder if you can't create the ReaderManager in advance and make it final. I mean if you do add categories which seems to be the purpose of that writer you need it anyway and the costs should be considerably low. That would remove the need for locking on it entirely. Improvements to DirectoryTaxonomyWriter (synchronization and others) Key: LUCENE-4061 URL: https://issues.apache.org/jira/browse/LUCENE-4061 Project: Lucene - Java Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Fix For: 4.0 Attachments: LUCENE-4061.patch, LUCENE-4061.patch DirTaxoWriter synchronizes in too many places. For instance addCategory() is fully synchronized, while only a small part of it needs to be. Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is interested can query the specific cache impl it has. Currently, only Cl2oTaxoWriterCache supports it. If the changes will be simple, I'll port them to 3.6.1 as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4134) modify release process/scripts to use svn for rc/release publishing (svnpubsub)
Hoss Man created LUCENE-4134: Summary: modify release process/scripts to use svn for rc/release publishing (svnpubsub) Key: LUCENE-4134 URL: https://issues.apache.org/jira/browse/LUCENE-4134 Project: Lucene - Java Issue Type: Task Reporter: Hoss Man By the end of 2012, all of www.apache.org *INCLUDING THE DIST DIR* must be entirely managed using svnpubsub ... our use of the Apache CMS for lucene.apache.org puts us in compliance for our main website, but the dist dir use for publishing release artifacts also needs to be manaved via svn. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4131) .cfs/.cfe should have a codecheader
[ https://issues.apache.org/jira/browse/LUCENE-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292981#comment-13292981 ] Michael McCandless commented on LUCENE-4131: +1 .cfs/.cfe should have a codecheader --- Key: LUCENE-4131 URL: https://issues.apache.org/jira/browse/LUCENE-4131 Project: Lucene - Java Issue Type: Improvement Affects Versions: 4.0 Reporter: Robert Muir Attachments: LUCENE-4131.patch, LUCENE-4131_cfe.patch The new .cfs is more tricky, but I still think we can do it. we should definitely fix this for .cfe -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-3533) Show CharFilters in Schema Browser
[ https://issues.apache.org/jira/browse/SOLR-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher reassigned SOLR-3533: -- Assignee: Erik Hatcher (was: Stefan Matheis (steffkes)) Show CharFilters in Schema Browser -- Key: SOLR-3533 URL: https://issues.apache.org/jira/browse/SOLR-3533 Project: Solr Issue Type: Improvement Affects Versions: 4.0 Reporter: Erik Hatcher Assignee: Erik Hatcher Priority: Minor Fix For: 4.0 Attachments: SOLR-3533.patch Schema Browser (on trunk) currently does not show CharFilters. The example/ schema has this definition that can be used to demonstrate, though it needs to be uncommented: {code} fieldType name=text_char_norm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.WhitespaceTokenizerFactory/ /analyzer /fieldType {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3533) Show CharFilters in Schema Browser
[ https://issues.apache.org/jira/browse/SOLR-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher resolved SOLR-3533. Resolution: Fixed Went ahead and committed this to trunk and branch_4x Show CharFilters in Schema Browser -- Key: SOLR-3533 URL: https://issues.apache.org/jira/browse/SOLR-3533 Project: Solr Issue Type: Improvement Affects Versions: 4.0 Reporter: Erik Hatcher Assignee: Erik Hatcher Priority: Minor Fix For: 4.0 Attachments: SOLR-3533.patch Schema Browser (on trunk) currently does not show CharFilters. The example/ schema has this definition that can be used to demonstrate, though it needs to be uncommented: {code} fieldType name=text_char_norm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.WhitespaceTokenizerFactory/ /analyzer /fieldType {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4134) modify release process/scripts to use svn for rc/release publishing (svnpubsub)
[ https://issues.apache.org/jira/browse/LUCENE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292986#comment-13292986 ] Hoss Man commented on LUCENE-4134: -- Recent email from INFRA... {noformat} FYI: infrastructure policy regarding website hosting has changed as of November 2011: we are requiring all websites and dist/ dirs to be svnpubsub or ASF CMS backed by the end of 2012. If your PMC has already met this requirement congratulations, you can ignore the remainder of this post. As stated on http://www.apache.org/dev/project-site.html#svnpubsub we are migrating our webserver infrastructure to 100% svnpubsub over the course of 2012. If your site does not currently make use of this technology, it is time to consider a migration effort, as rsync-based sites will be PERMANENTLY FROZEN in Jan 2013 due ... NOTE: the policy for dist/ dirs for managing project releases is similar. We have setup a dedicated svn server for handling this, please contact infra when you are ready to start using it. {noformat} Some docs... http://www.apache.org/dev/release.html#upload-ci At a minimum we need to open a Jira with INFRA when we are ready for them to setup https://dist.apache.org/repos/dist/release/lucene; and start using it for subsequent release publishing (instead of copying to the magic dist dir on people.apache.org and waiting for rsync. But as part of this new process there will also be a https://dist.apache.org/repos/dist/dev/lucene; directory where release candidates can be put for review (instead of people.apache.org/~releasemanager/...), and if/when they are voted successfully a simple svn mv to dist/release/lucene makes them official and pushes them to the mirrors. So we should also change our release scripts to start svn committing the release candidates there instead of scping to people.apache.org modify release process/scripts to use svn for rc/release publishing (svnpubsub) --- Key: LUCENE-4134 URL: https://issues.apache.org/jira/browse/LUCENE-4134 Project: Lucene - Java Issue Type: Task Reporter: Hoss Man By the end of 2012, all of www.apache.org *INCLUDING THE DIST DIR* must be entirely managed using svnpubsub ... our use of the Apache CMS for lucene.apache.org puts us in compliance for our main website, but the dist dir use for publishing release artifacts also needs to be manaved via svn. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4134) modify release process/scripts to use svn for rc/release publishing (svnpubsub)
[ https://issues.apache.org/jira/browse/LUCENE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292988#comment-13292988 ] Robert Muir commented on LUCENE-4134: - I agree we should do the first part. As for the second part, i personally dont want to use any scripts that ssh or svn commit automatically so its no problem for me. I think instead we should just have instructions on where we should commit things manually in ReleaseTODO etc. If someone wants to add automation thats great, but I just don't like automation when it comes to my passwords. modify release process/scripts to use svn for rc/release publishing (svnpubsub) --- Key: LUCENE-4134 URL: https://issues.apache.org/jira/browse/LUCENE-4134 Project: Lucene - Java Issue Type: Task Reporter: Hoss Man By the end of 2012, all of www.apache.org *INCLUDING THE DIST DIR* must be entirely managed using svnpubsub ... our use of the Apache CMS for lucene.apache.org puts us in compliance for our main website, but the dist dir use for publishing release artifacts also needs to be manaved via svn. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4134) modify release process/scripts to use svn for rc/release publishing (svnpubsub)
[ https://issues.apache.org/jira/browse/LUCENE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292994#comment-13292994 ] Hoss Man commented on LUCENE-4134: -- rmuir: doesn't the automation already exist in buildAndPushrelease.py ? ... doesn't that automatically scp to RCs to people.apache.org:~you/public_html/staging_area/ ? i'm just suggesting we change that to do the svn commit to https://dist.apache.org/repos/dist/dev/lucene ... the RCs are still uploaded automaticly, they would just start geting uploaded to an INFRA blessed location that would make it easier to (manually) publish them post-VOTE with an svn mv modify release process/scripts to use svn for rc/release publishing (svnpubsub) --- Key: LUCENE-4134 URL: https://issues.apache.org/jira/browse/LUCENE-4134 Project: Lucene - Java Issue Type: Task Reporter: Hoss Man By the end of 2012, all of www.apache.org *INCLUDING THE DIST DIR* must be entirely managed using svnpubsub ... our use of the Apache CMS for lucene.apache.org puts us in compliance for our main website, but the dist dir use for publishing release artifacts also needs to be manaved via svn. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4135) TestNumericQueryParser fails on java 7
Robert Muir created LUCENE-4135: --- Summary: TestNumericQueryParser fails on java 7 Key: LUCENE-4135 URL: https://issues.apache.org/jira/browse/LUCENE-4135 Project: Lucene - Java Issue Type: Bug Components: modules/queryparser Affects Versions: 4.0 Reporter: Robert Muir http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux-Java7-64/49/ Seed reproduces on branch-4x for me on linux as well: ant test -Dtestcase=TestNumericQueryParser -Dtests.seed=E6EC0E1871B28E1E -Dtests.multiplier=3 -Dtests.locale=es_PE -Dtests.timezone=Africa/Tunis -Dargs=-Dfile.encoding=UTF-8 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4134) modify release process/scripts to use svn for rc/release publishing (svnpubsub)
[ https://issues.apache.org/jira/browse/LUCENE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292997#comment-13292997 ] Robert Muir commented on LUCENE-4134: - Hoss: right I'm saying i dont use that script :) I think we should still lay out the new instructions on the wiki for people who dont want scripts svn committing for them, thats all. modify release process/scripts to use svn for rc/release publishing (svnpubsub) --- Key: LUCENE-4134 URL: https://issues.apache.org/jira/browse/LUCENE-4134 Project: Lucene - Java Issue Type: Task Reporter: Hoss Man By the end of 2012, all of www.apache.org *INCLUDING THE DIST DIR* must be entirely managed using svnpubsub ... our use of the Apache CMS for lucene.apache.org puts us in compliance for our main website, but the dist dir use for publishing release artifacts also needs to be manaved via svn. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4134) modify release process/scripts to use svn for rc/release publishing (svnpubsub)
[ https://issues.apache.org/jira/browse/LUCENE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293001#comment-13293001 ] Hoss Man commented on LUCENE-4134: -- Ah ... sorry ... i thought you were saying we shouldn't add automation for this ... didn't realize you ment i don't use the automation we currently have bq. I think we should still lay out the new instructions on the wiki for people who dont want scripts svn committing for them, thats all. +1 modify release process/scripts to use svn for rc/release publishing (svnpubsub) --- Key: LUCENE-4134 URL: https://issues.apache.org/jira/browse/LUCENE-4134 Project: Lucene - Java Issue Type: Task Reporter: Hoss Man By the end of 2012, all of www.apache.org *INCLUDING THE DIST DIR* must be entirely managed using svnpubsub ... our use of the Apache CMS for lucene.apache.org puts us in compliance for our main website, but the dist dir use for publishing release artifacts also needs to be manaved via svn. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4120) FST should use packed integer arrays
[ https://issues.apache.org/jira/browse/LUCENE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293012#comment-13293012 ] Dawid Weiss commented on LUCENE-4120: - bq. That's all I did, inspired by your talk/paper... I think we could do more Remember I didn't talk about my _failed_ attempts, there is a very likely chance you may be thinking about those ;) FST should use packed integer arrays Key: LUCENE-4120 URL: https://issues.apache.org/jira/browse/LUCENE-4120 Project: Lucene - Java Issue Type: Improvement Components: core/FSTs Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.0 Attachments: LUCENE-4120.patch There are some places where an int[] could be advantageously replaced with a packed integer array. I am thinking (at least) of: * FST.nodeAddress (GrowableWriter) * FST.inCounts (GrowableWriter) * FST.nodeRefToAddress (read-only Reader) The serialization/deserialization methods should be modified too in order to take advantage of PackedInts.get{Reader,Writer}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux-Java6-64 - Build # 862 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux-Java6-64/862/ 2 tests failed. REGRESSION: org.apache.solr.handler.dataimport.TestSqlEntityProcessorDelta3.testCompositePk_DeltaImport_empty Error Message: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed Stack Trace: org.apache.solr.common.SolrException: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed at __randomizedtesting.SeedInfo.seed([722383488ECB025B:B52018272F5E5F59]:0) at org.apache.solr.util.TestHarness.update(TestHarness.java:260) at org.apache.solr.util.TestHarness.checkUpdateStatus(TestHarness.java:304) at org.apache.solr.util.TestHarness.validateUpdate(TestHarness.java:274) at org.apache.solr.SolrTestCaseJ4.checkUpdateU(SolrTestCaseJ4.java:413) at org.apache.solr.SolrTestCaseJ4.assertU(SolrTestCaseJ4.java:392) at org.apache.solr.SolrTestCaseJ4.assertU(SolrTestCaseJ4.java:386) at org.apache.solr.SolrTestCaseJ4.clearIndex(SolrTestCaseJ4.java:758) at org.apache.solr.handler.dataimport.TestSqlEntityProcessorDelta3.setUp(TestSqlEntityProcessorDelta3.java:67) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:873) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Caused by:
[jira] [Resolved] (LUCENE-3949) Fix license headers in all Java files to not be in Javadocs /** format
[ https://issues.apache.org/jira/browse/LUCENE-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved LUCENE-3949. -- Resolution: Fixed Committed revision 1348980. - trunk Committed revision 1348984. - 4x Fix license headers in all Java files to not be in Javadocs /** format -- Key: LUCENE-3949 URL: https://issues.apache.org/jira/browse/LUCENE-3949 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler Assignee: Hoss Man Fix For: 4.0 Attachments: LUCENE-3949.patch, fix-license-jdoc.pl Our current License headers in all .java files are (for a reason I don't know) in Javadocs format. Means, when you have a class without javadocs, the License header is used as Javadocs. I reviewed lots of other Apache projects, most of them use the correct /* header, but some (including Lucene+Solr) the Javadocs one. We should change this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Issues with whitespace tokenization in QueryParser
: : NOTE: I definitely don't want to discourage you from tackling this : issue, but I think its fair to mention there is a workaround, and : thats if you can preprocess your queries yourself (maybe you dont : allow all the lucene syntax to your users or something like that), you : can escape the whitespace yourself such as rain\ coat, and I think : your synonyms will work as expected. Alternatively: use a QueryParser that doesn't know/care about any special markup and just analyzes the entire input against a single (configured) field and generates the appropriate query -- Solr's FieldQParser works this way for example. You have to pick a tradeoff between i want to support query operators like ':', '+', '-', and ' ' that let me build up BooleanQuery objects and query specific fields vs i want the entire query string analyzed as one chunk : really tripping them up. A prime example is that a search for dress shoes : returns a list of dresses and random shoes (not necessarily dress shoes). I : wish that I was able to synonym compound words to single tokens (e.g. dress : shoes = dress_shoes), but with this whitespace tokenization issue, it's : impossible. this is one of the main use cases of the DismaxQParser (and now EDismaxQParser as well) with the pf param in solr ... you can have it query for both dress and/or shoes in som set of fields (qf) but also for the entire phrase dress shoes in a distinct set of fields (pf) which get a higher score. http://wiki.apache.org/solr/DisMax http://wiki.apache.org/solr/DisMaxQParserPlugin http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/ -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4131) .cfs/.cfe should have a codecheader
[ https://issues.apache.org/jira/browse/LUCENE-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293028#comment-13293028 ] Uwe Schindler commented on LUCENE-4131: --- +1, the crazy code is funny to read understand ;-] .cfs/.cfe should have a codecheader --- Key: LUCENE-4131 URL: https://issues.apache.org/jira/browse/LUCENE-4131 Project: Lucene - Java Issue Type: Improvement Affects Versions: 4.0 Reporter: Robert Muir Attachments: LUCENE-4131.patch, LUCENE-4131_cfe.patch The new .cfs is more tricky, but I still think we can do it. we should definitely fix this for .cfe -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Windows-Java6-64 - Build # 507 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/507/ 1 tests failed. REGRESSION: org.apache.solr.handler.TestReplicationHandler.test Error Message: expected:498 but was:0 Stack Trace: java.lang.AssertionError: expected:498 but was:0 at __randomizedtesting.SeedInfo.seed([FB5AB0EC651FBCE9:730E8F36CBE3D111]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigReplication(TestReplicationHandler.java:391) at org.apache.solr.handler.TestReplicationHandler.test(TestReplicationHandler.java:250) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Build Log: [...truncated 17145 lines...] [junit4] 2 43351 T3021 oash.SnapPuller.fetchLatestIndex Number of files in latest index in master: 10 [junit4] 2 43356 T3004 C168 REQ [collection1] webapp=/solr path=/replication
[jira] [Created] (SOLR-3534) dismax and edismax should default to df when qf is absent.
David Smiley created SOLR-3534: -- Summary: dismax and edismax should default to df when qf is absent. Key: SOLR-3534 URL: https://issues.apache.org/jira/browse/SOLR-3534 Project: Solr Issue Type: Improvement Components: query parsers Affects Versions: 4.0 Reporter: David Smiley Assignee: David Smiley Priority: Minor The dismax and edismax query parsers should default to df when the qf parameter is absent. They only use the defaultSearchField in schema.xml as a fallback now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2724) Deprecate defaultSearchField and defaultOperator defined in schema.xml
[ https://issues.apache.org/jira/browse/SOLR-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293054#comment-13293054 ] David Smiley commented on SOLR-2724: Bernd: Support for a default search field name still exists mostly for compatibility, and in perhaps some peoples' views as a matter of preference. It wasn't actually required before its deprecation. I thought it was only the fallback for parsing a lucene query but you indeed point out dismax has it as a fallback for 'qf', and it's used by the highligher as a fallback for 'hl.fl' although it appears the highlighter consults 'df' too. The main point behind its deprecation is that I think you should be explicit in a request which field(s) apply to what query strings or other features because the schema (schema.xml) can't know. The same applies to the default query operator which is even more of an odd duck sitting in schema.xml. Bernd, simply define qf in your request handler definition to make Solr respond correctly to the same queries you had before. Arguably, Dismax/Edismax should consult df as a default when qf isn't specified. I created SOLR-3534 for this issue. Deprecate defaultSearchField and defaultOperator defined in schema.xml -- Key: SOLR-2724 URL: https://issues.apache.org/jira/browse/SOLR-2724 Project: Solr Issue Type: Improvement Components: Schema and Analysis, search Reporter: David Smiley Assignee: David Smiley Priority: Minor Fix For: 3.6, 4.0 Attachments: SOLR-2724_deprecateDefaultSearchField_and_defaultOperator.patch Original Estimate: 2h Remaining Estimate: 2h I've always been surprised to see the defaultSearchField element and solrQueryParser defaultOperator=OR/ defined in the schema.xml file since the first time I saw them. They just seem out of place to me since they are more query parser related than schema related. But not only are they misplaced, I feel they shouldn't exist. For query parsers, we already have a df parameter that works just fine, and explicit field references. And the default lucene query operator should stay at OR -- if a particular query wants different behavior then use q.op or simply use OR. similarity Seems like something better placed in solrconfig.xml than in the schema. In my opinion, defaultSearchField and defaultOperator configuration elements should be deprecated in Solr 3.x and removed in Solr 4. And similarity should move to solrconfig.xml. I am willing to do it, provided there is consensus on it of course. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)
[ https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293063#comment-13293063 ] Shai Erera commented on LUCENE-4061: Believe me, I wanted to avoid it too, but ReaderManager is allocated like that for few reasons: * It's lazy, as the comment in the code says -- it's a waste to open an IR if your DirTaxoWriter session is going to be short living. ** Personally, I think this is a minor issue, and if it were only that, I'd make it final. * The TaxoWriterCache can be 'complete' which means all the categories currently known to DirTW are cached. In that case, it is a waste to keep the reader open and we close it. ** This is true for Cl2oCache, since it keeps all categories in memory. ** But LruCache is not like that, since it potentially evicts entries from the cache. So it can be 'complete' until it evicts the first entry, in which case it will never be complete, and we'll need to keep the reader open. Currently, when we don't need ReaderManager, we close it. We also don't open it until few cache misses occur. To change it would mean to sacrifice efficiency by always keeping a Reader open, even if it's not needed. It wastes RAM, file handles and what not. Not sure it's worth it. What do you think? Improvements to DirectoryTaxonomyWriter (synchronization and others) Key: LUCENE-4061 URL: https://issues.apache.org/jira/browse/LUCENE-4061 Project: Lucene - Java Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Fix For: 4.0 Attachments: LUCENE-4061.patch, LUCENE-4061.patch DirTaxoWriter synchronizes in too many places. For instance addCategory() is fully synchronized, while only a small part of it needs to be. Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is interested can query the specific cache impl it has. Currently, only Cl2oTaxoWriterCache supports it. If the changes will be simple, I'll port them to 3.6.1 as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3534) dismax and edismax should default to df when qf is absent.
[ https://issues.apache.org/jira/browse/SOLR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293073#comment-13293073 ] Jack Krupansky commented on SOLR-3534: -- I would also suggest that the default if neither qf or df is present should be text preferably as a symbolic constant. dismax and edismax should default to df when qf is absent. -- Key: SOLR-3534 URL: https://issues.apache.org/jira/browse/SOLR-3534 Project: Solr Issue Type: Improvement Components: query parsers Affects Versions: 4.0 Reporter: David Smiley Assignee: David Smiley Priority: Minor The dismax and edismax query parsers should default to df when the qf parameter is absent. They only use the defaultSearchField in schema.xml as a fallback now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3534) dismax and edismax should default to df when qf is absent.
[ https://issues.apache.org/jira/browse/SOLR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293074#comment-13293074 ] David Smiley commented on SOLR-3534: RE text default -- that would be yet another default and worse, IMO, is it would be too hidden of a default. Being explicit by specifying a parameter on the request is best, IMO. dismax and edismax should default to df when qf is absent. -- Key: SOLR-3534 URL: https://issues.apache.org/jira/browse/SOLR-3534 Project: Solr Issue Type: Improvement Components: query parsers Affects Versions: 4.0 Reporter: David Smiley Assignee: David Smiley Priority: Minor The dismax and edismax query parsers should default to df when the qf parameter is absent. They only use the defaultSearchField in schema.xml as a fallback now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3536) How can I treat special character as alpha. like !@#$%^*().
phatak.prachi created SOLR-3536: --- Summary: How can I treat special character as alpha. like !@#$%^*(). Key: SOLR-3536 URL: https://issues.apache.org/jira/browse/SOLR-3536 Project: Solr Issue Type: Wish Reporter: phatak.prachi I need to allow search on the special characters. Example if I have Wi-Fi RET-34 Wi fi and user enters only -, then it should return Wi-fi and RET-34 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3535) Add block support for XMLLoader
Mikhail Khludnev created SOLR-3535: -- Summary: Add block support for XMLLoader Key: SOLR-3535 URL: https://issues.apache.org/jira/browse/SOLR-3535 Project: Solr Issue Type: Sub-task Components: update Affects Versions: 4.1, 5.0 Reporter: Mikhail Khludnev Priority: Minor I'd like to add the following update xml message: add-block doc/doc doc/doc /add-block out of scope for now: * other update formats * update log support (NRT), should not be a big deal * overwrite feature support for block updates - it's more complicated, I'll tell you why Alt * wdyt about adding attribute to the current tag {pre}add block=true{pre} * or we can establish RunBlockUpdateProcessor which treat every add /add as a block. *Test is included!!* How you'd suggest to improve the patch? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3535) Add block support for XMLLoader
[ https://issues.apache.org/jira/browse/SOLR-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev updated SOLR-3535: --- Attachment: SOLR-3535.patch Add block support for XMLLoader --- Key: SOLR-3535 URL: https://issues.apache.org/jira/browse/SOLR-3535 Project: Solr Issue Type: Sub-task Components: update Affects Versions: 4.1, 5.0 Reporter: Mikhail Khludnev Priority: Minor Attachments: SOLR-3535.patch I'd like to add the following update xml message: add-block doc/doc doc/doc /add-block out of scope for now: * other update formats * update log support (NRT), should not be a big deal * overwrite feature support for block updates - it's more complicated, I'll tell you why Alt * wdyt about adding attribute to the current tag {pre}add block=true{pre} * or we can establish RunBlockUpdateProcessor which treat every add /add as a block. *Test is included!!* How you'd suggest to improve the patch? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3536) How can I treat special character as alpha. like !@#$%^*().
[ https://issues.apache.org/jira/browse/SOLR-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-3536. Resolution: Not A Problem please ask questions like this on the solr-user mailing list... http://lucene.apache.org/solr/discussion.html How can I treat special character as alpha. like !@#$%^*(). Key: SOLR-3536 URL: https://issues.apache.org/jira/browse/SOLR-3536 Project: Solr Issue Type: Wish Reporter: phatak.prachi Labels: newbie I need to allow search on the special characters. Example if I have Wi-Fi RET-34 Wi fi and user enters only -, then it should return Wi-fi and RET-34 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2352) TermVectorComponent fails with Undefined Field errors for score, *, or any Solr 4x psuedo-fields used in the fl param.
[ https://issues.apache.org/jira/browse/SOLR-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-2352. Resolution: Fixed Committed revision 1349012. - trunk Committed revision 1349013. - 4x TermVectorComponent fails with Undefined Field errors for score, *, or any Solr 4x psuedo-fields used in the fl param. -- Key: SOLR-2352 URL: https://issues.apache.org/jira/browse/SOLR-2352 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 3.1 Environment: Ubuntu 10.04/Arch solr 3.x branch r1058326 Reporter: Jed Glazner Assignee: Hoss Man Fix For: 4.0 Attachments: SOLR-2352.patch When searching using the term vector components and setting fl=*,score the result is a http 400 error 'undefined field: *'. If you disable the tvc the search works properly. Example bad request... {code}http://localhost:8983/solr/select/?qt=tvrhq=includes:[*+TO+*]fl=*{code} 3.1 stack trace: {noformat} SEVERE: org.apache.solr.common.SolrException: undefined field: * at org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:142) ... {noformat} The work around is to explicitly use the tv.fl param when using psuedo-fields in the fl... {code}http://localhost:8983/solr/select/?qt=tvrhq=includes:[*+TO+*]fl=*tv.fl=includes{code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3537) TermVectorComponent should support globs in fl and tv.fl combined with per-field overrides of other params
Hoss Man created SOLR-3537: -- Summary: TermVectorComponent should support globs in fl and tv.fl combined with per-field overrides of other params Key: SOLR-3537 URL: https://issues.apache.org/jira/browse/SOLR-3537 Project: Solr Issue Type: Task Reporter: Hoss Man TermVectorComponent should be improved so that it fields can be specified in tv.fl (or fl) using globs ala the ReturnFields helper class. per field overrides for the various options TVC supports should work will all fields, even if specified as part of a glob. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org