[jira] [Commented] (LUCENE-4145) "Unhandled exception" from test framework (in json parsing of test output files?)
[ https://issues.apache.org/jira/browse/LUCENE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295490#comment-13295490 ] Dawid Weiss commented on LUCENE-4145: - This is weird, I'll look into it. > "Unhandled exception" from test framework (in json parsing of test output > files?) > - > > Key: LUCENE-4145 > URL: https://issues.apache.org/jira/browse/LUCENE-4145 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Dawid Weiss > > Working on SOLR-3267 i got a weird exception printed to the junit output... > {noformat} >[junit4] Unhandled exception in thread: Thread[pumper-events,5,main] >[junit4] > com.carrotsearch.ant.tasks.junit4.dependencies.com.google.gson.JsonParseException: > No such reference: id#org.apache.solr.search.TestSort[3] > ... > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3546) Add index page to Admin UI
Lance Norskog created SOLR-3546: --- Summary: Add index page to Admin UI Key: SOLR-3546 URL: https://issues.apache.org/jira/browse/SOLR-3546 Project: Solr Issue Type: New Feature Components: web gui Reporter: Lance Norskog Priority: Minor It would be great to index a file by uploading it. In designing schemas and testing features I often make one or two test documents. It would be great to upload these directly from the UI. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4146) -Dtests.iters combined with -Dtestmethod never fails?
[ https://issues.apache.org/jira/browse/LUCENE-4146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295489#comment-13295489 ] Dawid Weiss commented on LUCENE-4146: - Also, completing the above answer -- this issue also affects things like "re-running" a test from Eclipse and other IDEs. If you run your suite with -Dtests.iters=5 you'll get a tree of tests that executed, with their "unique" names that include a seed. If you click on a given test and re-run it Eclipse will try to filter execution to that particular test (that name) and if the seed is random (and not fixed) the chances of such a test occurring again are nearly zero, so you'll get an empty result (no executed test). I've tried a number of workarounds/ hacks but none of them worked well. This is really the best of what I've tried. > -Dtests.iters combined with -Dtestmethod never fails? > - > > Key: LUCENE-4146 > URL: https://issues.apache.org/jira/browse/LUCENE-4146 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Hoss Man > Attachments: LUCENE-4146.fail.patch, > TEST-org.apache.lucene.TestSearch.iters-no-fail.xml, > TEST-org.apache.lucene.TestSearch.no-iters-fail.xml > > > a test that is hardcoded to fail will report succes if you run it with > -Dtests.iters -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4146) -Dtests.iters combined with -Dtestmethod never fails?
[ https://issues.apache.org/jira/browse/LUCENE-4146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295485#comment-13295485 ] Dawid Weiss commented on LUCENE-4146: - This isn't a bug, Hoss. This is an unfortunate API shortcoming of JUnit that I had to accomodate somehow. So what happens is that: 1) no two junit tests can have the same "description" (which in realistic terms means no two junit tests can have an identical method name); this confuses the hell out of all IDE clients and other clients (like ant, maven, etc.). 2) because of the above (and wanting to have separate tests for repetitions), repeated test names are created so that they contain a sequential number and a seed (to make then unique). 3) because of the above a method filter no longer works because that exact string doesn't match the generated pseudo-method name. A workaround is to add globs around method name as in: {noformat} ant test -Dtests.iters=2 -Dtestcase=TestSearch -Dtestmethod=*testFailureBuildfile* {noformat} Yeah, I realize this sucks but I have no better ideas for the moment (that would work with existing JUnit infrastructure). > -Dtests.iters combined with -Dtestmethod never fails? > - > > Key: LUCENE-4146 > URL: https://issues.apache.org/jira/browse/LUCENE-4146 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Hoss Man > Attachments: LUCENE-4146.fail.patch, > TEST-org.apache.lucene.TestSearch.iters-no-fail.xml, > TEST-org.apache.lucene.TestSearch.no-iters-fail.xml > > > a test that is hardcoded to fail will report succes if you run it with > -Dtests.iters -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3534) dismax and edismax should default to "df" when "qf" is absent.
[ https://issues.apache.org/jira/browse/SOLR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295473#comment-13295473 ] David Smiley commented on SOLR-3534: Whoops -- that commit (#1350466) was mis-commented to SOLR-3304. > dismax and edismax should default to "df" when "qf" is absent. > -- > > Key: SOLR-3534 > URL: https://issues.apache.org/jira/browse/SOLR-3534 > Project: Solr > Issue Type: Improvement > Components: query parsers >Affects Versions: 4.0 >Reporter: David Smiley >Assignee: David Smiley >Priority: Minor > Attachments: > SOLR-3534_dismax_and_edismax_should_default_to_df_if_qf_is_absent.patch, > SOLR-3534_dismax_and_edismax_should_default_to_df_if_qf_is_absent.patch > > > The dismax and edismax query parsers should default to "df" when the "qf" > parameter is absent. They only use the defaultSearchField in schema.xml as a > fallback now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3534) dismax and edismax should default to "df" when "qf" is absent.
[ https://issues.apache.org/jira/browse/SOLR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-3534: --- Attachment: SOLR-3534_dismax_and_edismax_should_default_to_df_if_qf_is_absent.patch This is an updated patch. Instead of SolrPluginUtils, I chose QueryParsing which already has a similar method for q.op. And like q.op I had the 2nd arg be the string that the caller resolves. Some callers don't have a convenient params to provide. The fact that some don't led me to start doing more refactorings to QParser that I decided to withdraw from as to not make this issue do too much at once. I already committed test modifications so that this patch will pass. (I jumped the gun perhaps but no matter.) You should see this change in the subversion tab in JIRA. > dismax and edismax should default to "df" when "qf" is absent. > -- > > Key: SOLR-3534 > URL: https://issues.apache.org/jira/browse/SOLR-3534 > Project: Solr > Issue Type: Improvement > Components: query parsers >Affects Versions: 4.0 >Reporter: David Smiley >Assignee: David Smiley >Priority: Minor > Attachments: > SOLR-3534_dismax_and_edismax_should_default_to_df_if_qf_is_absent.patch, > SOLR-3534_dismax_and_edismax_should_default_to_df_if_qf_is_absent.patch > > > The dismax and edismax query parsers should default to "df" when the "qf" > parameter is absent. They only use the defaultSearchField in schema.xml as a > fallback now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3522) "literal" function can not be parsed
[ https://issues.apache.org/jira/browse/SOLR-3522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-3522: --- Attachment: SOLR-3522.patch patch that should fix the problem ... except that the test still fails in a way that suggests StringDistanceFunction isn't implementing equals properly (two FunctionQueries parsed from identical input aren't asserting equally) so now i need to go down that rabbit hole. (i may just have astupid mistake in the test i'm not seeing at the moment) > "literal" function can not be parsed > > > Key: SOLR-3522 > URL: https://issues.apache.org/jira/browse/SOLR-3522 > Project: Solr > Issue Type: Bug >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 4.0, 3.6.1 > > Attachments: SOLR-3522.patch > > > attempting to use the "literal" function in the fl param causes a parse > error... > Example queries with functions that works fine... > {noformat} > http://localhost:8983/solr/collection1/select?q=*:*&fl=foo:sum%284,5%29 > http://localhost:8983/solr/collection1/select?fl=score&q={!func}strdist%28%22foo%22,%22fo%22,edit%29 > {noformat} > Examples using literal function that fails... > {noformat} > http://localhost:8983/solr/collection1/select?q=*:*&fl=foo:literal%28%22foo%22%29 > http://localhost:8983/solr/collection1/select?fl=score&q={!func}strdist%28%22foo%22,literal%28%22fo%22%29,edit%29 > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-4132: --- Attachment: LUCENE-4132.patch Good catch Mike ! It went away in the last changes. I re-added testReuse, with asserting that e.g. the MP instances returned from LiveIWC are not the same. > IndexWriterConfig live settings > --- > > Key: LUCENE-4132 > URL: https://issues.apache.org/jira/browse/LUCENE-4132 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Shai Erera >Assignee: Shai Erera >Priority: Minor > Fix For: 4.0, 5.0 > > Attachments: LUCENE-4132.patch, LUCENE-4132.patch, LUCENE-4132.patch, > LUCENE-4132.patch, LUCENE-4132.patch, LUCENE-4132.patch > > > A while ago there was a discussion about making some IW settings "live" and I > remember that RAM buffer size was one of them. Judging from IW code, I see > that RAM buffer can be changed "live" as IW never caches it. > However, I don't remember which other settings were decided to be "live" and > I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: > {code} > * NOTE: some settings may be changed on the > * returned {@link IndexWriterConfig}, and will take > * effect in the current IndexWriter instance. See the > * javadocs for the specific setters in {@link > * IndexWriterConfig} for details. > {code} > But there's no text on e.g. IWC.setRAMBuffer mentioning that. > I think that it'd be good if we make it easier for users to tell which of the > settings are "live" ones. There are few possible ways to do it: > * Introduce a custom @live.setting tag on the relevant IWC.set methods, and > add special text for them in build.xml > ** Or, drop the tag and just document it clearly. > * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name > proposals are welcome !), have IWC impl both, and introduce another > IW.getLiveConfig which will return that interface, thereby clearly letting > the user know which of the settings are "live". > It'd be good if IWC itself could only expose setXYZ methods for the "live" > settings though. So perhaps, off the top of my head, we can do something like > this: > * Introduce a Config object, which is essentially what IWC is today, and pass > it to IW. > * IW will create a different object, IWC from that Config and IW.getConfig > will return IWC. > * IWC itself will only have setXYZ methods for the "live" settings. > It adds another object, but user code doesn't change - it still creates a > Config object when initializing IW, and need to handle a different type if it > ever calls IW.getConfig. > Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3522) "literal" function can not be parsed
[ https://issues.apache.org/jira/browse/SOLR-3522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-3522: --- Fix Version/s: 3.6.1 Assignee: Hoss Man Summary: "literal" function can not be parsed (was: "literal" function can not be parsed in 4x/trunk) Looking into this, it seems that the literal function is completley broken in 3.6 as well -- raw literals work, just not the {{literal("foo")}} or {{literal($foo})}}. problem seems to be a simple mistake of calling "fp.getString()" (which is the entire input string) instead of using fp.parseArg() ... i'll work on a test & fix. > "literal" function can not be parsed > > > Key: SOLR-3522 > URL: https://issues.apache.org/jira/browse/SOLR-3522 > Project: Solr > Issue Type: Bug >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 4.0, 3.6.1 > > > attempting to use the "literal" function in the fl param causes a parse > error... > Example queries with functions that works fine... > {noformat} > http://localhost:8983/solr/collection1/select?q=*:*&fl=foo:sum%284,5%29 > http://localhost:8983/solr/collection1/select?fl=score&q={!func}strdist%28%22foo%22,%22fo%22,edit%29 > {noformat} > Examples using literal function that fails... > {noformat} > http://localhost:8983/solr/collection1/select?q=*:*&fl=foo:literal%28%22foo%22%29 > http://localhost:8983/solr/collection1/select?fl=score&q={!func}strdist%28%22foo%22,literal%28%22fo%22%29,edit%29 > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295426#comment-13295426 ] Chris Russell commented on SOLR-2894: - Erik, I can't get your patch to apply cleanly to solr 1350445 $ patch -p0 -i SOLR-2894.patch patching file solr/core/src/test/org/apache/solr/handler/component/DistributedFacetPivotTest.java patching file solr/core/src/java/org/apache/solr/handler/component/EntryCountComparator.java patching file solr/core/src/java/org/apache/solr/handler/component/PivotNamedListCountComparator.java patching file solr/core/src/java/org/apache/solr/handler/component/PivotFacetHelper.java Hunk #2 FAILED at 103. 1 out of 2 hunks FAILED -- saving rejects to file solr/core/src/java/org/apache/solr/handler/component/PivotFacetHelper.java.rej patching file solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java Hunk #11 FAILED at 799. 1 out of 17 hunks FAILED -- saving rejects to file solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java.rej patching file solr/core/src/java/org/apache/solr/util/PivotListEntry.java patching file solr/solrj/src/java/org/apache/solr/common/params/FacetParams.java patching file solr/test-framework/src/java/org/apache/solr/BaseDistributedSearchTestCase.java > Implement distributed pivot faceting > > > Key: SOLR-2894 > URL: https://issues.apache.org/jira/browse/SOLR-2894 > Project: Solr > Issue Type: Improvement >Reporter: Erik Hatcher >Assignee: Erik Hatcher > Fix For: 4.0 > > Attachments: SOLR-2894.patch, SOLR-2894.patch, > distributed_pivot.patch, distributed_pivot.patch > > > Following up on SOLR-792, pivot faceting currently only supports > undistributed mode. Distributed pivot faceting needs to be implemented. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3267) TestSort failures (reproducible)
[ https://issues.apache.org/jira/browse/SOLR-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-3267. Resolution: Fixed I think my thought process when putting the "numberOfOddities" check in that test was that we shouldn't fail if a randomly generated string just happened to wind up being a valid function, or all control/blank characters, (or "score", or "_docid_) ... but that if it happened more then a non-trivial number of times, that was odd and should cause a failure so someone would look at the test. Looking at some of hte problematic seeds, i realized that the common situation with oddities was: * randomly generated strings that were all whitespace and/or control characters * randomly generated strings that were valid quote sequences (which means they can be treated as a (literal) function. so i changed it as follows... * removed all the "oddity" checking * added loop in the event that a random string is all whitespace, but made it fail hard if 37 attempts all produce strings that are entirely whitespace (rather then an "infinite" loop) * improved the "munging" of the random strings to ensure they aren't valid functions (or literal quoted strings) * made the test fail hard if any string produced parses as a function or query instead of a field name. Committed revision 1350444. - trunk Committed revision 1350445. - 4x > TestSort failures (reproducible) > > > Key: SOLR-3267 > URL: https://issues.apache.org/jira/browse/SOLR-3267 > Project: Solr > Issue Type: Bug >Reporter: Dawid Weiss >Assignee: Hoss Man > Fix For: 4.0 > > > {noformat} > Over 0.2% oddities in test: 14/6386 have func/query parsing semenatics gotten > broader? > {noformat} > Huh? Steps to reproduce: > {noformat} > ant test -Dtestcase=TestSort -Dtestmethod=testRandomFieldNameSorts > -Dtests.seed=-3e789c8564f08cbd:515c61b079794ea7:-6347ac0df7ad45c0 > -Dargs="-Dfile.encoding=UTF-8" > [junit] Testcase: > testRandomFieldNameSorts(org.apache.solr.search.TestSort):FAILED > [junit] Over 0.2% oddities in test: 14/6386 have func/query parsing > semenatics gotten broader? > [junit] junit.framework.AssertionFailedError: Over 0.2% oddities in test: > 14/6386 have func/query parsing semenatics gotten broader? > [junit] at org.junit.Assert.fail(Assert.java:93) > [junit] at org.junit.Assert.assertTrue(Assert.java:43) > [junit] at > org.apache.solr.search.TestSort.testRandomFieldNameSorts(TestSort.java:145) > [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > [junit] at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > [junit] at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > [junit] at java.lang.reflect.Method.invoke(Method.java:597) > [junit] at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) > [junit] at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) > [junit] at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) > [junit] at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) > [junit] at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) > [junit] at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) > [junit] at > org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63) > [junit] at > org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:739) > [junit] at > org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:655) > [junit] at > org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69) > [junit] at > org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:566) > [junit] at > org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75) > [junit] at > org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:628) > [junit] at org.junit.rules.RunRules.evaluate(RunRules.java:18) > [junit] at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) > [junit] at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) > [junit] at > org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164) > [junit] at > org.apache.lucene.util.LuceneTestCaseRunner.runChild(Lu
[jira] [Resolved] (SOLR-3542) Highlighter: Integration of LUCENE-4133 (Part of LUCENE-3440)
[ https://issues.apache.org/jira/browse/SOLR-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi resolved SOLR-3542. -- Resolution: Fixed Fix Version/s: 5.0 Committed in trunk and 4x. I also set WeightedFragListBuilder to default in example solrconfig.xml. Many thanks, Sebastian! > Highlighter: Integration of LUCENE-4133 (Part of LUCENE-3440) > - > > Key: SOLR-3542 > URL: https://issues.apache.org/jira/browse/SOLR-3542 > Project: Solr > Issue Type: Improvement > Components: highlighter >Affects Versions: 4.0 >Reporter: Sebastian Lutze >Assignee: Koji Sekiguchi >Priority: Minor > Labels: FastVectorHighlighter, highlight, patch > Fix For: 4.0, 5.0 > > Attachments: SOLR-3542.patch > > > This patch integrates a weight-based approach for sorting highlighted > fragments. > See LUCENE-4133 (Part of LUCENE-3440). > This patch contains: > - Introduction of class WeightedFragListBuilder, a implementation of > SolrFragListBuilder > - Updated example-configuration -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4146) -Dtests.iters combined with -Dtestmethod never fails?
[ https://issues.apache.org/jira/browse/LUCENE-4146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-4146: - Attachment: TEST-org.apache.lucene.TestSearch.no-iters-fail.xml TEST-org.apache.lucene.TestSearch.iters-no-fail.xml LUCENE-4146.fail.patch trivial patch adding a test that is guaranteed to fail. When run simply, it fails as expected... {noformat} hossman@bester:~/lucene/4x_dev/lucene/core$ ant test -Dtestcase=TestSearch -Dtestmethod=testFailure Buildfile: /home/hossman/lucene/4x_dev/lucene/core/build.xml ... test: [junit4] says ¡Hola! Master seed: E9EE2618BEEE855E [junit4] Executing 1 suite with 1 JVM. [junit4] Suite: org.apache.lucene.TestSearch [junit4] FAILURE 0.14s | TestSearch.testFailure [junit4]> Throwable #1: java.lang.AssertionError: This statement is false [junit4]>at __randomizedtesting.SeedInfo.seed([E9EE2618BEEE855E:8153D5F484DEE7F1]:0) [junit4]>at org.junit.Assert.fail(Assert.java:93) [junit4]>at org.junit.Assert.assertTrue(Assert.java:43) [junit4]>at org.apache.lucene.TestSearch.testFailure(TestSearch.java:39) ... [junit4]> [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestSearch -Dtests.method=testFailure -Dtests.seed=E9EE2618BEEE855E -Dtests.locale=es_PA -Dtests.timezone=Pacific/Chatham -Dargs="-Dfile.encoding=UTF-8" [junit4] 2> [junit4]> (@AfterClass output) [junit4] 2> NOTE: test params are: codec=Lucene40: {}, sim=RandomSimilarityProvider(queryNorm=false,coord=false): {}, locale=es_PA, timezone=Pacific/Chatham [junit4] 2> NOTE: Linux 2.6.31-23-generic amd64/Sun Microsystems Inc. 1.6.0_24 (64-bit)/cpus=2,threads=1,free=105287320,total=124125184 [junit4] 2> NOTE: All tests run in this JVM: [TestSearch] [junit4] 2> [junit4] Completed in 0.37s, 1 test, 1 failure <<< FAILURES! [junit4] [junit4] JVM J0: 0.53 .. 1.50 = 0.97s [junit4] Execution time total: 1.55 sec. [junit4] Tests summary: 1 suite, 1 test, 1 failure BUILD FAILED /home/hossman/lucene/4x_dev/lucene/common-build.xml:1019: The following error occurred while executing this line: /home/hossman/lucene/4x_dev/lucene/common-build.xml:745: There were test failures: 1 suite, 1 test, 1 failure Total time: 5 seconds hossman@bester:~/lucene/4x_dev/lucene/core$ cp ../build/core/test/TEST-org.apache.lucene.TestSearch.xml ~/tmp/TEST-org.apache.lucene.TestSearch.no-iters-fail.xml {noformat} However, when using -Dtests.iters, the test "passes" - but there's obvious record that it even ran... {noformat} hossman@bester:~/lucene/4x_dev/lucene/core$ ant test -Dtests.iters=2 -Dtestcase=TestSearch -Dtestmethod=testFailureBuildfile: /home/hossman/lucene/4x_dev/lucene/core/build.xml ... test: [junit4] says cześć. Master seed: 9BA05DE6F296F7C4 [junit4] Executing 1 suite with 1 JVM. [junit4] Suite: org.apache.lucene.TestSearch [junit4] Completed in 0.07s, 0 tests [junit4] [junit4] JVM J0: 0.73 .. 1.45 = 0.71s [junit4] Execution time total: 1.47 sec. [junit4] Tests summary: 1 suite, 0 tests [echo] 5 slowest tests: [tophints] 0.15s | org.apache.lucene.TestSearch BUILD SUCCESSFUL Total time: 5 seconds hossman@bester:~/lucene/4x_dev/lucene/core$ cp ../build/core/test/TEST-org.apache.lucene.TestSearch.xml ~/tmp/TEST-org.apache.lucene.TestSearch.iters-no-fail.xml {noformat} (note in the XML file that it says no tests were run) > -Dtests.iters combined with -Dtestmethod never fails? > - > > Key: LUCENE-4146 > URL: https://issues.apache.org/jira/browse/LUCENE-4146 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Hoss Man > Attachments: LUCENE-4146.fail.patch, > TEST-org.apache.lucene.TestSearch.iters-no-fail.xml, > TEST-org.apache.lucene.TestSearch.no-iters-fail.xml > > > a test that is hardcoded to fail will report succes if you run it with > -Dtests.iters -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4146) -Dtests.iters combined with -Dtestmethod never fails?
Hoss Man created LUCENE-4146: Summary: -Dtests.iters combined with -Dtestmethod never fails? Key: LUCENE-4146 URL: https://issues.apache.org/jira/browse/LUCENE-4146 Project: Lucene - Java Issue Type: Improvement Reporter: Hoss Man a test that is hardcoded to fail will report succes if you run it with -Dtests.iters -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Windows-Java6-64 - Build # 76 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java6-64/76/ 1 tests failed. REGRESSION: org.apache.solr.spelling.suggest.SuggesterTSTTest.testReload Error Message: Exception during query Stack Trace: java.lang.RuntimeException: Exception during query at __randomizedtesting.SeedInfo.seed([847D645062E1B0E6:438D1C53A8A248F4]:0) at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:459) at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:426) at org.apache.solr.spelling.suggest.SuggesterTest.testReload(SuggesterTest.java:91) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Caused by: java.lang.RuntimeException: REQUEST FAILED: xpath=//lst[@name='spellcheck']/lst[@name='suggestions']/lst[@name='ac']/int[@name='numFound'][.='2'] xml response was: 04 request was:q=ac&spellcheck.count=2&qt=/suggest_tst&spellcheck.onlyMorePopular=true at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:452) ... 39 more Build Log: [...truncated 10016 lines...] [junit4] 2>at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.eval
Re: Corrupt index
I can confirm 2.9.4 had autoCommit, but it is gone in 3.0.3 already, so Lucene.Net doesn't have autoCommit. So I don't have autoCommit set to true, but I can clearly see a segments_1 file there along with the other files. If that helpes, it always keeps with the name segments_1 with 32 bytes, never changes. And as again, if I kill the process and try to open the index with Luke 3.3, the index folder is being wiped out. Not sure what to make of all that. On Fri, Jun 15, 2012 at 3:21 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > Hmm, OK: in 2.9.4 / 3.0.x, if you open IW on a new directory, it will > make a zero-segment commit. This was changed/fixed in 3.1 with > LUCENE-2386. > > In 2.9.x (not 3.0.x) there is still an autoCommit parameter, > defaulting to false, but if you set it to true then IndexWriter will > periodically commit. > > Seeing segment files created and merge is definitely expected, but > it's not expected to see segments_N files unless you pass > autoCommit=true. > > Mike McCandless > > http://blog.mikemccandless.com > > On Thu, Jun 14, 2012 at 8:14 PM, Itamar Syn-Hershko > wrote: > > Not what I'm seeing. I actually see a lot of segments created and merged > > while it operates. Expected? > > > > Reminding you, this is 2.9.4 / 3.0.3 > > > > On Fri, Jun 15, 2012 at 3:10 AM, Michael McCandless > > wrote: > >> > >> Right: Lucene never autocommits anymore ... > >> > >> If you create a new index, add a bunch of docs, and things crash > >> before you have a chance to commit, then there is no index (not even a > >> 0 doc one) in that directory. > >> > >> Mike McCandless > >> > >> http://blog.mikemccandless.com > >> > >> On Thu, Jun 14, 2012 at 1:41 PM, Itamar Syn-Hershko > > >> wrote: > >> > I'm quite certain this shouldn't happen also when Commit wasn't > called. > >> > > >> > Mike, can you comment on that? > >> > > >> > On Thu, Jun 14, 2012 at 8:03 PM, Christopher Currens > >> > wrote: > >> >> > >> >> Well, the only thing I see is that there is no place where > >> >> writer.Commit() > >> >> is called in the delegate assigned to corpusReader.OnDocument. I > know > >> >> that > >> >> lucene is very transactional, and at least in 3.x, the writer will > >> >> never > >> >> auto commit to the index. You can write millions of documents, but > if > >> >> commit is never called, those documents aren't actually part of the > >> >> index. > >> >> Committing isn't a cheap operation, so you definitely don't want to > do > >> >> it > >> >> on every document. > >> >> > >> >> You can test it yourself with this (naive) solution. Right below the > >> >> writer.SetUseCompoundFile(false) line, add "int numDocsAdded = 0;". > At > >> >> the > >> >> end of the corpusReader.OnDocument delegate add: > >> >> > >> >> // Example only. I wouldn't suggest committing this often > >> >> if(++numDocsAdded % 5 == 0) > >> >> { > >> >>writer.Commit(); > >> >> } > >> >> > >> >> I had the application crash for real on this file: > >> >> > >> >> > >> >> > http://dumps.wikimedia.org/gawiktionary/20120613/gawiktionary-20120613-pages-meta-history.xml.bz2 > , > >> >> about 20% into the operation. Without the commit, the index is > empty. > >> >> Add > >> >> it in, and I get 755 files in the index after it crashes. > >> >> > >> >> > >> >> Thanks, > >> >> Christopher > >> >> > >> >> On Wed, Jun 13, 2012 at 6:13 PM, Itamar Syn-Hershko > >> >> wrote: > >> >> > >> >> > >> >> > Yes, reproduced in first try. See attached program - I referenced > it > >> >> > to > >> >> > current trunk. > >> >> > > >> >> > > >> >> > On Thu, Jun 14, 2012 at 3:54 AM, Itamar Syn-Hershko > >> >> > wrote: > >> >> > > >> >> >> Christopher, > >> >> >> > >> >> >> I used the IndexBuilder app from here > >> >> >> https://github.com/synhershko/Talks/tree/master/LuceneNeatThings > >> >> >> with a > >> >> >> 8.5GB wikipedia dump. > >> >> >> > >> >> >> After running for 2.5 days I had to forcefully close it (infinite > >> >> >> loop > >> >> >> in > >> >> >> the wiki-markdown parser at 92%, go figure), and the 40-something > GB > >> >> >> index > >> >> >> I had by then was unusable. I then was able to reproduce this > >> >> >> > >> >> >> Please note I now added a few safe-guards you might want to remove > >> >> >> to > >> >> >> make sure the app really crashes on process kill. > >> >> >> > >> >> >> I'll try to come up with a better way to reproduce this - > hopefully > >> >> >> Mike > >> >> >> will be able to suggest better ways than manual process kill... > >> >> >> > >> >> >> On Thu, Jun 14, 2012 at 1:41 AM, Christopher Currens < > >> >> >> currens.ch...@gmail.com> wrote: > >> >> >> > >> >> >>> Mike, The codebase for lucene.net should be almost identical to > >> >> >>> java's > >> >> >>> 3.0.3 release, and LUCENE-1044 is included in that. > >> >> >>> > >> >> >>> Itamar, are you committing the index regularly? I only ask > because > >> >> >>> I > >> >> >>> can't > >> >> >>> reproduce it myself by forcibly terminating the process while > it
[jira] [Updated] (LUCENE-4145) "Unhandled exception" from test framework (in json parsing of test output files?)
[ https://issues.apache.org/jira/browse/LUCENE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-4145: - Summary: "Unhandled exception" from test framework (in json parsing of test output files?) (was: "Unhandled exception" from test framework?) FWIW: i can reproduce this fairly trivially ... let me now if you want me to capture anything in particular. > "Unhandled exception" from test framework (in json parsing of test output > files?) > - > > Key: LUCENE-4145 > URL: https://issues.apache.org/jira/browse/LUCENE-4145 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Dawid Weiss > > Working on SOLR-3267 i got a weird exception printed to the junit output... > {noformat} >[junit4] Unhandled exception in thread: Thread[pumper-events,5,main] >[junit4] > com.carrotsearch.ant.tasks.junit4.dependencies.com.google.gson.JsonParseException: > No such reference: id#org.apache.solr.search.TestSort[3] > ... > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4145) "Unhandled exception" from test framework?
[ https://issues.apache.org/jira/browse/LUCENE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295404#comment-13295404 ] Hoss Man commented on LUCENE-4145: -- Execution was... {noformat} hossman@bester:~/lucene/dev/solr/core$ ant test -Dtests.iters=10 -Dtestcase=TestSort -Dtestmethod=testRandomFieldNameSorts ... validate: common.test: [junit4] says aloha! Master seed: D6A9197BD551566E [junit4] Executing 1 suite with 1 JVM. [junit4] Unhandled exception in thread: Thread[pumper-events,5,main] [junit4] com.carrotsearch.ant.tasks.junit4.dependencies.com.google.gson.JsonParseException: No such reference: id#org.apache.solr.search.TestSort[3] [junit4] at com.carrotsearch.ant.tasks.junit4.events.json.JsonDescriptionAdapter.deserialize(JsonDescriptionAdapter.java:90) [junit4] at com.carrotsearch.ant.tasks.junit4.events.json.JsonDescriptionAdapter.deserialize(JsonDescriptionAdapter.java:15) [junit4] at com.carrotsearch.ant.tasks.junit4.dependencies.com.google.gson.JsonDeserializerExceptionWrapper.deserialize(JsonDeserializerExceptionWrapper.java:51) [junit4] at com.carrotsearch.ant.tasks.junit4.dependencies.com.google.gson.GsonToMiniGsonTypeAdapterFactory$3.read(GsonToMiniGsonTypeAdapterFactory.java:85) [junit4] at com.carrotsearch.ant.tasks.junit4.dependencies.com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$1.read(ReflectiveTypeAdapterFactory.java:86) [junit4] at com.carrotsearch.ant.tasks.junit4.dependencies.com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:170) [junit4] at com.carrotsearch.ant.tasks.junit4.dependencies.com.google.gson.Gson.fromJson(Gson.java:720) [junit4] at com.carrotsearch.ant.tasks.junit4.events.Deserializer.deserialize(Deserializer.java:31) [junit4] at com.carrotsearch.ant.tasks.junit4.LocalSlaveStreamHandler.pumpEvents(LocalSlaveStreamHandler.java:100) [junit4] at com.carrotsearch.ant.tasks.junit4.LocalSlaveStreamHandler$1.run(LocalSlaveStreamHandler.java:73) [junit4] at java.lang.Thread.run(Thread.java:662) ... {noformat} ...and the (ant) process is still running, but no files in solr/build/solr-core/test have been modified in over 20 minutes. > "Unhandled exception" from test framework? > -- > > Key: LUCENE-4145 > URL: https://issues.apache.org/jira/browse/LUCENE-4145 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Dawid Weiss > > Working on SOLR-3267 i got a weird exception printed to the junit output... > {noformat} >[junit4] Unhandled exception in thread: Thread[pumper-events,5,main] >[junit4] > com.carrotsearch.ant.tasks.junit4.dependencies.com.google.gson.JsonParseException: > No such reference: id#org.apache.solr.search.TestSort[3] > ... > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3542) Highlighter: Integration of LUCENE-4133 (Part of LUCENE-3440)
[ https://issues.apache.org/jira/browse/SOLR-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295403#comment-13295403 ] Koji Sekiguchi commented on SOLR-3542: -- Patch looks good! Will commit soon. > Highlighter: Integration of LUCENE-4133 (Part of LUCENE-3440) > - > > Key: SOLR-3542 > URL: https://issues.apache.org/jira/browse/SOLR-3542 > Project: Solr > Issue Type: Improvement > Components: highlighter >Affects Versions: 4.0 >Reporter: Sebastian Lutze >Assignee: Koji Sekiguchi >Priority: Minor > Labels: FastVectorHighlighter, highlight, patch > Fix For: 4.0 > > Attachments: SOLR-3542.patch > > > This patch integrates a weight-based approach for sorting highlighted > fragments. > See LUCENE-4133 (Part of LUCENE-3440). > This patch contains: > - Introduction of class WeightedFragListBuilder, a implementation of > SolrFragListBuilder > - Updated example-configuration -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4145) "Unhandled exception" from test framework?
Hoss Man created LUCENE-4145: Summary: "Unhandled exception" from test framework? Key: LUCENE-4145 URL: https://issues.apache.org/jira/browse/LUCENE-4145 Project: Lucene - Java Issue Type: Improvement Reporter: Hoss Man Assignee: Dawid Weiss Working on SOLR-3267 i got a weird exception printed to the junit output... {noformat} [junit4] Unhandled exception in thread: Thread[pumper-events,5,main] [junit4] com.carrotsearch.ant.tasks.junit4.dependencies.com.google.gson.JsonParseException: No such reference: id#org.apache.solr.search.TestSort[3] ... {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Corrupt index
Hmm, OK: in 2.9.4 / 3.0.x, if you open IW on a new directory, it will make a zero-segment commit. This was changed/fixed in 3.1 with LUCENE-2386. In 2.9.x (not 3.0.x) there is still an autoCommit parameter, defaulting to false, but if you set it to true then IndexWriter will periodically commit. Seeing segment files created and merge is definitely expected, but it's not expected to see segments_N files unless you pass autoCommit=true. Mike McCandless http://blog.mikemccandless.com On Thu, Jun 14, 2012 at 8:14 PM, Itamar Syn-Hershko wrote: > Not what I'm seeing. I actually see a lot of segments created and merged > while it operates. Expected? > > Reminding you, this is 2.9.4 / 3.0.3 > > On Fri, Jun 15, 2012 at 3:10 AM, Michael McCandless > wrote: >> >> Right: Lucene never autocommits anymore ... >> >> If you create a new index, add a bunch of docs, and things crash >> before you have a chance to commit, then there is no index (not even a >> 0 doc one) in that directory. >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >> On Thu, Jun 14, 2012 at 1:41 PM, Itamar Syn-Hershko >> wrote: >> > I'm quite certain this shouldn't happen also when Commit wasn't called. >> > >> > Mike, can you comment on that? >> > >> > On Thu, Jun 14, 2012 at 8:03 PM, Christopher Currens >> > wrote: >> >> >> >> Well, the only thing I see is that there is no place where >> >> writer.Commit() >> >> is called in the delegate assigned to corpusReader.OnDocument. I know >> >> that >> >> lucene is very transactional, and at least in 3.x, the writer will >> >> never >> >> auto commit to the index. You can write millions of documents, but if >> >> commit is never called, those documents aren't actually part of the >> >> index. >> >> Committing isn't a cheap operation, so you definitely don't want to do >> >> it >> >> on every document. >> >> >> >> You can test it yourself with this (naive) solution. Right below the >> >> writer.SetUseCompoundFile(false) line, add "int numDocsAdded = 0;". At >> >> the >> >> end of the corpusReader.OnDocument delegate add: >> >> >> >> // Example only. I wouldn't suggest committing this often >> >> if(++numDocsAdded % 5 == 0) >> >> { >> >> writer.Commit(); >> >> } >> >> >> >> I had the application crash for real on this file: >> >> >> >> >> >> http://dumps.wikimedia.org/gawiktionary/20120613/gawiktionary-20120613-pages-meta-history.xml.bz2, >> >> about 20% into the operation. Without the commit, the index is empty. >> >> Add >> >> it in, and I get 755 files in the index after it crashes. >> >> >> >> >> >> Thanks, >> >> Christopher >> >> >> >> On Wed, Jun 13, 2012 at 6:13 PM, Itamar Syn-Hershko >> >> wrote: >> >> >> >> >> >> > Yes, reproduced in first try. See attached program - I referenced it >> >> > to >> >> > current trunk. >> >> > >> >> > >> >> > On Thu, Jun 14, 2012 at 3:54 AM, Itamar Syn-Hershko >> >> > wrote: >> >> > >> >> >> Christopher, >> >> >> >> >> >> I used the IndexBuilder app from here >> >> >> https://github.com/synhershko/Talks/tree/master/LuceneNeatThings >> >> >> with a >> >> >> 8.5GB wikipedia dump. >> >> >> >> >> >> After running for 2.5 days I had to forcefully close it (infinite >> >> >> loop >> >> >> in >> >> >> the wiki-markdown parser at 92%, go figure), and the 40-something GB >> >> >> index >> >> >> I had by then was unusable. I then was able to reproduce this >> >> >> >> >> >> Please note I now added a few safe-guards you might want to remove >> >> >> to >> >> >> make sure the app really crashes on process kill. >> >> >> >> >> >> I'll try to come up with a better way to reproduce this - hopefully >> >> >> Mike >> >> >> will be able to suggest better ways than manual process kill... >> >> >> >> >> >> On Thu, Jun 14, 2012 at 1:41 AM, Christopher Currens < >> >> >> currens.ch...@gmail.com> wrote: >> >> >> >> >> >>> Mike, The codebase for lucene.net should be almost identical to >> >> >>> java's >> >> >>> 3.0.3 release, and LUCENE-1044 is included in that. >> >> >>> >> >> >>> Itamar, are you committing the index regularly? I only ask because >> >> >>> I >> >> >>> can't >> >> >>> reproduce it myself by forcibly terminating the process while it's >> >> >>> indexing. I've tried both 3.0.3 and 2.9.4. If I don't commit at >> >> >>> all >> >> >>> and >> >> >>> terminate the process (even with a 10,000 4K documents created), >> >> >>> there >> >> >>> will >> >> >>> be no documents in the index when I open it in luke, which I >> >> >>> expect. >> >> >>> If >> >> >>> I >> >> >>> commit at 10,000 documents, and terminate it a few thousand after >> >> >>> that, >> >> >>> the >> >> >>> index has the first ten thousand that were committed. I've even >> >> >>> terminated >> >> >>> it *while* a second commit was taking place, and it still had all >> >> >>> of >> >> >>> the >> >> >>> documents I expected. >> >> >>> >> >> >>> It may be that I'm not trying to reproducing it correctly. Do you >> >> >>> have a >> >> >>> minimal amount of code that can reproduce it? >>
[jira] [Assigned] (SOLR-3267) TestSort failures (reproducible)
[ https://issues.apache.org/jira/browse/SOLR-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man reassigned SOLR-3267: -- Assignee: Hoss Man > TestSort failures (reproducible) > > > Key: SOLR-3267 > URL: https://issues.apache.org/jira/browse/SOLR-3267 > Project: Solr > Issue Type: Bug >Reporter: Dawid Weiss >Assignee: Hoss Man > Fix For: 4.0 > > > {noformat} > Over 0.2% oddities in test: 14/6386 have func/query parsing semenatics gotten > broader? > {noformat} > Huh? Steps to reproduce: > {noformat} > ant test -Dtestcase=TestSort -Dtestmethod=testRandomFieldNameSorts > -Dtests.seed=-3e789c8564f08cbd:515c61b079794ea7:-6347ac0df7ad45c0 > -Dargs="-Dfile.encoding=UTF-8" > [junit] Testcase: > testRandomFieldNameSorts(org.apache.solr.search.TestSort):FAILED > [junit] Over 0.2% oddities in test: 14/6386 have func/query parsing > semenatics gotten broader? > [junit] junit.framework.AssertionFailedError: Over 0.2% oddities in test: > 14/6386 have func/query parsing semenatics gotten broader? > [junit] at org.junit.Assert.fail(Assert.java:93) > [junit] at org.junit.Assert.assertTrue(Assert.java:43) > [junit] at > org.apache.solr.search.TestSort.testRandomFieldNameSorts(TestSort.java:145) > [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > [junit] at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > [junit] at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > [junit] at java.lang.reflect.Method.invoke(Method.java:597) > [junit] at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) > [junit] at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) > [junit] at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) > [junit] at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) > [junit] at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) > [junit] at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) > [junit] at > org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63) > [junit] at > org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:739) > [junit] at > org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:655) > [junit] at > org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69) > [junit] at > org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:566) > [junit] at > org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75) > [junit] at > org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:628) > [junit] at org.junit.rules.RunRules.evaluate(RunRules.java:18) > [junit] at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) > [junit] at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) > [junit] at > org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164) > [junit] at > org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57) > [junit] at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) > [junit] at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) > [junit] at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) > [junit] at > org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) > [junit] at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) > [junit] at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) > [junit] at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) > [junit] at > org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63) > [junit] at > org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75) > [junit] at > org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38) > [junit] at > org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69) > [junit] at org.junit.rules.RunRules.evaluate(RunRules.java:18) > [junit] at org.junit.runners.ParentR
Re: Corrupt index
Not what I'm seeing. I actually see a lot of segments created and merged while it operates. Expected? Reminding you, this is 2.9.4 / 3.0.3 On Fri, Jun 15, 2012 at 3:10 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > Right: Lucene never autocommits anymore ... > > If you create a new index, add a bunch of docs, and things crash > before you have a chance to commit, then there is no index (not even a > 0 doc one) in that directory. > > Mike McCandless > > http://blog.mikemccandless.com > > On Thu, Jun 14, 2012 at 1:41 PM, Itamar Syn-Hershko > wrote: > > I'm quite certain this shouldn't happen also when Commit wasn't called. > > > > Mike, can you comment on that? > > > > On Thu, Jun 14, 2012 at 8:03 PM, Christopher Currens > > wrote: > >> > >> Well, the only thing I see is that there is no place where > writer.Commit() > >> is called in the delegate assigned to corpusReader.OnDocument. I know > >> that > >> lucene is very transactional, and at least in 3.x, the writer will never > >> auto commit to the index. You can write millions of documents, but if > >> commit is never called, those documents aren't actually part of the > index. > >> Committing isn't a cheap operation, so you definitely don't want to do > it > >> on every document. > >> > >> You can test it yourself with this (naive) solution. Right below the > >> writer.SetUseCompoundFile(false) line, add "int numDocsAdded = 0;". At > >> the > >> end of the corpusReader.OnDocument delegate add: > >> > >> // Example only. I wouldn't suggest committing this often > >> if(++numDocsAdded % 5 == 0) > >> { > >>writer.Commit(); > >> } > >> > >> I had the application crash for real on this file: > >> > >> > http://dumps.wikimedia.org/gawiktionary/20120613/gawiktionary-20120613-pages-meta-history.xml.bz2 > , > >> about 20% into the operation. Without the commit, the index is empty. > >> Add > >> it in, and I get 755 files in the index after it crashes. > >> > >> > >> Thanks, > >> Christopher > >> > >> On Wed, Jun 13, 2012 at 6:13 PM, Itamar Syn-Hershko > >> wrote: > >> > >> > >> > Yes, reproduced in first try. See attached program - I referenced it > to > >> > current trunk. > >> > > >> > > >> > On Thu, Jun 14, 2012 at 3:54 AM, Itamar Syn-Hershko > >> > wrote: > >> > > >> >> Christopher, > >> >> > >> >> I used the IndexBuilder app from here > >> >> https://github.com/synhershko/Talks/tree/master/LuceneNeatThingswith a > >> >> 8.5GB wikipedia dump. > >> >> > >> >> After running for 2.5 days I had to forcefully close it (infinite > loop > >> >> in > >> >> the wiki-markdown parser at 92%, go figure), and the 40-something GB > >> >> index > >> >> I had by then was unusable. I then was able to reproduce this > >> >> > >> >> Please note I now added a few safe-guards you might want to remove to > >> >> make sure the app really crashes on process kill. > >> >> > >> >> I'll try to come up with a better way to reproduce this - hopefully > >> >> Mike > >> >> will be able to suggest better ways than manual process kill... > >> >> > >> >> On Thu, Jun 14, 2012 at 1:41 AM, Christopher Currens < > >> >> currens.ch...@gmail.com> wrote: > >> >> > >> >>> Mike, The codebase for lucene.net should be almost identical to > java's > >> >>> 3.0.3 release, and LUCENE-1044 is included in that. > >> >>> > >> >>> Itamar, are you committing the index regularly? I only ask because > I > >> >>> can't > >> >>> reproduce it myself by forcibly terminating the process while it's > >> >>> indexing. I've tried both 3.0.3 and 2.9.4. If I don't commit at > all > >> >>> and > >> >>> terminate the process (even with a 10,000 4K documents created), > there > >> >>> will > >> >>> be no documents in the index when I open it in luke, which I expect. > >> >>> If > >> >>> I > >> >>> commit at 10,000 documents, and terminate it a few thousand after > >> >>> that, > >> >>> the > >> >>> index has the first ten thousand that were committed. I've even > >> >>> terminated > >> >>> it *while* a second commit was taking place, and it still had all of > >> >>> the > >> >>> documents I expected. > >> >>> > >> >>> It may be that I'm not trying to reproducing it correctly. Do you > >> >>> have a > >> >>> minimal amount of code that can reproduce it? > >> >>> > >> >>> > >> >>> Thanks, > >> >>> Christopher > >> >>> > >> >>> On Wed, Jun 13, 2012 at 9:31 AM, Michael McCandless < > >> >>> luc...@mikemccandless.com> wrote: > >> >>> > >> >>> > Hi Itamar, > >> >>> > > >> >>> > One quick question: does Lucene.Net include the fixes done for > >> >>> > LUCENE-1044 (to fsync files on commit)? Those are very important > >> >>> > for > >> >>> > an index to be intact after OS/JVM crash or power loss. > >> >>> > > >> >>> > More responses below: > >> >>> > > >> >>> > On Tue, Jun 12, 2012 at 8:20 PM, Itamar Syn-Hershko < > >> >>> ita...@code972.com> > >> >>> > wrote: > >> >>> > > >> >>> > > I'm a Lucene.Net committer, and there is a chance we have a bug > in > >> >>> our > >> >>> > > FSDirectory impleme
Re: Corrupt index
Right: Lucene never autocommits anymore ... If you create a new index, add a bunch of docs, and things crash before you have a chance to commit, then there is no index (not even a 0 doc one) in that directory. Mike McCandless http://blog.mikemccandless.com On Thu, Jun 14, 2012 at 1:41 PM, Itamar Syn-Hershko wrote: > I'm quite certain this shouldn't happen also when Commit wasn't called. > > Mike, can you comment on that? > > On Thu, Jun 14, 2012 at 8:03 PM, Christopher Currens > wrote: >> >> Well, the only thing I see is that there is no place where writer.Commit() >> is called in the delegate assigned to corpusReader.OnDocument. I know >> that >> lucene is very transactional, and at least in 3.x, the writer will never >> auto commit to the index. You can write millions of documents, but if >> commit is never called, those documents aren't actually part of the index. >> Committing isn't a cheap operation, so you definitely don't want to do it >> on every document. >> >> You can test it yourself with this (naive) solution. Right below the >> writer.SetUseCompoundFile(false) line, add "int numDocsAdded = 0;". At >> the >> end of the corpusReader.OnDocument delegate add: >> >> // Example only. I wouldn't suggest committing this often >> if(++numDocsAdded % 5 == 0) >> { >> writer.Commit(); >> } >> >> I had the application crash for real on this file: >> >> http://dumps.wikimedia.org/gawiktionary/20120613/gawiktionary-20120613-pages-meta-history.xml.bz2, >> about 20% into the operation. Without the commit, the index is empty. >> Add >> it in, and I get 755 files in the index after it crashes. >> >> >> Thanks, >> Christopher >> >> On Wed, Jun 13, 2012 at 6:13 PM, Itamar Syn-Hershko >> wrote: >> >> >> > Yes, reproduced in first try. See attached program - I referenced it to >> > current trunk. >> > >> > >> > On Thu, Jun 14, 2012 at 3:54 AM, Itamar Syn-Hershko >> > wrote: >> > >> >> Christopher, >> >> >> >> I used the IndexBuilder app from here >> >> https://github.com/synhershko/Talks/tree/master/LuceneNeatThings with a >> >> 8.5GB wikipedia dump. >> >> >> >> After running for 2.5 days I had to forcefully close it (infinite loop >> >> in >> >> the wiki-markdown parser at 92%, go figure), and the 40-something GB >> >> index >> >> I had by then was unusable. I then was able to reproduce this >> >> >> >> Please note I now added a few safe-guards you might want to remove to >> >> make sure the app really crashes on process kill. >> >> >> >> I'll try to come up with a better way to reproduce this - hopefully >> >> Mike >> >> will be able to suggest better ways than manual process kill... >> >> >> >> On Thu, Jun 14, 2012 at 1:41 AM, Christopher Currens < >> >> currens.ch...@gmail.com> wrote: >> >> >> >>> Mike, The codebase for lucene.net should be almost identical to java's >> >>> 3.0.3 release, and LUCENE-1044 is included in that. >> >>> >> >>> Itamar, are you committing the index regularly? I only ask because I >> >>> can't >> >>> reproduce it myself by forcibly terminating the process while it's >> >>> indexing. I've tried both 3.0.3 and 2.9.4. If I don't commit at all >> >>> and >> >>> terminate the process (even with a 10,000 4K documents created), there >> >>> will >> >>> be no documents in the index when I open it in luke, which I expect. >> >>> If >> >>> I >> >>> commit at 10,000 documents, and terminate it a few thousand after >> >>> that, >> >>> the >> >>> index has the first ten thousand that were committed. I've even >> >>> terminated >> >>> it *while* a second commit was taking place, and it still had all of >> >>> the >> >>> documents I expected. >> >>> >> >>> It may be that I'm not trying to reproducing it correctly. Do you >> >>> have a >> >>> minimal amount of code that can reproduce it? >> >>> >> >>> >> >>> Thanks, >> >>> Christopher >> >>> >> >>> On Wed, Jun 13, 2012 at 9:31 AM, Michael McCandless < >> >>> luc...@mikemccandless.com> wrote: >> >>> >> >>> > Hi Itamar, >> >>> > >> >>> > One quick question: does Lucene.Net include the fixes done for >> >>> > LUCENE-1044 (to fsync files on commit)? Those are very important >> >>> > for >> >>> > an index to be intact after OS/JVM crash or power loss. >> >>> > >> >>> > More responses below: >> >>> > >> >>> > On Tue, Jun 12, 2012 at 8:20 PM, Itamar Syn-Hershko < >> >>> ita...@code972.com> >> >>> > wrote: >> >>> > >> >>> > > I'm a Lucene.Net committer, and there is a chance we have a bug in >> >>> our >> >>> > > FSDirectory implementation that causes indexes to get corrupted >> >>> > > when >> >>> > > indexing is cut while the IW is still open. As it roots from some >> >>> > > retroactive fixes you made, I'd appreciate your feedback. >> >>> > > >> >>> > > Correct me if I'm wrong, but by design Lucene should be able to >> >>> recover >> >>> > > rather quickly from power failures or app crashes. Since existing >> >>> segment >> >>> > > files are read only, only new segments that are still being >> >>> > > written >> >>> can >> >>> > get
[jira] [Commented] (LUCENE-4062) More fine-grained control over the packed integer implementation that is chosen
[ https://issues.apache.org/jira/browse/LUCENE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295395#comment-13295395 ] Michael McCandless commented on LUCENE-4062: Very cool graphs! Somehow you should turn them into a blog post :) > More fine-grained control over the packed integer implementation that is > chosen > --- > > Key: LUCENE-4062 > URL: https://issues.apache.org/jira/browse/LUCENE-4062 > Project: Lucene - Java > Issue Type: Improvement > Components: core/other >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Minor > Labels: performance > Fix For: 4.0 > > Attachments: LUCENE-4062-2.patch, LUCENE-4062.patch, > LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, > LUCENE-4062.patch, LUCENE-4062.patch > > > In order to save space, Lucene has two main PackedInts.Mutable implentations, > one that is very fast and is based on a byte/short/integer/long array > (Direct*) and another one which packs bits in a memory-efficient manner > (Packed*). > The packed implementation tends to be much slower than the direct one, which > discourages some Lucene components to use it. On the other hand, if you store > 21 bits integers in a Direct32, this is a space loss of (32-21)/32=35%. > If you accept to trade some space for speed, you could store 3 of these 21 > bits integers in a long, resulting in an overhead of 1/3 bit per value. One > advantage of this approach is that you never need to read more than one block > to read or write a value, so this can be significantly faster than Packed32 > and Packed64 which always need to read/write two blocks in order to avoid > costly branches. > I ran some tests, and for 1000 21 bits values, this implementation takes > less than 2% more space and has 44% faster writes and 30% faster reads. The > 12 bits version (5 values per block) has the same performance improvement and > a 6% memory overhead compared to the packed implementation. > In order to select the best implementation for a given integer size, I wrote > the {{PackedInts.getMutable(valueCount, bitsPerValue, > acceptableOverheadPerValue)}} method. This method select the fastest > implementation that has less than {{acceptableOverheadPerValue}} wasted bits > per value. For example, if you accept an overhead of 20% > ({{acceptableOverheadPerValue = 0.2f * bitsPerValue}}), which is pretty > reasonable, here is what implementations would be selected: > * 1: Packed64SingleBlock1 > * 2: Packed64SingleBlock2 > * 3: Packed64SingleBlock3 > * 4: Packed64SingleBlock4 > * 5: Packed64SingleBlock5 > * 6: Packed64SingleBlock6 > * 7: Direct8 > * 8: Direct8 > * 9: Packed64SingleBlock9 > * 10: Packed64SingleBlock10 > * 11: Packed64SingleBlock12 > * 12: Packed64SingleBlock12 > * 13: Packed64 > * 14: Direct16 > * 15: Direct16 > * 16: Direct16 > * 17: Packed64 > * 18: Packed64SingleBlock21 > * 19: Packed64SingleBlock21 > * 20: Packed64SingleBlock21 > * 21: Packed64SingleBlock21 > * 22: Packed64 > * 23: Packed64 > * 24: Packed64 > * 25: Packed64 > * 26: Packed64 > * 27: Direct32 > * 28: Direct32 > * 29: Direct32 > * 30: Direct32 > * 31: Direct32 > * 32: Direct32 > * 33: Packed64 > * 34: Packed64 > * 35: Packed64 > * 36: Packed64 > * 37: Packed64 > * 38: Packed64 > * 39: Packed64 > * 40: Packed64 > * 41: Packed64 > * 42: Packed64 > * 43: Packed64 > * 44: Packed64 > * 45: Packed64 > * 46: Packed64 > * 47: Packed64 > * 48: Packed64 > * 49: Packed64 > * 50: Packed64 > * 51: Packed64 > * 52: Packed64 > * 53: Packed64 > * 54: Direct64 > * 55: Direct64 > * 56: Direct64 > * 57: Direct64 > * 58: Direct64 > * 59: Direct64 > * 60: Direct64 > * 61: Direct64 > * 62: Direct64 > Under 32 bits per value, only 13, 17 and 22-26 bits per value would still > choose the slower Packed64 implementation. Allowing a 50% overhead would > prevent the packed implementation to be selected for bits per value under 32. > Allowing an overhead of 32 bits per value would make sure that a Direct* > implementation is always selected. > Next steps would be to: > * make lucene components use this {{getMutable}} method and let users decide > what trade-off better suits them, > * write a Packed32SingleBlock implementation if necessary (I didn't do it > because I have no 32-bits computer to test the performance improvements). > I think this would allow more fine-grained control over the speed/space > trade-off, what do you think? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jsp
[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295394#comment-13295394 ] Michael McCandless commented on LUCENE-4132: Hmm we are no longer cloning the IWC passed into IW? Maybe we shouldn't remove testReuse? > IndexWriterConfig live settings > --- > > Key: LUCENE-4132 > URL: https://issues.apache.org/jira/browse/LUCENE-4132 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Shai Erera >Assignee: Shai Erera >Priority: Minor > Fix For: 4.0, 5.0 > > Attachments: LUCENE-4132.patch, LUCENE-4132.patch, LUCENE-4132.patch, > LUCENE-4132.patch, LUCENE-4132.patch > > > A while ago there was a discussion about making some IW settings "live" and I > remember that RAM buffer size was one of them. Judging from IW code, I see > that RAM buffer can be changed "live" as IW never caches it. > However, I don't remember which other settings were decided to be "live" and > I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: > {code} > * NOTE: some settings may be changed on the > * returned {@link IndexWriterConfig}, and will take > * effect in the current IndexWriter instance. See the > * javadocs for the specific setters in {@link > * IndexWriterConfig} for details. > {code} > But there's no text on e.g. IWC.setRAMBuffer mentioning that. > I think that it'd be good if we make it easier for users to tell which of the > settings are "live" ones. There are few possible ways to do it: > * Introduce a custom @live.setting tag on the relevant IWC.set methods, and > add special text for them in build.xml > ** Or, drop the tag and just document it clearly. > * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name > proposals are welcome !), have IWC impl both, and introduce another > IW.getLiveConfig which will return that interface, thereby clearly letting > the user know which of the settings are "live". > It'd be good if IWC itself could only expose setXYZ methods for the "live" > settings though. So perhaps, off the top of my head, we can do something like > this: > * Introduce a Config object, which is essentially what IWC is today, and pass > it to IW. > * IW will create a different object, IWC from that Config and IW.getConfig > will return IWC. > * IWC itself will only have setXYZ methods for the "live" settings. > It adds another object, but user code doesn't change - it still creates a > Config object when initializing IW, and need to handle a different type if it > ever calls IW.getConfig. > Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2592) Pluggable shard lookup mechanism for SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295390#comment-13295390 ] Michael Garski commented on SOLR-2592: -- The reason for requiring the unique id to be hashable is that it is required to support the distributed real-time get component to retrieve a document based on only the unique id, which in turn is required for SolrCloud. Unit tests that exercise the patch thoroughly are still needed and I will be diving into later this week, so please keep that in mind if you are using this outside of a test environment. > Pluggable shard lookup mechanism for SolrCloud > -- > > Key: SOLR-2592 > URL: https://issues.apache.org/jira/browse/SOLR-2592 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Affects Versions: 4.0 >Reporter: Noble Paul > Attachments: dbq_fix.patch, pluggable_sharding.patch, > pluggable_sharding_V2.patch > > > If the data in a cloud can be partitioned on some criteria (say range, hash, > attribute value etc) It will be easy to narrow down the search to a smaller > subset of shards and in effect can achieve more efficient search. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295389#comment-13295389 ] Chris Russell commented on SOLR-2894: - Erik, what revision of solr did you apply the patch to? Did you not encounter the issues I encountered? > Implement distributed pivot faceting > > > Key: SOLR-2894 > URL: https://issues.apache.org/jira/browse/SOLR-2894 > Project: Solr > Issue Type: Improvement >Reporter: Erik Hatcher >Assignee: Erik Hatcher > Fix For: 4.0 > > Attachments: SOLR-2894.patch, SOLR-2894.patch, > distributed_pivot.patch, distributed_pivot.patch > > > Following up on SOLR-792, pivot faceting currently only supports > undistributed mode. Distributed pivot faceting needs to be implemented. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Linux-Java7-64 - Build # 100 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux-Java7-64/100/ 5 tests failed. REGRESSION: org.apache.solr.handler.dataimport.TestSqlEntityProcessorDelta2.testCompositePk_FullImport Error Message: Exception during query Stack Trace: java.lang.RuntimeException: Exception during query at __randomizedtesting.SeedInfo.seed([3A718873E85EA769:6418C6443206F7BF]:0) at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:459) at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:426) at org.apache.solr.handler.dataimport.TestSqlEntityProcessorDelta2.add1document(TestSqlEntityProcessorDelta2.java:85) at org.apache.solr.handler.dataimport.TestSqlEntityProcessorDelta2.testCompositePk_FullImport(TestSqlEntityProcessorDelta2.java:93) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Caused by: java.lang.RuntimeException: REQUEST FAILED: xpath=//*[@numFound='1'] xml response was: 010*:* OR add1documentstandard202.2 request was:start=0&q=*:*+OR+add1document&qt=standard&rows=20&version=2.2 at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:452) ... 40 mor
Re: Welcome Adrien Grand as a new Lucene/Solr committer
Welcome to the team, Adrien! -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com On 7. juni 2012, at 20:11, Michael McCandless wrote: > I'm pleased to announce that Adrien Grand has joined our ranks as a > committer. > > He has been contributing various patches to Lucene/Solr, recently to > Lucene's packed ints implementation, giving a nice performance gain in > some cases. For example check out > http://people.apache.org/~mikemccand/lucenebench/TermTitleSort.html > (look for annotation U). > > Adrien, its tradition that you introduce yourself with a brief bio. > > As soon as your SVN access is setup, you should then be able to add > yourself to the committers list on the website as well. > > Congratulations! > > Mike McCandless > > http://blog.mikemccandless.com > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-1259) scale() function doesn't work in multisegment indexes
[ https://issues.apache.org/jira/browse/SOLR-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-1259. Resolution: Fixed Fix Version/s: (was: 4.0) 3.1 Assignee: Yonik Seeley the core bug was idently fixed long ago, but the issue was left open for future improvements. those improvements are now tracked in SOLR-3545 > scale() function doesn't work in multisegment indexes > - > > Key: SOLR-1259 > URL: https://issues.apache.org/jira/browse/SOLR-1259 > Project: Solr > Issue Type: Bug >Affects Versions: 1.4 >Reporter: Hoss Man >Assignee: Yonik Seeley > Fix For: 3.1 > > Attachments: SOLR-1259.patch > > > per yonik's comments in an email... > bq. Darn... another SOLR- related issue. scale() will now only scale > per-segment. > ...we either need to fix, or document prior to releasing 1.4 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3545) make scale function more efficient in multi-segment indexes
Hoss Man created SOLR-3545: -- Summary: make scale function more efficient in multi-segment indexes Key: SOLR-3545 URL: https://issues.apache.org/jira/browse/SOLR-3545 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Yonik Seeley offshoot of SOLR-1259 where yonik said... bq. ... handle the situation the same as ord()... via top() to pop back to the top level reader. This isn't so bad since scale() was never really production quality anyway, since it doesn't cache the min and max -recomputing it each time. bq. Committed, and moving remainder of the work (per-segment fieldcache usage, caching min+max) ... [to future] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-3041) Solrs using SolrCloud feature for having shared config in ZK, might not all start successfully when started for the first time simultaneously
[ https://issues.apache.org/jira/browse/SOLR-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man reassigned SOLR-3041: -- Assignee: Mark Miller Could you please asses & triage this for 4.0? > Solrs using SolrCloud feature for having shared config in ZK, might not all > start successfully when started for the first time simultaneously > - > > Key: SOLR-3041 > URL: https://issues.apache.org/jira/browse/SOLR-3041 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.0 > Environment: Exact version: > https://builds.apache.org/job/Solr-trunk/1718/artifact/artifacts/apache-solr-4.0-2011-12-28_08-33-55.tgz >Reporter: Per Steffensen >Assignee: Mark Miller > Fix For: 4.0 > > Original Estimate: 96h > Remaining Estimate: 96h > > Starting Solr like this > java -DzkHost= -Dbootstrap_confdir=./myproject/conf > -Dcollection.configName=myproject_conf -Dsolr.solr.home=./myproject -jar > start.jar > When not already there (starting solr for the first time) the content of > ./myproject/conf will be copied by Solr into ZK. That process does not work > very well in parallel, so if the content is not there and I start several > Solrs simultaneously, one or more of them might not start successfully. > I see exceptions like the ones shown below, and the Solrs throwing them will > not work correctly afterwards. > I know that there could be different workarounds, like making sure to always > start one Solr and wait for a while before starting the rest of them, but I > think we should really be more robuste in these cases. > Regards, Per Steffensen > exception example 1 (the znode causing the problem can be different than > /configs/myproject_conf/protwords.txt) > org.apache.solr.common.cloud.ZooKeeperException: > at > org.apache.solr.core.CoreContainer.initZooKeeper(CoreContainer.java:193) > at org.apache.solr.core.CoreContainer.load(CoreContainer.java:337) > at org.apache.solr.core.CoreContainer.load(CoreContainer.java:294) > at > org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:240) > at > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:93) > at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97) > at > org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) > at > org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713) > at org.mortbay.jetty.servlet.Context.startContext(Context.java:140) > at > org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282) > at > org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518) > at > org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499) > at > org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) > at > org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) > at > org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156) > at > org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) > at > org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) > at > org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) > at > org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130) > at org.mortbay.jetty.Server.doStart(Server.java:224) > at > org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) > at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.mortbay.start.Main.invokeMain(Main.java:194) > at org.mortbay.start.Main.start(Main.java:534) > at org.mortbay.start.Main.start(Main.java:441) > at org.mortbay.start.Main.main(Main.java:119) > Caused by: org.apache.zookeeper.KeeperException$NodeExistsException: > KeeperErrorCode = NodeExists for /configs/myproject_conf/protwords.txt > at org.apache.zookeeper.KeeperException.create(KeeperException.java:110) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637) > at > org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java
[jira] [Updated] (SOLR-3313) Rename "Query Type" to "Request Handler" in SolrJ APIs
[ https://issues.apache.org/jira/browse/SOLR-3313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-3313: --- Component/s: (was: web gui) clients - java Summary: Rename "Query Type" to "Request Handler" in SolrJ APIs (was: Rename "Query Type" to "Request Handler" in API and UI ) The Admin UI was already updated to reflect this in another issue, so clarifying scope of summary to be specific about SolrJ. > Rename "Query Type" to "Request Handler" in SolrJ APIs > -- > > Key: SOLR-3313 > URL: https://issues.apache.org/jira/browse/SOLR-3313 > Project: Solr > Issue Type: Improvement > Components: clients - java >Reporter: David Smiley > Fix For: 4.0 > > > Nobody should speak of "query types" any more; it's "request handlers". I > understand we want to retain the "qt" parameter as such but I think we should > change the names of it wherever we can find it. We can leave some older API > methods in place as deprecated. > As an example, in SolrJ I have to call solrQuery.setQueryType("/blah") > instead of setRequestHandler() -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-725) CoreContainer/CoreDescriptor/SolrCore cleansing
[ https://issues.apache.org/jira/browse/SOLR-725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-725: -- Fix Version/s: (was: 4.1) Removing fix version since this issue hasn't gotten much attention lately and doesn't appear to be a priority for anyone at the moment. As always: if someone wants to take on this work they are welcome to do so at any time and the target release can be revisited > CoreContainer/CoreDescriptor/SolrCore cleansing > --- > > Key: SOLR-725 > URL: https://issues.apache.org/jira/browse/SOLR-725 > Project: Solr > Issue Type: Improvement >Affects Versions: 1.3 >Reporter: Henri Biestro > Attachments: solr-725.patch, solr-725.patch, solr-725.patch, > solr-725.patch > > > These 3 classes and the name vs alias handling are somewhat confusing. > The recent SOLR-647 & SOLR-716 have created a bit of a flux. > This issue attemps to clarify the model and the list of operations. > h3. CoreDescriptor: describes the parameters of a SolrCore > h4. Definitions > * has one name > ** The CoreDescriptor name may represent multiple aliases; in that > case, first alias is the SolrCore name > * has one instance directory location > * has one config & schema name > h4. Operations > The class is only a parameter passing facility > h3. SolrCore: manages a Lucene index > h4. Definitions > * has one unique *name* (in the CoreContainer) > **the *name* is used in JMX to identify the core > * has one current set of *aliases* > **the name is the first alias > h4. Name & alias operations > * *get name/aliases*: obvious > * *alias*: adds an alias to this SolrCore > * *unalias*: removes an alias from this SolrCore > * *name*: sets the SolrCore name > **potentially impacts JMX registration > * *rename*: picks a new name from the SolrCore aliases > **triggered when alias name is already in use > h3. CoreContainer: manages all relations between cores & descriptors > h4. Definitions > * has a set of aliases (each of them pointing to one core) > **ensure alias uniqueness. > h4. SolrCore instance operations > * *load*: makes a SolrCore available for requests > **creates a SolrCore > **registers all SolrCore aliases in the aliases set > **(load = create + register) > * *unload*: removes a core idenitified by one of its aliases > **stops handling the Lucene index > **all SolrCore aliases are removed > * *reload*: recreate the core identified by one of its aliases > * *create*: create a core from a CoreDescriptor > **readies up the Lucene index > * *register*: registers all aliases of a SolrCore > > h4. SolrCore alias operations > * *swap*: swaps 2 aliases > **method: swap > * *alias*: creates 1 alias for a core, potentially unaliasing a > previously used alias > **The SolrCore name being an alias, this operation might trigger > a SolrCore rename > * *unalias*: removes 1 alias for a core > **The SolrCore name being an alias, this operation might trigger > a SolrCore rename > * *rename*: renames a core > h3. CoreAdminHandler: handles CoreContainer operations > * *load*/*create*: CoreContainer load > * *unload*: CoreContainer unload > * *reload*: CoreContainer reload > * *swap*: CoreContainer swap > * *alias*: CoreContainer alias > * *unalias*: CoreContainer unalias > * *rename*: CoreContainer rename > * *persist*: CoreContainer persist, writes the solr.xml > **stauts*: returns the status of all/one SolrCore -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-731) CoreDescriptor.getCoreContainer should not be public
[ https://issues.apache.org/jira/browse/SOLR-731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-731: -- Fix Version/s: (was: 4.0) Removing fix version since this issue hasn't gotten much attention lately and doesn't appear to be a priority for anyone for 4.0. As always: if someone wants to take on this work they are welcome to do so at any time and the target release can be revisited In particular: i note that SolreCore.getCoreDescriptor and CoreDescriptor.getCoreContainer both seems to be fairly widely used now throughout the code base, so it's not clear to be that the intent/belief stated in this issue is still valid. > CoreDescriptor.getCoreContainer should not be public > > > Key: SOLR-731 > URL: https://issues.apache.org/jira/browse/SOLR-731 > Project: Solr > Issue Type: Bug >Affects Versions: 1.3 >Reporter: Henri Biestro > Attachments: solr-731.patch > > > For the very same reasons that CoreDescriptor.getCoreProperties did not need > to be public (aka SOLR-724) > It also means the CoreDescriptor ctor should not need a CoreContainer > The CoreDescriptor is only meant to be describing a "to-be created SolrCore". > However, we need access to the CoreContainer from the SolrCore now that we > are guaranteed the CoreContainer always exists. > This is also a natural consequence of SOLR-647 now that the CoreContainer is > not a map of CoreDescriptor but a map of SolrCore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4082) Implement explain in ToParentBlockJoinQuery$BlockJoinWeight
[ https://issues.apache.org/jira/browse/LUCENE-4082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen resolved LUCENE-4082. --- Resolution: Fixed Committed to branch4x and trunk. > Implement explain in ToParentBlockJoinQuery$BlockJoinWeight > --- > > Key: LUCENE-4082 > URL: https://issues.apache.org/jira/browse/LUCENE-4082 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/join >Affects Versions: 3.4, 3.5, 3.6 >Reporter: Christoph Kaser >Assignee: Martijn van Groningen >Priority: Minor > Fix For: 4.0, 5.0 > > Attachments: LUCENE-4082.patch, LUCENE-4082.patch > > > At the moment, ToParentBlockJoinQuery$BlockJoinWeight.explain throws an > UnsupportedOperationException. It would be useful if it could instead return > the score of parent document, even if the explanation on how that score was > calculated is missing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Corrupt index
> If this is the case, 2328 probably made it's way to Lucene.Net since we are > using the released sources for porting, and we now need to apply 3418 in > the current version. Iatmar: I confirmed that 2328 is in the latest code. Thanks, Troy On Wed, Jun 13, 2012 at 5:45 PM, Itamar Syn-Hershko wrote: > Mike, > > On Wed, Jun 13, 2012 at 7:31 PM, Michael McCandless < > luc...@mikemccandless.com> wrote: > >> Hi Itamar, >> >> One quick question: does Lucene.Net include the fixes done for >> LUCENE-1044 (to fsync files on commit)? Those are very important for >> an index to be intact after OS/JVM crash or power loss. >> > > Definitely, as Christopher noted we are about to release a 3.0.3 compatible > version, which is line-by-line port of the Java version. > > >> You shouldn't even have to run CheckIndex ... because (as of >> LUCENE-1044) we now fsync all segment files before writing the new >> segments_N file, and then removing old segments_N files (and any >> segments that are no longer referenced). >> >> You do have to remove the write.lock if you aren't using >> NativeFSLockFactory (but this has been the default lock impl for a >> while now). >> > > Somewhat unrelated to this thread, but what should I expect to see? from > time to time we do see write.lock present after an app-crash or power > failure. Also, what are the steps that are expected to be performed in such > cases? > > >> >> > Last week I have been playing with rather large indexes and crashed my >> app >> > while it was indexing. I wasn't able to open the index, and Luke was even >> > kind enough to wipe the index folder clean even though I opened it in >> > read-only mode. I re-ran this, and after another crash running CheckIndex >> > revealed nothing - the index was detected to be an empty one. I am not >> > entirely sure what could be the cause for this, but I suspect it has >> > been corrupted by the crash. >> >> Had no commit completed (no segments file written)? >> >> If you don't fsync then all sorts of crazy things are possible... >> > > Ok, so we do have fsync since LUCENE-1044 is present, and there were > segments present from previous commits. Any idea what went wrong? > > >> > I've been looking at these: >> > >> > >> https://issues.apache.org/jira/browse/LUCENE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel >> > >> https://issues.apache.org/jira/browse/LUCENE-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel >> >> (And LUCENE-1044 before that ... it was LUCENE-1044 that >> LUCENE-2328broke...). >> > > So 2328 broke 1044, and this was fixed only in 3.4, right? so 2328 made it > to a 3.0.x release while the fix for it (3418) was only released in 3.4. Am > I right? > > If this is the case, 2328 probably made it's way to Lucene.Net since we are > using the released sources for porting, and we now need to apply 3418 in > the current version. > > Does it make sense to just port FSDirectory from 3.4 to 3.0.3? or were > there API or other changes that will make our life miserable if we do that? > > >> >> > And it seems like this is what I was experiencing. Mike and Mark will >> > probably be able to tell if this is what they saw or not, but as far as I >> > can tell this is not an expected behavior of a Lucene index. >> >> Definitely not expected behavior: assuming nothing is flipping bits, >> then on OS/JVM crash or power loss your index should be fine, just >> reverted to the last successful commit. >> > > What I suspected. Will try to reproduce reliably - any recommendations? not > really feeling like reinventing the wheel here... > > MockDirectoryWrapper wasn't ported yet as it appears to only appear in 3.4, > and as you said it won't really help here anyway > > >> >> > What I'm looking for at the moment is some advice on what FSDirectory >> > implementation to use to make sure no corruption can happen. The 3.4 >> version >> > (which is where LUCENE-3418 was committed to) seems to handle a lot of >> > things the 3.0 doesn't, but on the other hand LUCENE-3418 was >> introduced by >> > changes made to the 3.0 codebase. >> >> Hopefully it's just that you are missing fsync! >> >> > Also, is there any test in the suite checking for those scenarios? >> >> Our test framework has a sneaky MockDirectoryWrapper that, after a >> test finishes, goes and corrupts any unsync'd files and then verifies >> the index is still OK... it's good because it'll catch any times we >> are missing calls t sync, but, it's not low level enough such that if >> FSDir is failing to actually call fsync (that wsa the bug in >> LUCENE-3418) then it won't catch that... >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> >> - To unsubscribe, e-ma
[jira] [Updated] (LUCENE-4082) Implement explain in ToParentBlockJoinQuery$BlockJoinWeight
[ https://issues.apache.org/jira/browse/LUCENE-4082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated LUCENE-4082: -- Attachment: LUCENE-4082.patch Included explain into random test. > Implement explain in ToParentBlockJoinQuery$BlockJoinWeight > --- > > Key: LUCENE-4082 > URL: https://issues.apache.org/jira/browse/LUCENE-4082 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/join >Affects Versions: 3.4, 3.5, 3.6 >Reporter: Christoph Kaser >Assignee: Martijn van Groningen >Priority: Minor > Fix For: 4.0, 5.0 > > Attachments: LUCENE-4082.patch, LUCENE-4082.patch > > > At the moment, ToParentBlockJoinQuery$BlockJoinWeight.explain throws an > UnsupportedOperationException. It would be useful if it could instead return > the score of parent document, even if the explanation on how that score was > calculated is missing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4082) Implement explain in ToParentBlockJoinQuery$BlockJoinWeight
[ https://issues.apache.org/jira/browse/LUCENE-4082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated LUCENE-4082: -- Fix Version/s: 5.0 4.0 Assignee: Martijn van Groningen > Implement explain in ToParentBlockJoinQuery$BlockJoinWeight > --- > > Key: LUCENE-4082 > URL: https://issues.apache.org/jira/browse/LUCENE-4082 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/join >Affects Versions: 3.4, 3.5, 3.6 >Reporter: Christoph Kaser >Assignee: Martijn van Groningen >Priority: Minor > Fix For: 4.0, 5.0 > > Attachments: LUCENE-4082.patch, LUCENE-4082.patch > > > At the moment, ToParentBlockJoinQuery$BlockJoinWeight.explain throws an > UnsupportedOperationException. It would be useful if it could instead return > the score of parent document, even if the explanation on how that score was > calculated is missing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3534) dismax and edismax should default to "df" when "qf" is absent.
[ https://issues.apache.org/jira/browse/SOLR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295310#comment-13295310 ] Hoss Man commented on SOLR-3534: The point of the test is to assert that DismaxQParser can function correctly with nothing but a "q" param and a schema specifying a defaultSearchField. If that's covered by another test you're adding (or that already exists) then great, we don't need it. > dismax and edismax should default to "df" when "qf" is absent. > -- > > Key: SOLR-3534 > URL: https://issues.apache.org/jira/browse/SOLR-3534 > Project: Solr > Issue Type: Improvement > Components: query parsers >Affects Versions: 4.0 >Reporter: David Smiley >Assignee: David Smiley >Priority: Minor > Attachments: > SOLR-3534_dismax_and_edismax_should_default_to_df_if_qf_is_absent.patch > > > The dismax and edismax query parsers should default to "df" when the "qf" > parameter is absent. They only use the defaultSearchField in schema.xml as a > fallback now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3544) Under heavy load json response is cut at some arbitrary position
[ https://issues.apache.org/jira/browse/SOLR-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295307#comment-13295307 ] Hoss Man commented on SOLR-3544: can you provide some more details please... 1) what servlet container are you using? 2) how big (in bytes) are the responses when they work? how big are they when "cut off"? 3) does the "cut off" always happen on/around a specific piece of markup? (ie: when closing a list or an object) or in the middle of arbitrary string values? is it possible there are certain byte sequences that always occur just before/at/after the cutoff happens? 4) your blog post mentions... bq. Unfortunately, there was no indication of any malfunction in Solr except for the “Broken Pipe” notification that the client has closed the connection. ...where are you seeing this? in packet sniffing tool? in the solr logs? ... what exactly is the full message? > Under heavy load json response is cut at some arbitrary position > > > Key: SOLR-3544 > URL: https://issues.apache.org/jira/browse/SOLR-3544 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 3.1 > Environment: Linux version 2.6.32-5-amd64 (Debian 2.6.32-38) > (b...@decadent.org.uk) (gcc version 4.3.5 (Debian 4.3.5-4) ) >Reporter: Dušan Omerčević > > We query solr for 30K documents using json as the response format. Normally > this works perfectly fine. But when the machine comes under heavy load (all > cores utilized) the response got interrupted at arbitrary position. We > circumvented the problem by switching to xml response format. > I've written the full description here: > http://restreaming.wordpress.com/2012/06/14/the-curious-case-of-solr-malfunction/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3534) dismax and edismax should default to "df" when "qf" is absent.
[ https://issues.apache.org/jira/browse/SOLR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295306#comment-13295306 ] David Smiley commented on SOLR-3534: TestExtendedDismaxParser line 126 already tests that defaultSearchField is consulted. In this patch I added another test above it to ensure that "df" is consulted. > dismax and edismax should default to "df" when "qf" is absent. > -- > > Key: SOLR-3534 > URL: https://issues.apache.org/jira/browse/SOLR-3534 > Project: Solr > Issue Type: Improvement > Components: query parsers >Affects Versions: 4.0 >Reporter: David Smiley >Assignee: David Smiley >Priority: Minor > Attachments: > SOLR-3534_dismax_and_edismax_should_default_to_df_if_qf_is_absent.patch > > > The dismax and edismax query parsers should default to "df" when the "qf" > parameter is absent. They only use the defaultSearchField in schema.xml as a > fallback now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3534) dismax and edismax should default to "df" when "qf" is absent.
[ https://issues.apache.org/jira/browse/SOLR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295303#comment-13295303 ] Hoss Man commented on SOLR-3534: bq. I'll presume that you don't mean "" literally, you mean "text". yes, sorry .. i was using "" as shorthand for ...something..., that was bad on my part and totally confusing. bq. So are you effectively saying that schema-minimal.xml should add a defaultSearchField to it? No, i'm saying that as long as "..." is legal and supported configuration, then this specific test (of "dismaxNoDefaults") should use a schema that has a "..." in it since that's the point of the test. schema-minimal.xml should certainly not have a "..." added, since that would no longer truely be a minimal schema.xml > dismax and edismax should default to "df" when "qf" is absent. > -- > > Key: SOLR-3534 > URL: https://issues.apache.org/jira/browse/SOLR-3534 > Project: Solr > Issue Type: Improvement > Components: query parsers >Affects Versions: 4.0 >Reporter: David Smiley >Assignee: David Smiley >Priority: Minor > Attachments: > SOLR-3534_dismax_and_edismax_should_default_to_df_if_qf_is_absent.patch > > > The dismax and edismax query parsers should default to "df" when the "qf" > parameter is absent. They only use the defaultSearchField in schema.xml as a > fallback now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2724) Deprecate defaultSearchField and defaultOperator defined in schema.xml
[ https://issues.apache.org/jira/browse/SOLR-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295299#comment-13295299 ] Hoss Man commented on SOLR-2724: David: looking back at the mailing list ~ 28/Mar/12 it's not clear what exactly was the problem that required reverting at the time ... where the test failures even caused by this specific issue, or something else that you committed right around the same time? Given that we've already created the 4x branch and started pushing towards Alpha, i would at least move forward with making sure trunk & 4x are on parity with 3.6 in terms of the changes to the example and the log/error messages. Depending on what the issue was with the tests we can figure out how we want to move forward from there. bq. I take issue with the Yonik's comment "we're not really gaining anything with this change". ... I don't think defaultSearchField & defaultOperator have a need to exist, let alone exist in schema.xml, thereby creating unnecessary complexity in understanding the product – albeit in a small way. I think the question is "if we stop promoting them in the example, and start encouraging an alternative instead, what is gained by actually removing the support in the code for existing users who already have them in the config and upgrade?" It's one thing to say in CHANGES.txt "we've removed feature X because doing so allowed us (add feature|fix bug) Y, so if you used X in the past now you have to use Z instead" but there is no "Y" in this case (that i see) ... we're just telling people "we've removed X because we think Z is better, so if you used X in the past now you have to use Z instead". You may feel it's a complexity for new users to understand why these things are in schema.xml -- which is fine, but isn't removing from the example schema.xml enough to addresses? What is the value gained in removing the ability to use it for existing users who already understand it? This is the crux of my suggestion way, way, WAY back in this issue about why i didn't/don't think there was a strong motivation to remove support completely in 4x - an opinion echoed by Yonik & Erick. As evidence from recent mailing list comments by folks like Bernd & Rohit, there is already clearly confusion for existing users just by removing these from the example -- let alone removing support for it from the code. > Deprecate defaultSearchField and defaultOperator defined in schema.xml > -- > > Key: SOLR-2724 > URL: https://issues.apache.org/jira/browse/SOLR-2724 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis, search >Reporter: David Smiley >Assignee: David Smiley >Priority: Minor > Fix For: 3.6, 4.0 > > Attachments: > SOLR-2724_deprecateDefaultSearchField_and_defaultOperator.patch > > Original Estimate: 2h > Remaining Estimate: 2h > > I've always been surprised to see the element and > defined in the schema.xml file since > the first time I saw them. They just seem out of place to me since they are > more query parser related than schema related. But not only are they > misplaced, I feel they shouldn't exist. For query parsers, we already have a > "df" parameter that works just fine, and explicit field references. And the > default lucene query operator should stay at OR -- if a particular query > wants different behavior then use q.op or simply use "OR". > Seems like something better placed in solrconfig.xml than in the > schema. > In my opinion, defaultSearchField and defaultOperator configuration elements > should be deprecated in Solr 3.x and removed in Solr 4. And > should move to solrconfig.xml. I am willing to do it, provided there is > consensus on it of course. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3544) Under heavy load json response is cut at some arbitrary position
Dušan Omerčević created SOLR-3544: - Summary: Under heavy load json response is cut at some arbitrary position Key: SOLR-3544 URL: https://issues.apache.org/jira/browse/SOLR-3544 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.1 Environment: Linux version 2.6.32-5-amd64 (Debian 2.6.32-38) (b...@decadent.org.uk) (gcc version 4.3.5 (Debian 4.3.5-4) ) Reporter: Dušan Omerčević We query solr for 30K documents using json as the response format. Normally this works perfectly fine. But when the machine comes under heavy load (all cores utilized) the response got interrupted at arbitrary position. We circumvented the problem by switching to xml response format. I've written the full description here: http://restreaming.wordpress.com/2012/06/14/the-curious-case-of-solr-malfunction/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3534) dismax and edismax should default to "df" when "qf" is absent.
[ https://issues.apache.org/jira/browse/SOLR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295296#comment-13295296 ] David Smiley commented on SOLR-3534: Hoss, I like your suggestion of refactoring this to SolrPluginUtils (not *Tools which doesn't exist). And also I realized that SolrParams.get() takes a 2nd arg for the default which can be s.getDefaultSearchFieldName(), simplifying this even more. bq. As Bernd noted, that test was written at a time when the schema.xml used by the test had a declared – that was/is the entire point of the test: that the Dismax(Handler|QParser) could work with a "" and a "q" and no other params specified. As long as "" is legal (even if it's deprecated and not mentioned in the example schema.xml) a test like that should exist somewhere shouldn't it? (if/when "" is no longer legal, then certainly change the test to add a "df" param and assert that it fails if one isn't specified) I'm confused by this, especially since you "+1"'ed on throwing an exception. I'll presume that you don't mean "" literally, you mean "text". So are you effectively saying that schema-minimal.xml should add a defaultSearchField to it? > dismax and edismax should default to "df" when "qf" is absent. > -- > > Key: SOLR-3534 > URL: https://issues.apache.org/jira/browse/SOLR-3534 > Project: Solr > Issue Type: Improvement > Components: query parsers >Affects Versions: 4.0 >Reporter: David Smiley >Assignee: David Smiley >Priority: Minor > Attachments: > SOLR-3534_dismax_and_edismax_should_default_to_df_if_qf_is_absent.patch > > > The dismax and edismax query parsers should default to "df" when the "qf" > parameter is absent. They only use the defaultSearchField in schema.xml as a > fallback now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3729) Allow using FST to hold terms data in DocValues.BYTES_*_SORTED
[ https://issues.apache.org/jira/browse/LUCENE-3729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-3729: Attachment: LUCENE-3729.patch here is a first cut at using FST to hold terms in sorted DocValues. This patch holds all data in an FST and currently doesn't support a direct source ie all FSTs are loaded into memory even during merging. All test (except BWcompat -- wich is good!) pass. I think we can have this as a first step but not being the default? > Allow using FST to hold terms data in DocValues.BYTES_*_SORTED > -- > > Key: LUCENE-3729 > URL: https://issues.apache.org/jira/browse/LUCENE-3729 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Michael McCandless >Assignee: Michael McCandless > Labels: gsoc2012, lucene-gsoc-11 > Attachments: LUCENE-3729.patch, LUCENE-3729.patch, LUCENE-3729.patch, > LUCENE-3729.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera reassigned LUCENE-4132: -- Assignee: Shai Erera > IndexWriterConfig live settings > --- > > Key: LUCENE-4132 > URL: https://issues.apache.org/jira/browse/LUCENE-4132 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Shai Erera >Assignee: Shai Erera >Priority: Minor > Fix For: 4.0, 5.0 > > Attachments: LUCENE-4132.patch, LUCENE-4132.patch, LUCENE-4132.patch, > LUCENE-4132.patch, LUCENE-4132.patch > > > A while ago there was a discussion about making some IW settings "live" and I > remember that RAM buffer size was one of them. Judging from IW code, I see > that RAM buffer can be changed "live" as IW never caches it. > However, I don't remember which other settings were decided to be "live" and > I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: > {code} > * NOTE: some settings may be changed on the > * returned {@link IndexWriterConfig}, and will take > * effect in the current IndexWriter instance. See the > * javadocs for the specific setters in {@link > * IndexWriterConfig} for details. > {code} > But there's no text on e.g. IWC.setRAMBuffer mentioning that. > I think that it'd be good if we make it easier for users to tell which of the > settings are "live" ones. There are few possible ways to do it: > * Introduce a custom @live.setting tag on the relevant IWC.set methods, and > add special text for them in build.xml > ** Or, drop the tag and just document it clearly. > * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name > proposals are welcome !), have IWC impl both, and introduce another > IW.getLiveConfig which will return that interface, thereby clearly letting > the user know which of the settings are "live". > It'd be good if IWC itself could only expose setXYZ methods for the "live" > settings though. So perhaps, off the top of my head, we can do something like > this: > * Introduce a Config object, which is essentially what IWC is today, and pass > it to IW. > * IW will create a different object, IWC from that Config and IW.getConfig > will return IWC. > * IWC itself will only have setXYZ methods for the "live" settings. > It adds another object, but user code doesn't change - it still creates a > Config object when initializing IW, and need to handle a different type if it > ever calls IW.getConfig. > Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295269#comment-13295269 ] Shai Erera commented on LUCENE-4132: Thanks Robert. I'll wait until Sunday and commit it. > IndexWriterConfig live settings > --- > > Key: LUCENE-4132 > URL: https://issues.apache.org/jira/browse/LUCENE-4132 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Shai Erera >Priority: Minor > Fix For: 4.0, 5.0 > > Attachments: LUCENE-4132.patch, LUCENE-4132.patch, LUCENE-4132.patch, > LUCENE-4132.patch, LUCENE-4132.patch > > > A while ago there was a discussion about making some IW settings "live" and I > remember that RAM buffer size was one of them. Judging from IW code, I see > that RAM buffer can be changed "live" as IW never caches it. > However, I don't remember which other settings were decided to be "live" and > I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: > {code} > * NOTE: some settings may be changed on the > * returned {@link IndexWriterConfig}, and will take > * effect in the current IndexWriter instance. See the > * javadocs for the specific setters in {@link > * IndexWriterConfig} for details. > {code} > But there's no text on e.g. IWC.setRAMBuffer mentioning that. > I think that it'd be good if we make it easier for users to tell which of the > settings are "live" ones. There are few possible ways to do it: > * Introduce a custom @live.setting tag on the relevant IWC.set methods, and > add special text for them in build.xml > ** Or, drop the tag and just document it clearly. > * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name > proposals are welcome !), have IWC impl both, and introduce another > IW.getLiveConfig which will return that interface, thereby clearly letting > the user know which of the settings are "live". > It'd be good if IWC itself could only expose setXYZ methods for the "live" > settings though. So perhaps, off the top of my head, we can do something like > this: > * Introduce a Config object, which is essentially what IWC is today, and pass > it to IW. > * IW will create a different object, IWC from that Config and IW.getConfig > will return IWC. > * IWC itself will only have setXYZ methods for the "live" settings. > It adds another object, but user code doesn't change - it still creates a > Config object when initializing IW, and need to handle a different type if it > ever calls IW.getConfig. > Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3534) dismax and edismax should default to "df" when "qf" is absent.
[ https://issues.apache.org/jira/browse/SOLR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295268#comment-13295268 ] Hoss Man commented on SOLR-3534: bq. dismax&edismax should look at 'df' before falling back to defaultSearchField +1 ... i thought it already did that, but i guess not. If we are "deprecating/discouraging" and instructing people to use "df" instead, then we should absolutely make 100% certain any code path we ship that currently consults checks "df" first. (if/when the code paths that consult are removed, they should still consult "df") bq. dismax&edismax should throw an exception if neither 'qf', 'df', nor defaultSearchField are specified, because these two query parsers are fairly useless without them. +1 .. (although i suppose edismax could still be usable if every clause is fully qualified with a fieldname/alias and fail only when a clause that requires a default is encountered ... just like the LuceneQParser) bq. I ran all tests before committing and found the MinimalSchemaTest failed related to the "dismaxNoDefaults" request handler in the test solrconfig.xml which was added in SOLR-1776. The problem is throwing an exception if there's no 'qf', 'df', or default search field. I disagree with that test – it is erroneous/misleading to use dismax without setting specifying via any of those 3 mechanisms. I am inclined to delete the "dismaxNoDefaults" test request handler (assuming there are no other ramifications). I want to get input from Hoss who put it there so I'll wait. As Bernd noted, that test was written at a time when the schema.xml used by the test had a declared -- that was/is the entire point of the test: that the Dismax(Handler|QParser) could work with a "" and a "q" and no other params specified. As long as "" is legal (even if it's deprecated and not mentioned in the example schema.xml) a test like that should exist somewhere shouldn't it? (if/when "" is no longer legal, then certainly change the test to add a "df" param and assert that it fails if one isn't specified) -- The current patch looks like a great start to me ... but i would suggest refactoring this core little bit into it's own method in SolrPluginTools and replacing every use of getDefaultSearchFieldName in the code base with it (and add a link to it from getDefaultSearchFieldName javadocs encouraging people to use it instead)... {code} /** * returns the effective default field based on the params or * hardcoded schema default. may be null if either exists specified. * @see CommonParams#DF * @see IndexSchema#getDefaultSearchFieldName */ public static String getDefaultField(final IndexSchema s, final SolrParams p) { String df = p.get(CommonParams.DF); if (df == null) { df = s.getDefaultSearchFieldName(); } return df; } {code} > dismax and edismax should default to "df" when "qf" is absent. > -- > > Key: SOLR-3534 > URL: https://issues.apache.org/jira/browse/SOLR-3534 > Project: Solr > Issue Type: Improvement > Components: query parsers >Affects Versions: 4.0 >Reporter: David Smiley >Assignee: David Smiley >Priority: Minor > Attachments: > SOLR-3534_dismax_and_edismax_should_default_to_df_if_qf_is_absent.patch > > > The dismax and edismax query parsers should default to "df" when the "qf" > parameter is absent. They only use the defaultSearchField in schema.xml as a > fallback now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295261#comment-13295261 ] Erik Hatcher commented on SOLR-2894: Trey - would you be in a position to test out the latest patch? I built my latest one by starting with the March 5, 2012 SOLR-2894.patch file. > Implement distributed pivot faceting > > > Key: SOLR-2894 > URL: https://issues.apache.org/jira/browse/SOLR-2894 > Project: Solr > Issue Type: Improvement >Reporter: Erik Hatcher >Assignee: Erik Hatcher > Fix For: 4.0 > > Attachments: SOLR-2894.patch, SOLR-2894.patch, > distributed_pivot.patch, distributed_pivot.patch > > > Following up on SOLR-792, pivot faceting currently only supports > undistributed mode. Distributed pivot faceting needs to be implemented. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher updated SOLR-2894: --- Attachment: SOLR-2894.patch Patch updated to 4x branch. Simon, just for you, I removed NamedListHelper as well :) (folded its one method into PivotFacetHelper) Tests pass. > Implement distributed pivot faceting > > > Key: SOLR-2894 > URL: https://issues.apache.org/jira/browse/SOLR-2894 > Project: Solr > Issue Type: Improvement >Reporter: Erik Hatcher >Assignee: Erik Hatcher > Fix For: 4.0 > > Attachments: SOLR-2894.patch, SOLR-2894.patch, > distributed_pivot.patch, distributed_pivot.patch > > > Following up on SOLR-792, pivot faceting currently only supports > undistributed mode. Distributed pivot faceting needs to be implemented. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295220#comment-13295220 ] Robert Muir commented on LUCENE-4132: - thanks, +1 > IndexWriterConfig live settings > --- > > Key: LUCENE-4132 > URL: https://issues.apache.org/jira/browse/LUCENE-4132 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Shai Erera >Priority: Minor > Fix For: 4.0, 5.0 > > Attachments: LUCENE-4132.patch, LUCENE-4132.patch, LUCENE-4132.patch, > LUCENE-4132.patch, LUCENE-4132.patch > > > A while ago there was a discussion about making some IW settings "live" and I > remember that RAM buffer size was one of them. Judging from IW code, I see > that RAM buffer can be changed "live" as IW never caches it. > However, I don't remember which other settings were decided to be "live" and > I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: > {code} > * NOTE: some settings may be changed on the > * returned {@link IndexWriterConfig}, and will take > * effect in the current IndexWriter instance. See the > * javadocs for the specific setters in {@link > * IndexWriterConfig} for details. > {code} > But there's no text on e.g. IWC.setRAMBuffer mentioning that. > I think that it'd be good if we make it easier for users to tell which of the > settings are "live" ones. There are few possible ways to do it: > * Introduce a custom @live.setting tag on the relevant IWC.set methods, and > add special text for them in build.xml > ** Or, drop the tag and just document it clearly. > * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name > proposals are welcome !), have IWC impl both, and introduce another > IW.getLiveConfig which will return that interface, thereby clearly letting > the user know which of the settings are "live". > It'd be good if IWC itself could only expose setXYZ methods for the "live" > settings though. So perhaps, off the top of my head, we can do something like > this: > * Introduce a Config object, which is essentially what IWC is today, and pass > it to IW. > * IW will create a different object, IWC from that Config and IW.getConfig > will return IWC. > * IWC itself will only have setXYZ methods for the "live" settings. > It adds another object, but user code doesn't change - it still creates a > Config object when initializing IW, and need to handle a different type if it > ever calls IW.getConfig. > Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4062) More fine-grained control over the packed integer implementation that is chosen
[ https://issues.apache.org/jira/browse/LUCENE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295195#comment-13295195 ] Dawid Weiss commented on LUCENE-4062: - Ok, thanks - makes sense. Is the code for these benchmarks somewhere? > More fine-grained control over the packed integer implementation that is > chosen > --- > > Key: LUCENE-4062 > URL: https://issues.apache.org/jira/browse/LUCENE-4062 > Project: Lucene - Java > Issue Type: Improvement > Components: core/other >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Minor > Labels: performance > Fix For: 4.0 > > Attachments: LUCENE-4062-2.patch, LUCENE-4062.patch, > LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, > LUCENE-4062.patch, LUCENE-4062.patch > > > In order to save space, Lucene has two main PackedInts.Mutable implentations, > one that is very fast and is based on a byte/short/integer/long array > (Direct*) and another one which packs bits in a memory-efficient manner > (Packed*). > The packed implementation tends to be much slower than the direct one, which > discourages some Lucene components to use it. On the other hand, if you store > 21 bits integers in a Direct32, this is a space loss of (32-21)/32=35%. > If you accept to trade some space for speed, you could store 3 of these 21 > bits integers in a long, resulting in an overhead of 1/3 bit per value. One > advantage of this approach is that you never need to read more than one block > to read or write a value, so this can be significantly faster than Packed32 > and Packed64 which always need to read/write two blocks in order to avoid > costly branches. > I ran some tests, and for 1000 21 bits values, this implementation takes > less than 2% more space and has 44% faster writes and 30% faster reads. The > 12 bits version (5 values per block) has the same performance improvement and > a 6% memory overhead compared to the packed implementation. > In order to select the best implementation for a given integer size, I wrote > the {{PackedInts.getMutable(valueCount, bitsPerValue, > acceptableOverheadPerValue)}} method. This method select the fastest > implementation that has less than {{acceptableOverheadPerValue}} wasted bits > per value. For example, if you accept an overhead of 20% > ({{acceptableOverheadPerValue = 0.2f * bitsPerValue}}), which is pretty > reasonable, here is what implementations would be selected: > * 1: Packed64SingleBlock1 > * 2: Packed64SingleBlock2 > * 3: Packed64SingleBlock3 > * 4: Packed64SingleBlock4 > * 5: Packed64SingleBlock5 > * 6: Packed64SingleBlock6 > * 7: Direct8 > * 8: Direct8 > * 9: Packed64SingleBlock9 > * 10: Packed64SingleBlock10 > * 11: Packed64SingleBlock12 > * 12: Packed64SingleBlock12 > * 13: Packed64 > * 14: Direct16 > * 15: Direct16 > * 16: Direct16 > * 17: Packed64 > * 18: Packed64SingleBlock21 > * 19: Packed64SingleBlock21 > * 20: Packed64SingleBlock21 > * 21: Packed64SingleBlock21 > * 22: Packed64 > * 23: Packed64 > * 24: Packed64 > * 25: Packed64 > * 26: Packed64 > * 27: Direct32 > * 28: Direct32 > * 29: Direct32 > * 30: Direct32 > * 31: Direct32 > * 32: Direct32 > * 33: Packed64 > * 34: Packed64 > * 35: Packed64 > * 36: Packed64 > * 37: Packed64 > * 38: Packed64 > * 39: Packed64 > * 40: Packed64 > * 41: Packed64 > * 42: Packed64 > * 43: Packed64 > * 44: Packed64 > * 45: Packed64 > * 46: Packed64 > * 47: Packed64 > * 48: Packed64 > * 49: Packed64 > * 50: Packed64 > * 51: Packed64 > * 52: Packed64 > * 53: Packed64 > * 54: Direct64 > * 55: Direct64 > * 56: Direct64 > * 57: Direct64 > * 58: Direct64 > * 59: Direct64 > * 60: Direct64 > * 61: Direct64 > * 62: Direct64 > Under 32 bits per value, only 13, 17 and 22-26 bits per value would still > choose the slower Packed64 implementation. Allowing a 50% overhead would > prevent the packed implementation to be selected for bits per value under 32. > Allowing an overhead of 32 bits per value would make sure that a Direct* > implementation is always selected. > Next steps would be to: > * make lucene components use this {{getMutable}} method and let users decide > what trade-off better suits them, > * write a Packed32SingleBlock implementation if necessary (I didn't do it > because I have no 32-bits computer to test the performance improvements). > I think this would allow more fine-grained control over the speed/space > trade-off, what do you think? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more
[jira] [Commented] (SOLR-3535) Add block support for XMLLoader
[ https://issues.apache.org/jira/browse/SOLR-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295189#comment-13295189 ] Hoss Man commented on SOLR-3535: bq. Or simply allow SolrInputDocument as a normal value and existing APIs could be used to add them. This would also be slightly more powerful, allowing more than one child list for the same parent. "allow SolrInputDocument as a normal value" ... a normal value to what? where? ... are you describing he same thing as Mikhail in modeling "children" SolrInputDocuments as field values of the parent SOlrInputDOcument? If so then i ask you the same questions i asked him above... {quote} bq. why new api/property is necessary? is solrInputDoc.addField("skus", new Object[]\{sku1, sku2, sku3\}) not enough? Are you suggesting we model child documents as objects (SolrInputDocuments i guess?) in a special field? ... what if i put child documents in multiple fields? would that signify the different types of child? how would solr model that in the (lucene) Documents when giving them to the InddexWriter? How would solr know how to order the children in from multiple fields/lists when creating the block? Wouldn't the "type of child" information be better living in the child documents itself? (particularly since that "type" information needs to be in the child documents anyway so that the filter query for a BJQ can be specified.) It also seems like it would require code that wants to know what children exist in a document to do a lot of work to find that out (need to iterate ever field in the SolrInputDocument and do reflection to see if they are child-documents or not) Another concern off the top of my head is that a lot of existing code (including any custom update processors people might have) would assume those child documents are multivaluved field values and would probably break – hence a new method on SolrInputDocument seems wiser (code that doens't know about may not do what you want, but at least it won't break it) {quote} > Add block support for XMLLoader > --- > > Key: SOLR-3535 > URL: https://issues.apache.org/jira/browse/SOLR-3535 > Project: Solr > Issue Type: Sub-task > Components: update >Affects Versions: 4.1, 5.0 >Reporter: Mikhail Khludnev >Priority: Minor > Attachments: SOLR-3535.patch > > > I'd like to add the following update xml message: > > > > > out of scope for now: > * other update formats > * update log support (NRT), should not be a big deal > * overwrite feature support for block updates - it's more complicated, I'll > tell you why > Alt > * wdyt about adding attribute to the current tag {pre}{pre} > * or we can establish RunBlockUpdateProcessor which treat every > as a block. > *Test is included!!* > How you'd suggest to improve the patch? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Corrupt index
I'm quite certain this shouldn't happen also when Commit wasn't called. Mike, can you comment on that? On Thu, Jun 14, 2012 at 8:03 PM, Christopher Currens < currens.ch...@gmail.com> wrote: > Well, the only thing I see is that there is no place where writer.Commit() > is called in the delegate assigned to corpusReader.OnDocument. I know that > lucene is very transactional, and at least in 3.x, the writer will never > auto commit to the index. You can write millions of documents, but if > commit is never called, those documents aren't actually part of the index. > Committing isn't a cheap operation, so you definitely don't want to do it > on every document. > > You can test it yourself with this (naive) solution. Right below the > writer.SetUseCompoundFile(false) line, add "int numDocsAdded = 0;". At the > end of the corpusReader.OnDocument delegate add: > > // Example only. I wouldn't suggest committing this often > if(++numDocsAdded % 5 == 0) > { >writer.Commit(); > } > > I had the application crash for real on this file: > > http://dumps.wikimedia.org/gawiktionary/20120613/gawiktionary-20120613-pages-meta-history.xml.bz2 > , > about 20% into the operation. Without the commit, the index is empty. Add > it in, and I get 755 files in the index after it crashes. > > > Thanks, > Christopher > > On Wed, Jun 13, 2012 at 6:13 PM, Itamar Syn-Hershko >wrote: > > > Yes, reproduced in first try. See attached program - I referenced it to > > current trunk. > > > > > > On Thu, Jun 14, 2012 at 3:54 AM, Itamar Syn-Hershko >wrote: > > > >> Christopher, > >> > >> I used the IndexBuilder app from here > >> https://github.com/synhershko/Talks/tree/master/LuceneNeatThings with a > >> 8.5GB wikipedia dump. > >> > >> After running for 2.5 days I had to forcefully close it (infinite loop > in > >> the wiki-markdown parser at 92%, go figure), and the 40-something GB > index > >> I had by then was unusable. I then was able to reproduce this > >> > >> Please note I now added a few safe-guards you might want to remove to > >> make sure the app really crashes on process kill. > >> > >> I'll try to come up with a better way to reproduce this - hopefully Mike > >> will be able to suggest better ways than manual process kill... > >> > >> On Thu, Jun 14, 2012 at 1:41 AM, Christopher Currens < > >> currens.ch...@gmail.com> wrote: > >> > >>> Mike, The codebase for lucene.net should be almost identical to java's > >>> 3.0.3 release, and LUCENE-1044 is included in that. > >>> > >>> Itamar, are you committing the index regularly? I only ask because I > >>> can't > >>> reproduce it myself by forcibly terminating the process while it's > >>> indexing. I've tried both 3.0.3 and 2.9.4. If I don't commit at all > and > >>> terminate the process (even with a 10,000 4K documents created), there > >>> will > >>> be no documents in the index when I open it in luke, which I expect. > If > >>> I > >>> commit at 10,000 documents, and terminate it a few thousand after that, > >>> the > >>> index has the first ten thousand that were committed. I've even > >>> terminated > >>> it *while* a second commit was taking place, and it still had all of > the > >>> documents I expected. > >>> > >>> It may be that I'm not trying to reproducing it correctly. Do you > have a > >>> minimal amount of code that can reproduce it? > >>> > >>> > >>> Thanks, > >>> Christopher > >>> > >>> On Wed, Jun 13, 2012 at 9:31 AM, Michael McCandless < > >>> luc...@mikemccandless.com> wrote: > >>> > >>> > Hi Itamar, > >>> > > >>> > One quick question: does Lucene.Net include the fixes done for > >>> > LUCENE-1044 (to fsync files on commit)? Those are very important for > >>> > an index to be intact after OS/JVM crash or power loss. > >>> > > >>> > More responses below: > >>> > > >>> > On Tue, Jun 12, 2012 at 8:20 PM, Itamar Syn-Hershko < > >>> ita...@code972.com> > >>> > wrote: > >>> > > >>> > > I'm a Lucene.Net committer, and there is a chance we have a bug in > >>> our > >>> > > FSDirectory implementation that causes indexes to get corrupted > when > >>> > > indexing is cut while the IW is still open. As it roots from some > >>> > > retroactive fixes you made, I'd appreciate your feedback. > >>> > > > >>> > > Correct me if I'm wrong, but by design Lucene should be able to > >>> recover > >>> > > rather quickly from power failures or app crashes. Since existing > >>> segment > >>> > > files are read only, only new segments that are still being written > >>> can > >>> > get > >>> > > corrupted. Hence, recovering from worst-case scenarios is done by > >>> simply > >>> > > removing the write.lock file. The worst that could happen then is > >>> having > >>> > the > >>> > > last segment damaged, and that can be fixed by removing those > files, > >>> > > possibly by running CheckIndex on the index. > >>> > > >>> > You shouldn't even have to run CheckIndex ... because (as of > >>> > LUCENE-1044) we now fsync all segment files before writing the new > >>> > segments_N f
[jira] [Commented] (LUCENE-4062) More fine-grained control over the packed integer implementation that is chosen
[ https://issues.apache.org/jira/browse/LUCENE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295176#comment-13295176 ] Adrien Grand commented on LUCENE-4062: -- The x axis is the number of bits per value while the y axis is the number of values that are read or written per second. For every bitsPerValue and bit-packing scheme, I took the impl with the lowest working bitsPerValue. (For example, bitsPerValue=19 would give a Direct32, a Packed64(bitsPerValue=19), a Packed8ThreeBlocks(24 bits per value) and a Packed64SingleBlock(bitsPerValue=21)). There are 4 lines because we currently have 4 different bit-packing schemes. In the two first cases, values are read at random offsets while the two bulk tests read/write a large number of values sequentially. I didn't want to test {{System.arraycopy}} against a naive for-loop, I just noticed that {{Direct64}} bulk operations didn't use {{arraycopy}}, so I fixed that and added a few words about it so that people understand why the throughput increases when bitsPerValue > 32, which is counter-intuitive. > More fine-grained control over the packed integer implementation that is > chosen > --- > > Key: LUCENE-4062 > URL: https://issues.apache.org/jira/browse/LUCENE-4062 > Project: Lucene - Java > Issue Type: Improvement > Components: core/other >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Minor > Labels: performance > Fix For: 4.0 > > Attachments: LUCENE-4062-2.patch, LUCENE-4062.patch, > LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, > LUCENE-4062.patch, LUCENE-4062.patch > > > In order to save space, Lucene has two main PackedInts.Mutable implentations, > one that is very fast and is based on a byte/short/integer/long array > (Direct*) and another one which packs bits in a memory-efficient manner > (Packed*). > The packed implementation tends to be much slower than the direct one, which > discourages some Lucene components to use it. On the other hand, if you store > 21 bits integers in a Direct32, this is a space loss of (32-21)/32=35%. > If you accept to trade some space for speed, you could store 3 of these 21 > bits integers in a long, resulting in an overhead of 1/3 bit per value. One > advantage of this approach is that you never need to read more than one block > to read or write a value, so this can be significantly faster than Packed32 > and Packed64 which always need to read/write two blocks in order to avoid > costly branches. > I ran some tests, and for 1000 21 bits values, this implementation takes > less than 2% more space and has 44% faster writes and 30% faster reads. The > 12 bits version (5 values per block) has the same performance improvement and > a 6% memory overhead compared to the packed implementation. > In order to select the best implementation for a given integer size, I wrote > the {{PackedInts.getMutable(valueCount, bitsPerValue, > acceptableOverheadPerValue)}} method. This method select the fastest > implementation that has less than {{acceptableOverheadPerValue}} wasted bits > per value. For example, if you accept an overhead of 20% > ({{acceptableOverheadPerValue = 0.2f * bitsPerValue}}), which is pretty > reasonable, here is what implementations would be selected: > * 1: Packed64SingleBlock1 > * 2: Packed64SingleBlock2 > * 3: Packed64SingleBlock3 > * 4: Packed64SingleBlock4 > * 5: Packed64SingleBlock5 > * 6: Packed64SingleBlock6 > * 7: Direct8 > * 8: Direct8 > * 9: Packed64SingleBlock9 > * 10: Packed64SingleBlock10 > * 11: Packed64SingleBlock12 > * 12: Packed64SingleBlock12 > * 13: Packed64 > * 14: Direct16 > * 15: Direct16 > * 16: Direct16 > * 17: Packed64 > * 18: Packed64SingleBlock21 > * 19: Packed64SingleBlock21 > * 20: Packed64SingleBlock21 > * 21: Packed64SingleBlock21 > * 22: Packed64 > * 23: Packed64 > * 24: Packed64 > * 25: Packed64 > * 26: Packed64 > * 27: Direct32 > * 28: Direct32 > * 29: Direct32 > * 30: Direct32 > * 31: Direct32 > * 32: Direct32 > * 33: Packed64 > * 34: Packed64 > * 35: Packed64 > * 36: Packed64 > * 37: Packed64 > * 38: Packed64 > * 39: Packed64 > * 40: Packed64 > * 41: Packed64 > * 42: Packed64 > * 43: Packed64 > * 44: Packed64 > * 45: Packed64 > * 46: Packed64 > * 47: Packed64 > * 48: Packed64 > * 49: Packed64 > * 50: Packed64 > * 51: Packed64 > * 52: Packed64 > * 53: Packed64 > * 54: Direct64 > * 55: Direct64 > * 56: Direct64 > * 57: Direct64 > * 58: Direct64 > * 59: Direct64 > * 60: Direct64 > * 61: Direct64 > * 62: Direct64 > Under 32 bits per value, only 13, 17 and 22-26 bits per value would still > choose the slower Packed64 implementation. Allo
[JENKINS] Lucene-Solr-tests-only-4.x - Build # 92 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-4.x/92/ 1 tests failed. REGRESSION: org.apache.solr.update.SoftAutoCommitTest.testSoftAndHardCommitMaxTimeDelete Error Message: searcher529 wasn't soon enough after soft529: 1339694370043 !< 1339694369890 + 100 (fudge) Stack Trace: java.lang.AssertionError: searcher529 wasn't soon enough after soft529: 1339694370043 !< 1339694369890 + 100 (fudge) at __randomizedtesting.SeedInfo.seed([C734A40A40E36661:781C975B4BABD1]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.update.SoftAutoCommitTest.testSoftAndHardCommitMaxTimeDelete(SoftAutoCommitTest.java:254) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Build Log: [...truncated 10175 lines...] [junit4] 2> 13259 T2223 oasc.SolrDeletionPolicy.onCommit SolrDeletionPolicy.onCommit: commits:num=1 [junit4] 2> commit{dir=MockDirWrapper(org.apache.lucene.store.RAMDirectory@5f2dcf8d lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@57895c19),segFN=segments_1,generation=1,filenames=[segments_1] [junit4] 2> 13259 T2223 oa
[jira] [Commented] (LUCENE-4062) More fine-grained control over the packed integer implementation that is chosen
[ https://issues.apache.org/jira/browse/LUCENE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295167#comment-13295167 ] Dawid Weiss commented on LUCENE-4062: - What's on the axes in those plots? System.copyarray is an intrinsic -- it'll be much faster than any other loop that doesn't eliminate bounds checks (and I think with more complex logic this will not be done). > More fine-grained control over the packed integer implementation that is > chosen > --- > > Key: LUCENE-4062 > URL: https://issues.apache.org/jira/browse/LUCENE-4062 > Project: Lucene - Java > Issue Type: Improvement > Components: core/other >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Minor > Labels: performance > Fix For: 4.0 > > Attachments: LUCENE-4062-2.patch, LUCENE-4062.patch, > LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, > LUCENE-4062.patch, LUCENE-4062.patch > > > In order to save space, Lucene has two main PackedInts.Mutable implentations, > one that is very fast and is based on a byte/short/integer/long array > (Direct*) and another one which packs bits in a memory-efficient manner > (Packed*). > The packed implementation tends to be much slower than the direct one, which > discourages some Lucene components to use it. On the other hand, if you store > 21 bits integers in a Direct32, this is a space loss of (32-21)/32=35%. > If you accept to trade some space for speed, you could store 3 of these 21 > bits integers in a long, resulting in an overhead of 1/3 bit per value. One > advantage of this approach is that you never need to read more than one block > to read or write a value, so this can be significantly faster than Packed32 > and Packed64 which always need to read/write two blocks in order to avoid > costly branches. > I ran some tests, and for 1000 21 bits values, this implementation takes > less than 2% more space and has 44% faster writes and 30% faster reads. The > 12 bits version (5 values per block) has the same performance improvement and > a 6% memory overhead compared to the packed implementation. > In order to select the best implementation for a given integer size, I wrote > the {{PackedInts.getMutable(valueCount, bitsPerValue, > acceptableOverheadPerValue)}} method. This method select the fastest > implementation that has less than {{acceptableOverheadPerValue}} wasted bits > per value. For example, if you accept an overhead of 20% > ({{acceptableOverheadPerValue = 0.2f * bitsPerValue}}), which is pretty > reasonable, here is what implementations would be selected: > * 1: Packed64SingleBlock1 > * 2: Packed64SingleBlock2 > * 3: Packed64SingleBlock3 > * 4: Packed64SingleBlock4 > * 5: Packed64SingleBlock5 > * 6: Packed64SingleBlock6 > * 7: Direct8 > * 8: Direct8 > * 9: Packed64SingleBlock9 > * 10: Packed64SingleBlock10 > * 11: Packed64SingleBlock12 > * 12: Packed64SingleBlock12 > * 13: Packed64 > * 14: Direct16 > * 15: Direct16 > * 16: Direct16 > * 17: Packed64 > * 18: Packed64SingleBlock21 > * 19: Packed64SingleBlock21 > * 20: Packed64SingleBlock21 > * 21: Packed64SingleBlock21 > * 22: Packed64 > * 23: Packed64 > * 24: Packed64 > * 25: Packed64 > * 26: Packed64 > * 27: Direct32 > * 28: Direct32 > * 29: Direct32 > * 30: Direct32 > * 31: Direct32 > * 32: Direct32 > * 33: Packed64 > * 34: Packed64 > * 35: Packed64 > * 36: Packed64 > * 37: Packed64 > * 38: Packed64 > * 39: Packed64 > * 40: Packed64 > * 41: Packed64 > * 42: Packed64 > * 43: Packed64 > * 44: Packed64 > * 45: Packed64 > * 46: Packed64 > * 47: Packed64 > * 48: Packed64 > * 49: Packed64 > * 50: Packed64 > * 51: Packed64 > * 52: Packed64 > * 53: Packed64 > * 54: Direct64 > * 55: Direct64 > * 56: Direct64 > * 57: Direct64 > * 58: Direct64 > * 59: Direct64 > * 60: Direct64 > * 61: Direct64 > * 62: Direct64 > Under 32 bits per value, only 13, 17 and 22-26 bits per value would still > choose the slower Packed64 implementation. Allowing a 50% overhead would > prevent the packed implementation to be selected for bits per value under 32. > Allowing an overhead of 32 bits per value would make sure that a Direct* > implementation is always selected. > Next steps would be to: > * make lucene components use this {{getMutable}} method and let users decide > what trade-off better suits them, > * write a Packed32SingleBlock implementation if necessary (I didn't do it > because I have no 32-bits computer to test the performance improvements). > I think this would allow more fine-grained control over the speed/space > trade-off, what do you think? -- This message is automatically generated by JIRA. If you think it was s
[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans
[ https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295153#comment-13295153 ] Simon Willnauer commented on LUCENE-2878: - hey alan, I won't be able to look at this this week but will do early next week! good stuff on a brief look! > Allow Scorer to expose positions and payloads aka. nuke spans > -- > > Key: LUCENE-2878 > URL: https://issues.apache.org/jira/browse/LUCENE-2878 > Project: Lucene - Java > Issue Type: Improvement > Components: core/search >Affects Versions: Positions Branch >Reporter: Simon Willnauer >Assignee: Simon Willnauer > Labels: gsoc2011, gsoc2012, lucene-gsoc-11, lucene-gsoc-12, > mentor > Fix For: Positions Branch > > Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, > LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, > LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, > LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, > LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, > LUCENE-2878.patch, LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, > PosHighlighter.patch, PosHighlighter.patch > > > Currently we have two somewhat separate types of queries, the one which can > make use of positions (mainly spans) and payloads (spans). Yet Span*Query > doesn't really do scoring comparable to what other queries do and at the end > of the day they are duplicating lot of code all over lucene. Span*Queries are > also limited to other Span*Query instances such that you can not use a > TermQuery or a BooleanQuery with SpanNear or anthing like that. > Beside of the Span*Query limitation other queries lacking a quiet interesting > feature since they can not score based on term proximity since scores doesn't > expose any positional information. All those problems bugged me for a while > now so I stared working on that using the bulkpostings API. I would have done > that first cut on trunk but TermScorer is working on BlockReader that do not > expose positions while the one in this branch does. I started adding a new > Positions class which users can pull from a scorer, to prevent unnecessary > positions enums I added ScorerContext#needsPositions and eventually > Scorere#needsPayloads to create the corresponding enum on demand. Yet, > currently only TermQuery / TermScorer implements this API and other simply > return null instead. > To show that the API really works and our BulkPostings work fine too with > positions I cut over TermSpanQuery to use a TermScorer under the hood and > nuked TermSpans entirely. A nice sideeffect of this was that the Position > BulkReading implementation got some exercise which now :) work all with > positions while Payloads for bulkreading are kind of experimental in the > patch and those only work with Standard codec. > So all spans now work on top of TermScorer ( I truly hate spans since today ) > including the ones that need Payloads (StandardCodec ONLY)!! I didn't bother > to implement the other codecs yet since I want to get feedback on the API and > on this first cut before I go one with it. I will upload the corresponding > patch in a minute. > I also had to cut over SpanQuery.getSpans(IR) to > SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk > first but after that pain today I need a break first :). > The patch passes all core tests > (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't > look into the MemoryIndex BulkPostings API yet) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux-Java7-64 - Build # 278 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux-Java7-64/278/ 4 tests failed. REGRESSION: org.apache.solr.handler.dataimport.TestSqlEntityProcessorDelta2.testCompositePk_DeltaImport_replace_nodelete Error Message: Exception during query Stack Trace: java.lang.RuntimeException: Exception during query at __randomizedtesting.SeedInfo.seed([65A41DCBAF63C272:E8DE2C1A940D8298]:0) at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:459) at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:426) at org.apache.solr.handler.dataimport.TestSqlEntityProcessorDelta2.testCompositePk_DeltaImport_replace_nodelete(TestSqlEntityProcessorDelta2.java:203) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Caused by: java.lang.RuntimeException: REQUEST FAILED: xpath=//*[@numFound='0'] xml response was: 010desc:hello OR XtestCompositePk_DeltaImport_replace_nodeletestandard202.2prefix-1hello2012-06-14T16:34:07.474Z request was:start=0&q=desc:hello+OR+XtestCompositePk_DeltaImport_replace_nodelete&qt=standard&rows=20&version=2.2 at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4
[JENKINS] Lucene-Solr-4.x-Windows-Java7-64 - Build # 64 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/64/ 1 tests failed. REGRESSION: org.apache.solr.handler.component.SpellCheckComponentTest.testThresholdTokenFrequency Error Message: Path not found: /spellcheck/suggestions/[1]/suggestion Stack Trace: java.lang.RuntimeException: Path not found: /spellcheck/suggestions/[1]/suggestion at __randomizedtesting.SeedInfo.seed([6587EC4535619BCC:EF2063B4BA8AA2B7]:0) at org.apache.solr.SolrTestCaseJ4.assertJQ(SolrTestCaseJ4.java:545) at org.apache.solr.SolrTestCaseJ4.assertJQ(SolrTestCaseJ4.java:493) at org.apache.solr.handler.component.SpellCheckComponentTest.testThresholdTokenFrequency(SpellCheckComponentTest.java:211) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Build Log: [...truncated 10434 lines...] [junit4]>at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) [junit4]>at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) [junit4]>at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(Tes
[jira] [Updated] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans
[ https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Woodward updated LUCENE-2878: -- Attachment: LUCENE-2878.patch Updated patch implementing startOffset and endOffset on UnionDocsAndPositionsEnum. MultiPhraseQuery can now return its positions properly. > Allow Scorer to expose positions and payloads aka. nuke spans > -- > > Key: LUCENE-2878 > URL: https://issues.apache.org/jira/browse/LUCENE-2878 > Project: Lucene - Java > Issue Type: Improvement > Components: core/search >Affects Versions: Positions Branch >Reporter: Simon Willnauer >Assignee: Simon Willnauer > Labels: gsoc2011, gsoc2012, lucene-gsoc-11, lucene-gsoc-12, > mentor > Fix For: Positions Branch > > Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, > LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, > LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, > LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, > LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, > LUCENE-2878.patch, LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, > PosHighlighter.patch, PosHighlighter.patch > > > Currently we have two somewhat separate types of queries, the one which can > make use of positions (mainly spans) and payloads (spans). Yet Span*Query > doesn't really do scoring comparable to what other queries do and at the end > of the day they are duplicating lot of code all over lucene. Span*Queries are > also limited to other Span*Query instances such that you can not use a > TermQuery or a BooleanQuery with SpanNear or anthing like that. > Beside of the Span*Query limitation other queries lacking a quiet interesting > feature since they can not score based on term proximity since scores doesn't > expose any positional information. All those problems bugged me for a while > now so I stared working on that using the bulkpostings API. I would have done > that first cut on trunk but TermScorer is working on BlockReader that do not > expose positions while the one in this branch does. I started adding a new > Positions class which users can pull from a scorer, to prevent unnecessary > positions enums I added ScorerContext#needsPositions and eventually > Scorere#needsPayloads to create the corresponding enum on demand. Yet, > currently only TermQuery / TermScorer implements this API and other simply > return null instead. > To show that the API really works and our BulkPostings work fine too with > positions I cut over TermSpanQuery to use a TermScorer under the hood and > nuked TermSpans entirely. A nice sideeffect of this was that the Position > BulkReading implementation got some exercise which now :) work all with > positions while Payloads for bulkreading are kind of experimental in the > patch and those only work with Standard codec. > So all spans now work on top of TermScorer ( I truly hate spans since today ) > including the ones that need Payloads (StandardCodec ONLY)!! I didn't bother > to implement the other codecs yet since I want to get feedback on the API and > on this first cut before I go one with it. I will upload the corresponding > patch in a minute. > I also had to cut over SpanQuery.getSpans(IR) to > SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk > first but after that pain today I need a break first :). > The patch passes all core tests > (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't > look into the MemoryIndex BulkPostings API yet) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4062) More fine-grained control over the packed integer implementation that is chosen
[ https://issues.apache.org/jira/browse/LUCENE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-4062: - Attachment: LUCENE-4062-2.patch I have run more tests on {{PackedInts}} impls over the last days to test their relative performance. It appears that the specializations in {{Packed64SingleBlock}} don't help much and even hurt performance in some cases. Moreover, replacing the naive bulk operations by a {{System.arraycopy}} in {{Direct64}} is a big win. (See attached patch.) You can look at the details of the tests here: http://people.apache.org/~jpountz/packed_ints.html (contiguous=Packed64, padding=Packed64SingleBlock,3 blocks=Packed*ThreeBlocks,direct=Direct*). The tests were run on a 64-bit computer (Core 2 Duo E5500) with valueCount=10 000 000. "Memory overhead" is {unused space in bits}/{bits per value} while the other charts measure the number of gets/sets per second. The random get/set results are very good for the packed versions, probably because they manage to fit much more values into the CPU caches than other impls. The reason why bulk get/set is faster when bitsPerValue>32 is that Direct64 uses System.arraycopy instead of naive copy (in a for loop). Interestingly, the different impls have very close random get performance. > More fine-grained control over the packed integer implementation that is > chosen > --- > > Key: LUCENE-4062 > URL: https://issues.apache.org/jira/browse/LUCENE-4062 > Project: Lucene - Java > Issue Type: Improvement > Components: core/other >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Minor > Labels: performance > Fix For: 4.0 > > Attachments: LUCENE-4062-2.patch, LUCENE-4062.patch, > LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, > LUCENE-4062.patch, LUCENE-4062.patch > > > In order to save space, Lucene has two main PackedInts.Mutable implentations, > one that is very fast and is based on a byte/short/integer/long array > (Direct*) and another one which packs bits in a memory-efficient manner > (Packed*). > The packed implementation tends to be much slower than the direct one, which > discourages some Lucene components to use it. On the other hand, if you store > 21 bits integers in a Direct32, this is a space loss of (32-21)/32=35%. > If you accept to trade some space for speed, you could store 3 of these 21 > bits integers in a long, resulting in an overhead of 1/3 bit per value. One > advantage of this approach is that you never need to read more than one block > to read or write a value, so this can be significantly faster than Packed32 > and Packed64 which always need to read/write two blocks in order to avoid > costly branches. > I ran some tests, and for 1000 21 bits values, this implementation takes > less than 2% more space and has 44% faster writes and 30% faster reads. The > 12 bits version (5 values per block) has the same performance improvement and > a 6% memory overhead compared to the packed implementation. > In order to select the best implementation for a given integer size, I wrote > the {{PackedInts.getMutable(valueCount, bitsPerValue, > acceptableOverheadPerValue)}} method. This method select the fastest > implementation that has less than {{acceptableOverheadPerValue}} wasted bits > per value. For example, if you accept an overhead of 20% > ({{acceptableOverheadPerValue = 0.2f * bitsPerValue}}), which is pretty > reasonable, here is what implementations would be selected: > * 1: Packed64SingleBlock1 > * 2: Packed64SingleBlock2 > * 3: Packed64SingleBlock3 > * 4: Packed64SingleBlock4 > * 5: Packed64SingleBlock5 > * 6: Packed64SingleBlock6 > * 7: Direct8 > * 8: Direct8 > * 9: Packed64SingleBlock9 > * 10: Packed64SingleBlock10 > * 11: Packed64SingleBlock12 > * 12: Packed64SingleBlock12 > * 13: Packed64 > * 14: Direct16 > * 15: Direct16 > * 16: Direct16 > * 17: Packed64 > * 18: Packed64SingleBlock21 > * 19: Packed64SingleBlock21 > * 20: Packed64SingleBlock21 > * 21: Packed64SingleBlock21 > * 22: Packed64 > * 23: Packed64 > * 24: Packed64 > * 25: Packed64 > * 26: Packed64 > * 27: Direct32 > * 28: Direct32 > * 29: Direct32 > * 30: Direct32 > * 31: Direct32 > * 32: Direct32 > * 33: Packed64 > * 34: Packed64 > * 35: Packed64 > * 36: Packed64 > * 37: Packed64 > * 38: Packed64 > * 39: Packed64 > * 40: Packed64 > * 41: Packed64 > * 42: Packed64 > * 43: Packed64 > * 44: Packed64 > * 45: Packed64 > * 46: Packed64 > * 47: Packed64 > * 48: Packed64 > * 49: Packed64 > * 50: Packed64 > * 51: Packed64 > * 52: Packed64 > * 53: Packed64 > * 54: Direct64 > * 55: Direct64 > * 56: Direct64 > * 57: Direct64 > * 58
[jira] [Resolved] (SOLR-1958) Empty fetchMailsSince exception
[ https://issues.apache.org/jira/browse/SOLR-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer resolved SOLR-1958. -- Resolution: Fixed Fix Version/s: 5.0 Committed...Trunk: r1350269, Branch_4x: r1350278 > Empty fetchMailsSince exception > --- > > Key: SOLR-1958 > URL: https://issues.apache.org/jira/browse/SOLR-1958 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Affects Versions: 4.0 > Environment: Ubuntu 9.10 x86_64 Linux 2.6.31-302-rs >Reporter: Max Lynch >Assignee: James Dyer > Labels: dih > Fix For: 4.0, 5.0 > > Attachments: SOLR-1958.patch, SOLR-1958.patch > > Original Estimate: 0h > Remaining Estimate: 0h > > When using the MailEntityProcessor, import would fail if fetchMailsSince was > not specified. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Solr-trunk - Build # 1884 - Failure
Build: https://builds.apache.org/job/Solr-trunk/1884/ 1 tests failed. REGRESSION: org.apache.solr.cloud.OverseerTest.testShardAssignmentBigger Error Message: could not find counter for shard:null Stack Trace: java.lang.AssertionError: could not find counter for shard:null at __randomizedtesting.SeedInfo.seed([B8806B0A91010277:2B6E95717B68FD14]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertNotNull(Assert.java:526) at org.apache.solr.cloud.OverseerTest.__CLR2_6_3v4oypg1pbz(OverseerTest.java:369) at org.apache.solr.cloud.OverseerTest.testShardAssignmentBigger(OverseerTest.java:251) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Build Log: [...truncated 41827 lines...] [junit4] 2>at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:531) [junit4] 2>at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:507) [junit4] 2> [junit4] 2> 290958 T2034 oasc.LeaderElector$1.process WARNING org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Sessio
[jira] [Updated] (SOLR-1958) Empty fetchMailsSince exception
[ https://issues.apache.org/jira/browse/SOLR-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer updated SOLR-1958: - Attachment: SOLR-1958.patch Here's an even simpler patch to fix this. I will commit this to trunk & back-port to 4x as it is a trivial change. However, I'm "blind" with MailEntityProcessor as I do not have a mailserver to run the unit test against. (See SOLR-2175...I've done a little research so far on this but haven't found the right answer yet...) > Empty fetchMailsSince exception > --- > > Key: SOLR-1958 > URL: https://issues.apache.org/jira/browse/SOLR-1958 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Affects Versions: 4.0 > Environment: Ubuntu 9.10 x86_64 Linux 2.6.31-302-rs >Reporter: Max Lynch >Assignee: James Dyer > Labels: dih > Fix For: 4.0 > > Attachments: SOLR-1958.patch, SOLR-1958.patch > > Original Estimate: 0h > Remaining Estimate: 0h > > When using the MailEntityProcessor, import would fail if fetchMailsSince was > not specified. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3535) Add block support for XMLLoader
[ https://issues.apache.org/jira/browse/SOLR-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295049#comment-13295049 ] Yonik Seeley commented on SOLR-3535: bq. 1) add "List getChildDocuments()" to SOlrInputDocument Or simply allow SolrInputDocument *as* a normal value and existing APIs could be used to add them. This would also be slightly more powerful, allowing more than one child list for the same parent. > Add block support for XMLLoader > --- > > Key: SOLR-3535 > URL: https://issues.apache.org/jira/browse/SOLR-3535 > Project: Solr > Issue Type: Sub-task > Components: update >Affects Versions: 4.1, 5.0 >Reporter: Mikhail Khludnev >Priority: Minor > Attachments: SOLR-3535.patch > > > I'd like to add the following update xml message: > > > > > out of scope for now: > * other update formats > * update log support (NRT), should not be a big deal > * overwrite feature support for block updates - it's more complicated, I'll > tell you why > Alt > * wdyt about adding attribute to the current tag {pre}{pre} > * or we can establish RunBlockUpdateProcessor which treat every > as a block. > *Test is included!!* > How you'd suggest to improve the patch? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-3542) Highlighter: Integration of LUCENE-4133 (Part of LUCENE-3440)
[ https://issues.apache.org/jira/browse/SOLR-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi reassigned SOLR-3542: Assignee: Koji Sekiguchi > Highlighter: Integration of LUCENE-4133 (Part of LUCENE-3440) > - > > Key: SOLR-3542 > URL: https://issues.apache.org/jira/browse/SOLR-3542 > Project: Solr > Issue Type: Improvement > Components: highlighter >Affects Versions: 4.0 >Reporter: Sebastian Lutze >Assignee: Koji Sekiguchi >Priority: Minor > Labels: FastVectorHighlighter, highlight, patch > Fix For: 4.0 > > Attachments: SOLR-3542.patch > > > This patch integrates a weight-based approach for sorting highlighted > fragments. > See LUCENE-4133 (Part of LUCENE-3440). > This patch contains: > - Introduction of class WeightedFragListBuilder, a implementation of > SolrFragListBuilder > - Updated example-configuration -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (LUCENE-4062) More fine-grained control over the packed integer implementation that is chosen
[ https://issues.apache.org/jira/browse/LUCENE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand reopened LUCENE-4062: -- Assignee: Adrien Grand (was: Michael McCandless) > More fine-grained control over the packed integer implementation that is > chosen > --- > > Key: LUCENE-4062 > URL: https://issues.apache.org/jira/browse/LUCENE-4062 > Project: Lucene - Java > Issue Type: Improvement > Components: core/other >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Minor > Labels: performance > Fix For: 4.0 > > Attachments: LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, > LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch > > > In order to save space, Lucene has two main PackedInts.Mutable implentations, > one that is very fast and is based on a byte/short/integer/long array > (Direct*) and another one which packs bits in a memory-efficient manner > (Packed*). > The packed implementation tends to be much slower than the direct one, which > discourages some Lucene components to use it. On the other hand, if you store > 21 bits integers in a Direct32, this is a space loss of (32-21)/32=35%. > If you accept to trade some space for speed, you could store 3 of these 21 > bits integers in a long, resulting in an overhead of 1/3 bit per value. One > advantage of this approach is that you never need to read more than one block > to read or write a value, so this can be significantly faster than Packed32 > and Packed64 which always need to read/write two blocks in order to avoid > costly branches. > I ran some tests, and for 1000 21 bits values, this implementation takes > less than 2% more space and has 44% faster writes and 30% faster reads. The > 12 bits version (5 values per block) has the same performance improvement and > a 6% memory overhead compared to the packed implementation. > In order to select the best implementation for a given integer size, I wrote > the {{PackedInts.getMutable(valueCount, bitsPerValue, > acceptableOverheadPerValue)}} method. This method select the fastest > implementation that has less than {{acceptableOverheadPerValue}} wasted bits > per value. For example, if you accept an overhead of 20% > ({{acceptableOverheadPerValue = 0.2f * bitsPerValue}}), which is pretty > reasonable, here is what implementations would be selected: > * 1: Packed64SingleBlock1 > * 2: Packed64SingleBlock2 > * 3: Packed64SingleBlock3 > * 4: Packed64SingleBlock4 > * 5: Packed64SingleBlock5 > * 6: Packed64SingleBlock6 > * 7: Direct8 > * 8: Direct8 > * 9: Packed64SingleBlock9 > * 10: Packed64SingleBlock10 > * 11: Packed64SingleBlock12 > * 12: Packed64SingleBlock12 > * 13: Packed64 > * 14: Direct16 > * 15: Direct16 > * 16: Direct16 > * 17: Packed64 > * 18: Packed64SingleBlock21 > * 19: Packed64SingleBlock21 > * 20: Packed64SingleBlock21 > * 21: Packed64SingleBlock21 > * 22: Packed64 > * 23: Packed64 > * 24: Packed64 > * 25: Packed64 > * 26: Packed64 > * 27: Direct32 > * 28: Direct32 > * 29: Direct32 > * 30: Direct32 > * 31: Direct32 > * 32: Direct32 > * 33: Packed64 > * 34: Packed64 > * 35: Packed64 > * 36: Packed64 > * 37: Packed64 > * 38: Packed64 > * 39: Packed64 > * 40: Packed64 > * 41: Packed64 > * 42: Packed64 > * 43: Packed64 > * 44: Packed64 > * 45: Packed64 > * 46: Packed64 > * 47: Packed64 > * 48: Packed64 > * 49: Packed64 > * 50: Packed64 > * 51: Packed64 > * 52: Packed64 > * 53: Packed64 > * 54: Direct64 > * 55: Direct64 > * 56: Direct64 > * 57: Direct64 > * 58: Direct64 > * 59: Direct64 > * 60: Direct64 > * 61: Direct64 > * 62: Direct64 > Under 32 bits per value, only 13, 17 and 22-26 bits per value would still > choose the slower Packed64 implementation. Allowing a 50% overhead would > prevent the packed implementation to be selected for bits per value under 32. > Allowing an overhead of 32 bits per value would make sure that a Direct* > implementation is always selected. > Next steps would be to: > * make lucene components use this {{getMutable}} method and let users decide > what trade-off better suits them, > * write a Packed32SingleBlock implementation if necessary (I didn't do it > because I have no 32-bits computer to test the performance improvements). > I think this would allow more fine-grained control over the speed/space > trade-off, what do you think? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295026#comment-13295026 ] Erik Hatcher commented on SOLR-2894: Trey - thanks for the positive feedback. I'll apply the patch, run the tests, review the code, and so on. Might be a couple of weeks, unless I can get to this today. > Implement distributed pivot faceting > > > Key: SOLR-2894 > URL: https://issues.apache.org/jira/browse/SOLR-2894 > Project: Solr > Issue Type: Improvement >Reporter: Erik Hatcher >Assignee: Erik Hatcher > Fix For: 4.0 > > Attachments: SOLR-2894.patch, distributed_pivot.patch, > distributed_pivot.patch > > > Following up on SOLR-792, pivot faceting currently only supports > undistributed mode. Distributed pivot faceting needs to be implemented. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3534) dismax and edismax should default to "df" when "qf" is absent.
[ https://issues.apache.org/jira/browse/SOLR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295025#comment-13295025 ] David Smiley commented on SOLR-3534: Just to keep these concerns separated, this issue, SOLR-3534 is about two things: * dismax&edismax should look at 'df' before falling back to defaultSearchField * dismax&edismax should throw an exception if neither 'qf', 'df', nor defaultSearchField are specified, because these two query parsers are fairly useless without them. SOLR-2724 is about the deprecation of defaultSearchField > dismax and edismax should default to "df" when "qf" is absent. > -- > > Key: SOLR-3534 > URL: https://issues.apache.org/jira/browse/SOLR-3534 > Project: Solr > Issue Type: Improvement > Components: query parsers >Affects Versions: 4.0 >Reporter: David Smiley >Assignee: David Smiley >Priority: Minor > Attachments: > SOLR-3534_dismax_and_edismax_should_default_to_df_if_qf_is_absent.patch > > > The dismax and edismax query parsers should default to "df" when the "qf" > parameter is absent. They only use the defaultSearchField in schema.xml as a > fallback now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3534) dismax and edismax should default to "df" when "qf" is absent.
[ https://issues.apache.org/jira/browse/SOLR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295019#comment-13295019 ] David Smiley commented on SOLR-3534: defaultSearchField may be referenced in a bunch of places but it is always a default for something else that you should be specifying (typically 'df'). I've commented out my defaultSearchField long before it was deprecated. > dismax and edismax should default to "df" when "qf" is absent. > -- > > Key: SOLR-3534 > URL: https://issues.apache.org/jira/browse/SOLR-3534 > Project: Solr > Issue Type: Improvement > Components: query parsers >Affects Versions: 4.0 >Reporter: David Smiley >Assignee: David Smiley >Priority: Minor > Attachments: > SOLR-3534_dismax_and_edismax_should_default_to_df_if_qf_is_absent.patch > > > The dismax and edismax query parsers should default to "df" when the "qf" > parameter is absent. They only use the defaultSearchField in schema.xml as a > fallback now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-4132: --- Attachment: LUCENE-4132.patch bq. Can we override all methods so the javadocs aren't confusing. Good idea! Done bq. Also can we rename it to LiveIndexWriterConfig? Done > IndexWriterConfig live settings > --- > > Key: LUCENE-4132 > URL: https://issues.apache.org/jira/browse/LUCENE-4132 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Shai Erera >Priority: Minor > Fix For: 4.0, 5.0 > > Attachments: LUCENE-4132.patch, LUCENE-4132.patch, LUCENE-4132.patch, > LUCENE-4132.patch, LUCENE-4132.patch > > > A while ago there was a discussion about making some IW settings "live" and I > remember that RAM buffer size was one of them. Judging from IW code, I see > that RAM buffer can be changed "live" as IW never caches it. > However, I don't remember which other settings were decided to be "live" and > I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: > {code} > * NOTE: some settings may be changed on the > * returned {@link IndexWriterConfig}, and will take > * effect in the current IndexWriter instance. See the > * javadocs for the specific setters in {@link > * IndexWriterConfig} for details. > {code} > But there's no text on e.g. IWC.setRAMBuffer mentioning that. > I think that it'd be good if we make it easier for users to tell which of the > settings are "live" ones. There are few possible ways to do it: > * Introduce a custom @live.setting tag on the relevant IWC.set methods, and > add special text for them in build.xml > ** Or, drop the tag and just document it clearly. > * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name > proposals are welcome !), have IWC impl both, and introduce another > IW.getLiveConfig which will return that interface, thereby clearly letting > the user know which of the settings are "live". > It'd be good if IWC itself could only expose setXYZ methods for the "live" > settings though. So perhaps, off the top of my head, we can do something like > this: > * Introduce a Config object, which is essentially what IWC is today, and pass > it to IW. > * IW will create a different object, IWC from that Config and IW.getConfig > will return IWC. > * IWC itself will only have setXYZ methods for the "live" settings. > It adds another object, but user code doesn't change - it still creates a > Config object when initializing IW, and need to handle a different type if it > ever calls IW.getConfig. > Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4144) OOM when call optimize
[ https://issues.apache.org/jira/browse/LUCENE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-4144. Resolution: Not A Problem Please raise this on the java-u...@lucene.apache.org list instead. Also, we've made good reductions in RAM usage since 2.1, so it could be simply upgrading to the latest (3.6) resolves this. > OOM when call optimize > -- > > Key: LUCENE-4144 > URL: https://issues.apache.org/jira/browse/LUCENE-4144 > Project: Lucene - Java > Issue Type: New Feature > Components: core/index >Affects Versions: 2.1 >Reporter: Zhenglin Sun > Fix For: 2.1 > > > The index file is about 6G, when i update the index, it can work good, but i > hit a OOM when call the method optimize > Caused by: java.lang.OutOfMemoryError: allocLargeObjectOrArray - Object size: > 969048, Num elements: 242258 > at > org.apache.lucene.index.TermInfosReader.ensureIndexIsRead(TermInfosReader.java:90) > at > org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:133) > at > org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:51) > at org.apache.lucene.index.IndexReader.termDocs(IndexReader.java:482) > at > org.apache.lucene.index.IndexReader.deleteDocuments(IndexReader.java:573) > at > org.apache.lucene.index.IndexWriter.applyDeletes(IndexWriter.java:1776) > at > org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:1670) > at > org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1521) > at > org.apache.lucene.index.IndexWriter.flushRamSegments(IndexWriter.java:1351) > at > org.apache.lucene.index.IndexWriter.maybeFlushRamSegments(IndexWriter.java:1344) > at > org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:763) > at > org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:743) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux-Java7-64 - Build # 276 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux-Java7-64/276/ 1 tests failed. REGRESSION: org.apache.solr.cloud.RecoveryZkTest.testDistribSearch Error Message: Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #1,6,] Stack Trace: java.lang.RuntimeException: Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #1,6,] at com.carrotsearch.randomizedtesting.RunnerThreadGroup.processUncaught(RunnerThreadGroup.java:96) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:857) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Caused by: org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.store.AlreadyClosedException: this Directory is closed at __randomizedtesting.SeedInfo.seed([5210B0FC43222B81]:0) at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:507) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:480) Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is closed at org.apache.lucene.store.Directory.ensureOpen(Directory.java:244) at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:241) at org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:321) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3127) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:382) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:451) Build Log: [...truncated 33938 lines...] [junit4] 2> 27912 T1782 C77 P40466 oasu.DirectUpdateHandler2.commit end_commit_flush [junit4] 2> 27912 T1789 oasc.SolrCore.registerSearcher [collection1] Registered new searcher Searcher@ec28733 main{StandardDirectoryReader(segments_4:1142 _e2(5.0):C2237/109 _ey(5.0):C247 _ex(5.0):C3)} [junit4] 2> 27915 T1914 C78 P60019 oasu.DirectUpdateHandler2.commit start commit{flags=0,version=0,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false} [junit4] 2> 27947 T1914 C78 P60019 oasc.SolrDeletionPolicy.onCommit SolrDeletionPolicy.onCommit: commits:num=2 [junit4] 2> commit{dir=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java7-64/checkout/solr/build/solr-core/test/J0/org.apache.solr.cloud.RecoveryZkTest-1339677498320/jetty2/index.20120614233844499,segFN=segments_4,generation=4,filenames=[_ed_Lucene40_0.prx, _ee_Lucene40_0.tim, _ee_Lucene40_0.tip, _ec_Lucene40_0.tim, _ee_Lucene40_0.frq, _ef_Lucene40_0.frq, _ec_Lucene40_0.tip, _e3_1.del, _e3.si, _ed_nrm.cfe, _ec.fnm, _ef.si, _ec_nrm.cfs, _ef_Lucene40_0.tip, _eb.fnm, _ef_Lucene40_0.tim, _ec_Lucene40_0.frq, _ed_nrm.cfs, _ef_Lucene40_0.prx, _ec_nrm.cfe, _e3_Lucene40_0.prx, _ef.fdx, _ec.fdt, _ef.fdt, _e3_Lucene40_0.frq, _ec.fdx, _
Re: Corrupt index
On Wed, Jun 13, 2012 at 8:45 PM, Itamar Syn-Hershko wrote: > Mike, > > On Wed, Jun 13, 2012 at 7:31 PM, Michael McCandless > wrote: >> >> Hi Itamar, >> >> One quick question: does Lucene.Net include the fixes done for >> LUCENE-1044 (to fsync files on commit)? Those are very important for >> an index to be intact after OS/JVM crash or power loss. > > > Definitely, as Christopher noted we are about to release a 3.0.3 compatible > version, which is line-by-line port of the Java version. Hmm OK. Then we still need to explain the corruption... >> You shouldn't even have to run CheckIndex ... because (as of >> LUCENE-1044) we now fsync all segment files before writing the new >> segments_N file, and then removing old segments_N files (and any >> segments that are no longer referenced). >> >> You do have to remove the write.lock if you aren't using >> NativeFSLockFactory (but this has been the default lock impl for a >> while now). > > Somewhat unrelated to this thread, but what should I expect to see? from > time to time we do see write.lock present after an app-crash or power > failure. Also, what are the steps that are expected to be performed in such > cases? If you are using NativeFSLockFactory, you will see a write.lock but it will not actually be locked (according to the OS); so, it's fine. If you are using SimpleFSLockFactory then the presence of write.lock means the index is still locked and you'll have to remove it. >> > Last week I have been playing with rather large indexes and crashed my >> > app >> > while it was indexing. I wasn't able to open the index, and Luke was >> > even >> > kind enough to wipe the index folder clean even though I opened it in >> > read-only mode. I re-ran this, and after another crash running >> > CheckIndex >> > revealed nothing - the index was detected to be an empty one. I am not >> > entirely sure what could be the cause for this, but I suspect it has >> > been corrupted by the crash. >> >> Had no commit completed (no segments file written)? >> >> If you don't fsync then all sorts of crazy things are possible... > > Ok, so we do have fsync since LUCENE-1044 is present, and there were > segments present from previous commits. Any idea what went wrong? I don't know! >> > I've been looking at these: >> > >> > >> > https://issues.apache.org/jira/browse/LUCENE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel >> > >> > https://issues.apache.org/jira/browse/LUCENE-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel >> >> (And LUCENE-1044 before that ... it was LUCENE-1044 that LUCENE-2328 >> broke...). > > So 2328 broke 1044, and this was fixed only in 3.4, right? so 2328 made it > to a 3.0.x release while the fix for it (3418) was only released in 3.4. Am > I right? > > If this is the case, 2328 probably made it's way to Lucene.Net since we are > using the released sources for porting, and we now need to apply 3418 in the > current version. OK that makes sense: 2328 broke things as of 3.0.3, and 3418 fixed things in 3.4. > Does it make sense to just port FSDirectory from 3.4 to 3.0.3? or were there > API or other changes that will make our life miserable if we do that? Hmmm I'm not certain offhand: maybe diff the two sources? The fix in 3418 was trivial in the end, so maybe just backport that. >> > And it seems like this is what I was experiencing. Mike and Mark will >> > probably be able to tell if this is what they saw or not, but as far as >> > I >> > can tell this is not an expected behavior of a Lucene index. >> >> Definitely not expected behavior: assuming nothing is flipping bits, >> then on OS/JVM crash or power loss your index should be fine, just >> reverted to the last successful commit. > > What I suspected. Will try to reproduce reliably - any recommendations? not > really feeling like reinventing the wheel here... > > MockDirectoryWrapper wasn't ported yet as it appears to only appear in 3.4, > and as you said it won't really help here anyway Use a spare computer and try pulling the plug on it ... or pull a (hot swappable/pluggable) hard drive while indexing onto it ... You can also use a virtual machine and power it off ungracefully / kill the process. If any of these events can corrupt the index then there's a bug somewhere (or: the IO system ignores fsync). Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3511) Refactor overseer to use a distributed "work"queue
[ https://issues.apache.org/jira/browse/SOLR-3511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved SOLR-3511. -- Resolution: Fixed Committed to 4.x too > Refactor overseer to use a distributed "work"queue > -- > > Key: SOLR-3511 > URL: https://issues.apache.org/jira/browse/SOLR-3511 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Sami Siren >Assignee: Sami Siren > Fix For: 4.0 > > Attachments: SOLR-3511.patch, SOLR-3511.patch > > > By using a queue overseer becomes a watch free, a lot simpler and probably > less buggy too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294989#comment-13294989 ] Michael McCandless commented on LUCENE-4132: Also can we rename it to LiveIndexWriterConfig? LiveConfig is too generic I think... > IndexWriterConfig live settings > --- > > Key: LUCENE-4132 > URL: https://issues.apache.org/jira/browse/LUCENE-4132 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Shai Erera >Priority: Minor > Fix For: 4.0, 5.0 > > Attachments: LUCENE-4132.patch, LUCENE-4132.patch, LUCENE-4132.patch, > LUCENE-4132.patch > > > A while ago there was a discussion about making some IW settings "live" and I > remember that RAM buffer size was one of them. Judging from IW code, I see > that RAM buffer can be changed "live" as IW never caches it. > However, I don't remember which other settings were decided to be "live" and > I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: > {code} > * NOTE: some settings may be changed on the > * returned {@link IndexWriterConfig}, and will take > * effect in the current IndexWriter instance. See the > * javadocs for the specific setters in {@link > * IndexWriterConfig} for details. > {code} > But there's no text on e.g. IWC.setRAMBuffer mentioning that. > I think that it'd be good if we make it easier for users to tell which of the > settings are "live" ones. There are few possible ways to do it: > * Introduce a custom @live.setting tag on the relevant IWC.set methods, and > add special text for them in build.xml > ** Or, drop the tag and just document it clearly. > * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name > proposals are welcome !), have IWC impl both, and introduce another > IW.getLiveConfig which will return that interface, thereby clearly letting > the user know which of the settings are "live". > It'd be good if IWC itself could only expose setXYZ methods for the "live" > settings though. So perhaps, off the top of my head, we can do something like > this: > * Introduce a Config object, which is essentially what IWC is today, and pass > it to IW. > * IW will create a different object, IWC from that Config and IW.getConfig > will return IWC. > * IWC itself will only have setXYZ methods for the "live" settings. > It adds another object, but user code doesn't change - it still creates a > Config object when initializing IW, and need to handle a different type if it > ever calls IW.getConfig. > Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Solr-4.x - Build # 9 - Still Failing
Build: https://builds.apache.org/job/Solr-4.x/9/ 1 tests failed. FAILED: org.apache.solr.cloud.RecoveryZkTest.testDistribSearch Error Message: Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #2,6,] Stack Trace: java.lang.RuntimeException: Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #2,6,] at com.carrotsearch.randomizedtesting.RunnerThreadGroup.processUncaught(RunnerThreadGroup.java:96) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:857) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Caused by: org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.store.AlreadyClosedException: this Directory is closed at __randomizedtesting.SeedInfo.seed([18DE9A9DE2F3DF31]:0) at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:507) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:480) Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is closed at org.apache.lucene.store.Directory.ensureOpen(Directory.java:244) at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:241) at org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:321) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3149) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:382) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:451) Build Log: [...truncated 46889 lines...] [junit4] 2> 26469 T1539 oasc.Overseer.coreChanged Core change pooled: 127.0.0.1:56723_solr states:[coll:collection1 core:collection1 props:{num_shards=1, shard=shard1, state=active, core=collection1, collection=collection1, node_name=127.0.0.1:56723_solr, base_url=http://127.0.0.1:56723/solr}] [junit4] 2> 26469 T1539 oascc.ZkStateReader$3.process Updating live nodes [junit4] 2> 26470 T1619 oascc.ZkStateReader.updateCloudState Manual update of cluster state initiated [junit4] 2> 26470 T1619 oascc.ZkStateReader.updateCloudState Updating cloud state from ZooKeeper... [junit4] 2> 26470 T1539 oasc.RecoveryStrategy.close WARNING Stopping recovery for core collection1 zkNodeName=127.0.0.1:56723_solr_collection1 [junit4] 2> 26471 T1619 oasc.Overseer$CloudStateUpdater.run Announcing new cluster state [junit4] 2> 26471 T1539 oascc.SolrZkClient.makePath makePath: /collections/collection1/leaders/shard1 [junit4] 2> 26478 T1483 oascc.ZkStateReader$2.process A cluster state change has occurred [junit4] 2> 26478 T1479 oascc.ZkStateReader$2.process A cluster state change has occurred [junit4] 2> 26480 T1539 oascc.ZkStateReader$2.process A cluster state change has occurred [junit4] 2> 26481 T1539 oasc.Overseer.processLeaderNodesChanged Leader nod
[jira] [Resolved] (SOLR-3543) JavaBinLoader catches (and logs) exceptions and the (solrj)client has no idea that an update failed
[ https://issues.apache.org/jira/browse/SOLR-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved SOLR-3543. -- Resolution: Fixed > JavaBinLoader catches (and logs) exceptions and the (solrj)client has no idea > that an update failed > --- > > Key: SOLR-3543 > URL: https://issues.apache.org/jira/browse/SOLR-3543 > Project: Solr > Issue Type: Bug > Components: update >Reporter: Sami Siren > Fix For: 4.0 > > Attachments: SOLR-3543.patch > > > When submitting docs to solr with the javabin wire format server responses > with 200 ok even when there was an error. The exception is only logged at the > server. > When using the xml format error is correctly reported back -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Linux-Java6-64 - Build # 114 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux-Java6-64/114/ 1 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.handler.TestReplicationHandler Error Message: ERROR: SolrIndexSearcher opens=74 closes=73 Stack Trace: java.lang.AssertionError: ERROR: SolrIndexSearcher opens=74 closes=73 at __randomizedtesting.SeedInfo.seed([E845A956CCCB28BB]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:190) at org.apache.solr.SolrTestCaseJ4.afterClass(SolrTestCaseJ4.java:82) at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:752) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Build Log: [...truncated 9942 lines...] [junit4] 2> 108 T775 oejs.AbstractConnector.doStart Started SocketConnector@0.0.0.0:35739 [junit4] 2> 108 T775 oasc.SolrResourceLoader.locateSolrHome JNDI not configured for solr (NoInitialContextEx) [junit4] 2> 108 T775 oasc.SolrResourceLoader.locateSolrHome using system property solr.solr.home: ./org.apache.solr.handler.TestReplicationHandler$SolrInstance-1339672437471/slave [junit4] 2> 109 T775 oasc.SolrResourceLoader. new SolrResourceLoader for deduced Solr Home: './org.apache.solr.handler.TestReplicationHandler$SolrInstance-1339672437471/slave/' [junit4] 2> 111 T775 oass.SolrDispatchFilter.init SolrDispatchFilter.init() [junit4] 2> 112 T775 oasc.SolrResourceLoader.locateSolrHome JNDI not configured for solr (NoInitialContextEx) [junit4] 2> 112 T775 oasc.SolrResourceLoader.locateSolrHome using system property solr.solr.home: ./org.apache.solr.handler.TestReplicationHandler$SolrInstance-1339672437471/slave [junit4] 2> 112 T775 oasc.CoreContainer$Initializer.initialize looking for solr.xml: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java6-64/checkout/solr/build/solr-core/test/J1/./org.apache.solr.handler.TestReplicationHandler$SolrInstance-1339672437471/slave/solr.xml [junit4] 2> 112 T775 oasc.CoreContainer. New CoreContainer 1100893972 [junit4] 2> 112 T775 oasc.CoreContainer$Initializer.initialize no solr.xml file found - using default [junit4] 2> 112 T775 oasc.CoreContainer.load Loading CoreContainer using Solr Home: './org.apache.solr.handler.TestReplicationHandler$SolrInstance-1339672437471/slave/' [junit4] 2> 113 T775 oasc.SolrResourceLoader. new SolrResourceLoader for directory: './org.apache.solr.handler.TestReplicationHandler$SolrInstance-1339672437471/slave/' [junit4] 2> 117 T775 oasc.CoreContainer.load Registering Log Listener [junit4] 2> 125 T775 oashc.HttpShardHandlerFactory.getParameter Setting socketTimeout to: 0 [junit4] 2> 125 T775 oashc.HttpShardHandlerFactory.getParameter Setting urlScheme to: http:// [junit4] 2> 125 T775 oashc.HttpShardHandlerFactory.getParameter Setting connTimeout to: 0 [junit4] 2> 125 T775 oashc.HttpShardHandlerFactory.getParameter Setting maxConnectionsPerHost to: 20 [junit4] 2> 126 T775 oashc.Htt
[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294939#comment-13294939 ] Robert Muir commented on LUCENE-4132: - Can we override *all* methods so the javadocs aren't confusing. I don't want the methods split in the javadocs between IWC and LiveConfig: LiveConfig is expert and should be a subset, not a portion. > IndexWriterConfig live settings > --- > > Key: LUCENE-4132 > URL: https://issues.apache.org/jira/browse/LUCENE-4132 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Shai Erera >Priority: Minor > Fix For: 4.0, 5.0 > > Attachments: LUCENE-4132.patch, LUCENE-4132.patch, LUCENE-4132.patch, > LUCENE-4132.patch > > > A while ago there was a discussion about making some IW settings "live" and I > remember that RAM buffer size was one of them. Judging from IW code, I see > that RAM buffer can be changed "live" as IW never caches it. > However, I don't remember which other settings were decided to be "live" and > I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: > {code} > * NOTE: some settings may be changed on the > * returned {@link IndexWriterConfig}, and will take > * effect in the current IndexWriter instance. See the > * javadocs for the specific setters in {@link > * IndexWriterConfig} for details. > {code} > But there's no text on e.g. IWC.setRAMBuffer mentioning that. > I think that it'd be good if we make it easier for users to tell which of the > settings are "live" ones. There are few possible ways to do it: > * Introduce a custom @live.setting tag on the relevant IWC.set methods, and > add special text for them in build.xml > ** Or, drop the tag and just document it clearly. > * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name > proposals are welcome !), have IWC impl both, and introduce another > IW.getLiveConfig which will return that interface, thereby clearly letting > the user know which of the settings are "live". > It'd be good if IWC itself could only expose setXYZ methods for the "live" > settings though. So perhaps, off the top of my head, we can do something like > this: > * Introduce a Config object, which is essentially what IWC is today, and pass > it to IW. > * IW will create a different object, IWC from that Config and IW.getConfig > will return IWC. > * IWC itself will only have setXYZ methods for the "live" settings. > It adds another object, but user code doesn't change - it still creates a > Config object when initializing IW, and need to handle a different type if it > ever calls IW.getConfig. > Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-4.x-Linux-Java6-64 - Build # 101 - Failure!
Wrt thread scheduling -- has anybody ever tried dtrace with hotspot on a linux system? Does it work? http://docs.oracle.com/javase/6/docs/technotes/guides/vm/dtrace.html I see there are probes to inspect thread lifecycle but I never played with dtrace so I've no idea how it works/ if it does work on linux. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-4.x-Linux-Java6-64 - Build # 101 - Failure!
> It certainly wouldn't be easy to do ... but it sure would it be nice :) I meant "difficult" as in "practically impossible" :) But then so are these -- http://js1k.com/ >> There's been some talk about tools to detect data races at the hotspot Found it -- see this thread: http://cs.oswego.edu/pipermail/concurrency-interest/2011-September/008205.html The tool I briefly looked at was this one: http://babelfish.arc.nasa.gov/trac/jpf > Or, even, just a way to record and then visualize what the thread > scheduling had been for a given test failure. In this case I could > have easily seen that a merge had completed before the NRT reader was > pulled (which is... unusual). This is in fact relatively easy if we allowed a jenkins run with some minor boot classpath adjustments overriding Thread's init/exit methods and logging timings from there. Obviously it'd have to be bound to a particular jvm version/ distribution but it can be done. Bytecode instrumentation would be a nicer alternative here but I'm not sure how deep it can go in terms of precedence (it'd probably need to be a native agent and this seems like an overkill). I also think (didn't check) YourKit's profiler has a thread schedule visualizer but this adds additional overhead and requires a gui (or remoting). Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-4132: --- Attachment: LUCENE-4132.patch Thanks Uwe. The test is now fixed by saving all 'synthetic' methods and all 'setter' methods and verifying in the end that all of them were received from IWC too. > IndexWriterConfig live settings > --- > > Key: LUCENE-4132 > URL: https://issues.apache.org/jira/browse/LUCENE-4132 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Shai Erera >Priority: Minor > Fix For: 4.0, 5.0 > > Attachments: LUCENE-4132.patch, LUCENE-4132.patch, LUCENE-4132.patch, > LUCENE-4132.patch > > > A while ago there was a discussion about making some IW settings "live" and I > remember that RAM buffer size was one of them. Judging from IW code, I see > that RAM buffer can be changed "live" as IW never caches it. > However, I don't remember which other settings were decided to be "live" and > I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: > {code} > * NOTE: some settings may be changed on the > * returned {@link IndexWriterConfig}, and will take > * effect in the current IndexWriter instance. See the > * javadocs for the specific setters in {@link > * IndexWriterConfig} for details. > {code} > But there's no text on e.g. IWC.setRAMBuffer mentioning that. > I think that it'd be good if we make it easier for users to tell which of the > settings are "live" ones. There are few possible ways to do it: > * Introduce a custom @live.setting tag on the relevant IWC.set methods, and > add special text for them in build.xml > ** Or, drop the tag and just document it clearly. > * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name > proposals are welcome !), have IWC impl both, and introduce another > IW.getLiveConfig which will return that interface, thereby clearly letting > the user know which of the settings are "live". > It'd be good if IWC itself could only expose setXYZ methods for the "live" > settings though. So perhaps, off the top of my head, we can do something like > this: > * Introduce a Config object, which is essentially what IWC is today, and pass > it to IW. > * IW will create a different object, IWC from that Config and IW.getConfig > will return IWC. > * IWC itself will only have setXYZ methods for the "live" settings. > It adds another object, but user code doesn't change - it still creates a > Config object when initializing IW, and need to handle a different type if it > ever calls IW.getConfig. > Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294867#comment-13294867 ] Uwe Schindler commented on LUCENE-4132: --- Hi Shai, ignore all methods with isSynthetic() set (that are covariant overrides compatibility methods, access$xx() methods for access to private fields/ctors/...). > IndexWriterConfig live settings > --- > > Key: LUCENE-4132 > URL: https://issues.apache.org/jira/browse/LUCENE-4132 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Shai Erera >Priority: Minor > Fix For: 4.0, 5.0 > > Attachments: LUCENE-4132.patch, LUCENE-4132.patch, LUCENE-4132.patch > > > A while ago there was a discussion about making some IW settings "live" and I > remember that RAM buffer size was one of them. Judging from IW code, I see > that RAM buffer can be changed "live" as IW never caches it. > However, I don't remember which other settings were decided to be "live" and > I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: > {code} > * NOTE: some settings may be changed on the > * returned {@link IndexWriterConfig}, and will take > * effect in the current IndexWriter instance. See the > * javadocs for the specific setters in {@link > * IndexWriterConfig} for details. > {code} > But there's no text on e.g. IWC.setRAMBuffer mentioning that. > I think that it'd be good if we make it easier for users to tell which of the > settings are "live" ones. There are few possible ways to do it: > * Introduce a custom @live.setting tag on the relevant IWC.set methods, and > add special text for them in build.xml > ** Or, drop the tag and just document it clearly. > * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name > proposals are welcome !), have IWC impl both, and introduce another > IW.getLiveConfig which will return that interface, thereby clearly letting > the user know which of the settings are "live". > It'd be good if IWC itself could only expose setXYZ methods for the "live" > settings though. So perhaps, off the top of my head, we can do something like > this: > * Introduce a Config object, which is essentially what IWC is today, and pass > it to IW. > * IW will create a different object, IWC from that Config and IW.getConfig > will return IWC. > * IWC itself will only have setXYZ methods for the "live" settings. > It adds another object, but user code doesn't change - it still creates a > Config object when initializing IW, and need to handle a different type if it > ever calls IW.getConfig. > Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4132) IndexWriterConfig live settings
[ https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-4132: --- Attachment: LUCENE-4132.patch Sorry if it came across like that, but I don't mean to rush or shove this issue in. I'm usually after consensus and I appreciate your feedback. I took another look at this, and found a solution without generics. Funny thing is, that's the first solution that came to my mind, but I guess at the time it didn't picture well enough, so I discarded it :). Now we have only LiveConfig and IndexWriterConfig, where IWC extends LC and overrides all setter methods. The "live" setters are overridden just to return IWC type, and call super.setXYZ(). So we don't have code dup, and whoever has IWC type at hand, will receive IWC back from all set() methods. LC is public class but with package-private ctors, one that takes IWC (used by IndexWriter) and one that takes Analyzer+Version, to match IWC's. It contains all "live" members as private, and the others as protected, so that IWC can set them. Since it cannot be sub-classed outside the package, this is 'safe'. The only thing that bothers me, and I'm not sure if it can be fixed, but this is not critical either, is TestIWC.testSettersChaining(). For some reason, even though I override the setters from LC in IWC, and set their return type to IWC, reflection still returns their return type as LiveConfig. This only affects the test, since if I do: {code} IndexWriterConfig conf; conf.setMaxBufferedDocs(); // or any other set from LC {code} the return type is IWC. If anyone knows how to solve it, please let me know, otherwise we'll just have to live with the modification to the test, and the chance that future "live" setters may be incorrectly overridden by IWC to not return IWC type That is not an error, just a convenience. Besides that, and if I follow your comments and concerns properly, I think this is now ready to commit -- there's no extra complexity (generics, 3 classes etc.), and with better compile time protection against misuse. > IndexWriterConfig live settings > --- > > Key: LUCENE-4132 > URL: https://issues.apache.org/jira/browse/LUCENE-4132 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Shai Erera >Priority: Minor > Fix For: 4.0, 5.0 > > Attachments: LUCENE-4132.patch, LUCENE-4132.patch, LUCENE-4132.patch > > > A while ago there was a discussion about making some IW settings "live" and I > remember that RAM buffer size was one of them. Judging from IW code, I see > that RAM buffer can be changed "live" as IW never caches it. > However, I don't remember which other settings were decided to be "live" and > I don't see any documentation in IW nor IWC for that. IW.getConfig mentions: > {code} > * NOTE: some settings may be changed on the > * returned {@link IndexWriterConfig}, and will take > * effect in the current IndexWriter instance. See the > * javadocs for the specific setters in {@link > * IndexWriterConfig} for details. > {code} > But there's no text on e.g. IWC.setRAMBuffer mentioning that. > I think that it'd be good if we make it easier for users to tell which of the > settings are "live" ones. There are few possible ways to do it: > * Introduce a custom @live.setting tag on the relevant IWC.set methods, and > add special text for them in build.xml > ** Or, drop the tag and just document it clearly. > * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name > proposals are welcome !), have IWC impl both, and introduce another > IW.getLiveConfig which will return that interface, thereby clearly letting > the user know which of the settings are "live". > It'd be good if IWC itself could only expose setXYZ methods for the "live" > settings though. So perhaps, off the top of my head, we can do something like > this: > * Introduce a Config object, which is essentially what IWC is today, and pass > it to IW. > * IW will create a different object, IWC from that Config and IW.getConfig > will return IWC. > * IWC itself will only have setXYZ methods for the "live" settings. > It adds another object, but user code doesn't change - it still creates a > Config object when initializing IW, and need to handle a different type if it > ever calls IW.getConfig. > Maybe that's not such a bad idea? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-
[jira] [Commented] (SOLR-3406) Support grouped range and query facets.
[ https://issues.apache.org/jira/browse/SOLR-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294847#comment-13294847 ] Martijn van Groningen commented on SOLR-3406: - Sure. I think what is in here can be committed. The only thing that needs work is caching. Right now no when facet.query in combination with group.facet=true is used, caching doesn't take place. I think this can be fixed in a new issue that refers to this issue. In the meantime the patch in this issue can get committed. > Support grouped range and query facets. > --- > > Key: SOLR-3406 > URL: https://issues.apache.org/jira/browse/SOLR-3406 > Project: Solr > Issue Type: New Feature >Reporter: David >Assignee: Martijn van Groningen >Priority: Critical > Fix For: 4.0 > > Attachments: SOLR-2898-backport.patch, SOLR-3406.patch, > SOLR-3406.patch > > Original Estimate: 504h > Remaining Estimate: 504h > > Need the ability to support grouped range and query facets. Grouped facet > fields have already been implemented in SOLR-2898 but we still need the > ability to compute grouped range and query facets. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org