[JENKINS] Lucene-Solr-5.x-Windows (32bit/jdk1.7.0_67) - Build # 4406 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Windows/4406/ Java: 32bit/jdk1.7.0_67 -client -XX:+UseParallelGC (asserts: false) 1 tests failed. FAILED: org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.testDistribSearch Error Message: There were too many update fails - we expect it can happen, but shouldn't easily Stack Trace: java.lang.AssertionError: There were too many update fails - we expect it can happen, but shouldn't easily at __randomizedtesting.SeedInfo.seed([2A0C9DA5A101DC38:ABEA13BDD65EBC04]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertFalse(Assert.java:68) at org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.doTest(ChaosMonkeyNothingIsSafeTest.java:223) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:869) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
[jira] [Commented] (SOLR-6810) Faster searching limited but high rows across many shards all with many hits
[ https://issues.apache.org/jira/browse/SOLR-6810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258710#comment-14258710 ] Shalin Shekhar Mangar commented on SOLR-6810: - {quote} Something to keep in mind for future optimizations: If we can use searcher leases, we know exactly which documents we need to retrieve from step 2 and can pass their ordinals in step 3. That would appear to represent another very large speedup... if you need doc 42 and 77 from a shard, you can get just those two docs instead of docs 1 through 77. edit: either ordinals (positions in the ranked doc list) or internal lucene docids would work if we're using searcher leases. {quote} Maybe I missed something but if we make sure that step 2 is executed on the same replicas as step 1 (which we would have to do for searcher leases anyway) then the query results should already be in the cache and the ordinals in the ranked doc list are just the top N? bq. Which begs the question: what are the downsides of using docValues for the ID field by default, and are those downsides enough to implement this alternate merge implementation? I'm not saying otherwise... just throwing it out there. I don't know. I'll create a benchmark to experiment with these ideas. In any case, existing indexes where ID are not doc values will also get a speed up with this new algorithm. Faster searching limited but high rows across many shards all with many hits Key: SOLR-6810 URL: https://issues.apache.org/jira/browse/SOLR-6810 Project: Solr Issue Type: Improvement Components: search Reporter: Per Steffensen Assignee: Shalin Shekhar Mangar Labels: distributed_search, performance Attachments: branch_5x_rev1642874.patch, branch_5x_rev1642874.patch, branch_5x_rev1645549.patch Searching limited but high rows across many shards all with many hits is slow E.g. * Query from outside client: q=somethingrows=1000 * Resulting in sub-requests to each shard something a-la this ** 1) q=somethingrows=1000fl=id,score ** 2) Request the full documents with ids in the global-top-1000 found among the top-1000 from each shard What does the subject mean * limited but high rows means 1000 in the example above * many shards means 200-1000 in our case * all with many hits means that each of the shards have a significant number of hits on the query The problem grows on all three factors above Doing such a query on our system takes between 5 min to 1 hour - depending on a lot of things. It ought to be much faster, so lets make it. Profiling show that the problem is that it takes lots of time to access the store to get id’s for (up to) 1000 docs (value of rows parameter) per shard. Having 1000 shards its up to 1 mio ids that has to be fetched. There is really no good reason to ever read information from store for more than the overall top-1000 documents, that has to be returned to the client. For further detail see mail-thread Slow searching limited but high rows across many shards all with high hits started 13/11-2014 on dev@lucene.apache.org -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6448) Improve SolrJ support for Collections API
[ https://issues.apache.org/jira/browse/SOLR-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anshum Gupta updated SOLR-6448: --- Attachment: SOLR-6448.patch Still trying to wrap up one test (BALANCESHARDUNIQUE). In the middle of validating that all supported SolrJ methods (for each of the Collection API calls) are actually supported. Once that's done, should be able to commit it and with SolrJ supporting all Collection API calls, it should be a merry christmas! :) Improve SolrJ support for Collections API - Key: SOLR-6448 URL: https://issues.apache.org/jira/browse/SOLR-6448 Project: Solr Issue Type: Improvement Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-6448.patch, SOLR-6448.patch, SOLR-6448.patch Right now SolrJ doesn't really support all of the collections API. This is a parent issue for bringing SolrJ support for all APIs up to where it should be. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6448) Improve SolrJ support for Collections API
[ https://issues.apache.org/jira/browse/SOLR-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258721#comment-14258721 ] Shalin Shekhar Mangar commented on SOLR-6448: - bq. Still trying to wrap up one test (BALANCESHARDUNIQUE). Isn't that and REBALANCELEADERS no longer in 5.0? See SOLR-6691 Improve SolrJ support for Collections API - Key: SOLR-6448 URL: https://issues.apache.org/jira/browse/SOLR-6448 Project: Solr Issue Type: Improvement Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-6448.patch, SOLR-6448.patch, SOLR-6448.patch Right now SolrJ doesn't really support all of the collections API. This is a parent issue for bringing SolrJ support for all APIs up to where it should be. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6889) debug, highlight with parallel streams
Shinichiro Abe created SOLR-6889: Summary: debug, highlight with parallel streams Key: SOLR-6889 URL: https://issues.apache.org/jira/browse/SOLR-6889 Project: Solr Issue Type: Improvement Affects Versions: Trunk Reporter: Shinichiro Abe I think we could gain search performance a little bit using Stream.parallel().forEach()~ which has processors awareness via f/j framework under the hood. Especially it would affect docList's for-loop processes, e.g. debugging, highlighting. It seems to me that this improvement is effective for many CPUs environment. My test condition: 1. Core i5(2core 4thead), standalone Solr. 2. q=日本debug=truehl=true, other parameters are [here|https://github.com/anond2/simplesearch/blob/master/conf/solrconfig.xml#L836]. 3. 7171 hits / 12000 docs(taken from ja.wikipedia dump) 4. compared to trunk, parallel streams are faster a little. My query execution results(QTime): {noformat} == rows=10 == trunk patch 1st 236146 2nd 179100 3rd 79 72 4th 75 53 5th 91 80 == rows=50 == trunk patch 1st 485325 2nd 225243 3rd 199151 4th 168127 5th 149118 == rows=100 == trunk patch 1st 948607 2nd 472390 3rd 237201 4th 256200 5th 224178 == rows=500 == trunk patch 1st 3248 2826 2nd 1545 1067 3rd 1563 801 4th 1551 816 5th 1452 777 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6889) debug, highlight with parallel streams
[ https://issues.apache.org/jira/browse/SOLR-6889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated SOLR-6889: - Attachment: SOLR-6889.patch simple patch. please catch this. debug, highlight with parallel streams -- Key: SOLR-6889 URL: https://issues.apache.org/jira/browse/SOLR-6889 Project: Solr Issue Type: Improvement Affects Versions: Trunk Reporter: Shinichiro Abe Attachments: SOLR-6889.patch I think we could gain search performance a little bit using Stream.parallel().forEach()~ which has processors awareness via f/j framework under the hood. Especially it would affect docList's for-loop processes, e.g. debugging, highlighting. It seems to me that this improvement is effective for many CPUs environment. My test condition: 1. Core i5(2core 4thead), standalone Solr. 2. q=日本debug=truehl=true, other parameters are [here|https://github.com/anond2/simplesearch/blob/master/conf/solrconfig.xml#L836]. 3. 7171 hits / 12000 docs(taken from ja.wikipedia dump) 4. compared to trunk, parallel streams are faster a little. My query execution results(QTime): {noformat} == rows=10 == trunk patch 1st 236146 2nd 179100 3rd 79 72 4th 75 53 5th 91 80 == rows=50 == trunk patch 1st 485325 2nd 225243 3rd 199151 4th 168127 5th 149118 == rows=100 == trunk patch 1st 948607 2nd 472390 3rd 237201 4th 256200 5th 224178 == rows=500 == trunk patch 1st 3248 2826 2nd 1545 1067 3rd 1563 801 4th 1551 816 5th 1452 777 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6137) RussianLightStemmer incorrectly handles the words ending with 'ее'
[ https://issues.apache.org/jira/browse/LUCENE-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258768#comment-14258768 ] Robert Muir commented on LUCENE-6137: - Can you propose your changes to http://members.unine.ch/jacques.savoy/clef/index.html? Like snowball, these are just implementations of those algorithms. They have done tests and written papers and so on, and can better evaluate these changes. RussianLightStemmer incorrectly handles the words ending with 'ее' -- Key: LUCENE-6137 URL: https://issues.apache.org/jira/browse/LUCENE-6137 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.10.2 Reporter: Alexander Sofronov Attachments: LUCENE-6137.patch Consider the forms of Russian word синий and the result returned by RussianLightStemmer: синий - син синяя - син синее - сине синие - син I think the correct result should be: синий - син синяя - син синее - син синие - син -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6137) RussianLightStemmer incorrectly handles the words ending with 'ее'
[ https://issues.apache.org/jira/browse/LUCENE-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258787#comment-14258787 ] Uwe Schindler commented on LUCENE-6137: --- Hi, I also agree you should raise this issue at the CLEF people who invented that stemmer! I talked with my wife (she has russian as mother language) and she can confirm your problem with some *neutral* adjective forms, but - as expected - she can confirm that removing -ee is too risky, because this would change also superlatives (also using -ee), too, which is not intended to be done by a light stemmer. I think this might be threason not to remove -ee by default (this changes meaning). RussianLightStemmer incorrectly handles the words ending with 'ее' -- Key: LUCENE-6137 URL: https://issues.apache.org/jira/browse/LUCENE-6137 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.10.2 Reporter: Alexander Sofronov Attachments: LUCENE-6137.patch Consider the forms of Russian word синий and the result returned by RussianLightStemmer: синий - син синяя - син синее - сине синие - син I think the correct result should be: синий - син синяя - син синее - син синие - син -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6448) Improve SolrJ support for Collections API
[ https://issues.apache.org/jira/browse/SOLR-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258801#comment-14258801 ] Erick Erickson commented on SOLR-6448: -- BALANCESHARDUNIQUE works fine in 5.0; it correctly updates the cluster state so I left it in 5.0. It may not be useful since the use-case I intended it for is REBALANCELEADERS, but it's still there. REBALANCELEADERS OTOH, was pulled from 5.0. I put REBALANCELEADERS back in to trunk last night (along with a bad svn EOL style, sh) and I'll back-port after 5.0 is cut. [~anshumg] There are a bunch of tests in TestReplicaProperties that might be be useful models. Those tests do the functionality pretty thoroughly, maybe it would be enough for the SolrJ tests to do more minimal testing? Also ReplicaPropertiesBase has some helper methods that you might find useful. Improve SolrJ support for Collections API - Key: SOLR-6448 URL: https://issues.apache.org/jira/browse/SOLR-6448 Project: Solr Issue Type: Improvement Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-6448.patch, SOLR-6448.patch, SOLR-6448.patch Right now SolrJ doesn't really support all of the collections API. This is a parent issue for bringing SolrJ support for all APIs up to where it should be. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release 4.10.3 RC1
What happened with 4.10.3? There seem to be a downloadable on the download archives, but the home page download link still talks about 4.10.2. Regards, Alex. Sign up for my Solr resources newsletter at http://www.solr-start.com/ On 15 December 2014 at 20:39, Mark Miller markrmil...@gmail.com wrote: This Vote has passed. I’ll start the process tomorrow. - Mark http://about.me/markrmiller - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-5.x #798: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-5.x/798/ 3 tests failed. FAILED: org.apache.solr.hadoop.MorphlineGoLiveMiniMRTest.org.apache.solr.hadoop.MorphlineGoLiveMiniMRTest Error Message: null Stack Trace: java.lang.AssertionError: null at __randomizedtesting.SeedInfo.seed([2DB1F4A2C332384F]:0) at org.apache.lucene.util.TestRuleTemporaryFilesCleanup.before(TestRuleTemporaryFilesCleanup.java:105) at com.carrotsearch.randomizedtesting.rules.TestRuleAdapter$1.before(TestRuleAdapter.java:26) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:35) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at java.lang.Thread.run(Thread.java:745) FAILED: org.apache.solr.hadoop.MorphlineBasicMiniMRTest.testPathParts Error Message: Test abandoned because suite timeout was reached. Stack Trace: java.lang.Exception: Test abandoned because suite timeout was reached. at __randomizedtesting.SeedInfo.seed([CED3797B9FD1E475]:0) FAILED: org.apache.solr.hadoop.MorphlineBasicMiniMRTest.org.apache.solr.hadoop.MorphlineBasicMiniMRTest Error Message: Suite timeout exceeded (= 720 msec). Stack Trace: java.lang.Exception: Suite timeout exceeded (= 720 msec). at __randomizedtesting.SeedInfo.seed([CED3797B9FD1E475]:0) Build Log: [...truncated 54009 lines...] BUILD FAILED /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-5.x/build.xml:552: The following error occurred while executing this line: /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-5.x/build.xml:204: The following error occurred while executing this line: : Java returned: 1 Total time: 378 minutes 4 seconds Build step 'Invoke Ant' marked build as failure Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-5.x-MacOSX (64bit/jdk1.8.0) - Build # 1971 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/1971/ Java: 64bit/jdk1.8.0 -XX:-UseCompressedOops -XX:+UseG1GC (asserts: false) 1 tests failed. FAILED: org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.testDistribSearch Error Message: There were too many update fails - we expect it can happen, but shouldn't easily Stack Trace: java.lang.AssertionError: There were too many update fails - we expect it can happen, but shouldn't easily at __randomizedtesting.SeedInfo.seed([663E0C429B5621D6:E7D8825AEC0941EA]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertFalse(Assert.java:68) at org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.doTest(ChaosMonkeyNothingIsSafeTest.java:223) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:869) at sun.reflect.GeneratedMethodAccessor57.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at
[jira] [Comment Edited] (LUCENE-6137) RussianLightStemmer incorrectly handles the words ending with 'ее'
[ https://issues.apache.org/jira/browse/LUCENE-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258787#comment-14258787 ] Uwe Schindler edited comment on LUCENE-6137 at 12/25/14 5:43 PM: - Hi, I also agree you should raise this issue at the CLEF people who invented that stemmer! I talked with my wife (she has russian as mother language) and she can confirm your problem with some *neutral* adjective forms, but - as expected - she can confirm that removing -ee is too risky, because this would change also comparative form (also using -ee), too, which is not intended to be done by a light stemmer. I think this might be the reason not to remove -ee by default (this changes meaning). was (Author: thetaphi): Hi, I also agree you should raise this issue at the CLEF people who invented that stemmer! I talked with my wife (she has russian as mother language) and she can confirm your problem with some *neutral* adjective forms, but - as expected - she can confirm that removing -ee is too risky, because this would change also superlatives (also using -ee), too, which is not intended to be done by a light stemmer. I think this might be threason not to remove -ee by default (this changes meaning). RussianLightStemmer incorrectly handles the words ending with 'ее' -- Key: LUCENE-6137 URL: https://issues.apache.org/jira/browse/LUCENE-6137 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.10.2 Reporter: Alexander Sofronov Attachments: LUCENE-6137.patch Consider the forms of Russian word синий and the result returned by RussianLightStemmer: синий - син синяя - син синее - сине синие - син I think the correct result should be: синий - син синяя - син синее - син синие - син -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6448) Improve SolrJ support for Collections API
[ https://issues.apache.org/jira/browse/SOLR-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258831#comment-14258831 ] Anshum Gupta commented on SOLR-6448: Shalin, as Erick mentioned, REBALANCELEADER was not in there but BALANCESHARDUNIQUE is. That list that I put up with my comment is wrong as I never put in RebalanceLeaders in there. The rest is all fine. Considering, REBALANCELEADER is only in trunk now, I'll leave that out for now and open another issue for just that and handle it after 5x is out. Improve SolrJ support for Collections API - Key: SOLR-6448 URL: https://issues.apache.org/jira/browse/SOLR-6448 Project: Solr Issue Type: Improvement Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-6448.patch, SOLR-6448.patch, SOLR-6448.patch Right now SolrJ doesn't really support all of the collections API. This is a parent issue for bringing SolrJ support for all APIs up to where it should be. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6840) Remove legacy solr.xml mode
[ https://issues.apache.org/jira/browse/SOLR-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258838#comment-14258838 ] Alexandre Rafalovitch commented on SOLR-6840: - Probably not caught by the tests, but the DIH example uses the legacy form of solr.xml. Somebody need to add empty core.properties files, etc. The only issue is that one of the cores was declared as default, not sure if any examples rely on this. Remove legacy solr.xml mode --- Key: SOLR-6840 URL: https://issues.apache.org/jira/browse/SOLR-6840 Project: Solr Issue Type: Task Reporter: Steve Rowe Assignee: Erick Erickson Priority: Blocker Fix For: 5.0 Attachments: SOLR-6840.patch, SOLR-6840.patch On the [Solr Cores and solr.xml page|https://cwiki.apache.org/confluence/display/solr/Solr+Cores+and+solr.xml], the Solr Reference Guide says: {quote} Starting in Solr 4.3, Solr will maintain two distinct formats for {{solr.xml}}, the _legacy_ and _discovery_ modes. The former is the format we have become accustomed to in which all of the cores one wishes to define in a Solr instance are defined in {{solr.xml}} in {{corescore/...core//cores}} tags. This format will continue to be supported through the entire 4.x code line. As of Solr 5.0 this form of solr.xml will no longer be supported. Instead Solr will support _core discovery_. [...] The new core discovery mode structure for solr.xml will become mandatory as of Solr 5.0, see: Format of solr.xml. {quote} AFAICT, nothing has been done to remove legacy {{solr.xml}} mode from 5.0 or trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6448) Improve SolrJ support for Collections API
[ https://issues.apache.org/jira/browse/SOLR-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258843#comment-14258843 ] Erick Erickson commented on SOLR-6448: -- Works for me, thanks! Improve SolrJ support for Collections API - Key: SOLR-6448 URL: https://issues.apache.org/jira/browse/SOLR-6448 Project: Solr Issue Type: Improvement Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-6448.patch, SOLR-6448.patch, SOLR-6448.patch Right now SolrJ doesn't really support all of the collections API. This is a parent issue for bringing SolrJ support for all APIs up to where it should be. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6840) Remove legacy solr.xml mode
[ https://issues.apache.org/jira/browse/SOLR-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258844#comment-14258844 ] Erick Erickson commented on SOLR-6840: -- [~romseygeek] I've actually got a pretty open week ahead, I can move this forward for a while if the patch is your most current Or not, up to you. [~arafalov] Yeah, one of the tasks I have on my list is to grep for cores in all the files in the project to be sure they're all gone. Although the parsing code should have errors if any are still in there We'll see. Remove legacy solr.xml mode --- Key: SOLR-6840 URL: https://issues.apache.org/jira/browse/SOLR-6840 Project: Solr Issue Type: Task Reporter: Steve Rowe Assignee: Erick Erickson Priority: Blocker Fix For: 5.0 Attachments: SOLR-6840.patch, SOLR-6840.patch On the [Solr Cores and solr.xml page|https://cwiki.apache.org/confluence/display/solr/Solr+Cores+and+solr.xml], the Solr Reference Guide says: {quote} Starting in Solr 4.3, Solr will maintain two distinct formats for {{solr.xml}}, the _legacy_ and _discovery_ modes. The former is the format we have become accustomed to in which all of the cores one wishes to define in a Solr instance are defined in {{solr.xml}} in {{corescore/...core//cores}} tags. This format will continue to be supported through the entire 4.x code line. As of Solr 5.0 this form of solr.xml will no longer be supported. Instead Solr will support _core discovery_. [...] The new core discovery mode structure for solr.xml will become mandatory as of Solr 5.0, see: Format of solr.xml. {quote} AFAICT, nothing has been done to remove legacy {{solr.xml}} mode from 5.0 or trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-NightlyTests-5.x - Build # 714 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-5.x/714/ 1 tests failed. FAILED: org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testDistribSearch Error Message: Error from server at http://127.0.0.1:65319/t_/vu: java.lang.NullPointerException Stack Trace: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error from server at http://127.0.0.1:65319/t_/vu: java.lang.NullPointerException at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:566) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:213) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:209) at org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testErrorHandling(CollectionsAPIDistributedZkTest.java:452) at org.apache.solr.cloud.CollectionsAPIDistributedZkTest.doTest(CollectionsAPIDistributedZkTest.java:201) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:869) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at
[jira] [Commented] (SOLR-6127) Improve Solr's exampledocs data
[ https://issues.apache.org/jira/browse/SOLR-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258855#comment-14258855 ] ASF subversion and git services commented on SOLR-6127: --- Commit 1647918 from [~ehatcher] in branch 'dev/trunk' [ https://svn.apache.org/r1647918 ] SOLR-6127: Improve example docs, using films data Improve Solr's exampledocs data --- Key: SOLR-6127 URL: https://issues.apache.org/jira/browse/SOLR-6127 Project: Solr Issue Type: Improvement Components: documentation, scripts and tools Reporter: Varun Thacker Assignee: Erik Hatcher Fix For: 5.0, Trunk Attachments: LICENSE.txt, README.txt, README.txt, SOLR-6127.patch, film.csv, film.json, film.xml, freebase_film_dump.py, freebase_film_dump.py, freebase_film_dump.py, freebase_film_dump.py, freebase_film_dump.py, freebase_film_dump.py, freebase_film_dump.py Currently - The CSV example has 10 documents. - The JSON example has 4 documents. - The XML example has 32 documents. 1. We should have equal number of documents and the same documents in all the example formats 2. A data set which is slightly more comprehensive. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6127) Improve Solr's exampledocs data
[ https://issues.apache.org/jira/browse/SOLR-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258856#comment-14258856 ] Erik Hatcher commented on SOLR-6127: made first commit of this, to trunk. made some adjustments like renaming the generated files to plural (films, instead of film). this works well with the steps from the included README.txt. porting to 5x is a consideration, but for now we'll proceed with this on trunk and work on migrating to films instead of techproducts. Improve Solr's exampledocs data --- Key: SOLR-6127 URL: https://issues.apache.org/jira/browse/SOLR-6127 Project: Solr Issue Type: Improvement Components: documentation, scripts and tools Reporter: Varun Thacker Assignee: Erik Hatcher Fix For: 5.0, Trunk Attachments: LICENSE.txt, README.txt, README.txt, SOLR-6127.patch, film.csv, film.json, film.xml, freebase_film_dump.py, freebase_film_dump.py, freebase_film_dump.py, freebase_film_dump.py, freebase_film_dump.py, freebase_film_dump.py, freebase_film_dump.py Currently - The CSV example has 10 documents. - The JSON example has 4 documents. - The XML example has 32 documents. 1. We should have equal number of documents and the same documents in all the example formats 2. A data set which is slightly more comprehensive. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5944) Support updates of numeric DocValues
[ https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258857#comment-14258857 ] Gopal Patwa commented on SOLR-5944: --- not sure if this patch is complete but it would be nice to have this in 5.0 Support updates of numeric DocValues Key: SOLR-5944 URL: https://issues.apache.org/jira/browse/SOLR-5944 Project: Solr Issue Type: New Feature Reporter: Ishan Chattopadhyaya Assignee: Shalin Shekhar Mangar Attachments: SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch LUCENE-5189 introduced support for updates to numeric docvalues. It would be really nice to have Solr support this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-6138) ItalianLightStemmer
Massimo Pasquini created LUCENE-6138: Summary: ItalianLightStemmer Key: LUCENE-6138 URL: https://issues.apache.org/jira/browse/LUCENE-6138 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Affects Versions: 4.10.2 Reporter: Massimo Pasquini Priority: Minor I expect a stemmer to transform nouns in their singular and plural forms into a shorter common form. The implementation of the ItalianLightStemmer doesn't apply any stemming to words shorter then 6 characters in length. This leads to some annoying results: singular form | plural form 4|5 chars in length (no stemming) alga - alga | alghe - alghe fuga - fuga | fughe - fughe lega - lega | leghe - leghe 5|6 chars in length (stemming only on plural form) vanga - vanga | vanghe - vang verga - verga | verghe - verg I suppose that such limitation on words length is to avoid other side effects on shorter words not in the set above, but I think something must be reviewed in the code for better results. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6138) ItalianLightStemmer doesn't apply on words shorter then 6 chars in length
[ https://issues.apache.org/jira/browse/LUCENE-6138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Massimo Pasquini updated LUCENE-6138: - Summary: ItalianLightStemmer doesn't apply on words shorter then 6 chars in length (was: ItalianLightStemmer) ItalianLightStemmer doesn't apply on words shorter then 6 chars in length - Key: LUCENE-6138 URL: https://issues.apache.org/jira/browse/LUCENE-6138 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Affects Versions: 4.10.2 Reporter: Massimo Pasquini Priority: Minor I expect a stemmer to transform nouns in their singular and plural forms into a shorter common form. The implementation of the ItalianLightStemmer doesn't apply any stemming to words shorter then 6 characters in length. This leads to some annoying results: singular form | plural form 4|5 chars in length (no stemming) alga - alga | alghe - alghe fuga - fuga | fughe - fughe lega - lega | leghe - leghe 5|6 chars in length (stemming only on plural form) vanga - vanga | vanghe - vang verga - verga | verghe - verg I suppose that such limitation on words length is to avoid other side effects on shorter words not in the set above, but I think something must be reviewed in the code for better results. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6435) Add script to simplify posting content to Solr
[ https://issues.apache.org/jira/browse/SOLR-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258898#comment-14258898 ] ASF subversion and git services commented on SOLR-6435: --- Commit 1647928 from [~ehatcher] in branch 'dev/trunk' [ https://svn.apache.org/r1647928 ] SOLR-6435: Add script to simplify posting content to Solr Add script to simplify posting content to Solr -- Key: SOLR-6435 URL: https://issues.apache.org/jira/browse/SOLR-6435 Project: Solr Issue Type: Improvement Components: scripts and tools Affects Versions: 4.10 Reporter: Erik Hatcher Assignee: Erik Hatcher Fix For: 5.0, Trunk Attachments: SOLR-6435.patch, SOLR-6435.patch Solr's SimplePostTool (example/exampledocs/post.jar) provides a very useful, simple way to get common types of content into Solr. With the new start scripts and the directory refactoring, let's move this tool to a first-class, non example script fronted tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6435) Add script to simplify posting content to Solr
[ https://issues.apache.org/jira/browse/SOLR-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258901#comment-14258901 ] Erik Hatcher commented on SOLR-6435: Put a simple stake in the ground on trunk with bin/post. TODO's: create comparable bin/post.cmd for Windows; centralize common environment (like Java and variables) across bin/solr and bin/post; merge this to branch_5x Add script to simplify posting content to Solr -- Key: SOLR-6435 URL: https://issues.apache.org/jira/browse/SOLR-6435 Project: Solr Issue Type: Improvement Components: scripts and tools Affects Versions: 4.10 Reporter: Erik Hatcher Assignee: Erik Hatcher Fix For: 5.0, Trunk Attachments: SOLR-6435.patch, SOLR-6435.patch Solr's SimplePostTool (example/exampledocs/post.jar) provides a very useful, simple way to get common types of content into Solr. With the new start scripts and the directory refactoring, let's move this tool to a first-class, non example script fronted tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6138) ItalianLightStemmer doesn't apply on words shorter then 6 chars in length
[ https://issues.apache.org/jira/browse/LUCENE-6138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258906#comment-14258906 ] Erick Erickson commented on LUCENE-6138: I think the discussion at LUCENE-6137 applies here. ItalianLightStemmer doesn't apply on words shorter then 6 chars in length - Key: LUCENE-6138 URL: https://issues.apache.org/jira/browse/LUCENE-6138 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Affects Versions: 4.10.2 Reporter: Massimo Pasquini Priority: Minor I expect a stemmer to transform nouns in their singular and plural forms into a shorter common form. The implementation of the ItalianLightStemmer doesn't apply any stemming to words shorter then 6 characters in length. This leads to some annoying results: singular form | plural form 4|5 chars in length (no stemming) alga - alga | alghe - alghe fuga - fuga | fughe - fughe lega - lega | leghe - leghe 5|6 chars in length (stemming only on plural form) vanga - vanga | vanghe - vang verga - verga | verghe - verg I suppose that such limitation on words length is to avoid other side effects on shorter words not in the set above, but I think something must be reviewed in the code for better results. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6770) Add/edit param sets and use them in Requests
[ https://issues.apache.org/jira/browse/SOLR-6770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258911#comment-14258911 ] David Smiley commented on SOLR-6770: +1 Add/edit param sets and use them in Requests Key: SOLR-6770 URL: https://issues.apache.org/jira/browse/SOLR-6770 Project: Solr Issue Type: Sub-task Reporter: Noble Paul Assignee: Noble Paul Fix For: Trunk Attachments: SOLR-6770.patch, SOLR-6770.patch, SOLR-6770.patch Make it possible to define paramsets and use them directly in requests example {code} curl http://localhost:8983/solr/collection1/config/params -H 'Content-type:application/json' -d '{ create : {x: { a:A val, b: B val} }, update : {y: { x:X val, Y: Y val} }, modify : {y: { x:X val modified} }, delete : z }' #do a GET to view all the configured params curl http://localhost:8983/solr/collection1/config/params #or GET with a specific name to get only one set of params curl http://localhost:8983/solr/collection1/config/params/x {code} This data will be stored in conf/params.json This is used requesttime and adding/editing params will not result in core reload and it will have no impact on the performance example usage http://localhost/solr/collection/select?useParams=x,y or it can be directly configured with a request handler as follows {code} requestHandler name=/dump1 class=DumpRequestHandler useParams=x/ {code} {{useParams}} specified in request overrides the one specified in {{requestHandler}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-6139) TokenGroup.getStart|EndOffset should return matchStart|EndOffset not start|endOffset
David Smiley created LUCENE-6139: Summary: TokenGroup.getStart|EndOffset should return matchStart|EndOffset not start|endOffset Key: LUCENE-6139 URL: https://issues.apache.org/jira/browse/LUCENE-6139 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Reporter: David Smiley The default highlighter has a TokenGroup class that is passed to Formatter.highlightTerm(). TokenGroup also has getStartOffset() and getEndOffset() methods that ostensibly return the start and end offsets into the original text of the current term. These getters aren't called by Lucene or Solr but they are made available and are useful to me. _The problem is that they return the wrong offsets when there are tokens at the same position._ I believe this was an oversight of LUCENE-627 in which these getters should have been updated but weren't. The fix is simple: return matchStartOffset and matchEndOffset from these getters, not startOffset and endOffset. I think this oversight would not have occurred if Highlighter didn't have package-access to TokenGroup's fields. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6137) RussianLightStemmer incorrectly handles the words ending with 'ее'
[ https://issues.apache.org/jira/browse/LUCENE-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258922#comment-14258922 ] Alexander Sofronov commented on LUCENE-6137: OK, I sent email to Ljiljana Doalmic and Jacques Savoy. BTW, I found that Perl version of Russian stemmer (http://members.unine.ch/jacques.savoy/clef/russianStemmerPerl.txt) handles ее and ие endings properly. RussianLightStemmer incorrectly handles the words ending with 'ее' -- Key: LUCENE-6137 URL: https://issues.apache.org/jira/browse/LUCENE-6137 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.10.2 Reporter: Alexander Sofronov Attachments: LUCENE-6137.patch Consider the forms of Russian word синий and the result returned by RussianLightStemmer: синий - син синяя - син синее - сине синие - син I think the correct result should be: синий - син синяя - син синее - син синие - син -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6139) TokenGroup.getStart|EndOffset should return matchStart|EndOffset not start|endOffset
[ https://issues.apache.org/jira/browse/LUCENE-6139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258931#comment-14258931 ] David Smiley commented on LUCENE-6139: -- I propose that TokenGroup's fields become private and Highlighter access them via it's getters -- the ones it already has, actually, no need for more. This begs the question if the distinction of a matchStartOffset vs. startOffset (and end variants) serves any purpose. That is, toss startOffset ( endOffset) then rename matchStartOffset ( matchEndOffset) to startOffset ( endOffset). They aren't used, and I doubt others are because I think the offset info, when needed, is accessed at the end via TextFragment (populated from TokenGroup.matchStartOffset matchEndOffset). FYI I didn't go that route because I want *all* matches and I found the custom Formatter approach to be more appealing than passing a very large numFragments, from an efficiency standpoint. h4. Unrelated questions about Highlighter Not directly related to this is a couple burning questions I have in Highlighter: * Why oh why does Highlighter call formatter.highlightTerm for essentially *every* token? If TokenGroup.getTotalScore() is 0, I argue it shouldn't. All the built-in Fragmenters (and one I just wrote) start with a zero score short-circuit. * Why does a 0-score fragment remains a part of the fragments priority queue; why it isn't tossed out when the fragment closes out? One might argue it's needless when numFragments is small, which is the size of the PQ but it'd be nice to ask for 'all' fragments/matches without a huge PQ even if there is just one real match. * Why is all text run through the encoder and appended to a newText StringBuilder, even when the fragment has no score? If there's no point then it's a waste to do it and then not use it as it won't be a part of a returned fragment. Again, I think 0-score fragments should be immediately dropped, and newText should only be for the current fragment. TokenGroup.getStart|EndOffset should return matchStart|EndOffset not start|endOffset Key: LUCENE-6139 URL: https://issues.apache.org/jira/browse/LUCENE-6139 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Reporter: David Smiley The default highlighter has a TokenGroup class that is passed to Formatter.highlightTerm(). TokenGroup also has getStartOffset() and getEndOffset() methods that ostensibly return the start and end offsets into the original text of the current term. These getters aren't called by Lucene or Solr but they are made available and are useful to me. _The problem is that they return the wrong offsets when there are tokens at the same position._ I believe this was an oversight of LUCENE-627 in which these getters should have been updated but weren't. The fix is simple: return matchStartOffset and matchEndOffset from these getters, not startOffset and endOffset. I think this oversight would not have occurred if Highlighter didn't have package-access to TokenGroup's fields. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2382 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2382/ 1 tests failed. REGRESSION: org.apache.solr.cloud.HttpPartitionTest.testDistribSearch Error Message: Didn't see all replicas for shard shard1 in c8n_1x2 come up within 3 ms! ClusterState: { control_collection:{ autoAddReplicas:false, router:{name:compositeId}, replicationFactor:1, autoCreated:true, shards:{shard1:{ range:8000-7fff, state:active, replicas:{core_node1:{ base_url:http://127.0.0.1:10795;, node_name:127.0.0.1:10795_, core:collection1, state:active, leader:true, maxShardsPerNode:1}, c8n_1x2:{ autoAddReplicas:false, router:{name:compositeId}, replicationFactor:2, shards:{shard1:{ range:8000-7fff, state:active, replicas:{ core_node1:{ base_url:http://127.0.0.1:10795;, node_name:127.0.0.1:10795_, core:c8n_1x2_shard1_replica1, state:active, leader:true}, core_node2:{ base_url:http://127.0.0.1:10812;, node_name:127.0.0.1:10812_, core:c8n_1x2_shard1_replica2, state:recovering, maxShardsPerNode:1}, collection1:{ autoAddReplicas:false, router:{name:compositeId}, replicationFactor:1, autoCreated:true, shards:{ shard1:{ range:8000-, state:active, replicas:{core_node2:{ base_url:http://127.0.0.1:10806;, node_name:127.0.0.1:10806_, core:collection1, state:active, leader:true}}}, shard2:{ range:0-7fff, state:active, replicas:{ core_node1:{ base_url:http://127.0.0.1:10802;, node_name:127.0.0.1:10802_, core:collection1, state:active, leader:true}, core_node3:{ base_url:http://127.0.0.1:10812;, node_name:127.0.0.1:10812_, core:collection1, state:active, maxShardsPerNode:1}} Stack Trace: java.lang.AssertionError: Didn't see all replicas for shard shard1 in c8n_1x2 come up within 3 ms! ClusterState: { control_collection:{ autoAddReplicas:false, router:{name:compositeId}, replicationFactor:1, autoCreated:true, shards:{shard1:{ range:8000-7fff, state:active, replicas:{core_node1:{ base_url:http://127.0.0.1:10795;, node_name:127.0.0.1:10795_, core:collection1, state:active, leader:true, maxShardsPerNode:1}, c8n_1x2:{ autoAddReplicas:false, router:{name:compositeId}, replicationFactor:2, shards:{shard1:{ range:8000-7fff, state:active, replicas:{ core_node1:{ base_url:http://127.0.0.1:10795;, node_name:127.0.0.1:10795_, core:c8n_1x2_shard1_replica1, state:active, leader:true}, core_node2:{ base_url:http://127.0.0.1:10812;, node_name:127.0.0.1:10812_, core:c8n_1x2_shard1_replica2, state:recovering, maxShardsPerNode:1}, collection1:{ autoAddReplicas:false, router:{name:compositeId}, replicationFactor:1, autoCreated:true, shards:{ shard1:{ range:8000-, state:active, replicas:{core_node2:{ base_url:http://127.0.0.1:10806;, node_name:127.0.0.1:10806_, core:collection1, state:active, leader:true}}}, shard2:{ range:0-7fff, state:active, replicas:{ core_node1:{ base_url:http://127.0.0.1:10802;, node_name:127.0.0.1:10802_, core:collection1, state:active, leader:true}, core_node3:{ base_url:http://127.0.0.1:10812;, node_name:127.0.0.1:10812_, core:collection1, state:active, maxShardsPerNode:1}} at __randomizedtesting.SeedInfo.seed([AB63F3415CA9F9E7:2A857D592BF699DB]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.cloud.AbstractFullDistribZkTestBase.ensureAllReplicasAreActive(AbstractFullDistribZkTestBase.java:1940) at org.apache.solr.cloud.HttpPartitionTest.testRf2(HttpPartitionTest.java:247) at org.apache.solr.cloud.HttpPartitionTest.doTest(HttpPartitionTest.java:114) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:869) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at
[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0_40-ea-b09) - Build # 11810 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/11810/ Java: 64bit/jdk1.8.0_40-ea-b09 -XX:-UseCompressedOops -XX:+UseG1GC (asserts: true) 1 tests failed. FAILED: org.apache.solr.cloud.OverseerStatusTest.testDistribSearch Error Message: Error from server at http://127.0.0.1:47564/bny: reload the collection time out:180s Stack Trace: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error from server at http://127.0.0.1:47564/bny: reload the collection time out:180s at __randomizedtesting.SeedInfo.seed([41BACA907B6A6EED:C05C44880C350ED1]:0) at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:566) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:213) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:209) at org.apache.solr.cloud.AbstractFullDistribZkTestBase.invokeCollectionApi(AbstractFullDistribZkTestBase.java:1867) at org.apache.solr.cloud.OverseerStatusTest.doTest(OverseerStatusTest.java:103) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:869) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at