[JENKINS] Lucene-Solr-5.x-Linux (64bit/jdk1.8.0_20) - Build # 11471 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Linux/11471/ Java: 64bit/jdk1.8.0_20 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC (asserts: true) 9 tests failed. REGRESSION: org.apache.solr.schema.CurrencyFieldOpenExchangeTest.testMockExchangeRateProvider Error Message: SolrCore 'collection1' is not available due to init failure: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' Stack Trace: org.apache.solr.common.SolrException: SolrCore 'collection1' is not available due to init failure: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' at __randomizedtesting.SeedInfo.seed([B6F3FA906348E49:4462C45B02F413D]:0) at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:765) at org.apache.solr.util.TestHarness.getCore(TestHarness.java:209) at org.apache.solr.schema.AbstractCurrencyFieldTest.testMockExchangeRateProvider(AbstractCurrencyFieldTest.java:124) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at
[jira] [Commented] (SOLR-6625) HttpClient callback in HttpSolrServer
[ https://issues.apache.org/jira/browse/SOLR-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217591#comment-14217591 ] Per Steffensen commented on SOLR-6625: -- Sorry I do not have the time to go into details. But from quick reading, and as I remember the SOLR-4470 implementation, most/all of what you say sounds reasonable. Please note that SOLR-4470 hasnt been committed to the Apache Solr code-base. I provided it long time ago, and just recently [~janhoy] and I updated it to fit tip of trunk, and I know Jan intended to try to push it to the code-base. Do not know what happened after that. Please also note that in SOLR-4470 I tried to prepare for additional authentication types, but it is hard to make it 100% right when you do not know the nature of the actual types being implemented in the future. Essence is that {{AuthCredentials}} should carry [information about] the authentications to be used for the request(s). How to use them is an implementation-detail of the specific authentication type (implementing {{AbstractAuthMethod}}), and it may require a little rearranging of code to implement authentication type #2. Basing it on a general callback feature sound like good idea. I believe in never design for the future, but if I didnt at least try to sketch the idea in a framework, there is a big risk than the next authentication type would be implemented in a completely different way. I also believe in separation of concerns, so I would really like the authentication concern to be handled in one single place - {{AuthCredentials}} was my attempt to make such a place. HttpClient callback in HttpSolrServer - Key: SOLR-6625 URL: https://issues.apache.org/jira/browse/SOLR-6625 Project: Solr Issue Type: Improvement Components: SolrJ Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Attachments: SOLR-6625.patch, SOLR-6625.patch Some of our setups use Solr in a SPNego/kerberos setup (we've done this by adding our own filters to the web.xml). We have an issue in that SPNego requires a negotiation step, but some HttpSolrServer requests are not repeatable, notably the PUT/POST requests. So, what happens is, HttpSolrServer sends the requests, the server responds with a negotiation request, and the request fails because the request is not repeatable. We've modified our code to send a repeatable request beforehand in these cases. It would be nicer if HttpSolrServer provided a pre/post callback when it was making an httpclient request. This would allow administrators to make changes to the request for authentication purposes, and would allow users to make per-request changes to the httpclient calls (i.e. modify httpclient requestconfig to modify the timeout on a per-request basis). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #1263: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1263/ 2 tests failed. FAILED: org.apache.solr.hadoop.MorphlineBasicMiniMRTest.testPathParts Error Message: Test abandoned because suite timeout was reached. Stack Trace: java.lang.Exception: Test abandoned because suite timeout was reached. at __randomizedtesting.SeedInfo.seed([F4C2D6522278DA38]:0) FAILED: org.apache.solr.hadoop.MorphlineBasicMiniMRTest.org.apache.solr.hadoop.MorphlineBasicMiniMRTest Error Message: Suite timeout exceeded (= 720 msec). Stack Trace: java.lang.Exception: Suite timeout exceeded (= 720 msec). at __randomizedtesting.SeedInfo.seed([F4C2D6522278DA38]:0) Build Log: [...truncated 53113 lines...] BUILD FAILED /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-trunk/build.xml:548: The following error occurred while executing this line: /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-trunk/build.xml:200: The following error occurred while executing this line: : Java returned: 1 Total time: 415 minutes 21 seconds Build step 'Invoke Ant' marked build as failure Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0_20) - Build # 11634 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/11634/ Java: 32bit/jdk1.8.0_20 -client -XX:+UseG1GC (asserts: false) 4 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.DisMaxRequestHandlerTest Error Message: SolrCore 'collection1' is not available due to init failure: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' Stack Trace: org.apache.solr.common.SolrException: SolrCore 'collection1' is not available due to init failure: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' at __randomizedtesting.SeedInfo.seed([3674920D63D7F469]:0) at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:767) at org.apache.solr.util.TestHarness.getCoreInc(TestHarness.java:219) at org.apache.solr.util.TestHarness.update(TestHarness.java:235) at org.apache.solr.util.BaseTestHarness.checkUpdateStatus(BaseTestHarness.java:282) at org.apache.solr.util.BaseTestHarness.validateUpdate(BaseTestHarness.java:252) at org.apache.solr.DisMaxRequestHandlerTest.beforeClass(DisMaxRequestHandlerTest.java:40) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:767) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.solr.common.SolrException: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' at org.apache.solr.core.SolrCore.init(SolrCore.java:896) at org.apache.solr.core.SolrCore.init(SolrCore.java:653) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:510) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:274) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:268) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ... 1 more Caused by: org.apache.solr.common.SolrException: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:535) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:519) at org.apache.solr.update.SolrIndexConfig.buildMergeScheduler(SolrIndexConfig.java:305) at org.apache.solr.update.SolrIndexConfig.toIndexWriterConfig(SolrIndexConfig.java:230) at org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:78) at
[JENKINS] Lucene-Solr-5.x-Windows (32bit/jdk1.8.0_40-ea-b09) - Build # 4335 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Windows/4335/ Java: 32bit/jdk1.8.0_40-ea-b09 -client -XX:+UseSerialGC (asserts: false) 6 tests failed. REGRESSION: org.apache.solr.cloud.RemoteQueryErrorTest.testDistribSearch Error Message: There are still nodes recoverying - waited for 15 seconds Stack Trace: java.lang.AssertionError: There are still nodes recoverying - waited for 15 seconds at __randomizedtesting.SeedInfo.seed([D66E1866D300EA27:5788967EA45F8A1B]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.cloud.AbstractDistribZkTestBase.waitForRecoveriesToFinish(AbstractDistribZkTestBase.java:178) at org.apache.solr.cloud.AbstractFullDistribZkTestBase.waitForRecoveriesToFinish(AbstractFullDistribZkTestBase.java:840) at org.apache.solr.cloud.AbstractFullDistribZkTestBase.waitForThingsToLevelOut(AbstractFullDistribZkTestBase.java:1459) at org.apache.solr.cloud.RemoteQueryErrorTest.doTest(RemoteQueryErrorTest.java:45) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:869) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
[jira] [Commented] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217708#comment-14217708 ] Modassar Ather commented on LUCENE-5205: Thanks [~talli...@apache.org] for your response. I am using the SpanQuryParser and fix for query hanging issue from your github site as provided in your comment. I tried using WhiteSpaceTokenizer. Will check with StandardAnalyzer too. With WhiteSpaceTokenizer: q=field: (SEARCH TOOLS PROVIDER CONSULTING COMPANY) still gets transformed to following: +spanNear([field:search, field:tools, field:provider, field:, field:consulting, field:company], 0, true) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser --- Key: LUCENE-5205 URL: https://issues.apache.org/jira/browse/LUCENE-5205 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: LUCENE-5205-cleanup-tests.patch, LUCENE-5205-date-pkg-prvt.patch, LUCENE-5205.patch.gz, LUCENE-5205.patch.gz, LUCENE-5205_dateTestReInitPkgPrvt.patch, LUCENE-5205_improve_stop_word_handling.patch, LUCENE-5205_smallTestMods.patch, LUCENE_5205.patch, SpanQueryParser_v1.patch.gz, patch.txt This parser extends QueryParserBase and includes functionality from: * Classic QueryParser: most of its syntax * SurroundQueryParser: recursive parsing for near and not clauses. * ComplexPhraseQueryParser: can handle near queries that include multiterms (wildcard, fuzzy, regex, prefix), * AnalyzingQueryParser: has an option to analyze multiterms. At a high level, there's a first pass BooleanQuery/field parser and then a span query parser handles all terminal nodes and phrases. Same as classic syntax: * term: test * fuzzy: roam~0.8, roam~2 * wildcard: te?t, test*, t*st * regex: /\[mb\]oat/ * phrase: jakarta apache * phrase with slop: jakarta apache~3 * default or clause: jakarta apache * grouping or clause: (jakarta apache) * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta * multiple fields: title:lucene author:hatcher Main additions in SpanQueryParser syntax vs. classic syntax: * Can require in order for phrases with slop with the \~ operator: jakarta apache\~3 * Can specify not near: fever bieber!\~3,10 :: find fever but not if bieber appears within 3 words before or 10 words after it. * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta apache\]~3 lucene\]\~4 :: find jakarta within 3 words of apache, and that hit has to be within four words before lucene * Can also use \[\] for single level phrasal queries instead of as in: \[jakarta apache\] * Can use or grouping clauses in phrasal queries: apache (lucene solr)\~3 :: find apache and then either lucene or solr within three words. * Can use multiterms in phrasal queries: jakarta\~1 ap*che\~2 * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like jakarta within two words of ap*che and that hit has to be within ten words of something like solr or that lucene regex. * Can require at least x number of hits at boolean level: apache AND (lucene solr tika)~2 * Can use negative only query: -jakarta :: Find all docs that don't contain jakarta * Can use an edit distance 2 for fuzzy query via SlowFuzzyQuery (beware of potential performance issues!). Trivial additions: * Can specify prefix length in fuzzy queries: jakarta~1,2 (edit distance =1, prefix =2) * Can specifiy Optimal String Alignment (OSA) vs Levenshtein for distance =2: (jakarta~1 (OSA) vs jakarta~1(Levenshtein) This parser can be very useful for concordance tasks (see also LUCENE-5317 and LUCENE-5318) and for analytical search. Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery. Most of the documentation is in the javadoc for SpanQueryParser. Any and all feedback is welcome. Thank you. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217708#comment-14217708 ] Modassar Ather edited comment on LUCENE-5205 at 11/19/14 10:45 AM: --- Thanks [~talli...@apache.org] for your response. I am using the SpanQuryParser and fix for query hanging issue from your github site as provided in your comment. I am using WhiteSpaceTokenizer. With WhiteSpaceTokenizer: q=field: (SEARCH TOOLS PROVIDER CONSULTING COMPANY) still gets transformed to following: +spanNear([field:search, field:tools, field:provider, field:, field:consulting, field:company], 0, true) I am trying to find the possible cause of the removal of '' in my config. was (Author: modassar): Thanks [~talli...@apache.org] for your response. I am using the SpanQuryParser and fix for query hanging issue from your github site as provided in your comment. I tried using WhiteSpaceTokenizer. Will check with StandardAnalyzer too. With WhiteSpaceTokenizer: q=field: (SEARCH TOOLS PROVIDER CONSULTING COMPANY) still gets transformed to following: +spanNear([field:search, field:tools, field:provider, field:, field:consulting, field:company], 0, true) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser --- Key: LUCENE-5205 URL: https://issues.apache.org/jira/browse/LUCENE-5205 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: LUCENE-5205-cleanup-tests.patch, LUCENE-5205-date-pkg-prvt.patch, LUCENE-5205.patch.gz, LUCENE-5205.patch.gz, LUCENE-5205_dateTestReInitPkgPrvt.patch, LUCENE-5205_improve_stop_word_handling.patch, LUCENE-5205_smallTestMods.patch, LUCENE_5205.patch, SpanQueryParser_v1.patch.gz, patch.txt This parser extends QueryParserBase and includes functionality from: * Classic QueryParser: most of its syntax * SurroundQueryParser: recursive parsing for near and not clauses. * ComplexPhraseQueryParser: can handle near queries that include multiterms (wildcard, fuzzy, regex, prefix), * AnalyzingQueryParser: has an option to analyze multiterms. At a high level, there's a first pass BooleanQuery/field parser and then a span query parser handles all terminal nodes and phrases. Same as classic syntax: * term: test * fuzzy: roam~0.8, roam~2 * wildcard: te?t, test*, t*st * regex: /\[mb\]oat/ * phrase: jakarta apache * phrase with slop: jakarta apache~3 * default or clause: jakarta apache * grouping or clause: (jakarta apache) * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta * multiple fields: title:lucene author:hatcher Main additions in SpanQueryParser syntax vs. classic syntax: * Can require in order for phrases with slop with the \~ operator: jakarta apache\~3 * Can specify not near: fever bieber!\~3,10 :: find fever but not if bieber appears within 3 words before or 10 words after it. * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta apache\]~3 lucene\]\~4 :: find jakarta within 3 words of apache, and that hit has to be within four words before lucene * Can also use \[\] for single level phrasal queries instead of as in: \[jakarta apache\] * Can use or grouping clauses in phrasal queries: apache (lucene solr)\~3 :: find apache and then either lucene or solr within three words. * Can use multiterms in phrasal queries: jakarta\~1 ap*che\~2 * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like jakarta within two words of ap*che and that hit has to be within ten words of something like solr or that lucene regex. * Can require at least x number of hits at boolean level: apache AND (lucene solr tika)~2 * Can use negative only query: -jakarta :: Find all docs that don't contain jakarta * Can use an edit distance 2 for fuzzy query via SlowFuzzyQuery (beware of potential performance issues!). Trivial additions: * Can specify prefix length in fuzzy queries: jakarta~1,2 (edit distance =1, prefix =2) * Can specifiy Optimal String Alignment (OSA) vs Levenshtein for distance =2: (jakarta~1 (OSA) vs jakarta~1(Levenshtein) This parser can be very useful for concordance tasks (see also LUCENE-5317 and LUCENE-5318) and for analytical search. Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery. Most of the documentation is in the javadoc for SpanQueryParser. Any and all feedback is welcome. Thank you. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SOLR-6633) let /update/json/docs store the source json as well
[ https://issues.apache.org/jira/browse/SOLR-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217770#comment-14217770 ] Yonik Seeley commented on SOLR-6633: bq. Finally, I am not clear on what this really means: all fields go into the 'df' . Do we mean, there is a magic copyField or something? I'm not clear on this either... my best guess is that it is like a copyField. And all *values* (but not keys) are copied into this field? I'm not quite clear on mapUniqueKeyOnly either... (what the Only refers to). I guess if it's false, then all the fields in JSON Object are mapped to Solr fields based on the key in the JSON? Oh, and when we have magic field names, the convention in Solr has been an underscore on both sides (or not at all). So can we use \_src\_ or src of _src please? let /update/json/docs store the source json as well --- Key: SOLR-6633 URL: https://issues.apache.org/jira/browse/SOLR-6633 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul Labels: EaseOfUse Fix For: 5.0, Trunk Attachments: SOLR-6633.patch, SOLR-6633.patch it is a common requirement to store the entire JSON as a field in Solr. we can have a extra param srcField=field_name to specify the field name the /update/json/docs is only useful when all the json fields are predefined or in schemaless mode. The better option would be to store the content in a store only field and index the data in another field in other modes the relevant section in solrconfig.xml {code:xml} initParams path=/update/json/docs lst name=defaults !--this ensures that the entire json doc will be stored verbatim into one field-- str name=srcField_src/str !--This means a the uniqueKeyField will be extracted from the fields and all fields go into the 'df' field. In this config df is already configured to be 'text' -- str name=mapUniqueKeyOnlytrue/str str name=dftext/str /lst /initParams {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217793#comment-14217793 ] Tim Allison commented on LUCENE-5205: - Good to hear the github workaround works. If a committer has any interest in taking this on, it would be great to merge this into trunk...and then we could deprecate AnalyzingQueryParser, SurroundQueryParser and ComplexPhraseQueryParser just in time for 5.0. :) In pure Lucene, with a WhitespaceAnalyzer, the '' is still making it through the parsing process. {noformat} spanNear([field:SEARCH, field:TOOLS, field:PROVIDER, field:, field:CONSULTING, field:COMPANY], 0, true) {noformat} What filters are you applying? From your output, at least the LowerCaseFilterFactory, but anything else? [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser --- Key: LUCENE-5205 URL: https://issues.apache.org/jira/browse/LUCENE-5205 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: LUCENE-5205-cleanup-tests.patch, LUCENE-5205-date-pkg-prvt.patch, LUCENE-5205.patch.gz, LUCENE-5205.patch.gz, LUCENE-5205_dateTestReInitPkgPrvt.patch, LUCENE-5205_improve_stop_word_handling.patch, LUCENE-5205_smallTestMods.patch, LUCENE_5205.patch, SpanQueryParser_v1.patch.gz, patch.txt This parser extends QueryParserBase and includes functionality from: * Classic QueryParser: most of its syntax * SurroundQueryParser: recursive parsing for near and not clauses. * ComplexPhraseQueryParser: can handle near queries that include multiterms (wildcard, fuzzy, regex, prefix), * AnalyzingQueryParser: has an option to analyze multiterms. At a high level, there's a first pass BooleanQuery/field parser and then a span query parser handles all terminal nodes and phrases. Same as classic syntax: * term: test * fuzzy: roam~0.8, roam~2 * wildcard: te?t, test*, t*st * regex: /\[mb\]oat/ * phrase: jakarta apache * phrase with slop: jakarta apache~3 * default or clause: jakarta apache * grouping or clause: (jakarta apache) * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta * multiple fields: title:lucene author:hatcher Main additions in SpanQueryParser syntax vs. classic syntax: * Can require in order for phrases with slop with the \~ operator: jakarta apache\~3 * Can specify not near: fever bieber!\~3,10 :: find fever but not if bieber appears within 3 words before or 10 words after it. * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta apache\]~3 lucene\]\~4 :: find jakarta within 3 words of apache, and that hit has to be within four words before lucene * Can also use \[\] for single level phrasal queries instead of as in: \[jakarta apache\] * Can use or grouping clauses in phrasal queries: apache (lucene solr)\~3 :: find apache and then either lucene or solr within three words. * Can use multiterms in phrasal queries: jakarta\~1 ap*che\~2 * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like jakarta within two words of ap*che and that hit has to be within ten words of something like solr or that lucene regex. * Can require at least x number of hits at boolean level: apache AND (lucene solr tika)~2 * Can use negative only query: -jakarta :: Find all docs that don't contain jakarta * Can use an edit distance 2 for fuzzy query via SlowFuzzyQuery (beware of potential performance issues!). Trivial additions: * Can specify prefix length in fuzzy queries: jakarta~1,2 (edit distance =1, prefix =2) * Can specifiy Optimal String Alignment (OSA) vs Levenshtein for distance =2: (jakarta~1 (OSA) vs jakarta~1(Levenshtein) This parser can be very useful for concordance tasks (see also LUCENE-5317 and LUCENE-5318) and for analytical search. Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery. Most of the documentation is in the javadoc for SpanQueryParser. Any and all feedback is welcome. Thank you. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217793#comment-14217793 ] Tim Allison edited comment on LUCENE-5205 at 11/19/14 11:58 AM: Good to hear the github workaround works. If a committer has any interest in taking this on, it would be great to merge this into trunk...and then we could deprecate AnalyzingQueryParser, SurroundQueryParser and ComplexPhraseQueryParser just in time for 5.0. :) In pure Lucene, with a WhitespaceAnalyzer, the '' is still making it through the parsing process. {noformat} spanNear([field:SEARCH, field:TOOLS, field:PROVIDER, field:, field:CONSULTING, field:COMPANY], 0, true) {noformat} When I use a StandardAnalyzer, the '' is correctly dropped: {noformat} spanNear([field:search, field:tools, field:provider, field:consulting, field:company], 1, true) {noformat} What filters are you applying? From your output, at least the LowerCaseFilterFactory, but anything else? was (Author: talli...@mitre.org): Good to hear the github workaround works. If a committer has any interest in taking this on, it would be great to merge this into trunk...and then we could deprecate AnalyzingQueryParser, SurroundQueryParser and ComplexPhraseQueryParser just in time for 5.0. :) In pure Lucene, with a WhitespaceAnalyzer, the '' is still making it through the parsing process. {noformat} spanNear([field:SEARCH, field:TOOLS, field:PROVIDER, field:, field:CONSULTING, field:COMPANY], 0, true) {noformat} What filters are you applying? From your output, at least the LowerCaseFilterFactory, but anything else? [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser --- Key: LUCENE-5205 URL: https://issues.apache.org/jira/browse/LUCENE-5205 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: LUCENE-5205-cleanup-tests.patch, LUCENE-5205-date-pkg-prvt.patch, LUCENE-5205.patch.gz, LUCENE-5205.patch.gz, LUCENE-5205_dateTestReInitPkgPrvt.patch, LUCENE-5205_improve_stop_word_handling.patch, LUCENE-5205_smallTestMods.patch, LUCENE_5205.patch, SpanQueryParser_v1.patch.gz, patch.txt This parser extends QueryParserBase and includes functionality from: * Classic QueryParser: most of its syntax * SurroundQueryParser: recursive parsing for near and not clauses. * ComplexPhraseQueryParser: can handle near queries that include multiterms (wildcard, fuzzy, regex, prefix), * AnalyzingQueryParser: has an option to analyze multiterms. At a high level, there's a first pass BooleanQuery/field parser and then a span query parser handles all terminal nodes and phrases. Same as classic syntax: * term: test * fuzzy: roam~0.8, roam~2 * wildcard: te?t, test*, t*st * regex: /\[mb\]oat/ * phrase: jakarta apache * phrase with slop: jakarta apache~3 * default or clause: jakarta apache * grouping or clause: (jakarta apache) * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta * multiple fields: title:lucene author:hatcher Main additions in SpanQueryParser syntax vs. classic syntax: * Can require in order for phrases with slop with the \~ operator: jakarta apache\~3 * Can specify not near: fever bieber!\~3,10 :: find fever but not if bieber appears within 3 words before or 10 words after it. * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta apache\]~3 lucene\]\~4 :: find jakarta within 3 words of apache, and that hit has to be within four words before lucene * Can also use \[\] for single level phrasal queries instead of as in: \[jakarta apache\] * Can use or grouping clauses in phrasal queries: apache (lucene solr)\~3 :: find apache and then either lucene or solr within three words. * Can use multiterms in phrasal queries: jakarta\~1 ap*che\~2 * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like jakarta within two words of ap*che and that hit has to be within ten words of something like solr or that lucene regex. * Can require at least x number of hits at boolean level: apache AND (lucene solr tika)~2 * Can use negative only query: -jakarta :: Find all docs that don't contain jakarta * Can use an edit distance 2 for fuzzy query via SlowFuzzyQuery (beware of potential performance issues!). Trivial additions: * Can specify prefix length in fuzzy queries: jakarta~1,2 (edit distance =1, prefix =2) * Can specifiy Optimal String Alignment (OSA) vs Levenshtein for distance =2: (jakarta~1 (OSA) vs jakarta~1(Levenshtein) This
[jira] [Updated] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated LUCENE-5205: Description: This parser extends QueryParserBase and includes functionality from: * Classic QueryParser: most of its syntax * SurroundQueryParser: recursive parsing for near and not clauses. * ComplexPhraseQueryParser: can handle near queries that include multiterms (wildcard, fuzzy, regex, prefix), * AnalyzingQueryParser: has an option to analyze multiterms. At a high level, there's a first pass BooleanQuery/field parser and then a span query parser handles all terminal nodes and phrases. Same as classic syntax: * term: test * fuzzy: roam~0.8, roam~2 * wildcard: te?t, test*, t*st * regex: /\[mb\]oat/ * phrase: jakarta apache * phrase with slop: jakarta apache~3 * default or clause: jakarta apache * grouping or clause: (jakarta apache) * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta * multiple fields: title:lucene author:hatcher Main additions in SpanQueryParser syntax vs. classic syntax: * Can require in order for phrases with slop with the \~ operator: jakarta apache\~3 * Can specify not near: fever bieber!\~3,10 :: find fever but not if bieber appears within 3 words before or 10 words after it. * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta apache\]~3 lucene\]\~4 :: find jakarta within 3 words of apache, and that hit has to be within four words before lucene * Can also use \[\] for single level phrasal queries instead of as in: \[jakarta apache\] * Can use or grouping clauses in phrasal queries: apache (lucene solr)\~3 :: find apache and then either lucene or solr within three words. * Can use multiterms in phrasal queries: jakarta\~1 ap*che\~2 * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like jakarta within two words of ap*che and that hit has to be within ten words of something like solr or that lucene regex. * Can require at least x number of hits at boolean level: apache AND (lucene solr tika)~2 * Can use negative only query: -jakarta :: Find all docs that don't contain jakarta * Can use an edit distance 2 for fuzzy query via SlowFuzzyQuery (beware of potential performance issues!). Trivial additions: * Can specify prefix length in fuzzy queries: jakarta~1,2 (edit distance =1, prefix =2) * Can specifiy Optimal String Alignment (OSA) vs Levenshtein for distance =2: (jakarta~1 (OSA) vs jakarta~1(Levenshtein) This parser can be very useful for concordance tasks (see also LUCENE-5317 and LUCENE-5318) and for analytical search. Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery. Most of the documentation is in the javadoc for SpanQueryParser. Any and all feedback is welcome. Thank you. Until this is added to the Lucene project, I've added a standalone lucene-addons repo (with jars compiled for the latest stable build of Lucene) on [github|https://github.com/tballison/lucene-addons]. was: This parser extends QueryParserBase and includes functionality from: * Classic QueryParser: most of its syntax * SurroundQueryParser: recursive parsing for near and not clauses. * ComplexPhraseQueryParser: can handle near queries that include multiterms (wildcard, fuzzy, regex, prefix), * AnalyzingQueryParser: has an option to analyze multiterms. At a high level, there's a first pass BooleanQuery/field parser and then a span query parser handles all terminal nodes and phrases. Same as classic syntax: * term: test * fuzzy: roam~0.8, roam~2 * wildcard: te?t, test*, t*st * regex: /\[mb\]oat/ * phrase: jakarta apache * phrase with slop: jakarta apache~3 * default or clause: jakarta apache * grouping or clause: (jakarta apache) * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta * multiple fields: title:lucene author:hatcher Main additions in SpanQueryParser syntax vs. classic syntax: * Can require in order for phrases with slop with the \~ operator: jakarta apache\~3 * Can specify not near: fever bieber!\~3,10 :: find fever but not if bieber appears within 3 words before or 10 words after it. * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta apache\]~3 lucene\]\~4 :: find jakarta within 3 words of apache, and that hit has to be within four words before lucene * Can also use \[\] for single level phrasal queries instead of as in: \[jakarta apache\] * Can use or grouping clauses in phrasal queries: apache (lucene solr)\~3 :: find apache and then either lucene or solr within three words. * Can use multiterms in phrasal queries: jakarta\~1 ap*che\~2 * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like jakarta within two words of ap*che and that hit has to be within ten words of something like solr or that
[jira] [Commented] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217817#comment-14217817 ] Modassar Ather commented on LUCENE-5205: It is solr.PatternReplaceFilterFactory in my analyzer chain which is replacing with blank. Thanks for sharing the above details. [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser --- Key: LUCENE-5205 URL: https://issues.apache.org/jira/browse/LUCENE-5205 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: LUCENE-5205-cleanup-tests.patch, LUCENE-5205-date-pkg-prvt.patch, LUCENE-5205.patch.gz, LUCENE-5205.patch.gz, LUCENE-5205_dateTestReInitPkgPrvt.patch, LUCENE-5205_improve_stop_word_handling.patch, LUCENE-5205_smallTestMods.patch, LUCENE_5205.patch, SpanQueryParser_v1.patch.gz, patch.txt This parser extends QueryParserBase and includes functionality from: * Classic QueryParser: most of its syntax * SurroundQueryParser: recursive parsing for near and not clauses. * ComplexPhraseQueryParser: can handle near queries that include multiterms (wildcard, fuzzy, regex, prefix), * AnalyzingQueryParser: has an option to analyze multiterms. At a high level, there's a first pass BooleanQuery/field parser and then a span query parser handles all terminal nodes and phrases. Same as classic syntax: * term: test * fuzzy: roam~0.8, roam~2 * wildcard: te?t, test*, t*st * regex: /\[mb\]oat/ * phrase: jakarta apache * phrase with slop: jakarta apache~3 * default or clause: jakarta apache * grouping or clause: (jakarta apache) * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta * multiple fields: title:lucene author:hatcher Main additions in SpanQueryParser syntax vs. classic syntax: * Can require in order for phrases with slop with the \~ operator: jakarta apache\~3 * Can specify not near: fever bieber!\~3,10 :: find fever but not if bieber appears within 3 words before or 10 words after it. * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta apache\]~3 lucene\]\~4 :: find jakarta within 3 words of apache, and that hit has to be within four words before lucene * Can also use \[\] for single level phrasal queries instead of as in: \[jakarta apache\] * Can use or grouping clauses in phrasal queries: apache (lucene solr)\~3 :: find apache and then either lucene or solr within three words. * Can use multiterms in phrasal queries: jakarta\~1 ap*che\~2 * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like jakarta within two words of ap*che and that hit has to be within ten words of something like solr or that lucene regex. * Can require at least x number of hits at boolean level: apache AND (lucene solr tika)~2 * Can use negative only query: -jakarta :: Find all docs that don't contain jakarta * Can use an edit distance 2 for fuzzy query via SlowFuzzyQuery (beware of potential performance issues!). Trivial additions: * Can specify prefix length in fuzzy queries: jakarta~1,2 (edit distance =1, prefix =2) * Can specifiy Optimal String Alignment (OSA) vs Levenshtein for distance =2: (jakarta~1 (OSA) vs jakarta~1(Levenshtein) This parser can be very useful for concordance tasks (see also LUCENE-5317 and LUCENE-5318) and for analytical search. Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery. Most of the documentation is in the javadoc for SpanQueryParser. Any and all feedback is welcome. Thank you. Until this is added to the Lucene project, I've added a standalone lucene-addons repo (with jars compiled for the latest stable build of Lucene) on [github|https://github.com/tballison/lucene-addons]. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217823#comment-14217823 ] Tim Allison commented on LUCENE-5205: - Ah, ok, so to confirm, no further action is required from me on the issue? Are you ok with single quotes becoming operators? Can you see a way of improving that behavior? [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser --- Key: LUCENE-5205 URL: https://issues.apache.org/jira/browse/LUCENE-5205 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: LUCENE-5205-cleanup-tests.patch, LUCENE-5205-date-pkg-prvt.patch, LUCENE-5205.patch.gz, LUCENE-5205.patch.gz, LUCENE-5205_dateTestReInitPkgPrvt.patch, LUCENE-5205_improve_stop_word_handling.patch, LUCENE-5205_smallTestMods.patch, LUCENE_5205.patch, SpanQueryParser_v1.patch.gz, patch.txt This parser extends QueryParserBase and includes functionality from: * Classic QueryParser: most of its syntax * SurroundQueryParser: recursive parsing for near and not clauses. * ComplexPhraseQueryParser: can handle near queries that include multiterms (wildcard, fuzzy, regex, prefix), * AnalyzingQueryParser: has an option to analyze multiterms. At a high level, there's a first pass BooleanQuery/field parser and then a span query parser handles all terminal nodes and phrases. Same as classic syntax: * term: test * fuzzy: roam~0.8, roam~2 * wildcard: te?t, test*, t*st * regex: /\[mb\]oat/ * phrase: jakarta apache * phrase with slop: jakarta apache~3 * default or clause: jakarta apache * grouping or clause: (jakarta apache) * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta * multiple fields: title:lucene author:hatcher Main additions in SpanQueryParser syntax vs. classic syntax: * Can require in order for phrases with slop with the \~ operator: jakarta apache\~3 * Can specify not near: fever bieber!\~3,10 :: find fever but not if bieber appears within 3 words before or 10 words after it. * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta apache\]~3 lucene\]\~4 :: find jakarta within 3 words of apache, and that hit has to be within four words before lucene * Can also use \[\] for single level phrasal queries instead of as in: \[jakarta apache\] * Can use or grouping clauses in phrasal queries: apache (lucene solr)\~3 :: find apache and then either lucene or solr within three words. * Can use multiterms in phrasal queries: jakarta\~1 ap*che\~2 * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like jakarta within two words of ap*che and that hit has to be within ten words of something like solr or that lucene regex. * Can require at least x number of hits at boolean level: apache AND (lucene solr tika)~2 * Can use negative only query: -jakarta :: Find all docs that don't contain jakarta * Can use an edit distance 2 for fuzzy query via SlowFuzzyQuery (beware of potential performance issues!). Trivial additions: * Can specify prefix length in fuzzy queries: jakarta~1,2 (edit distance =1, prefix =2) * Can specifiy Optimal String Alignment (OSA) vs Levenshtein for distance =2: (jakarta~1 (OSA) vs jakarta~1(Levenshtein) This parser can be very useful for concordance tasks (see also LUCENE-5317 and LUCENE-5318) and for analytical search. Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery. Most of the documentation is in the javadoc for SpanQueryParser. Any and all feedback is welcome. Thank you. Until this is added to the Lucene project, I've added a standalone lucene-addons repo (with jars compiled for the latest stable build of Lucene) on [github|https://github.com/tballison/lucene-addons]. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-SmokeRelease-5.x - Build # 212 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-5.x/212/ No tests ran. Build Log: [...truncated 51672 lines...] prepare-release-no-sign: [mkdir] Created dir: /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/lucene/build/smokeTestRelease/dist [copy] Copying 446 files to /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/lucene/build/smokeTestRelease/dist/lucene [copy] Copying 254 files to /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/lucene/build/smokeTestRelease/dist/solr [smoker] Java 1.7 JAVA_HOME=/home/jenkins/tools/java/latest1.7 [smoker] NOTE: output encoding is US-ASCII [smoker] [smoker] Load release URL file:/usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/lucene/build/smokeTestRelease/dist/... [smoker] [smoker] Test Lucene... [smoker] test basics... [smoker] get KEYS [smoker] 0.1 MB in 0.01 sec (15.6 MB/sec) [smoker] check changes HTML... [smoker] download lucene-5.0.0-src.tgz... [smoker] 27.8 MB in 0.04 sec (661.5 MB/sec) [smoker] verify md5/sha1 digests [smoker] download lucene-5.0.0.tgz... [smoker] 63.8 MB in 0.17 sec (376.3 MB/sec) [smoker] verify md5/sha1 digests [smoker] download lucene-5.0.0.zip... [smoker] 73.2 MB in 0.11 sec (675.7 MB/sec) [smoker] verify md5/sha1 digests [smoker] unpack lucene-5.0.0.tgz... [smoker] verify JAR metadata/identity/no javax.* or java.* classes... [smoker] test demo with 1.7... [smoker] got 5569 hits for query lucene [smoker] checkindex with 1.7... [smoker] check Lucene's javadoc JAR [smoker] unpack lucene-5.0.0.zip... [smoker] verify JAR metadata/identity/no javax.* or java.* classes... [smoker] test demo with 1.7... [smoker] got 5569 hits for query lucene [smoker] checkindex with 1.7... [smoker] check Lucene's javadoc JAR [smoker] unpack lucene-5.0.0-src.tgz... [smoker] make sure no JARs/WARs in src dist... [smoker] run ant validate [smoker] run tests w/ Java 7 and testArgs='-Dtests.jettyConnector=Socket -Dtests.multiplier=1 -Dtests.slow=false'... [smoker] test demo with 1.7... [smoker] got 206 hits for query lucene [smoker] checkindex with 1.7... [smoker] generate javadocs w/ Java 7... [smoker] [smoker] Crawl/parse... [smoker] [smoker] Verify... [smoker] confirm all releases have coverage in TestBackwardsCompatibility [smoker] find all past Lucene releases... [smoker] run TestBackwardsCompatibility.. [smoker] success! [smoker] [smoker] Test Solr... [smoker] test basics... [smoker] get KEYS [smoker] 0.1 MB in 0.00 sec (76.6 MB/sec) [smoker] check changes HTML... [smoker] download solr-5.0.0-src.tgz... [smoker] 34.1 MB in 0.07 sec (501.3 MB/sec) [smoker] verify md5/sha1 digests [smoker] download solr-5.0.0.tgz... [smoker] 146.4 MB in 0.26 sec (572.3 MB/sec) [smoker] verify md5/sha1 digests [smoker] download solr-5.0.0.zip... [smoker] 152.5 MB in 0.22 sec (698.3 MB/sec) [smoker] verify md5/sha1 digests [smoker] unpack solr-5.0.0.tgz... [smoker] verify JAR metadata/identity/no javax.* or java.* classes... [smoker] unpack lucene-5.0.0.tgz... [smoker] **WARNING**: skipping check of /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/lucene/build/smokeTestRelease/tmp/unpack/solr-5.0.0/contrib/dataimporthandler-extras/lib/activation-1.1.1.jar: it has javax.* classes [smoker] **WARNING**: skipping check of /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/lucene/build/smokeTestRelease/tmp/unpack/solr-5.0.0/contrib/dataimporthandler-extras/lib/javax.mail-1.5.1.jar: it has javax.* classes [smoker] verify WAR metadata/contained JAR identity/no javax.* or java.* classes... [smoker] unpack lucene-5.0.0.tgz... [smoker] copying unpacked distribution for Java 7 ... [smoker] test solr example w/ Java 7... [smoker] start Solr instance (log=/usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/lucene/build/smokeTestRelease/tmp/unpack/solr-5.0.0-java7/solr-example.log)... [smoker] No process found for Solr node running on port 8983 [smoker] starting Solr on port 8983 from /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/lucene/build/smokeTestRelease/tmp/unpack/solr-5.0.0-java7 [smoker] Startup failed; see log /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/lucene/build/smokeTestRelease/tmp/unpack/solr-5.0.0-java7/solr-example.log [smoker] [smoker] Starting Solr on port 8983 from
[jira] [Commented] (SOLR-6633) let /update/json/docs store the source json as well
[ https://issues.apache.org/jira/browse/SOLR-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217858#comment-14217858 ] Noble Paul commented on SOLR-6633: -- I hope you are all clear about the functionality/usecase. If the API/configuration needs change please suggest . bq.I'm not clear on this either... my best guess is that it is like a copyField. And all values (but not keys) are copied into this field? It is like a copyFIeld but without a src field. bq.I'm not quite clear on mapUniqueKeyOnly either... (what the Only refers to). All the values are extracted and dumped into a field. But it ensures that a uniqueKey is created. They don't need to use this attribute at all . f=text:/**f=uniqueKeyField:/unique-field-name should do the trick. Then , if the json does not have a value for uniqueKey it fails. bq.Oh, and when we have magic field names, the convention in Solr has been an underscore on both sides {{_src}} is not a magic field . It is explicitly added to the schema and it is explicitly specified here as well let /update/json/docs store the source json as well --- Key: SOLR-6633 URL: https://issues.apache.org/jira/browse/SOLR-6633 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul Labels: EaseOfUse Fix For: 5.0, Trunk Attachments: SOLR-6633.patch, SOLR-6633.patch it is a common requirement to store the entire JSON as a field in Solr. we can have a extra param srcField=field_name to specify the field name the /update/json/docs is only useful when all the json fields are predefined or in schemaless mode. The better option would be to store the content in a store only field and index the data in another field in other modes the relevant section in solrconfig.xml {code:xml} initParams path=/update/json/docs lst name=defaults !--this ensures that the entire json doc will be stored verbatim into one field-- str name=srcField_src/str !--This means a the uniqueKeyField will be extracted from the fields and all fields go into the 'df' field. In this config df is already configured to be 'text' -- str name=mapUniqueKeyOnlytrue/str str name=dftext/str /lst /initParams {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6633) let /update/json/docs store the source json as well
[ https://issues.apache.org/jira/browse/SOLR-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217881#comment-14217881 ] Alexandre Rafalovitch commented on SOLR-6633: - bq. They don't need to use this attribute at all . f=text:/**f=uniqueKeyField:/unique-field-name should do the trick. So, the advantage of not using the parameter syntax above is that it will automatically figure out what the uniqueKeyField is from the schema? Similar to the UUID URP? But what happens if somebody specifies both. Do we get double content in text? Can we also use the params to populate other fields anyway (I guess yes). And what happens if original JSON is super fat, can we specify exclusion rules. I bet this will be asked too. Don't have to implement it, but will it fit into the current model? I like the feature, I am just trying to make sure it does not cause the confusion through multiplication of options. In my own mind, when I was thinking about this use case (store original JSON), I imagined an URP that just pulls the original JSON from the request. Again, similar to UUID URP one can add into the chain. let /update/json/docs store the source json as well --- Key: SOLR-6633 URL: https://issues.apache.org/jira/browse/SOLR-6633 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul Labels: EaseOfUse Fix For: 5.0, Trunk Attachments: SOLR-6633.patch, SOLR-6633.patch it is a common requirement to store the entire JSON as a field in Solr. we can have a extra param srcField=field_name to specify the field name the /update/json/docs is only useful when all the json fields are predefined or in schemaless mode. The better option would be to store the content in a store only field and index the data in another field in other modes the relevant section in solrconfig.xml {code:xml} initParams path=/update/json/docs lst name=defaults !--this ensures that the entire json doc will be stored verbatim into one field-- str name=srcField_src/str !--This means a the uniqueKeyField will be extracted from the fields and all fields go into the 'df' field. In this config df is already configured to be 'text' -- str name=mapUniqueKeyOnlytrue/str str name=dftext/str /lst /initParams {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6633) let /update/json/docs store the source json as well
[ https://issues.apache.org/jira/browse/SOLR-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217912#comment-14217912 ] Noble Paul commented on SOLR-6633: -- bq.So, the advantage of not using the parameter syntax above is that it will automatically figure out what the uniqueKeyField is from the schema? Similar to the UUID URP? yes and no. I want this to work seamlessly even if uniqueKey is changed without mucking up with solrconfig.xml . I also want it to just work when there is no uniqueKey present in json. Basically, out of the box, it should just work for any json. I hate to tell newbies that they need to edit solrconfig.xml to just get anything working I would recommend this only if you are a newbie . I should document in place to do the explicit mappings with wildcards . bq.And what happens if original JSON is super fat, can we specify exclusion rules. No, there are only inclusion rules. but the sytax is quite powerful to achieve that bq.Again, similar to UUID URP one can add into the chain. URP just fails the simplicity test. It is extremely hard for even experts to get their head around. I HATE the fact that we recommend hard to do configuration to everyone. If we want to get the first time users on board we will need to stop all that. First time users just need stuff to work. let /update/json/docs store the source json as well --- Key: SOLR-6633 URL: https://issues.apache.org/jira/browse/SOLR-6633 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul Labels: EaseOfUse Fix For: 5.0, Trunk Attachments: SOLR-6633.patch, SOLR-6633.patch it is a common requirement to store the entire JSON as a field in Solr. we can have a extra param srcField=field_name to specify the field name the /update/json/docs is only useful when all the json fields are predefined or in schemaless mode. The better option would be to store the content in a store only field and index the data in another field in other modes the relevant section in solrconfig.xml {code:xml} initParams path=/update/json/docs lst name=defaults !--this ensures that the entire json doc will be stored verbatim into one field-- str name=srcField_src/str !--This means a the uniqueKeyField will be extracted from the fields and all fields go into the 'df' field. In this config df is already configured to be 'text' -- str name=mapUniqueKeyOnlytrue/str str name=dftext/str /lst /initParams {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6633) let /update/json/docs store the source json as well
[ https://issues.apache.org/jira/browse/SOLR-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217918#comment-14217918 ] Alexandre Rafalovitch commented on SOLR-6633: - We agree completely on the *newbie* message. I am just trying to make sure it is clear how it fits into the rest of Solr without creating a jarring jump between the step 1 and step 2. So, to be clear. This covers step 1. Then, for step 2 (e.g. _and now to handle dates_) this connects smoothly to what? To a *f=dateField:/xyz* and schemaless mode? To an explicit creation of a date field/type in an Admin UI? let /update/json/docs store the source json as well --- Key: SOLR-6633 URL: https://issues.apache.org/jira/browse/SOLR-6633 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul Labels: EaseOfUse Fix For: 5.0, Trunk Attachments: SOLR-6633.patch, SOLR-6633.patch it is a common requirement to store the entire JSON as a field in Solr. we can have a extra param srcField=field_name to specify the field name the /update/json/docs is only useful when all the json fields are predefined or in schemaless mode. The better option would be to store the content in a store only field and index the data in another field in other modes the relevant section in solrconfig.xml {code:xml} initParams path=/update/json/docs lst name=defaults !--this ensures that the entire json doc will be stored verbatim into one field-- str name=srcField_src/str !--This means a the uniqueKeyField will be extracted from the fields and all fields go into the 'df' field. In this config df is already configured to be 'text' -- str name=mapUniqueKeyOnlytrue/str str name=dftext/str /lst /initParams {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 1946 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1946/ Java: 64bit/jdk1.7.0 -XX:-UseCompressedOops -XX:+UseSerialGC (asserts: true) 8 tests failed. REGRESSION: org.apache.solr.TestTrie.testTrieDoubleRangeSearch Error Message: SolrCore 'collection1' is not available due to init failure: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' Stack Trace: org.apache.solr.common.SolrException: SolrCore 'collection1' is not available due to init failure: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:765) at org.apache.solr.util.TestHarness.getCoreInc(TestHarness.java:219) at org.apache.solr.util.TestHarness.update(TestHarness.java:235) at org.apache.solr.util.BaseTestHarness.checkUpdateStatus(BaseTestHarness.java:282) at org.apache.solr.util.BaseTestHarness.validateUpdate(BaseTestHarness.java:252) at org.apache.solr.SolrTestCaseJ4.checkUpdateU(SolrTestCaseJ4.java:677) at org.apache.solr.SolrTestCaseJ4.assertU(SolrTestCaseJ4.java:656) at org.apache.solr.SolrTestCaseJ4.assertU(SolrTestCaseJ4.java:650) at org.apache.solr.TestTrie.testTrieDoubleRangeSearch(TestTrie.java:142) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at
[jira] [Commented] (SOLR-6633) let /update/json/docs store the source json as well
[ https://issues.apache.org/jira/browse/SOLR-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217922#comment-14217922 ] Noble Paul commented on SOLR-6633: -- bq.But what happens if somebody specifies both. Do we get double content in text? No, if mapUniqueKeyOnly overrides other field definitions let /update/json/docs store the source json as well --- Key: SOLR-6633 URL: https://issues.apache.org/jira/browse/SOLR-6633 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul Labels: EaseOfUse Fix For: 5.0, Trunk Attachments: SOLR-6633.patch, SOLR-6633.patch it is a common requirement to store the entire JSON as a field in Solr. we can have a extra param srcField=field_name to specify the field name the /update/json/docs is only useful when all the json fields are predefined or in schemaless mode. The better option would be to store the content in a store only field and index the data in another field in other modes the relevant section in solrconfig.xml {code:xml} initParams path=/update/json/docs lst name=defaults !--this ensures that the entire json doc will be stored verbatim into one field-- str name=srcField_src/str !--This means a the uniqueKeyField will be extracted from the fields and all fields go into the 'df' field. In this config df is already configured to be 'text' -- str name=mapUniqueKeyOnlytrue/str str name=dftext/str /lst /initParams {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6633) let /update/json/docs store the source json as well
[ https://issues.apache.org/jira/browse/SOLR-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217939#comment-14217939 ] Noble Paul commented on SOLR-6633: -- bq.for step 2 (e.g. and now to handle dates) this connects smoothly to what? To a f=dateField:/xyz and schemaless mode? To an explicit creation of a date field/type in an Admin UI? This is not within the scope of this feature. Actually the objective of this was to introduce the {{srcField}} only. Then I realized that it needed to do more to achieve the objective let /update/json/docs store the source json as well --- Key: SOLR-6633 URL: https://issues.apache.org/jira/browse/SOLR-6633 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul Labels: EaseOfUse Fix For: 5.0, Trunk Attachments: SOLR-6633.patch, SOLR-6633.patch it is a common requirement to store the entire JSON as a field in Solr. we can have a extra param srcField=field_name to specify the field name the /update/json/docs is only useful when all the json fields are predefined or in schemaless mode. The better option would be to store the content in a store only field and index the data in another field in other modes the relevant section in solrconfig.xml {code:xml} initParams path=/update/json/docs lst name=defaults !--this ensures that the entire json doc will be stored verbatim into one field-- str name=srcField_src/str !--This means a the uniqueKeyField will be extracted from the fields and all fields go into the 'df' field. In this config df is already configured to be 'text' -- str name=mapUniqueKeyOnlytrue/str str name=dftext/str /lst /initParams {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-6570) Run SolrZkClient session watch asynchronously
[ https://issues.apache.org/jira/browse/SOLR-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned SOLR-6570: - Assignee: Mark Miller Run SolrZkClient session watch asynchronously - Key: SOLR-6570 URL: https://issues.apache.org/jira/browse/SOLR-6570 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Ramkumar Aiyengar Assignee: Mark Miller Priority: Minor Spin off from SOLR-6261. This kind of already happens because the only session watcher in {{ConnectionManager}} does it's processing async (changed in SOLR-5615), but this is more consistent and avoids the possibility that a second session watcher or a change to that code re-surfaces the issue again. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6533) Support editing common solrconfig.xml values
[ https://issues.apache.org/jira/browse/SOLR-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217968#comment-14217968 ] ASF subversion and git services commented on SOLR-6533: --- Commit 1640564 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1640564 ] SOLR-6533: Fixes the formatting for the CHANGES entry ... Support editing common solrconfig.xml values Key: SOLR-6533 URL: https://issues.apache.org/jira/browse/SOLR-6533 Project: Solr Issue Type: Sub-task Reporter: Noble Paul Attachments: SOLR-6533.patch, SOLR-6533.patch, SOLR-6533.patch, SOLR-6533.patch, SOLR-6533.patch, SOLR-6533.patch, SOLR-6533.patch, SOLR-6533.patch, SOLR-6533.patch, SOLR-6533.patch There are a bunch of properties in solrconfig.xml which users want to edit. We will attack them first These properties will be persisted to a separate file called config.json (or whatever file). Instead of saving in the same format we will have well known properties which users can directly edit {code} updateHandler.autoCommit.maxDocs query.filterCache.initialSize {code} The api will be modeled around the bulk schema API {code:javascript} curl http://localhost:8983/solr/collection1/config -H 'Content-type:application/json' -d '{ set-property : {updateHandler.autoCommit.maxDocs:5}, unset-property: updateHandler.autoCommit.maxDocs }' {code} {code:javascript} //or use this to set ${mypropname} values curl http://localhost:8983/solr/collection1/config -H 'Content-type:application/json' -d '{ set-user-property : {mypropname:my_prop_val}, unset-user-property:{mypropname} }' {code} The values stored in the config.json will always take precedence and will be applied after loading solrconfig.xml. An http GET on /config path will give the real config that is applied . An http GET of/config/overlay gives out the content of the configOverlay.json /config/component-name gives only the fchild of the same name from /config -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6570) Run SolrZkClient session watch asynchronously
[ https://issues.apache.org/jira/browse/SOLR-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217976#comment-14217976 ] ASF subversion and git services commented on SOLR-6570: --- Commit 1640566 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1640566 ] SOLR-6570: Run SolrZkClient session watch asynchronously. Run SolrZkClient session watch asynchronously - Key: SOLR-6570 URL: https://issues.apache.org/jira/browse/SOLR-6570 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Ramkumar Aiyengar Assignee: Mark Miller Priority: Minor Spin off from SOLR-6261. This kind of already happens because the only session watcher in {{ConnectionManager}} does it's processing async (changed in SOLR-5615), but this is more consistent and avoids the possibility that a second session watcher or a change to that code re-surfaces the issue again. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6570) Run SolrZkClient session watch asynchronously
[ https://issues.apache.org/jira/browse/SOLR-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217977#comment-14217977 ] ASF subversion and git services commented on SOLR-6570: --- Commit 1640568 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1640568 ] SOLR-6570: Run SolrZkClient session watch asynchronously. Run SolrZkClient session watch asynchronously - Key: SOLR-6570 URL: https://issues.apache.org/jira/browse/SOLR-6570 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Ramkumar Aiyengar Assignee: Mark Miller Priority: Minor Spin off from SOLR-6261. This kind of already happens because the only session watcher in {{ConnectionManager}} does it's processing async (changed in SOLR-5615), but this is more consistent and avoids the possibility that a second session watcher or a change to that code re-surfaces the issue again. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6758) Solr node to node communication errors w/ SSL + client auth
liuqibj created SOLR-6758: - Summary: Solr node to node communication errors w/ SSL + client auth Key: SOLR-6758 URL: https://issues.apache.org/jira/browse/SOLR-6758 Project: Solr Issue Type: Bug Reporter: liuqibj Enable two solr servers SSL w/ client auth(JSSE) and then change solr.xml to use 8443 port and change zookeeper to use https instead of http. then starting the two solr server and found these errors. Any suggestions? ERROR - 2014-11-19 12:55:31.125; org.apache.solr.common.SolrException; null:org.apache.solr.common.SolrException: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: https://9.12.11.9:8443/solr/collection2_shard1_replica2 at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:308) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:780) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:314) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:804) Caused by: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: https://9.12.11.9::8443/solr/collection2_shard1_replica2 at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:507) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:199) at org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:156) at org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:118) at java.util.concurrent.FutureTask.run(FutureTask.java:273) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482) at java.util.concurrent.FutureTask.run(FutureTask.java:273) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626) ... 1 more Caused by: javax.net.ssl.SSLHandshakeException: com.ibm.jsse2.util.j: PKIX path building failed: java.security.cert.CertPathBuilderException: PKIXCertPathBuilderImpl could not build a valid CertPath.; internal cause is: java.security.cert.CertPathValidatorException: The certificate issued by CN=testing, L=lt, ST=mass, OU=test, O=IBM test, C=ZQ is not trusted; internal cause is: java.security.cert.CertPathValidatorException: Certificate chaining error at com.ibm.jsse2.j.a(j.java:36) at com.ibm.jsse2.qc.a(qc.java:199) at com.ibm.jsse2.ab.a(ab.java:171) at com.ibm.jsse2.ab.a(ab.java:180) at com.ibm.jsse2.bb.a(bb.java:346) at com.ibm.jsse2.bb.a(bb.java:559) at com.ibm.jsse2.ab.r(ab.java:554) at com.ibm.jsse2.ab.a(ab.java:325) at com.ibm.jsse2.qc.a(qc.java:617) at com.ibm.jsse2.qc.h(qc.java:103) at com.ibm.jsse2.qc.a(qc.java:166) at com.ibm.jsse2.qc.startHandshake(qc.java:649) at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:533) at
[jira] [Resolved] (SOLR-6570) Run SolrZkClient session watch asynchronously
[ https://issues.apache.org/jira/browse/SOLR-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-6570. --- Resolution: Fixed Fix Version/s: Trunk 5.0 Thanks Ramkumar! Run SolrZkClient session watch asynchronously - Key: SOLR-6570 URL: https://issues.apache.org/jira/browse/SOLR-6570 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Ramkumar Aiyengar Assignee: Mark Miller Priority: Minor Fix For: 5.0, Trunk Spin off from SOLR-6261. This kind of already happens because the only session watcher in {{ConnectionManager}} does it's processing async (changed in SOLR-5615), but this is more consistent and avoids the possibility that a second session watcher or a change to that code re-surfaces the issue again. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-6500) Refactor FileFetcher in SnapPuller, add debug logging
[ https://issues.apache.org/jira/browse/SOLR-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned SOLR-6500: - Assignee: Mark Miller Refactor FileFetcher in SnapPuller, add debug logging - Key: SOLR-6500 URL: https://issues.apache.org/jira/browse/SOLR-6500 Project: Solr Issue Type: Improvement Components: replication (java), SolrCloud Reporter: Ramkumar Aiyengar Assignee: Mark Miller Priority: Minor Fix For: 5.0, Trunk I was debugging some replication slowness and felt the need for some debug statements in this code path, which then pointed me to a lot of repeated code between local fs and directory file fetching logic in SnapPuller (for which there was a TODO as well), so went ahead and refactored that as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6500) Refactor FileFetcher in SnapPuller, add debug logging
[ https://issues.apache.org/jira/browse/SOLR-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-6500: -- Fix Version/s: Trunk 5.0 Refactor FileFetcher in SnapPuller, add debug logging - Key: SOLR-6500 URL: https://issues.apache.org/jira/browse/SOLR-6500 Project: Solr Issue Type: Improvement Components: replication (java), SolrCloud Reporter: Ramkumar Aiyengar Assignee: Mark Miller Priority: Minor Fix For: 5.0, Trunk I was debugging some replication slowness and felt the need for some debug statements in this code path, which then pointed me to a lot of repeated code between local fs and directory file fetching logic in SnapPuller (for which there was a TODO as well), so went ahead and refactored that as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5317) [PATCH] Concordance capability
[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217994#comment-14217994 ] Steve Rowe commented on LUCENE-5317: bq. I didn't have luck posting this to the review board. When I tried to post it, I entered the base directory and was returned to the starting page without any error message. For the record, I'm sure that this is user error. I've successfully used {{trunk}} for the base directory in the past - what did you use? [PATCH] Concordance capability -- Key: LUCENE-5317 URL: https://issues.apache.org/jira/browse/LUCENE-5317 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 4.5 Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: LUCENE-5317.patch, concordance_v1.patch.gz, lucene5317v1.patch This patch enables a Lucene-powered concordance search capability. Concordances are extremely useful for linguists, lawyers and other analysts performing analytic search vs. traditional snippeting/document retrieval tasks. By analytic search, I mean that the user wants to browse every time a term appears (or at least the topn) in a subset of documents and see the words before and after. Concordance technology is far simpler and less interesting than IR relevance models/methods, but it can be extremely useful for some use cases. Traditional concordance sort orders are available (sort on words before the target, words after, target then words before and target then words after). Under the hood, this is running SpanQuery's getSpans() and reanalyzing to obtain character offsets. There is plenty of room for optimizations and refactoring. Many thanks to my colleague, Jason Robinson, for input on the design of this patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-5317) [PATCH] Concordance capability
[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217994#comment-14217994 ] Steve Rowe edited comment on LUCENE-5317 at 11/19/14 3:10 PM: -- bq. I didn't have luck posting this to the review board. When I tried to post it, I entered the base directory and was returned to the starting page without any error message. For the record, I'm sure that this is user error. I've successfully used {{trunk}} (literally just that) for the base directory in the past - what did you use? was (Author: steve_rowe): bq. I didn't have luck posting this to the review board. When I tried to post it, I entered the base directory and was returned to the starting page without any error message. For the record, I'm sure that this is user error. I've successfully used {{trunk}} for the base directory in the past - what did you use? [PATCH] Concordance capability -- Key: LUCENE-5317 URL: https://issues.apache.org/jira/browse/LUCENE-5317 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 4.5 Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: LUCENE-5317.patch, concordance_v1.patch.gz, lucene5317v1.patch This patch enables a Lucene-powered concordance search capability. Concordances are extremely useful for linguists, lawyers and other analysts performing analytic search vs. traditional snippeting/document retrieval tasks. By analytic search, I mean that the user wants to browse every time a term appears (or at least the topn) in a subset of documents and see the words before and after. Concordance technology is far simpler and less interesting than IR relevance models/methods, but it can be extremely useful for some use cases. Traditional concordance sort orders are available (sort on words before the target, words after, target then words before and target then words after). Under the hood, this is running SpanQuery's getSpans() and reanalyzing to obtain character offsets. There is plenty of room for optimizations and refactoring. Many thanks to my colleague, Jason Robinson, for input on the design of this patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6752) Buffer Cache allocate/lost should be exposed through JMX
[ https://issues.apache.org/jira/browse/SOLR-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218010#comment-14218010 ] Mark Miller commented on SOLR-6752: --- bq. SolrSOLR-6752 Buffer Cache allocate/lost should be exposed through JMX I'm a little confused by the title - isn't this simply exposing hdfds block cache metrics via JMX? Why anything specific about allocate/lost? I see all sorts of stats in the getStatistics call. We also probably want to try and align some of the stat key names with other cache objects in Solr: the query cache, filter cache, etc. Also, I don't believe these will get registered with the jmx server. I think only the top level class for a plugin is by default - eg the HdfsDirectoryFactory itself. Buffer Cache allocate/lost should be exposed through JMX Key: SOLR-6752 URL: https://issues.apache.org/jira/browse/SOLR-6752 Project: Solr Issue Type: Bug Reporter: Mike Drob Assignee: Mark Miller Labels: metrics Attachments: SOLR-6752.patch Currently, {{o.a.s.store.blockcache.Metrics}} has fields for tracking buffer allocations and losses, but they are never updated nor exposed to a receiving metrics system. We should do both. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6759) ExpandComponent does not call finish() on DelegatingCollectors
Simon Endele created SOLR-6759: -- Summary: ExpandComponent does not call finish() on DelegatingCollectors Key: SOLR-6759 URL: https://issues.apache.org/jira/browse/SOLR-6759 Project: Solr Issue Type: Bug Reporter: Simon Endele We have a PostFilter for ACL filtering in action that has a similar structure as CollapsingQParserPlugin, i.e. it's DelegatingCollector gathers all documents and calls delegate.collect() for all docs finally in its finish() method. In contrast to CollapsingQParserPlugin our PostFilter is also called by the ExpandComponent (for purpose). But as the finish method is never called by the ExpandComponent, the expand section in the result is always empty. Tested with Solr 4.10.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6759) ExpandComponent does not call finish() on DelegatingCollectors
[ https://issues.apache.org/jira/browse/SOLR-6759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Endele updated SOLR-6759: --- Attachment: ExpandComponent.java.patch I'm not a Solr expert, but if I understand the code right, this can be fixed with a few lines. Added a patch. Seems to work for us. ExpandComponent does not call finish() on DelegatingCollectors -- Key: SOLR-6759 URL: https://issues.apache.org/jira/browse/SOLR-6759 Project: Solr Issue Type: Bug Reporter: Simon Endele Attachments: ExpandComponent.java.patch We have a PostFilter for ACL filtering in action that has a similar structure as CollapsingQParserPlugin, i.e. it's DelegatingCollector gathers all documents and calls delegate.collect() for all docs finally in its finish() method. In contrast to CollapsingQParserPlugin our PostFilter is also called by the ExpandComponent (for purpose). But as the finish method is never called by the ExpandComponent, the expand section in the result is always empty. Tested with Solr 4.10.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6610) In stateFormat=2, ZkController.publishAndWaitForDownStates always times out
[ https://issues.apache.org/jira/browse/SOLR-6610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul resolved SOLR-6610. -- Resolution: Fixed Fix Version/s: Trunk 5.0 In stateFormat=2, ZkController.publishAndWaitForDownStates always times out --- Key: SOLR-6610 URL: https://issues.apache.org/jira/browse/SOLR-6610 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Jessica Cheng Mallet Assignee: Noble Paul Labels: solrcloud Fix For: 5.0, Trunk Attachments: SOLR-6610.patch Using stateFormat=2, our solr always takes a while to start up and spits out this warning line: {quote} WARN - 2014-10-08 17:30:24.290; org.apache.solr.cloud.ZkController; Timed out waiting to see all nodes published as DOWN in our cluster state. {quote} Looking at the code, this is probably because ZkController.publishAndWaitForDownStates is called in ZkController.init, which gets called via ZkContainer.initZookeeper in CoreContainer.load before any of the stateFormat=2 collection watches are set in the CoreContainer.preRegisterInZk call a few lines later. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6760) New optimized DistributedQueue implementation for overseer
Noble Paul created SOLR-6760: Summary: New optimized DistributedQueue implementation for overseer Key: SOLR-6760 URL: https://issues.apache.org/jira/browse/SOLR-6760 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul Currently the DQ works as follows * read all items in the directory * sort them all * take the head and return it This works well when we have only a handful of items in the Queue. If the items in the queue is much larger in tens of thousands, this is counterprodcutive As the overseer queue is a multiple producers + single consumer queue, We can read them all in bulk and before processing each item , just do a zk.exists(itemname) and if all is well we don't need to do the fetch all + sort thing again -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6760) New optimized DistributedQueue implementation for overseer
[ https://issues.apache.org/jira/browse/SOLR-6760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-6760: - Description: Currently the DQ works as follows * read all items in the directory * sort them all * take the head and return it This works well when we have only a handful of items in the Queue. If the items in the queue is much larger (in tens of thousands) , this is counterproductive As the overseer queue is a multiple producers + single consumer queue, We can read them all in bulk and before processing each item , just do a zk.exists(itemname) and if all is well we don't need to do the fetch all + sort thing again was: Currently the DQ works as follows * read all items in the directory * sort them all * take the head and return it This works well when we have only a handful of items in the Queue. If the items in the queue is much larger in tens of thousands, this is counterprodcutive As the overseer queue is a multiple producers + single consumer queue, We can read them all in bulk and before processing each item , just do a zk.exists(itemname) and if all is well we don't need to do the fetch all + sort thing again New optimized DistributedQueue implementation for overseer -- Key: SOLR-6760 URL: https://issues.apache.org/jira/browse/SOLR-6760 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul Currently the DQ works as follows * read all items in the directory * sort them all * take the head and return it This works well when we have only a handful of items in the Queue. If the items in the queue is much larger (in tens of thousands) , this is counterproductive As the overseer queue is a multiple producers + single consumer queue, We can read them all in bulk and before processing each item , just do a zk.exists(itemname) and if all is well we don't need to do the fetch all + sort thing again -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6586) JmxMonitoredMap#getAttribute is not very efficient.
[ https://issues.apache.org/jira/browse/SOLR-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218055#comment-14218055 ] ASF subversion and git services commented on SOLR-6586: --- Commit 1640582 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1640582 ] SOLR-6747: Add an optional caching option as a workaround for SOLR-6586. JmxMonitoredMap#getAttribute is not very efficient. --- Key: SOLR-6586 URL: https://issues.apache.org/jira/browse/SOLR-6586 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6761) Ability to ignore commit and optimize requests from clients when running in SolrCloud mode.
Timothy Potter created SOLR-6761: Summary: Ability to ignore commit and optimize requests from clients when running in SolrCloud mode. Key: SOLR-6761 URL: https://issues.apache.org/jira/browse/SOLR-6761 Project: Solr Issue Type: New Feature Components: SolrCloud, SolrJ Reporter: Timothy Potter In most SolrCloud environments, it's advisable to only rely on auto-commits (soft and hard) configured in solrconfig.xml and not send explicit commit requests from client applications. In fact, I've seen cases where improperly coded client applications can send commit requests too frequently, which can lead to harming the cluster's health. As a system administrator, I'd like the ability to disallow commit requests from client applications. Ideally, I could configure the updateHandler to ignore the requests and return an HTTP response code of my choosing as I may not want to break existing client applications by returning an error. In other words, I may want to just return 200 vs. 405. The same goes for optimize requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6747) Add an optional caching option as a workaround for SOLR-6586.
[ https://issues.apache.org/jira/browse/SOLR-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218054#comment-14218054 ] ASF subversion and git services commented on SOLR-6747: --- Commit 1640582 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1640582 ] SOLR-6747: Add an optional caching option as a workaround for SOLR-6586. Add an optional caching option as a workaround for SOLR-6586. - Key: SOLR-6747 URL: https://issues.apache.org/jira/browse/SOLR-6747 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Fix For: 5.0, Trunk Attachments: SOLR-6747.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6760) New optimized DistributedQueue implementation for overseer
[ https://issues.apache.org/jira/browse/SOLR-6760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-6760: - Description: Currently the DQ works as follows * read all items in the directory * sort them all * take the head and return it and discard everything else * rinse and repeat This works well when we have only a handful of items in the Queue. If the items in the queue is much larger (in tens of thousands) , this is counterproductive As the overseer queue is a multiple producers + single consumer queue, We can read them all in bulk and before processing each item , just do a zk.exists(itemname) and if all is well we don't need to do the fetch all + sort thing again was: Currently the DQ works as follows * read all items in the directory * sort them all * take the head and return it This works well when we have only a handful of items in the Queue. If the items in the queue is much larger (in tens of thousands) , this is counterproductive As the overseer queue is a multiple producers + single consumer queue, We can read them all in bulk and before processing each item , just do a zk.exists(itemname) and if all is well we don't need to do the fetch all + sort thing again New optimized DistributedQueue implementation for overseer -- Key: SOLR-6760 URL: https://issues.apache.org/jira/browse/SOLR-6760 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul Currently the DQ works as follows * read all items in the directory * sort them all * take the head and return it and discard everything else * rinse and repeat This works well when we have only a handful of items in the Queue. If the items in the queue is much larger (in tens of thousands) , this is counterproductive As the overseer queue is a multiple producers + single consumer queue, We can read them all in bulk and before processing each item , just do a zk.exists(itemname) and if all is well we don't need to do the fetch all + sort thing again -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6586) JmxMonitoredMap#getAttribute is not very efficient.
[ https://issues.apache.org/jira/browse/SOLR-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218072#comment-14218072 ] ASF subversion and git services commented on SOLR-6586: --- Commit 1640587 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1640587 ] SOLR-6747: Add an optional caching option as a workaround for SOLR-6586. JmxMonitoredMap#getAttribute is not very efficient. --- Key: SOLR-6586 URL: https://issues.apache.org/jira/browse/SOLR-6586 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6752) Buffer Cache allocate/lost should be exposed through JMX
[ https://issues.apache.org/jira/browse/SOLR-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218077#comment-14218077 ] Mike Drob commented on SOLR-6752: - bq. I'm a little confused by the title - isn't this simply exposing hdfds block cache metrics via JMX? Why anything specific about allocate/lost? I see all sorts of stats in the getStatistics call. The buffer allocatione/lost metrics are currently not exposed at all. When moving everything from {{metricsRecord.setMetric}} to {{stats.add}} in the metrics method, these lines are completely new, instead of being converted over. {noformat} + stats.add(buffercache.allocations, getPerSecond(shardBuffercacheAllocate.getAndSet(0), seconds)); + stats.add(buffercache.lost, getPerSecond(shardBuffercacheLost.getAndSet(0), seconds)); {noformat} bq. We also probably want to try and align some of the stat key names with other cache objects in Solr: the query cache, filter cache, etc. This makes sense, I'll look into it. bq. Also, I don't believe these will get registered with the jmx server. I think only the top level class for a plugin is by default - eg the HdfsDirectoryFactory itself. Not sure I understand this. Where should I look to make sure this is getting registered? Buffer Cache allocate/lost should be exposed through JMX Key: SOLR-6752 URL: https://issues.apache.org/jira/browse/SOLR-6752 Project: Solr Issue Type: Bug Reporter: Mike Drob Assignee: Mark Miller Labels: metrics Attachments: SOLR-6752.patch Currently, {{o.a.s.store.blockcache.Metrics}} has fields for tracking buffer allocations and losses, but they are never updated nor exposed to a receiving metrics system. We should do both. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-6762) bin/solr create_collection not working due to not finding configName but the configSet exists
[ https://issues.apache.org/jira/browse/SOLR-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter reassigned SOLR-6762: Assignee: Timothy Potter bin/solr create_collection not working due to not finding configName but the configSet exists - Key: SOLR-6762 URL: https://issues.apache.org/jira/browse/SOLR-6762 Project: Solr Issue Type: Bug Reporter: Timothy Potter Assignee: Timothy Potter On trunk, when doing: bin/solr create_collection -n foo2 -c sample_techproducts_configs The collection cannot be created because of: Connecting to ZooKeeper at localhost:9983 Uploading /Users/timpotter/dev/lw/projects/solr_trunk_co/solr/server/solr/configsets/sample_techproducts_configs/conf for config sample_techproducts_configs to ZooKeeper at localhost:9983 Creating new collection 'foo2' using command: http://192.168.1.2:7574/solr/admin/collections?action=CREATEname=foo2numShards=1replicationFactor=1maxShardsPerNode=1configSet=sample_techproducts_configs { responseHeader:{ status:0, QTime:16121}, failure:{:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'foo2_shard1_replica1': Unable to create core [foo2_shard1_replica1] Caused by: Could not find configName for collection foo2 found:[data_driven_schema_configs, sample_techproducts_configs]}} The bin/solr create_collection command uploads the sample_techproducts_configs configset but the create collection seems to require the configName instead. Logs: INFO - 2014-11-19 16:14:38.001; org.apache.solr.cloud.OverseerCollectionProcessor; Overseer Collection Processor: Get the message id:/overseer/collection-queue-work/qn-14 message:{ operation:create, fromApi:true, name:foo2, replicationFactor:1, numShards:1, maxShardsPerNode:1} WARN - 2014-11-19 16:14:38.002; org.apache.solr.cloud.OverseerCollectionProcessor; OverseerCollectionProcessor.processMessage : create , { operation:create, fromApi:true, name:foo2, replicationFactor:1, numShards:1, maxShardsPerNode:1} WARN - 2014-11-19 16:14:38.002; org.apache.solr.cloud.OverseerCollectionProcessor; Could not obtain config name INFO - 2014-11-19 16:14:38.003; org.apache.solr.cloud.DistributedQueue$LatchWatcher; NodeChildrenChanged fired on path /overseer/queue state SyncConnected INFO - 2014-11-19 16:14:38.004; org.apache.solr.cloud.Overseer$ClusterStateUpdater; building a new collection: foo2 INFO - 2014-11-19 16:14:38.004; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Create collection foo2 with shards [shard1] INFO - 2014-11-19 16:14:38.004; org.apache.solr.cloud.Overseer$ClusterStateUpdater; state version foo2 1 INFO - 2014-11-19 16:14:38.006; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 2) INFO - 2014-11-19 16:14:38.103; org.apache.solr.cloud.OverseerCollectionProcessor; Creating SolrCores for new collection foo2, shardNames [shard1] , replicationFactor : 1 INFO - 2014-11-19 16:14:38.103; org.apache.solr.cloud.OverseerCollectionProcessor; Creating shard foo2_shard1_replica1 as part of slice shard1 of collection foo2 on 192.168.1.2:8983_solr INFO - 2014-11-19 16:14:38.106; org.apache.solr.handler.admin.CoreAdminHandler; core create command numShards=1shard=shard1name=foo2_shard1_replica1action=CREATEcollection=foo2wt=javabinqt=/admin/coresversion=2 INFO - 2014-11-19 16:14:38.107; org.apache.solr.cloud.ZkController; publishing core=foo2_shard1_replica1 state=down collection=foo2 INFO - 2014-11-19 16:14:38.108; org.apache.solr.cloud.ZkController; look for our core node name INFO - 2014-11-19 16:14:38.108; org.apache.solr.cloud.DistributedQueue$LatchWatcher; NodeChildrenChanged fired on path /overseer/queue state SyncConnected INFO - 2014-11-19 16:14:38.109; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state numShards=1 message={ operation:state, numShards:1, shard:shard1, roles:null, state:down, core:foo2_shard1_replica1, collection:foo2, node_name:192.168.1.2:8983_solr, base_url:http://192.168.1.2:8983/solr} INFO - 2014-11-19 16:14:38.211; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 2) INFO - 2014-11-19 16:14:39.108; org.apache.solr.cloud.ZkController; waiting to find shard id in clusterstate for foo2_shard1_replica1 INFO - 2014-11-19 16:14:39.108; org.apache.solr.cloud.ZkController; Check for
[jira] [Resolved] (SOLR-6747) Add an optional caching option as a workaround for SOLR-6586.
[ https://issues.apache.org/jira/browse/SOLR-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-6747. --- Resolution: Fixed Add an optional caching option as a workaround for SOLR-6586. - Key: SOLR-6747 URL: https://issues.apache.org/jira/browse/SOLR-6747 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Fix For: 5.0, Trunk Attachments: SOLR-6747.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6762) bin/solr create_collection not working due to not finding configName but the configSet exists
Timothy Potter created SOLR-6762: Summary: bin/solr create_collection not working due to not finding configName but the configSet exists Key: SOLR-6762 URL: https://issues.apache.org/jira/browse/SOLR-6762 Project: Solr Issue Type: Bug Reporter: Timothy Potter On trunk, when doing: bin/solr create_collection -n foo2 -c sample_techproducts_configs The collection cannot be created because of: Connecting to ZooKeeper at localhost:9983 Uploading /Users/timpotter/dev/lw/projects/solr_trunk_co/solr/server/solr/configsets/sample_techproducts_configs/conf for config sample_techproducts_configs to ZooKeeper at localhost:9983 Creating new collection 'foo2' using command: http://192.168.1.2:7574/solr/admin/collections?action=CREATEname=foo2numShards=1replicationFactor=1maxShardsPerNode=1configSet=sample_techproducts_configs { responseHeader:{ status:0, QTime:16121}, failure:{:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'foo2_shard1_replica1': Unable to create core [foo2_shard1_replica1] Caused by: Could not find configName for collection foo2 found:[data_driven_schema_configs, sample_techproducts_configs]}} The bin/solr create_collection command uploads the sample_techproducts_configs configset but the create collection seems to require the configName instead. Logs: INFO - 2014-11-19 16:14:38.001; org.apache.solr.cloud.OverseerCollectionProcessor; Overseer Collection Processor: Get the message id:/overseer/collection-queue-work/qn-14 message:{ operation:create, fromApi:true, name:foo2, replicationFactor:1, numShards:1, maxShardsPerNode:1} WARN - 2014-11-19 16:14:38.002; org.apache.solr.cloud.OverseerCollectionProcessor; OverseerCollectionProcessor.processMessage : create , { operation:create, fromApi:true, name:foo2, replicationFactor:1, numShards:1, maxShardsPerNode:1} WARN - 2014-11-19 16:14:38.002; org.apache.solr.cloud.OverseerCollectionProcessor; Could not obtain config name INFO - 2014-11-19 16:14:38.003; org.apache.solr.cloud.DistributedQueue$LatchWatcher; NodeChildrenChanged fired on path /overseer/queue state SyncConnected INFO - 2014-11-19 16:14:38.004; org.apache.solr.cloud.Overseer$ClusterStateUpdater; building a new collection: foo2 INFO - 2014-11-19 16:14:38.004; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Create collection foo2 with shards [shard1] INFO - 2014-11-19 16:14:38.004; org.apache.solr.cloud.Overseer$ClusterStateUpdater; state version foo2 1 INFO - 2014-11-19 16:14:38.006; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 2) INFO - 2014-11-19 16:14:38.103; org.apache.solr.cloud.OverseerCollectionProcessor; Creating SolrCores for new collection foo2, shardNames [shard1] , replicationFactor : 1 INFO - 2014-11-19 16:14:38.103; org.apache.solr.cloud.OverseerCollectionProcessor; Creating shard foo2_shard1_replica1 as part of slice shard1 of collection foo2 on 192.168.1.2:8983_solr INFO - 2014-11-19 16:14:38.106; org.apache.solr.handler.admin.CoreAdminHandler; core create command numShards=1shard=shard1name=foo2_shard1_replica1action=CREATEcollection=foo2wt=javabinqt=/admin/coresversion=2 INFO - 2014-11-19 16:14:38.107; org.apache.solr.cloud.ZkController; publishing core=foo2_shard1_replica1 state=down collection=foo2 INFO - 2014-11-19 16:14:38.108; org.apache.solr.cloud.ZkController; look for our core node name INFO - 2014-11-19 16:14:38.108; org.apache.solr.cloud.DistributedQueue$LatchWatcher; NodeChildrenChanged fired on path /overseer/queue state SyncConnected INFO - 2014-11-19 16:14:38.109; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state numShards=1 message={ operation:state, numShards:1, shard:shard1, roles:null, state:down, core:foo2_shard1_replica1, collection:foo2, node_name:192.168.1.2:8983_solr, base_url:http://192.168.1.2:8983/solr} INFO - 2014-11-19 16:14:38.211; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 2) INFO - 2014-11-19 16:14:39.108; org.apache.solr.cloud.ZkController; waiting to find shard id in clusterstate for foo2_shard1_replica1 INFO - 2014-11-19 16:14:39.108; org.apache.solr.cloud.ZkController; Check for collection zkNode:foo2 INFO - 2014-11-19 16:14:39.109; org.apache.solr.cloud.ZkController; Creating collection in ZooKeeper:foo2 INFO - 2014-11-19 16:14:39.109; org.apache.solr.cloud.ZkController; Looking for collection configName INFO - 2014-11-19 16:14:39.109; org.apache.solr.cloud.ZkController; Could not find collection configName - pausing for 3 seconds
[jira] [Commented] (SOLR-6747) Add an optional caching option as a workaround for SOLR-6586.
[ https://issues.apache.org/jira/browse/SOLR-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218071#comment-14218071 ] ASF subversion and git services commented on SOLR-6747: --- Commit 1640587 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1640587 ] SOLR-6747: Add an optional caching option as a workaround for SOLR-6586. Add an optional caching option as a workaround for SOLR-6586. - Key: SOLR-6747 URL: https://issues.apache.org/jira/browse/SOLR-6747 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Fix For: 5.0, Trunk Attachments: SOLR-6747.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-NightlyTests-trunk - Build # 687 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-trunk/687/ 88 tests failed. REGRESSION: org.apache.solr.BasicFunctionalityTest.testRequestHandlerBaseException Error Message: SolrCore 'collection1' is not available due to init failure: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' Stack Trace: org.apache.solr.common.SolrException: SolrCore 'collection1' is not available due to init failure: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' at __randomizedtesting.SeedInfo.seed([B7DC30A9F14F5AF1:A508121957E39394]:0) at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:765) at org.apache.solr.util.TestHarness.getCore(TestHarness.java:209) at org.apache.solr.util.TestHarness$LocalRequestFactory.makeRequest(TestHarness.java:422) at org.apache.solr.SolrTestCaseJ4.req(SolrTestCaseJ4.java:1004) at org.apache.solr.BasicFunctionalityTest.testRequestHandlerBaseException(BasicFunctionalityTest.java:401) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
[jira] [Commented] (SOLR-6691) REBALANCELEADERS needs to change the leader election queue.
[ https://issues.apache.org/jira/browse/SOLR-6691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218097#comment-14218097 ] Erick Erickson commented on SOLR-6691: -- [~noble.paul] [~markrmiller] OK, I'm working this out (slowly). Here's the deal though. I don't see a graceful way of telling a node that is _currently_ a leader to stop being leader. Oh, and it must re-insert itself at the end of the leader-elector queue. I don't really want to down the node, that seems far too harsh, but perhaps it's not. Also, what I'm trying at this point (I'll improve if necessary before committing the patch). For leader rebalancing, basically just delete the leader ephemeral election node. I'd like the leader node itself to do this but I don't yet see a clean way to inform the current leader it should abdicate that role. I can do this from anywhere, but it seems cleaner if the core itself does it. 1 each node is watching the one before it. So when the leader ephemeral node disappears, the next node gets the event and looks through the queue to see if some _other_ node is preferred leader. If so, it puts itself at the end of the leader election queue and does _not_ become leader. But it does remove it's own ephemeral node so the next node in the chain gets that event and so on. 1a I'm having trouble having the leader that's abdicating get the message that it should abdicate its role. I'm trying to have the leader watch its own ephemeral node, is there a better way? Note that the only place this really produces churn is when a BALANCESHARDUNIQUE is issued and then immediately a REBALANCELEADERS is issued. Otherwise, when cores are loaded, if they are the preferred leader they insert themselves at the head of the leader-elector queue so REBALANCELEADERS in that case shouldn't cause any unnecessary churn. As you can tell, I'm a bit stymied, I'll plug along but wondered if there's some prior art I haven't found yet. Thanks! REBALANCELEADERS needs to change the leader election queue. --- Key: SOLR-6691 URL: https://issues.apache.org/jira/browse/SOLR-6691 Project: Solr Issue Type: Bug Reporter: Erick Erickson Assignee: Erick Erickson The original code (SOLR-6517) assumed that changes in the clusterstate after issuing a command to the overseer to change the leader indicated that the leader was successfully changed. Fortunately, Noble clued me in that this isn't the case and that the potential leader needs to insert itself in the leader election queue before trigging the change leader command. Inserting themselves in the front of the queue should probably happen in BALANCESHARDUNIQUE when the preferredLeader property is assigned as well. [~noble.paul] Do evil things happen if a node joins at the head but it's _already_ in the queue? These ephemeral nodes in the queue are watching each other. So if node1 is the leader you have node1 - node2 - node3 - node4 where - means watches. Now, if node3 puts itself at the head of the list, you have {code} node1 - node2 - node3 - node4 {code} I _think_ when I was looking at this it all just worked. 1 node 1 goes down. Nodes 2 and 3 duke it out but there's code to insure that node3 becomes the leader and node2 inserts itself at then end so it's watching node 4. 2 node 2 goes down, nobody gets notified and it doesn't matter. 3 node 3 goes down, node 4 gets notified and starts watching node 2 by inserting itself at the end of the list. 4 node 4 goes down, nobody gets notified and it doesn't matter. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6365) specify appends, defaults, invariants outside of the component
[ https://issues.apache.org/jira/browse/SOLR-6365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218121#comment-14218121 ] ASF subversion and git services commented on SOLR-6365: --- Commit 1640594 from [~noble.paul] in branch 'dev/trunk' [ https://svn.apache.org/r1640594 ] SOLR-6365 implicit requesthandler (specifiedin code) takes lower precedence over initParams specify appends, defaults, invariants outside of the component --- Key: SOLR-6365 URL: https://issues.apache.org/jira/browse/SOLR-6365 Project: Solr Issue Type: Improvement Reporter: Noble Paul Assignee: Noble Paul Fix For: 5.0, Trunk Attachments: SOLR-6365-crappy-test.patch, SOLR-6365.patch, SOLR-6365.patch, SOLR-6365.patch The components are configured in solrconfig.xml mostly for specifying these extra parameters. If we separate these out, we can avoid specifying the components altogether and make solrconfig much simpler. Eventually we want users to see all functions as paths instead of components and control these params from outside , through an API and persisted in ZK objectives : * define standard components implicitly and let users override some params only * reuse standard params across components * define multiple param sets and mix and match these params at request time example {code:xml} !-- use json for all paths and _txt as the default search field-- initParams name=global path=/** lst name=defaults str name=wtjson/str str name=df_txt/str /lst /initParams {code} other examples {code:xml} initParams name=a path=/dump3,/root/*,/root1/** lst name=defaults str name=aA/str /lst lst name=invariants str name=bB/str /lst lst name=appends str name=cC/str /lst /initParams requestHandler name=/dump3 class=DumpRequestHandler/ requestHandler name=/dump4 class=DumpRequestHandler/ requestHandler name=/root/dump5 class=DumpRequestHandler/ requestHandler name=/root1/anotherlevel/dump6 class=DumpRequestHandler/ requestHandler name=/dump1 class=DumpRequestHandler initParams=a/ requestHandler name=/dump2 class=DumpRequestHandler initParams=a lst name=defaults str name=aA1/str /lst lst name=invariants str name=bB1/str /lst lst name=appends str name=cC1/str /lst /requestHandler {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-4799) SQLEntityProcessor for zipper join
[ https://issues.apache.org/jira/browse/SOLR-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul reassigned SOLR-4799: Assignee: Noble Paul SQLEntityProcessor for zipper join -- Key: SOLR-4799 URL: https://issues.apache.org/jira/browse/SOLR-4799 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Mikhail Khludnev Assignee: Noble Paul Priority: Minor Labels: DIH, dataimportHandler, dih Attachments: SOLR-4799.patch DIH is mostly considered as a playground tool, and real usages end up with SolrJ. I want to contribute few improvements target DIH performance. This one provides performant approach for joining SQL Entities with miserable memory at contrast to http://wiki.apache.org/solr/DataImportHandler#CachedSqlEntityProcessor The idea is: * parent table is explicitly ordered by it’s PK in SQL * children table is explicitly ordered by parent_id FK in SQL * children entity processor joins ordered resultsets by ‘zipper’ algorithm. Do you think it’s worth to contribute it into DIH? cc: [~goksron] [~jdyer] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6763) Shard leader election thread can persist across connection loss
Alan Woodward created SOLR-6763: --- Summary: Shard leader election thread can persist across connection loss Key: SOLR-6763 URL: https://issues.apache.org/jira/browse/SOLR-6763 Project: Solr Issue Type: Bug Reporter: Alan Woodward A ZK connection loss during a call to ElectionContext.waitForReplicasToComeUp() will result in two leader election processes for the shard running within a single node - the initial election that was waiting, and another spawned by the ReconnectStrategy. After the function returns, the first election will create an ephemeral leader node. The second election will then also attempt to create this node, fail, and try to put itself into recovery. It will also set the 'isLeader' value in its CloudDescriptor to false. The first election, meanwhile, is happily maintaining the ephemeral leader node. But any updates that are sent to the shard will cause an exception due to the mismatch between the cloudstate (where this node is the leader) and the local CloudDescriptor leader state. I think the fix is straightfoward - the call to zkClient.getChildren() in waitForReplicasToComeUp should be called with 'retryOnReconnect=false', rather than 'true' as it is currently, because once the connection has dropped we're going to launch a new election process anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-4799) SQLEntityProcessor for zipper join
[ https://issues.apache.org/jira/browse/SOLR-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul reassigned SOLR-4799: Assignee: (was: Noble Paul) SQLEntityProcessor for zipper join -- Key: SOLR-4799 URL: https://issues.apache.org/jira/browse/SOLR-4799 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Mikhail Khludnev Priority: Minor Labels: DIH, dataimportHandler, dih Attachments: SOLR-4799.patch DIH is mostly considered as a playground tool, and real usages end up with SolrJ. I want to contribute few improvements target DIH performance. This one provides performant approach for joining SQL Entities with miserable memory at contrast to http://wiki.apache.org/solr/DataImportHandler#CachedSqlEntityProcessor The idea is: * parent table is explicitly ordered by it’s PK in SQL * children table is explicitly ordered by parent_id FK in SQL * children entity processor joins ordered resultsets by ‘zipper’ algorithm. Do you think it’s worth to contribute it into DIH? cc: [~goksron] [~jdyer] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6365) specify appends, defaults, invariants outside of the component
[ https://issues.apache.org/jira/browse/SOLR-6365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218130#comment-14218130 ] ASF subversion and git services commented on SOLR-6365: --- Commit 1640595 from [~noble.paul] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1640595 ] SOLR-6365 implicit requesthandler (specified in code) takes lower precedence over initParams specify appends, defaults, invariants outside of the component --- Key: SOLR-6365 URL: https://issues.apache.org/jira/browse/SOLR-6365 Project: Solr Issue Type: Improvement Reporter: Noble Paul Assignee: Noble Paul Fix For: 5.0, Trunk Attachments: SOLR-6365-crappy-test.patch, SOLR-6365.patch, SOLR-6365.patch, SOLR-6365.patch The components are configured in solrconfig.xml mostly for specifying these extra parameters. If we separate these out, we can avoid specifying the components altogether and make solrconfig much simpler. Eventually we want users to see all functions as paths instead of components and control these params from outside , through an API and persisted in ZK objectives : * define standard components implicitly and let users override some params only * reuse standard params across components * define multiple param sets and mix and match these params at request time example {code:xml} !-- use json for all paths and _txt as the default search field-- initParams name=global path=/** lst name=defaults str name=wtjson/str str name=df_txt/str /lst /initParams {code} other examples {code:xml} initParams name=a path=/dump3,/root/*,/root1/** lst name=defaults str name=aA/str /lst lst name=invariants str name=bB/str /lst lst name=appends str name=cC/str /lst /initParams requestHandler name=/dump3 class=DumpRequestHandler/ requestHandler name=/dump4 class=DumpRequestHandler/ requestHandler name=/root/dump5 class=DumpRequestHandler/ requestHandler name=/root1/anotherlevel/dump6 class=DumpRequestHandler/ requestHandler name=/dump1 class=DumpRequestHandler initParams=a/ requestHandler name=/dump2 class=DumpRequestHandler initParams=a lst name=defaults str name=aA1/str /lst lst name=invariants str name=bB1/str /lst lst name=appends str name=cC1/str /lst /requestHandler {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6762) bin/solr create_collection not working due to not finding configName but the configSet exists
[ https://issues.apache.org/jira/browse/SOLR-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218144#comment-14218144 ] ASF subversion and git services commented on SOLR-6762: --- Commit 1640598 from [~thelabdude] in branch 'dev/trunk' [ https://svn.apache.org/r1640598 ] SOLR-6762: use collection.configName parameter instead of configSet when creating a collection bin/solr create_collection not working due to not finding configName but the configSet exists - Key: SOLR-6762 URL: https://issues.apache.org/jira/browse/SOLR-6762 Project: Solr Issue Type: Bug Reporter: Timothy Potter Assignee: Timothy Potter On trunk, when doing: bin/solr create_collection -n foo2 -c sample_techproducts_configs The collection cannot be created because of: Connecting to ZooKeeper at localhost:9983 Uploading /Users/timpotter/dev/lw/projects/solr_trunk_co/solr/server/solr/configsets/sample_techproducts_configs/conf for config sample_techproducts_configs to ZooKeeper at localhost:9983 Creating new collection 'foo2' using command: http://192.168.1.2:7574/solr/admin/collections?action=CREATEname=foo2numShards=1replicationFactor=1maxShardsPerNode=1configSet=sample_techproducts_configs { responseHeader:{ status:0, QTime:16121}, failure:{:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'foo2_shard1_replica1': Unable to create core [foo2_shard1_replica1] Caused by: Could not find configName for collection foo2 found:[data_driven_schema_configs, sample_techproducts_configs]}} The bin/solr create_collection command uploads the sample_techproducts_configs configset but the create collection seems to require the configName instead. Logs: INFO - 2014-11-19 16:14:38.001; org.apache.solr.cloud.OverseerCollectionProcessor; Overseer Collection Processor: Get the message id:/overseer/collection-queue-work/qn-14 message:{ operation:create, fromApi:true, name:foo2, replicationFactor:1, numShards:1, maxShardsPerNode:1} WARN - 2014-11-19 16:14:38.002; org.apache.solr.cloud.OverseerCollectionProcessor; OverseerCollectionProcessor.processMessage : create , { operation:create, fromApi:true, name:foo2, replicationFactor:1, numShards:1, maxShardsPerNode:1} WARN - 2014-11-19 16:14:38.002; org.apache.solr.cloud.OverseerCollectionProcessor; Could not obtain config name INFO - 2014-11-19 16:14:38.003; org.apache.solr.cloud.DistributedQueue$LatchWatcher; NodeChildrenChanged fired on path /overseer/queue state SyncConnected INFO - 2014-11-19 16:14:38.004; org.apache.solr.cloud.Overseer$ClusterStateUpdater; building a new collection: foo2 INFO - 2014-11-19 16:14:38.004; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Create collection foo2 with shards [shard1] INFO - 2014-11-19 16:14:38.004; org.apache.solr.cloud.Overseer$ClusterStateUpdater; state version foo2 1 INFO - 2014-11-19 16:14:38.006; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 2) INFO - 2014-11-19 16:14:38.103; org.apache.solr.cloud.OverseerCollectionProcessor; Creating SolrCores for new collection foo2, shardNames [shard1] , replicationFactor : 1 INFO - 2014-11-19 16:14:38.103; org.apache.solr.cloud.OverseerCollectionProcessor; Creating shard foo2_shard1_replica1 as part of slice shard1 of collection foo2 on 192.168.1.2:8983_solr INFO - 2014-11-19 16:14:38.106; org.apache.solr.handler.admin.CoreAdminHandler; core create command numShards=1shard=shard1name=foo2_shard1_replica1action=CREATEcollection=foo2wt=javabinqt=/admin/coresversion=2 INFO - 2014-11-19 16:14:38.107; org.apache.solr.cloud.ZkController; publishing core=foo2_shard1_replica1 state=down collection=foo2 INFO - 2014-11-19 16:14:38.108; org.apache.solr.cloud.ZkController; look for our core node name INFO - 2014-11-19 16:14:38.108; org.apache.solr.cloud.DistributedQueue$LatchWatcher; NodeChildrenChanged fired on path /overseer/queue state SyncConnected INFO - 2014-11-19 16:14:38.109; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state numShards=1 message={ operation:state, numShards:1, shard:shard1, roles:null, state:down, core:foo2_shard1_replica1, collection:foo2, node_name:192.168.1.2:8983_solr, base_url:http://192.168.1.2:8983/solr} INFO - 2014-11-19 16:14:38.211; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating...
[jira] [Commented] (SOLR-6761) Ability to ignore commit and optimize requests from clients when running in SolrCloud mode.
[ https://issues.apache.org/jira/browse/SOLR-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218145#comment-14218145 ] Yonik Seeley commented on SOLR-6761: How about even more general: a minimum commitWithin and the ability to downgrade an immediate commit or softCommit to a soft commitWithin. Perhaps a special value of -1 could mean disallow / don't actually do it . So minCommitWithin=5000 would convert an incoming commit to commitWithin=5000 and would convert commitWithin=10 to commitWithin=5000 Ability to ignore commit and optimize requests from clients when running in SolrCloud mode. --- Key: SOLR-6761 URL: https://issues.apache.org/jira/browse/SOLR-6761 Project: Solr Issue Type: New Feature Components: SolrCloud, SolrJ Reporter: Timothy Potter In most SolrCloud environments, it's advisable to only rely on auto-commits (soft and hard) configured in solrconfig.xml and not send explicit commit requests from client applications. In fact, I've seen cases where improperly coded client applications can send commit requests too frequently, which can lead to harming the cluster's health. As a system administrator, I'd like the ability to disallow commit requests from client applications. Ideally, I could configure the updateHandler to ignore the requests and return an HTTP response code of my choosing as I may not want to break existing client applications by returning an error. In other words, I may want to just return 200 vs. 405. The same goes for optimize requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6762) bin/solr create_collection not working due to not finding configName but the configSet exists
[ https://issues.apache.org/jira/browse/SOLR-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter resolved SOLR-6762. -- Resolution: Fixed Fix Version/s: 5.0 bin/solr create_collection not working due to not finding configName but the configSet exists - Key: SOLR-6762 URL: https://issues.apache.org/jira/browse/SOLR-6762 Project: Solr Issue Type: Bug Reporter: Timothy Potter Assignee: Timothy Potter Fix For: 5.0 On trunk, when doing: bin/solr create_collection -n foo2 -c sample_techproducts_configs The collection cannot be created because of: Connecting to ZooKeeper at localhost:9983 Uploading /Users/timpotter/dev/lw/projects/solr_trunk_co/solr/server/solr/configsets/sample_techproducts_configs/conf for config sample_techproducts_configs to ZooKeeper at localhost:9983 Creating new collection 'foo2' using command: http://192.168.1.2:7574/solr/admin/collections?action=CREATEname=foo2numShards=1replicationFactor=1maxShardsPerNode=1configSet=sample_techproducts_configs { responseHeader:{ status:0, QTime:16121}, failure:{:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'foo2_shard1_replica1': Unable to create core [foo2_shard1_replica1] Caused by: Could not find configName for collection foo2 found:[data_driven_schema_configs, sample_techproducts_configs]}} The bin/solr create_collection command uploads the sample_techproducts_configs configset but the create collection seems to require the configName instead. Logs: INFO - 2014-11-19 16:14:38.001; org.apache.solr.cloud.OverseerCollectionProcessor; Overseer Collection Processor: Get the message id:/overseer/collection-queue-work/qn-14 message:{ operation:create, fromApi:true, name:foo2, replicationFactor:1, numShards:1, maxShardsPerNode:1} WARN - 2014-11-19 16:14:38.002; org.apache.solr.cloud.OverseerCollectionProcessor; OverseerCollectionProcessor.processMessage : create , { operation:create, fromApi:true, name:foo2, replicationFactor:1, numShards:1, maxShardsPerNode:1} WARN - 2014-11-19 16:14:38.002; org.apache.solr.cloud.OverseerCollectionProcessor; Could not obtain config name INFO - 2014-11-19 16:14:38.003; org.apache.solr.cloud.DistributedQueue$LatchWatcher; NodeChildrenChanged fired on path /overseer/queue state SyncConnected INFO - 2014-11-19 16:14:38.004; org.apache.solr.cloud.Overseer$ClusterStateUpdater; building a new collection: foo2 INFO - 2014-11-19 16:14:38.004; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Create collection foo2 with shards [shard1] INFO - 2014-11-19 16:14:38.004; org.apache.solr.cloud.Overseer$ClusterStateUpdater; state version foo2 1 INFO - 2014-11-19 16:14:38.006; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 2) INFO - 2014-11-19 16:14:38.103; org.apache.solr.cloud.OverseerCollectionProcessor; Creating SolrCores for new collection foo2, shardNames [shard1] , replicationFactor : 1 INFO - 2014-11-19 16:14:38.103; org.apache.solr.cloud.OverseerCollectionProcessor; Creating shard foo2_shard1_replica1 as part of slice shard1 of collection foo2 on 192.168.1.2:8983_solr INFO - 2014-11-19 16:14:38.106; org.apache.solr.handler.admin.CoreAdminHandler; core create command numShards=1shard=shard1name=foo2_shard1_replica1action=CREATEcollection=foo2wt=javabinqt=/admin/coresversion=2 INFO - 2014-11-19 16:14:38.107; org.apache.solr.cloud.ZkController; publishing core=foo2_shard1_replica1 state=down collection=foo2 INFO - 2014-11-19 16:14:38.108; org.apache.solr.cloud.ZkController; look for our core node name INFO - 2014-11-19 16:14:38.108; org.apache.solr.cloud.DistributedQueue$LatchWatcher; NodeChildrenChanged fired on path /overseer/queue state SyncConnected INFO - 2014-11-19 16:14:38.109; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state numShards=1 message={ operation:state, numShards:1, shard:shard1, roles:null, state:down, core:foo2_shard1_replica1, collection:foo2, node_name:192.168.1.2:8983_solr, base_url:http://192.168.1.2:8983/solr} INFO - 2014-11-19 16:14:38.211; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 2) INFO - 2014-11-19 16:14:39.108; org.apache.solr.cloud.ZkController; waiting to find shard id in clusterstate for foo2_shard1_replica1 INFO - 2014-11-19 16:14:39.108;
[JENKINS] Lucene-Solr-5.x-MacOSX (64bit/jdk1.7.0) - Build # 1904 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/1904/ Java: 64bit/jdk1.7.0 -XX:-UseCompressedOops -XX:+UseParallelGC (asserts: true) 4 tests failed. REGRESSION: org.apache.solr.search.SpatialFilterTest.testLatLonType Error Message: SolrCore 'collection1' is not available due to init failure: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' Stack Trace: org.apache.solr.common.SolrException: SolrCore 'collection1' is not available due to init failure: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' at __randomizedtesting.SeedInfo.seed([491625682ACE4ADF:5DD4E9832E225FF8]:0) at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:763) at org.apache.solr.util.TestHarness.getCoreInc(TestHarness.java:219) at org.apache.solr.util.TestHarness.update(TestHarness.java:235) at org.apache.solr.util.BaseTestHarness.checkUpdateStatus(BaseTestHarness.java:282) at org.apache.solr.util.BaseTestHarness.validateUpdate(BaseTestHarness.java:252) at org.apache.solr.SolrTestCaseJ4.checkUpdateU(SolrTestCaseJ4.java:677) at org.apache.solr.SolrTestCaseJ4.assertU(SolrTestCaseJ4.java:656) at org.apache.solr.SolrTestCaseJ4.assertU(SolrTestCaseJ4.java:650) at org.apache.solr.SolrTestCaseJ4.clearIndex(SolrTestCaseJ4.java:1056) at org.apache.solr.search.SpatialFilterTest.setupDocs(SpatialFilterTest.java:36) at org.apache.solr.search.SpatialFilterTest.testLatLonType(SpatialFilterTest.java:85) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
Re: [JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2215 - Still Failing
Apologies -- I haven't been following the commits closely this week. Does anyone have any idea what changed at the low levels of the Solr testing class hierarchy to cause these failures in a variety of tests? : SolrCore 'collection1' is not available due to init failure: Error : instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' : Caused by: org.apache.solr.common.SolrException: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' : at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:532) : at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:517) : at org.apache.solr.update.SolrIndexConfig.buildMergeScheduler(SolrIndexConfig.java:289) : at org.apache.solr.update.SolrIndexConfig.toIndexWriterConfig(SolrIndexConfig.java:214) : at org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:77) : at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) : at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:529) : at org.apache.solr.core.SolrCore.init(SolrCore.java:796) : ... 8 more : Caused by: java.lang.IllegalAccessException: Class org.apache.solr.core.SolrResourceLoader can not access a member of class org.apache.lucene.util.LuceneTestCase$3 with modifiers : at sun.reflect.Reflection.ensureMemberAccess(Reflection.java:109) : at java.lang.Class.newInstance(Class.java:368) : at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:529) : ... 15 more :[junit4] 2 NOTE: reproduce with: ant test -Dtestcase=SampleTest -Dtests.method=testSimple -Dtests.seed=2E6E8F9ADADFEACF -Dtests.multiplier=2 -Dtests.slow=true -Dtests.locale=ja_JP_JP_#u-ca-japanese -Dtests.timezone=Europe/Lisbon -Dtests.asserts=true -Dtests.file.encoding=US-ASCII -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6762) bin/solr create_collection not working due to not finding configName but the configSet exists
[ https://issues.apache.org/jira/browse/SOLR-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218158#comment-14218158 ] ASF subversion and git services commented on SOLR-6762: --- Commit 1640600 from [~thelabdude] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1640600 ] SOLR-6762: use collection.configName parameter instead of configSet when creating a collection bin/solr create_collection not working due to not finding configName but the configSet exists - Key: SOLR-6762 URL: https://issues.apache.org/jira/browse/SOLR-6762 Project: Solr Issue Type: Bug Reporter: Timothy Potter Assignee: Timothy Potter Fix For: 5.0 On trunk, when doing: bin/solr create_collection -n foo2 -c sample_techproducts_configs The collection cannot be created because of: Connecting to ZooKeeper at localhost:9983 Uploading /Users/timpotter/dev/lw/projects/solr_trunk_co/solr/server/solr/configsets/sample_techproducts_configs/conf for config sample_techproducts_configs to ZooKeeper at localhost:9983 Creating new collection 'foo2' using command: http://192.168.1.2:7574/solr/admin/collections?action=CREATEname=foo2numShards=1replicationFactor=1maxShardsPerNode=1configSet=sample_techproducts_configs { responseHeader:{ status:0, QTime:16121}, failure:{:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'foo2_shard1_replica1': Unable to create core [foo2_shard1_replica1] Caused by: Could not find configName for collection foo2 found:[data_driven_schema_configs, sample_techproducts_configs]}} The bin/solr create_collection command uploads the sample_techproducts_configs configset but the create collection seems to require the configName instead. Logs: INFO - 2014-11-19 16:14:38.001; org.apache.solr.cloud.OverseerCollectionProcessor; Overseer Collection Processor: Get the message id:/overseer/collection-queue-work/qn-14 message:{ operation:create, fromApi:true, name:foo2, replicationFactor:1, numShards:1, maxShardsPerNode:1} WARN - 2014-11-19 16:14:38.002; org.apache.solr.cloud.OverseerCollectionProcessor; OverseerCollectionProcessor.processMessage : create , { operation:create, fromApi:true, name:foo2, replicationFactor:1, numShards:1, maxShardsPerNode:1} WARN - 2014-11-19 16:14:38.002; org.apache.solr.cloud.OverseerCollectionProcessor; Could not obtain config name INFO - 2014-11-19 16:14:38.003; org.apache.solr.cloud.DistributedQueue$LatchWatcher; NodeChildrenChanged fired on path /overseer/queue state SyncConnected INFO - 2014-11-19 16:14:38.004; org.apache.solr.cloud.Overseer$ClusterStateUpdater; building a new collection: foo2 INFO - 2014-11-19 16:14:38.004; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Create collection foo2 with shards [shard1] INFO - 2014-11-19 16:14:38.004; org.apache.solr.cloud.Overseer$ClusterStateUpdater; state version foo2 1 INFO - 2014-11-19 16:14:38.006; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 2) INFO - 2014-11-19 16:14:38.103; org.apache.solr.cloud.OverseerCollectionProcessor; Creating SolrCores for new collection foo2, shardNames [shard1] , replicationFactor : 1 INFO - 2014-11-19 16:14:38.103; org.apache.solr.cloud.OverseerCollectionProcessor; Creating shard foo2_shard1_replica1 as part of slice shard1 of collection foo2 on 192.168.1.2:8983_solr INFO - 2014-11-19 16:14:38.106; org.apache.solr.handler.admin.CoreAdminHandler; core create command numShards=1shard=shard1name=foo2_shard1_replica1action=CREATEcollection=foo2wt=javabinqt=/admin/coresversion=2 INFO - 2014-11-19 16:14:38.107; org.apache.solr.cloud.ZkController; publishing core=foo2_shard1_replica1 state=down collection=foo2 INFO - 2014-11-19 16:14:38.108; org.apache.solr.cloud.ZkController; look for our core node name INFO - 2014-11-19 16:14:38.108; org.apache.solr.cloud.DistributedQueue$LatchWatcher; NodeChildrenChanged fired on path /overseer/queue state SyncConnected INFO - 2014-11-19 16:14:38.109; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state numShards=1 message={ operation:state, numShards:1, shard:shard1, roles:null, state:down, core:foo2_shard1_replica1, collection:foo2, node_name:192.168.1.2:8983_solr, base_url:http://192.168.1.2:8983/solr} INFO - 2014-11-19 16:14:38.211; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged
[jira] [Commented] (SOLR-6691) REBALANCELEADERS needs to change the leader election queue.
[ https://issues.apache.org/jira/browse/SOLR-6691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218162#comment-14218162 ] Noble Paul commented on SOLR-6691: -- bq.As you can tell, I'm a bit stymied, I'll plug along but wondered if there's some prior art I haven't found yet. Plz look at the overseer role feature. There is a core admin command which force-evicts an overseer. REBALANCELEADERS needs to change the leader election queue. --- Key: SOLR-6691 URL: https://issues.apache.org/jira/browse/SOLR-6691 Project: Solr Issue Type: Bug Reporter: Erick Erickson Assignee: Erick Erickson The original code (SOLR-6517) assumed that changes in the clusterstate after issuing a command to the overseer to change the leader indicated that the leader was successfully changed. Fortunately, Noble clued me in that this isn't the case and that the potential leader needs to insert itself in the leader election queue before trigging the change leader command. Inserting themselves in the front of the queue should probably happen in BALANCESHARDUNIQUE when the preferredLeader property is assigned as well. [~noble.paul] Do evil things happen if a node joins at the head but it's _already_ in the queue? These ephemeral nodes in the queue are watching each other. So if node1 is the leader you have node1 - node2 - node3 - node4 where - means watches. Now, if node3 puts itself at the head of the list, you have {code} node1 - node2 - node3 - node4 {code} I _think_ when I was looking at this it all just worked. 1 node 1 goes down. Nodes 2 and 3 duke it out but there's code to insure that node3 becomes the leader and node2 inserts itself at then end so it's watching node 4. 2 node 2 goes down, nobody gets notified and it doesn't matter. 3 node 3 goes down, node 4 gets notified and starts watching node 2 by inserting itself at the end of the list. 4 node 4 goes down, nobody gets notified and it doesn't matter. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-6764) Can't index exampledocs/*.xml into collection based on the data_driven_schema_configs configset
[ https://issues.apache.org/jira/browse/SOLR-6764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter reassigned SOLR-6764: Assignee: Timothy Potter Can't index exampledocs/*.xml into collection based on the data_driven_schema_configs configset --- Key: SOLR-6764 URL: https://issues.apache.org/jira/browse/SOLR-6764 Project: Solr Issue Type: Bug Reporter: Timothy Potter Assignee: Timothy Potter This is exactly what we don't want ;-) Fire up a collection that uses the data_driven_schema_configs (such as by doing: bin/solr -e cloud -noprompt) and then try to index our example docs using: $ java -Durl=http://localhost:8983/solr/gettingstarted/update -jar post.jar *.xml Here goes the spew ... SimplePostTool version 1.5 Posting files to base url http://localhost:8983/solr/gettingstarted/update using content-type application/xml.. POSTing file gb18030-example.xml POSTing file hd.xml SimplePostTool: WARNING: Solr returned an error #500 (Server Error) for url: http://localhost:8983/solr/gettingstarted/update SimplePostTool: WARNING: Response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status500/intint name=QTime19/int/lstlst name=errorstr name=msgServer Error request: http://192.168.1.2:8983/solr/gettingstarted_shard2_replica2/update?update.chain=add-unknown-fields-to-the-schemaamp;update.distrib=TOLEADERamp;distrib.from=http%3A%2F%2F192.168.1.2%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2Famp;wt=javabinamp;version=2/strstr name=traceorg.apache.solr.common.SolrException: Server Error request: http://192.168.1.2:8983/solr/gettingstarted_shard2_replica2/update?update.chain=add-unknown-fields-to-the-schemaamp;update.distrib=TOLEADERamp;distrib.from=http%3A%2F%2F192.168.1.2%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2Famp;wt=javabinamp;version=2 at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:241) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) /strint name=code500/int/lst /response SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 500 for URL: http://localhost:8983/solr/gettingstarted/update POSTing file ipod_other.xml SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for url: http://localhost:8983/solr/gettingstarted/update SimplePostTool: WARNING: Response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status400/intint name=QTime630/int/lstlst name=errorstr name=msgERROR: [doc=IW-02] Error adding field 'price'='11.50' msg=For input string: 11.50/strint name=code400/int/lst /response SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/gettingstarted/update POSTing file ipod_video.xml SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for url: http://localhost:8983/solr/gettingstarted/update SimplePostTool: WARNING: Response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status400/intint name=QTime5/int/lstlst name=errorstr name=msgERROR: [doc=MA147LL/A] Error adding field 'weight'='5.5' msg=For input string: 5.5/strint name=code400/int/lst /response SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/gettingstarted/update POSTing file manufacturers.xml SimplePostTool: WARNING: Solr returned an error #500 (Server Error) for url: http://localhost:8983/solr/gettingstarted/update SimplePostTool: WARNING: Response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status500/intint name=QTime2/int/lstlst name=errorstr name=msgException writing document id adata to the index; possible analysis error./strstr name=traceorg.apache.solr.common.SolrException: Exception writing document id adata to the index; possible analysis error. at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:168) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.AddSchemaFieldsUpdateProcessorFactory$AddSchemaFieldsUpdateProcessor.processAdd(AddSchemaFieldsUpdateProcessorFactory.java:328)
[jira] [Created] (SOLR-6764) Can't index exampledocs/*.xml into collection based on the data_driven_schema_configs configset
Timothy Potter created SOLR-6764: Summary: Can't index exampledocs/*.xml into collection based on the data_driven_schema_configs configset Key: SOLR-6764 URL: https://issues.apache.org/jira/browse/SOLR-6764 Project: Solr Issue Type: Bug Reporter: Timothy Potter This is exactly what we don't want ;-) Fire up a collection that uses the data_driven_schema_configs (such as by doing: bin/solr -e cloud -noprompt) and then try to index our example docs using: $ java -Durl=http://localhost:8983/solr/gettingstarted/update -jar post.jar *.xml Here goes the spew ... SimplePostTool version 1.5 Posting files to base url http://localhost:8983/solr/gettingstarted/update using content-type application/xml.. POSTing file gb18030-example.xml POSTing file hd.xml SimplePostTool: WARNING: Solr returned an error #500 (Server Error) for url: http://localhost:8983/solr/gettingstarted/update SimplePostTool: WARNING: Response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status500/intint name=QTime19/int/lstlst name=errorstr name=msgServer Error request: http://192.168.1.2:8983/solr/gettingstarted_shard2_replica2/update?update.chain=add-unknown-fields-to-the-schemaamp;update.distrib=TOLEADERamp;distrib.from=http%3A%2F%2F192.168.1.2%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2Famp;wt=javabinamp;version=2/strstr name=traceorg.apache.solr.common.SolrException: Server Error request: http://192.168.1.2:8983/solr/gettingstarted_shard2_replica2/update?update.chain=add-unknown-fields-to-the-schemaamp;update.distrib=TOLEADERamp;distrib.from=http%3A%2F%2F192.168.1.2%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2Famp;wt=javabinamp;version=2 at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:241) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) /strint name=code500/int/lst /response SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 500 for URL: http://localhost:8983/solr/gettingstarted/update POSTing file ipod_other.xml SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for url: http://localhost:8983/solr/gettingstarted/update SimplePostTool: WARNING: Response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status400/intint name=QTime630/int/lstlst name=errorstr name=msgERROR: [doc=IW-02] Error adding field 'price'='11.50' msg=For input string: 11.50/strint name=code400/int/lst /response SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/gettingstarted/update POSTing file ipod_video.xml SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for url: http://localhost:8983/solr/gettingstarted/update SimplePostTool: WARNING: Response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status400/intint name=QTime5/int/lstlst name=errorstr name=msgERROR: [doc=MA147LL/A] Error adding field 'weight'='5.5' msg=For input string: 5.5/strint name=code400/int/lst /response SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/gettingstarted/update POSTing file manufacturers.xml SimplePostTool: WARNING: Solr returned an error #500 (Server Error) for url: http://localhost:8983/solr/gettingstarted/update SimplePostTool: WARNING: Response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status500/intint name=QTime2/int/lstlst name=errorstr name=msgException writing document id adata to the index; possible analysis error./strstr name=traceorg.apache.solr.common.SolrException: Exception writing document id adata to the index; possible analysis error. at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:168) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.AddSchemaFieldsUpdateProcessorFactory$AddSchemaFieldsUpdateProcessor.processAdd(AddSchemaFieldsUpdateProcessorFactory.java:328) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:117) at
[jira] [Commented] (SOLR-6708) Smoke tester couldn't communicate with Solr started using 'bin/solr start'
[ https://issues.apache.org/jira/browse/SOLR-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218202#comment-14218202 ] ASF subversion and git services commented on SOLR-6708: --- Commit 1640609 from [~thelabdude] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1640609 ] SOLR-6708: backport fix from trunk Smoke tester couldn't communicate with Solr started using 'bin/solr start' -- Key: SOLR-6708 URL: https://issues.apache.org/jira/browse/SOLR-6708 Project: Solr Issue Type: Bug Affects Versions: 5.0 Reporter: Steve Rowe Assignee: Timothy Potter Attachments: solr-example.log The nightly-smoke target failed on ASF Jenkins [https://builds.apache.org/job/Lucene-Solr-SmokeRelease-5.x/208/]: {noformat} [smoker] unpack solr-5.0.0.tgz... [smoker] verify JAR metadata/identity/no javax.* or java.* classes... [smoker] unpack lucene-5.0.0.tgz... [smoker] **WARNING**: skipping check of /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/lucene/build/smokeTestRelease/tmp/unpack/solr-5.0.0/contrib/dataimporthandler-extras/lib/javax.mail-1.5.1.jar: it has javax.* classes [smoker] **WARNING**: skipping check of /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/lucene/build/smokeTestRelease/tmp/unpack/solr-5.0.0/contrib/dataimporthandler-extras/lib/activation-1.1.1.jar: it has javax.* classes [smoker] verify WAR metadata/contained JAR identity/no javax.* or java.* classes... [smoker] unpack lucene-5.0.0.tgz... [smoker] copying unpacked distribution for Java 7 ... [smoker] test solr example w/ Java 7... [smoker] start Solr instance (log=/usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/lucene/build/smokeTestRelease/tmp/unpack/solr-5.0.0-java7/solr-example.log)... [smoker] startup done [smoker] Failed to determine the port of a local Solr instance, cannot create core! [smoker] test utf8... [smoker] [smoker] command sh ./exampledocs/test_utf8.sh http://localhost:8983/solr/techproducts; failed: [smoker] ERROR: Could not curl to Solr - is curl installed? Is Solr not running? [smoker] [smoker] [smoker] stop server using: bin/solr stop -p 8983 [smoker] No process found for Solr node running on port 8983 [smoker] ***WARNING***: Solr instance didn't respond to SIGINT; using SIGKILL now... [smoker] ***WARNING***: Solr instance didn't respond to SIGKILL; ignoring... [smoker] Traceback (most recent call last): [smoker] File /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/dev-tools/scripts/smokeTestRelease.py, line 1526, in module [smoker] main() [smoker] File /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/dev-tools/scripts/smokeTestRelease.py, line 1471, in main [smoker] smokeTest(c.java, c.url, c.revision, c.version, c.tmp_dir, c.is_signed, ' '.join(c.test_args)) [smoker] File /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/dev-tools/scripts/smokeTestRelease.py, line 1515, in smokeTest [smoker] unpackAndVerify(java, 'solr', tmpDir, artifact, svnRevision, version, testArgs, baseURL) [smoker] File /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/dev-tools/scripts/smokeTestRelease.py, line 616, in unpackAndVerify [smoker] verifyUnpacked(java, project, artifact, unpackPath, svnRevision, version, testArgs, tmpDir, baseURL) [smoker] File /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/dev-tools/scripts/smokeTestRelease.py, line 783, in verifyUnpacked [smoker] testSolrExample(java7UnpackPath, java.java7_home, False) [smoker] File /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/dev-tools/scripts/smokeTestRelease.py, line 888, in testSolrExample [smoker] run('sh ./exampledocs/test_utf8.sh http://localhost:8983/solr/techproducts', 'utf8.log') [smoker] File /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/dev-tools/scripts/smokeTestRelease.py, line 541, in run [smoker] raise RuntimeError('command %s failed; see log file %s' % (command, logPath)) [smoker] RuntimeError: command sh ./exampledocs/test_utf8.sh http://localhost:8983/solr/techproducts; failed; see log file /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/lucene/build/smokeTestRelease/tmp/unpack/solr-5.0.0-java7/example/utf8.log BUILD FAILED /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-5.x/build.xml:410: exec
[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2216 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2216/ 1 tests failed. REGRESSION: org.apache.solr.search.TestStressUserVersions.testStressReorderVersions Error Message: org.apache.solr.common.SolrException: SolrCore 'collection1' is not available due to init failure: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' Stack Trace: java.lang.RuntimeException: org.apache.solr.common.SolrException: SolrCore 'collection1' is not available due to init failure: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' at __randomizedtesting.SeedInfo.seed([9610E8D364D8F592:8AD6D1C56E13711F]:0) at org.apache.solr.search.TestRTGBase.clearIndex(TestRTGBase.java:54) at org.apache.solr.search.TestStressUserVersions.testStressReorderVersions(TestStressUserVersions.java:63) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at
solr magento e-commerce plugin/extension
I am interested in exploring option of building plugin/extension for opensource e-commerce frameworks like magento, opencart etc. Such a plugin can give easy platform for Solr on-boarding to open-source e-commerce community. Please suggest if we have any discussion/work done in this area. Thanks Anurag
Solr with LAMP/ XAMPP/ WAMP/MAMP
Seeing the growing need of Solr in the web development, how do we perceive building a Solr integrated package with *AMP's. The integrated package installs Solr with Apache, PHP, MySQL and also give option of configuring/controlling Solr using the common Admin interface. Can it become a go to package for web development community?
Re: [JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2215 - Still Failing
I think this might be to do with Mike's changes in r1640457, but for some reason I can't up from svn or the apache git repo at the moment so I'm not certain. Alan Woodward www.flax.co.uk On 19 Nov 2014, at 17:05, Chris Hostetter wrote: Apologies -- I haven't been following the commits closely this week. Does anyone have any idea what changed at the low levels of the Solr testing class hierarchy to cause these failures in a variety of tests? : SolrCore 'collection1' is not available due to init failure: Error : instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' : Caused by: org.apache.solr.common.SolrException: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' : at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:532) : at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:517) : at org.apache.solr.update.SolrIndexConfig.buildMergeScheduler(SolrIndexConfig.java:289) : at org.apache.solr.update.SolrIndexConfig.toIndexWriterConfig(SolrIndexConfig.java:214) : at org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:77) : at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) : at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:529) : at org.apache.solr.core.SolrCore.init(SolrCore.java:796) : ... 8 more : Caused by: java.lang.IllegalAccessException: Class org.apache.solr.core.SolrResourceLoader can not access a member of class org.apache.lucene.util.LuceneTestCase$3 with modifiers : at sun.reflect.Reflection.ensureMemberAccess(Reflection.java:109) : at java.lang.Class.newInstance(Class.java:368) : at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:529) : ... 15 more :[junit4] 2 NOTE: reproduce with: ant test -Dtestcase=SampleTest -Dtests.method=testSimple -Dtests.seed=2E6E8F9ADADFEACF -Dtests.multiplier=2 -Dtests.slow=true -Dtests.locale=ja_JP_JP_#u-ca-japanese -Dtests.timezone=Europe/Lisbon -Dtests.asserts=true -Dtests.file.encoding=US-ASCII -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: solr magento e-commerce plugin/extension
Hello Anurag, Nice to hear from you. I'm big solr fan, Though due to time constraints its being impossible for me to find time towards contributions. During my Google summer of code'13. I developed the Advance Course Search[1] module for moodle[2]. You can look into source code here[3]. Plug-in is properly tested with solr 3.6 and 4.0 You can find integrations with typo3, word-press and drupal too. Feel free to ask me if you have any questions. Don't forget to drop your feedback to me ;) Good luck, Cheers! Related links - [1] - https://docs.moodle.org/dev/Course_search [2] - https://moodle.org/plugins/view.php?plugin=tool_coursesearch [3] - https://github.com/shashirepo/moodle-tool_coursesearch On Wed, Nov 19, 2014 at 11:12 PM, Anurag Sharma anura...@gmail.com wrote: I am interested in exploring option of building plugin/extension for opensource e-commerce frameworks like magento, opencart etc. Such a plugin can give easy platform for Solr on-boarding to open-source e-commerce community. Please suggest if we have any discussion/work done in this area. Thanks Anurag -- Thanks Regards Shashikant Vaishnav Developer at ClickHereMedia.co.uk Intern at Google Summer of Code http://about.me/shashitechno
Re: [JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2215 - Still Failing
Oh, I also saw this before committing, was confused, ran ant clean test in solr directory, and it passed, so I thought ant clean fixed it ... I guess not. With this change, in LuceneTestCase's newIndexWriterConfig, I sometimes randomly subclass ConcurrentMergeScheduler (to turn off merge throttling) in the random IWC that's returned. Does this make Solr unhappy? Why is Solr trying to instantiate the merge scheduler class that's already instantiated on IWC? I'm confused... Mike McCandless http://blog.mikemccandless.com On Wed, Nov 19, 2014 at 1:00 PM, Alan Woodward a...@flax.co.uk wrote: I think this might be to do with Mike's changes in r1640457, but for some reason I can't up from svn or the apache git repo at the moment so I'm not certain. Alan Woodward www.flax.co.uk On 19 Nov 2014, at 17:05, Chris Hostetter wrote: Apologies -- I haven't been following the commits closely this week. Does anyone have any idea what changed at the low levels of the Solr testing class hierarchy to cause these failures in a variety of tests? : SolrCore 'collection1' is not available due to init failure: Error : instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' : Caused by: org.apache.solr.common.SolrException: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' : at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:532) : at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:517) : at org.apache.solr.update.SolrIndexConfig.buildMergeScheduler(SolrIndexConfig.java:289) : at org.apache.solr.update.SolrIndexConfig.toIndexWriterConfig(SolrIndexConfig.java:214) : at org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:77) : at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) : at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:529) : at org.apache.solr.core.SolrCore.init(SolrCore.java:796) : ... 8 more : Caused by: java.lang.IllegalAccessException: Class org.apache.solr.core.SolrResourceLoader can not access a member of class org.apache.lucene.util.LuceneTestCase$3 with modifiers : at sun.reflect.Reflection.ensureMemberAccess(Reflection.java:109) : at java.lang.Class.newInstance(Class.java:368) : at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:529) : ... 15 more :[junit4] 2 NOTE: reproduce with: ant test -Dtestcase=SampleTest -Dtests.method=testSimple -Dtests.seed=2E6E8F9ADADFEACF -Dtests.multiplier=2 -Dtests.slow=true -Dtests.locale=ja_JP_JP_#u-ca-japanese -Dtests.timezone=Europe/Lisbon -Dtests.asserts=true -Dtests.file.encoding=US-ASCII -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6764) Can't index exampledocs/*.xml into collection based on the data_driven_schema_configs configset
[ https://issues.apache.org/jira/browse/SOLR-6764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218255#comment-14218255 ] Timothy Potter commented on SOLR-6764: -- Seems like a race-condition in the managed schema stuff as sometimes this works and other times it doesn't ... also seeing errors like this: Caused by: java.lang.NullPointerException at org.apache.lucene.analysis.core.StopFilter.accept(StopFilter.java:108) at org.apache.lucene.analysis.util.FilteringTokenFilter.incrementToken(FilteringTokenFilter.java:52) at org.apache.lucene.analysis.core.LowerCaseFilter.incrementToken(LowerCaseFilter.java:45) at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:617) at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:318) at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:240) at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:455) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1398) at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:240) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:164) ... 60 more This should never happen and doesn't in non-cloud mode. Can't index exampledocs/*.xml into collection based on the data_driven_schema_configs configset --- Key: SOLR-6764 URL: https://issues.apache.org/jira/browse/SOLR-6764 Project: Solr Issue Type: Bug Reporter: Timothy Potter Assignee: Timothy Potter This is exactly what we don't want ;-) Fire up a collection that uses the data_driven_schema_configs (such as by doing: bin/solr -e cloud -noprompt) and then try to index our example docs using: $ java -Durl=http://localhost:8983/solr/gettingstarted/update -jar post.jar *.xml Here goes the spew ... SimplePostTool version 1.5 Posting files to base url http://localhost:8983/solr/gettingstarted/update using content-type application/xml.. POSTing file gb18030-example.xml POSTing file hd.xml SimplePostTool: WARNING: Solr returned an error #500 (Server Error) for url: http://localhost:8983/solr/gettingstarted/update SimplePostTool: WARNING: Response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status500/intint name=QTime19/int/lstlst name=errorstr name=msgServer Error request: http://192.168.1.2:8983/solr/gettingstarted_shard2_replica2/update?update.chain=add-unknown-fields-to-the-schemaamp;update.distrib=TOLEADERamp;distrib.from=http%3A%2F%2F192.168.1.2%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2Famp;wt=javabinamp;version=2/strstr name=traceorg.apache.solr.common.SolrException: Server Error request: http://192.168.1.2:8983/solr/gettingstarted_shard2_replica2/update?update.chain=add-unknown-fields-to-the-schemaamp;update.distrib=TOLEADERamp;distrib.from=http%3A%2F%2F192.168.1.2%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2Famp;wt=javabinamp;version=2 at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:241) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) /strint name=code500/int/lst /response SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 500 for URL: http://localhost:8983/solr/gettingstarted/update POSTing file ipod_other.xml SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for url: http://localhost:8983/solr/gettingstarted/update SimplePostTool: WARNING: Response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status400/intint name=QTime630/int/lstlst name=errorstr name=msgERROR: [doc=IW-02] Error adding field 'price'='11.50' msg=For input string: 11.50/strint name=code400/int/lst /response SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/gettingstarted/update POSTing file ipod_video.xml SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for url: http://localhost:8983/solr/gettingstarted/update SimplePostTool: WARNING: Response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status400/intint name=QTime5/int/lstlst name=errorstr name=msgERROR: [doc=MA147LL/A] Error adding field 'weight'='5.5' msg=For input string:
Re: Solr with LAMP/ XAMPP/ WAMP/MAMP
On 11/19/2014 10:56 AM, Anurag Sharma wrote: Seeing the growing need of Solr in the web development, how do we perceive building a Solr integrated package with *AMP's. The integrated package installs Solr with Apache, PHP, MySQL and also give option of configuring/controlling Solr using the common Admin interface. Can it become a go to package for web development community? I believe that any high-level integration work like that would pull a large amount of focus away from Solr itself. There are lots of people inventing that wheel. The nature of the work produces a highly customized wheel that requires further customization or a complete rewrite to work with a different project. There are at least three Solr client packages for PHP, which will let a LAMP project integrate with Solr. In no particular order, these are the ones I know about: https://code.google.com/p/solr-php-client/ http://pecl.php.net/package/solr http://www.solarium-project.org/ Solr supports importing from a database, which can be MySQL, but better results can be obtained by writing an indexing client yourself, where you can manipulate the data in any way you choose. That could be in PHP like the rest of the stack being discussed here, or it could be written in another language. I *do* think it might be a good idea for us to write and maintain supported Solr clients for languages beyond Java -- work at a low level of integration instead of a high level. I personally don't have the kind of expertise required, but the expertise is out there. The others might not agree with this, and I won't stand in the way of the idea you've described if others want to pursue it. Thanks, Shawn - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6764) Can't index exampledocs/*.xml into collection based on the data_driven_schema_configs configset
[ https://issues.apache.org/jira/browse/SOLR-6764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218279#comment-14218279 ] Steve Rowe commented on SOLR-6764: -- Some of this might be a dupe of SOLR-6016 - there are fields in the example docs that have a mix of integer and floating point values, so depending on which file gets indexed first, a doc with a field previously detected as integral will fail when field values are floating point. Can't index exampledocs/*.xml into collection based on the data_driven_schema_configs configset --- Key: SOLR-6764 URL: https://issues.apache.org/jira/browse/SOLR-6764 Project: Solr Issue Type: Bug Reporter: Timothy Potter Assignee: Timothy Potter This is exactly what we don't want ;-) Fire up a collection that uses the data_driven_schema_configs (such as by doing: bin/solr -e cloud -noprompt) and then try to index our example docs using: $ java -Durl=http://localhost:8983/solr/gettingstarted/update -jar post.jar *.xml Here goes the spew ... SimplePostTool version 1.5 Posting files to base url http://localhost:8983/solr/gettingstarted/update using content-type application/xml.. POSTing file gb18030-example.xml POSTing file hd.xml SimplePostTool: WARNING: Solr returned an error #500 (Server Error) for url: http://localhost:8983/solr/gettingstarted/update SimplePostTool: WARNING: Response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status500/intint name=QTime19/int/lstlst name=errorstr name=msgServer Error request: http://192.168.1.2:8983/solr/gettingstarted_shard2_replica2/update?update.chain=add-unknown-fields-to-the-schemaamp;update.distrib=TOLEADERamp;distrib.from=http%3A%2F%2F192.168.1.2%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2Famp;wt=javabinamp;version=2/strstr name=traceorg.apache.solr.common.SolrException: Server Error request: http://192.168.1.2:8983/solr/gettingstarted_shard2_replica2/update?update.chain=add-unknown-fields-to-the-schemaamp;update.distrib=TOLEADERamp;distrib.from=http%3A%2F%2F192.168.1.2%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2Famp;wt=javabinamp;version=2 at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:241) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) /strint name=code500/int/lst /response SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 500 for URL: http://localhost:8983/solr/gettingstarted/update POSTing file ipod_other.xml SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for url: http://localhost:8983/solr/gettingstarted/update SimplePostTool: WARNING: Response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status400/intint name=QTime630/int/lstlst name=errorstr name=msgERROR: [doc=IW-02] Error adding field 'price'='11.50' msg=For input string: 11.50/strint name=code400/int/lst /response SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/gettingstarted/update POSTing file ipod_video.xml SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for url: http://localhost:8983/solr/gettingstarted/update SimplePostTool: WARNING: Response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status400/intint name=QTime5/int/lstlst name=errorstr name=msgERROR: [doc=MA147LL/A] Error adding field 'weight'='5.5' msg=For input string: 5.5/strint name=code400/int/lst /response SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/gettingstarted/update POSTing file manufacturers.xml SimplePostTool: WARNING: Solr returned an error #500 (Server Error) for url: http://localhost:8983/solr/gettingstarted/update SimplePostTool: WARNING: Response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status500/intint name=QTime2/int/lstlst name=errorstr name=msgException writing document id adata to the index; possible analysis error./strstr name=traceorg.apache.solr.common.SolrException: Exception writing document id adata to the index; possible analysis error. at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:168) at
[jira] [Created] (SOLR-6765) Provide better error messages for mis-configured Zookeeper ensemble connection strings
Erick Erickson created SOLR-6765: Summary: Provide better error messages for mis-configured Zookeeper ensemble connection strings Key: SOLR-6765 URL: https://issues.apache.org/jira/browse/SOLR-6765 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 5.0, Trunk Reporter: Erick Erickson From the user's list: The person specified the Zookeeper connection string as: SOLR_ZK_ENSEMBLE=zookeeper1:2181/solr,zookeeper2:2181/solr,zookeeper3:2181/solr Here's a snippet: Depending on how or where I set the Zookeeper ensemble details I get different results. We should detect this kind of thing and produce an informative error message. This particular user was using Cloudera Search, but I think this is a fairly generic issue that is common to vanilla Solr. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-5.x-Linux (64bit/jdk1.7.0_67) - Build # 11473 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Linux/11473/ Java: 64bit/jdk1.7.0_67 -XX:+UseCompressedOops -XX:+UseParallelGC (asserts: true) 4 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.cloud.TestCloudPivotFacet Error Message: Suite timeout exceeded (= 720 msec). Stack Trace: java.lang.Exception: Suite timeout exceeded (= 720 msec). at __randomizedtesting.SeedInfo.seed([8BD0A5A31D76461C]:0) REGRESSION: org.apache.solr.TestRandomFaceting.testRandomFaceting Error Message: SolrCore 'collection1' is not available due to init failure: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' Stack Trace: org.apache.solr.common.SolrException: SolrCore 'collection1' is not available due to init failure: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' at __randomizedtesting.SeedInfo.seed([8BD0A5A31D76461C:86B88576488F8EA3]:0) at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:763) at org.apache.solr.util.TestHarness.getCoreInc(TestHarness.java:219) at org.apache.solr.util.TestHarness.update(TestHarness.java:235) at org.apache.solr.util.BaseTestHarness.checkUpdateStatus(BaseTestHarness.java:282) at org.apache.solr.util.BaseTestHarness.validateUpdate(BaseTestHarness.java:252) at org.apache.solr.SolrTestCaseJ4.checkUpdateU(SolrTestCaseJ4.java:677) at org.apache.solr.SolrTestCaseJ4.assertU(SolrTestCaseJ4.java:656) at org.apache.solr.SolrTestCaseJ4.assertU(SolrTestCaseJ4.java:650) at org.apache.solr.SolrTestCaseJ4.clearIndex(SolrTestCaseJ4.java:1056) at org.apache.solr.TestRandomFaceting.init(TestRandomFaceting.java:51) at org.apache.solr.TestRandomFaceting.testRandomFaceting(TestRandomFaceting.java:117) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at
Re: [JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2215 - Still Failing
So digging in… Solr instantiates the merge scheduler via it's ResourceLoader, which takes a class name. The random indexconfig snippet sets the classname to whatever the value of ${solr.tests.mergeScheduler} is. This is set in SolrTestCaseJ4.newRandomConfig(): System.setProperty(solr.tests.mergeScheduler, iwc.getMergeScheduler().getClass().getName()); And I guess you can't call Class.newInstance() on an anonymous class? Alan Woodward www.flax.co.uk On 19 Nov 2014, at 18:10, Michael McCandless wrote: Oh, I also saw this before committing, was confused, ran ant clean test in solr directory, and it passed, so I thought ant clean fixed it ... I guess not. With this change, in LuceneTestCase's newIndexWriterConfig, I sometimes randomly subclass ConcurrentMergeScheduler (to turn off merge throttling) in the random IWC that's returned. Does this make Solr unhappy? Why is Solr trying to instantiate the merge scheduler class that's already instantiated on IWC? I'm confused... Mike McCandless http://blog.mikemccandless.com On Wed, Nov 19, 2014 at 1:00 PM, Alan Woodward a...@flax.co.uk wrote: I think this might be to do with Mike's changes in r1640457, but for some reason I can't up from svn or the apache git repo at the moment so I'm not certain. Alan Woodward www.flax.co.uk On 19 Nov 2014, at 17:05, Chris Hostetter wrote: Apologies -- I haven't been following the commits closely this week. Does anyone have any idea what changed at the low levels of the Solr testing class hierarchy to cause these failures in a variety of tests? : SolrCore 'collection1' is not available due to init failure: Error : instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' : Caused by: org.apache.solr.common.SolrException: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' : at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:532) : at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:517) : at org.apache.solr.update.SolrIndexConfig.buildMergeScheduler(SolrIndexConfig.java:289) : at org.apache.solr.update.SolrIndexConfig.toIndexWriterConfig(SolrIndexConfig.java:214) : at org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:77) : at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) : at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:529) : at org.apache.solr.core.SolrCore.init(SolrCore.java:796) : ... 8 more : Caused by: java.lang.IllegalAccessException: Class org.apache.solr.core.SolrResourceLoader can not access a member of class org.apache.lucene.util.LuceneTestCase$3 with modifiers : at sun.reflect.Reflection.ensureMemberAccess(Reflection.java:109) : at java.lang.Class.newInstance(Class.java:368) : at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:529) : ... 15 more :[junit4] 2 NOTE: reproduce with: ant test -Dtestcase=SampleTest -Dtests.method=testSimple -Dtests.seed=2E6E8F9ADADFEACF -Dtests.multiplier=2 -Dtests.slow=true -Dtests.locale=ja_JP_JP_#u-ca-japanese -Dtests.timezone=Europe/Lisbon -Dtests.asserts=true -Dtests.file.encoding=US-ASCII -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-6065) remove foreign readers from merge, fix LeafReader instead.
Robert Muir created LUCENE-6065: --- Summary: remove foreign readers from merge, fix LeafReader instead. Key: LUCENE-6065 URL: https://issues.apache.org/jira/browse/LUCENE-6065 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir Currently, SegmentMerger has supported two classes of citizens being merged: # SegmentReader # foreign reader (e.g. some FilterReader) It does an instanceof check and executes the merge differently. In the SegmentReader case: stored field and term vectors are bulk-merged, norms and docvalues are transferred directly without piling up on the heap, CRC32 verification runs with IO locality of the data being merged, etc. Otherwise, we treat it as a foreign reader and its slow. This is just the low-level, it gets worse as you wrap with more stuff. A great example there is SortingMergePolicy: not only will it have the low-level slowdowns listed above, it will e.g. cache/pile up OrdinalMaps for all string docvalues fields being merged and other silliness that just makes matters worse. Another use case is 5.0 users wishing to upgrade from fieldcache to docvalues. This should be possible to implement with a simple incremental transition based on a mergepolicy that uses UninvertingReader. But we shouldnt populate internal fieldcache entries unnecessarily on merge and spike RAM until all those segment cores are released, and other issues like bulk merge of stored fields and not piling up norms should still work: its completely unrelated. There are more problems we can fix if we clean this up, checkindex/checkreader can run efficiently where it doesn't need to RAM spike like merging, we can remove the checkIntegrity() method completely from LeafReader, since it can always be accomplished on producers, etc. In general it would be nice to just have one codepath for merging that is as efficient as we can make it, and to support things like index modifications during merge. I spent a few weeks writing 3 different implementations to fix this (interface, optional abstract class, fix LeafReader), and the latter is the only one i don't completely hate: I think our APIs should be efficient for indexing as well as search. So the proposal is simple, its to instead refactor LeafReader to just require the producer APIs as abstract methods (and FilterReaders should work on that). The search-oriented APIs can just be final methods that defer to those. So we would add 5 abstract methods, but implement 10 current methods as final based on those, and then merging would always be efficient. {code} // new abstract codec-based apis /** * Expert: retrieve thread-private TermVectorsReader * @throws AlreadyClosedException if this reader is closed * @lucene.internal */ protected abstract TermVectorsReader getTermVectorsReader(); /** * Expert: retrieve thread-private StoredFieldsReader * @throws AlreadyClosedException if this reader is closed * @lucene.internal */ protected abstract StoredFieldsReader getFieldsReader(); /** * Expert: retrieve underlying NormsProducer * @throws AlreadyClosedException if this reader is closed * @lucene.internal */ protected abstract NormsProducer getNormsReader(); /** * Expert: retrieve underlying DocValuesProducer * @throws AlreadyClosedException if this reader is closed * @lucene.internal */ protected abstract DocValuesProducer getDocValuesReader(); /** * Expert: retrieve underlying FieldsProducer * @throws AlreadyClosedException if this reader is closed * @lucene.internal */ protected abstract FieldsProducer getPostingsReader(); // user/search oriented public apis based on the above public final Fields fields(); public final void document(int, StoredFieldVisitor); public final Fields getTermVectors(int); public final NumericDocValues getNumericDocValues(String); public final Bits getDocsWithField(String); public final BinaryDocValues getBinaryDocValues(String); public final SortedDocValues getSortedDocValues(String); public final SortedNumericDocValues getSortedNumericDocValues(String); public final SortedSetDocValues getSortedSetDocValues(String); public final NumericDocValues getNormValues(String); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6065) remove foreign readers from merge, fix LeafReader instead.
[ https://issues.apache.org/jira/browse/LUCENE-6065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218334#comment-14218334 ] Michael McCandless commented on LUCENE-6065: +1 remove foreign readers from merge, fix LeafReader instead. Key: LUCENE-6065 URL: https://issues.apache.org/jira/browse/LUCENE-6065 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir Currently, SegmentMerger has supported two classes of citizens being merged: # SegmentReader # foreign reader (e.g. some FilterReader) It does an instanceof check and executes the merge differently. In the SegmentReader case: stored field and term vectors are bulk-merged, norms and docvalues are transferred directly without piling up on the heap, CRC32 verification runs with IO locality of the data being merged, etc. Otherwise, we treat it as a foreign reader and its slow. This is just the low-level, it gets worse as you wrap with more stuff. A great example there is SortingMergePolicy: not only will it have the low-level slowdowns listed above, it will e.g. cache/pile up OrdinalMaps for all string docvalues fields being merged and other silliness that just makes matters worse. Another use case is 5.0 users wishing to upgrade from fieldcache to docvalues. This should be possible to implement with a simple incremental transition based on a mergepolicy that uses UninvertingReader. But we shouldnt populate internal fieldcache entries unnecessarily on merge and spike RAM until all those segment cores are released, and other issues like bulk merge of stored fields and not piling up norms should still work: its completely unrelated. There are more problems we can fix if we clean this up, checkindex/checkreader can run efficiently where it doesn't need to RAM spike like merging, we can remove the checkIntegrity() method completely from LeafReader, since it can always be accomplished on producers, etc. In general it would be nice to just have one codepath for merging that is as efficient as we can make it, and to support things like index modifications during merge. I spent a few weeks writing 3 different implementations to fix this (interface, optional abstract class, fix LeafReader), and the latter is the only one i don't completely hate: I think our APIs should be efficient for indexing as well as search. So the proposal is simple, its to instead refactor LeafReader to just require the producer APIs as abstract methods (and FilterReaders should work on that). The search-oriented APIs can just be final methods that defer to those. So we would add 5 abstract methods, but implement 10 current methods as final based on those, and then merging would always be efficient. {code} // new abstract codec-based apis /** * Expert: retrieve thread-private TermVectorsReader * @throws AlreadyClosedException if this reader is closed * @lucene.internal */ protected abstract TermVectorsReader getTermVectorsReader(); /** * Expert: retrieve thread-private StoredFieldsReader * @throws AlreadyClosedException if this reader is closed * @lucene.internal */ protected abstract StoredFieldsReader getFieldsReader(); /** * Expert: retrieve underlying NormsProducer * @throws AlreadyClosedException if this reader is closed * @lucene.internal */ protected abstract NormsProducer getNormsReader(); /** * Expert: retrieve underlying DocValuesProducer * @throws AlreadyClosedException if this reader is closed * @lucene.internal */ protected abstract DocValuesProducer getDocValuesReader(); /** * Expert: retrieve underlying FieldsProducer * @throws AlreadyClosedException if this reader is closed * @lucene.internal */ protected abstract FieldsProducer getPostingsReader(); // user/search oriented public apis based on the above public final Fields fields(); public final void document(int, StoredFieldVisitor); public final Fields getTermVectors(int); public final NumericDocValues getNumericDocValues(String); public final Bits getDocsWithField(String); public final BinaryDocValues getBinaryDocValues(String); public final SortedDocValues getSortedDocValues(String); public final SortedNumericDocValues getSortedNumericDocValues(String); public final SortedSetDocValues getSortedSetDocValues(String); public final NumericDocValues getNormValues(String); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5317) [PATCH] Concordance capability
[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated LUCENE-5317: Attachment: lucene5317v2.patch I made the mistake of following instructions and tried {{/trunk}} and {{/trunk/}} yesterday. I tried with a git diff file yesterday, and I also just tried with a git --no-prefix diff file today, which seems to work with a traditional svn (patch attached). Today, I tried three variations of trunk. Still confident this is user error. Is there a size limit on diffs or is there something screwy with the attached diff file? [PATCH] Concordance capability -- Key: LUCENE-5317 URL: https://issues.apache.org/jira/browse/LUCENE-5317 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 4.5 Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: LUCENE-5317.patch, concordance_v1.patch.gz, lucene5317v1.patch, lucene5317v2.patch This patch enables a Lucene-powered concordance search capability. Concordances are extremely useful for linguists, lawyers and other analysts performing analytic search vs. traditional snippeting/document retrieval tasks. By analytic search, I mean that the user wants to browse every time a term appears (or at least the topn) in a subset of documents and see the words before and after. Concordance technology is far simpler and less interesting than IR relevance models/methods, but it can be extremely useful for some use cases. Traditional concordance sort orders are available (sort on words before the target, words after, target then words before and target then words after). Under the hood, this is running SpanQuery's getSpans() and reanalyzing to obtain character offsets. There is plenty of room for optimizations and refactoring. Many thanks to my colleague, Jason Robinson, for input on the design of this patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5317) [PATCH] Concordance capability
[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe updated LUCENE-5317: --- Attachment: LUCENE-5317.patch When I tried to make a new review request with your latest patch, I get this error: {quote} The specified diff file could not be parsed. Line 2: No valid separator after the filename was found in the diff header {quote} I've successfully applied your patch to my svn checkout (using {{svn patch}}), and I'm posting it here unchanged. [PATCH] Concordance capability -- Key: LUCENE-5317 URL: https://issues.apache.org/jira/browse/LUCENE-5317 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 4.5 Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: LUCENE-5317.patch, LUCENE-5317.patch, concordance_v1.patch.gz, lucene5317v1.patch, lucene5317v2.patch This patch enables a Lucene-powered concordance search capability. Concordances are extremely useful for linguists, lawyers and other analysts performing analytic search vs. traditional snippeting/document retrieval tasks. By analytic search, I mean that the user wants to browse every time a term appears (or at least the topn) in a subset of documents and see the words before and after. Concordance technology is far simpler and less interesting than IR relevance models/methods, but it can be extremely useful for some use cases. Traditional concordance sort orders are available (sort on words before the target, words after, target then words before and target then words after). Under the hood, this is running SpanQuery's getSpans() and reanalyzing to obtain character offsets. There is plenty of room for optimizations and refactoring. Many thanks to my colleague, Jason Robinson, for input on the design of this patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3774) /admin/mbean returning duplicate search handlers with names that map to their classes?
[ https://issues.apache.org/jira/browse/SOLR-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218384#comment-14218384 ] ASF subversion and git services commented on SOLR-3774: --- Commit 1640623 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1640623 ] SOLR-3774: Solr adds RequestHandler SolrInfoMBeans twice to the JMX server. /admin/mbean returning duplicate search handlers with names that map to their classes? -- Key: SOLR-3774 URL: https://issues.apache.org/jira/browse/SOLR-3774 Project: Solr Issue Type: Bug Reporter: Hoss Man Attachments: SOLR-3774.patch, SOLR-3774.patch Offshoot of SOLR-3232... bq. Along with some valid entries with names equal to the request handler names (/get search /browse) it also turned up one with the name org.apache.solr.handler.RealTimeGetHandler and another with the name org.apache.solr.handler.component.SearchHandler ...seems that we may have a bug with request handlers getting registered multiple times, once under their real name and once using their class? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Windows (64bit/jdk1.8.0_40-ea-b09) - Build # 4441 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/4441/ Java: 64bit/jdk1.8.0_40-ea-b09 -XX:-UseCompressedOops -XX:+UseSerialGC (asserts: true) 2 tests failed. REGRESSION: org.apache.solr.cloud.BasicDistributedZkTest.testDistribSearch Error Message: commitWithin did not work on node: http://127.0.0.1:64164/collection1 expected:68 but was:67 Stack Trace: java.lang.AssertionError: commitWithin did not work on node: http://127.0.0.1:64164/collection1 expected:68 but was:67 at __randomizedtesting.SeedInfo.seed([CA71B9A4F4F6031D:4B9737BC83A96321]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.apache.solr.cloud.BasicDistributedZkTest.doTest(BasicDistributedZkTest.java:345) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:869) at sun.reflect.GeneratedMethodAccessor44.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at
[jira] [Commented] (SOLR-3774) /admin/mbean returning duplicate search handlers with names that map to their classes?
[ https://issues.apache.org/jira/browse/SOLR-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218394#comment-14218394 ] ASF subversion and git services commented on SOLR-3774: --- Commit 1640625 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1640625 ] SOLR-3774: Solr adds RequestHandler SolrInfoMBeans twice to the JMX server. /admin/mbean returning duplicate search handlers with names that map to their classes? -- Key: SOLR-3774 URL: https://issues.apache.org/jira/browse/SOLR-3774 Project: Solr Issue Type: Bug Reporter: Hoss Man Attachments: SOLR-3774.patch, SOLR-3774.patch Offshoot of SOLR-3232... bq. Along with some valid entries with names equal to the request handler names (/get search /browse) it also turned up one with the name org.apache.solr.handler.RealTimeGetHandler and another with the name org.apache.solr.handler.component.SearchHandler ...seems that we may have a bug with request handlers getting registered multiple times, once under their real name and once using their class? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5317) [PATCH] Concordance capability
[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218401#comment-14218401 ] Tim Allison commented on LUCENE-5317: - Great. Thank you. I just tried svn diff from the svn checkout that I had patched with the correct git diff...with no luck. I hadn't even svn-added the concordance directory, so the diff file was quite short. Are you using rbtools or have you had luck with the web interface? And success with installing rbtools: {noformat} Searching for RBTools Reading https://pypi.python.org/simple/RBTools/ Download error on https://pypi.python.org/simple/RBTools/: [Errno 10061] No conn ection could be made because the target machine actively refused it -- Some pack ages may not be found! Couldn't find index page for 'RBTools' (maybe misspelled?) Scanning index of all packages (this may take a while) Reading https://pypi.python.org/simple/ Download error on https://pypi.python.org/simple/: [Errno 10061] No connection c ould be made because the target machine actively refused it -- Some packages may not be found! No local packages or download links found for RBTools error: Could not find suitable distribution for Requirement.parse('RBTools') {noformat} [PATCH] Concordance capability -- Key: LUCENE-5317 URL: https://issues.apache.org/jira/browse/LUCENE-5317 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 4.5 Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: LUCENE-5317.patch, LUCENE-5317.patch, concordance_v1.patch.gz, lucene5317v1.patch, lucene5317v2.patch This patch enables a Lucene-powered concordance search capability. Concordances are extremely useful for linguists, lawyers and other analysts performing analytic search vs. traditional snippeting/document retrieval tasks. By analytic search, I mean that the user wants to browse every time a term appears (or at least the topn) in a subset of documents and see the words before and after. Concordance technology is far simpler and less interesting than IR relevance models/methods, but it can be extremely useful for some use cases. Traditional concordance sort orders are available (sort on words before the target, words after, target then words before and target then words after). Under the hood, this is running SpanQuery's getSpans() and reanalyzing to obtain character offsets. There is plenty of room for optimizations and refactoring. Many thanks to my colleague, Jason Robinson, for input on the design of this patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5317) [PATCH] Concordance capability
[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218407#comment-14218407 ] Steve Rowe commented on LUCENE-5317: One of the nice things about {{svn patch}} is that it automatically does the {{svn add}} stuff for you. I've never tried rbtools, always used the web interface, no problems so far. [PATCH] Concordance capability -- Key: LUCENE-5317 URL: https://issues.apache.org/jira/browse/LUCENE-5317 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 4.5 Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: LUCENE-5317.patch, LUCENE-5317.patch, concordance_v1.patch.gz, lucene5317v1.patch, lucene5317v2.patch This patch enables a Lucene-powered concordance search capability. Concordances are extremely useful for linguists, lawyers and other analysts performing analytic search vs. traditional snippeting/document retrieval tasks. By analytic search, I mean that the user wants to browse every time a term appears (or at least the topn) in a subset of documents and see the words before and after. Concordance technology is far simpler and less interesting than IR relevance models/methods, but it can be extremely useful for some use cases. Traditional concordance sort orders are available (sort on words before the target, words after, target then words before and target then words after). Under the hood, this is running SpanQuery's getSpans() and reanalyzing to obtain character offsets. There is plenty of room for optimizations and refactoring. Many thanks to my colleague, Jason Robinson, for input on the design of this patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Review Request 28247: LUCENE-5317 Add Concordance capability to Lucene
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28247/ --- Review request for lucene. Repository: lucene Description --- This patch is a start towards adding a concordance capability to Lucene. It currently relies on converting queries to SpanQueries and then doing the calculations to build concordance windows with sort keys. Once spans are nuked and the positions branch is ready, it should be straightforward to modify this to use positions. There is plenty of room for optimization and for general cleanup. Use of concordances dates back to the 13th century (according to Wikipedia), but this can still be a very useful capability for advanced analysts, linguists and lawyers. Diffs - trunk/dev-tools/idea/.idea/ant.xml 1640617 trunk/dev-tools/idea/.idea/modules.xml 1640617 trunk/dev-tools/idea/.idea/workspace.xml 1640617 trunk/dev-tools/maven/lucene/pom.xml.template 1640617 trunk/lucene/build.xml 1640617 trunk/lucene/concordance/build.xml PRE-CREATION trunk/lucene/concordance/ivy.xml PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/AbstractConcordanceWindowCollector.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/ConcordanceSearcher.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/ConcordanceSearcherUtil.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/ConcordanceSortKey.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/ConcordanceSortOrder.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/ConcordanceSorter.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/ConcordanceWindow.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/ConcordanceWindowCollector.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/DedupingConcordanceWindowCollector.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/DefaultSortKeyBuilder.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/DocIdBuilder.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/DocMetadataExtractor.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/DocumentOrderSortKey.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/FieldBasedDocIdBuilder.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/IndexIdDocIdBuilder.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/SimpleDocMetadataExtractor.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/SortKeyBuilder.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/WindowBuilder.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/charoffsets/DocTokenOffsets.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/charoffsets/DocTokenOffsetsIterator.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/charoffsets/OffsetLengthStartComparator.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/charoffsets/OffsetStartComparator.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/charoffsets/OffsetUtil.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/charoffsets/RandomAccessCharOffsetContainer.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/charoffsets/ReanalyzingTokenCharOffsetsReader.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/charoffsets/SimpleAnalyzerUtil.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/charoffsets/TargetTokenNotFoundException.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/charoffsets/TokenCharOffsetRequests.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/charoffsets/TokenCharOffsetsReader.java PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/concordance/package.html PRE-CREATION trunk/lucene/concordance/src/java/org/apache/lucene/search/queries/SpanQueryConverter.java PRE-CREATION
[jira] [Commented] (LUCENE-5317) [PATCH] Concordance capability
[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218436#comment-14218436 ] Tim Allison commented on LUCENE-5317: - Switching to Chrome was the answer, apparently: [link|https://reviews.apache.org/r/28247/] [PATCH] Concordance capability -- Key: LUCENE-5317 URL: https://issues.apache.org/jira/browse/LUCENE-5317 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 4.5 Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: LUCENE-5317.patch, LUCENE-5317.patch, concordance_v1.patch.gz, lucene5317v1.patch, lucene5317v2.patch This patch enables a Lucene-powered concordance search capability. Concordances are extremely useful for linguists, lawyers and other analysts performing analytic search vs. traditional snippeting/document retrieval tasks. By analytic search, I mean that the user wants to browse every time a term appears (or at least the topn) in a subset of documents and see the words before and after. Concordance technology is far simpler and less interesting than IR relevance models/methods, but it can be extremely useful for some use cases. Traditional concordance sort orders are available (sort on words before the target, words after, target then words before and target then words after). Under the hood, this is running SpanQuery's getSpans() and reanalyzing to obtain character offsets. There is plenty of room for optimizations and refactoring. Many thanks to my colleague, Jason Robinson, for input on the design of this patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6752) Buffer Cache allocate/lost metrics should be exposed
[ https://issues.apache.org/jira/browse/SOLR-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob updated SOLR-6752: Summary: Buffer Cache allocate/lost metrics should be exposed (was: Buffer Cache allocate/lost should be exposed through JMX) Buffer Cache allocate/lost metrics should be exposed Key: SOLR-6752 URL: https://issues.apache.org/jira/browse/SOLR-6752 Project: Solr Issue Type: Bug Reporter: Mike Drob Assignee: Mark Miller Labels: metrics Attachments: SOLR-6752.patch Currently, {{o.a.s.store.blockcache.Metrics}} has fields for tracking buffer allocations and losses, but they are never updated nor exposed to a receiving metrics system. We should do both. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6766) Switch o.a.s.store.blockcache.Metrics to use JMX
Mike Drob created SOLR-6766: --- Summary: Switch o.a.s.store.blockcache.Metrics to use JMX Key: SOLR-6766 URL: https://issues.apache.org/jira/browse/SOLR-6766 Project: Solr Issue Type: Bug Reporter: Mike Drob The Metrics class currently reports to hadoop metrics, but it would be better to report to JMX. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6752) Buffer Cache allocate/lost metrics should be exposed
[ https://issues.apache.org/jira/browse/SOLR-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218449#comment-14218449 ] Mike Drob commented on SOLR-6752: - I will split this issue into this and SOLR-6766 to reduce confusion. Buffer Cache allocate/lost metrics should be exposed Key: SOLR-6752 URL: https://issues.apache.org/jira/browse/SOLR-6752 Project: Solr Issue Type: Bug Reporter: Mike Drob Assignee: Mark Miller Labels: metrics Attachments: SOLR-6752.patch Currently, {{o.a.s.store.blockcache.Metrics}} has fields for tracking buffer allocations and losses, but they are never updated nor exposed to a receiving metrics system. We should do both. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.9.0-ea-b34) - Build # 11636 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/11636/ Java: 64bit/jdk1.9.0-ea-b34 -XX:+UseCompressedOops -XX:+UseSerialGC (asserts: false) 9 tests failed. REGRESSION: org.apache.solr.TestDistributedMissingSort.testDistribSearch Error Message: Expected mime type application/octet-stream but got text/html. html head meta http-equiv=Content-Type content=text/html;charset=ISO-8859-1/ titleError 500 {msg=SolrCore 'collection1' is not available due to init failure: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3',trace=org.apache.solr.common.SolrException: SolrCore 'collection1' is not available due to init failure: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:767) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:294) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:202) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:137) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.server.handler.GzipHandler.handle(GzipHandler.java:301) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1077) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:953) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:628) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.solr.common.SolrException: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' at org.apache.solr.core.SolrCore.lt;initgt;(SolrCore.java:896) at org.apache.solr.core.SolrCore.lt;initgt;(SolrCore.java:653) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:510) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:274) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:268) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ... 1 more Caused by: org.apache.solr.common.SolrException: Error instantiating class: 'org.apache.lucene.util.LuceneTestCase$3' at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:535) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:519) at org.apache.solr.update.SolrIndexConfig.buildMergeScheduler(SolrIndexConfig.java:305) at org.apache.solr.update.SolrIndexConfig.toIndexWriterConfig(SolrIndexConfig.java:230) at org.apache.solr.update.SolrIndexWriter.lt;initgt;(SolrIndexWriter.java:78) at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:530) at org.apache.solr.core.SolrCore.lt;initgt;(SolrCore.java:797) ... 8 more Caused by: java.lang.IllegalAccessException: Class org.apache.solr.core.SolrResourceLoader can not access a member of class org.apache.lucene.util.LuceneTestCase$3 with modifiers at sun.reflect.Reflection.ensureMemberAccess(Reflection.java:101) at java.lang.Class.newInstance(Class.java:433) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:531) ... 15 more
[jira] [Comment Edited] (LUCENE-5317) [PATCH] Concordance capability
[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218436#comment-14218436 ] Tim Allison edited comment on LUCENE-5317 at 11/19/14 8:27 PM: --- Switching to Chrome was the answer, apparently: [link|https://reviews.apache.org/r/28247/] Thank you for the tip on svn patch...I just upgraded my Linux svn to something more appropriate for this decade, and now patch is available. :) was (Author: talli...@mitre.org): Switching to Chrome was the answer, apparently: [link|https://reviews.apache.org/r/28247/] [PATCH] Concordance capability -- Key: LUCENE-5317 URL: https://issues.apache.org/jira/browse/LUCENE-5317 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 4.5 Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: LUCENE-5317.patch, LUCENE-5317.patch, concordance_v1.patch.gz, lucene5317v1.patch, lucene5317v2.patch This patch enables a Lucene-powered concordance search capability. Concordances are extremely useful for linguists, lawyers and other analysts performing analytic search vs. traditional snippeting/document retrieval tasks. By analytic search, I mean that the user wants to browse every time a term appears (or at least the topn) in a subset of documents and see the words before and after. Concordance technology is far simpler and less interesting than IR relevance models/methods, but it can be extremely useful for some use cases. Traditional concordance sort orders are available (sort on words before the target, words after, target then words before and target then words after). Under the hood, this is running SpanQuery's getSpans() and reanalyzing to obtain character offsets. There is plenty of room for optimizations and refactoring. Many thanks to my colleague, Jason Robinson, for input on the design of this patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: solr magento e-commerce plugin/extension
I have only two resources on this (both for Magento): *) http://www.magentocommerce.com/magento-connect/solr-bridge-search.html *) http://inchoo.net/ecommerce/install-apache-solr/ (has links in the comments) Regards, Alex Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 19 November 2014 12:42, Anurag Sharma anura...@gmail.com wrote: I am interested in exploring option of building plugin/extension for opensource e-commerce frameworks like magento, opencart etc. Such a plugin can give easy platform for Solr on-boarding to open-source e-commerce community. Please suggest if we have any discussion/work done in this area. Thanks Anurag - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Solr with LAMP/ XAMPP/ WAMP/MAMP
On 19 November 2014 13:21, Shawn Heisey apa...@elyograg.org wrote: I *do* think it might be a good idea for us to write and maintain supported Solr clients for languages beyond Java I believe this has been announced as one of the focus items at the Lucene/Solr Revolution. Including the call to the current framework others to consider donating their libraries. The first non-Java target, I think, was C, though I have no further details. I think that it would be nice to have a mailing list specifically for the client-library maintainers, so they could all discuss impact of things like schema XML format change on their libraries. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5317) Concordance capability
[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Ernst updated LUCENE-5317: --- Fix Version/s: (was: 4.9) Summary: Concordance capability (was: [PATCH] Concordance capability) Concordance capability -- Key: LUCENE-5317 URL: https://issues.apache.org/jira/browse/LUCENE-5317 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 4.5 Reporter: Tim Allison Labels: patch Attachments: LUCENE-5317.patch, LUCENE-5317.patch, concordance_v1.patch.gz, lucene5317v1.patch, lucene5317v2.patch This patch enables a Lucene-powered concordance search capability. Concordances are extremely useful for linguists, lawyers and other analysts performing analytic search vs. traditional snippeting/document retrieval tasks. By analytic search, I mean that the user wants to browse every time a term appears (or at least the topn) in a subset of documents and see the words before and after. Concordance technology is far simpler and less interesting than IR relevance models/methods, but it can be extremely useful for some use cases. Traditional concordance sort orders are available (sort on words before the target, words after, target then words before and target then words after). Under the hood, this is running SpanQuery's getSpans() and reanalyzing to obtain character offsets. There is plenty of room for optimizations and refactoring. Many thanks to my colleague, Jason Robinson, for input on the design of this patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3774) /admin/mbean returning duplicate search handlers with names that map to their classes?
[ https://issues.apache.org/jira/browse/SOLR-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-3774. --- Resolution: Fixed Fix Version/s: Trunk 5.0 Assignee: Mark Miller /admin/mbean returning duplicate search handlers with names that map to their classes? -- Key: SOLR-3774 URL: https://issues.apache.org/jira/browse/SOLR-3774 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Mark Miller Fix For: 5.0, Trunk Attachments: SOLR-3774.patch, SOLR-3774.patch Offshoot of SOLR-3232... bq. Along with some valid entries with names equal to the request handler names (/get search /browse) it also turned up one with the name org.apache.solr.handler.RealTimeGetHandler and another with the name org.apache.solr.handler.component.SearchHandler ...seems that we may have a bug with request handlers getting registered multiple times, once under their real name and once using their class? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5318) Co-occurrence counts from Concordance
[ https://issues.apache.org/jira/browse/LUCENE-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated LUCENE-5318: Fix Version/s: (was: 4.9) Co-occurrence counts from Concordance - Key: LUCENE-5318 URL: https://issues.apache.org/jira/browse/LUCENE-5318 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 4.5 Reporter: Tim Allison Labels: patch Attachments: cooccur_v1.patch.gz This patch calculates co-occurrence statistics on search terms within a window of x tokens. This can help in synonym discovery and anywhere else co-occurrence stats have been used. The attached patch depends on LUCENE-5317. Again, many thanks to my colleague, Jason Robinson, for advice in developing this code and for his modifications to this code to make it more Solr-friendly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated LUCENE-5205: Fix Version/s: (was: 4.9) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser --- Key: LUCENE-5205 URL: https://issues.apache.org/jira/browse/LUCENE-5205 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Reporter: Tim Allison Labels: patch Attachments: LUCENE-5205-cleanup-tests.patch, LUCENE-5205-date-pkg-prvt.patch, LUCENE-5205.patch.gz, LUCENE-5205.patch.gz, LUCENE-5205_dateTestReInitPkgPrvt.patch, LUCENE-5205_improve_stop_word_handling.patch, LUCENE-5205_smallTestMods.patch, LUCENE_5205.patch, SpanQueryParser_v1.patch.gz, patch.txt This parser extends QueryParserBase and includes functionality from: * Classic QueryParser: most of its syntax * SurroundQueryParser: recursive parsing for near and not clauses. * ComplexPhraseQueryParser: can handle near queries that include multiterms (wildcard, fuzzy, regex, prefix), * AnalyzingQueryParser: has an option to analyze multiterms. At a high level, there's a first pass BooleanQuery/field parser and then a span query parser handles all terminal nodes and phrases. Same as classic syntax: * term: test * fuzzy: roam~0.8, roam~2 * wildcard: te?t, test*, t*st * regex: /\[mb\]oat/ * phrase: jakarta apache * phrase with slop: jakarta apache~3 * default or clause: jakarta apache * grouping or clause: (jakarta apache) * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta * multiple fields: title:lucene author:hatcher Main additions in SpanQueryParser syntax vs. classic syntax: * Can require in order for phrases with slop with the \~ operator: jakarta apache\~3 * Can specify not near: fever bieber!\~3,10 :: find fever but not if bieber appears within 3 words before or 10 words after it. * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta apache\]~3 lucene\]\~4 :: find jakarta within 3 words of apache, and that hit has to be within four words before lucene * Can also use \[\] for single level phrasal queries instead of as in: \[jakarta apache\] * Can use or grouping clauses in phrasal queries: apache (lucene solr)\~3 :: find apache and then either lucene or solr within three words. * Can use multiterms in phrasal queries: jakarta\~1 ap*che\~2 * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like jakarta within two words of ap*che and that hit has to be within ten words of something like solr or that lucene regex. * Can require at least x number of hits at boolean level: apache AND (lucene solr tika)~2 * Can use negative only query: -jakarta :: Find all docs that don't contain jakarta * Can use an edit distance 2 for fuzzy query via SlowFuzzyQuery (beware of potential performance issues!). Trivial additions: * Can specify prefix length in fuzzy queries: jakarta~1,2 (edit distance =1, prefix =2) * Can specifiy Optimal String Alignment (OSA) vs Levenshtein for distance =2: (jakarta~1 (OSA) vs jakarta~1(Levenshtein) This parser can be very useful for concordance tasks (see also LUCENE-5317 and LUCENE-5318) and for analytical search. Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery. Most of the documentation is in the javadoc for SpanQueryParser. Any and all feedback is welcome. Thank you. Until this is added to the Lucene project, I've added a standalone lucene-addons repo (with jars compiled for the latest stable build of Lucene) on [github|https://github.com/tballison/lucene-addons]. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5503) Trivial fixes to WeightedSpanTermExtractor
[ https://issues.apache.org/jira/browse/LUCENE-5503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated LUCENE-5503: Fix Version/s: (was: 4.9) Trivial fixes to WeightedSpanTermExtractor -- Key: LUCENE-5503 URL: https://issues.apache.org/jira/browse/LUCENE-5503 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Affects Versions: 4.7 Reporter: Tim Allison Priority: Minor Attachments: LUCENE-5503.patch The conversion of PhraseQuery to SpanNearQuery miscalculates the slop if there are stop words in some cases. The issue only really appears if there is more than one intervening run of stop words: ab the cd the the ef. I also noticed that the inOrder determination is based on the newly calculated slop, and it should probably be based on the original phraseQuery.getSlop() patch and unit tests on way -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5496) Nuke fuzzyMinSim and replace with maxEdits for FuzzyQuery and its friends
[ https://issues.apache.org/jira/browse/LUCENE-5496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved LUCENE-5496. - Resolution: Later Given the drift of trunk from the original patch, this is probably better started from scratch when there is interest. Nuke fuzzyMinSim and replace with maxEdits for FuzzyQuery and its friends - Key: LUCENE-5496 URL: https://issues.apache.org/jira/browse/LUCENE-5496 Project: Lucene - Core Issue Type: Task Components: core/queryparser, core/search Affects Versions: 4.8, Trunk Reporter: Tim Allison Priority: Minor Attachments: LUCENE-5496-lucene_core_sandbox_v1.patch, LUCENE-5496_4x_deprecations.patch As we get closer to 5.0, I propose adding some deprecations in the queryparsers realm of 4.x. Are we ready to get rid of all fuzzyMinSims in trunk? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org