[jira] [Updated] (SOLR-9518) Kerberos Delegation Tokens doesn't work without a chrooted ZK
[ https://issues.apache.org/jira/browse/SOLR-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chattopadhyaya updated SOLR-9518: --- Description: Starting up Solr 6.2.0 (with delegation tokens enabled) that doesn't have a chrooted ZK, I see the following in the startup logs: {code} 2016-09-15 07:08:22.453 ERROR (main) [ ] o.a.s.s.SolrDispatchFilter Could not start Solr. Check solr/home property and the logs 2016-09-15 07:08:22.477 ERROR (main) [ ] o.a.s.c.SolrCore null:java.lang.StringIndexOutOfBoundsException: String index out of range: -1 at java.lang.String.substring(String.java:1927) at org.apache.solr.security.KerberosPlugin.init(KerberosPlugin.java:138) at org.apache.solr.core.CoreContainer.initializeAuthenticationPlugin(CoreContainer.java:316) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:442) at org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:158) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:134) at org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:137) at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:856) at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:348) at org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1379) at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1341) at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:772) at org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:261) at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:517) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:41) at org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:188) at org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:499) at org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:147) at org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:180) at org.eclipse.jetty.deploy.providers.WebAppProvider.fileAdded(WebAppProvider.java:458) at org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:64) at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:610) at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:529) {code} To me, it seems that adding a check for the presence of a chrooted ZK, and, calculating the relative ZK path only if it exists should suffice. I'll add a patch for this shortly. was: Starting up Solr 6.2.0 that doesn't have a chrooted ZK, I see the following in the startup logs: {code} 2016-09-15 07:08:22.453 ERROR (main) [ ] o.a.s.s.SolrDispatchFilter Could not start Solr. Check solr/home property and the logs 2016-09-15 07:08:22.477 ERROR (main) [ ] o.a.s.c.SolrCore null:java.lang.StringIndexOutOfBoundsException: String index out of range: -1 at java.lang.String.substring(String.java:1927) at org.apache.solr.security.KerberosPlugin.init(KerberosPlugin.java:138) at org.apache.solr.core.CoreContainer.initializeAuthenticationPlugin(CoreContainer.java:316) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:442) at org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:158) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:134) at org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:137) at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:856) at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:348) at org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1379) at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1341) at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:772) at org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:261) at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:517) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:41) at org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:188) at org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:499) at
[jira] [Commented] (SOLR-9518) Kerberos Delegation Tokens doesn't work without a chrooted ZK
[ https://issues.apache.org/jira/browse/SOLR-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522053#comment-15522053 ] Ishan Chattopadhyaya commented on SOLR-9518: [~noble.paul], can you please review? > Kerberos Delegation Tokens doesn't work without a chrooted ZK > - > > Key: SOLR-9518 > URL: https://issues.apache.org/jira/browse/SOLR-9518 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya > Attachments: SOLR-9518.patch, SOLR-9518.patch > > > Starting up Solr 6.2.0 that doesn't have a chrooted ZK, I see the following > in the startup logs: > {code} > 2016-09-15 07:08:22.453 ERROR (main) [ ] o.a.s.s.SolrDispatchFilter Could > not start Solr. Check solr/home property and the logs > 2016-09-15 07:08:22.477 ERROR (main) [ ] o.a.s.c.SolrCore > null:java.lang.StringIndexOutOfBoundsException: String index out of range: -1 > at java.lang.String.substring(String.java:1927) > at > org.apache.solr.security.KerberosPlugin.init(KerberosPlugin.java:138) > at > org.apache.solr.core.CoreContainer.initializeAuthenticationPlugin(CoreContainer.java:316) > at org.apache.solr.core.CoreContainer.load(CoreContainer.java:442) > at > org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:158) > at > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:134) > at > org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:137) > at > org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:856) > at > org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:348) > at > org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1379) > at > org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1341) > at > org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:772) > at > org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:261) > at > org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:517) > at > org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) > at > org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:41) > at > org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:188) > at > org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:499) > at > org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:147) > at > org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:180) > at > org.eclipse.jetty.deploy.providers.WebAppProvider.fileAdded(WebAppProvider.java:458) > at > org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:64) > at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:610) > at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:529) > {code} > To me, it seems that adding a check for the presence of a chrooted ZK, and, > calculating the relative ZK path only if it exists should suffice. I'll add a > patch for this shortly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7460) Should SortedNumericDocValues expose a per-document random-access API?
[ https://issues.apache.org/jira/browse/LUCENE-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522041#comment-15522041 ] Adrien Grand commented on LUCENE-7460: -- Sorted numerics make it a bit hard to reason about to me since I am not very clear about the use-cases, but I guess that in some cases one would want to use the minimum value when sorting in ascending order and the max value when sorting in descending order, so having fast access to the maximum value too feels like an important feature. Of course users can index the min/max values directly but I think there is also some value in flexibility, eg. we do not require users to index edge n-grams to run prefix queries. That said I do not feel too strongly about it and mostly wanted to give some visibility to this change of our doc values API and discuss it. If you feel strongly about keeping the iterator API, I'm good with it. > Should SortedNumericDocValues expose a per-document random-access API? > -- > > Key: LUCENE-7460 > URL: https://issues.apache.org/jira/browse/LUCENE-7460 > Project: Lucene - Core > Issue Type: Wish >Reporter: Adrien Grand >Priority: Minor > > Sorted numerics used to expose a per-document random-access API so that > accessing the median or max element would be cheap. The new > SortedNumericDocValues still exposes the number of values a document has, but > the only way to read values is to use {nextValue}, which forces to read all > values in order to read the max value. > For instance, {{SortedNumericSelector.MAX}} does the following in master (the > important part is the for-loop): > {code} > private void setValue() throws IOException { > int count = in.docValueCount(); > for(int i=0;ivalue = in.nextValue(); > } > } > @Override > public int nextDoc() throws IOException { > int docID = in.nextDoc(); > if (docID != NO_MORE_DOCS) { > setValue(); > } > return docID; > } > {code} > while it used to simply look up the value at index {{count-1}} in 6.x: > {code} > @Override > public long get(int docID) { > in.setDocument(docID); > final int count = in.count(); > if (count == 0) { > return 0; // missing > } else { > return in.valueAt(count-1); > } > } > {code} > This could be a conscious decision since a sequential API gives more > opportunities to the codec to compress efficiently, but on the other hand > this API prevents sorting by max or median values to be efficient. > On my end I have a preference for the random-access API. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-6.x-Solaris (64bit/jdk1.8.0) - Build # 412 - Still Unstable!
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Solaris/412/ Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseSerialGC 3 tests failed. FAILED: org.apache.solr.util.TestSolrCLIRunExample.testInteractiveSolrCloudExample Error Message: After running Solr cloud example, test collection 'testCloudExamplePrompt' not found in Solr at: http://localhost:47541/solr; tool output: Welcome to the SolrCloud example! This interactive session will help you launch a SolrCloud cluster on your local workstation. To begin, how many Solr nodes would you like to run in your local cluster? (specify 1-4 nodes) [2]: Ok, let's start up 1 Solr nodes for your example SolrCloud cluster. Please enter the port for node1 [8983]: Oops! Looks like port 47541 is already being used by another process. Please choose a different port. Please enter a port for node 1 [8983]: Creating Solr home directory /export/home/jenkins/workspace/Lucene-Solr-6.x-Solaris/solr/build/solr-core/test/J1/temp/solr.util.TestSolrCLIRunExample_3D70E669AD8FD1F3-001/tempDir-002/cloud/node1/solr Starting up Solr on port 2 using command: /export/home/jenkins/workspace/Lucene-Solr-6.x-Solaris/solr/bin/solr start -cloud -p 2 -s "temp/solr.util.TestSolrCLIRunExample_3D70E669AD8FD1F3-001/tempDir-002/cloud/node1/solr" Stack Trace: java.lang.AssertionError: After running Solr cloud example, test collection 'testCloudExamplePrompt' not found in Solr at: http://localhost:47541/solr; tool output: Welcome to the SolrCloud example! This interactive session will help you launch a SolrCloud cluster on your local workstation. To begin, how many Solr nodes would you like to run in your local cluster? (specify 1-4 nodes) [2]: Ok, let's start up 1 Solr nodes for your example SolrCloud cluster. Please enter the port for node1 [8983]: Oops! Looks like port 47541 is already being used by another process. Please choose a different port. Please enter a port for node 1 [8983]: Creating Solr home directory /export/home/jenkins/workspace/Lucene-Solr-6.x-Solaris/solr/build/solr-core/test/J1/temp/solr.util.TestSolrCLIRunExample_3D70E669AD8FD1F3-001/tempDir-002/cloud/node1/solr Starting up Solr on port 2 using command: /export/home/jenkins/workspace/Lucene-Solr-6.x-Solaris/solr/bin/solr start -cloud -p 2 -s "temp/solr.util.TestSolrCLIRunExample_3D70E669AD8FD1F3-001/tempDir-002/cloud/node1/solr" at __randomizedtesting.SeedInfo.seed([3D70E669AD8FD1F3:E60106A39AFA1495]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.util.TestSolrCLIRunExample.testInteractiveSolrCloudExample(TestSolrCLIRunExample.java:434) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
[jira] [Created] (SOLR-9560) Solr should check max open files and other ulimits and refuse to start if they are set too low
Shalin Shekhar Mangar created SOLR-9560: --- Summary: Solr should check max open files and other ulimits and refuse to start if they are set too low Key: SOLR-9560 URL: https://issues.apache.org/jira/browse/SOLR-9560 Project: Solr Issue Type: Wish Security Level: Public (Default Security Level. Issues are Public) Reporter: Shalin Shekhar Mangar Fix For: 6.3, master (7.0) Solr should check max open files and other ulimits and refuse to start if they are set too low. Specifically: # max open files should be at least 32768 # max memory size and virtual memory should both be unlimited -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] $PROJECT_NAME (${ENV,var="JAVA"}) - Build # $BUILD_NUMBER - $BUILD_STATUS!
Build: ${BUILD_URL} Java: ${ENV,var="JAVA_DESC"} ${FAILED_TESTS} Build Log: ${BUILD_LOG_MULTILINE_REGEX,regex="(?x: \ \ (?:.*\\[javac\\]\\s++(?![1-9]\\d*\\s+error).*\\r?\\n)*+.*\\[javac\\]\\s+[1-9]\\d*\\s+error.*\\r?\\n \ \ |.*\\[junit4\\]\\s+Suite:.*+\\s++ \ (?:.*\\[junit4\\]\\s++(?!Suite:)(?!Completed).*\\r?\\n)*+ \ .*\\[junit4\\]\\s++Completed\\s+.*<<<\\s*FAILURES!\\r?\\n \ \ |.*\\[junit4\\]\\s+JVM\\s+J\\d+:\\s+std(?:out|err)\\s+was\\s+not\\s+empty.*+\\s++ \ (?:.*\\[junit4\\]\\s++(?!JVM\\s+\\d+:\\s+std)(?!\\<<<\\s+JVM\\s+J\\d+:\\s+EOF).*\\r?\\n)*+ \ .*\\[junit4\\]\\s++<<<\\s+JVM\\s+J\\d+:\\s+EOF.*\\r?\\n \ \ |.*rat-sources:.*\\r?\\n \ (?:\\s*+\\[echo\\]\\s*\\r?\\n|\\s*+\\[echo\\]\\s++(?![1-9]\\d*\\s+Unknown\\s+License)\\S.*\\r?\\n)*+ \ \\s*+\\[echo\\]\\s+[1-9]\\d*\\s+Unknown\\s+License.*\\r?\\n \ (?:\\s*+\\[echo\\].*\\r?\\n)*+ \ \ |(?:.*\\r?\\n){2}.*\\[licenses\\]\\s+MISSING\\s+sha1(?:.*\\r?\\n){2} \ \ |.*check-licenses:.*\\r?\\n\\s*\\[echo\\].*\\r?\\n \ \\s*\\[licenses\\]\\s+(?:MISSING\\s+LICENSE|CHECKSUM\\s+FAILED).*\\r?\\n \ (?:\\s*+\\[licenses\\].*\\r?\\n)++ \ \ |(?:.*\\[javadoc\\]\\s++(?![1-9]\\d*\\s+(?:error|warning)).+\\r?\\n)*+ \ .*\\[javadoc\\]\\s+[1-9]\\d*\\s+(?:error|warning).*\\r?\\n \ \ |.*javadocs-lint:.*\\r?\\n(?:.*\\[exec\\].*\\r?\\n)*+ \ \ |.*check.*:.*\\r?\\n \ (?:\\s*+\\[forbidden-apis\\]\\s*\\r?\\n \ |\\s*+\\[forbidden-apis\\]\\s++ \ (?!Scanned\\s+\\d+\\s+(?:\\(and\\s+\\d+\\s+related\\)\\s+)?class\\s+file\\(s\\))\\S.*\\r?\n)*+ \ \\s*+\\[forbidden-apis\\]\\s++
Re: Solr configuration format fracturing
Did you know about configoverlay.json? +1 to the discussion. Additional fuel to the fire is that /config endpoint will return solrconfig.xml + overlay.json merged, but not params.json. Confusing. Additionally, /config output is JSON but not one that can round-trip AFAIK. Regards, Alex On 26 Sep 2016 12:42 AM, "Shawn Heisey"wrote: > There seems to be some fracturing in the format of various Solr > configs. Most of the config uses XML, but some new features in the last > few years are using JSON, particularly where SolrCloud and Zookeeper are > concerned. When notifications about SOLR-9557 came through, it revealed > that there is a config file sitting next to solrconfig.xml named > "params.json" that Solr will use. I wasn't aware of this until reading > that issue. > > This leads me to suggest something rather drastic for 7.0: Consolidate > all configuration formats and agree to consistent format usage unless > there is another major discussion and agreement to change formats. > > I did consider starting this discussion in Jira, but it's fairly major, > so the dev list seemed like the right place to start. > > Comments from some new users have come my way along the lines of "XML is > so 90's ... get with the times!" Image problems like that can be fatal > to a software project, even if there's no technical problem. > > The likely winner in the format discussion is pure unmodified JSON, but > I'm not going to make any assumptions. SOLR-8029 has some format > discussions that may be relevant here. > > IMHO, in order to make the idea successful, Solr 7.0 will need to > automatically convert most configs on startup from the old format to the > new format without user intervention. If there's something that we find > we can't convert automatically, that should result in a failure to > start, with a helpful message so the user has some idea what they need > to do. > > Thoughts? Is this too scary to contemplate? Should I open an umbrella > issue in Jira to get the ball rolling? > > Thanks, > Shawn > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >
[jira] [Assigned] (SOLR-9411) Better validation for Schema API
[ https://issues.apache.org/jira/browse/SOLR-9411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl reassigned SOLR-9411: - Assignee: Jan Høydahl > Better validation for Schema API > > > Key: SOLR-9411 > URL: https://issues.apache.org/jira/browse/SOLR-9411 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Schema and Analysis >Reporter: Jan Høydahl >Assignee: Jan Høydahl > Attachments: SOLR-9411.patch > > > Schema REST API needs better validation before doing changes. > * It should not be allowed to delete uniqueKey (also handled in SOLR-9349) > * When adding a dynamic field the API should test that it begins or ends with > {{*}}. Today the change succeeds, but you get errors later > These are two known cases. We should harden validation across the board for > all known schema requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9411) Better validation for Schema API
[ https://issues.apache.org/jira/browse/SOLR-9411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated SOLR-9411: -- Attachment: SOLR-9411.patch This patch fixes a bug in {{add-dynamic-field}}, where the field got created using {{SchemaField.create()}} instead of {{managedIndexSchema.newDynamicField()}} > Better validation for Schema API > > > Key: SOLR-9411 > URL: https://issues.apache.org/jira/browse/SOLR-9411 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Schema and Analysis >Reporter: Jan Høydahl > Attachments: SOLR-9411.patch > > > Schema REST API needs better validation before doing changes. > * It should not be allowed to delete uniqueKey (also handled in SOLR-9349) > * When adding a dynamic field the API should test that it begins or ends with > {{*}}. Today the change succeeds, but you get errors later > These are two known cases. We should harden validation across the board for > all known schema requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7465) Add a PatternTokenizer that uses Lucene's RegExp implementation
[ https://issues.apache.org/jira/browse/LUCENE-7465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-7465: --- Attachment: LUCENE-7465.patch Another iteration, adding {{SimplePatternSplitTokenizer}}. It's surprisingly different from the non-spit case, and sort of complex :) But it does pass its tests. I haven't compared performance to {{PatternTokenizer}} with group -1 yet. I'll see if I can simplify it, but I think this is otherwise close. > Add a PatternTokenizer that uses Lucene's RegExp implementation > --- > > Key: LUCENE-7465 > URL: https://issues.apache.org/jira/browse/LUCENE-7465 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: master (7.0), 6.3 > > Attachments: LUCENE-7465.patch, LUCENE-7465.patch > > > I think there are some nice benefits to a version of PatternTokenizer that > uses Lucene's RegExp impl instead of the JDK's: > * Lucene's RegExp is compiled to a DFA up front, so if a "too hard" RegExp > is attempted the user discovers it up front instead of later on when a > "lucky" document arrives > * It processes the incoming characters as a stream, only pulling 128 > characters at a time, vs the existing {{PatternTokenizer}} which currently > reads the entire string up front (this has caused heap problems in the past) > * It should be fast. > I named it {{SimplePatternTokenizer}}, and it still needs a factory and > improved tests, but I think it's otherwise close. > It currently does not take a {{group}} parameter because Lucene's RegExps > don't yet implement sub group capture. I think we could add that at some > point, but it's a bit tricky. > This doesn't even have group=-1 support (like String.split) ... I think if we > did that we should maybe name it differently > ({{SimplePatternSplitTokenizer}}?). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-master-Linux (64bit/jdk1.8.0_102) - Build # 17901 - Failure!
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/17901/ Java: 64bit/jdk1.8.0_102 -XX:+UseCompressedOops -XX:+UseG1GC 2 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.client.solrj.TestLBHttpSolrClient Error Message: 1 thread leaked from SUITE scope at org.apache.solr.client.solrj.TestLBHttpSolrClient: 1) Thread[id=2513, name=Connection evictor, state=TIMED_WAITING, group=TGRP-TestLBHttpSolrClient] at java.lang.Thread.sleep(Native Method) at org.apache.http.impl.client.IdleConnectionEvictor$1.run(IdleConnectionEvictor.java:66) at java.lang.Thread.run(Thread.java:745) Stack Trace: com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE scope at org.apache.solr.client.solrj.TestLBHttpSolrClient: 1) Thread[id=2513, name=Connection evictor, state=TIMED_WAITING, group=TGRP-TestLBHttpSolrClient] at java.lang.Thread.sleep(Native Method) at org.apache.http.impl.client.IdleConnectionEvictor$1.run(IdleConnectionEvictor.java:66) at java.lang.Thread.run(Thread.java:745) at __randomizedtesting.SeedInfo.seed([F56801609D417348]:0) FAILED: org.apache.solr.client.solrj.TestLBHttpSolrClient.testReliability Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: Java heap space Build Log: [...truncated 13226 lines...] [junit4] Suite: org.apache.solr.client.solrj.TestLBHttpSolrClient [junit4] 2> Creating dataDir: /home/jenkins/workspace/Lucene-Solr-master-Linux/solr/build/solr-solrj/test/J0/temp/solr.client.solrj.TestLBHttpSolrClient_F56801609D417348-001/init-core-data-001 [junit4] 2> 95449 INFO (SUITE-TestLBHttpSolrClient-seed#[F56801609D417348]-worker) [] o.a.s.SolrTestCaseJ4 Randomized ssl (true) and clientAuth (true) via: @org.apache.solr.util.RandomizeSSL(reason=, value=NaN, ssl=NaN, clientAuth=NaN) [junit4] 2> 95452 INFO (TEST-TestLBHttpSolrClient.testTwoServers-seed#[F56801609D417348]) [] o.a.s.SolrTestCaseJ4 ###Starting testTwoServers [junit4] 2> 95457 INFO (TEST-TestLBHttpSolrClient.testTwoServers-seed#[F56801609D417348]) [] o.e.j.s.Server jetty-9.3.8.v20160314 [junit4] 2> 95458 INFO (TEST-TestLBHttpSolrClient.testTwoServers-seed#[F56801609D417348]) [] o.e.j.s.h.ContextHandler Started o.e.j.s.ServletContextHandler@3c3ea710{/solr,null,AVAILABLE} [junit4] 2> 95460 INFO (TEST-TestLBHttpSolrClient.testTwoServers-seed#[F56801609D417348]) [] o.e.j.s.ServerConnector Started ServerConnector@efa6c9c{SSL,[ssl, http/1.1]}{127.0.0.1:39244} [junit4] 2> 95460 INFO (TEST-TestLBHttpSolrClient.testTwoServers-seed#[F56801609D417348]) [] o.e.j.s.Server Started @97415ms [junit4] 2> 95460 INFO (TEST-TestLBHttpSolrClient.testTwoServers-seed#[F56801609D417348]) [] o.a.s.c.s.e.JettySolrRunner Jetty properties: {solr.data.dir=/home/jenkins/workspace/Lucene-Solr-master-Linux/solr/build/solr-solrj/test/J0/temp/solr.client.solrj.TestLBHttpSolrClient_F56801609D417348-001/instance-0-001/collection1/data, solrconfig=bad_solrconfig.xml, hostContext=/solr, hostPort=39244} [junit4] 2> 95460 INFO (TEST-TestLBHttpSolrClient.testTwoServers-seed#[F56801609D417348]) [] o.a.s.c.SolrXmlConfig Loading container configuration from /home/jenkins/workspace/Lucene-Solr-master-Linux/solr/build/solr-solrj/test/J0/temp/solr.client.solrj.TestLBHttpSolrClient_F56801609D417348-001/instance-0-001/solr.xml [junit4] 2> 95467 INFO (TEST-TestLBHttpSolrClient.testTwoServers-seed#[F56801609D417348]) [] o.a.s.c.CorePropertiesLocator Found 1 core definitions underneath /home/jenkins/workspace/Lucene-Solr-master-Linux/solr/build/solr-solrj/test/J0/temp/solr.client.solrj.TestLBHttpSolrClient_F56801609D417348-001/instance-0-001/. [junit4] 2> 95467 INFO (TEST-TestLBHttpSolrClient.testTwoServers-seed#[F56801609D417348]) [] o.a.s.c.CorePropertiesLocator Cores are: [collection1] [junit4] 2> 95474 INFO (coreLoadExecutor-214-thread-1) [] o.a.s.c.SolrConfig Using Lucene MatchVersion: 7.0.0 [junit4] 2> 95478 INFO (coreLoadExecutor-214-thread-1) [] o.a.s.c.SolrConfig Loaded SolrConfig: solrconfig.xml [junit4] 2> 95480 INFO (coreLoadExecutor-214-thread-1) [] o.a.s.s.IndexSchema [collection1] Schema name=test [junit4] 2> 95482 INFO (coreLoadExecutor-214-thread-1) [] o.a.s.s.IndexSchema [collection1] unique key field: id [junit4] 2> 95483 INFO (coreLoadExecutor-214-thread-1) [] o.a.s.c.CoreContainer Creating SolrCore 'collection1' using configuration from instancedir /home/jenkins/workspace/Lucene-Solr-master-Linux/solr/build/solr-solrj/test/J0/temp/solr.client.solrj.TestLBHttpSolrClient_F56801609D417348-001/instance-0-001/./collection1 [junit4] 2> 95484 INFO (coreLoadExecutor-214-thread-1) [ x:collection1] o.a.s.c.SolrCore [[collection1] ] Opening new SolrCore at
[jira] [Comment Edited] (LUCENE-7398) Nested Span Queries are buggy
[ https://issues.apache.org/jira/browse/LUCENE-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15519320#comment-15519320 ] Paul Elschot edited comment on LUCENE-7398 at 9/25/16 10:00 PM: Patch of 24 Sep 2016, work in progress. Edit: superseded on 25 Sep, this can be ignored. This introduces SpanNearQuery.MatchNear to choose the matching method. The ORDERED_LAZY case is still the patch of 14 August, this should be changed back to the current implementation, and be used to implement ORDERED_LOOKAHEAD. This implements MatchNear.UNORDERED_STARTPOS and uses that as the default implementation for the unordered case. The implementation of UNORDERED_STARTPOS is in NearSpansUnorderedStartPos, which is simpler than the current NearSpansUnordered, there is no SpansCell. I'd expect this StartPos implementation to be a little faster, so I also implemented it as default for the unordered case. In only one test case the UNORDERED_LAZY method is needed to pass the test. The question is whether it is ok to change the default unordered implementation to only use the span start positions. The collect() method is moved to the superclass ConjunctionSpans, this simplification might be done at another issue. was (Author: paul.elsc...@xs4all.nl): Patch of 24 Sep 2016, work in progress. This introduces SpanNearQuery.MatchNear to choose the matching method. The ORDERED_LAZY case is still the patch of 14 August, this should be changed back to the current implementation, and be used to implement ORDERED_LOOKAHEAD. This implements MatchNear.UNORDERED_STARTPOS and uses that as the default implementation for the unordered case. The implementation of UNORDERED_STARTPOS is in NearSpansUnorderedStartPos, which is simpler than the current NearSpansUnordered, there is no SpansCell. I'd expect this StartPos implementation to be a little faster, so I also implemented it as default for the unordered case. In only one test case the UNORDERED_LAZY method is needed to pass the test. The question is whether it is ok to change the default unordered implementation to only use the span start positions. The collect() method is moved to the superclass ConjunctionSpans, this simplification might be done at another issue. > Nested Span Queries are buggy > - > > Key: LUCENE-7398 > URL: https://issues.apache.org/jira/browse/LUCENE-7398 > Project: Lucene - Core > Issue Type: Bug > Components: core/search >Affects Versions: 5.5, 6.x >Reporter: Christoph Goller >Assignee: Alan Woodward >Priority: Critical > Attachments: LUCENE-7398-20160814.patch, LUCENE-7398-20160924.patch, > LUCENE-7398-20160925.patch, LUCENE-7398.patch, LUCENE-7398.patch, > TestSpanCollection.java > > > Example for a nested SpanQuery that is not working: > Document: Human Genome Organization , HUGO , is trying to coordinate gene > mapping research worldwide. > Query: spanNear([body:coordinate, spanOr([spanNear([body:gene, body:mapping], > 0, true), body:gene]), body:research], 0, true) > The query should match "coordinate gene mapping research" as well as > "coordinate gene research". It does not match "coordinate gene mapping > research" with Lucene 5.5 or 6.1, it did however match with Lucene 4.10.4. It > probably stopped working with the changes on SpanQueries in 5.3. I will > attach a unit test that shows the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7398) Nested Span Queries are buggy
[ https://issues.apache.org/jira/browse/LUCENE-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Elschot updated LUCENE-7398: - Attachment: LUCENE-7398-20160925.patch Patch of 25 Sep 2016. Compared to the previous patch, this removes the ORDERED_STARTPOS case, because I don't know whether that is needed. Also this restores backward compatibility. Compared to master, this has: Four MatchNear methods, two are the current ones, they are called ORDERED_LAZY and UNORDERED_LAZY, and these are used when the current builder and constructors use a boolean ordered argument. The third case is ORDERED_LOOKAHEAD, which is from the patch of 18 August. The last case is UNORDERED_STARTPOS, which is simpler than UNORDERED_LAZY, hopefully a little faster, and with better completeness of the result. Javadocs for all four cases have been added. All test cases from here have been added, and where necessary they have been modified to use ORDERED_LOOKAHEAD and to not do span collection. These tests pass. For the last case, UNORDERED_STARTPOS, no test cases have been added yet. This is still to be done. Does anyone have more difficult cases? Minor point: the collect() method was moved to the superclass ConjunctionSpans. Feedback welcome, especially on the javadocs of SpanNearQuery.MatchNear. Instead of adding backtracking methods, it might be better to do counting of input spans in a matching window. I'm hoping that the UNORDERED_STARTPOS case can be extended for that. Any ideas there? > Nested Span Queries are buggy > - > > Key: LUCENE-7398 > URL: https://issues.apache.org/jira/browse/LUCENE-7398 > Project: Lucene - Core > Issue Type: Bug > Components: core/search >Affects Versions: 5.5, 6.x >Reporter: Christoph Goller >Assignee: Alan Woodward >Priority: Critical > Attachments: LUCENE-7398-20160814.patch, LUCENE-7398-20160924.patch, > LUCENE-7398-20160925.patch, LUCENE-7398.patch, LUCENE-7398.patch, > TestSpanCollection.java > > > Example for a nested SpanQuery that is not working: > Document: Human Genome Organization , HUGO , is trying to coordinate gene > mapping research worldwide. > Query: spanNear([body:coordinate, spanOr([spanNear([body:gene, body:mapping], > 0, true), body:gene]), body:research], 0, true) > The query should match "coordinate gene mapping research" as well as > "coordinate gene research". It does not match "coordinate gene mapping > research" with Lucene 5.5 or 6.1, it did however match with Lucene 4.10.4. It > probably stopped working with the changes on SpanQueries in 5.3. I will > attach a unit test that shows the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9387) Allow topic expression to store queries and macros
[ https://issues.apache.org/jira/browse/SOLR-9387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-9387: - Description: The topic expression already stores the checkpoints for a topic. This ticket will allow the topic to store the topic query and a *macro* to be performed with the topic. Macros will be run using Solr's built-in parameter substitution: Sample syntax: {code} topic(collection1, q="*:*", macro="update(classify(model, ${topic}))") {code} The query and macro will be stored with the topic. Topics can be retrieved and executed as part of the larger macro using Solr's built in parameter substitution. {code} http://localhost:8983/solr/collection1/stream?expr=update(classify(model, ${topic}))=topic(collection1,) {code} Because topics are stored in a SolrCloud collection this will allow for storing millions of topics and macros. The parallel function can then be used to run the topics/macros in parallel across a large number of workers. was: The topic expression already stores the checkpoints for a topic. This ticket will allow the topic to store the topic query and a *macro* to be performed with the topic. Macros will be run using Solr's builtin parameter substitution: Sample syntax: {code} topic(collection1, q="*:*", macro="update(classify(model, ${topic}))") {code} The query and macro will be stored with the topic. Topics can be retrieved and executed as part of the larger macro using Solr's built in parameter substitution. {code} http://localhost:8983/solr/collection1/stream?expr=update(classify(model, ${topic}))=topic(collection1,) {code} Because topics are stored in a SolrCloud collection this will allow for storing millions of topics and macros. The parallel function can then be used to run the topics/macros in parallel across a large number of workers. > Allow topic expression to store queries and macros > -- > > Key: SOLR-9387 > URL: https://issues.apache.org/jira/browse/SOLR-9387 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein > > The topic expression already stores the checkpoints for a topic. This ticket > will allow the topic to store the topic query and a *macro* to be performed > with the topic. > Macros will be run using Solr's built-in parameter substitution: > Sample syntax: > {code} > topic(collection1, q="*:*", macro="update(classify(model, ${topic}))") > {code} > The query and macro will be stored with the topic. Topics can be retrieved > and executed as part of the larger macro using Solr's built in parameter > substitution. > {code} > http://localhost:8983/solr/collection1/stream?expr=update(classify(model, > ${topic}))=topic(collection1,) > {code} > Because topics are stored in a SolrCloud collection this will allow for > storing millions of topics and macros. > The parallel function can then be used to run the topics/macros in parallel > across a large number of workers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9387) Allow topic expression to store queries and macros
[ https://issues.apache.org/jira/browse/SOLR-9387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-9387: - Description: The topic expression already stores the checkpoints for a topic. This ticket will allow the topic to store the topic query and a *macro* to be performed with the topic. Macros will be run using Solr's builtin parameter substitution: Sample syntax: {code} topic(collection1, q="*:*", macro="update(classify(model, ${topic}))") {code} The query and macro will be stored with the topic. Topics can be retrieved and executed as part of the larger macro using Solr's built in parameter substitution. {code} http://localhost:8983/solr/collection1/stream?expr=update(classify(model, ${topic}))=topic(collection1,) {code} Because topics are stored in a SolrCloud collection this will allow for storing millions of topics and macros. The parallel function can then be used to run the topics/macros in parallel across a large number of workers. was: The topic expression already stores the checkpoints for a topic. This ticket will allow the topic to store the topic query as well as a macro to be performed with the topic. Macros will be run using Solr's builtin parameter substitution: Sample syntax: {code} topic(collection1, q="*:*", macro="update(classify(model, ${topic}))") {code} The query and macro will be stored with the topic. Topics can be retrieved and executed as part of the larger macro using Solr's built in parameter substitution. {code} http://localhost:8983/solr/collection1/stream?expr=update(classify(model, ${topic}))=topic(collection1,) {code} Because topics are stored in a SolrCloud collection this will allow for storing millions of topics and macros. The parallel function can then be used to run the topics/macros in parallel across a large number of workers. > Allow topic expression to store queries and macros > -- > > Key: SOLR-9387 > URL: https://issues.apache.org/jira/browse/SOLR-9387 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein > > The topic expression already stores the checkpoints for a topic. This ticket > will allow the topic to store the topic query and a *macro* to be performed > with the topic. > Macros will be run using Solr's builtin parameter substitution: > Sample syntax: > {code} > topic(collection1, q="*:*", macro="update(classify(model, ${topic}))") > {code} > The query and macro will be stored with the topic. Topics can be retrieved > and executed as part of the larger macro using Solr's built in parameter > substitution. > {code} > http://localhost:8983/solr/collection1/stream?expr=update(classify(model, > ${topic}))=topic(collection1,) > {code} > Because topics are stored in a SolrCloud collection this will allow for > storing millions of topics and macros. > The parallel function can then be used to run the topics/macros in parallel > across a large number of workers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8241) Evaluate W-TinyLfu cache
[ https://issues.apache.org/jira/browse/SOLR-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15521377#comment-15521377 ] Ben Manes commented on SOLR-8241: - I took look to refresh myself on LFUCache and decay. I don't think there is an issue because TinyLFU has similar logic to age the frequencies asynchronously. It observes a sample of 10 * maximum size and then halves the counters. The difference is the counters are stored in an array, are 4-bit, and represent all items (not just those currently residing in the cache). This extended history and using frequency for admission (rather than eviction) is what allows the policy to have a superior hit rate and be amortized O(1). > Evaluate W-TinyLfu cache > > > Key: SOLR-8241 > URL: https://issues.apache.org/jira/browse/SOLR-8241 > Project: Solr > Issue Type: Wish > Components: search >Reporter: Ben Manes >Priority: Minor > Attachments: SOLR-8241.patch > > > SOLR-2906 introduced an LFU cache and in-progress SOLR-3393 makes it O(1). > The discussions seem to indicate that the higher hit rate (vs LRU) is offset > by the slower performance of the implementation. An original goal appeared to > be to introduce ARC, a patented algorithm that uses ghost entries to retain > history information. > My analysis of Window TinyLfu indicates that it may be a better option. It > uses a frequency sketch to compactly estimate an entry's popularity. It uses > LRU to capture recency and operate in O(1) time. When using available > academic traces the policy provides a near optimal hit rate regardless of the > workload. > I'm getting ready to release the policy in Caffeine, which Solr already has a > dependency on. But, the code is fairly straightforward and a port into Solr's > caches instead is a pragmatic alternative. More interesting is what the > impact would be in Solr's workloads and feedback on the policy's design. > https://github.com/ben-manes/caffeine/wiki/Efficiency -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-master-Linux (64bit/jdk1.8.0_102) - Build # 17900 - Unstable!
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/17900/ Java: 64bit/jdk1.8.0_102 -XX:+UseCompressedOops -XX:+UseParallelGC 1 tests failed. FAILED: org.apache.solr.handler.TestReplicationHandler.doTestStressReplication Error Message: timed out waiting for collection1 startAt time to exceed: Sun Sep 25 20:09:56 WEST 2016 Stack Trace: java.lang.AssertionError: timed out waiting for collection1 startAt time to exceed: Sun Sep 25 20:09:56 WEST 2016 at __randomizedtesting.SeedInfo.seed([1AB77B68D36CBF84:C11C7BAED644D637]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.handler.TestReplicationHandler.watchCoreStartAt(TestReplicationHandler.java:1508) at org.apache.solr.handler.TestReplicationHandler.doTestStressReplication(TestReplicationHandler.java:858) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) at java.lang.Thread.run(Thread.java:745) Build Log: [...truncated 11895 lines...] [junit4] Suite:
[JENKINS] Lucene-Solr-6.x-Solaris (64bit/jdk1.8.0) - Build # 411 - Unstable!
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Solaris/411/ Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseParallelGC 1 tests failed. FAILED: org.apache.solr.schema.TestCloudSchemaless.test Error Message: QUERY FAILED: xpath=/response/arr[@name='fields']/lst/str[@name='name'][.='newTestFieldInt445'] request=/schema/fields?wt=xml response=
[jira] [Updated] (SOLR-9559) Add ExecutorStream to execute stored topics and macros
[ https://issues.apache.org/jira/browse/SOLR-9559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-9559: - Description: The ExecutorStream will execute the stored topics and macros from SOLR-9387. The ExecutorStream can be pointed at a SolrCloud collection where the topics are stored and it will execute the topics and macros in batches. The ExecutorStream will support parallel execution of topics/macros as well. This will allow the workload to be spread across a cluster of worker nodes. was: The ExecutorStream will execute the stored topics and macros from SOLR-9387. The ExecutorStream can be pointed at a SolrCloud collection where the topics are stored and it will execute the topics and macros in batches. The ExecutorStream will support parallel execution of topics/macros as well. This will allow the workload to be spread across a cluster worker nodes. > Add ExecutorStream to execute stored topics and macros > -- > > Key: SOLR-9559 > URL: https://issues.apache.org/jira/browse/SOLR-9559 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein > > The ExecutorStream will execute the stored topics and macros from SOLR-9387. > The ExecutorStream can be pointed at a SolrCloud collection where the topics > are stored and it will execute the topics and macros in batches. > The ExecutorStream will support parallel execution of topics/macros as well. > This will allow the workload to be spread across a cluster of worker nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9559) Add ExecutorStream to execute stored topics and macros
[ https://issues.apache.org/jira/browse/SOLR-9559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-9559: - Description: The ExecutorStream will execute the stored topics and macros from SOLR-9387. The ExecutorStream can be pointed at a SolrCloud collection where the topics are stored and it will execute the topics and macros in batches. The ExecutorStream will support parallel execution of topics/macros as well. This will allow the workload to be spread across a cluster worker nodes. was: The ExecutorStream will execute the stored topics and macros from SOLR-9387. The ExecutorStream can be pointed at a SolrCloud collection where the topics are stored and it will execute the topics and macros in batches. The ExecutorStream will support parallel execution of topics/macros as well. This will allow the workload to be spread over a large number of worker nodes. > Add ExecutorStream to execute stored topics and macros > -- > > Key: SOLR-9559 > URL: https://issues.apache.org/jira/browse/SOLR-9559 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein > > The ExecutorStream will execute the stored topics and macros from SOLR-9387. > The ExecutorStream can be pointed at a SolrCloud collection where the topics > are stored and it will execute the topics and macros in batches. > The ExecutorStream will support parallel execution of topics/macros as well. > This will allow the workload to be spread across a cluster worker nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Solr configuration format fracturing
There seems to be some fracturing in the format of various Solr configs. Most of the config uses XML, but some new features in the last few years are using JSON, particularly where SolrCloud and Zookeeper are concerned. When notifications about SOLR-9557 came through, it revealed that there is a config file sitting next to solrconfig.xml named "params.json" that Solr will use. I wasn't aware of this until reading that issue. This leads me to suggest something rather drastic for 7.0: Consolidate all configuration formats and agree to consistent format usage unless there is another major discussion and agreement to change formats. I did consider starting this discussion in Jira, but it's fairly major, so the dev list seemed like the right place to start. Comments from some new users have come my way along the lines of "XML is so 90's ... get with the times!" Image problems like that can be fatal to a software project, even if there's no technical problem. The likely winner in the format discussion is pure unmodified JSON, but I'm not going to make any assumptions. SOLR-8029 has some format discussions that may be relevant here. IMHO, in order to make the idea successful, Solr 7.0 will need to automatically convert most configs on startup from the old format to the new format without user intervention. If there's something that we find we can't convert automatically, that should result in a failure to start, with a helpful message so the user has some idea what they need to do. Thoughts? Is this too scary to contemplate? Should I open an umbrella issue in Jira to get the ball rolling? Thanks, Shawn - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9559) Add ExecutorStream to execute stored topics and macros
[ https://issues.apache.org/jira/browse/SOLR-9559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-9559: - Summary: Add ExecutorStream to execute stored topics and macros (was: Add ExecutorStream) > Add ExecutorStream to execute stored topics and macros > -- > > Key: SOLR-9559 > URL: https://issues.apache.org/jira/browse/SOLR-9559 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein > > The ExecutorStream will execute the stored topics and macros from SOLR-9387. > The ExecutorStream can be pointed at a SolrCloud collection where the topics > are stored and it will execute the topics and macros in batches. > The ExecutorStream will support parallel execution of topics/macros as well. > This will allow the workload to be spread over a large number of worker nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-9559) Add ExecutorStream
Joel Bernstein created SOLR-9559: Summary: Add ExecutorStream Key: SOLR-9559 URL: https://issues.apache.org/jira/browse/SOLR-9559 Project: Solr Issue Type: New Feature Security Level: Public (Default Security Level. Issues are Public) Reporter: Joel Bernstein The ExecutorStream will execute the stored topics and macros from SOLR-9387. The ExecutorStream can be pointed at a SolrCloud collection where the topics are stored and it will execute the topics and macros in batches. The ExecutorStream will support parallel execution of topics/macros as well. This will allow the workload to be spread over a large number of worker nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-9506) cache IndexFingerprint for each segment
[ https://issues.apache.org/jira/browse/SOLR-9506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15521130#comment-15521130 ] Pushkar Raste edited comment on SOLR-9506 at 9/25/16 5:14 PM: -- POC/Initial commit - https://github.com/praste/lucene-solr/commit/ca55daa9ea1eb23232173b50111b9068f1817c13 There are two issues we still need to solve. * How to compute {{versionsInHash}} from {{versionsInHash}} of individual segments. We can not use current {{versionsHash}} (unless we cache all the individual version numbers), as it is not additive. Consider following scenario *Leader segments, versions and versionsHash* *seg1* : versions: 100, 101, 102 versionHash: hash(100) + hash(101) + hash(102) *seg2*: versions: 103, 104, 105 versionHash: hash(103) + hash(104) + hash(105) \\ \\ *Replica segments, versions and hash* *seg1*: versions: 100, 101 versionHash: hash(100) + hash(101) *seg2*: versions: 102, 103, 104, 105 versionHash: hash(102) + hash(103) + hash(104) + hash(105) \\ \\Leader and Replica are essentially in sync, however using current method there is no way to compute and ensure cumulative {{versionHash}} of leader and replica would match. \\ \\Even if decide not to cache {{IndexFingerprint}} per segment but just to parallalize the computation, I think we still would run into issue mentioned above. * I still need to figure out how to keep cache in {{DefaultSolrCoreState}}, so that we can reuse {{IndexFingerprint}} of individual segments when a new Searcher is opened. was (Author: praste): POC/Initial commit - https://github.com/praste/lucene-solr/commit/ca55daa9ea1eb23232173b50111b9068f1817c13 There are two issues we still need to solve. * How to compute `versionsInHash` from `versionsInHash` of individual segments. We can not use current `versionsHash` (unless we cache all the individual version numbers), as it is not additive. Consider following scenario *Leader segments, versions and hash* *seg1* : versions: 100, 101, 102 versionHash: hash(100) + hash(101) + hash(102) *seg2*: versions: 103, 104, 105 versionHash: hash(103) + hash(104) + hash(105) \\ \\ *Replica segments, versions and hash* *seg1*: versions: 100, 101 versionHash: hash(100) + hash(101) *seg2*: versions: 102, 103, 104, 105 versionHash: hash(102) + hash(103) + hash(104) + hash(105) \\ \\Leader and Replica are essentially in sync, however using current method there is no way to compute and ensure cumulative `versionHash` of leader and replica would match * I still need to figure out how to keep cache in `DefaultSolrCoreState`, so that we can reuse `IndexFingerprint` of individual segments when a new Searcher is opened. > cache IndexFingerprint for each segment > --- > > Key: SOLR-9506 > URL: https://issues.apache.org/jira/browse/SOLR-9506 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul > > The IndexFingerprint is cached per index searcher. it is quite useless during > high throughput indexing. If the fingerprint is cached per segment it will > make it vastly more efficient to compute the fingerprint -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9506) cache IndexFingerprint for each segment
[ https://issues.apache.org/jira/browse/SOLR-9506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15521130#comment-15521130 ] Pushkar Raste commented on SOLR-9506: - POC/Initial commit - https://github.com/praste/lucene-solr/commit/ca55daa9ea1eb23232173b50111b9068f1817c13 There are two issues we still need to solve. * How to compute `versionsInHash` from `versionsInHash` of individual segments. We can not use current `versionsHash` (unless we cache all the individual version numbers), as it is not additive. Consider following scenario *Leader segments, versions and hash* *seg1* : versions: 100, 101, 102 versionHash: hash(100) + hash(101) + hash(102) *seg2*: versions: 103, 104, 105 versionHash: hash(103) + hash(104) + hash(105) \\ \\ *Replica segments, versions and hash* *seg1*: versions: 100, 101 versionHash: hash(100) + hash(101) *seg2*: versions: 102, 103, 104, 105 versionHash: hash(102) + hash(103) + hash(104) + hash(105) \\ \\Leader and Replica are essentially in sync, however using current method there is no way to compute and ensure cumulative `versionHash` of leader and replica would match * I still need to figure out how to keep cache in `DefaultSolrCoreState`, so that we can reuse `IndexFingerprint` of individual segments when a new Searcher is opened. > cache IndexFingerprint for each segment > --- > > Key: SOLR-9506 > URL: https://issues.apache.org/jira/browse/SOLR-9506 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul > > The IndexFingerprint is cached per index searcher. it is quite useless during > high throughput indexing. If the fingerprint is cached per segment it will > make it vastly more efficient to compute the fingerprint -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9387) Allow topic expression to store queries and macros
[ https://issues.apache.org/jira/browse/SOLR-9387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-9387: - Description: The topic expression already stores the checkpoints for a topic. This ticket will allow the topic to store the topic query as well as a macro to be performed with the topic. Macros will be run using Solr's builtin parameter substitution: Sample syntax: {code} topic(collection1, q="*:*", macro="update(classify(model, ${topic}))") {code} The query and macro will be stored with the topic. Topics can be retrieved and executed as part of the larger macro using Solr's built in parameter substitution. {code} http://localhost:8983/solr/collection1/stream?expr=update(classify(model, ${topic}))=topic(collection1,) {code} Because topics are stored in a SolrCloud collection this will allow for storing millions of topics and macros. The parallel function can then be used to run the topics/macros in parallel across a large number of workers. was: The topic expression already stores the checkpoints for a topic. This ticket will allow the topic to store the topic query as well as a macro to be performed with the topic. Macros will be run using Solr's builtin parameter substitution: Sample syntax: {code} topic(collection1, q="*:*", macro="update(classify(model, ${topic}))") {code} The query and macro will be stored with the topic. Topics can be retrieved and executed as part of the larger macro as using Solr's built in parameter substitution. {code} http://localhost:8983/solr/collection1/stream?expr=update(classify(model, ${topic}))=topic(collection1,) {code} Because topics are stored in a SolrCloud collection this will allow for storing millions of topics and macros. The parallel function can then be used to run the topics/macros in parallel across a large number of workers. > Allow topic expression to store queries and macros > -- > > Key: SOLR-9387 > URL: https://issues.apache.org/jira/browse/SOLR-9387 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein > > The topic expression already stores the checkpoints for a topic. This ticket > will allow the topic to store the topic query as well as a macro to be > performed with the topic. > Macros will be run using Solr's builtin parameter substitution: > Sample syntax: > {code} > topic(collection1, q="*:*", macro="update(classify(model, ${topic}))") > {code} > The query and macro will be stored with the topic. Topics can be retrieved > and executed as part of the larger macro using Solr's built in parameter > substitution. > {code} > http://localhost:8983/solr/collection1/stream?expr=update(classify(model, > ${topic}))=topic(collection1,) > {code} > Because topics are stored in a SolrCloud collection this will allow for > storing millions of topics and macros. > The parallel function can then be used to run the topics/macros in parallel > across a large number of workers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9387) Allow topic expression to store queries and macros
[ https://issues.apache.org/jira/browse/SOLR-9387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-9387: - Summary: Allow topic expression to store queries and macros (was: Allow topic expression to store queries and perform actions) > Allow topic expression to store queries and macros > -- > > Key: SOLR-9387 > URL: https://issues.apache.org/jira/browse/SOLR-9387 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein > > The topic expression already stores the checkpoints for a topic. This ticket > will allow the topic to store the topic query as well as a macro to be > performed with the topic. > Macros will be run using Solr's builtin parameter substitution: > Sample syntax: > {code} > topic(collection1, q="*:*", macro="update(classify(model, ${topic}))") > {code} > The query and macro will be stored with the topic. Topics can be retrieved > and executed as part of the larger macro as using Solr's built in parameter > substitution. > {code} > http://localhost:8983/solr/collection1/stream?expr=update(classify(model, > ${topic}))=topic(collection1,) > {code} > Because topics are stored in a SolrCloud collection this will allow for > storing millions of topics and macros. > The parallel function can then be used to run the topics/macros in parallel > across a large number of workers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9387) Allow topic expression to store queries and perform actions
[ https://issues.apache.org/jira/browse/SOLR-9387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-9387: - Description: The topic expression already stores the checkpoints for a topic. This ticket will allow the topic to store the topic query as well as a macro to be performed with the topic. Macros will be run using Solr's builtin parameter substitution: Sample syntax: {code} topic(collection1, q="*:*", macro="update(classify(model, ${topic}))") {code} The query and macro will be stored with the topic. Topics can be retrieved and executed as part of the larger macro as using Solr's built in parameter substitution. {code} http://localhost:8983/solr/collection1/stream?expr=update(classify(model, ${topic}))=topic(collection1,) {code} Because topics are stored in a SolrCloud collection this will allow for storing millions of topics and macros. The parallel function can then be used to run the topics/macros in parallel across a large number of workers. was: The topic expression already stores the checkpoints for a topic. This ticket will allow the topic to store the topic query as well. Because topics are stored in a SolrCloud collection this will allow for storing millions of topics/queries. The parallel function can then be used to execute the topic queries in parallel across a large number of workers. > Allow topic expression to store queries and perform actions > --- > > Key: SOLR-9387 > URL: https://issues.apache.org/jira/browse/SOLR-9387 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein > > The topic expression already stores the checkpoints for a topic. This ticket > will allow the topic to store the topic query as well as a macro to be > performed with the topic. > Macros will be run using Solr's builtin parameter substitution: > Sample syntax: > {code} > topic(collection1, q="*:*", macro="update(classify(model, ${topic}))") > {code} > The query and macro will be stored with the topic. Topics can be retrieved > and executed as part of the larger macro as using Solr's built in parameter > substitution. > {code} > http://localhost:8983/solr/collection1/stream?expr=update(classify(model, > ${topic}))=topic(collection1,) > {code} > Because topics are stored in a SolrCloud collection this will allow for > storing millions of topics and macros. > The parallel function can then be used to run the topics/macros in parallel > across a large number of workers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9387) Allow topic expression to store queries and perform actions
[ https://issues.apache.org/jira/browse/SOLR-9387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-9387: - Summary: Allow topic expression to store queries and perform actions (was: Scalable stored queries and alerts with the topic Streaming Expression) > Allow topic expression to store queries and perform actions > --- > > Key: SOLR-9387 > URL: https://issues.apache.org/jira/browse/SOLR-9387 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein > > The topic expression already stores the checkpoints for a topic. > This ticket will allow the topic to store the topic query as well. > Because topics are stored in a SolrCloud collection this will allow for > storing millions of topics/queries. > The parallel function can then be used to execute the topic queries in > parallel across a large number of workers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-9387) Allow topic expression to store queries and perform actions
[ https://issues.apache.org/jira/browse/SOLR-9387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein reassigned SOLR-9387: Assignee: Joel Bernstein > Allow topic expression to store queries and perform actions > --- > > Key: SOLR-9387 > URL: https://issues.apache.org/jira/browse/SOLR-9387 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein > > The topic expression already stores the checkpoints for a topic. > This ticket will allow the topic to store the topic query as well. > Because topics are stored in a SolrCloud collection this will allow for > storing millions of topics/queries. > The parallel function can then be used to execute the topic queries in > parallel across a large number of workers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-8577) Add AlertStream and ModelStream to the Streaming API
[ https://issues.apache.org/jira/browse/SOLR-8577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein resolved SOLR-8577. -- Resolution: Duplicate > Add AlertStream and ModelStream to the Streaming API > > > Key: SOLR-8577 > URL: https://issues.apache.org/jira/browse/SOLR-8577 > Project: Solr > Issue Type: New Feature >Reporter: Joel Bernstein > > The AlertStream will return the top N "new" documents for a query from a > SolrCloud collection. The AlertStream will track the highest version numbers > from each shard and use these as checkpoints to determine new content. > The DaemonStream (SOLR-8550) can be used to create "live" alerts that run at > intervals. Sample syntax: > {code} > daemon(alert(collection1, q="hello", n="20"), runInterval="2000") > {code} > The DaemonStream can be installed in a SolrCloud worker node where it can > llive and send out alerts. > *AI Models* > The *AlertStream* will also accept an optional *ModelStream* which will apply > a machine learning model to the alert. For example: > {code} > alert(collection1, q="hello", n="20", model(collection2, id="model1")) > {code} > The ModelStream will return a machine learning model saved in a SolrCloud > collection. Function queries for different model types will be developed so > the models can be applied in the re-ranker or as a sort. > *Taking action* > Custom decorator streams can be developed that *take actions based on the AI > driven alerts*. For example the pseudo code below would run the function > *someAction* on the Tuples emitted by the AlertStream. > {code} > daemon(someAction(alert(...))) > {code} > *Learning* > While some SolrCloud worker collections are alerting and taking action, other > worker collections can be *learning models* which can be applied for > alerting. For example: > {code} > daemon(update(logit())) > {code} > The pseudo code above calls the LogitStream (SOLR-8492) which would learn a > Logistic Regression model and flow the model into a SolrCloud collection. The > model can then be used for alerting and taking action on new data as it > enters the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-master-MacOSX (64bit/jdk1.8.0) - Build # 3563 - Unstable!
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-MacOSX/3563/ Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC 3 tests failed. FAILED: org.apache.solr.cloud.CollectionsAPIDistributedZkTest.test Error Message: Timeout occured while waiting response from server at: http://127.0.0.1:54849/yf_o/mt Stack Trace: org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting response from server at: http://127.0.0.1:54849/yf_o/mt at __randomizedtesting.SeedInfo.seed([420ECADA066E5E46:CA5AF500A89233BE]:0) at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:619) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:261) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:250) at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219) at org.apache.solr.cloud.CollectionsAPIDistributedZkTest.makeRequest(CollectionsAPIDistributedZkTest.java:400) at org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testCollectionsAPI(CollectionsAPIDistributedZkTest.java:898) at org.apache.solr.cloud.CollectionsAPIDistributedZkTest.test(CollectionsAPIDistributedZkTest.java:178) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:985) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:960) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at
[jira] [Commented] (SOLR-9558) DIH TemplateTransformer does not support multiple values
[ https://issues.apache.org/jira/browse/SOLR-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15520840#comment-15520840 ] Shalin Shekhar Mangar commented on SOLR-9558: - So you want to transform the same column by multiple (different) templates to create a multi-valued field with each value being the output of an individual template? > DIH TemplateTransformer does not support multiple values > > > Key: SOLR-9558 > URL: https://issues.apache.org/jira/browse/SOLR-9558 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - DataImportHandler >Affects Versions: 6.2 >Reporter: Ted Sullivan >Priority: Minor > Fix For: trunk > > Attachments: SOLR-9558.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > The DIH TemplateTransformer does not support multiple templates with the same > column name. Rather than creating a List of values as it should do in this > case, the value of the last tag with the same column name replaces > the values of previous transforms for that column. The reason is that it uses > a single HashMap to store the transformations with a key on column name. The > fix is to detect if a column has previously been transformed within the same > field set and to create a List for that column when this occurrs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9558) DIH TemplateTransformer does not support multiple values
[ https://issues.apache.org/jira/browse/SOLR-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Sullivan updated SOLR-9558: --- Attachment: SOLR-9558.patch > DIH TemplateTransformer does not support multiple values > > > Key: SOLR-9558 > URL: https://issues.apache.org/jira/browse/SOLR-9558 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - DataImportHandler >Affects Versions: 6.2 >Reporter: Ted Sullivan >Priority: Minor > Fix For: trunk > > Attachments: SOLR-9558.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > The DIH TemplateTransformer does not support multiple templates with the same > column name. Rather than creating a List of values as it should do in this > case, the value of the last tag with the same column name replaces > the values of previous transforms for that column. The reason is that it uses > a single HashMap to store the transformations with a key on column name. The > fix is to detect if a column has previously been transformed within the same > field set and to create a List for that column when this occurrs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-9558) DIH TemplateTransformer does not support multiple values
Ted Sullivan created SOLR-9558: -- Summary: DIH TemplateTransformer does not support multiple values Key: SOLR-9558 URL: https://issues.apache.org/jira/browse/SOLR-9558 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: contrib - DataImportHandler Affects Versions: 6.2 Reporter: Ted Sullivan Priority: Minor Fix For: trunk The DIH TemplateTransformer does not support multiple templates with the same column name. Rather than creating a List of values as it should do in this case, the value of the last tag with the same column name replaces the values of previous transforms for that column. The reason is that it uses a single HashMap to store the transformations with a key on column name. The fix is to detect if a column has previously been transformed within the same field set and to create a List for that column when this occurrs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-9310) PeerSync fails on a node restart due to IndexFingerPrint mismatch
[ https://issues.apache.org/jira/browse/SOLR-9310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15520747#comment-15520747 ] Pushkar Raste edited comment on SOLR-9310 at 9/25/16 12:51 PM: --- I went through logs at https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-MacOSX/429/consoleFull If PeerSync was unsuccessful I would expect to see a line like {{o.a.s.u.PeerSync Fingerprint comparison: -1}} However, I don't see such line. I could think of two scenarios that could break the test * data directory could get deleted while a node is brought down, since data directory is created in {{temp}}. Upon restart replica would have no frame of reference and will have to fall back on replication. * we need a better check than relying number of requests made to {{ReplicationHandler}} was (Author: praste): I went through logs in the failed test email notification but those are truncated. Where can I look at the entire build.log for the test. Only thing I could think of at this point is data directory could get deleted while a node is brought down, since data directory is created in {{temp}}. Upon restart replica would have no frame of reference and will have to fall back on replication. > PeerSync fails on a node restart due to IndexFingerPrint mismatch > - > > Key: SOLR-9310 > URL: https://issues.apache.org/jira/browse/SOLR-9310 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Pushkar Raste >Assignee: Noble Paul > Fix For: 5.5.3, 6.3, trunk > > Attachments: PeerSync_3Node_Setup.jpg, PeerSync_Experiment.patch, > SOLR-9310.patch, SOLR-9310.patch, SOLR-9310.patch, SOLR-9310.patch, > SOLR-9310.patch, SOLR-9310.patch, SOLR-9310_3ReplicaTest.patch, > SOLR-9310_5x.patch, SOLR-9310_final.patch > > > I found that Peer Sync fails if a node restarts and documents were indexed > while node was down. IndexFingerPrint check fails after recovering node > applies updates. > This happens only when node restarts and not if node just misses updates due > reason other than it being down. > Please check attached patch for the test. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9310) PeerSync fails on a node restart due to IndexFingerPrint mismatch
[ https://issues.apache.org/jira/browse/SOLR-9310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15520747#comment-15520747 ] Pushkar Raste commented on SOLR-9310: - I went through logs in the failed test email notification but those are truncated. Where can I look at the entire build.log for the test. Only thing I could think of at this point is data directory could get deleted while a node is brought down, since data directory is created in {{temp}}. Upon restart replica would have no frame of reference and will have to fall back on replication. > PeerSync fails on a node restart due to IndexFingerPrint mismatch > - > > Key: SOLR-9310 > URL: https://issues.apache.org/jira/browse/SOLR-9310 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Pushkar Raste >Assignee: Noble Paul > Fix For: 5.5.3, 6.3, trunk > > Attachments: PeerSync_3Node_Setup.jpg, PeerSync_Experiment.patch, > SOLR-9310.patch, SOLR-9310.patch, SOLR-9310.patch, SOLR-9310.patch, > SOLR-9310.patch, SOLR-9310.patch, SOLR-9310_3ReplicaTest.patch, > SOLR-9310_5x.patch, SOLR-9310_final.patch > > > I found that Peer Sync fails if a node restarts and documents were indexed > while node was down. IndexFingerPrint check fails after recovering node > applies updates. > This happens only when node restarts and not if node just misses updates due > reason other than it being down. > Please check attached patch for the test. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-7452) improve exception message: child query must only match non-parent docs, but parent docID=180314...
[ https://issues.apache.org/jira/browse/LUCENE-7452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev resolved LUCENE-7452. -- Resolution: Fixed > improve exception message: child query must only match non-parent docs, but > parent docID=180314... > -- > > Key: LUCENE-7452 > URL: https://issues.apache.org/jira/browse/LUCENE-7452 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Affects Versions: 6.2 >Reporter: Mikhail Khludnev >Assignee: Mikhail Khludnev >Priority: Minor > Fix For: master (7.0), 6.3 > > Attachments: LUCENE-7452.patch > > > when parent filter intersects with child query the exception exposes internal > details: docnum and scorer class. I propose an exception message to suggest > to execute a query intersecting them both. There is an opinion to add this > suggestion in addition to existing details. > My main concern against is, when index is constantly updated even SOLR-9582 > allows to search for docnum it would be like catching the wind, also think > about cloud case. But, user advised with executing query intersection can > catch problem documents even if they occurs sporadically. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7452) improve exception message: child query must only match non-parent docs, but parent docID=180314...
[ https://issues.apache.org/jira/browse/LUCENE-7452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15520705#comment-15520705 ] Mikhail Khludnev commented on LUCENE-7452: -- Thanks, Alexandre! > improve exception message: child query must only match non-parent docs, but > parent docID=180314... > -- > > Key: LUCENE-7452 > URL: https://issues.apache.org/jira/browse/LUCENE-7452 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Affects Versions: 6.2 >Reporter: Mikhail Khludnev >Assignee: Mikhail Khludnev >Priority: Minor > Fix For: master (7.0), 6.3 > > Attachments: LUCENE-7452.patch > > > when parent filter intersects with child query the exception exposes internal > details: docnum and scorer class. I propose an exception message to suggest > to execute a query intersecting them both. There is an opinion to add this > suggestion in addition to existing details. > My main concern against is, when index is constantly updated even SOLR-9582 > allows to search for docnum it would be like catching the wind, also think > about cloud case. But, user advised with executing query intersection can > catch problem documents even if they occurs sporadically. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7465) Add a PatternTokenizer that uses Lucene's RegExp implementation
[ https://issues.apache.org/jira/browse/LUCENE-7465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-7465: --- Attachment: LUCENE-7465.patch Patch. I added the {{SimplePatternTokenizerFactory}} as well. {{SimplePatternTokenizer}} takes either a String (parses as a regexp and compiles it) or a DFA (expert user who pre-built their own automaton). I folded in a nice idea from [~rcmuir] to optimize the ascii code points even when using {{CharacterRunAutomaton}}. It's quite fast, ~46% faster than {{PatternTokenizer}} when tokenizing 1 MB chunks from the English Wikipedia export, using a simplistic whitespace regexp {{\[^ \t\r\n]+}}. And it's nice that it doesn't read the entire input into heap! > Add a PatternTokenizer that uses Lucene's RegExp implementation > --- > > Key: LUCENE-7465 > URL: https://issues.apache.org/jira/browse/LUCENE-7465 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: master (7.0), 6.3 > > Attachments: LUCENE-7465.patch > > > I think there are some nice benefits to a version of PatternTokenizer that > uses Lucene's RegExp impl instead of the JDK's: > * Lucene's RegExp is compiled to a DFA up front, so if a "too hard" RegExp > is attempted the user discovers it up front instead of later on when a > "lucky" document arrives > * It processes the incoming characters as a stream, only pulling 128 > characters at a time, vs the existing {{PatternTokenizer}} which currently > reads the entire string up front (this has caused heap problems in the past) > * It should be fast. > I named it {{SimplePatternTokenizer}}, and it still needs a factory and > improved tests, but I think it's otherwise close. > It currently does not take a {{group}} parameter because Lucene's RegExps > don't yet implement sub group capture. I think we could add that at some > point, but it's a bit tricky. > This doesn't even have group=-1 support (like String.split) ... I think if we > did that we should maybe name it differently > ({{SimplePatternSplitTokenizer}}?). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org