[jira] [Commented] (SOLR-2493) SolrQueryParser constantly parse luceneMatchVersion in solrconfig. Large performance hit.
[ https://issues.apache.org/jira/browse/SOLR-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028974#comment-13028974 ] Stephane Bailliez commented on SOLR-2493: - The problem is hardly about naming here, it is about correctly using classes when offered the choice. Mistake was made. That's it. We expect committers to be sufficiently knowledgeable about the codebase when committing code. That's true anywhere. You can hardly expect a service ItemService to have methods such as: getItemFromDatabase() or getItemFromServerOnTheOtherSideOfThePlanet() or getItemFromFile() or getItemFromMemory() if there are 4 different implementations of it., you have getItem() and the 4 different implementation do something different internally. I rather actually wonder why the config is not parsed entirely at startup rather than have nodes lying around and cherry-picked. SolrQueryParser constantly parse luceneMatchVersion in solrconfig. Large performance hit. - Key: SOLR-2493 URL: https://issues.apache.org/jira/browse/SOLR-2493 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.1 Reporter: Stephane Bailliez Assignee: Uwe Schindler Priority: Blocker Labels: core, parser, performance, request, solr Fix For: 3.1.1, 3.2, 4.0 Attachments: SOLR-2493-3.x.patch, SOLR-2493.patch I' m putting this as blocker as I think this is a serious issue that should be adressed asap with a release. With the current code this is no way near suitable for production use. For each instance created SolrQueryParser calls getSchema().getSolrConfig().getLuceneVersion(luceneMatchVersion, Version.LUCENE_24) instead of using getSchema().getSolrConfig().luceneMatchVersion This creates a massive performance hit. For each request, there is generally 3 query parsers created and each of them will parse the xml node in config which involve creating an instance of XPath and behind the scene the usual factory finder pattern quicks in within the xml parser and does a loadClass. The stack is typically: at org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:363) at com.sun.org.apache.xml.internal.dtm.ObjectFactory.findProviderClass(ObjectFactory.java:506) at com.sun.org.apache.xml.internal.dtm.ObjectFactory.lookUpFactoryClass(ObjectFactory.java:217) at com.sun.org.apache.xml.internal.dtm.ObjectFactory.createObject(ObjectFactory.java:131) at com.sun.org.apache.xml.internal.dtm.ObjectFactory.createObject(ObjectFactory.java:101) at com.sun.org.apache.xml.internal.dtm.DTMManager.newInstance(DTMManager.java:135) at com.sun.org.apache.xpath.internal.XPathContext.init(XPathContext.java:100) at com.sun.org.apache.xpath.internal.jaxp.XPathImpl.eval(XPathImpl.java:201) at com.sun.org.apache.xpath.internal.jaxp.XPathImpl.evaluate(XPathImpl.java:275) at org.apache.solr.core.Config.getNode(Config.java:230) at org.apache.solr.core.Config.getVal(Config.java:256) at org.apache.solr.core.Config.getLuceneVersion(Config.java:325) at org.apache.solr.search.SolrQueryParser.init(SolrQueryParser.java:76) at org.apache.solr.schema.IndexSchema.getSolrQueryParser(IndexSchema.java:277) With the current 3.1 code, I do barely 250 qps with 16 concurrent users with a near empty index. Switching SolrQueryParser to use getSchema().getSolrConfig().luceneMatchVersion and doing a quick bench test, performance become reasonable beyond 2000 qps. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2493) SolrQueryParser constantly parse luceneMatchVersion in solrconfig. Large performance hit.
SolrQueryParser constantly parse luceneMatchVersion in solrconfig. Large performance hit. - Key: SOLR-2493 URL: https://issues.apache.org/jira/browse/SOLR-2493 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.1 Reporter: Stephane Bailliez Priority: Blocker I' m putting this as blocker as I think this is a serious issue that should be adressed asap with a release. With the current code this is no way near suitable for production use. For each instance created SolrQueryParser calls getSchema().getSolrConfig().getLuceneVersion(luceneMatchVersion, Version.LUCENE_24) instead of using getSchema().getSolrConfig().luceneMatchVersion This creates a massive performance hit. For each request, there is generally 3 query parsers created and each of them will parse the xml node in config which involve creating an instance of XPath and behind the scene the usual factory finder pattern quicks in within the xml parser and does a loadClass. The stack is typically: at org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:363) at com.sun.org.apache.xml.internal.dtm.ObjectFactory.findProviderClass(ObjectFactory.java:506) at com.sun.org.apache.xml.internal.dtm.ObjectFactory.lookUpFactoryClass(ObjectFactory.java:217) at com.sun.org.apache.xml.internal.dtm.ObjectFactory.createObject(ObjectFactory.java:131) at com.sun.org.apache.xml.internal.dtm.ObjectFactory.createObject(ObjectFactory.java:101) at com.sun.org.apache.xml.internal.dtm.DTMManager.newInstance(DTMManager.java:135) at com.sun.org.apache.xpath.internal.XPathContext.init(XPathContext.java:100) at com.sun.org.apache.xpath.internal.jaxp.XPathImpl.eval(XPathImpl.java:201) at com.sun.org.apache.xpath.internal.jaxp.XPathImpl.evaluate(XPathImpl.java:275) at org.apache.solr.core.Config.getNode(Config.java:230) at org.apache.solr.core.Config.getVal(Config.java:256) at org.apache.solr.core.Config.getLuceneVersion(Config.java:325) at org.apache.solr.search.SolrQueryParser.init(SolrQueryParser.java:76) at org.apache.solr.schema.IndexSchema.getSolrQueryParser(IndexSchema.java:277) With the current 3.1 code, I do barely 250 qps with 16 concurrent users with a near empty index. Switching SolrQueryParser to use getSchema().getSolrConfig().luceneMatchVersion and doing a quick bench test, performance become reasonable beyond 2000 qps. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-433) MultiCore and SpellChecker replication
[ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12636273#action_12636273 ] Stephane Bailliez commented on SOLR-433: Note that on the last patches, the grep to retrieve the snapshot is incorrect: {noformat} ls ${data_dir}|grep ${snap_prefix}\.|grep -v wip|sort -r|head -1 {noformat} would always retrieve the latest one on the ls, it needs to be with an anchor in the grep for the prefix otherwise it will never update the index snapshot (since 'snapshot' is present in every snapshot of index) {noformat} ls ${data_dir}|grep ^${snap_prefix}\.|grep -v wip|sort -r|head -1 {noformat} should be changed in snappuller and snapinstaller MultiCore and SpellChecker replication -- Key: SOLR-433 URL: https://issues.apache.org/jira/browse/SOLR-433 Project: Solr Issue Type: Improvement Components: replication, spellchecker Affects Versions: 1.3 Reporter: Otis Gospodnetic Fix For: 1.4 Attachments: RunExecutableListener.patch, SOLR-433-r698590.patch, SOLR-433.patch, solr-433.patch, SOLR-433_unified.patch, spellindexfix.patch With MultiCore functionality coming along, it looks like we'll need to be able to: A) snapshot each core's index directory, and B) replicate any and all cores' complete data directories, not just their index directories. Pulled from the spellchecker and multi-core index replication thread - http://markmail.org/message/pj2rjzegifd6zm7m Otis: I think that makes sense - distribute everything for a given core, not just its index. And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion. Right? Ryan: Yes, that was my thought. If an arbitrary directory could be distributed, then you could have /path/to/dist/index/... /path/to/dist/spelling-index/... /path/to/dist/foo and that would all get put into a snapshot. This would also let you put multiple cores within a single distribution: /path/to/dist/core0/index/... /path/to/dist/core0/spelling-index/... /path/to/dist/core0/foo /path/to/dist/core1/index/... /path/to/dist/core1/spelling-index/... /path/to/dist/core1/foo -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-759) DateField already formats the date as a string before sending it to the writer
DateField already formats the date as a string before sending it to the writer -- Key: SOLR-759 URL: https://issues.apache.org/jira/browse/SOLR-759 Project: Solr Issue Type: Bug Affects Versions: 1.3 Reporter: Stephane Bailliez Priority: Minor public void write(XMLWriter xmlWriter, String name, Fieldable f) throws IOException { xmlWriter.writeDate(name, toExternal(f)); } public void write(TextResponseWriter writer, String name, Fieldable f) throws IOException { writer.writeDate(name, toExternal(f)); } The above calls the method on the writer that takes a string as a value. For makes the formatting logic in the response writer irrelevant and is inefficient in case you need to format the date differently in a custom writer (you need to parse again the date into a date object that format it again) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (LUCENE-816) Manage dependencies in the build with ivy
[ https://issues.apache.org/jira/browse/LUCENE-816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12476022 ] Stephane Bailliez commented on LUCENE-816: -- I see some very deep similarities with some known build here... Manage dependencies in the build with ivy - Key: LUCENE-816 URL: https://issues.apache.org/jira/browse/LUCENE-816 Project: Lucene - Java Issue Type: New Feature Components: Analysis Affects Versions: 2.1 Reporter: Nicolas Lalevée Attachments: common-build.tar.gz, external-libs.tar.gz, ivy-build.patch There were issues about making the 2.1 release : http://www.nabble.com/-VOTE--release-Lucene-2.1-tf3228536.html#a8994721 Then the discussion started to talk about maven, and also about ivy. I propose here a draft, a proof of concept of an ant + ivy build. I made this build parallel to the actual one, so people can evaluate it. Note that I have only ivy-ified the core, the demo and the contrib/benchmark. The other contrib projects can be ivy-ified quite easily. The build system is in the common-build directory. In this directory we have : * common-build.xml : the main common build which handle dependencies with ivy * common-build-project.xml : build a java project, core, demo, or a contrib one * common-build-webapp.xml : extend common-build-project and have some tasks about building a war * common-build-modules.xml : allow to build sevral projects, just using some subant task * common-build-gcj.xml : build with gcj. It work once, need to be fixed * ivyconf.xml, ivyconf.properties : ivy configuration * build.xml : a little task to generate the ivyconf.xml to use with the eclipse ivy plugin * eclipse directory : contains some XSL/XML to generate .classpath and .project To test it and see how ivy is cool : cd contrib/benchmark ant -f build-ivy.xml buildeep and look at the new local-libs directory at the root of the lucene directory ! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]