[ https://issues.apache.org/jira/browse/SOLR-342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Grant Ingersoll updated SOLR-342: --------------------------------- Attachment: SOLR-342.tar.gz First crack at implementing this. All tests pass on OS X except SolrJ's SolrExceptionTest, but for some reason that is failing on a clean version, too, so I am convinced it is not due to anything I did. My personal benchmarking of just the Lucene side of things (see indexing.alg in Lucene contrib/benchmark) show pretty significant performance gains. This is also anecdotally confirmed by my basic testing in Solr. I set the default to be 16MB, per Mike McCandless defaults in Lucene, but this is probably too low given the server nature of Solr where a lot more memory is likely to be available. There are 4 new configuration possibilities: <ramBufferSizeMB> - When set, <maxBufferedDocs> is set to DISABLE_AUTO_FLUSH. Default is the maxBufferedDocs way, but this could be changed to be the other way around (and probably should be) <mergePolicy> - Set the MergePolicy, default is the new Lucene LogByteSizeMergePolicy. Old Lucene policy is LogDocMergePolicy. LogByteSizeMergePolicy by default. <mergeScheduler> - Set the way merges are performed. New way is ConcurrentMergeScheduler which runs the merges in separate background threads. Old way was SerialMergeScheduler. Concurrent by default. <luceneAutoCommit> - Specify whether Lucene IndexWriter should autoCommit flushes. false is the best for performance. Still need to develop recommendations for when to change this. Named it this way to avoid confusion with Solr's version. false by default. Patch is inside the tar file, as well as a bundling of the Lucene jars (not technically the latest, but only a couple days old) > Add support for Lucene's new Indexing and merge features (excluding > Document/Field/Token reuse) > ----------------------------------------------------------------------------------------------- > > Key: SOLR-342 > URL: https://issues.apache.org/jira/browse/SOLR-342 > Project: Solr > Issue Type: Improvement > Components: update > Reporter: Grant Ingersoll > Assignee: Grant Ingersoll > Priority: Minor > Attachments: SOLR-342.tar.gz > > > LUCENE-843 adds support for new indexing capabilities using the > setRAMBufferSizeMB() method that should significantly speed up indexing for > many applications. To fix this, we will need trunk version of Lucene (or > wait for the next official release of Lucene) > Side effect of this is that Lucene's new, faster StandardTokenizer will also > be incorporated. > Also need to think about how we want to incorporate the new merge scheduling > functionality (new default in Lucene is to do merges in a background thread) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.