FYI https://issues.apache.org/jira/browse/SOLR-1353
On Sun, Aug 9, 2009 at 2:02 PM, Yonik Seeley<[email protected]> wrote: > It looks like implementing the new attribute stuff will not be enough > - the token architecture has changed enough that it looks like we must > cache tokenstreams to get back to good performance. > > -Yonik > http://www.lucidimagination.com > > > On Sun, Aug 9, 2009 at 12:57 PM, Yonik Seeley<[email protected]> > wrote: >> OK, I've isolated (magnified) the effect with a test I just checked in. >> Indexing documents directly at the UpdateHandler was 85% faster before >> the latest lucene update. >> >> Run the test like this: >> >> ant test -Dtestcase=TestIndexingPerformance -Dargs="-server >> -Diter=100000"; grep throughput >> build/test-results/*TestIndexingPerformance.xml >> >> To run on an older trunk version, just copy over >> src/test/org/apache/solr/update/TestIndexingPerformance.java >> src/test/test-files/solr/conf/solrconfig_perf.xml >> >> I had a throughput of 10946 docs/sec before the lucene update, and 5849 >> after. >> >> -Yonik >> http://www.lucidimagination.com >> >> >> On Sun, Aug 9, 2009 at 12:10 PM, Yonik Seeley<[email protected]> >> wrote: >>> On Sun, Aug 9, 2009 at 12:01 PM, Grant Ingersoll<[email protected]> wrote: >>>> Or bite the bullet and upgrade to the incrementToken() method. >>> >>> Right - I'm not sure if that would fix it or not - I haven't been >>> involved in the new Token attribute stuff... >>> I'm currently writing a basic indexing unit test that we can use to >>> measure this (the standard solrconfig does stuff that slows down >>> indexing a lot, but helps in catching bugs on edge cases by creating >>> many segments). >>> >>> -Yonik >>> http://www.lucidimagination.com >>> >> >
