On Tue, Aug 11, 2009 at 6:50 AM, Robert Muir<rcm...@gmail.com> wrote:
> On Tue, Aug 11, 2009 at 4:28 AM, Michael Busch<busch...@gmail.com> wrote:
>> There was a performance test in Solr that apparently ran much slower
>> after upgrading to the new Lucene jar. This test is testing a rather
>> uncommon scenario: very very short documents.
>
> Actually, its more uncommon than that: its very very short documents,
> without implementing reusableTokenStream()
> this makes it basically a benchmark of ctor cost... doesn't really
> benchmark the token api in my opinion.

You would be surprized... there are quite a few Solr users that have
relatively short documents... or even if they are sizeable documents,
they have up to hundreds of short metadata-type fields (generally a
token or two).

Reusing TokenStreams has become a must in Solr IMO since construction
costs (hashmap lookups, etc) and GC costs (larger objects) have been
growing.  I'm focused on that now...

Robert's taking a crack at fixing things up so users can actually
create reusable analyzers out of our filters:
https://issues.apache.org/jira/browse/LUCENE-1794

-Yonik
http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to