On Aug 11, 2009, at 3:21 PM, Michael Busch wrote:
On 8/11/09 4:13 AM, Grant Ingersoll wrote:
On Aug 11, 2009, at 4:28 AM, Michael Busch wrote:
I'm not just responding to just you there, but more to the
growing pack of those speaking against the new API. I don't see
specific issues being brought up - the only issues I have seen
brought up have been addressed in JIRA issues that have received
no comments indicating the fix was not good enough. So we are
seeing a lot of general complaints, but specific complaints have
been addressed as far as I can tell.
Thanks Mark. Yeah, I'm really not sure what actually the problem
here is now. There was a performance test in Solr that apparently
ran much slower after upgrading to the new Lucene jar. This test
is testing a rather uncommon scenario: very very short documents.
That is not an uncommon scenario. Solr has very, very short fields
_ALL THE TIME_.
I meant that having documents that only contain very short fields is
not as common as having docs with a decent amount of text. Maybe
I'm wrong - in either case I didn't try to say it's not an important
use case. I think it is important to have good performance here
too. The point I was trying to make was that we tested performance
more thoroughly for the case we thought would be more common.
FWIW, I think the most common scenario is: one or two large fields and
several (usually in the range of 5-10, but have seen cases with many)
small fields, at least that has been my experience. Some of the small
fields require analysis, some don't.
According to the numbers posted on LUCENE-1796 it now seems like
it's fixed - even for documents with only very short fields and no
reusable TokenStreams.
Very cool.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org