+1. We don't use Solr, but have quite a bunch of medium and
short-sized documents. Plus heaps of metadata fields.
I'm yet to read Uwe's example, but I feel I'm a bit misunderstood by
Did you read it yet? What do you think about it?
some of you. My gripe with new API is not that it brings us troubles
(which are solved one way or another), it is that the switch and
associated migration costs bring zero benefits in immediate and remote
future.
The only person that tried to disprove this claim is Uwe. Others
either say "the problems are solved, so it's okay to move to the new
API", or "this will be usable when flexindexing arrives". Sorry, the
last phrase doesn't hold its place, this API is orthogonal to
flexindexing, or at least nobody has shown the opposite.
If the API is orthogonal to flexible indexing or not depends on how you
define "flexible indexing". I admit the term is vague and probably
nowhere clearly defined.
I agree that if flexible indexing means to only change the encoding,
i.e. *how* data is stored, e.g. PFOR vs. the current posting format,
then yes, we don't need the new TokenStream API for it.
But the goals we have with flexible indexing are more than that. We want
to allow customizing *what* data is stored in the inverted index. The
very first discussion about flexible indexing that happened several
years ago you can find in the wiki:
http://wiki.apache.org/lucene-java/FlexibleIndexing.
Already in this very early proposal it was suggested to have the
following posting formats as a start:
a. <doc>+
b. <doc, boost>+
c. <doc, freq, <position>+ >+
d. <doc, freq, <position, boost>+ >+
For d. you need to change the TokenStream API. How else can we get the
boost from the source to the indexer. Of course you can always serialize
the additional data into the payload byte array, but if filters want to
do something with it performance suffers. The new API solves this
problem very nicely. When we open the posting format like this people
will want to store different custom things in there. The new TokenStream
API is prepared for that - the old one isn't.
Michael
So, what I'm arguing against is adding some code (and forcing users to
migrate) just because we can, with no other reasons.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org