Right - this API is not required, even for Flexible indexing how its
appeared it will emerge. I think its just there to help. Originally, I
think the idea was to reduce how much casting was going to be needed.
Also, a given chain will be more easily able to just deal with just the
attributes that it wants - rather than one stream with methods for a ton
of attributes that may or may not be there.
Because Lucene is a full text engine, I feel it likely the ingestion
process will continue to rely on text ? That text can be converted to
anything by the index chain, and stored however in whatever file? I
think I'm missing something in your argument there.
- Mark
Earwin Burrfoot wrote:
Well, I have real use cases for it, but all of it is still missing the
biggest piece: search side support. It's the 900 lb. elephant in the room.
The 500 lb. elephant is the fact that all these attributes, AIUI, require
you to hook in your own indexing chain, etc. in order to even be indexed,
which is all package private stuff. It's not even clear to me what happens
right now if you were to, say have a Token Stream that, say, had only one
Attribute on it and none of the existing attributes (term buffer, length,
position, etc.) Please correct me if I am wrong, I still don't have a deep
understanding of it all.
Even pseudocode would be good. "Custom indexing chain for abstract
attributes" sounds like one of microsoft.com definitions - serious,
determined, but vague.
If you take current Token and start throwing away some of its fields,
the resulting index contents are obvious for one combinations and
absurd for others. You don't need this new API to handle obvious ones.
Oh, and now it seems the new QP is dependent on it all.
That's why I said earlier "before more damage is done".
Michael has always been up front that this new API is in preparation for
flexible indexing. It doesn't give us the goodness - he has laid out the
reasons for moving before the goodness comes more than once I think.
My problem is not waiting for 'goodness'. It is that I don't currently
see what goodness will come from this API even in remote future.
That's why I am asking! :)
Flexible indexing will lead to all kinds of little cool things - the likes of
which have been discussed a lot in older emails. It will likely lead to things
we cannot predict as well.
Everything will be more flexible. It also could play a part in CSF, and work on
allowing custom files to plug into merging. Plus everything else thats been
mentioned (pfor, etc) > I've been sold on the long term benefits. I don't think
you need these API for them, but its my understanding it helps solve part of the
equation.
Yeah. I too, would like to see all these little cool things, and I
don't think we need this API for them.
Flexible indexing is going to handle various different datatypes
besides text, so I can only reiterate - it cannot rely on generic
stream-based text-handling API for consuming data.
A bunch of issues have come up. To my knowledge, they have been addressed with
vigor every time. If someone is unhappy with how something has been addressed,
and it
needs to be addressed further, please speak up. Otherwise, I don't think the
sky is falling - I think the new API is being shaken out.
API is born dead without usecases. If a year later we get closer to
flexindexing it is supposed to support, and then we understand we
missed some crucial thing - WHAM! our back-compat policy kicks in and
makes our lives miserable once more.
--
- Mark
http://www.lucidimagination.com
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org