I do agree, Hoss, that it makes sense to think about an IndexDocument
and a SearchDocument or something along those lines for 3.0.
Also note, I added some comments to 1219 about the history of Fieldable.
On Mar 18, 2008, at 9:26 PM, Chris Hostetter wrote:
: Really, I think we could go just back to a single Field class
instead
: of the three classes Fieldable, AbstractField and Field. If we had
: this then LUCENE-1219 would be easier to cleanly implement.
It's probably worth reviewing the orriginal reasons why Fieldable and
AbstractField were added...
http://issues.apache.org/jira/browse/LUCENE-545
I'm not intimately familiar with most of this, but at it's core, the
purpose seems primarily related to Fields in *returned* documents
after a
search has been performed, particularly relating to lazy loading -- so
that alternate impls could be returned based on FieldSelector options.
I'm not sure how much consideration was given to the impacts on future
changes to the API of Documents/Fields being *indexed*.
somwhere i wrote a nice long diatribe on how in my opinion the biggest
flaw in Lucene's general API was the reuse of "Document" and "Field"
for
two radically differnet purposes, such that half the methods in each
class
are meaningless in the other half of the contexts they are used
for... i
can't find it, but here's a less angry version of the same
sentiment, pls
some followup discussion...
http://www.nabble.com/-jira--Created%3A-%28LUCENE-778%29-Allow-overriding-a-Document-to8406796.html
https://issues.apache.org/jira/browse/LUCENE-778
(note that parallel disscussion occured both in email replies and in
Jira
comments, they are both worth reading)
All of this is fixable in Lucene 3.0, where we will be free to
change the
API; but in the meantime, the fact that 2.3 uses an interface means we
are stuck with supporting it without changing it in 2.4 since right
now
clients can implement their own Fieldable impl and then pass it to
Document.add(Fieldable) before indexing the doc.
(things would be a lot easier if the old Document.add(Field) has
been left
alone and document as being explicitly for *indexing* docs, while a
new
method was used for Documents being returned by searches ... but
that's
water under the bridge)
The best short term approach I can think of for addressing LUCENE-1219
in 2.4:
1) list the new methods in a new interface that extends Fieldable
(ByteArrayReuseFieldable or something)
2) add the new methods to AbstractField so that it implements
ByteArrayReuseFieldable
3) put an instanceof check for ByteArrayReuseFieldable in
DocumentsWriter.
It's not pretty, but it's backwards compatible.
This reminds me of a slightly off topic idea that's been floating
arround
in the back of my head for a while relating to our backwards
compatibility
commitments and the issues of interfaces and abstract classes (which i
haven't though through all the way, but i'm throwing it out there as
long as we're talking about it) ...
Committers tend to prefer abstract classes for extension points
because it
makes it easier to support backwards compatibility in the cases were
we
want to add methods to extendable APIs and the "default" behavior for
these new methods is simple (or obvious delegation to existing
methods)
so that people who have writen custom impls can upgrade easily without
needing to invest time in making changes.
But abstract classes can be harder to mock when doing mock testing,
and
some developers would prefer interfaces that they can implement with
their existing classes -- i suspect these people who would prefer
interfaces are willing to invest the time to make changes to their
impls
when upgrading lucene if the interfaces were to change.
Perhaps the solution is a middle ground: altering our APIs such that
all
extension points we advertise have both an abstract base class as
well as
an interface and all methods that take them as arguments use the
interface
name. then we relax our backcompat commitments
such that we garuntee between minor releases that the interfaces won't
change unless the corrisponding abstract base class changes to acocunt
for it ... so if customers subclass the base class their code will
continue to work, but if they implement the interface directly
ignoring
the base class they are on their own to ensure their code compiles
against
future minor versions.
Like i said, i haven't thought it through completely, but at first
glance
it seems like it would give both commiters and lucene users a lot
of extra flexibility, without sacrificing much in the way of
compatibility
commitments. they key would be in adopting it rigirously and
religiously.
-Hoss
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------
Grant Ingersoll
http://www.lucenebootcamp.com
Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]