On 6/16/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:
Ryan: independent of the javadoc comment on loadStoredFields about it
possibly being refactored somwhere else, the build method doesn't really
match the semantics of the DocumentBuilder class.

i think i commented in SOLR-193 that it didn't belove in DOcumentBuilder,

The original DocumentBuilder was very efficient at least... it
directly built the Lucene document with no intermediate state (but
then we added constraint checking, like multiValued, etc, had we had
to build a hash anyway).

Something else I had in mind for DocumentBuilder was that it could
also be used on the client side to stream XML documents with no
intermediate state... probably not a usecase that most are interested
in though, esp since Solr-20 is now in.

Some other comments... sorry if these have already been discussed, but
I'm finding less time to keep up with JIRA patches (until after they
are committed sometimes).

SolrDocument :
- should it be Iterable at least?
- I'm not crazy about an ArrayList per field, considering that most
fields aren't multiValued, but I guess it's not too much of an issue.
The alternative would be to have getField() return Object instead of
Collection<Object>
- I have to wonder why we have a SolrDocument at all though, compared
to a Map<String,Collection<Object>>


SolrInputDocument :
- it seems like it should convey extra state (currently field and
document boosts), but not ehavior.  keepDuplicates logic seems like
extra overhead that will rarely be useful, and if it is useful, the
logic should probably be executed when building the Lucene Document
(when the schema is available for more info).
- keeping boosts in an extra map : uggg... going from a simple float
to boxing + hashing doesn't seem great performance-wise, and it also
doesn't match the current Lucene Document interface which allows a
boost per field value (although lucene indexing currently lacks the
ability to boost them separately).


-Yonik

Reply via email to