[
https://issues.apache.org/jira/browse/LUCENE-3186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13120782#comment-13120782
]
Michael McCandless commented on LUCENE-3186:
--------------------------------------------
I hit several test failures w/ the patch, all from same exc, eg:
{noformat}
ant test-core -Dtestcase=TestIndexWriterDelete
-Dtestmethod=testDeleteAllSlowly
-Dtests.seed=-8b03171493c8e71:-20c60f02979a36ec:-485ea5dac6836e33
{noformat}
I didn't dig...
Net/net this approach looks good! Though I have to say it's hard to
follow all the numerous tied-together classes we now have for
DocValues... I find myself getting "lost" but I don't know how to
simplify it.
Random feedback:
* About the nocommit in SegmentMerger: I think writing FIS after
merge is OK? In theory nothing should care? Anything that needs
FIS during merging should receive the object, not load it from
the directory...
* You can remove the TODO in DocValuesConsumer.merge :)
* Make sure we mark TypePromoter as @lucene.internal.
* If we hit exc inside FixedStraightBytesImpl.merge, we are still
setting merge=true; is finish then called, up above? (At which
point we are going to try to write to the closed file).
* Maybe rename that bool to "merged" or "didMerge" or something?
"merge" sounds like it's an imperative command.
* The jdocs for DocValuesConsumer.merge say "Merging segments out of
order is not supported", but you just mean segments must arrive in
sorted order right? (Ie, TieredMP merges non-sequential segments,
which we have been calling out-of-order merge).
* Since DocValuesConsumer.merge ignores the incoming IndexDocValues
(a MultiIndexDocValues), do we even need to pass that in...?
* It's sort of confusing how we have a DocValuesConsumer.MergeState
(holds one segment's reader) and the oal.index.codecs.MergeState
(references all segments); maybe rename the former to
SingleReaderMergeState? SubMergeState? (Something to indicate it
just covers one sub reader at a time).
* It looks like we don't handle the case of merging segs when a
given field is always FixedStraightBytes but the size had changed
from one seg to another? (We throw IAE in
FixedStraightBytesImpl.merge). Are there other cases that will
lead to "late binding" exc during merge?
* In TypePromoter, instead of "promoted.flags & ~(1 << 3)" can you
name that constant bit mask? EG IS_BYTES or NOT_NUMERIC
or something?
* With this patch, does this mean we can type-promote on the fly?
Ie, if I make a SlowMRWrapper, and pull its perDocValues, we will
present the promoted type (across all segments) correctly? And
when the user looks up the value for a certain docID, we promote
it as needed?
* Typo in comment in Floats.java: "only bulk merge is value type is
the same otherwise size differs": change "is" to "if"
* Can we rename Writer.setNextEnum -> Writer.setNextMergeEnum?
> DocValues type should be recored in FNX file to early fail if user specifies
> incompatible type
> ----------------------------------------------------------------------------------------------
>
> Key: LUCENE-3186
> URL: https://issues.apache.org/jira/browse/LUCENE-3186
> Project: Lucene - Java
> Issue Type: Improvement
> Components: core/index
> Affects Versions: 4.0
> Reporter: Simon Willnauer
> Assignee: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-3186.patch
>
>
> Currently segment merger fails if the docvalues type is not compatible across
> segments. We already catch this problem if somebody changes the values type
> for a field within one segment but not across segments. in order to do that
> we should record the type in the fnx fiel alone with the field numbers.
> I marked this 4.0 since it should not block the landing on trunk
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]