[
https://issues.apache.org/jira/browse/LUCENE-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041448#comment-13041448
]
Shai Erera commented on LUCENE-3153:
------------------------------------
The difference between the two is that on add/UpdateDocument, we can fail fast.
Upon commit, it's a failure that happens too late.
So I'm not at all convinced now that we should fail on this. Really, apps
shouldn't be fiddling w/ norms, at least the apps I know of always index a
field the same way. I don't know how common it is for apps to flip the norms
bit, and clearly they can only do it one way. So maybe what we should be doing
is:
* Consolidate norms info in .fnx -- that's a good idea irregardless of the
issue.
* Have javadocs sort out any confusion -- we don't fail add/updateDoc attempts,
just follow javadocs semantics
* Provide API for apps to disable norms for a field, since that practically the
only direction we want to allow a/ the aforementioned changed.
Hmm ... another scenario hit me as I wrote the above lines:
* App adds a field w/o norms.
* App deletes the document w/ the field
* App adds a field w/ norms -- now what? norms are marked disabled for that
field, but the only document that caused that is deleted.
commit() can be called in between and several documents can be added w/ and w/o
norms -- point is, this just gets complicated. This is another reason IMO to
let apps manage norms and trust that they don't do fiddle w/ norms. The
'disableNorms' API may still be useful for an app that does not fiddle w/
norms, but decides it does not need norms for a field anymore.
> Adding field w/ norms should fail if same field was added w/o norms already
> ---------------------------------------------------------------------------
>
> Key: LUCENE-3153
> URL: https://issues.apache.org/jira/browse/LUCENE-3153
> Project: Lucene - Java
> Issue Type: Bug
> Components: core/index
> Reporter: Shai Erera
> Fix For: 4.0
>
>
> A spinoff from LUCENE-3146. Consider the following two scenarios, according
> to how 4.0 currently works:
> * Field "a" is added w/ norms. Sometime later field "a" is added to a
> document w/o norms -- norms are disabled for field "a", for all docs.
> * Field "a" is added w/o norms - norms are disabled for field "a". Sometime
> later field "a" is added to a document w/ norms -- app thinks norms were
> added, while in fact they are dropped.
> This is a bug and case #2 should fail on add/updateDocument - app should know
> norms were not added. While case #1 isn't great either, it's the only way an
> app can choose to disable norms for field "a", after instances of it already
> contain norms, so we should support that scenario.
> In order to detect that early, we should track norms info in .fnx, as Mike
> describes at LUCENE-3146. Since this changes the index format, we should also
> update the "file format" page after we do it.
> Not sure what's the deal w/ 3.x indexes that are read by 4.0 code. Initially
> they won't have .fnx file, so no central norms information exist to detect
> the cases I've described above. Over time, as segments are merged, .fnx will
> include information from more and more segments, but there's always a chance
> few segments will still contain the norms for field "a". I'm not very
> familiar w/ that part of the code, but I think that:
> * If .fnx says "no norms for field a", the we ignore any norms information
> that may or may not exist in segments.
> * If .fnx says "norms for field a", then we need to make up some norms values
> for (old) segments w/ no norms? We need to make up values during segment
> merge and search?
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]