[ https://issues.apache.org/jira/browse/LUCENE-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12913646#action_12913646 ]
Michael McCandless commented on LUCENE-2649: -------------------------------------------- bq. I think this is a better option then adding a parameter to Parser since we can have an easy upgrade path. Parser is an interface, so we can not just add to it without breaking compatibility. To change things in 4.x, 3.x should have an upgrade path. Hmm... I'd rather make an exception to 3.x, ie, allow the addition of this method to the interface, than confuse the 4.x API, going forward, with 2 classes? Creating a custom FieldCache parser is an extremely advanced use case... very few users do this, and those that do will grok this method? bq. However, I don't cache the Bits separately since this is an edge case that should be avoided, but at least does not fail if you are not consistent. This makes me nervous since it can now lead to further cases of field cache insanity, ie, you loaded it once w/o the valid bits, and again w/ the valid bits, and now your values array is taking up 2X the RAM. It's already bad enough that FC allows one kind of insanity :) bq. This does cache a MatchAllBits even when 'cacheValidBits' is false, since that is small (a small class with one int) Hmm... but if I pass false here, it shouldn't spend any time allocating the bit set, building it, checking the bit set for "all bits set", etc.? {quote} bq. * We don't have to @Deprecate for 4.0 - just remove it, and note this in MIGRATE.txt. (Though for 3.x we need the deprecation, so maybe do 3.x patch first, then remove deprecations for 4.0?). My plan was to apply with deprecations to 4.x, then merge with 3.x. Then replace the calls in 4.x, then remove the old functions. Does this sound reasonable? {quote} OK that sounds like a good plan! bq. Right, the ValidBits are only checked for docs that exists (and the FC values are only set for docs that exists -- this has not changed), and may contain false positives for deleted docs. I think this is OK since most use cases (i can think of) deal with deletions anyway. Any ideas how/if we should change this? I think this is the right approach -- expecting FC's valid bits to take deletions into account is too much. We have IR.getDeletedDocs for this. But, eg this means classes like FCRF will still have to consult deleted docs. Really, "in general" we need a better way for the query execution path to enforce deleted docs. Eg if the FCRF will be AND'd w/ a query that's already excluding del docs then it need not be careful about deletions... bq. (I did not realize that the FC is reused after deletions -- so clever) Ha! There was a time when it didn't ;) > FieldCache should include a BitSet for matching docs > ---------------------------------------------------- > > Key: LUCENE-2649 > URL: https://issues.apache.org/jira/browse/LUCENE-2649 > Project: Lucene - Java > Issue Type: Improvement > Reporter: Ryan McKinley > Fix For: 4.0 > > Attachments: LUCENE-2649-FieldCacheWithBitSet.patch, > LUCENE-2649-FieldCacheWithBitSet.patch, > LUCENE-2649-FieldCacheWithBitSet.patch, > LUCENE-2649-FieldCacheWithBitSet.patch, LUCENE-2649-FieldCacheWithBitSet.patch > > > The FieldCache returns an array representing the values for each doc. > However there is no way to know if the doc actually has a value. > This should be changed to return an object representing the values *and* a > BitSet for all valid docs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org