[ 
https://issues.apache.org/jira/browse/LUCENE-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12913646#action_12913646
 ] 

Michael McCandless commented on LUCENE-2649:
--------------------------------------------


bq. I think this is a better option then adding a parameter to Parser since we 
can have an easy upgrade path. Parser is an interface, so we can not just add 
to it without breaking compatibility. To change things in 4.x, 3.x should have 
an upgrade path.

Hmm... I'd rather make an exception to 3.x, ie, allow the addition of
this method to the interface, than confuse the 4.x API, going forward,
with 2 classes?

Creating a custom FieldCache parser is an extremely advanced use
case... very few users do this, and those that do will grok this
method?

bq. However, I don't cache the Bits separately since this is an edge case that 
should be avoided, but at least does not fail if you are not consistent.

This makes me nervous since it can now lead to further cases of field
cache insanity, ie, you loaded it once w/o the valid bits, and again
w/ the valid bits, and now your values array is taking up 2X the RAM.

It's already bad enough that FC allows one kind of insanity :)

bq. This does cache a MatchAllBits even when 'cacheValidBits' is false, since 
that is small (a small class with one int)

Hmm... but if I pass false here, it shouldn't spend any time
allocating the bit set, building it, checking the bit set for "all
bits set", etc.?

{quote}
bq.     *  We don't have to @Deprecate for 4.0 - just remove it, and note this 
in MIGRATE.txt. (Though for 3.x we need the deprecation, so maybe do 3.x patch 
first, then remove deprecations for 4.0?).

My plan was to apply with deprecations to 4.x, then merge with 3.x.  Then 
replace the calls in 4.x, then remove the old functions.  Does this sound 
reasonable?
{quote}

OK that sounds like a good plan!

bq. Right, the ValidBits are only checked for docs that exists (and the FC 
values are only set for docs that exists -- this has not changed), and may 
contain false positives for deleted docs.  I think this is OK since most use 
cases (i can think of) deal with deletions anyway.   Any ideas how/if we should 
change this?

I think this is the right approach -- expecting FC's valid bits to
take deletions into account is too much.  We have IR.getDeletedDocs
for this.

But, eg this means classes like FCRF will still have to consult
deleted docs.

Really, "in general" we need a better way for the query execution path
to enforce deleted docs.  Eg if the FCRF will be AND'd w/ a query
that's already excluding del docs then it need not be careful about
deletions...

bq.  (I did not realize that the FC is reused after deletions -- so clever)

Ha!  There was a time when it didn't ;)


> FieldCache should include a BitSet for matching docs
> ----------------------------------------------------
>
>                 Key: LUCENE-2649
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2649
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Ryan McKinley
>             Fix For: 4.0
>
>         Attachments: LUCENE-2649-FieldCacheWithBitSet.patch, 
> LUCENE-2649-FieldCacheWithBitSet.patch, 
> LUCENE-2649-FieldCacheWithBitSet.patch, 
> LUCENE-2649-FieldCacheWithBitSet.patch, LUCENE-2649-FieldCacheWithBitSet.patch
>
>
> The FieldCache returns an array representing the values for each doc.  
> However there is no way to know if the doc actually has a value.
> This should be changed to return an object representing the values *and* a 
> BitSet for all valid docs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to