[ 
https://issues.apache.org/jira/browse/LUCENE-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley updated LUCENE-2649:
----------------------------------

    Attachment: LUCENE-2649-FieldCacheWithBitSet.patch

Here is a new patch that removes the static config.  Rather then put a property 
on Parser class, I added a class:
{code:java}
  public abstract static class CacheConfig {
    public abstract boolean cacheValidBits();
  }
{code}
and this gets passed to the getXXXValues function:
{code:java}
ByteValues getByteValues(IndexReader reader, String field, ByteParser parser, 
CacheConfig config)
{code}

I think this is a better option then adding a parameter to Parser since we can 
have an easy upgrade path.  Parser is an interface, so we can not just add to 
it without breaking compatibility.  To change things in 4.x, 3.x should have an 
upgrade path.

I took Mike's suggestion and include the CacheConfig hashcode in the Cache key 
-- however, I don't cache the Bits separately since this is an edge case that 
*should* be avoided, but at least does not fail if you are not consistent.

This does cache a MatchAllBits even when 'cacheValidBits' is false, since that 
is small (a small class with one int)

-----------

bq.     *  We don't have to @Deprecate for 4.0 - just remove it, and note this 
in MIGRATE.txt. (Though for 3.x we need the deprecation, so maybe do 3.x patch 
first, then remove deprecations for 4.0?).

My plan was to apply with deprecations to 4.x, then merge with 3.x.  Then 
replace the calls in 4.x, then remove the old functions.  Does this sound 
reasonable?

I would like this to get in 3.x since we could then remove many solr types in 
4.x and have a 3.x migration path.

bq.  * FieldCache.EntryCreator looks orphan'd?

dooh, thanks


bq. It looks like the valid bits will not reflect deletions (by design), right? 
Ie caller cannot rely on valid always incorporating deleted docs. (Eg the 
MatchAll opto disregards deletions, and, a reopened segment can have new 
deletions yet shares the FC entry).

Right, the ValidBits are only checked for docs that exists (and the FC values 
are only set for docs that exists -- this has not changed), and may contain 
false positives for deleted docs.  I think this is OK since most use cases (i 
can think of) deal with deletions anyway.   Any ideas how/if we should change 
this?  (I did not realize that the FC is reused after deletions -- so clever)

----------------

bq. I'm having trouble understanding the use case for this bitset.

My motivation is for supporting the supportMissingLast feature in solr sorting 
(that could now be pushed to lucene).  For example if I have a bunch of 
documents and only some have the field "bytes" -- sorting 'bytes desc' works 
great, but sorting 'bytes asc' puts all the documents that do not have the 
field 'bytes' first since the FieldCache thinks they are all zero.

If we get this working in solr, we can deprecate and delete all the "sortable" 
number fields and have that same functionality on Trie* fields.







> FieldCache should include a BitSet for matching docs
> ----------------------------------------------------
>
>                 Key: LUCENE-2649
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2649
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Ryan McKinley
>             Fix For: 4.0
>
>         Attachments: LUCENE-2649-FieldCacheWithBitSet.patch, 
> LUCENE-2649-FieldCacheWithBitSet.patch, 
> LUCENE-2649-FieldCacheWithBitSet.patch, 
> LUCENE-2649-FieldCacheWithBitSet.patch, LUCENE-2649-FieldCacheWithBitSet.patch
>
>
> The FieldCache returns an array representing the values for each doc.  
> However there is no way to know if the doc actually has a value.
> This should be changed to return an object representing the values *and* a 
> BitSet for all valid docs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to