[jira] [Commented] (LUCENE-4883) Hide FieldCache behind an UninvertingFilterReader

Robert Muir (JIRA) Tue, 26 Mar 2013 05:33:18 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-4883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13613730#comment-13613730
 ]


Robert Muir commented on LUCENE-4883:
-------------------------------------

I've been thinking about this in a lot of detail over the last few months, so I 
have a few more ideas (i'm not sure if this is all really the best/easiest 
path):

Currently FC "uses" the docvalues apis, but violates them in a couple of ways. 
I was trying to think of ways we could do this long term that would give us a 
filterreader that would also pass checkindex. If we can do this, its nice as 
someone could call IndexWriter.addIndexes(ir) and "upgrade" from fieldcache to 
docvalues. But unfortunately I think its a good deal of work and not easy to do 
immediately.

Anyway I think these are the three trickiest parts:

# How can we make the FilterReader's fieldinfos consistent with the docvalues 
types? I think it needs to take this information up-front: a mapping of field 
names from the underlying fieldinfos to docvalues types. Note that this would 
also make fieldcache "type insanity" impossible. It also allows a possibility 
for someone to easily control which fields are allowed to have fieldcaches 
built for them.
# How can we prevent non-dense ordinals (e.g. the case where someone "sorts on 
a multivalued field"). In this case today lucene happily allows it, but with a 
typed-no-insanity-filterreader i think we should throw an exception in this 
case instead. It means someone specified the incorrect docvalues type for the 
field (should have been SORTED_SET). Also in the filterreader's ctor, we can 
try to use underlying statistics on the field to detect if any fields are 
actually multivalued up front and throw exception early.
# How can we expose "missing" for NumericDocValues. One idea is just to see 
this "bitset" as another NumericDocValues field (that only has values of 0 or 
1) and provide sugar in the API that makes this happen automatically. I think 
actually for SortedDocValues we should try to move things to the same thing 
long-term (instead of returning -1).

                
> Hide FieldCache behind an UninvertingFilterReader
> -------------------------------------------------
>
>                 Key: LUCENE-4883
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4883
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Alan Woodward
>            Assignee: Alan Woodward
>            Priority: Minor
>         Attachments: LUCENE-4883.patch
>
>
> From a discussion on the mailing list:
> {{
> rmuir:
> I think instead FieldCache should actually be completely package
> private and hidden behind a UninvertingFilterReader and accessible via
> the existing AtomicReader docValues methods.
> }}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-4883) Hide FieldCache behind an UninvertingFilterReader

Reply via email to