[jira] [Commented] (LUCENE-3443) Port 3.x FieldCache.getDocsWithField() to trunk

Michael McCandless (Commented) (JIRA) Tue, 08 Nov 2011 10:32:17 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13146457#comment-13146457
 ]


Michael McCandless commented on LUCENE-3443:
--------------------------------------------


bq. That represents a pretty significant performance regression though, and it 
seems we should try to avoid stuff like that on trunk (better in a branch if 
one is going to remove functionality with the idea of adding it back later).

Well we haven't released trunk yet, and this is not a perf regression
vs 3.4.0?

Ie, there's no guarantee w/in trunk that performance won't "change",
just like there's no guarantee APIs and index format won't "change".

bq. Or we could generate the bits by default - the extra cache entry if not 
needed is less serious than doubling the generation time.

I'd rather not do that; apps that don't use sort missing first/last
shouldn't be forced to spend the RAM (even if it's only a bit per
doc).

I think we can find a solution that makes it optional; I just don't
think we have to do it here, now.

This issue can focus on getting to 3.x's FC impl.

bq. Another small piece we are losing from trunk is numUniqueTerms.

Well, we track this in FC (only in trunk) today, but it's not used
anywhere, I think?  Also, this is redundant with
Terms.getUniqueTermCount()?

{quote}
What about returning an object, so instead of getLongs() returning a long[], it 
would return a LongValues that had
a long[] as well as numUniqueTerms, and docsWithValues (optionally set). This 
also halves the number of cache lookups needed when using sortMissinglast and 
supplies a place to put more info in the future (stuff like numUniqueTerms).
{quote}

I think for now we should stick w/ simple arrays for this issue; we
can explore returning an object under a new issue.  The extra cache
lookup is really a negligible cost.

                
> Port 3.x FieldCache.getDocsWithField() to trunk
> -----------------------------------------------
>
>                 Key: LUCENE-3443
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3443
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 3.5, 4.0
>
>         Attachments: LUCENE-3443.patch
>
>
> [Spinoff from LUCENE-3390]
> I think the approach in 3.x for handling un-valued docs, and making it
> possible to specify how such docs are sorted, is better than the
> solution we have in trunk.
> I like that FC has a dedicated method to get the Bits for docs with field
> -- easy for apps to directly use.  And I like that the
> bits have their own entry in the FC.
> One downside is that it's 2 passes to get values and valid bits, but
> I think we can fix this by passing optional bool to FC.getXXX methods
> indicating you want the bits, and the populate the FC entry for the
> missing bits as well.  (We can do that for 3.x and trunk). Then it's
> single pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-3443) Port 3.x FieldCache.getDocsWithField() to trunk

Reply via email to