[ 
https://issues.apache.org/jira/browse/LUCENE-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437916#comment-13437916
 ] 

Robert Muir commented on LUCENE-4315:
-------------------------------------

I agree with the patch. 

On a related note, we should consider Fields.getUniqueTermCount,
which has a default implementation that sums across fields (preflex overrides). 

This was necessary to have some way to access the segment-level unique term 
count for 3.x indexes,
which do not actually know this information per-field and override this method 
to provide it.

But there is no need to have this on AtomicReader (I think its way too expert, 
just get the Fields
and get it from there), and we can consider deprecating this in 4.x and 
removing it in trunk 
alltogether as then someone can just use the field-level statistics.

                
> Minor fixes for Fields abstract class, TermVectorsWriter
> --------------------------------------------------------
>
>                 Key: LUCENE-4315
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4315
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>    Affects Versions: 4.0-BETA
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 5.0, 4.0
>
>         Attachments: LUCENE-4315.patch
>
>
> The Fields abstract class is a little bit inconsistent. It does not allow 
> iterator() to throw IOException, but size() is allowed to do this. This is 
> inconsistent, as looping through iterator always returns size without 
> IOException.
> Also Fields.size() allows -1 as return value, but almost all implementation 
> (only MultiFields and FieldFilteredAtomicReader may return -1) actually 
> implement it in a very cheap way. This is simple statistics, we should 
> rethink this:
> - TermVectorsWriter's basic merging (without optimization requires this 
> information, also Terms.size())
> - We can default Fields.size() to count iterator, if not explicitely 
> implemented. This method is called only by "IndexReader introspection) and 
> TermVectors merging.
> We should maybe enforce size() for Fields to return a value >=0 (Preflex also 
> knows its size!), and if the impl class does not have it (MultiFields, 
> FieldFilteredAtomicReader, loop by supplying default impl).
> The current patch still allows -1 as return value and removes IOException 
> from the signature.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to