[jira] Commented: (LUCENE-328) Some utilities for a compact sparse filter

Eks Dev (JIRA) Thu, 29 Dec 2005 07:25:25 -0800

    [ 
http://issues.apache.org/jira/browse/LUCENE-328?page=comments#action_12361375 ]


Eks Dev commented on LUCENE-328:
--------------------------------

I've been looking at this code and found some minor enhancements that could be 
done:

1. Any particular reason for SortedVIntList not to implement DocNrSkipper 
interface, the method getDocNrSkipper() is there, but declaration is missing. 

2. Should getDocNrSkipper() DocNrSkipper interface throw IOException? I have 
tried to add TermDocsSortedIntList to the family, but all methods in TermDocs 
are throwing IOException, and it is not nice to eat silently this exception too 
early in DocNrSkipper. Better ideas to deal with that? 

3. Paul, why SkipFilter exists (here I refer to the JIRA-330 )? Wouldn't be 
better to use DocNrSkipper interface instead (SkipFilter does nothing but 
wrapping this interface). Also, the same question applies to IterFilter. Did I 
get something wrong here? 


Must say, excelent work! 
A lot of use cases related to Filtering and non-scoring  queries can be done 
efficiently with this 

> Some utilities for a compact sparse filter
> ------------------------------------------
>
>          Key: LUCENE-328
>          URL: http://issues.apache.org/jira/browse/LUCENE-328
>      Project: Lucene - Java
>         Type: Improvement
>   Components: Search
>     Versions: CVS Nightly - Specify date in submission
>  Environment: Operating System: other
> Platform: Other
>     Reporter: paul.elschot
>     Assignee: Lucene Developers
>     Priority: Minor
>  Attachments: AndDocNrSkipper.java, AndDocNrSkipper.java, 
> BitSetSortedIntList.java, DocNrSkipper.java, DocNrSkipper.java, 
> IntArraySortedIntList.java, IntArraySortedIntList.java, OrDocNrSkipper.java, 
> OrDocNrSkipper.java, SortedVIntList.java, SortedVIntList.java, 
> SortedVIntList.java, TestDocNrSkippers.java, TestDocNrSkippers.java, 
> TestSortedVIntList.java, TestSortedVIntList.java, TestSortedVIntList.java, 
> intIterator.java
>
> Two files are attached that might form the basis for an alternative 
> filter implementation that is more memory efficient than one bit 
> per doc when less than about 1/8 of the docs pass through the filter. 
>  
> The document numbers are stored in RAM as VInt's from the Lucene index 
> format. These VInt's encode the difference between two successive 
> document numbers, much like a PositionDelta in the Positions: 
> http://jakarta.apache.org/lucene/docs/fileformats.html 
>  
> The getByteSize() method can be used to verify the compression 
> once a SortedVIntList is constructed. 
> The precise conditions under which this is more memory efficient than 
> one bit per document are not easy to specify in advance.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-328) Some utilities for a compact sparse filter

Reply via email to