[ 
https://issues.apache.org/jira/browse/LUCENE-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066628#comment-13066628
 ] 

Paul Elschot commented on LUCENE-3325:
--------------------------------------

This was more or less suggested in:

"Compressing Term Positions in Web Indexes", Hao Yan, Shuan Ding, Torsten Suel, 
SIGIR '09.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.152.4748&rep=rep1&type=pdf

in sections 7 and 8, and especially the last sentence: "... one could even 
consider storing the parsed documents themselves in highly compressed form and 
accessing these during a position data lookup, instead of keeping the positions 
in inverted lists."


> Transpose positions in index
> ----------------------------
>
>                 Key: LUCENE-3325
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3325
>             Project: Lucene - Java
>          Issue Type: Wish
>          Components: core/index
>            Reporter: Paul Elschot
>            Priority: Minor
>
> When positions are used in queries with many terms, each term in each 
> document causes a seek in the positions, and in large indexes these seeks can 
> be far apart even when the terms are in the same document.
> The number of (disk) cache misses of such position seeks might be reduced by 
> putting the positions for all terms in the same document directly behind each 
> other. This should have a noticable effect when terms are alphabetically 
> close, for example for truncations, and it should also help when the 
> documents have few enough positions to fill a cache entry (disk page, cache 
> line).
> This might also help the performance of highlighting based on indexed 
> positions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to