[ 
https://issues.apache.org/jira/browse/LUCENE-7863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054679#comment-16054679
 ] 

Mikhail Khludnev commented on LUCENE-7863:
------------------------------------------

[~dsmiley], thanks for breaking silence. 
It gives six docs
||name||name_rev||
|foo|oof|
|foo|oof|
|foo|oof|
|bar|rab|
|bar|rab|
|bar|rab|

the term dictionary is

||field/term||posting offset (relative)||
||name|| ||
|bar|0|
|foo|3|
||name_rev|| ||
|oof|3|
|rab|3|

||Postings (absolute values)||
|3,4,5|
|0,1,2|
|0,1,2|
|3,4,5|

Thus, we still see 12 postings, that's duplication, which I want to avoid. Or 
you propose to have an auxiliary docs like foo->oof?  


> Don't repeat postings and positions on ReverseWF, EdgeNGram, etc  
> ------------------------------------------------------------------
>
>                 Key: LUCENE-7863
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7863
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Mikhail Khludnev
>         Attachments: LUCENE-7863.hazard
>
>
> h2. Context
> \*suffix and \*infix\* searches on large indexes. 
> h2. Problem
> Obviously applying {{ReversedWildcardFilter}} doubles an index size, and I'm 
> shuddering to think about EdgeNGrams...
> h2. Proposal 
> _DRY_



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to