[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Luo resolved ASTERIXDB-2339.
---------------------------------
    Resolution: Implemented

> Improve Inverted Index Merge Performance
> ----------------------------------------
>
>                 Key: ASTERIXDB-2339
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2339
>             Project: Apache AsterixDB
>          Issue Type: Improvement
>          Components: STO - Storage
>            Reporter: Chen Luo
>            Assignee: Chen Luo
>            Priority: Major
>
> Currently, the merge of inverted index is implemented by a full range scan, 
> i.e., token+key pairs are generated and fed into a priority queue to obtain a 
> global ordering. However, it is typical that a token can correspond to tens 
> or hundreds (or even much more) keys. As a result, comparisons of tokens are 
> wasted because for many times tokens would be the same. To improve this, we 
> can have two priority queues, one for tokens and one for keys. For each 
> token, we merge their inverted lists using the key priority queue. After 
> that, we fetch the next token from the token queue, and merge their inverted 
> lists again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to