[ 
https://issues.apache.org/jira/browse/OAK-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nuno Santos resolved OAK-10682.
-------------------------------
    Fix Version/s: 1.62.0
       Resolution: Done

> [Indexing job] Improve Mongo regex filter to only use positive conditions (no 
> negations)
> ----------------------------------------------------------------------------------------
>
>                 Key: OAK-10682
>                 URL: https://issues.apache.org/jira/browse/OAK-10682
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: indexing
>         Environment: The current implementation of filtering excluded paths 
> and custom regex is using a condition like
> {noformat}
> { _id:  { $nin: [ /^[0-9]{1,3}:\/content\/dam\/.*$/ ]} {noformat}
> Mongo cannot evaluate this condition without retrieving the full document, 
> because a value of {{_null}} would also match this condition and the index 
> does not contain {{null}} values. Therefore, when the index contains excluded 
> paths, the download will be much slower because Mongo has to retrieve every 
> single document to evaluate the condition.
> As a workaround, we can transform the regex on an equivalent one that matches 
> the complement of the original regex using [negative 
> lookahead|https://stackoverflow.com/questions/1240275/how-to-negate-specific-word-in-regex].
>  This allows rewriting the filter condition using only positive conditions, 
> which can be evaluated using only the index.
>            Reporter: Nuno Santos
>            Priority: Major
>              Labels: indexing
>             Fix For: 1.62.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to