[ https://issues.apache.org/jira/browse/OAK-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nuno Santos resolved OAK-10682. ------------------------------- Fix Version/s: 1.62.0 Resolution: Done > [Indexing job] Improve Mongo regex filter to only use positive conditions (no > negations) > ---------------------------------------------------------------------------------------- > > Key: OAK-10682 > URL: https://issues.apache.org/jira/browse/OAK-10682 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing > Environment: The current implementation of filtering excluded paths > and custom regex is using a condition like > {noformat} > { _id: { $nin: [ /^[0-9]{1,3}:\/content\/dam\/.*$/ ]} {noformat} > Mongo cannot evaluate this condition without retrieving the full document, > because a value of {{_null}} would also match this condition and the index > does not contain {{null}} values. Therefore, when the index contains excluded > paths, the download will be much slower because Mongo has to retrieve every > single document to evaluate the condition. > As a workaround, we can transform the regex on an equivalent one that matches > the complement of the original regex using [negative > lookahead|https://stackoverflow.com/questions/1240275/how-to-negate-specific-word-in-regex]. > This allows rewriting the filter condition using only positive conditions, > which can be evaluated using only the index. > Reporter: Nuno Santos > Priority: Major > Labels: indexing > Fix For: 1.62.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)