[ https://issues.apache.org/jira/browse/OAK-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15705462#comment-15705462 ]
Stefan Egli commented on OAK-5186: ---------------------------------- Note that this problem is mostly relevant if the include paths contain globs - as otherwise it's just string comparison, which is cheaper (even though, also for large sets of include paths _without_ globs having a graph and even such a first-level-name optimization could be benefitial) > ChangeSetFIlterImpl: support many includePaths by filtering for 1st path name > ----------------------------------------------------------------------------- > > Key: OAK-5186 > URL: https://issues.apache.org/jira/browse/OAK-5186 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core > Affects Versions: 1.5.14 > Reporter: Stefan Egli > Fix For: 1.6 > > Attachments: OAK-5186.patch > > > When there is a large number of include paths in the ChangeSetFilterImpl and > combine that with a large-ish ChangeSet (many paths) then the comparison > becomes expensive, as there is a loop with each ChangeSet-path, then looping > through each include path. Basically an {{O(n*m)}}. > A probably ideal solution would be to implement a tree with the tree items be > the path elements. And have two sets of trees: the filter one and the > ChangeSet one. > A simpler and perhaps 'good enough' solution could be to just look at the > first level name of both the filter include paths: if a ChangeSet path's > first level name is not in that set, then it can't be included. That would > allow to skip the pattern comparison (which is slower even though it is a > compiled {{Pattern}}). -- This message was sent by Atlassian JIRA (v6.3.4#6332)