[jira] [Comment Edited] (JCRVLT-831) For collection of namespace prefixes, avoid iterating over sibling nodes not contained in the filter(s)

Julian Reschke (Jira) Tue, 27 Jan 2026 02:13:22 -0800


    [ 
https://issues.apache.org/jira/browse/JCRVLT-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18054567#comment-18054567
 ]


Julian Reschke edited comment on JCRVLT-831 at 1/27/26 10:12 AM:
-----------------------------------------------------------------

Attempt to summarize the situation:

1. There was a problem that FileVault had performance issues when it checked a 
potentially huge folder (for nodes matching the filters). This was fixed with 
JCRVLT-789 (Version 3.8.4), where we avoided that check when it was clear from 
the filter config that the set of nodes already was known  in advance.

2. We found that we only fixed one of two issues; the same problem was present 
in the "namespace prefix scan" phase. We applied a similar fix for that case 
(this ticket).

3. What's left is the case where FV needs to collect namespace prefixes from 
all sibling nodes in a collection, because it is ordered and they are 
serialized as empty elements (see 
https://jackrabbit.apache.org/filevault/docview.html#Empty_Elements) (Note that 
filevault only checks the primary node type - strictly speaking, that is not 
correct, but apparently it's not a problem in practice)

4. In JCRVLT-836, we added a lot of DEBUG logging to make it easier to 
understand what actually is happening, and how long it takes.

5. A simple change would be to always serialize the whole namespace registry. 
That would avoid all scanning, but it would affect the size of the generated 
XML. As compromise, that shortcut could be restricted to the case where we 
actually *have* ordered collections.

cc: [~jhoh], [~kwin], [~cschneider], [~patlego].






was (Author: reschke):
Attempt to summarize the situation:

1. There was a problem that FileVault had performance issues when it checked a 
potentially huge folder (for nodes matching the filters). This was fixed with 
JCRVLT-789 (Version 3.8.4), where we avoided that check when it was clear from 
the filter config that the set of nodes already was known  in advance.

2. We found that we only fixed one of two issues; the same problem was present 
in the "namespace prefix scan" phase. We applied a similar fix for that case 
(this ticket).

3. What's left is the case where FV needs to collect namespace prefixes from 
all sibling nodes in a collection, because it is ordered and they are 
serialized as empty elements (see 
https://jackrabbit.apache.org/filevault/docview.html#Empty_Elements) (Note that 
filevault only checks the primary node type - strictly speaking, that is not 
correct, but apparently it's not a problem in practice)

4. In JCRVLT-836, we added a lot of DEBUG logging to make it easier to 
understand what acutally is happening, and how long it takes.

5. A simple change would be to always serialize the whole namespace registry. 
That would avoid all scanning, but it would affect the size of the generated 
XML. As compromise, that shortcut could be restricted to the case where we 
actually *have* ordered collections.

cc: [~jhoh], [~kwin], [~cschneider], [~patlego].





> For collection of namespace prefixes, avoid iterating over sibling nodes not 
> contained in the filter(s)
> -------------------------------------------------------------------------------------------------------
>
>                 Key: JCRVLT-831
>                 URL: https://issues.apache.org/jira/browse/JCRVLT-831
>             Project: Jackrabbit FileVault
>          Issue Type: Improvement
>          Components: vlt
>            Reporter: Julian Reschke
>            Assignee: Julian Reschke
>            Priority: Major
>             Fix For: 4.2.0
>
>
> It appears that the changes for JCRVLT-789 improved the performance for 
> exporting the *content*, however the phase of collecting namespace prefixes 
> does not use that optimzation.
> (TBD: write a test)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (JCRVLT-831) For collection of namespace prefixes, avoid iterating over sibling nodes not contained in the filter(s)

Reply via email to