[
https://issues.apache.org/jira/browse/OAK-11672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joerg Hoh updated OAK-11672:
----------------------------
Description:
Iterating through large results is a known reason why JCR queries can be slow.
Also there are abstractions which first consume the entire JCR result set for
later post-processing, even if just a few results were read then.
Right now it's hardly possible to find out how many results from a query were
consumed without inspecting the code and possibly re-execute it; but that type
of information is often required and useful for the analysis of performance
problems. For that reason the node-iterator of a query result should be able to
log more details how it is used:
* log a WARN if more than 1000 and 10'000 results were read, hinting towards
performance issues.
* a TRACE logging for every 100 results beyond 10'000
(To make it better understandable It would be good if the matching query could
be listed as well.)
Note: There are already warnings if a lot of nodes are read, either as part of
a traversal or an index-traversal. But that level it's unclear why that many
nodes are read. This issue will add log statements for a special case (reading
many results) which can happen independently from the (index-) traversal, but
in combination with such a warning makes it easier to propose an improvement.
was:
Iterating through large results is a known reason why JCR queries can be slow.
Also there are abstractions which first consume the entire JCR result set for
later post-processing, even if just a few results were read then.
Right now it's hardly possible to find out how many results from a query were
consumed without inspecting the code and possibly re-execute it; but that type
of information is often required and useful for the analysis of performance
problems. For that reason the node-iterator of a query result should be able to
log more details how it is used:
* log a WARN if more than 1000 and 10'000 results were read, hinting towards
performance issues.
* a TRACE logging for every 100 results beyond 10'000
(To make it better understandable It would be good if the matching query could
be listed as well.)
> WARN if large query result sets are read
> ----------------------------------------
>
> Key: OAK-11672
> URL: https://issues.apache.org/jira/browse/OAK-11672
> Project: Jackrabbit Oak
> Issue Type: Task
> Components: jcr
> Reporter: Joerg Hoh
> Priority: Major
>
> Iterating through large results is a known reason why JCR queries can be
> slow. Also there are abstractions which first consume the entire JCR result
> set for later post-processing, even if just a few results were read then.
> Right now it's hardly possible to find out how many results from a query were
> consumed without inspecting the code and possibly re-execute it; but that
> type of information is often required and useful for the analysis of
> performance problems. For that reason the node-iterator of a query result
> should be able to log more details how it is used:
> * log a WARN if more than 1000 and 10'000 results were read, hinting towards
> performance issues.
> * a TRACE logging for every 100 results beyond 10'000
> (To make it better understandable It would be good if the matching query
> could be listed as well.)
> Note: There are already warnings if a lot of nodes are read, either as part
> of a traversal or an index-traversal. But that level it's unclear why that
> many nodes are read. This issue will add log statements for a special case
> (reading many results) which can happen independently from the (index-)
> traversal, but in combination with such a warning makes it easier to propose
> an improvement.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)