[ 
https://issues.apache.org/jira/browse/OAK-11672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joerg Hoh updated OAK-11672:
----------------------------
    Description: 
Iterating through large results is a known reason why JCR queries can be slow. 
Also there are abstractions which first consume the entire JCR result set for 
later post-processing, even if just a few results were read then.

Right now it's hardly possible to find out how many results from a query were 
consumed without inspecting the code and possibly re-execute it; but that type 
of information is often required and useful for the analysis of performance 
problems. For that reason the node-iterator of a query result should be able to 
log more details how it is used:
 * log a WARN if more than 1000 and 10'000 results were read, hinting towards 
performance issues.
 * a TRACE logging for every 100 results beyond 10'000

(To make it better understandable It would be good if the matching query could 
be listed as well.)

 Note: There are already warnings if a lot of nodes are read, either as part of 
a traversal or an index-traversal. But that level it's unclear why that many 
nodes are read. This issue will add log statements for a special case (reading 
many results) which can happen independently from the (index-) traversal, but 
in combination with such a warning makes it easier to propose an improvement.

  was:
Iterating through large results is a known reason why JCR queries can be slow. 
Also there are abstractions which first consume the entire JCR result set for 
later post-processing, even if just a few results were read then.

Right now it's hardly possible to find out how many results from a query were 
consumed without inspecting the code and possibly re-execute it; but that type 
of information is often required and useful for the analysis of performance 
problems. For that reason the node-iterator of a query result should be able to 
log more details how it is used:

* log a WARN if more than 1000 and 10'000 results were read, hinting towards 
performance issues.
* a TRACE logging for every 100 results beyond 10'000

(To make it better understandable It would be good if the matching query could 
be listed as well.)



> WARN if large query result sets are read
> ----------------------------------------
>
>                 Key: OAK-11672
>                 URL: https://issues.apache.org/jira/browse/OAK-11672
>             Project: Jackrabbit Oak
>          Issue Type: Task
>          Components: jcr
>            Reporter: Joerg Hoh
>            Priority: Major
>
> Iterating through large results is a known reason why JCR queries can be 
> slow. Also there are abstractions which first consume the entire JCR result 
> set for later post-processing, even if just a few results were read then.
> Right now it's hardly possible to find out how many results from a query were 
> consumed without inspecting the code and possibly re-execute it; but that 
> type of information is often required and useful for the analysis of 
> performance problems. For that reason the node-iterator of a query result 
> should be able to log more details how it is used:
>  * log a WARN if more than 1000 and 10'000 results were read, hinting towards 
> performance issues.
>  * a TRACE logging for every 100 results beyond 10'000
> (To make it better understandable It would be good if the matching query 
> could be listed as well.)
>  Note: There are already warnings if a lot of nodes are read, either as part 
> of a traversal or an index-traversal. But that level it's unclear why that 
> many nodes are read. This issue will add log statements for a special case 
> (reading many results) which can happen independently from the (index-) 
> traversal, but in combination with such a warning makes it easier to propose 
> an improvement.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to