[ 
https://issues.apache.org/jira/browse/JCR-4770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Kuijpers updated JCR-4770:
--------------------------------
    Description: 
When executing a query, it could happen that a query yields so many results 
(for example in the case of migration scripts), that it's causing a failure. 
Not while executing the query, but while iterating through the results of the 
query.

We have a few migration scripts in our codebase that need to migrate content 
(such as CMS components, pages, ...). We also have, especially on production, 
quite a lot of content. Such scripts can easily find 100.000+ nodes and thus 
produce a resultset that is bigger than the "query read limit".

This limit can currently be configured on system-level, either through a system 
property, or through OSGi configuration. QueryEngineSettingsService takes care 
of that. 

Raising this limit means raising the limit for the entire system. For every 
query that is executed. It would be ideal if we could configure this limit on 
the query level, for example through an option (like the options for traversal 
and for index tag selection). I would propose to add an option:

"select * from ... option (readlimit 999999)" 

which would take precedence over the limit that is active in 
QueryEngineSettingsService. Then, it would be the responsibility of the 
developer who creates the query to specify the correct overridden limit (or not 
specify a limit at all, of course).

Stacktrace of such a failing query, currently:

{code:java}
10.03.2022 16:55:00.032 *WARN* [qtp881876674-5121] 
org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor
 Index-Traversed 100000 nodes with filter Filter(query=select [jcr:path], 
[jcr:score], * from [cq:Page] as a where isdescendantnode(a, '/content') /* 
xpath: /jcr:root/content//element(*, cq:Page) */, path=/content//*)
10.03.2022 16:55:00.228 *ERROR* [qtp881876674-5121] 
com.day.crx.delite.impl.servlets.QueryServlet Exception while searching
org.apache.jackrabbit.oak.query.RuntimeNodeTraversalException: The query read 
or traversed more than 100000 nodes. To avoid affecting other tasks, processing 
was stopped.
        at 
org.apache.jackrabbit.oak.query.FilterIterators.checkReadLimit(FilterIterators.java:70)
 [org.apache.jackrabbit.oak-core:1.22.9]
        at 
org.apache.jackrabbit.oak.plugins.index.Cursors.checkReadLimit(Cursors.java:67) 
[org.apache.jackrabbit.oak-core:1.22.9]
        at 
org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor$1.next(FulltextIndex.java:411)
 [org.apache.jackrabbit.oak-lucene:1.22.9]
        at 
org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor$1.next(FulltextIndex.java:392)
 [org.apache.jackrabbit.oak-lucene:1.22.9]
        at com.google.common.collect.Iterators$7.computeNext(Iterators.java:646)
        at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
        at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
        at 
org.apache.jackrabbit.oak.plugins.index.Cursors$PathCursor.hasNext(Cursors.java:216)
 [org.apache.jackrabbit.oak-core:1.22.9]
        at 
org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor.hasNext(FulltextIndex.java:432)
 [org.apache.jackrabbit.oak-lucene:1.22.9]
        at 
org.apache.jackrabbit.oak.query.ast.SelectorImpl.nextInternal(SelectorImpl.java:515)
 [org.apache.jackrabbit.oak-core:1.22.9]
        at 
org.apache.jackrabbit.oak.query.ast.SelectorImpl.next(SelectorImpl.java:508) 
[org.apache.jackrabbit.oak-core:1.22.9]
        at 
org.apache.jackrabbit.oak.query.QueryImpl$RowIterator.fetchNext(QueryImpl.java:876)
 [org.apache.jackrabbit.oak-core:1.22.9]
        at 
org.apache.jackrabbit.oak.query.QueryImpl$RowIterator.hasNext(QueryImpl.java:903)
 [org.apache.jackrabbit.oak-core:1.22.9]
        at 
org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.fetch(QueryResultImpl.java:103)
 [org.apache.jackrabbit.oak-jcr:1.22.9]
        at 
org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.next(QueryResultImpl.java:128)
 [org.apache.jackrabbit.oak-jcr:1.22.9]
        at 
org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.next(QueryResultImpl.java:83)
 [org.apache.jackrabbit.oak-jcr:1.22.9]
        at 
org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate$SynchronizedIterator.next(SessionDelegate.java:702)
 [org.apache.jackrabbit.oak-jcr:1.22.9]
        at 
org.apache.jackrabbit.oak.jcr.query.PrefetchIterator.next(PrefetchIterator.java:88)
 [org.apache.jackrabbit.oak-jcr:1.22.9]
        at 
org.apache.jackrabbit.commons.iterator.RangeIteratorAdapter.next(RangeIteratorAdapter.java:152)
 [org.apache.jackrabbit.jackrabbit-jcr-commons:2.20.2]
        at 
org.apache.jackrabbit.commons.iterator.RangeIteratorDecorator.next(RangeIteratorDecorator.java:92)
 [org.apache.jackrabbit.jackrabbit-jcr-commons:2.20.2]
        at 
org.apache.jackrabbit.commons.iterator.RowIteratorAdapter.nextRow(RowIteratorAdapter.java:76)
 [org.apache.jackrabbit.jackrabbit-jcr-commons:2.20.2]
{code}


  was:
When executing a query, it could happen that a query yields so many results 
(for example in the case of migration scripts), that it's causing a failure. 
Not while executing the query, but while iterating through the results of the 
query.

We have a few migration scripts in our codebase that need to migrate content 
(such as CMS components, pages, ...). We also have, especially on production, 
quite a lot of content. Such scripts can easily find 100.000+ nodes and thus 
produce a resultset that is bigger than the "query read limit".

This limit can currently be configured on system-level, either through a system 
property, or through OSGi configuration. QueryEngineSettingsService takes care 
of that. 

Raising this limit means raising the limit for the entire system. For every 
query that is executed. It would be ideal if we could configure this limit on 
the query level, for example through an option (like the options for traversal 
and for index tag selection). I would propose to add an option:

"select * from ... option (readlimit 999999)" 

which would take precedence over the limit that is active in 
QueryEngineSettingsService. Then, it would be the responsibility of the 
developer who creates the query to specify the correct overridden limit (or not 
specify a limit at all, of course).


> Query read limit should be overridable through query option 
> ------------------------------------------------------------
>
>                 Key: JCR-4770
>                 URL: https://issues.apache.org/jira/browse/JCR-4770
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: query, sql
>            Reporter: Henry Kuijpers
>            Priority: Major
>
> When executing a query, it could happen that a query yields so many results 
> (for example in the case of migration scripts), that it's causing a failure. 
> Not while executing the query, but while iterating through the results of the 
> query.
> We have a few migration scripts in our codebase that need to migrate content 
> (such as CMS components, pages, ...). We also have, especially on production, 
> quite a lot of content. Such scripts can easily find 100.000+ nodes and thus 
> produce a resultset that is bigger than the "query read limit".
> This limit can currently be configured on system-level, either through a 
> system property, or through OSGi configuration. QueryEngineSettingsService 
> takes care of that. 
> Raising this limit means raising the limit for the entire system. For every 
> query that is executed. It would be ideal if we could configure this limit on 
> the query level, for example through an option (like the options for 
> traversal and for index tag selection). I would propose to add an option:
> "select * from ... option (readlimit 999999)" 
> which would take precedence over the limit that is active in 
> QueryEngineSettingsService. Then, it would be the responsibility of the 
> developer who creates the query to specify the correct overridden limit (or 
> not specify a limit at all, of course).
> Stacktrace of such a failing query, currently:
> {code:java}
> 10.03.2022 16:55:00.032 *WARN* [qtp881876674-5121] 
> org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor
>  Index-Traversed 100000 nodes with filter Filter(query=select [jcr:path], 
> [jcr:score], * from [cq:Page] as a where isdescendantnode(a, '/content') /* 
> xpath: /jcr:root/content//element(*, cq:Page) */, path=/content//*)
> 10.03.2022 16:55:00.228 *ERROR* [qtp881876674-5121] 
> com.day.crx.delite.impl.servlets.QueryServlet Exception while searching
> org.apache.jackrabbit.oak.query.RuntimeNodeTraversalException: The query read 
> or traversed more than 100000 nodes. To avoid affecting other tasks, 
> processing was stopped.
>       at 
> org.apache.jackrabbit.oak.query.FilterIterators.checkReadLimit(FilterIterators.java:70)
>  [org.apache.jackrabbit.oak-core:1.22.9]
>       at 
> org.apache.jackrabbit.oak.plugins.index.Cursors.checkReadLimit(Cursors.java:67)
>  [org.apache.jackrabbit.oak-core:1.22.9]
>       at 
> org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor$1.next(FulltextIndex.java:411)
>  [org.apache.jackrabbit.oak-lucene:1.22.9]
>       at 
> org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor$1.next(FulltextIndex.java:392)
>  [org.apache.jackrabbit.oak-lucene:1.22.9]
>       at com.google.common.collect.Iterators$7.computeNext(Iterators.java:646)
>       at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>       at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>       at 
> org.apache.jackrabbit.oak.plugins.index.Cursors$PathCursor.hasNext(Cursors.java:216)
>  [org.apache.jackrabbit.oak-core:1.22.9]
>       at 
> org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor.hasNext(FulltextIndex.java:432)
>  [org.apache.jackrabbit.oak-lucene:1.22.9]
>       at 
> org.apache.jackrabbit.oak.query.ast.SelectorImpl.nextInternal(SelectorImpl.java:515)
>  [org.apache.jackrabbit.oak-core:1.22.9]
>       at 
> org.apache.jackrabbit.oak.query.ast.SelectorImpl.next(SelectorImpl.java:508) 
> [org.apache.jackrabbit.oak-core:1.22.9]
>       at 
> org.apache.jackrabbit.oak.query.QueryImpl$RowIterator.fetchNext(QueryImpl.java:876)
>  [org.apache.jackrabbit.oak-core:1.22.9]
>       at 
> org.apache.jackrabbit.oak.query.QueryImpl$RowIterator.hasNext(QueryImpl.java:903)
>  [org.apache.jackrabbit.oak-core:1.22.9]
>       at 
> org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.fetch(QueryResultImpl.java:103)
>  [org.apache.jackrabbit.oak-jcr:1.22.9]
>       at 
> org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.next(QueryResultImpl.java:128)
>  [org.apache.jackrabbit.oak-jcr:1.22.9]
>       at 
> org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.next(QueryResultImpl.java:83)
>  [org.apache.jackrabbit.oak-jcr:1.22.9]
>       at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate$SynchronizedIterator.next(SessionDelegate.java:702)
>  [org.apache.jackrabbit.oak-jcr:1.22.9]
>       at 
> org.apache.jackrabbit.oak.jcr.query.PrefetchIterator.next(PrefetchIterator.java:88)
>  [org.apache.jackrabbit.oak-jcr:1.22.9]
>       at 
> org.apache.jackrabbit.commons.iterator.RangeIteratorAdapter.next(RangeIteratorAdapter.java:152)
>  [org.apache.jackrabbit.jackrabbit-jcr-commons:2.20.2]
>       at 
> org.apache.jackrabbit.commons.iterator.RangeIteratorDecorator.next(RangeIteratorDecorator.java:92)
>  [org.apache.jackrabbit.jackrabbit-jcr-commons:2.20.2]
>       at 
> org.apache.jackrabbit.commons.iterator.RowIteratorAdapter.nextRow(RowIteratorAdapter.java:76)
>  [org.apache.jackrabbit.jackrabbit-jcr-commons:2.20.2]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to