Dave Reynolds created JENA-2328:
-----------------------------------
Summary: Query timeouts failing when plan phase is long
Key: JENA-2328
URL: https://issues.apache.org/jira/browse/JENA-2328
Project: Apache Jena
Issue Type: Bug
Components: ARQ
Reporter: Dave Reynolds
Attachments: TestQueryExecutionTimeout3.java
In a production service with a large TDB store (around 500MT) we find that some
complex queries evade the query timeouts (set to 90s first result, 120s total)
and then run for hours soaking up all available CPU cores. While the queries
show no clear pattern, and it has been hard replicate in a controlled setting,
we do now have one example which is expressible as a test case. See attached.
The behaviour is that the abort() call from the alarm timeout is received by
QueryExecDataset before there is an iterator to cancel - the QueryExecDataset
instance is deep in getPlan() which itself executes part of the query. In the
specific example it's OpSlice which is iterating through the offset while still
in the planning phase. Though not queries which cause this sort of behaviour
use offsets.
Sorry but have no PR to offer at this stage. Have looked at whether it's
possible to have getPlan() return some future or deferrable plan so that the
top level exec has a handle on something that it can abort. However, the
changes looks far reaching and I don't yet have a satisfactory approach to
offer.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]