[ https://issues.apache.org/jira/browse/DRILL-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17750360#comment-17750360 ]
James Turton edited comment on DRILL-8448 at 8/2/23 2:26 PM: ------------------------------------------------------------- Thanks for reporting this. There a few Drill Jira issues that talk about queries getting stuck in CANCELLATION_REQUESTED, e.g. DRILL-5500. I believe that any failure that leaves one or more fragments waiting for a lock (monitor) that never gets released can result in a query stuck in this state. Such failures can happen as a result of a failed HBase pstore access but also in other ways as can be seen in the other Jira issues. [~volodymyr] [~cgivre] one general way to fix this might be to set a query cancellation timeout after which any fragment threads that are still running (or blocked) get forcibly interrupted - has anything like this ever been discussed? was (Author: dzamo): Thanks for reporting this. There a few Drill Jira issues that talk about queries getting stuck in CANCELLATION_REQUESTED, e.g. DRILL-5500. I believe that any failure that leaves one or more fragments waiting for a lock (monitor) that never gets released can result in a query stuck in this state. Such failures can happen as a result of a failed HBase pstore access but also in other ways as can be seen in the other Jira issues. [~volodymyr] [~cgivre] one way to fix this might be to set a query cancellation timeout after which any fragment threads that are still running (or blocked) get forcibly interrupted - has anything like this ever been discussed? > Query cancellation not possible with hbase persistent store > ----------------------------------------------------------- > > Key: DRILL-8448 > URL: https://issues.apache.org/jira/browse/DRILL-8448 > Project: Apache Drill > Issue Type: Bug > Affects Versions: 1.20.3 > Reporter: Christian Pfarr > Priority: Minor > Attachments: 1b3dcd7e-60b4-069f-0c53-c1ee168b2d9c.json, > 1b3dcd7e-60b4-069f-0c53-c1ee168b2d9c.log, > 1b419c0e-0018-3775-794b-77513f64df56.json, > 1b419c0e-0018-3775-794b-77513f64df56.log, image-2023-08-02-08-47-40-413.png > > > When we cancel a running query it is hanging in the state > CANCELLATION_REQUESTED and stays in the list of running queries via webui. > !image-2023-08-02-08-47-40-413.png! > In the logs we see that these queries has failed because of an error with our > hbase persistent store. Maybe its just a bug in the UI? > I´ve attached the profiles and the logs for this cancellation requests. -- This message was sent by Atlassian Jira (v8.20.10#820010)