[
https://issues.apache.org/jira/browse/DRILL-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kunal Khatua updated DRILL-6879:
--------------------------------
Description:
When running a very large query on a cluster with limited resource, we noticed
that one of the node's VM thread freezes the fragment threads as it tries to do
some work (GC perhaps?). This is a clear indication that the query is stuck in
a weird state where it might not recover from.
Under such circumstances, it makes sense to cancel or atleast warn the user on
that page of the query exceeding a certain threshold.
For detecting this, the user will find that the {{Last Progress}} column in the
Fragments Overview section will show large times.
!image-2018-12-04-11-54-54-247.png!
was:
When running a very large query on a cluster with limited resource, we noticed
that one of the node's VM thread freezes the fragment threads as it tries to do
some work (GC perhaps?). This is a clear indication that the query is stuck in
a weird state where it might not recover from.
Under such circumstances, it makes sense to cancel or atleast warn the user on
that page of the query exceeding a certain threshold.
For detecting this, the user will find that the {{Last Progress}} column in the
Fragments Overview section will show large times.
!image-2018-12-04-11-54-54-247.png|thumbnail!
> Indicate a warning in the WebUI when a query makes no progress for a long time
> ------------------------------------------------------------------------------
>
> Key: DRILL-6879
> URL: https://issues.apache.org/jira/browse/DRILL-6879
> Project: Apache Drill
> Issue Type: Improvement
> Components: Execution - Monitoring, Web Server
> Affects Versions: 1.14.0
> Reporter: Kunal Khatua
> Assignee: Kunal Khatua
> Priority: Major
> Labels: user-experience
> Fix For: 1.16.0
>
> Attachments: image-2018-12-04-11-54-54-247.png
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> When running a very large query on a cluster with limited resource, we
> noticed that one of the node's VM thread freezes the fragment threads as it
> tries to do some work (GC perhaps?). This is a clear indication that the
> query is stuck in a weird state where it might not recover from.
> Under such circumstances, it makes sense to cancel or atleast warn the user
> on that page of the query exceeding a certain threshold.
> For detecting this, the user will find that the {{Last Progress}} column in
> the Fragments Overview section will show large times.
> !image-2018-12-04-11-54-54-247.png!
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)