On Wed, 19 Oct 2011 12:32:25 +0100 Chris Wilson <[email protected]> wrote:
> On Tue, 11 Oct 2011 16:39:09 +0200, Daniel Vetter <[email protected]> > wrote: > > From: Ben Widawsky <[email protected]> > > > > This was pulled out of the per ring error handling patch series as it > > actually fixes two issues, and bikeshedding appears to be going on > > there. > > > > First, remove setting hangcheck_count when we do notify ring. While it > > seems counterintuitive to be setting up a timer to catch hangcheck_count > > greater than 0 with hangcheck_count already greater than 0, actually > > when we go to check if the GPU is hung we clear that value if the gpu is > > still alive . Leaving this is actually harmful as submitting work could > > falsely clear the count while the hanghcheck code is checking the count. > > I can't think of case where this doesn't just delay the inevitable > > reset... but I didn't spend too much time thinking about it. > > > > Second, for Gen5+ we have more information to be considered when > > determining if the GPU is stuck, primarily the media ring (and blitter > > ring in gen6). This patch will check all available rings, and also updates > > error state with the new information. It theoretically cant fix false > > positives, but I haven't actually come across such a case. > > > > Signed-off-by: Ben Widawsky <[email protected]> > > [danvet: remove remnants of a unrelated cleanup patch] > > Signed-off-by: Daniel Vetter <[email protected]> > > NAK: This failed to detect a hang, leaving my box frozen. I suspect that > the value of INSTDONE was fluctuating on the render ring even though we > had now requests pending and so could assume that it was idle. > -Chris > How is that different than the previous behavior? We checked instdone on the render ring before this patch too. _______________________________________________ Intel-gfx mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/intel-gfx
