Re: [Intel-gfx] 3.15-rc: regression in suspend
On Thu, Aug 07, 2014 at 02:47:21PM +0200, Jiri Kosina wrote: > On Fri, 11 Jul 2014, Pavel Machek wrote: > > > > > > Ok, so I have set up machines for ktest / autobisect, and found out > > > > > that > > > > > 3.16-rc1 no longer has that problem. Oh well, bisect would not be > > > > > fun, > > > > > anyway... > > > > > > > > I am still seeing the problem with 3.16-rc2. > > > > > > I'm confused now. Is the bisect result > > > > > > commit 773875bfb6737982903c42d1ee88cf60af80089c > > > Author: Daniel Vetter > > > Date: Mon Jan 27 10:00:30 2014 +0100 > > > > > > drm/i915: Don't set the 8to6 dither flag when not scaling > > > > > > now the culprit or not? Or do we have 2 different bugs at hand here? > > > > Three different issues, it seems. Two ring initialization problems, > > one went away in 3.16 (for me), second did not (suspend for jikos), > > third -- trivial issue with 8to6 dither. > > The patch below seems to finally cure the problem at my system; I've just > attached it to freedesktop bugzilla, but sending it to this thread as well > to hopefully get as much testing coverage by affected people as possible. > > I am going on with testing whether it really completely fixes the problem > or just made it less likely. Picked up for -fixes, thanks for the patch. -Daniel > > > > > > From: Jiri Kosina > Subject: [PATCH] drm/i915: read HEAD register back in init_ring_common() to > enforce ordering > > Withtout this, ring initialization fails reliabily during resume with > > [drm:init_ring_common] *ERROR* render ring initialization failed ctl > 0001f001 head ff8804 tail start 000e4000 > > Signed-off-by: Jiri Kosina > --- > drivers/gpu/drm/i915/intel_ringbuffer.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > b/drivers/gpu/drm/i915/intel_ringbuffer.c > index 279488a..7add7ee 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -517,6 +517,9 @@ static int init_ring_common(struct intel_engine_cs *ring) > else > ring_setup_phys_status_page(ring); > > + /* Enforce ordering by reading HEAD register back */ > + I915_READ_HEAD(ring); > + > /* Initialize the ring. This must happen _after_ we've cleared the ring >* registers with the above sequence (the readback of the HEAD registers >* also enforces ordering), otherwise the hw might lose the new ring > > -- > Jiri Kosina > SUSE Labs > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] 3.15-rc: regression in suspend
On Thu, 7 Aug 2014, Jiri Kosina wrote: > The patch below seems to finally cure the problem at my system; I've just > attached it to freedesktop bugzilla, but sending it to this thread as well > to hopefully get as much testing coverage by affected people as possible. > > I am going on with testing whether it really completely fixes the problem > or just made it less likely. Okay, after 31 suspend-resume cycles, the problem appeared again (while without the patch, it triggers with 100% reliability). So it's not a complete fix, it just makes the problem much less visible. Going back to bugzilla discussion. > > From: Jiri Kosina > Subject: [PATCH] drm/i915: read HEAD register back in init_ring_common() to > enforce ordering > > Withtout this, ring initialization fails reliabily during resume with > > [drm:init_ring_common] *ERROR* render ring initialization failed ctl > 0001f001 head ff8804 tail start 000e4000 > > Signed-off-by: Jiri Kosina > --- > drivers/gpu/drm/i915/intel_ringbuffer.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > b/drivers/gpu/drm/i915/intel_ringbuffer.c > index 279488a..7add7ee 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -517,6 +517,9 @@ static int init_ring_common(struct intel_engine_cs *ring) > else > ring_setup_phys_status_page(ring); > > + /* Enforce ordering by reading HEAD register back */ > + I915_READ_HEAD(ring); > + > /* Initialize the ring. This must happen _after_ we've cleared the ring >* registers with the above sequence (the readback of the HEAD registers >* also enforces ordering), otherwise the hw might lose the new ring -- Jiri Kosina SUSE Labs ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] 3.15-rc: regression in suspend
On Fri, 11 Jul 2014, Pavel Machek wrote: > > > > Ok, so I have set up machines for ktest / autobisect, and found out > > > > that > > > > 3.16-rc1 no longer has that problem. Oh well, bisect would not be fun, > > > > anyway... > > > > > > I am still seeing the problem with 3.16-rc2. > > > > I'm confused now. Is the bisect result > > > > commit 773875bfb6737982903c42d1ee88cf60af80089c > > Author: Daniel Vetter > > Date: Mon Jan 27 10:00:30 2014 +0100 > > > > drm/i915: Don't set the 8to6 dither flag when not scaling > > > > now the culprit or not? Or do we have 2 different bugs at hand here? > > Three different issues, it seems. Two ring initialization problems, > one went away in 3.16 (for me), second did not (suspend for jikos), > third -- trivial issue with 8to6 dither. The patch below seems to finally cure the problem at my system; I've just attached it to freedesktop bugzilla, but sending it to this thread as well to hopefully get as much testing coverage by affected people as possible. I am going on with testing whether it really completely fixes the problem or just made it less likely. From: Jiri Kosina Subject: [PATCH] drm/i915: read HEAD register back in init_ring_common() to enforce ordering Withtout this, ring initialization fails reliabily during resume with [drm:init_ring_common] *ERROR* render ring initialization failed ctl 0001f001 head ff8804 tail start 000e4000 Signed-off-by: Jiri Kosina --- drivers/gpu/drm/i915/intel_ringbuffer.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 279488a..7add7ee 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -517,6 +517,9 @@ static int init_ring_common(struct intel_engine_cs *ring) else ring_setup_phys_status_page(ring); + /* Enforce ordering by reading HEAD register back */ + I915_READ_HEAD(ring); + /* Initialize the ring. This must happen _after_ we've cleared the ring * registers with the above sequence (the readback of the HEAD registers * also enforces ordering), otherwise the hw might lose the new ring -- Jiri Kosina SUSE Labs ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] 3.15-rc: regression in suspend
On Fri, 11 Jul 2014, Pavel Machek wrote: > > > > Ok, so I have set up machines for ktest / autobisect, and found out > > > > that > > > > 3.16-rc1 no longer has that problem. Oh well, bisect would not be fun, > > > > anyway... > > > > > > I am still seeing the problem with 3.16-rc2. > > > > I'm confused now. Is the bisect result > > > > commit 773875bfb6737982903c42d1ee88cf60af80089c > > Author: Daniel Vetter > > Date: Mon Jan 27 10:00:30 2014 +0100 > > > > drm/i915: Don't set the 8to6 dither flag when not scaling > > > > now the culprit or not? Or do we have 2 different bugs at hand here? > > Three different issues, it seems. Two ring initialization problems, > one went away in 3.16 (for me), second did not (suspend for jikos), > third -- trivial issue with 8to6 dither. That's correct assesment. The ring initialization failure I reported is still there. -- Jiri Kosina SUSE Labs ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] 3.15-rc: regression in suspend
On Mon 2014-07-07 10:39:08, Daniel Vetter wrote: > On Fri, Jun 27, 2014 at 03:37:16PM +0200, Jiri Kosina wrote: > > On Thu, 26 Jun 2014, Pavel Machek wrote: > > > > > Ok, so I have set up machines for ktest / autobisect, and found out that > > > 3.16-rc1 no longer has that problem. Oh well, bisect would not be fun, > > > anyway... > > > > I am still seeing the problem with 3.16-rc2. > > I'm confused now. Is the bisect result > > commit 773875bfb6737982903c42d1ee88cf60af80089c > Author: Daniel Vetter > Date: Mon Jan 27 10:00:30 2014 +0100 > > drm/i915: Don't set the 8to6 dither flag when not scaling > > now the culprit or not? Or do we have 2 different bugs at hand here? Three different issues, it seems. Two ring initialization problems, one went away in 3.16 (for me), second did not (suspend for jikos), third -- trivial issue with 8to6 dither. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] 3.15-rc: regression in suspend
On Fri, Jun 27, 2014 at 03:37:16PM +0200, Jiri Kosina wrote: > On Thu, 26 Jun 2014, Pavel Machek wrote: > > > Ok, so I have set up machines for ktest / autobisect, and found out that > > 3.16-rc1 no longer has that problem. Oh well, bisect would not be fun, > > anyway... > > I am still seeing the problem with 3.16-rc2. I'm confused now. Is the bisect result commit 773875bfb6737982903c42d1ee88cf60af80089c Author: Daniel Vetter Date: Mon Jan 27 10:00:30 2014 +0100 drm/i915: Don't set the 8to6 dither flag when not scaling now the culprit or not? Or do we have 2 different bugs at hand here? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] 3.15-rc: regression in suspend
On Thu, 26 Jun 2014, Pavel Machek wrote: > Ok, so I have set up machines for ktest / autobisect, and found out that > 3.16-rc1 no longer has that problem. Oh well, bisect would not be fun, > anyway... I am still seeing the problem with 3.16-rc2. -- Jiri Kosina SUSE Labs ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] 3.15-rc: regression in suspend
On Mon 2014-06-09 13:03:31, Jiri Kosina wrote: > On Mon, 9 Jun 2014, Pavel Machek wrote: > > > > > Strange. It seems 3.15 with the patch reverted only boots in 30% or so > > > > cases... And I've seen resume failure, too, so maybe I was just lucky > > > > that it worked for a while. > > > > > > git bisect really likes 25f397a429dfa43f22c278d0119a60 - you're about > > > the 5th report or so that claims this is the culprit but it's > > > something else. The above code is definitely not used in i915 so bogus > > > bisect result. > > > > Note I did not do the bisect, I only attempted revert and test. > > > > And did three boots of successful s2ram.. only to find out that it > > does not really fix s2ram, I was just lucky :-(. > > > > Unfortunately, this means my s2ram problem will be tricky/impossible > > to bisect :-(. > > Welcome to the situation I have been in for past several months. Ok, so I have set up machines for ktest / autobisect, and found out that 3.16-rc1 no longer has that problem. Oh well, bisect would not be fun, anyway... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] 3.15-rc: regression in suspend
On Mon, 9 Jun 2014, Pavel Machek wrote: > > > Strange. It seems 3.15 with the patch reverted only boots in 30% or so > > > cases... And I've seen resume failure, too, so maybe I was just lucky > > > that it worked for a while. > > > > git bisect really likes 25f397a429dfa43f22c278d0119a60 - you're about > > the 5th report or so that claims this is the culprit but it's > > something else. The above code is definitely not used in i915 so bogus > > bisect result. > > Note I did not do the bisect, I only attempted revert and test. > > And did three boots of successful s2ram.. only to find out that it > does not really fix s2ram, I was just lucky :-(. > > Unfortunately, this means my s2ram problem will be tricky/impossible > to bisect :-(. Welcome to the situation I have been in for past several months. -- Jiri Kosina SUSE Labs ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] 3.15-rc: regression in suspend
On Mon 2014-06-09 11:25:20, Daniel Vetter wrote: > On Sun, Jun 8, 2014 at 1:11 AM, Pavel Machek wrote: > > Strange. It seems 3.15 with the patch reverted only boots in 30% or so > > cases... And I've seen resume failure, too, so maybe I was just lucky > > that it worked for a while. > > git bisect really likes 25f397a429dfa43f22c278d0119a60 - you're about > the 5th report or so that claims this is the culprit but it's > something else. The above code is definitely not used in i915 so bogus > bisect result. Note I did not do the bisect, I only attempted revert and test. And did three boots of successful s2ram.. only to find out that it does not really fix s2ram, I was just lucky :-(. Unfortunately, this means my s2ram problem will be tricky/impossible to bisect :-(. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] 3.15-rc: regression in suspend
On Sun, Jun 8, 2014 at 1:11 AM, Pavel Machek wrote: > Strange. It seems 3.15 with the patch reverted only boots in 30% or so > cases... And I've seen resume failure, too, so maybe I was just lucky > that it worked for a while. git bisect really likes 25f397a429dfa43f22c278d0119a60 - you're about the 5th report or so that claims this is the culprit but it's something else. The above code is definitely not used in i915 so bogus bisect result. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] 3.15-rc: regression in suspend
On Sat 2014-06-07 14:06:14, Pavel Machek wrote: > On Thu 2014-05-15 17:31:54, Daniel Vetter wrote: > > On Thu, May 15, 2014 at 5:29 PM, Jiri Kosina wrote: > > >> > Note that X do work somehow after resume (I can't switch virtual > > >> > desktops and dialog is stuck on screen, but it is not complete > > >> > failure). I can do ctrl-alt-f1 and get to useful prompt. > > >> > > >> Oops. You were right. It seems it is duplicate after all. > > >> > > >> [drm:init_ring_common] *ERROR* render ring initialization failed ctl > > >> 0001f001 head 1020 tail start 3000 > > > > > > Pavel, thanks a lot for testing. > > > > > > Adding Daniel and Chris to CC -- we have another incarnation of the bug > > > that is being chased in 76554. > > > > Someone succeeding at a bisect would be awesome ... Note that the only > > key here is the ring init failure in dmesg. > > Oh and... the machine has problems comming up after reboot (never seen > those before 3.15). Sometimes, boot will hang with blank screen, and hard > powerdown is needed... Strange. It seems 3.15 with the patch reverted only boots in 30% or so cases... And I've seen resume failure, too, so maybe I was just lucky that it worked for a while. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] 3.15-rc: regression in suspend
On Thu 2014-05-15 17:31:54, Daniel Vetter wrote: > On Thu, May 15, 2014 at 5:29 PM, Jiri Kosina wrote: > >> > Note that X do work somehow after resume (I can't switch virtual > >> > desktops and dialog is stuck on screen, but it is not complete > >> > failure). I can do ctrl-alt-f1 and get to useful prompt. > >> > >> Oops. You were right. It seems it is duplicate after all. > >> > >> [drm:init_ring_common] *ERROR* render ring initialization failed ctl > >> 0001f001 head 1020 tail start 3000 > > > > Pavel, thanks a lot for testing. > > > > Adding Daniel and Chris to CC -- we have another incarnation of the bug > > that is being chased in 76554. > > Someone succeeding at a bisect would be awesome ... Note that the only > key here is the ring init failure in dmesg. Oh and... the machine has problems comming up after reboot (never seen those before 3.15). Sometimes, boot will hang with blank screen, and hard powerdown is needed... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] 3.15-rc: regression in suspend
On Thu 2014-05-15 17:31:54, Daniel Vetter wrote: > On Thu, May 15, 2014 at 5:29 PM, Jiri Kosina wrote: > >> > Note that X do work somehow after resume (I can't switch virtual > >> > desktops and dialog is stuck on screen, but it is not complete > >> > failure). I can do ctrl-alt-f1 and get to useful prompt. > >> > >> Oops. You were right. It seems it is duplicate after all. > >> > >> [drm:init_ring_common] *ERROR* render ring initialization failed ctl > >> 0001f001 head 1020 tail start 3000 > > > > Pavel, thanks a lot for testing. > > > > Adding Daniel and Chris to CC -- we have another incarnation of the bug > > that is being chased in 76554. > > Someone succeeding at a bisect would be awesome ... Note that the only > key here is the ring init failure in dmesg. Hmm. I was slowly getting ready for doing the bisect, but it seems someone did it for me. Date: Wed, 28 May 2014 18:25:21 +0100 From: Ken Moffat To: Daniel Vetter Cc: Dave Airlie , linux-ker...@vger.kernel.org Subject: Resume from suspend broken in 3.15. (bisected) User-Agent: Mutt/1.5.23 (2014-03-12) I manually reverted 25f397a429dfa43f22c278d0119a60a343aa568f and it seems I'm back at working suspend/resume in i915. The 3.15 regression seemed to be "if suspend works once, it seems to work until reboot", otherwise it failed in cca 70% cases. I did two reboots so far with 25f... reverted, and it seems to behave ok. Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] 3.15-rc: regression in suspend
On Thu, May 15, 2014 at 5:29 PM, Jiri Kosina wrote: >> > Note that X do work somehow after resume (I can't switch virtual >> > desktops and dialog is stuck on screen, but it is not complete >> > failure). I can do ctrl-alt-f1 and get to useful prompt. >> >> Oops. You were right. It seems it is duplicate after all. >> >> [drm:init_ring_common] *ERROR* render ring initialization failed ctl >> 0001f001 head 1020 tail start 3000 > > Pavel, thanks a lot for testing. > > Adding Daniel and Chris to CC -- we have another incarnation of the bug > that is being chased in 76554. Someone succeeding at a bisect would be awesome ... Note that the only key here is the ring init failure in dmesg. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx