Re: [REGRESSION] nouveau: Memory corruption using nva3 engine for 0xaf

2012-07-10 Thread Henrik Rydberg
On Mon, Jul 09, 2012 at 03:13:25PM +0200, Henrik Rydberg wrote:
 On Thu, Jul 05, 2012 at 10:34:10AM +0200, Henrik Rydberg wrote:
  On Thu, Jul 05, 2012 at 08:54:46AM +0200, Henrik Rydberg wrote:
Thanks for tracking down the source of this corruption.  I don't have
any such hardware, so until someone can figure it out, I think we
should apply this patch.
   
   In that case, I would have to massage the patch a bit first; it
   creates a problem with suspend/resume. Might be something with
   nva3_pm.c, who knows. I am really stabbing in the dark here. :-)
  
  It seems the suspend/resume problem is unrelated (bad systemd update),
  so I am fine with applying this as is. Obviously not the best
  solution, and if I have time I will continue to look for problems in
  the nva3 copy code, but for now,
  
  Signed-off-by: Henrik Rydberg rydb...@euromail.se
 
 I have not encountered the problem in a long while, and I do not have
 the patch applied. It is entirely possible that this was fixed by
 something else. Unless you have already applied the patch, I would
 suggest holding on to it to see if the problem reappears.
 
 Sorry for the churn.

... and there it was again, hours after giving up on it. Oh well.

What makes this bug particularly difficult is that as soon as the
patch is applied, the problem disappears and does not show itself
again - with or without the patch applied. Sounds very much like the
problem is a failure state that does not get reset by current
mainline, but somehow gets reset with the patch applied.

I also learnt that the problem is not in the nva3_copy code itself; I
reverted nva3_copy.c and nva3_pm.c back to v3.4, but the problem persisted.

A DMA problem elsewhere, in the drm code or in the pci layer, seems
more likely than this particular hardware having problems with this
particular copy engine. As it stands, though, applying the patch is
the only thing known to work.

Thanks,
Henrik
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[REGRESSION] nouveau: Memory corruption using nva3 engine for 0xaf

2012-07-09 Thread Henrik Rydberg
On Mon, Jul 09, 2012 at 03:13:25PM +0200, Henrik Rydberg wrote:
> On Thu, Jul 05, 2012 at 10:34:10AM +0200, Henrik Rydberg wrote:
> > On Thu, Jul 05, 2012 at 08:54:46AM +0200, Henrik Rydberg wrote:
> > > > Thanks for tracking down the source of this corruption.  I don't have
> > > > any such hardware, so until someone can figure it out, I think we
> > > > should apply this patch.
> > > 
> > > In that case, I would have to massage the patch a bit first; it
> > > creates a problem with suspend/resume. Might be something with
> > > nva3_pm.c, who knows. I am really stabbing in the dark here. :-)
> > 
> > It seems the suspend/resume problem is unrelated (bad systemd update),
> > so I am fine with applying this as is. Obviously not the best
> > solution, and if I have time I will continue to look for problems in
> > the nva3 copy code, but for now,
> > 
> > Signed-off-by: Henrik Rydberg 
> 
> I have not encountered the problem in a long while, and I do not have
> the patch applied. It is entirely possible that this was fixed by
> something else. Unless you have already applied the patch, I would
> suggest holding on to it to see if the problem reappears.
> 
> Sorry for the churn.

... and there it was again, hours after giving up on it. Oh well.

What makes this bug particularly difficult is that as soon as the
patch is applied, the problem disappears and does not show itself
again - with or without the patch applied. Sounds very much like the
problem is a failure state that does not get reset by current
mainline, but somehow gets reset with the patch applied.

I also learnt that the problem is not in the nva3_copy code itself; I
reverted nva3_copy.c and nva3_pm.c back to v3.4, but the problem persisted.

A DMA problem elsewhere, in the drm code or in the pci layer, seems
more likely than this particular hardware having problems with this
particular copy engine. As it stands, though, applying the patch is
the only thing known to work.

Thanks,
Henrik


[REGRESSION] nouveau: Memory corruption using nva3 engine for 0xaf

2012-07-09 Thread Henrik Rydberg
On Thu, Jul 05, 2012 at 10:34:10AM +0200, Henrik Rydberg wrote:
> On Thu, Jul 05, 2012 at 08:54:46AM +0200, Henrik Rydberg wrote:
> > > Thanks for tracking down the source of this corruption.  I don't have
> > > any such hardware, so until someone can figure it out, I think we
> > > should apply this patch.
> > 
> > In that case, I would have to massage the patch a bit first; it
> > creates a problem with suspend/resume. Might be something with
> > nva3_pm.c, who knows. I am really stabbing in the dark here. :-)
> 
> It seems the suspend/resume problem is unrelated (bad systemd update),
> so I am fine with applying this as is. Obviously not the best
> solution, and if I have time I will continue to look for problems in
> the nva3 copy code, but for now,
> 
> Signed-off-by: Henrik Rydberg 

I have not encountered the problem in a long while, and I do not have
the patch applied. It is entirely possible that this was fixed by
something else. Unless you have already applied the patch, I would
suggest holding on to it to see if the problem reappears.

Sorry for the churn.

Thanks,
Henrik


Re: [REGRESSION] nouveau: Memory corruption using nva3 engine for 0xaf

2012-07-09 Thread Henrik Rydberg
On Thu, Jul 05, 2012 at 10:34:10AM +0200, Henrik Rydberg wrote:
 On Thu, Jul 05, 2012 at 08:54:46AM +0200, Henrik Rydberg wrote:
   Thanks for tracking down the source of this corruption.  I don't have
   any such hardware, so until someone can figure it out, I think we
   should apply this patch.
  
  In that case, I would have to massage the patch a bit first; it
  creates a problem with suspend/resume. Might be something with
  nva3_pm.c, who knows. I am really stabbing in the dark here. :-)
 
 It seems the suspend/resume problem is unrelated (bad systemd update),
 so I am fine with applying this as is. Obviously not the best
 solution, and if I have time I will continue to look for problems in
 the nva3 copy code, but for now,
 
 Signed-off-by: Henrik Rydberg rydb...@euromail.se

I have not encountered the problem in a long while, and I do not have
the patch applied. It is entirely possible that this was fixed by
something else. Unless you have already applied the patch, I would
suggest holding on to it to see if the problem reappears.

Sorry for the churn.

Thanks,
Henrik
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[REGRESSION] nouveau: Memory corruption using nva3 engine for 0xaf

2012-07-05 Thread Ben Skeggs
On Thu, Jul 05, 2012 at 08:31:13AM +0200, Henrik Rydberg wrote:
> Hi Ben, Dave,
Hey Henrik,

> 
> Since 3.5-rc0, I have been experiencing occasional screen corruption
> on my MacBookAir3,1, using a GeForce 320M (nv50, 0xaf). The X driver
> version is xf86-video-nouvea-1.0.1-1 (arch).
> 
> I do not know what the root problem is, but I have been able to
> isolate the symptoms to the usage of nva3_copy.c. The patch below is
> the least intrusive way I could find which kills the symptoms.
> 
> Hopefully this will sched some light on the true problem, such that a
> fix can be found for 3.5.
Thanks for tracking down the source of this corruption.  I don't have
any such hardware, so until someone can figure it out, I think we
should apply this patch.

Cheers,
Ben.

> 
> Thanks,
> Henrik
> 
> The nva3 copy engine exhibits random memory corruption in at least one
> case, the GeForce 320M (nv50, 0xaf) in the MacBookAir3,1.  This patch
> omits creating the engine for the specific chipset, falling back to
> M2MF, which kills the symptoms.
> ---
Signed-off-by: Ben Skeggs 

>  drivers/gpu/drm/nouveau/nouveau_state.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_state.c 
> b/drivers/gpu/drm/nouveau/nouveau_state.c
> index 19706f0..b466937 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_state.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_state.c
> @@ -731,7 +731,6 @@ nouveau_card_init(struct drm_device *dev)
>   case 0xa3:
>   case 0xa5:
>   case 0xa8:
> - case 0xaf:
>   nva3_copy_create(dev);
>   break;
>   }
> 
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel


[REGRESSION] nouveau: Memory corruption using nva3 engine for 0xaf

2012-07-05 Thread Henrik Rydberg
On Thu, Jul 05, 2012 at 08:54:46AM +0200, Henrik Rydberg wrote:
> > Thanks for tracking down the source of this corruption.  I don't have
> > any such hardware, so until someone can figure it out, I think we
> > should apply this patch.
> 
> In that case, I would have to massage the patch a bit first; it
> creates a problem with suspend/resume. Might be something with
> nva3_pm.c, who knows. I am really stabbing in the dark here. :-)

It seems the suspend/resume problem is unrelated (bad systemd update),
so I am fine with applying this as is. Obviously not the best
solution, and if I have time I will continue to look for problems in
the nva3 copy code, but for now,

Signed-off-by: Henrik Rydberg 

Thanks,
Henrik


[REGRESSION] nouveau: Memory corruption using nva3 engine for 0xaf

2012-07-05 Thread Henrik Rydberg
> Thanks for tracking down the source of this corruption.  I don't have
> any such hardware, so until someone can figure it out, I think we
> should apply this patch.

In that case, I would have to massage the patch a bit first; it
creates a problem with suspend/resume. Might be something with
nva3_pm.c, who knows. I am really stabbing in the dark here. :-)

Thanks,
Henrik


[REGRESSION] nouveau: Memory corruption using nva3 engine for 0xaf

2012-07-05 Thread Henrik Rydberg
Hi Ben, Dave,

Since 3.5-rc0, I have been experiencing occasional screen corruption
on my MacBookAir3,1, using a GeForce 320M (nv50, 0xaf). The X driver
version is xf86-video-nouvea-1.0.1-1 (arch).

I do not know what the root problem is, but I have been able to
isolate the symptoms to the usage of nva3_copy.c. The patch below is
the least intrusive way I could find which kills the symptoms.

Hopefully this will sched some light on the true problem, such that a
fix can be found for 3.5.

Thanks,
Henrik

The nva3 copy engine exhibits random memory corruption in at least one
case, the GeForce 320M (nv50, 0xaf) in the MacBookAir3,1.  This patch
omits creating the engine for the specific chipset, falling back to
M2MF, which kills the symptoms.
---
 drivers/gpu/drm/nouveau/nouveau_state.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_state.c 
b/drivers/gpu/drm/nouveau/nouveau_state.c
index 19706f0..b466937 100644
--- a/drivers/gpu/drm/nouveau/nouveau_state.c
+++ b/drivers/gpu/drm/nouveau/nouveau_state.c
@@ -731,7 +731,6 @@ nouveau_card_init(struct drm_device *dev)
case 0xa3:
case 0xa5:
case 0xa8:
-   case 0xaf:
nva3_copy_create(dev);
break;
}



[REGRESSION] nouveau: Memory corruption using nva3 engine for 0xaf

2012-07-05 Thread Henrik Rydberg
Hi Ben, Dave,

Since 3.5-rc0, I have been experiencing occasional screen corruption
on my MacBookAir3,1, using a GeForce 320M (nv50, 0xaf). The X driver
version is xf86-video-nouvea-1.0.1-1 (arch).

I do not know what the root problem is, but I have been able to
isolate the symptoms to the usage of nva3_copy.c. The patch below is
the least intrusive way I could find which kills the symptoms.

Hopefully this will sched some light on the true problem, such that a
fix can be found for 3.5.

Thanks,
Henrik

The nva3 copy engine exhibits random memory corruption in at least one
case, the GeForce 320M (nv50, 0xaf) in the MacBookAir3,1.  This patch
omits creating the engine for the specific chipset, falling back to
M2MF, which kills the symptoms.
---
 drivers/gpu/drm/nouveau/nouveau_state.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_state.c 
b/drivers/gpu/drm/nouveau/nouveau_state.c
index 19706f0..b466937 100644
--- a/drivers/gpu/drm/nouveau/nouveau_state.c
+++ b/drivers/gpu/drm/nouveau/nouveau_state.c
@@ -731,7 +731,6 @@ nouveau_card_init(struct drm_device *dev)
case 0xa3:
case 0xa5:
case 0xa8:
-   case 0xaf:
nva3_copy_create(dev);
break;
}

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [REGRESSION] nouveau: Memory corruption using nva3 engine for 0xaf

2012-07-05 Thread Ben Skeggs
On Thu, Jul 05, 2012 at 08:31:13AM +0200, Henrik Rydberg wrote:
 Hi Ben, Dave,
Hey Henrik,

 
 Since 3.5-rc0, I have been experiencing occasional screen corruption
 on my MacBookAir3,1, using a GeForce 320M (nv50, 0xaf). The X driver
 version is xf86-video-nouvea-1.0.1-1 (arch).
 
 I do not know what the root problem is, but I have been able to
 isolate the symptoms to the usage of nva3_copy.c. The patch below is
 the least intrusive way I could find which kills the symptoms.
 
 Hopefully this will sched some light on the true problem, such that a
 fix can be found for 3.5.
Thanks for tracking down the source of this corruption.  I don't have
any such hardware, so until someone can figure it out, I think we
should apply this patch.

Cheers,
Ben.

 
 Thanks,
 Henrik
 
 The nva3 copy engine exhibits random memory corruption in at least one
 case, the GeForce 320M (nv50, 0xaf) in the MacBookAir3,1.  This patch
 omits creating the engine for the specific chipset, falling back to
 M2MF, which kills the symptoms.
 ---
Signed-off-by: Ben Skeggs bske...@redhat.com

  drivers/gpu/drm/nouveau/nouveau_state.c | 1 -
  1 file changed, 1 deletion(-)
 
 diff --git a/drivers/gpu/drm/nouveau/nouveau_state.c 
 b/drivers/gpu/drm/nouveau/nouveau_state.c
 index 19706f0..b466937 100644
 --- a/drivers/gpu/drm/nouveau/nouveau_state.c
 +++ b/drivers/gpu/drm/nouveau/nouveau_state.c
 @@ -731,7 +731,6 @@ nouveau_card_init(struct drm_device *dev)
   case 0xa3:
   case 0xa5:
   case 0xa8:
 - case 0xaf:
   nva3_copy_create(dev);
   break;
   }
 
 ___
 dri-devel mailing list
 dri-devel@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [REGRESSION] nouveau: Memory corruption using nva3 engine for 0xaf

2012-07-05 Thread Henrik Rydberg
 Thanks for tracking down the source of this corruption.  I don't have
 any such hardware, so until someone can figure it out, I think we
 should apply this patch.

In that case, I would have to massage the patch a bit first; it
creates a problem with suspend/resume. Might be something with
nva3_pm.c, who knows. I am really stabbing in the dark here. :-)

Thanks,
Henrik
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [REGRESSION] nouveau: Memory corruption using nva3 engine for 0xaf

2012-07-05 Thread Henrik Rydberg
On Thu, Jul 05, 2012 at 08:54:46AM +0200, Henrik Rydberg wrote:
  Thanks for tracking down the source of this corruption.  I don't have
  any such hardware, so until someone can figure it out, I think we
  should apply this patch.
 
 In that case, I would have to massage the patch a bit first; it
 creates a problem with suspend/resume. Might be something with
 nva3_pm.c, who knows. I am really stabbing in the dark here. :-)

It seems the suspend/resume problem is unrelated (bad systemd update),
so I am fine with applying this as is. Obviously not the best
solution, and if I have time I will continue to look for problems in
the nva3 copy code, but for now,

Signed-off-by: Henrik Rydberg rydb...@euromail.se

Thanks,
Henrik
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel