Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-26 Thread Daniel Vetter
On Tue, Mar 25, 2014 at 06:42:13PM -0400, Mikulas Patocka wrote:
> 
> 
> On Mon, 24 Mar 2014, Daniel Vetter wrote:
> 
> > >> Like I've said the entire teardown sequence for legacy drm drivers is
> > >> terminally busted, so the only hope we have is to reapply this missing
> > >> duct-tape which made your X crash. But if that itself isn't a regression
> > >> there's no way to fix the current drm/mga driver without a complete
> > >> rewrite as a new-style kernel modesetting driver.
> > >> -Daniel
> > >
> > > If someone understands the locking issues I pointed out above, it could be
> > > easy to fix.
> > 
> > The locking issue isn't your problem, the real issue is that putting a
> > irq_uninstall into core code will break all the new (properly working)
> > drivers. And you can't really fix this in mga itself since the
> > lifetime rules of the register mappings are totally broken. It's a
> > fundamental misdesign of the legacy drm driver architecture and the
> > _only_ way to fix this bug for real is to rewrite this all. Which was
> > done for all the still used drivers like i915, radeon, nouveau, ...
> > -Daniel
> 
> When I tried Radeon AGP card with the KMS driver, it lacked the 
> possibility to set video mode with fbset and the framebuffer console was 
> very slow because it wasn't accelerated.
> 
> So, Radeon with the new driver is much less useable than Matrox.
> 
> Did I misconfigure something? Or, is console acceleration and modesetting 
> deliberately unsupported in KMS drivers?

You've missed nothing and the performances is abmysal intentionally. It
could be fixed (we've had patches for i915 to accelarate the performance)
but not worth it - fbdev emulation is just here for backwards compat.

If you want fast console on kms drivers you need to look at David
Herrmann's kmscon. That uses GL drivers on top of egl+gbm+kms and that
will be fast. The long-term plan is to switch to that and have a very
minimal shim in the kernel on top of kms as an emergency logging console
only (i.e. even less than the current fbdev stuff).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-26 Thread Daniel Vetter
On Tue, Mar 25, 2014 at 06:42:13PM -0400, Mikulas Patocka wrote:
 
 
 On Mon, 24 Mar 2014, Daniel Vetter wrote:
 
   Like I've said the entire teardown sequence for legacy drm drivers is
   terminally busted, so the only hope we have is to reapply this missing
   duct-tape which made your X crash. But if that itself isn't a regression
   there's no way to fix the current drm/mga driver without a complete
   rewrite as a new-style kernel modesetting driver.
   -Daniel
  
   If someone understands the locking issues I pointed out above, it could be
   easy to fix.
  
  The locking issue isn't your problem, the real issue is that putting a
  irq_uninstall into core code will break all the new (properly working)
  drivers. And you can't really fix this in mga itself since the
  lifetime rules of the register mappings are totally broken. It's a
  fundamental misdesign of the legacy drm driver architecture and the
  _only_ way to fix this bug for real is to rewrite this all. Which was
  done for all the still used drivers like i915, radeon, nouveau, ...
  -Daniel
 
 When I tried Radeon AGP card with the KMS driver, it lacked the 
 possibility to set video mode with fbset and the framebuffer console was 
 very slow because it wasn't accelerated.
 
 So, Radeon with the new driver is much less useable than Matrox.
 
 Did I misconfigure something? Or, is console acceleration and modesetting 
 deliberately unsupported in KMS drivers?

You've missed nothing and the performances is abmysal intentionally. It
could be fixed (we've had patches for i915 to accelarate the performance)
but not worth it - fbdev emulation is just here for backwards compat.

If you want fast console on kms drivers you need to look at David
Herrmann's kmscon. That uses GL drivers on top of egl+gbm+kms and that
will be fast. The long-term plan is to switch to that and have a very
minimal shim in the kernel on top of kms as an emergency logging console
only (i.e. even less than the current fbdev stuff).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-25 Thread Mikulas Patocka


On Mon, 24 Mar 2014, Daniel Vetter wrote:

> >> Like I've said the entire teardown sequence for legacy drm drivers is
> >> terminally busted, so the only hope we have is to reapply this missing
> >> duct-tape which made your X crash. But if that itself isn't a regression
> >> there's no way to fix the current drm/mga driver without a complete
> >> rewrite as a new-style kernel modesetting driver.
> >> -Daniel
> >
> > If someone understands the locking issues I pointed out above, it could be
> > easy to fix.
> 
> The locking issue isn't your problem, the real issue is that putting a
> irq_uninstall into core code will break all the new (properly working)
> drivers. And you can't really fix this in mga itself since the
> lifetime rules of the register mappings are totally broken. It's a
> fundamental misdesign of the legacy drm driver architecture and the
> _only_ way to fix this bug for real is to rewrite this all. Which was
> done for all the still used drivers like i915, radeon, nouveau, ...
> -Daniel

When I tried Radeon AGP card with the KMS driver, it lacked the 
possibility to set video mode with fbset and the framebuffer console was 
very slow because it wasn't accelerated.

So, Radeon with the new driver is much less useable than Matrox.

Did I misconfigure something? Or, is console acceleration and modesetting 
deliberately unsupported in KMS drivers?

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-25 Thread Daniel Vetter
On Tue, Mar 25, 2014 at 12:11 AM, Andreas Mohr  wrote:
> On Mon, Mar 24, 2014 at 10:46:49PM +0100, Daniel Vetter wrote:
>> On Mon, Mar 24, 2014 at 9:40 PM, Mikulas Patocka  wrote:
>> > If someone understands the locking issues I pointed out above, it could be
>> > easy to fix.
>>
>> The locking issue isn't your problem, the real issue is that putting a
>> irq_uninstall into core code will break all the new (properly working)
>> drivers. And you can't really fix this in mga itself since the
>> lifetime rules of the register mappings are totally broken. It's a
>> fundamental misdesign of the legacy drm driver architecture and the
>> _only_ way to fix this bug for real is to rewrite this all. Which was
>> done for all the still used drivers like i915, radeon, nouveau, ...
>
> That sounds plausible - yet with meatballs (ok, maybe I should omit
> such quite possibly unjustified qualification) such as this:
>
> git show --stat 771fe6b912fca54f03
>
> how is a bunch of marginally-trained hobbyists ever supposed to be 
> implementing
> a working practical (i.e., "base") driver for *various* currently unsupported
> (booted would perhaps even be a more fitting word?) hardware?
>
> While the result of a wc -l check of the drivers/gpu/drm/r128 dir itself
> seems quite positive, that still might be not much of help
> when eyeing the large KMS changes that had to be done elsewhere.
>
> I guess we can make use of all the practical advice/links that we can get...
> (such as hints at good candidates of existing KMSified drivers
> which don't come with the full bells and whistles package,
> hints at suitably sized KMS support commits, grandma tutorials, ...).
> Some semi-short search wasn't overly successful, with links such as
> http://www.x.org/wiki/ModeSetting/
> https://en.wikipedia.org/wiki/KMS_%28Linux_kernel%29#Linux
> "New, Generic X.Org KMS Driver Work" 
> http://www.phoronix.com/scan.php?page=news_item=OTk1OA
>
> Or perhaps I should just state outright that I seem to be in need
> of a working solution for my kernel upgrade pain
> which I would be deemed to want semi-soonish
> (the i810, MGA users and some others might be sharing my thoughts).
> IOW, my r128 driver is somewhat of a "still used driver", thank you very much.
>
> Thanks for having managed to survive my posting in an asbestos-lined
> garment (apologies if it came across in harsh terms :),
>
> Andreas Mohr
> (not necessarily a member of the forced-monopoly hardware upgrade treadmill 
> cult)

Well I have a i810 here personally and since about 3 years haven't
gotten around to implement a kms stack for it (just for fun
essentially). And that's on a platform which can steal a lot of the
kms infrastructure from the i915 driver.

For simple devices getting a basic kms driver off the ground (that
means no accelaration support in the kernel with GEM, and obviously
userspace not ported) seems to be doable in one gsoc term, at least
someone managed it. Which means 3 months full-time work.

So yeah, writing real gpu drivers is fairly non-trivial task. And all
the people who can do it are fully busy getting the new shit to work.

Wrt stealing code: It's better to look at the legacy fbdev drivers,
since they have modeset support. And meanwhile we have fairly decent
drm DocBook in the kernel, the latest updates are in drm-next.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-25 Thread Daniel Vetter
On Tue, Mar 25, 2014 at 12:11 AM, Andreas Mohr a...@lisas.de wrote:
 On Mon, Mar 24, 2014 at 10:46:49PM +0100, Daniel Vetter wrote:
 On Mon, Mar 24, 2014 at 9:40 PM, Mikulas Patocka mpato...@redhat.com wrote:
  If someone understands the locking issues I pointed out above, it could be
  easy to fix.

 The locking issue isn't your problem, the real issue is that putting a
 irq_uninstall into core code will break all the new (properly working)
 drivers. And you can't really fix this in mga itself since the
 lifetime rules of the register mappings are totally broken. It's a
 fundamental misdesign of the legacy drm driver architecture and the
 _only_ way to fix this bug for real is to rewrite this all. Which was
 done for all the still used drivers like i915, radeon, nouveau, ...

 That sounds plausible - yet with meatballs (ok, maybe I should omit
 such quite possibly unjustified qualification) such as this:

 git show --stat 771fe6b912fca54f03

 how is a bunch of marginally-trained hobbyists ever supposed to be 
 implementing
 a working practical (i.e., base) driver for *various* currently unsupported
 (booted would perhaps even be a more fitting word?) hardware?

 While the result of a wc -l check of the drivers/gpu/drm/r128 dir itself
 seems quite positive, that still might be not much of help
 when eyeing the large KMS changes that had to be done elsewhere.

 I guess we can make use of all the practical advice/links that we can get...
 (such as hints at good candidates of existing KMSified drivers
 which don't come with the full bells and whistles package,
 hints at suitably sized KMS support commits, grandma tutorials, ...).
 Some semi-short search wasn't overly successful, with links such as
 http://www.x.org/wiki/ModeSetting/
 https://en.wikipedia.org/wiki/KMS_%28Linux_kernel%29#Linux
 New, Generic X.Org KMS Driver Work 
 http://www.phoronix.com/scan.php?page=news_itempx=OTk1OA

 Or perhaps I should just state outright that I seem to be in need
 of a working solution for my kernel upgrade pain
 which I would be deemed to want semi-soonish
 (the i810, MGA users and some others might be sharing my thoughts).
 IOW, my r128 driver is somewhat of a still used driver, thank you very much.

 Thanks for having managed to survive my posting in an asbestos-lined
 garment (apologies if it came across in harsh terms :),

 Andreas Mohr
 (not necessarily a member of the forced-monopoly hardware upgrade treadmill 
 cult)

Well I have a i810 here personally and since about 3 years haven't
gotten around to implement a kms stack for it (just for fun
essentially). And that's on a platform which can steal a lot of the
kms infrastructure from the i915 driver.

For simple devices getting a basic kms driver off the ground (that
means no accelaration support in the kernel with GEM, and obviously
userspace not ported) seems to be doable in one gsoc term, at least
someone managed it. Which means 3 months full-time work.

So yeah, writing real gpu drivers is fairly non-trivial task. And all
the people who can do it are fully busy getting the new shit to work.

Wrt stealing code: It's better to look at the legacy fbdev drivers,
since they have modeset support. And meanwhile we have fairly decent
drm DocBook in the kernel, the latest updates are in drm-next.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-25 Thread Mikulas Patocka


On Mon, 24 Mar 2014, Daniel Vetter wrote:

  Like I've said the entire teardown sequence for legacy drm drivers is
  terminally busted, so the only hope we have is to reapply this missing
  duct-tape which made your X crash. But if that itself isn't a regression
  there's no way to fix the current drm/mga driver without a complete
  rewrite as a new-style kernel modesetting driver.
  -Daniel
 
  If someone understands the locking issues I pointed out above, it could be
  easy to fix.
 
 The locking issue isn't your problem, the real issue is that putting a
 irq_uninstall into core code will break all the new (properly working)
 drivers. And you can't really fix this in mga itself since the
 lifetime rules of the register mappings are totally broken. It's a
 fundamental misdesign of the legacy drm driver architecture and the
 _only_ way to fix this bug for real is to rewrite this all. Which was
 done for all the still used drivers like i915, radeon, nouveau, ...
 -Daniel

When I tried Radeon AGP card with the KMS driver, it lacked the 
possibility to set video mode with fbset and the framebuffer console was 
very slow because it wasn't accelerated.

So, Radeon with the new driver is much less useable than Matrox.

Did I misconfigure something? Or, is console acceleration and modesetting 
deliberately unsupported in KMS drivers?

Mikulas
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-24 Thread Andreas Mohr
Hi,

On Mon, Mar 24, 2014 at 10:46:49PM +0100, Daniel Vetter wrote:
> On Mon, Mar 24, 2014 at 9:40 PM, Mikulas Patocka  wrote:
> > If someone understands the locking issues I pointed out above, it could be
> > easy to fix.
> 
> The locking issue isn't your problem, the real issue is that putting a
> irq_uninstall into core code will break all the new (properly working)
> drivers. And you can't really fix this in mga itself since the
> lifetime rules of the register mappings are totally broken. It's a
> fundamental misdesign of the legacy drm driver architecture and the
> _only_ way to fix this bug for real is to rewrite this all. Which was
> done for all the still used drivers like i915, radeon, nouveau, ...

That sounds plausible - yet with meatballs (ok, maybe I should omit
such quite possibly unjustified qualification) such as this:

git show --stat 771fe6b912fca54f03

how is a bunch of marginally-trained hobbyists ever supposed to be implementing
a working practical (i.e., "base") driver for *various* currently unsupported
(booted would perhaps even be a more fitting word?) hardware?

While the result of a wc -l check of the drivers/gpu/drm/r128 dir itself
seems quite positive, that still might be not much of help
when eyeing the large KMS changes that had to be done elsewhere.

I guess we can make use of all the practical advice/links that we can get...
(such as hints at good candidates of existing KMSified drivers
which don't come with the full bells and whistles package,
hints at suitably sized KMS support commits, grandma tutorials, ...).
Some semi-short search wasn't overly successful, with links such as
http://www.x.org/wiki/ModeSetting/
https://en.wikipedia.org/wiki/KMS_%28Linux_kernel%29#Linux
"New, Generic X.Org KMS Driver Work" 
http://www.phoronix.com/scan.php?page=news_item=OTk1OA

Or perhaps I should just state outright that I seem to be in need
of a working solution for my kernel upgrade pain
which I would be deemed to want semi-soonish
(the i810, MGA users and some others might be sharing my thoughts).
IOW, my r128 driver is somewhat of a "still used driver", thank you very much.

Thanks for having managed to survive my posting in an asbestos-lined
garment (apologies if it came across in harsh terms :),

Andreas Mohr
(not necessarily a member of the forced-monopoly hardware upgrade treadmill 
cult)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-24 Thread Daniel Vetter
On Mon, Mar 24, 2014 at 9:40 PM, Mikulas Patocka  wrote:
>> > > -> All hell breaks loose if Xorg dies and takes all it's mappings with it
>> > > (in master_destroy, since the Xorg /dev fd is the master) and leaves the
>> > > driver hanging in the air if there's an interrupt still pending (or
>> > > anything else fwiw).
>> >
>> > For me that crash happened when xorg exited with a fatal error too.
>>
>> Is this fatal error itself a regression or have you seen that on older
>> kernels, too?
>
> In my case that Xorg error was not kernel-related at all. It happened
> because of unknown symbol because I used mga_dri.so from Debian 6 in
> Debian 7 (mga_dri.so isn't shipped in Debian 7 anymore). I can still play
> quake with that old mga_dri.so, although in some other scenario it causes
> failure because of unknown symbol. I should probably recompile mga_dri on
> my own.
>
>> Like I've said the entire teardown sequence for legacy drm drivers is
>> terminally busted, so the only hope we have is to reapply this missing
>> duct-tape which made your X crash. But if that itself isn't a regression
>> there's no way to fix the current drm/mga driver without a complete
>> rewrite as a new-style kernel modesetting driver.
>> -Daniel
>
> If someone understands the locking issues I pointed out above, it could be
> easy to fix.

The locking issue isn't your problem, the real issue is that putting a
irq_uninstall into core code will break all the new (properly working)
drivers. And you can't really fix this in mga itself since the
lifetime rules of the register mappings are totally broken. It's a
fundamental misdesign of the legacy drm driver architecture and the
_only_ way to fix this bug for real is to rewrite this all. Which was
done for all the still used drivers like i915, radeon, nouveau, ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-24 Thread Mikulas Patocka


On Mon, 24 Mar 2014, Daniel Vetter wrote:

> On Mon, Mar 24, 2014 at 01:17:12PM -0400, Mikulas Patocka wrote:

> > > > > Hmm, given that Mikulas in
> > > > > https://lkml.org/lkml/2014/2/26/537
> > > > > offered a diff of linux-3.13.5 files, it truly seems (shock! ack! 
> > > > > noo!)
> > > > > that that indeed may have been a regression at <= 3.13 proper even
> > > > > (which may pose interesting questions about the level of testing 
> > > > > coverage
> > > > > we still enjoy [not!?] in this hardware area).
> > 
> > That patch drops a mutex, so it is not correct. There is mutex resursion - 
> > we need to uninstall the irq in drm_master_destroy, because here we are 
> > committed to destroy the device. But the routine that uninstalls the irq 
> > takes struct_mutex, which is already held in drm_master_destroy.
> > 
> > I suppose that the person who maintains drm reworks the patch so that it's 
> > correct:
> > 
> > - could we use a different mutex to protect the irq in drm_irq.c? Or 
> > possibly no mutex at all and use cmpxchg to manipulate the variable 
> > dev->irq_enabled? - this seems like the best solution. But I am not sure 
> > if the code in drm_irq.c somehow depends on the facts that other parts of 
> > the drm subsystem take struct_mutex.
> > 
> > - could we pass a new argument to drm_irq_uninstall that tells it not to 
> > take the mutex? drm_master_destroy would set this argument to 1. 
> > drm_master_destroy is mostly called with struct_mutex held, but there may 
> > be places in vmwgfx_drv.c where drm_master_put (which calls 
> > drm_master_destroy) may be called without struct_mutex held.
> > 
> > Is it true that drm_master_destroy can be called without struct_mutex 
> > held? I don't know.
> > 
> > 
> > I think drm maintainer should sort out the above issues and modify the 
> > patch accordingly.
> > 
> > > -> All hell breaks loose if Xorg dies and takes all it's mappings with it
> > > (in master_destroy, since the Xorg /dev fd is the master) and leaves the
> > > driver hanging in the air if there's an interrupt still pending (or
> > > anything else fwiw).
> > 
> > For me that crash happened when xorg exited with a fatal error too.
> 
> Is this fatal error itself a regression or have you seen that on older
> kernels, too?

In my case that Xorg error was not kernel-related at all. It happened 
because of unknown symbol because I used mga_dri.so from Debian 6 in 
Debian 7 (mga_dri.so isn't shipped in Debian 7 anymore). I can still play 
quake with that old mga_dri.so, although in some other scenario it causes 
failure because of unknown symbol. I should probably recompile mga_dri on 
my own.

> Like I've said the entire teardown sequence for legacy drm drivers is
> terminally busted, so the only hope we have is to reapply this missing
> duct-tape which made your X crash. But if that itself isn't a regression
> there's no way to fix the current drm/mga driver without a complete
> rewrite as a new-style kernel modesetting driver.
> -Daniel

If someone understands the locking issues I pointed out above, it could be 
easy to fix.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-24 Thread Daniel Vetter
On Mon, Mar 24, 2014 at 01:17:12PM -0400, Mikulas Patocka wrote:
> 
> 
> On Mon, 24 Mar 2014, Daniel Vetter wrote:
> 
> > On Mon, Mar 24, 2014 at 07:45:47AM +1000, Dave Airlie wrote:
> > > On Mon, Mar 24, 2014 at 7:27 AM, Andreas Mohr  wrote:
> > > > On Sun, Mar 23, 2014 at 09:39:16AM -0700, Linus Torvalds wrote:
> > > >> On Sun, Mar 23, 2014 at 5:15 AM, Andreas Mohr  wrote:
> > > >> >
> > > >> > which did end up flawless on 3.12.0-rc2+, too
> > > >> > (but failed to improve the issue on 3.14.0-rc7+).
> > > >> >
> > > >> > So, for all intents and purposes, drm infrastructure seems 
> > > >> > unavoidably
> > > >> > (neither dri disable nor libdrm upgrade helps) affected.
> > > >> > Does anyone know which change caused that issue?
> > > >> > (I'm asking because bisect here would be relatively painful).
> > > >>
> > > >> So 3.12-rc2 works. Does 3.13 work? Is this a regression in the current
> > > >> 3.14 rc only, or did it happen already in the previous release?
> > > >
> > > > Hmm, given that Mikulas in
> > > > https://lkml.org/lkml/2014/2/26/537
> > > > offered a diff of linux-3.13.5 files, it truly seems (shock! ack! noo!)
> > > > that that indeed may have been a regression at <= 3.13 proper even
> > > > (which may pose interesting questions about the level of testing 
> > > > coverage
> > > > we still enjoy [not!?] in this hardware area).
> 
> That patch drops a mutex, so it is not correct. There is mutex resursion - 
> we need to uninstall the irq in drm_master_destroy, because here we are 
> committed to destroy the device. But the routine that uninstalls the irq 
> takes struct_mutex, which is already held in drm_master_destroy.
> 
> I suppose that the person who maintains drm reworks the patch so that it's 
> correct:
> 
> - could we use a different mutex to protect the irq in drm_irq.c? Or 
> possibly no mutex at all and use cmpxchg to manipulate the variable 
> dev->irq_enabled? - this seems like the best solution. But I am not sure 
> if the code in drm_irq.c somehow depends on the facts that other parts of 
> the drm subsystem take struct_mutex.
> 
> - could we pass a new argument to drm_irq_uninstall that tells it not to 
> take the mutex? drm_master_destroy would set this argument to 1. 
> drm_master_destroy is mostly called with struct_mutex held, but there may 
> be places in vmwgfx_drv.c where drm_master_put (which calls 
> drm_master_destroy) may be called without struct_mutex held.
> 
> Is it true that drm_master_destroy can be called without struct_mutex 
> held? I don't know.
> 
> 
> I think drm maintainer should sort out the above issues and modify the 
> patch accordingly.
> 
> > > > Oh well, seems I'll have to prepare/build 3.13 now...
> > > 
> > > It's > 15 year old hardware, so yes I believe we have close to 0
> > > testing coverage on it outside of distros,
> > > 
> > > I'm not even sure I have one anymore, I might be able to test an MGA in 
> > > one box.
> > 
> > I haven't done a full read of all the related code, but this smells like a
> > similar bug I've hit all over the place in the i810 driver (another one of
> > those undead drm drivers of yonders). Ingredients:
> > 
> > 1) Xorg creates a drm mapping of the register space.
> > 2) Xorg tells the hw-specific drm which drm mapping has the hw registers,
> > and the driver uses that. Iirc this has been done as some form of OS
> > abstraction. Also note that these mappings aren't refcounted, so the first
> > guy to call drm_rmmap wins.
> > 
> > -> All hell breaks loose if Xorg dies and takes all it's mappings with it
> > (in master_destroy, since the Xorg /dev fd is the master) and leaves the
> > driver hanging in the air if there's an interrupt still pending (or
> > anything else fwiw).
> 
> For me that crash happened when xorg exited with a fatal error too.

Is this fatal error itself a regression or have you seen that on older
kernels, too?

Like I've said the entire teardown sequence for legacy drm drivers is
terminally busted, so the only hope we have is to reapply this missing
duct-tape which made your X crash. But if that itself isn't a regression
there's no way to fix the current drm/mga driver without a complete
rewrite as a new-style kernel modesetting driver.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-24 Thread Mikulas Patocka


On Mon, 24 Mar 2014, Daniel Vetter wrote:

> On Mon, Mar 24, 2014 at 07:45:47AM +1000, Dave Airlie wrote:
> > On Mon, Mar 24, 2014 at 7:27 AM, Andreas Mohr  wrote:
> > > On Sun, Mar 23, 2014 at 09:39:16AM -0700, Linus Torvalds wrote:
> > >> On Sun, Mar 23, 2014 at 5:15 AM, Andreas Mohr  wrote:
> > >> >
> > >> > which did end up flawless on 3.12.0-rc2+, too
> > >> > (but failed to improve the issue on 3.14.0-rc7+).
> > >> >
> > >> > So, for all intents and purposes, drm infrastructure seems unavoidably
> > >> > (neither dri disable nor libdrm upgrade helps) affected.
> > >> > Does anyone know which change caused that issue?
> > >> > (I'm asking because bisect here would be relatively painful).
> > >>
> > >> So 3.12-rc2 works. Does 3.13 work? Is this a regression in the current
> > >> 3.14 rc only, or did it happen already in the previous release?
> > >
> > > Hmm, given that Mikulas in
> > > https://lkml.org/lkml/2014/2/26/537
> > > offered a diff of linux-3.13.5 files, it truly seems (shock! ack! noo!)
> > > that that indeed may have been a regression at <= 3.13 proper even
> > > (which may pose interesting questions about the level of testing coverage
> > > we still enjoy [not!?] in this hardware area).

That patch drops a mutex, so it is not correct. There is mutex resursion - 
we need to uninstall the irq in drm_master_destroy, because here we are 
committed to destroy the device. But the routine that uninstalls the irq 
takes struct_mutex, which is already held in drm_master_destroy.

I suppose that the person who maintains drm reworks the patch so that it's 
correct:

- could we use a different mutex to protect the irq in drm_irq.c? Or 
possibly no mutex at all and use cmpxchg to manipulate the variable 
dev->irq_enabled? - this seems like the best solution. But I am not sure 
if the code in drm_irq.c somehow depends on the facts that other parts of 
the drm subsystem take struct_mutex.

- could we pass a new argument to drm_irq_uninstall that tells it not to 
take the mutex? drm_master_destroy would set this argument to 1. 
drm_master_destroy is mostly called with struct_mutex held, but there may 
be places in vmwgfx_drv.c where drm_master_put (which calls 
drm_master_destroy) may be called without struct_mutex held.

Is it true that drm_master_destroy can be called without struct_mutex 
held? I don't know.


I think drm maintainer should sort out the above issues and modify the 
patch accordingly.

> > > Oh well, seems I'll have to prepare/build 3.13 now...
> > 
> > It's > 15 year old hardware, so yes I believe we have close to 0
> > testing coverage on it outside of distros,
> > 
> > I'm not even sure I have one anymore, I might be able to test an MGA in one 
> > box.
> 
> I haven't done a full read of all the related code, but this smells like a
> similar bug I've hit all over the place in the i810 driver (another one of
> those undead drm drivers of yonders). Ingredients:
> 
> 1) Xorg creates a drm mapping of the register space.
> 2) Xorg tells the hw-specific drm which drm mapping has the hw registers,
> and the driver uses that. Iirc this has been done as some form of OS
> abstraction. Also note that these mappings aren't refcounted, so the first
> guy to call drm_rmmap wins.
> 
> -> All hell breaks loose if Xorg dies and takes all it's mappings with it
> (in master_destroy, since the Xorg /dev fd is the master) and leaves the
> driver hanging in the air if there's an interrupt still pending (or
> anything else fwiw).

For me that crash happened when xorg exited with a fatal error too.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-24 Thread Daniel Vetter
On Mon, Mar 24, 2014 at 07:45:47AM +1000, Dave Airlie wrote:
> On Mon, Mar 24, 2014 at 7:27 AM, Andreas Mohr  wrote:
> > On Sun, Mar 23, 2014 at 09:39:16AM -0700, Linus Torvalds wrote:
> >> On Sun, Mar 23, 2014 at 5:15 AM, Andreas Mohr  wrote:
> >> >
> >> > which did end up flawless on 3.12.0-rc2+, too
> >> > (but failed to improve the issue on 3.14.0-rc7+).
> >> >
> >> > So, for all intents and purposes, drm infrastructure seems unavoidably
> >> > (neither dri disable nor libdrm upgrade helps) affected.
> >> > Does anyone know which change caused that issue?
> >> > (I'm asking because bisect here would be relatively painful).
> >>
> >> So 3.12-rc2 works. Does 3.13 work? Is this a regression in the current
> >> 3.14 rc only, or did it happen already in the previous release?
> >
> > Hmm, given that Mikulas in
> > https://lkml.org/lkml/2014/2/26/537
> > offered a diff of linux-3.13.5 files, it truly seems (shock! ack! noo!)
> > that that indeed may have been a regression at <= 3.13 proper even
> > (which may pose interesting questions about the level of testing coverage
> > we still enjoy [not!?] in this hardware area).
> >
> > Oh well, seems I'll have to prepare/build 3.13 now...
> 
> It's > 15 year old hardware, so yes I believe we have close to 0
> testing coverage on it outside of distros,
> 
> I'm not even sure I have one anymore, I might be able to test an MGA in one 
> box.

I haven't done a full read of all the related code, but this smells like a
similar bug I've hit all over the place in the i810 driver (another one of
those undead drm drivers of yonders). Ingredients:

1) Xorg creates a drm mapping of the register space.
2) Xorg tells the hw-specific drm which drm mapping has the hw registers,
and the driver uses that. Iirc this has been done as some form of OS
abstraction. Also note that these mappings aren't refcounted, so the first
guy to call drm_rmmap wins.

-> All hell breaks loose if Xorg dies and takes all it's mappings with it
(in master_destroy, since the Xorg /dev fd is the master) and leaves the
driver hanging in the air if there's an interrupt still pending (or
anything else fwiw).

In my case with i810 it was some subtle thing with error codes somewhere
else (iirc, been a while) which made Xorg fall over a bit differently and
so crash the kernel. Presuming this is the case I think we need a proper
bisect here to figure out the root-cause and re-apply the lost duct-tape.

"Properly" fixing the underlying bug for any sane definition of "proper"
is impossible - the legacy drm driver model is just too broken for that.
And for new-style drivers killing the irq support when they don't expect
it is not cool, since those drivers are sane and assume full control over
all hw interactions (not like their legacy bethren).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-24 Thread Daniel Vetter
On Mon, Mar 24, 2014 at 07:45:47AM +1000, Dave Airlie wrote:
 On Mon, Mar 24, 2014 at 7:27 AM, Andreas Mohr a...@lisas.de wrote:
  On Sun, Mar 23, 2014 at 09:39:16AM -0700, Linus Torvalds wrote:
  On Sun, Mar 23, 2014 at 5:15 AM, Andreas Mohr a...@lisas.de wrote:
  
   which did end up flawless on 3.12.0-rc2+, too
   (but failed to improve the issue on 3.14.0-rc7+).
  
   So, for all intents and purposes, drm infrastructure seems unavoidably
   (neither dri disable nor libdrm upgrade helps) affected.
   Does anyone know which change caused that issue?
   (I'm asking because bisect here would be relatively painful).
 
  So 3.12-rc2 works. Does 3.13 work? Is this a regression in the current
  3.14 rc only, or did it happen already in the previous release?
 
  Hmm, given that Mikulas in
  https://lkml.org/lkml/2014/2/26/537
  offered a diff of linux-3.13.5 files, it truly seems (shock! ack! noo!)
  that that indeed may have been a regression at = 3.13 proper even
  (which may pose interesting questions about the level of testing coverage
  we still enjoy [not!?] in this hardware area).
 
  Oh well, seems I'll have to prepare/build 3.13 now...
 
 It's  15 year old hardware, so yes I believe we have close to 0
 testing coverage on it outside of distros,
 
 I'm not even sure I have one anymore, I might be able to test an MGA in one 
 box.

I haven't done a full read of all the related code, but this smells like a
similar bug I've hit all over the place in the i810 driver (another one of
those undead drm drivers of yonders). Ingredients:

1) Xorg creates a drm mapping of the register space.
2) Xorg tells the hw-specific drm which drm mapping has the hw registers,
and the driver uses that. Iirc this has been done as some form of OS
abstraction. Also note that these mappings aren't refcounted, so the first
guy to call drm_rmmap wins.

- All hell breaks loose if Xorg dies and takes all it's mappings with it
(in master_destroy, since the Xorg /dev fd is the master) and leaves the
driver hanging in the air if there's an interrupt still pending (or
anything else fwiw).

In my case with i810 it was some subtle thing with error codes somewhere
else (iirc, been a while) which made Xorg fall over a bit differently and
so crash the kernel. Presuming this is the case I think we need a proper
bisect here to figure out the root-cause and re-apply the lost duct-tape.

Properly fixing the underlying bug for any sane definition of proper
is impossible - the legacy drm driver model is just too broken for that.
And for new-style drivers killing the irq support when they don't expect
it is not cool, since those drivers are sane and assume full control over
all hw interactions (not like their legacy bethren).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-24 Thread Mikulas Patocka


On Mon, 24 Mar 2014, Daniel Vetter wrote:

 On Mon, Mar 24, 2014 at 07:45:47AM +1000, Dave Airlie wrote:
  On Mon, Mar 24, 2014 at 7:27 AM, Andreas Mohr a...@lisas.de wrote:
   On Sun, Mar 23, 2014 at 09:39:16AM -0700, Linus Torvalds wrote:
   On Sun, Mar 23, 2014 at 5:15 AM, Andreas Mohr a...@lisas.de wrote:
   
which did end up flawless on 3.12.0-rc2+, too
(but failed to improve the issue on 3.14.0-rc7+).
   
So, for all intents and purposes, drm infrastructure seems unavoidably
(neither dri disable nor libdrm upgrade helps) affected.
Does anyone know which change caused that issue?
(I'm asking because bisect here would be relatively painful).
  
   So 3.12-rc2 works. Does 3.13 work? Is this a regression in the current
   3.14 rc only, or did it happen already in the previous release?
  
   Hmm, given that Mikulas in
   https://lkml.org/lkml/2014/2/26/537
   offered a diff of linux-3.13.5 files, it truly seems (shock! ack! noo!)
   that that indeed may have been a regression at = 3.13 proper even
   (which may pose interesting questions about the level of testing coverage
   we still enjoy [not!?] in this hardware area).

That patch drops a mutex, so it is not correct. There is mutex resursion - 
we need to uninstall the irq in drm_master_destroy, because here we are 
committed to destroy the device. But the routine that uninstalls the irq 
takes struct_mutex, which is already held in drm_master_destroy.

I suppose that the person who maintains drm reworks the patch so that it's 
correct:

- could we use a different mutex to protect the irq in drm_irq.c? Or 
possibly no mutex at all and use cmpxchg to manipulate the variable 
dev-irq_enabled? - this seems like the best solution. But I am not sure 
if the code in drm_irq.c somehow depends on the facts that other parts of 
the drm subsystem take struct_mutex.

- could we pass a new argument to drm_irq_uninstall that tells it not to 
take the mutex? drm_master_destroy would set this argument to 1. 
drm_master_destroy is mostly called with struct_mutex held, but there may 
be places in vmwgfx_drv.c where drm_master_put (which calls 
drm_master_destroy) may be called without struct_mutex held.

Is it true that drm_master_destroy can be called without struct_mutex 
held? I don't know.


I think drm maintainer should sort out the above issues and modify the 
patch accordingly.

   Oh well, seems I'll have to prepare/build 3.13 now...
  
  It's  15 year old hardware, so yes I believe we have close to 0
  testing coverage on it outside of distros,
  
  I'm not even sure I have one anymore, I might be able to test an MGA in one 
  box.
 
 I haven't done a full read of all the related code, but this smells like a
 similar bug I've hit all over the place in the i810 driver (another one of
 those undead drm drivers of yonders). Ingredients:
 
 1) Xorg creates a drm mapping of the register space.
 2) Xorg tells the hw-specific drm which drm mapping has the hw registers,
 and the driver uses that. Iirc this has been done as some form of OS
 abstraction. Also note that these mappings aren't refcounted, so the first
 guy to call drm_rmmap wins.
 
 - All hell breaks loose if Xorg dies and takes all it's mappings with it
 (in master_destroy, since the Xorg /dev fd is the master) and leaves the
 driver hanging in the air if there's an interrupt still pending (or
 anything else fwiw).

For me that crash happened when xorg exited with a fatal error too.

Mikulas
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-24 Thread Daniel Vetter
On Mon, Mar 24, 2014 at 01:17:12PM -0400, Mikulas Patocka wrote:
 
 
 On Mon, 24 Mar 2014, Daniel Vetter wrote:
 
  On Mon, Mar 24, 2014 at 07:45:47AM +1000, Dave Airlie wrote:
   On Mon, Mar 24, 2014 at 7:27 AM, Andreas Mohr a...@lisas.de wrote:
On Sun, Mar 23, 2014 at 09:39:16AM -0700, Linus Torvalds wrote:
On Sun, Mar 23, 2014 at 5:15 AM, Andreas Mohr a...@lisas.de wrote:

 which did end up flawless on 3.12.0-rc2+, too
 (but failed to improve the issue on 3.14.0-rc7+).

 So, for all intents and purposes, drm infrastructure seems 
 unavoidably
 (neither dri disable nor libdrm upgrade helps) affected.
 Does anyone know which change caused that issue?
 (I'm asking because bisect here would be relatively painful).
   
So 3.12-rc2 works. Does 3.13 work? Is this a regression in the current
3.14 rc only, or did it happen already in the previous release?
   
Hmm, given that Mikulas in
https://lkml.org/lkml/2014/2/26/537
offered a diff of linux-3.13.5 files, it truly seems (shock! ack! noo!)
that that indeed may have been a regression at = 3.13 proper even
(which may pose interesting questions about the level of testing 
coverage
we still enjoy [not!?] in this hardware area).
 
 That patch drops a mutex, so it is not correct. There is mutex resursion - 
 we need to uninstall the irq in drm_master_destroy, because here we are 
 committed to destroy the device. But the routine that uninstalls the irq 
 takes struct_mutex, which is already held in drm_master_destroy.
 
 I suppose that the person who maintains drm reworks the patch so that it's 
 correct:
 
 - could we use a different mutex to protect the irq in drm_irq.c? Or 
 possibly no mutex at all and use cmpxchg to manipulate the variable 
 dev-irq_enabled? - this seems like the best solution. But I am not sure 
 if the code in drm_irq.c somehow depends on the facts that other parts of 
 the drm subsystem take struct_mutex.
 
 - could we pass a new argument to drm_irq_uninstall that tells it not to 
 take the mutex? drm_master_destroy would set this argument to 1. 
 drm_master_destroy is mostly called with struct_mutex held, but there may 
 be places in vmwgfx_drv.c where drm_master_put (which calls 
 drm_master_destroy) may be called without struct_mutex held.
 
 Is it true that drm_master_destroy can be called without struct_mutex 
 held? I don't know.
 
 
 I think drm maintainer should sort out the above issues and modify the 
 patch accordingly.
 
Oh well, seems I'll have to prepare/build 3.13 now...
   
   It's  15 year old hardware, so yes I believe we have close to 0
   testing coverage on it outside of distros,
   
   I'm not even sure I have one anymore, I might be able to test an MGA in 
   one box.
  
  I haven't done a full read of all the related code, but this smells like a
  similar bug I've hit all over the place in the i810 driver (another one of
  those undead drm drivers of yonders). Ingredients:
  
  1) Xorg creates a drm mapping of the register space.
  2) Xorg tells the hw-specific drm which drm mapping has the hw registers,
  and the driver uses that. Iirc this has been done as some form of OS
  abstraction. Also note that these mappings aren't refcounted, so the first
  guy to call drm_rmmap wins.
  
  - All hell breaks loose if Xorg dies and takes all it's mappings with it
  (in master_destroy, since the Xorg /dev fd is the master) and leaves the
  driver hanging in the air if there's an interrupt still pending (or
  anything else fwiw).
 
 For me that crash happened when xorg exited with a fatal error too.

Is this fatal error itself a regression or have you seen that on older
kernels, too?

Like I've said the entire teardown sequence for legacy drm drivers is
terminally busted, so the only hope we have is to reapply this missing
duct-tape which made your X crash. But if that itself isn't a regression
there's no way to fix the current drm/mga driver without a complete
rewrite as a new-style kernel modesetting driver.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-24 Thread Mikulas Patocka


On Mon, 24 Mar 2014, Daniel Vetter wrote:

 On Mon, Mar 24, 2014 at 01:17:12PM -0400, Mikulas Patocka wrote:

 Hmm, given that Mikulas in
 https://lkml.org/lkml/2014/2/26/537
 offered a diff of linux-3.13.5 files, it truly seems (shock! ack! 
 noo!)
 that that indeed may have been a regression at = 3.13 proper even
 (which may pose interesting questions about the level of testing 
 coverage
 we still enjoy [not!?] in this hardware area).
  
  That patch drops a mutex, so it is not correct. There is mutex resursion - 
  we need to uninstall the irq in drm_master_destroy, because here we are 
  committed to destroy the device. But the routine that uninstalls the irq 
  takes struct_mutex, which is already held in drm_master_destroy.
  
  I suppose that the person who maintains drm reworks the patch so that it's 
  correct:
  
  - could we use a different mutex to protect the irq in drm_irq.c? Or 
  possibly no mutex at all and use cmpxchg to manipulate the variable 
  dev-irq_enabled? - this seems like the best solution. But I am not sure 
  if the code in drm_irq.c somehow depends on the facts that other parts of 
  the drm subsystem take struct_mutex.
  
  - could we pass a new argument to drm_irq_uninstall that tells it not to 
  take the mutex? drm_master_destroy would set this argument to 1. 
  drm_master_destroy is mostly called with struct_mutex held, but there may 
  be places in vmwgfx_drv.c where drm_master_put (which calls 
  drm_master_destroy) may be called without struct_mutex held.
  
  Is it true that drm_master_destroy can be called without struct_mutex 
  held? I don't know.
  
  
  I think drm maintainer should sort out the above issues and modify the 
  patch accordingly.
  
   - All hell breaks loose if Xorg dies and takes all it's mappings with it
   (in master_destroy, since the Xorg /dev fd is the master) and leaves the
   driver hanging in the air if there's an interrupt still pending (or
   anything else fwiw).
  
  For me that crash happened when xorg exited with a fatal error too.
 
 Is this fatal error itself a regression or have you seen that on older
 kernels, too?

In my case that Xorg error was not kernel-related at all. It happened 
because of unknown symbol because I used mga_dri.so from Debian 6 in 
Debian 7 (mga_dri.so isn't shipped in Debian 7 anymore). I can still play 
quake with that old mga_dri.so, although in some other scenario it causes 
failure because of unknown symbol. I should probably recompile mga_dri on 
my own.

 Like I've said the entire teardown sequence for legacy drm drivers is
 terminally busted, so the only hope we have is to reapply this missing
 duct-tape which made your X crash. But if that itself isn't a regression
 there's no way to fix the current drm/mga driver without a complete
 rewrite as a new-style kernel modesetting driver.
 -Daniel

If someone understands the locking issues I pointed out above, it could be 
easy to fix.

Mikulas
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-24 Thread Daniel Vetter
On Mon, Mar 24, 2014 at 9:40 PM, Mikulas Patocka mpato...@redhat.com wrote:
   - All hell breaks loose if Xorg dies and takes all it's mappings with it
   (in master_destroy, since the Xorg /dev fd is the master) and leaves the
   driver hanging in the air if there's an interrupt still pending (or
   anything else fwiw).
 
  For me that crash happened when xorg exited with a fatal error too.

 Is this fatal error itself a regression or have you seen that on older
 kernels, too?

 In my case that Xorg error was not kernel-related at all. It happened
 because of unknown symbol because I used mga_dri.so from Debian 6 in
 Debian 7 (mga_dri.so isn't shipped in Debian 7 anymore). I can still play
 quake with that old mga_dri.so, although in some other scenario it causes
 failure because of unknown symbol. I should probably recompile mga_dri on
 my own.

 Like I've said the entire teardown sequence for legacy drm drivers is
 terminally busted, so the only hope we have is to reapply this missing
 duct-tape which made your X crash. But if that itself isn't a regression
 there's no way to fix the current drm/mga driver without a complete
 rewrite as a new-style kernel modesetting driver.
 -Daniel

 If someone understands the locking issues I pointed out above, it could be
 easy to fix.

The locking issue isn't your problem, the real issue is that putting a
irq_uninstall into core code will break all the new (properly working)
drivers. And you can't really fix this in mga itself since the
lifetime rules of the register mappings are totally broken. It's a
fundamental misdesign of the legacy drm driver architecture and the
_only_ way to fix this bug for real is to rewrite this all. Which was
done for all the still used drivers like i915, radeon, nouveau, ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-24 Thread Andreas Mohr
Hi,

On Mon, Mar 24, 2014 at 10:46:49PM +0100, Daniel Vetter wrote:
 On Mon, Mar 24, 2014 at 9:40 PM, Mikulas Patocka mpato...@redhat.com wrote:
  If someone understands the locking issues I pointed out above, it could be
  easy to fix.
 
 The locking issue isn't your problem, the real issue is that putting a
 irq_uninstall into core code will break all the new (properly working)
 drivers. And you can't really fix this in mga itself since the
 lifetime rules of the register mappings are totally broken. It's a
 fundamental misdesign of the legacy drm driver architecture and the
 _only_ way to fix this bug for real is to rewrite this all. Which was
 done for all the still used drivers like i915, radeon, nouveau, ...

That sounds plausible - yet with meatballs (ok, maybe I should omit
such quite possibly unjustified qualification) such as this:

git show --stat 771fe6b912fca54f03

how is a bunch of marginally-trained hobbyists ever supposed to be implementing
a working practical (i.e., base) driver for *various* currently unsupported
(booted would perhaps even be a more fitting word?) hardware?

While the result of a wc -l check of the drivers/gpu/drm/r128 dir itself
seems quite positive, that still might be not much of help
when eyeing the large KMS changes that had to be done elsewhere.

I guess we can make use of all the practical advice/links that we can get...
(such as hints at good candidates of existing KMSified drivers
which don't come with the full bells and whistles package,
hints at suitably sized KMS support commits, grandma tutorials, ...).
Some semi-short search wasn't overly successful, with links such as
http://www.x.org/wiki/ModeSetting/
https://en.wikipedia.org/wiki/KMS_%28Linux_kernel%29#Linux
New, Generic X.Org KMS Driver Work 
http://www.phoronix.com/scan.php?page=news_itempx=OTk1OA

Or perhaps I should just state outright that I seem to be in need
of a working solution for my kernel upgrade pain
which I would be deemed to want semi-soonish
(the i810, MGA users and some others might be sharing my thoughts).
IOW, my r128 driver is somewhat of a still used driver, thank you very much.

Thanks for having managed to survive my posting in an asbestos-lined
garment (apologies if it came across in harsh terms :),

Andreas Mohr
(not necessarily a member of the forced-monopoly hardware upgrade treadmill 
cult)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-23 Thread Dave Airlie
On Mon, Mar 24, 2014 at 7:27 AM, Andreas Mohr  wrote:
> On Sun, Mar 23, 2014 at 09:39:16AM -0700, Linus Torvalds wrote:
>> On Sun, Mar 23, 2014 at 5:15 AM, Andreas Mohr  wrote:
>> >
>> > which did end up flawless on 3.12.0-rc2+, too
>> > (but failed to improve the issue on 3.14.0-rc7+).
>> >
>> > So, for all intents and purposes, drm infrastructure seems unavoidably
>> > (neither dri disable nor libdrm upgrade helps) affected.
>> > Does anyone know which change caused that issue?
>> > (I'm asking because bisect here would be relatively painful).
>>
>> So 3.12-rc2 works. Does 3.13 work? Is this a regression in the current
>> 3.14 rc only, or did it happen already in the previous release?
>
> Hmm, given that Mikulas in
> https://lkml.org/lkml/2014/2/26/537
> offered a diff of linux-3.13.5 files, it truly seems (shock! ack! noo!)
> that that indeed may have been a regression at <= 3.13 proper even
> (which may pose interesting questions about the level of testing coverage
> we still enjoy [not!?] in this hardware area).
>
> Oh well, seems I'll have to prepare/build 3.13 now...

It's > 15 year old hardware, so yes I believe we have close to 0
testing coverage on it outside of distros,

I'm not even sure I have one anymore, I might be able to test an MGA in one box.

Dave.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-23 Thread Andreas Mohr
On Sun, Mar 23, 2014 at 09:39:16AM -0700, Linus Torvalds wrote:
> On Sun, Mar 23, 2014 at 5:15 AM, Andreas Mohr  wrote:
> >
> > which did end up flawless on 3.12.0-rc2+, too
> > (but failed to improve the issue on 3.14.0-rc7+).
> >
> > So, for all intents and purposes, drm infrastructure seems unavoidably
> > (neither dri disable nor libdrm upgrade helps) affected.
> > Does anyone know which change caused that issue?
> > (I'm asking because bisect here would be relatively painful).
> 
> So 3.12-rc2 works. Does 3.13 work? Is this a regression in the current
> 3.14 rc only, or did it happen already in the previous release?

Hmm, given that Mikulas in
https://lkml.org/lkml/2014/2/26/537
offered a diff of linux-3.13.5 files, it truly seems (shock! ack! noo!)
that that indeed may have been a regression at <= 3.13 proper even
(which may pose interesting questions about the level of testing coverage
we still enjoy [not!?] in this hardware area).

Oh well, seems I'll have to prepare/build 3.13 now...

Thanks,

Andreas Mohr
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-23 Thread Linus Torvalds
On Sun, Mar 23, 2014 at 5:15 AM, Andreas Mohr  wrote:
>
> which did end up flawless on 3.12.0-rc2+, too
> (but failed to improve the issue on 3.14.0-rc7+).
>
> So, for all intents and purposes, drm infrastructure seems unavoidably
> (neither dri disable nor libdrm upgrade helps) affected.
> Does anyone know which change caused that issue?
> (I'm asking because bisect here would be relatively painful).

So 3.12-rc2 works. Does 3.13 work? Is this a regression in the current
3.14 rc only, or did it happen already in the previous release?

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-23 Thread Andreas Mohr
Hi,

On Sun, Mar 23, 2014 at 12:43:17AM +0100, Andreas Mohr wrote:
> Hi,
> 
> now testing 3.14-rc7 here (r128 hardware rather than MGA),
> and I seem to still be experiencing the same or very similar crash as you 
> here:

I decided to do some more experimentation:

I added a

Section "Module"
Disable "dri"
EndSection

which did reliably disable dri according to 3.12.0-rc2+ xdpyinfo / Xorg.0.log

On -rc7 however, this changed the issue from a kernel OOPS into merely a
Xorg.0.log trace and abort (no such issue on working kernel
given my userspace configuration!):

[68.334] Backtrace:
[68.335] 0: /usr/bin/X (xorg_backtrace+0x49) [0xb7767769]
[68.335] 1: /usr/bin/X (0xb75ea000+0x181186) [0xb776b186]
[68.335] 2: linux-gate.so.1 (__kernel_rt_sigreturn+0x0) [0xe40c]
[68.335] 3: /usr/lib/i386-linux-gnu/libpixman-1.so.0 (0xb743+0x727c0) 
[0xb74a27c0]
[68.335] 4: /usr/lib/i386-linux-gnu/libpixman-1.so.0 (0xb743+0x57abf) 
[0xb7487abf]
[68.335] 5: /usr/lib/i386-linux-gnu/libpixman-1.so.0 (pixman_blt+0x7d) 
[0xb74367ad]
[68.335] 6: /usr/lib/xorg/modules/libfb.so (fbCopyNtoN+0x2af) [0xb6de066f]
[68.335] 7: /usr/bin/X (miCopyRegion+0x17c) [0xb7743f6c]
[68.336] 8: /usr/bin/X (miDoCopy+0x4f0) [0xb77445a0]
[68.336] 9: /usr/lib/xorg/modules/libfb.so (fbCopyArea+0x7e) [0xb6de089e]
[68.336] 10: /usr/lib/xorg/modules/libxaa.so (0xb6f04000+0xacf3) 
[0xb6f0ecf3]
[68.336] 11: /usr/lib/xorg/modules/libxaa.so (0xb6f04000+0x548d3) 
[0xb6f588d3]
[68.336] 12: /usr/bin/X (0xb75ea000+0x10929d) [0xb76f329d]
[68.336] 13: /usr/bin/X (0xb75ea000+0x15b60f) [0xb774560f]
[68.336] 14: /usr/bin/X (0xb75ea000+0x16cd21) [0xb7756d21]
[68.336] 15: /usr/bin/X (0xb75ea000+0x16d50c) [0xb775750c]
[68.337] 16: /usr/bin/X (0xb75ea000+0xbe0b7) [0xb76a80b7]
[68.337] 17: /usr/bin/X (miPointerUpdateSprite+0x2a7) [0xb7751687]
[68.337] 18: /usr/bin/X (0xb75ea000+0x16791a) [0xb775191a]
[68.337] 19: /usr/bin/X (0xb75ea000+0xcd1c4) [0xb76b71c4]
[68.337] 20: /usr/bin/X (0xb75ea000+0x101bfe) [0xb76ebbfe]
[68.337] 21: /usr/bin/X (0xb75ea000+0x4437b) [0xb762e37b]
[68.337] 22: /usr/bin/X (WindowHasNewCursor+0x3b) [0xb762f6db]
[68.337] 23: /usr/bin/X (ChangeWindowAttributes+0xb0c) [0xb7655e1c]
[68.338] 24: /usr/bin/X (0xb75ea000+0x35da7) [0xb761fda7]
[68.338] 25: /usr/bin/X (0xb75ea000+0x3c375) [0xb7626375]
[68.338] 26: /usr/bin/X (0xb75ea000+0x29e95) [0xb7613e95]
[68.338] 27: /lib/i386-linux-gnu/i686/cmov/libc.so.6 
(__libc_start_main+0xf5) [0xb71ef8f5]
[68.338] 28: /usr/bin/X (0xb75ea000+0x2a1e9) [0xb76141e9]
[68.338] 
[68.338] Bus error at address 0xb4f1ed4e
[68.338] 
Fatal server error:
[68.339] Caught signal 7 (Bus error). Server aborting




I then also decided to update libdrm package:

i  libdrm-dev:i386  2.4.52-1  i386  
   Userspace interface to kernel DRM services -- development files
ii  libdrm-intel1:i386   2.4.52-1  i386 
Userspace interface to intel-specific kernel DRM services -- runtime
rc  libdrm-nouveau1  2.4.21-1~squeeze3 i386 
Userspace interface to nouveau-specific kernel DRM services -- runtime
ii  libdrm-nouveau2:i386 2.4.52-1  i386 
Userspace interface to nouveau-specific kernel DRM services -- runtime
ii  libdrm-radeon1:i386  2.4.52-1  i386 
Userspace interface to radeon-specific kernel DRM services -- runtime
ii  libdrm2:i386 2.4.52-1  i386 
Userspace interface to kernel DRM services -- runtime


which did end up flawless on 3.12.0-rc2+, too
(but failed to improve the issue on 3.14.0-rc7+).

So, for all intents and purposes, drm infrastructure seems unavoidably
(neither dri disable nor libdrm upgrade helps) affected.
Does anyone know which change caused that issue?
(I'm asking because bisect here would be relatively painful).

Thanks,

Andreas Mohr
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-23 Thread Andreas Mohr
Hi,

On Sun, Mar 23, 2014 at 12:43:17AM +0100, Andreas Mohr wrote:
 Hi,
 
 now testing 3.14-rc7 here (r128 hardware rather than MGA),
 and I seem to still be experiencing the same or very similar crash as you 
 here:

I decided to do some more experimentation:

I added a

Section Module
Disable dri
EndSection

which did reliably disable dri according to 3.12.0-rc2+ xdpyinfo / Xorg.0.log

On -rc7 however, this changed the issue from a kernel OOPS into merely a
Xorg.0.log trace and abort (no such issue on working kernel
given my userspace configuration!):

[68.334] Backtrace:
[68.335] 0: /usr/bin/X (xorg_backtrace+0x49) [0xb7767769]
[68.335] 1: /usr/bin/X (0xb75ea000+0x181186) [0xb776b186]
[68.335] 2: linux-gate.so.1 (__kernel_rt_sigreturn+0x0) [0xe40c]
[68.335] 3: /usr/lib/i386-linux-gnu/libpixman-1.so.0 (0xb743+0x727c0) 
[0xb74a27c0]
[68.335] 4: /usr/lib/i386-linux-gnu/libpixman-1.so.0 (0xb743+0x57abf) 
[0xb7487abf]
[68.335] 5: /usr/lib/i386-linux-gnu/libpixman-1.so.0 (pixman_blt+0x7d) 
[0xb74367ad]
[68.335] 6: /usr/lib/xorg/modules/libfb.so (fbCopyNtoN+0x2af) [0xb6de066f]
[68.335] 7: /usr/bin/X (miCopyRegion+0x17c) [0xb7743f6c]
[68.336] 8: /usr/bin/X (miDoCopy+0x4f0) [0xb77445a0]
[68.336] 9: /usr/lib/xorg/modules/libfb.so (fbCopyArea+0x7e) [0xb6de089e]
[68.336] 10: /usr/lib/xorg/modules/libxaa.so (0xb6f04000+0xacf3) 
[0xb6f0ecf3]
[68.336] 11: /usr/lib/xorg/modules/libxaa.so (0xb6f04000+0x548d3) 
[0xb6f588d3]
[68.336] 12: /usr/bin/X (0xb75ea000+0x10929d) [0xb76f329d]
[68.336] 13: /usr/bin/X (0xb75ea000+0x15b60f) [0xb774560f]
[68.336] 14: /usr/bin/X (0xb75ea000+0x16cd21) [0xb7756d21]
[68.336] 15: /usr/bin/X (0xb75ea000+0x16d50c) [0xb775750c]
[68.337] 16: /usr/bin/X (0xb75ea000+0xbe0b7) [0xb76a80b7]
[68.337] 17: /usr/bin/X (miPointerUpdateSprite+0x2a7) [0xb7751687]
[68.337] 18: /usr/bin/X (0xb75ea000+0x16791a) [0xb775191a]
[68.337] 19: /usr/bin/X (0xb75ea000+0xcd1c4) [0xb76b71c4]
[68.337] 20: /usr/bin/X (0xb75ea000+0x101bfe) [0xb76ebbfe]
[68.337] 21: /usr/bin/X (0xb75ea000+0x4437b) [0xb762e37b]
[68.337] 22: /usr/bin/X (WindowHasNewCursor+0x3b) [0xb762f6db]
[68.337] 23: /usr/bin/X (ChangeWindowAttributes+0xb0c) [0xb7655e1c]
[68.338] 24: /usr/bin/X (0xb75ea000+0x35da7) [0xb761fda7]
[68.338] 25: /usr/bin/X (0xb75ea000+0x3c375) [0xb7626375]
[68.338] 26: /usr/bin/X (0xb75ea000+0x29e95) [0xb7613e95]
[68.338] 27: /lib/i386-linux-gnu/i686/cmov/libc.so.6 
(__libc_start_main+0xf5) [0xb71ef8f5]
[68.338] 28: /usr/bin/X (0xb75ea000+0x2a1e9) [0xb76141e9]
[68.338] 
[68.338] Bus error at address 0xb4f1ed4e
[68.338] 
Fatal server error:
[68.339] Caught signal 7 (Bus error). Server aborting




I then also decided to update libdrm package:

i  libdrm-dev:i386  2.4.52-1  i386  
   Userspace interface to kernel DRM services -- development files
ii  libdrm-intel1:i386   2.4.52-1  i386 
Userspace interface to intel-specific kernel DRM services -- runtime
rc  libdrm-nouveau1  2.4.21-1~squeeze3 i386 
Userspace interface to nouveau-specific kernel DRM services -- runtime
ii  libdrm-nouveau2:i386 2.4.52-1  i386 
Userspace interface to nouveau-specific kernel DRM services -- runtime
ii  libdrm-radeon1:i386  2.4.52-1  i386 
Userspace interface to radeon-specific kernel DRM services -- runtime
ii  libdrm2:i386 2.4.52-1  i386 
Userspace interface to kernel DRM services -- runtime


which did end up flawless on 3.12.0-rc2+, too
(but failed to improve the issue on 3.14.0-rc7+).

So, for all intents and purposes, drm infrastructure seems unavoidably
(neither dri disable nor libdrm upgrade helps) affected.
Does anyone know which change caused that issue?
(I'm asking because bisect here would be relatively painful).

Thanks,

Andreas Mohr
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-23 Thread Linus Torvalds
On Sun, Mar 23, 2014 at 5:15 AM, Andreas Mohr a...@lisas.de wrote:

 which did end up flawless on 3.12.0-rc2+, too
 (but failed to improve the issue on 3.14.0-rc7+).

 So, for all intents and purposes, drm infrastructure seems unavoidably
 (neither dri disable nor libdrm upgrade helps) affected.
 Does anyone know which change caused that issue?
 (I'm asking because bisect here would be relatively painful).

So 3.12-rc2 works. Does 3.13 work? Is this a regression in the current
3.14 rc only, or did it happen already in the previous release?

   Linus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-23 Thread Andreas Mohr
On Sun, Mar 23, 2014 at 09:39:16AM -0700, Linus Torvalds wrote:
 On Sun, Mar 23, 2014 at 5:15 AM, Andreas Mohr a...@lisas.de wrote:
 
  which did end up flawless on 3.12.0-rc2+, too
  (but failed to improve the issue on 3.14.0-rc7+).
 
  So, for all intents and purposes, drm infrastructure seems unavoidably
  (neither dri disable nor libdrm upgrade helps) affected.
  Does anyone know which change caused that issue?
  (I'm asking because bisect here would be relatively painful).
 
 So 3.12-rc2 works. Does 3.13 work? Is this a regression in the current
 3.14 rc only, or did it happen already in the previous release?

Hmm, given that Mikulas in
https://lkml.org/lkml/2014/2/26/537
offered a diff of linux-3.13.5 files, it truly seems (shock! ack! noo!)
that that indeed may have been a regression at = 3.13 proper even
(which may pose interesting questions about the level of testing coverage
we still enjoy [not!?] in this hardware area).

Oh well, seems I'll have to prepare/build 3.13 now...

Thanks,

Andreas Mohr
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-23 Thread Dave Airlie
On Mon, Mar 24, 2014 at 7:27 AM, Andreas Mohr a...@lisas.de wrote:
 On Sun, Mar 23, 2014 at 09:39:16AM -0700, Linus Torvalds wrote:
 On Sun, Mar 23, 2014 at 5:15 AM, Andreas Mohr a...@lisas.de wrote:
 
  which did end up flawless on 3.12.0-rc2+, too
  (but failed to improve the issue on 3.14.0-rc7+).
 
  So, for all intents and purposes, drm infrastructure seems unavoidably
  (neither dri disable nor libdrm upgrade helps) affected.
  Does anyone know which change caused that issue?
  (I'm asking because bisect here would be relatively painful).

 So 3.12-rc2 works. Does 3.13 work? Is this a regression in the current
 3.14 rc only, or did it happen already in the previous release?

 Hmm, given that Mikulas in
 https://lkml.org/lkml/2014/2/26/537
 offered a diff of linux-3.13.5 files, it truly seems (shock! ack! noo!)
 that that indeed may have been a regression at = 3.13 proper even
 (which may pose interesting questions about the level of testing coverage
 we still enjoy [not!?] in this hardware area).

 Oh well, seems I'll have to prepare/build 3.13 now...

It's  15 year old hardware, so yes I believe we have close to 0
testing coverage on it outside of distros,

I'm not even sure I have one anymore, I might be able to test an MGA in one box.

Dave.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-22 Thread Andreas Mohr
Hi,

now testing 3.14-rc7 here (r128 hardware rather than MGA),
and I seem to still be experiencing the same or very similar crash as you here:

agpgart-intel :00:00.0: AGP 2.0 bridge
agpgart-intel :00:00.0: putting AGP V2 device into 4x mode
pci :01:00.0: putting AGP V2 device into 4x mode
Registering platform device 'r128_cce.0'. Parent at platform
device: 'r128_cce.0': device_add
bus: 'platform': add device r128_cce.0
PM: Adding info for platform:r128_cce.0
__allocate_fw_buf: fw-r128/r128_cce.bin buf=dd9ec800
platform r128_cce.0: firmware: direct-loading firmware r128/r128_cce.bin
fw_set_page_data: fw-r128/r128_cce.bin buf=dd9ec800 data=e07f8000 size=2048
bus: 'platform': remove device r128_cce.0
PM: Removing info for platform:r128_cce.0
fw_name_devm_release: fw_name-r128/r128_cce.bin devm-dd9ccfcc released
__fw_free_buf: fw-r128/r128_cce.bin buf=dd9ec800 data=e07f8000 size=2048
evbug: Event. Dev: input7, Type: 2, Code: 0, Value: 1
evbug: Event. Dev: input7, Type: 2, Code: 1, Value: 1
evbug: Event. Dev: input7, Type: 0, Code: 0, Value: 0
evbug: Event. Dev: input7, Type: 2, Code: 0, Value: 2
evbug: Event. Dev: input7, Type: 0, Code: 0, Value: 0
BUG: unable to handle kernel paging request at e07f0040
IP: [] r128_driver_irq_uninstall+0x18/0x1d [r128]
*pde = 1f414067 *pte =  
Oops: 0002 [#1] 
Modules linked in: lp r128 drm uinput nls_iso8859_1 nls_cp437 vfat fat radeonfb 
cfbfillrect cfbimgblt cfbcopyarea i2c_algo_bit fb_ddc i2c_core fb fbdev ppdev lo
op fuse firewire_sbp2 mcs7830 usbnet usb_storage mii iTCO_wdt iTCO_vendor_suppor
t snd_maestro3 snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_seq
_midi snd_rawmidi snd_seq_oss pcmcia snd_seq_midi_event snd_seq snd_seq_device s
nd_timer microcode firewire_ohci dell_laptop sg dcdbas yenta_socket snd firewire
_core sr_mod pcmcia_rsrc psmouse crc_itu_t cdrom pcmcia_core pcspkr video backli
ght evbug evdev uhci_hcd floppy rtc_cmos ehci_hcd intel_agp intel_gtt usbcore us
b_common lpc_ich mfd_core
CPU: 0 PID: 4674 Comm: Xorg Not tainted 3.14.0-rc7+ #9
Hardware name: Dell Computer Corporation Inspiron 8000   /Inspir
on 8000, BIOS A23 01/21/2004
task: ded082f0 ti: da6c4000 task.ti: da6c4000
EIP: 0060:[] EFLAGS: 00213246 CPU: 0
EIP is at r128_driver_irq_uninstall+0x18/0x1d [r128]
EAX:  EBX: dd9eb400 ECX:  EDX: e07f
ESI: 0001 EDI: dd9ccd40 EBP: da6c5d48 ESP: da6c5d48
 DS: 007b ES: 007b FS:  GS: 00e0 SS: 0068
CR0: 80050033 CR2: e07f0040 CR3: 1da53000 CR4: 07d0
Stack:
 da6c5d78 e284dc79  e28696c5 e285fddc e28696bd 000b 01cc54c0
 00203202 dd9eb400 dd9eb400  da6c5d90 e293b163 dd9ec8e0 dd9ec8e8
 dd9eb400 dd9eb400 da6c5d98 e293fc9c da6c5dc0 e284c542 0001 e2869575
Call Trace:
 [] drm_irq_uninstall+0x119/0x13b [drm]
 [] r128_do_cleanup_cce+0x15/0xb3 [r128]
 [] r128_driver_lastclose+0x8/0xa [r128]
 [] drm_lastclose+0x40/0x143 [drm]
 [] drm_release+0x3f2/0x419 [drm]
 [] __fput+0xca/0x185
 [] fput+0x8/0xa
 [] task_work_run+0x4f/0x60
 [] do_exit+0x27f/0x6bb
 [] ? __sigqueue_free+0x2c/0x2f
 [] do_group_exit+0x2e/0x65
 [] get_signal_to_deliver+0x420/0x45b
 [] ? __send_signal.constprop.34+0x15a/0x234
 [] do_signal+0x34/0x6d0
 [] ? do_send_specific+0x4a/0x74
 [] do_notify_resume+0x2b/0x52
 [] work_notifysig+0x24/0x29
Code: 50 10 b8 01 00 00 00 89 42 44 5d c3 55 31 c0 89 e5 5d c3 55 8b 80 e8 00 
00 00 89 e5 85 c0 74 0e 8b 80 94 00 00 00 8b 50 10 31 c0 <89> 42 40 5d c3 55 ba 
f0 0c 94 e2 89 e5 b8 68 0d 94 e2 e8 55 15
EIP: [] r128_driver_irq_uninstall+0x18/0x1d [r128] SS:ESP 
0068:da6c5d48
CR2: e07f0040
---[ end trace 018ccfcd552fb6cf ]---
Fixing recursive fault but reboot is needed!
device: '254:0': device_add








Applying your (probably experimental?) posted patch
(thanks for having done the necessary debugging work for me :)
upon next boot made this dump go away,
but I got greeted with a relatively similar:











[drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[drm] No driver support for vblank timestamp query.
[drm] Initialized r128 2.5.0 20030725 for :01:00.0 on minor 0
agpgart-intel :00:00.0: AGP 2.0 bridge
agpgart-intel :00:00.0: putting AGP V2 device into 4x mode
pci :01:00.0: putting AGP V2 device into 4x mode
Registering platform device 'r128_cce.0'. Parent at platform
device: 'r128_cce.0': device_add
bus: 'platform': add device r128_cce.0
PM: Adding info for platform:r128_cce.0
__allocate_fw_buf: fw-r128/r128_cce.bin buf=dda02aa0
platform r128_cce.0: firmware: direct-loading firmware r128/r128_cce.bin
fw_set_page_data: fw-r128/r128_cce.bin buf=dda02aa0 data=e080c000 size=2048
bus: 'platform': remove device r128_cce.0
PM: Removing info for platform:r128_cce.0
fw_name_devm_release: fw_name-r128/r128_cce.bin devm-dda289cc released
__fw_free_buf: fw-r128/r128_cce.bin buf=dda02aa0 data=e080c000 size=2048
device class 'printer': registering
lp: driver loaded but no devices found
BUG: unable to handle kernel NULL 

3.14-rc7 crashes in drm ([PATCH] a crash in mga_driver_irq_uninstall)

2014-03-22 Thread Andreas Mohr
Hi,

now testing 3.14-rc7 here (r128 hardware rather than MGA),
and I seem to still be experiencing the same or very similar crash as you here:

agpgart-intel :00:00.0: AGP 2.0 bridge
agpgart-intel :00:00.0: putting AGP V2 device into 4x mode
pci :01:00.0: putting AGP V2 device into 4x mode
Registering platform device 'r128_cce.0'. Parent at platform
device: 'r128_cce.0': device_add
bus: 'platform': add device r128_cce.0
PM: Adding info for platform:r128_cce.0
__allocate_fw_buf: fw-r128/r128_cce.bin buf=dd9ec800
platform r128_cce.0: firmware: direct-loading firmware r128/r128_cce.bin
fw_set_page_data: fw-r128/r128_cce.bin buf=dd9ec800 data=e07f8000 size=2048
bus: 'platform': remove device r128_cce.0
PM: Removing info for platform:r128_cce.0
fw_name_devm_release: fw_name-r128/r128_cce.bin devm-dd9ccfcc released
__fw_free_buf: fw-r128/r128_cce.bin buf=dd9ec800 data=e07f8000 size=2048
evbug: Event. Dev: input7, Type: 2, Code: 0, Value: 1
evbug: Event. Dev: input7, Type: 2, Code: 1, Value: 1
evbug: Event. Dev: input7, Type: 0, Code: 0, Value: 0
evbug: Event. Dev: input7, Type: 2, Code: 0, Value: 2
evbug: Event. Dev: input7, Type: 0, Code: 0, Value: 0
BUG: unable to handle kernel paging request at e07f0040
IP: [e293fdb8] r128_driver_irq_uninstall+0x18/0x1d [r128]
*pde = 1f414067 *pte =  
Oops: 0002 [#1] 
Modules linked in: lp r128 drm uinput nls_iso8859_1 nls_cp437 vfat fat radeonfb 
cfbfillrect cfbimgblt cfbcopyarea i2c_algo_bit fb_ddc i2c_core fb fbdev ppdev lo
op fuse firewire_sbp2 mcs7830 usbnet usb_storage mii iTCO_wdt iTCO_vendor_suppor
t snd_maestro3 snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_seq
_midi snd_rawmidi snd_seq_oss pcmcia snd_seq_midi_event snd_seq snd_seq_device s
nd_timer microcode firewire_ohci dell_laptop sg dcdbas yenta_socket snd firewire
_core sr_mod pcmcia_rsrc psmouse crc_itu_t cdrom pcmcia_core pcspkr video backli
ght evbug evdev uhci_hcd floppy rtc_cmos ehci_hcd intel_agp intel_gtt usbcore us
b_common lpc_ich mfd_core
CPU: 0 PID: 4674 Comm: Xorg Not tainted 3.14.0-rc7+ #9
Hardware name: Dell Computer Corporation Inspiron 8000   /Inspir
on 8000, BIOS A23 01/21/2004
task: ded082f0 ti: da6c4000 task.ti: da6c4000
EIP: 0060:[e293fdb8] EFLAGS: 00213246 CPU: 0
EIP is at r128_driver_irq_uninstall+0x18/0x1d [r128]
EAX:  EBX: dd9eb400 ECX:  EDX: e07f
ESI: 0001 EDI: dd9ccd40 EBP: da6c5d48 ESP: da6c5d48
 DS: 007b ES: 007b FS:  GS: 00e0 SS: 0068
CR0: 80050033 CR2: e07f0040 CR3: 1da53000 CR4: 07d0
Stack:
 da6c5d78 e284dc79  e28696c5 e285fddc e28696bd 000b 01cc54c0
 00203202 dd9eb400 dd9eb400  da6c5d90 e293b163 dd9ec8e0 dd9ec8e8
 dd9eb400 dd9eb400 da6c5d98 e293fc9c da6c5dc0 e284c542 0001 e2869575
Call Trace:
 [e284dc79] drm_irq_uninstall+0x119/0x13b [drm]
 [e293b163] r128_do_cleanup_cce+0x15/0xb3 [r128]
 [e293fc9c] r128_driver_lastclose+0x8/0xa [r128]
 [e284c542] drm_lastclose+0x40/0x143 [drm]
 [e284ca37] drm_release+0x3f2/0x419 [drm]
 [c10b4c07] __fput+0xca/0x185
 [c10b4ce8] fput+0x8/0xa
 [c103c213] task_work_run+0x4f/0x60
 [c102a4fc] do_exit+0x27f/0x6bb
 [c1032bc0] ? __sigqueue_free+0x2c/0x2f
 [c102b4df] do_group_exit+0x2e/0x65
 [c1034963] get_signal_to_deliver+0x420/0x45b
 [c1033788] ? __send_signal.constprop.34+0x15a/0x234
 [c10014a2] do_signal+0x34/0x6d0
 [c1033fdf] ? do_send_specific+0x4a/0x74
 [c1001b69] do_notify_resume+0x2b/0x52
 [c12c5a33] work_notifysig+0x24/0x29
Code: 50 10 b8 01 00 00 00 89 42 44 5d c3 55 31 c0 89 e5 5d c3 55 8b 80 e8 00 
00 00 89 e5 85 c0 74 0e 8b 80 94 00 00 00 8b 50 10 31 c0 89 42 40 5d c3 55 ba 
f0 0c 94 e2 89 e5 b8 68 0d 94 e2 e8 55 15
EIP: [e293fdb8] r128_driver_irq_uninstall+0x18/0x1d [r128] SS:ESP 
0068:da6c5d48
CR2: e07f0040
---[ end trace 018ccfcd552fb6cf ]---
Fixing recursive fault but reboot is needed!
device: '254:0': device_add








Applying your (probably experimental?) posted patch
(thanks for having done the necessary debugging work for me :)
upon next boot made this dump go away,
but I got greeted with a relatively similar:











[drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[drm] No driver support for vblank timestamp query.
[drm] Initialized r128 2.5.0 20030725 for :01:00.0 on minor 0
agpgart-intel :00:00.0: AGP 2.0 bridge
agpgart-intel :00:00.0: putting AGP V2 device into 4x mode
pci :01:00.0: putting AGP V2 device into 4x mode
Registering platform device 'r128_cce.0'. Parent at platform
device: 'r128_cce.0': device_add
bus: 'platform': add device r128_cce.0
PM: Adding info for platform:r128_cce.0
__allocate_fw_buf: fw-r128/r128_cce.bin buf=dda02aa0
platform r128_cce.0: firmware: direct-loading firmware r128/r128_cce.bin
fw_set_page_data: fw-r128/r128_cce.bin buf=dda02aa0 data=e080c000 size=2048
bus: 'platform': remove device r128_cce.0
PM: Removing info for platform:r128_cce.0
fw_name_devm_release: fw_name-r128/r128_cce.bin devm-dda289cc released
__fw_free_buf: