[PATCH] drm/mgag200: Fix calling drm_fb_helper_fini() twice

2015-10-03 Thread Ingo Molnar

* Archit Taneja  wrote:

> 
> 
> On 9/17/2015 2:04 PM, Ingo Molnar wrote:
> >
> >
> >* Ingo Molnar  wrote:
> >
> >
> >
> >>So this patch was whitespace damaged - I applied it by hand and made the 
> >>commit
> >
> >>below. This has solved the crash, thanks Archit!
> >
> >
> >
> >Spoke too soon - the attached (allyesconfig-ish) config still crashes, first 
> >there
> >
> >are a handful of kobject debug warnings, then:
> 
> The error handling in the driver is bad. The main problem is that the
> driver_load op calls mgag200_driver_unload if anything fails, which doesn't
> work well if driver_load fails mid way.
> 
> I'll post out patches to fix this. But you'll need to undo the patch
> I'd sent previously.

Thanks, with Linus's latest kernel that has your fixes included I can no longer 
reproduce the crash.

The MGAG200 driver still generates the following kobject warnings:

[  269.353392] calling  mgag200_init+0x0/0x3b @ 1
[  269.358702] bus: 'pci': add driver mgag200
[  269.363760] bus: 'pci': driver_probe_device: matched device :0b:00.0 
with driver mgag200
[  269.373748] bus: 'pci': really_probe: probing driver mgag200 with device 
:0b:00.0
[  269.383073] mgag200 :0b:00.0: no default pinctrl state
[  269.389590] devices_kset: Moving :0b:00.0 to end of list
[  269.399437] device: 'controlD64': device_add
[  269.404901] PM: Adding info for No Bus:controlD64
[  269.410986] device: 'card0': device_add
[  269.415877] PM: Adding info for No Bus:card0
[  269.431210] [ cut here ]
[  269.436655] WARNING: CPU: 0 PID: 230 at lib/kobject.c:582 
kobject_get+0x33/0x6a()
[  269.445503] kobject: 'ttm' (86c288b0): is not initialized, yet 
kobject_get() is being called.
[  269.456290] Modules linked in:
[  269.460057] CPU: 0 PID: 230 Comm: kworker/0:1 Tainted: GWL  
4.3.0-rc3-02057-g77880ef-dirty #183
[  269.471428] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS 
SE5C600.86B.02.02.0002.122320131210 12/23/2013
[  269.483389] Workqueue: events work_for_cpu_fn
[  269.488612]   88041fabf9a8 81892127 
88041fabf9f0
[  269.497809]  88041fabf9e0 81145b0f 8189405f 
86c288b0
[  269.506882]   86c288b0 00ff 
88041fabfa40
[  269.515943] Call Trace:
[  269.518951]  [] dump_stack+0x4b/0x64
[  269.524975]  [] warn_slowpath_common+0x9f/0xb8
[  269.531970]  [] ? kobject_get+0x33/0x6a
[  269.538271]  [] warn_slowpath_fmt+0x4c/0x4e
[  269.544976]  [] ? lock_is_held+0x55/0x66
[  269.551382]  [] kobject_get+0x33/0x6a
[  269.557500]  [] kobject_add_internal+0x58/0x2c4
[  269.564589]  [] kobject_init_and_add+0x73/0x7e
[  269.571586]  [] ttm_mem_global_init+0xc6/0x2cd
[  269.578583]  [] ? kasan_poison_shadow+0x2f/0x31
[  269.585674]  [] ? kasan_unpoison_shadow+0x14/0x35
[  269.592958]  [] ? kasan_poison_shadow+0x2f/0x31
[  269.600044]  [] ? kasan_kmalloc+0x4b/0x50
[  269.606544]  [] ? __kmalloc+0x13e/0x180
[  269.612856]  [] ? kasan_poison_shadow+0x2f/0x31
[  269.619945]  [] ? drm_global_item_ref+0x67/0xad
[  269.627056]  [] mgag200_ttm_mem_global_init+0x12/0x14
[  269.634725]  [] drm_global_item_ref+0x7e/0xad
[  269.641619]  [] mgag200_mm_init+0x50/0x199
[  269.648223]  [] mgag200_driver_load+0x34a/0x4c7
[  269.655316]  [] drm_dev_register+0x6f/0xb0
[  269.661923]  [] drm_get_pci_dev+0xff/0x1c2
[  269.668526]  [] mga_pci_probe+0xa6/0xad
[  269.674840]  [] local_pci_probe+0x3d/0x82
[  269.685976]  [] work_for_cpu_fn+0x14/0x1b
[  269.692622]  [] process_one_work+0x28e/0x4ef
[  269.699424]  [] ? process_one_work+0x170/0x4ef
[  269.706438]  [] process_scheduled_works+0x21/0x2f
[  269.713728]  [] worker_thread+0x1fb/0x2bd
[  269.720238]  [] ? cancel_delayed_work_sync+0x15/0x15
[  269.727815]  [] ? cancel_delayed_work_sync+0x15/0x15
[  269.735395]  [] kthread+0xc5/0xcd
[  269.741118]  [] ? kthread_parkme+0x24/0x24
[  269.747727]  [] ret_from_fork+0x3f/0x70
[  269.754041]  [] ? kthread_parkme+0x24/0x24
[  269.760641] ---[ end trace 0609b8147f0ef5df ]---
[  269.766074] [ cut here ]
[  269.771499] WARNING: CPU: 0 PID: 230 at include/linux/kref.h:47 
kobject_get+0x5d/0x6a()
[  269.780921] Modules linked in:
[  269.784691] CPU: 0 PID: 230 Comm: kworker/0:1 Tainted: GWL  
4.3.0-rc3-02057-g77880ef-dirty #183
[  269.796062] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS 
SE5C600.86B.02.02.0002.122320131210 12/23/2013
[  269.808019] Workqueue: events work_for_cpu_fn
[  269.813274]   88041fabf9f8 81892127 

[  269.822347]  88041fabfa30 81145b0f 81894089 
86c288b0
[  269.831405]   86c288b0 00ff 
88041fabfa40
[  269.840455] Call Trace:
[  269.843459]  [] dump_stack+0x4b/0x64
[  269.849477]  [] warn_slowpath_common+0x9f/0xb8
[  269.856468]  [] ? kobject_get+0x5d/0x6a
[  269.862783]  [] warn_slowpath_null+0x1a/0x1c
[  269.869571]  [] 

[PATCH] drm/mgag200: Fix calling drm_fb_helper_fini() twice

2015-09-17 Thread Sudip Mukherjee
On Thu, Sep 17, 2015 at 04:24:21PM +0530, Archit Taneja wrote:
> 
> 
> On 9/17/2015 2:04 PM, Ingo Molnar wrote:
> >
> >
> >* Ingo Molnar  wrote:
> >
> >
> >
> >>So this patch was whitespace damaged - I applied it by hand and made the 
> >>commit
> >
> >>below. This has solved the crash, thanks Archit!
> >
> >
> >
> >Spoke too soon - the attached (allyesconfig-ish) config still crashes, first 
> >there
> >
> >are a handful of kobject debug warnings, then:
> 
> The error handling in the driver is bad. The main problem is that
> the driver_load op calls mgag200_driver_unload if anything fails,
> which doesn't work well if driver_load fails mid way.
mgag200_driver_unload is trying to unload everything evenif that has not
succeeded in initializing. Here the ttm failed to initialize but
still mgag200_mm_fini was called to unload it.

regards
sudip


[PATCH] drm/mgag200: Fix calling drm_fb_helper_fini() twice

2015-09-17 Thread Archit Taneja


On 9/17/2015 2:04 PM, Ingo Molnar wrote:
>
>
> * Ingo Molnar  wrote:
>
>
>
>> So this patch was whitespace damaged - I applied it by hand and made the 
>> commit
>
>> below. This has solved the crash, thanks Archit!
>
>
>
> Spoke too soon - the attached (allyesconfig-ish) config still crashes, first 
> there
>
> are a handful of kobject debug warnings, then:

The error handling in the driver is bad. The main problem is that the 
driver_load op calls mgag200_driver_unload if anything fails, which 
doesn't work well if driver_load fails mid way.

I'll post out patches to fix this. But you'll need to undo the patch
I'd sent previously.

Thanks,
Archit

>
>
>
> [  115.274847] [drm:mgag200_mm_init] *ERROR* Failed setting up TTM memory 
> accounting subsystem.
>
> [  115.274853] BUG: unable to handle kernel NULL pointer dereference at   
> (null)
>
> [  115.274856] IP: [] drm_mode_config_cleanup+0x1f/0x1cc
>
> [  115.274858] PGD 0
>
> [  115.274860] Oops:  [#1] SMP DEBUG_PAGEALLOC KASAN
>
> [  115.274861] Modules linked in:
>
> [  115.274862] CPU: 0 PID: 4 Comm: kworker/0:0 Tainted: GWL  
> 4.3.0-rc1-01729-g525850e-dirty #59
>
> [  115.274863] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS 
> SE5C600.86B.02.02.0002.122320131210 12/23/2013
>
> [  115.274865] Workqueue: events work_for_cpu_fn
>
> [  115.274865] task: 88017d2a8000 ti: 88017d2b task.ti: 
> 88017d2b
>
> [  115.274868] RIP: 0010:[]  [] 
> drm_mode_config_cleanup+0x1f/0x1cc
>
> [  115.274868] RSP: :88017d2b7bd8  EFLAGS: 00010246
>
> [  115.274869] RAX:  RBX: 8800ba0a4520 RCX: 
> 81186ff1
>
> [  115.274870] RDX: 88017d2b7a90 RSI: 83d73579 RDI: 
> 8800ba0a4520
>
> [  115.274870] RBP: 88017d2b7bf0 R08: 0001 R09: 
> 
>
> [  115.274871] R10: 00075000 R11: 85147a82 R12: 
> 8800ba0a4520
>
> [  115.274872] R13: 8800ba0a4da8 R14: fffe R15: 
> 8800ba0a3400
>
> [  115.274873] FS:  () GS:880420e0() 
> knlGS:
>
> [  115.274874] CS:  0010 DS:  ES:  CR0: 80050033
>
> [  115.274875] CR2:  CR3: 05e2d000 CR4: 
> 001406f0
>
> [  115.274876] Stack:
>
> [  115.274877]  8800ba0a3400 8800ba0a4520 fffe 
> 88017d2b7c10
>
> [  115.274879]  81cb6a3e 8800ba0a4520 8800ba0a3400 
> 88017d2b7c78
>
> [  115.274881]  81cb6efe 07200100 aa55 
> 075b
>
> [  115.274881] Call Trace:
>
> [  115.274883]  [] mgag200_driver_unload+0x30/0x48
>
> [  115.274884]  [] mgag200_driver_load+0x4a8/0x4ba
>
> [  115.274886]  [] drm_dev_register+0x6f/0xb0
>
> [  115.274887]  [] drm_get_pci_dev+0xff/0x1c2
>
> [  115.274889]  [] mga_pci_probe+0xa6/0xad
>
> [  115.274890]  [] local_pci_probe+0x3d/0x82
>
> [  115.274891]  [] work_for_cpu_fn+0x14/0x1b
>
> [  115.274893]  [] process_one_work+0x28e/0x4ef
>
> [  115.274895]  [] ? process_one_work+0x170/0x4ef
>
> [  115.274897]  [] process_scheduled_works+0x21/0x2f
>
>
>
> Full log is below, config attached as well.
>
>
>
> Thanks,
>
>
>
>   Ingo
>
>
>
> >
>
> [  115.243547] mgag200 :0b:00.0: no default pinctrl state
>
> [  115.243732] devices_kset: Moving :0b:00.0 to end of list
>
> [  115.247078] device: 'controlD64': device_add
>
> [  115.247494] PM: Adding info for No Bus:controlD64
>
> [  115.247979] device: 'card0': device_add
>
> [  115.248315] PM: Adding info for No Bus:card0
>
> [  115.274511] [ cut here ]
>
> [  115.274518] WARNING: CPU: 0 PID: 4 at lib/kobject.c:582 
> kobject_get+0x33/0x6a()
>
> [  115.274519] kobject: 'ttm' (86c28700): is not initialized, yet 
> kobject_get() is being called.
>
> [  115.274521] Modules linked in:
>
> [  115.274524] CPU: 0 PID: 4 Comm: kworker/0:0 Tainted: GWL  
> 4.3.0-rc1-01729-g525850e-dirty #59
>
> [  115.274525] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS 
> SE5C600.86B.02.02.0002.122320131210 12/23/2013
>
> [  115.274529] Workqueue: events work_for_cpu_fn
>
> [  115.274532]   88017d2b79a8 8188f38b 
> 88017d2b79f0
>
> [  115.274534]  88017d2b79e0 81144593 8189129f 
> 86c28700
>
> [  115.274536]   86c28700 007f4000 
> 88017d2b7a40
>
> [  115.274537] Call Trace:
>
> [  115.274540]  [] dump_stack+0x4b/0x64
>
> [  115.274543]  [] warn_slowpath_common+0x9f/0xb8
>
> [  115.274545]  [] ? kobject_get+0x33/0x6a
>
> [  115.274547]  [] warn_slowpath_fmt+0x4c/0x4e
>
> [  115.274551]  [] ? lock_is_held+0x55/0x66
>
> [  115.274553]  [] kobject_get+0x33/0x6a
>
> [  115.274554]  [] kobject_add_internal+0x58/0x2c4
>
> [  115.274556]  [] kobject_init_and_add+0x73/0x7e
>
> [  115.274559]  [] ttm_mem_global_init+0xc6/0x2cd
>
> [  115.274563]  [] ? kasan_poison_shadow+0x2f/0x31
>
> [  115.274564]  [] 

[PATCH] drm/mgag200: Fix calling drm_fb_helper_fini() twice

2015-09-17 Thread Ingo Molnar

* Ingo Molnar  wrote:

> So this patch was whitespace damaged - I applied it by hand and made the 
> commit 
> below. This has solved the crash, thanks Archit!

Spoke too soon - the attached (allyesconfig-ish) config still crashes, first 
there 
are a handful of kobject debug warnings, then:

[  115.274847] [drm:mgag200_mm_init] *ERROR* Failed setting up TTM memory 
accounting subsystem.
[  115.274853] BUG: unable to handle kernel NULL pointer dereference at 
  (null)
[  115.274856] IP: [] drm_mode_config_cleanup+0x1f/0x1cc
[  115.274858] PGD 0 
[  115.274860] Oops:  [#1] SMP DEBUG_PAGEALLOC KASAN
[  115.274861] Modules linked in:
[  115.274862] CPU: 0 PID: 4 Comm: kworker/0:0 Tainted: GWL  
4.3.0-rc1-01729-g525850e-dirty #59
[  115.274863] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS 
SE5C600.86B.02.02.0002.122320131210 12/23/2013
[  115.274865] Workqueue: events work_for_cpu_fn
[  115.274865] task: 88017d2a8000 ti: 88017d2b task.ti: 
88017d2b
[  115.274868] RIP: 0010:[]  [] 
drm_mode_config_cleanup+0x1f/0x1cc
[  115.274868] RSP: :88017d2b7bd8  EFLAGS: 00010246
[  115.274869] RAX:  RBX: 8800ba0a4520 RCX: 81186ff1
[  115.274870] RDX: 88017d2b7a90 RSI: 83d73579 RDI: 8800ba0a4520
[  115.274870] RBP: 88017d2b7bf0 R08: 0001 R09: 
[  115.274871] R10: 00075000 R11: 85147a82 R12: 8800ba0a4520
[  115.274872] R13: 8800ba0a4da8 R14: fffe R15: 8800ba0a3400
[  115.274873] FS:  () GS:880420e0() 
knlGS:
[  115.274874] CS:  0010 DS:  ES:  CR0: 80050033
[  115.274875] CR2:  CR3: 05e2d000 CR4: 001406f0
[  115.274876] Stack:
[  115.274877]  8800ba0a3400 8800ba0a4520 fffe 
88017d2b7c10
[  115.274879]  81cb6a3e 8800ba0a4520 8800ba0a3400 
88017d2b7c78
[  115.274881]  81cb6efe 07200100 aa55 
075b
[  115.274881] Call Trace:
[  115.274883]  [] mgag200_driver_unload+0x30/0x48
[  115.274884]  [] mgag200_driver_load+0x4a8/0x4ba
[  115.274886]  [] drm_dev_register+0x6f/0xb0
[  115.274887]  [] drm_get_pci_dev+0xff/0x1c2
[  115.274889]  [] mga_pci_probe+0xa6/0xad
[  115.274890]  [] local_pci_probe+0x3d/0x82
[  115.274891]  [] work_for_cpu_fn+0x14/0x1b
[  115.274893]  [] process_one_work+0x28e/0x4ef
[  115.274895]  [] ? process_one_work+0x170/0x4ef
[  115.274897]  [] process_scheduled_works+0x21/0x2f

Full log is below, config attached as well.

Thanks,

Ingo

>
[  115.243547] mgag200 :0b:00.0: no default pinctrl state
[  115.243732] devices_kset: Moving :0b:00.0 to end of list
[  115.247078] device: 'controlD64': device_add
[  115.247494] PM: Adding info for No Bus:controlD64
[  115.247979] device: 'card0': device_add
[  115.248315] PM: Adding info for No Bus:card0
[  115.274511] [ cut here ]
[  115.274518] WARNING: CPU: 0 PID: 4 at lib/kobject.c:582 
kobject_get+0x33/0x6a()
[  115.274519] kobject: 'ttm' (86c28700): is not initialized, yet 
kobject_get() is being called.
[  115.274521] Modules linked in:
[  115.274524] CPU: 0 PID: 4 Comm: kworker/0:0 Tainted: GWL  
4.3.0-rc1-01729-g525850e-dirty #59
[  115.274525] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS 
SE5C600.86B.02.02.0002.122320131210 12/23/2013
[  115.274529] Workqueue: events work_for_cpu_fn
[  115.274532]   88017d2b79a8 8188f38b 
88017d2b79f0
[  115.274534]  88017d2b79e0 81144593 8189129f 
86c28700
[  115.274536]   86c28700 007f4000 
88017d2b7a40
[  115.274537] Call Trace:
[  115.274540]  [] dump_stack+0x4b/0x64
[  115.274543]  [] warn_slowpath_common+0x9f/0xb8
[  115.274545]  [] ? kobject_get+0x33/0x6a
[  115.274547]  [] warn_slowpath_fmt+0x4c/0x4e
[  115.274551]  [] ? lock_is_held+0x55/0x66
[  115.274553]  [] kobject_get+0x33/0x6a
[  115.274554]  [] kobject_add_internal+0x58/0x2c4
[  115.274556]  [] kobject_init_and_add+0x73/0x7e
[  115.274559]  [] ttm_mem_global_init+0xc6/0x2cd
[  115.274563]  [] ? kasan_poison_shadow+0x2f/0x31
[  115.274564]  [] ? kasan_unpoison_shadow+0x14/0x35
[  115.274566]  [] ? kasan_poison_shadow+0x2f/0x31
[  115.274567]  [] ? kasan_kmalloc+0x4b/0x50
[  115.274569]  [] ? __kmalloc+0x13e/0x180
[  115.274571]  [] ? kasan_poison_shadow+0x2f/0x31
[  115.274573]  [] ? drm_global_item_ref+0x67/0xad
[  115.274577]  [] mgag200_ttm_mem_global_init+0x12/0x14
[  115.274579]  [] drm_global_item_ref+0x7e/0xad
[  115.274581]  [] mgag200_mm_init+0x50/0x199
[  115.274583]  [] mgag200_driver_load+0x34a/0x4ba
[  115.274587]  [] drm_dev_register+0x6f/0xb0
[  115.274589]  [] drm_get_pci_dev+0xff/0x1c2
[  115.274590]  [] mga_pci_probe+0xa6/0xad
[  115.274593]  [] local_pci_probe+0x3d/0x82
[  115.274595]  [] 

[PATCH] drm/mgag200: Fix calling drm_fb_helper_fini() twice

2015-09-16 Thread Ingo Molnar

* Archit Taneja  wrote:

> From: Archit Taneja 
> Date: Mon, 14 Sep 2015 20:11:43 +0530
> Subject: [PATCH] drm/mgag200: Prevent calling drm_fb_helper_fini twice
> 
> mgag200_fbdev_init's error handling path calls drm_fb_helper_fini before
> bailing out. The error handling path of mgag200_driver_load also ends
> up calling drm_fb_helper_fini.
> 
> This results in drm_fb_helper_fini being called twice if the driver load
> drm op fails somewhere in between.
> 
> Make only mgag200_driver_unload call drm_fb_helper_fini, remove the call
> from mgag200_fbdev_init.
> 
> Signed-off-by: Archit Taneja 
> ---
>  drivers/gpu/drm/mgag200/mgag200_fb.c | 8 ++--
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/mgag200/mgag200_fb.c
> b/drivers/gpu/drm/mgag200/mgag200_fb.c
> index 87de15e..6259b0a 100644
> --- a/drivers/gpu/drm/mgag200/mgag200_fb.c
> +++ b/drivers/gpu/drm/mgag200/mgag200_fb.c
> @@ -280,20 +280,16 @@ int mgag200_fbdev_init(struct mga_device *mdev)
> 
>   ret = drm_fb_helper_single_add_all_connectors(>helper);
>   if (ret)
> - goto fini;
> + return ret;
> 
>   /* disable all the possible outputs/crtcs before entering KMS mode */
>   drm_helper_disable_unused_functions(mdev->dev);
> 
>   ret = drm_fb_helper_initial_config(>helper, bpp_sel);
>   if (ret)
> - goto fini;
> + return ret;
> 
>   return 0;
> -
> -fini:
> - drm_fb_helper_fini(>helper);
> - return ret;
>  }
> 
>  void mgag200_fbdev_fini(struct mga_device *mdev)

So this patch was whitespace damaged - I applied it by hand and made the commit 
below. This has solved the crash, thanks Archit!

And yes, you are right that my config probably crashed with older kernels too.

Ingo

=>
>From 60d733a3ec19dc72372e12207a0a86293cd40cf5 Mon Sep 17 00:00:00 2001
From: Archit Taneja <arch...@codeaurora.org>
Date: Mon, 14 Sep 2015 20:53:57 +0530
Subject: [PATCH] drm/mgag200: Fix calling drm_fb_helper_fini() twice

mgag200_fbdev_init()'s error handling path calls
drm_fb_helper_fini() before bailing out. The error handling path
of mgag200_driver_load() also ends up calling drm_fb_helper_fini().

This results in drm_fb_helper_fini() being called twice if the
driver load drm op fails somewhere in between.

Make only mgag200_driver_unload() call drm_fb_helper_fini(), remove
the call from mgag200_fbdev_init().

Reported-by: Ingo Molnar 
Signed-off-by: Archit Taneja 
Cc: Archit Taneja 
Cc: Daniel Vetter 
Cc: Dave Airlie 
Cc: David Airlie 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Sudip Mukherjee 
Cc: Thomas Gleixner 
Cc: dri-devel 
Link: http://lkml.kernel.org/r/55F6E68D.8070800 at codeaurora.org
Signed-off-by: Ingo Molnar 
---
 drivers/gpu/drm/mgag200/mgag200_fb.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/mgag200/mgag200_fb.c 
b/drivers/gpu/drm/mgag200/mgag200_fb.c
index 87de15ea1f93..6259b0a5fc38 100644
--- a/drivers/gpu/drm/mgag200/mgag200_fb.c
+++ b/drivers/gpu/drm/mgag200/mgag200_fb.c
@@ -280,20 +280,16 @@ int mgag200_fbdev_init(struct mga_device *mdev)

ret = drm_fb_helper_single_add_all_connectors(>helper);
if (ret)
-   goto fini;
+   return ret;

/* disable all the possible outputs/crtcs before entering KMS mode */
drm_helper_disable_unused_functions(mdev->dev);

ret = drm_fb_helper_initial_config(>helper, bpp_sel);
if (ret)
-   goto fini;
+   return ret;

return 0;
-
-fini:
-   drm_fb_helper_fini(>helper);
-   return ret;
 }

 void mgag200_fbdev_fini(struct mga_device *mdev)