Re: [PATCH] video: fbdev: arkfb: fix possible object reference leak

2023-10-06 Thread Ondrej Zary
On Friday 06 October 2023, Helge Deller wrote:
> On 10/5/23 09:01, Zhang Shurong wrote:
> > Add missing pci_disable_device() in error path in ark_pci_probe().
> 
> Do you have this hardware and tested your patch?
> I'm sure there is a reason, why "pci_disable_device()" was commented
> out in the original submission in commit 681e14730c73c...

pci_disable_device() call is disabled in many fbdev drivers because calling it 
might prevent display from working.

> 
> Additionally I'm wondering why your patch doesn't show up in
> the fbdev patchwork, although you added linux-fbdev mailing list.
> Probably a vger issue.
> 
> Helge
> 
> 
> > Signed-off-by: Zhang Shurong 
> > ---
> >   drivers/video/fbdev/arkfb.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/video/fbdev/arkfb.c b/drivers/video/fbdev/arkfb.c
> > index 60a96fdb5dd8..6c4e5065646f 100644
> > --- a/drivers/video/fbdev/arkfb.c
> > +++ b/drivers/video/fbdev/arkfb.c
> > @@ -1064,7 +1064,7 @@ static int ark_pci_probe(struct pci_dev *dev, const 
> > struct pci_device_id *id)
> >   err_dac:
> > pci_release_regions(dev);
> >   err_request_regions:
> > -/* pci_disable_device(dev); */
> > +   pci_disable_device(dev);
> >   err_enable_device:
> > framebuffer_release(info);
> > return rc;
> > @@ -1085,7 +1085,7 @@ static void ark_pci_remove(struct pci_dev *dev)
> >
> > pci_iounmap(dev, info->screen_base);
> > pci_release_regions(dev);
> > -/* pci_disable_device(dev); */
> > +   pci_disable_device(dev);
> >
> > framebuffer_release(info);
> > }
> 
> 



-- 
Ondrej Zary


Re: [PATCH v4 1/4] video: fbdev: atyfb: only use ioremap_uc() on i386 and ia64

2023-03-08 Thread Ondrej Zary
On Wednesday 08 March 2023 21:01:09 Luis Chamberlain wrote:
> On Wed, Mar 08, 2023 at 09:07:07PM +0800, Baoquan He wrote:
> > From: Arnd Bergmann 
> > 
> > ioremap_uc() is only meaningful on old x86-32 systems with the PAT
> > extension, and on ia64 with its slightly unconventional ioremap()
> > behavior, everywhere else this is the same as ioremap() anyway.
> > 
> > Change the only driver that still references ioremap_uc() to only do so
> > on x86-32/ia64 in order to allow removing that interface at some
> > point in the future for the other architectures.
> > 
> > On some architectures, ioremap_uc() just returns NULL, changing
> > the driver to call ioremap() means that they now have a chance
> > of working correctly.
> > 
> > Signed-off-by: Arnd Bergmann 
> > Signed-off-by: Baoquan He 
> > Cc: Helge Deller 
> > Cc: Thomas Zimmermann 
> > Cc: Christophe Leroy 
> > Cc: linux-fb...@vger.kernel.org
> > Cc: dri-devel@lists.freedesktop.org
> 
> Reviewed-by: Luis Chamberlain 
> 
> Is anyone using this driver these days? How often do fbdev drivers get
> audited to see what can be nuked?

Older servers have integrated ATI Rage XL chips and this is the only driver for 
it.

-- 
Ondrej Zary


[PATCH] fbdev: i740fb: use memset_io() to clear screen

2022-04-10 Thread Ondrej Zary
sparse complains that using memset() on __iomem pointer is wrong:
incorrect type in argument 1 (different address spaces)

Use memset_io() to clear screen instead.

Tested on real i740 cards.

Signed-off-by: Ondrej Zary 
---
 drivers/video/fbdev/i740fb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/i740fb.c b/drivers/video/fbdev/i740fb.c
index 52cce0db8bd3..dd45ea8203be 100644
--- a/drivers/video/fbdev/i740fb.c
+++ b/drivers/video/fbdev/i740fb.c
@@ -740,7 +740,7 @@ static int i740fb_set_par(struct fb_info *info)
if (i)
return i;
 
-   memset(info->screen_base, 0, info->screen_size);
+   memset_io(info->screen_base, 0, info->screen_size);
 
vga_protect(par);
 
-- 
Ondrej Zary



Re: [PATCH 1/7] video: fbdev: i740fb: Error out if 'pixclock' equals zero

2022-04-10 Thread Ondrej Zary
On Friday 08 April 2022 03:58:10 Zheyu Ma wrote:
> On Fri, Apr 8, 2022 at 3:50 AM Helge Deller  wrote:
> >
> > On 4/4/22 10:47, Zheyu Ma wrote:
> > > The userspace program could pass any values to the driver through
> > > ioctl() interface. If the driver doesn't check the value of 'pixclock',
> > > it may cause divide error.
> > >
> > > Fix this by checking whether 'pixclock' is zero in the function
> > > i740fb_check_var().
> > >
> > > The following log reveals it:
> > >
> > > divide error:  [#1] PREEMPT SMP KASAN PTI
> > > RIP: 0010:i740fb_decode_var drivers/video/fbdev/i740fb.c:444 [inline]
> > > RIP: 0010:i740fb_set_par+0x272f/0x3bb0 drivers/video/fbdev/i740fb.c:739
> > > Call Trace:
> > > fb_set_var+0x604/0xeb0 drivers/video/fbdev/core/fbmem.c:1036
> > > do_fb_ioctl+0x234/0x670 drivers/video/fbdev/core/fbmem.c:1112
> > > fb_ioctl+0xdd/0x130 drivers/video/fbdev/core/fbmem.c:1191
> > > vfs_ioctl fs/ioctl.c:51 [inline]
> > > __do_sys_ioctl fs/ioctl.c:874 [inline]
> > >
> > > Signed-off-by: Zheyu Ma 
> >
> > Hello Zheyu,
> >
> > I've applied the patches #2-#7 of this series, but left
> > out this specific patch (for now).
> > As discussed on the mailing list we can try to come up with a
> > better fix (to round up the pixclock when it's invalid).
> > If not, I will apply this one later.
> 
> I'm also looking forward to a more appropriate patch for this driver!

I was not able to reproduce it at first but finally found it: the monitor must 
be unplugged. If a valid EDID is present, fb_validate_mode() call in 
i740fb_check_var() will refuse zero pixclock.

Haven't found any obvious way to correct zero pixclock value. Most other 
drivers simply return -EINVAL.

> Thanks,
> Zheyu Ma
> 


-- 
Ondrej Zary


Re: [BUG] fbdev: i740fb: Divide error when ‘var->pixclock’ is zero

2022-04-05 Thread Ondrej Zary



On Tuesday 05 April 2022 08:33:57 Helge Deller wrote:
> Hello Geert,
> 
> On 4/4/22 13:46, Geert Uytterhoeven wrote:
> > Hi Helge,
> >
> > On Sun, Apr 3, 2022 at 5:41 PM Helge Deller  wrote:
> >> On 4/3/22 13:26, Zheyu Ma wrote:
> >>> I found a bug in the function i740fb_set_par().
> >>
> >> Nice catch!
> >>
> >>> When the user calls the ioctl system call without setting the value to
> >>> 'var->pixclock', the driver will throw a divide error.
> >>>
> >>> This bug occurs because the driver uses the value of 'var->pixclock'
> >>> without checking it, as the following code snippet show:
> >>>
> >>> if ((100 / var->pixclock) > DACSPEED8) {
> >>>  dev_err(info->device, "requested pixclock %i MHz out of range
> >>> (max. %i MHz at 8bpp)\n",
> >>>  100 / var->pixclock, DACSPEED8);
> >>> return -EINVAL;x
> >>> }
> >>>
> >>> We can fix this by checking the value of 'var->pixclock' in the
> >>> function i740fb_check_var() similar to commit
> >>> b36b242d4b8ea178f7fd038965e3cac7f30c3f09, or we should set the lowest
> >>> supported value when this field is zero.
> >>> I have no idea about which solution is better.
> >>
> >> Me neither.
> >> I think a solution like commit b36b242d4b8ea178f7fd038965e3cac7f30c3f09
> >> is sufficient.
> >>
> >> Note that i740fb_set_par() is called in i740fb_resume() as well.
> >> Since this doesn't comes form userspace I think adding a check for
> >> the return value there isn't necessary.
> >>
> >> Would you mind sending a patch like 
> >> b36b242d4b8ea178f7fd038965e3cac7f30c3f09 ?
> >
> > When passed an invalid value, .check_var() is supposed to
> > round up the invalid to a valid value, if possible.
> 
> I don't disagree.
> The main problem probably is: what is the next valid value?
> This needs to be analyzed on a per-driver base and ideally tested.
> Right now a division-by-zero is tiggered which is probably more worse.

I still have an i740 card so I can test it.

> That said, currently I'd prefer to apply the zero-checks patches over
> any untested patches. It's easy to revert such checks if a better solution
> becomes available.
> 
> Thoughts?
> 
> > Commit b36b242d4b8ea178 ("video: fbdev: asiliantfb: Error out if
> > 'pixclock' equals zero") does not do that.
> 
> Helge
> 


-- 
Ondrej Zary


Re: [Nouveau] nouveau broken again on Riva TNT2 in 5.14.0-rc2

2021-07-23 Thread Ondrej Zary
On Friday 23 July 2021 09:26:10 Daniel Vetter wrote:
> On Thu, Jul 22, 2021 at 9:51 PM Karol Herbst  wrote:
> >
> > hey thanks for the report.
> >
> > This is a known issue and the fix is pending in drm-mist-fixes and
> > should land in 5.14 soonish.
> 
> It just landed in Linus' tree yesterday, please retest that or -rc3.
> If it's still broken it's something else.
> -Daniel

Thanks, it works!

-- 
Ondrej Zary


nouveau broken again on Riva TNT2 in 5.14.0-rc2

2021-07-22 Thread Ondrej Zary
Hello,
nouveau is broken again:

[   58.795794] BUG: kernel NULL pointer dereference, address: 017c
[   58.795835] #PF: supervisor read access in kernel mode
[   58.795844] #PF: error_code(0x) - not-present page
[   58.795851] *pde = 
[   58.795862] Oops:  [#1] SMP
[   58.795875] CPU: 0 PID: 1730 Comm: Xorg Not tainted 5.14.0-rc2+ #391
[   58.795886] Hardware name: VIA Technologies, Inc. VT82C694X/694X, BIOS 6.00 
PG 02/19/2002
[   58.795894] EIP: nouveau_bo_wr16+0x8/0x27 [nouveau]
[   58.796716] Code: 85 ff 74 0d 80 7d f3 00 74 07 80 a6 c0 01 00 00 fe 89 f0 
e8 e5 ee ff ff 8d 65 f4 89 f8 5b 5e 5f 5d c3 55 01 d2 89 e5 53 89 c3 <03> 93 7c 
01 00 00 0f b7 c1 f6 83 84 01 00 00 80 74 07 e8 8a bc 72
[   58.796728] EAX:  EBX:  ECX:  EDX: 
[   58.796736] ESI: 0020 EDI: c18bc600 EBP: c7c49d88 ESP: c7c49d84
[   58.796744] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00210246
[   58.796754] CR0: 80050033 CR2: 017c CR3: 07e12000 CR4: 0690
[   58.796762] Call Trace:
[   58.796774]  nv04_crtc_cursor_set+0x148/0x1d8 [nouveau]
[   58.796952]  ? ttm_bo_reserve.constprop.16+0x1c/0x1c [nouveau]
[   58.797122]  drm_mode_cursor_common+0x13b/0x1ad
[   58.797150]  ? ttm_bo_reserve.constprop.16+0x1c/0x1c [nouveau]
[   58.797322]  drm_mode_cursor_ioctl+0x2e/0x36
[   58.797335]  ? drm_mode_setplane+0x203/0x203
[   58.797346]  drm_ioctl_kernel+0x66/0x99
[   58.797366]  drm_ioctl+0x211/0x2d8
[   58.797377]  ? drm_mode_setplane+0x203/0x203
[   58.797389]  ? __cond_resched+0x1e/0x22
[   58.797409]  ? mutex_lock+0xb/0x24
[   58.797422]  ? rpm_resume.part.14+0x6f/0x362
[   58.797447]  ? ktime_get_mono_fast_ns+0x5e/0xf2
[   58.797469]  ? __pm_runtime_resume+0x5b/0x63
[   58.797480]  nouveau_drm_ioctl+0x65/0x81 [nouveau]
[   58.797662]  ? nouveau_cli_work+0xc3/0xc3 [nouveau]
[   58.797838]  vfs_ioctl+0x1a/0x24
[   58.797850]  __ia32_sys_ioctl+0x6ea/0x704
[   58.797861]  ? doublefault_shim+0x120/0x120
[   58.797872]  ? exit_to_user_mode_prepare+0x9e/0x10c
[   58.797900]  do_int80_syscall_32+0x53/0x6e
[   58.797910]  entry_INT80_32+0xf0/0xf0
[   58.797923] EIP: 0xb7f04092
[   58.797932] Code: 00 00 00 e9 90 ff ff ff ff a3 24 00 00 00 68 30 00 00 00 
e9 80 ff ff ff ff a3 e8 ff ff ff 66 90 00 00 00 00 00 00 00 00 cd 80  8d b4 
26 00 00 00 00 8d b6 00 00 00 00 8b 1c 24 c3 8d b4 26 00
[   58.797943] EAX: ffda EBX: 000e ECX: c01c64a3 EDX: bf9a15c0
[   58.797952] ESI: 00997850 EDI: c01c64a3 EBP: 000e ESP: bf9a1574
[   58.797959] DS: 007b ES: 007b FS:  GS: 0033 SS: 007b EFLAGS: 00200292
[   58.797972] Modules linked in: i2c_dev nouveau wmi hwmon drm_ttm_helper 
psmouse serio_raw via_agp sg parport_pc 8139cp parport
[   58.798016] CR2: 017c
[   58.798147] ---[ end trace 732829d39ed65de9 ]---


d02117f8efaa5fbc37437df1ae955a147a2a424a is the first bad commit

-- 
Ondrej Zary


Re: [PATCH] drm/nouveau: fix dma_address check for CPU/GPU sync

2021-06-14 Thread Ondrej Zary
On Monday 14 June 2021 13:05:17 Christian König wrote:
> AGP for example doesn't have a dma_address array.
> 
> Signed-off-by: Christian König 

Fixes NULL pointer dereference in nouveau_bo_sync_for_device on AGP cards.

Tested-by: Ondrej Zary 

> ---
>  drivers/gpu/drm/nouveau/nouveau_bo.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
> b/drivers/gpu/drm/nouveau/nouveau_bo.c
> index 3e09df0472ce..170aba99a110 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_bo.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
> @@ -546,7 +546,7 @@ nouveau_bo_sync_for_device(struct nouveau_bo *nvbo)
>   struct ttm_tt *ttm_dma = (struct ttm_tt *)nvbo->bo.ttm;
>   int i, j;
>  
> - if (!ttm_dma)
> + if (!ttm_dma || !ttm_dma->dma_address)
>   return;
>   if (!ttm_dma->pages) {
>   NV_DEBUG(drm, "ttm_dma 0x%p: pages NULL\n", ttm_dma);
> @@ -582,7 +582,7 @@ nouveau_bo_sync_for_cpu(struct nouveau_bo *nvbo)
>   struct ttm_tt *ttm_dma = (struct ttm_tt *)nvbo->bo.ttm;
>   int i, j;
>  
> - if (!ttm_dma)
> + if (!ttm_dma || !ttm_dma->dma_address)
>   return;
>   if (!ttm_dma->pages) {
>   NV_DEBUG(drm, "ttm_dma 0x%p: pages NULL\n", ttm_dma);


-- 
Ondrej Zary


Re: nouveau broken on Riva TNT2 in 5.13.0-rc4: NULL pointer dereference in nouveau_bo_sync_for_device

2021-06-11 Thread Ondrej Zary
On Friday 11 June 2021 14:38:18 Christian König wrote:
> 
> Am 10.06.21 um 19:59 schrieb Christian König:
> > Am 10.06.21 um 19:50 schrieb Ondrej Zary:
> >> [SNIP]
> >>> I can't see how this is called from the nouveau code, only 
> >>> possibility I
> >>> see is that it is maybe called through the AGP code somehow.
> >> Yes, you're right:
> >> [   13.192663] Call Trace:
> >> [   13.192678]  dump_stack+0x54/0x68
> >> [   13.192690]  ttm_tt_init+0x11/0x8a [ttm]
> >> [   13.192699]  ttm_agp_tt_create+0x39/0x51 [ttm]
> >> [   13.192840]  nouveau_ttm_tt_create+0x17/0x22 [nouveau]
> >> [   13.192856]  ttm_tt_create+0x78/0x8c [ttm]
> >> [   13.192864]  ttm_bo_handle_move_mem+0x7d/0xca [ttm]
> >> [   13.192873]  ttm_bo_validate+0x92/0xc8 [ttm]
> >> [   13.192883]  ttm_bo_init_reserved+0x216/0x243 [ttm]
> >> [   13.192892]  ttm_bo_init+0x45/0x65 [ttm]
> >> [   13.193018]  ? nouveau_bo_del_io_reserve_lru+0x48/0x48 [nouveau]
> >> [   13.193150]  nouveau_bo_init+0x8c/0x94 [nouveau]
> >> [   13.193273]  ? nouveau_bo_del_io_reserve_lru+0x48/0x48 [nouveau]
> >> [   13.193407]  nouveau_bo_new+0x44/0x57 [nouveau]
> >> [   13.193537]  nouveau_channel_prep+0xa3/0x269 [nouveau]
> >> [   13.193665]  nouveau_channel_new+0x3c/0x5f7 [nouveau]
> >> [   13.193679]  ? slab_free_freelist_hook+0x3b/0xa7
> >> [   13.193686]  ? kfree+0x9e/0x11a
> >> [   13.193781]  ? nvif_object_sclass_put+0xd/0x16 [nouveau]
> >> [   13.193908]  nouveau_drm_device_init+0x2e2/0x646 [nouveau]
> >> [   13.193924]  ? pci_enable_device_flags+0x1e/0xac
> >> [   13.194052]  nouveau_drm_probe+0xeb/0x188 [nouveau]
> >> [   13.194182]  ? nouveau_drm_device_init+0x646/0x646 [nouveau]
> >> [   13.194195]  pci_device_probe+0x89/0xe9
> >> [   13.194205]  really_probe+0x127/0x2a7
> >> [   13.194212]  driver_probe_device+0x5b/0x87
> >> [   13.194219]  device_driver_attach+0x2e/0x41
> >> [   13.194226]  __driver_attach+0x7c/0x83
> >> [   13.194232]  bus_for_each_dev+0x4c/0x66
> >> [   13.194238]  driver_attach+0x14/0x16
> >> [   13.194244]  ? device_driver_attach+0x41/0x41
> >> [   13.194251]  bus_add_driver+0xc5/0x16c
> >> [   13.194258]  driver_register+0x87/0xb9
> >> [   13.194265]  __pci_register_driver+0x38/0x3b
> >> [   13.194271]  ? 0xf0c0d000
> >> [   13.194362]  nouveau_drm_init+0x14c/0x1000 [nouveau]
> >>
> >> How is ttm_dma_tt->dma_address allocated?
> >
> > Mhm, I need to double check how AGP is supposed to work.
> >
> > Since barely anybody is using it these days it is something which 
> > breaks from time to time.
> 
> I have no idea how that ever worked in the first place since AGP isn't 
> supposed to sync between CPU/GPU. Everything is coherent for that case.
> 
> Anyway here is a patch which adds a check to those functions if the 
> dma_address array is allocated in the first place. Please test it.

Thanks, the patch fixes the problem and nouveau now works!
Should be applied to 5.12-stable too (5.11 is affected too but EOL).

It's weird that it worked before.
Looks like dma_address was used uninitialized - it contained some random
crap:
[   12.293304] nouveau_bo_sync_for_device: ttm_dma->dma_address=3e055971 
ttm_dma->ttm.num_pages=18
[   12.293321] ttm_dma->dma_address[0]=0x0
[   12.293341] ttm_dma->dma_address[1]=0x0
[   12.293360] ttm_dma->dma_address[2]=0xee728980
[   12.293379] ttm_dma->dma_address[3]=0xed1cb120
[   12.293397] ttm_dma->dma_address[4]=0x12
[   12.293416] ttm_dma->dma_address[5]=0x0
[   12.293434] ttm_dma->dma_address[6]=0x1
[   12.293453] ttm_dma->dma_address[7]=0x0
[   12.293471] ttm_dma->dma_address[8]=0x1
[   12.293490] ttm_dma->dma_address[9]=0x0
[   12.293510] ttm_dma->dma_address[10]=0x101
[   12.293528] ttm_dma->dma_address[11]=0xee7289ec
[   12.293546] ttm_dma->dma_address[12]=0xee7289ec
[   12.293564] ttm_dma->dma_address[13]=0x0
[   12.293581] ttm_dma->dma_address[14]=0x0
[   12.293599] ttm_dma->dma_address[15]=0x0
[   12.293616] ttm_dma->dma_address[16]=0x0
[   12.293634] ttm_dma->dma_address[17]=0x0
But it did not matter as dma_sync_single_for_device is a no-op here.
When dma_address is properly initialized to NULL, it crashes...

> Thanks,
> Christian.
> 
> >
> > Thanks for the backtrace,
> > Christian.
> >
> >>   I cannot find any assignment
> >> executed (in the working code):
> >>
> >> $ git grep dma_address\ = drivers/gpu/
> >> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c: 
> >> sg->sgl->dma_address = add

Re: nouveau broken on Riva TNT2 in 5.13.0-rc4: NULL pointer dereference in nouveau_bo_sync_for_device

2021-06-10 Thread Ondrej Zary
On Thursday 10 June 2021 08:43:06 Christian König wrote:
> 
> Am 09.06.21 um 22:00 schrieb Ondrej Zary:
> > On Wednesday 09 June 2021 11:21:05 Christian König wrote:
> >> Am 09.06.21 um 09:10 schrieb Ondrej Zary:
> >>> On Wednesday 09 June 2021, Christian König wrote:
> >>>> Am 09.06.21 um 08:57 schrieb Ondrej Zary:
> >>>>> [SNIP]
> >>>>>> Thanks for the heads up. So the problem with my patch is already fixed,
> >>>>>> isn't it?
> >>>>> The NULL pointer dereference in nouveau_bo_wr16 introduced in
> >>>>> 141b15e59175aa174ca1f7596188bd15a7ca17ba was fixed by
> >>>>> aea656b0d05ec5b8ed5beb2f94c4dd42ea834e9d.
> >>>>>
> >>>>> That's the bug I hit when bisecting the original problem:
> >>>>> NULL pointer dereference in nouveau_bo_sync_for_device
> >>>>> It's caused by:
> >>>>> # first bad commit: [e34b8feeaa4b65725b25f49c9b08a0f8707e8e86] drm/ttm: 
> >>>>> merge ttm_dma_tt back into ttm_tt
> >>>> Good that I've asked :)
> >>>>
> >>>> Ok that's a bit strange. e34b8feeaa4b65725b25f49c9b08a0f8707e8e86 was
> >>>> created mostly automated.
> >>>>
> >>>> Do you have the original backtrace of that NULL pointer deref once more?
> >>> The original backtrace is here: 
> >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml.org%2Flkml%2F2021%2F6%2F5%2F350data=04%7C01%7Cchristian.koenig%40amd.com%7C4309ff021d5e4cbe948b08d92b813106%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637588657045383056%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=t70c9ktzPJzDaEAcO4wpQMv3TUo5b53cUy66AkLeVwE%3Dreserved=0
> >> And the problem is that ttm_dma->dma_address is NULL, right? Mhm, I
> >> don't see how that can happen since nouveau is using ttm_sg_tt_init().
> >>
> >> Apart from that what nouveau does here is rather questionable since you
> >> need a coherent architecture for most things anyway, but that's not what
> >> we are trying to fix here.
> >>
> >> Can you try to narrow down if ttm_sg_tt_init is called before calling
> >> this function for the tt object in question?
> > ttm_sg_tt_init is not called:
> > [   12.150124] nouveau :01:00.0: DRM: VRAM: 31 MiB
> > [   12.150133] nouveau :01:00.0: DRM: GART: 128 MiB
> > [   12.150143] nouveau :01:00.0: DRM: BMP version 5.6
> > [   12.150151] nouveau :01:00.0: DRM: No DCB data found in VBIOS
> > [   12.151362] ttm_tt_init
> > [   12.151370] ttm_tt_init_fields
> > [   12.151374] ttm_tt_alloc_page_directory
> > [   12.151615] BUG: kernel NULL pointer dereference, address: 
> 
> Please add dump_stack(); to ttm_tt_init() and report back with the 
> backtrace.
> 
> I can't see how this is called from the nouveau code, only possibility I 
> see is that it is maybe called through the AGP code somehow.

Yes, you're right:
[   13.192663] Call Trace:
[   13.192678]  dump_stack+0x54/0x68
[   13.192690]  ttm_tt_init+0x11/0x8a [ttm]
[   13.192699]  ttm_agp_tt_create+0x39/0x51 [ttm]
[   13.192840]  nouveau_ttm_tt_create+0x17/0x22 [nouveau]
[   13.192856]  ttm_tt_create+0x78/0x8c [ttm]
[   13.192864]  ttm_bo_handle_move_mem+0x7d/0xca [ttm]
[   13.192873]  ttm_bo_validate+0x92/0xc8 [ttm]
[   13.192883]  ttm_bo_init_reserved+0x216/0x243 [ttm]
[   13.192892]  ttm_bo_init+0x45/0x65 [ttm]
[   13.193018]  ? nouveau_bo_del_io_reserve_lru+0x48/0x48 [nouveau]
[   13.193150]  nouveau_bo_init+0x8c/0x94 [nouveau]
[   13.193273]  ? nouveau_bo_del_io_reserve_lru+0x48/0x48 [nouveau]
[   13.193407]  nouveau_bo_new+0x44/0x57 [nouveau]
[   13.193537]  nouveau_channel_prep+0xa3/0x269 [nouveau]
[   13.193665]  nouveau_channel_new+0x3c/0x5f7 [nouveau]
[   13.193679]  ? slab_free_freelist_hook+0x3b/0xa7
[   13.193686]  ? kfree+0x9e/0x11a
[   13.193781]  ? nvif_object_sclass_put+0xd/0x16 [nouveau]
[   13.193908]  nouveau_drm_device_init+0x2e2/0x646 [nouveau]
[   13.193924]  ? pci_enable_device_flags+0x1e/0xac
[   13.194052]  nouveau_drm_probe+0xeb/0x188 [nouveau]
[   13.194182]  ? nouveau_drm_device_init+0x646/0x646 [nouveau]
[   13.194195]  pci_device_probe+0x89/0xe9
[   13.194205]  really_probe+0x127/0x2a7
[   13.194212]  driver_probe_device+0x5b/0x87
[   13.194219]  device_driver_attach+0x2e/0x41
[   13.194226]  __driver_attach+0x7c/0x83
[   13.194232]  bus_for_each_dev+0x4c/0x66
[   13.194238]  driver_attach+0x14/0x16
[   13.194244]  ? device_driver_attach+0x41/0x41
[   13.194251]  bus_add_driver+0xc5/0x16c
[   13.194258]  driver_register+0x87/0xb9
[   13.194265]  __pci_register_driver+0x

Re: nouveau broken on Riva TNT2 in 5.13.0-rc4: NULL pointer dereference in nouveau_bo_sync_for_device

2021-06-09 Thread Ondrej Zary
On Wednesday 09 June 2021 11:21:05 Christian König wrote:
> Am 09.06.21 um 09:10 schrieb Ondrej Zary:
> > On Wednesday 09 June 2021, Christian König wrote:
> >> Am 09.06.21 um 08:57 schrieb Ondrej Zary:
> >>> [SNIP]
> >>>> Thanks for the heads up. So the problem with my patch is already fixed,
> >>>> isn't it?
> >>> The NULL pointer dereference in nouveau_bo_wr16 introduced in
> >>> 141b15e59175aa174ca1f7596188bd15a7ca17ba was fixed by
> >>> aea656b0d05ec5b8ed5beb2f94c4dd42ea834e9d.
> >>>
> >>> That's the bug I hit when bisecting the original problem:
> >>> NULL pointer dereference in nouveau_bo_sync_for_device
> >>> It's caused by:
> >>> # first bad commit: [e34b8feeaa4b65725b25f49c9b08a0f8707e8e86] drm/ttm: 
> >>> merge ttm_dma_tt back into ttm_tt
> >> Good that I've asked :)
> >>
> >> Ok that's a bit strange. e34b8feeaa4b65725b25f49c9b08a0f8707e8e86 was
> >> created mostly automated.
> >>
> >> Do you have the original backtrace of that NULL pointer deref once more?
> > The original backtrace is here: 
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml.org%2Flkml%2F2021%2F6%2F5%2F350data=04%7C01%7Cchristian.koenig%40amd.com%7Ce905b6bd2aa842ace15508d92b15b96d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637588195000729460%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=zFqheBbJcOHtYgqG%2Fs63AT1dwuk4REmUDJWHvzaLAlc%3Dreserved=0
> 
> And the problem is that ttm_dma->dma_address is NULL, right? Mhm, I 
> don't see how that can happen since nouveau is using ttm_sg_tt_init().
> 
> Apart from that what nouveau does here is rather questionable since you 
> need a coherent architecture for most things anyway, but that's not what 
> we are trying to fix here.
> 
> Can you try to narrow down if ttm_sg_tt_init is called before calling 
> this function for the tt object in question?

ttm_sg_tt_init is not called:
[   12.150124] nouveau :01:00.0: DRM: VRAM: 31 MiB
[   12.150133] nouveau :01:00.0: DRM: GART: 128 MiB
[   12.150143] nouveau :01:00.0: DRM: BMP version 5.6
[   12.150151] nouveau :01:00.0: DRM: No DCB data found in VBIOS
[   12.151362] ttm_tt_init
[   12.151370] ttm_tt_init_fields
[   12.151374] ttm_tt_alloc_page_directory
[   12.151615] BUG: kernel NULL pointer dereference, address: 



-- 
Ondrej Zary


Re: nouveau broken on Riva TNT2 in 5.13.0-rc4: NULL pointer dereference in nouveau_bo_sync_for_device

2021-06-09 Thread Ondrej Zary
On Wednesday 09 June 2021, Christian König wrote:
> Am 09.06.21 um 08:57 schrieb Ondrej Zary:
> > [SNIP]
> >> Thanks for the heads up. So the problem with my patch is already fixed,
> >> isn't it?
> > The NULL pointer dereference in nouveau_bo_wr16 introduced in
> > 141b15e59175aa174ca1f7596188bd15a7ca17ba was fixed by
> > aea656b0d05ec5b8ed5beb2f94c4dd42ea834e9d.
> >
> > That's the bug I hit when bisecting the original problem:
> > NULL pointer dereference in nouveau_bo_sync_for_device
> > It's caused by:
> > # first bad commit: [e34b8feeaa4b65725b25f49c9b08a0f8707e8e86] drm/ttm: 
> > merge ttm_dma_tt back into ttm_tt
> 
> Good that I've asked :)
> 
> Ok that's a bit strange. e34b8feeaa4b65725b25f49c9b08a0f8707e8e86 was 
> created mostly automated.
> 
> Do you have the original backtrace of that NULL pointer deref once more?

The original backtrace is here: https://lkml.org/lkml/2021/6/5/350

-- 
Ondrej Zary


Re: nouveau broken on Riva TNT2 in 5.13.0-rc4: NULL pointer dereference in nouveau_bo_sync_for_device

2021-06-09 Thread Ondrej Zary
On Wednesday 09 June 2021, Christian König wrote:
> Am 08.06.21 um 23:59 schrieb Ondrej Zary:
> > On Tuesday 08 June 2021 22:01:56 Ondrej Zary wrote:
> >> On Tuesday 08 June 2021 20:47:42 Ondrej Zary wrote:
> >>> On Monday 07 June 2021 22:58:43 Ondrej Zary wrote:
> >>>> On Sunday 06 June 2021 23:16:03 Ondrej Zary wrote:
> >>>>> On Saturday 05 June 2021 23:34:23 Ondrej Zary wrote:
> >>>>>> On Saturday 05 June 2021 21:43:52 Ondrej Zary wrote:
> >>>>>>> Hello,
> >>>>>>> I'm testing 5.13.0-rc4 and nouveau crashes with NULL pointer 
> >>>>>>> dereference in nouveau_bo_sync_for_device.
> >>>>>>> Found various reports like this but that was back in februaryso that 
> >>>>>>> should be fixed now.
> >>>>>> So it is the same bug. Broken since 5.11. This revert fixes it in 5.11:
> >>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Farchives%2Fdri-devel%2F2021-February%2F298531.htmldata=04%7C01%7Cchristian.koenig%40amd.com%7C605d2e3757ba466bb02a08d92ac8a895%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637587864017853132%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=M5KXSwD%2Fnro3cnCo8Nx4llFu%2Fj2T%2FGQAaMBLeGl0XMc%3Dreserved=0
> >>>>>>
> >>>>>> Added some debug printks to nouveau_bo_sync_for_device:
> >>>>>> [   22.225048] ttm_dma=fc33b500
> >>>>>> [   22.225066] ttm_dma->num_pages=18
> >>>>>> [   22.225071] i=0 num_pages=16
> >>>>>> [   22.225077] ttm_dma->dma_address=
> >>>>>> [   22.225094] BUG: kernel NULL pointer dereference, address: 
> >>>>>>
> >>>>>> So ttm->dma_address is NULL.
> >>>>>>
> >>>>> Tested reverting f295c8cfec833c2707ff1512da10d65386dde7af again and it 
> >>>>> does not work...
> >>>>> Not sure what I did before.
> >>>>>
> >>>>> Bisecting between 5.10 and 5.11 is impossible - I keep hitting 
> >>>>> neverending stream of bugs.
> >>>>> As always with nouveau...
> >>>> e34b8feeaa4b65725b25f49c9b08a0f8707e8e86 seems to be the first bad commit
> >>>> Going back one commit makes it crash in a different way:
> >>>>
> >>>> [   55.444208] BUG: kernel NULL pointer dereference, address: 01b0
> >>>> [   55.444219] #PF: supervisor read access in kernel mode
> >>>> [   55.444222] #PF: error_code(0x) - not-present page
> >>>> [   55.444225] *pde = 
> >>>> [   55.444231] Oops:  [#1] SMP
> >>>> [   55.444237] CPU: 0 PID: 1740 Comm: Xorg Not tainted 5.9.0-rc5+ #361
> >>>> [   55.444240] Hardware name:  /848P-ICH5, BIOS 6.00 PG 02/03/2005
> >>>> [   55.444321] EIP: nouveau_bo_wr16+0x8/0x27 [nouveau]
> >>>> [   55.444326] Code: 85 ff 74 0d 80 7d f3 00 74 07 80 a6 f4 01 00 00 fe 
> >>>> 89 f0 e8 0c ef ff ff 8d 65 f4 89 f8 5b 5e 5f 5d c3 55 01 d2 89 e5 53 89 
> >>>> c3 <03> 93 b0 01 00 00 0f b7 c1 f6 83 b8 01 00 00 80 74 07 e8 40 49 69
> >>>> [   55.444330] EAX:  EBX:  ECX:  EDX: 
> >>>> [   55.444334] ESI: 0020 EDI: e7a14400 EBP: e786fd98 ESP: e786fd94
> >>>> [   55.444338] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 
> >>>> 00210246
> >>>> [   55.444341] CR0: 80050033 CR2: 01b0 CR3: 27896000 CR4: 0690
> >>>> [   55.444344] Call Trace:
> >>>> [   55.444395]  nv04_crtc_cursor_set+0x148/0x1d8 [nouveau]
> >>>> [   55.42]  ? ttm_bo_reserve.constprop.15+0x1c/0x1c [nouveau]
> >>>> [   55.51]  drm_mode_cursor_common+0x13b/0x1ad
> >>>> [   55.97]  ? ttm_bo_reserve.constprop.15+0x1c/0x1c [nouveau]
> >>>> [   55.444504]  drm_mode_cursor_ioctl+0x2e/0x36
> >>>> [   55.444509]  ? drm_mode_setplane+0x203/0x203
> >>>> [   55.444514]  drm_ioctl_kernel+0x66/0x99
> >>>> [   55.444518]  drm_ioctl+0x211/0x2d8
> >>>> [   55.444522]  ? drm_mode_setplane+0x203/0x203
> >>>> [   55.444529]  ? _cond_resched+0x1e/0x22
> >>>> [   55.444533]  ? mutex_lock+0xb/0x24
> >>>> [   55.444582]  ? nouveau_bo_add_io_reserve_lru+0x53/0x58 [nouveau]
> >>>> [   55.444589]  ? rpm_

Re: nouveau broken on Riva TNT2 in 5.13.0-rc4: NULL pointer dereference in nouveau_bo_sync_for_device

2021-06-08 Thread Ondrej Zary
On Tuesday 08 June 2021 22:01:56 Ondrej Zary wrote:
> On Tuesday 08 June 2021 20:47:42 Ondrej Zary wrote:
> > On Monday 07 June 2021 22:58:43 Ondrej Zary wrote:
> > > On Sunday 06 June 2021 23:16:03 Ondrej Zary wrote:
> > > > On Saturday 05 June 2021 23:34:23 Ondrej Zary wrote:
> > > > > On Saturday 05 June 2021 21:43:52 Ondrej Zary wrote:
> > > > > > Hello,
> > > > > > I'm testing 5.13.0-rc4 and nouveau crashes with NULL pointer 
> > > > > > dereference in nouveau_bo_sync_for_device.
> > > > > > Found various reports like this but that was back in februaryso 
> > > > > > that should be fixed now.
> > > > > 
> > > > > So it is the same bug. Broken since 5.11. This revert fixes it in 
> > > > > 5.11:
> > > > > https://lists.freedesktop.org/archives/dri-devel/2021-February/298531.html
> > > > > 
> > > > > Added some debug printks to nouveau_bo_sync_for_device:
> > > > > [   22.225048] ttm_dma=fc33b500
> > > > > [   22.225066] ttm_dma->num_pages=18
> > > > > [   22.225071] i=0 num_pages=16
> > > > > [   22.225077] ttm_dma->dma_address=
> > > > > [   22.225094] BUG: kernel NULL pointer dereference, address: 
> > > > > 
> > > > > So ttm->dma_address is NULL.
> > > > > 
> > > > 
> > > > Tested reverting f295c8cfec833c2707ff1512da10d65386dde7af again and it 
> > > > does not work...
> > > > Not sure what I did before.
> > > > 
> > > > Bisecting between 5.10 and 5.11 is impossible - I keep hitting 
> > > > neverending stream of bugs.
> > > > As always with nouveau...
> > > 
> > > e34b8feeaa4b65725b25f49c9b08a0f8707e8e86 seems to be the first bad commit
> > > Going back one commit makes it crash in a different way:
> > > 
> > > [   55.444208] BUG: kernel NULL pointer dereference, address: 01b0
> > > [   55.444219] #PF: supervisor read access in kernel mode
> > > [   55.444222] #PF: error_code(0x) - not-present page
> > > [   55.444225] *pde = 
> > > [   55.444231] Oops:  [#1] SMP
> > > [   55.444237] CPU: 0 PID: 1740 Comm: Xorg Not tainted 5.9.0-rc5+ #361
> > > [   55.444240] Hardware name:  /848P-ICH5, BIOS 6.00 PG 02/03/2005
> > > [   55.444321] EIP: nouveau_bo_wr16+0x8/0x27 [nouveau]
> > > [   55.444326] Code: 85 ff 74 0d 80 7d f3 00 74 07 80 a6 f4 01 00 00 fe 
> > > 89 f0 e8 0c ef ff ff 8d 65 f4 89 f8 5b 5e 5f 5d c3 55 01 d2 89 e5 53 89 
> > > c3 <03> 93 b0 01 00 00 0f b7 c1 f6 83 b8 01 00 00 80 74 07 e8 40 49 69
> > > [   55.444330] EAX:  EBX:  ECX:  EDX: 
> > > [   55.444334] ESI: 0020 EDI: e7a14400 EBP: e786fd98 ESP: e786fd94
> > > [   55.444338] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 
> > > 00210246
> > > [   55.444341] CR0: 80050033 CR2: 01b0 CR3: 27896000 CR4: 0690
> > > [   55.444344] Call Trace:
> > > [   55.444395]  nv04_crtc_cursor_set+0x148/0x1d8 [nouveau]
> > > [   55.42]  ? ttm_bo_reserve.constprop.15+0x1c/0x1c [nouveau]
> > > [   55.51]  drm_mode_cursor_common+0x13b/0x1ad
> > > [   55.97]  ? ttm_bo_reserve.constprop.15+0x1c/0x1c [nouveau]
> > > [   55.444504]  drm_mode_cursor_ioctl+0x2e/0x36
> > > [   55.444509]  ? drm_mode_setplane+0x203/0x203
> > > [   55.444514]  drm_ioctl_kernel+0x66/0x99
> > > [   55.444518]  drm_ioctl+0x211/0x2d8
> > > [   55.444522]  ? drm_mode_setplane+0x203/0x203
> > > [   55.444529]  ? _cond_resched+0x1e/0x22
> > > [   55.444533]  ? mutex_lock+0xb/0x24
> > > [   55.444582]  ? nouveau_bo_add_io_reserve_lru+0x53/0x58 [nouveau]
> > > [   55.444589]  ? rpm_resume.part.13+0x72/0x365
> > > [   55.444594]  ? ktime_get_mono_fast_ns+0x5e/0xf2
> > > [   55.444598]  ? __pm_runtime_resume+0x5b/0x63
> > > [   55.444647]  nouveau_drm_ioctl+0x65/0x81 [nouveau]
> > > [   55.444696]  ? nouveau_cli_work+0xc3/0xc3 [nouveau]
> > > [   55.444702]  vfs_ioctl+0x1a/0x24
> > > [   55.444706]  __ia32_sys_ioctl+0x583/0x59d
> > > [   55.444711]  ? doublefault_shim+0x120/0x120
> > > [   55.444717]  ? exit_to_user_mode_prepare+0x71/0xba
> > > [   55.444721]  do_int80_syscall_32+0x2c/0x39
> > > [   55.444725]  entry_INT80_32+0xf0/0xf0
> > > [   55.444729] EIP: 0xb7fb2092
> > > [   55.444733] Code: 00 00 00 e9 90 ff ff ff ff a3 24 00 00 00 68 3

Re: nouveau broken on Riva TNT2 in 5.13.0-rc4: NULL pointer dereference in nouveau_bo_sync_for_device

2021-06-08 Thread Ondrej Zary
On Tuesday 08 June 2021 20:47:42 Ondrej Zary wrote:
> On Monday 07 June 2021 22:58:43 Ondrej Zary wrote:
> > On Sunday 06 June 2021 23:16:03 Ondrej Zary wrote:
> > > On Saturday 05 June 2021 23:34:23 Ondrej Zary wrote:
> > > > On Saturday 05 June 2021 21:43:52 Ondrej Zary wrote:
> > > > > Hello,
> > > > > I'm testing 5.13.0-rc4 and nouveau crashes with NULL pointer 
> > > > > dereference in nouveau_bo_sync_for_device.
> > > > > Found various reports like this but that was back in februaryso that 
> > > > > should be fixed now.
> > > > 
> > > > So it is the same bug. Broken since 5.11. This revert fixes it in 5.11:
> > > > https://lists.freedesktop.org/archives/dri-devel/2021-February/298531.html
> > > > 
> > > > Added some debug printks to nouveau_bo_sync_for_device:
> > > > [   22.225048] ttm_dma=fc33b500
> > > > [   22.225066] ttm_dma->num_pages=18
> > > > [   22.225071] i=0 num_pages=16
> > > > [   22.225077] ttm_dma->dma_address=
> > > > [   22.225094] BUG: kernel NULL pointer dereference, address: 
> > > > 
> > > > So ttm->dma_address is NULL.
> > > > 
> > > 
> > > Tested reverting f295c8cfec833c2707ff1512da10d65386dde7af again and it 
> > > does not work...
> > > Not sure what I did before.
> > > 
> > > Bisecting between 5.10 and 5.11 is impossible - I keep hitting 
> > > neverending stream of bugs.
> > > As always with nouveau...
> > 
> > e34b8feeaa4b65725b25f49c9b08a0f8707e8e86 seems to be the first bad commit
> > Going back one commit makes it crash in a different way:
> > 
> > [   55.444208] BUG: kernel NULL pointer dereference, address: 01b0
> > [   55.444219] #PF: supervisor read access in kernel mode
> > [   55.444222] #PF: error_code(0x) - not-present page
> > [   55.444225] *pde = 
> > [   55.444231] Oops:  [#1] SMP
> > [   55.444237] CPU: 0 PID: 1740 Comm: Xorg Not tainted 5.9.0-rc5+ #361
> > [   55.444240] Hardware name:  /848P-ICH5, BIOS 6.00 PG 02/03/2005
> > [   55.444321] EIP: nouveau_bo_wr16+0x8/0x27 [nouveau]
> > [   55.444326] Code: 85 ff 74 0d 80 7d f3 00 74 07 80 a6 f4 01 00 00 fe 89 
> > f0 e8 0c ef ff ff 8d 65 f4 89 f8 5b 5e 5f 5d c3 55 01 d2 89 e5 53 89 c3 
> > <03> 93 b0 01 00 00 0f b7 c1 f6 83 b8 01 00 00 80 74 07 e8 40 49 69
> > [   55.444330] EAX:  EBX:  ECX:  EDX: 
> > [   55.444334] ESI: 0020 EDI: e7a14400 EBP: e786fd98 ESP: e786fd94
> > [   55.444338] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00210246
> > [   55.444341] CR0: 80050033 CR2: 01b0 CR3: 27896000 CR4: 0690
> > [   55.444344] Call Trace:
> > [   55.444395]  nv04_crtc_cursor_set+0x148/0x1d8 [nouveau]
> > [   55.42]  ? ttm_bo_reserve.constprop.15+0x1c/0x1c [nouveau]
> > [   55.51]  drm_mode_cursor_common+0x13b/0x1ad
> > [   55.97]  ? ttm_bo_reserve.constprop.15+0x1c/0x1c [nouveau]
> > [   55.444504]  drm_mode_cursor_ioctl+0x2e/0x36
> > [   55.444509]  ? drm_mode_setplane+0x203/0x203
> > [   55.444514]  drm_ioctl_kernel+0x66/0x99
> > [   55.444518]  drm_ioctl+0x211/0x2d8
> > [   55.444522]  ? drm_mode_setplane+0x203/0x203
> > [   55.444529]  ? _cond_resched+0x1e/0x22
> > [   55.444533]  ? mutex_lock+0xb/0x24
> > [   55.444582]  ? nouveau_bo_add_io_reserve_lru+0x53/0x58 [nouveau]
> > [   55.444589]  ? rpm_resume.part.13+0x72/0x365
> > [   55.444594]  ? ktime_get_mono_fast_ns+0x5e/0xf2
> > [   55.444598]  ? __pm_runtime_resume+0x5b/0x63
> > [   55.444647]  nouveau_drm_ioctl+0x65/0x81 [nouveau]
> > [   55.444696]  ? nouveau_cli_work+0xc3/0xc3 [nouveau]
> > [   55.444702]  vfs_ioctl+0x1a/0x24
> > [   55.444706]  __ia32_sys_ioctl+0x583/0x59d
> > [   55.444711]  ? doublefault_shim+0x120/0x120
> > [   55.444717]  ? exit_to_user_mode_prepare+0x71/0xba
> > [   55.444721]  do_int80_syscall_32+0x2c/0x39
> > [   55.444725]  entry_INT80_32+0xf0/0xf0
> > [   55.444729] EIP: 0xb7fb2092
> > [   55.444733] Code: 00 00 00 e9 90 ff ff ff ff a3 24 00 00 00 68 30 00 00 
> > 00 e9 80 ff ff ff ff a3 e8 ff ff ff 66 90 00 00 00 00 00 00 00 00 cd 80 
> >  8d b4 26 00 00 00 00 8d b6 00 00 00 00 8b 1c 24 c3 8d b4 26 00
> > [   55.444737] EAX: ffda EBX: 000e ECX: c01c64a3 EDX: bfe89750
> > [   55.444741] ESI: 02580b40 EDI: c01c64a3 EBP: 000e ESP: bfe89704
> > [   55.444744] DS: 007b ES: 007b FS:  GS: 0033 SS: 007b EFLAGS: 00200292
> > [   55.444748] Modules linked in: i2c_de

Re: nouveau broken on Riva TNT2 in 5.13.0-rc4: NULL pointer dereference in nouveau_bo_sync_for_device

2021-06-08 Thread Ondrej Zary
On Monday 07 June 2021 22:58:43 Ondrej Zary wrote:
> On Sunday 06 June 2021 23:16:03 Ondrej Zary wrote:
> > On Saturday 05 June 2021 23:34:23 Ondrej Zary wrote:
> > > On Saturday 05 June 2021 21:43:52 Ondrej Zary wrote:
> > > > Hello,
> > > > I'm testing 5.13.0-rc4 and nouveau crashes with NULL pointer 
> > > > dereference in nouveau_bo_sync_for_device.
> > > > Found various reports like this but that was back in februaryso that 
> > > > should be fixed now.
> > > 
> > > So it is the same bug. Broken since 5.11. This revert fixes it in 5.11:
> > > https://lists.freedesktop.org/archives/dri-devel/2021-February/298531.html
> > > 
> > > Added some debug printks to nouveau_bo_sync_for_device:
> > > [   22.225048] ttm_dma=fc33b500
> > > [   22.225066] ttm_dma->num_pages=18
> > > [   22.225071] i=0 num_pages=16
> > > [   22.225077] ttm_dma->dma_address=
> > > [   22.225094] BUG: kernel NULL pointer dereference, address: 
> > > 
> > > So ttm->dma_address is NULL.
> > > 
> > 
> > Tested reverting f295c8cfec833c2707ff1512da10d65386dde7af again and it does 
> > not work...
> > Not sure what I did before.
> > 
> > Bisecting between 5.10 and 5.11 is impossible - I keep hitting neverending 
> > stream of bugs.
> > As always with nouveau...
> 
> e34b8feeaa4b65725b25f49c9b08a0f8707e8e86 seems to be the first bad commit
> Going back one commit makes it crash in a different way:
> 
> [   55.444208] BUG: kernel NULL pointer dereference, address: 01b0
> [   55.444219] #PF: supervisor read access in kernel mode
> [   55.444222] #PF: error_code(0x) - not-present page
> [   55.444225] *pde = 
> [   55.444231] Oops:  [#1] SMP
> [   55.444237] CPU: 0 PID: 1740 Comm: Xorg Not tainted 5.9.0-rc5+ #361
> [   55.444240] Hardware name:  /848P-ICH5, BIOS 6.00 PG 02/03/2005
> [   55.444321] EIP: nouveau_bo_wr16+0x8/0x27 [nouveau]
> [   55.444326] Code: 85 ff 74 0d 80 7d f3 00 74 07 80 a6 f4 01 00 00 fe 89 f0 
> e8 0c ef ff ff 8d 65 f4 89 f8 5b 5e 5f 5d c3 55 01 d2 89 e5 53 89 c3 <03> 93 
> b0 01 00 00 0f b7 c1 f6 83 b8 01 00 00 80 74 07 e8 40 49 69
> [   55.444330] EAX:  EBX:  ECX:  EDX: 
> [   55.444334] ESI: 0020 EDI: e7a14400 EBP: e786fd98 ESP: e786fd94
> [   55.444338] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00210246
> [   55.444341] CR0: 80050033 CR2: 01b0 CR3: 27896000 CR4: 0690
> [   55.444344] Call Trace:
> [   55.444395]  nv04_crtc_cursor_set+0x148/0x1d8 [nouveau]
> [   55.42]  ? ttm_bo_reserve.constprop.15+0x1c/0x1c [nouveau]
> [   55.51]  drm_mode_cursor_common+0x13b/0x1ad
> [   55.97]  ? ttm_bo_reserve.constprop.15+0x1c/0x1c [nouveau]
> [   55.444504]  drm_mode_cursor_ioctl+0x2e/0x36
> [   55.444509]  ? drm_mode_setplane+0x203/0x203
> [   55.444514]  drm_ioctl_kernel+0x66/0x99
> [   55.444518]  drm_ioctl+0x211/0x2d8
> [   55.444522]  ? drm_mode_setplane+0x203/0x203
> [   55.444529]  ? _cond_resched+0x1e/0x22
> [   55.444533]  ? mutex_lock+0xb/0x24
> [   55.444582]  ? nouveau_bo_add_io_reserve_lru+0x53/0x58 [nouveau]
> [   55.444589]  ? rpm_resume.part.13+0x72/0x365
> [   55.444594]  ? ktime_get_mono_fast_ns+0x5e/0xf2
> [   55.444598]  ? __pm_runtime_resume+0x5b/0x63
> [   55.444647]  nouveau_drm_ioctl+0x65/0x81 [nouveau]
> [   55.444696]  ? nouveau_cli_work+0xc3/0xc3 [nouveau]
> [   55.444702]  vfs_ioctl+0x1a/0x24
> [   55.444706]  __ia32_sys_ioctl+0x583/0x59d
> [   55.444711]  ? doublefault_shim+0x120/0x120
> [   55.444717]  ? exit_to_user_mode_prepare+0x71/0xba
> [   55.444721]  do_int80_syscall_32+0x2c/0x39
> [   55.444725]  entry_INT80_32+0xf0/0xf0
> [   55.444729] EIP: 0xb7fb2092
> [   55.444733] Code: 00 00 00 e9 90 ff ff ff ff a3 24 00 00 00 68 30 00 00 00 
> e9 80 ff ff ff ff a3 e8 ff ff ff 66 90 00 00 00 00 00 00 00 00 cd 80  8d 
> b4 26 00 00 00 00 8d b6 00 00 00 00 8b 1c 24 c3 8d b4 26 00
> [   55.444737] EAX: ffda EBX: 000e ECX: c01c64a3 EDX: bfe89750
> [   55.444741] ESI: 02580b40 EDI: c01c64a3 EBP: 000e ESP: bfe89704
> [   55.444744] DS: 007b ES: 007b FS:  GS: 0033 SS: 007b EFLAGS: 00200292
> [   55.444748] Modules linked in: i2c_dev nouveau serial_cs snd_intel8x0 
> snd_ac97_codec wmi hwmon ttm ac97_bus 8139cp snd_pcm pcmcia snd_timer snd sg 
> soundcore psmouse yenta_socket serio_raw pcmcia_rsrc pcmcia_core intel_agp 
> parport_pc parport
> [   55.444769] CR2: 01b0
> [   55.444774] ---[ end trace e2b0d4c3c2e4e488 ]---
> [   55.444827] EIP: nouveau_bo_wr16+0x8/0x27 [nouveau]
> [   55.444831] Code: 85 ff 74 0d 80 7d f3 00 74 07 80 a6 f4 01 00 00 fe 89 f

Re: nouveau broken on Riva TNT2 in 5.13.0-rc4: NULL pointer dereference in nouveau_bo_sync_for_device

2021-06-07 Thread Ondrej Zary
On Sunday 06 June 2021 23:16:03 Ondrej Zary wrote:
> On Saturday 05 June 2021 23:34:23 Ondrej Zary wrote:
> > On Saturday 05 June 2021 21:43:52 Ondrej Zary wrote:
> > > Hello,
> > > I'm testing 5.13.0-rc4 and nouveau crashes with NULL pointer dereference 
> > > in nouveau_bo_sync_for_device.
> > > Found various reports like this but that was back in februaryso that 
> > > should be fixed now.
> > 
> > So it is the same bug. Broken since 5.11. This revert fixes it in 5.11:
> > https://lists.freedesktop.org/archives/dri-devel/2021-February/298531.html
> > 
> > Added some debug printks to nouveau_bo_sync_for_device:
> > [   22.225048] ttm_dma=fc33b500
> > [   22.225066] ttm_dma->num_pages=18
> > [   22.225071] i=0 num_pages=16
> > [   22.225077] ttm_dma->dma_address=
> > [   22.225094] BUG: kernel NULL pointer dereference, address: 
> > 
> > So ttm->dma_address is NULL.
> > 
> 
> Tested reverting f295c8cfec833c2707ff1512da10d65386dde7af again and it does 
> not work...
> Not sure what I did before.
> 
> Bisecting between 5.10 and 5.11 is impossible - I keep hitting neverending 
> stream of bugs.
> As always with nouveau...

e34b8feeaa4b65725b25f49c9b08a0f8707e8e86 seems to be the first bad commit
Going back one commit makes it crash in a different way:

[   55.444208] BUG: kernel NULL pointer dereference, address: 01b0
[   55.444219] #PF: supervisor read access in kernel mode
[   55.444222] #PF: error_code(0x) - not-present page
[   55.444225] *pde = 
[   55.444231] Oops:  [#1] SMP
[   55.444237] CPU: 0 PID: 1740 Comm: Xorg Not tainted 5.9.0-rc5+ #361
[   55.444240] Hardware name:  /848P-ICH5, BIOS 6.00 PG 02/03/2005
[   55.444321] EIP: nouveau_bo_wr16+0x8/0x27 [nouveau]
[   55.444326] Code: 85 ff 74 0d 80 7d f3 00 74 07 80 a6 f4 01 00 00 fe 89 f0 
e8 0c ef ff ff 8d 65 f4 89 f8 5b 5e 5f 5d c3 55 01 d2 89 e5 53 89 c3 <03> 93 b0 
01 00 00 0f b7 c1 f6 83 b8 01 00 00 80 74 07 e8 40 49 69
[   55.444330] EAX:  EBX:  ECX:  EDX: 
[   55.444334] ESI: 0020 EDI: e7a14400 EBP: e786fd98 ESP: e786fd94
[   55.444338] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00210246
[   55.444341] CR0: 80050033 CR2: 01b0 CR3: 27896000 CR4: 0690
[   55.444344] Call Trace:
[   55.444395]  nv04_crtc_cursor_set+0x148/0x1d8 [nouveau]
[   55.42]  ? ttm_bo_reserve.constprop.15+0x1c/0x1c [nouveau]
[   55.51]  drm_mode_cursor_common+0x13b/0x1ad
[   55.97]  ? ttm_bo_reserve.constprop.15+0x1c/0x1c [nouveau]
[   55.444504]  drm_mode_cursor_ioctl+0x2e/0x36
[   55.444509]  ? drm_mode_setplane+0x203/0x203
[   55.444514]  drm_ioctl_kernel+0x66/0x99
[   55.444518]  drm_ioctl+0x211/0x2d8
[   55.444522]  ? drm_mode_setplane+0x203/0x203
[   55.444529]  ? _cond_resched+0x1e/0x22
[   55.444533]  ? mutex_lock+0xb/0x24
[   55.444582]  ? nouveau_bo_add_io_reserve_lru+0x53/0x58 [nouveau]
[   55.444589]  ? rpm_resume.part.13+0x72/0x365
[   55.444594]  ? ktime_get_mono_fast_ns+0x5e/0xf2
[   55.444598]  ? __pm_runtime_resume+0x5b/0x63
[   55.444647]  nouveau_drm_ioctl+0x65/0x81 [nouveau]
[   55.444696]  ? nouveau_cli_work+0xc3/0xc3 [nouveau]
[   55.444702]  vfs_ioctl+0x1a/0x24
[   55.444706]  __ia32_sys_ioctl+0x583/0x59d
[   55.444711]  ? doublefault_shim+0x120/0x120
[   55.444717]  ? exit_to_user_mode_prepare+0x71/0xba
[   55.444721]  do_int80_syscall_32+0x2c/0x39
[   55.444725]  entry_INT80_32+0xf0/0xf0
[   55.444729] EIP: 0xb7fb2092
[   55.444733] Code: 00 00 00 e9 90 ff ff ff ff a3 24 00 00 00 68 30 00 00 00 
e9 80 ff ff ff ff a3 e8 ff ff ff 66 90 00 00 00 00 00 00 00 00 cd 80  8d b4 
26 00 00 00 00 8d b6 00 00 00 00 8b 1c 24 c3 8d b4 26 00
[   55.444737] EAX: ffda EBX: 000e ECX: c01c64a3 EDX: bfe89750
[   55.444741] ESI: 02580b40 EDI: c01c64a3 EBP: 000e ESP: bfe89704
[   55.444744] DS: 007b ES: 007b FS:  GS: 0033 SS: 007b EFLAGS: 00200292
[   55.444748] Modules linked in: i2c_dev nouveau serial_cs snd_intel8x0 
snd_ac97_codec wmi hwmon ttm ac97_bus 8139cp snd_pcm pcmcia snd_timer snd sg 
soundcore psmouse yenta_socket serio_raw pcmcia_rsrc pcmcia_core intel_agp 
parport_pc parport
[   55.444769] CR2: 01b0
[   55.444774] ---[ end trace e2b0d4c3c2e4e488 ]---
[   55.444827] EIP: nouveau_bo_wr16+0x8/0x27 [nouveau]
[   55.444831] Code: 85 ff 74 0d 80 7d f3 00 74 07 80 a6 f4 01 00 00 fe 89 f0 
e8 0c ef ff ff 8d 65 f4 89 f8 5b 5e 5f 5d c3 55 01 d2 89 e5 53 89 c3 <03> 93 b0 
01 00 00 0f b7 c1 f6 83 b8 01 00 00 80 74 07 e8 40 49 69
[   55.444835] EAX:  EBX:  ECX:  EDX: 
[   55.444838] ESI: 0020 EDI: e7a14400 EBP: e786fd98 ESP: e786fd94
[   55.444842] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00210246
[   55.444845] CR0: 80050033 CR2: 01b0 CR3: 27896000 CR4: 0690


-- 
Ondrej Zary


Re: nouveau broken on Riva TNT2 in 5.13.0-rc4: NULL pointer dereference in nouveau_bo_sync_for_device

2021-06-07 Thread Ondrej Zary
On Saturday 05 June 2021 23:34:23 Ondrej Zary wrote:
> On Saturday 05 June 2021 21:43:52 Ondrej Zary wrote:
> > Hello,
> > I'm testing 5.13.0-rc4 and nouveau crashes with NULL pointer dereference in 
> > nouveau_bo_sync_for_device.
> > Found various reports like this but that was back in februaryso that should 
> > be fixed now.
> 
> So it is the same bug. Broken since 5.11. This revert fixes it in 5.11:
> https://lists.freedesktop.org/archives/dri-devel/2021-February/298531.html
> 
> Added some debug printks to nouveau_bo_sync_for_device:
> [   22.225048] ttm_dma=fc33b500
> [   22.225066] ttm_dma->num_pages=18
> [   22.225071] i=0 num_pages=16
> [   22.225077] ttm_dma->dma_address=
> [   22.225094] BUG: kernel NULL pointer dereference, address: 
> 
> So ttm->dma_address is NULL.
> 

Tested reverting f295c8cfec833c2707ff1512da10d65386dde7af again and it does not 
work...
Not sure what I did before.

Bisecting between 5.10 and 5.11 is impossible - I keep hitting neverending 
stream of bugs.
As always with nouveau...

-- 
Ondrej Zary


nouveau broken on Riva TNT2 in 5.13.0-rc4: NULL pointer dereference in nouveau_bo_sync_for_device

2021-06-07 Thread Ondrej Zary
Hello,
I'm testing 5.13.0-rc4 and nouveau crashes with NULL pointer dereference in 
nouveau_bo_sync_for_device.
Found various reports like this but that was back in februaryso that should be 
fixed now.

[   21.003216] BUG: kernel NULL pointer dereference, address: 
[   21.003235] #PF: supervisor read access in kernel mode
[   21.003243] #PF: error_code(0x) - not-present page
[   21.003250] *pde = 
[   21.003258] Oops:  [#1] SMP
[   21.003268] CPU: 0 PID: 222 Comm: systemd-udevd Not tainted 5.13.0-rc4+ #327
[   21.003278] Hardware name:  /848P-ICH5, BIOS 6.00 PG 02/03/2005
[   21.003285] EIP: nouveau_bo_sync_for_device+0x9e/0xbf [nouveau]
[   21.003571] Code: 02 89 45 e8 01 d1 8b 19 89 5d ec bb 01 00 00 00 3b 5d e8 
74 0d 89 d8 c1 e0 05 03 45 ec 39 04 99 74 1e 8b 46 10 89 d9 c1 e1 0c <8b> 14 10 
8b 47 e0 8b 40 08 6a 01 e8 d5 03 55 df 01 5d f0 58 eb ae
[   21.003588] EAX:  EBX: 0010 ECX: 0001 EDX: 
[   21.003597] ESI: c3e90280 EDI: c185a494 EBP: c2ed7c10 ESP: c2ed7bf8
[   21.003606] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00210206
[   21.003615] CR0: 80050033 CR2:  CR3: 02ecb000 CR4: 0690
[   21.003625] Call Trace:
[   21.003635]  nouveau_bo_validate+0x3f/0x48 [nouveau]
[   21.003911]  nouveau_bo_pin+0xf0/0x187 [nouveau]
[   21.004182]  nouveau_channel_prep+0xc0/0x269 [nouveau]
[   21.004454]  nouveau_channel_new+0x3c/0x5f5 [nouveau]
[   21.004725]  ? slab_free_freelist_hook+0x3b/0xa7
[   21.004740]  ? kfree+0x9e/0x11a
[   21.004749]  ? nvif_object_sclass_put+0xd/0x16 [nouveau]
[   21.004944]  nouveau_drm_device_init+0x2e2/0x646 [nouveau]
[   21.005186]  ? pci_enable_device_flags+0x23/0x97
[   21.005202]  nouveau_drm_probe+0xe5/0x182 [nouveau]
[   21.005443]  ? nouveau_drm_device_init+0x646/0x646 [nouveau]
[   21.005683]  pci_device_probe+0x89/0xe9
[   21.005696]  really_probe+0x127/0x2b9
[   21.005707]  driver_probe_device+0x62/0x89
[   21.005715]  device_driver_attach+0x2e/0x41
[   21.005724]  __driver_attach+0x83/0x8a
[   21.005732]  bus_for_each_dev+0x4c/0x66
[   21.005740]  driver_attach+0x14/0x16
[   21.005747]  ? device_driver_attach+0x41/0x41
[   21.005756]  bus_add_driver+0xc5/0x16c
[   21.005764]  driver_register+0x87/0xb9
[   21.005772]  __pci_register_driver+0x38/0x3b
[   21.005780]  ? 0xf0be4000
[   21.005787]  nouveau_drm_init+0x14c/0x1000 [nouveau]
[   21.005964]  do_one_initcall+0x5a/0x134
[   21.005975]  ? __vunmap+0x124/0x12d
[   21.005984]  ? __vunmap+0x124/0x12d
[   21.005992]  ? kmem_cache_alloc+0xa8/0xb6
[   21.006001]  ? do_init_module+0x17/0x1cf
[   21.006012]  do_init_module+0x46/0x1cf
[   21.006021]  load_module+0x1799/0x1bcb
[   21.006032]  __ia32_sys_finit_module+0x72/0x7a
[   21.006044]  do_int80_syscall_32+0x53/0x62
[   21.006054]  entry_INT80_32+0xf0/0xf0
[   21.006063] EIP: 0xb7f40092
[   21.006071] Code: 00 00 00 e9 90 ff ff ff ff a3 24 00 00 00 68 30 00 00 00 
e9 80 ff ff ff ff a3 e8 ff ff ff 66 90 00 00 00 00 00 00 00 00 cd 80  8d b4 
26 00 00 00 00 8d b6 00 00 00 00 8b 1c 24 c3 8d b4 26 00
[   21.006086] EAX: ffda EBX: 0010 ECX: b7e9bbdd EDX: 
[   21.006095] ESI: 008f27d0 EDI: 008f9e10 EBP:  ESP: bfa140b8
[   21.006103] DS: 007b ES: 007b FS:  GS: 0033 SS: 007b EFLAGS: 00200296
[   21.006114] Modules linked in: nouveau(+) snd_intel8x0 snd_ac97_codec pcmcia 
wmi hwmon ac97_bus yenta_socket pcmcia_rsrc drm_ttm_helper snd_pcm ttm 
snd_timer pcmcia_core psmouse 8139cp snd sg soundcore serio_raw parport_pc 
intel_agp parport
[   21.006165] CR2: 
[   21.006201] ---[ end trace 02dc541683feafc6 ]---
[   21.006211] EIP: nouveau_bo_sync_for_device+0x9e/0xbf [nouveau]
[   21.006460] Code: 02 89 45 e8 01 d1 8b 19 89 5d ec bb 01 00 00 00 3b 5d e8 
74 0d 89 d8 c1 e0 05 03 45 ec 39 04 99 74 1e 8b 46 10 89 d9 c1 e1 0c <8b> 14 10 
8b 47 e0 8b 40 08 6a 01 e8 d5 03 55 df 01 5d f0 58 eb ae
[   21.006476] EAX:  EBX: 0010 ECX: 0001 EDX: 
[   21.006485] ESI: c3e90280 EDI: c185a494 EBP: c2ed7c10 ESP: c2ed7bf8
[   21.006494] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00210206
[   21.006503] CR0: 80050033 CR2:  CR3: 02ecb000 CR4: 00000690


-- 
Ondrej Zary


Re: nouveau broken on Riva TNT2 in 5.13.0-rc4: NULL pointer dereference in nouveau_bo_sync_for_device

2021-06-07 Thread Ondrej Zary
On Saturday 05 June 2021 21:43:52 Ondrej Zary wrote:
> Hello,
> I'm testing 5.13.0-rc4 and nouveau crashes with NULL pointer dereference in 
> nouveau_bo_sync_for_device.
> Found various reports like this but that was back in februaryso that should 
> be fixed now.

So it is the same bug. Broken since 5.11. This revert fixes it in 5.11:
https://lists.freedesktop.org/archives/dri-devel/2021-February/298531.html

Added some debug printks to nouveau_bo_sync_for_device:
[   22.225048] ttm_dma=fc33b500
[   22.225066] ttm_dma->num_pages=18
[   22.225071] i=0 num_pages=16
[   22.225077] ttm_dma->dma_address=
[   22.225094] BUG: kernel NULL pointer dereference, address: 

So ttm->dma_address is NULL.

-- 
Ondrej Zary


Re: [Nouveau] nouveau broken on Riva TNT2 in 5.9.0-rc8: GPU not supported on big-endian

2020-10-29 Thread Ondrej Zary
On Saturday 10 October 2020 02:02:42 Karol Herbst wrote:
> On Sat, Oct 10, 2020 at 12:23 AM Ilia Mirkin  wrote:
> >
> > On Fri, Oct 9, 2020 at 5:54 PM Karol Herbst  wrote:
> > >
> > > On Fri, Oct 9, 2020 at 11:35 PM Ondrej Zary  wrote:
> > > >
> > > > Hello,
> > > > I'm testing 5.9.0-rc8 and found that Riva TNT2 stopped working:
> > > > [0.00] Linux version 5.9.0-rc8+ (zary@gsql) (gcc (Debian 
> > > > 8.3.0-6) 8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #326 SMP Fri 
> > > > Oct 9 22:31:40 CEST 2020
> > > > ...
> > > > [   14.771464] nouveau :01:00.0: GPU not supported on big-endian
> > > > [   14.771782] nouveau: probe of :01:00.0 failed with error -38
> > > >
> > > > big-endian? WTF? The machine is x86.
> > > >
> > >
> > > mhh, we reworked the endianess checks a bit and apparently that broke
> > > something... I will give it some thoughts, but could you be so kind
> > > and create an mmiotrace under 5.9 with nouveau? You won't need to
> > > start X or anything while doing it. Just enable the trace and modprobe
> > > nouveau and collect the trace.
> >
> > Looks like nvkm_device_endianness unconditionally reads out 0x4. I
> > don't think that reg is there pre-NV11. At least NV4, NV5, NV10 and
> > maybe NV15 (which is logically pre-NV11) don't support big-endian
> > mode. Not sure about NV1A, which was the IGP of the series and IIRC
> > logically pre-NV11 as well (but clearly could only be used with x86
> > chips, since it was part of the motherboard).
> >
> > Aha, it's documented in rnndb:
> >
> > https://github.com/envytools/envytools/blob/master/rnndb/bus/pmc.xml
> > 
> >
> 
> ohh, I should have checked there.. yeah, will write a fix for it then.
> Before my patch we just always tried to switch it, but never threw an
> error.

Any progress with the patch?

-- 
Ondrej Zary
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Nouveau] nouveau broken on Riva TNT2 in 5.9.0-rc8: GPU not supported on big-endian

2020-10-10 Thread Ondrej Zary
On Saturday 10 October 2020 00:23:38 Ilia Mirkin wrote:
> On Fri, Oct 9, 2020 at 5:54 PM Karol Herbst  wrote:
> >
> > On Fri, Oct 9, 2020 at 11:35 PM Ondrej Zary  wrote:
> > >
> > > Hello,
> > > I'm testing 5.9.0-rc8 and found that Riva TNT2 stopped working:
> > > [0.00] Linux version 5.9.0-rc8+ (zary@gsql) (gcc (Debian 8.3.0-6) 
> > > 8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #326 SMP Fri Oct 9 
> > > 22:31:40 CEST 2020
> > > ...
> > > [   14.771464] nouveau :01:00.0: GPU not supported on big-endian
> > > [   14.771782] nouveau: probe of :01:00.0 failed with error -38
> > >
> > > big-endian? WTF? The machine is x86.
> > >
> >
> > mhh, we reworked the endianess checks a bit and apparently that broke
> > something... I will give it some thoughts, but could you be so kind
> > and create an mmiotrace under 5.9 with nouveau? You won't need to
> > start X or anything while doing it. Just enable the trace and modprobe
> > nouveau and collect the trace.
> 
> Looks like nvkm_device_endianness unconditionally reads out 0x4. I
> don't think that reg is there pre-NV11. At least NV4, NV5, NV10 and
> maybe NV15 (which is logically pre-NV11) don't support big-endian
> mode. Not sure about NV1A, which was the IGP of the series and IIRC
> logically pre-NV11 as well (but clearly could only be used with x86
> chips, since it was part of the motherboard).

Yes, you're right. Forcing nvkm_device_endianness to return true allows
5.9.0-rc8 to work:
[0.00] Linux version 5.9.0-rc8+ (zary@gsql) (gcc (Debian 8.3.0-6) 
8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #326 SMP Fri Oct 9 22:31:40 
CEST 2020
...
[   12.311258] nouveau :01:00.0: bios: DCB table not found
[   12.311583] nouveau :01:00.0: bios: DCB table not found
[   12.311834] nouveau :01:00.0: bios: DCB table not found
[   12.311847] nouveau :01:00.0: bios: DCB table not found
[   12.311989] agpgart-intel :00:00.0: AGP 3.0 bridge
[   12.312017] agpgart-intel :00:00.0: bridge is in legacy mode, falling 
back to 2.x
[   12.312031] agpgart-intel :00:00.0: putting AGP V2 device into 4x mode
[   12.312066] nouveau :01:00.0: putting AGP V2 device into 4x mode
[   12.312162] agpgart-intel :00:00.0: AGP 3.0 bridge
[   12.312182] agpgart-intel :00:00.0: bridge is in legacy mode, falling 
back to 2.x
[   12.312195] agpgart-intel :00:00.0: putting AGP V2 device into 4x mode
[   12.312230] nouveau :01:00.0: putting AGP V2 device into 4x mode
[   12.312247] nouveau :01:00.0: tmr: unknown input clock freq
[   12.318341] nouveau :01:00.0: fb: 32 MiB SDRAM
[   12.76] [TTM] Zone  kernel: Available graphics memory: 385048 KiB
[   12.92] [TTM] Initializing pool allocator
[   12.333434] nouveau :01:00.0: DRM: VRAM: 31 MiB
[   12.333443] nouveau :01:00.0: DRM: GART: 128 MiB
[   12.333453] nouveau :01:00.0: DRM: BMP version 5.6
[   12.333460] nouveau :01:00.0: DRM: No DCB data found in VBIOS
[   12.335355] nouveau :01:00.0: DRM: MM: using M2MF for buffer copies
[   12.335443] nouveau :01:00.0: bios: DCB table not found
[   12.336033] nouveau :01:00.0: DRM: Saving VGA fonts
[   12.376420] nouveau :01:00.0: DRM: No DCB data found in VBIOS
[   12.410397] nouveau :01:00.0: DRM: allocated 1280x1024 fb: 0x4000, bo 
b68d2ac4
[   12.441217] fbcon: nouveaudrmfb (fb0) is primary device
[   12.591964] Console: switching to colour frame buffer device 160x64
[   12.593876] nouveau :01:00.0: [drm] fb0: nouveaudrmfb frame buffer device
[   12.594944] [drm] Initialized nouveau 1.3.1 20120801 for :01:00.0 on 
minor 0

BTW. 5.8 kernel (that appeared today in Debian packports) is broken the same 
way.

> Aha, it's documented in rnndb:
> 
> https://github.com/envytools/envytools/blob/master/rnndb/bus/pmc.xml
> 
> 
>   -ilia
> 


-- 
Ondrej Zary
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


nouveau broken on Riva TNT2 in 5.9.0-rc8: GPU not supported on big-endian

2020-10-10 Thread Ondrej Zary
Hello,
I'm testing 5.9.0-rc8 and found that Riva TNT2 stopped working:
[0.00] Linux version 5.9.0-rc8+ (zary@gsql) (gcc (Debian 8.3.0-6) 
8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #326 SMP Fri Oct 9 22:31:40 
CEST 2020
...
[   14.771464] nouveau :01:00.0: GPU not supported on big-endian
[   14.771782] nouveau: probe of :01:00.0 failed with error -38

big-endian? WTF? The machine is x86.

It works fine with Debian 5.7 kernel (5.7.10-1~bpo10+1):
[0.00] Linux version 5.7.0-0.bpo.2-686 (debian-ker...@lists.debian.org) 
(gcc version 8.3.0 (Debian 8.3.0-6), GNU ld (GNU Binutils for Debian) 2.31.1) 
#1 SMP Debian 5.7.10-1~bpo10+1 (2020-07-30)
...
[   23.266196] nouveau :01:00.0: NVIDIA NV05 (20154000)
[   23.288582] nouveau :01:00.0: bios: version 02.05.20.02.00
[   23.288869] nouveau :01:00.0: bios: DCB table not found
[   23.289595] nouveau :01:00.0: bios: DCB table not found
[   23.289956] nouveau :01:00.0: bios: DCB table not found
[   23.290015] nouveau :01:00.0: bios: DCB table not found
[   23.290215] agpgart-intel :00:00.0: AGP 3.0 bridge
[   23.290287] agpgart-intel :00:00.0: bridge is in legacy mode, falling 
back to 2.x
[   23.290351] agpgart-intel :00:00.0: putting AGP V2 device into 4x mode
[   23.290430] nouveau :01:00.0: putting AGP V2 device into 4x mode
[   23.290565] agpgart-intel :00:00.0: AGP 3.0 bridge
[   23.290627] agpgart-intel :00:00.0: bridge is in legacy mode, falling 
back to 2.x
[   23.290690] agpgart-intel :00:00.0: putting AGP V2 device into 4x mode
[   23.290768] nouveau :01:00.0: putting AGP V2 device into 4x mode
[   23.290830] nouveau :01:00.0: tmr: unknown input clock freq
[   23.293026] nouveau :01:00.0: fb: 32 MiB SDRAM
[   23.301269] [TTM] Zone  kernel: Available graphics memory: 382728 KiB
[   23.301327] [TTM] Initializing pool allocator
[   23.301414] nouveau :01:00.0: DRM: VRAM: 31 MiB
[   23.301465] nouveau :01:00.0: DRM: GART: 128 MiB
[   23.301518] nouveau :01:00.0: DRM: BMP version 5.6
[   23.301570] nouveau :01:00.0: DRM: No DCB data found in VBIOS
[   23.303594] nouveau :01:00.0: DRM: MM: using M2MF for buffer copies
[   23.303719] nouveau :01:00.0: bios: DCB table not found
[   23.304904] nouveau :01:00.0: DRM: Saving VGA fonts
[   23.349089] nouveau :01:00.0: DRM: No DCB data found in VBIOS
[   23.349681] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[   23.383066] nouveau :01:00.0: DRM: allocated 1280x1024 fb: 0x4000, bo 
b10d2f17
[   23.413903] fbcon: nouveaudrmfb (fb0) is primary device
[   23.569851] Console: switching to colour frame buffer device 160x64
[   23.571050] nouveau :01:00.0: fb0: nouveaudrmfb frame buffer device


-- 
Ondrej Zary
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: nouveau: System crashes with NVIDIA GeForce 8600 GT

2019-08-19 Thread Ondrej Zary
On Saturday 17 August 2019 14:50:33 Alex Dewar wrote:
> Hi all,
>
> I'm getting frequent system crashes (every few hours or so) and it seems
> that the nouveau driver is causing the issue (dmesg output below). I see it
> with both v5.2.8 and the v4.19 LTS kernel. Sometimes the system
> completely freezes and sometimes seemingly just the nouveau driver goes
> down. The screen freezes and colours stream across it. Often after I
> reboot the BIOS logo is mangled too until the first modeset. The crash
> seems to be happening in nv50_fb_intr() in nv50.c.
>
> I'm not sure if this is related, but the system now often freezes on
> suspend or resume since I switched from using the old (recently
> abandoned) proprietry NVIDIA drivers, again both with 5.2 and 4.19
> kernels. Blacklisting the nouveau driver doesn't seem to fix it however,
> though I guess the graphics card could still be causing issues in some
> other way? I never had problems with suspend and resume before.
>
> Any suggestions about how I could debug this further?

Is it really a software problem (does it still work fine with proprietary 
driver)?
These nVidia chips are known to fail and corrupt BIOS logo suggests that.

-- 
Ondrej Zary
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] video: fbdev: remove dead igafb driver

2017-10-18 Thread Ondrej Zary
On Wednesday 18 October 2017, David Miller wrote:
> From: John Paul Adrian Glaubitz <glaub...@physik.fu-berlin.de>
> Date: Wed, 18 Oct 2017 15:14:27 +0200
>
> > Hi Bartlomiej!
> >
> > On 10/18/2017 02:56 PM, Bartlomiej Zolnierkiewicz wrote:
> >> igafb driver hasn't compiled since at least kernel v2.6.34 as
> >> commit 6016a363f6b5 ("of: unify phandle name in struct device_node")
> >> missed updating igafb.c to use dp->phandle instead of dp->node.
> >
> > Would it take a lot of work to port the driver to the new interface?
> >
> > I'm not sure which SPARC machines use this particular framebuffer, but
> > my plans are to fix up all these old framebuffer drivers. I have
> > already
> > received several Amiga (Zorro) graphics cards for testing the updated
> > drivers on Amiga.
> >
> > It could be that I actually have this particular SPARC framebuffer in
> > my hardware collection.
>
> Unless you have a 32-bit sparc laptop, you don't have a machine that
> will use this driver.

There are also some x86 PCI cards using this chip.

-- 
Ondrej Zary
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Nouveau] [PATCH] [resend] nouveau: Disable AGP for SiS 761

2015-09-30 Thread Ondrej Zary
On Wednesday 30 September 2015, Samuel Pitoiset wrote:
> This patch has been merged by Ben yesterday.
>
> http://cgit.freedesktop.org/~darktama/nouveau/commit/?id=8c713f90a63ffca10d
>122af09d439f3409c933ed
>
> Why do you send a new version ? Is the previous patch wrong?

Oops, sorry. Didn't notice it was merged.

-- 
Ondrej Zary


[PATCH] [resend] nouveau: Disable AGP for SiS 761

2015-09-30 Thread Ondrej Zary
SiS 761 chipset does not support AGP cards but has AGP capability (for
the onboard video). At least PC Chips A31G board using this chipset has
an AGP-like AGPro slot that's wired to the PCI bus. Enabling AGP will
fail (GPU lockup and software fbcon, X11 hangs).

Add support for matching just the host bridge in nvkm_device_agp_quirks
and add entry for SiS 761 with mode 0 (AGP disabled).

Signed-off-by: Ondrej Zary 
---
 drivers/gpu/drm/nouveau/nvkm/subdev/pci/agp.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/agp.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/agp.c
index 814cb51..385a90f 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/agp.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/agp.c
@@ -35,6 +35,8 @@ static const struct nvkm_device_agp_quirk
 nvkm_device_agp_quirks[] = {
/* VIA Apollo PRO133x / GeForce FX 5600 Ultra - fdo#20341 */
{ PCI_VENDOR_ID_VIA, 0x0691, PCI_VENDOR_ID_NVIDIA, 0x0311, 2 },
+   /* SiS 761 does not support AGP cards, use PCI mode */
+   { PCI_VENDOR_ID_SI, 0x0761, PCI_ANY_ID, PCI_ANY_ID, 0 },
{},
 };

@@ -137,8 +139,10 @@ nvkm_agp_ctor(struct nvkm_pci *pci)
while (quirk->hostbridge_vendor) {
if (info.device->vendor == quirk->hostbridge_vendor &&
info.device->device == quirk->hostbridge_device &&
-   pci->pdev->vendor == quirk->chip_vendor &&
-   pci->pdev->device == quirk->chip_device) {
+   (quirk->chip_vendor == (u16)PCI_ANY_ID ||
+   pci->pdev->vendor == quirk->chip_vendor) &&
+   (quirk->chip_device == (u16)PCI_ANY_ID ||
+   pci->pdev->device == quirk->chip_device)) {
nvkm_info(subdev, "forcing default agp mode to %dX, "
      "use NvAGP= to override\n",
  quirk->mode);
-- 
Ondrej Zary



No more new fbdev drivers, please

2015-09-25 Thread Ondrej Zary
On Friday 25 September 2015, Aaro Koskinen wrote:
> Hi,
>
> On Thu, Sep 24, 2015 at 03:27:01PM +0300, Tomi Valkeinen wrote:
> > fbdev is (more or less) maintained, but it's a deprecated framework. All
> > new Linux display drivers should be done on DRM.
> >
> > So let's not add any more new fbdev drivers.
> >
> > I will continue to maintain the current fbdev drivers, and I don't mind
> > adding some new features to those current drivers, as long as the amount
> > of code required to add the features stays sensible.
> >
> > I see we have three fbdev drivers in staging: xgifb, fbtft and sm750fb,
> > and the question is what to do with those.
>
> I was still planning to work on xgifb as I need it on some systems for
> the console.

xgifb supports these devices:
PCI_VENDOR_ID_XGI, PCI_DEVICE_ID_XGI_20
PCI_VENDOR_ID_XGI, PCI_DEVICE_ID_XGI_27
PCI_VENDOR_ID_XGI, PCI_DEVICE_ID_XGI_40
PCI_VENDOR_ID_XGI, PCI_DEVICE_ID_XGI_42

Two of them are already supported by sisfb:
PCI_VENDOR_ID_XGI, PCI_DEVICE_ID_XGI_20
PCI_VENDOR_ID_XGI, PCI_DEVICE_ID_XGI_40

So I think that support for the remaining two (and missing features, if any) 
should be added to sisfb.

-- 
Ondrej Zary


No more new fbdev drivers, please

2015-09-24 Thread Ondrej Zary
On Thursday 24 September 2015 17:59:12 Daniel Vetter wrote:
> On Thu, Sep 24, 2015 at 11:21:15AM -0400, Austin S Hemmelgarn wrote:
> > On 2015-09-24 08:46, Thomas Petazzoni wrote:
> > >Hello,
> > >
> > >On Thu, 24 Sep 2015 15:27:01 +0300, Tomi Valkeinen wrote:
> > >>fbdev is (more or less) maintained, but it's a deprecated framework.
> > >> All new Linux display drivers should be done on DRM.
> > >>
> > >>So let's not add any more new fbdev drivers.
> > >>
> > >>I will continue to maintain the current fbdev drivers, and I don't mind
> > >>adding some new features to those current drivers, as long as the
> > >> amount of code required to add the features stays sensible.
> > >>
> > >>I see we have three fbdev drivers in staging: xgifb, fbtft and sm750fb,
> > >>and the question is what to do with those.
> > >>
> > >>xgifb was added in 2010, and is still in staging.
> > >>
> > >>fbtft looks like maybe some kind of framework on top of fbdev, with
> > >>fbtft specific subdrivers... I didn't look at it in detail, but my gut
> > >>says "never".
> > >
> > >fbtft mainly drives some very simple I2C-based or SPI-based displays,
> > >and DRM is I believe overkill for such displays. Last time I talked
> > >with Laurent Pinchart about such drivers, I believe he said that such
> > >simple drivers could probably continue to use the fbdev subsystem.
> >
> > I have to agree, using DRM _really_ doesn't make sense for these, the
> > devices in question are (AFAIK) simple I2C or SPI connected frame-buffer
> > chips that are hooked up to equally simple TFT displays.  There's no 3d
> > acceleration at all from what I can tell, there's _very_ limited 2d
> > acceleration, and most of the stuff that the DRM framework provides
> > call-backs for would have to be done on the CPU anyway.  On top of that,
> > it's targeted at small embedded systems with limited memory, and the DRM
> > framework is by no-means lightweight (TBH, fbdev isn't really either, but
> > it's much more light weight than DRM).
>
> See my other mail, but you can write very simple drm drivers. And if
> there's really a bloat problem for small systems we can add Kconfig knobs
> to throw out everything not needed for simple drivers. The only problem
> really is that everyone with such simple drivers doesn't even consider drm
> "because I don't have a desktop gpu" which is just silly - drm has become
> rather flexible. And that's essentially why writing simple drm drivers
> still has a bit too much boilerplate, since no one yet bothered to add a
> bit of helper support needed.

Is there a simple way to convert existing fbdev drivers to DRM? Let's say I 
want to convert tridentfb to DRM, keeping the 2D acceleration (pan, fillrect, 
copyarea, imageblit) to be usable by the console (and maybe extend it to X11 
using some generic 2D driver?)

-- 
Ondrej Zary


AGP cards in PCI mode (fake slots like AGPro, AGP Express, AGI, AGX, XGP)

2015-09-15 Thread Ondrej Zary
On Monday 14 September 2015 04:31:43 Alex Deucher wrote:
> On Sun, Sep 13, 2015 at 2:57 PM, Ondrej Zary  
> wrote:
> > Hello,
> > I have a PC Chips A31G board with AGPro slot and found that nouveau does
> > not work properly with it. Console works but reverts to software mode,
> > X11 hangs with mouse cursor only.
> >
> > The slot is physically AGP 1.5V but is wired to PCI bus as the chipset
> > (SiS 761) does not support AGP cards. To further complicate things, the
> > chipset has AGP capability - but only for the integrated video. You can
> > see that in the lspci output below - the AGP card is on bus 0 and SiS
> > card on bus 1 (AGP bus behind the AGP bridge). The SiS card is not used
> > (can be disabled in BIOS but it does not improve things - as the AGP
> > capability of the host bridge remains active).
> >
> > As seen in dmesg below, kernel tries to set AGP 8x mode for all AGP
> > devices, including the AGP 4x TNT2 card which is not even connected to
> > the AGP bridge.
> >
> > Setting nouveau.agpmode=0 makes it work but how can we make this case
> > work automatically?
> >
> > Radeon driver does some "ring test" and if it fails, it disables AGP mode
> > and retries. That seems to work a bit (with R7000 but not with R7200).
>
> You can boot with radeon.agpmode=-1 to force pci mode.

Found out that the autoswitch to PCI mode works correctly. Radeon 7000 works
without any parameters (first ring test fails, then the driver disables AGP
and the second ring test succeeds).

Radeon 7200 does not work even with agpmode=-1 - ring test fails (both ring
test fail if booted without the agpmode=-1 parameter):

[   13.111087] [drm] radeon kernel modesetting enabled.
[   13.679593] [drm] initializing kernel modesetting (R100 0x1002:0x5144 
0x1002:0x02AA).
[   13.679669] [drm] Forcing AGP to PCI mode
[   13.679737] [drm] register mmio base: 0xFEA0
[   13.679790] [drm] register mmio size: 524288
[   13.683710] radeon :00:05.0: VRAM: 128M 0xD000 - 
0xD7FF (32M used)
[   13.683782] radeon :00:05.0: GTT: 512M 0xB000 - 
0xCFFF
[   13.684802] [drm] Detected VRAM RAM=128M, BAR=128M
[   13.684856] [drm] RAM width 128bits DDR
[   13.687628] [TTM] Zone  kernel: Available graphics memory: 240180 kiB
[   13.687686] [TTM] Initializing pool allocator
[   13.687778] [drm] radeon: 32M of VRAM memory ready
[   13.687831] [drm] radeon: 512M of GTT memory ready.
[   13.687904] [drm] GART: num cpu pages 131072, num gpu pages 131072
[   13.690406] [drm] PCI GART of 512M enabled (table at 0x1A68).
[   13.690510] radeon :00:05.0: WB disabled
[   13.690566] radeon :00:05.0: fence driver on ring 0 use gpu addr 
0xb000 and cpu addr 0xc0051000
[   13.690638] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[   13.690693] [drm] Driver supports precise vblank timestamp query.
[   13.690780] [drm] radeon: irq initialized.
[   13.690844] [drm] Loading R100 Microcode
[   14.575464] [drm] radeon: ring at 0xB0001000
[   14.750114] [drm:r100_ring_test [radeon]] *ERROR* radeon: ring test failed 
(scratch(0x15E4)=0xCAFEDEAD)
[   14.750423] [drm:r100_cp_init [radeon]] *ERROR* radeon: cp isn't working 
(-22).
[   14.750631] radeon :00:05.0: failed initializing CP (-22).
[   14.750770] radeon :00:05.0: Disabling GPU acceleration
[   14.925219] [drm:r100_cp_fini [radeon]] *ERROR* Wait for CP idle timeout, 
shutting down CP.
[   14.927652] [drm] radeon: cp finalized
[   14.933949] [drm] Radeon Display Connectors
[   14.934079] [drm] Connector 0:
[   14.934196] [drm]   DVI-I-1
[   14.934310] [drm]   HPD1
[   14.934424] [drm]   DDC: 0x64 0x64 0x64 0x64 0x64 0x64 0x64 0x64
[   14.934563] [drm]   Encoders:
[   14.934679] [drm] CRT1: INTERNAL_DAC1
[   14.934804] [drm] DFP1: INTERNAL_TMDS1
[   14.991877] [drm] fb mappable at 0xD004
[   14.992023] [drm] vram apper at 0xD000
[   14.992147] [drm] size 1310720
[   14.992264] [drm] fb depth is 8
[   14.992381] [drm]pitch is 1280
[   14.993653] fbcon: radeondrmfb (fb0) is primary device
[   15.072448] Console: switching to colour frame buffer device 160x64
[   15.096165] radeon :00:05.0: fb0: radeondrmfb frame buffer device
[   15.096581] [drm] Initialized radeon 2.43.0 20080528 for :00:05.0 on 
minor 0

-- 
Ondrej Zary


AGP cards in PCI mode (fake slots like AGPro, AGP Express, AGI, AGX, XGP)

2015-09-14 Thread Ondrej Zary
On Sunday 13 September 2015 21:12:25 Ilia Mirkin wrote:
> On Sun, Sep 13, 2015 at 2:57 PM, Ondrej Zary  
> wrote:
> > Hello,
> > I have a PC Chips A31G board with AGPro slot and found that nouveau does
> > not work properly with it. Console works but reverts to software mode,
> > X11 hangs with mouse cursor only.
> >
> > The slot is physically AGP 1.5V but is wired to PCI bus as the chipset
> > (SiS 761) does not support AGP cards. To further complicate things, the
> > chipset has AGP capability - but only for the integrated video. You can
> > see that in the lspci output below - the AGP card is on bus 0 and SiS
> > card on bus 1 (AGP bus behind the AGP bridge). The SiS card is not used
> > (can be disabled in BIOS but it does not improve things - as the AGP
> > capability of the host bridge remains active).
>
> I believe we can handle it with a blacklist. If the chipset just
> doesn't support AGP at all, we should just set agpmode=0 irrespective
> of the card plugged in, right?
>
> Shouldn't the agpgart know about this and not even allow any setting
> at all? This is where we get the idea to set 8x AGP from. See
> drivers/gpu/drm/nouveau/nvkm/subdev/pci/agp.c for details.

The chipset does not support AGP slot but supports AGP for the integrated
video. So it shouldn't be completely disabled.

> The alternative is to add to nvkm_device_agp_quirks, and just add
> something that matches just the host bridge vendor/device, ignoring
> the chip.

Something like this?

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/agp.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/agp.c
index 814cb51..385a90f 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/agp.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/agp.c
@@ -35,6 +35,8 @@ static const struct nvkm_device_agp_quirk
 nvkm_device_agp_quirks[] = {
/* VIA Apollo PRO133x / GeForce FX 5600 Ultra - fdo#20341 */
{ PCI_VENDOR_ID_VIA, 0x0691, PCI_VENDOR_ID_NVIDIA, 0x0311, 2 },
+   /* SiS 761 does not support AGP cards, use PCI mode */
+   { PCI_VENDOR_ID_SI, 0x0761, PCI_ANY_ID, PCI_ANY_ID, 0 },
{},
 };

@@ -137,8 +139,10 @@ nvkm_agp_ctor(struct nvkm_pci *pci)
while (quirk->hostbridge_vendor) {
if (info.device->vendor == quirk->hostbridge_vendor &&
info.device->device == quirk->hostbridge_device &&
-   pci->pdev->vendor == quirk->chip_vendor &&
-   pci->pdev->device == quirk->chip_device) {
+   (quirk->chip_vendor == (u16)PCI_ANY_ID ||
+   pci->pdev->vendor == quirk->chip_vendor) &&
+   (quirk->chip_device == (u16)PCI_ANY_ID ||
+   pci->pdev->device == quirk->chip_device)) {
    nvkm_info(subdev, "forcing default agp mode to %dX, "
  "use NvAGP= to override\n",
  quirk->mode);
-- 
Ondrej Zary


AGP cards in PCI mode (fake slots like AGPro, AGP Express, AGI, AGX, XGP)

2015-09-13 Thread Ondrej Zary
Hello,
I have a PC Chips A31G board with AGPro slot and found that nouveau does not
work properly with it. Console works but reverts to software mode, X11 hangs
with mouse cursor only.

The slot is physically AGP 1.5V but is wired to PCI bus as the chipset (SiS
761) does not support AGP cards. To further complicate things, the chipset has
AGP capability - but only for the integrated video. You can see that in the
lspci output below - the AGP card is on bus 0 and SiS card on bus 1 (AGP bus
behind the AGP bridge). The SiS card is not used (can be disabled in BIOS but
it does not improve things - as the AGP capability of the host bridge remains
active).

As seen in dmesg below, kernel tries to set AGP 8x mode for all AGP devices,
including the AGP 4x TNT2 card which is not even connected to the AGP bridge.

Setting nouveau.agpmode=0 makes it work but how can we make this case work
automatically?

Radeon driver does some "ring test" and if it fails, it disables AGP mode and
retries. That seems to work a bit (with R7000 but not with R7200).

But I think that we shouldn't even touch the AGP registers of other devices
in this case as it might break the integrated video.
But how can we know that the card is connected to the AGP bus? There does not
seem to be a reliable way...

dmesg:
[   22.015411] nouveau  [  DEVICE][:00:05.0] BOOT0  : 0x20154000
[   22.015473] nouveau  [  DEVICE][:00:05.0] Chipset: NV05 (NV05)
[   22.015527] nouveau  [  DEVICE][:00:05.0] Family : NV04
[   22.041131] nouveau  [   VBIOS][:00:05.0] using image from PRAMIN
[   22.041194] nouveau  [   VBIOS][:00:05.0] BMP version 5.6
[   22.041382] nouveau  [   VBIOS][:00:05.0] version 02.05.20.02.00
[   22.041561] nouveau W[   VBIOS][:00:05.0] DCB table not found
[   22.041867] nouveau W[   VBIOS][:00:05.0] DCB table not found
[   22.042079] nouveau W[   VBIOS][:00:05.0] DCB table not found
[   22.042133] nouveau W[   VBIOS][:00:05.0] DCB table not found
[   22.042245] nouveau W[  PTIMER][:00:05.0] unknown input clock freq
[   22.042306] nouveau  [ PFB][:00:05.0] RAM type: SDRAM
[   22.042360] nouveau  [ PFB][:00:05.0] RAM size: 32 MiB
[   22.042413] nouveau  [ PFB][:00:05.0]ZCOMP: 0 tags
[   22.047063] nouveau  [ CLK][:00:05.0] --:
[   22.047137] nouveau W[   VBIOS][:00:05.0] DCB table not found
[   22.047220] agpgart-amd64 :00:00.0: AGP 3.0 bridge
[   22.047281] agpgart: systemd-udevd tried to set rate=x12. Setting to AGP3 x8 
mode.
[   22.047348] agpgart-amd64 :00:00.0: putting AGP V3 device into 8x mode
[   22.047425] nouveau :00:05.0: putting AGP V3 device into 8x mode
[   22.047503] pci :01:00.0: putting AGP V3 device into 8x mode
[   22.047632] [TTM] Zone  kernel: Available graphics memory: 239112 kiB
[   22.047685] [TTM] Initializing pool allocator
[   22.047744] [TTM] Initializing DMA pool allocator
[   22.047814] nouveau  [ DRM] VRAM: 31 MiB
[   22.047865] nouveau  [ DRM] GART: 64 MiB
[   22.047918] nouveau  [ DRM] BMP version 5.6
[   22.047971] nouveau W[ DRM] No DCB data found in VBIOS
[   22.051250] nouveau  [ DRM] Saving VGA fonts
[   22.099912] nouveau W[ DRM] No DCB data found in VBIOS
[   22.101006] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[   22.101061] [drm] Driver supports precise vblank timestamp query.
[   22.102645] nouveau  [ DRM] MM: using M2MF for buffer copies
[   22.133344] nouveau  [ DRM] allocated 1280x1024 fb: 0x4000, bo db2d6c00
[   22.133545] fbcon: nouveaufb (fb0) is primary device
[   22.369387] nouveau E[ DRM] GPU lockup - switching to software fbcon
[   22.378443] Console: switching to colour frame buffer device 160x64
[   22.395704] nouveau :00:05.0: fb0: nouveaufb frame buffer device
[   22.395808] nouveau :00:05.0: registered panic notifier
[   22.396783] [drm] Initialized nouveau 1.2.2 20120801 for :00:05.0 on 
minor 0

lspci -vvnn:
00:00.0 Host bridge [0600]: Silicon Integrated Systems [SiS] 761/M761 Host 
[1039:0761] (rev 01)
Subsystem: Elitegroup Computer Systems Device [1019:0131]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- 
SERR- TAbort- 
SERR- TAbort- 
Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [a4] HyperTransport: UnitID Clumping
...
00:05.0 VGA compatible controller [0300]: NVIDIA Corporation NV5 [Riva TNT2 
Model 64 / Model 64 Pro] [10de:002d] (rev 15) (prog-if 00 [VGA controller])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
SERR- TAbort- 
SERR- 

[PATCH] Another card with wrong primary dac adj

2013-07-20 Thread Ondrej Zary
On Friday 19 July 2013 23:50:50 Alex Deucher wrote:
> On Fri, Jul 19, 2013 at 3:08 PM, Ondrej Zary  
wrote:
> > Hello,
> > got another card with "too bright" problem:
> > Sapphire Radeon VE 7000 DDR (VGA+S-Video)
> >
> > lspci -vnn:
> > 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices [AMD]
> > nee ATI RV100 QY [Radeon 7000/VE] [1002:5159] (prog-if 00 [VGA
> > controller]) Subsystem: PC Partner Limited Sapphire Radeon VE 7000 DDR
> > [174b:7c28]
> >
> > The patch below fixes the problem for this card.
>
> Applied.
>
> > But I don't like the blacklist, couldn't some heuristic be used instead?
>
> How about the attached patch?

Thanks, it fixes my card without the quirk. So if it does not break anything, 
it can be used instead of my patch.

> > The interesting thing is that the manufacturer is the same as the other
> > card needing the same quirk. I wonder how many different types are broken
> > this way.
>
> So far we only have two quirks, so it doesn't seem that widespread.
>
> Alex
>
> > The "wrong" ps2_pdac_adj value that comes from BIOS on this card is
> > 0x300.
> >
> > 
> > drm/radeon: Add primary dac adj quirk for Sapphire Radeon VE 7000 DDR
> >
> > Values from BIOS are wrong, causing too bright colors.
> > Use default values instead.
> >
> > Signed-off-by: Ondrej Zary 
> > ---
> >  drivers/gpu/drm/radeon/radeon_combios.c |8 ++--
> >  1 files changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/radeon/radeon_combios.c
> > b/drivers/gpu/drm/radeon/radeon_combios.c index 78edadc..8528b81 100644
> > --- a/drivers/gpu/drm/radeon/radeon_combios.c
> > +++ b/drivers/gpu/drm/radeon/radeon_combios.c
> > @@ -971,10 +971,14 @@ struct radeon_encoder_primary_dac
> > *radeon_combios_get_primary_dac_info(struct }
> >
> > /* quirks */
> > +   /* Radeon 7000 (RV100) */
> > +   if (((dev->pdev->device == 0x5159) &&
> > +   (dev->pdev->subsystem_vendor == 0x174B) &&
> > +   (dev->pdev->subsystem_device == 0x7c28)) ||
> > /* Radeon 9100 (R200) */
> > -   if ((dev->pdev->device == 0x514D) &&
> > +  ((dev->pdev->device == 0x514D) &&
> > (dev->pdev->subsystem_vendor == 0x174B) &&
> > -   (dev->pdev->subsystem_device == 0x7149)) {
> > +   (dev->pdev->subsystem_device == 0x7149))) {
> > /* vbios value is bad, use the default */
> > found = 0;
> > }
> > --
> > Ondrej Zary
> > ___
> > dri-devel mailing list
> > dri-devel at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/dri-devel


-- 
Ondrej Zary


[PATCH] Another card with wrong primary dac adj

2013-07-19 Thread Ondrej Zary
Hello,
got another card with "too bright" problem:
Sapphire Radeon VE 7000 DDR (VGA+S-Video)

lspci -vnn:
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices [AMD] nee ATI 
RV100 QY [Radeon 7000/VE] [1002:5159] (prog-if 00 [VGA controller])
Subsystem: PC Partner Limited Sapphire Radeon VE 7000 DDR [174b:7c28]

The patch below fixes the problem for this card.
But I don't like the blacklist, couldn't some heuristic be used instead?
The interesting thing is that the manufacturer is the same as the other card
needing the same quirk. I wonder how many different types are broken this way.

The "wrong" ps2_pdac_adj value that comes from BIOS on this card is 0x300.


drm/radeon: Add primary dac adj quirk for Sapphire Radeon VE 7000 DDR

Values from BIOS are wrong, causing too bright colors.
Use default values instead.

Signed-off-by: Ondrej Zary 
---
 drivers/gpu/drm/radeon/radeon_combios.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_combios.c 
b/drivers/gpu/drm/radeon/radeon_combios.c
index 78edadc..8528b81 100644
--- a/drivers/gpu/drm/radeon/radeon_combios.c
+++ b/drivers/gpu/drm/radeon/radeon_combios.c
@@ -971,10 +971,14 @@ struct radeon_encoder_primary_dac 
*radeon_combios_get_primary_dac_info(struct
}

/* quirks */
+   /* Radeon 7000 (RV100) */
+   if (((dev->pdev->device == 0x5159) &&
+   (dev->pdev->subsystem_vendor == 0x174B) &&
+   (dev->pdev->subsystem_device == 0x7c28)) ||
/* Radeon 9100 (R200) */
-   if ((dev->pdev->device == 0x514D) &&
+  ((dev->pdev->device == 0x514D) &&
(dev->pdev->subsystem_vendor == 0x174B) &&
-   (dev->pdev->subsystem_device == 0x7149)) {
+   (dev->pdev->subsystem_device == 0x7149))) {
    /* vbios value is bad, use the default */
found = 0;
}
-- 
Ondrej Zary


Re: [PATCH] Another card with wrong primary dac adj

2013-07-19 Thread Ondrej Zary
On Friday 19 July 2013 23:50:50 Alex Deucher wrote:
 On Fri, Jul 19, 2013 at 3:08 PM, Ondrej Zary li...@rainbow-software.org 
wrote:
  Hello,
  got another card with too bright problem:
  Sapphire Radeon VE 7000 DDR (VGA+S-Video)
 
  lspci -vnn:
  01:00.0 VGA compatible controller [0300]: Advanced Micro Devices [AMD]
  nee ATI RV100 QY [Radeon 7000/VE] [1002:5159] (prog-if 00 [VGA
  controller]) Subsystem: PC Partner Limited Sapphire Radeon VE 7000 DDR
  [174b:7c28]
 
  The patch below fixes the problem for this card.

 Applied.

  But I don't like the blacklist, couldn't some heuristic be used instead?

 How about the attached patch?

Thanks, it fixes my card without the quirk. So if it does not break anything, 
it can be used instead of my patch.

  The interesting thing is that the manufacturer is the same as the other
  card needing the same quirk. I wonder how many different types are broken
  this way.

 So far we only have two quirks, so it doesn't seem that widespread.

 Alex

  The wrong ps2_pdac_adj value that comes from BIOS on this card is
  0x300.
 
  
  drm/radeon: Add primary dac adj quirk for Sapphire Radeon VE 7000 DDR
 
  Values from BIOS are wrong, causing too bright colors.
  Use default values instead.
 
  Signed-off-by: Ondrej Zary li...@rainbow-software.org
  ---
   drivers/gpu/drm/radeon/radeon_combios.c |8 ++--
   1 files changed, 6 insertions(+), 2 deletions(-)
 
  diff --git a/drivers/gpu/drm/radeon/radeon_combios.c
  b/drivers/gpu/drm/radeon/radeon_combios.c index 78edadc..8528b81 100644
  --- a/drivers/gpu/drm/radeon/radeon_combios.c
  +++ b/drivers/gpu/drm/radeon/radeon_combios.c
  @@ -971,10 +971,14 @@ struct radeon_encoder_primary_dac
  *radeon_combios_get_primary_dac_info(struct }
 
  /* quirks */
  +   /* Radeon 7000 (RV100) */
  +   if (((dev-pdev-device == 0x5159) 
  +   (dev-pdev-subsystem_vendor == 0x174B) 
  +   (dev-pdev-subsystem_device == 0x7c28)) ||
  /* Radeon 9100 (R200) */
  -   if ((dev-pdev-device == 0x514D) 
  +  ((dev-pdev-device == 0x514D) 
  (dev-pdev-subsystem_vendor == 0x174B) 
  -   (dev-pdev-subsystem_device == 0x7149)) {
  +   (dev-pdev-subsystem_device == 0x7149))) {
  /* vbios value is bad, use the default */
  found = 0;
  }
  --
  Ondrej Zary
  ___
  dri-devel mailing list
  dri-devel@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/dri-devel


-- 
Ondrej Zary
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Future desktop on dumb frame buffers?

2011-03-21 Thread Ondrej Zary
On Monday 21 March 2011 20:34:38 Corbin Simpson wrote:
> On Mon, Mar 21, 2011 at 12:25 PM, Jesse Barnes  
wrote:
> > On Mon, 21 Mar 2011 19:19:43 +
> >
> > timofonic timofonic  wrote:
> >> So if KMS is so cool and provides many advantages over fbdev and
> >> such... Why isn't more widely used intead of still relying on fbdev?
> >> Why still using fbdev emulation (that is partial and somewhat broken,
> >> it seems) instead using KMS directly?
> >
> > Used by what? ?All three major GPU device classes have KMS support
> > (Intel, ATI, and nVidia). ?If you want it for a particular device, you
> > can always port it over.
> >
> > As for fbdev emulation, what's still using it? ?There's nothing
> > stopping projects from converting over; X and Wayland can already
> > handle KMS APIs just fine.
> >
> >> I know the graphic driver situation is quite bad on Linux, especially
> >> on the embedded world. Fbdev seems is still quite used there by binary
> >> blob drivers.
> >
> > Probably for a couple of reasons:
> > ?1) inertia: fbdev has been around a lot longer, and provides most of
> > ?what embedded devices need anyway
> > ?2) feature set: why bother doing a full KMS driver if you're not
> > ?going to use any of the additional features it would provide (output
> > ?management, memory management, execution management)
>
> Related: We are still missing basic userspace tools (kmsset, e.g.),
> some kind of direct KMS console (kmscon would work, if it existed),
> and an xf86-video-modesetting which compiles and works (this is
> actually possible now, with some patches that landed in 2.6.38 for
> generic KMS access.)

This looks interesting. If existing *fb drivers could be easily converted to 
KMS (including 2D acceleration) and then used in X with a common driver, it 
would be great. Let's say, convert cyber2000fb driver to KMS and use it in X 
with 2D acceleration.

> This is important to me, as the various old drivers I've been hacking
> on won't be accepted upstream without some sort of userspace which can
> work with them. One of the big goals of KMS was a generic
> userspace-facing API, like FB, but without the suck.


-- 
Ondrej Zary


Re: Future desktop on dumb frame buffers?

2011-03-21 Thread Ondrej Zary
On Monday 21 March 2011 20:34:38 Corbin Simpson wrote:
 On Mon, Mar 21, 2011 at 12:25 PM, Jesse Barnes jbar...@virtuousgeek.org 
wrote:
  On Mon, 21 Mar 2011 19:19:43 +
 
  timofonic timofonic timofo...@gmail.com wrote:
  So if KMS is so cool and provides many advantages over fbdev and
  such... Why isn't more widely used intead of still relying on fbdev?
  Why still using fbdev emulation (that is partial and somewhat broken,
  it seems) instead using KMS directly?
 
  Used by what?  All three major GPU device classes have KMS support
  (Intel, ATI, and nVidia).  If you want it for a particular device, you
  can always port it over.
 
  As for fbdev emulation, what's still using it?  There's nothing
  stopping projects from converting over; X and Wayland can already
  handle KMS APIs just fine.
 
  I know the graphic driver situation is quite bad on Linux, especially
  on the embedded world. Fbdev seems is still quite used there by binary
  blob drivers.
 
  Probably for a couple of reasons:
   1) inertia: fbdev has been around a lot longer, and provides most of
   what embedded devices need anyway
   2) feature set: why bother doing a full KMS driver if you're not
   going to use any of the additional features it would provide (output
   management, memory management, execution management)

 Related: We are still missing basic userspace tools (kmsset, e.g.),
 some kind of direct KMS console (kmscon would work, if it existed),
 and an xf86-video-modesetting which compiles and works (this is
 actually possible now, with some patches that landed in 2.6.38 for
 generic KMS access.)

This looks interesting. If existing *fb drivers could be easily converted to 
KMS (including 2D acceleration) and then used in X with a common driver, it 
would be great. Let's say, convert cyber2000fb driver to KMS and use it in X 
with 2D acceleration.

 This is important to me, as the various old drivers I've been hacking
 on won't be accepted upstream without some sort of userspace which can
 work with them. One of the big goals of KMS was a generic
 userspace-facing API, like FB, but without the suck.


-- 
Ondrej Zary
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Regression, post-2.6.34] Hibernation broken on machines with radeon/KMS and r300

2010-06-16 Thread Ondrej Zary
On Wednesday 16 June 2010, Rafael J. Wysocki wrote:
> On Tuesday, June 15, 2010, Rafael J. Wysocki wrote:
> > On Monday, June 14, 2010, Alex Deucher wrote:
> > > On Mon, Jun 14, 2010 at 3:03 PM, Rafael J. Wysocki  wrote:
> > > > On Monday, June 14, 2010, Alex Deucher wrote:
> > > >> On Mon, Jun 14, 2010 at 10:53 AM, Rafael J. Wysocki  
wrote:
> > > >> > Alex, Dave,
> > > >> >
> > > >> > I'm afraid hibernation is broken on all machines using radeon/KMS
> > > >> > with r300 after commit ce8f53709bf440100cb9d31b1303291551cf517f
> > > >> > (drm/radeon/kms/pm: rework power management).  At least, I'm able
> > > >> > to reproduce the symptom, which is that the machine hangs hard
> > > >> > around the point where an image is created (probably during the
> > > >> > device thaw phase), on two different boxes with r300 (the output
> > > >> > of lspci from one of them is attached for reference, the other one
> > > >> > is HP nx6325).
> > > >> >
> > > >> > Suspend to RAM appears to work fine at least on one of the
> > > >> > affected boxes.
> > > >> >
> > > >> > Unfortunately, the commit above changes a lot of code and it's not
> > > >> > too easy to figure out what's wrong with it and I didn't have the
> > > >> > time to look more into details of this failure.  However, it looks
> > > >> > like you use .suspend() and .resume() callbacks as .freeze() and
> > > >> > .thaw() which may not be 100% correct (in fact it looks like the
> > > >> > "legacy" PCI suspend/resume is used, which is not recommended any
> > > >> > more).
> > > >>
> > > >> Does it work any better after Dave's last drm pull request?
> > > >
> > > > Nope.  The symptom is slightly different, though, because now it
> > > > hangs after turning off the screen.
> > > >
> > > >> With the latest changes, pm should not be a factor unless it's
> > > >> explicitly enabled via sysfs.
> > > >
> > > > Well, I guess the first pm patch changed more than just pm, then.
> > >
> > > Does this patch help?
> > > http://lists.freedesktop.org/archives/dri-devel/2010-June/001314.html
> >
> > No, it doesn't.  I try to hibernate, everything works to the point where
> > the screen goes off and the box hangs (solid).  Normally, it would turn
> > the screen back on and continue with saving the image.
> >
> > But, since that happens with the patch above applied, I think it doesn't
> > really pass the suspend phase (IOW, it probably hangs somewhere in the
> > radeon's suspend routine).
>
> I've just verified that in fact hibernation works on HP nx6325 with
> 2.6.35-rc3, but it takes about 55 sec. to suspend the graphics adapter in
> the "freeze" phase.  Surprisingly enough, during suspend to RAM it works
> normally (as well as in the "poweroff" phase of hibernation).

It takes 2 minutes on RV530: 
https://bugzilla.redhat.com/show_bug.cgi?id=586522


-- 
Ondrej Zary


regression: 2.6.35-rc1 hangs on i865G with KMS

2010-06-06 Thread Ondrej Zary
On Sunday 06 June 2010 11:04:44 Dave Airlie wrote:
> On Sun, Jun 6, 2010 at 6:28 AM, Ondrej Zary  
wrote:
> > On Saturday 05 June 2010 02:23:27 Eric Anholt wrote:
> >> On Fri, 4 Jun 2010 22:01:28 +0200, Ondrej Zary 
 wrote:
> >> > Hello,
> >> > I'm testing 2.6.35-rc1 kernel on Asus P4P800-VM (i865G chipset). After
> >> > loading i915 module, the screen goes blank and the kernel hangs
> >> > completely (same with 2.6.35-rc1-git2). This does not happen with
> >> > "i915.modeset=0" parameter.
> >> >
> >> > This problem does not appear with 2.6.34. Is this a known regression?
> >>
> >> Not known as far as I know -- we'd enjoy a bisect with a bug report on
> >> bugs.freedesktop.org.
>
> Can you try the attached patch?

Thanks, applied it on 2.6.35-rc2 and it seems to work fine.


-- 
Ondrej Zary


Re: regression: 2.6.35-rc1 hangs on i865G with KMS

2010-06-06 Thread Ondrej Zary
On Sunday 06 June 2010 11:04:44 Dave Airlie wrote:
 On Sun, Jun 6, 2010 at 6:28 AM, Ondrej Zary li...@rainbow-software.org 
wrote:
  On Saturday 05 June 2010 02:23:27 Eric Anholt wrote:
  On Fri, 4 Jun 2010 22:01:28 +0200, Ondrej Zary 
li...@rainbow-software.org wrote:
   Hello,
   I'm testing 2.6.35-rc1 kernel on Asus P4P800-VM (i865G chipset). After
   loading i915 module, the screen goes blank and the kernel hangs
   completely (same with 2.6.35-rc1-git2). This does not happen with
   i915.modeset=0 parameter.
  
   This problem does not appear with 2.6.34. Is this a known regression?
 
  Not known as far as I know -- we'd enjoy a bisect with a bug report on
  bugs.freedesktop.org.

 Can you try the attached patch?

Thanks, applied it on 2.6.35-rc2 and it seems to work fine.


-- 
Ondrej Zary
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


regression: 2.6.35-rc1 hangs on i865G with KMS

2010-06-05 Thread Ondrej Zary
On Saturday 05 June 2010 02:23:27 Eric Anholt wrote:
> On Fri, 4 Jun 2010 22:01:28 +0200, Ondrej Zary  rainbow-software.org> wrote:
> > Hello,
> > I'm testing 2.6.35-rc1 kernel on Asus P4P800-VM (i865G chipset). After
> > loading i915 module, the screen goes blank and the kernel hangs
> > completely (same with 2.6.35-rc1-git2). This does not happen with
> > "i915.modeset=0" parameter.
> >
> > This problem does not appear with 2.6.34. Is this a known regression?
>
> Not known as far as I know -- we'd enjoy a bisect with a bug report on
> bugs.freedesktop.org.

Serial console with some printk()s added:
[0.00] Initializing cgroup subsys cpuset
[0.00] Linux version 2.6.35-rc1-git2 (root at test) (gcc version 4.4.4 
(Debian 4.4.4-1) ) #14 SMP Sat Jun 5 21:51:52 0
[0.00] BIOS-provided physical RAM map:
[0.00]  BIOS-e820:  - 0009fc00 (usable)
[0.00]  BIOS-e820: 0009fc00 - 000a (reserved)
[0.00]  BIOS-e820: 000e8000 - 0010 (reserved)
[0.00]  BIOS-e820: 0010 - 1f73 (usable)
[0.00]  BIOS-e820: 1f73 - 1f74 (ACPI data)
[0.00]  BIOS-e820: 1f74 - 1f7f (ACPI NVS)
[0.00]  BIOS-e820: 1f7f - 1f80 (reserved)
[0.00]  BIOS-e820: ffb8 - 0001 (reserved)
[0.00] Notice: NX (Execute Disable) protection missing in CPU or 
disabled in BIOS!
[0.00] DMI 2.3 present.
[0.00] AMI BIOS detected: BIOS may corrupt low RAM, working around it.
[0.00] last_pfn = 0x1f730 max_arch_pfn = 0x10
[0.00] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
[0.00] found SMP MP-table at [c00ff780] ff780
[0.00] init_memory_mapping: -1f73
[0.00] ACPI: RSDP 000fad50 00021 (v02 ACPIAM)
[0.00] ACPI: XSDT 1f730100 0003C (v01 A M I  OEMXSDT  09000505 MSFT 
0097)
[0.00] ACPI: FACP 1f730290 000F4 (v03 A M I  OEMFACP  09000505 MSFT 
0097)
[0.00] ACPI: DSDT 1f7303f0 036A7 (v01  PPVM1 PPVM1911 0911 INTL 
02002026)
[0.00] ACPI: FACS 1f74 00040
[0.00] ACPI: APIC 1f730390 0005C (v01 A M I  OEMAPIC  09000505 MSFT 
0097)
[0.00] ACPI: OEMB 1f740040 0003F (v01 A M I  OEMBIOS  09000505 MSFT 
0097)
[0.00] 503MB LOWMEM available.
[0.00]   mapped low ram: 0 - 1f73
[0.00]   low ram: 0 - 1f73
[0.00] Zone PFN ranges:
[0.00]   DMA  0x0010 -> 0x1000
[0.00]   Normal   0x1000 -> 0x0001f730
[0.00] Movable zone start PFN for each node
[0.00] early_node_map[2] active PFN ranges
[0.00] 0: 0x0010 -> 0x009f
[0.00] 0: 0x0100 -> 0x0001f730
[0.00] Using APIC driver default
[0.00] ACPI: PM-Timer IO Port: 0x808
[0.00] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
[0.00] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x81] disabled)
[0.00] ACPI: IOAPIC (id[0x01] address[0xfec0] gsi_base[0])
[0.00] IOAPIC[0]: apic_id 1, version 32, address 0xfec0, GSI 0-23
[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[0.00] Using ACPI (MADT) for SMP configuration information
[0.00] SMP: Allowing 2 CPUs, 1 hotplug CPUs
[0.00] PM: Registered nosave memory: 0009f000 - 000a
[0.00] PM: Registered nosave memory: 000a - 000e8000
[0.00] PM: Registered nosave memory: 000e8000 - 0010
[0.00] Allocating PCI resources starting at 1f80 (gap: 
1f80:e038)
[0.00] setup_percpu: NR_CPUS:32 nr_cpumask_bits:32 nr_cpu_ids:2 
nr_node_ids:1
[0.00] PERCPU: Embedded 12 pages/cpu @c180 s28160 r0 d20992 u2097152
[0.00] pcpu-alloc: s28160 r0 d20992 u2097152 alloc=1*4194304
[0.00] pcpu-alloc: [0] 0 1
[0.00] Built 1 zonelists in Zone order, mobility grouping on.  Total 
pages: 127696
[0.00] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-2.6.35-rc1-git2 
root=/dev/sda1 ro console=ttyS0 console=tty0
[0.00] PID hash table entries: 2048 (order: 1, 8192 bytes)
[0.00] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
[0.00] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
[0.00] Enabling fast FPU save and restore... done.
[0.00] Enabling unmasked SIMD FPU exception support... done.
[0.00] Initializing CPU#0
[0.00] Subtract (41 early reservations)
[0.00]   #1 [001000 - 002000]   EX TRAMPOLINE
[0.00]   #2 [000100 - 00013cff2c]   TEXT DATA BSS
[0.00]   #3 [00013d - 0

regression: 2.6.35-rc1 hangs on i865G with KMS

2010-06-05 Thread Ondrej Zary
On Saturday 05 June 2010 02:23:27 Eric Anholt wrote:
> On Fri, 4 Jun 2010 22:01:28 +0200, Ondrej Zary  rainbow-software.org> 
wrote:
> > Hello,
> > I'm testing 2.6.35-rc1 kernel on Asus P4P800-VM (i865G chipset). After
> > loading i915 module, the screen goes blank and the kernel hangs
> > completely (same with 2.6.35-rc1-git2). This does not happen with
> > "i915.modeset=0" parameter.
> >
> > This problem does not appear with 2.6.34. Is this a known regression?
>
> Not known as far as I know -- we'd enjoy a bisect with a bug report on
> bugs.freedesktop.org.

I don't like that place - from 13 bugs I have reported there, 11 are still 
open :( I even provided a patch for one and nobody cares...

Bisect reveals two possible bad commits:
386516744ba45d50f42c6999151cc210cb4f96e4
drm/fb: fix fbdev object model + cleanup properly

or
8be48d924c307e72e3797ab5bde81b07a1ccc52d
drm/kms/fb: move to using fb helper crtc grouping instead of core crtc list

It depends on the last step of bisection - that kernel hangs but in a 
different way - the monitor has no signal (in every other case, the monitor 
was just blank).

Reverting these commits is not possible.

-- 
Ondrej Zary


Re: regression: 2.6.35-rc1 hangs on i865G with KMS

2010-06-05 Thread Ondrej Zary
On Saturday 05 June 2010 02:23:27 Eric Anholt wrote:
 On Fri, 4 Jun 2010 22:01:28 +0200, Ondrej Zary li...@rainbow-software.org 
 wrote:
  Hello,
  I'm testing 2.6.35-rc1 kernel on Asus P4P800-VM (i865G chipset). After
  loading i915 module, the screen goes blank and the kernel hangs
  completely (same with 2.6.35-rc1-git2). This does not happen with
  i915.modeset=0 parameter.
 
  This problem does not appear with 2.6.34. Is this a known regression?

 Not known as far as I know -- we'd enjoy a bisect with a bug report on
 bugs.freedesktop.org.

Serial console with some printk()s added:
[0.00] Initializing cgroup subsys cpuset
[0.00] Linux version 2.6.35-rc1-git2 (r...@test) (gcc version 4.4.4 
(Debian 4.4.4-1) ) #14 SMP Sat Jun 5 21:51:52 0
[0.00] BIOS-provided physical RAM map:
[0.00]  BIOS-e820:  - 0009fc00 (usable)
[0.00]  BIOS-e820: 0009fc00 - 000a (reserved)
[0.00]  BIOS-e820: 000e8000 - 0010 (reserved)
[0.00]  BIOS-e820: 0010 - 1f73 (usable)
[0.00]  BIOS-e820: 1f73 - 1f74 (ACPI data)
[0.00]  BIOS-e820: 1f74 - 1f7f (ACPI NVS)
[0.00]  BIOS-e820: 1f7f - 1f80 (reserved)
[0.00]  BIOS-e820: ffb8 - 0001 (reserved)
[0.00] Notice: NX (Execute Disable) protection missing in CPU or 
disabled in BIOS!
[0.00] DMI 2.3 present.
[0.00] AMI BIOS detected: BIOS may corrupt low RAM, working around it.
[0.00] last_pfn = 0x1f730 max_arch_pfn = 0x10
[0.00] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
[0.00] found SMP MP-table at [c00ff780] ff780
[0.00] init_memory_mapping: -1f73
[0.00] ACPI: RSDP 000fad50 00021 (v02 ACPIAM)
[0.00] ACPI: XSDT 1f730100 0003C (v01 A M I  OEMXSDT  09000505 MSFT 
0097)
[0.00] ACPI: FACP 1f730290 000F4 (v03 A M I  OEMFACP  09000505 MSFT 
0097)
[0.00] ACPI: DSDT 1f7303f0 036A7 (v01  PPVM1 PPVM1911 0911 INTL 
02002026)
[0.00] ACPI: FACS 1f74 00040
[0.00] ACPI: APIC 1f730390 0005C (v01 A M I  OEMAPIC  09000505 MSFT 
0097)
[0.00] ACPI: OEMB 1f740040 0003F (v01 A M I  OEMBIOS  09000505 MSFT 
0097)
[0.00] 503MB LOWMEM available.
[0.00]   mapped low ram: 0 - 1f73
[0.00]   low ram: 0 - 1f73
[0.00] Zone PFN ranges:
[0.00]   DMA  0x0010 - 0x1000
[0.00]   Normal   0x1000 - 0x0001f730
[0.00] Movable zone start PFN for each node
[0.00] early_node_map[2] active PFN ranges
[0.00] 0: 0x0010 - 0x009f
[0.00] 0: 0x0100 - 0x0001f730
[0.00] Using APIC driver default
[0.00] ACPI: PM-Timer IO Port: 0x808
[0.00] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
[0.00] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x81] disabled)
[0.00] ACPI: IOAPIC (id[0x01] address[0xfec0] gsi_base[0])
[0.00] IOAPIC[0]: apic_id 1, version 32, address 0xfec0, GSI 0-23
[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[0.00] Using ACPI (MADT) for SMP configuration information
[0.00] SMP: Allowing 2 CPUs, 1 hotplug CPUs
[0.00] PM: Registered nosave memory: 0009f000 - 000a
[0.00] PM: Registered nosave memory: 000a - 000e8000
[0.00] PM: Registered nosave memory: 000e8000 - 0010
[0.00] Allocating PCI resources starting at 1f80 (gap: 
1f80:e038)
[0.00] setup_percpu: NR_CPUS:32 nr_cpumask_bits:32 nr_cpu_ids:2 
nr_node_ids:1
[0.00] PERCPU: Embedded 12 pages/cpu @c180 s28160 r0 d20992 u2097152
[0.00] pcpu-alloc: s28160 r0 d20992 u2097152 alloc=1*4194304
[0.00] pcpu-alloc: [0] 0 1
[0.00] Built 1 zonelists in Zone order, mobility grouping on.  Total 
pages: 127696
[0.00] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-2.6.35-rc1-git2 
root=/dev/sda1 ro console=ttyS0 console=tty0
[0.00] PID hash table entries: 2048 (order: 1, 8192 bytes)
[0.00] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
[0.00] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
[0.00] Enabling fast FPU save and restore... done.
[0.00] Enabling unmasked SIMD FPU exception support... done.
[0.00] Initializing CPU#0
[0.00] Subtract (41 early reservations)
[0.00]   #1 [001000 - 002000]   EX TRAMPOLINE
[0.00]   #2 [000100 - 00013cff2c]   TEXT DATA BSS
[0.00]   #3 [00013d - 00013d62a8] BRK
[0.00]   #4 [0ff790 - 10]   BIOS reserved
[0.00

regression: 2.6.35-rc1 hangs on i865G with KMS

2010-06-04 Thread Ondrej Zary
Hello,
I'm testing 2.6.35-rc1 kernel on Asus P4P800-VM (i865G chipset). After loading 
i915 module, the screen goes blank and the kernel hangs completely (same with 
2.6.35-rc1-git2). This does not happen with "i915.modeset=0" parameter.

This problem does not appear with 2.6.34. Is this a known regression?

-- 
Ondrej Zary