Bug#1029602: Bug report: kernel oops in vmw_fb_dirty_flush()

2023-01-30 Thread Keyu Tao

Hi Rusin,

Thank you for your timely response. I tested that this bug is not 
reproducible in v6.2-rc5 yesterday.


On 1/31/23 03:54, Zack Rusin wrote:

On Tue, 2023-01-31 at 00:36 +0800, Keyu Tao wrote:

!! External Email

Hi vmwgfx maintainers,

An out-of-bound access in vmwgfx specific framebuffer implementation can
be easily triggered by fbterm (a framebuffer terminal emulator) when it
is going to scroll screen.

With some debugging, it seems that vmw_fb_dirty_flush() cannot handle
the vinfo.yoffset correctly after calling `ioctl(fbdev_fd,
FBIOPAN_DISPLAY, );`, and then subsequent access to the mapped
memory area causes the oops.

As current mainline vmwgfx implementation (in Linux 6.2-rc) has removed
this framebuffer implementation, this bug can be triggered only in Linux
stable. I have tested it with vanilla 6.1.8 and 5.10.165 and they all oops.

This bug is reported in
<
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.debian.org%2
Fcgi-
bin%2Fbugreport.cgi%3Fbug%3D1029602=05%7C01%7Czackr%40vmware.com%7C63862e731c
3b4a97796808db02e03145%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C63810693415592
2769%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiL
CJXVCI6Mn0%3D%7C2000%7C%7C%7C=uVOtDBAyn%2BDx5w8r1twuKO4Xd0Lma6zCr2ie3lQ%2BRR
E%3D=0> first, and
the maintainer there suggests me to report this issue to upstream :)

Relevant information (for self-compiled Linux 6.1.8):

- /proc/version: Linux version 6.1.8 (tao@mira) (gcc (Debian 10.2.1-6)
10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #7 SMP
PREEMPT_DYNAMIC Mon Jan 30 21:09:02 CST 2023

- Linux distribution: Debian GNU/Linux 11 (bullseye)

- Architecture (uname -mi): x86_64 unknown

- Virtualization software: VMware Fusion 13 Player

- How to reproduce:
    1. Install (or compile) fbterm
    2. Run fbterm under a tty (by a user with read & write permission to
/dev/fb0, usually users in video group), and try to make it scroll (for
example by pressing Enter for a few seconds)
    3. The graphics hang and it oops.



Thanks a lot for the detailed report. Is there any chance that you could try 
any of
the 6.2 rc releases to see if you can reproduce? We removed all of the hand 
rolled
fb code and ported it to drm helpers in change:
df42523c12f8 ("drm/vmwgfx: Port the framebuffer code to drm fb helpers")
which for the first time got into the official kernel in v6.2-rc1 . So any 
kernel
after that shouldn't crash with fbterm, if anyone could verify that'd be much
appreciated.

z




Bug#1029602: Bug report: kernel oops in vmw_fb_dirty_flush()

2023-01-30 Thread Zack Rusin
On Tue, 2023-01-31 at 00:36 +0800, Keyu Tao wrote:
> !! External Email
> 
> Hi vmwgfx maintainers,
> 
> An out-of-bound access in vmwgfx specific framebuffer implementation can
> be easily triggered by fbterm (a framebuffer terminal emulator) when it
> is going to scroll screen.
> 
> With some debugging, it seems that vmw_fb_dirty_flush() cannot handle
> the vinfo.yoffset correctly after calling `ioctl(fbdev_fd,
> FBIOPAN_DISPLAY, );`, and then subsequent access to the mapped
> memory area causes the oops.
> 
> As current mainline vmwgfx implementation (in Linux 6.2-rc) has removed
> this framebuffer implementation, this bug can be triggered only in Linux
> stable. I have tested it with vanilla 6.1.8 and 5.10.165 and they all oops.
> 
> This bug is reported in
> <
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.debian.org%2
> Fcgi-
> bin%2Fbugreport.cgi%3Fbug%3D1029602=05%7C01%7Czackr%40vmware.com%7C63862e731c
> 3b4a97796808db02e03145%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C63810693415592
> 2769%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiL
> CJXVCI6Mn0%3D%7C2000%7C%7C%7C=uVOtDBAyn%2BDx5w8r1twuKO4Xd0Lma6zCr2ie3lQ%2BRR
> E%3D=0> first, and
> the maintainer there suggests me to report this issue to upstream :)
> 
> Relevant information (for self-compiled Linux 6.1.8):
> 
> - /proc/version: Linux version 6.1.8 (tao@mira) (gcc (Debian 10.2.1-6)
> 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #7 SMP
> PREEMPT_DYNAMIC Mon Jan 30 21:09:02 CST 2023
> 
> - Linux distribution: Debian GNU/Linux 11 (bullseye)
> 
> - Architecture (uname -mi): x86_64 unknown
> 
> - Virtualization software: VMware Fusion 13 Player
> 
> - How to reproduce:
>    1. Install (or compile) fbterm
>    2. Run fbterm under a tty (by a user with read & write permission to
> /dev/fb0, usually users in video group), and try to make it scroll (for
> example by pressing Enter for a few seconds)
>    3. The graphics hang and it oops.
> 

Thanks a lot for the detailed report. Is there any chance that you could try 
any of
the 6.2 rc releases to see if you can reproduce? We removed all of the hand 
rolled
fb code and ported it to drm helpers in change:
df42523c12f8 ("drm/vmwgfx: Port the framebuffer code to drm fb helpers")
which for the first time got into the official kernel in v6.2-rc1 . So any 
kernel
after that shouldn't crash with fbterm, if anyone could verify that'd be much
appreciated.

z


Bug#1029602: Bug report: kernel oops in vmw_fb_dirty_flush()

2023-01-30 Thread Keyu Tao

Hi vmwgfx maintainers,

An out-of-bound access in vmwgfx specific framebuffer implementation can 
be easily triggered by fbterm (a framebuffer terminal emulator) when it 
is going to scroll screen.


With some debugging, it seems that vmw_fb_dirty_flush() cannot handle 
the vinfo.yoffset correctly after calling `ioctl(fbdev_fd, 
FBIOPAN_DISPLAY, );`, and then subsequent access to the mapped 
memory area causes the oops.


As current mainline vmwgfx implementation (in Linux 6.2-rc) has removed 
this framebuffer implementation, this bug can be triggered only in Linux 
stable. I have tested it with vanilla 6.1.8 and 5.10.165 and they all oops.


This bug is reported in 
 first, and 
the maintainer there suggests me to report this issue to upstream :)


Relevant information (for self-compiled Linux 6.1.8):

- /proc/version: Linux version 6.1.8 (tao@mira) (gcc (Debian 10.2.1-6) 
10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #7 SMP 
PREEMPT_DYNAMIC Mon Jan 30 21:09:02 CST 2023


- Linux distribution: Debian GNU/Linux 11 (bullseye)

- Architecture (uname -mi): x86_64 unknown

- Virtualization software: VMware Fusion 13 Player

- How to reproduce:
  1. Install (or compile) fbterm
  2. Run fbterm under a tty (by a user with read & write permission to 
/dev/fb0, usually users in video group), and try to make it scroll (for 
example by pressing Enter for a few seconds)

  3. The graphics hang and it oops.

- decoded oops message:

[   31.519514] BUG: unable to handle page fault for address: 
a7c5019d6000

[   31.519843] #PF: supervisor write access in kernel mode
[   31.520149] #PF: error_code(0x0002) - not-present page
[   31.520453] PGD 167 P4D 167 PUD 11bc067 PMD 31f0d067 PTE 0
[   31.520784] Oops: 0002 [#1] PREEMPT SMP PTI
[   31.521022] CPU: 0 PID: 7 Comm: kworker/0:0 Kdump: loaded Not tainted 
6.1.8 #7
[   31.521266] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform, BIOS 6.00 11/12/2020

[   31.521796] Workqueue: events vmw_fb_dirty_flush [vmwgfx]
[   31.522080] RIP: 0010:memcpy_orig 
(/home/tao/Downloads/linux-6.1.8/arch/x86/lib/memcpy_64.S:85)
[ 31.522396] Code: 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe 7c 35 48 83 ea 
20 48 83 ea 20 4c 8b 06 4c 8b 4e 08 4c 8b 56 10 4c 8b 5e 18 48 8d 76 20 
<4c> 89 07 4c 89 4f 08 4c 89 57 10 4c 89 5f 18 48 8d 7f 20 73 d4 83

All code

   0:   00 48 89add%cl,-0x77(%rax)
   3:   f8  clc
   4:   48 83 fa 20 cmp$0x20,%rdx
   8:   72 7e   jb 0x88
   a:   40 38 fecmp%dil,%sil
   d:   7c 35   jl 0x44
   f:   48 83 ea 20 sub$0x20,%rdx
  13:   48 83 ea 20 sub$0x20,%rdx
  17:   4c 8b 06mov(%rsi),%r8
  1a:   4c 8b 4e 08 mov0x8(%rsi),%r9
  1e:   4c 8b 56 10 mov0x10(%rsi),%r10
  22:   4c 8b 5e 18 mov0x18(%rsi),%r11
  26:   48 8d 76 20 lea0x20(%rsi),%rsi
  2a:*  4c 89 07mov%r8,(%rdi)   <-- trapping 
instruction
  2d:   4c 89 4f 08 mov%r9,0x8(%rdi)
  31:   4c 89 57 10 mov%r10,0x10(%rdi)
  35:   4c 89 5f 18 mov%r11,0x18(%rdi)
  39:   48 8d 7f 20 lea0x20(%rdi),%rdi
  3d:   73 d4   jae0x13
  3f:   83  .byte 0x83

Code starting with the faulting instruction
===
   0:   4c 89 07mov%r8,(%rdi)
   3:   4c 89 4f 08 mov%r9,0x8(%rdi)
   7:   4c 89 57 10 mov%r10,0x10(%rdi)
   b:   4c 89 5f 18 mov%r11,0x18(%rdi)
   f:   48 8d 7f 20 lea0x20(%rdi),%rdi
  13:   73 d4   jae0xffe9
  15:   83  .byte 0x83
[   31.523208] RSP: 0018:a7c50005be10 EFLAGS: 00010202
[   31.523555] RAX: a7c5019d5c00 RBX: 0c80 RCX: 
0c80
[   31.523841] RDX: 0840 RSI: a7c500e73a20 RDI: 
a7c5019d6000
[   31.524071] RBP:  R08:  R09: 

[   31.524299] R10:  R11:  R12: 
a7c500e73600
[   31.524525] R13: 97ba70af4cd8 R14: 97ba70b4 R15: 
97ba70af4800
[   31.524753] FS:  () GS:97ba9180() 
knlGS:

[   31.524981] CS:  0010 DS:  ES:  CR0: 80050033
[   31.525209] CR2: a7c5019d6000 CR3: 37a10002 CR4: 
003706f0

[   31.525440] Call Trace:
[   31.525670]  
[   31.525900] vmw_fb_dirty_flush 
(/home/tao/Downloads/linux-6.1.8/drivers/gpu/drm/vmwgfx/vmwgfx_fb.c:244) 
vmwgfx
[   31.526162] process_one_work 
(/home/tao/Downloads/linux-6.1.8/kernel/workqueue.c:2289)
[   31.526399] worker_thread 
(/home/tao/Downloads/linux-6.1.8/./include/linux/list.h:292