re: kernel stack usage

2020-05-30 Thread matthew green
glad to see this effort and the clean up already!

ideally, we can break the kernel build if large stack consumers
are added to the kernel.  i'd be OK with it being default on,
with of course a way to skip it, and if necessary it can have
a whitelist of "OK large users."

> 1264cdioctl at cd.c:1204
> 1248uvm_swap_stats at uvm_swap.c:726

i think we can ignore these two.  they're both going to be
early in the stack so very unlikely to be problematic.


.mrg.


Re: kernel stack usage

2020-05-30 Thread Jonathan A. Kollasch
On Sat, May 30, 2020 at 11:52:18AM +0200, Martin Husemann wrote:
> 1248aubtfwl_attach_hook at aubtfwl.c:273
> 

I took care of this.  It was placing MAXPATHLEN+1 chars on the stack.

While PNBUF_GET/PUT() seemed like a possible choice, I decided on
kmem_asprintf()/kmem_strfree(), as in reality it needs nowhere near a
MAXPATHLEN.


Re: kernel stack usage

2020-05-30 Thread Jaromír Doleček
Le sam. 30 mai 2020 à 18:41, Jason Thorpe  a écrit :
> These two seem slightly bogus.  coredump_note_elf64() was storing register 
> state not the stack, but not nearly 3K worth.  procfs_domounts() has nearly 
> nothing on the stack as far as I can tell, and the one function that could be 
> auto-inlined that it calls doesn't have much either.
>

struct statvfs is certainly over 3 KB - line 619

Jaromir


Re: kernel stack usage

2020-05-30 Thread Jason Thorpe


> On May 30, 2020, at 7:18 AM, Christos Zoulas  wrote:
> 
> 3352 80b940bb:procfs_domounts+0xd
> 3264 80c677da:coredump_note_elf64+0xb

These two seem slightly bogus.  coredump_note_elf64() was storing register 
state not the stack, but not nearly 3K worth.  procfs_domounts() has nearly 
nothing on the stack as far as I can tell, and the one function that could be 
auto-inlined that it calls doesn't have much either.

-- thorpej



Re: kernel stack usage

2020-05-30 Thread Jaromír Doleček
I've fixed several where I felt comfortable, feel free to do more:
4096pci_conf_print at pci_subr.c:4812
4096dtv_demux_read at dtv_demux.c:493
3408genfb_calc_hsize at genfb.c:630
2240bwfm_rx_event_cb at bwfm.c:2099
1664wdcprobe_with_reset at wdc.c:491

Jaromir

Le sam. 30 mai 2020 à 16:18, Christos Zoulas  a écrit :
>
> In article <20200530095218.gb28...@mail.duskware.de>,
> Martin Husemann   wrote:
> >Hey folks,
> >
> >triggered by some experiments simonb did on mips I wrote a script to find
> >the functions using the bigest stack frame in my current sparc64 kernel.
> >
> >The top 15 list is:
> >
> >Frame/b Function
> >4096pci_conf_print at pci_subr.c:4812
> >4096dtv_demux_read at dtv_demux.c:493
> >3536SHA3_Selftest at sha3.c:430
> >3408genfb_calc_hsize at genfb.c:630
> >3248radeonfb_pickres at radeonfb.c:4127
> >2304radeonfb_set_cursor at radeonfb.c:3690
> >2272gem_pci_attach at if_gem_pci.c:147
> >2256twoway_memmem at memmem.c:84
> >2240bwfm_rx_event_cb at bwfm.c:2099
> >2240compat_60_ptmget_ioctl at tty_60.c:70
> >2112db_stack_trace_print at db_trace.c:77
> >1664wdcprobe_with_reset at wdc.c:491
> >1424nfsrv_rename at nfs_serv.c:1906
> >1408OF_mapintr at ofw_machdep.c:728
> >1344sysctl_hw_firmware_path at firmload.c:81
> >1280fw_bmr at firewire.c:2296
> >1264cdioctl at cd.c:1204
> >1248cpu_reset_fpustate at cpu.c:400
> >1248aubtfwl_attach_hook at aubtfwl.c:273
> >1248uvm_swap_stats at uvm_swap.c:726
> >
> >(left column is size of the frame on sparc64 in bytes)
> >
> >I think anything > 1k is dubious and should be checked.
>
> I agree, here is the same for x86_64/GENERIC...
>
> 4408 8027af14:pci_conf_print+0xd
> 4128 80a8dca0:dtv_demux_read+0xb
> 3352 80b940bb:procfs_domounts+0xd
> 3272 80e36b4b:SHA3_Selftest+0xd
> 3264 80c677da:coredump_note_elf64+0xb
> 3240 80b537c6:genfb_calc_hsize.isra.0+0x5
> 2704 80c66a88:coredump_note_elf32+0xb
> 2408 80227a71:process_machdep_doxstate+0xd
> 2184 804381fd:linux_ioctl_termios+0xd
> 2168 80440b2d:linux32_ioctl_termios+0xd
> 2112 802c5579:gem_pci_attach+0xb
> 2104 80e465c3:twoway_memmem+0xd
> 2088 806b5c18:bwfm_rx_event_cb+0xd
> 2072 8097e221:compat_60_ptmget_ioctl+0xd
> 2064 8053ce72:db_stack_trace_print+0x11
> 1488 8064f943:wdcprobe_with_reset+0xb
> 1384 80d7ee2b:ipmi_match+0x9
> 1328 80467a95:usb_add_event+0x7
> 1304 80ba85bb:nfsrv_rename+0xd
> 1256 8053069f:acpicpu_md_pstate_sysctl_all+0xd
> 1256 8052cd13:acpicpu_start+0x9
> 1240 8043b162:linux_sys_rt_sigreturn+0x9
> 1192 80b9392a:procfs_do_pid_stat+0xd
> 1176 80d70810:sysctl_hw_firmware_path+0xd
> 1160 807b30a3:radeon_cs_ioctl+0xd
> 1128 8044610f:oss_ioctl_mixer+0xd
> 1128 8024dd3c:cdioctl+0xd
> 1112 80c633ca:uvm_swap_stats.part.1+0xd
> 1104 804fdf59:fw_bmr+0xb
> 1096 80c8c0ab:ktrwrite+0xd
> 1080 80573ca6:ahc_print_register+0xd
> 1080 80550de8:procfs_getonecpu+0xd
> 1080 8048d95e:aubtfwl_attach_hook+0x9
> 1064 80cf925d:proc_regio+0xd
> 1064 80ccb407:bufq_alloc+0xd
> 1064 80ae8090:ar5112SetPowerTable+0xd
> 1064 80582aa0:ahd_print_register+0xd
> 1064 80568a53:tpmread+0xd
> 1064 80382262:txp_attach+0xd
> 1064 802636da:ata_probe_caps+0xd
> 1048 80e06097:ar9003_paprd_tx_tone_done+0xd
> 1048 80d87e9b:sdl_print+0x9
> 1048 80c67604:coredump_getseghdrs_elf64+0xd
>
>


Re: kernel stack usage

2020-05-30 Thread Christos Zoulas
In article <20200530095218.gb28...@mail.duskware.de>,
Martin Husemann   wrote:
>Hey folks,
>
>triggered by some experiments simonb did on mips I wrote a script to find
>the functions using the bigest stack frame in my current sparc64 kernel.
>
>The top 15 list is:
>
>Frame/b Function
>4096pci_conf_print at pci_subr.c:4812
>4096dtv_demux_read at dtv_demux.c:493
>3536SHA3_Selftest at sha3.c:430
>3408genfb_calc_hsize at genfb.c:630
>3248radeonfb_pickres at radeonfb.c:4127
>2304radeonfb_set_cursor at radeonfb.c:3690
>2272gem_pci_attach at if_gem_pci.c:147
>2256twoway_memmem at memmem.c:84
>2240bwfm_rx_event_cb at bwfm.c:2099
>2240compat_60_ptmget_ioctl at tty_60.c:70
>2112db_stack_trace_print at db_trace.c:77
>1664wdcprobe_with_reset at wdc.c:491
>1424nfsrv_rename at nfs_serv.c:1906
>1408OF_mapintr at ofw_machdep.c:728
>1344sysctl_hw_firmware_path at firmload.c:81
>1280fw_bmr at firewire.c:2296
>1264cdioctl at cd.c:1204
>1248cpu_reset_fpustate at cpu.c:400
>1248aubtfwl_attach_hook at aubtfwl.c:273
>1248uvm_swap_stats at uvm_swap.c:726
>
>(left column is size of the frame on sparc64 in bytes)
>
>I think anything > 1k is dubious and should be checked.

I agree, here is the same for x86_64/GENERIC...

4408 8027af14:pci_conf_print+0xd
4128 80a8dca0:dtv_demux_read+0xb
3352 80b940bb:procfs_domounts+0xd
3272 80e36b4b:SHA3_Selftest+0xd
3264 80c677da:coredump_note_elf64+0xb
3240 80b537c6:genfb_calc_hsize.isra.0+0x5
2704 80c66a88:coredump_note_elf32+0xb
2408 80227a71:process_machdep_doxstate+0xd
2184 804381fd:linux_ioctl_termios+0xd
2168 80440b2d:linux32_ioctl_termios+0xd
2112 802c5579:gem_pci_attach+0xb
2104 80e465c3:twoway_memmem+0xd
2088 806b5c18:bwfm_rx_event_cb+0xd
2072 8097e221:compat_60_ptmget_ioctl+0xd
2064 8053ce72:db_stack_trace_print+0x11
1488 8064f943:wdcprobe_with_reset+0xb
1384 80d7ee2b:ipmi_match+0x9
1328 80467a95:usb_add_event+0x7
1304 80ba85bb:nfsrv_rename+0xd
1256 8053069f:acpicpu_md_pstate_sysctl_all+0xd
1256 8052cd13:acpicpu_start+0x9
1240 8043b162:linux_sys_rt_sigreturn+0x9
1192 80b9392a:procfs_do_pid_stat+0xd
1176 80d70810:sysctl_hw_firmware_path+0xd
1160 807b30a3:radeon_cs_ioctl+0xd
1128 8044610f:oss_ioctl_mixer+0xd
1128 8024dd3c:cdioctl+0xd
1112 80c633ca:uvm_swap_stats.part.1+0xd
1104 804fdf59:fw_bmr+0xb
1096 80c8c0ab:ktrwrite+0xd
1080 80573ca6:ahc_print_register+0xd
1080 80550de8:procfs_getonecpu+0xd
1080 8048d95e:aubtfwl_attach_hook+0x9
1064 80cf925d:proc_regio+0xd
1064 80ccb407:bufq_alloc+0xd
1064 80ae8090:ar5112SetPowerTable+0xd
1064 80582aa0:ahd_print_register+0xd
1064 80568a53:tpmread+0xd
1064 80382262:txp_attach+0xd
1064 802636da:ata_probe_caps+0xd
1048 80e06097:ar9003_paprd_tx_tone_done+0xd
1048 80d87e9b:sdl_print+0x9
1048 80c67604:coredump_getseghdrs_elf64+0xd




Re: Fix for slow run(4) configuration on OHCI/UHCI

2020-05-30 Thread Christos Zoulas
In article <24273.15131.928644.743...@guava.gson.org>,
Andreas Gustafsson   wrote:
>Hi all,
>
>When I connect a USB WiFi adapter based on a Ralink RT5370 chip to a
>USB port that uses an OHCI or UHCI host controller, running "ifconfig
>run0 up" takes a very long time, about 20-30 seconds.
>
>This is because the run(4) driver writes a large amount of data to the
>device just two bytes at a time using the WRITE_2 command, and each
>write takes two full USB frames of 1 millisecond each on UHCI, or
>three frames on OCHI.  With an EHCI or XHCI controller, the large
>number of transfers is not a problem as these controllers can perform
>multiple control transfers within a single frame.
>
>The driver already contains code to do the transfers in larger blocks
>using the WRITE_REGION_1 command, but it's #if'ed out with a comment
>saying it is "not stable on RT2860".  The FreeBSD driver is similar,
>except that its version of the #if'ed-out code limits the transfers
>to a maximum of 64 bytes at a time.
>
>I have verified that enabling the use of WRITE_REGION_1 with the
>64-bit limit from FreeBSD works with my RT5370 based adapter, and
>makes it configure about ten times faster on OHCI and UHCI.
>
>I'm now using the following patch, which enables the use of
>WRITE_REGION_1 using blocks of up to 64 bytes for the RT5370 only:
>
>  https://www.gson.org/netbsd/patches/run-faster.patch
>
>OK to commit?

Yes, let's do it but put a comment in the commit message that provides
a summary of this message, so next time someone encounters an issue,
they find the information in one place.

christos



kernel stack usage

2020-05-30 Thread Martin Husemann
Hey folks,

triggered by some experiments simonb did on mips I wrote a script to find
the functions using the bigest stack frame in my current sparc64 kernel.

The top 15 list is:

Frame/b Function
4096pci_conf_print at pci_subr.c:4812
4096dtv_demux_read at dtv_demux.c:493
3536SHA3_Selftest at sha3.c:430
3408genfb_calc_hsize at genfb.c:630
3248radeonfb_pickres at radeonfb.c:4127
2304radeonfb_set_cursor at radeonfb.c:3690
2272gem_pci_attach at if_gem_pci.c:147
2256twoway_memmem at memmem.c:84
2240bwfm_rx_event_cb at bwfm.c:2099
2240compat_60_ptmget_ioctl at tty_60.c:70
2112db_stack_trace_print at db_trace.c:77
1664wdcprobe_with_reset at wdc.c:491
1424nfsrv_rename at nfs_serv.c:1906
1408OF_mapintr at ofw_machdep.c:728
1344sysctl_hw_firmware_path at firmload.c:81
1280fw_bmr at firewire.c:2296
1264cdioctl at cd.c:1204
1248cpu_reset_fpustate at cpu.c:400
1248aubtfwl_attach_hook at aubtfwl.c:273
1248uvm_swap_stats at uvm_swap.c:726

(left column is size of the frame on sparc64 in bytes)

I think anything > 1k is dubious and should be checked.

Martin