re: kernel stack usage
glad to see this effort and the clean up already! ideally, we can break the kernel build if large stack consumers are added to the kernel. i'd be OK with it being default on, with of course a way to skip it, and if necessary it can have a whitelist of "OK large users." > 1264cdioctl at cd.c:1204 > 1248uvm_swap_stats at uvm_swap.c:726 i think we can ignore these two. they're both going to be early in the stack so very unlikely to be problematic. .mrg.
Re: kernel stack usage
On Sat, May 30, 2020 at 11:52:18AM +0200, Martin Husemann wrote: > 1248aubtfwl_attach_hook at aubtfwl.c:273 > I took care of this. It was placing MAXPATHLEN+1 chars on the stack. While PNBUF_GET/PUT() seemed like a possible choice, I decided on kmem_asprintf()/kmem_strfree(), as in reality it needs nowhere near a MAXPATHLEN.
Re: kernel stack usage
Le sam. 30 mai 2020 à 18:41, Jason Thorpe a écrit : > These two seem slightly bogus. coredump_note_elf64() was storing register > state not the stack, but not nearly 3K worth. procfs_domounts() has nearly > nothing on the stack as far as I can tell, and the one function that could be > auto-inlined that it calls doesn't have much either. > struct statvfs is certainly over 3 KB - line 619 Jaromir
Re: kernel stack usage
> On May 30, 2020, at 7:18 AM, Christos Zoulas wrote: > > 3352 80b940bb:procfs_domounts+0xd > 3264 80c677da:coredump_note_elf64+0xb These two seem slightly bogus. coredump_note_elf64() was storing register state not the stack, but not nearly 3K worth. procfs_domounts() has nearly nothing on the stack as far as I can tell, and the one function that could be auto-inlined that it calls doesn't have much either. -- thorpej
Re: kernel stack usage
I've fixed several where I felt comfortable, feel free to do more: 4096pci_conf_print at pci_subr.c:4812 4096dtv_demux_read at dtv_demux.c:493 3408genfb_calc_hsize at genfb.c:630 2240bwfm_rx_event_cb at bwfm.c:2099 1664wdcprobe_with_reset at wdc.c:491 Jaromir Le sam. 30 mai 2020 à 16:18, Christos Zoulas a écrit : > > In article <20200530095218.gb28...@mail.duskware.de>, > Martin Husemann wrote: > >Hey folks, > > > >triggered by some experiments simonb did on mips I wrote a script to find > >the functions using the bigest stack frame in my current sparc64 kernel. > > > >The top 15 list is: > > > >Frame/b Function > >4096pci_conf_print at pci_subr.c:4812 > >4096dtv_demux_read at dtv_demux.c:493 > >3536SHA3_Selftest at sha3.c:430 > >3408genfb_calc_hsize at genfb.c:630 > >3248radeonfb_pickres at radeonfb.c:4127 > >2304radeonfb_set_cursor at radeonfb.c:3690 > >2272gem_pci_attach at if_gem_pci.c:147 > >2256twoway_memmem at memmem.c:84 > >2240bwfm_rx_event_cb at bwfm.c:2099 > >2240compat_60_ptmget_ioctl at tty_60.c:70 > >2112db_stack_trace_print at db_trace.c:77 > >1664wdcprobe_with_reset at wdc.c:491 > >1424nfsrv_rename at nfs_serv.c:1906 > >1408OF_mapintr at ofw_machdep.c:728 > >1344sysctl_hw_firmware_path at firmload.c:81 > >1280fw_bmr at firewire.c:2296 > >1264cdioctl at cd.c:1204 > >1248cpu_reset_fpustate at cpu.c:400 > >1248aubtfwl_attach_hook at aubtfwl.c:273 > >1248uvm_swap_stats at uvm_swap.c:726 > > > >(left column is size of the frame on sparc64 in bytes) > > > >I think anything > 1k is dubious and should be checked. > > I agree, here is the same for x86_64/GENERIC... > > 4408 8027af14:pci_conf_print+0xd > 4128 80a8dca0:dtv_demux_read+0xb > 3352 80b940bb:procfs_domounts+0xd > 3272 80e36b4b:SHA3_Selftest+0xd > 3264 80c677da:coredump_note_elf64+0xb > 3240 80b537c6:genfb_calc_hsize.isra.0+0x5 > 2704 80c66a88:coredump_note_elf32+0xb > 2408 80227a71:process_machdep_doxstate+0xd > 2184 804381fd:linux_ioctl_termios+0xd > 2168 80440b2d:linux32_ioctl_termios+0xd > 2112 802c5579:gem_pci_attach+0xb > 2104 80e465c3:twoway_memmem+0xd > 2088 806b5c18:bwfm_rx_event_cb+0xd > 2072 8097e221:compat_60_ptmget_ioctl+0xd > 2064 8053ce72:db_stack_trace_print+0x11 > 1488 8064f943:wdcprobe_with_reset+0xb > 1384 80d7ee2b:ipmi_match+0x9 > 1328 80467a95:usb_add_event+0x7 > 1304 80ba85bb:nfsrv_rename+0xd > 1256 8053069f:acpicpu_md_pstate_sysctl_all+0xd > 1256 8052cd13:acpicpu_start+0x9 > 1240 8043b162:linux_sys_rt_sigreturn+0x9 > 1192 80b9392a:procfs_do_pid_stat+0xd > 1176 80d70810:sysctl_hw_firmware_path+0xd > 1160 807b30a3:radeon_cs_ioctl+0xd > 1128 8044610f:oss_ioctl_mixer+0xd > 1128 8024dd3c:cdioctl+0xd > 1112 80c633ca:uvm_swap_stats.part.1+0xd > 1104 804fdf59:fw_bmr+0xb > 1096 80c8c0ab:ktrwrite+0xd > 1080 80573ca6:ahc_print_register+0xd > 1080 80550de8:procfs_getonecpu+0xd > 1080 8048d95e:aubtfwl_attach_hook+0x9 > 1064 80cf925d:proc_regio+0xd > 1064 80ccb407:bufq_alloc+0xd > 1064 80ae8090:ar5112SetPowerTable+0xd > 1064 80582aa0:ahd_print_register+0xd > 1064 80568a53:tpmread+0xd > 1064 80382262:txp_attach+0xd > 1064 802636da:ata_probe_caps+0xd > 1048 80e06097:ar9003_paprd_tx_tone_done+0xd > 1048 80d87e9b:sdl_print+0x9 > 1048 80c67604:coredump_getseghdrs_elf64+0xd > >
Re: kernel stack usage
In article <20200530095218.gb28...@mail.duskware.de>, Martin Husemann wrote: >Hey folks, > >triggered by some experiments simonb did on mips I wrote a script to find >the functions using the bigest stack frame in my current sparc64 kernel. > >The top 15 list is: > >Frame/b Function >4096pci_conf_print at pci_subr.c:4812 >4096dtv_demux_read at dtv_demux.c:493 >3536SHA3_Selftest at sha3.c:430 >3408genfb_calc_hsize at genfb.c:630 >3248radeonfb_pickres at radeonfb.c:4127 >2304radeonfb_set_cursor at radeonfb.c:3690 >2272gem_pci_attach at if_gem_pci.c:147 >2256twoway_memmem at memmem.c:84 >2240bwfm_rx_event_cb at bwfm.c:2099 >2240compat_60_ptmget_ioctl at tty_60.c:70 >2112db_stack_trace_print at db_trace.c:77 >1664wdcprobe_with_reset at wdc.c:491 >1424nfsrv_rename at nfs_serv.c:1906 >1408OF_mapintr at ofw_machdep.c:728 >1344sysctl_hw_firmware_path at firmload.c:81 >1280fw_bmr at firewire.c:2296 >1264cdioctl at cd.c:1204 >1248cpu_reset_fpustate at cpu.c:400 >1248aubtfwl_attach_hook at aubtfwl.c:273 >1248uvm_swap_stats at uvm_swap.c:726 > >(left column is size of the frame on sparc64 in bytes) > >I think anything > 1k is dubious and should be checked. I agree, here is the same for x86_64/GENERIC... 4408 8027af14:pci_conf_print+0xd 4128 80a8dca0:dtv_demux_read+0xb 3352 80b940bb:procfs_domounts+0xd 3272 80e36b4b:SHA3_Selftest+0xd 3264 80c677da:coredump_note_elf64+0xb 3240 80b537c6:genfb_calc_hsize.isra.0+0x5 2704 80c66a88:coredump_note_elf32+0xb 2408 80227a71:process_machdep_doxstate+0xd 2184 804381fd:linux_ioctl_termios+0xd 2168 80440b2d:linux32_ioctl_termios+0xd 2112 802c5579:gem_pci_attach+0xb 2104 80e465c3:twoway_memmem+0xd 2088 806b5c18:bwfm_rx_event_cb+0xd 2072 8097e221:compat_60_ptmget_ioctl+0xd 2064 8053ce72:db_stack_trace_print+0x11 1488 8064f943:wdcprobe_with_reset+0xb 1384 80d7ee2b:ipmi_match+0x9 1328 80467a95:usb_add_event+0x7 1304 80ba85bb:nfsrv_rename+0xd 1256 8053069f:acpicpu_md_pstate_sysctl_all+0xd 1256 8052cd13:acpicpu_start+0x9 1240 8043b162:linux_sys_rt_sigreturn+0x9 1192 80b9392a:procfs_do_pid_stat+0xd 1176 80d70810:sysctl_hw_firmware_path+0xd 1160 807b30a3:radeon_cs_ioctl+0xd 1128 8044610f:oss_ioctl_mixer+0xd 1128 8024dd3c:cdioctl+0xd 1112 80c633ca:uvm_swap_stats.part.1+0xd 1104 804fdf59:fw_bmr+0xb 1096 80c8c0ab:ktrwrite+0xd 1080 80573ca6:ahc_print_register+0xd 1080 80550de8:procfs_getonecpu+0xd 1080 8048d95e:aubtfwl_attach_hook+0x9 1064 80cf925d:proc_regio+0xd 1064 80ccb407:bufq_alloc+0xd 1064 80ae8090:ar5112SetPowerTable+0xd 1064 80582aa0:ahd_print_register+0xd 1064 80568a53:tpmread+0xd 1064 80382262:txp_attach+0xd 1064 802636da:ata_probe_caps+0xd 1048 80e06097:ar9003_paprd_tx_tone_done+0xd 1048 80d87e9b:sdl_print+0x9 1048 80c67604:coredump_getseghdrs_elf64+0xd
Re: Fix for slow run(4) configuration on OHCI/UHCI
In article <24273.15131.928644.743...@guava.gson.org>, Andreas Gustafsson wrote: >Hi all, > >When I connect a USB WiFi adapter based on a Ralink RT5370 chip to a >USB port that uses an OHCI or UHCI host controller, running "ifconfig >run0 up" takes a very long time, about 20-30 seconds. > >This is because the run(4) driver writes a large amount of data to the >device just two bytes at a time using the WRITE_2 command, and each >write takes two full USB frames of 1 millisecond each on UHCI, or >three frames on OCHI. With an EHCI or XHCI controller, the large >number of transfers is not a problem as these controllers can perform >multiple control transfers within a single frame. > >The driver already contains code to do the transfers in larger blocks >using the WRITE_REGION_1 command, but it's #if'ed out with a comment >saying it is "not stable on RT2860". The FreeBSD driver is similar, >except that its version of the #if'ed-out code limits the transfers >to a maximum of 64 bytes at a time. > >I have verified that enabling the use of WRITE_REGION_1 with the >64-bit limit from FreeBSD works with my RT5370 based adapter, and >makes it configure about ten times faster on OHCI and UHCI. > >I'm now using the following patch, which enables the use of >WRITE_REGION_1 using blocks of up to 64 bytes for the RT5370 only: > > https://www.gson.org/netbsd/patches/run-faster.patch > >OK to commit? Yes, let's do it but put a comment in the commit message that provides a summary of this message, so next time someone encounters an issue, they find the information in one place. christos
kernel stack usage
Hey folks, triggered by some experiments simonb did on mips I wrote a script to find the functions using the bigest stack frame in my current sparc64 kernel. The top 15 list is: Frame/b Function 4096pci_conf_print at pci_subr.c:4812 4096dtv_demux_read at dtv_demux.c:493 3536SHA3_Selftest at sha3.c:430 3408genfb_calc_hsize at genfb.c:630 3248radeonfb_pickres at radeonfb.c:4127 2304radeonfb_set_cursor at radeonfb.c:3690 2272gem_pci_attach at if_gem_pci.c:147 2256twoway_memmem at memmem.c:84 2240bwfm_rx_event_cb at bwfm.c:2099 2240compat_60_ptmget_ioctl at tty_60.c:70 2112db_stack_trace_print at db_trace.c:77 1664wdcprobe_with_reset at wdc.c:491 1424nfsrv_rename at nfs_serv.c:1906 1408OF_mapintr at ofw_machdep.c:728 1344sysctl_hw_firmware_path at firmload.c:81 1280fw_bmr at firewire.c:2296 1264cdioctl at cd.c:1204 1248cpu_reset_fpustate at cpu.c:400 1248aubtfwl_attach_hook at aubtfwl.c:273 1248uvm_swap_stats at uvm_swap.c:726 (left column is size of the frame on sparc64 in bytes) I think anything > 1k is dubious and should be checked. Martin