Re: Request for Testing: TCP RACK
On 3/12/24 6:39 AM, Nuno Teixeira wrote: Hello, I'm curious about tcp RACK. As I do not run on a server background, only a laptop and a rpi4 for poudriere, git, browsing, some torrent and ssh/sftp connections, will I see any difference using RACK? What tests should I do for comparison? I found this blog post from Klara was a good backgrounder to get my comfortable with testing out RACK: https://klarasystems.com/articles/using-the-freebsd-rack-tcp-stack/ I've been using it on a busy'ish server and a workstation without any issues, but I'll defer to others as to what areas of focus for testing are needed. -pete -- Pete Wright p...@nomadlogic.org
NLNet Labs Ending Dev of drill(1)
I just came across this blog post which seems to indicate that the drill(1) utility from NLNet is ending development in favor of a rust based tool: https://blog.nlnetlabs.nl/domain-dns-building-blocks-for-rust-application-developers/ https://fosstodon.org/@nlnetlabs/111964417192522741 I was curious if a) anyone was aware of this and b) will we maintain a version of drill(1) in base or revert to including dns/bind-tools in base? not trying to start a "rust in base" discussion, just curious if i should start making plans now to have a replacement for this tool at my site. -pete -- Pete Wright p...@nomadlogic.org
Re: nvme controller reset failures on recent -CURRENT
There's a tiny chance that this could be something more exotic, but my money is on hardware gone bad after 2 years of service. I don't think this is 'wear out' of the NAND (it's only 15TB written, but it could be if this drive is really really crappy nand: first generation QLC maybe, but it seems too new). It might also be a connector problem that's developed over time. There might be a few other things too, but I don't think this is a U.2 drive with funky cables. The system was probably idle the majority of those two years of power on time. It's one of these: https://www.techpowerup.com/ssd-specs/intel-660p-512-gb.d437 I've seen comments that these generally don't need cooling. I just ordered a heatsink with some nice big fins, but it will take a week or more to arrive. just wanted to add another data-point to this discussion. i had a crucial NVME drive on my workstation that recently was showing similar problems. after much debugging i came to the same conclusion that it was getting too hot. i went ahead an purchased a Sabrent NVME drive that came with a heat sink. i've also starting making much more use of my workstation (and the disk subsystem) and have had zero issues. so lessons learnt: 1. M.2 nvme really does need proper cooling, much more so than traditional SATA/SAS/SCSI drives. 2. not all vendors do a great job reporting the health of devices -pete -- Pete Wright p...@nomadlogic.org
Re: upgrade failure
On 12/12/23 1:16 PM, AN wrote: Hi Pete: After running make clean I now get the following error: i think Warner mentioned in this thread that a "make delete-old" or possibly "make delete-old-libs" to clean up some stuff. i've run into similar issues in the past and that seemed to help. -pete -- Pete Wright p...@nomadlogic.org
Re: upgrade failure
On 12/11/23 18:19, AN wrote: I just had this failure trying to upgrade from 14 to 15. Any help fixing this would be appreciated. --- _SERINS --- install -C -o root -g wheel -m 444 /usr/src/contrib/llvm-project/libcxx/include/__system_error/errc.h /usr/src/contrib/llvm-project/libcxx/include/__system_error/error_category.h /usr/src/contrib/llvm-project/libcxx/include/__system_error/error_code.h /usr/src/contrib/llvm-project/libcxx/include/__system_error/error_condition.h /usr/src/contrib/llvm-project/libcxx/include/__system_error/system_error.h /usr/include/c++/v1/__system_error/ --- _THRINS --- install -C -o root -g wheel -m 444 /usr/src/contrib/llvm-project/libcxx/include/__thread/formatter.h /usr/src/contrib/llvm-project/libcxx/include/__thread/id.h /usr/src/contrib/llvm-project/libcxx/include/__thread/poll_with_backoff.h /usr/src/contrib/llvm-project/libcxx/include/__thread/this_thread.h /usr/src/contrib/llvm-project/libcxx/include/__thread/thread.h /usr/src/contrib/llvm-project/libcxx/include/__thread/timed_backoff_policy.h /usr/include/c++/v1/__thread/ --- _TUPINS --- install -C -o root -g wheel -m 444 /usr/src/contrib/llvm-project/libcxx/include/__tuple/make_tuple_types.h /usr/src/contrib/llvm-project/libcxx/include/__tuple/pair_like.h /usr/src/contrib/llvm-project/libcxx/include/__tuple/sfinae_helpers.h /usr/src/contrib/llvm-project/libcxx/include/__tuple/tuple_element.h /usr/src/contrib/llvm-project/libcxx/include/__tuple/tuple_indices.h /usr/src/contrib/llvm-project/libcxx/include/__tuple/tuple_like.h /usr/src/contrib/llvm-project/libcxx/include/__tuple/tuple_like_ext.h /usr/src/contrib/llvm-project/libcxx/include/__tuple/tuple_size.h /usr/src/contrib/llvm-project/libcxx/include/__tuple/tuple_types.h /usr/include/c++/v1/__tuple/ install: target directory `/usr/include/c++/v1/__tuple/' does not exist did you run "make clean" before doing make buildworld/buildkernel? -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: nvme timeout issues with hardware and bhyve vm's
On Thu, Dec 07, 2023 at 04:19:12PM -0800, Chuck Tuffli wrote: > On Thu, Dec 7, 2023 at 2:39 PM Pete Wright wrote: > ... > > Hi Warner, just resurfacing this thread because I've had a few lockups > > on my workstation running 14.0-STABLE. I was able to capture a photo of > > the hang and this seems to be the most important line: > > > > nvme0: Resetting controller due to a timeout and possible hot unplug. > > > > When I scan the device after reboot I don't see any errors, but if there > > is a particular thing I should check via nvmecontrol please let me know. > > Also, since it mentions possible hot unplug I wonder if this is > > hardware/firmware related to my system? > > Does the device support Persistent Log pages (LID=0x0d)? If so, it > might be interesting to dump those. > unfortunately it does not. i will probably just script this up and dump some data into my local prometheus server so i can see what the temp looks over time. -p -- Pete Wright p...@nomadlogic.org
Re: nvme timeout issues with hardware and bhyve vm's
On 12/7/23 3:16 PM, Craig Leres wrote: On 12/7/23 15:09, Tomoaki AOKI wrote: If I myself encounter this kind of problem ON BARE METAL HARDWARE, I would usually suspect *Overheating caused hang of NVMe controller or PCI bridge on SSD, or This would also be my first guess. Five years ago I had an nmve in an intel nuc that would sometimes "go to sleep", here's the thread https://lists.freebsd.org/pipermail/freebsd-hackers/2018-May/052783.html @imp helpfully suggested running "nvmecontrol logpage -p 2 nvme0" which showed mine was hot (60° C/140° F)! I adjusted the fan settings in the bios and have never had an issue since. oh interesting, i'll run that next time it locks up. the box is well ventilated, but that's not to say its not overheating. right now its at: Temperature:314 K, 40.85 C, 105.53 F nvemecontrol doesn't list any errors or warnings though: Media errors: 0 No. error info log entries: 0 Warning Temp Composite Time:0 Error Temp Composite Time: 0 thanks for the tip! -pete -- Pete Wright p...@nomadlogic.org
Re: nvme timeout issues with hardware and bhyve vm's
On 12/7/23 2:49 PM, Warner Losh wrote: On Thu, Dec 7, 2023 at 3:38 PM Pete Wright <mailto:p...@nomadlogic.org>> wrote: On 10/13/23 7:34 PM, Warner Losh wrote: > > > the messages i posted in the start of the thread are from the VM itself > (13.2-RELEASE). The zpool on the hypervisor (13.2-RELEASE) showed no > such issues. > > Based on your comment about the improvements in 14 I'll focus my > efforts > on my workstation, it seemed to happen regularly so hopefully i can > find > a repo case. > > > Let me now if you see similar messages in stable/14. I think I've fixed > all the > issues with timeouts, though you shouldn't ever seem them in a vm setup > unless something else weird is going on. > Hi Warner, just resurfacing this thread because I've had a few lockups on my workstation running 14.0-STABLE. I was able to capture a photo of the hang and this seems to be the most important line: nvme0: Resetting controller due to a timeout and possible hot unplug. When I scan the device after reboot I don't see any errors, but if there is a particular thing I should check via nvmecontrol please let me know. Also, since it mentions possible hot unplug I wonder if this is hardware/firmware related to my system? Anyway, haven't found a repro case yet but it has locked up a few times the past two weeks. What the message means is that (a) we stopped getting interrupts from the device and (b) when we went to check on the status of the device it read back like missing hardware. So is this from inside the VM running under bhyve, or in the host that's hosting the VM? We have different next steps depending on where it is. OK awesome thanks for that context, so this is on a bare metal workstation. -pete -- Pete Wright p...@nomadlogic.org
Re: nvme timeout issues with hardware and bhyve vm's
On 10/13/23 7:34 PM, Warner Losh wrote: the messages i posted in the start of the thread are from the VM itself (13.2-RELEASE). The zpool on the hypervisor (13.2-RELEASE) showed no such issues. Based on your comment about the improvements in 14 I'll focus my efforts on my workstation, it seemed to happen regularly so hopefully i can find a repo case. Let me now if you see similar messages in stable/14. I think I've fixed all the issues with timeouts, though you shouldn't ever seem them in a vm setup unless something else weird is going on. Hi Warner, just resurfacing this thread because I've had a few lockups on my workstation running 14.0-STABLE. I was able to capture a photo of the hang and this seems to be the most important line: nvme0: Resetting controller due to a timeout and possible hot unplug. When I scan the device after reboot I don't see any errors, but if there is a particular thing I should check via nvmecontrol please let me know. Also, since it mentions possible hot unplug I wonder if this is hardware/firmware related to my system? Anyway, haven't found a repro case yet but it has locked up a few times the past two weeks. -pete -- Pete Wright p...@nomadlogic.org
Re: [HEADS-UP] Quick update to 14.0-RELEASE schedule
On 11/15/23 16:30, Glen Barber wrote: The alternative would be to say nothing at all. Either way, it is a productivity, communication drain. It is a lose-lose situation no matter how one looks at it given the above context. We either get chastised for being "too open" into insights delaying an official announcement, or for being "not open enough" when there is silence from RE when a release does not meet its scheduled announcement date. Glen I appreciate your transparency and the efforts you have done to keep everyone in the loop. I think people get excited about new releases, which is probably a good thing. IMHO i feel you've been striking a good balance. As an operator these updates are really helpful for me as it allows me to adjust timelines on my end for updating my fleet of servers. You and the RE team do an incredible job - and personally I am thankful for the caution and common sense you all bring to this complex process. Cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: nvme timeout issues with hardware and bhyve vm's
On 10/13/23 6:24 AM, Warner Losh wrote: On Thu, Oct 12, 2023, 10:53 PM Pete Wright <mailto:p...@nomadlogic.org>> wrote: On 10/12/23 8:45 PM, Warner Losh wrote: > What version is that kernel? oh dang i sent this to the wrong list, i'm not running current. the hypervisor and vm are both 13.2 and my workstation is a recent 14.0 pre-release build. i'll do more homework tomorrow and post to questions or a more appropriate list. Are the messages from the VM? Stable/14 should have the important nvme changes I've made lately. The bhyve in 13.2 is lacking a number of nvme fixes that have gone into current and stable/14. It's hard to say where the fault is coming from. the messages i posted in the start of the thread are from the VM itself (13.2-RELEASE). The zpool on the hypervisor (13.2-RELEASE) showed no such issues. Based on your comment about the improvements in 14 I'll focus my efforts on my workstation, it seemed to happen regularly so hopefully i can find a repo case. thanks warner! -pete -- Pete Wright p...@nomadlogic.org
Re: nvme timeout issues with hardware and bhyve vm's
On 10/12/23 8:45 PM, Warner Losh wrote: What version is that kernel? oh dang i sent this to the wrong list, i'm not running current. the hypervisor and vm are both 13.2 and my workstation is a recent 14.0 pre-release build. i'll do more homework tomorrow and post to questions or a more appropriate list. -pete -- Pete Wright p...@nomadlogic.org
nvme timeout issues with hardware and bhyve vm's
hey there - i was curious if anyone has had issues with nvme devices recently. i'm chasing down similar issues on my workstation which has a physical NVMe zroot, and on a bhyve VM which has a large pool exposed as a NVMe device (and is backed by a zvol). on the most recent bhyve issue the VM reported this: Oct 13 02:52:52 emby kernel: nvme1: RECOVERY_START 13737432416007567 vs 13737432371683671 Oct 13 02:52:52 emby kernel: nvme1: RECOVERY_START 13737432718499597 vs 13737432371683671 Oct 13 02:52:52 emby kernel: nvme1: timeout with nothing complete, resetting Oct 13 02:52:52 emby kernel: nvme1: Resetting controller due to a timeout. Oct 13 02:52:52 emby kernel: nvme1: RECOVERY_WAITING Oct 13 02:52:52 emby kernel: nvme1: resetting controller Oct 13 02:52:53 emby kernel: nvme1: waiting Oct 13 02:53:23 emby syslogd: last message repeated 114 times Oct 13 02:53:23 emby kernel: nvme1: controller ready did not become 1 within 30500 ms Oct 13 02:53:23 emby kernel: nvme1: failing outstanding i/o Oct 13 02:53:23 emby kernel: nvme1: WRITE sqid:1 cid:119 nsid:1 lba:4968850592 len:256 Oct 13 02:53:23 emby kernel: nvme1: ABORTED - BY REQUEST (00/07) crd:0 m:0 dnr:1 sqid:1 cid:119 cdw0:0 Oct 13 02:53:23 emby kernel: nvme1: failing outstanding i/o Oct 13 02:53:23 emby kernel: nvme1: WRITE sqid:6 cid:0 nsid:1 lba:5241952432 len:32 Oct 13 02:53:23 emby kernel: nvme1: WRITE sqid:3 cid:123 nsid:1 lba:4968850336 len:256 Oct 13 02:53:23 emby kernel: nvme1: ABORTED - BY REQUEST (00/07) crd:0 m:0 dnr:1 sqid:3 cid:123 cdw0:0 Oct 13 02:53:23 emby kernel: nvme1: WRITE sqid:3 cid:0 nsid:1 lba:5242495888 len:256 Oct 13 02:53:23 emby kernel: nvme1: ABORTED - BY REQUEST (00/07) crd:0 m:0 dnr:0 sqid:3 cid:0 cdw0:0 Oct 13 02:53:23 emby kernel: nvme1: READ sqid:3 cid:0 nsid:1 lba:528 len:16 Oct 13 02:53:23 emby kernel: nvme1: WRITE sqid:5 cid:0 nsid:1 lba:4934226784 len:96 Oct 13 02:53:23 emby kernel: nvme1: ABORTED - BY REQUEST (00/07) crd:0 m:0 dnr:0 sqid:3 cid:0 cdw0:0 Oct 13 02:53:23 emby kernel: nvme1: READ sqid:3 cid:0 nsid:1 lba:6442449936 len:16 Oct 13 02:53:25 emby kernel: nvme1: ABORTED - BY REQUEST (00/07) crd:0 m:0 dnr:0 sqid:3 cid:0 cdw0:0 Oct 13 02:53:25 emby kernel: nvme1: READ sqid:3 cid:0 nsid:1 lba:6442450448 len:16 Oct 13 02:53:25 emby kernel: nvme1: ABORTED - BY REQUEST (00/07) crd:0 m:0 dnr:0 sqid:3 cid:0 cdw0:0 Oct 13 02:53:25 emby kernel: nvme1: ABORTED - BY REQUEST (00/07) crd:0 m:0 dnr:0 sqid:5 cid:0 cdw0:0 Oct 13 02:53:25 emby kernel: nvme1: ABORTED - BY REQUEST (00/07) crd:0 m:0 dnr:0 sqid:6 cid:0 cdw0:0 Oct 13 02:53:25 emby kernel: nvd1: detached I had similar issues on my workstation as well. Scrubbing the NVMe device on my real-hardware workstation hasn't turned up any issues, but the system has locked up a handful of times. Just curious if others have seen the same, or if someone could point me in the right direction... thanks! -pete -- Pete Wright p...@nomadlogic.org
Re: zfs autotrim default to off now
On 8/28/23 12:00, Alexander Motin wrote: On 28.08.2023 13:56, Pete Wright wrote: So to be clear, if we were using the default autotrim=enabled behavior we in fact weren't having our SSDs trimmed? I think that's my concern, as an admin I was under the impression that it was enabled by default but apparently that wasn't actually happening. We wanted autotrim to be enabled by default, but it was not enabled, and it was reported as not enabled, so there should be no confusion. The only confusion may have been if you tried to read the code and see it should have been enabled. ok, thanks Alexander! -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: zfs autotrim default to off now
On 8/28/23 07:23, Alexander Motin wrote: Hi Pete, On 27.08.2023 23:34, Pete Wright wrote: looking at a recent pull of CURRENT i'm noticing this in the git logs: #15079 set autotrim default to 'off' everywhere which references this openzfs PR: https://github.com/openzfs/zfs/pull/15079 looking at the PR i'm not seeing a reference to a bug report or anything, is anyone able to point me to a bug report for this. it seems like a pretty major issue: "As it turns out having autotrim default to 'on' on FreeBSD never really worked due to mess with defines where userland and kernel module were getting different default values (userland was defaulting to 'off', module was thinking it's 'on')." i'd just like to make sure i better understand the issue and can see if my systems are impacted. You are probably misinterpreting the quote. There is nothing wrong with the autotrim itself, assuming your specific devices properly handle it. It is just saying that setting it to "on" by default on FreeBSD, that was done to keep pre-OpenZFS behavior, appeared broken for a while. So that commit merely confirmed the status quo. It should not affect any already existing pools. On a new pool creation the default is now officially "off", matching OpenZFS on other platforms, but there is no reason why you can not set it to "on", if it is beneficial for your devices and workloads. As alternative, for example, you may run trim manually from time to time during any low activity periods. OK I think that makes sense. So to be clear, if we were using the default autotrim=enabled behavior we in fact weren't having our SSDs trimmed? I think that's my concern, as an admin I was under the impression that it was enabled by default but apparently that wasn't actually happening. -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
zfs autotrim default to off now
looking at a recent pull of CURRENT i'm noticing this in the git logs: #15079 set autotrim default to 'off' everywhere which references this openzfs PR: https://github.com/openzfs/zfs/pull/15079 looking at the PR i'm not seeing a reference to a bug report or anything, is anyone able to point me to a bug report for this. it seems like a pretty major issue: "As it turns out having autotrim default to 'on' on FreeBSD never really worked due to mess with defines where userland and kernel module were getting different default values (userland was defaulting to 'off', module was thinking it's 'on')." i'd just like to make sure i better understand the issue and can see if my systems are impacted. thanks! -pete -- Pete Wright p...@nomadlogic.org
Re: kabylake + drm-515-kmod/drm-510-kmod hangs
On 8/21/23 11:24, Mark Johnston wrote: Does your system have the debug.debugger_on_panic sysctl set to 1? If so, does setting it to 0 allow the system to reboot following the hang? oh fantastic, yea that works - thanks for the heads up! Commit cedc82c0466a in src changed the layout of a structure used by a stub in the GPU firmware kernel modules. If you rebuild the one(s) you need from ports, does the problem persist? FWIW I had to do this: $ cd /usr/ports/graphics/gpu-firmware-intel-kmod $ sudo make reinstall FLAVOR=kabylake bingo that was it. i missed that change, thanks for pointing that out Mark. I'm able to load the i915kms module now. next step is figuring out why the nvidia-drm module is causing a panic (this is one of those funky intel + nvidia GPU laptops). but i get a core on that, and having the intel gpu load allows me to run X at least. thanks everyone! -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: kabylake + drm-515-kmod/drm-510-kmod hangs
On 8/21/23 10:49, Poul-Henning Kamp wrote: Pete Wright writes: i've got a kabylake laptop that i've been using with drm-kmod for several years without much hassle. after upgrading to a new CURRENT this weekend I've found that when loading either the 510 or 515 drm-kmod kernel modules my system will hang. Does it make any difference if you load the module from single-user mode ? I've seen similar problems on my T14s if I try to load it from multi-user. oh that's an interesting idea. i just tried it and unfortunately got the same results. -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: kabylake + drm-515-kmod/drm-510-kmod hangs
On 8/21/23 10:53, Cy Schubert wrote: In message <76275772-a9c3-ed59-5fb3-47a13d2a6...@nomadlogic.org>, Pete Wright w rites: hey there, i've got a kabylake laptop that i've been using with drm-kmod for several years without much hassle. after upgrading to a new CURRENT this weekend I've found that when loading either the 510 or 515 drm-kmod kernel modules my system will hang. unfortunately i am not getting a panic or crash, the screen stops updating and i am unable to ping or SSH into the system. interestingly the capslock LED still toggles but doing a CTL+ALT+DEL does not seem to do anything useful and i have to manually power cycle. any tips for finding out what's going on? i've booted the system with verbose dmesg output, and loaded the module with "kldload -v" but do not get any useful output. here's the uname: FreeBSD colony 14.0-ALPHA2 FreeBSD 14.0-ALPHA2 amd64 1400096 #0 main-n264924-e2340276fc73: Sun Aug 20 21:28:44 PDT 2023 pete@colony:/usr/obj/usr/home/pete/git/freebsd/amd64.amd64/sys/GENERIC amd64 these are the log messages i see before the system locks up: Aug 21 10:40:34 colony kernel: iic0: on iicbus0 Aug 21 10:40:35 colony kernel: drmn0: on vgapci0 Aug 21 10:40:35 colony kernel: vgapci0: child drmn0 requested pci_enable_io Aug 21 10:40:35 colony syslogd: last message repeated 1 times Aug 21 10:40:35 colony kernel: [drm] Unable to create a private tmpfs mount, hugepage support will be disabled(-19). Aug 21 10:40:35 colony kernel: [drm] Got stolen memory base 0x4b80, size 0x400 Aug 21 10:40:35 colony kernel: lkpi_iic0: on drmn0 Aug 21 10:40:35 colony kernel: iicbus1: on lkpi_iic0 Aug 21 10:40:35 colony kernel: iic1: on iicbus1 Aug 21 10:40:35 colony kernel: lkpi_iic1: on drmn0 Aug 21 10:40:35 colony kernel: iicbus2: on lkpi_iic1 Aug 21 10:40:35 colony kernel: iic2: on iicbus2 Aug 21 10:40:35 colony kernel: lkpi_iic2: on drmn0 Aug 21 10:40:35 colony kernel: iicbus3: on lkpi_iic2 Aug 21 10:40:35 colony kernel: iic3: on iicbus3 Aug 21 10:40:35 colony kernel: lkpi_iic3: on drmn0 Aug 21 10:40:35 colony kernel: iicbus4: on lkpi_iic3 Aug 21 10:40:35 colony kernel: iic4: on iicbus4 cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA Rebuilding drm-51[05]-kmod after an update to LinuxKPI affecting the ABI used by the drm modules is required. Typically I get a kernel panic on a page fault when this occurs. Depending on how memory is laid out on your system you may get a hang instead. You need to install thew new kernel and world first. Disable xdm, gdm, any other *dm, or simply not use startx. From a text console session rebuild the drm port and reinstall it. I use poudriere here. My procedure is to update the poudriere jail, rebuild the port (-C option) and pkg upgrade -f or pkg install -f. Use this approach if you use poudriere. Thanks Cy, yes my local scripts ensure to update the ports tree, then rebuild the drm-kmod module i'm using as a package. then i remove the old pkg, install the freshly build one then reboot. this ensures my kernel and drm modules match. this is how i've been doing it for years on all my systems since we started the work on the drm-kmod. -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
kabylake + drm-515-kmod/drm-510-kmod hangs
hey there, i've got a kabylake laptop that i've been using with drm-kmod for several years without much hassle. after upgrading to a new CURRENT this weekend I've found that when loading either the 510 or 515 drm-kmod kernel modules my system will hang. unfortunately i am not getting a panic or crash, the screen stops updating and i am unable to ping or SSH into the system. interestingly the capslock LED still toggles but doing a CTL+ALT+DEL does not seem to do anything useful and i have to manually power cycle. any tips for finding out what's going on? i've booted the system with verbose dmesg output, and loaded the module with "kldload -v" but do not get any useful output. here's the uname: FreeBSD colony 14.0-ALPHA2 FreeBSD 14.0-ALPHA2 amd64 1400096 #0 main-n264924-e2340276fc73: Sun Aug 20 21:28:44 PDT 2023 pete@colony:/usr/obj/usr/home/pete/git/freebsd/amd64.amd64/sys/GENERIC amd64 these are the log messages i see before the system locks up: Aug 21 10:40:34 colony kernel: iic0: on iicbus0 Aug 21 10:40:35 colony kernel: drmn0: on vgapci0 Aug 21 10:40:35 colony kernel: vgapci0: child drmn0 requested pci_enable_io Aug 21 10:40:35 colony syslogd: last message repeated 1 times Aug 21 10:40:35 colony kernel: [drm] Unable to create a private tmpfs mount, hugepage support will be disabled(-19). Aug 21 10:40:35 colony kernel: [drm] Got stolen memory base 0x4b80, size 0x400 Aug 21 10:40:35 colony kernel: lkpi_iic0: on drmn0 Aug 21 10:40:35 colony kernel: iicbus1: on lkpi_iic0 Aug 21 10:40:35 colony kernel: iic1: on iicbus1 Aug 21 10:40:35 colony kernel: lkpi_iic1: on drmn0 Aug 21 10:40:35 colony kernel: iicbus2: on lkpi_iic1 Aug 21 10:40:35 colony kernel: iic2: on iicbus2 Aug 21 10:40:35 colony kernel: lkpi_iic2: on drmn0 Aug 21 10:40:35 colony kernel: iicbus3: on lkpi_iic2 Aug 21 10:40:35 colony kernel: iic3: on iicbus3 Aug 21 10:40:35 colony kernel: lkpi_iic3: on drmn0 Aug 21 10:40:35 colony kernel: iicbus4: on lkpi_iic3 Aug 21 10:40:35 colony kernel: iic4: on iicbus4 cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: huge amount laundry memory not being cleaned up
On 7/11/23 10:37 AM, Olivier Certner wrote: Le mardi 11 juillet 2023, 01:51:02 CEST Pete Wright a écrit : sorry for the noise folks, this is almost certainly an issue with my local env. this is helpful for me, as i'll have a better idea as to where i should focus my efforts trying to find this memory hog next time. I have had what I suspect to be the same issue on 13.2-STABLE with KDE and very long sessions. I say "suspect" since I seem to remember that laundry usage went down by unlogging (or restarting kwin or plasmashell), but I'm not completely sure now. So may not be specific to your local env after all, nor to CURRENT. Regards. oh interesting. i've found myself having to restart plasma once a week or so to fix some glitches with the tool bar. i just run: kquitapp5 plasmashell && kstart5 plasmashell so it's certainly possible thats where the issue lies. i'd try another window manager, but i really like plasma5 lol. maybe i'll take one for the team and run mate or xfce for a few weeks and see if see similar behavior. -pete -- Pete Wright p...@nomadlogic.org
Re: huge amount laundry memory not being cleaned up
On 7/10/23 4:26 PM, Pete Wright wrote: i'm doing a build world now since i'll need to reboot this box anyway, just to get everything up to date. interestingly enough i'm still pegged at 14G of laundry memory, and my 2G swap is %100 utilized. once the build world completes i'll do a double check and see if i can find any large consumers of resident memory which may lead me in the right direction. so...build world competed, and after quitting my kde5 session all of the laundry got free'd up right away. i suspect either firefox/chrome didn't exit cleanly, or there was something funky going on with kde. sorry for the noise folks, this is almost certainly an issue with my local env. this is helpful for me, as i'll have a better idea as to where i should focus my efforts trying to find this memory hog next time. -p -- Pete Wright p...@nomadlogic.org
Re: huge amount laundry memory not being cleaned up
On 7/10/23 4:01 PM, Mark Millard wrote: Pete Wright wrote on Date: Mon, 10 Jul 2023 19:35:26 UTC : hi there, i've got a workstation running CURRENT that recently ran out of swap space. i killed the usual suspects (chrome, firefox and thunderbird) and noticed some odd behavior. while some memory did get freed up - after leaving the system idle for 4 hours i still have 14G or memory in the Laundry according to top. I also have noticed that very little data has paged out of swap (100MB out of 2G). i was wondering if there was a good way to determine what is in the laundry, I do not know how to get a breakdown of the laundry's usage. But I'd expect, say, for example, top's resident memory figures would count what is in the laundry as resident. If correct, given the large laundry usage, may be some resident figures would be suggestive? thanks Mark, so yea i poked around and didn't see any large consumers of resident memory. or get diagnostic info on why it's not cleaning itself up? As I understand, it would take take one of the following to change the status of the pages in the laundry in normal operation: A) access to a page by a program, turning the page into being in the active category. (It might go through inactive to get there?) B) memory pressure leading to sending the page to the swap in order to provide a page for a different use. Time alone does not contribute much as I understand. More on-demand driven. Laundry is sort of "inactive but known to be dirty" as I understand, in some respects just a subset of inactive optimized for being closer to ready to page out to swap space if needed. i'm doing a build world now since i'll need to reboot this box anyway, just to get everything up to date. interestingly enough i'm still pegged at 14G of laundry memory, and my 2G swap is %100 utilized. once the build world completes i'll do a double check and see if i can find any large consumers of resident memory which may lead me in the right direction. cheers! -pete -- Pete Wright p...@nomadlogic.org
huge amount laundry memory not being cleaned up
hi there, i've got a workstation running CURRENT that recently ran out of swap space. i killed the usual suspects (chrome, firefox and thunderbird) and noticed some odd behavior. while some memory did get freed up - after leaving the system idle for 4 hours i still have 14G or memory in the Laundry according to top. I also have noticed that very little data has paged out of swap (100MB out of 2G). i was wondering if there was a good way to determine what is in the laundry, or get diagnostic info on why it's not cleaning itself up? when i've seen this before only a reboot will get the system back to being stable, if i re-launch my desktop apps they'll quickly start trying to page out to disk again creating an OOM condition. system has 32G of RAM and is running this checkout FreeBSD topanga 14.0-CURRENT FreeBSD 14.0-CURRENT #66 main-n263884-d2a45e9e817a: Thu Jun 29 15:50:44 PDT 2023 Cheers, -pete -- Pete Wright p...@nomadlogic.org
Re: PinePhone Pro Boots On CURRENT
On 5/30/23 6:02 AM, Mario Marietto wrote: That's interesting. As I have already said,I haven't bought the pinephone pro,because it is expensive for me. So I'm working on a parallel project. I've bought this phone,instead : https://www.hdblog.it/schede-tecniche/samsung-galaxy-a6_i3655/ That's cheaper. Between the specs I read that it has a mali gpu,too : Mali-T830MP2 so,eventually,I can use the Lima and the PanFrost driver even for my samsung galaxy A6 ? I've started planning to install FreeBSD on top of the Android Kernel,using a specific patch,but now I'm thinking that maybe,I can install FreeBSD there natively. Can someone tell me if it is doable,giving a look at the specs of that phone model ? thanks. have you tried putting a snapshot from CURRENT on a memory card and booting it? i'm not familiar with that device, but i suspect you'd need serial console access. -pete
Re: PinePhone Pro Boots On CURRENT
On 5/30/23 05:19, Stephan Lichtenauer wrote: But, once i get the linux DTB for this guy into a disk image i'm going to see if i can get the display up next, but would love to hear any input or pointers from folks with more ARM porting experience than me. I am probably telling you something you already know, but afaik the Pinephone Pro uses the Rockchip RK3399 which according to the datasheet (https://opensource.rock-chips.com/images/d/d7/Rockchip_RK3399_Datasheet_V2.1-20200323.pdf page 16, 1.2.9 Graphics Engine) contains a Mali GPU. That means Ruslan Bukin's Panfrost article in the FreeBSD Journal Jul/Aug 2021 at https://freebsdfoundation.org/wp-content/uploads/2021/08/The-Panfrost-Driver.pdf might be interesting regarding graphics. Looking forward to your updates! Oh sweet - I wasn't aware of that thanks for the pointer! I'm hoping to have some cycles to hack on this this week and will post updates or if I'm feeling ambitious (and not burnt out from day job) will create a wiki page for this device. cheers! -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
PinePhone Pro Boots On CURRENT
i've had this pinephone pro for a few months now and finally got around to attempting to boot FreeBSD on it. Here's the phone: https://wiki.pine64.org/wiki/PinePhone_Pro I needed to get this serial console adapter which works flawlessly with "cu" (USB TTL Serial Adapter Converter Cable 3.3v/3v3 3.5mm Stereo Jack Cable): https://www.amazon.com/dp/B00XSPECIA?psc=1=ppx_yo2ov_dt_b_product_details then i just downloaded the latest CURRENT snapshot and put it on a microsd card and it booted into multi-user mode. here's the dmesg: https://www.nomadlogic.org/ppro-dmesg.txt i am working on building a new image now to include the pinephone pro DTB file from Linux to see if that improves some of the hardware detection. certainly a long way to go before this could be a useful mobile device, but i'm very encouraged that i can actually boot the thing. i suspect we'll need to use iwlwifi to get the AzureWave AW-CM256SM wifi and bluetooth card working... But, once i get the linux DTB for this guy into a disk image i'm going to see if i can get the display up next, but would love to hear any input or pointers from folks with more ARM porting experience than me. thanks! -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: current status of zfs block_cloning on CURRENT?
On Tue, Apr 25, 2023 at 12:35:29AM -0700, Mark Millard wrote: > Warner Losh wrote on > Date: Tue, 25 Apr 2023 04:30:26 UTC : > > > On Mon, Apr 24, 2023 at 9:49 PM Charlie Li wrote: > > > > > Charlie Li wrote: > > > > Pete Wright wrote: > > > >> i've seen a few threads about the block_cloning feature causing data > > > >> corruption issues on CURRENT and have been keen to avoid enabling it > > > >> until the dust settles. i was under the impression that we either > > > >> reverted or disabled block_cloning on CURRENT, but when i ran "zpool > > > >> upgrade" on a pool today it reported block_cloning was enabled. this > > > >> is on a system i rebuilt yesterday. > > > >> > > > > The dust has settled. > > > Barely... > > > >> i was hoping to get some clarity on the effect of having this feature > > > >> enabled, is this enough to trigger the data corruption bug or does > > > >> something on the zfs filesystem itself have to be enabled to trigger > > > >> this? > > > >> > > > > The initial problem with block_cloning [0][1] was fixed in commits > > > > e0bb199925565a3770733afd1a4d8bb2d4d0ce31 and > > > > 1959e122d9328b31a62ff7508e1746df2857b592, with a sysctl added in commit > > > > 068913e4ba3dd9b3067056e832cefc5ed264b5cc. A different data corruption > > > > problem [2][3] was fixed in commit > > > > 63ee747febbf024be0aace61161241b53245449e. All were committed between > > > > 15-17 April. > > > > > > > > [0] https://github.com/openzfs/zfs/pull/13392#issuecomment-1504239103 > > > > [1] https://github.com/openzfs/zfs/pull/14739 > > > > [2] https://github.com/openzfs/zfs/issues/14753 > > > > [3] https://github.com/openzfs/zfs/pull/14761 > > > > > > > Given mjg@'s thread reporting further crashes/panics, you may want to > > > keep the sysctl disabled if you upgraded the pool already. > > > > > > > I thought the plan was to keep it disabled until after 14. And even then, > > when it comes back in, it will be a new feature It should never be enabled. > > > https://lists.freebsd.org/archives/freebsd-current/2023-April/003514.html > > had Pawel Jakub Dawidek reporting adding a sysctl vfs.zfs.bclone_enabled > to allow the feature to be actually in use in 14, with a default that > would not have it in use. (Any cases of previously enable but not "in > use" here is wording simplification as I understand: special handling > if active from a previous pool upgrade and later activity so that > it cleans itself up, or something like that.) ah ok thanks for that insight. on my system where i did upgrade the pool i have this sysctl: $ sysctl vfs.zfs.bclone_enabled vfs.zfs.bclone_enabled: 0 which seems to jive with the statement above. thanks! -p -- Pete Wright p...@nomadlogic.org
Re: current status of zfs block_cloning on CURRENT?
On 4/24/23 21:30, Warner Losh wrote: On Mon, Apr 24, 2023 at 9:49 PM Charlie Li wrote: Charlie Li wrote: > Pete Wright wrote: >> i've seen a few threads about the block_cloning feature causing data >> corruption issues on CURRENT and have been keen to avoid enabling it >> until the dust settles. i was under the impression that we either >> reverted or disabled block_cloning on CURRENT, but when i ran "zpool >> upgrade" on a pool today it reported block_cloning was enabled. this >> is on a system i rebuilt yesterday. >> > The dust has settled. Barely... >> i was hoping to get some clarity on the effect of having this feature >> enabled, is this enough to trigger the data corruption bug or does >> something on the zfs filesystem itself have to be enabled to trigger >> this? >> > The initial problem with block_cloning [0][1] was fixed in commits > e0bb199925565a3770733afd1a4d8bb2d4d0ce31 and > 1959e122d9328b31a62ff7508e1746df2857b592, with a sysctl added in commit > 068913e4ba3dd9b3067056e832cefc5ed264b5cc. A different data corruption > problem [2][3] was fixed in commit > 63ee747febbf024be0aace61161241b53245449e. All were committed between > 15-17 April. > > [0] https://github.com/openzfs/zfs/pull/13392#issuecomment-1504239103 > [1] https://github.com/openzfs/zfs/pull/14739 > [2] https://github.com/openzfs/zfs/issues/14753 > [3] https://github.com/openzfs/zfs/pull/14761 > Given mjg@'s thread reporting further crashes/panics, you may want to keep the sysctl disabled if you upgraded the pool already. I thought the plan was to keep it disabled until after 14. And even then, when it comes back in, it will be a new feature It should never be enabled. that was my reading of things too - thanks for the tip on disabling the sysctl knob Charlie, I'll do that. if this is really intended to be live i'd like to suggest we update zpool-features(7) at the least so others aren't caught by surprise. i'd propose a PR myself, but I'm not %100 clear on what its intent is. -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
current status of zfs block_cloning on CURRENT?
hi everyone, i've seen a few threads about the block_cloning feature causing data corruption issues on CURRENT and have been keen to avoid enabling it until the dust settles. i was under the impression that we either reverted or disabled block_cloning on CURRENT, but when i ran "zpool upgrade" on a pool today it reported block_cloning was enabled. this is on a system i rebuilt yesterday. i was hoping to get some clarity on the effect of having this feature enabled, is this enough to trigger the data corruption bug or does something on the zfs filesystem itself have to be enabled to trigger this? i also noticed there is no entry for this feature in zpool-features(7), hence i thought i was safe to upgrade my pool. thanks in advance, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: RFC: A new NFS mount option to encourage use of Kerberized mounts
On Mon, Mar 13, 2023 at 07:25:07PM -0700, Rick Macklem wrote: > Hi, > > I have implemented a new mount option for NFSv4.1/4.2 mounts > that I hope will encourage use of Kerberos and TLS to help > secure NFS mounts. Although I do not know why users choose > to not use Kerberized NFS mounts, I think that the administrative > issues related to the "machine credential" is a factor. > This new option, which I have called "syskrb5" (feel free to > suggest a better name), avoids the need for a Kerberos machine > credential. > > > So, does this sound like something that should be committed > to FreeBSD? > speaking as an enduser.. this sounds pretty fantastic, i have several workloads in public cloud that use NFS, and having this added layer of auth would be really beneficial from a security perspective. i also like how it should be much easier for me to manage as well. one question - do you see other NFS implementations getting ready to roll out this support on their end? i ask because it would be nice to have this client support working and well tested by the time other vendors start offering this support server side. for example AWS EFS. thanks! -pete -- Pete Wright p...@nomadlogic.org
Re: Drm-kmod and 14-CURRENT
On 10/14/22 10:14, Patrick Bowen wrote: Hello all, I've just used reinstall.sh to add a CURRENT boot environment to a 13.1 ZFS installation. Xorg doesn't load in CURRENT, presumably because the drm-kmod doesn't work with 14. I tried to build drm-current-kmod from ports but the build errors out. I can send all the error messages and dmesg and such if necessary, but I'm betting it's just a simple fix that I'm unaware of. The graphics is i915 Intel BTW. I think you want to use graphics/drm-510-kmod this is what i use on my systems running CURRENT, it tracks the 5.10 linux kernel. iirc drm-current-kmod and drm-devel-kmod have been retired in favor of this one. -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: Updating EFI boot loader results in boot hangup
On 8/14/22 13:26, Graham Perrin wrote: On 14/08/2022 20:26, Pete Wright wrote: … has anyone else who has been impacted by this been able to recover? … If you have multiple boot environments: do you have a non-affected BE (prior to c32dde3166922f55927764464d13f1bc9640f5f6)? So unfortunately i didn't have a recent BE, but I was able to do the following to get back up: 1. download latest CURRENT snapshot memdisk from ftp.freebsd.org and put it on a usb drive 2. boot via usb drive and enter live shell 3. load zfs kmod: kldload zfs 4. import zroot: zpool import -R /mnt/ zroot 5. mount ROOT filesystem: zfs mount zroot/ROOT/default 6. copy usb loader to zroot: cp /boot/loader /mnt/boot/loader i'd recommend just using boot environments, it's much easier and is specifically what they are for :) -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: Updating EFI boot loader results in boot hangup
On Sun, Aug 14, 2022 at 10:26:32AM +0200, Stefan Esser wrote: > Am 14.08.22 um 04:20 schrieb Oleg Lelchuk: > > Yes, Yasuhiro and I have the same error. > > Just a "me too", also on ZFS, on a Ryzen 3 based system. > > Booting the latest USB snapshot image worked, but not when I copy > the whole of /boot from that USB stick to my ZFS boot partition. > > The system is usable if I boot from USB and manually mount the ZFS > file systems over the USB boot image. > adding my voice to the chorus here of "me too". ryzen5/uefi/zfs setup. has anyone else who has been impacted by this been able to recover? i can boot into a memstick image, and access my uefi shell via the bios - but i've never had to hack on btx like this before. any links or pointers would be appreciated! -pete -- Pete Wright p...@nomadlogic.org
Re: Chasing OOM Issues - good sysctl metrics to use?
On 5/29/22 11:16, Mark Millard wrote: FYI, the combination: vm.pageout_oom_seq=120 # in /boot/loader.conf vm.swap_enabled=0 # in /etc/sysctl.conf vm.swap_idle_enabled=0 # in /etc/sysctl.conf still has not caused me any additional problems and helps avoid loss of access by avoiding the relevant interaction-processes from having their kernel stacks swapped out. (Not that the effect of vm.swap_enabled=0 is limited to interaction-processes.) So, the combination is now part of the configuration of each FreeBSD that I use. awesome thanks Mark! i appreciate your feedback and input, i've certainly learned quite a bit as well which is awesome. i'm going to revert the diff as well when i kick off my weekly rebuild now. -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: Chasing OOM Issues - good sysctl metrics to use?
On 5/14/22 01:09, Mark Millard wrote: One of the points is to see if I get any evidence of vm.swap_enabled=0 with vm.swap_idle_enabled=0 ending up contributing to any problems in my normal usage. So far: no. vm.pageout_oom_seq=120 is in use for this, my normal context since sometime in 2018. So to revive an old thread here. it looks like setting these two sysctl knobs have helped the situation: vm.swap_enabled=0 vm.swap_idle_enabled=0 i've gone 7 days without any OOM events under normal work usage (as opposed to about 4days previously). this includes the following patch to vm_pageout.c that tijl@ shared with us: diff --git a/sys/vm/vm_pageout.c b/sys/vm/vm_pageout.c index 36d5f327580..df827af3075 100644 --- a/sys/vm/vm_pageout.c +++ b/sys/vm/vm_pageout.c @@ -1069,7 +1069,7 @@ vm_pageout_laundry_worker(void *arg) nclean = vmd->vmd_free_count + vmd->vmd_pagequeues[PQ_INACTIVE].pq_cnt; ndirty = vmd->vmd_pagequeues[PQ_LAUNDRY].pq_cnt; - if (target == 0 && ndirty * isqrt(howmany(nfreed + 1, + if (target == 0 && ndirty * isqrt(howmany(nfreed, vmd->vmd_free_target - vmd->vmd_free_min)) >= nclean) { target = vmd->vmd_background_launder_target; } I have adjusted my behavior a little bit as well, since i do quite a bit of work in the AWS console in firefox I've been better at closing out all of those tabs when i'm not using them (their console is a serious memory hog). i've also started using an official chrome binary inside an ubuntu jail which is where i run slack and discord, that seems to behave better as well in terms of memory utilization. i am going to revert the vm_pageout.c patch today when i do my weekly rebuild of world to see how things go, maybe that'll give determine if its really the sysctl's helping or not. cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: no hw.acpi.video.lcd0 with graphics/drm-510-kmod
On 5/23/22 00:37, Emmanuel Vadot wrote: On Sun, 22 May 2022 09:08:12 -0700 Pete Wright wrote: hello, i have a lenovo P43s laptop running current. i've noticed that since graphics/drm-510-kmod became available hw.acpi.video.lcd0 ceases to exist (which makes it impossible to adjust screen brightness). i've installed graphics/drm-54-kmod and things work as expected. previously i was running drm-devel-kmod without issues, so i think it's probably an issue with the 5.10 driver? i can file a bug report for this (or just test out patches here if that's easier), but wanted to see if anyone here had observed this on other laptops first. cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA Not sure what's happening as this sysctl is exposed by acpi_video(4) and not related to drm. But you shouldn't need this, backlight(8) should work everywhere once drm is loaded. yea that's why i was confused, i investigated acpi_video changes and didn't see anything obvious. before i started trying to bisect recent acpi changes i thought it would be a good idea to test the drm-54-kmod port, which restored that sysctl knob. sounds like i should continue to try to bisect the acpi kernel changes under 5.10 and see what i can find though. cheers, -p -- Pete Wright p...@nomadlogic.org @nomadlogicLA
no hw.acpi.video.lcd0 with graphics/drm-510-kmod
hello, i have a lenovo P43s laptop running current. i've noticed that since graphics/drm-510-kmod became available hw.acpi.video.lcd0 ceases to exist (which makes it impossible to adjust screen brightness). i've installed graphics/drm-54-kmod and things work as expected. previously i was running drm-devel-kmod without issues, so i think it's probably an issue with the 5.10 driver? i can file a bug report for this (or just test out patches here if that's easier), but wanted to see if anyone here had observed this on other laptops first. cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: Chasing OOM Issues - good sysctl metrics to use?
On 5/11/22 12:52, Mark Millard wrote: Relative to avoiding hang-ups, so far it seems that use of vm.swap_enabled=0 with vm.swap_idle_enabled=0 makes hang-ups less likely/less frequent/harder to produce examples of. But is no guarantee of lack of a hang-up. Its does change the cause of the hang-up (in that it avoids processes with kernel stacks swapped out being involved). thanks for the above analysis Mark. i am going to test these settings out now as i'm still seeing the lockup. this most recent hang-up was using a patch tijl@ asked me to test (attached to this email), and the default setting of vm.pageout_oom_seq: 12. interestingly enough with the patch applied i observed a smaller amount of memory used for laundry as well as less swap space used until right before the crash. cheers, -p -- Pete Wright p...@nomadlogic.org @nomadlogicLA From f3260a2eb1cdc86f9216b7923d7b09704a37e79d Mon Sep 17 00:00:00 2001 From: Tijl Coosemans Date: Sun, 8 May 2022 12:19:28 +0200 Subject: [PATCH] vm: stop background laundering when no progress Let vm_pageout_laundry_worker go to sleep when it cannot make progress (nfreed == 0). This allows other threads to run so they can hopefully free some memory. This restores behaviour from before c098768e4dad and 60684862588f and prevents some OOM kills. Tested by: Pete Wright MFC after: 2 weeks --- sys/vm/vm_pageout.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/sys/vm/vm_pageout.c b/sys/vm/vm_pageout.c index 36d5f3275800..fbb2cd4128f0 100644 --- a/sys/vm/vm_pageout.c +++ b/sys/vm/vm_pageout.c @@ -1061,15 +1061,13 @@ vm_pageout_laundry_worker(void *arg) * clean pages freed by the page daemon since the last * background laundering. Thus, as the ratio of dirty to * clean inactive pages grows, the amount of memory pressure - * required to trigger laundering decreases. We ensure - * that the threshold is non-zero after an inactive queue - * scan, even if that scan failed to free a single clean page. + * required to trigger laundering decreases. */ trybackground: nclean = vmd->vmd_free_count + vmd->vmd_pagequeues[PQ_INACTIVE].pq_cnt; ndirty = vmd->vmd_pagequeues[PQ_LAUNDRY].pq_cnt; - if (target == 0 && ndirty * isqrt(howmany(nfreed + 1, + if (target == 0 && ndirty * isqrt(howmany(nfreed, vmd->vmd_free_target - vmd->vmd_free_min)) >= nclean) { target = vmd->vmd_background_launder_target; } -- 2.35.1
Re: Chasing OOM Issues - good sysctl metrics to use?
On 4/29/22 11:38, Mark Millard wrote: On 2022-Apr-29, at 11:08, Pete Wright wrote: On 4/23/22 19:20, Pete Wright wrote: The developers handbook has a section debugging deadlocks that he referenced in a response to another report (on freebsd-hackers). https://docs.freebsd.org/en/books/developers-handbook/kerneldebug/#kerneldebug-deadlocks d'oh - thanks for the correction! -pete hello, i just wanted to provide an update on this issue. so the good news is that by removing the file backed swap the deadlocks have indeed gone away! thanks for sorting me out on that front Mark! Glad it helped. d'oh - went out for lunch and workstation locked up. i *knew* i shouldn't have said anything lol. i still am seeing a memory leak with either firefox or chrome (maybe both where they create a voltron of memory leaks?). this morning firefox and chrome had been killed when i first logged in. fortunately the system has remained responsive for several hours which was not the case previously. when looking at my metrics i see vm.domain.0.stats.inactive take a nose dive from around 9GB to 0 over the course of 1min. the timing seems to align with around the time when firefox crashed, and is proceeded by a large spike in vm.domain.0.stats.active from ~1GB to 7GB 40mins before the apps crashed. after the binaries were killed memory metrics seem to have recovered (laundry size grew, and inactive size grew by several gigs for example). Since the form of kill here is tied to sustained low free memory ("failed to reclaim memory"), you might want to report the vm.domain.0.stats.free_count figures from various time frames as well: vm.domain.0.stats.free_count: Free pages (It seems you are converting pages to byte counts in your report, the units I'm not really worried about so long as they are obvious.) There are also figures possibly tied to the handling of the kill activity but some being more like thresholds than usage figures, such as: vm.domain.0.stats.free_severe: Severe free pages vm.domain.0.stats.free_min: Minimum free pages vm.domain.0.stats.free_reserved: Reserved free pages vm.domain.0.stats.free_target: Target free pages vm.domain.0.stats.inactive_target: Target inactive pages ok thanks Mark, based on this input and the fact i did manage to lock up my system, i'm going to get some metrics up on my website and share them publicly when i have time. i'll definitely take you input into account when sharing this info. Also, what value were you using for: vm.pageout_oom_seq $ sysctl vm.pageout_oom_seq vm.pageout_oom_seq: 120 $ cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: Chasing OOM Issues - good sysctl metrics to use?
On 4/23/22 19:20, Pete Wright wrote: The developers handbook has a section debugging deadlocks that he referenced in a response to another report (on freebsd-hackers). https://docs.freebsd.org/en/books/developers-handbook/kerneldebug/#kerneldebug-deadlocks d'oh - thanks for the correction! -pete hello, i just wanted to provide an update on this issue. so the good news is that by removing the file backed swap the deadlocks have indeed gone away! thanks for sorting me out on that front Mark! i still am seeing a memory leak with either firefox or chrome (maybe both where they create a voltron of memory leaks?). this morning firefox and chrome had been killed when i first logged in. fortunately the system has remained responsive for several hours which was not the case previously. when looking at my metrics i see vm.domain.0.stats.inactive take a nose dive from around 9GB to 0 over the course of 1min. the timing seems to align with around the time when firefox crashed, and is proceeded by a large spike in vm.domain.0.stats.active from ~1GB to 7GB 40mins before the apps crashed. after the binaries were killed memory metrics seem to have recovered (laundry size grew, and inactive size grew by several gigs for example). maybe i'll have to gather data and post it online for anyone who would be interested in seeing this in graph form. although, frankly i feel like it's a browser problem which i can work around by running them in jails with resource limits in place via rctl. -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: Chasing OOM Issues - good sysctl metrics to use?
On 4/23/22 12:31, Mark Millard wrote: I think you may have taken my suggestion backwards . . . Unfortunately, vnode (file) based swap space should be *avoided* and partitions are what should be used in order to avoid deadlocks: On 2017-Feb-13, at 7:20 PM, Konstantin Belousov wrote on the freebsd-arm list: QUOTE swapfile write requires the write request to come through the filesystem write path, which might require the filesystem to allocate more memory and read some data. E.g. it is known that any ZFS write request allocates memory, and that write request on large UFS file might require allocating and reading an indirect block buffer to find the block number of the written block, if the indirect block was not yet read. As result, swapfile swapping is more prone to the trivial and unavoidable deadlocks where the pagedaemon thread, which produces free memory, needs more free memory to make a progress. Swap write on the raw partition over simple partitioning scheme directly over HBA are usually safe, while e.g. zfs over geli over umass is the worst construction. END QUOTE The developers handbook has a section debugging deadlocks that he referenced in a response to another report (on freebsd-hackers). https://docs.freebsd.org/en/books/developers-handbook/kerneldebug/#kerneldebug-deadlocks d'oh - thanks for the correction! -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: Chasing OOM Issues - good sysctl metrics to use?
On 4/22/22 18:46, Mark Millard wrote: On 2022-Apr-22, at 16:42, Pete Wright wrote: On 4/21/22 21:18, Mark Millard wrote: Messages in the console out would be appropriate to report. Messages might also be available via the following at appropriate times: that is what is frustrating. i will get notification that the processes are killed: Apr 22 09:55:15 topanga kernel: pid 76242 (chrome), jid 0, uid 1001, was killed: failed to reclaim memory Apr 22 09:55:19 topanga kernel: pid 76288 (chrome), jid 0, uid 1001, was killed: failed to reclaim memory Apr 22 09:55:20 topanga kernel: pid 76259 (firefox), jid 0, uid 1001, was killed: failed to reclaim memory Apr 22 09:55:22 topanga kernel: pid 76252 (firefox), jid 0, uid 1001, was killed: failed to reclaim memory Apr 22 09:55:23 topanga kernel: pid 76267 (firefox), jid 0, uid 1001, was killed: failed to reclaim memory Apr 22 09:55:24 topanga kernel: pid 76234 (chrome), jid 0, uid 1001, was killed: failed to reclaim memory Apr 22 09:55:26 topanga kernel: pid 76275 (firefox), jid 0, uid 1001, was killed: failed to reclaim memory Those messages are not reporting being out of swap as such. They are reporting sustained low free RAM despite a number of less drastic attempts to gain back free RAM (to above some threshold). FreeBSD does not swap out the kernel stacks for processes that stay in a runnable state: it just continues to page. Thus just one large process that has a huge working set of active pages can lead to OOM kills in a context were no other set of processes would be enough to gain the free RAM required. Such contexts are not really a swap issue. Thank you for this clarification/explanation - that totally makes sense! Based on there being only 1 "killed:" reason, I have a suggestion that should allow delaying such kills for a long time. That in turn may help with investigating without actually suffering the kills during the activity: more time with low free RAM to observe. Great idea thank-you! and thanks for the example settings and descriptions as well. But those are large but finite activities. If you want to leave something running for days, weeks, months, or whatever that produces the sustained low free RAM conditions, the problem will eventually happen. Ultimately one may have to exit and restart such processes once and a while, exiting enough of them to give a little time with sufficient free RAM. perfect - since this is a workstation my run-time for these processes is probably a week as i update my system and pkgs over the weekend, then dog food current during the work week. yes i have a 2GB of swap that resides on a nvme device. I assume a partition style. Otherwise there are other issues involved --that likely should be avoided by switching to partition style. so i kinda lied - initially i had just a 2G swap, but i added a second 20G swap a while ago to have enough space to capture some cores while testing drm-kmod work. based on this comment i am going to only use the 20G file backed swap and see how that goes. this is my fstab entry currently for the file backed swap: md99 none swap sw,file=/root/swap1,late 0 0 ZFS (so with ARC)? UFS? Both? i am using ZFS and am setting my vfs.zfs.arc.max to 10G. i have also experienced this crash with that set to the default unlimited value as well. I use ZFS on systems with at least 8 GiBytes of RAM, but I've never tuned ZFS. So I'm not much help for that side of things. since we started this thread I've gone ahead and removed the zfs.arc.max setting since its cruft at this point. i initially added it to test a configuration i deployed to a sever hosting a bunch of VMs. I'm hoping that vm.pageout_oom_seq=120 (or more) makes it so you do not have to have identified everything up front and can explore easier. Note that vm.pageout_oom_seq is both a loader tunable and a writeable runtime tunable: # sysctl -T vm.pageout_oom_seq vm.pageout_oom_seq: 120 amd64_ZFS amd64 1400053 1400053 # sysctl -W vm.pageout_oom_seq vm.pageout_oom_seq: 120 So you can use it to extend the time when the machine is already running. fantastic. thanks again for taking your time and sharing your knowledge and experience with me Mark! these types of journeys are why i run current on my daily driver, it really helps me better understand the OS so that i can be a better admin on the "real" servers i run for work. its also just fun to learn stuff too heh. -p -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: Chasing OOM Issues - good sysctl metrics to use?
On 4/22/22 13:39, tech-lists wrote: Hi, On Thu, Apr 21, 2022 at 07:16:42PM -0700, Pete Wright wrote: hello - on my workstation running CURRENT (amd64/32g of ram) i've been running into a scenario where after 4 or 5 days of daily use I get an OOM event and both chromium and firefox are killed. then in the next day or so the system will become very unresponsive in the morning when i unlock my screensaver in the morning forcing a manual power cycle. I have the following set in /etc/sysctl.conf on a stable/13 workstation. Am using zfs with 32GB RAM. vm.pageout_oom_seq=120 vm.pfault_oom_attempts=-1 vm.pageout_update_period=0 Since setting these here, OOM is a rarity. I don't profess to exactly know what they do in detail though. But my experience since these were set is hardly any OOM and big users of memory like firefox don't crash. nice, i will give those a test next time i crash which will be by next thurs if the pattern continues. looking at the sysctl descriptions: vm.pageout_oom_seq: back-to-back calls to oom detector to start OOM vm.pfault_oom_attempts: Number of page allocation attempts in page fault handler before it triggers OOM handling vm.pageout_update_period: Maximum active LRU update period i could certainly see how those could be helpful. in an ideal world i'd find the root cause of the system lock-ups, but it would be nice to just move on from this :) cheers, -p -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: Chasing OOM Issues - good sysctl metrics to use?
On 4/21/22 21:18, Mark Millard wrote: Messages in the console out would be appropriate to report. Messages might also be available via the following at appropriate times: that is what is frustrating. i will get notification that the processes are killed: Apr 22 09:55:15 topanga kernel: pid 76242 (chrome), jid 0, uid 1001, was killed: failed to reclaim memory Apr 22 09:55:19 topanga kernel: pid 76288 (chrome), jid 0, uid 1001, was killed: failed to reclaim memory Apr 22 09:55:20 topanga kernel: pid 76259 (firefox), jid 0, uid 1001, was killed: failed to reclaim memory Apr 22 09:55:22 topanga kernel: pid 76252 (firefox), jid 0, uid 1001, was killed: failed to reclaim memory Apr 22 09:55:23 topanga kernel: pid 76267 (firefox), jid 0, uid 1001, was killed: failed to reclaim memory Apr 22 09:55:24 topanga kernel: pid 76234 (chrome), jid 0, uid 1001, was killed: failed to reclaim memory Apr 22 09:55:26 topanga kernel: pid 76275 (firefox), jid 0, uid 1001, was killed: failed to reclaim memory the system in this case had killed both firefox and chrome while i was afk. i logged back in and started them up to do more more, then the next logline is from this morning when i had to force power off/on the system as they keyboard and network were both unresponsive: Apr 22 09:58:20 topanga syslogd: kernel boot file is /boot/kernel/kernel Do you have any swap partitions set up and in use? The details could be relevant. Do you have swap set up some other way than via swap partition use? No swap? yes i have a 2GB of swap that resides on a nvme device. ZFS (so with ARC)? UFS? Both? i am using ZFS and am setting my vfs.zfs.arc.max to 10G. i have also experienced this crash with that set to the default unlimited value as well. The first block of lines from a top display could be relevant, particularly when it is clearly progressing towards having the problem. (After the problem is too late.) (I just picked top as a way to get a bunch of the information all together automatically.) since the initial OOM events happen when i am AFK it is difficult to get relevant stats out of top. this is why i've started collecting more detailed metrics in prometheus. my hope is i'll be able to do a better job observing how my system is behaving over time, in the run up to the OOM event as well as right before and after. there are heaps of metrics collected though so hoping someone can point me in the right direction :) -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Chasing OOM Issues - good sysctl metrics to use?
hello - on my workstation running CURRENT (amd64/32g of ram) i've been running into a scenario where after 4 or 5 days of daily use I get an OOM event and both chromium and firefox are killed. then in the next day or so the system will become very unresponsive in the morning when i unlock my screensaver in the morning forcing a manual power cycle. one thing i've noticed is growing swap usage but plenty of free and inactive memory as well as a GB or so of memory in the Laundry state according top. my understanding is that seeing swap usage grow over time is expected and doesn't necessarily indicate a problem. but what concerns me is the system locking up while seeing quite a bit of disk i/o (maybe from paging back in?). in order to help chase this down i've setup the prometheus_sysctl_exporter(8) to send data to a local prometheus instance. the goal is to examine memory utilizaton over time to help detect any issues. so my question is this: what OID's would be useful to help see to help diagnose weird memory issues like this? i'm currently looking at: sysctl_vm_domain_0_stats_laundry sysctl_vm_domain_0_stats_active sysctl_vm_domain_0_stats_free_count sysctl_vm_domain_0_stats_inactive_pps thanks in advance - and i'd be happy to share my data if anyone is interested :) -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: -CURRENT hangs since at least 2022-04-04
On 4/18/22 12:23, filis+fbsdcurr...@filis.org wrote: Hi, I'm running -CURRENT on this one desktop box which is a "Ryzen 7 4800U with Radeon Graphics", since it didn't work on 13R. I use Boot environments and on 2022-04-04 I updated it and it started to completely freeze under X (I haven't tried letting it run without X) after a few dozen minutes. I went on vacation and came back today and updated it again to see if the issue went away, but it froze again. I went back to the latest BE before 2022-04-04, which is from 2022-03-21 and so far it works fine again. I use a different machine to build and then rsync /usr/src and /usr/obj over and run make installworld, etc locally and also pkg upgrade (I use FreeBSD -latest packages) everything, so I can't quite tell if this is related to base or drm-kmod and I'm not too familiar with changes in the timeframe between 2022-03-21 and 2022-04-04 that would affect my setup. Is there anything I can try and/or find or collect info to shed more light on this? After updating your CURRENT environment did you rebuild the drm-kmod package? that's usually required as the LKPI is much more of a moving target on that branch compared to STABLE or RELEASE. i have a pretty much identical setup and building/installing drm-devel-kmod has been working flawlessly for quite a while. after building/installing my latest world i do following (this is from a local script i use when rebuilding): cd $PORTS/graphics/drm-devel-kmod sudo pkg unlock -y drm-devel-kmod sudo make package sudo pkg upgrade -y work/pkg/*.pkg sudo pkg lock -y drm-devel-kmod -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: loading amfgpu results in immefiate power off on 12.3-STABLE r371721
On 3/25/22 21:42, Chris wrote: This probably isn't the correct list. But it's the closest of all the lists I'm subscribed to. Please forgive me. OK so here's what happened. I couldn't get the trackpad on a Dell laptop I just got to work in FreeBSD-13. So after a couple of days, I gave up and tried 12.3-STABLE r371721 today. Once I got the network (wifi) going. I pkg installed drm-kmod && it's depends. Added kld_list="amdgpu" to rc.conf && rebooted. The moment it loaded, the screen went black and it powered off. Booted to single-user, fsck && cp /var/log/messages to ~/ . I'm attaching a copy in case it sheds any light on the cause. The most interesting thing about all this, is that amdgpu worked flawlessly on 13 -- go figure. this discussion is probably best suited for the freebsd-x11 mailing list, but i think you can try a couple things: - give NomadBSD a spin (https://nomadbsd.org/). it's a live USB image that does a really good job at auto-detecting hardware and giving you nice desktop. it's based on freebsd-13.0. you can also install it on your disk if everything looks good. i frequently use it to test hardware support on new systems i encounter. - it's hard to tell without any hardware info provided, but its possible you have an older AMD gpu, as such you might want to try using radeonkms in rc.conf rather than amdgpu. if neither of those things help i'd definitely suggest subscribing to the freebsd-x11@ mailing list to get the appropriate eyes on things: https://lists.freebsd.org/subscription/freebsd-x11 -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: Is the graphics on AMD A8-7410 APU (Radeon R5 Graphics) supported?
On 3/23/22 15:09, Chris wrote: On 2022-03-23 14:57, Pete Wright wrote: On 3/23/22 14:44, Chris wrote: On a releng/13 install. I've installed the drm-kmod and loaded both the amdgpu and the radeonkms (at different times). I also installed the xf86-ati driver. But X isn't happy with it. This is in a laptop with the A8-7410. It claims Radeon R5 (formerly Carrizo). I find no mention of it in the on the FBSD Graphics wiki, or any of the links from there. Has anyone set one of these up sucessfully? Is it even possible? If so, what must I do? hey chris - what happens when you load the amdgpu driver via rc.conf? does it load correctly, or does the system crash on boot? for example what does "dmesg | grep drm" look like? Hey pete, thanks for the prompt reply! It "flashes" but the resolution doesn't appear to change. It's booting UEFI, if that should matter. grep(1) output attached. It's a bit long to paste inline. assuming it does load successfully i think you are using the wrong xorg driver as the xf86-ati driver is ATI not amd devices. Does loading xf86-video-amdgpu work better? You're probably right. I'll give that a try. Thanks again! :-) no worries! i took a peek at your dmesg and i think once you install the amdgpu xorg driver you'll be good to go. the screen flash is just the display cutting over to the new driver and is expected. looks like it detected both of your displays too so you window manager should pick them up too. a bunch of work was done recently to make xorg load up correctly without any configuration files - so you shouldn't even need to setup an xorg.conf or anything like that :) have fun! -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: Is the graphics on AMD A8-7410 APU (Radeon R5 Graphics) supported?
On 3/23/22 14:44, Chris wrote: On a releng/13 install. I've installed the drm-kmod and loaded both the amdgpu and the radeonkms (at different times). I also installed the xf86-ati driver. But X isn't happy with it. This is in a laptop with the A8-7410. It claims Radeon R5 (formerly Carrizo). I find no mention of it in the on the FBSD Graphics wiki, or any of the links from there. Has anyone set one of these up sucessfully? Is it even possible? If so, what must I do? hey chris - what happens when you load the amdgpu driver via rc.conf? does it load correctly, or does the system crash on boot? for example what does "dmesg | grep drm" look like? assuming it does load successfully i think you are using the wrong xorg driver as the xf86-ati driver is ATI not amd devices. Does loading xf86-video-amdgpu work better? cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: WSLg update on 1-5-2021 - BSD / WSL
On 5/1/21 12:42 PM, Chargen wrote: Dear all please note that I hope this message will be discussed to get this on the roadmap for FreeBSD. Perhaps there is already talk about && work done on that. I would like to suggest having a BSD side for Microsoft FOSS ambitions and get to know the BSD license. I hope the tech people here, know which nuts and bolts would be ready to boot a *BSD subsystem kernel and make that available on Windows 10 installations. I believe most of the effort make this happen lies with Microsoft - it is their product after all. WSL under the covers is Hyper-V which supports FreeBSD pretty well. I believe most of the work would be on the Windows side to get the plumbing in place to spin up a FreeBSD VM. There are open discussions on the WSL github system where people have asked for this but it has not gained much traction by Microsoft. -p -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: freebsd.org ftp site throws BOGUS cert
On 3/4/21 10:20 AM, Chris wrote: This post probably belongs on a different list that I am not subscribed to. But... The appropriate list is freebsd-hubs@ https://lists.freebsd.org/mailman/listinfo/freebsd-hubs I can't get to the freebsd ftp sites because the cert(s) appear to be bad: https://ftp0.tuk.freebsd.org/ they should be able to track down the owner of that site, it looks like the TLS cert is valid for download.freebsd.org: * Server certificate: * subject: CN=download.freebsd.org * start date: Feb 14 20:17:20 2021 GMT * expire date: May 15 20:17:20 2021 GMT * subjectAltName does not match ftp0.tuk.freebsd.org download.freebsd.org may actually be the preferred method for accessing resources on this system. For example if geo-loadbalancing is used. anywho - the people on the above mailing list would know best. -p -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: DRM problem installing kernel on main-c561-gc3e75b6c1
On 1/20/21 11:09 AM, Emmanuel Vadot wrote: On Wed, 20 Jan 2021 09:47:28 -0800 Pete Wright wrote: On 1/19/21 11:55 PM, Emmanuel Vadot wrote: i'm happy now running the current-kmod but let me know if it'd be helpful to do any more tests or provide additional info. So what did you change ? ok i think i spot the issue - in my checkout of the ports tree via the github mirror at git://github.com/freebsd/freebsd-ports.git it looks like the pkg-plist doesn't include the %%SOURCE%%KMODSRC%% statements: $ cat pkg-plist %%AMDGPU%%/%%KMODDIR%%/amdgpu.ko %%AMDKFD%%/%%KMODDIR%%/amdkfd.ko /%%KMODDIR%%/drm.ko %%I915%%/%%KMODDIR%%/i915kms.ko /%%KMODDIR%%/linuxkpi_gplv2.ko /%%KMODDIR%%/radeonkms.ko /%%KMODDIR%%/ttm.ko $ on the drm-current-kmod plist things look as we would expect them i believe: $ head pkg-plist %%AMDGPU%%/%%KMODDIR%%/amdgpu.ko /%%KMODDIR%%/drm.ko %%I915%%/%%KMODDIR%%/i915kms.ko /%%KMODDIR%%/linuxkpi_gplv2.ko /%%KMODDIR%%/radeonkms.ko /%%KMODDIR%%/ttm.ko %%SOURCEKMODSRC%%/Makefile %%SOURCEKMODSRC%%/kconfig.mk %%SOURCEKMODSRC%%/amd/Makefile %%SOURCEKMODSRC%%/amd/amdgpu/Makefile $ I can file a PR with a patch later today if that's helpful, if this isn't due to bad git workspace on my end. cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA drm-devel-kmod doesn't install the sources on purpose. It never had and never will. So, did you "solve" the problem by switching to drm-current-kmod or to drm-devel-kmod ? ah i see, thanks for the clarification. so as of now i'm using the drm-current-kmod on my amdgpu system. using the drm-devel-kmod throws the previously reported error trying to load linuxkpi_gplv2.ko: KLD drm.ko: depends on linuxkpi_gplv2 - not available or version mismatch linker_load_file: /boot/modules/drm.ko - unsupported file type KLD amdgpu.ko: depends on drmn - not available or version mismatch linker_load_file: /boot/modules/amdgpu.ko - unsupported file type -p -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: DRM problem installing kernel on main-c561-gc3e75b6c1
On 1/19/21 11:55 PM, Emmanuel Vadot wrote: i'm happy now running the current-kmod but let me know if it'd be helpful to do any more tests or provide additional info. So what did you change ? ok i think i spot the issue - in my checkout of the ports tree via the github mirror at git://github.com/freebsd/freebsd-ports.git it looks like the pkg-plist doesn't include the %%SOURCE%%KMODSRC%% statements: $ cat pkg-plist %%AMDGPU%%/%%KMODDIR%%/amdgpu.ko %%AMDKFD%%/%%KMODDIR%%/amdkfd.ko /%%KMODDIR%%/drm.ko %%I915%%/%%KMODDIR%%/i915kms.ko /%%KMODDIR%%/linuxkpi_gplv2.ko /%%KMODDIR%%/radeonkms.ko /%%KMODDIR%%/ttm.ko $ on the drm-current-kmod plist things look as we would expect them i believe: $ head pkg-plist %%AMDGPU%%/%%KMODDIR%%/amdgpu.ko /%%KMODDIR%%/drm.ko %%I915%%/%%KMODDIR%%/i915kms.ko /%%KMODDIR%%/linuxkpi_gplv2.ko /%%KMODDIR%%/radeonkms.ko /%%KMODDIR%%/ttm.ko %%SOURCEKMODSRC%%/Makefile %%SOURCEKMODSRC%%/kconfig.mk %%SOURCEKMODSRC%%/amd/Makefile %%SOURCEKMODSRC%%/amd/amdgpu/Makefile $ I can file a PR with a patch later today if that's helpful, if this isn't due to bad git workspace on my end. cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: DRM problem installing kernel on main-c561-gc3e75b6c1
On 1/19/21 1:18 PM, Emmanuel Vadot wrote: interesting - so it seems like if i have drm-devel-kmod installed this will fail (missing or wrong linuxkpi_gplv2.ko). this happens both if i install the pkg and rebuild the kernel, and if i build the kernel w/o the pkg installed. Don't use the package, always rebuild from the latest ports. see bellow yet, if i have the drm-current-kmod pkg installed, then "make buildkernel" it looks like the i915/amdgpu modules get build and an "installkernel" drops the linuxkpi_gplv2.ko module under /boot/kernel. at that point i am able to successfully load the amdgpu.ko. drm-current-kmod will also install its sources in /usr/local/sys/ and this will get built with buildkernel. The problem is that if the package is old (and it is right now) you might have sources that either don't compile or don't work correctly. OK interesting, in both cases I was building the package from my local ports tree (via "make package"). i should have better explained that in previous emails. i verified my checkout was up to date as well (it includes your latest commits from Sunday and Monday). i'm happy now running the current-kmod but let me know if it'd be helpful to do any more tests or provide additional info. cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: DRM problem installing kernel on main-c561-gc3e75b6c1
On 1/19/21 12:18 PM, Pete Wright wrote: On 1/19/21 12:11 PM, Emmanuel Vadot wrote: On Tue, 19 Jan 2021 11:40:04 -0800 Pete Wright wrote: On 1/19/21 11:33 AM, Pete Wright wrote: On 1/19/21 6:26 AM, Thomas Laus wrote: I perform a CURRENT build weekly on a more powerful build machine and then export /usr/src and /usr/obj via NFS to other slower PC's. The 'installkernel' phase failed with 'linuxkpi_gplv2.ko' not found. It looks like this file is not installed before the rest of the 'drm-current-kmod' files. This causes the 'installkernel' over NFS to fail. My fis was to un-install drm-current-kmod, install the kernel and then re-install drm-current-kmod. hrm, i'm not sure this is specifically an NFS issue. I am building/installing locally on my workstation but am getting similar errors trying to load drm-devel-kmod's amdgpu mod. at this point even uninstalling drm-devel-kmod, make installkernel, install drm-devel-kmod pkg results in the same problem. forgot to include dmesg error: KLD drm.ko: depends on linuxkpi_gplv2 - not available or version mismatch linker_load_file: /boot/modules/drm.ko - unsupported file type KLD amdgpu.ko: depends on drmn - not available or version mismatch linker_load_file: /boot/modules/amdgpu.ko - unsupported file type -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA Sound like you have an old linuxkpi_gplv2.ko in /boot/kernel/ Thanks Manu - so it looks like i don't have that file under /boot/kernel/ but in /boot/modules instead: $ find /boot/ -name '*linuxkpi*' -print /boot/modules/linuxkpi_gplv2.ko /boot/kernel/linuxkpi.ko /boot/kernel.old/linuxkpi.ko $ pkg which /boot/modules/linuxkpi_gplv2.ko /boot/modules/linuxkpi_gplv2.ko was installed by package drm-current-kmod-5.4.62.g20210118 $ above is after installing the current kmod to see if it behaved differently than the devel one. -pete interesting - so it seems like if i have drm-devel-kmod installed this will fail (missing or wrong linuxkpi_gplv2.ko). this happens both if i install the pkg and rebuild the kernel, and if i build the kernel w/o the pkg installed. yet, if i have the drm-current-kmod pkg installed, then "make buildkernel" it looks like the i915/amdgpu modules get build and an "installkernel" drops the linuxkpi_gplv2.ko module under /boot/kernel. at that point i am able to successfully load the amdgpu.ko. finally, i install the drm-current-pkg fresh (without doing the above buildkernel/installkernel) i get the linuxkpi_gplv2 error as above. -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: DRM problem installing kernel on main-c561-gc3e75b6c1
On 1/19/21 12:11 PM, Emmanuel Vadot wrote: On Tue, 19 Jan 2021 11:40:04 -0800 Pete Wright wrote: On 1/19/21 11:33 AM, Pete Wright wrote: On 1/19/21 6:26 AM, Thomas Laus wrote: I perform a CURRENT build weekly on a more powerful build machine and then export /usr/src and /usr/obj via NFS to other slower PC's. The 'installkernel' phase failed with 'linuxkpi_gplv2.ko' not found. It looks like this file is not installed before the rest of the 'drm-current-kmod' files. This causes the 'installkernel' over NFS to fail. My fis was to un-install drm-current-kmod, install the kernel and then re-install drm-current-kmod. hrm, i'm not sure this is specifically an NFS issue. I am building/installing locally on my workstation but am getting similar errors trying to load drm-devel-kmod's amdgpu mod. at this point even uninstalling drm-devel-kmod, make installkernel, install drm-devel-kmod pkg results in the same problem. forgot to include dmesg error: KLD drm.ko: depends on linuxkpi_gplv2 - not available or version mismatch linker_load_file: /boot/modules/drm.ko - unsupported file type KLD amdgpu.ko: depends on drmn - not available or version mismatch linker_load_file: /boot/modules/amdgpu.ko - unsupported file type -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA Sound like you have an old linuxkpi_gplv2.ko in /boot/kernel/ Thanks Manu - so it looks like i don't have that file under /boot/kernel/ but in /boot/modules instead: $ find /boot/ -name '*linuxkpi*' -print /boot/modules/linuxkpi_gplv2.ko /boot/kernel/linuxkpi.ko /boot/kernel.old/linuxkpi.ko $ pkg which /boot/modules/linuxkpi_gplv2.ko /boot/modules/linuxkpi_gplv2.ko was installed by package drm-current-kmod-5.4.62.g20210118 $ above is after installing the current kmod to see if it behaved differently than the devel one. -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: DRM problem installing kernel on main-c561-gc3e75b6c1
On 1/19/21 11:33 AM, Pete Wright wrote: On 1/19/21 6:26 AM, Thomas Laus wrote: I perform a CURRENT build weekly on a more powerful build machine and then export /usr/src and /usr/obj via NFS to other slower PC's. The 'installkernel' phase failed with 'linuxkpi_gplv2.ko' not found. It looks like this file is not installed before the rest of the 'drm-current-kmod' files. This causes the 'installkernel' over NFS to fail. My fis was to un-install drm-current-kmod, install the kernel and then re-install drm-current-kmod. hrm, i'm not sure this is specifically an NFS issue. I am building/installing locally on my workstation but am getting similar errors trying to load drm-devel-kmod's amdgpu mod. at this point even uninstalling drm-devel-kmod, make installkernel, install drm-devel-kmod pkg results in the same problem. forgot to include dmesg error: KLD drm.ko: depends on linuxkpi_gplv2 - not available or version mismatch linker_load_file: /boot/modules/drm.ko - unsupported file type KLD amdgpu.ko: depends on drmn - not available or version mismatch linker_load_file: /boot/modules/amdgpu.ko - unsupported file type -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: DRM problem installing kernel on main-c561-gc3e75b6c1
On 1/19/21 6:26 AM, Thomas Laus wrote: I perform a CURRENT build weekly on a more powerful build machine and then export /usr/src and /usr/obj via NFS to other slower PC's. The 'installkernel' phase failed with 'linuxkpi_gplv2.ko' not found. It looks like this file is not installed before the rest of the 'drm-current-kmod' files. This causes the 'installkernel' over NFS to fail. My fis was to un-install drm-current-kmod, install the kernel and then re-install drm-current-kmod. hrm, i'm not sure this is specifically an NFS issue. I am building/installing locally on my workstation but am getting similar errors trying to load drm-devel-kmod's amdgpu mod. at this point even uninstalling drm-devel-kmod, make installkernel, install drm-devel-kmod pkg results in the same problem. -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: git and the loss of revision numbers
On 12/28/20 4:38 PM, monochrome wrote: what would be the git command for reverting source to a previous version using these numbers? for example, with svn and old numbers: svnlite update -r367627 /usr/src I will generally just checkout the short git hash like so in my local checkout: $ git checkout gb81783dc98e6 you can quickly get the hashes by running "git log" from your checkout. cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CURRENT failing at contrib/unbound/util/config_file.c:122:20
On 11/7/20 3:56 PM, Cy Schubert wrote: In message , Pete Wright w rites: wondering if anyone else is having this error building CURRENT today: --- config_file.o --- /usr/home/pete/git/freebsd/contrib/unbound/util/config_file.c:122:20: error: use of undeclared identifier 'UNBOUND_DNS_OVER_HTTPS_PORT'        cfg->https_port = UNBOUND_DNS_OVER_HTTPS_PORT;                          ^ 1 error generated. --- all_subdir_lib/ncurses --- my last commit from the github mirror is: commit efb48d58bee75fdb221adece8ef5a13cede99e8c (HEAD -> master, origin/master, origin/HEAD) Author: tuexen Date:  Sat Nov 7 21:17:49 2020 +    The ioctl() calls using FIONREAD, FIONWRITE, FIONSPACE, and SIOCATMAR K    access the socket send or receive buffer. This is not possible for    listening sockets since r319722.    Because send()/recv() calls fail on listening sockets, fail also ioctl()    indicating EINVAL. so not sure if it's been found or if this is a real issue. No such problem here. What do you see on line 1397 of /usr/src/usr.sbin/unbound/config.h? Also, uname -a, please. And, git status usr.sbin/unbound, looking for local mods. Your cwd will need to be the root of your git tree. Thanks Cy for the confirmation that it is working on your end. I think my git checkout must be in a bad state locally. Even after a make clean and purge of my obj directory it was still failing. I then did a fresh checkout using the cgit-beta server and it is building fine now by the looks of it. I should have done that before spam'ing the list - d'oh. -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
CURRENT failing at contrib/unbound/util/config_file.c:122:20
wondering if anyone else is having this error building CURRENT today: --- config_file.o --- /usr/home/pete/git/freebsd/contrib/unbound/util/config_file.c:122:20: error: use of undeclared identifier 'UNBOUND_DNS_OVER_HTTPS_PORT' cfg->https_port = UNBOUND_DNS_OVER_HTTPS_PORT; ^ 1 error generated. --- all_subdir_lib/ncurses --- my last commit from the github mirror is: commit efb48d58bee75fdb221adece8ef5a13cede99e8c (HEAD -> master, origin/master, origin/HEAD) Author: tuexen Date: Sat Nov 7 21:17:49 2020 + The ioctl() calls using FIONREAD, FIONWRITE, FIONSPACE, and SIOCATMARK access the socket send or receive buffer. This is not possible for listening sockets since r319722. Because send()/recv() calls fail on listening sockets, fail also ioctl() indicating EINVAL. so not sure if it's been found or if this is a real issue. thx! -p -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: second ZFS pool no longer auto-imports
On 9/29/20 1:53 PM, xto...@hotmail.com wrote: Pete Wright wrote: Hello, I have a workstation with two ZFS pools (zroot and tank0). After upgrading CURRENT post the OpenZFS merge I have found that my tank0 pool no longer auto-imports on boot. After the system has booted I am able to import it via "zpool import tank0" then mount all of its filesystems without issues or errors. I've also run a scrub on the pool and no issues were identified their either. I've also been careful to *not* run "zpool upgrade" on this system as my zroot is geli encrypted, and have been waiting for the all clear to do that. Has anyone else noticed this behavior? Perhaps I'm missing an option or something - kinda confused here as to what may have changed to cause this to happen. Check /usr/src/UPDATING, 20200824 entry. thank-you! I had read that entry but obviously skipped over the last paragraph, sorry for the noise. cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
second ZFS pool no longer auto-imports
Hello, I have a workstation with two ZFS pools (zroot and tank0). After upgrading CURRENT post the OpenZFS merge I have found that my tank0 pool no longer auto-imports on boot. After the system has booted I am able to import it via "zpool import tank0" then mount all of its filesystems without issues or errors. I've also run a scrub on the pool and no issues were identified their either. I've also been careful to *not* run "zpool upgrade" on this system as my zroot is geli encrypted, and have been waiting for the all clear to do that. Has anyone else noticed this behavior? Perhaps I'm missing an option or something - kinda confused here as to what may have changed to cause this to happen. Thanks! -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Can't forward X11 apps over ssh since migrating to 13-CURRENT
On 9/21/20 8:39 PM, O'Connor, Daniel wrote: On 22 Sep 2020, at 08:06, Patrick McMunn wrote: I don't know if it's just coincidental or if it's because of some change in 13-CURRENT, but I recently migrated from 12.1-STABLE, and now I am unable to forward X11 apps over ssh. The only app I was accustomed to running this way was Handbrake. It worked fine before, but now i get this: $ ghb Unable to init server: Could not connect to 127.0.0.1: Connection refused (ghb:87219): Gtk-WARNING **: 13:12:41.281: cannot open display: I have tried other apps like Wireshark and even xclock just to see, but they won't work either. Has anyone else had problems with X11 forwarding on 13-CURRENT? If it's working for everyone else, at least I can know it's probably not 13-CURRENT's fault, and I need to look elsewhere for the cause. And yes, my sshd_config has it enabled. It worked fine before, and I made sure to keep the same config. What is the value of DISPLAY? (ie echo $DISPLAY) Is sshd listening on that port? eg.. [test 3:36] ~> echo $DISPLAY localhost:11.0 [test 3:36] ~> sockstat|grep 6011 radarsshd 5414 8 tcp6 ::1:6011 *:* radarsshd 5414 9 tcp4 127.0.0.1:6011*:* might have missed this but how is the ssh session being established. i just verified "ssh -X host" allows me to redirect X to my local workstation. both systems are running CURRENT as well. -p -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Plans for git
On 9/21/20 12:49 PM, Christian Weisgerber wrote: On 2020-09-21, Alexander Leidinger wrote: In my opinion the people which drive this didn't keep it behind closed curtains, and they went step by step more public, as they made progress. To me it looks like now, that they have something which is presentable to the world (and not only to committers), they presented it to the world. Since I am one of the sad people who managed to miss all this public information, where can we find a summary of what's planned for the switch? I believe the most detailed report on this was in the 2020-04 quarterly status report: https://www.freebsd.org/news/status/report-2020-04-2020-06.html#Git-Migration-Working-Group before this, work was mentioned in previous updates as part of the core teams status update. there is also the freebsd-git@ mailing list here: https://lists.freebsd.org/pipermail/freebsd-git/ i don't subscribe to that list personally but i have checked the archives periodically when i have questions that pop up. hope this helps, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Plans for git
making quarterly reports about this for almost a years as well. We put out calls for people to help with the efforts about the same time. We have tried at every step of the way to be open and honest that this was going to happen. All developer centric communications I would argue that quarterly reports are actually one of the few methods of getting accurate information about the state of the project as a non-insider. i've been following the progress of this work via the quarterly status reports for years now, and as someone who is merely a freebsd operator felt like i was more or less kept up to date on this eventuality. honestly there has to be *some* responsibility of operators to at least make an effort to keep up to date on the status of the various efforts in such a large project. and as an outsider the idea that comms can only happen on the mailing list isn't the greatest - how am i to know that the idea of one person on the ML carries more weight than another, or one persons opinion is the "official" stated opinion of the core group? -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Deprecating ftpd in the FreeBSD base system?
On 9/17/20 12:49 PM, John-Mark Gurney wrote: Ian Lepore wrote this message on Thu, Sep 17, 2020 at 09:01 -0600: On Thu, 2020-09-17 at 18:43 +0400, Gleb Popov wrote: On Thu, Sep 17, 2020 at 6:05 PM Cy Schubert < cy.schub...@cschubert.com> wrote: I've been advocating removing FTP (and HTTP) from libfetch as well. People should be using HTTPS only. Isn't this a bit too much? I often find myself in need to download something starting with "http://; or "ftp://; and use fetch for this. Indeed, we have products which rely on this ability in libfetch and we have to keep supporting them for many many years to come. I hate it when someone imperiously declares [For security reasons] "People should/shouldn't be using __". You have no idea what the context is, and thus no ability to declare what should or shouldn't be used in that context. For example, two embedded systems talking to each other over a point to point link within a sealed device are not concerned about man in the middle attacks or other modern internet threats. And I really dislike when people want to make sure that their unique case that less than a percent of people would every hit blocks the security improvements for the majority of people... I've given up on a number of security improvements in FreeBSD because of this attitude... while i tend to agree with you here - i would say that in this case there is a very large use case where preservation of http is very important to a wide base of users: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html https://cloud.google.com/compute/docs/storing-retrieving-metadata https://docs.microsoft.com/en-us/azure/virtual-machines/windows/instance-metadata-service regarding the main topic tho - dropping ftpd from base seems like a good iteration in clearing out cruft from the code base so we can focus on things with much larger user bases. fortunately we have an excellent ports/pkg infrastructure to service this need if it arises. -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Undefined symbol "lzc_remap"
On 9/14/20 1:00 PM, Kyle Evans wrote: On Mon, Sep 14, 2020 at 2:56 PM Pete Wright wrote: Hello, I have a system running current that is acting a little odd after a rebuild from last night (sept 13th). After reboot my root zfs pool mountd fine, but my second datavol "tank0" didn't auto-import/mount. A manual "zpool import" then "zfs mount -a" got everything back where it should be, but I am noticing some odd behavior with iocage: ImportError: /lib/libzfs.so.3: Undefined symbol "lzc_remap" Interestingly enough this is the second update i've done to this system since the import of openzfs code, and iocage was operating without issues previously. i am wondering does iocage need to be rebuild against newer sources or did something else change recently? Hi, You'll need a ports tree >= r548105 and rebuild devel/py-libzfs from that -- that should be sufficient. Thanks Kyle - you saved me quite a bit of debugging, i'll give that a spin now :) -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Undefined symbol "lzc_remap"
Hello, I have a system running current that is acting a little odd after a rebuild from last night (sept 13th). After reboot my root zfs pool mountd fine, but my second datavol "tank0" didn't auto-import/mount. A manual "zpool import" then "zfs mount -a" got everything back where it should be, but I am noticing some odd behavior with iocage: ImportError: /lib/libzfs.so.3: Undefined symbol "lzc_remap" Interestingly enough this is the second update i've done to this system since the import of openzfs code, and iocage was operating without issues previously. i am wondering does iocage need to be rebuild against newer sources or did something else change recently? -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: DRM Project report (week of August 10)
On 8/17/20 1:46 AM, Emmanuel Vadot wrote: Hello, 5.4 was finilly reached ! For AMD users it means that Navi12/14, Arctarus and Renoir should work. For Intel users it means that TigerLake should work too. No ports update for now as I want to give current users a bit of time to update their base (as the ports needs recent addition to base linuxkpi) but if you have a current >= 364233 you can test directly the master branch of https://github.com/freebsd/drm-kmod/ I plan to commit the port update at the end of the week, and probably at the end of the month we will switch drm-current-kmod to 5.4. thanks Manu! would it make sense to make this version available in graphics/drm-devel-kmod now to make it easier to test for people who update world/kernel more frequently? -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: -CURRENT and drm-devel-kmod
On 7/11/20 6:55 AM, Andreas Nilsson wrote: However, when I load i915kms from -devel the console stops refreshing. It only refreshes when I switch (Ctrl+alt+Fx). I see it refresh and display the new content just before switching to the requested console. add hw.i915kms.enable_psr=0 to /boot/loader.conf This is a bug that none of my hardware have and I don't really know what's happening for now. Thanks, setting that fixed the problem! ah thanks for asking this question Andreas (and the pointer Manu)! I've had this issue for a while but have been just blindly typing my login and xinit wrapper. interestingly enough, I've found if I start then exit Xorg the console starts updating again. -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: vfs_mouse.c breakage?
On 6/1/20 5:10 PM, Rick Macklem wrote: It also needed . It is ancient code (that started out in SunOS. if I recall correctly), where they used things like "bool_t" and set them with TRUE/FALSE (upper case). Unfortunately, those includes love to include other includes... Anyhow, I think it is fixed now, rick I can confirm on my end as well - thanks Rick! -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: vfs_mouse.c breakage?
On 6/1/20 2:50 PM, Rick Macklem wrote: Pete Wright wrote: Subject: vfs_mouse.c breakage? Not sure if the vfs mouse is broken (sorry, I couldn't resist), but... hah nice - dyslexia + poor eyesight are not good bedfellows :^) I think it needs a: #include but it will take a little while for me to test this. Thanks for reporting it, rick no prob - adding that include thew some more errors $ git diff diff --git a/sys/kern/vfs_mount.c b/sys/kern/vfs_mount.c index 03f95b2845f9..4282b1938095 100644 --- a/sys/kern/vfs_mount.c +++ b/sys/kern/vfs_mount.c @@ -39,6 +39,7 @@ #include __FBSDID("$FreeBSD$"); +#include #include #include #include here's a snippet of the exception: --- vfs_mount.o --- In file included from /usr/home/pete/git/freebsd/sys/kern/vfs_mount.c:42: In file included from /usr/home/pete/git/freebsd/sys/rpc/auth.h:50: /usr/home/pete/git/freebsd/sys/rpc/xdr.h:105:3: error: type name requires a specifier or qualifier bool_t (*x_getlong)(struct XDR *, long *); I'll sit tight for now - thanks for checking it out! -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
vfs_mouse.c breakage?
hello - i am having issues building CURRENT after this was applied: https://reviews.freebsd.org/D25045 https://svnweb.freebsd.org/base?view=revision=361699 --- vfs_mount.o --- /usr/home/pete/git/freebsd/sys/kern/vfs_mount.c:2360:27: error: use of undeclared identifier 'AUTH_SYS' exp->ex_secflavors[0] = AUTH_SYS; ^ 1 error generated. *** [vfs_mount.o] Error code 1 was curious if others are seeing this? cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: lockups on lenovo p43s under current
On 5/11/20 3:16 PM, Pete Wright wrote: hello, i have a lenovo thinkpad P43s that exhibits lockups under CURRENT but behaves fine when running STABLE. i've tried to find a fully reproducible situation to get this system to lockup but haven't found anything yet. i am starting to suspect that the changes implemented in this review may be the issue though: https://reviews.freebsd.org/D23728 my reasoning is that i've observed issues when: - removing AC power from the laptop, or inserting AC power - when the system display has gone to sleep - randomly hanging during boot with this as last line: battery0: battery enitialization start unfortunately while the above seem to be cases where this has happened i haven't been able to %100 reproduce yet. so my first question is - would it be possible to just revert the changes in that diff, or has too much time gone past to just back out that single change. alternatively, is there any debugging information i can get on my end that might help figure out what the root cause is? closing the loop on this - I am able to run CURRENT on this system by defining this in /boot/loader.conf: hint.hwpstate_intel.0.disabled="1" thanks to Diane Bruce for mentioning this issue in the following thread which gave me the hint i needed: https://lists.freebsd.org/pipermail/freebsd-current/2020-May/076123.html -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: lockups on lenovo p43s under current
On 5/11/20 4:21 PM, Pete Wright wrote: On 5/11/20 3:28 PM, Yuri Pankov wrote: Pete Wright wrote: hello, i have a lenovo thinkpad P43s that exhibits lockups under CURRENT but behaves fine when running STABLE. i've tried to find a fully reproducible situation to get this system to lockup but haven't found anything yet. i am starting to suspect that the changes implemented in this review may be the issue though: https://reviews.freebsd.org/D23728 my reasoning is that i've observed issues when: - removing AC power from the laptop, or inserting AC power - when the system display has gone to sleep - randomly hanging during boot with this as last line: battery0: battery enitialization start unfortunately while the above seem to be cases where this has happened i haven't been able to %100 reproduce yet. so my first question is - would it be possible to just revert the changes in that diff, or has too much time gone past to just back out that single change. alternatively, is there any debugging information i can get on my end that might help figure out what the root cause is? Not really what you are asking, but it's possible to disable ACPI subdevices, so you could check if disabling cmbat completely helps and it's indeed the suspect: debug.acpi.disabled="cmbat" Thanks Yuri, So I was able to boot my system once via batter with this set, but unfortunately it crashed after I tried to suspend/resume. Realizing that was a bit optimistic I attempted to reboot the system and wasn't able to get it to fully boot after several attempts. I believe what the next step at this point is checkout the code right before this commit and see if I can get it to successfully boot. I'll report back if I find anything after that test. To follow-up on this I believe the updates in the above review may be the culprit. What I have done is built a memstick.img set to the commit right before the changes in D23728 were merged. running this image I can boot my system, disconnect and reconnect AC power without any issues. i then booted from a memstick using the latest snapshot of current. i can disconnect AC power without issues, but reconnecting hangs the system immediately. i've tested this a couple times and it seems pretty reproducible, not sure what the best next step would be though. would someone here be willing to help me debug this, or would it be best to file a PR along with a dmesg and output from acpiconf? cheers! -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: lockups on lenovo p43s under current
On 5/11/20 3:28 PM, Yuri Pankov wrote: Pete Wright wrote: hello, i have a lenovo thinkpad P43s that exhibits lockups under CURRENT but behaves fine when running STABLE. i've tried to find a fully reproducible situation to get this system to lockup but haven't found anything yet. i am starting to suspect that the changes implemented in this review may be the issue though: https://reviews.freebsd.org/D23728 my reasoning is that i've observed issues when: - removing AC power from the laptop, or inserting AC power - when the system display has gone to sleep - randomly hanging during boot with this as last line: battery0: battery enitialization start unfortunately while the above seem to be cases where this has happened i haven't been able to %100 reproduce yet. so my first question is - would it be possible to just revert the changes in that diff, or has too much time gone past to just back out that single change. alternatively, is there any debugging information i can get on my end that might help figure out what the root cause is? Not really what you are asking, but it's possible to disable ACPI subdevices, so you could check if disabling cmbat completely helps and it's indeed the suspect: debug.acpi.disabled="cmbat" Thanks Yuri, So I was able to boot my system once via batter with this set, but unfortunately it crashed after I tried to suspend/resume. Realizing that was a bit optimistic I attempted to reboot the system and wasn't able to get it to fully boot after several attempts. I believe what the next step at this point is checkout the code right before this commit and see if I can get it to successfully boot. I'll report back if I find anything after that test. -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
lockups on lenovo p43s under current
hello, i have a lenovo thinkpad P43s that exhibits lockups under CURRENT but behaves fine when running STABLE. i've tried to find a fully reproducible situation to get this system to lockup but haven't found anything yet. i am starting to suspect that the changes implemented in this review may be the issue though: https://reviews.freebsd.org/D23728 my reasoning is that i've observed issues when: - removing AC power from the laptop, or inserting AC power - when the system display has gone to sleep - randomly hanging during boot with this as last line: battery0: battery enitialization start unfortunately while the above seem to be cases where this has happened i haven't been able to %100 reproduce yet. so my first question is - would it be possible to just revert the changes in that diff, or has too much time gone past to just back out that single change. alternatively, is there any debugging information i can get on my end that might help figure out what the root cause is? cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Xorg question
On 5/9/20 2:35 AM, Filippo Moretti wrote: I run the latest current and I have the following packages installed>xf86-input-keyboard-1.9.0_4 xf86-input-libinput-0.28.2_1 xf86-input-mouse-1.9.3_3 Should I keep all of them or may I keep xf86-input-libinput I don't think there is any harm in having all three of those packages installed. In fact I have all three on my system, but allow Xorg to auto configure itself, which picks up libinput by default. -p -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: OpenZFS port updated
On 4/17/20 2:54 PM, Ryan Moeller wrote: On Apr 17, 2020, at 4:56 PM, Pete Wright wrote: On 4/17/20 11:35 AM, Ryan Moeller wrote: FreeBSD support has been merged into the master branch of the openzfs/zfs repository, and the FreeBSD ports have been switched to this branch. Congratulations on this effort - big milestone! OpenZFS brings many exciting features to FreeBSD, including: * native encryption Is there a good doc reference on available for using this? I believe this is zfs filesystem level encryption and not a replacement for our existing full-disk-encryption scheme that currently works? I’m not aware of a good current doc for this. If anyone finds/writes something, please post it! There are some old resources you can find with a quick search that do a pretty good job of covering the basic ideas, but I think the exact syntax of commands may be slightly changed in the final implementation. The encryption is performed at a filesystem level (per-dataset). thanks for the clarification Ryan. I may try to test this out in the near future and will try to record my findings in a wiki or somewhere. being able to do filesystem level encryption is something i have several immediate use cases for. thanks! -p -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: OpenZFS port updated
On 4/17/20 11:35 AM, Ryan Moeller wrote: FreeBSD support has been merged into the master branch of the openzfs/zfs repository, and the FreeBSD ports have been switched to this branch. Congratulations on this effort - big milestone! OpenZFS brings many exciting features to FreeBSD, including: * native encryption Is there a good doc reference on available for using this? I believe this is zfs filesystem level encryption and not a replacement for our existing full-disk-encryption scheme that currently works? thanks again! -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Working on Zoom port
On 4/12/20 12:26 PM, Eric McCorkle wrote: All, Given how Zoom is getting used a lot more these days, I've started working on a port that installs the Zoom linux client. Here is a link to my github if anyone wants to help: https://github.com/emc2/freebsd-ports/tree/zoom I'm not done yet. The zoom linux client installs a bunch of Qt libraries in its own directory. These either need to be installed with a port, or else the right configs need to be set to search for libraries there. I'm going to take a break, but I'm going to circle back to this. Thanks Eric, I remember trying to get this working several months ago via the linux compatibility layer and got stuck. i hope to take another wack at it based on your repository. in my ideal world i'd be able to get this working in a jail via, but i think just getting the bits to work is probably the most important task. i've had working solutions based on jitsi and riot.im with acceptable performance, so i suspect our webcamd bits are in good enough shape to support this. interested to see how how this effort progresses :) -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Error building in libunbound
recent checkout, getting this error building libunbound: --- all_subdir_lib/libunbound --- /usr/home/pete/git/freebsd/contrib/unbound/util/log.c:120:30: error: use of undeclared identifier 'UB_SYSLOG_FACILITY' openlog(ident, LOG_NDELAY, UB_SYSLOG_FACILITY); ^ wondering if anyone else is seeing this. Looking at recent commits to log.c don't show any recent changes. Cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: error building scsi_da.c with recent current
no worries - and thanks for the workaround. i'm going to take a look before i sign-off for the night, hopefully github will be in sync by then. cheers, -pete On Mon, Dec 16, 2019 at 3:15 PM Warner Losh wrote: > The prior version (r355814) works just fine, if you wanted to get going... > Sorry for the hassle.. I'm flushing my WIP trees, deleting old stuff and > forgot this change had a dependency on something that's not quite ready yet. > > Warner > > On Mon, Dec 16, 2019 at 4:12 PM pete wright wrote: > >> Thanks, maybe git mirror was lagging will test in a bit. >> >> Thanks! >> -pete >> >> On Mon, Dec 16, 2019, 3:09 PM Warner Losh wrote: >> >>> Update. That version was in place for about an hour before I fixed it. >>> r355815 broke it and r355818 fixed it. >>> >>> Warner >>> >>> >>> On Mon, Dec 16, 2019 at 3:27 PM pete wright >>> wrote: >>> >>>> here's the error i'm getting when building: >>>> /usr/home/pete/git/freebsd/sys/cam/scsi/scsi_da.c:1544:28: error: >>>> expected >>>> identifier >>>> SYSCTL_INT64(_kern_cam_da, OID_AUTO, default_max_delete, CTLFLAG_RWTUN, >>>>^ >>>> /usr/home/pete/git/freebsd/sys/sys/sysctl.h:126:18: note: expanded from >>>> macro 'OID_AUTO' >>>> #define OID_AUTO(-1) >>>> ^ >>>> /usr/home/pete/git/freebsd/sys/cam/scsi/scsi_da.c:1544:1: error: type >>>> specifier missing, defaults to 'int' [-Werror,-Wimplicit-int] >>>> SYSCTL_INT64(_kern_cam_da, OID_AUTO, default_max_delete, CTLFLAG_RWTUN, >>>> ^ >>>> /usr/home/pete/git/freebsd/sys/cam/scsi/scsi_da.c:1544:13: error: this >>>> function declaration is not a prototype [-Werror,-Wstrict-prototypes] >>>> SYSCTL_INT64(_kern_cam_da, OID_AUTO, default_max_delete, CTLFLAG_RWTUN, >>>> ^ >>>> 3 errors generated. >>>> *** [scsi_da.o] Error code 1 >>>> >>>> >>>> >>>> might be this commit? >>>> commit 5fa79c6768be78d78815156f8ecf50cb2008233f (HEAD -> master, >>>> origin/master, origin/HEAD) >>>> Author: imp >>>> Date: Mon Dec 16 18:16:44 2019 + >>>> >>>> Implement a system-wide limit or da and ada devices for delete. >>>> >>>> Excesively large TRIMs can result in timeouts, which cause big >>>> problems. Limit trims to 1GB to mititgate these issues. >>>> >>>> Reviewed by: scottl >>>> Differential Revision: https://reviews.freebsd.org/D22809 >>>> >>>> >>>> >>>> if there is any additional info needed happy to provide it. >>>> >>>> cheers, >>>> -pete >>>> >>>> -- >>>> pete wright >>>> www.nomadlogic.org >>>> @nomadlogicLA >>>> ___ >>>> freebsd-current@freebsd.org mailing list >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-current >>>> To unsubscribe, send any mail to " >>>> freebsd-current-unsubscr...@freebsd.org" >>>> >>> -- pete wright www.nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: error building scsi_da.c with recent current
Thanks, maybe git mirror was lagging will test in a bit. Thanks! -pete On Mon, Dec 16, 2019, 3:09 PM Warner Losh wrote: > Update. That version was in place for about an hour before I fixed it. > r355815 broke it and r355818 fixed it. > > Warner > > > On Mon, Dec 16, 2019 at 3:27 PM pete wright wrote: > >> here's the error i'm getting when building: >> /usr/home/pete/git/freebsd/sys/cam/scsi/scsi_da.c:1544:28: error: expected >> identifier >> SYSCTL_INT64(_kern_cam_da, OID_AUTO, default_max_delete, CTLFLAG_RWTUN, >>^ >> /usr/home/pete/git/freebsd/sys/sys/sysctl.h:126:18: note: expanded from >> macro 'OID_AUTO' >> #define OID_AUTO(-1) >> ^ >> /usr/home/pete/git/freebsd/sys/cam/scsi/scsi_da.c:1544:1: error: type >> specifier missing, defaults to 'int' [-Werror,-Wimplicit-int] >> SYSCTL_INT64(_kern_cam_da, OID_AUTO, default_max_delete, CTLFLAG_RWTUN, >> ^ >> /usr/home/pete/git/freebsd/sys/cam/scsi/scsi_da.c:1544:13: error: this >> function declaration is not a prototype [-Werror,-Wstrict-prototypes] >> SYSCTL_INT64(_kern_cam_da, OID_AUTO, default_max_delete, CTLFLAG_RWTUN, >> ^ >> 3 errors generated. >> *** [scsi_da.o] Error code 1 >> >> >> >> might be this commit? >> commit 5fa79c6768be78d78815156f8ecf50cb2008233f (HEAD -> master, >> origin/master, origin/HEAD) >> Author: imp >> Date: Mon Dec 16 18:16:44 2019 + >> >> Implement a system-wide limit or da and ada devices for delete. >> >> Excesively large TRIMs can result in timeouts, which cause big >> problems. Limit trims to 1GB to mititgate these issues. >> >> Reviewed by: scottl >> Differential Revision: https://reviews.freebsd.org/D22809 >> >> >> >> if there is any additional info needed happy to provide it. >> >> cheers, >> -pete >> >> -- >> pete wright >> www.nomadlogic.org >> @nomadlogicLA >> ___ >> freebsd-current@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-current >> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org >> " >> > ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
error building scsi_da.c with recent current
here's the error i'm getting when building: /usr/home/pete/git/freebsd/sys/cam/scsi/scsi_da.c:1544:28: error: expected identifier SYSCTL_INT64(_kern_cam_da, OID_AUTO, default_max_delete, CTLFLAG_RWTUN, ^ /usr/home/pete/git/freebsd/sys/sys/sysctl.h:126:18: note: expanded from macro 'OID_AUTO' #define OID_AUTO(-1) ^ /usr/home/pete/git/freebsd/sys/cam/scsi/scsi_da.c:1544:1: error: type specifier missing, defaults to 'int' [-Werror,-Wimplicit-int] SYSCTL_INT64(_kern_cam_da, OID_AUTO, default_max_delete, CTLFLAG_RWTUN, ^ /usr/home/pete/git/freebsd/sys/cam/scsi/scsi_da.c:1544:13: error: this function declaration is not a prototype [-Werror,-Wstrict-prototypes] SYSCTL_INT64(_kern_cam_da, OID_AUTO, default_max_delete, CTLFLAG_RWTUN, ^ 3 errors generated. *** [scsi_da.o] Error code 1 might be this commit? commit 5fa79c6768be78d78815156f8ecf50cb2008233f (HEAD -> master, origin/master, origin/HEAD) Author: imp Date: Mon Dec 16 18:16:44 2019 + Implement a system-wide limit or da and ada devices for delete. Excesively large TRIMs can result in timeouts, which cause big problems. Limit trims to 1GB to mititgate these issues. Reviewed by: scottl Differential Revision: https://reviews.freebsd.org/D22809 if there is any additional info needed happy to provide it. cheers, -pete -- pete wright www.nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: breakage at usr.sbin/jail/Makefile
On 11/20/19 8:13 PM, Glen Barber wrote: On Wed, Nov 20, 2019 at 07:57:57PM -0800, Pete Wright wrote: Hello, looks like some of the recent commits to usr.sbin/jail/Makefile has broken CURRENT. I am getting this error when attempting a buildworld: ===> usr.sbin/jail (cleandir) make[4]: "/usr/home/pete/git/freebsd/usr.sbin/jail/Makefile" line 21: Malformed conditional (${LINKER_TYPE} == "bfd" && ${MACHINE} == "riscv") make[4]: Fatal errors encountered -- cannot continue make[4]: stopped in /usr/home/pete/git/freebsd/usr.sbin/jail *** [cleandir_subdir_usr.sbin/jail] Error code 1 here's the code in question: 18 # workaround for GNU ld (GNU Binutils) 2.33.1: 19 # relocation truncated to fit: R_RISCV_GPREL_I against `.LANCHOR2' 20 # https://bugs.freebsd.org/242109 21 .if ${LINKER_TYPE} == "bfd" && ${MACHINE} == "riscv" 22 CFLAGS+=-Wl,--no-relax 23 .endif looks like Ed Maste caught this already in the https://bugs.freebsd.org/242109 but wanted to flag it here as well in case anyone else runs into this in the hopes it saves some debugging time :) Reverted out of frustration in r354935. thanks! the issue seems to be on my amd64 system ${LINKER_TYPE} is not defined. i was contemplating suggesting updating the .if clause in the Makefile like so: .if defined($LINKER_TYPE}) && ${LINKER_TYPE} == "bfd" && ${MACHINE} == "riscv" it allows things to compile on my end, but i'm not sure this is best way to resolve this issue. -p -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
breakage at usr.sbin/jail/Makefile
Hello, looks like some of the recent commits to usr.sbin/jail/Makefile has broken CURRENT. I am getting this error when attempting a buildworld: ===> usr.sbin/jail (cleandir) make[4]: "/usr/home/pete/git/freebsd/usr.sbin/jail/Makefile" line 21: Malformed conditional (${LINKER_TYPE} == "bfd" && ${MACHINE} == "riscv") make[4]: Fatal errors encountered -- cannot continue make[4]: stopped in /usr/home/pete/git/freebsd/usr.sbin/jail *** [cleandir_subdir_usr.sbin/jail] Error code 1 here's the code in question: 18 # workaround for GNU ld (GNU Binutils) 2.33.1: 19 # relocation truncated to fit: R_RISCV_GPREL_I against `.LANCHOR2' 20 # https://bugs.freebsd.org/242109 21 .if ${LINKER_TYPE} == "bfd" && ${MACHINE} == "riscv" 22 CFLAGS+=-Wl,--no-relax 23 .endif looks like Ed Maste caught this already in the https://bugs.freebsd.org/242109 but wanted to flag it here as well in case anyone else runs into this in the hopes it saves some debugging time :) -p -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Problems with AMDGPU and two grafic cards
On 10/18/19 3:07 PM, mms.vanbreukelin...@gmail.com wrote: Hi Pete, did BIOS update, has been critical. in the meantime with lot o'luck I got the scfb-driver to work but only in 1024x76 (but the bios-logo starting in 1920x1280) and with 'dbus-launch kstart5 plasmashell' I have a root-login to plasmashell. I guess I can add further video-modes to the driver in xorg.conf or set the boot kernel option for 1920x1280 but now, after two days with just 3 hours of sleep, I'm too tired and grateful enough it did graphics at all. The CD-medium has been from february and didn't do 'mode 5' on loaders prompt, the one I have now is setting up /usr/src with 1920x1080 (I just thought on 1280x1024, that's a resolution where you really need an expensive screen. It's a ASUS r7240-o4gd5-L (GB, DVI, HDMI, Active, LP) and the Ryzen 7 has 2700MhZ. X. Ok, so I think for that graphics adapter you should ensure you are using the "radeonkms" kernel module, can you confirm you have this set in your rc.conf? kld_list="radeonkms" If you have tried booting using that kernel module can you share the output of "dmesg | grep drm" assuming that it is different than the previous dmesg you posted in this thread? if it is the same, then I'm not sure why the firmware is failing to load and will have to defer to others on the list. cheers, -p -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Troubles with the X-Server on an RADEON GPU 240
On 10/2/19 9:36 AM, Miranda Maria Sophie Van den Breukelingen wrote: Revision: svn'd but not yet built 353009 on CURRENT GENERIC; built: xorg-xserver; llvm-devel X -configure log tells, can't open dev/io; falling back to scfb no screens found /etc/rc.conf: kld_load="amdgpu". sddm: xauth in sloop... kernel rebuilt? I would make sure you've installed the drm-kmod pkg, it is not clear to me that you have done this yet as per above. This should install the appropriate kernel drivers, just make sure you read the pkg message after installing it (esp making sure you are a member of the "video" group). -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Booting anything after r352057 kills console
On 9/23/19 2:32 PM, Thomas Laus wrote: Poul-Henning Kamp [p...@phk.freebsd.dk] wrote: In message <11db909b-57ee-b452-6a17-90ec2765c...@acm.org>, Thomas Laus writes: Where do I go from here? The computer is an Intel i5 Skylake with onboard graphics. Based on personal experience: 1. Deinstall drm ports 2. Remove all remaining drm related files under /boot 3. Reinstall drm port That did not work. On a successful boot after using beadm to rollback to r352057, I see the following items startup after setting the ntpd security policy: starting ntpd configuring vt: blanktime sanity check of sshd configuration start sshd start sendmail & sendmail submit as well as cron start background checks login On all svn updates after r352057, the last item logged is the ntpd security policy and then the console goes black. The computer is dead and I can't login through ssh nor change to another console. I hae to hit the reset switch to reboot. Even ctrl-alt-delete is not functioning. I remember having similar issues a while ago when we were first hacking on drm, one thing to try is updating /boot/loader.conf with the following: debug.debugger_on_panic=0 dev.drm.skip_ddb="1" dev.drm.drm_debug_persist="1" these are semi-documented in the wiki here: https://wiki.freebsd.org/Graphics#Issues_.2F_Bugs while they may not solve the issue, they will hopefully give us better info as to why the system is hanging. Also, are you able to boot the previously working kernel (iirc you can do this via the boot loader menu) successfully? and lasty, can you boot single user then manually attempt to load the kernel module via kldload i915kms.ko? cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: AMD & 13.0 Current
On 9/15/19 4:14 PM, c...@riseup.net wrote: The other thing to note, is that for 13.0 Current they both need a line in /boot/loader.conf: hw.syscons..disable=1 Why is this line needed and for what purposes? This issue, and the one it links to, go a bit towards describing the issue: https://github.com/FreeBSDDesktop/kms-drm/issues/127 I have to use the workaround as well, which isn't ideal to be honest, but i just think due to the other higher priority tasks it hasn't been addressed yet. Does the new hardware in this new build also work well in FreeBSD 12 stable branch? The FreeBSD current is for testing purposes so we are very interested in how it the above mentioned new hardware work on stable or production version. I run both CURRENT and 12.0-RELEASE on Ryzen with amdgpu graphics and they work fine. I actually use AWS instances that have the AMD Epyc CPU installed and those work great as well - so i def feel like AMD is becoming relevant again on both desktop and servers :) -p -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: 13.0 Current - r350702 exposed a Xorg failure
On 8/9/19 8:56 PM, Clay Daniels Jr. wrote: I was eager to load the new 13.0 Current snapshot yesterday as I wanted to play with the new FUSE tools. I was running 13.0 Current r350491 from last week and everything was going great. So last night, a little late I guess, I wiped the older install and loaded r250702. Then I loaded Xorg, all 172 packages, and loaded the drm-kmod video driver kernel modules, and then ran startx (as user of course). I got errors & it was late so today I looked closer. It said: "xauth: file .serverauth.1039 does not exist" Well, this file is apparently something created automatically. I played with the half-running install for a long time. It ran fine in console mode. Then I the wiped it and reloaded the same newer r350702. No Go. Wiped the new r350702 and reloaded the older r350491 that was working just fine last night. Same Problemserverauth.xxx Now, I do know that the drm-kmod was the same (g20190710) that had worked for me at least two times already. I do not know if the Xorg pkg is the same. I couldn't find a date other than "latest". I'm writing this email from my Linux partition. first thing that comes to mind, did you make sure to add your user to the "video" group? this doesn't sound related though...this does sound like a local configuration issue. iirc when i ran into this problem in the past it was due to permissions, either a .serverauth file owned by root or a UID that no longer exists. -p -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: acpi issues on FreeBSD-current_r350103 on Thinkpad A485
On 7/19/19 3:54 PM, Evilham wrote: > > Serious issue: > I was just debugging this right now, more infos with a proper bug > report will come, but I think the system encounters a deadlock > sometimes with the drm-kmod / amdgpu which results in a kernel panic. > It is a serious issue, but it allows me to use the computer for work, > it doesn't happen every couple hours, but it does happen a couple > times a day. > > FWIW, this is part of the crashlog: > > WARNING !drm_modeset_is_locked(>mutex) failed at > /wrkdirs/usr/ports/graphics/drm-fbsd12.0-kmod/work/kms-drm-6365030/drivers/gpu/drm/drm_atomic_helper.c:821 > [Multiple times...] > kernel trap 22 with interrupts disabled > kernel trap 22 with interrupts > disabled > kernel trap 22 with interrupts disabled > kernel trap 22 with interrupts disabled > panic: spin lock held too long > interesting. can you post this kernel panic, and any backtraces you are able to get here: https://github.com/FreeBSDDesktop/kms-drm/issues also, are you using the xf86-video-amdgpu driver, or the stock modesetting driver to X? thanks! -pete -- Pete Wright p...@nomadlogic.org 310.309.9298 ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: undefined symbol: random_source_register during kernel compilation
On 7/18/19 5:21 AM, M - Krasznai András wrote: Hi I have been trying to compile a freebsd-current kernel since the 16th of July, and keep getting the following error during "make buildkernel": Building /usr/obj/usr/src/amd64.amd64/sys/G13NEW/kernel.full linking kernel.full ld: error: undefined symbol: random_source_register referenced by ivy.c:108 (/usr/src/sys/dev/random/ivy.c:108) ivy.o:(rdrand_modevent) ld: error: undefined symbol: random_source_deregister referenced by ivy.c:115 (/usr/src/sys/dev/random/ivy.c:115) ivy.o:(rdrand_modevent) ld: error: undefined symbol: random_source_register referenced by nehemiah.c:124 (/usr/src/sys/dev/random/nehemiah.c:124) nehemiah.o:(nehemiah_modevent) ld: error: undefined symbol: random_source_deregister referenced by nehemiah.c:133 (/usr/src/sys/dev/random/nehemiah.c:133) nehemiah.o:(nehemiah_modevent) *** Error code 1 Stop. make[2]: stopped in /usr/obj/usr/src/amd64.amd64/sys/G13NEW .ERROR_TARGET='kernel.full' .ERROR_META_FILE='/usr/obj/usr/src/amd64.amd64/sys/G13NEW/kernel.full.meta' .MAKE.LEVEL='2' MAKEFILE='' .MAKE.MODE='meta missing-filemon=yes missing-meta=yes silent=yes verbose curdirOk= yes' I deleted and resynchronized the source tree and emptied the /usr/obj directory, but it did not help. How could I get kernel compilation work again? I would like to say that the make.conf, src.conf files as well as my kernel configuration file was not changed since a couple of months (since I installed freebsd-current). are you able to build GENERIC? if so might be worth looking at the delta's b/w GENERIC and your custom configuration and trying to zero in on what may be causing this to fail. -p -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: frequent panics from current built today
On 6/25/19 11:37 PM, Doug Moore wrote: This problem is almost certainly my fault, due to r349393. Once I learned of the problem and verified that I was responsible, I reverted that change with r349405. I apologize for the multiple inconveniences I have caused. Thanks Doug, this has fixed things on my end! Cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
frequent panics from current built today
hello - i've had a system panic multiple times after rebuilding world/kernel today. here's my info: FreeBSD topanga 13.0-CURRENT FreeBSD 13.0-CURRENT 2474a68216f(master) GENERIC-NODEBUG amd64 Looking at two of the panic texts I see this: Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0x52 fault code = supervisor read data, page not present instruction pointer = 0x20:0x80f47f54 stack pointer = 0x0:0xfe0122e4a6d0 frame pointer = 0x0:0xfe0122e4a7b0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 69399 (java) trap number = 12 panic: page fault cpuid = 1 time = 1561514025 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe0122e4a380 vpanic() at vpanic+0x19d/frame 0xfe0122e4a3d0 panic() at panic+0x43/frame 0xfe0122e4a430 trap_fatal() at trap_fatal+0x39c/frame 0xfe0122e4a490 trap_pfault() at trap_pfault+0x49/frame 0xfe0122e4a4f0 trap() at trap+0x29f/frame 0xfe0122e4a600 calltrap() at calltrap+0x8/frame 0xfe0122e4a600 --- trap 0xc, rip = 0x80f47f54, rsp = 0xfe0122e4a6d0, rbp = 0xfe0122e4a7b0 --- vm_map_lookup() at vm_map_lookup+0x2a4/frame 0xfe0122e4a7b0 vm_fault_hold() at vm_fault_hold+0x72/frame 0xfe0122e4a900 vm_fault() at vm_fault+0x60/frame 0xfe0122e4a940 trap_pfault() at trap_pfault+0x164/frame 0xfe0122e4a9a0 trap() at trap+0x42b/frame 0xfe0122e4aab0 calltrap() at calltrap+0x8/frame 0xfe0122e4aab0 --- trap 0xc, rip = 0x41f695e0, rsp = 0x7fffdfff0738, rbp = 0x7fffdfff07b0 --- KDB: enter: panic __curthread () at /usr/home/pete/git/freebsd/sys/amd64/include/pcpu.h:246 246 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (OFFSETOF_CURTHREAD)); I have usable core files and can post any additional debug info needed. The first crash happened when launching chrome, the second (which is what the above text is from) happened when I was running a java process in a jail. Has anyone else seen this? Cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: mergemaster still having issues
On 5/17/19 10:51 AM, David Wolfskill wrote: On Fri, May 17, 2019 at 10:46:57AM -0700, Pete Wright wrote: hello - i am still having issues with mergemaster when building/installing current on my end as per this thread: https://lists.freebsd.org/pipermail/freebsd-current/2019-May/073403.html my source contains the fix that I think we committed to address the original issue, are there any other steps required to get things working: *** Creating the temporary root environment in /var/tmp/temproot *** /var/tmp/temproot ready for use *** Creating and populating directory structure in /var/tmp/temproot cp: /usr/home/pete/git/freebsd/etc/master.passwd: No such file or directory *** FATAL ERROR: Cannot copy files to the temproot environment cheers, -pete Yes: you need to actually install that (new) version of mergemaster before you try to use it. E.g.: pushd usr.bin/mergemaster && make install; popd Then re-start the userland installation (e.g., "mergemaster -p ...") and proceed with the rest of the install as usual. Ah OK, i think missed that in the thread - and the promised update to the UPDATING file in the commit doesn't seem to have landed yet either. Thanks David! -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
mergemaster still having issues
hello - i am still having issues with mergemaster when building/installing current on my end as per this thread: https://lists.freebsd.org/pipermail/freebsd-current/2019-May/073403.html my source contains the fix that I think we committed to address the original issue, are there any other steps required to get things working: *** Creating the temporary root environment in /var/tmp/temproot *** /var/tmp/temproot ready for use *** Creating and populating directory structure in /var/tmp/temproot cp: /usr/home/pete/git/freebsd/etc/master.passwd: No such file or directory *** FATAL ERROR: Cannot copy files to the temproot environment cheers, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"