Re: Panic after update main-n269202-4e7aa03b7076 -> n269230-f6f67f58c19d
Cy Schubert writes: > In message , Gleb Smirnoff writes: > > On Tue, Apr 09, 2024 at 07:02:11PM +0200, FreeBSD User wrote: > > F> The crash is still present on the most recent checked out sources as of > mi > > nutes ago. > > F> I just checked out on HEAD the latest commits (see below, just for the r > ec > > ord and to prevent > > F> being wrong here). > > F> > > F> [...] > > F> commit 841cf52595b6a6b98e266b63e54a7cf6fb6ca73e (HEAD -> main, origin/ma > in > > , origin/HEAD) > > > > Is the crash same or different? Can you please share backtrace? > > The new panic is: > > Fatal trap 12: page fault while in kernel mode > cpuid = 3; apic id = 03 > fault virtual address = 0x28 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0x80729d8d > stack pointer = 0x28:0xfe00b59c0a70 > frame pointer = 0x28:0xfe00b59c0aa0 > code segment= base 0x0, limit 0xf, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags= interrupt enabled, resume, IOPL = 0 > current process = 2697 (rpcbind) > rdi: f80004fcd720 rsi: rdx: fe00b59c0b68 > rcx: r8: 0001 r9: 3b9ac9e0 > rax: 3b9aca00 rbx: fe00b59c0b68 rbp: fe00b59c0aa0 > r10: 0020 r11: r12: > r13: 0020 r14: 0020 r15: f80004fcd720 > trap number = 12 > panic: page fault > cpuid = 3 > time = 1712682162 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > 0xfe00b59c0760 > vpanic() at vpanic+0x135/frame 0xfe00b59c0890 > panic() at panic+0x43/frame 0xfe00b59c08f0 > trap_fatal() at trap_fatal+0x40b/frame 0xfe00b59c0950 > trap_pfault() at trap_pfault+0x46/frame 0xfe00b59c09a0 > calltrap() at calltrap+0x8/frame 0xfe00b59c09a0 > --- trap 0xc, rip = 0x80729d8d, rsp = 0xfe00b59c0a70, rbp = > 0xfe00b59c0aa0 --- > uiomove_faultflag() at uiomove_faultflag+0x9d/frame 0xfe00b59c0aa0 > uipc_soreceive_stream_or_seqpacket() at uipc_soreceive_stream_or_seqpacket+0 > x38c/frame 0xfe00b59c0b30 > soreceive() at soreceive+0x2f/frame 0xfe00b59c0b50 > clnt_vc_soupcall() at clnt_vc_soupcall+0x139/frame 0xfe00b59c0c00 > sorwakeup_locked() at sorwakeup_locked+0x98/frame 0xfe00b59c0c20 > uipc_sosend_stream_or_seqpacket() at uipc_sosend_stream_or_seqpacket+0x58e/f > rame 0xfe00b59c0ce0 > sousrsend() at sousrsend+0x5f/frame 0xfe00b59c0d40 > dofilewrite() at dofilewrite+0x7f/frame 0xfe00b59c0d90 > sys_write() at sys_write+0xb3/frame 0xfe00b59c0e00 > amd64_syscall() at amd64_syscall+0x115/frame 0xfe00b59c0f30 > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfe00b59c0f30 > --- syscall (4, FreeBSD ELF64, write), rip = 0x1d82f79281a, rsp = > 0x1d82c63be78, rbp = 0x1d82c63bee0 --- > Uptime: 39s > Dumping 515 out of 7969 MB:..4%..13%..22%..32%..41%..53%..63%..72%..81%..91% > > (kgdb) bt > #0 __curthread () at /opt/src/git-src/sys/amd64/include/pcpu_aux.h:57 > #1 doadump (textdump=textdump@entry=1) at /opt/src/git-src/sys/kern/kern_sh > utdown.c:404 > #2 0x806bd7d9 in kern_reboot (howto=260) at > /opt/src/git-src/sys/kern/kern_shutdown.c:524 > #3 0x806bdcf2 in vpanic (fmt=0x80ae0f0d "%s", > ap=ap@entry=0xfe00b59c08d0) at /opt/src/git-src/sys/kern/kern_shutdown.c > :976 > #4 0x806bdb43 in panic (fmt=) at > /opt/src/git-src/sys/kern/kern_shutdown.c:892 > #5 0x80a597fb in trap_fatal (frame=0xfe00b59c09b0, eva=40) at > /opt/src/git-src/sys/amd64/amd64/trap.c:950 > #6 0x80a59846 in trap_pfault (frame=, usermode=false, > signo=, ucode=) at /opt/src/git-src/sys/amd64/ > amd64/trap.c:758 > #7 > #8 uiomove_faultflag (cp=0xf80004fcd720, n=32, > uio=uio@entry=0xfe00b59c0b68, nofault=nofault@entry=0) at > /opt/src/git-src/sys/kern/subr_uio.c:240 > #9 0x80729ce9 in uiomove (cp=0xf80004fcd720, n=0, > uio=uio@entry=0xfe00b59c0b68) at /opt/src/git-src/sys/kern/subr_uio.c:19 > 3 > #10 0x80774f1c in uipc_soreceive_stream_or_seqpacket > (so=0xf800361f4000, psa=, uio=0xfe00b59c0b68, > mp0=, controlp=0xfe00b59c0bc0, flagsp=0xfe00b59c0ba8) > at /opt/src/git-src/sys/kern/uipc_usrreq.c:1420 > #11 0x8076d4ff in soreceive (so=0xf80004fcd720, > so@entry=0xf800361f4000, psa=psa@entry=0x0, uio=uio@entry=0xfe00b59c > 0b68, mp0=0x0, mp0@entry=0xfe00b59c0bb8, controlp=0x1, > controlp@entry=0xfe0
Re: kernel crash in tcp_subr.c:2386
In message <20240212193044.e089d...@slippy.cwsent.com>, Cy Schubert writes: > In message <625e0ea4-9413-45ad-b05c-500833a1d...@freebsd.org>, > tuexen@freebsd.o > rg writes: > > > On Feb 12, 2024, at 10:36, Alexander Leidinger = > > wrote: > > >=20 > > > Hi, > > >=20 > > > I got a coredump with sources from 2024-02-10-144617 (GMT+0100): > > Hi Alexander, > > > > we are aware of this problem, but haven't found a way to reproduce it. > > Do you know how to reproduce this? > > I've reproduced this by rebooting any one of my machines in my basement. > The other machines will panic as below. > > I've reverted the three tcp timer commits, expecting one of them to be the > cause. Another data point: I build on a build machine and NFS mount /usr/obj on my other machines. Another symptom of this problem is that the NFS share will appear corrupted. And df -htnfs will sometimes not display the mounted NFS share. If not a kernel page fault, random kernel memory can be overwritten resulting in bizarre behaviour prior. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: kernel crash in tcp_subr.c:2386
; No locals. > > #15 0x808597d6 in syscallenter (td=3D0xf8068ef99740) > >at = > /space/system/usr_src/sys/amd64/amd64/../../kern/subr_syscall.c:186 > >se =3D 0x80a48330 > >p =3D 0xfe07f29995c0 > >sa =3D 0xf8068ef99b30 > >error =3D > >sy_thr_static =3D > >traced =3D > > #16 amd64_syscall (td=3D0xf8068ef99740, traced=3D0) > >at /space/system/usr_src/sys/amd64/amd64/trap.c:1192 > >ksi =3D {ksi_link =3D {tqe_next =3D 0xfe08a079ef30, > >tqe_prev =3D 0x808588af }, ksi_info =3D = > { > >si_signo =3D 1, si_errno =3D 0, si_code =3D 2015268872, = > si_pid =3D -512, > >si_uid =3D 2398721856, si_status =3D -2042, > >si_addr =3D 0xfe08a079ef40, si_value =3D {sival_int =3D = > -1602621824, > > sival_ptr =3D 0xfe08a079ee80, sigval_int =3D = > -1602621824, > > sigval_ptr =3D 0xfe08a079ee80}, _reason =3D {_fault =3D= > { > >_trapno =3D 1489045984}, _timer =3D {_timerid =3D = > 1489045984, > >_overrun =3D 17999}, _mesgq =3D {_mqd =3D 1489045984}, = > _poll =3D { > >_band =3D 77306605406688}, _capsicum =3D {_syscall =3D = > 1489045984}, > > __spare__ =3D {__spare1__ =3D 77306605406688, __spare2__ = > =3D { > > 1489814048, 17999, 208, 0, 0, 0, 992191072, > > ksi_flags =3D 975329968, ksi_sigq =3D 0x8082f8f3 = > } > > #17 > > No locals. > > #18 0x3af13b17fc9a in ?? () > > No symbol table info available. > > Backtrace stopped: Cannot access memory at address 0x3af13a225ab8 > > ---snip--- > >=20 > > Any ideas? > >=20 > > Due to another issue in userland, I updated to 2024-02-11-212006, but = > I have the above mentioned version and core still in a BE if needed > >=20 > > Bye, > > Alexander. > > -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: noatime on ufs2
In message <3f6cf45c-3d34-4da6-9b81-337eb70bb...@karels.net>, Mike Karels write s: > On 30 Jan 2024, at 15:48, Cy Schubert wrote: > > > In message c > > om> > > , Rick Macklem writes: > >> On Tue, Jan 30, 2024 at 10:49=E2=80=AFAM Mike Karels wro > t= > >> e: > >>> > >>> On 30 Jan 2024, at 3:00, Olivier Certner wrote: > >>> > >>>> Hi Warner, > >>>> > >>>>> I strongly oppose this notion to control this from loader.conf. Root i= > >> s > >>>>> mounted read-only, so it doesn't matter. That's why I liked Mike's > >>>>> suggestion: root isn't special. > >>>> > >>>> Then in fact there is nothing to oppose. You've just said yourself tha= > >> t root is mounted first read-only. As Mike already said, it is remounted > r= > >> /w in userland later in the boot process. I just re-checked the code, bec > a= > >> use I only had a vague recollection of all this, and can confirm. > >>>> > >>>> I mentioned the need to modify '/etc/loader.conf' as a possible consequ= > >> ence, not as a goal. Given what we have established, there is no need to > c= > >> hange it at all. > >>>> > >>>> The root FS is thus in no way more special in the sysctl proposal than = > >> with Mike's (assuming it doesn't rely on sysctl), this is an independent p > r= > >> operty due to the boot process design. > >>> > >>> With the possible exception that the sysctl mechanism might then have to > >>> apply to mount update. > >>> > >>>>>>> It also seems undesirable to add a sysctl to control a value that th= > >> e > >>>>>>> kernel doesn't use. > >>>>>> > >>>>>> The kernel has to use it to guarantee some uniform behavior irrespect= > >> ive > >>>>>> of the mount being performed through mount(8) or by a direct call to > >>>>>> nmount(2). I think this consistency is important. Perhaps all > >>>>>> auto-mounters and mount helpers always run mount(8) and never deal wi= > >> th > >>>>>> nmount(2), I would have to check (I seem to remember that, a long tim= > >> e ago, > >>>>>> when nmount(2) was introduced as an enhancement over mount(2), the st= > >> ance > >>>>>> was that applications should use mount(8) and not nmount(2) directly)= > >> . > >>>>>> Even if there were no obvious callers of nmount(2), I would be a bit > >>>>>> uncomfortable with this discrepancy in behavior. > >>> > >>> Based on a quick git grep, it looks like most of the things in base use > >>> nmount(2), not mount(2). If they use mount(8), then it's not a problem > >>> because mount(8) would be the first thing to get things right. If, by > >>> mount helpers, you mean things like mount_nfs and mount_mfs, then mount(8 > = > >> ) > >>> uses them rather than the reverse. I also don't remember any admonition > >>> not to use nmount(2). mount(8) has a limited set of file system types th > = > >> at > >>> it handles directly. > >>> > >>>>> I disagree. I think Mike's suggestion was better and dealt with POLA a= > >> nd > >>>>> POLA breaking in a sane way. If the default is applied universally in = > >> user > >>>>> space, then we need not change the kernel at all. > >>>> > >>>> I think applying the changes to userland only is really a bad idea. I'= > >> ve already explained why, but going to do it again in case you missed that > .= > >> If you have counter-arguments, fine, but I would like to see them. > >>>> > >>>> Changing userland only causes a discrepancy between mount(8) and nmount= > >> (2). Even if the project would take a stance that nmount(2) is not a publ > i= > >> c API and mount(8) must always be used, the system call will still be ther > e= > >> And if it's not supposed to be used, what's the problem with changing it > = > >> as well? > >>> > >>> I don't think that stance has been taken; nmount(2) is certainly document > = > >> ed. > >>> But I think that user level changes are required in both cases. First, f > = > >> or > >>> the kernel to do the right thing, it needs to
Re: noatime on ufs2
t; atime ..." is given on the command line, noatime will not be included in > > the kernel options. The kernel can't tell why, whether nothing was speci= > fied > > or the option was explicit. In theory, three states can be encoded using > > nmount; options could include "atime", "noatime", or neither. But that's > > not what the current user level does, so changes are required. Given tha= > t, > > it makes the most sense to have mount(8) and others to incorporate the > > default into their operation, and just give the kernel the answer. btw, > > see mntopts(3) for where this code would go. > These days most mount options are parsed in the kernel via vfs_getopts(), > but not "atime". It appears that "(no)atime" sets/clears MNT_NOATIME in > userspace via the getmntopts() function that lives in > /usr/src/sbin/mount/getmntopts.c. > > I think this is mostly cruft left over from the mount(2)->nmount(2) convers= > ion, > for generic options that cover all file systems. > > Personally, I like the idea of the addition of a defaults line in > fstab(5), but am > not sure what needs to be done for things like auto mounting? automountd will require addition of of options to existing configuration. am-utils users can add a default line. Or an addition of a "default" specification, which would make it incompatible with Linux and Solaris. Currently our autofs is 100% compatible (minus the /net bug) with both. > > I'll admit I do not see what the default value of "(no)atime" is, so long a= > s it > can be overridden on a per mount basis. A change to what the installer sets= > , > seems fine to me. > > rick > [...] -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: Removing fdisk and bsdlabel (legacy partition tools)
On January 26, 2024 7:13:15 PM PST, Ed Maste wrote: >On Wed, 24 Jan 2024 at 15:43, Julian H. Stacey wrote: >> >> Probably many do, clueless there's a proposal to remove them, >> as many wont be tracking lists (I havent been tracking lately, >> focused on moving home, other will have other distractions) > >As Rod suggested I'll have the tools emit a warning when they are run, >so that those users will become aware. >https://reviews.freebsd.org/D43585 >https://reviews.freebsd.org/D43586 > We can also point people to the two new ports. -- Cheers, Cy Schubert FreeBSD UNIX:Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 Pardon the typos. Small keyboard in use.
Re: Removing fdisk and bsdlabel (legacy partition tools)
In message <20240125101308.92e93...@slippy.cwsent.com>, Cy Schubert writes: > In message <84c6f3b1-58b3-44f8-aeaf-35f78e059...@quip.cz>, Miroslav Lachman > wri > tes: > > On 25/01/2024 06:50, Cy Schubert wrote: > > > In message > > > l. > > c > > > > > > >> > > >> What can they do that gpart can't do? > > > > > > This was quite a while ago, booted off my recovery USB attempting to repa > ir > > > some self caused damage. The ability to edit (vi) a file with starting > > > addresses and lengths, visually using bsdlabel, was suited to my panicked > > > state as I worked to recover the machine. > > > > > > A visual view of columns of a bsdlabel, editing a label using vi, checkin > g > > > and double checking numbers before committing them is handy.The visual > > > format and the ability to adjust the numbers in an editor before committi > ng > > > them is handy. You can't do this with gpart, as it's transactional. And > > > bsdinstall doesn't give one the opportunity to check the numbers in detai > l > > > on a console before committing them. > > > > If you really like your editor of choice to edit partition table, you > > can use gpart backup and gpart restore like this: > > > > gpart backup ada0 > ada0.part > > vi ada0.part > > gpart restore -F -l < ada0.part > > That would work. > > > > > > Maybe a good GSoC project may be to replace bsdlabel's driect writes to > > > disk with geom calls. Though, t doesn't need to be bsdlabel, but some kin > d > > > of utility that displays the existing label in an editor session where > > > changes can be made, using the editor, and committed. This could even be > an > > > enhancement to bsdinstall: call it expert mode or whatever. > > > > Manipulating partition table in editor session can be achieved by few > > lines of shell script as a wrapper around gpart backup & gpart restore. > > Or just build a gpart edit mode with the functions used to implement backup > and restore. Excellent idea. Thank you. A small project to work on. > > > > > Kind regards > > Miroslav Lachman A freebsd-bsdlabel port has been created making way for its removal. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: Removing fdisk and bsdlabel (legacy partition tools)
In message <84c6f3b1-58b3-44f8-aeaf-35f78e059...@quip.cz>, Miroslav Lachman wri tes: > On 25/01/2024 06:50, Cy Schubert wrote: > > In message c > > > >> > >> What can they do that gpart can't do? > > > > This was quite a while ago, booted off my recovery USB attempting to repair > > some self caused damage. The ability to edit (vi) a file with starting > > addresses and lengths, visually using bsdlabel, was suited to my panicked > > state as I worked to recover the machine. > > > > A visual view of columns of a bsdlabel, editing a label using vi, checking > > and double checking numbers before committing them is handy.The visual > > format and the ability to adjust the numbers in an editor before committing > > them is handy. You can't do this with gpart, as it's transactional. And > > bsdinstall doesn't give one the opportunity to check the numbers in detail > > on a console before committing them. > > If you really like your editor of choice to edit partition table, you > can use gpart backup and gpart restore like this: > > gpart backup ada0 > ada0.part > vi ada0.part > gpart restore -F -l < ada0.part That would work. > > > Maybe a good GSoC project may be to replace bsdlabel's driect writes to > > disk with geom calls. Though, t doesn't need to be bsdlabel, but some kind > > of utility that displays the existing label in an editor session where > > changes can be made, using the editor, and committed. This could even be an > > enhancement to bsdinstall: call it expert mode or whatever. > > Manipulating partition table in editor session can be achieved by few > lines of shell script as a wrapper around gpart backup & gpart restore. Or just build a gpart edit mode with the functions used to implement backup and restore. Excellent idea. Thank you. A small project to work on. > > Kind regards > Miroslav Lachman -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: Removing fdisk and bsdlabel (legacy partition tools)
In message <2369865.bDOn7JOVgO@ravel>, Olivier Certner writes: > --nextPart5823302.8T7jmnknE8 > Content-Transfer-Encoding: 7Bit > Content-Type: text/plain; charset="UTF-8"; protected-headers="v1" > From: Olivier Certner > To: Cy Schubert > Subject: Re: Removing fdisk and bsdlabel (legacy partition tools) > Date: Thu, 25 Jan 2024 10:43:18 +0100 > Message-ID: <2369865.bDOn7JOVgO@ravel> > In-Reply-To: <20240125055019.ccf1...@slippy.cwsent.com> > MIME-Version: 1.0 > > Hi, > > > A visual view of columns of a bsdlabel, editing a label using vi, checking > > and double checking numbers before committing them is handy.The visual > > format and the ability to adjust the numbers in an editor before committing > > > them is handy. You can't do this with gpart, as it's transactional. And > > bsdinstall doesn't give one the opportunity to check the numbers in detail > > on a console before committing them. > > You seem to want to be able to stack a number of modifications before actuall > y pushing them. Actually, gpart(8) already can do that! Please see the "OPE > RATIONAL FLAGS" section in gpart(8). > > In between your tentative modifications, just use 'gpart show' to see where y > ou stand. gpart(8) should have a vi mode. That is different than having changes pending and committing them. A person is still entering commands rather than doing something like editing a spreadsheet, which is what editing a file is kind-of like. Even something like, gpart show ada0s2 > some_file vi some_file gpart batch ada0s2 < some_file -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: Removing fdisk and bsdlabel (legacy partition tools)
In message , Warner Losh writes: > --b0adc9060fbe7411 > Content-Type: text/plain; charset="UTF-8" > Content-Transfer-Encoding: quoted-printable > > On Wed, Jan 24, 2024, 10:07=E2=80=AFPM Cy Schubert om> > wrote: > > > In message <202401242347.40onlwkz099...@gndrsh.dnsmgr.net>, "Rodney W. > > Grimes" > > writes: > > > > I would agree personally, to moving to ports (eg ports/sysutils) with > > > > a DEPRECATED in the DESCR or something, or better yet a Make > > > > invokation event to say "superceded, here is how to proceed against > > > > advice") or something. > > > > > > They are totally useless as ports when your booted from install > > > media and working from a standalone shell. These are the exact > > > times you want things like fdisk and bsdlabel so you can figure > > > out wtf is going on, and bsdinstall is NOT gona help you. > > > > This is certainly a good point. > > > > What can they do that gpart can't do? This was quite a while ago, booted off my recovery USB attempting to repair some self caused damage. The ability to edit (vi) a file with starting addresses and lengths, visually using bsdlabel, was suited to my panicked state as I worked to recover the machine. A visual view of columns of a bsdlabel, editing a label using vi, checking and double checking numbers before committing them is handy.The visual format and the ability to adjust the numbers in an editor before committing them is handy. You can't do this with gpart, as it's transactional. And bsdinstall doesn't give one the opportunity to check the numbers in detail on a console before committing them. Maybe a good GSoC project may be to replace bsdlabel's driect writes to disk with geom calls. Though, t doesn't need to be bsdlabel, but some kind of utility that displays the existing label in an editor session where changes can be made, using the editor, and committed. This could even be an enhancement to bsdinstall: call it expert mode or whatever. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: Removing fdisk and bsdlabel (legacy partition tools)
In message , "Patrick M. Hause n" writes: > Hi all, > > > Am 25.01.2024 um 00:47 schrieb Rodney W. Grimes = > : > >=20 > >> I would agree personally, to moving to ports (eg ports/sysutils) with > >> a DEPRECATED in the DESCR or something, or better yet a Make > >> invokation event to say "superceded, here is how to proceed against > >> advice") or something. > >=20 > > They are totally useless as ports when your booted from install > > media and working from a standalone shell. These are the exact > > times you want things like fdisk and bsdlabel so you can figure > > out wtf is going on, and bsdinstall is NOT gona help you. > >=20 > > I know there are a boat load of people that have built there > > own installers for VM's and stuff, running UFS and I bet you > > they are using MBR disks too. PLEASE do not kick these tiny > > little and very usable and pretty univeral (as far as I know > > ALL BSD's have fdisk and bsdlabel/disklabel) tools out of > > the base system. > >=20 > > The world is NOT 2TB nvme drives with GPT, EFI and ZFS, > > yours might not be, but I am pretty certain I am not > > alone in this other world. > > I totally undestand that point, but what exactly do these tools do that > gpart cannot? On MBR disks? With BSD partitions? > > Ever since I found out that gpart can manage *all* on-disk partition = > formats > I have not been using anything else. You can create your MBR partitions > and BSD labels just fine with gpart. At least in all situations I = > encountered, > there might of course be edge cases I simply don't know. On occasion when trying to manipulate a disk label, gpart will refuse to. Usually when creating or manipulating a label on a zvol one doesn't want to use on the host system, that is destined to be used in a VM. It's simpler to create the partitions and labels beforehand, attach the zvol to the VM, boot and install (or test) within the VM. In this case one doesn't even care if geom sees the "disk" or its partitions on the host because the "disk" is destined for use in a VM. I've created zvols for use by various VMs in this manner. I agree with Rod's remark that when one is in panic mode working through a difficult situation extra tools, not fewer, can help. Regarding extra tools, I do maintain a full copy of FreeBSD on a USB disk, in order to recover from catastrophic situations. They're extremely rare, the last of which was the result of a commit that broke loader (or was it a boot blocks -- I can't remember the exact details anymore) in 12 or 13-CURRENT. The extra tools came in handy as I worked through the mess. > > gpart is not the "GPT partition tool". It's the universal swiss army = > knife > "GEOM partition tool" for all disk partitioning in any format supported. > > Kind regards, > Patrick= > -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: Removing fdisk and bsdlabel (legacy partition tools)
In message <202401242347.40onlwkz099...@gndrsh.dnsmgr.net>, "Rodney W. Grimes" writes: > > I would agree personally, to moving to ports (eg ports/sysutils) with > > a DEPRECATED in the DESCR or something, or better yet a Make > > invokation event to say "superceded, here is how to proceed against > > advice") or something. > > They are totally useless as ports when your booted from install > media and working from a standalone shell. These are the exact > times you want things like fdisk and bsdlabel so you can figure > out wtf is going on, and bsdinstall is NOT gona help you. This is certainly a good point. > > I know there are a boat load of people that have built there > own installers for VM's and stuff, running UFS and I bet you > they are using MBR disks too. PLEASE do not kick these tiny > little and very usable and pretty univeral (as far as I know > ALL BSD's have fdisk and bsdlabel/disklabel) tools out of > the base system. > > The world is NOT 2TB nvme drives with GPT, EFI and ZFS, > yours might not be, but I am pretty certain I am not > alone in this other world. > > > -G > > > > On Thu, Jan 25, 2024 at 3:30?AM Warner Losh wrote: > > > > > > On Wed, Jan 24, 2024 at 8:45?AM Ed Maste wrote: > > >> > > >> MBR (PC BIOS) partition tables were historically maintained with > > >> fdisk(8), but gpart(8) has long been the preferred method for working > > >> with partition tables of all types. fdisk has been declared as > > >> obsolete in the man page since 2015. Similarly BSD disklabels were > > >> historically maintained with bsdlabel. It does not yet have a > > >> deprecation notice - I have proposed a man page addition in > > >> https://reviews.freebsd.org/D43563. > > >> > > >> I would like to disconnect these from the build, and subsequently > > >> remove them. This is prompted by a recent bsdlabel bug report which > > >> uncovered a longstanding buffer overflow in that tool. Effort is much > > >> better focused on contemporary, maintained tools rather than > > >> investigating issues in deprecated ones. Removing these tools would > > >> happen in FreeBSD 15 only (no change in 14 or 13). > > >> > > >> Code review to disconnect fdisk: https://reviews.freebsd.org/D43575 > > >> > > >> Note that this effort is limited to these maintenance tools only - > > >> there is no change to kernel or gpart support for MBR or BSD > > >> disklablel partitioning. That said, MBR partitioning and BSD > > >> disklabels are best considered legacy formats and should be avoided > > >> for new installations, if possible. > > >> > > >> If anyone is using fdisk and/or bsdlabel rather than gpart I would > > >> appreciate knowing what is preventing you from using the contemporary > > >> tools. > > > > > > > > > nanobsd's legacy.sh still is using disklabel in two spots. > > > > > > But one is to just do gpart create -s bsd and the other is to display it. > Easy > > > to fix, but even easier to delete legacy.sh entirely. It's not really nee > ded any > > > more and was a product of CHS addressing... Now that we use LBA, it's > > > better to use the new embedded ones. Even at $WORK where we kinda > > > use legacy, we replace the partitioning stuff with our own custom thing.. > . > > > > > > Those are the only users in the tree, but not for long :) > > > > > > fdisk was good, but somewhere around the CHS -> LBA transition things > > > got weird with it, and for really big disks there were reports of issues > that > > > I could never encounter when I set out to fix them... Most likely due to > a > > > mismatch in the CHS data and the LBA data being recorded in the MBR. > > > The in-kernel gpart copes so much better. > > > > > > I wouldn't object to making these ports, but both these programs use 'sek > ret' > > > bits from the kernel that might not remain exposed as we clean things up. > > > Though the IOCTLs they do (or used to do) may no longer be relevant. It's > > > been so long that I've forgotten > > > > > > Warner > -- > Rod Grimes rgri...@freebsd.or > g > -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: Removing fdisk and bsdlabel (legacy partition tools)
In message , Ed Maste writes: > MBR (PC BIOS) partition tables were historically maintained with > fdisk(8), but gpart(8) has long been the preferred method for working > with partition tables of all types. fdisk has been declared as > obsolete in the man page since 2015. Similarly BSD disklabels were > historically maintained with bsdlabel. It does not yet have a > deprecation notice - I have proposed a man page addition in > https://reviews.freebsd.org/D43563. > > I would like to disconnect these from the build, and subsequently > remove them. This is prompted by a recent bsdlabel bug report which > uncovered a longstanding buffer overflow in that tool. Effort is much > better focused on contemporary, maintained tools rather than > investigating issues in deprecated ones. Removing these tools would > happen in FreeBSD 15 only (no change in 14 or 13). > > Code review to disconnect fdisk: https://reviews.freebsd.org/D43575 > > Note that this effort is limited to these maintenance tools only - > there is no change to kernel or gpart support for MBR or BSD > disklablel partitioning. That said, MBR partitioning and BSD > disklabels are best considered legacy formats and should be avoided > for new installations, if possible. > > If anyone is using fdisk and/or bsdlabel rather than gpart I would > appreciate knowing what is preventing you from using the contemporary > tools. > We need to fix the kern.geom.debugflags sysctl foot shooting option so that it works. (Not that bsdlabel or fdisk worked around the issue). Otherwise one is left with boot to single user or from alternate media if that doesn't work. I do have a patch that circumvents the problem. I haven't looked it it in years and probably needs some cleanup though. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: NFSv4 crash of CURRENT
In message , Rick Macklem writes: > On Sat, Jan 13, 2024 at 12:39=E2=80=AFPM Ronald Klop = > wrote: > > > > > > Van: FreeBSD User > > Datum: 13 januari 2024 19:34 > > Aan: FreeBSD CURRENT > > Onderwerp: NFSv4 crash of CURRENT > > > > Hello, > > > > running CURRENT client (FreeBSD 15.0-CURRENT #4 main-n267556-69748e62e82a= > : Sat Jan 13 18:08:32 > > CET 2024 amd64). One NFSv4 server is same OS revision as the mentioned cl= > ient, other is FreeBSD > > 13.2-RELEASE-p8. Both offer NFSv4 filesystems, non-kerberized. > > > > I can crash the client reproducable by accessing the one or other NFSv4 F= > S (a simple ls -la). > > The NFSv4 FS is backed by ZFS (if this matters). I do not have physicla a= > ccess to the client > > host, luckily the box recovers. > Did you rebuild both the nfscommon and nfscl modules from the same sources? > I did a commit to main that changes the interface between these two > modules and did bump the > __FreeBSD_version to 1500010, which should cause both to be rebuilt. > (If you have "options NFSCL" in your kernel config, both should have > been rebuilt as a part of > the kernel build.) > Is anyone by chance seeing autofs in the backtrace too? -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: CDE on FreeBSD 14.0 Release
In message <20231127172506.horde.iibmvttnl5om_j0h4cvd...@webmail.in-berlin.d e>, "Rolf M. Dietze" writes: > Hi, > > might be that I am on the wrong list for this, since my Problem > exists on 14.0Release. > > After an out of the box install of FreeBSD 14.0 Release and following > https://forums.freebsd.org/threads/setting-up-common-desktop-environment-for- > modern-use.69475/ First, unless you plan on using the CDE calendar, you don't needs the dtspc entry in inetd.conf. I'm currently running CDE, installed using pkg, on 14.0-RELEASE using dtlogin at $JOB. At home I use CDE on 15-CURRENT. I use xdm at home instead because dtlogin doesn't support PAM* but xdm does. Caveat, if you use xdm you will need to add dtsession to your .xsession file. * My home network uses MIT KRB5 to serve passwords and LDAP to serve UID/GID information. This requires pam_krb5, and as dtlogin is not PAM aware it can only work with accounts in /etc/passwd. Hence xdm. I'll probably submit a pull request to the CDE development team one day but considering all the other things on my plate, PAM support within dtlogin is pretty low on my priority list. > I am stuck with CDE. CDE loads, dtlogin starts an presents the login > or greeter window, but upon logging in I get a popup telling > "The desktop messaging system could not be started". Guess I am > missing some config steps. Any pointer for further reading? > I had CDE running on 13.2Release Make sure your /etc/hosts has an entry for localhost. Also make sure your machine's PTR record is correct and that it matches its A record. It's most likely your hostname doesn't match any IP on your network. Just add your correct hostname's IP to /etc/hosts or if using dhcp, prefix your hostname to the localhost entry in /etc/hosts like this, 127.0.0.1 my_hostname_whatever_it_is localhost localhost.my.domain Also make sure rpcbind is running. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 ÀÀÀÀÀÀÀÀÀ
autofs -hosts maps
Hi, The discussion about NFS exports of ZFS snapshots prompted me to play around with -hosts maps on my network. -hosts maps are mounted on /net. I've discovered that -hosts maps don't work with most shares but do with others. I've only played with this for a few minutes so I don't fully understand why some maps work and others not. Some of underlying directories that don't work are ZFS while others are UFS. Yet, auto_home maps mounting the same directories does work. And mounting the shares by hand (using mount_nfs) also works. Just putting this out there should someone else have noticed this. I'll play around with this a little over the weekend. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 ÀÀÀÀÀÀÀÀ
Re: revision not displayed in a2440348eed7
On Wed, 8 Nov 2023 15:14:34 +0100 Marek Zarychta wrote: > W dniu 8.11.2023 o 14:10, Marek Zarychta pisze: > > > > W dniu 27.09.2023 o 01:07, Tomoaki AOKI pisze: > >> On Tue, 26 Sep 2023 15:19:46 -0700 > >> Cy Schubert wrote: > >> > >>> In message <20230926231431.20f42fec1075c3980446c...@dec.sakura.ne.jp>, > >>> Tomoaki > >>> AOKI writes: > >>>> On Tue, 26 Sep 2023 15:48:50 +0200 > >>>> Marek Zarychta wrote: > >>>> > >>>>> W dniu 26.09.2023 o 13:30, KIRIYAMA Kazuhiko pisze: > >>>>>> At least up to 15.0-CURRENT, nothing has happend by > >>>>>> WITHOUT_REPRODUCIBLE_BUILD=yes. Something has changed in > >>>>>> 15.0-CURRENT at some time. I've rebuilded with 3fb80f1476c7, > >>>>>> but revision not showed by `uname -a' ;-( > >>>>>> > >>>>>> What changed > >>>>> Nothing changed. Perhaps your build system can't check git hash ? If > >>>>> your sources are from git repository, you need at least git-lite > >>>>> installed and full git repository available on build machine. If you > >>>>> checked out the repository with gitup and have gitup installed, it > >>>>> should also work. It won't work if your build machine has access to > >>>>> only a part of the repository like worktree. > >>>>> > >>>>> Cheers > >>>>> > >>>>> -- > >>>>> Marek Zarychta > >>>> Just a possibility, but copying src tree to directory other than the > >>>> directory where checked out from git repo and building there could > >>>> lose track with git hash. > >>>> > >>>> Another possibility is that if you build src with any user other than > >>>> the one owning local (pulled) git repo could also lose track with git > >>>> hash. For example, if I `git log HEAD` with regular user and the local > >>>> repo is pulled by root, it fails. No special configuration is done. > >>>> > >>>> % git log HEAD > >>>> fatal: detected dubious ownership in repository at '/usr/src' > >>>> To add an exception for this directory, call: > >>>> > >>>> git config --global --add safe.directory /usr/src > >>>> > >>>> > >>> This could be due to e6dc6a27230, which was committed this morning. > >>> There > >>> is discussion on the src commits ML (dev-commits-src-all, > >>> dev-commits-src-main) about reverting the change. > >>> > >>> > >>> -- > >>> Cheers, > >>> Cy Schubert > >>> FreeBSD UNIX: Web: https://FreeBSD.org > >>> NTP: Web: https://nwtime.org > >>> > >>> e^(i*pi)+1=0 > >> Would be unrelated here, unfortunately. > >> As the subject says, the commit the original reporter is bitten at (not > >> bi-sected) is at a2440348eed7, which is before e6dc6a27230. > > > > Let's refresh this thread. It looks like (at least for stable/14) > > build system doesn't hardcode revision into the kernel anymore. Last > > time it worked to me was just after branching stable/14. Today I tried > > to build kernel from sources mounted over NFS and I ened with: > > > > # strings /usr/obj/usr/src/amd64.amd64/sys/BSDONDELL/kernel | grep > > 14.0-STABLE > > @(#)FreeBSD 14.0-STABLE #6 -dirty: Tue Nov 7 14:04:35 CET 2023 > > FreeBSD 14.0-STABLE #6 -dirty: Tue Nov 7 14:04:35 CET 2023 > > 14.0-STABLE > > > > the source repository is updated, consisted, but mounted read-only > > over NFS > > > > /usr/src# git status > > On branch stable/14 > > Your branch is up to date with 'origin/stable/14'. > > > > Untracked files: > > (use "git add ..." to include in what will be committed) > > sys/amd64/conf/BSDONDELL > > > > It took 2.53 seconds to enumerate untracked files. > > See 'git help status' for information on how to improve this. > > > > nothing added to commit but untracked files present (use "git add" to > > track) > > > > > > Any clues what could be wrong ? Does /usr/src/ require write > > permissions now ? > > > I am sorry for the false alarm. It looks like using META MODE prevented > updating this info. Af
Re: Kernel with INVARIANTS panicing if drm is loaded
, > arg=0x80a472c8 ) at /usr/src/sys/dev/vt/vt_core.c:101 > 8 > #30 0x8078ffcf in atkbd_intr (kbd=0x80cef898 , > arg=) at /usr/src/sys/dev/atkbdc/atkbd.c:565 > #31 0x804b1376 in intr_event_execute_handlers (ie=0xf800010ece00, > p=) at /usr/src/sys/kern/kern_intr.c:1205 > #32 ithread_execute_handlers (ie=0xf800010ece00, p=) > at /usr/src/sys/kern/kern_intr.c:1218 > #33 ithread_loop (arg=arg@entry=0xf80001c5aea0) > at /usr/src/sys/kern/kern_intr.c:1306 > #34 0x804adae2 in fork_exit ( > callout=0x804b1120 , arg=0xf80001c5aea0, > frame=0xfe00ce259f40) at /usr/src/sys/kern/kern_fork.c:1160 > #35 > #36 0x0b88 in ?? () > Backtrace stopped: Cannot access memory at address 0xbc7 > (kgdb) > > > > > > Can you submit a PR for this? GFP_KERNEL is an alias for M_WAITOK, which is verboten when intel_atomic_state_alloc() makes its call to kzalloc(), an alias for kmalloc(). -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 ÀÀÀÀÀÀÀÀ
Re: how to set vfs.zfs.arc.max in 15-current ?
In message , void writes: > Is there a new way to set arc.max in 15-current? > > It's no longer settable (except to "0") in main-n265801 (Oct 7th) > while multiuser. > > # sysctl vfs.zfs.arc.max=8589934592 > vfs.zfs.arc.max: 0 > sysctl: vfs.zfs.arc.max=8589934592: Invalid argument Try reducing your arc.max by an order of 10. This suggests that it's probably failing in param_set_arc_max() in the val >= arc_all_memory() comparison.. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 ÀÀÀÀÀÀÀÀ
Re: revision not displayed in a2440348eed7
In message <20230926231431.20f42fec1075c3980446c...@dec.sakura.ne.jp>, Tomoaki AOKI writes: > On Tue, 26 Sep 2023 15:48:50 +0200 > Marek Zarychta wrote: > > > W dniu 26.09.2023 o 13:30, KIRIYAMA Kazuhiko pisze: > > > At least up to 15.0-CURRENT, nothing has happend by > > > WITHOUT_REPRODUCIBLE_BUILD=yes. Something has changed in > > > 15.0-CURRENT at some time. I've rebuilded with 3fb80f1476c7, > > > but revision not showed by `uname -a' ;-( > > > > > > What changed > > > > Nothing changed. Perhaps your build system can't check git hash ? If > > your sources are from git repository, you need at least git-lite > > installed and full git repository available on build machine. If you > > checked out the repository with gitup and have gitup installed, it > > should also work. It won't work if your build machine has access to > > only a part of the repository like worktree. > > > > Cheers > > > > -- > > Marek Zarychta > > Just a possibility, but copying src tree to directory other than the > directory where checked out from git repo and building there could > lose track with git hash. > > Another possibility is that if you build src with any user other than > the one owning local (pulled) git repo could also lose track with git > hash. For example, if I `git log HEAD` with regular user and the local > repo is pulled by root, it fails. No special configuration is done. > > % git log HEAD > fatal: detected dubious ownership in repository at '/usr/src' > To add an exception for this directory, call: > > git config --global --add safe.directory /usr/src > > This could be due to e6dc6a27230, which was committed this morning. There is discussion on the src commits ML (dev-commits-src-all, dev-commits-src-main) about reverting the change. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 ÀÀÀÀÀÀÀÀÀ
Re: ZFS Panics Still
On Tue, 12 Sep 2023 05:29:41 +0100 Graham Perrin wrote: > On 12/09/2023 00:17, Cy Schubert wrote: > > > … poudriere … > > > panic: vm_page_dequeue_deferred: page 0xfe000b7e9748 has unexpected > > queue state > > … > <https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=265795> is for arm64. > Should we broaden the hardware field, there? Probably. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
ZFS Panics Still
294, a = {{flags = 16, queue = 255 '\377', act_count = 0 '\000'}, _bits = 16711696}, order = 13 '\r', pool = 0 '\000', flags = 0 '\000', oflags = 0 '\000', psind = 0 '\000', segind = 5 '\005', valid = 255 '\377', dirty = 0 '\000'} (kgdb) At frame 13 *vfsp contains: $7 = {mnt_vfs_ops = 1, mnt_kern_flag = 1090847177, mnt_flag = 268439568, mnt_pcpu = 0xfe010d84cbc0, mnt_rootvnode = 0x0, mnt_vnodecovered = 0xf8008dc691c0, mnt_op = 0x83bb3080 , mnt_vfc = 0x83bb3228 , mnt_mtx = {lock_object = { lo_name = 0x80abf68d "struct mount mtx", lo_flags = 16973824, lo_data = 0, lo_witness = 0xf8021fd75b00}, mtx_lock = 0}, mnt_gen = 1, mnt_list = {tqe_next = 0x0, tqe_prev = 0xfe00c45f2168}, mnt_syncer = 0x0, mnt_ref = 27, mnt_nvnodelist = { tqh_first = 0xf800665c5000, tqh_last = 0xf8007cc90aa8}, mnt_nvnodelistsize = 24, mnt_writeopcount = 1, mnt_opt = 0xf80084d56cd0, mnt_optnew = 0x0, mnt_stat = {f_version = 538182936, f_type = 222, f_flags = 268439568, f_bsize = 512, f_iosize = 131072, f_blocks = 251997486, f_bfree = 248369646, f_bavail = 248369646, f_files = 248516350, f_ffree = 248369646, f_syncwrites = 0, f_asyncwrites = 0, f_syncreads = 0, f_asyncreads = 0, f_nvnodelistsize = 113, f_spare0 = 0, f_spare = {0, 0, 0, 0, 0, 0, 0, 0, 0}, f_namemax = 255, f_owner = 0, f_fsid = {val = {-313067424, 1444670686}}, f_charspare = '\000' , f_fstypename = "zfs", '\000' , f_mntfromname = "bob/poudriere/bob/jails/HEADi386-new-ports-ref/04", '\000' , f_mntonname = "/poudriere/bob/data/.m/HEADi386-new-ports/04", '\000' }, mnt_cred = 0xf800c83cb200, mnt_data = 0xf800b713e000, mnt_time = 0, mnt_iosize_max = 65536, mnt_export = 0x0, mnt_label = 0x0, mnt_hashseed = 1242221059, mnt_lockref = 0, mnt_secondary_writes = 0, mnt_secondary_accwrites = 0, mnt_susp_owner = 0x0, mnt_exjail = 0x0, mnt_gjprovider = 0x0, mnt_listmtx = {lock_object = { lo_name = 0x80b1539e "struct mount vlist mtx", lo_flags = 16973824, lo_data = 0, lo_witness = 0xf8021fd82a80}, mtx_lock = 0}, mnt_lazyvnodelist = {tqh_first = 0x0, tqh_last = 0xfe00da21d550}, mnt_lazyvnodelistsize = 0, mnt_upper_pending = 0, mnt_explock = {lock_object = { lo_name = 0x80b6167f "explock", lo_flags = 108199936, lo_data = 0, lo_witness = 0xf8021fd82880}, lk_lock = 1, lk_exslpfail ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ tqh_first = 0x0, tqh_last = 0xfe00da21d590}, mnt_notify = { tqh_first = 0x0, tqh_last = 0xfe00da21d5a0}, mnt_taskqueue_link = { stqe_next = 0x0}, mnt_taskqueue_flags = 0, mnt_unmount_retries = 0} -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: Possible regression in main causing poor performance
In message , Mark Millard write s: > On Sep 5, 2023, at 08:58, Cy Schubert wrote: > > > In message <20230830204406.24fd...@slippy.cwsent.com>, Cy Schubert = > writes: > >> In message <20230830184426.gm1...@freebsd.org>, Glen Barber writes: > >>>=20 > >>>=20 > >>> On Mon, Aug 28, 2023 at 06:06:09PM -0700, Mark Millard wrote: > >>>> Has any more been learned about this? Is it still an issue? > >>>> =3D20 > >>>=20 > >>> I rebooted the machine before the ALPHA3 builds with no other = > changes, > >>> and the overall times for 14.x builds went back to normal. I do not > >>> like to experiment with builders during a release cycle, but as we = > are > >>> going to have 15.x snapshots available moving forward, I will not = > reboot > >>> that machine next week in hopes to get some useful data. > >>>=20 > >>> If my memory serves correctly, mm@ has a pending ZFS import from > >>> upstream for both main and stable/14 pending. Whether or not that = > will > >>> resolve any issue here, I do not know. > >>=20 > >> Two of my poudriere builder machines have experienced different = > panics=20 > >> since the ZFS import two days ago. The problems have been documented = > on the=20 > >> -current list. > >=20 > > Just an update. > >=20 > > The three pull requests amotin@ pointed to did resolve all my = > problems. A=20 > > subsequent update which included the latest ZFS commits worked just as=20= > > > well, without any new regressions. AFAIAC this problem has been = > resolved. > >=20 > > The random email corruptions have also been resolved. > >=20 > >=20 > > --=20 > > Cheers, > > Cy Schubert > > FreeBSD UNIX: Web: https://FreeBSD.org > > NTP: Web: https://nwtime.org > >=20 > > e^(i*pi)+1=3D0 > >=20 > >=20 > >=20 > >=20 > > =C2=9C9O8 > > The just-above quoted line looks like a corruption to me. > Hmm. Just to rule out that a build of the exmh2 and nmh-devel packages might have been corrupt, I've rebuilt the two and will continue to monitor. This email was sent by a rebuilt exmh2 and nmh-devel. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 ÀÀÀÀÀÀÀÀ
Re: Possible regression in main causing poor performance
In message <20230830184426.gm1...@freebsd.org>, Glen Barber writes: > > > On Mon, Aug 28, 2023 at 06:06:09PM -0700, Mark Millard wrote: > > Has any more been learned about this? Is it still an issue? > >=20 > > I rebooted the machine before the ALPHA3 builds with no other changes, > and the overall times for 14.x builds went back to normal. I do not > like to experiment with builders during a release cycle, but as we are > going to have 15.x snapshots available moving forward, I will not reboot > that machine next week in hopes to get some useful data. > > If my memory serves correctly, mm@ has a pending ZFS import from > upstream for both main and stable/14 pending. Whether or not that will > resolve any issue here, I do not know. Two of my poudriere builder machines have experienced different panics since the ZFS import two days ago. The problems have been documented on the -current list. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 ÀÀÀÀÀÀÀÀ
Another ZFS Panic -- buffer modified while frozen
A different panic on a different amd64 machine also running poudriere but building amd64 packages. Exmh was just started, displaying back to my laptop at the time of panic. panic: buffer modified while frozen! cpuid = 1 time = 1693417762 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe008e67fba0 vpanic() at vpanic+0x132/frame 0xfe008e67fcd0 panic() at panic+0x43/frame 0xfe008e67fd30 arc_cksum_verify() at arc_cksum_verify+0x12c/frame 0xfe008e67fd80 arc_buf_destroy_impl() at arc_buf_destroy_impl+0x6f/frame 0xfe008e67fdc0 arc_buf_destroy() at arc_buf_destroy+0xd5/frame 0xfe008e67fdf0 dbuf_destroy() at dbuf_destroy+0x60/frame 0xfe008e67fe40 dbuf_evict_one() at dbuf_evict_one+0x176/frame 0xfe008e67fe70 dbuf_evict_thread() at dbuf_evict_thread+0x345/frame 0xfe008e67fef0 fork_exit() at fork_exit+0x82/frame 0xfe008e67ff30 fork_trampoline() at fork_trampoline+0xe/frame 0xfe008e67ff30 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- Uptime: 3h46m10s Dumping 1962 out of 8122 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91 % __curthread () at /opt/src/git-src/sys/amd64/include/pcpu_aux.h:57 57 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu , (kgdb) bt #0 __curthread () at /opt/src/git-src/sys/amd64/include/pcpu_aux.h:57 #1 doadump (textdump=textdump@entry=1) at /opt/src/git-src/sys/kern/kern_shutdown.c:405 #2 0x806c1b30 in kern_reboot (howto=260) at /opt/src/git-src/sys/kern/kern_shutdown.c:526 #3 0x806c202f in vpanic ( fmt=0x83d82b7c "buffer modified while frozen!", ap=ap@entry=0xfe008e67fd10) at /opt/src/git-src/sys/kern/kern_shutdown.c:970 #4 0x806c1dd3 in panic (fmt=) at /opt/src/git-src/sys/kern/kern_shutdown.c:894 #5 0x83ae5f2c in arc_cksum_verify (buf=0xf80188cde180) at /opt/src/git-src/sys/contrib/openzfs/module/zfs/arc.c:1475 #6 0x83ae99ff in arc_buf_destroy_impl ( buf=buf@entry=0xf80188cde180) at /opt/src/git-src/sys/contrib/openzfs/module/zfs/arc.c:3113 #7 0x83ae9625 in arc_buf_destroy (buf=0xf80188cde180, tag=tag@entry=0xf80104a534c8) at /opt/src/git-src/sys/contrib/openzfs/module/zfs/arc.c:3889 #8 0x83b0eee0 in dbuf_destroy (db=db@entry=0xf80104a534c8) at /opt/src/git-src/sys/contrib/openzfs/module/zfs/dbuf.c:2983 #9 0x83b17996 in dbuf_evict_one () at /opt/src/git-src/sys/contrib/openzfs/module/zfs/dbuf.c:781 --Type for more, q to quit, c to continue without paging--c #10 0x83b0c345 in dbuf_evict_thread (unused=) at /opt/src/git-src/sys/contrib/openzfs/module/zfs/dbuf.c:819 #11 0x80677ab2 in fork_exit ( callout=0x83b0c000 , arg=0x0, frame=0xfe008e67ff40) at /opt/src/git-src/sys/kern/kern_fork.c:1160 #12 (kgdb) FreeBSD cwsys 15.0-CURRENT FreeBSD 15.0-CURRENT amd64 150 #4 komquats-n26508 9-b22aae410bc7: Wed Aug 30 04:38:24 PDT 2023 root@cwsys:/export/obj/opt/ src/ git-src/amd64.amd64/sys/BREAK2 amd64 Almost the same configuration as the other machine. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 ÀÀÀÀÀÀÀÀ
Re: ZFS Page Derefrence
In message , Mark Johnston writes: > On Tue, Aug 29, 2023 at 07:08:35PM -0700, Cy Schubert wrote: > > Hi > > > > Just got the following panic on an and64 machine running poudriere building > > > i386 packages. > > > > panic: vm_page_dequeue_deferred: page 0xfe000b222808 has unexpected > > queue state^M > > [...] > > > > uname reports, > > > > FreeBSD bob 15.0-CURRENT FreeBSD 15.0-CURRENT amd64 150 #1 > > komquats-n265075-2e8edbc285cf: Tue Aug 29 03:51:59 PDT 2023 > > root@cwsys:/export/obj/opt/src/git-src/amd64.amd64/sys/BREAK2 amd64 > > > > My BREAK2 kernel removes devices I don't use and enables keystrokes to > > interrupt the system from the conosle (conserver). Local patches affect > > ipfilter only. > > > > Head of core.txt: > > > > __curthread () at /opt/src/git-src/sys/amd64/include/pcpu_aux.h:57 > > 57 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct > > > pcpu > > , > > (kgdb) #0 __curthread () at /opt/src/git-src/sys/amd64/include/pcpu_aux.h: > 5 > > 7 > > #1 doadump (textdump=textdump@entry=1) > > at /opt/src/git-src/sys/kern/kern_shutdown.c:405 > > #2 0x806c1b30 in kern_reboot (howto=260) > > at /opt/src/git-src/sys/kern/kern_shutdown.c:526 > > #3 0x806c202f in vpanic ( > > fmt=0x80b5da55 "%s: page %p has unexpected queue state", > > ap=ap@entry=0xfe00bf55d770) > > at /opt/src/git-src/sys/kern/kern_shutdown.c:970 > > #4 0x806c1dd3 in panic (fmt=) > > at /opt/src/git-src/sys/kern/kern_shutdown.c:894 > > #5 0x809daab2 in vm_page_dequeue_deferred (m=, > > m@entry=0xfe000b222808) at /opt/src/git-src/sys/vm/vm_page.c:3790 > > #6 0x809ddfeb in vm_page_free_prep (m=m@entry=0xfe000b222808) > > at /opt/src/git-src/sys/vm/vm_page.c:3928 > > Could you please print/x *m from this frame? Sure. (kgdb) print/x *m $1 = {plinks = {q = {tqe_next = 0x, tqe_prev = 0x}, s = {ss = { sle_next = 0x}}, memguard = {p = 0x, v = 0x}, uma = {slab = 0x, zone = 0x}}, listq = {tqe_next = 0x, tqe_prev = 0x}, object = 0x0, pindex = 0x572c, phys_addr = 0x1b67d5000, md = {pv_list = {tqh_first = 0x0, tqh_last = 0xfe000b222840}, pv_gen = 0xf4a, pat_mode = 0x6}, ref_count = 0x0, busy_lock = 0xfffe, a = {{flags = 0x10, queue = 0xff, act_count = 0x0}, _bits = 0xff0010}, order = 0xd, pool = 0x0, flags = 0x1, oflags = 0x0, psind = 0x0, segind = 0x5, valid = 0xff, dirty = 0x0} (kgdb) -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 ÀÀÀÀÀÀÀÀÀ
ZFS Page Derefrence
dule/os/freebsd/zfs/kmod_core. c:16 8 #16 0x8054b482 in devfs_ioctl (ap=0xfe00bf55dc50) at /opt/src/git-src/sys/fs/devfs/devfs_vnops.c:933 #17 0x807cf032 in vn_ioctl (fp=0xf801909f6870, com=, data=0xfe00bf55dd50, active_cred=0xf800b6ed0b00, td=) at /opt/src/git-src/sys/kern/vfs_vnops.c:1701 #18 0x8054bb5e in devfs_ioctl_f (fp=, fp@entry=, com=, com@entry=, data=, data@entry=, cred=, cred@entry=, td=, td@entry=) at /opt/src/git-src/sys/fs/devfs/devfs_vnops.c:864 #19 0x8073aca6 in fo_ioctl (fp=0xf801909f6870, com=3222821401, data=, active_cred=, td=0xfe00c3e43900) at /opt/src/git-src/sys/sys/file.h:366 #20 kern_ioctl (td=td@entry=0xfe00c3e43900, fd=4, com=com@entry=3222821401, data=, data@entry=0xfe00bf55dd50 "\017") at /opt/src/git-src/sys/kern/sys_generic.c:805 #21 0x8073a9b2 in sys_ioctl (td=0xfe00c3e43900, uap=0xfe00c3e43d00) at /opt/src/git-src/sys/kern/sys_generic.c:713 #22 0x80a73a88 in syscallenter (td=) at /opt/src/git-src/sys/amd64/amd64/../../kern/subr_syscall.c:187 #23 amd64_syscall (td=0xfe00c3e43900, traced=0) at /opt/src/git-src/sys/amd64/amd64/trap.c:1197 #24 #25 0x191264a4fbca in ?? () Backtrace stopped: Cannot access memory at address 0x19125ca905c8 (kgdb) *vp looks good. Dump is available if needed. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 ÀÀÀÀÀÀÀÀ
Re: Possible issue with linux xattr support?
On August 27, 2023 12:55:23 PM PDT, Felix Palmen wrote: >* Dmitry Chagin [20230827 22:46]: >> On Sun, Aug 27, 2023 at 07:59:32PM +0200, Felix Palmen wrote: >> > * Dmitry Chagin [20230827 20:54]: >> > > 1. which fs are you using? >> > >> > ZFS. >> > >> > > 2. jailed? >> > >> > Yes, this is during building ports with poudriere. >> > >> >> I think it's a weird prohibition on changing system namespace extattr >> attributes, look to comments in extattr_check_cred() > >Maybe that's when I should finally start trying to understand the stuff >in src.git ;) > >> I can fix this completely disabling exttatr for jailed proc, >> however, it's gonna be bullshit, though > >Would probably be better than nothing. AFAIK, "Linux jails" are used a >lot, probably with userlands from distributions actually using xattr. > >Cheers, Felix > If we are to break it to fix a problem, maybe a sysctl to enable/disable then? -- Cheers, Cy Schubert FreeBSD UNIX:Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 Pardon the typos. Small keyboard in use.
Re: kabylake + drm-515-kmod/drm-510-kmod hangs
In message <76275772-a9c3-ed59-5fb3-47a13d2a6...@nomadlogic.org>, Pete Wright w rites: > hey there, > i've got a kabylake laptop that i've been using with drm-kmod for > several years without much hassle. after upgrading to a new CURRENT > this weekend I've found that when loading either the 510 or 515 drm-kmod > kernel modules my system will hang. > > unfortunately i am not getting a panic or crash, the screen stops > updating and i am unable to ping or SSH into the system. interestingly > the capslock LED still toggles but doing a CTL+ALT+DEL does not seem to > do anything useful and i have to manually power cycle. > > any tips for finding out what's going on? i've booted the system with > verbose dmesg output, and loaded the module with "kldload -v" but do not > get any useful output. > > here's the uname: > FreeBSD colony 14.0-ALPHA2 FreeBSD 14.0-ALPHA2 amd64 1400096 #0 > main-n264924-e2340276fc73: Sun Aug 20 21:28:44 PDT 2023 > pete@colony:/usr/obj/usr/home/pete/git/freebsd/amd64.amd64/sys/GENERIC amd64 > > > these are the log messages i see before the system locks up: > Aug 21 10:40:34 colony kernel: iic0: on iicbus0 > Aug 21 10:40:35 colony kernel: drmn0: on vgapci0 > Aug 21 10:40:35 colony kernel: vgapci0: child drmn0 requested pci_enable_io > Aug 21 10:40:35 colony syslogd: last message repeated 1 times > Aug 21 10:40:35 colony kernel: [drm] Unable to create a private tmpfs > mount, hugepage support will be disabled(-19). > Aug 21 10:40:35 colony kernel: [drm] Got stolen memory base 0x4b80, > size 0x400 > Aug 21 10:40:35 colony kernel: lkpi_iic0: on drmn0 > Aug 21 10:40:35 colony kernel: iicbus1: on lkpi_iic0 > Aug 21 10:40:35 colony kernel: iic1: on iicbus1 > Aug 21 10:40:35 colony kernel: lkpi_iic1: on drmn0 > Aug 21 10:40:35 colony kernel: iicbus2: on lkpi_iic1 > Aug 21 10:40:35 colony kernel: iic2: on iicbus2 > Aug 21 10:40:35 colony kernel: lkpi_iic2: on drmn0 > Aug 21 10:40:35 colony kernel: iicbus3: on lkpi_iic2 > Aug 21 10:40:35 colony kernel: iic3: on iicbus3 > Aug 21 10:40:35 colony kernel: lkpi_iic3: on drmn0 > Aug 21 10:40:35 colony kernel: iicbus4: on lkpi_iic3 > Aug 21 10:40:35 colony kernel: iic4: on iicbus4 > > > > cheers, > -pete > > -- > Pete Wright > p...@nomadlogic.org > @nomadlogicLA > Rebuilding drm-51[05]-kmod after an update to LinuxKPI affecting the ABI used by the drm modules is required. Typically I get a kernel panic on a page fault when this occurs. Depending on how memory is laid out on your system you may get a hang instead. You need to install thew new kernel and world first. Disable xdm, gdm, any other *dm, or simply not use startx. From a text console session rebuild the drm port and reinstall it. I use poudriere here. My procedure is to update the poudriere jail, rebuild the port (-C option) and pkg upgrade -f or pkg install -f. Use this approach if you use poudriere. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 ÀÀÀÀÀÀÀÀ
Re: Defaulting serial communication to 115200 bps for FreeBSD 14
Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 message dated "Tue, 15 Aug 2023 17:18:37 -0400." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii In message , Ed Maste writes: > FreeBSD currently uses 9600 bps as the default for serial > communication -- in the boot loader, kernel serial console, /etc/ttys, > and so on. This was consistent with most equipment in the 90s, when > these defaults were established. Today 115200 bps seems to be much > more common, and I'm proposing that we make it the default for FreeBSD > 14.0. > > I have a review open: https://reviews.freebsd.org/D36295. There are a > few minor nits in the review to be addressed still but assuming > there's general agreement I'll iterate on those and commit this in a > few logical chunks. > There should probably be an UPDATING entry for those who use boot0 to revert back to 9600 in that case.
Re: ZFS deadlock in 14
bb112e in sleepq_wait (wchan=, > wchan@entry=0xf80108fe1540, pri=, pri@entry=0) at > /usr/src/sys/kern/subr_sleepqueue.c:660 >#4 0x80ade224 in _cv_wait (cvp=0xf80108fe1540, > lock=0xf80108fe14d0) at /usr/src/sys/kern/kern_condvar.c:146 >#5 0x820b383b in txg_wait_synced_impl (dp=0xf80108fe1000, > txg=8751529, txg@entry=0, wait_sig=wait_sig@entry=0) at > /usr/src/sys/contrib/openzfs/module/zfs/txg.c:726 >#6 0x820b31eb in txg_wait_synced (dp=, > txg=, txg@entry=0) at > /usr/src/sys/contrib/openzfs/module/zfs/txg.c:736 >#7 0x81fa5fc5 in zfsvfs_teardown (zfsvfs=0xf81ab3c81000, > unmounting=unmounting@entry=0) at > /usr/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vfsops.c:1661 >#8 0x81fa5db9 in zfs_suspend_fs (zfsvfs=) at > /usr/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vfsops.c:1954 >#9 0x821680ff in zfs_ioc_rollback (fsname=0xfe0301913000 > "zroot-default-ref/03", fsname@entry= available>, innvl=, innvl@entry= is not available>, >outnvl=0xf81601748640, outnvl@entry= is not available>) at /usr/src/sys/contrib/openzfs/module/zfs/zfs_ioctl.c:4401 >#10 0x82163836 in zfsdev_ioctl_common (vecnum=vecnum@entry=25, > zc=zc@entry=0xfe0301913000, flag=flag@entry=0) at > /usr/src/sys/contrib/openzfs/module/zfs/zfs_ioctl.c:7798 >#11 0x81f969aa in zfsdev_ioctl (dev=, > zcmd=, zcmd@entry= available>, arg=0xfe02fd546d50 "\017", arg@entry= value is not available>, flag=, td=) >at /usr/src/sys/contrib/openzfs/module/os/freebsd/zfs/kmod_core.c:168 >#12 0x809dc9cc in devfs_ioctl (ap=0xfe02fd546c40) at > /usr/src/sys/fs/devfs/devfs_vnops.c:935 >#13 0x80c5cac0 in vn_ioctl (fp=0xf81e9207f0a0, com= out>, data=0xfe02fd546d50, active_cred=0xf8026a65a900, > td=) at /usr/src/sys/kern/vfs_vnops.c:1697 >#14 0x809dd07e in devfs_ioctl_f (fp=, fp@entry= reading variable: value is not available>, com=, > com@entry=, > data=, data@entry= available>, >cred=, cred@entry= available>, td=, td@entry= available>) at /usr/src/sys/fs/devfs/devfs_vnops.c:866 >#15 0x80bca1ce in fo_ioctl (fp=0xf81e9207f0a0, com=3222821401, > data=, active_cred=, td=) at > /usr/src/sys/sys/file.h:367 >#16 kern_ioctl (td=td@entry=0xfe0314249020, fd=, > com=com@entry=3222821401, data=, data@entry=0xfe02fd546d50 > "\017") at /usr/src/sys/kern/sys_generic.c:807 >#17 0x80bc9f64 in sys_ioctl (td=0xfe0314249020, > td@entry=, > uap=0xfe0314249420, uap@entry= available>) at /usr/src/sys/kern/sys_generic.c:715 >#18 0x8104d8e0 in syscallenter (td=) at > /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:190 >#19 amd64_syscall (td=0xfe0314249020, traced=0) at > /usr/src/sys/amd64/amd64/trap.c:1199 >#20 >#21 0x05c8e125953a in ?? () >Backtrace stopped: Cannot access memory at address 0x5c8d89c8018 > >DES Yes, this is the same panic my poudriere builder building amd64 packages gets. The poudeiere builder, also running on amd64, building i386 packages gets a different panic. I'm on my phone and don't have a keyboard to look up the PR number. -- Cheers, Cy Schubert FreeBSD UNIX:Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 Pardon the typos. Small keyboard in use.
Re: ZFS deadlock in 14
The poudriere build machine building amd64 packages also panicked. But with: Dumping 2577 out of 8122 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91 % __curthread () at /opt/src/git-src/sys/amd64/include/pcpu_aux.h:59 59 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu , (kgdb) #0 __curthread () at /opt/src/git-src/sys/amd64/include/pcpu_aux.h:5 9 #1 doadump (textdump=textdump@entry=1) at /opt/src/git-src/sys/kern/kern_shutdown.c:407 #2 0x806c10e0 in kern_reboot (howto=260) at /opt/src/git-src/sys/kern/kern_shutdown.c:528 #3 0x806c15df in vpanic ( fmt=0x80b6c5f5 "%s: possible deadlock detected for %p (%s), blocked for %d ticks\n", ap=ap@entry=0xfe008e698e90) at /opt/src/git-src/sys/kern/kern_shutdown.c:972 #4 0x806c1383 in panic (fmt=) at /opt/src/git-src/sys/kern/kern_shutdown.c:896 #5 0x8064a5ea in deadlkres () at /opt/src/git-src/sys/kern/kern_clock.c:201 #6 0x80677632 in fork_exit (callout=0x8064a2c0 , arg=0x0, frame=0xfe008e698f40) at /opt/src/git-src/sys/kern/kern_fork.c:1162 #7 (kgdb) This is consistent with PR/271945. Reducing -J to 1 or 5:1 circumvents this panic. This is certainly a different panic from the one experienced on the poudriere builder building i386 packages. Both machines run in amd64 mode. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 Cy Schubert writes: > This is new. Instead of affecting the machine with poudriere building amd64 > packages, it affected the other machine with poudriere building i386 > packages. This is new since the two recent ZFS patches. > > Don't get me wrong, the two new patches have resulted in I believe better > availability of the poudriere machine building amd64 packages. I doubt the > two patches caused this but they may have exposed this problem, probably > fixed by another patch or two. > > Sorry, there was no dump produced by this panic. I'll need to check the > config of this machine, swap is a gmirror, which it doesn't like to dump > to. Below are serial console messages captured by conserver. > > panic: vm_page_dequeue_deferred: page 0xfe00028fb0d0 has unexpected > queue state^M > cpuid = 3^M > time = 1691807572^M > KDB: stack backtrace:^M > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > 0xfe00c50bc600^M > vpanic() at vpanic+0x132/frame 0xfe00c50bc730^M > panic() at panic+0x43/frame 0xfe00c50bc790^M > vm_page_dequeue_deferred() at vm_page_dequeue_deferred+0xb2/frame > 0xfe00c50bc7a0^M > vm_page_free_prep() at vm_page_free_prep+0x11b/frame 0xfe00c50bc7c0^M > vm_page_free_toq() at vm_page_free_toq+0x12/frame 0xfe00c50bc7f0^M > vm_object_page_remove() at vm_object_page_remove+0xb6/frame > 0xfe00c50bc850^M > vn_pages_remove_valid() at vn_pages_remove_valid+0x48/frame > 0xfe00c50bc880^M > zfs_rezget() at zfs_rezget+0x35/frame 0xfe00c50bca60^M > zfs_resume_fs() at zfs_resume_fs+0x1c8/frame 0xfe00c50bcab0^M > zfs_ioc_rollback() at zfs_ioc_rollback+0x157/frame 0xfe00c50bcb00^M > zfsdev_ioctl_common() at zfsdev_ioctl_common+0x612/frame > 0xfe00c50bcbc0^M > zfsdev_ioctl() at zfsdev_ioctl+0x12a/frame 0xfe00c50bcbf0^M > devfs_ioctl() at devfs_ioctl+0xd2/frame 0xfe00c50bcc40^M > vn_ioctl() at vn_ioctl+0xc2/frame 0xfe00c50bccb0^M > devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfe00c50bccd0^M > kern_ioctl() at kern_ioctl+0x286/frame 0xfe00c50bcd30^M > sys_ioctl() at sys_ioctl+0x152/frame 0xfe00c50bce00^M > amd64_syscall() at amd64_syscall+0x138/frame 0xfe00c50bcf30^M > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfe00c50bcf30^M > --- syscall (54, FreeBSD ELF64, ioctl), rip = 0x20938296107a, rsp = > 0x209379aeee18, rbp = 0x209379aeee90 ---^M > Uptime: 42m33s^M > Automatic reboot in 15 seconds - press a key on the console to abort^M > Rebooting...^M > cpu_reset: Restarting BSP^M > cpu_reset_proxy: Stopped CPU 3^M > > > -- > Cheers, > Cy Schubert > FreeBSD UNIX: Web: https://FreeBSD.org > NTP: Web: https://nwtime.org > > e^(i*pi)+1=0 > > > Cy Schubert writes: > > I haven't experienced any problems (yet) either. > > > > > > -- > > Cheers, > > Cy Schubert > > FreeBSD UNIX: Web: https://FreeBSD.org > > NTP: Web: https://nwtime.org > > > > e^(i*pi)+1=0 > > > > > > In message c > > om> > > , Kevin Bowling writes: > > > The two MFVs on head have improved/fixed stability with po
Re: ZFS deadlock in 14
I haven't experienced any problems (yet) either. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 In message , Kevin Bowling writes: > The two MFVs on head have improved/fixed stability with poudriere for > me 48 core bare metal. > > On Thu, Aug 10, 2023 at 6:37=E2=80=AFAM Cy Schubert com> wrote: > > > > In message l.c > > om> > > , Kevin Bowling writes: > > > Possibly https://github.com/openzfs/zfs/commit/2cb992a99ccadb78d97049b4= > 0bd4=3D > > > 42eb4fdc549d > > > > > > On Tue, Aug 8, 2023 at 10:08=3DE2=3D80=3DAFAM Dag-Erling Sm=3DC3=3DB8rg= > rav > > sd.org> wrote: > > > > > > > > At some point between 42d088299c (4 May) and f0c9703301 (26 June), a > > > > deadlock was introduced in ZFS. It is still present as of 9c2823bae9= > (4 > > > > August) and is 100% reproducable just by starting poudriere bulk in a > > > > 16-core VM and waiting a few hours until deadlkres kicks in. In the > > > > latest instance, deadlkres complained about a bash process: > > > > > > > > #0 sched_switch (td=3D3Dtd@entry=3D3D0xfe02fb1d8000, flags= > =3D3Dflags@e=3D > > > ntry=3D3D259) at /usr/src/sys/kern/sched_ule.c:2299 > > > > #1 0x80b5a0a3 in mi_switch (flags=3D3Dflags@entry=3D3D25= > 9) at /u=3D > > > sr/src/sys/kern/kern_synch.c:550 > > > > #2 0x80babcb4 in sleepq_switch (wchan=3D3D0xf818543a= > 9e70, =3D > > > pri=3D3D64) at /usr/src/sys/kern/subr_sleepqueue.c:609 > > > > #3 0x80babb8c in sleepq_wait (wchan=3D3D, p= > ri=3D3D<=3D > > > unavailable>) at /usr/src/sys/kern/subr_sleepqueue.c:660 > > > > #4 0x80b1c1b0 in sleeplk (lk=3D3Dlk@entry=3D3D0xf818= > 543a9e70=3D > > > , flags=3D3Dflags@entry=3D3D2121728, ilk=3D3Dilk@entry=3D3D0x0, wmesg= > =3D3Dwmesg@entry=3D > > > =3D3D0x8222a054 "zfs", pri=3D3D, pri@entry=3D3D6= > 4, timo=3D3D=3D > > > timo@entry=3D3D6, queue=3D3D1) at /usr/src/sys/kern/kern_lock.c:310 > > > > #5 0x80b1a23f in lockmgr_slock_hard (lk=3D3D0xf81854= > 3a9e70=3D > > > , flags=3D3D2121728, ilk=3D3D, file=3D3D0x812544= > fb "/usr/s=3D > > > rc/sys/kern/vfs_subr.c", line=3D3D3057, lwa=3D3D0x0) at /usr/src/sys/ke= > rn/kern_=3D > > > lock.c:705 > > > > #6 0x80c59ec3 in VOP_LOCK1 (vp=3D3D0xf818543a9e00, f= > lags=3D > > > =3D3D2105344, file=3D3D0x812544fb "/usr/src/sys/kern/vfs_subr.c= > ", line=3D > > > =3D3D3057) at ./vnode_if.h:1120 > > > > #7 _vn_lock (vp=3D3Dvp@entry=3D3D0xf818543a9e00, flags=3D3D2= > 105344, fi=3D > > > le=3D3D, line=3D3D, line@entry=3D3D3057) at /= > usr/src/sy=3D > > > s/kern/vfs_vnops.c:1815 > > > > #8 0x80c4173d in vget_finish (vp=3D3D0xf818543a9e00,= > flags=3D > > > =3D3D, vs=3D3Dvs@entry=3D3DVGET_USECOUNT) at /usr/src/sys/= > kern/vfs_s=3D > > > ubr.c:3057 > > > > #9 0x80c1c9b7 in cache_lookup (dvp=3D3Ddvp@entry=3D3D0xf= > 802c=3D > > > d02ac40, vpp=3D3Dvpp@entry=3D3D0xfe046b20ac30, cnp=3D3Dcnp@entry=3D= > 3D0xfe04=3D > > > 6b20ac58, tsp=3D3Dtsp@entry=3D3D0x0, ticksp=3D3Dticksp@entry=3D3D0x0) a= > t /usr/src/s=3D > > > ys/kern/vfs_cache.c:2086 > > > > #10 0x80c2150c in vfs_cache_lookup (ap=3D3D >) at =3D > > > /usr/src/sys/kern/vfs_cache.c:3068 > > > > #11 0x80c32c37 in VOP_LOOKUP (dvp=3D3D0xf802cd02ac40,= > vpp=3D > > > =3D3D0xfe046b20ac30, cnp=3D3D0xfe046b20ac58) at ./vnode_if.h:69 > > > > #12 vfs_lookup (ndp=3D3Dndp@entry=3D3D0xfe046b20abd8) at /usr= > /src/sys=3D > > > /kern/vfs_lookup.c:1266 > > > > #13 0x80c31ce1 in namei (ndp=3D3Dndp@entry=3D3D0xfe04= > 6b20abd8=3D > > > ) at /usr/src/sys/kern/vfs_lookup.c:689 > > > > #14 0x80c52090 in kern_statat (td=3D3D0xfe02fb1d8000,= > flag=3D > > > =3D3D, fd=3D3D-100, path=3D3D0xa75b480e070 t access m=3D > > > emory at address 0xa75b480e070>, pathseg=3D3Dpathseg@entry=3D3DUIO_USER= > SPACE, s=3D > > > bp=3D3Dsbp@entry=3D3D0xfe046b20ad18) > > > > at /usr/src/sys/kern/vfs_syscalls.c:2441 > > > > #15 0x80c52797 in sys_fstatat (td=3D3D, uap= > =3D3D0xff=3D > > > fffe02fb1d8400) at /usr/src/sys/kern/vfs
Re: ZFS deadlock in 14
ff8204075b in dsl_dataset_rollback (fsname=3D >, fsname@entry=3D0xfe0401d15000 "zroot/poudriere/jails/13amd64-default= > -ref/15", tosnap=3D, owner=3D, result=3Dresul= > t@entry=3D0xf81c826a9ea0) > > at /usr/src/sys/contrib/openzfs/module/zfs/dsl_dataset.c:3261 > > #10 0x82168dd9 in zfs_ioc_rollback (fsname=3D0xfe0401d150= > 00 "zroot/poudriere/jails/13amd64-default-ref/15", fsname@entry=3D ading variable: value is not available>, innvl=3D, innvl@entry= > =3D, > > outnvl=3D0xf81c826a9ea0, outnvl@entry=3D le: value is not available>) at /usr/src/sys/contrib/openzfs/module/zfs/zfs= > _ioctl.c:4405 > > #11 0x82164522 in zfsdev_ioctl_common (vecnum=3Dvecnum@entry= > =3D25, zc=3Dzc@entry=3D0xfe0401d15000, flag=3Dflag@entry=3D0) at /usr/s= > rc/sys/contrib/openzfs/module/zfs/zfs_ioctl.c:7798 > > #12 0x81f97fca in zfsdev_ioctl (dev=3D, zcmd= > =3D, zcmd@entry=3D ble>, arg=3D0xfe02fb827d50 "\017", arg@entry=3D value is not available>, flag=3D, td=3D) > > at /usr/src/sys/contrib/openzfs/module/os/freebsd/zfs/kmod_core.c= > :168 > > #13 0x809d6212 in devfs_ioctl (ap=3D0xfe02fb827c50) at /u= > sr/src/sys/fs/devfs/devfs_vnops.c:935 > > #14 0x80c585f2 in vn_ioctl (fp=3D0xf8052cdd80f0, com=3D ptimized out>, data=3D0xfe02fb827d50, active_cred=3D0xf80122ab1e00,= > td=3D) at /usr/src/sys/kern/vfs_vnops.c:1704 > > #15 0x809d68ee in devfs_ioctl_f (fp=3D, fp@entry= > =3D, com=3D, c= > om@entry=3D, data=3D lable>, data@entry=3D, > > cred=3D, cred@entry=3D is not available>, td=3D, td@entry=3D value is not available>) at /usr/src/sys/fs/devfs/devfs_vnops.c:866 > > #16 0x80bc57e6 in fo_ioctl (fp=3D0xf8052cdd80f0, com=3D32= > 22821401, data=3D, active_cred=3D, td=3D0xfe0= > 422ef8560) at /usr/src/sys/sys/file.h:367 > > #17 kern_ioctl (td=3Dtd@entry=3D0xfe0422ef8560, fd=3D4, com=3Dcom= > @entry=3D3222821401, data=3D, data@entry=3D0xfffffe02fb827d50 = > "\017") at /usr/src/sys/kern/sys_generic.c:807 > > #18 0x80bc54f2 in sys_ioctl (td=3D0xfe0422ef8560, uap=3D0= > xfe0422ef8960) at /usr/src/sys/kern/sys_generic.c:715 > > #19 0x81049398 in syscallenter (td=3D) at /usr= > /src/sys/amd64/amd64/../../kern/subr_syscall.c:190 > > #20 amd64_syscall (td=3D0xfe0422ef8560, traced=3D0) at /usr/src/s= > ys/amd64/amd64/trap.c:1199 [...] The backtrace looks different though it certainly smells like PR/271945. I've had similar to PR/271945 panics on an amd64 with a mirrored zpool with four vdevs running poudriere with AMD64 jails. My other amd64 with a mirrored zpool with two vdevs using i386 jails has no such issue. All other workloads are unaffected. On the affected machine running poudriere bulk with -J N:1 circumvents the issue. So far. There were two openzfs cherry-picks this morning. I intend to try them against a full bulk build later today. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: dhclient unable to negotiate on WPA2-Enterprise network (eduroam)
Pull request #787. I can look at it. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 In message , "Naman Sood " writes: > Hi, > > wpa_supplicant-devel unfortunately did not fix my problem. However, applying > this patch did: https://github.com/freebsd/freebsd-src/commit/b393d862dc78a99 > 203455b01e685fb2108e51b05. > > Thanks, > Naman. > (they/them) > > On Sat, Jul 1, 2023, at 00:14, Cy Schubert wrote: > > On Fri, 30 Jun 2023 10:56:54 -0700 > > Cy Schubert wrote: > > > > > Can you try wpa_supplicant-devel? It was updated last week. The -devel po > rt tracks the latest WPA development. > > > > > > > > > > Now that I'm back at home, looking at hostap (our upstream w1.fi) commit > > logs, there have been a few OpenSSL 3.0 patches applied to wpa since > > wpa_supplicant/hostapd 2.10 was imported into FreeBSD base (on Jan 18, > > 2022). Try the wpa_supplicant-devel port, it's current to the latest > > upstream w1.fi commit. If it fixes your problem, I will import it into > > FreeBSD base as well > > > > I backport a few patches applied to base back into both ports next week. > > > > > > -- > > Cheers, > > Cy Schubert > > FreeBSD UNIX: Web: https://FreeBSD.org > > <https://freebs > d.org/> > > NTP: Web: https://nwtime.org > > > > e^(i*pi)+1=0 > > > >
Re: dhclient unable to negotiate on WPA2-Enterprise network (eduroam)
On Fri, 30 Jun 2023 10:56:54 -0700 Cy Schubert wrote: > Can you try wpa_supplicant-devel? It was updated last week. The -devel port > tracks the latest WPA development. > > Now that I'm back at home, looking at hostap (our upstream w1.fi) commit logs, there have been a few OpenSSL 3.0 patches applied to wpa since wpa_supplicant/hostapd 2.10 was imported into FreeBSD base (on Jan 18, 2022). Try the wpa_supplicant-devel port, it's current to the latest upstream w1.fi commit. If it fixes your problem, I will import it into FreeBSD base as well I backport a few patches applied to base back into both ports next week. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: dhclient unable to negotiate on WPA2-Enterprise network (eduroam)
Can you try wpa_supplicant-devel? It was updated last week. The -devel port tracks the latest WPA development. -- Cheers, Cy Schubert FreeBSD UNIX:Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 Pardon the typos. Small keyboard in use.
Re: Following a panic (271945): zpool status reports 1 data error but identifies no file
On June 11, 2023 5:58:49 AM PDT, Miroslav Lachman <000.f...@quip.cz> wrote: >On 11/06/2023 14:02, Graham Perrin wrote: >> See below, should I begin scrubbing? Or (before I begin) might zdb reveal >> something useful? >> >> The supposed error was observable after >> <https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271945> >> >> /271945 – panic: deadlres_td_sleep_q: possible deadlock detected for >> 0xfe0133324ac0 (stat), blocked for 1801328 ticks// >> / > >[..] > >> errors: Permanent errors have been detected in the following files: > > >Can it be that the error was in file which is deleted now? Or was in snapshot >which was already destroyed by some automatic script? > >Kind regards >Miroslav Lachman > > Zpool export/import or reboot may fix this. -- Cheers, Cy Schubert FreeBSD UNIX:Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 Pardon the typos. Small keyboard in use.
Re: another crash and going forward with zfs
In message , Pawel Jakub Dawi dek writes: > On 4/18/23 05:14, Mateusz Guzik wrote: > > On 4/17/23, Pawel Jakub Dawidek wrote: > >> Correct me if I'm wrong, but from my understanding there were zero > >> problems with block cloning when it wasn't in use or now disabled. > >> > >> The reason I've introduced vfs.zfs.bclone_enabled sysctl, was to exactly > >> avoid mess like this and give us more time to sort all the problems out > >> while making it easy for people to try it. > >> > >> If there is no plan to revert the whole import, I don't see what value > >> removing just block cloning will bring if it is now disabled by default > >> and didn't cause any problems when disabled. > >> > > > > The feature definitely was not properly stress tested and what not and > > trying to do it keeps running into panics. Given the complexity of the > > feature I would expect there are many bug lurking, some of which > > possibly related to the on disk format. Not having to deal with any of > > this is can be arranged as described above and is imo the most > > sensible route given the timeline for 14.0 > > Block cloning doesn't create, remove or modify any on-disk data until it > is in use. > > Again, if we are not going to revert the whole merge, I see no point in > reverting block cloning as until it is enabled, its code is not > executed. This allow people who upgraded the pools to do nothing special > and it will allow people to test it easily. In this case zpool upgrade and zpool status should return no feature upgrades are available instead of enticing users to zpool upgrade. The userland zpool command should test for this sysctl and print nothing regarding block_cloning. I can see a scenario when a user zpool upgrades their pools, notices the sysctl and does the unthinkable. Not only would this fill the mailing lists with angry chatter but it would spawn a number of PRs plus give us a lot of bad press for data loss. Should we keep the new ZFS in 14, we should: 1. Make sure that zpool(8) does not mention or offer block_cloning in any way if the sysctl is disabled. 2. Print a cautionary note in release notes advising people not to enable this experimental sysctl. Maybe even have it print "(experimental)" to warn users that it will hurt. 3. Update the man pages to caution that block_cloning is experimental and unstable. It's not enough to have a sysctl without hiding block_cloning completely from view. Only expose it in zpool(8) when the sysctl is enabled. Let's avoid people mistakenly enabling it. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75
In message <5a47f62d-0e78-4c3e-84c0-45eeb03c7...@yahoo.com>, Mark Millard write s: > On Apr 15, 2023, at 07:36, Cy Schubert = > wrote: > > > In message <20230415115452.08911...@thor.intern.walstatt.dynvpn.de>,=20= > > > FreeBSD Us > > er writes: > >> Am Thu, 13 Apr 2023 22:18:04 -0700 > >> Mark Millard schrieb: > >>=20 > >>> On Apr 13, 2023, at 21:44, Charlie Li wrote: > >>>=20 > >>>> Mark Millard wrote: =20 > >>>>> FYI: in my original report for a context that has never had > >>>>> block_cloning enabled, I reported BOTH missing files and > >>>>> file content corruption in the poudriere-devel bulk build > >>>>> testing. This predates: > >>>>> https://people.freebsd.org/~pjd/patches/brt_revert.patch > >>>>> but had the changes from: > >>>>> https://github.com/openzfs/zfs/pull/14739/files > >>>>> The files were missing from packages installed to be used > >>>>> during a port's build. No other types of examples of missing > >>>>> files happened. (But only 11 ports failed.) =20 > >>>> I also don't have block_cloning enabled. "Missing files" prior to = > brt_rev > >> ert may actually > >>>> be present, but as the corruption also messes with the file(1) = > signature, > >> some tools like > >>>> ldconfig report them as missing. =20 > >>>=20 > >>> For reference, the specific messages that were not explicit > >>> null-byte complaints were (some shown with a little context): > >>>=20 > >>>=20 > >>> =3D=3D=3D> py39-lxml-4.9.2 depends on shared library: libxml2.so - = > not found > >>> =3D=3D=3D> Installing existing package = > /packages/All/libxml2-2.10.3_1.pkg =20 > >>> [CA72_ZFS] Installing libxml2-2.10.3_1... > >>> [CA72_ZFS] Extracting libxml2-2.10.3_1: .. done > >>> =3D=3D=3D> py39-lxml-4.9.2 depends on shared library: libxml2.so - = > found > >>> (/usr/local/lib/libxml2.so) . . . > >>> [CA72_ZFS] Extracting libxslt-1.1.37: .. done > >>> =3D=3D=3D> py39-lxml-4.9.2 depends on shared library: libxslt.so - = > found > >>> (/usr/local/lib/libxslt.so) =3D=3D=3D> Returning to build of = > py39-lxml-4.9.2 =20 > >>> . . . > >>> =3D=3D=3D> Configuring for py39-lxml-4.9.2 =20 > >>> Building lxml version 4.9.2. > >>> Building with Cython 0.29.33. > >>> Error: Please make sure the libxml2 and libxslt development packages = > are in > >> stalled. > >>>=20 > >>>=20 > >>> [CA72_ZFS] Extracting libunistring-1.1: .. done > >>> =3D=3D=3D> libidn2-2.3.4 depends on shared library: = > libunistring.so - not found > >>=20 > >>>=20 > >>>=20 > >>> [CA72_ZFS] Extracting gmp-6.2.1: .. done > >>> =3D=3D=3D> mpfr-4.2.0,1 depends on shared library: libgmp.so - not = > found =20 > >>>=20 > >>>=20 > >>> =3D=3D=3D> nettle-3.8.1 depends on shared library: libgmp.so - not = > found > >>> =3D=3D=3D> Installing existing package /packages/All/gmp-6.2.1.pkg = > =20 > >>> [CA72_ZFS] Installing gmp-6.2.1... > >>> the most recent version of gmp-6.2.1 is already installed > >>> =3D=3D=3D> nettle-3.8.1 depends on shared library: libgmp.so - not = > found =20 > >>> *** Error code 1 > >>>=20 > >>>=20 > >>> autom4te: error: need GNU m4 1.4 or later: /usr/local/bin/gm4 > >>>=20 > >>>=20 > >>> checking for GNU=20 > >>> M4 that supports accurate traces... configure: error: no acceptable = > m4 coul > >> d be found in > >>> $PATH. GNU M4 1.4.6 or later is required; 1.4.16 or newer is = > recommended. > >>> GNU M4 1.4.15 uses a buggy replacement strstr on some systems. > >>> Glibc 2.9 - 2.12 and GNU M4 1.4.11 - 1.4.15 have another strstr bug. > >>>=20 > >>>=20 > >>> ld: error: /usr/local/lib/libblkid.a: unknown file type > >>>=20 > >>>=20 > >>> =3D=3D=3D > >>> Mark Millard > >>> marklmi at yahoo.com > >>>=20 > >>>=20 > >>=20 > >> Hello=20 > >>=20 > >> whar is the recent status of fixing/mitigate this desatrous bug? = > Especially f > >>
Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75
On Sat, 15 Apr 2023 18:07:34 +0200 Florian Smeets wrote: > On 15.04.23 17:51, FreeBSD User wrote: > > Am Sat, 15 Apr 2023 07:36:25 -0700 > > Cy Schubert schrieb: > >> > >> With an up-to-date tree + pjd@'s "Fix data corruption when cloning embedded > >> blocks. #14739" patch I didn't have any issues, except for email messages > >> with corruption in my sent directory, nowhere else. I'm still investigating > >> the email messages issue. IMO one is generally safe to run poudriere on the > >> latest ZFS with the additional patch. > > This is also my current observation. I have 2 hosts where I was > unfortunate enough to update at the wrong time. I currently *think* that > I'm *not* seeing data corruption with head from April 12th and this > patch > https://github.com/openzfs/zfs/commit/d3a6e5ca3b2f684132238ca968bf0b96f17ec7e1.diff > > applied. > > One pool has been upgraded with feature@block_cloning and the other hasn't. > > > > FreeBSD 14.0-CURRENT #8 main-n262175-5ee1c90e50ce: Sat Apr 15 07:57:16 CEST > > 2023 amd64 > > > > The box is crashing while trying to update ports with the well known issue: > > > > Panic String: VERIFY(!zil_replaying(zilog, tx)) failed > > > On the pool that has block_cloning enabled I see the above insta panic > when poudriere starts building. I found a workaround though: > > --- /usr/local/share/poudriere/include/fs.sh.orig 2023-04-15 > 18:03:50.090823000 +0200 > +++ /usr/local/share/poudriere/include/fs.sh 2023-04-15 > 18:04:04.144736000 +0200 > @@ -295,7 +295,6 @@ > fi > > zfs clone -o mountpoint=${mnt} \ > - -o sync=disabled \ > -o atime=off \ > -o compression=off \ > ${fs}@${snap} \ > > With this workaround I was able to build thousands of packages without > panics or failures due to data corruption. Thanks for this. I'll test this next week. A one should be able to test this by hand to capture a dump. > > Florian -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75
In message <20230415175218.777d0...@thor.intern.walstatt.dynvpn.de>, FreeBSD Us er writes: > Am Sat, 15 Apr 2023 07:36:25 -0700 > Cy Schubert schrieb: > > > In message <20230415115452.08911...@thor.intern.walstatt.dynvpn.de>, > > FreeBSD Us > > er writes: > > > Am Thu, 13 Apr 2023 22:18:04 -0700 > > > Mark Millard schrieb: > > > > > > > On Apr 13, 2023, at 21:44, Charlie Li wrote: > > > > > > > > > Mark Millard wrote: > > > > >> FYI: in my original report for a context that has never had > > > > >> block_cloning enabled, I reported BOTH missing files and > > > > >> file content corruption in the poudriere-devel bulk build > > > > >> testing. This predates: > > > > >> https://people.freebsd.org/~pjd/patches/brt_revert.patch > > > > >> but had the changes from: > > > > >> https://github.com/openzfs/zfs/pull/14739/files > > > > >> The files were missing from packages installed to be used > > > > >> during a port's build. No other types of examples of missing > > > > >> files happened. (But only 11 ports failed.) > > > > > I also don't have block_cloning enabled. "Missing files" prior to brt > _rev > > > ert may actually > > > > > be present, but as the corruption also messes with the file(1) signat > ure, > > > some tools like > > > > > ldconfig report them as missing. > > > > > > > > For reference, the specific messages that were not explicit > > > > null-byte complaints were (some shown with a little context): > > > > > > > > > > > > ===> py39-lxml-4.9.2 depends on shared library: libxml2.so - not foun > d > > > > ===> Installing existing package /packages/All/libxml2-2.10.3_1.pkg > > > > > [CA72_ZFS] Installing libxml2-2.10.3_1... > > > > [CA72_ZFS] Extracting libxml2-2.10.3_1: .. done > > > > ===> py39-lxml-4.9.2 depends on shared library: libxml2.so - found > > > > (/usr/local/lib/libxml2.so) . . . > > > > [CA72_ZFS] Extracting libxslt-1.1.37: .. done > > > > ===> py39-lxml-4.9.2 depends on shared library: libxslt.so - found > > > > (/usr/local/lib/libxslt.so) ===> Returning to build of py39-lxml-4.9. > 2 > > > > . . . > > > > ===> Configuring for py39-lxml-4.9.2 > > > > Building lxml version 4.9.2. > > > > Building with Cython 0.29.33. > > > > Error: Please make sure the libxml2 and libxslt development packages ar > e in > > > stalled. > > > > > > > > > > > > [CA72_ZFS] Extracting libunistring-1.1: .. done > > > > ===> libidn2-2.3.4 depends on shared library: libunistring.so - not f > ound > > > > > > > > > > > > > > > [CA72_ZFS] Extracting gmp-6.2.1: .. done > > > > ===> mpfr-4.2.0,1 depends on shared library: libgmp.so - not found > > > > > > > > > > > > > ===> nettle-3.8.1 depends on shared library: libgmp.so - not found > > > > ===> Installing existing package /packages/All/gmp-6.2.1.pkg > > > > [CA72_ZFS] Installing gmp-6.2.1... > > > > the most recent version of gmp-6.2.1 is already installed > > > > ===> nettle-3.8.1 depends on shared library: libgmp.so - not found > > > > > *** Error code 1 > > > > > > > > > > > > autom4te: error: need GNU m4 1.4 or later: /usr/local/bin/gm4 > > > > > > > > > > > > checking for GNU > > > > M4 that supports accurate traces... configure: error: no acceptable m4 > coul > > > d be found in > > > > $PATH. GNU M4 1.4.6 or later is required; 1.4.16 or newer is recommende > d. > > > > GNU M4 1.4.15 uses a buggy replacement strstr on some systems. > > > > Glibc 2.9 - 2.12 and GNU M4 1.4.11 - 1.4.15 have another strstr bug. > > > > > > > > > > > > ld: error: /usr/local/lib/libblkid.a: unknown file type > > > > > > > > > > > > === > > > > Mark Millard > > > > marklmi at yahoo.com > > > > > > > > > > > > > > Hello > > > > > > whar is the recent status of fixing/mitigate this desatr
Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75
rectory, nowhere else. I'm still investigating the email messages issue. IMO one is generally safe to run poudriere on the latest ZFS with the additional patch. My tests of the additional patch concluded that it resolved my last problems, except for the sent email problem I'm still investigating. I'm sure there's a simple explanation for it, i.e. the email thread was corrupted by the EXDEV regression which cannot be fixed by anything, even reverting to the previous ZFS -- the data in those files will remain damaged regardless. I cannot speak to the others who have had poudriere and other issues. I never had any problems with poudriere on top of the new ZFS. WRT reverting block_cloning pools to without, your only option is to backup your pool and recreate it without block_cloning. Then restore your data. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75
In message , Mateusz Guzik writes: > On 4/13/23, Cy Schubert wrote: > > On Thu, 13 Apr 2023 19:54:42 +0900 > > Pawe=C5=82 Jakub Dawidek wrote: > > > >> On Apr 13, 2023, at 16:10, Cy Schubert wrote= > : > >> > > >> > =EF=BB=BFIn message <20230413070426.8a54f...@slippy.cwsent.com>, Cy Sc= > hubert > >> > writes: > >> > In message <20230413064252.1e5c1...@slippy.cwsent.com>, Cy Schubert > >> > writes: > >> >> In message , Mark > >> >> Millard > >> >>> write > >> >>> s: > >> >>> [This just puts my prior reply's material into Cy's > >> >>>> adjusted resend of the original. The To/Cc should > >> >>>> be coomplete this time.] > >> >>>> > >> >>>> On Apr 12, 2023, at 22:52, Cy Schubert = > =3D > >> >>>> wrote: > >> >>>> > >> >>>> In message , Mark = > =3D > >> >>>>> Millard=3D20 > >> >>>> write > >> >>>>> s: > >> >>>>> From: Charlie Li wrote on > >> >>>>>> Date: Wed, 12 Apr 2023 20:11:16 UTC : > >> >>>>>> =3D20 > >> >>>>>> Charlie Li wrote: > >> >>>>>>> Mateusz Guzik wrote: > >> >>>>>>>> can you please test poudriere with > >> >>>>>>>>> https://github.com/openzfs/zfs/pull/14739/files > >> >>>>>>>>> =3D20 > >> >>>>>>>>> After applying, on the md(4)-backed pool regardless of =3D3D > >> >>>>>>>> block_cloning,=3D3D20 > >> >>>>>> the cy@ `cp -R` test reports no differing (ie corrupted) files. = > =3D > >> >>>>>>>> Will=3D3D20=3D3D > >> >>>> =3D20 > >> >>>>>> report back on poudriere results (no block_cloning). > >> >>>>>>>> =3D3D20 > >> >>>>>>>> As for poudriere, build failures are still rolling in. These ar= > e > >> >>>>>>>> =3D > >> >>>>>>> (and=3D3D20=3D3D > >> >>>> =3D20 > >> >>>>>> have been) entirely random on every run. Some examples from this = > =3D > >> >>>>>>> run: > >> >>>> =3D3D20 > >> >>>>>>> lang/php81: > >> >>>>>>> - post-install: @${INSTALL_DATA} > >> >>>>>>> ${WRKSRC}/php.ini-development=3D3D20 > >> >>>>>>> ${WRKSRC}/php.ini-production ${WRKDIR}/php.conf =3D3D > >> >>>>>>> ${STAGEDIR}/${PREFIX}/etc > >> >>>>>> - consumers fail to build due to corrupted php.conf packaged > >> >>>>>>> =3D3D20 > >> >>>>>>> devel/ninja: > >> >>>>>>> - phase: stage > >> >>>>>>> - install -s -m 555=3D3D20 > >> >>>>>>> /wrkdirs/usr/ports/devel/ninja/work/ninja-1.11.1/ninja=3D3D20 > >> >>>>>>> /wrkdirs/usr/ports/devel/ninja/work/stage/usr/local/bin > >> >>>>>>> - consumers fail to build due to corrupted bin/ninja packaged > >> >>>>>>> =3D3D20 > >> >>>>>>> devel/netsurf-buildsystem: > >> >>>>>>> - phase: stage > >> >>>>>>> - mkdir -p=3D3D20 > >> >>>>>>> =3D3D > >> >>>>>>> =3D > >> >>>>>> /wrkdirs/usr/ports/devel/netsurf-buildsystem/work/stage/usr/local= > /share/n > >> >>>> e=3D > >> >> =3D3D > >> >>>> tsurf-buildsystem/makefiles=3D3D20 > >> >>>>>> =3D3D > >> >>>>>>> =3D > >> >>>>>> /wrkdirs/usr/ports/devel/netsurf-buildsystem/work/stage/usr/local= > /share/n > >> >>>> e=3D > >> >> =3D3D > >> >>>> tsurf-buildsystem/testtools > >> >>>>>> for M in Makefile.top Makefile.tools Makefile.subdir =3D3D > >> >>>>>>> Makefile.pkgconfig=3D3D20 > >> >>>>>> Makefile.clang Makefile.gcc Makefile.norcroft Makefile.op
Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75
On Thu, 13 Apr 2023 19:54:42 +0900 Paweł Jakub Dawidek wrote: > On Apr 13, 2023, at 16:10, Cy Schubert wrote: > > > > In message <20230413070426.8a54f...@slippy.cwsent.com>, Cy Schubert writes: > > In message <20230413064252.1e5c1...@slippy.cwsent.com>, Cy Schubert writes: > >> In message , Mark Millard > >>> write > >>> s: > >>> [This just puts my prior reply's material into Cy's > >>>> adjusted resend of the original. The To/Cc should > >>>> be coomplete this time.] > >>>> > >>>> On Apr 12, 2023, at 22:52, Cy Schubert = > >>>> wrote: > >>>> > >>>> In message , Mark = > >>>>> Millard=20 > >>>> write > >>>>> s: > >>>>> From: Charlie Li wrote on > >>>>>> Date: Wed, 12 Apr 2023 20:11:16 UTC : > >>>>>> =20 > >>>>>> Charlie Li wrote: > >>>>>>> Mateusz Guzik wrote: > >>>>>>>> can you please test poudriere with > >>>>>>>>> https://github.com/openzfs/zfs/pull/14739/files > >>>>>>>>> =20 > >>>>>>>>> After applying, on the md(4)-backed pool regardless of =3D > >>>>>>>> block_cloning,=3D20 > >>>>>> the cy@ `cp -R` test reports no differing (ie corrupted) files. = > >>>>>>>> Will=3D20=3D > >>>> =20 > >>>>>> report back on poudriere results (no block_cloning). > >>>>>>>> =3D20 > >>>>>>>> As for poudriere, build failures are still rolling in. These are = > >>>>>>> (and=3D20=3D > >>>> =20 > >>>>>> have been) entirely random on every run. Some examples from this = > >>>>>>> run: > >>>> =3D20 > >>>>>>> lang/php81: > >>>>>>> - post-install: @${INSTALL_DATA} ${WRKSRC}/php.ini-development=3D20 > >>>>>>> ${WRKSRC}/php.ini-production ${WRKDIR}/php.conf =3D > >>>>>>> ${STAGEDIR}/${PREFIX}/etc > >>>>>> - consumers fail to build due to corrupted php.conf packaged > >>>>>>> =3D20 > >>>>>>> devel/ninja: > >>>>>>> - phase: stage > >>>>>>> - install -s -m 555=3D20 > >>>>>>> /wrkdirs/usr/ports/devel/ninja/work/ninja-1.11.1/ninja=3D20 > >>>>>>> /wrkdirs/usr/ports/devel/ninja/work/stage/usr/local/bin > >>>>>>> - consumers fail to build due to corrupted bin/ninja packaged > >>>>>>> =3D20 > >>>>>>> devel/netsurf-buildsystem: > >>>>>>> - phase: stage > >>>>>>> - mkdir -p=3D20 > >>>>>>> =3D > >>>>>>> = > >>>>>> /wrkdirs/usr/ports/devel/netsurf-buildsystem/work/stage/usr/local/share/n > >>>> e= > >> =3D > >>>> tsurf-buildsystem/makefiles=3D20 > >>>>>> =3D > >>>>>>> = > >>>>>> /wrkdirs/usr/ports/devel/netsurf-buildsystem/work/stage/usr/local/share/n > >>>> e= > >> =3D > >>>> tsurf-buildsystem/testtools > >>>>>> for M in Makefile.top Makefile.tools Makefile.subdir =3D > >>>>>>> Makefile.pkgconfig=3D20 > >>>>>> Makefile.clang Makefile.gcc Makefile.norcroft Makefile.open64; do \ > >>>>>>> cp makefiles/$M=3D20 > >>>>>>> =3D > >>>>>>> = > >>>>>> /wrkdirs/usr/ports/devel/netsurf-buildsystem/work/stage/usr/local/share/n > >>>> e= > >> =3D > >>>> tsurf-buildsystem/makefiles/;=3D20 > >>>>>> \ > >>>>>>> done > >>>>>>> - graphics/libnsgif fails to build due to NUL characters in=3D20 > >>>>>>> Makefile.{clang,subdir}, causing nothing to link > >>>>>>> =20 > >>>>>> Summary: I have problems building ports into packages > >>>>>> via poudriere-devel use despite being fully updated/patched > >>>>>> (as of when I started the experiment), never having enabled > >>>>>> block_cloning ( still using openzfs-2.1-freebsd ). > >>>>>> =20 > >>>
Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75
In message <20230413070426.8a54f...@slippy.cwsent.com>, Cy Schubert writes: > In message <20230413064252.1e5c1...@slippy.cwsent.com>, Cy Schubert writes: > > In message , Mark Millard > > write > > s: > > > [This just puts my prior reply's material into Cy's > > > adjusted resend of the original. The To/Cc should > > > be coomplete this time.] > > > > > > On Apr 12, 2023, at 22:52, Cy Schubert = > > > wrote: > > > > > > > In message , Mark = > > > Millard=20 > > > > write > > > > s: > > > >> From: Charlie Li wrote on > > > >> Date: Wed, 12 Apr 2023 20:11:16 UTC : > > > >>=20 > > > >>> Charlie Li wrote: > > > >>>> Mateusz Guzik wrote: > > > >>>>> can you please test poudriere with > > > >>>>> https://github.com/openzfs/zfs/pull/14739/files > > > >>>>>=20 > > > >>>> After applying, on the md(4)-backed pool regardless of =3D > > > >> block_cloning,=3D20 > > > >>>> the cy@ `cp -R` test reports no differing (ie corrupted) files. = > > > Will=3D20=3D > > > >>=20 > > > >>>> report back on poudriere results (no block_cloning). > > > >>>> =3D20 > > > >>> As for poudriere, build failures are still rolling in. These are = > > > (and=3D20=3D > > > >>=20 > > > >>> have been) entirely random on every run. Some examples from this = > > > run: > > > >>> =3D20 > > > >>> lang/php81: > > > >>> - post-install: @${INSTALL_DATA} ${WRKSRC}/php.ini-development=3D20 > > > >>> ${WRKSRC}/php.ini-production ${WRKDIR}/php.conf =3D > > > >> ${STAGEDIR}/${PREFIX}/etc > > > >>> - consumers fail to build due to corrupted php.conf packaged > > > >>> =3D20 > > > >>> devel/ninja: > > > >>> - phase: stage > > > >>> - install -s -m 555=3D20 > > > >>> /wrkdirs/usr/ports/devel/ninja/work/ninja-1.11.1/ninja=3D20 > > > >>> /wrkdirs/usr/ports/devel/ninja/work/stage/usr/local/bin > > > >>> - consumers fail to build due to corrupted bin/ninja packaged > > > >>> =3D20 > > > >>> devel/netsurf-buildsystem: > > > >>> - phase: stage > > > >>> - mkdir -p=3D20 > > > >>> =3D > > > >> = > > > /wrkdirs/usr/ports/devel/netsurf-buildsystem/work/stage/usr/local/share/n > e= > > > =3D > > > >> tsurf-buildsystem/makefiles=3D20 > > > >>> =3D > > > >> = > > > /wrkdirs/usr/ports/devel/netsurf-buildsystem/work/stage/usr/local/share/n > e= > > > =3D > > > >> tsurf-buildsystem/testtools > > > >>> for M in Makefile.top Makefile.tools Makefile.subdir =3D > > > >> Makefile.pkgconfig=3D20 > > > >>> Makefile.clang Makefile.gcc Makefile.norcroft Makefile.open64; do \ > > > >>> cp makefiles/$M=3D20 > > > >>> =3D > > > >> = > > > /wrkdirs/usr/ports/devel/netsurf-buildsystem/work/stage/usr/local/share/n > e= > > > =3D > > > >> tsurf-buildsystem/makefiles/;=3D20 > > > >>> \ > > > >>> done > > > >>> - graphics/libnsgif fails to build due to NUL characters in=3D20 > > > >>> Makefile.{clang,subdir}, causing nothing to link > > > >>=20 > > > >> Summary: I have problems building ports into packages > > > >> via poudriere-devel use despite being fully updated/patched > > > >> (as of when I started the experiment), never having enabled > > > >> block_cloning ( still using openzfs-2.1-freebsd ). > > > >>=20 > > > >> In other words, I can confirm other reports that have > > > >> been made. > > > >>=20 > > > >> The details follow. > > > >>=20 > > > >>=20 > > > >> [Written as I was working on setting up for the experiments > > > >> and then executing those experiments, adjusting as I went > > > >> along.] > > > >>=20 > > > >> I've run my own tests in a context that has never had the > > > >> zpool upgrade and that jump from before the openzfs import to > > > >> after the existing commi
Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75
In message <20230413064252.1e5c1...@slippy.cwsent.com>, Cy Schubert writes: > In message , Mark Millard > write > s: > > [This just puts my prior reply's material into Cy's > > adjusted resend of the original. The To/Cc should > > be coomplete this time.] > > > > On Apr 12, 2023, at 22:52, Cy Schubert = > > wrote: > > > > > In message , Mark = > > Millard=20 > > > write > > > s: > > >> From: Charlie Li wrote on > > >> Date: Wed, 12 Apr 2023 20:11:16 UTC : > > >>=20 > > >>> Charlie Li wrote: > > >>>> Mateusz Guzik wrote: > > >>>>> can you please test poudriere with > > >>>>> https://github.com/openzfs/zfs/pull/14739/files > > >>>>>=20 > > >>>> After applying, on the md(4)-backed pool regardless of =3D > > >> block_cloning,=3D20 > > >>>> the cy@ `cp -R` test reports no differing (ie corrupted) files. = > > Will=3D20=3D > > >>=20 > > >>>> report back on poudriere results (no block_cloning). > > >>>> =3D20 > > >>> As for poudriere, build failures are still rolling in. These are = > > (and=3D20=3D > > >>=20 > > >>> have been) entirely random on every run. Some examples from this = > > run: > > >>> =3D20 > > >>> lang/php81: > > >>> - post-install: @${INSTALL_DATA} ${WRKSRC}/php.ini-development=3D20 > > >>> ${WRKSRC}/php.ini-production ${WRKDIR}/php.conf =3D > > >> ${STAGEDIR}/${PREFIX}/etc > > >>> - consumers fail to build due to corrupted php.conf packaged > > >>> =3D20 > > >>> devel/ninja: > > >>> - phase: stage > > >>> - install -s -m 555=3D20 > > >>> /wrkdirs/usr/ports/devel/ninja/work/ninja-1.11.1/ninja=3D20 > > >>> /wrkdirs/usr/ports/devel/ninja/work/stage/usr/local/bin > > >>> - consumers fail to build due to corrupted bin/ninja packaged > > >>> =3D20 > > >>> devel/netsurf-buildsystem: > > >>> - phase: stage > > >>> - mkdir -p=3D20 > > >>> =3D > > >> = > > /wrkdirs/usr/ports/devel/netsurf-buildsystem/work/stage/usr/local/share/ne= > > =3D > > >> tsurf-buildsystem/makefiles=3D20 > > >>> =3D > > >> = > > /wrkdirs/usr/ports/devel/netsurf-buildsystem/work/stage/usr/local/share/ne= > > =3D > > >> tsurf-buildsystem/testtools > > >>> for M in Makefile.top Makefile.tools Makefile.subdir =3D > > >> Makefile.pkgconfig=3D20 > > >>> Makefile.clang Makefile.gcc Makefile.norcroft Makefile.open64; do \ > > >>> cp makefiles/$M=3D20 > > >>> =3D > > >> = > > /wrkdirs/usr/ports/devel/netsurf-buildsystem/work/stage/usr/local/share/ne= > > =3D > > >> tsurf-buildsystem/makefiles/;=3D20 > > >>> \ > > >>> done > > >>> - graphics/libnsgif fails to build due to NUL characters in=3D20 > > >>> Makefile.{clang,subdir}, causing nothing to link > > >>=20 > > >> Summary: I have problems building ports into packages > > >> via poudriere-devel use despite being fully updated/patched > > >> (as of when I started the experiment), never having enabled > > >> block_cloning ( still using openzfs-2.1-freebsd ). > > >>=20 > > >> In other words, I can confirm other reports that have > > >> been made. > > >>=20 > > >> The details follow. > > >>=20 > > >>=20 > > >> [Written as I was working on setting up for the experiments > > >> and then executing those experiments, adjusting as I went > > >> along.] > > >>=20 > > >> I've run my own tests in a context that has never had the > > >> zpool upgrade and that jump from before the openzfs import to > > >> after the existing commits for trying to fix openzfs on > > >> FreeBSD. I report on the sequence of activities getting to > > >> the point of testing as well. > > >>=20 > > >> By personal policy I keep my (non-temporary) pool's compatible > > >> with what the most recent ??.?-RELEASE supports, using > > >> openzfs-2.1-freebsd for now. The pools involved below have > > >> never had a zpool upgrade from where they started. (I've no > > >> pools that have ever had a zpool upgrade.) > > &g
Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75
In message , Mark Millard write s: > [This just puts my prior reply's material into Cy's > adjusted resend of the original. The To/Cc should > be coomplete this time.] > > On Apr 12, 2023, at 22:52, Cy Schubert = > wrote: > > > In message , Mark = > Millard=20 > > write > > s: > >> From: Charlie Li wrote on > >> Date: Wed, 12 Apr 2023 20:11:16 UTC : > >>=20 > >>> Charlie Li wrote: > >>>> Mateusz Guzik wrote: > >>>>> can you please test poudriere with > >>>>> https://github.com/openzfs/zfs/pull/14739/files > >>>>>=20 > >>>> After applying, on the md(4)-backed pool regardless of =3D > >> block_cloning,=3D20 > >>>> the cy@ `cp -R` test reports no differing (ie corrupted) files. = > Will=3D20=3D > >>=20 > >>>> report back on poudriere results (no block_cloning). > >>>> =3D20 > >>> As for poudriere, build failures are still rolling in. These are = > (and=3D20=3D > >>=20 > >>> have been) entirely random on every run. Some examples from this = > run: > >>> =3D20 > >>> lang/php81: > >>> - post-install: @${INSTALL_DATA} ${WRKSRC}/php.ini-development=3D20 > >>> ${WRKSRC}/php.ini-production ${WRKDIR}/php.conf =3D > >> ${STAGEDIR}/${PREFIX}/etc > >>> - consumers fail to build due to corrupted php.conf packaged > >>> =3D20 > >>> devel/ninja: > >>> - phase: stage > >>> - install -s -m 555=3D20 > >>> /wrkdirs/usr/ports/devel/ninja/work/ninja-1.11.1/ninja=3D20 > >>> /wrkdirs/usr/ports/devel/ninja/work/stage/usr/local/bin > >>> - consumers fail to build due to corrupted bin/ninja packaged > >>> =3D20 > >>> devel/netsurf-buildsystem: > >>> - phase: stage > >>> - mkdir -p=3D20 > >>> =3D > >> = > /wrkdirs/usr/ports/devel/netsurf-buildsystem/work/stage/usr/local/share/ne= > =3D > >> tsurf-buildsystem/makefiles=3D20 > >>> =3D > >> = > /wrkdirs/usr/ports/devel/netsurf-buildsystem/work/stage/usr/local/share/ne= > =3D > >> tsurf-buildsystem/testtools > >>> for M in Makefile.top Makefile.tools Makefile.subdir =3D > >> Makefile.pkgconfig=3D20 > >>> Makefile.clang Makefile.gcc Makefile.norcroft Makefile.open64; do \ > >>> cp makefiles/$M=3D20 > >>> =3D > >> = > /wrkdirs/usr/ports/devel/netsurf-buildsystem/work/stage/usr/local/share/ne= > =3D > >> tsurf-buildsystem/makefiles/;=3D20 > >>> \ > >>> done > >>> - graphics/libnsgif fails to build due to NUL characters in=3D20 > >>> Makefile.{clang,subdir}, causing nothing to link > >>=20 > >> Summary: I have problems building ports into packages > >> via poudriere-devel use despite being fully updated/patched > >> (as of when I started the experiment), never having enabled > >> block_cloning ( still using openzfs-2.1-freebsd ). > >>=20 > >> In other words, I can confirm other reports that have > >> been made. > >>=20 > >> The details follow. > >>=20 > >>=20 > >> [Written as I was working on setting up for the experiments > >> and then executing those experiments, adjusting as I went > >> along.] > >>=20 > >> I've run my own tests in a context that has never had the > >> zpool upgrade and that jump from before the openzfs import to > >> after the existing commits for trying to fix openzfs on > >> FreeBSD. I report on the sequence of activities getting to > >> the point of testing as well. > >>=20 > >> By personal policy I keep my (non-temporary) pool's compatible > >> with what the most recent ??.?-RELEASE supports, using > >> openzfs-2.1-freebsd for now. The pools involved below have > >> never had a zpool upgrade from where they started. (I've no > >> pools that have ever had a zpool upgrade.) > >>=20 > >> (Temporary pools are rare for me, such as this investigation. > >> But I'm not testing block_cloning or anything new this time.) > >>=20 > >> I'll note that I use zfs for bectl, not for redundancy. So > >> my evidence is more limited in that respect. > >>=20 > >> The activities were done on a HoneyComb (16 Cortex-A72 cores). > >> The system has and supports ECC RAM, 64 GiBytes of RAM are > >> present. > >>=20 > >> I started by duplicating my normal zfs environment to an > &
Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75
> 4e94ac9eb97f (HEAD -> main, freebsd/main, freebsd/HEAD) = > devel/freebsd-gcc12: Bump to 12.2.0. > Author: John Baldwin > Commit: John Baldwin > CommitDate: 2023-03-25 00:06:40 + > branch: main > merge-base: 4e94ac9eb97fab16510b74ebcaa9316613182a72 > merge-base: CommitDate: 2023-03-25 00:06:40 + > n613214 (--first-parent --count for merge-base) > > poudriere attempted to build 476 packages, starting > with pkg (in order to build the 56 that I explicitly > indicate that I want). It is my normal set of ports. > The form of building is biased to allowing a high > load average compared to the number of hardware > threads (same as cores here): each builder is allowed > to use the full count of hardware threads. The build > used USE_TMPFS=3D"data" instead of the USE_TMPFS=3Dall I > normally use on the build machine involved. > > And it produced some random errors during the attempted > builds. A type of example that is easy to interpret > without further exploration is: > > pkg_resources.extern.packaging.requirements.InvalidRequirement: Parse = > error at "'\x00\x00\x00\x00\x00\x00\x00\x00'": Expected W:(0-9A-Za-z) > > A fair number of errors are of the form: the build > installing a previously built package for use in the > builder but later the builder can not find some file > from the package's installation. > > Another error reported was: > > ld: error: /usr/local/lib/libblkid.a: unknown file type > > For reference: > > [main-CA72-bulk_a-default] [2023-04-12_20h45m32s] [committing:] Queued: = > 476 Built: 252 Failed: 11 Skipped: 213 Ignored: 0 Fetched: 0 = > Tobuild: 0Time: 00:37:52 > > I started another build that tried to build 224 packeges: > the 11 failed and 213 skipped. > > Just 1 package built that failed before: > > [00:04:58] [09] [00:04:15] Finished databases/sqlite3@default | = > sqlite3-3.41.0_1,1: Success > > It seems to be the only one where the original failure was not > an example of complaining about the missing/corrupted content > of a package install used for building. So it is an example > of randomly varying behavior. > > That, in turn, allowed: > > [00:04:58] [01] [00:00:00] Building security/nss | nss-3.89 > > to build but everything else failed or was skipped. > > The sqlite3 vs. other failure difference suggests that writes > have random problems but later reads reliably see the problem > that resulted (before the content is deleted). > > > After the above: > > # zpool status > pool: zroot > state: ONLINE > config: > > NAMESTATE READ WRITE CKSUM > zroot ONLINE 0 0 0 > da0p8 ONLINE 0 0 0 > > errors: No known data errors > >> # zpool status > pool: zroot > state: ONLINE > scan: scrub repaired 0B in 00:16:25 with 0 errors on Wed Apr 12 = > 22:15:39 2023 > config: > > NAMESTATE READ WRITE CKSUM > zroot ONLINE 0 0 0 > da0p8 ONLINE 0 0 0 > > errors: No known data errors > > > =3D=3D=3D > Mark Millard > marklmi at yahoo.com Let's try this again. Claws-mail didn't include the list address in the header. Trying to reply, again, using exmh instead. Did your pools suffer the EXDEV problem? The EXDEV also corrupted files. I think, without sufficient investigation we risk jumping to conclusions. I've taken an extremely cautious approach, rolling back snapshots (as much as possible, i.e. poudriere datasets) when EXDEV corruption was encountered. I did not rollback any snapshots in my MH mail directory. Rolling back snapshots of my MH maildir would result in loss of email. I have to live with that corruption. Corrupted files in my outgoing sent email directory remain: slippy$ ugrep -cPa '\x00' ~/.Mail/note | grep -c :1 53 slippy$ There are 53 corrupted files in my note log of 9913 emails. Those files will never be fixed. They were corrupted by the EXDEV bug. Any new ZFS or ZFS patches cannot retroactively remove the corruption from those files. But my poudriere files, because the snapshots were rolled back, were "repaired" by the rolled back snapshots. I'm not convinced that there is presently active corruption since the problem has been fixed. I am convinced that whatever corruption that was written at the time will remain forever or until those files are deleted or replaced -- just like my email files written to disk at the time. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: CURRENT: Panic VERIFY(!zil_replaying(zilog, tx)) failed (and crashing)
In message <20230411142831.db824...@slippy.cwsent.com>, Cy Schubert writes: > In message <434b83db-f6bb-436f-8aa5-385730d20...@dawidek.net>, > =?utf-8?Q?Pawe=C > 5=82_Jakub_Dawidek?= writes: > > > > > > > On Apr 11, 2023, at 11:31, Cy Schubert wrote: > > >=20 > > > =EF=BB=BFIn message <20230409161436.5412fa6e@thor.intern.walstatt.dynvpn. > d= > > e>,=20 > > > FreeBSD Us > > > er writes: > > >> Am Sun, 9 Apr 2023 14:37:03 +0200 > > >> Mateusz Guzik schrieb: > > >>=20 > > >>>> On 4/9/23, FreeBSD User wrote: > > >>>>> Today, after upgrading to FreeBSD 14.0-CURRENT #8 main-n262052-0d4038 > e= > > 301 > > >>> 2b: > > >>>>> Sun Apr 9 > > >>>>> 12:01:02 CEST 2023 amd64, AND upgrading ZPOOLs via > > >>>>>=20 > > >>>>> zpool upgrade POOLNAME > > >>>>>=20 > > >>>>> some boxes keep crashing when starting compiler runs (the trigger is > > >>>>> different on boxes). > > >>>>>=20 > > >>>>> ZFS module is statically compiled into the kernel (if this is of > > >>>>> importance) > > >>>>>=20 > > >>>>> Last known good was: > > >>>>>=20 > > >>>>> [...] > > >>>>> Apr 9 07:10:04 <0.2> thor kernel: FreeBSD 14.0-CURRENT #7 > > >>>>> main-n262051-75379ea2e461: Sun Apr > > >>>>> 9 00:12:57 CEST 2023 Apr 9 07:10:04 <0.2> thor kernel: > > >>>>> root@thor:/usr/obj/usr/src/amd64.amd64/sys/THOR amd64 Apr 9 07:10:04 > < > > = > > 0. > > >>> 2> > > >>>>> thor kernel: > > >>>>> FreeBSD clang version 15.0.7 (https://github.com/llvm/llvm-project.gi > t= > > > > >>>>> llvmorg-15.0.7-0-g8dfdcc7b7bf6) Apr 9 07:10:04 <0.2> thor kernel: > > >>>>> VT(efifb): resolution > > >>>>> 2560x1440 Apr 9 07:10:04 <0.2> thor kernel: module zfsctrl already > > >>>>> present! > > >>>>> [...] > > >>>>>=20 > > >>>>> The file /var/crash/info.X > > >>>>>=20 > > >>>>> contains: > > >>>>>=20 > > >>>>> [...] > > >>>>>=20 > > >>>>> root@thor:/var/crash # more info.2 > > >>>>> Dump header from device: /dev/gpt/swap > > >>>>> Architecture: amd64 > > >>>>> Architecture Version: 2 > > >>>>> Dump Length: 1095192576 > > >>>>> Blocksize: 512 > > >>>>> Compression: none > > >>>>> Dumptime: 2023-04-09 11:43:41 + > > >>>>> Hostname: thor.local > > >>>>> Magic: FreeBSD Kernel Dump > > >>>>> Version String: FreeBSD 14.0-CURRENT #8 main-n262052-0d4038e3012b: S > u= > > n=20 > > >>> Apr > > >>>>> 9 12:01:02 CEST > > >>>>> 2023 > > >>>>>root@thor:/usr/obj/usr/src/amd64.amd64/sys/THOR > > >>>>> Panic String: VERIFY(!zil_replaying(zilog, tx)) failed > > >>>>>=20 > > >>>>> Dump Parity: 2961465682 > > >>>>> Bounds: 2 > > >>>>> Dump Status: good > > >>>>>=20 > > >>>>> Until reconfigured for more debug stuff I do not have more to present > .= > > > > >>>>>=20 > > >>>>> I rememeber now really scraed that there was a HEADSUP in the list re > g= > > ard > > >>> ing > > >>>>> some serious ZFS > > >>>>> problems - I didn't find it right now. > > >>>>>=20 > > >>>>> Thanks in advance, > > >>>>>=20 > > >>>=20 > > >>> That's fallout from the new block cloning feature, adding the author > > >>>=20 > > >>=20 > > >> Thanks. > > >>=20 > > >> As of this moment, all systems with the newest kernel and the new ZFS op > t= > > ion=20 > > >> enabled, crash - > > >> the reason is mostly in different ZFS datasets. I guess there is no way > b > > = > > ack > > >> once this faulty > > >> option is enabled? > > >=20 > > > I've run a test on a scratch pool here, first without block_cloning=20 > > > enabled, then with. There was no corruption when block_cloning was=20 > > > disabled. There was corruption when block_cloning was enabled. > > >=20 > > > I don't know of any way to revert back nor is there any way to fix or=20 > > > recover the corrupted blocks. > > > > Is the corruption still present after EXDEV fixes? > > Yes and no. > > Yes, there is corruption when block_cloning is enabled. > > There is no corruption when block_cloning is disabled. I should add some detail to this. The corruption experienced when block cloning is disabled was fixed by: - eb1feadc201a - e2d997d1cbb9 - d012836fb616 (specifically this commit) - 20be1b4fc4b7 When block_cloning is enabled, the pool is corrupted. This has not been fixed. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: CURRENT: Panic VERIFY(!zil_replaying(zilog, tx)) failed (and crashing)
In message <434b83db-f6bb-436f-8aa5-385730d20...@dawidek.net>, =?utf-8?Q?Pawe=C 5=82_Jakub_Dawidek?= writes: > > > > On Apr 11, 2023, at 11:31, Cy Schubert wrote: > >=20 > > =EF=BB=BFIn message <20230409161436.5412fa6e@thor.intern.walstatt.dynvpn.d= > e>,=20 > > FreeBSD Us > > er writes: > >> Am Sun, 9 Apr 2023 14:37:03 +0200 > >> Mateusz Guzik schrieb: > >>=20 > >>>> On 4/9/23, FreeBSD User wrote: > >>>>> Today, after upgrading to FreeBSD 14.0-CURRENT #8 main-n262052-0d4038e= > 301 > >>> 2b: > >>>>> Sun Apr 9 > >>>>> 12:01:02 CEST 2023 amd64, AND upgrading ZPOOLs via > >>>>>=20 > >>>>> zpool upgrade POOLNAME > >>>>>=20 > >>>>> some boxes keep crashing when starting compiler runs (the trigger is > >>>>> different on boxes). > >>>>>=20 > >>>>> ZFS module is statically compiled into the kernel (if this is of > >>>>> importance) > >>>>>=20 > >>>>> Last known good was: > >>>>>=20 > >>>>> [...] > >>>>> Apr 9 07:10:04 <0.2> thor kernel: FreeBSD 14.0-CURRENT #7 > >>>>> main-n262051-75379ea2e461: Sun Apr > >>>>> 9 00:12:57 CEST 2023 Apr 9 07:10:04 <0.2> thor kernel: > >>>>> root@thor:/usr/obj/usr/src/amd64.amd64/sys/THOR amd64 Apr 9 07:10:04 < > = > 0. > >>> 2> > >>>>> thor kernel: > >>>>> FreeBSD clang version 15.0.7 (https://github.com/llvm/llvm-project.git= > > >>>>> llvmorg-15.0.7-0-g8dfdcc7b7bf6) Apr 9 07:10:04 <0.2> thor kernel: > >>>>> VT(efifb): resolution > >>>>> 2560x1440 Apr 9 07:10:04 <0.2> thor kernel: module zfsctrl already > >>>>> present! > >>>>> [...] > >>>>>=20 > >>>>> The file /var/crash/info.X > >>>>>=20 > >>>>> contains: > >>>>>=20 > >>>>> [...] > >>>>>=20 > >>>>> root@thor:/var/crash # more info.2 > >>>>> Dump header from device: /dev/gpt/swap > >>>>> Architecture: amd64 > >>>>> Architecture Version: 2 > >>>>> Dump Length: 1095192576 > >>>>> Blocksize: 512 > >>>>> Compression: none > >>>>> Dumptime: 2023-04-09 11:43:41 + > >>>>> Hostname: thor.local > >>>>> Magic: FreeBSD Kernel Dump > >>>>> Version String: FreeBSD 14.0-CURRENT #8 main-n262052-0d4038e3012b: Su= > n=20 > >>> Apr > >>>>> 9 12:01:02 CEST > >>>>> 2023 > >>>>>root@thor:/usr/obj/usr/src/amd64.amd64/sys/THOR > >>>>> Panic String: VERIFY(!zil_replaying(zilog, tx)) failed > >>>>>=20 > >>>>> Dump Parity: 2961465682 > >>>>> Bounds: 2 > >>>>> Dump Status: good > >>>>>=20 > >>>>> Until reconfigured for more debug stuff I do not have more to present.= > > >>>>>=20 > >>>>> I rememeber now really scraed that there was a HEADSUP in the list reg= > ard > >>> ing > >>>>> some serious ZFS > >>>>> problems - I didn't find it right now. > >>>>>=20 > >>>>> Thanks in advance, > >>>>>=20 > >>>=20 > >>> That's fallout from the new block cloning feature, adding the author > >>>=20 > >>=20 > >> Thanks. > >>=20 > >> As of this moment, all systems with the newest kernel and the new ZFS opt= > ion=20 > >> enabled, crash - > >> the reason is mostly in different ZFS datasets. I guess there is no way b > = > ack > >> once this faulty > >> option is enabled? > >=20 > > I've run a test on a scratch pool here, first without block_cloning=20 > > enabled, then with. There was no corruption when block_cloning was=20 > > disabled. There was corruption when block_cloning was enabled. > >=20 > > I don't know of any way to revert back nor is there any way to fix or=20 > > recover the corrupted blocks. > > Is the corruption still present after EXDEV fixes? Yes and no. Yes, there is corruption when block_cloning is enabled. There is no corruption when block_cloning is disabled. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: CURRENT: Panic VERIFY(!zil_replaying(zilog, tx)) failed (and crashing)
In message <20230409161436.5412f...@thor.intern.walstatt.dynvpn.de>, FreeBSD Us er writes: > Am Sun, 9 Apr 2023 14:37:03 +0200 > Mateusz Guzik schrieb: > > > On 4/9/23, FreeBSD User wrote: > > > Today, after upgrading to FreeBSD 14.0-CURRENT #8 main-n262052-0d4038e301 > 2b: > > > Sun Apr 9 > > > 12:01:02 CEST 2023 amd64, AND upgrading ZPOOLs via > > > > > > zpool upgrade POOLNAME > > > > > > some boxes keep crashing when starting compiler runs (the trigger is > > > different on boxes). > > > > > > ZFS module is statically compiled into the kernel (if this is of > > > importance) > > > > > > Last known good was: > > > > > > [...] > > > Apr 9 07:10:04 <0.2> thor kernel: FreeBSD 14.0-CURRENT #7 > > > main-n262051-75379ea2e461: Sun Apr > > > 9 00:12:57 CEST 2023 Apr 9 07:10:04 <0.2> thor kernel: > > > root@thor:/usr/obj/usr/src/amd64.amd64/sys/THOR amd64 Apr 9 07:10:04 <0. > 2> > > > thor kernel: > > > FreeBSD clang version 15.0.7 (https://github.com/llvm/llvm-project.git > > > llvmorg-15.0.7-0-g8dfdcc7b7bf6) Apr 9 07:10:04 <0.2> thor kernel: > > > VT(efifb): resolution > > > 2560x1440 Apr 9 07:10:04 <0.2> thor kernel: module zfsctrl already > > > present! > > > [...] > > > > > > The file /var/crash/info.X > > > > > > contains: > > > > > > [...] > > > > > > root@thor:/var/crash # more info.2 > > > Dump header from device: /dev/gpt/swap > > > Architecture: amd64 > > > Architecture Version: 2 > > > Dump Length: 1095192576 > > > Blocksize: 512 > > > Compression: none > > > Dumptime: 2023-04-09 11:43:41 + > > > Hostname: thor.local > > > Magic: FreeBSD Kernel Dump > > > Version String: FreeBSD 14.0-CURRENT #8 main-n262052-0d4038e3012b: Sun > Apr > > > 9 12:01:02 CEST > > > 2023 > > > root@thor:/usr/obj/usr/src/amd64.amd64/sys/THOR > > > Panic String: VERIFY(!zil_replaying(zilog, tx)) failed > > > > > > Dump Parity: 2961465682 > > > Bounds: 2 > > > Dump Status: good > > > > > > Until reconfigured for more debug stuff I do not have more to present. > > > > > > I rememeber now really scraed that there was a HEADSUP in the list regard > ing > > > some serious ZFS > > > problems - I didn't find it right now. > > > > > > Thanks in advance, > > > > > > > That's fallout from the new block cloning feature, adding the author > > > > Thanks. > > As of this moment, all systems with the newest kernel and the new ZFS option > enabled, crash - > the reason is mostly in different ZFS datasets. I guess there is no way back > once this faulty > option is enabled? I've run a test on a scratch pool here, first without block_cloning enabled, then with. There was no corruption when block_cloning was disabled. There was corruption when block_cloning was enabled. I don't know of any way to revert back nor is there any way to fix or recover the corrupted blocks. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: trpt(8) to be decomissioned
In message , Gleb Smirnoff writes: > Max, > > the reason I want to retire it is not that it consumes 40 Kb > in the repository. The reason is that knows kernel structures, > and fails to compile after changes to them. So the tool that > nobody uses requires special care when working on TCP. The > kernel headers disclose the structures for trpt (with some > protection with _WANT_TCPCB, though) and some software from > ports (not calling names!) would start use them too. Now a > kernel developer needs to care not only about trpt, but > about this software, too. I recall when Bryan Cantrill came to one of the local hotels here to announce Solaris 9, I remember him saying that Solaris truss was now an app that called DTrace functions. If people feel the need for trpt-like utility, would it be an idea to write it using DTrace calls? Could it be a GSoC project? It would be kind of neat for a co-op student or someone to get their feet wet with systems programming. I typically use DTrace when snooping around looking for that proverbial needle in a haystack. And TCPDEBUG seems to be one of those things that DTrace was designed to replace. It would be a good project to have a still in school upcoming developer to work on. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: morse(6) sound
In message , Nuno Teixeira writes: > Hello all, > > Is there any way to get sound from morse(6) without speaker(4) device? My question is, why is this still in base? Shouldn't it be a port? I don't think this software is of interest to the majority of FreeBSD users out there and would be a perfect candidate for migration to ports. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Meta Mode (was: Re: BOOT CRASH -- Current -CURRENT)
In message , Warner Losh writes: > --65ac9c05ea048b2a > Content-Type: text/plain; charset="UTF-8" > Content-Transfer-Encoding: quoted-printable > > On Sat, Oct 1, 2022 at 9:06 PM Larry Rosenman wrote: > > > On 10/01/2022 10:04 pm, Warner Losh wrote: > > > > Do you have a /boot tarball that can be loaded in a VM that recreates th= > e > > problem (along with a clean hash)? > > > > But before you try that, have you tried a completely clean rebuild of the > > kernel to preclude the possibility that something is somehow cross thread= > ed? > > > > Warner > > > > On Sat, Oct 1, 2022 at 8:39 PM Larry Rosenman wrote: > > > > > > =E2=9D=AF more info.11 > > Dump header from device: /dev/mfid0p3 > >Architecture: amd64 > >Architecture Version: 2 > >Dump Length: 126748815 > >Blocksize: 512 > >Compression: zstd > >Dumptime: 2022-10-01 21:26:40 -0500 > >Hostname: > >Magic: FreeBSD Kernel Dump > >Version String: FreeBSD 14.0-CURRENT #168 > > ler/freebsd-main-changes-n258354-6cdd871ebc4: Sat Oct 1 21:13:01 CDT > > 2022 > > r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL > >Panic String: page fault > >Dump Parity: 501115454 > >Bounds: 11 > >Dump Status: good > > > > I do have source and debug stuff, BUT kgdb croaks on me. > > > > I *CAN* give access to the machine. > > > > the console backtrace showed something about the kld load of > > dependencies. > > > > > > > > -- > > Larry Rosenman http://people.freebsd.org/~ler > > Phone: +1 214-642-9640 E-Mail: l...@freebsd.org > > US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106 > > > > let me wipe /usr/obj, and rebuild everything (I *DO* use meta-mode). > > > > I've had fewer problems with it than non-meta mode, but this looks like a > 'corruption' or 'cross threaded' crash I've chased in the past that went > away with a rebuild. So it's better to be sure... I think so too. What may appear to be a gratuitous rebuild of llvm, for example, is in fact meta mode rebuilding because of some makefile change. Without meta mode I've experienced odd weirdnesses that are fixed through a subsequent clean build. I just started using meta mode again this week after a few years hiatus to see if it addresses the occasional weird behaviour due to something not being rebuilt when it should have been. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: Header symbols that shouldn't be visible to ports?
In message , Alan Somers writes: > On Sat, Sep 3, 2022 at 11:10 PM Konstantin Belousov wro > te: > > > > On Sat, Sep 03, 2022 at 10:19:12AM -0600, Alan Somers wrote: > > > Our /usr/include headers define a lot of symbols that are used by > > > critical utilities in the base system like ps and ifconfig, but aren't > > > stable across major releases. Since they aren't stable, utilities > > > built for older releases won't run correctly on newer ones. Would it > > > make sense to guard these symbols so they can't be used by programs in > > > the ports tree? There is some precedent for that, for example > > > _WANT_SOCKET and _WANT_MNTOPTNAMES. > > _WANT_SOCKET is clearly about exposing parts of the kernel definitions > > for userspace code that wants to dig into kernel structures. Similarly > > for _WANT_MNTOPTNAMES, but in fact this thing is quite stable. The > > definitions are guarded by additional defines not due to their instability, > > but because using them in userspace requires (much) more preparation from > > userspace environment, which is either not trivial (_WANT_SOCKET) or > > contradicts to standartized use of the header (_WANT_MNTOPTNAMES + > > sys/mount.h). > > > > > > > > I'm particular, I'm thinking about symbols like the following: > > > MINCORE_SUPER > > Why this symbol should be hidden? It is implementation-defined and > > intended to be exposed to userspace. All MINCORE_* not only MINCORE_SUPER > > are under BSD_VISIBLE braces, because POSIX does not define the symbols. > > Because it isn't stable. It changed for example in rev 847ab36bf22 > for 13.0. Programs using the older value (including virtually every > Rust program) won't work on 13.0 and later. > > > > > > TDF_* > > These symbols coming from non-standard header sys/proc.h. If userspace > > includes the header, it is already outside any formal standard, and I > > do not see a reason to make the implementation more convoluted there. > > > > > PRI_MAX* > > > PRI_MIN* > > > PI_*, PRIBIO, PVFS, etc > > > IFCAP_* > > These are all implementation-specific and come from non-standard headers, > > unless I am mistaken, then please correct me. > > > > > RLIM_NLIMITS > > > IFF_* > > Same. > > > > > *_MAXID > > This is too broad. > > I'm talking about symbols like IPV6CTL_MAXID, which record the size of > sysctl lists. Obviously, these symbols can't be stable, and probably > aren't useful outside of the base system. > > > > > > > > > Clearly delineating private symbols like this would ease the > > > maintenance burden on languages that rely on FFI, like Ruby and Rust. > > > FFI basically assumes that symbols once defined will never change. > > > > Why e.g. sys/proc.h is ever consumed by FFI wrappers? > > I should add a little detail. Rust uses FFI to access C functions, > and #define'd constants are redefined in the Rust bindings. For most > Rust programs, the build process doesn't check the contents of > /usr/include in any way. Instead, all of that stuff is hard-coded in > the Rust bindings. That makes cross-compiling a breeze! But it does > cause problems when the C library changes. Adding a new symbol, like > copy_file_range, isn't so bad. If your Rust program doesn't use it, > then the Rust binding will become an unused symbol and get eliminated > by the linker. If your Rust program does use it OTOH, then it will be > resolved by the dynamic linker at runtime - if you're running on > FreeBSD 13 or newer. Otherwise, your program will fail to run. A > bigger problem is with symbols that change. For example, the 64-bit > inode stuff. Rust programs still use a FreeBSD 11 ABI (we're working > on that). But other symbols change more frequently. Things like > PRI_MAX_REALTIME can change between any two releases. That creates a > big maintenance burden to keep track of them in the FFI bindings. And > they also aren't very useful in cross-compiled programs targeting a > FreeBSD 11 ABI. Instead, they really need to have bindings > automatically generated at build time. That's possible, but it's not > the default. This is exactly what happened with DMD D. When 64-bit statfs was introduced all DMD D compiled programs failed to run and recompiling didn't help. The DMD upstream failed to understand the problem. Eventually the port had to be removed. > > So what the Rust community really needs is a way to know which symbols > will be stable across releases, and which might vary. Are you > suggesting that anything from a non-POSIX header file should be > considered variable? > Rust and every other community. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: security/clamav: /var/run on TMPFS renders the port broken by design
In message <20220829082514.63926...@thor.intern.walstatt.dynvpn.de>, FreeBSD Us er writes: > Am Sun, 28 Aug 2022 06:11:20 -0700 > Cy Schubert schrieb: > > > In message <20220828130107.1a76d54a.gre...@freebsd.org>, Michael Gmelin > > writes: > > > > > > > > > > > > On Sun, 28 Aug 2022 03:21:24 -0700 > > > Cy Schubert wrote: > > > > > > > In message <16b4-76a1-4e46-b7c3-60492d379...@freebsd.org>, > > > > Michael Gmelin w > > > > rites: > > > > > > > > > > > > > > > > > > > > > On 28. Aug 2022, at 10:42, free...@oldach.net wrote: > > > > > >=20 > > > > > > =EF=BB=BFCy Schubert wrote on Sat, 27 Aug 2022 17:26:38 +0200 > > > > > > (CEST): > > > > > >> As stated before in this thread, replacing /var/run with tmpfs > > > > > >> is not a supported configuration. > > > > > >=20 > > > > > > Not supported? What is the purpose of /etc/rc.d/var then? That > > > > > > creates a t= > > > > > mpfs backed /var, populates it through mtree, and makes a proper > > > > > /var/run av= ailable. > > > > > >=20 > > > > > > However it doesn't (yet) create /var/run/clamav of course. > > > > > >=20 > > > > > > It would be fairly easy to extend /etc/rc.d/var by a logic that > > > > > > walks thro= > > > > > ugh /usr/local/etc/mtree/* and runs mtree on each of the files > > > > > found as need= ed. All that the security/clamav port would need to > > > > > do then is to drop an ap= propriate small mtree file as > > > > > /usr/local/etc/mtree/clamav. =46rom a port's p= erspective that is > > > > > the same logic as dropping service scripts as /usr/local/= > > > > > etc/rc.d/clamav-*. > > > > > > > > > > =46rom a user's perspective, it would be preferable to have this > > > > > happen at s= ervice start though, as (unlike in the setup > > > > > described) reboots don't happen= that frequently, but files in > > > > > /var/run might get deleted manually. Maybe so= me rc framework > > > > > based solution would make sense, e.g., a variable `mtree_fil= es`, > > > > > which, if set, is applied in the default start_precmd. Besides > > > > > being mo= re resilient, this would also have the advantage that all > > > > > required file syst= ems should be available at that point and the > > > > > separation between system and p = orts would be more clear. Another > > > > > advantage would be that directories are on= ly created for services > > > > > that are actually enabled/started. > > > > > > > > Unfortunately this requires all ports to include an mtree file. > > > > Relying on port maintainers who are human to ensure that these files > > > > are created and updated when ports are created and maintained will > > > > result in more human error. I've learned over my long career to rely > > > > more on automation than human beings. Automation [should] never fail > > > > and when it does it does temporarily until the bug is found and > > > > fixed. Human beings inconsistently fail. > > > > > > > > If it were an auto-discovery script that created an mtree file as > > > > part of the packaging process, it would be another matter. But this > > > > optional solution path should be discussed on ports@, not here. > > > > > > > > > > > > > > I don't have much skin in the game, but I created a little proof of > > > concept to allow further discussion (which is not ports-specific, as it > > > works for all service scripts): > > > > > > https://reviews.freebsd.org/D36385 > > > > I've been toying with the idea for a few months but was never bothered to > > create a review or even a script for that matter. > > > > > > > > This basically allows both system admins and port maintainers to > > > create mtree files in /usr/local/etc/mtree (or /etc/mtree, as it's > > > always relative to the service script called) which are automatically > > > applied on service start. It's non-intrusive and doesn't require any > > > sweeping changes to existing ports/services. > > > >
Re: security/clamav: /ar/run on TMPFS renders the port broken by design
In message <20220828130107.1a76d54a.gre...@freebsd.org>, Michael Gmelin writes: > > > > On Sun, 28 Aug 2022 03:21:24 -0700 > Cy Schubert wrote: > > > In message <16b4-76a1-4e46-b7c3-60492d379...@freebsd.org>, > > Michael Gmelin w > > rites: > > > > > > > > > > > > > On 28. Aug 2022, at 10:42, free...@oldach.net wrote: > > > >=20 > > > > =EF=BB=BFCy Schubert wrote on Sat, 27 Aug 2022 17:26:38 +0200 > > > > (CEST): > > > >> As stated before in this thread, replacing /var/run with tmpfs > > > >> is not a supported configuration. > > > >=20 > > > > Not supported? What is the purpose of /etc/rc.d/var then? That > > > > creates a t= > > > mpfs backed /var, populates it through mtree, and makes a proper > > > /var/run av= ailable. > > > >=20 > > > > However it doesn't (yet) create /var/run/clamav of course. > > > >=20 > > > > It would be fairly easy to extend /etc/rc.d/var by a logic that > > > > walks thro= > > > ugh /usr/local/etc/mtree/* and runs mtree on each of the files > > > found as need= ed. All that the security/clamav port would need to > > > do then is to drop an ap= propriate small mtree file as > > > /usr/local/etc/mtree/clamav. =46rom a port's p= erspective that is > > > the same logic as dropping service scripts as /usr/local/= > > > etc/rc.d/clamav-*. > > > > > > =46rom a user's perspective, it would be preferable to have this > > > happen at s= ervice start though, as (unlike in the setup > > > described) reboots don't happen= that frequently, but files in > > > /var/run might get deleted manually. Maybe so= me rc framework > > > based solution would make sense, e.g., a variable `mtree_fil= es`, > > > which, if set, is applied in the default start_precmd. Besides > > > being mo= re resilient, this would also have the advantage that all > > > required file syst= ems should be available at that point and the > > > separation between system and p = orts would be more clear. Another > > > advantage would be that directories are on= ly created for services > > > that are actually enabled/started. > > > > Unfortunately this requires all ports to include an mtree file. > > Relying on port maintainers who are human to ensure that these files > > are created and updated when ports are created and maintained will > > result in more human error. I've learned over my long career to rely > > more on automation than human beings. Automation [should] never fail > > and when it does it does temporarily until the bug is found and > > fixed. Human beings inconsistently fail. > > > > If it were an auto-discovery script that created an mtree file as > > part of the packaging process, it would be another matter. But this > > optional solution path should be discussed on ports@, not here. > > > > > > I don't have much skin in the game, but I created a little proof of > concept to allow further discussion (which is not ports-specific, as it > works for all service scripts): > > https://reviews.freebsd.org/D36385 I've been toying with the idea for a few months but was never bothered to create a review or even a script for that matter. > > This basically allows both system admins and port maintainers to > create mtree files in /usr/local/etc/mtree (or /etc/mtree, as it's > always relative to the service script called) which are automatically > applied on service start. It's non-intrusive and doesn't require any > sweeping changes to existing ports/services. Understood that this is a manual process. > > In this specific case, the requester could create > /usr/local/etc/mtree/clamav-clamd with the required content (or > persuade the port maintainer to include that file). > > You could of course add some construct to the ports framework that > picks up certain directories from the package list automatically and > places them into an mtree file as part of the build or installation > process. But that would be an additional feature on top of this change. Someone could. Personally, I think that's a lot of work compared to simply saving the state of /var/run at shutdown and restoring it at boot. I can't speak for the ports management though. > > This is meant to inspire more discussions, I'm not trying to force > anything in. ;) Agreed. I cobbled something up yesterday that saves the directory tree state of /var/run prior to shutdown (or manually) and restores it at boot. https://reviews.freebsd.org/D36386 People can try it out if they want. If there's enough interest I'd be willing to commit it. We have a few options on the table and probably more. The ports infrastructure option is probably the most work. Adding functionality to all the ports that use /var/run is also a lot of work and if relying on individual porters, will likely take some time and be varied in implementation and robustness. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: security/clamav: /ar/run on TMPFS renders the port broken by design
In message <16b4-76a1-4e46-b7c3-60492d379...@freebsd.org>, Michael Gmelin w rites: > > > > > On 28. Aug 2022, at 10:42, free...@oldach.net wrote: > >=20 > > =EF=BB=BFCy Schubert wrote on Sat, 27 Aug 2022 17:26:38 +0200 (CEST): > >> As stated before in this thread, replacing /var/run with tmpfs is not a > >> supported configuration. > >=20 > > Not supported? What is the purpose of /etc/rc.d/var then? That creates a t= > mpfs backed /var, populates it through mtree, and makes a proper /var/run av= > ailable. > >=20 > > However it doesn't (yet) create /var/run/clamav of course. > >=20 > > It would be fairly easy to extend /etc/rc.d/var by a logic that walks thro= > ugh /usr/local/etc/mtree/* and runs mtree on each of the files found as need= > ed. All that the security/clamav port would need to do then is to drop an ap= > propriate small mtree file as /usr/local/etc/mtree/clamav. =46rom a port's p= > erspective that is the same logic as dropping service scripts as /usr/local/= > etc/rc.d/clamav-*. > > =46rom a user's perspective, it would be preferable to have this happen at s= > ervice start though, as (unlike in the setup described) reboots don't happen= > that frequently, but files in /var/run might get deleted manually. Maybe so= > me rc framework based solution would make sense, e.g., a variable `mtree_fil= > es`, which, if set, is applied in the default start_precmd. Besides being mo= > re resilient, this would also have the advantage that all required file syst= > ems should be available at that point and the separation between system and p > = > orts would be more clear. Another advantage would be that directories are on= > ly created for services that are actually enabled/started. Unfortunately this requires all ports to include an mtree file. Relying on port maintainers who are human to ensure that these files are created and updated when ports are created and maintained will result in more human error. I've learned over my long career to rely more on automation than human beings. Automation [should] never fail and when it does it does temporarily until the bug is found and fixed. Human beings inconsistently fail. If it were an auto-discovery script that created an mtree file as part of the packaging process, it would be another matter. But this optional solution path should be discussed on ports@, not here. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: security/clamav: /ar/run on TMPFS renders the port broken by design
In message <202208280842.27s8gdxn055...@nuc.oldach.net>, Helge Oldach writes: > Cy Schubert wrote on Sat, 27 Aug 2022 17:26:38 +0200 (CEST): > > As stated before in this thread, replacing /var/run with tmpfs is not a > > supported configuration. > > Not supported? What is the purpose of /etc/rc.d/var then? That creates a tmpf > s backed /var, populates it through mtree, and makes a proper /var/run availa > ble. > > However it doesn't (yet) create /var/run/clamav of course. > > It would be fairly easy to extend /etc/rc.d/var by a logic that walks through > /usr/local/etc/mtree/* and runs mtree on each of the files found as needed. > All that the security/clamav port would need to do then is to drop an appropr > iate small mtree file as /usr/local/etc/mtree/clamav. From a port's perspecti > ve that is the same logic as dropping service scripts as /usr/local/etc/rc.d/ > clamav-*. > > Kind regards > Helge This is because you don't already have a /var/run/clamav yet. Unfortunately this dies not retroactively create /var/run/clamav. My new copy of the script, attached, also does not retroactively create the directory. Create the directory by hand. Use your server. Reboot and the directories will be recreated. If converting from UFS or ZFS /var/run, simply add the tmpfs mountpoint after adding and enabling the script and reboot. (I prefix all locally written scripts with kq-). Remember, this does not retroactively create /var/run/clamav if it doesn't already exist. This only makes mounting of tmpfs /var/run an option possible. #!/bin/sh # PROVIDE: kq-var-run # REQUIRE: zfs tmp # BEFORE: FILESYSTEMS . /etc/rc.subr name=kq_var_run rcvar=kq_var_run_enable extra_commands="load save" start_cmd="kq_var_run_start" load_cmd="kq_var_run_load" save_cmd="kq_var_run_save" stop_cmd="kq_var_run_stop" load_rc_config $name # Set defaults : ${kq_var_run_enable:="NO"} : ${kq_var_run_mtree:="/var/db/mtree/BSD.var-run.mtree"} : ${kq_var_run_autosave:="YES"} kq_var_run_load() { test -f ${kq_var_run_mtree} && mtree -U -i -q -f ${kq_var_run_mtree} -p /var/run > /dev/null } kq_var_run_save() { if [ ! -d $(dirname ${kq_var_run_mtree}) ]; then mkdir -p ${kq_var_run_mtree} fi mtree -dcbj -p /var/run > ${kq_var_run_mtree} } kq_var_run_start() { df -ttmpfs /var/run > /dev/null 2>&1 && kq_var_run_load } kq_var_run_stop() { df -ttmpfs /var/run > /dev/null 2>&1 && checkyesno kq_var_run_autosave && kq_var_run_save } run_rc_command "$1" Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: security/clamav: /ar/run on TMPFS renders the port broken by design
In message <20220827082638.57901a72@slippy>, Cy Schubert writes: > On Sat, 27 Aug 2022 15:38:44 +0200 > Juraj Lutter wrote: > > > > On 27 Aug 2022, at 15:27, Michael Gmelin wrote: > > >=20 > > >=20 > > > =20 > > >> On 27. Aug 2022, at 15:18, free...@oldach.net wrote: > > >>=20 > > >> =EF=BB=BFMichael Gmelin wrote on Sat, 27 Aug 2022 15:02:04 +0200 (CEST= > ): =20 > > >>> (you're removing /var/run, which shouldn't be removed =20 > > >>=20 > > >> Not quite. It's actually not uncommon to boot with an empty /var. Plea= > se see /etc/rc.d/var and related. =20 > > >=20 > > > That=E2=80=99s a good point. > > > =20 > > >> The request that ports/packages should consider this case is not exact= > ly unreasonable IMO. > > >> =20 > > >=20 > > > If I was the maintainer, I would simply add the code to create the dire= > ctory for robustness sake (I for one deleted subdirs in /var/run more than = > once and would expect a port to fix this on restart, also to make sure corr= > ect permissions are applied). But since it doesn=E2=80=99t seem like this i= > s going to happen, adding a custom rc file would be a viable short term wor= > karound for the requester. > > >=20 > > > I like the idea of having something like tmpfiles.d, it would also help= > port maintainers (could also be done as a port). > > > =20 > >=20 > > As I have stated in one of those PR: clamd creates file in two locations: > >=20 > > - PidFile > > - LocalSocket > >=20 > > Both the locations could be checked by rc.d script in clamd.conf (also fr= > eshclam eventually) and respective directories can be created from within s= > tart_precmd() > >=20 > > otis > >=20 > > =E2=80=94 > > Juraj Lutter > > o...@freebsd.org > >=20 > > As stated before in this thread, replacing /var/run with tmpfs is not a > supported configuration. However if users wish to replace /var/run > with tmpfs they can create an rc script (I put my extra rc scripts in > /etc/local/rc.d) to create the hierarc > If one does this they can either use mtree(1) to create the hierarchy > or simply take a snapshot (find /var/run -type d | cpio -o > > /etc/local/my_var_run.cpio), having their rc script recreate the > hierarchy using cpio -i < /etc/local/my_var_run.cpio). And > be periodically updated the archive as needed, probably through a > shutdown script. > > One will notice that /etc/mtree/BSD.var.dist shows us what is created > in /var/run by default during installworld. > > The change requested is not specifically for an individual port but > essentially a FreeBSD-wide infrastructure change. I don't think this > is reasonable without a lot of consideration about what will be broken > during the process of changing build and boot processes and the > potential POLA fallout from such a change. A change like this needs to > be architected. > > I don't think this is the mailing list to discuss this topic. This > should be discussed on ports@. Not here. Maybe it should be moved there > as this is a ports not a base O/S issue. This will resolve the problem: #!/bin/sh # PROVIDE: kq-var-run # REQUIRE: zfs tmp # BEFORE: FILESYSTEMS . /etc/rc.subr name=kq_var_run rcvar=kq_var_run_enable extra_commands="update create" start_cmd="kq_var_run_start" create_cmd="kq_var_run_create" update_cmd="kq_var_run_create" # stop_cmd="kq_var_run_create" load_rc_config $name # Set defaults : ${kq_var_run_enable:="NO"} : ${kq_var_run_mtree:="/etc/local/mtree/KQ.var-run.mtree"} kq_var_run_start() { df -ttmpfs /var/run > /dev/null 2>&1 && mtree -f ${kq_var_run_mtree} -p /var/run } kq_var_run_create() { mtree -cbdj -p /var/run > ${kq_var_run_mtree} } run_rc_command "$1" A person could add stop_cmd="kq_var_run_create" to save the /var/run mtree at shutdown instead of manually. Works with tmpfs /var/run. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: security/clamav: /ar/run on TMPFS renders the port broken by design
On Sat, 27 Aug 2022 15:38:44 +0200 Juraj Lutter wrote: > > On 27 Aug 2022, at 15:27, Michael Gmelin wrote: > > > > > > > >> On 27. Aug 2022, at 15:18, free...@oldach.net wrote: > >> > >> Michael Gmelin wrote on Sat, 27 Aug 2022 15:02:04 +0200 (CEST): > >>> (you're removing /var/run, which shouldn't be removed > >> > >> Not quite. It's actually not uncommon to boot with an empty /var. Please > >> see /etc/rc.d/var and related. > > > > That’s a good point. > > > >> The request that ports/packages should consider this case is not exactly > >> unreasonable IMO. > >> > > > > If I was the maintainer, I would simply add the code to create the > > directory for robustness sake (I for one deleted subdirs in /var/run more > > than once and would expect a port to fix this on restart, also to make sure > > correct permissions are applied). But since it doesn’t seem like this is > > going to happen, adding a custom rc file would be a viable short term > > workaround for the requester. > > > > I like the idea of having something like tmpfiles.d, it would also help > > port maintainers (could also be done as a port). > > > > As I have stated in one of those PR: clamd creates file in two locations: > > - PidFile > - LocalSocket > > Both the locations could be checked by rc.d script in clamd.conf (also > freshclam eventually) and respective directories can be created from within > start_precmd() > > otis > > — > Juraj Lutter > o...@freebsd.org > As stated before in this thread, replacing /var/run with tmpfs is not a supported configuration. However if users wish to replace /var/run with tmpfs they can create an rc script (I put my extra rc scripts in /etc/local/rc.d) to create the hierarc If one does this they can either use mtree(1) to create the hierarchy or simply take a snapshot (find /var/run -type d | cpio -o > /etc/local/my_var_run.cpio), having their rc script recreate the hierarchy using cpio -i < /etc/local/my_var_run.cpio). And be periodically updated the archive as needed, probably through a shutdown script. One will notice that /etc/mtree/BSD.var.dist shows us what is created in /var/run by default during installworld. The change requested is not specifically for an individual port but essentially a FreeBSD-wide infrastructure change. I don't think this is reasonable without a lot of consideration about what will be broken during the process of changing build and boot processes and the potential POLA fallout from such a change. A change like this needs to be architected. I don't think this is the mailing list to discuss this topic. This should be discussed on ports@. Not here. Maybe it should be moved there as this is a ports not a base O/S issue. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0
Re: DTrace Error
In message <20220725153706.a2bb...@slippy.cwsent.com>, Cy Schubert writes: > In message , Mark Johnston writes: > > On Sun, Jul 24, 2022 at 10:07:19AM -0700, Cy Schubert wrote: > > > In message <20220724030857.b57f...@slippy.cwsent.com>, Cy Schubert writes > : > > > > In message <20220723185533.9ea7d...@slippy.cwsent.com>, Cy Schubert wri > te > > s: > > > > > In message , Mark Johnston writes: > > > > > > On Sat, Jul 23, 2022 at 07:14:44AM -0700, Cy Schubert wrote: > > > > > > > In message <20220723035223.57cd...@slippy.cwsent.com>, Cy Schuber > t > > writ > > > > es > > > > > : > > > > > > > > I'm not sure if this is because my obj tree needs a fresh rebui > ld > > and > > > > > > > > > > > > reinstall or if this is a legitimate problem. Regardless of the > d > > trac > > > > e > > > > > > > > command entered, whether it be a fbt or sdt, the following erro > r > > occu > > > > rs > > > > > : > > > > > > > > > > > > > > > > slippy# dtrace -n 'fbt::ieee80211_vap_setup:entry { printf("ent > er > > ing > > > > > > > > ieee80211_vap_setup\n"); }' > > > > > > > > dtrace: invalid probe specifier fbt::ieee80211_vap_setup:entry > { > > > > > > > > printf("entering ieee80211_vap_setup\n"); }: "/usr/lib/dtrace/p > si > > nfo. > > > > d" > > > > > , > > > > > > > > line 1: failed to copy type of 'pr_gid': Conflicting type is al > re > > ady > > > > de > > > > > fi > > > > > > ned > > > > > > > > slippy# > > > > > > > > > > > > > > > > Old DTrace scripts I've used months or even years ago also fail > w > > ith > > > > th > > > > > e > > > > > > > > same error. It's not this one probe. All probes result in the p > r_ > > gid > > > > er > > > > > ro > > > > > > r. > > > > > > > > > > > > > > > > I'm currently rebuilding my "prod" tree from scratch with the h > op > > e th > > > > at > > > > > > > > > > > > > it's simply something out of sync. But, should it not be, has a > ny > > one > > > > el > > > > > se > > > > > > > > > > > > > > encountered this lately? > > > > > > > > > > > > > > A full clean rebuild and installworld/kernel did not change the r > es > > ult. > > > > > > > > > > > This is a new problem. > > > > > > > > > > > > I don't see any such problem on a system built from commit 151abc80 > cd > > e, > > > > > > using GENERIC. Are you using a custom kernel config? Which kernel > > > > > > modules do you have loaded? > > > > > > > > > > [...] > > > > > > > > chuck@ emailed me privately suggesting a roll back to cb2ae6163174. The > > > > > problem is fixed. I'm creating a special branch that reverts only the l > lv > > m > > > > commits since then. > > > > > > llvm 14 is not the problem. There must be something else after cb2ae61631 > 74 > > > > > that is causing the regression. > > > > Are you able to bisect? I spent a bit of time trying to replicate the > > problem based on your kernel config, without any luck yet. > > How fortuitous is this email. I just rebooted my sandbox again and > discovered this is related to non-INVARIANT kernels. Enabling INVARIANTS > "fixes" dtrace. There must be some commit since cb2ae6163174 that affected > non-INVARIANT kernels. As to which one, I'm not sure yet. The commit that introduced the regression to non-INVARIANT kernels is 2449b9e5fe565be757a4b29093fd1c9c6ffcf3c9. Looking at the diff I don't see how it caused the problem but reverting it locally addresses the regression. (Of course one needs to disable building the mac_ddb module in order to have the build succeed.) Without looking at it closer, I suspect that dtrace could be sensitive to one of the struct changes. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org NTP: Web: https://nwtime.org e**(i*pi)+1=0
Re: DTrace Error
In message , Mark Johnston writes: > On Sun, Jul 24, 2022 at 10:07:19AM -0700, Cy Schubert wrote: > > In message <20220724030857.b57f...@slippy.cwsent.com>, Cy Schubert writes: > > > In message <20220723185533.9ea7d...@slippy.cwsent.com>, Cy Schubert write > s: > > > > In message , Mark Johnston writes: > > > > > On Sat, Jul 23, 2022 at 07:14:44AM -0700, Cy Schubert wrote: > > > > > > In message <20220723035223.57cd...@slippy.cwsent.com>, Cy Schubert > writ > > > es > > > > : > > > > > > > I'm not sure if this is because my obj tree needs a fresh rebuild > and > > > > > > > > > > reinstall or if this is a legitimate problem. Regardless of the d > trac > > > e > > > > > > > command entered, whether it be a fbt or sdt, the following error > occu > > > rs > > > > : > > > > > > > > > > > > > > slippy# dtrace -n 'fbt::ieee80211_vap_setup:entry { printf("enter > ing > > > > > > > ieee80211_vap_setup\n"); }' > > > > > > > dtrace: invalid probe specifier fbt::ieee80211_vap_setup:entry { > > > > > > > printf("entering ieee80211_vap_setup\n"); }: "/usr/lib/dtrace/psi > nfo. > > > d" > > > > , > > > > > > > line 1: failed to copy type of 'pr_gid': Conflicting type is alre > ady > > > de > > > > fi > > > > > ned > > > > > > > slippy# > > > > > > > > > > > > > > Old DTrace scripts I've used months or even years ago also fail w > ith > > > th > > > > e > > > > > > > same error. It's not this one probe. All probes result in the pr_ > gid > > > er > > > > ro > > > > > r. > > > > > > > > > > > > > > I'm currently rebuilding my "prod" tree from scratch with the hop > e th > > > at > > > > > > > > > > > it's simply something out of sync. But, should it not be, has any > one > > > el > > > > se > > > > > > > > > > > > encountered this lately? > > > > > > > > > > > > A full clean rebuild and installworld/kernel did not change the res > ult. > > > > > > > > > This is a new problem. > > > > > > > > > > I don't see any such problem on a system built from commit 151abc80cd > e, > > > > > using GENERIC. Are you using a custom kernel config? Which kernel > > > > > modules do you have loaded? > > > > > > > > [...] > > > > > > chuck@ emailed me privately suggesting a roll back to cb2ae6163174. The > > > problem is fixed. I'm creating a special branch that reverts only the llv > m > > > commits since then. > > > > llvm 14 is not the problem. There must be something else after cb2ae6163174 > > > that is causing the regression. > > Are you able to bisect? I spent a bit of time trying to replicate the > problem based on your kernel config, without any luck yet. How fortuitous is this email. I just rebooted my sandbox again and discovered this is related to non-INVARIANT kernels. Enabling INVARIANTS "fixes" dtrace. There must be some commit since cb2ae6163174 that affected non-INVARIANT kernels. As to which one, I'm not sure yet. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org NTP: Web: https://nwtime.org e**(i*pi)+1=0
Re: DTrace Error
In message <20220724030857.b57f...@slippy.cwsent.com>, Cy Schubert writes: > In message <20220723185533.9ea7d...@slippy.cwsent.com>, Cy Schubert writes: > > In message , Mark Johnston writes: > > > On Sat, Jul 23, 2022 at 07:14:44AM -0700, Cy Schubert wrote: > > > > In message <20220723035223.57cd...@slippy.cwsent.com>, Cy Schubert writ > es > > : > > > > > I'm not sure if this is because my obj tree needs a fresh rebuild and > > > > > > reinstall or if this is a legitimate problem. Regardless of the dtrac > e > > > > > command entered, whether it be a fbt or sdt, the following error occu > rs > > : > > > > > > > > > > slippy# dtrace -n 'fbt::ieee80211_vap_setup:entry { printf("entering > > > > > ieee80211_vap_setup\n"); }' > > > > > dtrace: invalid probe specifier fbt::ieee80211_vap_setup:entry { > > > > > printf("entering ieee80211_vap_setup\n"); }: "/usr/lib/dtrace/psinfo. > d" > > , > > > > > line 1: failed to copy type of 'pr_gid': Conflicting type is already > de > > fi > > > ned > > > > > slippy# > > > > > > > > > > Old DTrace scripts I've used months or even years ago also fail with > th > > e > > > > > same error. It's not this one probe. All probes result in the pr_gid > er > > ro > > > r. > > > > > > > > > > I'm currently rebuilding my "prod" tree from scratch with the hope th > at > > > > > > > it's simply something out of sync. But, should it not be, has anyone > el > > se > > > > > > > > encountered this lately? > > > > > > > > A full clean rebuild and installworld/kernel did not change the result. > > > > > This is a new problem. > > > > > > I don't see any such problem on a system built from commit 151abc80cde, > > > using GENERIC. Are you using a custom kernel config? Which kernel > > > modules do you have loaded? > > > > [...] > > chuck@ emailed me privately suggesting a roll back to cb2ae6163174. The > problem is fixed. I'm creating a special branch that reverts only the llvm > commits since then. llvm 14 is not the problem. There must be something else after cb2ae6163174 that is causing the regression. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org NTP: Web: https://nwtime.org e**(i*pi)+1=0
Re: DTrace Error
In message <20220723185533.9ea7d...@slippy.cwsent.com>, Cy Schubert writes: > In message , Mark Johnston writes: > > On Sat, Jul 23, 2022 at 07:14:44AM -0700, Cy Schubert wrote: > > > In message <20220723035223.57cd...@slippy.cwsent.com>, Cy Schubert writes > : > > > > I'm not sure if this is because my obj tree needs a fresh rebuild and > > > > reinstall or if this is a legitimate problem. Regardless of the dtrace > > > > command entered, whether it be a fbt or sdt, the following error occurs > : > > > > > > > > slippy# dtrace -n 'fbt::ieee80211_vap_setup:entry { printf("entering > > > > ieee80211_vap_setup\n"); }' > > > > dtrace: invalid probe specifier fbt::ieee80211_vap_setup:entry { > > > > printf("entering ieee80211_vap_setup\n"); }: "/usr/lib/dtrace/psinfo.d" > , > > > > line 1: failed to copy type of 'pr_gid': Conflicting type is already de > fi > > ned > > > > slippy# > > > > > > > > Old DTrace scripts I've used months or even years ago also fail with th > e > > > > same error. It's not this one probe. All probes result in the pr_gid er > ro > > r. > > > > > > > > I'm currently rebuilding my "prod" tree from scratch with the hope that > > > > > it's simply something out of sync. But, should it not be, has anyone el > se > > > > > > encountered this lately? > > > > > > A full clean rebuild and installworld/kernel did not change the result. > > > This is a new problem. > > > > I don't see any such problem on a system built from commit 151abc80cde, > > using GENERIC. Are you using a custom kernel config? Which kernel > > modules do you have loaded? > > [...] chuck@ emailed me privately suggesting a roll back to cb2ae6163174. The problem is fixed. I'm creating a special branch that reverts only the llvm commits since then. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org NTP: Web: https://nwtime.org e**(i*pi)+1=0
Re: DTrace Error
In message , Mark Johnston writes: > On Sat, Jul 23, 2022 at 07:14:44AM -0700, Cy Schubert wrote: > > In message <20220723035223.57cd...@slippy.cwsent.com>, Cy Schubert writes: > > > I'm not sure if this is because my obj tree needs a fresh rebuild and > > > reinstall or if this is a legitimate problem. Regardless of the dtrace > > > command entered, whether it be a fbt or sdt, the following error occurs: > > > > > > slippy# dtrace -n 'fbt::ieee80211_vap_setup:entry { printf("entering > > > ieee80211_vap_setup\n"); }' > > > dtrace: invalid probe specifier fbt::ieee80211_vap_setup:entry { > > > printf("entering ieee80211_vap_setup\n"); }: "/usr/lib/dtrace/psinfo.d", > > > line 1: failed to copy type of 'pr_gid': Conflicting type is already defi > ned > > > slippy# > > > > > > Old DTrace scripts I've used months or even years ago also fail with the > > > same error. It's not this one probe. All probes result in the pr_gid erro > r. > > > > > > I'm currently rebuilding my "prod" tree from scratch with the hope that > > > it's simply something out of sync. But, should it not be, has anyone else > > > > encountered this lately? > > > > A full clean rebuild and installworld/kernel did not change the result. > > This is a new problem. > > I don't see any such problem on a system built from commit 151abc80cde, > using GENERIC. Are you using a custom kernel config? Which kernel > modules do you have loaded? The kernel config is custom. Here is what is reported by the kernel through strings: options CONFIG_AUTOGENERATED makeoptions WITH_CTF=1 makeoptions DEBUG=-g options BREAK_TO_DEBUGGER options SW_WATCHDOG options DIRECTIO options KDB_UNATTENDED options IICHID_SAMPLING options HID_DEBUG options EVDEV_SUPPORT options USB_DEBUG options ATH_ENABLE_11N options AH_AR5416_INTERRUPT_MITIGATION options IEEE80211_SUPPORT_MESH options IEEE80211_DEBUG options SC_PIXEL_MODE options PPS_SYNC options COMPAT_LINUXKPI options PCI_IOV options PCI_HP options IOMMU options EARLY_AP_STARTUP options SMP options NETGDB options NETDUMP options DEBUGNET options ZSTDIO options GZIO options EKCD options VERBOSE_SYSINIT=0 options MALLOC_DEBUG_MAXZONES=8 options QUEUE_MACRO_DEBUG_TRASH options DEADLKRES options GDB options FULL_BUF_TRACKING options DDB options BUF_TRACKING options KDB_TRACE options KDB options RCTL options RACCT_DEFAULT_TO_DISABLED options RACCT options INCLUDE_CONFIG_FILE options DDB_CTF options KDTRACE_HOOKS options KDTRACE_FRAME options MAC options CAPABILITIES options CAPABILITY_MODE options AUDIT options HWPMC_HOOKS options KBD_INSTALL_CDEV options PRINTF_BUFR_SIZE=128 options _KPOSIX_PRIORITY_SCHEDULING options SYSVSEM options SYSVMSG options SYSVSHM options STACK options KTRACE options SCSI_DELAY=5000 options COMPAT_FREEBSD13 options COMPAT_FREEBSD12 options COMPAT_FREEBSD11 options COMPAT_FREEBSD10 options COMPAT_FREEBSD9 options COMPAT_FREEBSD7 options COMPAT_FREEBSD6 options COMPAT_FREEBSD5 options COMPAT_FREEBSD4 options COMPAT_FREEBSD32 options EFIRT options GEOM_LABEL options GEOM_RAID options TMPFS options PSEUDOFS options PROCFS options CD9660 options MSDOSFS options NFS_ROOT options NFSLOCKD options NFSD options NFSCL options MD_ROOT options QUOTA options UFS_GJOURNAL options UFS_DIRHASH options UFS_ACL options SOFTUPDATES options FFS options KERN_TLS options SCTP_SUPPORT options TCP_RFC7413 options TCP_HHOOK options TCP_BLACKBOX options TCP_OFFLOAD options FIB_ALGO options ROUTE_MPATH options IPSEC_SUPPORT options INET6 options INET options VIMAGE options PREEMPTION options NUMA options SCHED_ULE options NEW_PCIB options CC_NEWRENO options GEOM_PART_GPT options GEOM_PART_MBR options GEOM_PART_EBR options GEOM_PART_BSD options KDB_TRACE device isa device mem device io device uart_ns8250 device acpi device pci device fdc device ahci device ata device siis device ahc device scbus device ch device da device sa device cd device pass device ses device atkbdc device atkbd device psm device vga device splash device sc device vt device vt_vga device vt_efifb device vt_vbefb device uart device puc device iflib device igc device axp device miibus device crypto device aesni device loop device padlock_rng device rdrand_rng device ether device md device firmware device xz device bpf device uhci device ohci device ehci device xhci device usb device ukbd device umass device virtio device virtio_pci device vtnet device virtio_blk device virtio_scsi device virtio_balloon device kvm_clock device xentimer device evdev device uinput device hid Kernel modules are: slippy# kldstat Id Refs AddressSize Name 1 185 0x8020 10290a8 kernel 21 0x8122a000 36c
Re: DTrace Error
In message <20220723035223.57cd...@slippy.cwsent.com>, Cy Schubert writes: > I'm not sure if this is because my obj tree needs a fresh rebuild and > reinstall or if this is a legitimate problem. Regardless of the dtrace > command entered, whether it be a fbt or sdt, the following error occurs: > > slippy# dtrace -n 'fbt::ieee80211_vap_setup:entry { printf("entering > ieee80211_vap_setup\n"); }' > dtrace: invalid probe specifier fbt::ieee80211_vap_setup:entry { > printf("entering ieee80211_vap_setup\n"); }: "/usr/lib/dtrace/psinfo.d", > line 1: failed to copy type of 'pr_gid': Conflicting type is already defined > slippy# > > Old DTrace scripts I've used months or even years ago also fail with the > same error. It's not this one probe. All probes result in the pr_gid error. > > I'm currently rebuilding my "prod" tree from scratch with the hope that > it's simply something out of sync. But, should it not be, has anyone else > encountered this lately? A full clean rebuild and installworld/kernel did not change the result. This is a new problem. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org NTP: Web: https://nwtime.org e**(i*pi)+1=0
DTrace Error
I'm not sure if this is because my obj tree needs a fresh rebuild and reinstall or if this is a legitimate problem. Regardless of the dtrace command entered, whether it be a fbt or sdt, the following error occurs: slippy# dtrace -n 'fbt::ieee80211_vap_setup:entry { printf("entering ieee80211_vap_setup\n"); }' dtrace: invalid probe specifier fbt::ieee80211_vap_setup:entry { printf("entering ieee80211_vap_setup\n"); }: "/usr/lib/dtrace/psinfo.d", line 1: failed to copy type of 'pr_gid': Conflicting type is already defined slippy# Old DTrace scripts I've used months or even years ago also fail with the same error. It's not this one probe. All probes result in the pr_gid error. I'm currently rebuilding my "prod" tree from scratch with the hope that it's simply something out of sync. But, should it not be, has anyone else encountered this lately? -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org NTP: Web: https://nwtime.org e**(i*pi)+1=0
Re: Loader can't find /boot/ua/loader.lua on UFS after main-n255828-18054d0220c
In message , Warner Losh writes: > --8495bd05e03b4d42 > Content-Type: text/plain; charset="UTF-8" > Content-Transfer-Encoding: quoted-printable > > On Mon, May 30, 2022 at 8:14 AM Toomas Soome wrote: > > > > > > > On 30. May 2022, at 17:06, Warner Losh wrote: > > > > > > > > On Mon, May 30, 2022 at 4:26 AM David Wolfskill > > wrote: > > > >> On Mon, May 30, 2022 at 08:40:10AM +0300, Toomas Soome wrote: > >> > ... > >> > Does loader_4th have same issue? > >> > > >> > >> I don't know; I hadn't tried it. I will do so later today & report > >> back. > >> > > > > So if it's only one system, and it's only UFS, then what does fsck of tha= > t > > UFS system tell you? > > The loader can't find its UFS filesystem to read the configuration from. > > So either its having trouble > > finding the device (unlikely since that code hasn't changed in a long > > time), or its having heartburn > > with the UFS system for some reason it's being silent about (within the > > realm of possibilities because > > there might be an unknown edge case in Kirks recent UFS integrity > > changes). I suspect that the 4th > > boot loader will have the same issue, but a different error message. > > > > Others have reported issues with GELI, but that's not in play here, If I'= > m > > reading this correctly. Right? > > > > Warner > > > > > > Ye, thats why I was asking about loader_4th. I=E2=80=99m trying to spot t= > he issue > > from ufs image sample. > > > > I thought it was a good suggestion. My guess on it not working wasn't to > imply it wasn't. Backing out 076002f24d35962f0d21f44bfddd34ee4d7f015d restored the one machine of mine that did have the problem. The other three were fine with that commit. To summarize, things I tried: - Reinstall all boot blocks. - set currdev to my USB rescue disk, ls works, boots fine - boot from my USB rescue disk, set currdev to the boot disk, boots - boot from the USB rescue disk, copy /boot/loader* to the boot disk, works around the problem. - Revert 076002f24d35962f0d21f44bfddd34ee4d7f015d resolves the problem. Other data points: My three AMD machines on Asus motherboards had no problem with the commit. My Acer laptop with Intel CPU suffered the same problem. Could it be that malloc() worked on the Asus/AMD machines while it failed on the Acer/Intel laptop? If my hunch that this may be caused by a malloc() failure, would it be a good idea to print a nasty warning when malloc failures in loader occur? Because silently failing, resulting in weird behaviour is more of a POLA than a nasty message. If not this, a loader variable to enable verbose messages might help in debugging these kinds of problems. Again, this assumes my hunch that it's a malloc() failure is what actually happened. -- Cheers, Cy Schubert or FreeBSD UNIX: Web: http://www.FreeBSD.org NTP: Web: https://nwtime.org e**(i*pi)+1=0
Re: Considering stepping down from all of my FreeBSD responsibilities
In message <20220401064816.gs60...@eureka.lemis.com>, Greg 'groggy' Lehey write s: > > --TSQPSNmi3T91JED+ > Content-Type: text/plain; charset=us-ascii > Content-Disposition: inline > > On Friday, 1 April 2022 at 5:58:39 +, Alexey Dokuchaev wrote: > > On Fri, Apr 01, 2022 at 02:20:31PM +0900, Yasuhiro Kimura wrote: > >> Hi Glen, > >> > >> From: Glen Barber > >> Subject: Considering stepping down from all of my FreeBSD responsibilities > >> Date: Fri, 1 Apr 2022 00:15:02 + > >> > >>> Dear community, > >>> > >>> Given the mental toll the past two years or so have taken on me, I have > >>> decided to step down from all of my "hats" within the Project, and take > >>> some time to sort out what my future looks like going forward. > >>> > >>> Happy April 1st. I'm not going anywhere. :-) > >> > >> We are waiting for the announce of FreeBSD 2.2.10-RELEASE. :-) > >> > >> Cf. https://lists.freebsd.org/pipermail/freebsd-announce/2006-April/001055 > .html > > > > I don't think 2.2.10 is warranted. > > Agreed. The upgrade isn't sufficiently important. > > How about 2.2.9.1? I had a different more sinister thought: Announcing that we've moved from BSDL to GPLv3 to be more like Linux. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few.
Re: Deprecating ISA sound cards
In message <20220319022405.ga29...@lonesome.com>, Mark Linimon writes: > Anyone objecting to this, be careful, I might ship a pile of such > things to you from the depths of the closets :-) <<=1 -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few.
Re: DTrace Brokenness [Solved]
A full clean build resolved the problem. It was likely some incompatible CTF or possibly some other patch that touched DTrace that left my obj tree in an inconsistent state. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few. In message <20220318234704.6c14...@slippy.cwsent.com>, Cy Schubert writes: > It's been a while (~ 4-6 months) since I've last used dtrace. Needing to > use it again today scripts that worked before fail to. > > A first example: > > cwfw# cat dt10.d > #!/usr/sbin/dtrace -s > > fbt::ipf_check:entry { > parintf("%x\n", (void) arg[1]); > } > > cwfw# > > Results in this error: > > cwfw# ./dt10.d > dtrace: failed to compile script ./dt10.d: "/usr/lib/dtrace/psinfo.d", line > 1: cannot translate from "struct thread *" to "lwpsinfo_t *" > cwfw# > > Another example, > > slippy# cat dtrace.d > #!/usr/sbin/dtrace -s > > fbt::uma_reclaim:entry { > printf("in uma_reclaim\n"); > } > slippy# > > Results in the same error: > > slippy# ./dtrace.d > dtrace: failed to compile script ./dtrace.d: "/usr/lib/dtrace/psinfo.d", > line 1: cannot translate from "struct thread *" to "lwpsinfo_t *" > slippy# > > > A variation of the second example, > > slippy# cat dtrace.sh > #!/bin/sh - > dtrace -n 'fbt::uma_reclaim:entry { printf("in uma_reclaim\n"); }' > slippy# > > Results in two errors, the first being that the -n option results in an > invalid probe specified and the second being the struct thread * error. > > slippy# ./dtrace.sh > dtrace: invalid probe specifier fbt::uma_reclaim:entry { printf("in > uma_reclaim\n"); }: "/usr/lib/dtrace/psinfo.d", line 1: cannot translate > from "struct thread *" to "lwpsinfo_t *" > slippy# > > I'm not sure if this is related to 2d5d2a986ce or something else. > > > -- > Cheers, > Cy Schubert > FreeBSD UNIX: Web: https://FreeBSD.org > NTP: Web: https://nwtime.org > > The need of the many outweighs the greed of the few. > > >
DTrace Brokenness
It's been a while (~ 4-6 months) since I've last used dtrace. Needing to use it again today scripts that worked before fail to. A first example: cwfw# cat dt10.d #!/usr/sbin/dtrace -s fbt::ipf_check:entry { parintf("%x\n", (void) arg[1]); } cwfw# Results in this error: cwfw# ./dt10.d dtrace: failed to compile script ./dt10.d: "/usr/lib/dtrace/psinfo.d", line 1: cannot translate from "struct thread *" to "lwpsinfo_t *" cwfw# Another example, slippy# cat dtrace.d #!/usr/sbin/dtrace -s fbt::uma_reclaim:entry { printf("in uma_reclaim\n"); } slippy# Results in the same error: slippy# ./dtrace.d dtrace: failed to compile script ./dtrace.d: "/usr/lib/dtrace/psinfo.d", line 1: cannot translate from "struct thread *" to "lwpsinfo_t *" slippy# A variation of the second example, slippy# cat dtrace.sh #!/bin/sh - dtrace -n 'fbt::uma_reclaim:entry { printf("in uma_reclaim\n"); }' slippy# Results in two errors, the first being that the -n option results in an invalid probe specified and the second being the struct thread * error. slippy# ./dtrace.sh dtrace: invalid probe specifier fbt::uma_reclaim:entry { printf("in uma_reclaim\n"); }: "/usr/lib/dtrace/psinfo.d", line 1: cannot translate from "struct thread *" to "lwpsinfo_t *" slippy# I'm not sure if this is related to 2d5d2a986ce or something else. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few.
Re: Dragonfly Mail Agent (dma) in the base system
In message , David Chisnall w rites: > On 30/01/2022 14:01, michael.osi...@siemens.com wrote: > > Sendmail: The biggest problem is that authentication strictly requires > > Cyrus SASL, even for stupid ones like PLAIN/LOGIN, accourding to the > > handbook you must recompile sendmail from base with Cyrus SASL from > > ports to make this possible. A showstopper actually, for two reasons: > > 1. I don't like mixing base and ports, it just creates a messy system. > > 2. While this may work with hosts, when you have jails running off a > > RELEASE in Bastille this obviously will not work. > > Not going to work with sendmail easily. > > I think this is a critical point: at the moment, we're paying the cost > of having a full-featured MTA in the base system, without getting most > of the benefits. Around 2003, I hit exactly this problem. The > instructions after update were slightly terrifying: after each base > system or ports update, I potentially had to recompile my own sendmail. > > There's now a sendmail+sasl configuration in packages and so I was > incredibly happy to be able to move away from using sendmail in base. > Now I have two copies of sendmail on some machines. The one in ports, > for compatibility reasons, looks for config in /etc/mail not under > LOCALBASE, which is a layering violation and means that freebsd-update > periodically tries to corrupt my config. > > I have no strong opinions about where we move to, but moving *from* > shipping a limited sendmail in base would make me very happy. I'd like to add, proceed cautiously. I've been running postfix on my external gateway for a couple of decades but recently migrated all but one of my internal machines from sendmail to postfix. There were a couple of hiccups along the way. In one case there was a mail loop of at(1) jobs which required the tweak of a procmail rule. In the second case nmh submits mail to localhost:587 requiring altering master.cf. nmh uses only that port though it can pipe directly to the sendmail binary when built that way. If dma doesn't support SMTP submission, we may need to review various port default options or whether ports even support it. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few.
Re: git: 5e6a2d6eb220 - main - Reapply: move libc++ from /usr/lib to /lib
d. > > https://ci.freebsd.org and https://ci.freebsd.org show > successful builds at this point. > > > It looks like Cy may need to report more about the context > for the reported build failure. It was a NO_CLEAN build. A CLEAN build resolved it. There were no mods to this, my prod tree, except for some upcoming ipfilter commits intended for the new year. One would think a META_MODE build would also fail if NO_CLEAN fails. Sorry for the late reply. There are other things here that needed some urgent attention. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few.
Re: current now panics when starting VBox VM
In message <36923A7F-23DE-490D-B1FA-A8B064740BD6@unrelenting.technology>, Greg V via freebsd-current writes: > > > On November 2, 2021 5:16:35 PM GMT+03:00, Michael Butler via freebsd-current > wrote: > >On current as of this morning (I haven't tried to bisect yet) .. > > > > .. with either graphics/drm-devel-kmod or graphics/drm-current-kmod, > >trying to start a VirtualBox VM triggers this panic .. > > > > >#16 0x80c81fc8 at calltrap+0x8 > >#17 0x808b4d69 at sysctl_kern_proc_pathname+0xc9 > > something something https://reviews.freebsd.org/D32738 ? sysctl_kern_proc_pat > hname was touched recently there. > > (Also can someone commit https://reviews.freebsd.org/D30174 ? These warning-f > illed reports are unreadable >_<) Usually the first thing to do with virtualbox is rebuild it. That usually fixes any panics I experience here. Of course, make sure your virtualbox ports subdirs are fully patched, as it's an opportune time to update it too. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few.
Re: [HEADSUP] making /bin/sh the default shell for root
In message <20210922083645.4vnoajyvwq6wf...@aniel.nours.eu>, Baptiste Daroussin writes: > Hello, > > TL;DR: this is not a proposal to deorbit csh from base!!! > > For years now, csh is the default root shell for FreeBSD, csh can be confusin > g > as a default shell for many as all other unix like settled on a bourne shell > compatible interactive shell: zsh, bash, or variant of ksh. > > Recently our sh(1) has receive update to make it more user friendly in > interactive mode: > * command completion (thanks pstef@) > * improvement in the emacs mode, to make it behave by default like other shel > ls > * improvement in the vi mode (in particular the vi edit to respect $EDITOR) > * support for history as described by POSIX. > > This makes it a usable shell by default, which is why I would like to propose > to > make it the default shell for root starting FreeBSD 14.0-RELEASE (not MFCed) > > If no strong arguments has been raised until October 15th, I will make this > proposal happen. > > Again just in case: THIS IS NOT A PROPOSAL TO REMOVE CSH FROM BASE! Having used /bin/sh as my root shell on all my FreeBSD machines, here and at $JOB, except for only one, I feel this is perfectly reasonable. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few.
Re: wlan0 no longer functional after n249128-a0c64a443e4c -> n249146-cb5c07649aa0
In message <86ilzaoc0z@shiori.com.br>, Filipe da Silva Santos via current w rites: > Hi, thank you for the support, and sorry for the late feedback. > > The procedure `wpa_poststart' didn't solve the regression on my system. > wlan0 doesn't seem to come up. > > I have the following output on my log: > > | Sep 8 19:09:43 misaka wpa_supplicant[23325]: ioctl[SIOCS80211, op=3D103,= > val=3D0, arg_len=3D128]: Operation now in progress > | Sep 8 19:09:43 misaka wpa_supplicant[23325]: wlan0: CTRL-EVENT-SCAN-FAIL= > ED ret=3D-1 retry=3D1 > > Here is a sanitized version of my wpa_supplicant.conf: > > ctrl_interface=3D/var/run/wpa_supplicant > eapol_version=3D1 > ap_scan=3D1 > fast_reauth=3D1 > country=3DBR > network=3D{ > ssid=3D"" > psk=3D"" > priority=3D5 > } > [...] > > and rc.conf related settings: > > [...] > ifconfig_wlan0=3D"WPA powersave 10.0.0.110 netmask 0xff00 broadcast 10.= > 0.0.255" > defaultrouter=3D"10.0.0.1" > > wlans_iwm0=3D"wlan0" > create_args_wlan0=3D"country BR regdomain FCC" > [...] > > The last fix still works, although `sleep' isn't necessary. > > @@ -29,6 +29,7 @@ > } > =20 > wpa_poststart() { > + ifconfig ${ifn} down > ifconfig ${ifn} up > } > d06d7eb09131 has taken care of this. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few.
Re: wlan0 no longer functional after n249128-a0c64a443e4c -> n249146-cb5c07649aa0
Sorry for the breakage. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few. In message , Idwer Vollering writes: > I can confirm this commit has addressed the wpa_supplicant 'breakage'. > Thanks for fixing it. > > Op di 7 sep. 2021 om 19:37 schreef Cy Schubert : > > > > In message , David Wolfskill writes > : > > > > -- > > Cheers, > > Cy Schubert > > FreeBSD UNIX: Web: https://FreeBSD.org > > NTP: Web: https://nwtime.org > > > > The need of the many outweighs the greed of the few. > > > On Tue, Sep 07, 2021 at 10:13:23AM +0200, Jakob Alvermark wrote: > > > > ...=20 > > > > wlan0 does not associate after boot. (This is with iwm, AC 9260) > > > >=20 > > > > My workaround is simply 'ifconfig wlan0 up'. > > > >=20 > > > > After a few seconds wpa_supplicant associates and another few secods=20 > > > > later I have a DHCP IP address. > > > > > > > > > > I just tried that (running main-n249159-bb61ccd530b7), and that (also) > > > works for me -- in case that data point is of use. > > > > Hi, > > > > Commit 5fcdc19a8111 has addressed this. > > > > > > > >
Re: killall, symlinks, and signal delivery?
On September 7, 2021 3:42:53 PM PDT, Steve Kargl wrote: >I have stumbled about a quandry, which I hope someone >can shed some light upon. In my day job, I often >generate a sequence of images and display these images >with ImageMagick's display command. From my csh prompt, >a quick and dirty foreach() loop > >% foreach i (*.png) >> display $i & >> sleep 3 >> end > >Instead of moving the cursor to each image and hitting >'q' to close the images. I normally kill all of the >processes at one time. This used to work: > >% killall display > >Now I geit, for example, > >% display z.miff & >% killall display >No matching processes belonging to you were found >% ps -Ukargl | grep display >19463 1 S0:00.02 display z.miff (magick) >19465 1 S+ 0:00.00 grep display >% ls -l /usr/local/bin/display >lrwxr-xr-x 1 root wheel - 6 Jun 1 14:18 /usr/local/bin/display@ -> magick > >So, there are two possibilities: >(1) display was once an independent program and not a >symlink to magick. Thus, killall just worked. Or, >(2) killall no longer works because command associated >with process 19463 is not really 'display' and the >symlink isn't resolved to actually kill 'magick'. > >So, just chekcing (2), here. Is this a change in behvior >for FreeBSD? > It's likely your app is replacing its process name (argv[0]) to something else. ps auxww may give you a hint what it might be now. -- Pardon the typos and autocorrect, small keyboard in use. Cy Schubert FreeBSD UNIX: Web: https://www.FreeBSD.org The need of the many outweighs the greed of the few. Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: wlan0 no longer functional after n249128-a0c64a443e4c -> n249146-cb5c07649aa0
In message , David Wolfskill writes: -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few. > On Tue, Sep 07, 2021 at 10:13:23AM +0200, Jakob Alvermark wrote: > > ...=20 > > wlan0 does not associate after boot. (This is with iwm, AC 9260) > >=20 > > My workaround is simply 'ifconfig wlan0 up'. > >=20 > > After a few seconds wpa_supplicant associates and another few secods=20 > > later I have a DHCP IP address. > > > > I just tried that (running main-n249159-bb61ccd530b7), and that (also) > works for me -- in case that data point is of use. Hi, Commit 5fcdc19a8111 has addressed this.
Re: wlan0 no longer functional after n249128-a0c64a443e4c -> n249146-cb5c07649aa0
In message <86tuix4cys@shiori.com.br>, Filipe da Silva Santos writes: > --=-=-= > Content-Type: text/plain > Content-Transfer-Encoding: quoted-printable > > > I'll have more questions later (need to start working on another job) but= > =20 > > I'd like to learn more about your configuration to understand why it work= > s=20 > > at boot for myself and phlip@ and not for you and the others here on=20 > > -current who have experienced the same issue. Understanding what triggers= > =20 > > this will go a long way to resolving it. > > Hello, Cy, > I have a Intel AC 3168 and can reproduce both problem and solution. > > I'd love to help with testing and info with the new version. Can you also try the security/wpa_supplicant and security/wpa_supplicant-dev el ports, both without the ifconfig mitigation patch? This will more than confirm that this is an upstream problem and not in the FreeBSD Makefiles. This would be of great help as I cannot reproduce the problem at boot but after boot using service netif (which the old wpa_supplicant 2.9 also had). An additional confirmation that the -devel port has the same problem would help a lot. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few.
Re: wlan0 no longer functional after n249128-a0c64a443e4c -> n249146-cb5c07649aa0
In message <2780735.ssxfcku...@sigill.theweb.org.ua>, "Oleg V. Nauman" writes: > On 2021 M09 6, Mon 20:31:33 EEST Cy Schubert wrote: > > One last favour to ask, can you try this with the wpa_supplicant-devel > > port, please? I'm trying to narrow down if this is related to the options > > in usr.sbin/wpa/Makefile.inc or an upstream problem. If this behaves the > > same using wpa_supplicant-devel, this tells me to look at the code instead > > of Makefiles. > > > > I can reproduce the service netif restart problem using the old > > wpa_supplicant 2.9, so at least here there is no change in behaviour. > > Though on my sandbox machine the ifconfig dow/up is not required -- though > > even the older wpa_supplicant 2.9 behaves the same on my laptop, (no > > regression experienced here). > > > > To help point to either Makefile.inc or contrib/wpa, can you please try the > > wpa_supplicant-devel port. This will tell me where to look next. > > I can confirm that wpa_supplicant from security/wpa_supplicant-devel port > demonstrating the same behavior as wpa_supplicant from base - "ifconfig wlan0 > > down ; sleep 5 ; ifconfig wlan0 up" mitigate wlan association issue. Thank you. This is an issue that I'll need to chase down with our upstream. In the mean time while work on this/bring it to upstream's attention this should circumvent the issue: diff --git a/libexec/rc/rc.d/wpa_supplicant b/libexec/rc/rc.d/wpa_supplicant index 8a86fec90e4d..cfe5f1ab27c6 100755 --- a/libexec/rc/rc.d/wpa_supplicant +++ b/libexec/rc/rc.d/wpa_supplicant @@ -12,6 +12,7 @@ name="wpa_supplicant" desc="WPA/802.11i Supplicant for wireless network devices" +start_postcmd="wpa_poststart" rcvar= ifn="$2" @@ -27,6 +28,12 @@ is_ndis_interface() esac } +wpa_poststart() { + ifconfig ${ifn} down + sleep 3 + ifconfig ${ifn} up +} + if is_wired_interface ${ifn} ; then driver="wired" elif is_ndis_interface ${ifn} ; then I'll have more questions later (need to start working on another job) but I'd like to learn more about your configuration to understand why it works at boot for myself and phlip@ and not for you and the others here on -current who have experienced the same issue. Understanding what triggers this will go a long way to resolving it. (cc'd philip@) BTW, my laptop is configured so that wlan0 (iwn0) and bge0 are members of lagg0. Whereas on my sandbox wlan0 (ath0) is used directly. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few.
Re: wlan0 no longer functional after n249128-a0c64a443e4c -> n249146-cb5c07649aa0
One last favour to ask, can you try this with the wpa_supplicant-devel port, please? I'm trying to narrow down if this is related to the options in usr.sbin/wpa/Makefile.inc or an upstream problem. If this behaves the same using wpa_supplicant-devel, this tells me to look at the code instead of Makefiles. I can reproduce the service netif restart problem using the old wpa_supplicant 2.9, so at least here there is no change in behaviour. Though on my sandbox machine the ifconfig dow/up is not required -- though even the older wpa_supplicant 2.9 behaves the same on my laptop, (no regression experienced here). To help point to either Makefile.inc or contrib/wpa, can you please try the wpa_supplicant-devel port. This will tell me where to look next. Fifteen seconds isn't needed. Two or three, even no wait, will do. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few. In message <3000346.zmr5pbt...@sigill.theweb.org.ua>, "Oleg V. Nauman" writes: > On 2021 M09 6, Mon 18:41:13 EEST Cy Schubert wrote: > > I changed mine to be the same as yours. I can connect. (I use iwn(4) and > > ath(4) here.) > > a ) regular reboot - wlan can not associate > b ) service netif restart - wlan can not associate > c ) service netif stop wlan0 ; service netif start wlan0 - wlan can not > associate > d ) ifconfig wlan0 down; sleep 15 ; ifconfig wlan0 up - wlan associated > e ) regular reboot with ifconfig wlan0 down; sleep 15 ; ifconfig wlan0 up > added to /etc/rc.local - wlan associated > > Thank you. > > > > > Do you reboot every time you test or simply this? > > > > service netif stop wlan0 > > service netif start wlan0 > > > > If simply above, does a reboot have it work again? > > > > The reason I ask is, I discovered, today, a quirk in 14-CURRENT, regardless > > of the wpa_supplicant installed. It will always associate following a > > reboot however when running the above two commands to stop and start wlan0 > > I can reproduce your problem. The workaround for now is when running the > > above two commands to also ifconfig wlan0 down; ifconfig wlan0 up. > > > > Can you try ifconfig wlan0 down; ifconfig wlan0 up after stopping/starting > > wlan0? You may need to wait 2-3 seconds between down and up. > >
Re: wlan0 no longer functional after n249128-a0c64a443e4c -> n249146-cb5c07649aa0
I changed mine to be the same as yours. I can connect. (I use iwn(4) and ath(4) here.) Do you reboot every time you test or simply this? service netif stop wlan0 service netif start wlan0 If simply above, does a reboot have it work again? The reason I ask is, I discovered, today, a quirk in 14-CURRENT, regardless of the wpa_supplicant installed. It will always associate following a reboot however when running the above two commands to stop and start wlan0 I can reproduce your problem. The workaround for now is when running the above two commands to also ifconfig wlan0 down; ifconfig wlan0 up. Can you try ifconfig wlan0 down; ifconfig wlan0 up after stopping/starting wlan0? You may need to wait 2-3 seconds between down and up. If this occurs at boot, try the ifconfig down and up anyway (to help narrow down the problem). -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few. In message , Idwer Vollering writes: > There's no core dump in /, wpa_supplicant connects to 802.11b/g/n > (there's no way to lock this, instead of having a mix of standards) on > 2,4GHz. > > /etc/wpa_supplicant.conf: > network={ > ssid="some ssid" > scan_ssid=1 > key_mgmt=WPA-PSK > psk="some key" > } > > Op ma 6 sep. 2021 om 15:23 schreef Cy Schubert : > > > > In message c > > om> > > , Idwer Vollering writes: > > > Op ma 6 sep. 2021 om 07:53 schreef Cy Schubert >: > > > > > > > > In message <2838567.hhqauc6...@sigill.theweb.org.ua>, "Oleg V. Nauman" > > > > writes: > > > > > On 2021 M09 5, Sun 15:52:50 EEST David Wolfskill wrote: > > > > > > Sorry I hadn't noticed this yesterday (so I could have repported it > > > > > > then), but after updating the "head" slice of my laptopp from: > > > > > > > > > > > > FreeBSD g1-51.catwhisker.org 14.0-CURRENT FreeBSD 14.0-CURRENT #340 > > > > > > main-n249128-a0c64a443e4c: Fri Sep 3 04:06:12 PDT 2021 > > > > > > r...@g1-55.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CA > NARY > > > > > > amd64 1400032 1400032 > > > > > > > > > > > > to: > > > > > > > > > > > > FreeBSD g1-51.catwhisker.org 14.0-CURRENT FreeBSD 14.0-CURRENT #341 > > > > > > main-n249146-cb5c07649aa0: Sat Sep 4 04:28:27 PDT 2021 > > > > > > r...@g1-51.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CA > NARY > > > > > > amd64 1400032 1400032 > > > > > > > > > > > > I find that while the em0 NIC still works, wlan0 (iwn(4) HW) does n > ot: > > > > > > the WLAN LED doesn't light up. > > > > > > > > > > I am also experiencing issues with wlan after my current update to > > > > > 1f7a6325fe1b. I have checked ath(4) , run(4), rtwn(4) and all of them > > > > > demonstrating the same behavior - wlan can not associate. > > > > > You can mitigate it by using security/wpa_supplicant from ports as re > plac > > > emen > > > > > t > > > > > of wpa_supplicant in base. > > > > > > > > > > . > > > > > > > > > > > > I note that exactly the same hardware works OK in stable/12 and sta > ble/ > > > 13. > > > > > > > > > > > > Peace, > > > > > > david > > > > > > > > > > > > > Can you grep wpa_supplicant in /var/log/messages? This will give us a c > lue. > > > > > > wpa_supplicant stops in wpa_driver_bsd_scan() - > > > https://github.com/freebsd/freebsd-src/blob/bd452dcbede69b1862c769f244948 > f94b > > > 86448b5/contrib/wpa/src/drivers/driver_bsd.c#L1315 > > > > > > Here's some selected output from /var/log/messages. > > > > > > Before (built from commit a0c64a443e4cae67a5eea3a61a47d746866de3ee): > > > > > > Sep 6 13:29:40 wpa_supplicant[45348]: Successfully > > > initialized wpa_supplicant > > > Sep 6 13:29:40 wpa_supplicant[45348]: ioctl[SIOCS80211, > > > op=20, val=0, arg_len=7]: Invalid argument > > > Sep 6 13:29:40 syslogd: last message repeated 1 times > > > Sep 6 13:29:46 wpa_supplicant[45349]: wlan1: Trying to > > > associate with (SSID='' freq=2447 MHz) > > > Sep 6 13:29:46 wpa_supplicant[45349]: Failed to add > > > supported
Re: wlan0 no longer functional after n249128-a0c64a443e4c -> n249146-cb5c07649aa0
In message , Idwer Vollering writes: > Op ma 6 sep. 2021 om 07:53 schreef Cy Schubert : > > > > In message <2838567.hhqauc6...@sigill.theweb.org.ua>, "Oleg V. Nauman" > > writes: > > > On 2021 M09 5, Sun 15:52:50 EEST David Wolfskill wrote: > > > > Sorry I hadn't noticed this yesterday (so I could have repported it > > > > then), but after updating the "head" slice of my laptopp from: > > > > > > > > FreeBSD g1-51.catwhisker.org 14.0-CURRENT FreeBSD 14.0-CURRENT #340 > > > > main-n249128-a0c64a443e4c: Fri Sep 3 04:06:12 PDT 2021 > > > > r...@g1-55.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY > > > > amd64 1400032 1400032 > > > > > > > > to: > > > > > > > > FreeBSD g1-51.catwhisker.org 14.0-CURRENT FreeBSD 14.0-CURRENT #341 > > > > main-n249146-cb5c07649aa0: Sat Sep 4 04:28:27 PDT 2021 > > > > r...@g1-51.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY > > > > amd64 1400032 1400032 > > > > > > > > I find that while the em0 NIC still works, wlan0 (iwn(4) HW) does not: > > > > the WLAN LED doesn't light up. > > > > > > I am also experiencing issues with wlan after my current update to > > > 1f7a6325fe1b. I have checked ath(4) , run(4), rtwn(4) and all of them > > > demonstrating the same behavior - wlan can not associate. > > > You can mitigate it by using security/wpa_supplicant from ports as replac > emen > > > t > > > of wpa_supplicant in base. > > > > > > . > > > > > > > > I note that exactly the same hardware works OK in stable/12 and stable/ > 13. > > > > > > > > Peace, > > > > david > > > > > > > Can you grep wpa_supplicant in /var/log/messages? This will give us a clue. > > wpa_supplicant stops in wpa_driver_bsd_scan() - > https://github.com/freebsd/freebsd-src/blob/bd452dcbede69b1862c769f244948f94b > 86448b5/contrib/wpa/src/drivers/driver_bsd.c#L1315 > > Here's some selected output from /var/log/messages. > > Before (built from commit a0c64a443e4cae67a5eea3a61a47d746866de3ee): > > Sep 6 13:29:40 wpa_supplicant[45348]: Successfully > initialized wpa_supplicant > Sep 6 13:29:40 wpa_supplicant[45348]: ioctl[SIOCS80211, > op=20, val=0, arg_len=7]: Invalid argument > Sep 6 13:29:40 syslogd: last message repeated 1 times > Sep 6 13:29:46 wpa_supplicant[45349]: wlan1: Trying to > associate with (SSID='' freq=2447 MHz) > Sep 6 13:29:46 wpa_supplicant[45349]: Failed to add > supported operating classes IE > Sep 6 13:29:46 kernel: wlan1: link state changed to UP > Sep 6 13:29:46 wpa_supplicant[45349]: wlan1: Associated with > > Sep 6 13:29:46 dhclient[45401]: send_packet: No buffer > space available > Sep 6 13:29:46 wpa_supplicant[45349]: wlan1: WPA: Key > negotiation completed with [PTK=CCMP GTK=CCMP] > Sep 6 13:29:46 wpa_supplicant[45349]: wlan1: > CTRL-EVENT-CONNECTED - Connection to completed [id=0 id_str=] > > After (built from main): > > Sep 6 12:19:50 wpa_supplicant[1236]: Successfully > initialized wpa_supplicant > Sep 6 12:19:50 kernel: wlan1: Ethernet address: > Sep 6 12:19:50 wpa_supplicant[1236]: ioctl[SIOCS80211, > op=20, val=0, arg_len=7]: Invalid argument > Sep 6 12:19:50 syslogd: last message repeated 1 times > Sep 6 12:19:50 wpa_supplicant[1237]: wlan1: > CTRL-EVENT-SCAN-FAILED ret=-1 retry=1 Is there a wpa_supplicant.core dump in / ? Can you also send me a sanitized copy of wpa_supplicant.conf, please? I'm interested in the lines proto=, key_mgmt=, pairwise=, group=, eap=, and phase2=. You may not be using eap= or phase2=, which is fine. I'd like to see if there are any differences from what was tested. Though, looking at your outputs above you're probably using something like: proto=RSN WPA key_mgmt=WPA-PSK pairwise=CCMP group=CCMP Is this correct? If you try ports/securitiy/wpa_supplicant-devel (same codebase as in 14-CURRENT), does it work? (ports/security/wpa_supplicant is the old 2.9 codebase.) What is your AP set for? 802.11g, 802.11n, 802.11ac? -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few.
Re: wlan0 no longer functional after n249128-a0c64a443e4c -> n249146-cb5c07649aa0
In message <2838567.hhqauc6...@sigill.theweb.org.ua>, "Oleg V. Nauman" writes: > On 2021 M09 5, Sun 15:52:50 EEST David Wolfskill wrote: > > Sorry I hadn't noticed this yesterday (so I could have repported it > > then), but after updating the "head" slice of my laptopp from: > > > > FreeBSD g1-51.catwhisker.org 14.0-CURRENT FreeBSD 14.0-CURRENT #340 > > main-n249128-a0c64a443e4c: Fri Sep 3 04:06:12 PDT 2021 > > r...@g1-55.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY > > amd64 1400032 1400032 > > > > to: > > > > FreeBSD g1-51.catwhisker.org 14.0-CURRENT FreeBSD 14.0-CURRENT #341 > > main-n249146-cb5c07649aa0: Sat Sep 4 04:28:27 PDT 2021 > > r...@g1-51.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY > > amd64 1400032 1400032 > > > > I find that while the em0 NIC still works, wlan0 (iwn(4) HW) does not: > > the WLAN LED doesn't light up. > > I am also experiencing issues with wlan after my current update to > 1f7a6325fe1b. I have checked ath(4) , run(4), rtwn(4) and all of them > demonstrating the same behavior - wlan can not associate. > You can mitigate it by using security/wpa_supplicant from ports as replacemen > t > of wpa_supplicant in base. > > . > > > > I note that exactly the same hardware works OK in stable/12 and stable/13. > > > > Peace, > > david > Can you grep wpa_supplicant in /var/log/messages? This will give us a clue. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few. > >
Re: drm-kmod kernel crash fatal trap 12
In message <4894bd36-92bd-596e-cc18-cd3e6aafe...@selasky.org>, Hans Petter Sela sky writes: > On 6/9/21 4:43 PM, Thomas Laus wrote: > > I updated my system this morning to main-n247260-dc318a4ffab June 9 2012 > > and the first boot after the kernel was loaded I received: > > > > 'fatal trap 12' fault virtual address = 0x0 > > fault code = supervisor write data, page not present > > instruction pointer = 0x20:0x82fc3d1b > > stack pointer = 0x28:0xfe011aea3330 > > frame pointer = 0x28:0xfe011aea3370 > > code segment = base 0x0 limit 0x, type 0x1b > > DPL 0,pres 1, long 1, def 32 0, gran 1 > > processor eflags = interrupt enabled, resume, IOPL = 0 > > current process = 1187 (kldload) > > trap number = 12 > > > > I hand copied the screen display since I was not able to generate a > > crash dump to /var/crash on a zfs file system. > > > > I am rebuilding the GENERIC kernel since the crash was using the NODEBUG > > version. This is 100 percent repeatable. > > > > Tom > > > > Make sure you also re-build the drm-kmod module. And while you're at it, update your copy of the drm-* port to the latest. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few.
Re: wpa_supplicant: SIGBUS after main-n247052-d40cd26a86a7 -> main-n247092-ec7b47fc81b2
In message <72a5e40f-c973-473c-b2a4-acdd28685...@yahoo.com>, Mark Millard write s: > Cy Schubert wrote on: > Date: Tue, 01 Jun 2021 14:02:06 -0700 : > > > Can you provide me with a backtrace, using the bt command, please. > > > That was in the original message from David W. A copy was > in the reply that you sent to the list as well: > > > > (gdb) bt > > > #0 0x010fb34f in wpa_sm_rx_eapol () > > > #1 0x010f3afe in l2_packet_receive () > > > #2 0x01122ef3 in eloop_run () > > > #3 0x010b44a8 in wpa_supplicant_run () > > > #4 0x0109fdec in main () > > But it also had this report about the context: > > > > (No debugging symbols found in /usr/obj/usr/src/amd64.amd64/usr.sbin/wpa/ > wp= > > > a_supplicant/wpa_supplicant) > > > So it was apparently a non-debug build without symbols, limiting > the information that is available. Correct. We have debug symbols now and are chasing it down. I suspect a static function address in a structure may be incorrect. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few.
Re: wpa_supplicant: SIGBUS after main-n247052-d40cd26a86a7 -> main-n247092-ec7b47fc81b2
Can you provide me with a backtrace, using the bt command, please. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few. In message , David Wolfskill writes: > > --+ZZXH3gC4eszK4ZV > Content-Type: text/plain; charset=us-ascii > Content-Disposition: inline > Content-Transfer-Encoding: quoted-printable > > Reading symbols from /usr/obj/usr/src/amd64.amd64/usr.sbin/wpa/wpa_supplica= > nt/wpa_supplicant... > (No debugging symbols found in /usr/obj/usr/src/amd64.amd64/usr.sbin/wpa/wp= > a_supplicant/wpa_supplicant) > [New LWP 100168] > Core was generated by `/usr/sbin/wpa_supplicant -s -B -i wlan0 -c /etc/wpa_= > supplicant.conf -D bsd -P /v'. > Program terminated with signal SIGBUS, Bus error. > --Type for more, q to quit, c to continue without paging-- > #0 0x010fb34f in wpa_sm_rx_eapol () > (gdb) bt > #0 0x010fb34f in wpa_sm_rx_eapol () > #1 0x010f3afe in l2_packet_receive () > #2 0x01122ef3 in eloop_run () > #3 0x010b44a8 in wpa_supplicant_run () > #4 0x0109fdec in main () > (gdb)=20 > > wlan0 is an iwn(4) device, in this case. Not yet sure how reproducible > this is, but wpa_supplicant's issue(s) do not (yet) seem to prevent the > machine from using teh network (as I'm typing on the laptop's keyboard > to write this). > > uname strings: yesterday: > > FreeBSD g1-55.catwhisker.org 14.0-CURRENT FreeBSD 14.0-CURRENT #258 main-n2= > 47052-d40cd26a86a7: Mon May 31 05:48:18 PDT 2021 root@g1-55.catwhisker.= > org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY amd64 1400018 1400018 > > today: > > FreeBSD g1-55.catwhisker.org 14.0-CURRENT FreeBSD 14.0-CURRENT #259 main-n2= > 47092-ec7b47fc81b2: Tue Jun 1 04:49:26 PDT 2021 root@g1-55.catwhisker.= > org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY amd64 1400018 1400018 > > (Though the laptop did just lose connectivity; checking /var/log/messages: > > <6>1 2021-06-01T12:06:15.336865+00:00 g1-55.catwhisker.org kernel - - - wla= > n0: link state changed to DOWN > <2>1 2021-06-01T12:09:26.811751+00:00 g1-55.catwhisker.org kernel - - - if_= > delmulti_locked: detaching ifnet instance 0xf800126d6800 > <2>1 2021-06-01T12:09:26.811773+00:00 g1-55.catwhisker.org syslogd - - - la= > st message repeated 5 times > <6>1 2021-06-01T12:09:26.811774+00:00 g1-55.catwhisker.org kernel - - - lo0= > : link state changed to DOWN > <27>1 2021-06-01T12:09:27.317032+00:00 g1-55.catwhisker.org dhclient 441 - = > - My address (172.17.1.55) was deleted, dhclient exiting > <2>1 2021-06-01T12:09:27.317474+00:00 g1-55.catwhisker.org kernel - - - if_= > delmulti_locked: detaching ifnet instance 0xf800129a1800 > > I tried "sudo service netif restart" and that brought the connection back > (for now, anyway). > > As the laptop is a machine that I connect to networks I do not > control, it uses packet filtering (ipfw, which I've been using since > Whistle Communications, ca. 1998). > > The build typescript will be up at > https://www.catwhisker.org/~david/FreeBSD/history/laptop.14_build_typescrip= > t.txt > shortly. > > Peace, > david > --=20 > David H. Wolfskill da...@catwhisker.org > Claiming that Donald Trump won the 2020 election is the opposite of > patriotism. Make of that what you will. > > See https://www.catwhisker.org/~david/publickey.gpg for my public key. > > --+ZZXH3gC4eszK4ZV > Content-Type: application/pgp-signature; name="signature.asc" > > -BEGIN PGP SIGNATURE- > > iQGTBAEBCgB9FiEE4owz2QxMJyaxAefyQLJg+bY2PckFAmC2JXtfFIAALgAo > aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldEUy > OEMzM0Q5MEM0QzI3MjZCMTAxRTdGMjQwQjI2MEY5QjYzNjNEQzkACgkQQLJg+bY2 > PclQaQgA18e9tascZE1aSW6elcBrSo/cIRQU1KeguDAea+6LdqOx1ONqODh7GAQt > GDhQylVwnZTci4aEXl48katwt4yYiqm+HvCD99uN1NJYw3Fjvn1jr0ql0x6ZNG3V > hmd+pdatIJXGRwE9g/HFP0quvbDOdkHonCmZtZZaR8zb0azSMA3MbmWgcoPs+G/n > JpNCZuPWs0fAKtK20pi9fsTL8LDu0Y6bAHPA5Ch1Hrmi/yuj2EJwYwj+Un2WwAY4 > NfVeUQW4glansw3JDAX7Uws3qvBXsz+l9QGvTNGMG2YscqLNlAgUsj+HX/UFi2vs > X0QU8GDr7bXH9Kzmsa8orIbB/Hv3hQ== > =WZoK > -END PGP SIGNATURE- > > --+ZZXH3gC4eszK4ZV-- >
Re: boot loader blank screen
In message <63a29589-22b5-495b-8e0d-14e13091d...@yahoo.com>, Mark Millard write s: > > > On 2021-Jan-5, at 17:54, David Wolfskill wrote: > > > On Wed, Jan 06, 2021 at 12:46:08AM +0200, Toomas Soome wrote: > >> ... > >>> the 58661b3ba9eb should hopefully fix the loader text mode issue, it woul > d be cool if you can verify:) > >>> > >>> thanks, > >>> toomas > >> > >> I think, I got it fixed (at least idwer did confirm for his system, thanks > ). If you can test this patch: http://148-52-235-80.sta.estpak.ee/0001-loader > -rewrite-font-install.patch <http://148-52-235-80.sta.estpak.ee/0001-loader-r > ewrite-font-install.patch> it would be really nice. > >> > >> thanks, > >> toomas > > > > I tested with each of the following "stanzas' in /boot/loader.conf, > > using vt (vs. syscons) in each case (though that breaks video reset > > on resume after suspend): > > > > . . . > > I've done no experiments with an explicit vbe_max_resolution > setting. My context for hw.vga.textmode="0" shows up as > 1920x1200. (I do have the font for this set to 8x16, making > for lots of character cells across and down.) > > > For the below I do not have hw.vga.textmode="0". > > > # hw.vga.textmode="0" textmode=1 doesn't work either. Been using it for years and this is the first time it's borked. > > # vbe_max_resolution=1280x800 > > > > (That is, not specifying anything for hw.vga.textmode or > > vbe_max_resolution.) > > > > This boots OK, but I see no kernel probe messages or single- to > > multi-user mode messages. I can use (e.g.) Ctl+Alt+F2 to switch to > > vty1, see a "login: " prompt, and that (also) works. (This is the > > initial symptom I had reported.) > > So I tried commenting out hw.vga.textmode="1" and I saw everything > I expected in my context. Whiteish on black background (or at least > something very dark). I did not take videos to do detailed > inspections. Didn't work for me. Then again my old eyes didn't detect much difference in contrast. > > > > hw.vga.textmode="1" > > # vbe_max_resolution=1280x800 > > > > This works -- boots OK, and I see kernel probe () messages; this is a > > text console (mostly blue text; some white, against a dark background. > > It's a medium-light blue, so it's easy enough to read (unlike a navy > > blue, for example). > > FYI: whiteish on black (or at least something very dark) > in my hw.vga.textmode="1" context. I saw everything here > as well. I did not take videos to do detailed inspections. > > I did not notice any way to tell hw.vga.textmode="1" from > having no hw.vga.textmode assignment at all. But, again, > I did not set up for an after-the-fact detailed review of > what is displayed. Everything becomes normal when X starts, except for the three machines downstairs which don't use X. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: firewall choice
In message , grarpamp writes: > >>> What's the "best" [1] choice for firewalling these days > >>> There's pf, ipf and ipfw. > >> > >>This question comes up over years. > >> > >>Consider starting and joining with people to create > >>a comparison page on the FreeBSD Wiki, > >>both a feature / capability comparison table, > >>and contextual paragraphs. > >>A mini project like that can help many users > >>and add their researches to it. > > > > I'd be happy to if I knew where to start/how to start/is there a guide. > > Starting a wiki is here... > https://wiki.freebsd.org/ > https://wiki.freebsd.org/AboutWiki > > Which falls under larger handbook doc area... > https://lists.freebsd.org/mailman/listinfo/freebsd-doc > > Much of comparison would pull from man pages. > > Could also come from posting a call for input / announce > to questions, hackers, forum, etc. > > Wiki should not duplicate admin info from here... > https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/firewalls.html > But would cover this handbook bullet item that is > not actually covered in the handbook (which > could link out to the wiki page for that)... > "- The differences between the firewalls built into FreeBSD." > > A full comparison would also want to note and point to > upstream sources, and have a table of which filter systems > are supported going forward in each unix OS (the *BSD > flavors including DragonFly ipfw3 pf, Linux netfilter+nftables, > Illumos). pf was originally written when Darren Reed took a job at Sun. He changed the license at the time. FreeBSD moved it (and other softwre to contrib), as did NetBSD (in their own way). OpenBSD wrote pf in the space of a week in reaction to the license change. > > And cover layer2 capabilities, switching, bridging, ipv6, > nat, rate limits / shape / queue, proxy, arbitrary rewriting > and routing hooks, etc. > > NetBSD where ipf was last released has deprecated > both ipf and pf in favor of npf. While upstream devel and > maintenance on ipf has died, pf still lives on at OpenBSD. It's hardly deprecated in NetBSD. Christos Zoulas and I have exchanged a fair bit of code. Darren Reed released and maintained IPF through the Australian National University. NetBSD imported it, like we do here at FreeBSD, into their src tree. > > Anyone can start. Have fun. My ipf work is documented at https://wiki.freebsd.org/IPFilter. > ___ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" > -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: svn.freebsd.org
In message <20201112.042716.381474736225590586.y...@utahime.org>, Yasuhiro KIMU RA writes: > At first, lagging has disappeared now. > > From: "Bjoern A. Zeeb" > Subject: Re: svn.freebsd.org > Date: Wed, 11 Nov 2020 19:12:06 + > > > svn.freebsd.org is geolocated imho; so unless youâll tell people to > > which IPv6/IPv4 address you are connecting itâll be harder to track > > this down if it is not all mirrors. > > I use 192.50.199.249. But svnweb.freebsd.org had also been lagging. So > It doesn't seem the problem was specific to one mirror. It's happened a few times this week. Last time was yesterday morning PDT. I didn't notice the exact time though. And correct, it's not happening now. It updated at approximately 1130U (that's 1930Z). I also noticed that the github repo via the website listed the same commit as its latest as well. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: svn.freebsd.org
In message <0398cede-609f-4789-b056-2809712f9...@lists.zabbadoz.net>, "Bjoern A . Zeeb" writes: > On 11 Nov 2020, at 18:47, Yasuhiro KIMURA wrote: > > > From: Cy Schubert > > Subject: svn.freebsd.org > > Date: Wed, 11 Nov 2020 10:20:55 -0800 > > > >> I've noticed that svn.freebsd.org has been lagging with commits from > >> repo.freebsd.org. Is this a change or is there something broken? (I > >> use > >> svn.freebsd.org as the source of truth at $JOB.) > >> > >> At the moment svn.freebsd.org is at r367589 while repo.freebsd.org is > >> at > >> r367596. > > > > Not only src but also ports has been lagging. Currently > > https://svn.freebsd.org/ports/ is r554896 but I received commit > > message of r554908 from svn-ports-all ML. > > > > Also this is not first time. Though I can't remember exactly when, > > similar situation happened within a week. > > svn.freebsd.org is geolocated imho; so unless youâll tell people to > which IPv6/IPv4 address you are connecting itâll be harder to track > this down if it is not all mirrors. slippy$ nslookup svn.freebsd.org Server: 127.0.0.1 Address:127.0.0.1#53 Non-authoritative answer: svn.freebsd.org canonical name = svnmir.geo.freebsd.org. Name: svnmir.geo.freebsd.org Address: 96.47.72.69 Name: svnmir.geo.freebsd.org Address: 2610:1c1:1:606c::e6a:0 slippy$ Located on West Coast Canada. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
svn.freebsd.org
I've noticed that svn.freebsd.org has been lagging with commits from repo.freebsd.org. Is this a change or is there something broken? (I use svn.freebsd.org as the source of truth at $JOB.) At the moment svn.freebsd.org is at r367589 while repo.freebsd.org is at r367596. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CURRENT failing at contrib/unbound/util/config_file.c:122:20
In message , Pete Wright w rites: > wondering if anyone else is having this error building CURRENT today: > > > --- config_file.o --- > /usr/home/pete/git/freebsd/contrib/unbound/util/config_file.c:122:20: > error: use of undeclared identifier 'UNBOUND_DNS_OVER_HTTPS_PORT' > Â Â Â Â Â Â Â cfg->https_port = UNBOUND_DNS_OVER_HTTPS_PORT; > Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ^ > 1 error generated. > --- all_subdir_lib/ncurses --- > > > my last commit from the github mirror is: > > commit efb48d58bee75fdb221adece8ef5a13cede99e8c (HEAD -> master, > origin/master, origin/HEAD) > Author: tuexen > Date:Â Â Sat Nov 7 21:17:49 2020 + > > Â Â Â The ioctl() calls using FIONREAD, FIONWRITE, FIONSPACE, and SIOCATMAR > K > Â Â Â access the socket send or receive buffer. This is not possible for > Â Â Â listening sockets since r319722. > Â Â Â Because send()/recv() calls fail on listening sockets, fail also > ioctl() > Â Â Â indicating EINVAL. > > so not sure if it's been found or if this is a real issue. No such problem here. What do you see on line 1397 of /usr/src/usr.sbin/unbound/config.h? Also, uname -a, please. And, git status usr.sbin/unbound, looking for local mods. Your cwd will need to be the root of your git tree. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: OpenZFS: kldload zfs.ko freezes on i386 4GB memory
In message , Matthew Macy writes: > On Fri, Oct 30, 2020 at 4:50 PM Cy Schubert wrote > : > > > > In message <20201030233138.gd34...@zxy.spb.ru>, Slawa Olhovchenkov writes: > > > On Fri, Oct 30, 2020 at 04:00:55PM -0700, Cy Schubert wrote: > > > > > > > > > > More stresses memory usually refers to performance penalty. > > > > > > > Usually way for better performance is reduce memory access. > > > > > > > > > > > > The reason filesystems (UFS, ZFS, EXT4, etc.) cache is to avoid dis > k > > > > > > accesses. Nanoseconds vs milliseconds. > > > > > > > > > > I mean compared ZoL ZFS ARC vs old (BSD/Opensolaris/Illumos) ZFS ARC. > > > > > Any reaason to rise ARC hit rate in ZoL case? > > > > > > > > That's what hit rate is. It's a memory access instead of a disk access. > > > > That's what you want. > > > > > > Is ZoL ARC hit rate rise from FreeBSD ARC hit rate? > > > > We don't know that. You should be able to find out by running some tests > > that would populate your ARC and run the test again. I see that my > > -DNO_CLEAN buildworlds run faster, when I run them a second or third time > > after making a minor edit, than they did before. Thus I assume it uses > > memory more efficiently. By default it stores more metadata in ARC, 75% > > instead of IIRC 25% by default. > > > > Getting back to your original question. A more efficient ARC would exercise > > your memory more intensely because you are replacing disk reads with memory > > reads. And as I said before the old ZFS "found" weak RAM on three separate > > occasions in three different machines over the last ten years. You're > > advised to replace the marginal memory. > > Ryan has been able to reproduce this in a VM with 4GB, similarly a VM > with 2GB loads just fine. It would seem that 4GB triggers a bug in > limit handling. We're hoping that we can simply lower one of the > default limits on i386 and make the problem go away. > > Please don't shoot the messenger when I observe that, generally > speaking, i386 is considered a self supported platform due to ZFS > general inability to perform well with limited memory or KVA. Long > mode has been available on virtually all processors shipped since > 2006. Yes, I was able to use ZFS on a 2 GB Pentium-M (i386) laptop for many years. ZFS worked well with a little tuning on such a small machine. Last time I booted it was late last year or early this year. It's in a drawer right now. I'll try to pull it out this coming week to test it out. Serendipitous that I was thinking about pulling out that old laptop to test out the new ZFS just last week. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: OpenZFS: kldload zfs.ko freezes on i386 4GB memory
In message <20201030233138.gd34...@zxy.spb.ru>, Slawa Olhovchenkov writes: > On Fri, Oct 30, 2020 at 04:00:55PM -0700, Cy Schubert wrote: > > > > > > More stresses memory usually refers to performance penalty. > > > > > Usually way for better performance is reduce memory access. > > > > > > > > The reason filesystems (UFS, ZFS, EXT4, etc.) cache is to avoid disk > > > > accesses. Nanoseconds vs milliseconds. > > > > > > I mean compared ZoL ZFS ARC vs old (BSD/Opensolaris/Illumos) ZFS ARC. > > > Any reaason to rise ARC hit rate in ZoL case? > > > > That's what hit rate is. It's a memory access instead of a disk access. > > That's what you want. > > Is ZoL ARC hit rate rise from FreeBSD ARC hit rate? We don't know that. You should be able to find out by running some tests that would populate your ARC and run the test again. I see that my -DNO_CLEAN buildworlds run faster, when I run them a second or third time after making a minor edit, than they did before. Thus I assume it uses memory more efficiently. By default it stores more metadata in ARC, 75% instead of IIRC 25% by default. Getting back to your original question. A more efficient ARC would exercise your memory more intensely because you are replacing disk reads with memory reads. And as I said before the old ZFS "found" weak RAM on three separate occasions in three different machines over the last ten years. You're advised to replace the marginal memory. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: OpenZFS: kldload zfs.ko freezes on i386 4GB memory
In message <20201030224734.gh2...@zxy.spb.ru>, Slawa Olhovchenkov writes: > On Fri, Oct 30, 2020 at 03:34:10PM -0700, Cy Schubert wrote: > > > In message <20201030220809.gg2...@zxy.spb.ru>, Slawa Olhovchenkov writes: > > > On Fri, Oct 30, 2020 at 01:53:10PM -0700, Cy Schubert wrote: > > > > > > > In message <20201030204622.gf2...@zxy.spb.ru>, Slawa Olhovchenkov write > s: > > > > > On Thu, Oct 29, 2020 at 08:13:00PM -0700, Cy Schubert wrote: > > > > > > > > > > > In message , qr > oxan > > > a > > > > > > writes > > > > > > : > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > I have an old i386 machine running r364479. After upgrading to > > > > > > > r367045, running kldload zfs.ko freezes the whole system. > > > > > > > > > > > > > > I also tried to replace the 4GB memory with another 2GB one > > > > > > > and kldload zfs.ko works without freezing the machine. > > > > > > > > > > > > ZFS ARC stresses memory. I've found a number of bad RAM chips over > the > > > > > > years using ZFS. > > > > > > > > > > > > The OpenZFS upgrade significantly changed how it manages ARC. It's > like > > > ly > > > > > > that prior to the OpenZFS upgrade your memory wasn't stressed to th > e po > > > int > > > > > > of failure. You can try to mask the problem by reducing your RAM cl > ock > > > rate > > > > > > > > > > > or or increase one of the other latency settings in your BIOS. Howe > ver, > > > > > > > > > again, this only masks an already weak RAM chip. > > > > > > > > > > Sounds like performance drop and regression > > > > > > > > How so. Please explain. > > > > > > More stresses memory usually refers to performance penalty. > > > Usually way for better performance is reduce memory access. > > > > The reason filesystems (UFS, ZFS, EXT4, etc.) cache is to avoid disk > > accesses. Nanoseconds vs milliseconds. > > I mean compared ZoL ZFS ARC vs old (BSD/Opensolaris/Illumos) ZFS ARC. > Any reaason to rise ARC hit rate in ZoL case? That's what hit rate is. It's a memory access instead of a disk access. That's what you want. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org The need of the many outweighs the greed of the few. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"