Re: Unfamiliar console message: in prompt_tty(): caught signal 2
On Sun, Apr 21, 2024 at 10:16:55PM +0200, Dag-Erling Smørgrav wrote: > bob prohaska writes: > > Apr 20 22:14:37 www su[30398]: in prompt_tty(): caught signal 2 > > This means someone ran `su` and pressed Ctrl-C instead of entering a > password when prompted. Ahh, that would have been me. Thank you! bob prohaska
Unfamiliar console message: in prompt_tty(): caught signal 2
On the serial console on a Pi3 v1.1 (so armv7) I just noticed an unfamilar message: Apr 20 22:14:37 www su[30398]: in prompt_tty(): caught signal 2 Several login failures were reported shortly afterward, so the message seems to have been a console message, not from the tip session used to connect. I've never seen it before and wondered if it has any special importance. The machine was running buildworld on -current, updated a day or so ago. By the next morning the machine had locked up hard, no response to the enter-tilda-control-B debugger escape. After power-cycling it came back up after fsck and buildworld was resumed where it left off. Thanks for reading, bob prohaska
Re: Buildworld stops for d3befb534b9 in tests
On Mon, Mar 04, 2024 at 09:54:14AM -0800, Mark Millard wrote: > bob prohaska wrote on > Date: Mon, 04 Mar 2024 16:35:52 UTC : > > > An armv7 (Pi2 v1.1) -current system stopped buildworld with > > > > c++: error: linker command failed with exit code 1 (use -v to see > > invocation) > > *** [capsicum-test.full] Error code 1 > > There might have been more error messages at some earlier point prior to > the above. Such likely would have more detail about what the issue was. > You're right, I missed the start. Here it is: Building /usr/obj/usr/src/arm.armv7/tests/sys/capsicum/capsicum-test.full cc -target armv7-gnueabihf-freebsd15.0 --sysroot=/usr/obj/usr/src/arm.armv7/tmp -B/usr/obj/usr/src/arm.armv7/tmp/usr/bin -O2 -pipe -fno-common -I/usr/src/lib/libarchive/tests -I/usr/src/lib/libarchive -I/usr/obj/usr/src/arm.armv7/lib/libarchive/tests -I/usr/src/contrib/libarchive/libarchive -I/usr/src/contrib/libarchive/libarchive/test -I/usr/src/contrib/libarchive/test_utils -DHAVE_BZLIB_H=1 -DHAVE_LIBLZMA=1 -DHAVE_LZMA_H=1 -DHAVE_ZSTD_H=1 -DHAVE_LIBZSTD=1 -DHAVE_LIBZSTD_COMPRESSOR=1 -DPLATFORM_CONFIG_H=\"/usr/src/lib/libarchive/tests/config_freebsd.h\" -DWITH_OPENSSL -DOPENSSL_API_COMPAT=0x1010L -g -gz=zlib -std=gnu99 -Wno-format-zero-length -fstack-protector-strong -Wsystem-headers -Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wno-uninitialized -Wno-pointer-sign -Wdate-time -Wno-empty-body -Wno-string-plus-int -Wno-unused-const-variable -Wno-error=unused-but-set-parameter -Wno-tautological-compare -Wno-unused-value -Wno-parentheses-equality -Wno-unused-function -Wno-enum-conversion -Wno-unused-local-typedef -Wno-address-of-packed-member -Qunused-arguments -c /usr/src/contrib/libarchive/libarchive/test/test_archive_pathmatch.c -o test_archive_pathmatch.o ld: error: undefined symbol: testing::internal::CmpHelperGE(char const*, char const*, long long, long long) >>> referenced by procdesc.cc:199 >>> (/usr/src/contrib/capsicum-test/procdesc.cc:199) >>> procdesc.o:(Pdfork_TimeCheck_Test::TestBody()) ld: error: undefined symbol: testing::internal::CmpHelperEQ(char const*, char const*, long long, long long) >>> referenced by gtest.h:1502 >>> (/usr/obj/usr/src/arm.armv7/tmp/usr/include/private/gtest/gtest.h:1502) >>> procdesc.o:(Pdfork_TimeCheck_Test::TestBody()) >>> referenced by gtest.h:1502 >>> (/usr/obj/usr/src/arm.armv7/tmp/usr/include/private/gtest/gtest.h:1502) >>> procdesc.o:(Pdfork_TimeCheck_Test::TestBody()) >>> referenced by gtest.h:1502 >>> (/usr/obj/usr/src/arm.armv7/tmp/usr/include/private/gtest/gtest.h:1502) >>> procdesc.o:(Pdfork_TimeCheck_Test::TestBody()) __cxa_thread_call_dtors: dtr 0x6ce470 from unloaded dso, skipping __cxa_thread_call_dtors: dtr 0x6309c4 from unloaded dso, skipping __cxa_thread_call_dtors: dtr 0x6ce470 from unloaded dso, skipping __cxa_thread_call_dtors: dtr 0x64b138 from unloaded dso, skipping __cxa_thread_call_dtors: dtr 0x6309c4 from unloaded dso, skipping __cxa_thread_call_dtors: dtr 0x6309c4 from unloaded dso, skipping __cxa_thread_call_dtors: dtr 0x6ce470 from unloaded dso, skipping __cxa_thread_call_dtors: dtr 0x64b138 from unloaded dso, skipping Building /usr/obj/usr/src/arm.armv7/usr.bin/mandoc/mdoc_markdown.o Thanks for reading, and apologies for the omission! bob prohaska
Buildworld stops for d3befb534b9 in tests
An armv7 (Pi2 v1.1) -current system stopped buildworld with c++: error: linker command failed with exit code 1 (use -v to see invocation) *** [capsicum-test.full] Error code 1 make[6]: stopped in /usr/src/tests/sys/capsicum .ERROR_TARGET='capsicum-test.full' .ERROR_META_FILE='/usr/obj/usr/src/arm.armv7/tests/sys/capsicum/capsicum-test.full.meta' .MAKE.LEVEL='6' MAKEFILE='' .MAKE.MODE='meta missing-filemon=yes missing-meta=yes silent=yes verbose' _ERROR_CMD='c++ -target armv7-gnueabihf-freebsd15.0 --sysroot=/usr/obj/usr/src/arm.armv7/tmp -B/usr/obj/usr/src/arm.armv7/tmp/usr/bin -O2 -pipe -fno-common -I/usr/src/tests -g -gz=zlib -Wno-format-zero-length -fstack-protector-strong -Wsystem-headers -Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter -Wpointer-arith -Wno-uninitialized -Wdate-time -Wno-empty-body -Wno-string-plus-int -Wno-unused-const-variable -Wno-error=unused-but-set-parameter -Wno-tautological-compare -Wno-unused-value -Wno-parentheses-equality -Wno-unused-function -Wno-enum-conversion -Wno-unused-local-typedef -Wno-address-of-packed-member -Qunused-arguments -I/usr/obj/usr/src/arm.armv7/tmp/usr/include/private -DGTEST_HAS_POSIX_RE=1 -DGTEST_HAS_PTHREAD=1 -DGTEST_HAS_STREAM_REDIRECTION=1 -frtti -std=c++14 -Wno-c++11-extensions -Wl,-zrelro -o capsicum-test.full capsicum-test-main.o capsicum-test.o capability-fd.o copy_file_range.o fexecve.o procdesc.o capmode.o fcntl.o ioctl.o openat.o sysctl.o select.o mqueue.o socket.o sctp.o capability-fd-pair.o overhead.o rename.o -lprivategtest -lprocstat -lpthread;' .CURDIR='/usr/src/tests/sys/capsicum' .MAKE='make' .OBJDIR='/usr/obj/usr/src/arm.armv7/tests/sys/capsicum' .TARGETS=' all' CPUTYPE='' DESTDIR='/usr/obj/usr/src/arm.armv7/tmp' LD_LIBRARY_PATH='' MACHINE='arm' MACHINE_ARCH='armv7' MACHINE_CPUARCH='arm' MAKEOBJDIRPREFIX='' MAKESYSPATH='/usr/src/share/mk' MAKE_VERSION='20220726' PATH='/usr/obj/usr/src/arm.armv7/tmp/bin:/usr/obj/usr/src/arm.armv7/tmp/usr/sbin:/usr/obj/usr/src/arm.armv7/tmp/usr/bin:/usr/obj/usr/src/arm.armv7/tmp/legacy/usr/sbin:/usr/obj/usr/src/arm.armv7/tmp/legacy/usr/bin:/usr/obj/usr/src/arm.armv7/tmp/legacy/bin:/usr/obj/usr/src/arm.armv7/tmp/legacy/usr/libexec::/sbin:/bin:/usr/sbin:/usr/bin' SRCTOP='/usr/src' OBJTOP='/usr/obj/usr/src/arm.armv7' .MAKE.MAKEFILES='/usr/src/share/mk/sys.mk /usr/src/share/mk/local.sys.env.mk /usr/src/share/mk/src.sys.env.mk /usr/src/share/mk/bsd.mkopt.mk /usr/src/share/mk/src.sys.obj.mk /usr/src/share/mk/local.sys.machine.mk /usr/src/share/mk/meta.sys.mk /usr/src/share/mk/local.meta.sys.env.mk /usr/src/share/mk/auto.obj.mk /usr/src/share/mk/bsd.suffixes.mk /etc/make.conf /usr/src/share/mk/local.sys.mk /usr/src/share/mk/src.sys.mk /etc/src.conf /usr/src/tests/sys/capsicum/Makefile /usr/src/share/mk/src.opts.mk /usr/src/share/mk/bsd.own.mk /usr/src/share/mk/bsd.opts.mk /usr/src/share/mk/bsd.cpu.mk /usr/src/share/mk/bsd.compiler.mk /usr/src/share/mk/bsd.endian.mk /usr/src/share/mk/bsd.linker.mk /usr/src/share/mk/bsd.test.mk /usr/src/share/mk/bsd.init.mk /usr/src/share/mk/local.init.mk /usr/src/share/mk/src.init.mk /usr/src/tests/sys/capsicum/../Makefile.inc /usr/src/tests/Makefile.inc0 /usr/src/share/mk/googletest.test.mk /usr/src/share/mk/googletest.test.inc.mk /usr/src/share/mk/plain.test.mk /usr/src/share/mk/tap.test.mk make[2]: stopped in /usr/src I just re-ran git pull, no changes Thanks for reading, bob prohaska
Re: Missing files on -current
On Sat, Feb 24, 2024 at 03:59:01PM +, Gary Jennejohn wrote: > > The function run_rc_scripts is defined in /usr/src/libexec/rc/rc.subr and > is called in /usr/src/libexec/rc/rc. /etc/rc includes /etc/rc.subr. > > So, maybe one of these files is not up to date under /etc? > My fault, etcupdate reported a conflict and I didn't notice it. Sorry for the noise! bob prohaska
Re: Missing files on -current
On Sat, Feb 24, 2024 at 07:02:19AM -0800, David Wolfskill wrote: > > This is from an amd64 system at main-n268514-61b88a230bac, but > run_rc_scripts is a shell function defined in /etc/rc.subr. > > So the whine about not finding run_rc_scripts would indicate that at > least one of the following is true: > > * The script that should have sourced /etc/rc.subr failed to do so. > > * /etc/rc.csubr is corrupted, and fails to define run_rc_scripts(). > Indeed, it seems to be absent: root@:~ # more /etc/rc.csubr /etc/rc.csubr: No such file or directory root@:~ # However, the same is true of a Pi3 running 14-release p5. It boots reliably once it reaches loader. I wouldn't expect this part of the boot process to be platform dependent. Maybe -current and -release do things differently? > * /etc/rc.subr is missing. Present and accounted for: root@:~ # ls -l /etc/rc.subr -rw-r--r-- 1 root wheel 51911 Nov 18 21:46 /etc/rc.subr root@:~ # Thanks for writing! bob prohaska
Missing files on -current
A Pi4 running -current completed a build/install cycle for world and kernel without obvious errors but failed to reboot, reporting: ... Warning: no time-of-day clock registered, system time will not be set accurately Dual Console: Serial Primary, Video Secondary /etc/rc: run_rc_scripts: not found /etc/rc: run_rc_scripts: not found /etc/rc: have: not found Sat Feb 24 13:42:09 UTC 2024 2024-02-24T13:42:10.007616+00:00 - init 31 - - can't exec getty '/usr/libexec/getty' for port /dev/ttyv1: No such file or directory ... Uname -a reports: FreeBSD 15.0-CURRENT FreeBSD 15.0-CURRENT #121 main-n268499-b9870ba93ea9: Fri Feb 23 23:14:59 PST 2024 b...@nemesis.zefox.com:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC arm64distribution. Power cycling allowed boot to single-user, running fsck -fy reports a clean root file system. /etc/fstab contains /dev/da0s2a / ufs rw 1 1 /dev/da0s1 /boot/msdos msdosfs rw,noatime 0 0 #tmpfs /tmp tmpfs rw,mode=1777,size=50m 0 0 /dev/da0s2d /usrufs rw 2 2 /dev/da0s2b noneswapsw There does not seem to be a file named run_rc_scripts present in the filesystem. Any suggestions on how to back myself out of this corner would be much appreciated! Thanks for reading, bob prohaska
Re: bsdinstall use on rpi4
On Sat, Jan 13, 2024 at 05:03:41PM +, void wrote: > > I've used this method with 13-stable and 14-stable, but wondered if > maybe it was depreciated in 15-current. The showstopper is the error marked > [2] which is within seconds followed by [3]. If it was just [1] > then I could work around it. IIRC I didn't use the automatic setup, but rather the manual one. Perhaps that changed something. Also, I was using the snapshot for 14-stable on Pi3, that might be quite different, intentionally or otherwise. "Can't create..." a file looks like a permissions or sequence error, with one of the intermediate directories not created yet. ISTR a thread on this general topic saying to my eye that bsdinstall either didn't or couldn't deal with partitions outside the UFS filesystem. If you got a working msdos partition then I'm badly mistaken. Can't find that thread now. Thanks for reading, bob prohaska
Re: bsdinstall use on rpi4
On Sat, Jan 13, 2024 at 03:26:19PM +, void wrote: > Hi, > > I'm trying to use bsdinstall on > FreeBSD-15.0-CURRENT-arm64-aarch64-RPI-20240111-a61d2c7fbd3c-267507.img > to install to usb3-connected HD, using the 'expert mode' for UFS, > after having initially booted from mmcsd. > > The goal is to boot to usb3 with freebsd on UFS filesystem, and to have > that filesystem split into partitions, and the partitions having various > properties. IOW not to have it all on /. I'd like to use GPT instead of MBR. > > Is this (bsdinstall) method 'correct' or should I use some other method? > > I've tried this method but get errors after the fetching phase. > > 1. "manifest not found on local disk and will be fetched from an > unverified source..." http://void.f-m.fm.user.fm/error1.png > > 2. "error while extracting base.txz: can't create > /usr/share/untrusted/Sonera_Class_2_Root_CA.pem" >http://void.f-m.fm.user.fm/error2.png > > 3. "could not set root password. An installation step has been >aborted. Would you like to restart the installation or exitthe > installer?" >http://void.f-m.fm.user.fm/error3.png > I tried this using 14-release on a Pi3 with a usb mechanical hard disk. I don't recall seeing the errors reported above, but found that the msdos partition wasn't populated. I copied files manually. The resulting host is finicky about booting, sometimes requiring intervention at the serial console to prod u-boot to find the usb disk. This particular Pi is booting without a microSD, it's possible the usb-sata adapter contributes to the problem. It might be worth trying the "bootcode.bin-only" method to see if that helps. Perhaps others can say more. Once FreeBSD is up there have been no obvious problems. Thanks for reading, hth bob prohaska
Re: How much survives an install/reboot cycle?
On Mon, Nov 20, 2023 at 09:12:45AM +0800, Zhenlei Huang wrote: > > > > On Nov 19, 2023, at 11:51 PM, bob prohaska wrote: > > > > How much of a running system's state survives a reboot? I used to think > > the answer was "nothing", but from time to time a second reboot behaves > > a little differently from the previous one. > > Warner has a good description about that. I totally agree. > > > > > The most recent example was an update to bpf.c: Prior to the update an > > armv7 system had been inclined to drop ssh connections left up for days. > > After updating and running a build/install cycle the behavior persisted, > > but since a second reboot with no intentional changes it has stopped. > > The most recent change to bpf.c is 7a974a649848 (bpf: Make dead_bpf_if const) > . > It is not a functional change, and I do not think it will affect ssh. > There could be issues under the earth. > That is most helpful. Very likely the change I saw is simply coincidence. > Anyway please do not hesitate to report if you get recovered by reverting > 7a974a649848. In this case I don't want to revert, the new behavior is desirable. My only puzzle was the seeming delay in its appearance. The only consistent issue remaining is reported in Bug 273566 . It finally dawned on me that the garbage characters must be originated on the USB end, transmitted to the getty process watching the serial end and get stuck in the transmit buffer when the link goes down. When the serial link comes back up they appear on the receiving console display. Many thanks to you and Warner! bob prohaska
How much survives an install/reboot cycle?
How much of a running system's state survives a reboot? I used to think the answer was "nothing", but from time to time a second reboot behaves a little differently from the previous one. The most recent example was an update to bpf.c: Prior to the update an armv7 system had been inclined to drop ssh connections left up for days. After updating and running a build/install cycle the behavior persisted, but since a second reboot with no intentional changes it has stopped. I've not tampered with nextboot, so I don't think that's it. Maybe I'm just imagining imagining things Thanks for reading, bob prohaska
Re: www/chromium will not build on a host w/ 8 CPU and 16G mem [RPi4B 8 GiByte example]
On Fri, Aug 25, 2023 at 02:21:33AM -0700, Mark Millard wrote: > > That will not help avoid the R_AARCH64_ABS64 abuse, > unfortunately. > > Thank you for the analysis. I've posted a bug,id=273349. Sounds like I shouldn't hold my breath 8-( bob prohaska >
Re: www/chromium will not build on a host w/ 8 CPU and 16G mem [RPi4B 8 GiByte example]
On Thu, Aug 24, 2023 at 03:20:50PM -0700, Mark Millard wrote: > bob prohaska wrote on > Date: Thu, 24 Aug 2023 19:44:17 UTC : > > > On Fri, Aug 18, 2023 at 08:05:41AM +0200, Matthias Apitz wrote: > > > > > > sysctl vfs.read_max=128 > > > sysctl vfs.aio.max_buf_aio=8192 > > > sysctl vfs.aio.max_aio_queue_per_proc=65536 > > > sysctl vfs.aio.max_aio_per_proc=8192 > > > sysctl vfs.aio.max_aio_queue=65536 > > > sysctl vm.pageout_oom_seq=120 > > > sysctl vm.pfault_oom_attempts=-1 > > > > > > > Just tried these settings on a Pi4, 8GB. Seemingly no help, > > build of www/chromium failed again, saying only: > > > > ===> Compilation failed unexpectedly. > > Try to set MAKE_JOBS_UNSAFE=yes and rebuild before reporting the failure to > > the maintainer. > > *** Error code 1 > > > > No messages on the console at all, no indication of any swap use at all. > > If somebody can tell me how to invoke MAKE_JOBS_UNSAFE=yes, either > > locally or globally, I'll give it a try. But, if it's a system problem > > I'd expect at least a peep on the console > > Are you going to post the log file someplace? http://nemesis.zefox.com/~bob/data/logs/bulk/main-default/2023-08-20_16h11m59s/logs/errors/chromium-115.0.5790.170_1.log > You may have missed an earlier message. Yes, I did. Some (very long) lines above there is: [ 96% 53691/55361] "python3" "../../build/toolchain/gcc_link_wrapper.py" --output="./v8_context_snapshot_generator" -- c++ -fuse-ld=lld -Wl,--build-id=sha1 -fPIC -Wl,-z,noexecstack -Wl,-z,relro -Wl,-z,now -Wl,--icf=all -Wl,--color-diagnostics -Wl,--undefined-version -Wl,-mllvm,-enable-machine-outliner=never -no-canonical-prefixes -Wl,-O2 -Wl,--gc-sections -rdynamic -pie -Wl,--disable-new-dtags -Wl,--icf=none -L/usr/local/lib -fstack-protector-strong -L/usr/local/lib -o "./v8_context_snapshot_generator" -Wl,--start-group @"./v8_context_snapshot_generator.rsp" -Wl,--end-group -lpthread -lgmodule-2.0 -lglib-2.0 -lgobject-2.0 -lgthread-2.0 -lintl -licui18n -licuuc -licudata -lnss3 -lsmime3 -lnssutil3 -lplds4 -lplc4 -lnspr4 -ldl -lkvm -lexecinfo -lutil -levent -lgio-2.0 -ljpeg -lpng16 -lxml2 -lxslt -lexpat -lwebp -lwebpdemux -lwebpmux -lharfbuzz-subset -lharfbuzz -lfontconfig -lopus -lopenh264 -lm -lz -ldav1d -lX11 -lXcomposite -lXdamage -lXext -lXfixes -lXrender -lXrandr -lXtst -lepoll-shim -ldrm -lxcb -lxkbcommon -lgbm -lXi -lGL -lpci -lffi -ldbus-1 -lpangocairo-1.0 -lpango-1.0 -lcairo -latk-1.0 -latk-bridge-2.0 -lsndio -lFLAC -lsnappy -latspi FAILED: v8_context_snapshot_generator Then, a bit further down in the file a series of d.lld: error: relocation R_AARCH64_ABS64 cannot be used against local symbol; recompile with -fPIC complaints. Unclear if the two kinds of complaints are related, nor whether they're the first.. > How long had it run before stopping? 95 hours, give or take. Nothing about timeout was reported > How does that match up with the MAX_EXECUTION_TIME > and NOHANG_TIME and the like that you have poudriere set > up to use ( /usr/local/etc/poudriere.conf ). NOHANG_TIME=44400 MAX_EXECUTION_TIME=1728000 MAX_EXECUTION_TIME_EXTRACT=144000 MAX_EXECUTION_TIME_INSTALL=144000 MAX_EXECUTION_TIME_PACKAGE=11728000 Admittedly some are plain silly, I just started tacking on zeros after getting timeouts and being unable to match the error message and variable name.. I checked for duplicates this time, however. > Something relevant for the question is what you have for: > > # Grep build logs to determine a possible build failure reason. This is > # only shown on the web interface. > # Default: yes > DETERMINE_BUILD_FAILURE_REASON=no > > Using DETERMINE_BUILD_FAILURE_REASON leads to large builds > running for a long time after it starts the process of > stopping from a timeout the grep activity takes a long > time and the build activity is not stopped during the > grep. > > > vm.pageout_oom_seq=120 and vm.pfault_oom_attempts=-1 make > sense to me for certain kinds of issues involved in large > builds, presuming sufficient RAM+SWAP for how it is set > up to operate. vm.pageout_oom_seq is associated with > console/log messages. if one runs out of RAM+SWAP, > vm.pfault_oom_attempts=-1 tends to lead to deadlock. But > it allows slow I/O to have the time to complete and so > can be useful. > > I'm not sure that any vfs.aio.* is actually involved: special > system calls are involved, splitting requests vs. retrieving > the status of completed requests later. Use of aio has to be > explicit in the running software from what I can tell. I've > no information about which software builds might be using aio > during the build activity. >
Re: www/chromium will not build on a host w/ 8 CPU and 16G mem
On Fri, Aug 18, 2023 at 08:05:41AM +0200, Matthias Apitz wrote: > > sysctl vfs.read_max=128 > sysctl vfs.aio.max_buf_aio=8192 > sysctl vfs.aio.max_aio_queue_per_proc=65536 > sysctl vfs.aio.max_aio_per_proc=8192 > sysctl vfs.aio.max_aio_queue=65536 > sysctl vm.pageout_oom_seq=120 > sysctl vm.pfault_oom_attempts=-1 > Just tried these settings on a Pi4, 8GB. Seemingly no help, build of www/chromium failed again, saying only: ===> Compilation failed unexpectedly. Try to set MAKE_JOBS_UNSAFE=yes and rebuild before reporting the failure to the maintainer. *** Error code 1 No messages on the console at all, no indication of any swap use at all. If somebody can tell me how to invoke MAKE_JOBS_UNSAFE=yes, either locally or globally, I'll give it a try. But, if it's a system problem I'd expect at least a peep on the console.... Thanks for reading, bob prohaska
Re: alpha-1 armv7 git failed: fatal: pack is corrupted (SHA1 mismatch)
On Sun, Aug 13, 2023 at 12:45:12PM -0700, Mark Millard wrote: > > Wow. I'm going to suggest doing a clone (to a temporary > place) on one or more different types of system, such > as aarch64 or amd64. If, say, aarch64 works but armv7 > does not, then the corruption may well be in some armv7 > FreeBSD handling of data transfers, in places common > to both https:// and ssh:// use. > That seems to have worked on a Pi4 8GB running -current: root@nemesis:/usr # mv src src.old root@nemesis:/usr # git clone -o freebsd ssh://anon...@git.freebsd.org/src.git /usr/src Cloning into '/usr/src'... The authenticity of host 'git.freebsd.org (96.47.72.109)' can't be established. ED25519 key fingerprint is SHA256:y1ljKrKMD3lDObRUG3xJ9gXwEIuqnh306tSyFd1tuZE. This key is not known by any other names. Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added 'git.freebsd.org' (ED25519) to the list of known hosts. remote: Enumerating objects: 4323641, done. remote: Counting objects: 100% (381285/381285), done. remote: Compressing objects: 100% (28204/28204), done. remote: Total 4323641 (delta 375527), reused 353081 (delta 353081), pack-reused 3942356 Receiving objects: 100% (4323641/4323641), 1.54 GiB | 390.00 KiB/s, done. Resolving deltas: 100% (3432012/3432012), done. Checking objects: 100% (16777216/16777216), done. Updating files: 100% (95944/95944), done. root@nemesis:/usr # > Note that, if you get a good clone, you can locally > copy the tree over to the armv7 media. But that is > not the point of my suggestion above. Under the circumstances it seems like the path of least resistance. Can I do something simple like sftp, using get -r ? Any trick to updating the copy? Many thanks! bob prohaska
Re: alpha-1 armv7 git failed: fatal: pack is corrupted (SHA1 mismatch)
On Sat, Aug 12, 2023 at 08:45:54PM -0700, Mark Millard wrote: > > You might need to use the ssh alternative if your > context allows it: > > ssh://anon...@git.freebsd.org/src.git > > There was a time when git fetch proved unreliable > in my context and I got around it via ssh use. It > also took less time transferring. > Seemingly no dice: # git clone -o freebsd ssh://anon...@git.freebsd.org/src.git /usr/src Cloning into '/usr/src'... remote: Enumerating objects: 4323641, done. remote: Counting objects: 100% (381285/381285), done. remote: Compressing objects: 100% (28204/28204), done. remote: Total 4323641 (delta 375529), reused 353081 (delta 353081), pack-reused 3942356 Receiving objects: 100% (4323641/4323641), 1.54 GiB | 656.00 KiB/s, done. fatal: pack is corrupted (SHA1 mismatch) fatal: fetch-pack: invalid index-pack output # I've added freebsd-current to the cc list 8-( Thanks for writing! bob prohaska
Re: Using etcupdate resolve, was Re: Surprise null root password
I want to thank Patrick, Dmitry and Mark for providing orientation sufficient to make some headway. The Handbook simply says "If etcupdate(8) is not able to merge a file automatically, the merge conflicts can be resolved with manual interaction by issuing: # etcupdate resolve While not wrong, it's certainly less than the whole story 8-) It's unfortunate that the example posted was a trivial case, certainly I didn't tamper with BSD.tests.dist and tf was the correct response. If I'm understanding correctly, the file presented by the df and e options contains essentially all possible versions, delimited by <<<<<<<<<<<<<, ||| and >>>>>>>>> characters. Once edited, that will become the new local version of the file. If this is mistaken please say so. bob prohaska
Using etcupdate resolve, was Re: Surprise null root password
Here's an example of the puzzles faced when using etcupdate that have so far proved baffling: On running etcupdate resolve, the system reports Resolving conflict in '/etc/mtree/BSD.tests.dist': Select: (p) postpone, (df) diff-full, (e) edit, (h) help for more options: df --- /etc/mtree/BSD.tests.dist 2023-05-29 08:29:48.174762000 -0700 +++ /var/db/etcupdate/conflicts/etc/mtree/BSD.tests.dist2023-06-13 22:55:04.284491000 -0700 @@ -442,6 +442,16 @@ .. ifconfig .. +<<<<<<< yours +||| original +md5 +.. +=== +ipfw +.. +md5 +.. +>>>>>>> new mdconfig .. nvmecontrol Select: (p) postpone, (df) diff-full, (e) edit, (h) help for more options: e Selecting option e for edit brings up what appears to be a vi window, using search I can find the line with mdconfig: <<<<<<< yours ||| original md5 .. === ipfw .. md5 .. >>>>>>> new mdconfig .. nvmecontrol .. pfctl files .. .. ping .. The puzzle at this point is what to do. It's looks like the points of interest are the lines marked "yours" and "new", but I'll admit to bafflement which to modify and whether the modifications needed include the <<<< and >>>>> characters. If there's a relevant man section please point it out. Thanks for reading, bob prohaska
Re: Surprise null root password
On Tue, May 30, 2023 at 11:02:13AM -0700, Mark Millard wrote: > bob prohaska wrote on > Date: Tue, 30 May 2023 15:36:21 UTC : > > > On Tue, May 30, 2023 at 08:41:33AM +0200, Alexander Leidinger wrote: > > > > > > Quoting bob prohaska (from Fri, 26 May 2023 16:26:06 > > > -0700): > > > > > > > On Fri, May 26, 2023 at 10:55:49PM +0200, Yuri wrote: > > > > > > > > > > The question is how you update the configuration files, > > > > > mergemaster/etcupdate/something else? > > > > > > > > > > > > > Via etcupdate after installworld. In the event the system > > > > requests manual intervention I accept "theirs all". It seems > > > > odd if that can null a root password. > > > > > > > > Still, it does seem an outside possibility. I could see it adding > > > > system users, but messing with root's existing password seems a > > > > bit unexpected. > > > > > > As you are posting to -current@, I expect you to report this issue about > > > 14-current systems. As such: there was a "recent" change (2021-10-20) to > > > the > > > root entry to change the shell. > > > https://cgit.freebsd.org/src/commit/etc/master.passwd?id=d410b585b6f00a26c2de7724d6576a3ea7d548b7 > > > > > > By blindly accepting all changes, this has reset the PW to the default > > > setting (empty). > > > > So it's a line-by-line merge. That's the most sensible explanation > > available. > > > > > > > > I suggest to review changes ("df" instead of "tf" in etcupdate) to at > > > least > > > those files which you know you have modified, including the password/group > > > stuff. After that you can decide if the diff which is shown with "df" can > > > be > > > applied ("tf"), or if you want to keep the old version ("mf"), or if you > > > want to modify the current file ("e", with both versions present in the > > > file > > > so that you can copy/paste between the different versions and keep what > > > you > > > need). > > > > > > > The key sequences required to copy and paste between files in the edit > > screen > > were elusive. Probably it was thought self-evident, but not for me. I last > > tried > > it long ago, via mergemaster. Is there is a guide to commands for merging > > files > > using /etcupdate? Is it in the vi man page? I couldn't find it. > > # man etcpudate > . . . > CONFIG FILE > The etcupdate utility can also be configured by setting variables in an > optional configuration file named /etc/etcupdate.conf. Note that command > line options override settings in the configuration file. The > configuration file is executed by sh(1), so it uses that syntax to set > configuration variables. The following variables can be set: > > . . . > > EDITOR Specify a program to edit merge conflicts. > . . . > ENVIRONMENT > The etcupdate utility uses the program identified in the EDITOR > environment variable to edit merge conflicts. If EDITOR is not set, > vi(1) is used as the default editor. > > > > So, if you do not want to use vi, you can use either the EDITOR > environment variable or an EDITOR assignment in > /etc/etcupdate.conf to change what editor etcupdate uses for > you to edit merge conflicts with. My difficulty is precisely a lack of skill with vi, which I've used and cursed since starting with 386BSD. Evidently I'm a slow learner I tried other editors, but vi is the only one always available. For the moment, etcupgrade isn't asking for manual intervention. When it next does I'll pay closer attention and ask better questions. Thanks to you in particular and everybody else who has helped! bob prohaska
Re: Surprise null root password
On Tue, May 30, 2023 at 08:41:33AM +0200, Alexander Leidinger wrote: > > Quoting bob prohaska (from Fri, 26 May 2023 16:26:06 > -0700): > > > On Fri, May 26, 2023 at 10:55:49PM +0200, Yuri wrote: > > > > > > The question is how you update the configuration files, > > > mergemaster/etcupdate/something else? > > > > > > > Via etcupdate after installworld. In the event the system > > requests manual intervention I accept "theirs all". It seems > > odd if that can null a root password. > > > > Still, it does seem an outside possibility. I could see it adding > > system users, but messing with root's existing password seems a > > bit unexpected. > > As you are posting to -current@, I expect you to report this issue about > 14-current systems. As such: there was a "recent" change (2021-10-20) to the > root entry to change the shell. > https://cgit.freebsd.org/src/commit/etc/master.passwd?id=d410b585b6f00a26c2de7724d6576a3ea7d548b7 > > By blindly accepting all changes, this has reset the PW to the default > setting (empty). So it's a line-by-line merge. That's the most sensible explanation available. > > I suggest to review changes ("df" instead of "tf" in etcupdate) to at least > those files which you know you have modified, including the password/group > stuff. After that you can decide if the diff which is shown with "df" can be > applied ("tf"), or if you want to keep the old version ("mf"), or if you > want to modify the current file ("e", with both versions present in the file > so that you can copy/paste between the different versions and keep what you > need). > The key sequences required to copy and paste between files in the edit screen were elusive. Probably it was thought self-evident, but not for me. I last tried it long ago, via mergemaster. Is there is a guide to commands for merging files using /etcupdate? Is it in the vi man page? I couldn't find it. Thanks for writing! bob prohaska
Re: Surprise null root password
It turns out all seven hosts in my cluster report a null password for root in /usr/src/etc/master.passwd: root::0:0::0:0:Charlie &:/root:/bin/sh Is that intentional? Thanks for reading, bob prohaska
Re: Surprise null root password
On Fri, May 26, 2023 at 10:55:49PM +0200, Yuri wrote: > > The question is how you update the configuration files, > mergemaster/etcupdate/something else? > Via etcupdate after installworld. In the event the system requests manual intervention I accept "theirs all". It seems odd if that can null a root password. Still, it does seem an outside possibility. I could see it adding system users, but messing with root's existing password seems a bit unexpected. Thanks very much for raising the point! bob prohaska
Re: Surprise null root password
On Fri, May 26, 2023 at 07:48:04PM +0100, Ben Laurie wrote: > -T on ls will give you full time resolution... > More's the wonder: root@www:/usr/src # ls -lT /etc/*p*wd* -rw--- 1 root wheel 2099 May 10 17:20:33 2023 /etc/master.passwd -rw-r--r-- 1 root wheel 1831 May 10 17:20:33 2023 /etc/passwd -rw-r--r-- 1 root wheel 40960 May 10 17:20:33 2023 /etc/pwd.db -rw--- 1 root wheel 40960 May 10 17:20:33 2023 /etc/spwd.db For sake of clarity, /etc/master.passwd's root line is root::0:0::0:0:Charlie &:/root:/bin/sh while /etc/passwd's root line is root:*:0:0:Charlie &:/root:/bin/sh I just noticed a second host (Pi3) which is similarly affected. It completed a build/install cycle on May 25, uname -a yields FreeBSD www.zefox.org 14.0-CURRENT FreeBSD 14.0-CURRENT #46 main-n263122-57a3a161a92f: Thu May 25 21:25:57 PDT 2023 b...@www.zefox.org:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC arm64 On this host I get root@www:/usr/src # ls -lT /etc/*p*wd* -rw--- 1 root wheel 1796 Nov 12 16:00:03 2022 /etc/master.passwd -rw-r--r-- 1 root wheel 2430 Oct 1 19:40:22 2020 /etc/passwd -rw-r--r-- 1 root wheel 40960 Oct 1 19:40:22 2020 /etc/pwd.db -rw--- 1 root wheel 40960 Oct 1 19:40:22 2020 /etc/spwd.db (at least the dates make more sense) The root line in /etc/master.passwd is root::0:0::0:0:Charlie &:/root:/bin/sh I didn't catch any null password reports in the security emails, most likely through lack of attention. As with the first case, passwords seem to work normally (null rejected, normal accepted). Any advice appreciated! bob prohaska > On Fri, 26 May 2023 at 19:45, bob prohaska wrote: > > > On Fri, May 26, 2023 at 01:03:19PM -0500, Mike Karels wrote: > > > On 26 May 2023, at 12:35, bob prohaska wrote: > > > > > > > While going through normal security email from a Pi2 > > > > running -current I was disturbed to find: > > > > > > > > Checking for passwordless accounts: > > > > root::0:0::0:0:Charlie &:/root:/bin/sh > > > > > > [details snipped] > > > /etc/master.passwd is the source, but the operational database > > > is /etc/spwd.db. You should check the date on it as well. > > > You can rebuild it with ???pwd_mkdb -p /etc/master.passwd???. > > > > At present the host reports: > > root@www:/usr/src # ls -l /etc/*p*wd* > > -rw--- 1 root wheel 2099 May 10 17:20 /etc/master.passwd > > -rw-r--r-- 1 root wheel 1831 May 10 17:20 /etc/passwd > > -rw-r--r-- 1 root wheel 40960 May 10 17:20 /etc/pwd.db > > -rw--- 1 root wheel 40960 May 10 17:20 /etc/spwd.db > > > > /etc/master.passwd reports a null password for root, /etc/passwd > > has the usual asterisk. The running system reports > > root@www:/usr/src # uname -a > > FreeBSD www.zefox.com 14.0-CURRENT FreeBSD 14.0-CURRENT #25 > > main-743516d51f: Thu May 18 00:08:40 PDT 2023 > > b...@www.zefox.com:/usr/obj/usr/src/arm.armv7/sys/GENERIC > > arm > > root@www:/usr/src # uname -KU > > 1400088 1400088 > > > > I've never manually run pwd_mkdb and most certainly > > never set a null password for root. It looks rather > > as if a null password was set for root within one > > minute after running pwd_mkdb. > > > > At this point I'm unsure how to sort out what happened. > > The obvious next step is to re-establish a non-null > > root password and rebuild both databases. > > > > Is it worthwhile to check for backdoors? There's no > > evidence to suggest any malicious action (and plenty > > of stupidity on my end) but the tale is getting > > curiouser and curiouser. > > > > Many thanks for the quick reply! > > > > bob prohaska > > > > > > > > > >
Re: Surprise null root password
On Fri, May 26, 2023 at 01:03:19PM -0500, Mike Karels wrote: > On 26 May 2023, at 12:35, bob prohaska wrote: > > > While going through normal security email from a Pi2 > > running -current I was disturbed to find: > > > > Checking for passwordless accounts: > > root::0:0::0:0:Charlie &:/root:/bin/sh > > [details snipped] > /etc/master.passwd is the source, but the operational database > is /etc/spwd.db. You should check the date on it as well. > You can rebuild it with ???pwd_mkdb -p /etc/master.passwd???. At present the host reports: root@www:/usr/src # ls -l /etc/*p*wd* -rw--- 1 root wheel 2099 May 10 17:20 /etc/master.passwd -rw-r--r-- 1 root wheel 1831 May 10 17:20 /etc/passwd -rw-r--r-- 1 root wheel 40960 May 10 17:20 /etc/pwd.db -rw--- 1 root wheel 40960 May 10 17:20 /etc/spwd.db /etc/master.passwd reports a null password for root, /etc/passwd has the usual asterisk. The running system reports root@www:/usr/src # uname -a FreeBSD www.zefox.com 14.0-CURRENT FreeBSD 14.0-CURRENT #25 main-743516d51f: Thu May 18 00:08:40 PDT 2023 b...@www.zefox.com:/usr/obj/usr/src/arm.armv7/sys/GENERIC arm root@www:/usr/src # uname -KU 1400088 1400088 I've never manually run pwd_mkdb and most certainly never set a null password for root. It looks rather as if a null password was set for root within one minute after running pwd_mkdb. At this point I'm unsure how to sort out what happened. The obvious next step is to re-establish a non-null root password and rebuild both databases. Is it worthwhile to check for backdoors? There's no evidence to suggest any malicious action (and plenty of stupidity on my end) but the tale is getting curiouser and curiouser. Many thanks for the quick reply! bob prohaska
Surprise null root password
While going through normal security email from a Pi2 running -current I was disturbed to find: Checking for passwordless accounts: root::0:0::0:0:Charlie &:/root:/bin/sh The machine had locked up on a -j4 buildworld since sending the mail, so it was taken off the net, power cycled and started single-user. Sure enough, /etc/master.passwd contained a null password for root, but the last modification to the file was two weeks ago according to ls -l. Stranger still, when fsck'd and brought up multi-user, the normal password was still honored and a null password rejected for both regular and root account. AFAIK, /etc/master.passwd is _the_ password repository, but clearly I'm wrong. If somebody can tell me what's going on and what to check for before placing the machine back on line it would be much appreciated. Thanks for reading, bob prohaska
Re: Stray characters in command history
Here is another example, perhaps a bit clearer. The ssh connection to the first Pi3 in the chain had dropped, so it was re-establishing via a regular user login, then su was invoked and tip run: . To change this login announcement, see motd(5). Want to go the directory you were just in? Type "cd -" bob@pelorus:~ % su Password: # tip ucom Stale lock on cuaU0 PID=2487... overriding. connected osed by r31 www s <<<< This appeared spontaneously, then I hit return. osed: Command not found. <<<<< I didn't type anything. bob@www:/usr/src %<<<<< The shell prompt on the 2nd Pi3's serial console. Clearly nothing to do with sshd, might it simply be a misdirected echo of console output generated by a (dead or broken) tip connection? The first example looked possibly malicious, this does not Thanks for reading, bob prohaska On Sun, May 21, 2023 at 06:49:33AM -0700, bob prohaska wrote: > Lately I've been playing with buildworld on a Pi3 running -current. The same > machine > acts as a terminal server for a second Pi3 also running -current in my > "cluster". > I ssh into the first Pi3, su to root and run tip to pick up a serial > connection to > the second Pi's console. Both machines are within a week of up-to-date. > > While building world on the first Pi3 the ssh connection frequently drops and > must be > re-established. If there was a shell running on the serial console of the > second > Pi3 it typically keeps running and when the tip session is restarted > disgorges a > stream of accumulated console message. > > This morning the same thing happened, but I noticed something odd. The stream > of > messages (all login failure announcements from ssh) ended with > > > May 21 06:15:00 www sshd[33562]: error: Fssh_kex_exchange_identification: > banner line contains invalid characters > *+May 21 06:15:19 www sshd[33565]: error: Fssh_kex_exchange_identification: > Connection closed by remote host > May 21 06:15:33 www sshd[33613]: error: Protocol major versions differ: 2 vs. > 1 > > At that point I hit carriage return and got > *+: No match. > > I did not type the *+ so it looks like the characters were somehow introduced > elsewhere, > possibly from the ssh failure message. How they got into the command stream > is unclear. > > This strikes me as undesirable at best and possibly much worse. The shell > reporting > the "no match" was a regular user shell, but if I'd been su'd to root it > appears that > the unmatched characters would be seen by the root shell as input. > > This more-or-less fits with the pattern seen earlier with reboots observed > via serial > console halting on un-typed keystrokes. Those halts were attributed to > electrical noise > on the serial line, but this looks like something injected via part of the > network > login process. Reboot pauses have been an ongoing phenomenon for months, this > is the > first time I've noticed the "invalid characters" message from ssh on the > console. > > Thanks for reading, apologies if I'm being a worrywart. > > bob prohaska > > >
Stray characters in command history
Lately I've been playing with buildworld on a Pi3 running -current. The same machine acts as a terminal server for a second Pi3 also running -current in my "cluster". I ssh into the first Pi3, su to root and run tip to pick up a serial connection to the second Pi's console. Both machines are within a week of up-to-date. While building world on the first Pi3 the ssh connection frequently drops and must be re-established. If there was a shell running on the serial console of the second Pi3 it typically keeps running and when the tip session is restarted disgorges a stream of accumulated console message. This morning the same thing happened, but I noticed something odd. The stream of messages (all login failure announcements from ssh) ended with May 21 06:15:00 www sshd[33562]: error: Fssh_kex_exchange_identification: banner line contains invalid characters *+May 21 06:15:19 www sshd[33565]: error: Fssh_kex_exchange_identification: Connection closed by remote host May 21 06:15:33 www sshd[33613]: error: Protocol major versions differ: 2 vs. 1 At that point I hit carriage return and got *+: No match. I did not type the *+ so it looks like the characters were somehow introduced elsewhere, possibly from the ssh failure message. How they got into the command stream is unclear. This strikes me as undesirable at best and possibly much worse. The shell reporting the "no match" was a regular user shell, but if I'd been su'd to root it appears that the unmatched characters would be seen by the root shell as input. This more-or-less fits with the pattern seen earlier with reboots observed via serial console halting on un-typed keystrokes. Those halts were attributed to electrical noise on the serial line, but this looks like something injected via part of the network login process. Reboot pauses have been an ongoing phenomenon for months, this is the first time I've noticed the "invalid characters" message from ssh on the console. Thanks for reading, apologies if I'm being a worrywart. bob prohaska
Re: Making -current machines accept mail from sendmail
On Sat, Mar 04, 2023 at 10:57:59AM -0800, David Wolfskill wrote: > > You might start with checking the output of "sockstat -l" on the machine > that is intended to receive the mail: SMTP is expected to be on 25/tcp. > > If the intended recipient machine does NOT show that 25/tcp is being > listened to, you will need to (install &) start a process to do so. > That may well involve installing (& starting) some MTA -- whether > sendmail, postfix, exim, or even qmail (or something else). > > (I expect that nothing is listening on 25/tcp, as that is what > "connection refused" implies.) > Indeed, that's the case. It looks as if dma isn't intended to replace sendmail, so I'll take the hint in UPDATING and turn sendmail back on. Thank you! bob prohaska
Making -current machines accept mail from sendmail
Is there some special step to turn on dma so hosts can receive email from a sendmail-using host? I've got three hosts using 12/stable (hence sendmail) and a few more running -current (hence dma). The -stable hosts report "connection refused" when sending to -current, but -current has no trouble sending to -stable. On a fresh reboot I don't see any reference to dma in the dmesg output, and ps -aux reports only busdma. Thanks for reading, bob prohaska
Re: Timekeeping problem in /usr/src on new RPI aarch64 snapshot
On Sat, Feb 25, 2023 at 10:33:23AM -0600, Mike Karels wrote: > On 25 Feb 2023, at 10:16, bob prohaska wrote: > > > On Sat, Feb 25, 2023 at 12:21:09AM +0100, Ronald Klop wrote: > >> > >> UFS stores the current timestamp in the superblock of the FS on clean > >> shutdown/unmount. On boot it reads the time from the timestamp in the > >> superblock of the root FS. Of coarse this timestamp is behind for the > >> duration that the machine was off or rebooting so you need to adjust that > >> using ntp. For ZFS root you can use the fakertc port to do something > >> similar. > >> > >> > > Mark Millard points out: > > /etc/localtimeCurrent zoneinfo file, see tzsetup(8) and zic(8). > > > > /etc/wall_cmos_clock Empty file. Its presence indicates that the > >machine's CMOS clock is set to local time, while > >its absence indicates a UTC CMOS clock. > > > > Since there is no /etc/wall_cmos_clock on the newly-installed filesystem > > it appears the superblock timestamp is then interpreted as UTC when a Pi > > boots, using whatever happens to be set in /etc/localtime. My confusion > > is reduced somewhat. On first boot, what is the state of /etc/localtime? > > > > I've neglected to run tzsetup immediately on many previous installations > > and not noticed any complaints about mis-set clocks in buildworld. Is this > > new behavior? > > /etc/localtime is used in formatting dates (e.g. for ls), but is not > involved in storage of timestamps. Timestamps on files, system time, etc, > are all in UTC. So the system should act normally if there is no > /etc/localtime, and after one is added. Does formatting include calculating offsets from UTC for display? On at least a couple of installs I've observed date reporting UTC time. After running tzsetup, set to PST, date then reported the same numerical time with a PST time zone. This happened very early in an installation lifecycle and seemed to just "go away" after a few reboots, though I never paid close attention since it caused no complaints. Thanks for replying! bob prohaska
Re: Timekeeping problem in /usr/src on new RPI aarch64 snapshot
On Sat, Feb 25, 2023 at 12:21:09AM +0100, Ronald Klop wrote: > > UFS stores the current timestamp in the superblock of the FS on clean > shutdown/unmount. On boot it reads the time from the timestamp in the > superblock of the root FS. Of coarse this timestamp is behind for the > duration that the machine was off or rebooting so you need to adjust that > using ntp. For ZFS root you can use the fakertc port to do something > similar. > > Mark Millard points out: /etc/localtimeCurrent zoneinfo file, see tzsetup(8) and zic(8). /etc/wall_cmos_clock Empty file. Its presence indicates that the machine's CMOS clock is set to local time, while its absence indicates a UTC CMOS clock. Since there is no /etc/wall_cmos_clock on the newly-installed filesystem it appears the superblock timestamp is then interpreted as UTC when a Pi boots, using whatever happens to be set in /etc/localtime. My confusion is reduced somewhat. On first boot, what is the state of /etc/localtime? I've neglected to run tzsetup immediately on many previous installations and not noticed any complaints about mis-set clocks in buildworld. Is this new behavior? Thanks to both Mark and Ronald! bob prohaska
Timekeeping problem in /usr/src on new RPI aarch64 snapshot
After installing FreeBSD-14.0-CURRENT-arm64-aarch64-RPI-20230223-fe5c211ba873-261074.img on a Pi3 and setting up the hard disk to use separate swap and /usr partitions an oddity came to light regarding dates. The image file was written to disk the night of the 23rd, from a Pi3 with a correctly-set time and date. It was left idle overnight, configured the morning of the 24th and booted without issue. It then cloned -current into /usr/src, at which point the time was noticed to be wrong, apparently fast. It turned out ntpdate wasn't running, so it was started and then tzsetup run. After a reboot the time reported correctly. However, make buildworld in /usr/src triggers an exhortation to "check your time" and refuses run further. running date on the system reports Fri Feb 24 12:49:41 PST 2023 but ls -l /usr/src reports time around Feb 24 19:10 an obvious inconsistency. Presumably just waiting until the system clock catches up with the /usr/src timestamps will fix the error. Is there a better method? Still, the date and time handling don't seem quite right. In at least one earlier instance it appeared that tzsetup altered the reported timeszone without shifting the time stamp by the UTC/PDT offset. I always thought timestamps were internally in UTC+timezone, displayed with the right offset. It looks to a casual observer like something else is going on. An earlier fiasco (on this same Pi3) included wildly wrong timestamps in a filesystem. The Pi3 has no hardware clock, how does it set time when booted without a reference? Thanks for reading, bob prohaska
Turning security email back on
It looks as if daily email reporting login failures and system status are no longer being sent out on -current. Is there a switch for /etc/rc.conf that will turn them back on? Thanks for reading, bob prohaska
Seeking an idiot's guide to etcupdate/mergemaster
On Mon, Oct 24, 2022 at 08:32:17PM -0700, Mark Millard wrote: > > Your /etc/rc.d/ldconfig script seems to have not been updated > by use of etcupdate or mergemaster or other such. (How much > else is also out of date? How much of what you have for /etc/ > and the like goes back to 2022-Jan-07 or before?) > Alas, that is too true. The system was set up on July 2, 2020 and I've never managed to make sense of either mergemaster nor etcupdate. Far as I could tell it didn't matter, the host ran correctly, until now. It's been transplanted to a new hard drive, which allows the installation of a ports tree. Ports don't install because of the stale /etc/rc.d/ldconfig file. Since no changes have been made to /etc/ apart from /etc/rc.conf is it possible to simply let mergemaster or etcupdate install the latest defaults? I have looked at the manpage for etcupdate and didn't recognize any straightforward way to simply accept all updates. This particular system is expendable, so I'd be glad to try things that might not work well, or at all. Apologies if I'm being dumb (probably guilty) or lazy (definitely guilty). The barrage of questions generated by etcupdate and mergemaster is simply overwhelming. And, I suspect, largely unnecessary. Thanks for reading! bob prohaska
Buildworld stops with ld: error: undefined symbol: AcpiWarning on -current
A Pi3 running -current is stopping with ld: error: undefined symbol: AcpiWarning during buildworld. The sources are up-to- date as of a few minutes ago. A series of related "Acpi..." errors follow. The build command line is make -j2 -DWITH_META_MODE buildworld > buildworld.log Thanks for reading, bob prohaska
Re: panic: data abort in critical section or under mutex (was: Re: panic: Unknown kernel exception 0 esr_el1 2000000 (on 14-CURRENT/aarch64 Feb 28))
On Mon, Mar 07, 2022 at 11:45:02AM -0500, Mark Johnston wrote: > On Mon, Mar 07, 2022 at 04:25:22PM +, Andrew Turner wrote: > > > > > On 7 Mar 2022, at 15:13, Mark Johnston wrote: > > > ... > > > A (the?) problem is that the compiler is treating "pc" as an alias > > > for x18, but the rmlock code assumes that the pcpu pointer is loaded > > > once, as it dereferences "pc" outside of the critical section. On > > > arm64, if a context switch occurs between the store at _rm_rlock+144 and > > > the load at +152, and the thread is migrated to another CPU, then we'll > > > end up using the wrong CPU ID in the rm->rm_writecpus test. > > > > > > I suspect the problem is unique to arm64 as its get_pcpu() > > > implementation is different from the others in that it doesn't use > > > volatile-qualified inline assembly. This has been the case since > > > https://cgit.freebsd.org/src/commit/?id=63c858a04d56529eddbddf85ad04fc8e99e73762 > > > > > > <https://cgit.freebsd.org/src/commit/?id=63c858a04d56529eddbddf85ad04fc8e99e73762> > > > . > > > > > > I haven't been able to reproduce any crashes running poudriere in an > > > arm64 AWS instance, though. Could you please try the patch below and > > > confirm whether it fixes your panics? I verified that the apparent > > > problem described above is gone with the patch. > > > > Alternatively (or additionally) we could do something like the following. > > There are only a few MI users of get_pcpu with the main place being in rm > > locks. > > > > diff --git a/sys/arm64/include/pcpu.h b/sys/arm64/include/pcpu.h > > index 09f6361c651c..59b890e5c2ea 100644 > > --- a/sys/arm64/include/pcpu.h > > +++ b/sys/arm64/include/pcpu.h > > @@ -58,7 +58,14 @@ struct pcpu; > > > > register struct pcpu *pcpup __asm ("x18"); > > > > -#defineget_pcpu() pcpup > > +static inline struct pcpu * > > +get_pcpu(void) > > +{ > > + struct pcpu *pcpu; > > + > > + __asm __volatile("mov %0, x18" : "="(pcpu)); > > + return (pcpu); > > +} > > > > static inline struct thread * > > get_curthread(void) > > Indeed, I think this is probably the best solution. Just for fun I tried the patch on a Pi3 running -current, updated a day or two prior. The patch applied, compiled and seemed to run acceptably, but when I left a -j2 -DWITH_META_MODE buildworld running it crashed overnight, reporting login: panic: rm_rlock: recursed on non-recursive rmlock sysctl lock @ /usr/src/sys/kern/kern_sysctl.c:193 cpuid = 0 time = 1646720264 KDB: stack backtrace: db_trace_self() at db_trace_self db_trace_self_wrapper() at db_trace_self_wrapper+0x30 vpanic() at vpanic+0x174 panic() at panic+0x44 _rm_rlock_debug() at _rm_rlock_debug+0x214 sysctl_root_handler_locked() at sysctl_root_handler_locked+0x140 sysctl_root() at sysctl_root+0x1ac userland_sysctl() at userland_sysctl+0x140 sys___sysctl() at sys___sysctl+0x68 do_el0_sync() at do_el0_sync+0x520 handle_el0_sync() at handle_el0_sync+0x40 --- exception, esr 0x5600 KDB: enter: panic [ thread pid 869 tid 100091 ] Stopped at kdb_enter+0x44: undefined f902011f I tried typing bt at the debugger prompt but got no more output. I've put the buildworld log file at http://www.zefox.net/~fbsd/rpi3/crashes/20220307/ Hope this is of some use bob prohaska
Re: 14-CURRENT/aarch64 build problem
On Tue, Jun 08, 2021 at 09:15:37PM +0200, Juraj Lutter wrote: > Hi, > > I???m having problem to build recent 14-CURRENT/aarch64 as of > 6d2648bcaba9b14e2f5c76680f3e7608e1f125f4: > > --- cddl/lib/libuutil__L --- > make[4]: make[4]: don't know how to make uu_dprintf.c. Stop > make[4]: make[4]: don't know how to make uu_open.c. Stop > `uu_alloc.c' is up to date. > `uu_avl.c' is up to date. > `uu_dprintf.c' was not built (being made, type OP_DEPS_FOUND|OP_MARK, flags > REMAKE|DONE_WAIT|DONE_ALLSRC|DONECYCLE)! > ??? > > The build is performed with pristine /usr/obj > FWIW, same problem seen here. In an added twist, git pull (hoping for a fix) fails also: root@www:/usr/src # git pull error: cannot lock ref 'refs/remotes/freebsd/vendor/openzfs/legacy': 'refs/remotes/freebsd/vendor/openzfs' exists; cannot create 'refs/remotes/freebsd/vendor/openzfs/legacy' >From https://git.freebsd.org/src ! [new branch] vendor/openzfs/legacy -> freebsd/vendor/openzfs/legacy (unable to update local ref) error: cannot lock ref 'refs/remotes/freebsd/vendor/openzfs/master': 'refs/remotes/freebsd/vendor/openzfs' exists; cannot create 'refs/remotes/freebsd/vendor/openzfs/master' ! [new branch] vendor/openzfs/master -> freebsd/vendor/openzfs/master (unable to update local ref) error: cannot lock ref 'refs/remotes/freebsd/vendor/openzfs/zfs-2.1-release': 'refs/remotes/freebsd/vendor/openzfs' exists; cannot create 'refs/remotes/freebsd/vendor/openzfs/zfs-2.1-release' ! [new branch] vendor/openzfs/zfs-2.1-release -> freebsd/vendor/openzfs/zfs-2.1-release (unable to update local ref) Is this a problem at my end, or the server's end? Thanks for reading, bob prohaska
Re: URL for stable/13
On Tue, Mar 02, 2021 at 09:46:11AM -0700, Warner Losh wrote: > On Tue, Mar 2, 2021 at 9:18 AM bob prohaska wrote: > > > A while back I obtained a buildable source tree for stable/13 > > but it hasn't been updated in the last few days. Running > > It would help if you asked a question. > Apparently I didn't understand the correct question to ask. 8-) When set up I configured git to use -ff only and simply used git pull . to update, which seemed to work until a few days ago. > If you'd like to know how to update now that you have this tree, I'd > suggest 'git pull --rebase' or 'git pull --ff-only' > Now it seems an explicit git pull --ff-only is required. It just pulled down a substantial crop of updates. Now to see if buildworld succeeds That'll take a while. > If it's some other question, I'm happy to help with that. > You already have, and I didn't even ask the right question... Thank you! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
URL for stable/13
A while back I obtained a buildable source tree for stable/13 but it hasn't been updated in the last few days. Running git remote show origin reports in part Fetch URL: https://git.FreeBSD.org/src.git Push URL: https://git.FreeBSD.org/src.git HEAD branch: main Remote branches: main tracked . Local branch configured for 'git pull': stable/13 merges with remote stable/13 Local ref configured for 'git push': stable/13 pushes to stable/13 (local out of date) Thanks for reading, any hints how to get back in sync apprecidated. This is used for self-hosting on a Raspberry Pi, if it matters. bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Silent hang in buildworld, was Re: Invoking -v for clang during buildworld
re /usr/fbsd/mm-src/sys/arm/conf/GENERIC-NODBG > >> include "GENERIC" > >> > >> ident GENERIC-NODBG > >> > >> makeoptions DEBUG=-g# Build kernel with gdb(1) debug > >> symbols > >> > >> options AUDIT # Not enabled by default in > >> armv7/v6 kernels > >> # Enabled here to allow kyua test > >> runs to > >> # possibly report auditing works. > >> > >> options ALT_BREAK_TO_DEBUGGER > >> > >> options KDB # Enable kernel debugger support > >> > >> # For minimum debugger support (stable branch) use: > >> options KDB_TRACE # Print a stack trace for a panic > >> options DDB # Enable the kernel debugger > >> > >> # Extra stuff: > >> #optionsVERBOSE_SYSINIT=0 # Enable verbose sysinit messages > >> #optionsBOOTVERBOSE=1 > >> #optionsBOOTHOWTO=RB_VERBOSE > >> options ALT_BREAK_TO_DEBUGGER # Enter debugger on keyboard > >> escape sequence > >> options KLD_DEBUG > >> #optionsKTR > >> #optionsKTR_MASK=KTR_TRAP > >> ##options KTR_CPUMASK=0xF > >> #optionsKTR_VERBOSE > >> > >> # Disable any extra checking for. . . > >> nooptions INVARIANTS # Enable calls of extra sanity > >> checking > >> nooptions INVARIANT_SUPPORT # Extra sanity checks of internal > >> structures, required by INVARIANTS > >> nooptions WITNESS # Enable checks to detect > >> deadlocks and cycles > >> nooptions WITNESS_SKIPSPIN# Don't run witness on spinlocks > >> for speed > >> nooptions DEADLKRES # Enable the deadlock resolver > >> nooptions MALLOC_DEBUG_MAXZONES # Separate malloc(9) zones > >> nooptions DIAGNOSTIC > >> nooptions BUF_TRACKING > >> nooptions FULL_BUF_TRACKING > >> nooptions USB_DEBUG > >> nooptions USB_REQ_DEBUG > >> nooptions USB_VERBOSE > >> > >> The /boot/loader.conf file and the /etc/sysctl.conf files > >> both contained: > >> > >> vm.pageout_oom_seq=120 > >> vm.pfault_oom_attempts=-1 > >> > >> (The hw.physmem=979042304 in /boot/loader.conf was very-special, > >> to better approximate your environment. I also controlled the > >> cpu frequency used via a line in /etc/sysctl.conf . I do not > >> bother with such non-default frequency usage [or related settings] > >> for RPi*'s with the pre-RPi4B style power connections but do > >> control the frequency for the OPi+2E.) > > > > The following had been left implicit about my context and > > how it manages memory space use. > > > > I'll note that I do not use tmpfs or other such memory based > > file system techniques that could compete for RAM/swap. What > > is in use for the only file system involved is just the > > root file system: > > > > # df -m > > Filesystem1M-blocks Used Avail Capacity Mounted on > > /dev/gpt/BPIM3root 195378 63940 11580836%/ > > devfs 0 0 0 100%/dev > > > > It is a USB SSD. The swap partition is also on that same > > media. (The BPIM3 based name dates back to before the > > BPI-M3 power connection failed and I switched to the > > OPi+2E.) > > > > I'll note that I've started a new from-scratch build without > > LDFLAGS.lld+= -Wl,--threads=1 . So at some point I'll have > > information about how much of a difference (+/-) in swap > > usage it actually made for with vs. without, if any. > > Looks like, for such 4-core contexts, that bothering > with LDFLAGS.lld+= -Wl,--threads=1 is typically a > waste of effort for both swap usage and time . . . > > With LDFLAGS.lld+= -Wl,--threads=1 : > > Mem: . . . , 765700Ki MaxObsActive, 200412Ki MaxObsWired, 954116Ki > MaxObs(Act+Wir) > Swap: . . . , 537588Ki MaxObsUsed > > without: > > Mem: . . ., 715756Ki MaxObsActive, 194816Ki MaxObsWired, 903132Ki > MaxObs(Act+Wir) > Swap: . . ., 557208Ki MaxObsUsed > > > With LDFLAGS.lld+= -Wl,--threads=1 : > > World built in 72960 seconds, ncpu: 4, make -j4 > Kernel(s) GENERIC-NODBG built in 4998 seconds, ncpu: 4, make -j4 > > without: > > World built in 72804 seconds, ncpu: 4, make -j4 > Kernel(s) GENERIC-NODBG built in 4824 seconds, ncpu: 4, make -j4 > > > So, just not that much of a difference compared to the overall > sizes or times involved. > A first OS build/install cycle on armv7 (RPI2) using meta mode finished without trouble. Sources were a day or two newer than the kernel, -j4 buildworld took 157121 seconds. Peak swap use was half again as much at 732932. No constraints on ld.lld beyond defaults. I'm a little surprised at the extreme slowness, but this was a fully-debug'd-current kernel and sources were slightly newer than existing world. In case there's interest I've put what log files I could gather at http://www.zefox.net/~fbsd/rpi2/buildworld/main-c950-gff1a307801/ Thanks for your attention and help!! bob prohaska > === > Mark Millard > marklmi at yahoo.com > ( dsl-only.net went > away in early 2018-Mar) > > ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Silent hang in buildworld, was Re: Invoking -v for clang during buildworld
On Sun, Jan 17, 2021 at 12:30:51PM -0800, Mark Millard wrote: > > > On 2021-Jan-17, at 09:40, bob prohaska wrote: > > > On Sat, Jan 16, 2021 at 03:04:04PM -0800, Mark Millard wrote: > >> > >> Other than -j1 style builds (or equivalent), one pretty much > >> always needs to go looking around for a non-panic failure. It > >> is uncommon for all the material to be together in the build > >> log in such contexts. > > > > Running make cleandir twice and restarting -j4 buildworld brought > > the process full circle: A silent hang, no debugger response, no > > console warnings. That's what sent me down the rabbit hole of make > > without clean, which worked at least once... > > Unfortunately, such a hang tends to mean that log files and such > were not completely written out to media. We do not get to see > evidence of the actual failure time frame, just somewhat before. > (compiler/linker output and such can have the same issues of > ending up with incomplete updates.) > > So, pretty much my notes are unlikely to be strongly tied to > any solid evidence: more like alternatives to possibly explore > that could be far off the mark. > > It is not clear if you were using: > > LDFLAGS.lld+= -Wl,--threads=1 > > or some such to limit the multi-thread linking and its memory. No, I wasn't trying to limit ld.lld thread number. > I'll note that if -j4 gets 4 links running in parallel it used > to be each could have something like 5 threads active on a 4 > core machine, so 20 or so threads. (I've not checked llvm11's > lld behavior. It might avoid such for defaults.) > > You have not reported any testing of -j2 or -j3 so far, just > -j4 . (Another way of limiting memory use, power use, temperature, > etc. .) > Not recently, simply because it's so slow to build. On my "production" armv7 machines running stable/12 I do use -j2. But, they get updated only a couple times per year, when there's a security issue. > You have not reported if your boot complained about the swap > space size or if you have adjusted related settings to make > non-default tradeoffs for swap amanagment for these specific > tests. I recommend not tailoring and using a swap size total > that is somewhat under what starts to complain when there is > no tailoring. > Both Pi2 and Pi3 have been complaining about too much swap since I first got them. Near as can be told it's never been a demonstrated problem, thus far. Now, as things like LLVM get bigger and bigger, it seems possible excess swap might cause, or obscure, other problems. For the Pi2 I picked 2 GB from the old "2x physical RAM" rule. > > > The residue of the top screen shows > > > > last pid: 63377; load averages: 4.29, 4.18, 4.15 > > up 1+07:11:07 04:46:46 > > 60 processes: 5 running, 55 sleeping > > CPU: 70.7% user, 0.0% nice, 26.5% system, 2.8% interrupt, 0.0% idle > > Mem: 631M Active, 4932K Inact, 92M Laundry, 166M Wired, 98M Buf, 18M Free > > Swap: 2048M Total, 119M Used, 1928M Free, 5% Inuse, 16K In, 3180K Out > > packet_write_wait: Connection to 50.1.20.26 port 22: Broken pipe > > bob@raspberrypi:~ $ ssh www.zefox.comRES STATEC TIMEWCPU > > COMMAND > > ssh: connect to host www.zefox.com port 22: Connection timed out86.17% c++ > > bob@raspberrypi:~ $ 1 990 277M 231M RUN 0 3:26 75.00% c++ > > 63245 bob 1 990 219M 173M CPU0 0 2:10 73.12% c++ > > 62690 bob 1 980 354M 234M RUN 3 9:42 47.06% c++ > > 63377 bob 1 300 5856K 2808K nanslp 0 0:00 3.13% gstat > > 38283 bob 1 240 5208K 608K wait 2 2:00 0.61% sh > > 995 bob 1 200 6668K 1184K CPU3 3 8:46 0.47% top > > 990 bob 1 20012M 1060K select 2 0:48 0.05% sshd > > > > This does not look like ld was in use as of the last top > display update's content. But the time between reasonable > display updates is fairly long relative to CPU activity > so it is only suggestive. > > > [apologies for typing over the remnants] > > > > I've put copies of the build and swap logs at > > > > http://www.zefox.net/~fbsd/rpi2/buildworld/ > > > > The last vmstat entry (10 second repeat time) reports: > > procs memory page disks faults cpu > > r b w avm fre flt re pi pofr sr da0 sd0 in sy cs us > > sy id > > 4 0 14 969160 91960 685 2 2 1 707 304 0 0 11418 692 > > 1273 45 5 50 > > > > Does that point to the mem
Silent hang in buildworld, was Re: Invoking -v for clang during buildworld
On Sat, Jan 16, 2021 at 03:04:04PM -0800, Mark Millard wrote: > > Other than -j1 style builds (or equivalent), one pretty much > always needs to go looking around for a non-panic failure. It > is uncommon for all the material to be together in the build > log in such contexts. Running make cleandir twice and restarting -j4 buildworld brought the process full circle: A silent hang, no debugger response, no console warnings. That's what sent me down the rabbit hole of make without clean, which worked at least once... The residue of the top screen shows last pid: 63377; load averages: 4.29, 4.18, 4.15 up 1+07:11:07 04:46:46 60 processes: 5 running, 55 sleeping CPU: 70.7% user, 0.0% nice, 26.5% system, 2.8% interrupt, 0.0% idle Mem: 631M Active, 4932K Inact, 92M Laundry, 166M Wired, 98M Buf, 18M Free Swap: 2048M Total, 119M Used, 1928M Free, 5% Inuse, 16K In, 3180K Out packet_write_wait: Connection to 50.1.20.26 port 22: Broken pipe bob@raspberrypi:~ $ ssh www.zefox.comRES STATEC TIMEWCPU COMMAND ssh: connect to host www.zefox.com port 22: Connection timed out86.17% c++ bob@raspberrypi:~ $ 1 990 277M 231M RUN 0 3:26 75.00% c++ 63245 bob 1 990 219M 173M CPU0 0 2:10 73.12% c++ 62690 bob 1 980 354M 234M RUN 3 9:42 47.06% c++ 63377 bob 1 300 5856K 2808K nanslp 0 0:00 3.13% gstat 38283 bob 1 240 5208K 608K wait 2 2:00 0.61% sh 995 bob 1 200 6668K 1184K CPU3 3 8:46 0.47% top 990 bob 1 20012M 1060K select 2 0:48 0.05% sshd [apologies for typing over the remnants] I've put copies of the build and swap logs at http://www.zefox.net/~fbsd/rpi2/buildworld/ The last vmstat entry (10 second repeat time) reports: procs memory page disks faults cpu r b w avm fre flt re pi pofr sr da0 sd0 in sy cs us sy id 4 0 14 969160 91960 685 2 2 1 707 304 0 0 11418 692 1273 45 5 50 Does that point to the memory exhaustion suggested earlier in the thread? At this point /boot/loader.conf contains vm.pfault_oom_attempts="-1", but that's a relic of long-ago attempts to use USB flash for root and swap. Might removing it stimulate more warning messages? Thanks for reading! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Invoking -v for clang during buildworld
On Sat, Jan 16, 2021 at 11:17:52AM -0800, Mark Millard wrote: > > > On 2021-Jan-16, at 07:55, bob prohaska wrote: > > > On Fri, Jan 15, 2021 at 09:25:00PM -0800, Mark Millard wrote: > >> > >> On 2021-Jan-15, at 20:37, bob prohaska wrote: > >> > >>> While playing with -current on armv7 using a raspberry pi 2 v1.1 > >>> an error crops up with recent kernels while building world: > >>> > >>> ++: error: linker command failed with exit code 1 (use -v to see > >>> invocation) > >>> *** [clang.full] Error code 1 > >>> > >>> make[5]: stopped in /usr/freebsd-src/usr.bin/clang/clang > >>> > >>> How does one invoke -v in this situation? > >> > >> Going a different direction: Going to publish the build log > >> someplace? There is likely more there of interest to isolating > >> the issue(s). > >> > > I've put what I hope is a useful picture at > > http://www.zefox.net/~fbsd/rpi2/buildworld/ > > Looks to me like your -DNO_CLEAN based build is reusing one or > more files with inappropriate/incomplete contents that need to > be regenerated: there are a number of undefined symbols stopping > the linker during its attempt to build the "usr.bin/clang/clang > (all)" material. See below. > [examples snipped] > > FYI: > > I found this by noting the "all_subdir_usr.bin" below and > searching backwards for prior examples and seeing what was > after those examples. > > --- all_subdir_usr.bin --- > c++: error: linker command failed with exit code 1 (use -v to see invocation) > *** [clang.full] Error code 1 > > It never dawned that I wasn't looking at the first error message. > > The undefined symbols seem unlikely to be a voltage problem. > > The zeros are from the units for the integers not being volts > but micro volts. (Which is not the same as saying measurements > reach that scale of accuracy.) > So long as they're measured values they might be worth keeping track of. I thought maybe they were some sort of input or placeholder values. > >> I use META_MODE builds. One thing they do is record the > >> command used to try to produce each file. So in that kind > >> of context, identifying what it was trying to build allows > >> finding the related NAME.meta file and looking in it. > >> Not needed now, but worth remembering for the future. > > I see no specific evidence for a kernel problem being involved. > Agreed. The problem is the operator. Thanks for your patience! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Invoking -v for clang during buildworld
On Fri, Jan 15, 2021 at 09:25:00PM -0800, Mark Millard wrote: > > On 2021-Jan-15, at 20:37, bob prohaska wrote: > > > While playing with -current on armv7 using a raspberry pi 2 v1.1 > > an error crops up with recent kernels while building world: > > > > ++: error: linker command failed with exit code 1 (use -v to see invocation) > > *** [clang.full] Error code 1 > > > > make[5]: stopped in /usr/freebsd-src/usr.bin/clang/clang > > > > How does one invoke -v in this situation? > > Going a different direction: Going to publish the build log > someplace? There is likely more there of interest to isolating > the issue(s). > I've put what I hope is a useful picture at http://www.zefox.net/~fbsd/rpi2/buildworld/ Files from a clean start would probably be better, but it will take days to get back to that state. I was thinking this might be a kernel problem, but after trying three different kernels, all with the same result, it's looking doubtful. No hint of the "cannot allocate memory" message of earlier troubles, and nothing on the console. One additional question, however: Does the Pi2 have an internal voltage measurement that could be added to the swap logging script? Sysctl -a | grep olt produces a bunch of output, but none of it looks real, with too many trailing zeroes. Power supply problems have been rare, but they caused much hair loss. RaspiOS reports under voltage, does FreeBSD have a comparable feature? > I use META_MODE builds. One thing they do is record the > command used to try to produce each file. So in that kind > of context, identifying what it was trying to build allows > finding the related NAME.meta file and looking in it. > > An example failure for armv7 and 1 GiByte of RAM could be > a simple memory allocation failure: unable to get a > sufficiently large contiguous range from the address space > for some request. (So it never gets to the point of using > swap for it.) Are you controlling how many threads the > linker uses? > There have been none of the "unable to allocate memory" messages that characterized the previous failures, and nothing on the console. I do not try to control thread count beyond -j4 on the command line. It wasn't necessary up to a few days ago. It does seem that memory use is vastly greater with the arrival of clang 11, swap use on armv7 gets up past half a GB. With clang 9 it hardly registered. > > For the record, uname -a reports > > FreeBSD www.zefox.com 13.0-CURRENT FreeBSD 13.0-CURRENT #6 > > main-c950-gff1a307801: Wed Jan 13 19:02:18 PST 2021 > > b...@www.zefox.com:/usr/obj/usr/freebsd-src/arm.armv7/sys/GENERIC-MMCCAM > > arm > > > > The present sources are a day or two newer. > > > > Nothing is obviously wrong; swap usage is small, no warnings or errors on > > the console. > > > > In past occurrences, an old kernel (pre-git) worked through the problem. > > If a restart of make buildworld doesn't get past the stoppage I'll check > > again. The pre-git kernel didn't work either, nor did kernel.old, a couple days previous. For clarity, all three were -DNO_CLEAN starts. Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Invoking -v for clang during buildworld
While playing with -current on armv7 using a raspberry pi 2 v1.1 an error crops up with recent kernels while building world: ++: error: linker command failed with exit code 1 (use -v to see invocation) *** [clang.full] Error code 1 make[5]: stopped in /usr/freebsd-src/usr.bin/clang/clang How does one invoke -v in this situation? For the record, uname -a reports FreeBSD www.zefox.com 13.0-CURRENT FreeBSD 13.0-CURRENT #6 main-c950-gff1a307801: Wed Jan 13 19:02:18 PST 2021 b...@www.zefox.com:/usr/obj/usr/freebsd-src/arm.armv7/sys/GENERIC-MMCCAM arm The present sources are a day or two newer. Nothing is obviously wrong; swap usage is small, no warnings or errors on the console. In past occurrences, an old kernel (pre-git) worked through the problem. If a restart of make buildworld doesn't get past the stoppage I'll check again. Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: How does /usr/bin/uname work in plain english?
On Wed, Jan 13, 2021 at 10:15:32PM -0700, Warner Losh wrote: > > > __FreeBSD_version is defined in sys/param.h. For -U, uname prints that > > > value. For -K, it asks the kernel for this value to print. > > > > > > MMmmnnn where MM is the major version, mm is minor, and nnn is > > incremental > > > when the APIs change, approximately weekly. > > > Sounds like the numbers are manually set by humans... I imagined something much more automated. > > He has a newer kernel than userland... however that came to be... > Yes, a new kernel was compiled to fix the "won't boot with HDMI connected" problem on Raspberry Pi. Thanks for explaining! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
How does /usr/bin/uname work in plain english?
Since the switch to git I've been wondering how /usr/bin/uname works. The man page is thin on details and uname.c is far too subtle. For example, on my test box uname -a reports FreeBSD www.zefox.org 13.0-CURRENT FreeBSD 13.0-CURRENT #7 main-c255937-g818390ce0ca5: Wed Jan 13 16:42:12 PST 2021 b...@www.zefox.org:/usr/obj/usr/freebsd-src/arm64.aarch64/sys/GENERIC-MMCCAM arm64 which seems to replay git nomeclature. However, uname -KU reports 1300135 1300134 which is admirably readable, even for me. Is there a natural language description detailing how uname -KU outputs are computed, and roughly what they mean? I've noticed that different sources sometimes produce the same values, so the level of detail is less, but might suffice for initial reports to the mailing lists. Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: HEADS UP: FreeBSD src repo transitioning to git this weekend
On Tue, Dec 22, 2020 at 12:19:03PM -0800, Mark Millard wrote: > > git in base would have licensing issues. > I gather you're referring to GPLv2. A sticky wicket. The trouble with ports is the tree is getting awfully big. The host in question has a 32 GB disk and is over half full with just a base source installation. Adding a "dormant" ports tree will take nearly 2 GB, most of which is not used. Might there be some way to clone a "sparse tree" including only one port, which then leafs out just enough to build that port and dependencies? When the ports system was introduced it seemed a marvel of compactness and efficiency. Time marches on. > Pi2B: v1.1 (armv7 only)? v1.2 running armv7 FreeBSD? > v1.2 running arm64 FreeBSD? > Sorry for the ambiguity... It's v1.1, armv7 only. That's why I want to test git on this particular machine. Git seems to work fine on the Pi3. Thanks for replying! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: HEADS UP: FreeBSD src repo transitioning to git this weekend
On Tue, Dec 22, 2020 at 09:34:25PM +0100, Ronald Klop wrote: > > what does "pkg install git" do for you? NB: I use "pkg install git-lite". > Prevents about 1000 dependencies. > That seems to have worked. It reported something about package management not being installed, but after a prompt installed pkg-static and set up a version of git which seems to run. Svnlite had been working without this step. This is for a Pi2B v 1.1, arm v7 only. Using the "mini git primer" at https://hackmd.io/hJgnfzd5TMK-VHgUzshA2g I tried to clone stable/12 expecting that the -beta would be gone. It looks as if I'm still jumping the gun. Although cgit.freegbsd.org replies to ping, using bob@www:/usr % git clone cgit.freebsd.org -b stable/12 freebsd-src reports: fatal: repository 'cgit.freebsd.org' does not exist This is just a rehearsal, so I can wait, but if I've made other mistakes please point them out. Thanks for your help! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: HEADS UP: FreeBSD src repo transitioning to git this weekend
On Wed, Dec 16, 2020 at 05:46:35PM -0700, Warner Losh wrote: > > The FreeBSD project will be moving it's source repo from subversion to git > starting this this weekend. The docs repo was moved 2 weeks ago. The ports > repo will move at the end of March, 2021 due to timing issues. > Is there some way to obtain git on a Pi2B running 13.0-CURRENT FreeBSD 13.0-CURRENT #2 r365692 without installing the ports tree? I expected to find git in base, but it isn't there. Can it be found under another package name? Thanks for reading, and any guidance! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic: non-current pmap 0xffffa00020eab8f0 on Rpi3
On Tue, Oct 20, 2020 at 12:30:05PM -0400, Mark Johnston wrote: > > I set up a RPi3 to try and repro this and have so far managed to trigger > it once using Peter Holm's stress2 suite, so I'll keep investigating. I > hadn't configured a dump device, but I was able to confirm from DDB that > PCPU_GET(curpmap) == >vm_pmap. > Is the invalid pmap fault related in any way to intensity of swap usage? That's easily adjusted using -j values building things like www/chromium. In the past, when I've reported crashes caused by stress2 I've observed a "that's inevitable" sort of response with some regularity. Panics when doing more normal things like make seem to stimulate greater interest. > For future reference, > https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html The pieces are all in place, but the machine doesn't seem to find core dumps when coming up after a crash. It does routinely issue "no core dumps found" during reboot, so it's looking. It it necessary to issue a dump command from inside the debugger? Thanks for writing! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic: non-current pmap 0xffffa00020eab8f0 on Rpi3
On Mon, Oct 19, 2020 at 04:39:54PM -0400, Mark Johnston wrote: > > I think vmspace_exit() should issue a release fence with the cmpset and > an acquire fence when handling the refcnt == 1 case, but I don't see why > that would make a difference here. So, if you can test a debug patch, > this one will yield a bit more debug info. If you can provide access to > a vmcore and kernel debug symbols, that'd be even better. > I haven't seen an invalid pmap panic since the report of October 5th. Your patch applied cleanly on the Pi3 running HEAD at r366780M, the M being due to patches supplied by Kyle Evans applied to M sys/arm/broadcom/bcm2835/bcm2835_mbox.c M sys/arm/broadcom/bcm2835/bcm2835_sdhci.c M sys/arm/broadcom/bcm2835/bcm2835_vcbus.c M sys/arm/broadcom/bcm2835/bcm2835_vcbus.h AIUI, they're something to do with DMA for peripherals. They've caused no obvious trouble, if you anticipate conflicts let me know and I'll remove them I've never seen either a vmcore file or debug symbols on this machine. A sequence of instructions to generate the data needed would be helpful. Thanks for reading! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
panic: non-current pmap 0xffffa00020eab8f0 on Rpi3
Still seeing non-current pmap panics on the Pi3, this time a B+ running 13.0-CURRENT (GENERIC-MMCCAM) #0 71e02448ffb-c271826(master) during a -j4 buildworld. The backtrace reports panic: non-current pmap 0xa00020eab8f0 cpuid = 0 time = 1601947137 KDB: stack backtrace: db_trace_self() at db_trace_self_wrapper+0x30 pc = 0x0072999c lr = 0x0019ec8c sp = 0x5d96c230 fp = 0x5d96c430 db_trace_self_wrapper() at kdb_backtrace+0x38 pc = 0x0019ec8c lr = 0x004b4984 sp = 0x5d96c440 fp = 0x5d96c500 kdb_backtrace() at vpanic+0x19c pc = 0x004b4984 lr = 0x004734c0 sp = 0x5d96c510 fp = 0x5d96c560 vpanic() at panic+0x44 pc = 0x004734c0 lr = 0x004730dc sp = 0x5d96c570 fp = 0x5d96c620 panic() at pmap_remove_pages+0x5d8 pc = 0x004730dc lr = 0x0073fe58 sp = 0x5d96c630 fp = 0x5d96c690 pmap_remove_pages() at vmspace_exit+0xb0 pc = 0x0073fe58 lr = 0x006c77a0 sp = 0x5d96c6a0 fp = 0x5d96c700 vmspace_exit() at exit1+0x470 pc = 0x006c77a0 lr = 0x0042e5bc sp = 0x5d96c710 fp = 0x5d96c760 exit1() at sys_sys_exit+0x10 pc = 0x0042e5bc lr = 0x0042e148 sp = 0x5d96c770 fp = 0x5d96c7c0 sys_sys_exit() at syscallenter+0x104 pc = 0x0042e148 lr = 0x007463dc sp = 0x5d96c7d0 fp = 0x5d96c7d0 syscallenter() at svc_handler+0x4c pc = 0x007463dc lr = 0x00745df8 sp = 0x5d96c7e0 fp = 0x5d96c810 svc_handler() at do_el0_sync+0xf0 pc = 0x00745df8 lr = 0x00745c08 sp = 0x5d96c820 fp = 0x5d96c830 do_el0_sync() at handle_el0_sync+0x90 pc = 0x00745c08 lr = 0x0072c224 sp = 0x5d96c840 fp = 0x5d96c980 handle_el0_sync() at 0x40421150 pc = 0x0072c224 lr = 0x40421150 sp = 0x5d96c990 fp = 0xd830 KDB: enter: panic [ thread pid 2429 tid 100951 ] Stopped at 0x403fa408 db> Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange USB loop
On Thu, Aug 27, 2020 at 10:02:21AM -0700, bob prohaska wrote: > On Tue, Aug 25, 2020 at 11:29:16AM -0700, bob prohaska wrote: > > > With a _different_ FT232 plugged in it also came up normally. > > > > Both are thought to be genuine, but they are of different age > > and produce different recognition messages: > > > > The FT232 that causes trouble reports > > ugen1.4: at usbus1 > > uftdi0 on uhub1 > > uftdi0: on usbus1 > > > > The one that seems to work is newer and reports > > ugen1.4: at usbus1 > > uftdi0 on uhub1 > > uftdi0: on usbus1 > > With the system updated to r364900 both FT232 devices seem to be working. The machine boots from a USB disk successfully with mouse, keyboard and FT232 connected at powerup. Thanks for your help, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: usbd_setup_device_desc: getting device descriptor at addr 6 failed, USB_ERR_IOERROR
On Thu, Aug 27, 2020 at 09:41:16PM +0300, Yuri Pankov wrote: > Another issue that I started seeing lately, didn't try finding out when > exactly in case someone knows what it's about: > > Root mount waiting for: usbus0 > usbd_setup_device_desc: getting device descriptor at addr 6 failed, > USB_ERR_IOERROR > [details snipped] > So far not seeing any ill effects from this, i.e. I can connect USB HDD to > these ports, and it's successfully detected. If it's convenient, connecting a USB-serial adapter and rebooting might be interesting. I'm having trouble with FT232 obstructing disk detection in some cases and self-disconnecting in others on a Pi3B. Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange USB loop
On Tue, Aug 25, 2020 at 11:29:16AM -0700, bob prohaska wrote: > With a _different_ FT232 plugged in it also came up normally. > > Both are thought to be genuine, but they are of different age > and produce different recognition messages: > > The FT232 that causes trouble reports > ugen1.4: at usbus1 > uftdi0 on uhub1 > uftdi0: on usbus1 > > The one that seems to work is newer and reports > ugen1.4: at usbus1 > uftdi0 on uhub1 > uftdi0: on usbus1 > > On balance I think the new kernel is better-behaved. Beyond that > I'm at a loss. If you can suggest other things to try please do. > > This morning I found on the console a message: uftdi0: at uhub1, port 3, addr 4 (disconnected) uftdi0: detached but, usbconfig -a repored ugen1.4: at usbus1, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (90mA) and lsusb says Bus /dev/usb Device /dev/ugen1.4: ID 0403:6001 Future Technology Devices International, Ltd FT232 Serial (UART) IC The FT232 is plugged directly into the Pi. This the newer, supposedly functional, ft232... Unplugging and replugging put on the console ugen1.4: at usbus1 (disconnected) uftdi0: at uhub1, port 3, addr 4 (disconnected) uftdi0: detached ugen1.4: at usbus1 uftdi0 on uhub1 uftdi0: on usbus1 But it still can't connect to the serial port of the correspondent host, which is up and running. Meanwhile, the FT232 which appeared faulty is working fine overnight on RaspiOS Buster. Thanks for reading, and any suggestions bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange USB loop
On Tue, Aug 25, 2020 at 09:41:41AM +0200, Hans Petter Selasky wrote: > > Can you try r364346 ? > The kernel compiled and installed without trouble. After running run bootcmd_usb0 the machine loaded the kernel, but stopped at the loader prompt. The keyboard was connected direct to the Pi, the mouse was disconnected. It isn't obvious why it stopped at the loader prompt. lsdev reports disk devices: disk0:250085377 X 512 blocks (removable) disk0s1: DOS/Windows disk0s2: FreeBSD disk0s2a: FreeBSD UFS disk0s2b: FreeBSD swap disk1:1953525169 X 512 blocks disk1s1: DOS/Windows disk1s2: FreeBSD disk1s2a: FreeBSD UFS disk1s2b: FreeBSD swap http: (unknown) net devices: net0: OK Disk0 is the (bootable) microSD, disk1 is the hard drive. Boot -s came up single-user, with / mounted from /dev/da0s2a as desired. Fsck reported the filesystem clean. Exit to multi-user worked. The USB system keyboard (plugged into the Pi) worked. The mouse (plugged into the hub after boot) also worked. A second reboot with the mouse connected via the hub worked without pausing at the loader prompt. Plugging the FTDI FT232 adapter into the hub triggered a round of uhub_reattach_port: giving up port reset - device vanished messages, but this time they stopped when I pulled the FT232. Plugging the FT232 directly into the Pi caused normal recognition. It looks as if the FT232 somehow interferes with disk discovery. A reboot with USB disk & mouse in the hub but keyboard and FT232 in the Pi again resulted in a mountroot failure, along with a few other error messages: uhub2: MTT enabled Root mount waiting for: usbus1 CAM uhub2: 4 ports with 4 removable, self powered Root mount waiting for: usbus1 CAM usb_alloc_device: set address 7 failed (USB_ERR_IOERROR, ignored) Root mount waiting for: usbus1 CAM Root mount waiting for: usbus1 CAM usbd_setup_device_desc: getting device descriptor at addr 7 failed, USB_ERR_IOERROR usbd_req_re_enumerate: addr=7, set address failed! (USB_ERR_IOERROR, ignored) Root mount waiting for: usbus1 CAM Root mount waiting for: usbus1 CAM usbd_setup_device_desc: getting device descriptor at addr 7 failed, USB_ERR_IOERROR Root mount waiting for: usbus1 CAM usbd_req_re_enumerate: addr=7, port reset failed, USB_ERR_IOERROR usbd_req_re_enumerate: addr=7, port reset failed, USB_ERR_IOERROR Root mount waiting for: usbus1 CAM usbd_req_re_enumerate: addr=7, port reset failed, USB_ERR_IOERROR ugen1.7: at usbus1 (disconnected) uhub_reattach_port: could not allocate new device Root mount waiting for: CAM Root mount waiting for: CAM Root mount waiting for: CAM Root mount waiting for: CAM Root mount waiting for: CAM Root mount waiting for: CAM Root mount waiting for: CAM Mounting from ufs:/dev/da0s2a failed with error 2; retrying for 3 more seconds Mounting from ufs:/dev/da0s2a failed with error 2. Loader variables: vfs.root.mountfrom=ufs:/dev/da0s2a vfs.root.mountfrom.options=rw With the FT232 unplugged the machine came up normally. With a _different_ FT232 plugged in it also came up normally. Both are thought to be genuine, but they are of different age and produce different recognition messages: The FT232 that causes trouble reports ugen1.4: at usbus1 uftdi0 on uhub1 uftdi0: on usbus1 The one that seems to work is newer and reports ugen1.4: at usbus1 uftdi0 on uhub1 uftdi0: on usbus1 On balance I think the new kernel is better-behaved. Beyond that I'm at a loss. If you can suggest other things to try please do. Thanks for all your help, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange USB loop
On Tue, Aug 25, 2020 at 12:46:01AM +0200, Hans Petter Selasky wrote: > On 2020-08-24 18:37, bob prohaska wrote: > > After updating to > > FreeBSD 13.0-CURRENT (GENERIC) #5 r364475: Mon Aug 24 06:47:29 PDT 2020 > > on a Pi3 it was necessary to disconnect the mouse, keyboard and usb-serial > > You are after: > > https://svnweb.freebsd.org/changeset/base/364433 > > You may want to try a kernel before: > > r364379 A kernel for FreeBSD 13.0-CURRENT (GENERIC) #6 r364378: Tue Aug 25 00:46:27 PDT 2020 compiled and installed without incident, but the problem persists. This time I plugged the keyboard into the hub and got a stream of uhub_reattach_port: giving up port reset - device vanished which didn't stop when the keyboard was removed. If the keyboard is moved to the Pi's internal USB connectors the keyboard is recognized and works, but the once-per-second "...device vanished" messages continue. Attempts to repeat this behavior were frustrating. After a few iterations the error message was triggered by plugging in an FTDI usb-serial adapter, but the messages stopped when it was unplugged. The hub is Bus /dev/usb Device /dev/ugen1.4: ID 05e3:0610 Genesys Logic, Inc. 4-port hub The disk adapter is Bus /dev/usb Device /dev/ugen1.5: ID 152d:1561 JMicron Technology Corp. / JMicron USA Technology Corp. JMS561U two ports SATA 6Gb/s bridge Are either of these known troublmakers? Thanks for reading! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange USB loop
On Mon, Aug 24, 2020 at 05:26:27PM +, Bjoern A. Zeeb wrote: > On 24 Aug 2020, at 16:37, bob prohaska wrote: > > > > > uhub_reattach_port: giving up port reset - device vanished > > uhub_reattach_port: giving up port reset - device vanished > > uhub_reattach_port: giving up port reset - device vanished > > uhub_reattach_port: giving up port reset - device vanished > > uhub_reattach_port: giving up port reset - device vanished > > > > I hit something like it last weekend and found this one: > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237666 > Hmm, rather discouraging. Same error message, different hardware. Considerable investigation without resolution. Over a year old. Thanks for writing 8=( bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Strange USB loop
After updating to FreeBSD 13.0-CURRENT (GENERIC) #5 r364475: Mon Aug 24 06:47:29 PDT 2020 on a Pi3 it was necessary to disconnect the mouse, keyboard and usb-serial adapter to allow the machine to mount root from USB via a hub. Once the machine came back up with root mounted from USB, I tried plugging the serial adapter, mouse and keyboard back in via the hub. The FTDI serial adapater was recognized without trouble, but when the elderly Dell mouse was connected, a stream of uhub_reattach_port: giving up port reset - device vanished uhub_reattach_port: giving up port reset - device vanished uhub_reattach_port: giving up port reset - device vanished uhub_reattach_port: giving up port reset - device vanished uhub_reattach_port: giving up port reset - device vanished began to scroll on both the monitor and console. Unplugging the mouse made no difference. Plugging the mouse directly into the Pi's USB port allowed recognition and function, but the stream of errors persisted. Network access seems normal. It looks almost as if there's some sort of infinite loop running in the USB software. The need to disconnect mouse and keyboard to permit mountroot to work isn't new, but the "giving up port reset" _is_ new at least to me. Are there any experiments which might narrow down what's wrong? Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Is there any error checking on swap?
On Sun, Jul 12, 2020 at 12:29:12AM -0700, John-Mark Gurney wrote: > bob prohaska wrote this message on Sat, Jul 11, 2020 at 20:33 -0700: > > Is there any error checking on swap traffic, along the lines of > > a checksum or parity test? > > > > Just curious what happens if a page written out is corrupted when > > it comes back. > > Looks like it doesn't: > https://svnweb.freebsd.org/base/head/sys/vm/swap_pager.c?annotate=361965#l1389 > Certainly nothing about parity or checksums in the comments. All faith in the hardware, I guess Thanks for writing! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Is there any error checking on swap?
Is there any error checking on swap traffic, along the lines of a checksum or parity test? Just curious what happens if a page written out is corrupted when it comes back. Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Problem compiling Chromium
Don't know about AMD, but on ARM failures that resemble this are common. In some cases the actual error message is tens or a hundred lines prior to the last make output. If you search for the string Error: or maybe error: does anything show up? Including the colon : helps to reduce irrelevant hits. Good luck, bob prohaska On Sun, Jun 07, 2020 at 04:27:24PM +, Filippo Moretti wrote: > Good evening, FreeBSD sting > 13.0-CURRENT FreeBSD 13.0-CURRENT #2 r361787: Sun Jun?? 7 15:02:09 CEST > 2020 root@sting:/usr/obj/usr/src/amd64.amd64/sys/STING?? amd64 > > the build fails with the following message/common/extensions/api/action.json > ../../chrome/common/extensions/api/browser_action.json > ../../chrome/common/extensions/api/browsing_data.json > ../../chrome/common/extensions/api/extension.json > ../../chrome/common/extensions/api/idltest.idl > ../../chrome/common/extensions/api/page_action.json > ../../chrome/common/extensions/api/top_sites.json > ninja: build stopped: subcommand failed. > ===> Compilation failed unexpectedly. > Try to set MAKE_JOBS_UNSAFE=yes and rebuild before reporting the failure to > the maintainer. > *** Error code 1 > > Stop. > make: stopped in /usr/ports/www/chromium > > ===>>> make build failed for www/chromium > ===>>> Aborting update > > I enclose the list of packages installedsincerelyFilippo > ___ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Recovering after a crash during installworld
On Thu, Jun 04, 2020 at 10:42:18PM +0200, Ronald Klop wrote: > Delete /usr/src and make a new svn checkout. That turned out to be the solution. The error message persuaded me that the problem was in the executable, not the repository. After devel/subversion produced the idential error, replacing the repository solved the problem. Thanks for reading, bob prohaska > > > Regards, > Ronald. > > > Van: bob prohaska > Datum: 4 juni 2020 21:24 > Aan: freebsd-current@freebsd.org > CC: bob prohaska > Onderwerp: Recovering after a crash during installworld > > > > > > > A Raspberry Pi3B running -current near r360134 crashed during installworld. > > Installkernel completed in single-user mode, but it looks like something > > got corrupted in files related to svnlite: > > > > root@www:/usr/src # svnlite up . > > svn: E235000: In file > > '/usr/src/contrib/subversion/subversion/libsvn_wc/wc_db_wcroot.c' line 311: > > assertion failed (format >= 1) > > Abort (core dumped) > > root@www:/usr/src # svnlite cleanup . > > svn: E235000: In file > > '/usr/src/contrib/subversion/subversion/libsvn_wc/wc_db_wcroot.c' line 311: > > assertion failed (format >= 1) > > Abort (core dumped) > > > > The machine comes up multi-user without problems, but attempts to update > > or simply rebuild the system run afoul of the svnlite errors. > > > > Is there a practical way to recover? > > > > Thanks for reading, > > > > bob prohaska > > ___ > > freebsd-current@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-current > > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" > > > > > > > > ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Recovering after a crash during installworld
A Raspberry Pi3B running -current near r360134 crashed during installworld. Installkernel completed in single-user mode, but it looks like something got corrupted in files related to svnlite: root@www:/usr/src # svnlite up . svn: E235000: In file '/usr/src/contrib/subversion/subversion/libsvn_wc/wc_db_wcroot.c' line 311: assertion failed (format >= 1) Abort (core dumped) root@www:/usr/src # svnlite cleanup . svn: E235000: In file '/usr/src/contrib/subversion/subversion/libsvn_wc/wc_db_wcroot.c' line 311: assertion failed (format >= 1) Abort (core dumped) The machine comes up multi-user without problems, but attempts to update or simply rebuild the system run afoul of the svnlite errors. Is there a practical way to recover? Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: about loader & console
On Mon, Mar 02, 2020 at 10:47:41AM +0200, Toomas Soome wrote: > What we have now (on current): > > arm: efi console is the same state as in x86, no serial driver to provide > comconsole, the serial console is only available via redirection from > firmware. > Is it possible, on a Pi3 without WiFi nor bluetooth, to add enable_uart=1 to config.txt? At this point config.txt contains arm_control=0x200 dtparam=audio=on,i2c_arm=on,spi=on dtoverlay=mmc dtoverlay=pwm dtoverlay=pi3-disable-bt device_tree_address=0x4000 kernel=u-boot.bin and is, I think, default. Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: OOMA kill with vm.pfault_oom_attempts="-1" on RPi3 at r357147 (a vm_pfault_oom_attempts < 0 handling bug as of head -r357026)
On Tue, Jan 28, 2020 at 11:28:14AM -0800, Mark Millard wrote: > > > On 2020-Jan-28, at 11:02, bob prohaska wrote: > > > On Tue, Jan 28, 2020 at 09:42:17AM -0800, Mark Millard wrote: > >> > >> > >> > > The (partly)modified kernel compiled and booted without > > obvious trouble. It's trying to finish buildworld now. > > Stopped already, with Jan 28 11:41:59 www kernel: pid 29909 (cc), jid 0, uid 0, was killed: fault's page allocation failed > >> If you are testing with vm.pfault_oom_attempts="-1" then > >> the vm_fault printf message should never happen anyway. > >> > > Would it not be interesting if the message appeared in that > > case? > > Thanks for the question: looking at the new code found a bug > causing oom where it used to be avoided in head -r357025 and > before. Glad to be of service, even if inadvertently 8-) > After vm_waitpfault(dset, vm_pfault_oom_wait * hz) > the -r357026 code does a vm_pageout_oom(VM_OOM_MEM_PF) no > matter what, even when vm_pfault_oom_attempts < 0 || > fs->oom < vm_pfault_oom_attempts : > > New code in head -r357026 > ( nothing to avoid the vm_pageout_oom(VM_OOM_MEM_PF) > for vm_pfault_oom_attempts < 0 || > fs->oom < vm_pfault_oom_attempts ): > > if (fs->m == NULL) { > unlock_and_deallocate(fs); > if (vm_pfault_oom_attempts < 0 || > fs->oom < vm_pfault_oom_attempts) { > fs->oom++; > vm_waitpfault(dset, vm_pfault_oom_wait * hz); > } > if (bootverbose) > printf( > "proc %d (%s) failed to alloc page on fault, starting OOM\n", > curproc->p_pid, curproc->p_comm); > vm_pageout_oom(VM_OOM_MEM_PF); > return (KERN_RESOURCE_SHORTAGE); > } > > Old code in head -r357025 > ( has the goto RetryFault_oom after vm_waitpfault(. . .), > thereby avoiding the vm_pageout_oom(VM_OOM_MEM_PF) for > vm_pfault_oom_attempts < 0 || fs->oom < vm_pfault_oom_attempts ) : > > if (fs.m == NULL) { > unlock_and_deallocate(); > if (vm_pfault_oom_attempts < 0 || > oom < vm_pfault_oom_attempts) { > oom++; > vm_waitpfault(dset, > vm_pfault_oom_wait * hz); > goto RetryFault_oom; > } > if (bootverbose) > printf( > "proc %d (%s) failed to alloc page on fault, starting OOM\n", > curproc->p_pid, curproc->p_comm); > vm_pageout_oom(VM_OOM_MEM_PF); > goto RetryFault; > } > > I expect this is the source of the behavioral > difference folks have been seeing for OOM kills. > > > As for "gather evidence" messages . . . > > >> You may be able to just look and manually delete or > >> comment out the bootverbose line in the more modern > >> source that currently looks like: > >> > >>if (bootverbose) > >>printf( > >> "proc %d (%s) failed to alloc page on fault, starting OOM\n", > >>curproc->p_pid, curproc->p_comm); > >>vm_pageout_oom(VM_OOM_MEM_PF); > >>return (KERN_RESOURCE_SHORTAGE); > >> > > > > I can find those lines in /usr/src/sys/vm/vm_fault.c, but > > unclear on the motivation to comment the lines out. Perhaps > > to eliminate the return(...) ? Anyway, is it sufficient > > to insert /* before and */ after? > > The only line to delete or comment out in that > code block is: > > if (bootverbose) > > Disabling that line makes the following printf > always happen, even when a verbose boot was not > done. Oops, it's commented out now and the kernel is rebuilding. > > Based on the above reported code change, having > a message before vm_pageout_oom(VM_OOM_MEM_PF) is > important to getting a report of the kill being > via that code. > Thank you! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: OOMA kill with vm.pfault_oom_attempts="-1" on RPi3 at r357147
On Mon, Jan 27, 2020 at 06:22:20PM -0800, Mark Millard wrote: > > So far as I know, in the past progress was only made when someone > already knowledgable got involved in isolating what was happening > and how to control it. > Indeed. One can only hope said knowledgeables are reading Thanks for reading! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CAM breaks USB [was Re: USB causing boot to hang]
, 100baseTX-FDX, auto ue0: on smsc0 ue0: Ethernet address: 4e:61:0e:1c:ae:c0 .ugen0.4: at usbus0 umass0 on uhub1 umass0: on usbus0 .ugen0.5: at usbus0 ...Restarting file system checks: /dev/ufs/rootfs: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ufs/rootfs: clean, 109531 free (5579 frags, 12994 blocks, 2.3% fragmentation) Can't stat /dev/da0d: No such file or directory Can't stat /dev/da0e: No such file or directory Can't stat /dev/da0d: No such file or directory Can't stat /dev/da0e: No such file or directory Can't stat /dev/da0a: No such file or directory Can't stat /dev/da0a: No such file or directory THE FOLLOWING FILE SYSTEMS HAD AN UNEXPECTED INCONSISTENCY: ufs: /dev/da0d (/tmp), ufs: /dev/da0e (/usr), ufs: /dev/da0a (/var) Unknown error 3; help! ERROR: ABORTING BOOT (sending SIGTERM to parent)! 2019-12-06T20:07:21.926442-08:00 init 1 - - /bin/sh on /etc/rc terminated abnormally, going to single user mode Enter full pathname of shell or RETURN for /bin/sh: The machine seems able to boot hands-off a kernel from r333740, so I don't think it's hardware. /boot/loader.conf contains bob@www:~ % more /boot/loader.conf kern.cam.boot_delay="2" vm.pageout_oom_seq="2048" bob@www:~ % Booting direct to single-user, running fsck and exiting the shell brought multi-user operation. Still, It appears that recognition of an FTDI FT232 usb-serial adapter is impaired as well. It had to be unplugged and replugged after booting to be recognized. Also FWIW, an RPI3 running r355422 seems not to share the difficulty. Hope this is of some use, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Rpi3 panic: non-current pmap 0xfffffd001e05b130
On Sat, Nov 30, 2019 at 05:16:15PM -0800, bob prohaska wrote: > A Pi3 running r355024 reported a panic while doing a -j3 make of > www/chromium: > Ok, another panic, looks like a dying storage device. This time there was a preamble on the console: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 c3 90 d8 00 00 08 00 (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da0:umass-sim0:0:0:0): Error 5, Retries exhausted swap_pager: I/O error - pageout failed; blkno 1442883,size 4096, error 5 swap_pager: I/O error - pageout failed; blkno 1442884,size 4096, error 5 swap_pager: I/O error - pageout failed; blkno 1442885,size 8192, error 5 swap_pager: I/O error - pageout failed; blkno 1442887,size 4096, error 5 swap_pager: I/O error - pagein failed; blkno 1103209,size 4096, error 5 vm_fault: pager read error, pid 681 (devd) swap_pager: I/O error - pagein failed; blkno 1130270,size 4096, error 5 vm_fault: pager read error, pid 2362 (c++) Nov 30 17:37:34 www kernel: Failed to fully fault in a core file segment at VA 0x4040 with size 0x60b000 to be written at offset 0x32b000 for process devd panic: vm_page_assert_unbusied: page 0xfd0030f8af80 busy @ /usr/src/sys/vm/vm_object.c:777 cpuid = 3 time = 1575164255 Earlier panics didn't have any proximate warnings on the console, but they're probably the same story. apologies for the noise! bob prohaska > panic: non-current pmap 0xfd001e05b130 > cpuid = 0 > time = 1575161361 > KDB: stack backtrace: > db_trace_self() at db_trace_self_wrapper+0x28 >pc = 0x00729e4c lr = 0x001066c8 >sp = 0x59f3e2b0 fp = 0x59f3e4c0 > > db_trace_self_wrapper() at vpanic+0x18c >pc = 0x001066c8 lr = 0x00400d7c >sp = 0x59f3e4d0 fp = 0x59f3e580 > > vpanic() at panic+0x44 >pc = 0x00400d7c lr = 0x00400b2c >sp = 0x59f3e590 fp = 0x59f3e610 > > panic() at pmap_remove_pages+0x8d4 >pc = 0x00400b2c lr = 0x0074154c >sp = 0x59f3e620 fp = 0x59f3e6e0 > > pmap_remove_pages() at vmspace_exit+0xc0 >pc = 0x0074154c lr = 0x006c9c00 >sp = 0x59f3e6f0 fp = 0x59f3e720 > > vmspace_exit() at exit1+0x4f8 >pc = 0x006c9c00 lr = 0x003bc2a4 >sp = 0x59f3e730 fp = 0x59f3e7a0 > > exit1() at sys_sys_exit+0x10 >pc = 0x003bc2a4 lr = 0x003bbda8 >sp = 0x59f3e7b0 fp = 0x59f3e7b0 > > sys_sys_exit() at do_el0_sync+0x514 >pc = 0x003bbda8 lr = 0x00747aa4 >sp = 0x59f3e7c0 fp = 0x59f3e860 > > do_el0_sync() at handle_el0_sync+0x90 >pc = 0x00747aa4 lr = 0x0072ca14 >sp = 0x59f3e870 fp = 0x59f3e980 > > handle_el0_sync() at 0x404e6d60 >pc = 0x0072ca14 lr = 0x404e6d60 >sp = 0x59f3e990 fp = 0xd590 > > KDB: enter: panic > [ thread pid 94966 tid 100145 ] > Stopped at 0x40505460: undefined 5442 > db> bt > Tracing pid 94966 tid 100145 td 0xfd002552b000 > db_trace_self() at db_stack_trace+0xf8 > pc = 0x00729e4c lr = 0x00103b0c > sp = 0x59f3de80 fp = 0x59f3deb0 > > db_stack_trace() at db_command+0x228 > pc = 0x00103b0c lr = 0x00103784 > sp = 0x59f3dec0 fp = 0x59f3dfa0 > > db_command() at db_command_loop+0x58 > pc = 0x00103784 lr = 0x0010352c > sp = 0x59f3dfb0 fp = 0x59f3dfd0 > > db_command_loop() at db_trap+0xf4 > pc = 0x0010352c lr = 0x00106830 > sp = 0x59f3dfe0 fp = 0x59f3e200 > > db_trap() at kdb_trap+0x1d8 > pc = 0x00106830 lr = 0x004492fc > sp = 0x59f3e210 fp = 0x59f3e2c0 > > kdb_trap() at do_el1h_sync+0xf4 > pc = 0x004492fc lr = 0x00747418 > sp = 0x59f3e2d0 fp = 0x59f3e300 > > do_el1h_sync() at handle_el1h_sync+0x78 > pc = 0x00747418 lr = 0x0072c878 > sp = 0x59f3e310 fp = 0x59f3e420 > > handle_el1h_sync() at kdb_enter+0x34 > pc = 0x0072c878 lr = 0x00448948 > sp = 0x59f3e430 fp = 0x59f3e4c0 > > kdb_enter() at vpanic+0x1a8 > pc = 0x00448948 lr = 0x00400d98 > sp = 0x59f3e4d0 fp = 0x59f3e580 > > vpanic() at panic+0x44 > pc
Rpi3 panic: non-current pmap 0xfffffd001e05b130
page disks faults cpu r b w avm fre flt re pi pofr sr mm0 da0 in sy cs us sy id 0 0 12 4523836 52860 6989 186 715 257 6932 25125 1038 1038 30790 1073 29820 14 26 60 dT: 1.002s w: 1.000s L(q) ops/sr/s kBps ms/rw/s kBps ms/wd/s kBps ms/d %busy Name 1751702 48604.0 492515.0 0 00.0 94.4 mmcsd0 1751702 48604.1 492515.1 0 00.0 94.7 mmcsd0s2 2704658 40821.8 462350.6 0 00.0 71.9 da0 1751702 48604.1 492515.1 0 00.0 94.8 mmcsd0s2b 2704658 40821.8 462350.7 0 00.0 72.6 da0p6 Sat Nov 30 16:48:26 PST 2019 Device 1K-blocks UsedAvail Capacity /dev/mmcsd0s2b4404252 1959504 244474844% /dev/da0p65242880 1957540 328534037% Total 9647132 3917044 573008841% Nov 30 16:38:17 www sshd[91264]: error: PAM: Authentication error for illegal user support from 103.133.104.114 Nov 30 16:38:17 www sshd[91264]: error: Received disconnect from 103.133.104.114 port 52716:14: No more user authentication methods available. [preauth] 0/1016/1016/19178 mbuf clusters in use (current/cache/total/max) procs memory page disks faults cpu r b w avm fre flt re pi pofr sr mm0 da0 in sy cs us sy id 0 0 12 4523868 46872 6989 186 715 257 6932 25123 0 0 30790 1073 29820 14 26 60 dT: 1.002s w: 1.000s L(q) ops/sr/s kBps ms/rw/s kBps ms/wd/s kBps ms/d %busy Name 2700681 48883.7 191083.7 0 00.0 92.1 mmcsd0 2700681 48883.8 191083.7 0 00.0 92.5 mmcsd0s2 2709687 43142.1 221083.4 0 00.0 78.2 da0 2700681 48883.8 191083.7 0 00.0 92.6 mmcsd0s2b 2709687 43142.1 221083.4 0 00.0 78.7 da0p6 Sat Nov 30 16:48:28 PST 2019 Device 1K-blocks UsedAvail Capacity /dev/mmcsd0s2b440 It's clear the machine was heavily loaded, but storage didn't appear to be swamped. I hope the foregoing has been of some interest, thanks for reading! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Reverting -current by date.
On Wed, Nov 20, 2019 at 02:52:22PM -0800, Mark Millard wrote: > > Unfortunately for Bob P., no suggestion can meet his full criteria. So > he has several suggestions to potentially pick from or to use in > combination. > This is a most gracious way of saying my expectations are unreasonable. Sad but not surprising. At least now I know. Thanks to everyone for enlightening me, bob prohaska > >> > >> . . . > > > > === > Mark Millard > marklmi at yahoo.com > ( dsl-only.net went > away in early 2018-Mar) > > ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: g_vfs_done():ufs/rootfs[WRITE flood on rpi3
On Wed, Nov 20, 2019 at 04:41:46PM -0600, Kyle Evans wrote: > > The revisions noted are a good data point, thanks! Can you try > upgrading the kernel past r354875 before I revert the most likely > candidate up to that point? Perhaps with this patch applied, to make > sure you're not hitting an interrupt race that's hard to deduce from > logs: https://reviews.freebsd.org/D22430.diff > The r354909 kernel is running now, doing a test compile of www/chromium. so far no problems are apparent. > More than willing to build a kernel as described and put up for you to > download, as well, if you'd accept that. > If I understand correctly there's no need, if I'm mistaken please let me know. Thanks for your attention! bob prohaska > Thanks, > > Kyle Evans ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
g_vfs_done():ufs/rootfs[WRITE flood on rpi3
Setting hostid: 0x5cd40a6a. warning: total configured swap (1101063 pages) exceeds maximum recommended amount (920808 pages). warning: increase kern.maxswzone or reduce amount of swap. warning: total configured swap (2411783 pages) exceeds maximum recommended amount (920808 pages). warning: increase kern.maxswzone or reduce amount of swap. Starting file system checks: At this point things returned to normal. Thanks for reading, I hope it's useful. bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Reverting -current by date.
On Wed, Nov 20, 2019 at 11:18:41AM -0700, Warner Losh wrote: > On Wed, Nov 20, 2019 at 10:39 AM bob prohaska wrote: > > > From time to time it would be handy to revert freebsd-current to > > an older, well-behaved revision. > > > > Is there a mechanism for identifying revision numbers that > > will at least compile and boot, by date? > > > > Almost all of them will compile. Almost all of those will boot. While some > build breakage sneaks through, the default assumption is that it's good. > That's certainly been my experience randomly updating to -current. There's > some that are more or less performant, mind you, and some that are more or > less stable, it is true. But the overwhelming vast majority will compile > and boot, at least for amd64. I have issues less than 1% of the time when > updating to whatever is current at the moment I fancy an update. > Are commits that depend on one another somehow grouped in a single revision? > There's some hardware that gets broken from time to time, but we don't > track that specifically. And non-amd64 architectures takes more care and > planning as any build breakage for those platforms lasts longer, in direct > proportion to how popular the platform is > Point taken. I'm interested in aarch64, which puts me somewhat in the weeds. > It's all in the commit logs. If you run -current you need to read them. > They will also tell you almost always if you pick revision X if there was a > subsequent fix that made things compile you should go with. > I take it the strategy would be go back in the log to a rough date, then browse forward in time looking for signs of major trouble. When the commits turn minor/benign, select a revision from that timeframe. > > Study the commit logs? I know I'm harping on that, but when things go > wrong, that's what I do. > I hoped for a more mechanical approach. For example, snapshots are generated from time to time. Presumably, they're vetted in some way and knowing what revisions made it to the snapshot stage might be a starting point. The snapshot server does not appear to contain that information for earlier offerings. > Also -DNO_CLEAN builds help a lot if you're worried about it not even > building, though from time to time you run into issues with a NO_CLEAN > build due to a recent commit that wasn't appreciated at the time of the > commit, but was later and fixed. > Does -DNO_CLEAN behave sanely (and usefully) when going backwards in time? I commonly use it for small forward steps, but time reversal is tricky 8-) Thanks for replying! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Reverting -current by date.
>From time to time it would be handy to revert freebsd-current to an older, well-behaved revision. Is there a mechanism for identifying revision numbers that will at least compile and boot, by date? In my case buildworld seems to be markedly slower than, say, six months ago. Maybe it's hardware, maybe something else. Is there a way to pick a revision number to revert to, that's better than merely guessing? Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: spurious out of swap kills
On Fri, Sep 13, 2019 at 10:59:58PM -0700, Mark Millard wrote: > bob prohaska fbsd at www.zefox.net wrote on > Fri Sep 13 16:24:57 UTC 2019 : > > > Not sure this is relevant, but in compiling chromium on a Pi3 with 6 GB > > of swap the job completed successfully some months ago, with peak swap > > use around 3.5 GB. The swap layout was sub-optimal, with a 2 GB partition > > combined with a 4 GB partition. A little over 4GB total seems usable. > > > > A few days ago the same attempt stopped with a series of OOMA kills, > > but in each case simply restarting allowed the compile to pick up > > where it left off and continue, eventually finishing with a runnable > > version of chromium. In this case swap use peaked a little over 4 GB. > > > > Might this suggest the machine isn't freeing swap in a timely manner? > > Are you saying that your increases to: > > vm.pageout_oom_seq > > no longer prove sufficient? What value for vm.pageout_oom_seq were > you using that got the recent failures? > Correct. Initial value was 2048, later raised to 4096. Far as I could tell the change didn't help. No explict j value was set for make, but no more than four jobs were observed in top A log of storage activity along with swap total and the last two console messages is at http://www.zefox.net/~fbsd/rpi3/swaptests/r351586/swapscript.log along with a sorted list of total swap use, which can be used as a sort of index to the log file. The initial "out of swap space" at the very beginning is a relic from before logging started. Da0 is a Sandisk SDCZ80 usb 3.0 device, mmcsd0 is a Samsung Evo + 128 GB device. The two points of curiosity to me are: 1. Why did swap use increase from 3.5 GB months ago to 4.2 GB now? 2. Why does stopping and restarting make (which would seem to free un-needed swap) allow the job to finish? > If more or different configuration/tuning is required, I'm going to > eventually want to learn about it as well. > You will have some company. Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: spurious out of swap kills
Not sure this is relevant, but in compiling chromium on a Pi3 with 6 GB of swap the job completed successfully some months ago, with peak swap use around 3.5 GB. The swap layout was sub-optimal, with a 2 GB partition combined with a 4 GB partition. A little over 4GB total seems usable. A few days ago the same attempt stopped with a series of OOMA kills, but in each case simply restarting allowed the compile to pick up where it left off and continue, eventually finishing with a runnable version of chromium. In this case swap use peaked a little over 4 GB. Might this suggest the machine isn't freeing swap in a timely manner? Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Firstboot behavior, was Re: New vm-image size is much smaller than previos
[changed subject to follow drift of conversation] On Sat, May 04, 2019 at 06:03:00AM -0700, Rodney W. Grimes wrote: > > Do we even have install note(s) pages for these things, or a wiki page Not that I know of. > that documents it, or ?Working around /firstboot does not require > a serial console, if you know about it ahead of time, you can even ^^ 8-) The statement is true. The gymnastics required to > mount the sd image up on another system, and remove firstboot if you > want, or create a swap partition at the end of the device, make the > boot partition use up the rest and then iirc growfs on firstboot does > what you want. (Untested at this time, but that should just work.) are far from trivial, even for experienced foot-shooters such as myself. A Pi running Raspbian can download and write the FreeBSD image, but it can't mount ufs to manipulate files. I'm not sure about Mac OS and Windows. That's a likely starting scenario for potential users of FreeBSD on the Pi. AFAIK it's still necessary to boot single-user, set up the microSD (which is a considerable challenge using gpart unless one is in good practice) and then let the system go to multi-user. Last time I checked, u-boot (or maybe it's loader) couldn't read the USB keyboard to execute boot -s so the system essentially runs away from the user's control. I admit not having checked in the last few months, but even if it's fixed asking a new user to start by using gpart is unlikely to encourage further exploration. Thanks for your attention! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New vm-image size is much smaller than previos
On Fri, May 03, 2019 at 07:39:00PM -0700, Rodney W. Grimes wrote: > > On Fri, May 3, 2019, 7:42 PM bob prohaska wrote: > > > > > On Fri, May 03, 2019 at 11:06:15AM -0700, Rodney W. Grimes wrote: > > > > -- Start of PGP signed section. > > > > > On Fri, May 03, 2019 at 10:12:58AM -0700, Enji Cooper wrote: > > > > > > > > > > > > > On May 3, 2019, at 9:57 AM, Alan Somers > > > wrote: > > > > > > > > > > > > > > See r346959. Before first boot, you should expand the image up to > > > > > > > whatever size you want. growfs(8) will automatically expand the > > > file > > > > > > > system. > > > > > > > -Alan > > > > > > > > > > > > > > On Fri, May 3, 2019 at 10:32 AM David Boyd > > > wrote: > > > > > > >> > > > > > > >> The vm-image for 13.0-CURRENT > > > > > > >> > > > > > > >> FreeBSD-13.0-CURRENT-amd64-20190503-r347033.vmdk > > > > > > >> > > > > > > >> is only 4.0 GB in size. Previous images were about 31.0 GB. > > > > > > >> > > > > > > >> This smaller image doesn't leave much room to add packages and > > > other > > > > > > >> customizations. > > > > > > > > > > > > This probably deserves a release note. > > > > > > > > > > It will certainly be mentioned in the 11.3 release notes. > > > > > > > > And those running head snapshots without reading commit messages > > > > are likely to have lots of foot shooting. > > > > > > > > > Glen > > > > -- > > > > Rod Grimes > > > rgri...@freebsd.org > > > > > > At the risk of being branded a wishful thinker, a firstboot script that > > > asked the user for some configuration information would be a great help > > > to both new and experienced foot-shooters. I'm thinking of Raspberry Pi, > > > but perhaps it applies to non-embedded platforms also. > > > > > > > That's not a bad idea... we could press bsdinstall into service for that > > perhaps... we already expand the partition / filesystem to match the media > > size... > > As asommers already pointed out a) we already do the for real media > like on the rasberry pi's, etc all in that on first boot they do a > growfs to fill the real media up with the file system. > I misunderstood the significance of "vm-image", thinking it was the same as a bootable microSD image. Apologies for the blunder. My thoughts are about physical media. In that situation the default growfs on firstboot is a real handicap. It makes difficult any local customization of the microSD card, in particular adding a swap partition. A Pi2 is sort of usable without swap, a Pi3 is badly hampered with no swap. Having the existence of /firstboot trigger a configuration script that sets up swap, storage, accounts and network would be a great aid to new users (and old users with imperfect memories). A man page for firstboot would be useful in any case. "What's that empty file supposed to do?" is a very natural question. Unfortunately, by the time the question is discovered it's too late to ask, and the user has to start over. There are references to firstboot in man rc, but that's a very hard way to answer a relatively simple question. Working around /firstboot requires a serial console and considerable patience, at least on a physical Raspberry Pi 2 or 3. Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New vm-image size is much smaller than previos
On Fri, May 03, 2019 at 11:06:15AM -0700, Rodney W. Grimes wrote: > -- Start of PGP signed section. > > On Fri, May 03, 2019 at 10:12:58AM -0700, Enji Cooper wrote: > > > > > > > On May 3, 2019, at 9:57 AM, Alan Somers wrote: > > > > > > > > See r346959. Before first boot, you should expand the image up to > > > > whatever size you want. growfs(8) will automatically expand the file > > > > system. > > > > -Alan > > > > > > > > On Fri, May 3, 2019 at 10:32 AM David Boyd wrote: > > > >> > > > >> The vm-image for 13.0-CURRENT > > > >> > > > >> FreeBSD-13.0-CURRENT-amd64-20190503-r347033.vmdk > > > >> > > > >> is only 4.0 GB in size. Previous images were about 31.0 GB. > > > >> > > > >> This smaller image doesn't leave much room to add packages and other > > > >> customizations. > > > > > > This probably deserves a release note. > > > > It will certainly be mentioned in the 11.3 release notes. > > And those running head snapshots without reading commit messages > are likely to have lots of foot shooting. > > > Glen > -- > Rod Grimes rgri...@freebsd.org At the risk of being branded a wishful thinker, a firstboot script that asked the user for some configuration information would be a great help to both new and experienced foot-shooters. I'm thinking of Raspberry Pi, but perhaps it applies to non-embedded platforms also. The original FreeBSD install program (the one by Jordan Hubbard) did a very serviceable job. Could it (the user interface) be resurrected? Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
conflict between revision numbers for update and info
The other night I ran svnlite up on /usr/src, which ended with Updated to revision 344015 Somewhat later I noticed that uname -a reported the same revision, which seemed odd, since buildworld/buildkernel were still in progress. The next day I ran svnlite info /usr/src, which reported Revision: 344113 Any idea what's going on? Thanks for reading bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CFT: TRIM Consolodation on UFS/FFS filesystems
On Tue, Aug 21, 2018 at 06:47:19PM -0700, Mark Millard wrote: > > I've used a SSD both directly via SATA and via a USB enclosure, > the same partitions/file systems across the uses. Only when it > was SATA-style-use did TRIM work. > This is likely the key to my question. If USB blocks the TRIM service the behavior of the device doesn't matter. As an aside, Sandisk now says: "Please be informed that we have not tested running TRIM commands on USB flash drive and microSD cards therefore we would not be able to comment on it explicitly." Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CFT: TRIM Consolodation on UFS/FFS filesystems
On Mon, Aug 20, 2018 at 12:40:56PM -0700, Kirk McKusick wrote: > I have recently added TRIM consolodation support for the UFS/FFS > filesystem. This feature consolodates large numbers of TRIM commands > into a much smaller number of commands covering larger blocks of > disk space. Best described by the commit message: > > Author: mckusick > Date: Sun Aug 19 16:56:42 2018 > New Revision: 338056 > URL: https://svnweb.freebsd.org/changeset/base/338056 > > Log: > Add consolodation of TRIM / BIO_DELETE commands to the UFS/FFS filesystem. > > When deleting files on filesystems that are stored on flash-memory > (solid-state) disk drives, the filesystem notifies the underlying > disk of the blocks that it is no longer using. The notification > allows the drive to avoid saving these blocks when it needs to > flash (zero out) one of its flash pages. These notifications of > no-longer-being-used blocks are referred to as TRIM notifications. > In FreeBSD these TRIM notifications are sent from the filesystem > to the drive using the BIO_DELETE command. > > Until now, the filesystem would send a separate message to the drive > for each block of the file that was deleted. Each Gigabyte of file > size resulted in over 3000 TRIM messages being sent to the drive. > This burst of messages can overwhelm the drive's task queue causing > multiple second delays for read and write requests. > > This implementation collects runs of contiguous blocks in the file > and then consolodates them into a single BIO_DELETE command to the > drive. The BIO_DELETE command describes the run of blocks as a > single large block being deleted. Each Gigabyte of file size can > result in as few as two BIO_DELETE commands and is typically less > than ten. Though these larger BIO_DELETE commands take longer to > run, they do not clog the drive task queue, so read and write > commands can intersperse effectively with them. > > Though this new feature has been throughly reviewed and tested, it > is being added disabled by default so as to minimize the possibility > of disrupting the upcoming 12.0 release. It can be enabled by running > ``sysctl vfs.ffs.dotrimcons=1''. Users are encouraged to test it. > If no problems arise, we will consider requesting that it be enabled > by default for 12.0. > > Reviewed by: kib > Tested by:Peter Holm > Sponsored by: Netflix > > This support is off by default, but I am hoping that I can get enough > testing to ensure that it (a) works, and (b) is helpful that it will > be reasonable to have it turned on by default in 12.0. The cutoff for > turning it on by default in 12.0 is September 19th. So I am requesting > your testing feedback in the near-term. Please let me know if you have > managed to use it successfully (or not) and also if it provided any > performance difference (good or bad). > > To enable TRIM consolodation either use `sysctl vfs.ffs.dotrimcons=1' > or just set the `dotrimcons' variable in sys/ufs/ffs/ffs_alloc.c to 1. > Will the new feature be active on a Raspberry Pi 3 using flash on microSD and USB for file systems and swap? Can the feature be turned on using one of the conf files in /etc? According to Sandisk, "All microSD or USB drives are flash memory and does support the TRIM command, however, you will not notice any difference after running TRIM command on memory cards or USB drives. TRIM command is basically used for SSD and Hard drives." The "you will not notice any difference" qualification makes me slightly uncertain the reply was well-informed, but if there's any hope of success I'd like to try it. >From time to time there seem to be traffic jams among flash devices on the >RPI3, it would a pleasant surprise if this feature helps. Thanks for reading! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: building LLVM threads gets killed
On Mon, Aug 20, 2018 at 07:33:32PM +0200, Dimitry Andric wrote: > On 20 Aug 2018, at 16:26, Rodney W. Grimes > wrote: > > > >> It is running out of RAM while running multiple parallel link jobs. If > >> you are building using WITH_DEBUG, turn that off, it consumes large > >> amounts of memory. If you must have debug info, try adding the > >> following flag to the CMake command line: > >> > >> -D LLVM_PARALLEL_LINK_JOBS:STRING="1" > >> > >> That will limit the amount of parallel link jobs to 1, even if you > >> specify -j 8 to gmake or ninja. > >> > >> Brooks, it would not be a bad idea to always use this CMake flag in the > >> llvm ports. :) > > > > And this may also fix the issues that all the small > > memory (aka, RPI*) buliders are facing when trying > > to do -j4? > > Possibly, as linking is usually the most memory-consuming part of the > build process (and more so, if debugging is enabled). Are there build > logs available somewhere for those RPI builders? > There is a collection of RPI3 buildworld logs in http://www.zefox.net/~fbsd/rpi3/swaptests/ The more recent experiments are sorted by revision first, then swap config and then other modifications. If I can do anything to make the records more useful please let me know. hth, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ntpd as ntpd user question
On Mon, Jul 23, 2018 at 09:28:41PM -0700, Kevin Oberman wrote: > On Mon, Jul 23, 2018 at 7:25 PM, Ian Lepore wrote: > > > On Mon, 2018-07-23 at 18:54 -0700, bob prohaska wrote: > > > On Mon, Jul 23, 2018 at 09:34:26PM +0200, Herbert J. Skuhra wrote: > > > > > > > > > > > > Yes, first you press m. Then you will see differences of installed > > > > file (left) and new file (right). Then you press either l or > > > > r: > > > > > > > > l | 1: choose left diff > > > > r | 2: choose right diff > > > > > > > > If the diff tries to remove/add to many lines you can: > > > > > > > > el: edit left diff > > > > er: edit right diff > > > > > > > > And if done you can view the merged file (v) before installing (i) > > > > it. > > > > > > > > I am sure, someone can explain it better! :) > > > > > > > Perhaps, but you've made the essential point. Your reply let me > > > understand that > > > mergemaster does not really "master" the merge, it rather identifies > > > files needing > > > to be merged and then starts sdiff to let me modify files. Never > > > having even looked > > > at sdiff, the learning curve proved very steep. Too steep, in fact. > > > > > > I'm going to try a more incremental approach. > > > > > > Thank you _very_ much! > > > > > > bob prohaska > > > > Your reaction to mergemaster is about the same as mine was when I first > > encountered it very long ago, and re-discovered when I tried it a > > couple years ago. It just seems like more trouble than it's worth, I > > can usually figure out what's broken and fix it by hand faster than > > messing with all the merge stuff. > > > > But, someone told me that if you give mergemaster the right flags it > > can potentially be intervention-free. Those apparently aren't the flag > > or two that're suggested at the bottom of UPDATING. So I didn't really > > dig into that any deeper, but I toss it out there in case someone can > > expand on it. > > > > It certainly makes some sense that it could be done intervention-free. > > When doing other diff-based merges (like 'svn update') you only have to > > intervene when there's an actual conflict between some local change > > you've made and the incoming changes. > > > > > It gets a LOT simpler if you use "mergemaster -iPUF" Only those files you > have modified will show up. In most cases, it just zips right by. In most > that it does not, the use of 'r' or 'l' in merge is all you need and always > 'r' eccepton lines you have modified, yourself, so you should know about > them. > I realize your comments are directed to Ian and not me, so please take these $.02 for no more than they're worth. My problems with mergemaster are _not_ with mergemaster. They're with sdiff. The window presented, along with the prompts, are simply bewildering. I suspect that someboey truly fluent with vi would recognize what's going on at once and have no trouble. I've used vi for a long time, but only in the most naive way, and sdiff's man page is little help for a newcomer. Even a Web search for tutorials found nothing very useful, at least not quickly. A plain language discription of what sdif does and how might make the minutia of the man page comprehensible to non-experts. Apologies if I'm belaboring the obvious, and thanks for reading! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ntpd as ntpd user question
On Mon, Jul 23, 2018 at 08:25:59PM -0600, Ian Lepore wrote: > On Mon, 2018-07-23 at 18:54 -0700, bob prohaska wrote: > > at sdiff, the learning curve proved very steep. Too steep, in fact. > > > > I'm going to try a more incremental approach.? > > > > Thank you _very_ much! > > > > bob prohaska > > Your reaction to mergemaster is about the same as mine was when I first > encountered it very long ago, and re-discovered when I tried it a > couple years ago. It just seems like more trouble than it's worth, I > can usually figure out what's broken and fix it by hand faster than > messing with all the merge stuff. > Your suggestion to use vipw seems to have worked. Copied the required line, ran /usr/sbin/pwd_mkdb -p /etc/master.passwd and installworld ran without issue. The machine has now rebooted and ntp has set the clock correctly. I don't see ntpd in a ps -aux output. It's unclear what I need to do next, but at least I'm over the first hurdle. I'll go back to your earlier email and attempt the rest of the updates by hand. Thanks for all your help! bob prohaska > But, someone told me that if you give mergemaster the right flags it > can potentially be intervention-free. Those apparently aren't the flag > or two that're suggested at the bottom of UPDATING. So I didn't really > dig into that any deeper, but I toss it out there in case someone can > expand on it. > > It certainly makes some sense that it could be done intervention-free. > When doing other diff-based merges (like 'svn update') you only have to > intervene when there's an actual conflict between some local change > you've made and the incoming changes. > > -- Ian > ___ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" > ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ntpd as ntpd user question
On Mon, Jul 23, 2018 at 09:34:26PM +0200, Herbert J. Skuhra wrote: > > Yes, first you press m. Then you will see differences of installed > file (left) and new file (right). Then you press either l or > r: > > l | 1:choose left diff > r | 2:choose right diff > > If the diff tries to remove/add to many lines you can: > > el: edit left diff > er: edit right diff > > And if done you can view the merged file (v) before installing (i) it. > > I am sure, someone can explain it better! :) > Perhaps, but you've made the essential point. Your reply let me understand that mergemaster does not really "master" the merge, it rather identifies files needing to be merged and then starts sdiff to let me modify files. Never having even looked at sdiff, the learning curve proved very steep. Too steep, in fact. I'm going to try a more incremental approach. Thank you _very_ much! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ntpd as ntpd user question
On Sun, Jul 22, 2018 at 01:49:41AM +0200, Herbert J. Skuhra wrote: > On Sat, Jul 21, 2018 at 03:09:26PM -0700, bob prohaska wrote: > > The failure is a little surprising, is ntpd a reserved name? > > Why? You obviously entered the string "ntpd" instead of an integer when > asked for the uid!? > Sigh...you're right. Must have been sleepier than I thought. > > The machine is re-running buildworld/installworld from a clean start, > > so presumably it'll halt over the same error again. When that happens, > > what's the simplest way to recover? Mergemaster is a big hammer, something > > less comprehensive might suffice, even manual editing of files. > > In this case 'mergemaster -p' is enough. > An example or two on the use of mergemaster might be a considerable help. There's something very basic that I don't understand. What is the correct response to the prompts for this simple case? The output displayed is said to be differences, so the "temporary" file's nature isn't self-evident. It looks as if the most obvious option is m, followed by eb, but that leaves me editing by hand, which is what I thought mergemaster was supposed to avoid. Thanks for reading, and apologies for being dense. bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ntpd as ntpd user question
On Sat, Jul 21, 2018 at 12:14:10PM -0600, Ian Lepore wrote: > > I can't see any way that installkernel would lead to the complaint > about the ntpd user not existing; that check is tied to the > installworld target. > My mistake. I was sleepy and in a hurry. The error message was in installworld and my attempt to adduser ntpd concluded with an error: Locked : yes OK? (yes/no): yes pw: Bad id 'ntpd': invalid adduser: ERROR: There was an error adding user (ntpd). On reboot the old ntpd set the clock and I thought all was well. The failure is a little surprising, is ntpd a reserved name? The machine is re-running buildworld/installworld from a clean start, so presumably it'll halt over the same error again. When that happens, what's the simplest way to recover? Mergemaster is a big hammer, something less comprehensive might suffice, even manual editing of files. There's minimal customization on the machine, basically /etc/fstab, /etc/rc.conf and /etc/passwd. Nothing else of real value, so if I kill it in the attempt it won't be a disaster. Thanks for waking me to my blunder... bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ntpd as ntpd user question
On Sat, Jul 21, 2018 at 11:14:45AM -0600, Ian Lepore wrote: > > There's a "pre-world" stage of mergemaster (-Fp option I think) which > isn't needed often, but one of the times it is needed is apparently > when new user ids are added. ?(So I've been told, I've never much used > mergemaster myself). I think there are some words about it at the very > bottom of UPDATING. > FWIW, installkernel stopped with the note about needing an ntpd user/group. Never having been successful with mergemaster (couldn't make heads nor tails of the "what to do" prompts) I just ran adduser, creating a locked ntpd user and group. Nothing else special done. The machine is up to r336567 on arm64. Installkernel ran, I didn't touch anthing in /etc manually and reboot looked normal. For now it seems ignorance is bliss If there's something special I should do (beyond locking) to secure the ntpd account please warn me. Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: use of undeclared identifier 'DW_LANG_C11'
On Sat, Jun 09, 2018 at 04:13:08PM -0400, Mark Johnston wrote: > On Sat, Jun 09, 2018 at 01:07:24PM -0700, bob prohaska wrote: > > I'm seeing persistent > > --- dwarf.o --- > > /usr/src/cddl/contrib/opensolaris/tools/ctf/cvt/dwarf.c:1980:8: error: use > > of undeclared identifier 'DW_LANG_C11' > > case DW_LANG_C11: > > ^ > > errors very early in buildworld attempts on 334890 > > > > I've tried "make clean" in /usr/src to no avail, is there something else > > to try, or should I just wait for a source update? > > Please give r334892 a try. Thank you, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
use of undeclared identifier 'DW_LANG_C11'
I'm seeing persistent --- dwarf.o --- /usr/src/cddl/contrib/opensolaris/tools/ctf/cvt/dwarf.c:1980:8: error: use of undeclared identifier 'DW_LANG_C11' case DW_LANG_C11: ^ errors very early in buildworld attempts on 334890 I've tried "make clean" in /usr/src to no avail, is there something else to try, or should I just wait for a source update? Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic: Assertion td->td_lock == TDQ_LOCKPTR(tdq) failed at /usr/src/sys/kern/sched_ule.c:2137
On Thu, Jun 07, 2018 at 05:13:45PM -0700, bob prohaska wrote: > > I'll try again, this time with USB swap turned off. The circle closed, back to the original panic in the subject line. Console, top and buildworld.log files are at http://www.zefox.net/~fbsd/rpi3/crashes/20180607 Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic: Assertion td->td_lock == TDQ_LOCKPTR(tdq) failed at /usr/src/sys/kern/sched_ule.c:2137
On Wed, Jun 06, 2018 at 10:28:58PM -0700, Mark Millard wrote: > > Looks like there has been another stab at avoiding some > unnecessary Out Of Memory killing of processes: > > Author: alc > Date: Thu Jun 7 02:54:11 2018 > New Revision: 334752 > URL: https://svnweb.freebsd.org/changeset/base/334752 > > > Log: > . . . One visible > effect of this error was that processes were being killed by the > virtual memory system's OOM killer when in fact there was plentiful > free memory. > An RPI3 kernel at 334800 still reported Jun 7 16:28:21 www kernel: pid 71329 (c++), uid 0, was killed: out of swap space during a -j4 buildworld. I wasn't watching top at the time, so I don't know how much swap was in use. Total available was 4 GB, which certainly seems like it ought to be enough. The swap was on both microSD and USB flash. I've run make clean in /usr/src/lib/clang/libllvm and restarted a -j4 buildworld with the -DNO_CLEAN option, and also set sysctl vm.pageout_update_period=0 to see what would happen. Within a few minutes buildworld stopped, the tail of the log file contained --- X86GenEVEX2VEXTables.inc --- llvm-tblgen -gen-x86-EVEX2VEX-tables -I /usr/src/contrib/llvm/include -I /usr/src/contrib/llvm/lib/Target/X86 -d X86GenEVEX2VEXTables.inc.d -o X86GenEVEX2VEXTables.inc /usr/src/contrib/llvm/lib/Target/X86/X86.td --- X86GenFastISel.inc --- llvm-tblgen -gen-fast-isel -I /usr/src/contrib/llvm/include -I /usr/src/contrib/llvm/lib/Target/X86 -d X86GenFastISel.inc.d -o X86GenFastISel.inc /usr/src/contrib/llvm/lib/Target/X86/X86.td --- X86GenDAGISel.inc --- Killed *** [X86GenDAGISel.inc] Error code 137 make[6]: stopped in /usr/src/lib/clang/libllvm 1 error make[6]: stopped in /usr/src/lib/clang/libllvm *** [all_subdir_lib/clang/libllvm] Error code 2 make[5]: stopped in /usr/src/lib/clang I'll try again, this time with USB swap turned off. Thanks for reading! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic: Assertion td->td_lock == TDQ_LOCKPTR(tdq) failed at /usr/src/sys/kern/sched_ule.c:2137
On Wed, Jun 06, 2018 at 08:55:39PM +0200, Ronald Klop wrote: > On Sat, 02 Jun 2018 13:40:27 +0200, Ronald Klop > wrote: > > > How do you ever run a -j4 buildworld? My RPI3 starts building clang/llvm > with sometimes 500 MB+ per process so everything starts swapping like hell > and takes forever to run. > Lately, never 8-) When I started playing with an RPI3, in late 2016, -j4 buildworlds worked usably well. Early in 2018 problems appeared, including Assertion td->td_lock == TDQ_LOCKPTR(tdq) failed, among others. Things didn't really go to pot until somewhat later when the swap frenzy issue reared its head and haven't improved much. Sadly, when the swap frenzy workaround of using sysctl vm.pageout_update_period=0 was suggested, a -j4 buildworld then resorted to the old td_lock issue, so it looks as if both bugs are alive and kicking. Just to complicate matters, I was in the habit of using a USB flash drive as both an outboard file system (/usr/, /var/ and /tmp/) and as a swap device. A very common reaction was to blame the flash device for the trouble, though so far as I can tell a Sandisk Extreme USB flash drive isn't much slower, if any, than a mechanical hard disk for random writes. The same USB flash devices on a Pi2 running 11-Stable seems to be fine. However, turning off the USB flash swap device does seem to reduce the number of "indefinite wait buffer" messages on the console (they're usually not fatal) so I think there is still something amiss. Whether it's the flash, the USB or the VM system is unclear to me. For now the workarounds are to run buildworld with no explicit -j value (presumably equivalent to -j1), to use only swap on the microSD card and to use the -DNO_CLEAN option for most buildworld sessions, doing an explict "make clean" or "rm -rf /usr/obj/usr" when necessary. In a few cases it seemed helpful to start with "make kernel-toolchain" then follow with make -DNO_CLEAN buildworld" but I didn't keep good enough records to be certain of the benefits. Apologies for the length, HTH bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Module compiles looking in /usr/src when alternate src tree is in use [actually the arm_neon.h and stdint.h issue]
On Mon, Apr 09, 2018 at 06:04:24AM -0700, Mark Millard wrote: > On 2018-Apr-8, at 10:08 PM, bob prohaska wrote: > >> . . . > > On my RPi3 > > root@www:/usr/src # ls -l /usr/lib/clang/6.0.0/include/stdint.h > > -rw-r--r-- 1 root wheel 23387 Feb 16 07:37 > > /usr/lib/clang/6.0.0/include/stdint.h > > > > Every other file in that directory is dated January 22nd. > > > > > >> . . . > > > > Looks like it's close enough 8-) > > Removing /usr/lib/clang/6.0.0/include/stdint.h has allowed make kernel > > to proceed past its former point of failure. > > > > Looks like you copied the file there. Its presence is not a > build problem. See below. > > >From Feb 16 Email from you: > > From: bob prohaska > Subject: Re: RPI3 can't build kernel-toolchain > Date: February 16, 2018 at 9:09:27 AM PST > To: Mark Millard > Cc: freebsd-arm at freebsd.org, bob prohaska > . . . > Running > cp ./contrib/llvm/tools/clang/lib/Headers/stdint.h > /usr/lib/clang/6.0.0/include > didn't solve the problem. > I remembered the experiment that worked, and forgot the one that didn't. Thank you! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Module compiles looking in /usr/src when alternate src tree is in use [actually the arm_neon.h and stdint.h issue]
On Sun, Apr 08, 2018 at 09:51:19PM -0700, Mark Millard wrote: > Rodney W. Grimes freebsd-rwg at pdx.rh.CN85.dnsmgr.net wrote on > Mon Apr 9 03:54:50 UTC 2018 : > > > Something for some reason included arm_neon.h? > > > # grep -r arm_neon.h /usr/src/sys/ | more > /usr/src/sys/crypto/armv8/armv8_crypto_wrap.c:#include > > arm_neon.h is something that the kernel source itself has a reference > to. [But the stdint.h that was in the error messages was found were > the it should not exist as far as I can tell, see below.] > > # find /usr/src -name .svn -prune -o -name arm_neon.h -print > > finds nothing. But . . . > > # find /usr/lib -name arm_neon.h -print > /usr/lib/clang/6.0.0/include/arm_neon.h > > This matches the error message report and is the only > copy around in the system areas to find. (Ignoring > ports materials and /usr/local/ .) > > In turn that arm_neon.h has: > > # grep stdint.h /usr/lib/clang/6.0.0/include/arm_neon.h > #include > > Looking in a tree that I have (from an amd64 -> arm64 cross > build for what is a Cortex-A53 intended use): > > /usr/obj/DESTDIRs/clang-cortexA53-installworld/ > > were I did an installworld for arm64: > > # find /usr/obj/DESTDIRs/clang-cortexA53-installworld -name stdint.h > /usr/obj/DESTDIRs/clang-cortexA53-installworld/usr/include/c++/v1/stdint.h > /usr/obj/DESTDIRs/clang-cortexA53-installworld/usr/include/c++/v1/tr1/stdint.h > /usr/obj/DESTDIRs/clang-cortexA53-installworld/usr/include/sys/stdint.h > /usr/obj/DESTDIRs/clang-cortexA53-installworld/usr/include/stdint.h > > There is no stdint.h under that tree's /usr/lib/ area: > > /usr/obj/DESTDIRs/clang-cortexA53-installworld/usr/lib/ > > was not listed anywhere. > > For reference relative to arm_neon.h and this tree: > > # find /usr/obj/DESTDIRs/clang-cortexA53-installworld -name arm_neon.h | more > /usr/obj/DESTDIRs/clang-cortexA53-installworld/usr/lib/clang/6.0.0/include/arm_neon.h > > I conclude that: > > /usr/lib/clang/6.0.0/include/stdint.h > > should not have been created in the first place. > > [Does that stdint.h have file-system dates/times matching > the other files from the build? Or does it look to be > mismatched and possibly just needs to be deleted?] > On my RPi3 root@www:/usr/src # ls -l /usr/lib/clang/6.0.0/include/stdint.h -rw-r--r-- 1 root wheel 23387 Feb 16 07:37 /usr/lib/clang/6.0.0/include/stdint.h Every other file in that directory is dated January 22nd. > > For reference, all the above is based on source for head -r332293: > > # uname -apKU > FreeBSD FBSDFSSD 12.0-CURRENT FreeBSD 12.0-CURRENT r332293M amd64 amd64 > 1200061 1200061 > > # svnlite info /usr/src | grep "Re[plv]" > Relative URL: ^/head > Repository Root: svn://svn.freebsd.org/base > Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f > Revision: 332293 > Last Changed Rev: 332293 > > > I do not have an arm64 system that is anywhere near up to > date at this time so the above evidence is not from a > self-hosted build: My context is not a full-match. > Looks like it's close enough 8-) Removing /usr/lib/clang/6.0.0/include/stdint.h has allowed make kernel to proceed past its former point of failure. Thank you! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Module compiles looking in /usr/src when alternate src tree is in use
On Sun, Apr 08, 2018 at 05:40:55PM -0700, Rodney W. Grimes wrote: > > On Sun, Apr 08, 2018 at 12:00:52PM -0700, Rodney W. Grimes wrote: > > > I am having a compile time issue for a patched that compiled fine on my > > > r329294 system, but now failes to compile with what looks like a wrong > > > header being included. > > > > > Might this be a cousin to the problem reported at > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227274 ? > > > > In that kernel compile (on an RPi3) the compiler complains > > > > In file included from /usr/src/sys/crypto/armv8/armv8_crypto_wrap.c:46: > > In file included from /usr/lib/clang/6.0.0/include/arm_neon.h:31: > > /usr/lib/clang/6.0.0/include/stdint.h:228:25: error: typedef redefinition > > with different types ('int16_t' (aka 'short') vs '__int_fast16_t' (aka > > 'int')) > > typedef __int_least16_t int_fast16_t; > > > > The reference to /usr/lib/clang/... seems a bit strange; isn't a major > > purpose of the kernel build procedure to minimize reliance on the > > host system's (already-stale) software? > > Are you building in /usr/src, or are your sources located some place else? > This is a straightforward self-hosted build on an RPi3. Sources are in /usr/src. There are no modifications to the source directories. > Really need the log that includes the cc command line, as that has the > tell tell -I/usr/src/sys in it. That component is totally bogus! At > no time should a src tree rooted at /usr/src-topo be trying to use files > from /usr/src/. > Should files _outside_ /usr/src or /usr/obj _ever_ be referenced during a world or kernel build? I thought the answer was "no". The line leading up to the error message is: --- armv8_crypto_wrap.o --- cc -target aarch64-unknown-freebsd12.0 --sysroot=/usr/obj/usr/src/arm64.aarch6 4/tmp -B/usr/obj/usr/src/arm64.aarch64/tmp/usr/bin -c -O3 -pipe -fno-strict-al iasing -Werror -D_KERNEL -DKLD_MODULE -DHAVE_KERNEL_OPTION_HEADERS -include /u sr/obj/usr/src/arm64.aarch64/sys/GENERIC-NODEBUG/opt_global.h -I. -I/usr/src/s ys -fno-common -g -fPIC -I/usr/obj/usr/src/arm64.aarch64/sys/GENERIC-NODEBUG - ffixed-x18 -ffreestanding -fwrapv -fstack-protector -gdwarf-2 -Wall -Wredundan t-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-ar ith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -D__printf__=__freebsd_kpri ntf__ -Wmissing-include-dirs -fdiagnostics-show-option -Wno-unknown-pragmas -W no-error-tautological-compare -Wno-error-empty-body -Wno-error-parentheses-equ ality -Wno-error-unused-function -Wno-error-pointer-sign -Wno-error-shift-nega tive-value -Wno-error-address-of-packed-member -std=iso9899:1999 -Werror -m arch=armv8-a+crypto /usr/src/sys/crypto/armv8/armv8_crypto_wrap.c In file included from /usr/src/sys/crypto/armv8/armv8_crypto_wrap.c:46: In file included from /usr/lib/clang/6.0.0/include/arm_neon.h:31: There's a "-I/usr/src/sys" in the fourth line, which in my case makes sense, but where does the reference to /usr/lib/clang/ come from, and is it appropriate? > > If the two problems are related, should the subject line on the bug > > report be changed? > > It could be, but more info would be needed. > Please let me know what additional information is needed. Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Module compiles looking in /usr/src when alternate src tree is in use
On Sun, Apr 08, 2018 at 12:00:52PM -0700, Rodney W. Grimes wrote: > I am having a compile time issue for a patched that compiled fine on my > r329294 system, but now failes to compile with what looks like a wrong > header being included. > Might this be a cousin to the problem reported at https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227274 ? In that kernel compile (on an RPi3) the compiler complains In file included from /usr/src/sys/crypto/armv8/armv8_crypto_wrap.c:46: In file included from /usr/lib/clang/6.0.0/include/arm_neon.h:31: /usr/lib/clang/6.0.0/include/stdint.h:228:25: error: typedef redefinition with different types ('int16_t' (aka 'short') vs '__int_fast16_t' (aka 'int')) typedef __int_least16_t int_fast16_t; The reference to /usr/lib/clang/... seems a bit strange; isn't a major purpose of the kernel build procedure to minimize reliance on the host system's (already-stale) software? If the two problems are related, should the subject line on the bug report be changed? Thanks for reading, bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"