Re: Unfamiliar console message: in prompt_tty(): caught signal 2

2024-04-21 Thread bob prohaska
On Sun, Apr 21, 2024 at 10:16:55PM +0200, Dag-Erling Smørgrav wrote:
> bob prohaska  writes:
> > Apr 20 22:14:37 www su[30398]: in prompt_tty(): caught signal 2
> 
> This means someone ran `su` and pressed Ctrl-C instead of entering a
> password when prompted.

Ahh, that would have been me. Thank you!

bob prohaska




Unfamiliar console message: in prompt_tty(): caught signal 2

2024-04-21 Thread bob prohaska
On the serial console on a Pi3 v1.1 (so armv7) I just noticed an
unfamilar message:

Apr 20 22:14:37 www su[30398]: in prompt_tty(): caught signal 2

Several login failures were reported shortly afterward, so
the message seems to have been a console message, not from
the tip session used to connect.

I've never seen it before and wondered if it has any special 
importance. The machine was running buildworld on -current,
updated a day or so ago.

By the next morning the machine had locked up hard, no
response to the enter-tilda-control-B debugger escape.

After power-cycling it came back up after fsck  and buildworld
was resumed where it left off.

Thanks for reading, 

bob prohaska






Re: Buildworld stops for d3befb534b9 in tests

2024-03-04 Thread bob prohaska
On Mon, Mar 04, 2024 at 09:54:14AM -0800, Mark Millard wrote:
> bob prohaska  wrote on
> Date: Mon, 04 Mar 2024 16:35:52 UTC :
> 
> > An armv7 (Pi2 v1.1) -current system stopped buildworld with
> > 
> > c++: error: linker command failed with exit code 1 (use -v to see 
> > invocation)
> > *** [capsicum-test.full] Error code 1
> 
> There might have been more error messages at some earlier point prior to
> the above. Such likely would have more detail about what the issue was.
> 

You're right, I missed the start. Here it is:

Building /usr/obj/usr/src/arm.armv7/tests/sys/capsicum/capsicum-test.full
cc -target armv7-gnueabihf-freebsd15.0 --sysroot=/usr/obj/usr/src/arm.armv7/tmp 
-B/usr/obj/usr/src/arm.armv7/tmp/usr/bin  -O2 -pipe -fno-common   
-I/usr/src/lib/libarchive/tests -I/usr/src/lib/libarchive 
-I/usr/obj/usr/src/arm.armv7/lib/libarchive/tests 
-I/usr/src/contrib/libarchive/libarchive 
-I/usr/src/contrib/libarchive/libarchive/test 
-I/usr/src/contrib/libarchive/test_utils -DHAVE_BZLIB_H=1 -DHAVE_LIBLZMA=1 
-DHAVE_LZMA_H=1  -DHAVE_ZSTD_H=1 -DHAVE_LIBZSTD=1 -DHAVE_LIBZSTD_COMPRESSOR=1 
-DPLATFORM_CONFIG_H=\"/usr/src/lib/libarchive/tests/config_freebsd.h\" 
-DWITH_OPENSSL -DOPENSSL_API_COMPAT=0x1010L -g -gz=zlib -std=gnu99 
-Wno-format-zero-length -fstack-protector-strong -Wsystem-headers -Werror -Wall 
-Wno-format-y2k -W -Wno-unused-parameter -Wstrict-prototypes 
-Wmissing-prototypes -Wpointer-arith -Wno-uninitialized -Wno-pointer-sign 
-Wdate-time -Wno-empty-body -Wno-string-plus-int -Wno-unused-const-variable 
-Wno-error=unused-but-set-parameter -Wno-tautological-compare -Wno-unused-value 
-Wno-parentheses-equality -Wno-unused-function -Wno-enum-conversion 
-Wno-unused-local-typedef -Wno-address-of-packed-member  -Qunused-arguments   
-c /usr/src/contrib/libarchive/libarchive/test/test_archive_pathmatch.c -o 
test_archive_pathmatch.o
ld: error: undefined symbol: testing::internal::CmpHelperGE(char const*, char 
const*, long long, long long)
>>> referenced by procdesc.cc:199 
>>> (/usr/src/contrib/capsicum-test/procdesc.cc:199)
>>>   procdesc.o:(Pdfork_TimeCheck_Test::TestBody())

ld: error: undefined symbol: testing::internal::CmpHelperEQ(char const*, char 
const*, long long, long long)
>>> referenced by gtest.h:1502 
>>> (/usr/obj/usr/src/arm.armv7/tmp/usr/include/private/gtest/gtest.h:1502)
>>>   procdesc.o:(Pdfork_TimeCheck_Test::TestBody())
>>> referenced by gtest.h:1502 
>>> (/usr/obj/usr/src/arm.armv7/tmp/usr/include/private/gtest/gtest.h:1502)
>>>   procdesc.o:(Pdfork_TimeCheck_Test::TestBody())
>>> referenced by gtest.h:1502 
>>> (/usr/obj/usr/src/arm.armv7/tmp/usr/include/private/gtest/gtest.h:1502)
>>>   procdesc.o:(Pdfork_TimeCheck_Test::TestBody())
__cxa_thread_call_dtors: dtr 0x6ce470 from unloaded dso, skipping
__cxa_thread_call_dtors: dtr 0x6309c4 from unloaded dso, skipping
__cxa_thread_call_dtors: dtr 0x6ce470 from unloaded dso, skipping
__cxa_thread_call_dtors: dtr 0x64b138 from unloaded dso, skipping
__cxa_thread_call_dtors: dtr 0x6309c4 from unloaded dso, skipping
__cxa_thread_call_dtors: dtr 0x6309c4 from unloaded dso, skipping
__cxa_thread_call_dtors: dtr 0x6ce470 from unloaded dso, skipping
__cxa_thread_call_dtors: dtr 0x64b138 from unloaded dso, skipping
Building /usr/obj/usr/src/arm.armv7/usr.bin/mandoc/mdoc_markdown.o


Thanks for reading, and apologies for the omission!

bob prohaska



Buildworld stops for d3befb534b9 in tests

2024-03-04 Thread bob prohaska
An armv7 (Pi2 v1.1) -current system stopped buildworld with

c++: error: linker command failed with exit code 1 (use -v to see invocation)
*** [capsicum-test.full] Error code 1

make[6]: stopped in /usr/src/tests/sys/capsicum
.ERROR_TARGET='capsicum-test.full'
.ERROR_META_FILE='/usr/obj/usr/src/arm.armv7/tests/sys/capsicum/capsicum-test.full.meta'
.MAKE.LEVEL='6'
MAKEFILE=''
.MAKE.MODE='meta missing-filemon=yes missing-meta=yes silent=yes verbose'
_ERROR_CMD='c++ -target armv7-gnueabihf-freebsd15.0 
--sysroot=/usr/obj/usr/src/arm.armv7/tmp 
-B/usr/obj/usr/src/arm.armv7/tmp/usr/bin -O2 -pipe -fno-common -I/usr/src/tests 
-g -gz=zlib -Wno-format-zero-length -fstack-protector-strong -Wsystem-headers 
-Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter -Wpointer-arith 
-Wno-uninitialized -Wdate-time -Wno-empty-body -Wno-string-plus-int 
-Wno-unused-const-variable -Wno-error=unused-but-set-parameter 
-Wno-tautological-compare -Wno-unused-value -Wno-parentheses-equality 
-Wno-unused-function -Wno-enum-conversion -Wno-unused-local-typedef 
-Wno-address-of-packed-member -Qunused-arguments 
-I/usr/obj/usr/src/arm.armv7/tmp/usr/include/private -DGTEST_HAS_POSIX_RE=1 
-DGTEST_HAS_PTHREAD=1 -DGTEST_HAS_STREAM_REDIRECTION=1 -frtti -std=c++14 
-Wno-c++11-extensions  -Wl,-zrelro   -o capsicum-test.full  
capsicum-test-main.o capsicum-test.o capability-fd.o copy_file_range.o 
fexecve.o procdesc.o capmode.o fcntl.o ioctl.o openat.o sysctl.o select.o 
mqueue.o socket.o sctp.o capability-fd-pair.o overhead.o rename.o 
-lprivategtest   -lprocstat -lpthread;'
.CURDIR='/usr/src/tests/sys/capsicum'
.MAKE='make'
.OBJDIR='/usr/obj/usr/src/arm.armv7/tests/sys/capsicum'
.TARGETS=' all'
CPUTYPE=''
DESTDIR='/usr/obj/usr/src/arm.armv7/tmp'
LD_LIBRARY_PATH=''
MACHINE='arm'
MACHINE_ARCH='armv7'
MACHINE_CPUARCH='arm'
MAKEOBJDIRPREFIX=''
MAKESYSPATH='/usr/src/share/mk'
MAKE_VERSION='20220726'
PATH='/usr/obj/usr/src/arm.armv7/tmp/bin:/usr/obj/usr/src/arm.armv7/tmp/usr/sbin:/usr/obj/usr/src/arm.armv7/tmp/usr/bin:/usr/obj/usr/src/arm.armv7/tmp/legacy/usr/sbin:/usr/obj/usr/src/arm.armv7/tmp/legacy/usr/bin:/usr/obj/usr/src/arm.armv7/tmp/legacy/bin:/usr/obj/usr/src/arm.armv7/tmp/legacy/usr/libexec::/sbin:/bin:/usr/sbin:/usr/bin'
SRCTOP='/usr/src'
OBJTOP='/usr/obj/usr/src/arm.armv7'
.MAKE.MAKEFILES='/usr/src/share/mk/sys.mk /usr/src/share/mk/local.sys.env.mk 
/usr/src/share/mk/src.sys.env.mk /usr/src/share/mk/bsd.mkopt.mk 
/usr/src/share/mk/src.sys.obj.mk /usr/src/share/mk/local.sys.machine.mk 
/usr/src/share/mk/meta.sys.mk /usr/src/share/mk/local.meta.sys.env.mk 
/usr/src/share/mk/auto.obj.mk /usr/src/share/mk/bsd.suffixes.mk /etc/make.conf 
/usr/src/share/mk/local.sys.mk /usr/src/share/mk/src.sys.mk /etc/src.conf 
/usr/src/tests/sys/capsicum/Makefile /usr/src/share/mk/src.opts.mk 
/usr/src/share/mk/bsd.own.mk /usr/src/share/mk/bsd.opts.mk 
/usr/src/share/mk/bsd.cpu.mk /usr/src/share/mk/bsd.compiler.mk 
/usr/src/share/mk/bsd.endian.mk /usr/src/share/mk/bsd.linker.mk 
/usr/src/share/mk/bsd.test.mk /usr/src/share/mk/bsd.init.mk 
/usr/src/share/mk/local.init.mk /usr/src/share/mk/src.init.mk 
/usr/src/tests/sys/capsicum/../Makefile.inc /usr/src/tests/Makefile.inc0 
/usr/src/share/mk/googletest.test.mk /usr/src/share/mk/googletest.test.inc.mk 
/usr/src/share/mk/plain.test.mk /usr/src/share/mk/tap.test.mk 
make[2]: stopped in /usr/src

I just re-ran git pull, no changes

Thanks for reading,

bob prohaska




Re: Missing files on -current

2024-02-24 Thread bob prohaska
On Sat, Feb 24, 2024 at 03:59:01PM +, Gary Jennejohn wrote:
> 
> The function run_rc_scripts is defined in /usr/src/libexec/rc/rc.subr and
> is called in /usr/src/libexec/rc/rc.  /etc/rc includes /etc/rc.subr.
> 
> So, maybe one of these files is not up to date under /etc?
> 

My fault, etcupdate reported a conflict and I didn't
notice it. Sorry for the noise!

bob prohaska
 



Re: Missing files on -current

2024-02-24 Thread bob prohaska
On Sat, Feb 24, 2024 at 07:02:19AM -0800, David Wolfskill wrote:
> 
> This is from an amd64 system at main-n268514-61b88a230bac, but
> run_rc_scripts is a shell function defined in /etc/rc.subr.
> 
> So the whine about not finding run_rc_scripts would indicate that at
> least one of the following is true:
> 
> * The script that should have sourced /etc/rc.subr failed to do so.
> 
> * /etc/rc.csubr is corrupted, and fails to define run_rc_scripts().
>


Indeed, it seems to be absent:
root@:~ # more /etc/rc.csubr
/etc/rc.csubr: No such file or directory
root@:~ #

However, the same is true of a Pi3 running 14-release p5.
It boots reliably once it reaches loader.

I wouldn't expect this part of the boot process to be
platform dependent. Maybe -current and -release do
things differently?
 
> * /etc/rc.subr is missing.
Present and accounted for:
root@:~ # ls -l /etc/rc.subr
-rw-r--r--  1 root wheel 51911 Nov 18 21:46 /etc/rc.subr
root@:~ # 

Thanks for writing!

bob prohaska




Missing files on -current

2024-02-24 Thread bob prohaska
A Pi4 running -current completed a build/install cycle for world and kernel
without obvious errors but failed to reboot, reporting:
...
Warning: no time-of-day clock registered, system time will not be set accurately
Dual Console: Serial Primary, Video Secondary
/etc/rc: run_rc_scripts: not found
/etc/rc: run_rc_scripts: not found
/etc/rc: have: not found

Sat Feb 24 13:42:09 UTC 2024
2024-02-24T13:42:10.007616+00:00 - init 31 - - can't exec getty 
'/usr/libexec/getty' for port /dev/ttyv1: No such file or directory 
...

Uname -a reports:
FreeBSD  15.0-CURRENT FreeBSD 15.0-CURRENT #121 main-n268499-b9870ba93ea9: Fri 
Feb 23 23:14:59 PST 2024 
b...@nemesis.zefox.com:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC 
arm64distribution.

Power cycling allowed boot to single-user, running fsck -fy reports a clean
root file system.

/etc/fstab contains
/dev/da0s2a   /   ufs rw  1   1
/dev/da0s1 /boot/msdos msdosfs rw,noatime 0 0
#tmpfs /tmp tmpfs rw,mode=1777,size=50m 0 0
/dev/da0s2d /usrufs rw  2   2
/dev/da0s2b noneswapsw

There does not seem to be a file named run_rc_scripts present
in the filesystem.

Any suggestions on how to back myself out of this corner
would be much appreciated!

Thanks for reading,

bob prohaska




Re: bsdinstall use on rpi4

2024-01-13 Thread bob prohaska
On Sat, Jan 13, 2024 at 05:03:41PM +, void wrote:
> 
> I've used this method with 13-stable and 14-stable, but wondered if
> maybe it was depreciated in 15-current. The showstopper is the error marked
> [2] which is within seconds followed by [3]. If it was just [1]
> then I could work around it.
 
IIRC I didn't use the automatic setup, but rather the manual one. Perhaps
that changed something. Also, I was using the snapshot for 14-stable on Pi3,
that might be quite different, intentionally or otherwise. "Can't create..." a
file looks like a permissions or sequence error, with one of the intermediate
directories not created yet. 

ISTR a thread on this general topic saying to my eye that bsdinstall 
either didn't or couldn't deal with partitions outside the UFS 
filesystem. If you got a working msdos partition then I'm badly
mistaken. Can't find that thread now. 

Thanks for reading,

bob prohaska




Re: bsdinstall use on rpi4

2024-01-13 Thread bob prohaska
On Sat, Jan 13, 2024 at 03:26:19PM +, void wrote:
> Hi,
> 
> I'm trying to use bsdinstall on
> FreeBSD-15.0-CURRENT-arm64-aarch64-RPI-20240111-a61d2c7fbd3c-267507.img
> to install to usb3-connected HD, using the 'expert mode' for UFS,
> after having initially booted from mmcsd.
> 
> The goal is to boot to usb3 with freebsd on UFS filesystem, and to have
> that filesystem split into partitions, and the partitions having various
> properties. IOW not to have it all on /. I'd like to use GPT instead of MBR.
> 
> Is this (bsdinstall) method 'correct' or should I use some other method?
> 
> I've tried this method but get errors after the fetching phase.
> 
> 1. "manifest not found on local disk and will be fetched from an
> unverified source..." http://void.f-m.fm.user.fm/error1.png
> 
> 2. "error while extracting base.txz: can't create
> /usr/share/untrusted/Sonera_Class_2_Root_CA.pem"
>http://void.f-m.fm.user.fm/error2.png
> 
> 3. "could not set root password. An installation step has been
>aborted. Would you like to restart the installation or exitthe
> installer?"
>http://void.f-m.fm.user.fm/error3.png
> 

I tried this using 14-release on a Pi3 with a usb mechanical 
hard disk. I don't recall seeing the errors reported above, 
but found that the msdos partition wasn't populated. I copied 
files manually.

The resulting host is finicky about booting, sometimes requiring
intervention at the serial console to prod u-boot to find the
usb disk. This particular Pi is booting without a microSD, it's 
possible the usb-sata adapter contributes to the problem. It
might be worth trying the "bootcode.bin-only" method to see
if that helps. Perhaps others can say more.  

Once FreeBSD is up there have been no obvious problems.

Thanks for reading, hth

bob prohaska





Re: How much survives an install/reboot cycle?

2023-11-20 Thread bob prohaska
On Mon, Nov 20, 2023 at 09:12:45AM +0800, Zhenlei Huang wrote:
> 
> 
> > On Nov 19, 2023, at 11:51 PM, bob prohaska  wrote:
> > 
> > How much of a running system's state survives a reboot? I used to think
> > the answer was "nothing", but from time to time a second reboot behaves
> > a  little differently from the previous one. 
> 
> Warner has a good description about that. I totally agree.
> 
> > 
> > The most recent example was an update to bpf.c: Prior to the update an
> > armv7 system had been inclined to drop ssh connections left up for days.
> > After updating and running a build/install cycle the behavior persisted,
> > but since a second reboot with no intentional changes it has stopped.
> 
> The most recent change to bpf.c is 7a974a649848 (bpf: Make dead_bpf_if const) 
> .
> It is not a functional change, and I do not think it will affect ssh.
> There could be issues under the earth.
> 

That is most helpful. Very likely the change I saw is simply coincidence.

> Anyway please do not hesitate to report if you get recovered by reverting 
> 7a974a649848.

In this case I don't want to revert, the new behavior is desirable. My only 
puzzle was
the seeming delay in its appearance. 


The only consistent issue remaining is reported in Bug 273566 . It finally 
dawned
on me that the garbage characters must be originated on the USB end, transmitted
to the getty process watching the serial end and get stuck in the transmit 
buffer 
when the link goes down. When the serial link comes back up they appear on the
receiving console display. 

Many thanks to you and Warner!

bob prohaska




How much survives an install/reboot cycle?

2023-11-19 Thread bob prohaska
How much of a running system's state survives a reboot? I used to think
the answer was "nothing", but from time to time a second reboot behaves
a  little differently from the previous one. 

The most recent example was an update to bpf.c: Prior to the update an
armv7 system had been inclined to drop ssh connections left up for days.
After updating and running a build/install cycle the behavior persisted,
but since a second reboot with no intentional changes it has stopped.

I've not tampered with nextboot, so I don't think that's it. Maybe I'm
just imagining imagining things 

Thanks for reading,

bob prohaska




Re: www/chromium will not build on a host w/ 8 CPU and 16G mem [RPi4B 8 GiByte example]

2023-08-25 Thread bob prohaska
On Fri, Aug 25, 2023 at 02:21:33AM -0700, Mark Millard wrote:
> 
> That will not help avoid the R_AARCH64_ABS64 abuse,
> unfortunately.
> 
> 
Thank you for the analysis. I've posted a bug,id=273349.

Sounds like I shouldn't hold my breath 8-(

bob prohaska
 
> 



Re: www/chromium will not build on a host w/ 8 CPU and 16G mem [RPi4B 8 GiByte example]

2023-08-24 Thread bob prohaska
On Thu, Aug 24, 2023 at 03:20:50PM -0700, Mark Millard wrote:
> bob prohaska  wrote on
> Date: Thu, 24 Aug 2023 19:44:17 UTC :
> 
> > On Fri, Aug 18, 2023 at 08:05:41AM +0200, Matthias Apitz wrote:
> > > 
> > > sysctl vfs.read_max=128
> > > sysctl vfs.aio.max_buf_aio=8192
> > > sysctl vfs.aio.max_aio_queue_per_proc=65536
> > > sysctl vfs.aio.max_aio_per_proc=8192
> > > sysctl vfs.aio.max_aio_queue=65536
> > > sysctl vm.pageout_oom_seq=120
> > > sysctl vm.pfault_oom_attempts=-1 
> > > 
> > 
> > Just tried these settings on a Pi4, 8GB. Seemingly no help,
> > build of www/chromium failed again, saying only:
> > 
> > ===> Compilation failed unexpectedly.
> > Try to set MAKE_JOBS_UNSAFE=yes and rebuild before reporting the failure to
> > the maintainer.
> > *** Error code 1
> > 
> > No messages on the console at all, no indication of any swap use at all.
> > If somebody can tell me how to invoke MAKE_JOBS_UNSAFE=yes, either
> > locally or globally, I'll give it a try. But, if it's a system problem
> > I'd expect at least a peep on the console
> 
> Are you going to post the log file someplace? 


http://nemesis.zefox.com/~bob/data/logs/bulk/main-default/2023-08-20_16h11m59s/logs/errors/chromium-115.0.5790.170_1.log

> You may have  missed an earlier message. 

Yes, I did. Some (very long) lines above there is:

[ 96% 53691/55361] "python3" "../../build/toolchain/gcc_link_wrapper.py" 
--output="./v8_context_snapshot_generator" -- c++ -fuse-ld=lld 
-Wl,--build-id=sha1 -fPIC -Wl,-z,noexecstack -Wl,-z,relro -Wl,-z,now 
-Wl,--icf=all -Wl,--color-diagnostics -Wl,--undefined-version 
-Wl,-mllvm,-enable-machine-outliner=never -no-canonical-prefixes -Wl,-O2 
-Wl,--gc-sections -rdynamic -pie -Wl,--disable-new-dtags -Wl,--icf=none 
-L/usr/local/lib  -fstack-protector-strong -L/usr/local/lib  -o 
"./v8_context_snapshot_generator" -Wl,--start-group 
@"./v8_context_snapshot_generator.rsp"  -Wl,--end-group  -lpthread 
-lgmodule-2.0 -lglib-2.0 -lgobject-2.0 -lgthread-2.0 -lintl -licui18n -licuuc 
-licudata -lnss3 -lsmime3 -lnssutil3 -lplds4 -lplc4 -lnspr4 -ldl -lkvm 
-lexecinfo -lutil -levent -lgio-2.0 -ljpeg -lpng16 -lxml2 -lxslt -lexpat -lwebp 
-lwebpdemux -lwebpmux -lharfbuzz-subset -lharfbuzz -lfontconfig -lopus 
-lopenh264 -lm -lz -ldav1d -lX11 -lXcomposite -lXdamage -lXext -lXfixes 
-lXrender -lXrandr -lXtst -lepoll-shim -ldrm -lxcb -lxkbcommon -lgbm -lXi -lGL 
-lpci -lffi -ldbus-1 -lpangocairo-1.0 -lpango-1.0 -lcairo -latk-1.0 
-latk-bridge-2.0 -lsndio -lFLAC -lsnappy -latspi 
FAILED: v8_context_snapshot_generator 

Then, a bit further down in the file a series of 
d.lld: error: relocation R_AARCH64_ABS64 cannot be used against local symbol; 
recompile with -fPIC
complaints.

Unclear if the two kinds of complaints are related, nor whether they're the 
first..

> How long had it run before  stopping? 

95 hours, give or take. Nothing about timeout was reported

> How does that match up with the MAX_EXECUTION_TIME
> and NOHANG_TIME and the like that you have poudriere set
> up to use ( /usr/local/etc/poudriere.conf ). 

NOHANG_TIME=44400
MAX_EXECUTION_TIME=1728000
MAX_EXECUTION_TIME_EXTRACT=144000
MAX_EXECUTION_TIME_INSTALL=144000
MAX_EXECUTION_TIME_PACKAGE=11728000
Admittedly some are plain silly, I just started
tacking on zeros after getting timeouts and being
unable to match the error message and variable name..

I checked for duplicates this time, however.

> Something  relevant for the question is what you have for:
> 
> # Grep build logs to determine a possible build failure reason.  This is
> # only shown on the web interface.
> # Default: yes
> DETERMINE_BUILD_FAILURE_REASON=no
> 
> Using DETERMINE_BUILD_FAILURE_REASON leads to large builds
> running for a long time after it starts the process of
> stopping from a timeout the grep activity takes a long
> time and the build activity is not stopped during the
> grep.
> 
> 
> vm.pageout_oom_seq=120 and vm.pfault_oom_attempts=-1 make
> sense to me for certain kinds of issues involved in large
> builds, presuming sufficient RAM+SWAP for how it is set
> up to operate. vm.pageout_oom_seq is associated with
> console/log messages. if one runs out of RAM+SWAP,
> vm.pfault_oom_attempts=-1 tends to lead to deadlock. But
> it allows slow I/O to have the time to complete and so
> can be useful.
> 
> I'm not sure that any vfs.aio.* is actually involved: special
> system calls are involved, splitting requests vs. retrieving
> the status of completed requests later. Use of aio has to be
> explicit in the running software from what I can tell. I've
> no information about which software builds might be using aio
> during the build activity.
> 

Re: www/chromium will not build on a host w/ 8 CPU and 16G mem

2023-08-24 Thread bob prohaska
On Fri, Aug 18, 2023 at 08:05:41AM +0200, Matthias Apitz wrote:
> 
> sysctl vfs.read_max=128
> sysctl vfs.aio.max_buf_aio=8192
> sysctl vfs.aio.max_aio_queue_per_proc=65536
> sysctl vfs.aio.max_aio_per_proc=8192
> sysctl vfs.aio.max_aio_queue=65536
> sysctl vm.pageout_oom_seq=120
> sysctl vm.pfault_oom_attempts=-1 
> 

Just tried these settings on a Pi4, 8GB. Seemingly no help,
build of www/chromium failed again, saying only:

===> Compilation failed unexpectedly.
Try to set MAKE_JOBS_UNSAFE=yes and rebuild before reporting the failure to
the maintainer.
*** Error code 1

No messages on the console at all, no indication of any swap use at all.
If somebody can tell me how to invoke MAKE_JOBS_UNSAFE=yes, either
locally or globally, I'll give it a try. But, if it's a system problem
I'd expect at least a peep on the console....

Thanks for reading,

bob prohaska




Re: alpha-1 armv7 git failed: fatal: pack is corrupted (SHA1 mismatch)

2023-08-13 Thread bob prohaska
On Sun, Aug 13, 2023 at 12:45:12PM -0700, Mark Millard wrote:
> 
> Wow. I'm going to suggest doing a clone (to a temporary
> place) on one or more different types of system, such
> as aarch64 or amd64. If, say, aarch64 works but armv7
> does not, then the corruption may well be in some armv7
> FreeBSD handling of data transfers, in places common
> to both https:// and ssh:// use.
> 
That seems to have worked on a Pi4 8GB running -current:

root@nemesis:/usr # mv src src.old
root@nemesis:/usr # git clone  -o  freebsd 
ssh://anon...@git.freebsd.org/src.git /usr/src
Cloning into '/usr/src'...
The authenticity of host 'git.freebsd.org (96.47.72.109)' can't be established.
ED25519 key fingerprint is SHA256:y1ljKrKMD3lDObRUG3xJ9gXwEIuqnh306tSyFd1tuZE.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'git.freebsd.org' (ED25519) to the list of known 
hosts.
remote: Enumerating objects: 4323641, done.
remote: Counting objects: 100% (381285/381285), done.
remote: Compressing objects: 100% (28204/28204), done.
remote: Total 4323641 (delta 375527), reused 353081 (delta 353081), pack-reused 
3942356
Receiving objects: 100% (4323641/4323641), 1.54 GiB | 390.00 KiB/s, done.
Resolving deltas: 100% (3432012/3432012), done.
Checking objects: 100% (16777216/16777216), done.
Updating files: 100% (95944/95944), done.
root@nemesis:/usr # 


> Note that, if you get a good clone, you can locally
> copy the tree over to the armv7 media. But that is
> not the point of my suggestion above.

Under the circumstances it seems like the path of
least resistance. Can I do something simple like
sftp, using get -r ? Any trick to updating the copy? 

Many thanks!

bob prohaska




Re: alpha-1 armv7 git failed: fatal: pack is corrupted (SHA1 mismatch)

2023-08-13 Thread bob prohaska
On Sat, Aug 12, 2023 at 08:45:54PM -0700, Mark Millard wrote:
> 
> You might need to use the ssh alternative if your
> context allows it:
> 
> ssh://anon...@git.freebsd.org/src.git
> 
> There was a time when git fetch proved unreliable
> in my context and I got around it via ssh use. It
> also took less time transferring.
> 
Seemingly no dice:

# git clone  -o  freebsd ssh://anon...@git.freebsd.org/src.git /usr/src
Cloning into '/usr/src'...
remote: Enumerating objects: 4323641, done.
remote: Counting objects: 100% (381285/381285), done.
remote: Compressing objects: 100% (28204/28204), done.
remote: Total 4323641 (delta 375529), reused 353081 (delta 353081), pack-reused 
3942356
Receiving objects: 100% (4323641/4323641), 1.54 GiB | 656.00 KiB/s, done.
fatal: pack is corrupted (SHA1 mismatch)
fatal: fetch-pack: invalid index-pack output
#

I've added freebsd-current to the cc list 8-(

Thanks for writing!

bob prohaska




Re: Using etcupdate resolve, was Re: Surprise null root password

2023-06-15 Thread bob prohaska
I want to thank Patrick, Dmitry and Mark for providing
orientation sufficient to make some headway. The Handbook
simply says 

"If etcupdate(8) is not able to merge a file automatically, 
the merge conflicts can be resolved with manual interaction 
by issuing:

# etcupdate resolve

While not wrong, it's certainly less than the whole story 8-)

It's unfortunate that the example posted was a trivial case,
certainly I didn't tamper with BSD.tests.dist and tf was the
correct response. 

If I'm understanding correctly, the file presented by the
df and e options contains essentially all possible versions,
delimited by <<<<<<<<<<<<<, ||| and >>>>>>>>> 
characters. Once edited, that will become the new local
version of the file. If this is mistaken please say so.

bob prohaska




Using etcupdate resolve, was Re: Surprise null root password

2023-06-15 Thread bob prohaska
Here's an example of the puzzles faced when using etcupdate
that have so far proved baffling:

On running etcupdate resolve, the system reports

Resolving conflict in '/etc/mtree/BSD.tests.dist':
Select: (p) postpone, (df) diff-full, (e) edit,
(h) help for more options: df
--- /etc/mtree/BSD.tests.dist   2023-05-29 08:29:48.174762000 -0700
+++ /var/db/etcupdate/conflicts/etc/mtree/BSD.tests.dist2023-06-13 
22:55:04.284491000 -0700
@@ -442,6 +442,16 @@
 ..
 ifconfig
 ..
+<<<<<<< yours
+||| original
+md5
+..
+===
+ipfw
+..
+md5
+..
+>>>>>>> new
 mdconfig
 ..
 nvmecontrol
Select: (p) postpone, (df) diff-full, (e) edit,
(h) help for more options: e

Selecting option e for edit brings up what appears to be a
vi window, using search I can find the line with mdconfig:

<<<<<<< yours
||| original
md5
..
===
ipfw
..
md5
..
>>>>>>> new
mdconfig
..
nvmecontrol
..
pfctl
files
..
..
ping
..

The puzzle at this point is what to do. It's looks like the
points of interest are the lines marked "yours" and "new",
but I'll admit to bafflement which to modify and whether
the modifications needed include the <<<< and >>>>> characters.

If there's a relevant man section please point it out.  
 
Thanks for reading,

bob prohaska
 





Re: Surprise null root password

2023-05-30 Thread bob prohaska
On Tue, May 30, 2023 at 11:02:13AM -0700, Mark Millard wrote:
> bob prohaska  wrote on
> Date: Tue, 30 May 2023 15:36:21 UTC :
> 
> > On Tue, May 30, 2023 at 08:41:33AM +0200, Alexander Leidinger wrote:
> > > 
> > > Quoting bob prohaska  (from Fri, 26 May 2023 16:26:06
> > > -0700):
> > > 
> > > > On Fri, May 26, 2023 at 10:55:49PM +0200, Yuri wrote:
> > > > > 
> > > > > The question is how you update the configuration files,
> > > > > mergemaster/etcupdate/something else?
> > > > > 
> > > > 
> > > > Via etcupdate after installworld. In the event the system
> > > > requests manual intervention I accept "theirs all". It seems
> > > > odd if that can null a root password.
> > > > 
> > > > Still, it does seem an outside possibility. I could see it adding
> > > > system users, but messing with root's existing password seems a
> > > > bit unexpected.
> > > 
> > > As you are posting to -current@, I expect you to report this issue about
> > > 14-current systems. As such: there was a "recent" change (2021-10-20) to 
> > > the
> > > root entry to change the shell.
> > > https://cgit.freebsd.org/src/commit/etc/master.passwd?id=d410b585b6f00a26c2de7724d6576a3ea7d548b7
> > > 
> > > By blindly accepting all changes, this has reset the PW to the default
> > > setting (empty).
> > 
> > So it's a line-by-line merge. That's the most sensible explanation 
> > available.
> > 
> > > 
> > > I suggest to review changes ("df" instead of "tf" in etcupdate) to at 
> > > least
> > > those files which you know you have modified, including the password/group
> > > stuff. After that you can decide if the diff which is shown with "df" can 
> > > be
> > > applied ("tf"), or if you want to keep the old version ("mf"), or if you
> > > want to modify the current file ("e", with both versions present in the 
> > > file
> > > so that you can copy/paste between the different versions and keep what 
> > > you
> > > need).
> > > 
> > 
> > The key sequences required to copy and paste between files in the edit 
> > screen
> > were elusive. Probably it was thought self-evident, but not for me. I last 
> > tried 
> > it long ago, via mergemaster. Is there is a guide to commands for merging 
> > files 
> > using /etcupdate? Is it in the vi man page? I couldn't find it.
> 
> # man etcpudate
> . . .
> CONFIG FILE
>  The etcupdate utility can also be configured by setting variables in an
>  optional configuration file named /etc/etcupdate.conf.  Note that command
>  line options override settings in the configuration file.  The
>  configuration file is executed by sh(1), so it uses that syntax to set
>  configuration variables.  The following variables can be set:
> 
>  . . .
> 
>  EDITOR  Specify a program to edit merge conflicts.
> . . .
> ENVIRONMENT
>  The etcupdate utility uses the program identified in the EDITOR
>  environment variable to edit merge conflicts.  If EDITOR is not set,
>  vi(1) is used as the default editor.
> 
> 
> 
> So, if you do not want to use vi, you can use either the EDITOR
> environment variable or an EDITOR assignment in
> /etc/etcupdate.conf to change what editor etcupdate uses for
> you to edit merge conflicts with.

My difficulty is precisely a lack of skill with vi, which I've
used and cursed since starting with 386BSD. Evidently I'm a slow
learner I tried other editors, but vi is the only one always
available.  

For the moment, etcupgrade isn't asking for manual intervention.
When it next does I'll pay closer attention and ask better questions.

Thanks to you in particular and everybody else who has helped!

bob prohaska




Re: Surprise null root password

2023-05-30 Thread bob prohaska
On Tue, May 30, 2023 at 08:41:33AM +0200, Alexander Leidinger wrote:
> 
> Quoting bob prohaska  (from Fri, 26 May 2023 16:26:06
> -0700):
> 
> > On Fri, May 26, 2023 at 10:55:49PM +0200, Yuri wrote:
> > > 
> > > The question is how you update the configuration files,
> > > mergemaster/etcupdate/something else?
> > > 
> > 
> > Via etcupdate after installworld. In the event the system
> > requests manual intervention I accept "theirs all". It seems
> > odd if that can null a root password.
> > 
> > Still, it does seem an outside possibility. I could see it adding
> > system users, but messing with root's existing password seems a
> > bit unexpected.
> 
> As you are posting to -current@, I expect you to report this issue about
> 14-current systems. As such: there was a "recent" change (2021-10-20) to the
> root entry to change the shell.
> https://cgit.freebsd.org/src/commit/etc/master.passwd?id=d410b585b6f00a26c2de7724d6576a3ea7d548b7
> 
> By blindly accepting all changes, this has reset the PW to the default
> setting (empty).

So it's a line-by-line merge. That's the most sensible explanation available.

> 
> I suggest to review changes ("df" instead of "tf" in etcupdate) to at least
> those files which you know you have modified, including the password/group
> stuff. After that you can decide if the diff which is shown with "df" can be
> applied ("tf"), or if you want to keep the old version ("mf"), or if you
> want to modify the current file ("e", with both versions present in the file
> so that you can copy/paste between the different versions and keep what you
> need).
> 

The key sequences required to copy and paste between files in the edit screen
were elusive. Probably it was thought self-evident, but not for me. I last 
tried 
it long ago, via mergemaster. Is there is a guide to commands for merging files 
using /etcupdate? Is it in the vi man page? I couldn't find it.

Thanks for writing!

bob prohaska




Re: Surprise null root password

2023-05-26 Thread bob prohaska
It turns out all seven hosts in my cluster report
a null password for root in /usr/src/etc/master.passwd:
root::0:0::0:0:Charlie &:/root:/bin/sh

Is that intentional?

Thanks for reading,

bob prohaska




Re: Surprise null root password

2023-05-26 Thread bob prohaska
On Fri, May 26, 2023 at 10:55:49PM +0200, Yuri wrote:
> 
> The question is how you update the configuration files,
> mergemaster/etcupdate/something else?
> 

Via etcupdate after installworld. In the event the system
requests manual intervention I accept "theirs all". It seems
odd if that can null a root password.

Still, it does seem an outside possibility. I could see it adding
system users, but messing with root's existing password seems a
bit unexpected.  

Thanks very much for raising the point!

bob prohaska




Re: Surprise null root password

2023-05-26 Thread bob prohaska
On Fri, May 26, 2023 at 07:48:04PM +0100, Ben Laurie wrote:
> -T on ls will give you full time resolution...
> 
More's the wonder:
root@www:/usr/src # ls -lT /etc/*p*wd*
-rw---  1 root  wheel   2099 May 10 17:20:33 2023 /etc/master.passwd
-rw-r--r--  1 root  wheel   1831 May 10 17:20:33 2023 /etc/passwd
-rw-r--r--  1 root  wheel  40960 May 10 17:20:33 2023 /etc/pwd.db
-rw---  1 root  wheel  40960 May 10 17:20:33 2023 /etc/spwd.db

For sake of clarity, /etc/master.passwd's root line is
root::0:0::0:0:Charlie &:/root:/bin/sh
while /etc/passwd's root line is
root:*:0:0:Charlie &:/root:/bin/sh

I just noticed a second host (Pi3) which is similarly affected.
It completed a build/install cycle on May 25, uname -a yields
FreeBSD www.zefox.org 14.0-CURRENT FreeBSD 14.0-CURRENT #46 
main-n263122-57a3a161a92f: Thu May 25 21:25:57 PDT 2023 
b...@www.zefox.org:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC arm64

On this host I get
root@www:/usr/src # ls -lT /etc/*p*wd*
-rw---  1 root  wheel   1796 Nov 12 16:00:03 2022 /etc/master.passwd
-rw-r--r--  1 root  wheel   2430 Oct  1 19:40:22 2020 /etc/passwd
-rw-r--r--  1 root  wheel  40960 Oct  1 19:40:22 2020 /etc/pwd.db
-rw---  1 root  wheel  40960 Oct  1 19:40:22 2020 /etc/spwd.db
(at least the dates make more sense)

The root line in /etc/master.passwd is
root::0:0::0:0:Charlie &:/root:/bin/sh

I didn't catch any null password reports in the security emails,
most likely through lack of attention. As with the first case,
passwords seem to work normally (null rejected, normal accepted).

Any advice appreciated!

bob prohaska




> On Fri, 26 May 2023 at 19:45, bob prohaska  wrote:
> 
> > On Fri, May 26, 2023 at 01:03:19PM -0500, Mike Karels wrote:
> > > On 26 May 2023, at 12:35, bob prohaska wrote:
> > >
> > > > While going through normal security email from a Pi2
> > > > running -current I was disturbed to find:
> > > >
> > > > Checking for passwordless accounts:
> > > > root::0:0::0:0:Charlie &:/root:/bin/sh
> > > >
> > [details snipped]
> > > /etc/master.passwd is the source, but the operational database
> > > is /etc/spwd.db.  You should check the date on it as well.
> > > You can rebuild it with ???pwd_mkdb -p /etc/master.passwd???.
> >
> > At present the host reports:
> > root@www:/usr/src # ls -l /etc/*p*wd*
> > -rw---  1 root  wheel   2099 May 10 17:20 /etc/master.passwd
> > -rw-r--r--  1 root  wheel   1831 May 10 17:20 /etc/passwd
> > -rw-r--r--  1 root  wheel  40960 May 10 17:20 /etc/pwd.db
> > -rw---  1 root  wheel  40960 May 10 17:20 /etc/spwd.db
> >
> > /etc/master.passwd reports a null password for root, /etc/passwd
> > has the usual asterisk. The running system reports
> > root@www:/usr/src # uname -a
> > FreeBSD www.zefox.com 14.0-CURRENT FreeBSD 14.0-CURRENT #25
> > main-743516d51f: Thu May 18 00:08:40 PDT 2023 
> > b...@www.zefox.com:/usr/obj/usr/src/arm.armv7/sys/GENERIC
> > arm
> > root@www:/usr/src # uname -KU
> > 1400088 1400088
> >
> > I've never manually run pwd_mkdb and most certainly
> > never set a null password for root. It looks rather
> > as if a null password was set for root within one
> > minute after running pwd_mkdb.
> >
> > At this point I'm unsure how to sort out what happened.
> > The obvious next step is to re-establish a non-null
> > root password and rebuild both databases.
> >
> > Is it worthwhile to check for backdoors? There's no
> > evidence to suggest any malicious action (and plenty
> > of stupidity on my end) but the tale is getting
> > curiouser and curiouser.
> >
> > Many thanks for the quick reply!
> >
> > bob prohaska
> >
> >
> >
> >
> >



Re: Surprise null root password

2023-05-26 Thread bob prohaska
On Fri, May 26, 2023 at 01:03:19PM -0500, Mike Karels wrote:
> On 26 May 2023, at 12:35, bob prohaska wrote:
> 
> > While going through normal security email from a Pi2
> > running -current I was disturbed to find:
> >
> > Checking for passwordless accounts:
> > root::0:0::0:0:Charlie &:/root:/bin/sh
> >
[details snipped] 
> /etc/master.passwd is the source, but the operational database
> is /etc/spwd.db.  You should check the date on it as well.
> You can rebuild it with ???pwd_mkdb -p /etc/master.passwd???.

At present the host reports:
root@www:/usr/src # ls -l /etc/*p*wd*
-rw---  1 root  wheel   2099 May 10 17:20 /etc/master.passwd
-rw-r--r--  1 root  wheel   1831 May 10 17:20 /etc/passwd
-rw-r--r--  1 root  wheel  40960 May 10 17:20 /etc/pwd.db
-rw---  1 root  wheel  40960 May 10 17:20 /etc/spwd.db

/etc/master.passwd reports a null password for root, /etc/passwd
has the usual asterisk. The running system reports
root@www:/usr/src # uname -a
FreeBSD www.zefox.com 14.0-CURRENT FreeBSD 14.0-CURRENT #25 main-743516d51f: 
Thu May 18 00:08:40 PDT 2023 
b...@www.zefox.com:/usr/obj/usr/src/arm.armv7/sys/GENERIC arm
root@www:/usr/src # uname -KU
1400088 1400088

I've never manually run pwd_mkdb and most certainly
never set a null password for root. It looks rather
as if a null password was set for root within one
minute after running pwd_mkdb.

At this point I'm unsure how to sort out what happened.
The obvious next step is to re-establish a non-null
root password and rebuild both databases. 

Is it worthwhile to check for backdoors? There's no
evidence to suggest any malicious action (and plenty
of stupidity on my end) but the tale is getting
curiouser and curiouser.

Many thanks for the quick reply!

bob prohaska
 





Surprise null root password

2023-05-26 Thread bob prohaska
While going through normal security email from a Pi2
running -current I was disturbed to find:

Checking for passwordless accounts:
root::0:0::0:0:Charlie &:/root:/bin/sh

The machine had locked up on a -j4 buildworld since
sending the mail, so it was taken off the net, power
cycled and started single-user.

Sure enough, /etc/master.passwd contained a
null password for root, but the last modification
to the file was two weeks ago according to ls -l.

Stranger still, when fsck'd and brought up multi-user,
the normal password was still honored and a null
password rejected for both regular and root account.

AFAIK, /etc/master.passwd is _the_ password repository,
but clearly I'm wrong.

If somebody can tell me what's going on and what to
check for before placing the machine back on line
it would be much appreciated.

Thanks for reading,

bob prohaska




Re: Stray characters in command history

2023-05-21 Thread bob prohaska
Here is another example, perhaps a bit clearer.

The ssh connection to the first Pi3 in the chain had dropped, so it was
re-establishing via a regular user login, then su was invoked and tip run:
.
To change this login announcement, see motd(5).
Want to go the directory you were just in?
Type "cd -"
bob@pelorus:~ % su
Password:
# tip ucom
Stale lock on cuaU0 PID=2487... overriding.
connected
osed by r31 www s <<<<  This appeared spontaneously, then I hit return.
osed: Command not found.  <<<<< I didn't type anything.
bob@www:/usr/src %<<<<< The shell prompt on the 2nd Pi3's serial 
console.

Clearly nothing to do with sshd, might it simply be a misdirected echo of 
console
output generated by a (dead or broken) tip connection? The first example looked
possibly malicious, this does not

Thanks for reading,

bob prohaska



On Sun, May 21, 2023 at 06:49:33AM -0700, bob prohaska wrote:
> Lately I've been playing with buildworld on a Pi3 running -current. The same 
> machine
> acts as a terminal server for a second Pi3 also running -current in my 
> "cluster". 
> I ssh into the first Pi3, su to root and run tip to pick up a serial 
> connection to 
> the second Pi's console. Both machines are within a week of up-to-date.
> 
> While building world on the first Pi3 the ssh connection frequently drops and 
> must be
> re-established. If there was a shell running on the serial console of the 
> second
> Pi3 it typically keeps running and when the tip session is restarted 
> disgorges a
> stream of accumulated console message. 
> 
> This morning the same thing happened, but I noticed something odd. The stream 
> of
> messages (all login failure announcements from ssh) ended with
> 
> 
> May 21 06:15:00 www sshd[33562]: error: Fssh_kex_exchange_identification: 
> banner line contains invalid characters
> *+May 21 06:15:19 www sshd[33565]: error: Fssh_kex_exchange_identification: 
> Connection closed by remote host
> May 21 06:15:33 www sshd[33613]: error: Protocol major versions differ: 2 vs. 
> 1
> 
> At that point I hit carriage return and got
> *+: No match.
> 
> I did not type the *+ so it looks like the characters were somehow introduced 
> elsewhere,
> possibly from the ssh failure message. How they got into the command stream 
> is unclear.
> 
> This strikes me as undesirable at best and possibly much worse. The shell 
> reporting
> the "no match" was a regular user shell, but if I'd been su'd to root it 
> appears that
> the unmatched characters would be seen by the root shell as input.
> 
> This more-or-less fits with the pattern seen earlier with reboots observed 
> via serial
> console halting on un-typed keystrokes. Those halts were attributed to 
> electrical noise
> on the serial line, but this looks like something injected via part of the 
> network
> login process. Reboot pauses have been an ongoing phenomenon for months, this 
> is the 
> first time I've noticed the "invalid characters" message from ssh on the 
> console.
> 
> Thanks for reading, apologies if I'm being a worrywart.
> 
> bob prohaska
> 
>  
> 



Stray characters in command history

2023-05-21 Thread bob prohaska
Lately I've been playing with buildworld on a Pi3 running -current. The same 
machine
acts as a terminal server for a second Pi3 also running -current in my 
"cluster". 
I ssh into the first Pi3, su to root and run tip to pick up a serial connection 
to 
the second Pi's console. Both machines are within a week of up-to-date.

While building world on the first Pi3 the ssh connection frequently drops and 
must be
re-established. If there was a shell running on the serial console of the second
Pi3 it typically keeps running and when the tip session is restarted disgorges a
stream of accumulated console message. 

This morning the same thing happened, but I noticed something odd. The stream of
messages (all login failure announcements from ssh) ended with


May 21 06:15:00 www sshd[33562]: error: Fssh_kex_exchange_identification: 
banner line contains invalid characters
*+May 21 06:15:19 www sshd[33565]: error: Fssh_kex_exchange_identification: 
Connection closed by remote host
May 21 06:15:33 www sshd[33613]: error: Protocol major versions differ: 2 vs. 1

At that point I hit carriage return and got
*+: No match.

I did not type the *+ so it looks like the characters were somehow introduced 
elsewhere,
possibly from the ssh failure message. How they got into the command stream is 
unclear.

This strikes me as undesirable at best and possibly much worse. The shell 
reporting
the "no match" was a regular user shell, but if I'd been su'd to root it 
appears that
the unmatched characters would be seen by the root shell as input.

This more-or-less fits with the pattern seen earlier with reboots observed via 
serial
console halting on un-typed keystrokes. Those halts were attributed to 
electrical noise
on the serial line, but this looks like something injected via part of the 
network
login process. Reboot pauses have been an ongoing phenomenon for months, this 
is the 
first time I've noticed the "invalid characters" message from ssh on the 
console.

Thanks for reading, apologies if I'm being a worrywart.

bob prohaska

 



Re: Making -current machines accept mail from sendmail

2023-03-04 Thread bob prohaska
On Sat, Mar 04, 2023 at 10:57:59AM -0800, David Wolfskill wrote:
> 
> You might start with checking the output of "sockstat -l" on the machine
> that is intended to receive the mail: SMTP is expected to be on 25/tcp.
> 
> If the intended recipient machine does NOT show that 25/tcp is being
> listened to, you will need to (install &) start a process to do so.
> That may well involve installing (& starting) some MTA -- whether
> sendmail, postfix, exim, or even qmail (or something else).
> 
> (I expect that nothing is listening on 25/tcp, as that is what
> "connection refused" implies.)
> 

Indeed, that's the case. It looks as if dma isn't intended 
to replace sendmail, so I'll take the hint in UPDATING and
turn sendmail back on.

Thank you!

bob prohaska




Making -current machines accept mail from sendmail

2023-03-04 Thread bob prohaska
Is there some special step to turn on dma so hosts
can receive email from a sendmail-using host? 

I've got three hosts using 12/stable (hence sendmail)
and a few more running -current (hence dma). The -stable
hosts report "connection refused" when sending to -current,
but -current has no trouble sending to -stable. 

On a fresh reboot I don't see any reference to dma in the
dmesg output, and ps -aux reports only busdma. 

Thanks for reading,

bob prohaska





Re: Timekeeping problem in /usr/src on new RPI aarch64 snapshot

2023-02-25 Thread bob prohaska
On Sat, Feb 25, 2023 at 10:33:23AM -0600, Mike Karels wrote:
> On 25 Feb 2023, at 10:16, bob prohaska wrote:
> 
> > On Sat, Feb 25, 2023 at 12:21:09AM +0100, Ronald Klop wrote:
> >>
> >> UFS stores the current timestamp in the superblock of the FS on clean
> >> shutdown/unmount. On boot it reads the time from the timestamp in the
> >> superblock of the root FS. Of coarse this timestamp is behind for the
> >> duration that the machine was off or rebooting so you need to adjust that
> >> using ntp. For ZFS root you can use the fakertc port to do something
> >> similar.
> >>
> >>
> > Mark Millard points out:
> >  /etc/localtimeCurrent zoneinfo file, see tzsetup(8) and zic(8).
> >
> >  /etc/wall_cmos_clock  Empty file.  Its presence indicates that the
> >machine's CMOS clock is set to local time, while
> >its absence indicates a UTC CMOS clock.
> >
> > Since there is no /etc/wall_cmos_clock on the newly-installed filesystem
> > it appears the superblock timestamp is then interpreted as UTC when a Pi
> > boots, using whatever happens to be set in /etc/localtime. My confusion
> > is reduced somewhat. On first boot, what is the state of /etc/localtime?
> >
> > I've neglected to run tzsetup immediately on many previous installations
> > and not noticed any complaints about mis-set clocks in buildworld. Is this
> > new behavior?
> 
> /etc/localtime is used in formatting dates (e.g. for ls), but is not
> involved in storage of timestamps.  Timestamps on files, system time, etc,
> are all in UTC.  So the system should act normally if there is no
> /etc/localtime, and after one is added.

Does formatting include calculating offsets from UTC for display?

On at least a couple of installs I've observed date reporting UTC time. 
After running tzsetup, set to PST, date then reported the same numerical
time with a PST time zone. This happened very early in an installation
lifecycle and seemed to just "go away" after a few reboots, though I
never paid close attention since it caused no complaints.

Thanks for replying!

bob prohaska
 



Re: Timekeeping problem in /usr/src on new RPI aarch64 snapshot

2023-02-25 Thread bob prohaska
On Sat, Feb 25, 2023 at 12:21:09AM +0100, Ronald Klop wrote:
> 
> UFS stores the current timestamp in the superblock of the FS on clean
> shutdown/unmount. On boot it reads the time from the timestamp in the
> superblock of the root FS. Of coarse this timestamp is behind for the
> duration that the machine was off or rebooting so you need to adjust that
> using ntp. For ZFS root you can use the fakertc port to do something
> similar.
> 
> 
Mark Millard points out:
 /etc/localtimeCurrent zoneinfo file, see tzsetup(8) and zic(8).

 /etc/wall_cmos_clock  Empty file.  Its presence indicates that the
   machine's CMOS clock is set to local time, while
   its absence indicates a UTC CMOS clock.

Since there is no /etc/wall_cmos_clock on the newly-installed filesystem
it appears the superblock timestamp is then interpreted as UTC when a Pi
boots, using whatever happens to be set in /etc/localtime. My confusion
is reduced somewhat. On first boot, what is the state of /etc/localtime?

I've neglected to run tzsetup immediately on many previous installations
and not noticed any complaints about mis-set clocks in buildworld. Is this
new behavior?

Thanks to both Mark and Ronald!

bob prohaska




Timekeeping problem in /usr/src on new RPI aarch64 snapshot

2023-02-24 Thread bob prohaska
After installing 
FreeBSD-14.0-CURRENT-arm64-aarch64-RPI-20230223-fe5c211ba873-261074.img
on a Pi3 and setting up the hard disk to use separate swap and /usr partitions
an oddity came to light regarding dates.

The image file was written to disk the night of the 23rd, from a Pi3 with
a correctly-set time and date. It was left idle overnight, configured the
morning of the 24th and booted without issue. It then cloned -current into
/usr/src, at which point the time was noticed to be wrong, apparently fast.

It turned out ntpdate wasn't running, so it was started and then tzsetup
run. After a reboot the time reported correctly. 

However, make buildworld in /usr/src triggers an exhortation to "check
your time" and refuses run further. 

running date on the system reports
Fri Feb 24 12:49:41 PST 2023
but ls -l /usr/src reports time around 
Feb 24 19:10
an obvious inconsistency.
 
Presumably just waiting until the system clock catches
up with the /usr/src timestamps will fix the error. Is
there a better method?

Still, the date and time handling don't seem quite right. 
In at least one earlier instance it appeared that tzsetup 
altered the reported timeszone without shifting the time
stamp by the UTC/PDT offset. I always thought timestamps
were internally in UTC+timezone, displayed with the right
offset. It looks to a casual observer like something else
is going on. 

An earlier fiasco (on this same Pi3) included wildly wrong
timestamps in a filesystem. The Pi3 has no hardware clock,
how does it set time when booted without a reference?

Thanks for reading,

bob prohaska




Turning security email back on

2022-11-15 Thread bob prohaska
It looks as if daily email reporting login failures and system status
are no longer being sent out on -current. Is there a switch for 
/etc/rc.conf that will turn them back on? 

Thanks for reading,

bob prohaska




Seeking an idiot's guide to etcupdate/mergemaster

2022-11-05 Thread bob prohaska
On Mon, Oct 24, 2022 at 08:32:17PM -0700, Mark Millard wrote:
> 
> Your /etc/rc.d/ldconfig script seems to have not been updated
> by use of etcupdate or mergemaster or other such. (How much
> else is also out of date? How much of what you have for /etc/
> and the like goes back to 2022-Jan-07 or before?)
>
 
Alas, that is too true. The system was set up on July 2, 2020
and I've never managed to make sense of either mergemaster nor
etcupdate. Far as I could tell it didn't matter, the host ran
correctly, until now.

It's been transplanted to a new hard drive, which allows the
installation of a ports tree. Ports don't install because of
the stale /etc/rc.d/ldconfig file.

Since no changes have been made to /etc/ apart from /etc/rc.conf
is it possible to simply let mergemaster or etcupdate install
the latest defaults? I have looked at the manpage for etcupdate
and didn't recognize any straightforward way to simply accept
all updates. This particular system is expendable, so I'd be
glad to try things that might not work well, or at all. 

Apologies if I'm being dumb (probably guilty) or lazy (definitely
guilty). The barrage of questions generated by etcupdate  and
mergemaster is simply overwhelming. And, I suspect, largely 
unnecessary.   
 
Thanks for reading!

bob prohaska




Buildworld stops with ld: error: undefined symbol: AcpiWarning on -current

2022-04-05 Thread bob prohaska
A Pi3 running -current is stopping with 
ld: error: undefined symbol: AcpiWarning
during buildworld. The sources are up-to-
date as of a few minutes ago. A series of
related "Acpi..." errors follow.

The build command line is 
make -j2 -DWITH_META_MODE  buildworld > buildworld.log

Thanks for reading,

bob prohaska




Re: panic: data abort in critical section or under mutex (was: Re: panic: Unknown kernel exception 0 esr_el1 2000000 (on 14-CURRENT/aarch64 Feb 28))

2022-03-08 Thread bob prohaska
On Mon, Mar 07, 2022 at 11:45:02AM -0500, Mark Johnston wrote:
> On Mon, Mar 07, 2022 at 04:25:22PM +, Andrew Turner wrote:
> > 
> > > On 7 Mar 2022, at 15:13, Mark Johnston  wrote:
> > > ...
> > > A (the?) problem is that the compiler is treating "pc" as an alias
> > > for x18, but the rmlock code assumes that the pcpu pointer is loaded
> > > once, as it dereferences "pc" outside of the critical section.  On
> > > arm64, if a context switch occurs between the store at _rm_rlock+144 and
> > > the load at +152, and the thread is migrated to another CPU, then we'll
> > > end up using the wrong CPU ID in the rm->rm_writecpus test.
> > > 
> > > I suspect the problem is unique to arm64 as its get_pcpu()
> > > implementation is different from the others in that it doesn't use
> > > volatile-qualified inline assembly.  This has been the case since
> > > https://cgit.freebsd.org/src/commit/?id=63c858a04d56529eddbddf85ad04fc8e99e73762
> > >  
> > > <https://cgit.freebsd.org/src/commit/?id=63c858a04d56529eddbddf85ad04fc8e99e73762>
> > > .
> > > 
> > > I haven't been able to reproduce any crashes running poudriere in an
> > > arm64 AWS instance, though.  Could you please try the patch below and
> > > confirm whether it fixes your panics?  I verified that the apparent
> > > problem described above is gone with the patch.
> > 
> > Alternatively (or additionally) we could do something like the following. 
> > There are only a few MI users of get_pcpu with the main place being in rm 
> > locks.
> > 
> > diff --git a/sys/arm64/include/pcpu.h b/sys/arm64/include/pcpu.h
> > index 09f6361c651c..59b890e5c2ea 100644
> > --- a/sys/arm64/include/pcpu.h
> > +++ b/sys/arm64/include/pcpu.h
> > @@ -58,7 +58,14 @@ struct pcpu;
> > 
> >  register struct pcpu *pcpup __asm ("x18");
> > 
> > -#defineget_pcpu()  pcpup
> > +static inline struct pcpu *
> > +get_pcpu(void)
> > +{
> > +   struct pcpu *pcpu;
> > +
> > +   __asm __volatile("mov   %0, x18" : "="(pcpu));
> > +   return (pcpu);
> > +}
> > 
> >  static inline struct thread *
> >  get_curthread(void)
> 
> Indeed, I think this is probably the best solution.

Just for fun I tried the patch on a Pi3 running -current, updated a day or two
prior. The patch applied, compiled and seemed to run acceptably, but when I 
left a -j2 -DWITH_META_MODE buildworld running it crashed overnight, reporting


login: panic: rm_rlock: recursed on non-recursive rmlock sysctl lock @ 
/usr/src/sys/kern/kern_sysctl.c:193

cpuid = 0
time = 1646720264
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x174
panic() at panic+0x44
_rm_rlock_debug() at _rm_rlock_debug+0x214
sysctl_root_handler_locked() at sysctl_root_handler_locked+0x140
sysctl_root() at sysctl_root+0x1ac
userland_sysctl() at userland_sysctl+0x140
sys___sysctl() at sys___sysctl+0x68
do_el0_sync() at do_el0_sync+0x520
handle_el0_sync() at handle_el0_sync+0x40
--- exception, esr 0x5600
KDB: enter: panic
[ thread pid 869 tid 100091 ]
Stopped at  kdb_enter+0x44: undefined   f902011f


I tried typing bt at the debugger prompt but got no more output. 

I've put the buildworld log file at
http://www.zefox.net/~fbsd/rpi3/crashes/20220307/

Hope this is of some use

bob prohaska





Re: 14-CURRENT/aarch64 build problem

2021-06-08 Thread bob prohaska
On Tue, Jun 08, 2021 at 09:15:37PM +0200, Juraj Lutter wrote:
> Hi,
> 
> I???m having problem to build recent 14-CURRENT/aarch64 as of 
> 6d2648bcaba9b14e2f5c76680f3e7608e1f125f4:
> 
> --- cddl/lib/libuutil__L ---
> make[4]: make[4]: don't know how to make uu_dprintf.c. Stop
> make[4]: make[4]: don't know how to make uu_open.c. Stop
> `uu_alloc.c' is up to date.
> `uu_avl.c' is up to date.
> `uu_dprintf.c' was not built (being made, type OP_DEPS_FOUND|OP_MARK, flags 
> REMAKE|DONE_WAIT|DONE_ALLSRC|DONECYCLE)!
> ???
> 
> The build is performed with pristine /usr/obj
> 

FWIW, same problem seen here. In an added twist, git pull (hoping for
a fix) fails also:

root@www:/usr/src # git pull
error: cannot lock ref 'refs/remotes/freebsd/vendor/openzfs/legacy': 
'refs/remotes/freebsd/vendor/openzfs' exists; cannot create 
'refs/remotes/freebsd/vendor/openzfs/legacy'
>From https://git.freebsd.org/src
 ! [new branch]  vendor/openzfs/legacy -> 
freebsd/vendor/openzfs/legacy  (unable to update local ref)
error: cannot lock ref 'refs/remotes/freebsd/vendor/openzfs/master': 
'refs/remotes/freebsd/vendor/openzfs' exists; cannot create 
'refs/remotes/freebsd/vendor/openzfs/master'
 ! [new branch]  vendor/openzfs/master -> 
freebsd/vendor/openzfs/master  (unable to update local ref)
error: cannot lock ref 'refs/remotes/freebsd/vendor/openzfs/zfs-2.1-release': 
'refs/remotes/freebsd/vendor/openzfs' exists; cannot create 
'refs/remotes/freebsd/vendor/openzfs/zfs-2.1-release'
 ! [new branch]  vendor/openzfs/zfs-2.1-release -> 
freebsd/vendor/openzfs/zfs-2.1-release  (unable to update local ref)

Is this a problem at my end, or the server's end?

Thanks for reading,

bob prohaska
 




Re: URL for stable/13

2021-03-02 Thread bob prohaska
On Tue, Mar 02, 2021 at 09:46:11AM -0700, Warner Losh wrote:
> On Tue, Mar 2, 2021 at 9:18 AM bob prohaska  wrote:
> 
> > A while back I obtained a buildable source tree for stable/13
> > but it hasn't been updated in the last few days. Running
> 
> It would help if you asked a question.
> 

Apparently I didn't understand the correct question to ask.
8-)

When set up I configured git to use -ff only and simply used
git pull . to update, which seemed to work until a few days ago.

 
> If you'd like to know how to update now that you have this tree, I'd
> suggest 'git pull --rebase' or 'git pull --ff-only'
> 

Now it seems an explicit git pull --ff-only is required.
It just pulled down a substantial crop of updates. Now 
to see if buildworld succeeds That'll take a while.

> If it's some other question, I'm happy to help with that.
> 

You already have, and I didn't even ask the right question...

Thank you!
bob prohaska


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


URL for stable/13

2021-03-02 Thread bob prohaska
A while back I obtained a buildable source tree for stable/13 
but it hasn't been updated in the last few days. Running
git remote show origin 
reports in part

 Fetch URL: https://git.FreeBSD.org/src.git
  Push  URL: https://git.FreeBSD.org/src.git
  HEAD branch: main
  Remote branches:
main tracked
   
.

 Local branch configured for 'git pull':
stable/13 merges with remote stable/13
  Local ref configured for 'git push':
stable/13 pushes to stable/13 (local out of date)

Thanks for reading, any hints how to get back in sync apprecidated. 
This is used for self-hosting on a Raspberry Pi, if it matters.  

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Silent hang in buildworld, was Re: Invoking -v for clang during buildworld

2021-01-20 Thread bob prohaska
re /usr/fbsd/mm-src/sys/arm/conf/GENERIC-NODBG
> >> include "GENERIC"
> >> 
> >> ident   GENERIC-NODBG
> >> 
> >> makeoptions DEBUG=-g# Build kernel with gdb(1) debug 
> >> symbols
> >> 
> >> options AUDIT   # Not enabled by default in 
> >> armv7/v6 kernels
> >>   # Enabled here to allow kyua test 
> >> runs to
> >>   # possibly report auditing works.
> >> 
> >> options ALT_BREAK_TO_DEBUGGER
> >> 
> >> options KDB # Enable kernel debugger support
> >> 
> >> # For minimum debugger support (stable branch) use:
> >> options KDB_TRACE   # Print a stack trace for a panic
> >> options DDB # Enable the kernel debugger
> >> 
> >> # Extra stuff:
> >> #optionsVERBOSE_SYSINIT=0   # Enable verbose sysinit messages
> >> #optionsBOOTVERBOSE=1
> >> #optionsBOOTHOWTO=RB_VERBOSE
> >> options ALT_BREAK_TO_DEBUGGER   # Enter debugger on keyboard 
> >> escape sequence
> >> options KLD_DEBUG
> >> #optionsKTR
> >> #optionsKTR_MASK=KTR_TRAP
> >> ##options   KTR_CPUMASK=0xF
> >> #optionsKTR_VERBOSE
> >> 
> >> # Disable any extra checking for. . .
> >> nooptions   INVARIANTS  # Enable calls of extra sanity 
> >> checking
> >> nooptions   INVARIANT_SUPPORT   # Extra sanity checks of internal 
> >> structures, required by INVARIANTS
> >> nooptions   WITNESS # Enable checks to detect 
> >> deadlocks and cycles
> >> nooptions   WITNESS_SKIPSPIN# Don't run witness on spinlocks 
> >> for speed
> >> nooptions   DEADLKRES   # Enable the deadlock resolver
> >> nooptions   MALLOC_DEBUG_MAXZONES   # Separate malloc(9) zones
> >> nooptions   DIAGNOSTIC
> >> nooptions   BUF_TRACKING
> >> nooptions   FULL_BUF_TRACKING
> >> nooptions   USB_DEBUG
> >> nooptions   USB_REQ_DEBUG
> >> nooptions   USB_VERBOSE
> >> 
> >> The /boot/loader.conf file and the /etc/sysctl.conf files
> >> both contained:
> >> 
> >> vm.pageout_oom_seq=120
> >> vm.pfault_oom_attempts=-1
> >> 
> >> (The hw.physmem=979042304 in /boot/loader.conf was very-special,
> >> to better approximate your environment. I also controlled the
> >> cpu frequency used via a line in /etc/sysctl.conf . I do not
> >> bother with such non-default frequency usage [or related settings]
> >> for RPi*'s with the pre-RPi4B style power connections but do
> >> control the frequency for the OPi+2E.)
> > 
> > The following had been left implicit about my context and
> > how it manages memory space use.
> > 
> > I'll note that I do not use tmpfs or other such memory based
> > file system techniques that could compete for RAM/swap. What
> > is in use for the only file system involved is just the
> > root file system:
> > 
> > # df -m
> > Filesystem1M-blocks  Used  Avail Capacity  Mounted on
> > /dev/gpt/BPIM3root   195378 63940 11580836%/
> > devfs 0 0  0   100%/dev
> > 
> > It is a USB SSD. The swap partition is also on that same
> > media. (The BPIM3 based name dates back to before the
> > BPI-M3 power connection failed and I switched to the
> > OPi+2E.)
> > 
> > I'll note that I've started a new from-scratch build without
> > LDFLAGS.lld+= -Wl,--threads=1 . So at some point I'll have
> > information about how much of a difference (+/-) in swap
> > usage it actually made for with vs. without, if any.
> 
> Looks like, for such 4-core contexts, that bothering
> with LDFLAGS.lld+= -Wl,--threads=1 is typically a
> waste of effort for both swap usage and time . . .
> 
> With LDFLAGS.lld+= -Wl,--threads=1 :
> 
> Mem:  . . . , 765700Ki MaxObsActive, 200412Ki MaxObsWired, 954116Ki 
> MaxObs(Act+Wir)
> Swap: . . . , 537588Ki MaxObsUsed
> 
> without:
> 
> Mem:  . . ., 715756Ki MaxObsActive, 194816Ki MaxObsWired, 903132Ki 
> MaxObs(Act+Wir)
> Swap: . . ., 557208Ki MaxObsUsed
> 
> 
> With LDFLAGS.lld+= -Wl,--threads=1 :
> 
> World built in 72960 seconds, ncpu: 4, make -j4
> Kernel(s)  GENERIC-NODBG built in 4998 seconds, ncpu: 4, make -j4
> 
> without:
> 
> World built in 72804 seconds, ncpu: 4, make -j4
> Kernel(s)  GENERIC-NODBG built in 4824 seconds, ncpu: 4, make -j4
> 
> 
> So, just not that much of a difference compared to the overall
> sizes or times involved.
> 
A first OS build/install cycle on armv7 (RPI2) using meta mode 
finished without trouble.  Sources were a day or two newer than 
the kernel, -j4 buildworld took 157121 seconds. Peak swap use 
was half again as much at 732932. No constraints on ld.lld 
beyond defaults. I'm a little surprised at the extreme slowness,
but this was a fully-debug'd-current kernel and sources were
slightly newer than existing world.

In case there's interest I've put what log files I could gather at
http://www.zefox.net/~fbsd/rpi2/buildworld/main-c950-gff1a307801/

Thanks for your attention and help!!

bob prohaska

 
> ===
> Mark Millard
> marklmi at yahoo.com
> ( dsl-only.net went
> away in early 2018-Mar)
> 
> 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Silent hang in buildworld, was Re: Invoking -v for clang during buildworld

2021-01-17 Thread bob prohaska
On Sun, Jan 17, 2021 at 12:30:51PM -0800, Mark Millard wrote:
> 
> 
> On 2021-Jan-17, at 09:40, bob prohaska  wrote:
> 
> > On Sat, Jan 16, 2021 at 03:04:04PM -0800, Mark Millard wrote:
> >> 
> >> Other than -j1 style builds (or equivalent), one pretty much
> >> always needs to go looking around for a non-panic failure. It
> >> is uncommon for all the material to be together in the build
> >> log in such contexts.
> > 
> > Running make cleandir twice and restarting -j4 buildworld brought
> > the process full circle: A silent hang, no debugger response, no
> > console warnings. That's what sent me down the rabbit hole of make
> > without clean, which worked at least once...
> 
> Unfortunately, such a hang tends to mean that log files and such
> were not completely written out to media. We do not get to see
> evidence of the actual failure time frame, just somewhat before.
> (compiler/linker output and such can have the same issues of
> ending up with incomplete updates.)
> 
> So, pretty much my notes are unlikely to be strongly tied to
> any solid evidence: more like alternatives to possibly explore
> that could be far off the mark.
> 
> It is not clear if you were using:
> 
> LDFLAGS.lld+= -Wl,--threads=1
> 
> or some such to limit the multi-thread linking and its memory.
No, I wasn't trying to limit ld.lld thread number.

> I'll note that if -j4 gets 4 links running in parallel it used
> to be each could have something like 5 threads active on a 4
> core machine, so 20 or so threads. (I've not checked llvm11's
> lld behavior. It might avoid such for defaults.)
> 
> You have not reported any testing of -j2 or -j3 so far, just
> -j4 . (Another way of limiting memory use, power use, temperature,
> etc. .)
>
Not recently, simply because it's so slow to build. On my "production"
armv7 machines running stable/12 I do use -j2. But, they get updated
only a couple times per year, when there's a security issue. 

> You have not reported if your boot complained about the swap
> space size or if you have adjusted related settings to make
> non-default tradeoffs for swap amanagment for these specific
> tests. I recommend not tailoring and using a swap size total
> that is somewhat under what starts to complain when there is
> no tailoring.
> 
Both Pi2 and Pi3 have been complaining about too much swap
since I first got them. Near as can be told it's never been
a demonstrated problem, thus far. Now, as things like LLVM
get bigger and bigger, it seems possible excess swap might
cause, or obscure, other problems. For the Pi2 I picked 2
GB from the old "2x physical RAM" rule. 

> 
> > The residue of the top screen shows
> > 
> > last pid: 63377;  load averages:  4.29,  4.18,  4.15
> >  up 1+07:11:07  04:46:46
> > 60 processes:  5 running, 55 sleeping
> > CPU: 70.7% user,  0.0% nice, 26.5% system,  2.8% interrupt,  0.0% idle
> > Mem: 631M Active, 4932K Inact, 92M Laundry, 166M Wired, 98M Buf, 18M Free
> > Swap: 2048M Total, 119M Used, 1928M Free, 5% Inuse, 16K In, 3180K Out
> > packet_write_wait: Connection to 50.1.20.26 port 22: Broken pipe
> > bob@raspberrypi:~ $ ssh www.zefox.comRES STATEC   TIMEWCPU 
> > COMMAND
> > ssh: connect to host www.zefox.com port 22: Connection timed out86.17% c++
> > bob@raspberrypi:~ $ 1  990   277M   231M RUN  0   3:26  75.00% c++
> > 63245 bob   1  990   219M   173M CPU0 0   2:10  73.12% c++
> > 62690 bob   1  980   354M   234M RUN  3   9:42  47.06% c++
> > 63377 bob   1  300  5856K  2808K nanslp   0   0:00   3.13% gstat
> > 38283 bob   1  240  5208K   608K wait 2   2:00   0.61% sh
> >  995 bob   1  200  6668K  1184K CPU3 3   8:46   0.47% top
> >  990 bob   1  20012M  1060K select   2   0:48   0.05% sshd
> > 
> 
> This does not look like ld was in use as of the last top
> display update's content. But the time between reasonable
> display updates is fairly long relative to CPU activity
> so it is only suggestive.
> 
> > [apologies for typing over the remnants]
> > 
> > I've put copies of the build and swap logs at
> > 
> > http://www.zefox.net/~fbsd/rpi2/buildworld/
> > 
> > The last vmstat entry (10 second repeat time) reports:
> > procs memory   page  disks faults   cpu
> > r b w avm fre  flt  re  pi  pofr   sr da0 sd0   in   sy   cs us 
> > sy id
> > 4  0 14  969160   91960   685   2   2   1   707  304   0   0 11418   692  
> > 1273 45  5 50
> > 
> > Does that point to the mem

Silent hang in buildworld, was Re: Invoking -v for clang during buildworld

2021-01-17 Thread bob prohaska
On Sat, Jan 16, 2021 at 03:04:04PM -0800, Mark Millard wrote:
> 
> Other than -j1 style builds (or equivalent), one pretty much
> always needs to go looking around for a non-panic failure. It
> is uncommon for all the material to be together in the build
> log in such contexts.

Running make cleandir twice and restarting -j4 buildworld brought
the process full circle: A silent hang, no debugger response, no
console warnings. That's what sent me down the rabbit hole of make
without clean, which worked at least once...

The residue of the top screen shows

last pid: 63377;  load averages:  4.29,  4.18,  4.15 up 
1+07:11:07  04:46:46
60 processes:  5 running, 55 sleeping
CPU: 70.7% user,  0.0% nice, 26.5% system,  2.8% interrupt,  0.0% idle
Mem: 631M Active, 4932K Inact, 92M Laundry, 166M Wired, 98M Buf, 18M Free
Swap: 2048M Total, 119M Used, 1928M Free, 5% Inuse, 16K In, 3180K Out
packet_write_wait: Connection to 50.1.20.26 port 22: Broken pipe
bob@raspberrypi:~ $ ssh www.zefox.comRES STATEC   TIMEWCPU COMMAND
ssh: connect to host www.zefox.com port 22: Connection timed out86.17% c++
bob@raspberrypi:~ $ 1  990   277M   231M RUN  0   3:26  75.00% c++
63245 bob   1  990   219M   173M CPU0 0   2:10  73.12% c++
62690 bob   1  980   354M   234M RUN  3   9:42  47.06% c++
63377 bob   1  300  5856K  2808K nanslp   0   0:00   3.13% gstat
38283 bob   1  240  5208K   608K wait 2   2:00   0.61% sh
  995 bob   1  200  6668K  1184K CPU3 3   8:46   0.47% top
  990 bob   1  20012M  1060K select   2   0:48   0.05% sshd


[apologies for typing over the remnants]

I've put copies of the build and swap logs at

http://www.zefox.net/~fbsd/rpi2/buildworld/

The last vmstat entry (10 second repeat time) reports:
procs memory   page  disks faults   cpu
r b w avm fre  flt  re  pi  pofr   sr da0 sd0   in   sy   cs us sy 
id
 4  0 14  969160   91960   685   2   2   1   707  304   0   0 11418   692  1273 
45  5 50

Does that point to the memory exhaustion suggested earlier in the thread?
At this point /boot/loader.conf contains vm.pfault_oom_attempts="-1", but 
that's a relic of long-ago attempts to use USB flash for root and swap.
Might removing it stimulate more warning messages?

Thanks for reading!

bob prohaska
 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Invoking -v for clang during buildworld

2021-01-16 Thread bob prohaska
On Sat, Jan 16, 2021 at 11:17:52AM -0800, Mark Millard wrote:
> 
> 
> On 2021-Jan-16, at 07:55, bob prohaska  wrote:
> 
> > On Fri, Jan 15, 2021 at 09:25:00PM -0800, Mark Millard wrote:
> >> 
> >> On 2021-Jan-15, at 20:37, bob prohaska  wrote:
> >> 
> >>> While playing with -current on armv7 using a raspberry pi 2 v1.1 
> >>> an error crops up with recent kernels while building world:
> >>> 
> >>> ++: error: linker command failed with exit code 1 (use -v to see 
> >>> invocation)
> >>> *** [clang.full] Error code 1
> >>> 
> >>> make[5]: stopped in /usr/freebsd-src/usr.bin/clang/clang
> >>> 
> >>> How does one invoke -v in this situation?
> >> 
> >> Going a different direction: Going to publish the build log
> >> someplace? There is likely more there of interest to isolating
> >> the issue(s).
> >> 
> > I've put what I hope is a useful picture at
> > http://www.zefox.net/~fbsd/rpi2/buildworld/
> 
> Looks to me like your -DNO_CLEAN based build is reusing one or
> more files with inappropriate/incomplete contents that need to
> be regenerated: there are a number of undefined symbols stopping
> the linker during its attempt to build the "usr.bin/clang/clang
> (all)" material. See below.
> 
[examples snipped]
> 
> FYI:
> 
> I found this by noting the "all_subdir_usr.bin" below and
> searching backwards for prior examples and seeing what was
> after those examples.
> 
> --- all_subdir_usr.bin ---
> c++: error: linker command failed with exit code 1 (use -v to see invocation)
> *** [clang.full] Error code 1
> 
> 

It never dawned that I wasn't looking at the first error message.
 
> 
> The undefined symbols seem unlikely to be a voltage problem.
> 
> The zeros are from the units for the integers not being volts
> but micro volts. (Which is not the same as saying measurements
> reach that scale of accuracy.)
> 

So long as  they're measured values they might be worth keeping track of.
I thought maybe they were some sort of input or placeholder values.

> >> I use META_MODE builds. One thing they do is record the
> >> command used to try to produce each file. So in that kind
> >> of context, identifying what it was trying to build allows
> >> finding the related NAME.meta file and looking in it.
> >> 

Not needed now, but worth remembering for the future.

> 
> I see no specific evidence for a kernel problem being involved.
> 
Agreed. The problem is the operator.

Thanks for your patience!

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Invoking -v for clang during buildworld

2021-01-16 Thread bob prohaska
On Fri, Jan 15, 2021 at 09:25:00PM -0800, Mark Millard wrote:
> 
> On 2021-Jan-15, at 20:37, bob prohaska  wrote:
> 
> > While playing with -current on armv7 using a raspberry pi 2 v1.1 
> > an error crops up with recent kernels while building world:
> > 
> > ++: error: linker command failed with exit code 1 (use -v to see invocation)
> > *** [clang.full] Error code 1
> > 
> > make[5]: stopped in /usr/freebsd-src/usr.bin/clang/clang
> > 
> > How does one invoke -v in this situation?
> 
> Going a different direction: Going to publish the build log
> someplace? There is likely more there of interest to isolating
> the issue(s).
> 
I've put what I hope is a useful picture at
http://www.zefox.net/~fbsd/rpi2/buildworld/
Files from a clean start would probably be better, but it will take 
days to get back to that state. I was thinking this might be a 
kernel problem, but after trying three different kernels, all with
the same result, it's looking doubtful. No hint of the "cannot allocate
memory" message of earlier troubles, and nothing on the console. 

One additional question, however: Does the Pi2 have an internal
voltage measurement that could be added to the swap logging script?
Sysctl -a | grep olt
produces a bunch of output, but none of it looks
real, with too many trailing zeroes. Power supply problems have
been rare, but they caused much hair loss. RaspiOS reports under
voltage, does FreeBSD have a comparable feature?  

> I use META_MODE builds. One thing they do is record the
> command used to try to produce each file. So in that kind
> of context, identifying what it was trying to build allows
> finding the related NAME.meta file and looking in it.
> 
> An example failure for armv7 and 1 GiByte of RAM could be
> a simple memory allocation failure: unable to get a
> sufficiently large contiguous range from the address space
> for some request. (So it never gets to the point of using
> swap for it.) Are you controlling how many threads the
> linker uses?
>
There have been none of the "unable to allocate memory" messages
that characterized the previous failures, and nothing on the console. 
I do not try to control thread count beyond -j4 on the command line.
It wasn't necessary up to a few days ago. It does seem that memory
use is vastly greater with the arrival of clang 11, swap use on armv7
gets up past half a GB. With clang 9 it hardly registered. 

> > For the record, uname -a reports
> > FreeBSD www.zefox.com 13.0-CURRENT FreeBSD 13.0-CURRENT #6 
> > main-c950-gff1a307801: Wed Jan 13 19:02:18 PST 2021 
> > b...@www.zefox.com:/usr/obj/usr/freebsd-src/arm.armv7/sys/GENERIC-MMCCAM  
> > arm
> > 
> > The present sources are a day or two newer.
> > 
> > Nothing is obviously wrong; swap usage is small, no warnings or errors on 
> > the console.
> > 
> > In past occurrences, an old kernel (pre-git) worked through the problem.
> > If a restart of make buildworld doesn't get past the stoppage I'll check 
> > again.

The pre-git kernel didn't work either, nor did kernel.old, a couple days
previous. For clarity, all three were -DNO_CLEAN starts.

Thanks for reading,

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Invoking -v for clang during buildworld

2021-01-15 Thread bob prohaska
While playing with -current on armv7 using a raspberry pi 2 v1.1 
an error crops up with recent kernels while building world:

++: error: linker command failed with exit code 1 (use -v to see invocation)
*** [clang.full] Error code 1

make[5]: stopped in /usr/freebsd-src/usr.bin/clang/clang

How does one invoke -v in this situation?

For the record, uname -a reports
FreeBSD www.zefox.com 13.0-CURRENT FreeBSD 13.0-CURRENT #6 
main-c950-gff1a307801: Wed Jan 13 19:02:18 PST 2021 
b...@www.zefox.com:/usr/obj/usr/freebsd-src/arm.armv7/sys/GENERIC-MMCCAM  arm

The present sources are a day or two newer.

Nothing is obviously wrong; swap usage is small, no warnings or errors on the 
console.

In past occurrences, an old kernel (pre-git) worked through the problem.
If a restart of make buildworld doesn't get past the stoppage I'll check again.

Thanks for reading,

bob prohaska


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: How does /usr/bin/uname work in plain english?

2021-01-13 Thread bob prohaska
On Wed, Jan 13, 2021 at 10:15:32PM -0700, Warner Losh wrote:
> > > __FreeBSD_version is defined in sys/param.h. For -U, uname prints that
> > > value. For -K, it asks the kernel for this value to print.
> > >
> > > MMmmnnn where MM is the major version, mm is minor, and nnn is
> > incremental
> > > when the APIs change, approximately weekly.
> > >

Sounds like the numbers are manually set by humans...
I imagined something much more automated.

> 
> He has a newer kernel than userland... however that came to be...
> 
Yes, a new kernel was compiled to fix the "won't boot with HDMI connected"
problem on Raspberry Pi.

Thanks for explaining!

bob prohaska
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


How does /usr/bin/uname work in plain english?

2021-01-13 Thread bob prohaska
Since the switch to git I've been wondering how /usr/bin/uname works.
The man page is thin on details and uname.c is far too subtle.

For example, on my test box uname -a reports
FreeBSD www.zefox.org 13.0-CURRENT FreeBSD 13.0-CURRENT #7 
main-c255937-g818390ce0ca5: Wed Jan 13 16:42:12 PST 2021 
b...@www.zefox.org:/usr/obj/usr/freebsd-src/arm64.aarch64/sys/GENERIC-MMCCAM  
arm64
which seems to replay git nomeclature.

However, uname -KU reports
1300135 1300134
which is admirably readable, even for me. 

Is there a natural language description detailing  how 
uname -KU outputs are computed, and roughly what they mean? 
I've noticed that different sources sometimes produce the 
same values, so the level of detail is less, but might suffice
for initial reports to the mailing lists.

Thanks for reading,

bob prohaska

 

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: HEADS UP: FreeBSD src repo transitioning to git this weekend

2020-12-22 Thread bob prohaska
On Tue, Dec 22, 2020 at 12:19:03PM -0800, Mark Millard wrote:
> 
> git in base would have licensing issues.
> 
I gather you're referring to GPLv2. A sticky wicket.

The trouble with ports is the tree is getting awfully big.
The host in question has a 32 GB disk and is over half full
with just a base source installation. Adding a "dormant"
ports tree will take nearly 2 GB, most of which is not used.

Might there be some way to clone a "sparse tree" including 
only one port, which then leafs out just enough to build that
port and dependencies? 

When the ports system was introduced it seemed a marvel of
compactness and efficiency. Time marches on.

> Pi2B: v1.1 (armv7 only)? v1.2 running armv7 FreeBSD?
> v1.2 running arm64 FreeBSD?
>
Sorry for the ambiguity... It's v1.1, armv7 only. That's
why I want to test git on this particular machine. Git
seems to work fine on the Pi3.

Thanks for replying!

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: HEADS UP: FreeBSD src repo transitioning to git this weekend

2020-12-22 Thread bob prohaska
On Tue, Dec 22, 2020 at 09:34:25PM +0100, Ronald Klop wrote:
> 
> what does "pkg install git" do for you? NB: I use "pkg install git-lite".
> Prevents about 1000 dependencies.
> 

That seems to have worked. It reported something about package management
not being installed, but after a prompt installed pkg-static and set
up a version of git which seems to run. Svnlite had been working without
this step. 

This is for a Pi2B v 1.1, arm v7 only.

Using the "mini git primer" at https://hackmd.io/hJgnfzd5TMK-VHgUzshA2g
I tried to clone stable/12 expecting that the -beta would be gone.

It looks as if I'm still jumping the gun. Although 
cgit.freegbsd.org replies to ping, using

bob@www:/usr % git clone cgit.freebsd.org -b stable/12 freebsd-src

reports:

fatal: repository 'cgit.freebsd.org' does not exist

This is just a rehearsal, so I can wait, but if I've 
made other mistakes please point them out.

Thanks for your help!

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: HEADS UP: FreeBSD src repo transitioning to git this weekend

2020-12-22 Thread bob prohaska
On Wed, Dec 16, 2020 at 05:46:35PM -0700, Warner Losh wrote:
> 
> The FreeBSD project will be moving it's source repo from subversion to git
> starting this this weekend. The docs repo was moved 2 weeks ago. The ports
> repo will move at the end of March, 2021 due to timing issues.
> 

Is there some way to obtain git on a Pi2B running 
13.0-CURRENT FreeBSD 13.0-CURRENT #2 r365692
without installing the ports tree? I expected 
to find git in base, but it isn't there. 

Can it be found  under another package name?

Thanks for reading, and any guidance!

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: non-current pmap 0xffffa00020eab8f0 on Rpi3

2020-10-20 Thread bob prohaska
On Tue, Oct 20, 2020 at 12:30:05PM -0400, Mark Johnston wrote:
> 
> I set up a RPi3 to try and repro this and have so far managed to trigger
> it once using Peter Holm's stress2 suite, so I'll keep investigating.  I
> hadn't configured a dump device, but I was able to confirm from DDB that
> PCPU_GET(curpmap) == >vm_pmap.
> 

Is the invalid pmap fault related in any way to intensity of swap usage?
That's easily adjusted using -j values building things like www/chromium.
In the past, when I've reported crashes caused by stress2 I've observed
a "that's inevitable" sort of response with some regularity. Panics when
doing more normal things like make seem to stimulate greater interest.


> For future reference,
> https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html

The pieces are all in place, but the machine doesn't seem to find core
dumps when coming up after a crash. It does routinely issue "no core dumps
found" during reboot, so it's looking. It it necessary to issue a dump
command from inside the debugger? 

Thanks for writing!

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: non-current pmap 0xffffa00020eab8f0 on Rpi3

2020-10-19 Thread bob prohaska
On Mon, Oct 19, 2020 at 04:39:54PM -0400, Mark Johnston wrote:
> 
> I think vmspace_exit() should issue a release fence with the cmpset and
> an acquire fence when handling the refcnt == 1 case, but I don't see why
> that would make a difference here.  So, if you can test a debug patch,
> this one will yield a bit more debug info.  If you can provide access to
> a vmcore and kernel debug symbols, that'd be even better.
> 

I haven't seen an invalid pmap panic since the report of October 5th.
Your patch  applied cleanly on the Pi3 running HEAD at r366780M, 
the M being due to patches supplied by Kyle Evans applied to 
M   sys/arm/broadcom/bcm2835/bcm2835_mbox.c
M   sys/arm/broadcom/bcm2835/bcm2835_sdhci.c
M   sys/arm/broadcom/bcm2835/bcm2835_vcbus.c
M   sys/arm/broadcom/bcm2835/bcm2835_vcbus.h

AIUI, they're something to do with DMA for peripherals. They've 
caused no obvious trouble, if you anticipate conflicts let me know 
and I'll remove them 

I've never seen either a vmcore file or debug symbols on this machine.
A sequence of instructions to generate the data needed would be helpful.

Thanks for reading!

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


panic: non-current pmap 0xffffa00020eab8f0 on Rpi3

2020-10-05 Thread bob prohaska
Still seeing non-current pmap panics on the Pi3, this time a B+ running
13.0-CURRENT (GENERIC-MMCCAM) #0 71e02448ffb-c271826(master)
during a -j4 buildworld.  The backtrace reports

panic: non-current pmap 0xa00020eab8f0
cpuid = 0
time = 1601947137
KDB: stack backtrace:
db_trace_self() at db_trace_self_wrapper+0x30
 pc = 0x0072999c  lr = 0x0019ec8c
 sp = 0x5d96c230  fp = 0x5d96c430

db_trace_self_wrapper() at kdb_backtrace+0x38
 pc = 0x0019ec8c  lr = 0x004b4984
 sp = 0x5d96c440  fp = 0x5d96c500

kdb_backtrace() at vpanic+0x19c
 pc = 0x004b4984  lr = 0x004734c0
 sp = 0x5d96c510  fp = 0x5d96c560

vpanic() at panic+0x44
 pc = 0x004734c0  lr = 0x004730dc
 sp = 0x5d96c570  fp = 0x5d96c620

panic() at pmap_remove_pages+0x5d8
 pc = 0x004730dc  lr = 0x0073fe58
 sp = 0x5d96c630  fp = 0x5d96c690

pmap_remove_pages() at vmspace_exit+0xb0
 pc = 0x0073fe58  lr = 0x006c77a0
 sp = 0x5d96c6a0  fp = 0x5d96c700

vmspace_exit() at exit1+0x470
 pc = 0x006c77a0  lr = 0x0042e5bc
 sp = 0x5d96c710  fp = 0x5d96c760

exit1() at sys_sys_exit+0x10
 pc = 0x0042e5bc  lr = 0x0042e148
 sp = 0x5d96c770  fp = 0x5d96c7c0

sys_sys_exit() at syscallenter+0x104
 pc = 0x0042e148  lr = 0x007463dc
 sp = 0x5d96c7d0  fp = 0x5d96c7d0

syscallenter() at svc_handler+0x4c
 pc = 0x007463dc  lr = 0x00745df8
 sp = 0x5d96c7e0  fp = 0x5d96c810

svc_handler() at do_el0_sync+0xf0
 pc = 0x00745df8  lr = 0x00745c08
 sp = 0x5d96c820  fp = 0x5d96c830

do_el0_sync() at handle_el0_sync+0x90
 pc = 0x00745c08  lr = 0x0072c224
 sp = 0x5d96c840  fp = 0x5d96c980

handle_el0_sync() at 0x40421150
 pc = 0x0072c224  lr = 0x40421150
 sp = 0x5d96c990  fp = 0xd830

KDB: enter: panic
[ thread pid 2429 tid 100951 ]
Stopped at  0x403fa408
db> 

Thanks for reading,

bob prohaska
 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Strange USB loop

2020-09-02 Thread bob prohaska
On Thu, Aug 27, 2020 at 10:02:21AM -0700, bob prohaska wrote:
> On Tue, Aug 25, 2020 at 11:29:16AM -0700, bob prohaska wrote:
> 
> > With a _different_ FT232 plugged in it also came up normally.
> > 
> > Both are thought to be genuine, but they are of different age
> > and produce different recognition messages:
> > 
> > The FT232 that causes trouble reports
> > ugen1.4:  at usbus1
> > uftdi0 on uhub1
> > uftdi0:  on usbus1
> > 
> > The one that seems to work is newer and reports
> > ugen1.4:  at usbus1
> > uftdi0 on uhub1
> > uftdi0:  on usbus1
> > 

With the system updated to r364900 both FT232 devices seem
to be working. The machine boots from a USB disk successfully
with mouse, keyboard and FT232 connected at powerup.

Thanks for your help,

bob prohaska



___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: usbd_setup_device_desc: getting device descriptor at addr 6 failed, USB_ERR_IOERROR

2020-08-27 Thread bob prohaska
On Thu, Aug 27, 2020 at 09:41:16PM +0300, Yuri Pankov wrote:
> Another issue that I started seeing lately, didn't try finding out when
> exactly in case someone knows what it's about:
> 
> Root mount waiting for: usbus0
> usbd_setup_device_desc: getting device descriptor at addr 6 failed,
> USB_ERR_IOERROR
> 
[details snipped]

> So far not seeing any ill effects from this, i.e. I can connect USB HDD to
> these ports, and it's successfully detected.

If it's convenient, connecting a USB-serial adapter and rebooting might
be interesting. I'm having trouble with FT232 obstructing disk detection
in some cases and self-disconnecting in others on a Pi3B. 

Thanks for reading,

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Strange USB loop

2020-08-27 Thread bob prohaska
On Tue, Aug 25, 2020 at 11:29:16AM -0700, bob prohaska wrote:

> With a _different_ FT232 plugged in it also came up normally.
> 
> Both are thought to be genuine, but they are of different age
> and produce different recognition messages:
> 
> The FT232 that causes trouble reports
> ugen1.4:  at usbus1
> uftdi0 on uhub1
> uftdi0:  on usbus1
> 
> The one that seems to work is newer and reports
> ugen1.4:  at usbus1
> uftdi0 on uhub1
> uftdi0:  on usbus1
> 
> On balance I think the new kernel is better-behaved. Beyond that
> I'm at a loss. If you can suggest other things to try please do.
> 
>  

This morning I found on the console a message:
uftdi0: at uhub1, port 3, addr 4 (disconnected)
uftdi0: detached

but, usbconfig -a repored
ugen1.4:  at usbus1, cfg=0 md=HOST spd=FULL (12Mbps) 
pwr=ON (90mA)

and lsusb says
Bus /dev/usb Device /dev/ugen1.4: ID 0403:6001 Future Technology Devices 
International, Ltd FT232 Serial (UART) IC

The FT232 is plugged directly into the Pi. This the newer, supposedly
functional, ft232...

Unplugging and replugging put on the console
ugen1.4:  at usbus1 (disconnected)
uftdi0: at uhub1, port 3, addr 4 (disconnected)
uftdi0: detached
ugen1.4:  at usbus1
uftdi0 on uhub1
uftdi0:  on usbus1

But it still can't connect to the serial port of the correspondent host,
which is up and running. 

Meanwhile, the FT232 which appeared faulty is working fine overnight
on RaspiOS Buster. 

Thanks for reading, and any suggestions

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Strange USB loop

2020-08-25 Thread bob prohaska
On Tue, Aug 25, 2020 at 09:41:41AM +0200, Hans Petter Selasky wrote:
> 
> Can you try r364346 ?
> 

The kernel compiled and installed without trouble. After running
run bootcmd_usb0 the machine loaded the kernel, but stopped at the 
loader prompt. The keyboard was connected direct to the Pi, the
mouse was disconnected.
 
It isn't obvious why it stopped at the loader prompt. 

lsdev reports
disk devices:
disk0:250085377 X 512 blocks (removable)
  disk0s1: DOS/Windows
  disk0s2: FreeBSD
disk0s2a: FreeBSD UFS
disk0s2b: FreeBSD swap
disk1:1953525169 X 512 blocks
  disk1s1: DOS/Windows
  disk1s2: FreeBSD
disk1s2a: FreeBSD UFS
disk1s2b: FreeBSD swap
http: (unknown)
net devices:
net0:
OK 

Disk0 is the (bootable) microSD, disk1 is the hard drive.

Boot -s came up single-user, with / mounted from /dev/da0s2a as desired.

Fsck reported the filesystem clean. Exit to multi-user worked.
The USB system keyboard (plugged into the Pi) worked. The mouse
(plugged into the hub after boot) also worked.

A second reboot with the mouse connected via the hub worked without
pausing at the loader prompt.

Plugging the FTDI FT232 adapter into the hub triggered a round of
uhub_reattach_port: giving up port reset - device vanished
messages, but this time they stopped when I pulled the FT232.
Plugging the FT232 directly into the Pi caused normal recognition.

It looks as if the FT232 somehow interferes with disk discovery.
A reboot with USB disk & mouse in the hub but keyboard and FT232
in the Pi again resulted in a mountroot failure, along with a few
other error messages:

uhub2: MTT enabled
Root mount waiting for: usbus1 CAM
uhub2: 4 ports with 4 removable, self powered
Root mount waiting for: usbus1 CAM
usb_alloc_device: set address 7 failed (USB_ERR_IOERROR, ignored)
Root mount waiting for: usbus1 CAM
Root mount waiting for: usbus1 CAM
usbd_setup_device_desc: getting device descriptor at addr 7 failed, 
USB_ERR_IOERROR
usbd_req_re_enumerate: addr=7, set address failed! (USB_ERR_IOERROR, ignored)
Root mount waiting for: usbus1 CAM
Root mount waiting for: usbus1 CAM
usbd_setup_device_desc: getting device descriptor at addr 7 failed, 
USB_ERR_IOERROR
Root mount waiting for: usbus1 CAM
usbd_req_re_enumerate: addr=7, port reset failed, USB_ERR_IOERROR
usbd_req_re_enumerate: addr=7, port reset failed, USB_ERR_IOERROR
Root mount waiting for: usbus1 CAM
usbd_req_re_enumerate: addr=7, port reset failed, USB_ERR_IOERROR
ugen1.7:  at usbus1 (disconnected)
uhub_reattach_port: could not allocate new device
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Mounting from ufs:/dev/da0s2a failed with error 2; retrying for 3 more seconds
Mounting from ufs:/dev/da0s2a failed with error 2.

Loader variables:
  vfs.root.mountfrom=ufs:/dev/da0s2a
  vfs.root.mountfrom.options=rw

With the FT232 unplugged the machine came up normally.

With a _different_ FT232 plugged in it also came up normally.

Both are thought to be genuine, but they are of different age
and produce different recognition messages:

The FT232 that causes trouble reports
ugen1.4:  at usbus1
uftdi0 on uhub1
uftdi0:  on usbus1

The one that seems to work is newer and reports
ugen1.4:  at usbus1
uftdi0 on uhub1
uftdi0:  on usbus1

On balance I think the new kernel is better-behaved. Beyond that
I'm at a loss. If you can suggest other things to try please do.

Thanks for all your help,

bob prohaska
 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Strange USB loop

2020-08-24 Thread bob prohaska
On Tue, Aug 25, 2020 at 12:46:01AM +0200, Hans Petter Selasky wrote:
> On 2020-08-24 18:37, bob prohaska wrote:
> > After updating to
> > FreeBSD 13.0-CURRENT (GENERIC) #5 r364475: Mon Aug 24 06:47:29 PDT 2020
> > on a Pi3 it was necessary to disconnect the mouse, keyboard and usb-serial
> 
> You are after:
> 
> https://svnweb.freebsd.org/changeset/base/364433
> 
> You may want to try a kernel before:
> 
> r364379

A kernel for 
FreeBSD 13.0-CURRENT (GENERIC) #6 r364378: Tue Aug 25 00:46:27 PDT 2020
compiled and installed without incident, but the problem persists. This
time I plugged the keyboard into the hub and got a stream of 
uhub_reattach_port: giving up port reset - device vanished
which didn't stop when the keyboard was removed. If the keyboard is
moved to the Pi's internal USB connectors the keyboard is recognized
and works, but the once-per-second "...device vanished" messages continue.

Attempts to repeat this behavior were frustrating. After a few iterations
the error message was triggered by plugging in an FTDI usb-serial adapter,
but the messages stopped when it was unplugged. 

The hub is 
Bus /dev/usb Device /dev/ugen1.4: ID 05e3:0610 Genesys Logic, Inc. 4-port hub

The disk adapter is 
Bus /dev/usb Device /dev/ugen1.5: ID 152d:1561 JMicron Technology Corp. / 
JMicron USA Technology Corp. JMS561U two ports SATA 6Gb/s bridge

Are either of these known troublmakers?

Thanks for reading!

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Strange USB loop

2020-08-24 Thread bob prohaska
On Mon, Aug 24, 2020 at 05:26:27PM +, Bjoern A. Zeeb wrote:
> On 24 Aug 2020, at 16:37, bob prohaska wrote:
> 
> >
> > uhub_reattach_port: giving up port reset - device vanished
> > uhub_reattach_port: giving up port reset - device vanished
> > uhub_reattach_port: giving up port reset - device vanished
> > uhub_reattach_port: giving up port reset - device vanished
> > uhub_reattach_port: giving up port reset - device vanished
> >
> 
> I hit something like it last weekend and found this one:
> 
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237666
> 

Hmm, rather discouraging. Same error message, different hardware.
Considerable investigation without resolution. Over a year old.

Thanks for writing 8=(

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Strange USB loop

2020-08-24 Thread bob prohaska
After updating to 
FreeBSD 13.0-CURRENT (GENERIC) #5 r364475: Mon Aug 24 06:47:29 PDT 2020
on a Pi3 it was necessary to disconnect the mouse, keyboard and usb-serial
adapter to allow the machine to mount root from USB via a hub.

Once the machine came back up with root mounted from USB, I tried plugging
the serial adapter, mouse and keyboard back in via the hub. 

The FTDI serial adapater was recognized without trouble, but when the
elderly Dell mouse was connected, a stream of

uhub_reattach_port: giving up port reset - device vanished
uhub_reattach_port: giving up port reset - device vanished
uhub_reattach_port: giving up port reset - device vanished
uhub_reattach_port: giving up port reset - device vanished
uhub_reattach_port: giving up port reset - device vanished

began to scroll on both the monitor and console. Unplugging 
the mouse made no difference. Plugging the mouse directly
into the Pi's USB port allowed recognition and function,
but the stream of errors persisted. Network access seems
normal.  

It looks almost as if there's some sort of infinite loop
running in the USB software. The need to disconnect mouse
and keyboard to permit mountroot to work isn't new, but
the "giving up port reset" _is_ new at least to me.

Are there any experiments which might narrow down what's wrong?

Thanks for reading,

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Is there any error checking on swap?

2020-07-12 Thread bob prohaska
On Sun, Jul 12, 2020 at 12:29:12AM -0700, John-Mark Gurney wrote:
> bob prohaska wrote this message on Sat, Jul 11, 2020 at 20:33 -0700:
> > Is there any error checking on swap traffic, along the lines of
> > a checksum or parity test? 
> > 
> > Just curious what happens if a page written out is corrupted  when
> > it comes back.
> 
> Looks like it doesn't:
> https://svnweb.freebsd.org/base/head/sys/vm/swap_pager.c?annotate=361965#l1389
> 

Certainly nothing about parity or checksums in the comments.
All faith in the hardware, I guess

Thanks for writing!

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Is there any error checking on swap?

2020-07-11 Thread bob prohaska


Is there any error checking on swap traffic, along the lines of
a checksum or parity test? 

Just curious what happens if a page written out is corrupted  when
it comes back.

Thanks for reading,

bob prohaska
 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Problem compiling Chromium

2020-06-07 Thread bob prohaska
Don't know about AMD, but on ARM failures that resemble this are
common. In some cases the actual error message is tens or a hundred
lines prior to the last make output. 

If you search for the string Error: or maybe error: does anything
show up? Including the colon : helps to reduce irrelevant hits.

Good luck,

bob prohaska

On Sun, Jun 07, 2020 at 04:27:24PM +, Filippo Moretti wrote:
> Good evening, FreeBSD sting 
> 13.0-CURRENT FreeBSD 13.0-CURRENT #2 r361787: Sun Jun?? 7 15:02:09 CEST 
> 2020 root@sting:/usr/obj/usr/src/amd64.amd64/sys/STING?? amd64
> 
> the build fails with the following message/common/extensions/api/action.json 
> ../../chrome/common/extensions/api/browser_action.json 
> ../../chrome/common/extensions/api/browsing_data.json 
> ../../chrome/common/extensions/api/extension.json 
> ../../chrome/common/extensions/api/idltest.idl 
> ../../chrome/common/extensions/api/page_action.json 
> ../../chrome/common/extensions/api/top_sites.json
> ninja: build stopped: subcommand failed.
> ===> Compilation failed unexpectedly.
> Try to set MAKE_JOBS_UNSAFE=yes and rebuild before reporting the failure to
> the maintainer.
> *** Error code 1
> 
> Stop.
> make: stopped in /usr/ports/www/chromium
> 
> ===>>> make build failed for www/chromium
> ===>>> Aborting update
> 
> I enclose the list of packages installedsincerelyFilippo


> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Recovering after a crash during installworld

2020-06-05 Thread bob prohaska
On Thu, Jun 04, 2020 at 10:42:18PM +0200, Ronald Klop wrote:
> Delete /usr/src and make a new svn checkout.

That turned out to be the solution. The error message persuaded
me that the problem was in the executable, not the repository.
After devel/subversion produced the idential error, replacing
the repository solved the problem.

Thanks for reading,

bob prohaska


> 
> 
> Regards,
> Ronald.
> 
> 
> Van: bob prohaska 
> Datum: 4 juni 2020 21:24
> Aan: freebsd-current@freebsd.org
> CC: bob prohaska 
> Onderwerp: Recovering after a crash during installworld
> 
> > 
> > 
> > A Raspberry Pi3B running -current near r360134 crashed during installworld.
> > Installkernel completed in single-user mode, but it looks like something
> > got corrupted in files related to svnlite:
> > 
> > root@www:/usr/src # svnlite up .
> > svn: E235000: In file 
> > '/usr/src/contrib/subversion/subversion/libsvn_wc/wc_db_wcroot.c' line 311: 
> > assertion failed (format >= 1)
> > Abort (core dumped)
> > root@www:/usr/src # svnlite cleanup .
> > svn: E235000: In file 
> > '/usr/src/contrib/subversion/subversion/libsvn_wc/wc_db_wcroot.c' line 311: 
> > assertion failed (format >= 1)
> > Abort (core dumped)
> > 
> > The machine comes up multi-user without problems, but attempts to update
> > or simply rebuild the system run afoul of the svnlite errors.
> > 
> > Is there a practical way to recover?
> > 
> > Thanks for reading,
> > 
> > bob prohaska
> > ___
> > freebsd-current@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
> > 
> > 
> > 
> > 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Recovering after a crash during installworld

2020-06-04 Thread bob prohaska
A Raspberry Pi3B running -current near r360134 crashed during installworld.
Installkernel completed in single-user mode, but it looks like something
got corrupted in files related to svnlite:

root@www:/usr/src # svnlite up .
svn: E235000: In file 
'/usr/src/contrib/subversion/subversion/libsvn_wc/wc_db_wcroot.c' line 311: 
assertion failed (format >= 1)
Abort (core dumped)
root@www:/usr/src # svnlite cleanup .
svn: E235000: In file 
'/usr/src/contrib/subversion/subversion/libsvn_wc/wc_db_wcroot.c' line 311: 
assertion failed (format >= 1)
Abort (core dumped)

The machine comes up multi-user without problems, but attempts to update
or simply rebuild the system run afoul of the svnlite errors.

Is there a practical way to recover?

Thanks for reading,

bob prohaska
 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: about loader & console

2020-03-02 Thread bob prohaska
On Mon, Mar 02, 2020 at 10:47:41AM +0200, Toomas Soome wrote:
> What we have now (on current):
> 
> arm: efi console is the same state as in x86, no serial driver to provide 
> comconsole, the serial console is only available via redirection from 
> firmware.
>
 
Is it possible, on a Pi3 without WiFi nor bluetooth, to add 
enable_uart=1
to config.txt?

At this point config.txt contains
arm_control=0x200
dtparam=audio=on,i2c_arm=on,spi=on
dtoverlay=mmc
dtoverlay=pwm
dtoverlay=pi3-disable-bt
device_tree_address=0x4000
kernel=u-boot.bin

and is, I think, default.

Thanks for reading,

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: OOMA kill with vm.pfault_oom_attempts="-1" on RPi3 at r357147 (a vm_pfault_oom_attempts < 0 handling bug as of head -r357026)

2020-01-28 Thread bob prohaska
On Tue, Jan 28, 2020 at 11:28:14AM -0800, Mark Millard wrote:
> 
> 
> On 2020-Jan-28, at 11:02, bob prohaska  wrote:
> 
> > On Tue, Jan 28, 2020 at 09:42:17AM -0800, Mark Millard wrote:
> >> 
> >> 
> >> 
> > The (partly)modified kernel compiled and booted without
> > obvious trouble. It's trying to finish buildworld now.
> > 
Stopped already, with 
Jan 28 11:41:59 www kernel: pid 29909 (cc), jid 0, uid 0, was killed: fault's 
page allocation failed



> >> If you are testing with vm.pfault_oom_attempts="-1" then
> >> the vm_fault printf message should never happen anyway.
> >> 
> > Would it not be interesting if the message appeared in that
> > case? 
> 
> Thanks for the question: looking at the new code found a bug
> causing oom where it used to be avoided in head -r357025 and
> before.


Glad to be of service, even if inadvertently 8-)

 
> After vm_waitpfault(dset, vm_pfault_oom_wait * hz)
> the -r357026 code does a vm_pageout_oom(VM_OOM_MEM_PF) no
> matter what, even when vm_pfault_oom_attempts < 0 ||
> fs->oom < vm_pfault_oom_attempts :
> 
> New code in head -r357026
> ( nothing to avoid the vm_pageout_oom(VM_OOM_MEM_PF)
> for vm_pfault_oom_attempts < 0 ||
> fs->oom < vm_pfault_oom_attempts ):
> 
>   if (fs->m == NULL) {
>   unlock_and_deallocate(fs);
>   if (vm_pfault_oom_attempts < 0 ||
>   fs->oom < vm_pfault_oom_attempts) {
>   fs->oom++;
>   vm_waitpfault(dset, vm_pfault_oom_wait * hz);
>   }
>   if (bootverbose)
>   printf(
> "proc %d (%s) failed to alloc page on fault, starting OOM\n",
>   curproc->p_pid, curproc->p_comm);
>   vm_pageout_oom(VM_OOM_MEM_PF);
>   return (KERN_RESOURCE_SHORTAGE);
>   }
> 
> Old code in head -r357025
> ( has the goto RetryFault_oom after vm_waitpfault(. . .),
> thereby avoiding the vm_pageout_oom(VM_OOM_MEM_PF) for
> vm_pfault_oom_attempts < 0 || fs->oom < vm_pfault_oom_attempts ) :
> 
>   if (fs.m == NULL) {
>   unlock_and_deallocate();
>   if (vm_pfault_oom_attempts < 0 ||
>   oom < vm_pfault_oom_attempts) {
>   oom++;
>   vm_waitpfault(dset,
>   vm_pfault_oom_wait * hz);
>   goto RetryFault_oom;
>   }
>   if (bootverbose)
>   printf(
>   "proc %d (%s) failed to alloc page on fault, starting OOM\n",
>   curproc->p_pid, curproc->p_comm);
>   vm_pageout_oom(VM_OOM_MEM_PF);
>   goto RetryFault;
>   }
> 
> I expect this is the source of the behavioral
> difference folks have been seeing for OOM kills.
> 
> 
> As for "gather evidence" messages . . .
> 
> >> You may be able to just look and manually delete or
> >> comment out the bootverbose line in the more modern
> >> source that currently looks like:
> >> 
> >>if (bootverbose)
> >>printf(
> >> "proc %d (%s) failed to alloc page on fault, starting OOM\n",
> >>curproc->p_pid, curproc->p_comm);
> >>vm_pageout_oom(VM_OOM_MEM_PF);
> >>return (KERN_RESOURCE_SHORTAGE);
> >> 
> > 
> > I can find those lines in /usr/src/sys/vm/vm_fault.c, but
> > unclear on the motivation to comment the lines out. Perhaps 
> > to eliminate the return(...) ?  Anyway, is it sufficient 
> > to insert /* before and */ after? 
> 
> The only line to delete or comment out in that
> code block is:
> 
>   if (bootverbose)
> 
> Disabling that line makes the following printf
> always happen, even when a verbose boot was not
> done.
Oops, it's commented out now and the kernel is rebuilding.

> 
> Based on the above reported code change, having
> a message before vm_pageout_oom(VM_OOM_MEM_PF) is
> important to getting a report of the kill being
> via that code.
>

Thank you!

bob prohaska
 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: OOMA kill with vm.pfault_oom_attempts="-1" on RPi3 at r357147

2020-01-27 Thread bob prohaska
On Mon, Jan 27, 2020 at 06:22:20PM -0800, Mark Millard wrote:
> 
> So far as I know, in the past progress was only made when someone
> already knowledgable got involved in isolating what was happening
> and how to control it.
> 
Indeed. One can only hope said knowledgeables are reading

Thanks for reading!

bob prohaska


 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CAM breaks USB [was Re: USB causing boot to hang]

2019-12-06 Thread bob prohaska
, 100baseTX-FDX, auto
ue0:  on smsc0
ue0: Ethernet address: 4e:61:0e:1c:ae:c0
.ugen0.4:  at usbus0
umass0 on uhub1
umass0:  on usbus0
.ugen0.5:  at usbus0
...Restarting file system checks:
/dev/ufs/rootfs: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/ufs/rootfs: clean, 109531 free (5579 frags, 12994 blocks, 2.3% 
fragmentation)
Can't stat /dev/da0d: No such file or directory
Can't stat /dev/da0e: No such file or directory
Can't stat /dev/da0d: No such file or directory
Can't stat /dev/da0e: No such file or directory
Can't stat /dev/da0a: No such file or directory
Can't stat /dev/da0a: No such file or directory
THE FOLLOWING FILE SYSTEMS HAD AN UNEXPECTED INCONSISTENCY:
ufs: /dev/da0d (/tmp), ufs: /dev/da0e (/usr), ufs: /dev/da0a (/var)
Unknown error 3; help!
ERROR: ABORTING BOOT (sending SIGTERM to parent)!
2019-12-06T20:07:21.926442-08:00  init 1 - - /bin/sh on /etc/rc terminated 
abnormally, going to single user mode
Enter full pathname of shell or RETURN for /bin/sh:

The machine seems able to boot hands-off a kernel from r333740, 
so I don't think it's hardware.

/boot/loader.conf contains
bob@www:~ % more /boot/loader.conf
kern.cam.boot_delay="2"
vm.pageout_oom_seq="2048"
bob@www:~ % 

Booting direct to single-user, running fsck and exiting the shell 
brought  multi-user operation. Still, It appears that recognition 
of an FTDI FT232 usb-serial adapter is impaired as well. It had to 
be unplugged and replugged after booting to be recognized.

Also FWIW, an RPI3 running r355422 seems not to share the difficulty.

Hope this is of some use,

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Rpi3 panic: non-current pmap 0xfffffd001e05b130

2019-11-30 Thread bob prohaska
On Sat, Nov 30, 2019 at 05:16:15PM -0800, bob prohaska wrote:
> A Pi3 running r355024 reported a panic while doing a -j3 make of
> www/chromium:
> 
Ok, another panic, looks like a dying storage device. This time there
was a preamble on the console:

(da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 c3 90 d8 00 00 08 00 
(da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da0:umass-sim0:0:0:0): Error 5, Retries exhausted
swap_pager: I/O error - pageout failed; blkno 1442883,size 4096, error 5
swap_pager: I/O error - pageout failed; blkno 1442884,size 4096, error 5
swap_pager: I/O error - pageout failed; blkno 1442885,size 8192, error 5
swap_pager: I/O error - pageout failed; blkno 1442887,size 4096, error 5
swap_pager: I/O error - pagein failed; blkno 1103209,size 4096, error 5
vm_fault: pager read error, pid 681 (devd)
swap_pager: I/O error - pagein failed; blkno 1130270,size 4096, error 5
vm_fault: pager read error, pid 2362 (c++)
Nov 30 17:37:34 www kernel: Failed to fully fault in a core file segment at VA 
0x4040 with size 0x60b000 to be written at offset 0x32b000 for process devd
panic: vm_page_assert_unbusied: page 0xfd0030f8af80 busy @ 
/usr/src/sys/vm/vm_object.c:777
cpuid = 3
time = 1575164255

Earlier panics didn't have any proximate warnings on the console, but they're
probably the same story.

apologies for the noise!

bob prohaska



> panic: non-current pmap 0xfd001e05b130
> cpuid = 0
> time = 1575161361
> KDB: stack backtrace:
> db_trace_self() at db_trace_self_wrapper+0x28
>pc = 0x00729e4c  lr = 0x001066c8
>sp = 0x59f3e2b0  fp = 0x59f3e4c0
> 
> db_trace_self_wrapper() at vpanic+0x18c
>pc = 0x001066c8  lr = 0x00400d7c
>sp = 0x59f3e4d0  fp = 0x59f3e580
> 
> vpanic() at panic+0x44
>pc = 0x00400d7c  lr = 0x00400b2c
>sp = 0x59f3e590  fp = 0x59f3e610
> 
> panic() at pmap_remove_pages+0x8d4
>pc = 0x00400b2c  lr = 0x0074154c
>sp = 0x59f3e620  fp = 0x59f3e6e0
> 
> pmap_remove_pages() at vmspace_exit+0xc0
>pc = 0x0074154c  lr = 0x006c9c00
>sp = 0x59f3e6f0  fp = 0x59f3e720
> 
> vmspace_exit() at exit1+0x4f8
>pc = 0x006c9c00  lr = 0x003bc2a4
>sp = 0x59f3e730  fp = 0x59f3e7a0
> 
> exit1() at sys_sys_exit+0x10
>pc = 0x003bc2a4  lr = 0x003bbda8
>sp = 0x59f3e7b0  fp = 0x59f3e7b0
> 
> sys_sys_exit() at do_el0_sync+0x514
>pc = 0x003bbda8  lr = 0x00747aa4
>sp = 0x59f3e7c0  fp = 0x59f3e860
> 
> do_el0_sync() at handle_el0_sync+0x90
>pc = 0x00747aa4  lr = 0x0072ca14
>sp = 0x59f3e870  fp = 0x59f3e980
> 
> handle_el0_sync() at 0x404e6d60
>pc = 0x0072ca14  lr = 0x404e6d60
>sp = 0x59f3e990  fp = 0xd590
> 
> KDB: enter: panic
> [ thread pid 94966 tid 100145 ]
> Stopped at  0x40505460: undefined   5442
> db> bt
> Tracing pid 94966 tid 100145 td 0xfd002552b000
> db_trace_self() at db_stack_trace+0xf8
>  pc = 0x00729e4c  lr = 0x00103b0c
>  sp = 0x59f3de80  fp = 0x59f3deb0
> 
> db_stack_trace() at db_command+0x228
>  pc = 0x00103b0c  lr = 0x00103784
>  sp = 0x59f3dec0  fp = 0x59f3dfa0
> 
> db_command() at db_command_loop+0x58
>  pc = 0x00103784  lr = 0x0010352c
>  sp = 0x59f3dfb0  fp = 0x59f3dfd0
> 
> db_command_loop() at db_trap+0xf4
>  pc = 0x0010352c  lr = 0x00106830
>  sp = 0x59f3dfe0  fp = 0x59f3e200
> 
> db_trap() at kdb_trap+0x1d8
>  pc = 0x00106830  lr = 0x004492fc
>  sp = 0x59f3e210  fp = 0x59f3e2c0
> 
> kdb_trap() at do_el1h_sync+0xf4
>  pc = 0x004492fc  lr = 0x00747418
>  sp = 0x59f3e2d0  fp = 0x59f3e300
> 
> do_el1h_sync() at handle_el1h_sync+0x78
>  pc = 0x00747418  lr = 0x0072c878
>  sp = 0x59f3e310  fp = 0x59f3e420
> 
> handle_el1h_sync() at kdb_enter+0x34
>  pc = 0x0072c878  lr = 0x00448948
>  sp = 0x59f3e430  fp = 0x59f3e4c0
> 
> kdb_enter() at vpanic+0x1a8
>  pc = 0x00448948  lr = 0x00400d98
>  sp = 0x59f3e4d0  fp = 0x59f3e580
> 
> vpanic() at panic+0x44
>  pc 

Rpi3 panic: non-current pmap 0xfffffd001e05b130

2019-11-30 Thread bob prohaska
  page  disks faults   cpu
r b w avm fre  flt  re  pi  pofr   sr mm0 da0   in   sy   cs us sy 
id
 0  0 12 4523836   52860  6989 186 715 257  6932 25125 1038 1038 30790  1073 
29820 14 26 60
dT: 1.002s  w: 1.000s
 L(q)  ops/sr/s   kBps   ms/rw/s   kBps   ms/wd/s   kBps   ms/d   
%busy Name
1751702   48604.0 492515.0  0  00.0   
94.4  mmcsd0
1751702   48604.1 492515.1  0  00.0   
94.7  mmcsd0s2
2704658   40821.8 462350.6  0  00.0   
71.9  da0
1751702   48604.1 492515.1  0  00.0   
94.8  mmcsd0s2b
2704658   40821.8 462350.7  0  00.0   
72.6  da0p6
Sat Nov 30 16:48:26 PST 2019
Device  1K-blocks UsedAvail Capacity
/dev/mmcsd0s2b4404252  1959504  244474844%
/dev/da0p65242880  1957540  328534037%
Total 9647132  3917044  573008841%
Nov 30 16:38:17 www sshd[91264]: error: PAM: Authentication error for illegal 
user support from 103.133.104.114
Nov 30 16:38:17 www sshd[91264]: error: Received disconnect from 
103.133.104.114 port 52716:14: No more user authentication methods available. 
[preauth]
0/1016/1016/19178 mbuf clusters in use (current/cache/total/max)
procs memory   page  disks faults   cpu
r b w avm fre  flt  re  pi  pofr   sr mm0 da0   in   sy   cs us sy 
id
 0  0 12 4523868   46872  6989 186 715 257  6932 25123   0   0 30790  1073 
29820 14 26 60
dT: 1.002s  w: 1.000s
 L(q)  ops/sr/s   kBps   ms/rw/s   kBps   ms/wd/s   kBps   ms/d   
%busy Name
2700681   48883.7 191083.7  0  00.0   
92.1  mmcsd0
2700681   48883.8 191083.7  0  00.0   
92.5  mmcsd0s2
2709687   43142.1 221083.4  0  00.0   
78.2  da0
2700681   48883.8 191083.7  0  00.0   
92.6  mmcsd0s2b
2709687   43142.1 221083.4  0  00.0   
78.7  da0p6
Sat Nov 30 16:48:28 PST 2019
Device  1K-blocks UsedAvail Capacity
/dev/mmcsd0s2b440

It's clear the machine was heavily loaded, but storage didn't appear to be 
swamped.
I hope the foregoing has been of some interest, thanks for reading!

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Reverting -current by date.

2019-11-20 Thread bob prohaska
On Wed, Nov 20, 2019 at 02:52:22PM -0800, Mark Millard wrote:
> 
> Unfortunately for Bob P., no suggestion can meet his full criteria. So
> he has several suggestions to potentially pick from or to use in
> combination.
> 
This is a most gracious way of saying my expectations are 
unreasonable. Sad but not surprising. At least now I know.

Thanks to everyone for enlightening me,

bob prohaska




> >> 
> >> . . .
> > 
> 
> ===
> Mark Millard
> marklmi at yahoo.com
> ( dsl-only.net went
> away in early 2018-Mar)
> 
> 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: g_vfs_done():ufs/rootfs[WRITE flood on rpi3

2019-11-20 Thread bob prohaska
On Wed, Nov 20, 2019 at 04:41:46PM -0600, Kyle Evans wrote:
> 
> The revisions noted are a good data point, thanks! Can you try
> upgrading the kernel past r354875 before I revert the most likely
> candidate up to that point? Perhaps with this patch applied, to make
> sure you're not hitting an interrupt race that's hard to deduce from
> logs: https://reviews.freebsd.org/D22430.diff
> 

The r354909 kernel is running now, doing a test compile of www/chromium.
so far no problems are apparent.

> More than willing to build a kernel as described and put up for you to
> download, as well, if you'd accept that.
>
If I understand correctly there's no need, if I'm mistaken please let me 
know.

Thanks for your attention!

bob prohaska
 
> Thanks,
> 
> Kyle Evans
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


g_vfs_done():ufs/rootfs[WRITE flood on rpi3

2019-11-20 Thread bob prohaska
Setting hostid: 0x5cd40a6a.
warning: total configured swap (1101063 pages) exceeds maximum recommended 
amount (920808 pages).
warning: increase kern.maxswzone or reduce amount of swap.
warning: total configured swap (2411783 pages) exceeds maximum recommended 
amount (920808 pages).
warning: increase kern.maxswzone or reduce amount of swap.
Starting file system checks:

At this point things returned to normal.

Thanks for reading, I hope it's useful.

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Reverting -current by date.

2019-11-20 Thread bob prohaska
On Wed, Nov 20, 2019 at 11:18:41AM -0700, Warner Losh wrote:
> On Wed, Nov 20, 2019 at 10:39 AM bob prohaska  wrote:
> 
> > From time to time it would be handy to revert freebsd-current to
> > an older, well-behaved revision.
> >
> > Is there a mechanism for identifying revision numbers that
> > will at least compile and boot, by date?
> >
> 
> Almost all of them will compile. Almost all of those will boot. While some
> build breakage sneaks through, the default assumption is that it's good.
> That's certainly been my experience randomly updating to -current. There's
> some that are more or less performant, mind you, and some that are more or
> less stable, it is true. But the overwhelming vast majority will compile
> and boot, at least for amd64. I have issues less than 1% of the time when
> updating to whatever is current at the moment I fancy an update.
>

Are commits that depend on one another somehow grouped in a single revision?
 
> There's some hardware that gets broken from time to time, but we don't
> track that specifically. And non-amd64 architectures takes more care and
> planning as any build breakage for those platforms lasts longer, in direct
> proportion to how popular the platform is
>

Point taken. I'm interested in aarch64, which puts me somewhat in the weeds.
 
> It's all in the commit logs. If you run -current you need to read them.
> They will also tell you almost always if you pick revision X if there was a
> subsequent fix that made things compile you should go with.
> 

I take it the strategy would be go back in the log to a rough date,
then browse forward in time looking for signs of major trouble. When
the commits turn minor/benign, select a revision from that timeframe.
 
> 
> Study the commit logs? I know I'm harping on that, but when things go
> wrong, that's what I do.
> 

I hoped for a more mechanical approach. For example, snapshots are 
generated from time to time. Presumably, they're vetted in some way
and knowing what revisions made it to the snapshot stage might be a
starting point. The snapshot server does not appear to contain that
information for earlier offerings. 


> Also -DNO_CLEAN builds help a lot if you're worried about it not even
> building, though from time to time you run into issues with a NO_CLEAN
> build due to a recent commit that wasn't appreciated at the time of the
> commit, but was later and fixed.
> 

Does -DNO_CLEAN behave sanely (and usefully) when going backwards in time?
I commonly use it for small forward steps, but time reversal is tricky 8-)

Thanks for replying!

bob prohaska



___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Reverting -current by date.

2019-11-20 Thread bob prohaska
>From time to time it would be handy to revert freebsd-current to
an older, well-behaved revision. 

Is there a mechanism for identifying revision numbers that
will at least compile and boot, by date? 

In my case buildworld seems to be markedly slower than, say,
six months ago. Maybe it's hardware, maybe something else. Is
there a way to pick a revision number to revert to, that's
better than merely guessing? 

Thanks for reading,

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: spurious out of swap kills

2019-09-14 Thread bob prohaska
On Fri, Sep 13, 2019 at 10:59:58PM -0700, Mark Millard wrote:
> bob prohaska fbsd at www.zefox.net wrote on
> Fri Sep 13 16:24:57 UTC 2019 :
> 
> > Not sure this is relevant, but in compiling chromium on a Pi3 with 6 GB
> > of swap the job completed successfully some months ago, with peak swap 
> > use around 3.5 GB. The swap layout was sub-optimal, with a 2 GB partition
> > combined with a 4 GB partition. A little over 4GB total seems usable. 
> > 
> > A few days ago the same attempt stopped with a series of OOMA kills,
> > but in each case simply restarting allowed the compile to pick up
> > where it left off and continue, eventually finishing with a runnable
> > version of chromium. In this case swap use peaked a little over 4 GB.
> > 
> > Might this suggest the machine isn't freeing swap in a timely manner?
> 
> Are you saying that your increases to:
> 
> vm.pageout_oom_seq
> 
> no longer prove sufficient? What value for vm.pageout_oom_seq were
> you using that got the recent failures?
> 
Correct. Initial value was 2048, later raised to 4096. Far as I could
tell the change didn't help. No explict j value was set for make, but
no more than four jobs were observed in top 

A log of storage activity along with swap total and the last two 
console messages is at
http://www.zefox.net/~fbsd/rpi3/swaptests/r351586/swapscript.log
along with a sorted list of total swap use, which can be used as
a sort of index to the log file. 

The initial "out of swap space" at the very beginning
is a relic from before logging started. 

Da0 is a Sandisk SDCZ80 usb 3.0 device, mmcsd0 is a Samsung
Evo + 128 GB device.

The two points of curiosity to me are:
1. Why did swap use increase from 3.5 GB months ago to 4.2 GB now?
2. Why does stopping and restarting make (which would seem to free
un-needed swap) allow the job to finish?

> If more or different configuration/tuning is required, I'm going to
> eventually want to learn about it as well.
> 
You will have some company.

Thanks for reading,

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: spurious out of swap kills

2019-09-13 Thread bob prohaska
Not sure this is relevant, but in compiling chromium on a Pi3 with 6 GB
of swap the job completed successfully some months ago, with peak swap 
use around 3.5 GB. The swap layout was sub-optimal, with a 2 GB partition
combined with a 4 GB partition. A little over 4GB total seems usable. 

A few days ago the same attempt stopped with a series of OOMA kills,
but in each case simply restarting allowed the compile to pick up
where it left off and continue, eventually finishing with a runnable
version of chromium. In this case swap use peaked a little over 4 GB.

Might this suggest the machine isn't freeing swap in a timely manner?

Thanks for reading,

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Firstboot behavior, was Re: New vm-image size is much smaller than previos

2019-05-04 Thread bob prohaska
[changed subject to follow drift of conversation]

On Sat, May 04, 2019 at 06:03:00AM -0700, Rodney W. Grimes wrote:
> 
> Do we even have install note(s) pages for these things, or a wiki page

Not that I know of.
 
> that documents it, or ?Working around /firstboot does not require
> a serial console, if you know about it ahead of time, you can even
^^
8-)

The statement is true.  The gymnastics required to
> mount the sd image up on another system, and remove firstboot if you
> want, or create a swap partition at the end of the device, make the
> boot partition use up the rest and then iirc growfs on firstboot does
> what you want. (Untested at this time, but that should just work.)
are far from trivial, even for experienced foot-shooters such as myself.

A Pi running Raspbian can download and write the FreeBSD image, but it
can't mount ufs to manipulate files. I'm not sure about Mac OS and Windows.
That's a likely starting scenario for potential users of FreeBSD on the Pi. 

AFAIK it's still necessary to boot single-user, set up the microSD (which is 
a considerable challenge using gpart unless one is in good practice) and then
let the system go to multi-user. Last time I checked, u-boot (or maybe it's
loader)  couldn't read the USB keyboard to execute boot -s so the system
essentially runs away from the user's control. I admit not having checked
in the last few months, but even if it's fixed asking a new user to start
by using gpart is unlikely to encourage further exploration. 

Thanks for your attention!

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: New vm-image size is much smaller than previos

2019-05-03 Thread bob prohaska
On Fri, May 03, 2019 at 07:39:00PM -0700, Rodney W. Grimes wrote:
> > On Fri, May 3, 2019, 7:42 PM bob prohaska  wrote:
> > 
> > > On Fri, May 03, 2019 at 11:06:15AM -0700, Rodney W. Grimes wrote:
> > > > -- Start of PGP signed section.
> > > > > On Fri, May 03, 2019 at 10:12:58AM -0700, Enji Cooper wrote:
> > > > > >
> > > > > > > On May 3, 2019, at 9:57 AM, Alan Somers 
> > > wrote:
> > > > > > >
> > > > > > > See r346959.  Before first boot, you should expand the image up to
> > > > > > > whatever size you want.  growfs(8) will automatically expand the
> > > file
> > > > > > > system.
> > > > > > > -Alan
> > > > > > >
> > > > > > > On Fri, May 3, 2019 at 10:32 AM David Boyd 
> > > wrote:
> > > > > > >>
> > > > > > >> The vm-image for 13.0-CURRENT
> > > > > > >>
> > > > > > >> FreeBSD-13.0-CURRENT-amd64-20190503-r347033.vmdk
> > > > > > >>
> > > > > > >> is only 4.0 GB in size.  Previous images were about 31.0 GB.
> > > > > > >>
> > > > > > >> This smaller image doesn't leave much room to add packages and
> > > other
> > > > > > >> customizations.
> > > > > >
> > > > > > This probably deserves a release note.
> > > > >
> > > > > It will certainly be mentioned in the 11.3 release notes.
> > > >
> > > > And those running head snapshots without reading commit messages
> > > > are likely to have lots of foot shooting.
> > > >
> > > > > Glen
> > > > --
> > > > Rod Grimes
> > > rgri...@freebsd.org
> > >
> > > At the risk of being branded a wishful thinker, a firstboot script that
> > > asked the user for some configuration information would be a great help
> > > to both new and experienced foot-shooters. I'm thinking of Raspberry Pi,
> > > but perhaps it applies to non-embedded platforms also.
> > >
> > 
> > That's not a bad idea... we could press bsdinstall into service for that
> > perhaps... we already expand the partition / filesystem to match the media
> > size...
> 
> As asommers already pointed out a) we already do the for real media
> like on the rasberry pi's, etc all in that on first boot they do a
> growfs to fill the real media up with the file system.
> 

I misunderstood the significance of "vm-image", thinking it was
the same as a bootable microSD image. Apologies for the blunder.

My thoughts are about physical media. In that situation the default 
growfs on firstboot is a real handicap. It makes difficult any local 
customization of the microSD card, in particular adding a swap partition. 
A Pi2 is sort of usable without swap, a Pi3 is badly hampered with no swap. 

Having the existence of /firstboot trigger a configuration script 
that sets up swap, storage, accounts and network would be a great aid
to new users (and old users with imperfect memories). 

A man page for firstboot would be useful in any case. "What's that empty
file supposed to do?" is a very natural question. Unfortunately, by the
time the question is discovered it's too late to ask, and the user has
to start over. There are references to firstboot in man rc, but that's
a very hard way to answer a relatively simple question. Working around
/firstboot requires a serial console and considerable patience, at least
on a physical Raspberry Pi 2 or 3. 

Thanks for reading,  

bob prohaska



___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: New vm-image size is much smaller than previos

2019-05-03 Thread bob prohaska
On Fri, May 03, 2019 at 11:06:15AM -0700, Rodney W. Grimes wrote:
> -- Start of PGP signed section.
> > On Fri, May 03, 2019 at 10:12:58AM -0700, Enji Cooper wrote:
> > > 
> > > > On May 3, 2019, at 9:57 AM, Alan Somers  wrote:
> > > > 
> > > > See r346959.  Before first boot, you should expand the image up to
> > > > whatever size you want.  growfs(8) will automatically expand the file
> > > > system.
> > > > -Alan
> > > > 
> > > > On Fri, May 3, 2019 at 10:32 AM David Boyd  wrote:
> > > >> 
> > > >> The vm-image for 13.0-CURRENT
> > > >> 
> > > >> FreeBSD-13.0-CURRENT-amd64-20190503-r347033.vmdk
> > > >> 
> > > >> is only 4.0 GB in size.  Previous images were about 31.0 GB.
> > > >> 
> > > >> This smaller image doesn't leave much room to add packages and other
> > > >> customizations.
> > > 
> > > This probably deserves a release note.
> > 
> > It will certainly be mentioned in the 11.3 release notes.
> 
> And those running head snapshots without reading commit messages
> are likely to have lots of foot shooting.
> 
> > Glen
> -- 
> Rod Grimes rgri...@freebsd.org

At the risk of being branded a wishful thinker, a firstboot script that
asked the user for some configuration information would be a great help
to both new and experienced foot-shooters. I'm thinking of Raspberry Pi,
but perhaps it applies to non-embedded platforms also.

The original FreeBSD install program (the one by Jordan Hubbard) did a 
very serviceable job. Could it (the user interface) be resurrected?

Thanks for reading,

bob prohaska
 

 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


conflict between revision numbers for update and info

2019-02-14 Thread bob prohaska
The other night I ran svnlite up on /usr/src, which ended with
Updated to revision 344015

Somewhat later I noticed that uname -a reported the same revision, which
seemed odd, since buildworld/buildkernel were still in progress.

The next day I ran svnlite info /usr/src, which reported
Revision: 344113

Any idea what's going on?

Thanks for reading

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CFT: TRIM Consolodation on UFS/FFS filesystems

2018-08-22 Thread bob prohaska
On Tue, Aug 21, 2018 at 06:47:19PM -0700, Mark Millard wrote:
> 
> I've used a SSD both directly via SATA and via a USB enclosure,
> the same partitions/file systems across the uses. Only when it
> was SATA-style-use did TRIM work.
> 
This is likely the key to my question. If USB blocks the TRIM service 
the behavior of the device doesn't matter. 

As an aside, Sandisk now says:
"Please be informed that we have not tested running TRIM commands on USB flash 
drive 
and microSD cards therefore we would not be able to comment on it explicitly."

Thanks for reading,

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CFT: TRIM Consolodation on UFS/FFS filesystems

2018-08-21 Thread bob prohaska
On Mon, Aug 20, 2018 at 12:40:56PM -0700, Kirk McKusick wrote:
> I have recently added TRIM consolodation support for the UFS/FFS
> filesystem. This feature consolodates large numbers of TRIM commands
> into a much smaller number of commands covering larger blocks of
> disk space. Best described by the commit message:
> 
>   Author: mckusick
>   Date: Sun Aug 19 16:56:42 2018
>   New Revision: 338056
>   URL: https://svnweb.freebsd.org/changeset/base/338056
> 
>   Log:
> Add consolodation of TRIM / BIO_DELETE commands to the UFS/FFS filesystem.
> 
> When deleting files on filesystems that are stored on flash-memory
> (solid-state) disk drives, the filesystem notifies the underlying
> disk of the blocks that it is no longer using. The notification
> allows the drive to avoid saving these blocks when it needs to
> flash (zero out) one of its flash pages. These notifications of
> no-longer-being-used blocks are referred to as TRIM notifications.
> In FreeBSD these TRIM notifications are sent from the filesystem
> to the drive using the BIO_DELETE command.
> 
> Until now, the filesystem would send a separate message to the drive
> for each block of the file that was deleted. Each Gigabyte of file
> size resulted in over 3000 TRIM messages being sent to the drive.
> This burst of messages can overwhelm the drive's task queue causing
> multiple second delays for read and write requests.
> 
> This implementation collects runs of contiguous blocks in the file
> and then consolodates them into a single BIO_DELETE command to the
> drive. The BIO_DELETE command describes the run of blocks as a
> single large block being deleted. Each Gigabyte of file size can
> result in as few as two BIO_DELETE commands and is typically less
> than ten.  Though these larger BIO_DELETE commands take longer to
> run, they do not clog the drive task queue, so read and write
> commands can intersperse effectively with them.
> 
> Though this new feature has been throughly reviewed and tested, it
> is being added disabled by default so as to minimize the possibility
> of disrupting the upcoming 12.0 release. It can be enabled by running
> ``sysctl vfs.ffs.dotrimcons=1''. Users are encouraged to test it.
> If no problems arise, we will consider requesting that it be enabled
> by default for 12.0.
> 
> Reviewed by:  kib
> Tested by:Peter Holm
> Sponsored by: Netflix
> 
> This support is off by default, but I am hoping that I can get enough
> testing to ensure that it (a) works, and (b) is helpful that it will
> be reasonable to have it turned on by default in 12.0. The cutoff for
> turning it on by default in 12.0 is September 19th. So I am requesting
> your testing feedback in the near-term. Please let me know if you have
> managed to use it successfully (or not) and also if it provided any
> performance difference (good or bad).
> 
> To enable TRIM consolodation either use `sysctl vfs.ffs.dotrimcons=1'
> or just set the `dotrimcons' variable in sys/ufs/ffs/ffs_alloc.c to 1.
> 

Will the new feature be active  on a Raspberry Pi 3 using flash 
on microSD and USB for file systems and swap? 

Can the feature be turned on using one of the conf files in /etc? 


According to Sandisk, 
"All microSD or USB drives are flash memory and does support the TRIM command, 
however, 
you will not  notice any difference after running TRIM command on memory cards 
or USB 
drives. TRIM command is basically used for SSD and Hard drives."

The "you will not notice any difference" qualification makes me slightly 
uncertain
the reply was well-informed, but if there's any hope of success I'd like to try 
it.
>From time to time there seem to be traffic jams among flash devices on the 
>RPI3, it
would a pleasant surprise if this feature helps.

Thanks for reading!

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: building LLVM threads gets killed

2018-08-20 Thread bob prohaska
On Mon, Aug 20, 2018 at 07:33:32PM +0200, Dimitry Andric wrote:
> On 20 Aug 2018, at 16:26, Rodney W. Grimes 
>  wrote:
> > 
> >> It is running out of RAM while running multiple parallel link jobs.  If
> >> you are building using WITH_DEBUG, turn that off, it consumes large
> >> amounts of memory.  If you must have debug info, try adding the
> >> following flag to the CMake command line:
> >> 
> >> -D LLVM_PARALLEL_LINK_JOBS:STRING="1"
> >> 
> >> That will limit the amount of parallel link jobs to 1, even if you
> >> specify -j 8 to gmake or ninja.
> >> 
> >> Brooks, it would not be a bad idea to always use this CMake flag in the
> >> llvm ports. :)
> > 
> > And this may also fix the issues that all the small
> > memory (aka, RPI*) buliders are facing when trying
> > to do -j4?
> 
> Possibly, as linking is usually the most memory-consuming part of the
> build process (and more so, if debugging is enabled).  Are there build
> logs available somewhere for those RPI builders?
> 

There is a collection of RPI3 buildworld logs in
http://www.zefox.net/~fbsd/rpi3/swaptests/
The more recent experiments are sorted by revision first, then swap config
and then other modifications.

If I can do anything to make the records more useful please let me know.

hth,

bob prohaska
 


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ntpd as ntpd user question

2018-07-24 Thread bob prohaska
On Mon, Jul 23, 2018 at 09:28:41PM -0700, Kevin Oberman wrote:
> On Mon, Jul 23, 2018 at 7:25 PM, Ian Lepore  wrote:
> 
> > On Mon, 2018-07-23 at 18:54 -0700, bob prohaska wrote:
> > > On Mon, Jul 23, 2018 at 09:34:26PM +0200, Herbert J. Skuhra wrote:
> > > >
> > > >
> > > > Yes, first you press m. Then you will see differences of installed
> > > > file (left) and new file (right). Then you press either l or
> > > > r:
> > > >
> > > > l | 1:  choose left diff
> > > > r | 2:  choose right diff
> > > >
> > > > If the diff tries to remove/add to many lines you can:
> > > >
> > > > el: edit left diff
> > > > er: edit right diff
> > > >
> > > > And if done you can view the merged file (v) before installing (i)
> > > > it.
> > > >
> > > > I am sure, someone can explain it better! :)
> > > >
> > > Perhaps, but you've made the essential point. Your reply let me
> > > understand that
> > > mergemaster does not really "master" the merge, it rather identifies
> > > files needing
> > > to be merged and then starts sdiff to let me modify files. Never
> > > having even looked
> > > at sdiff, the learning curve proved very steep. Too steep, in fact.
> > >
> > > I'm going to try a more incremental approach.
> > >
> > > Thank you _very_ much!
> > >
> > > bob prohaska
> >
> > Your reaction to mergemaster is about the same as mine was when I first
> > encountered it very long ago, and re-discovered when I tried it a
> > couple years ago. It just seems like more trouble than it's worth, I
> > can usually figure out what's broken and fix it by hand faster than
> > messing with all the merge stuff.
> >
> > But, someone told me that if you give mergemaster the right flags it
> > can potentially be intervention-free. Those apparently aren't the flag
> > or two that're suggested at the bottom of UPDATING. So I didn't really
> > dig into that any deeper, but I toss it out there in case someone can
> > expand on it.
> >
> > It certainly makes some sense that it could be done intervention-free.
> > When doing other diff-based merges (like 'svn update') you only have to
> > intervene when there's an actual conflict between some local change
> > you've made and the incoming changes.
> >
> >
> It gets a LOT simpler if you use "mergemaster -iPUF" Only those files you
> have modified will show up. In most cases, it just zips right by. In most
> that it does not, the use of 'r' or 'l' in merge is all you need and always
> 'r' eccepton lines you have modified, yourself, so you should know about
> them.
> 
I realize your comments are directed to Ian and not me, so please take these
$.02 for no more than they're worth. 

My problems with mergemaster are _not_ with mergemaster. They're with sdiff.
The window presented, along with the prompts, are simply bewildering. I suspect
that someboey truly fluent with vi would recognize what's going on at once and
have no trouble. I've used vi for a long time, but only in the most naive way,
and sdiff's man page is little help for a newcomer. Even a Web search for 
tutorials
found nothing very useful, at least not quickly. 

A plain language discription of what sdif does and how might make the minutia 
of the
man page comprehensible to non-experts. 

Apologies if I'm belaboring the obvious, and thanks for reading!

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ntpd as ntpd user question

2018-07-23 Thread bob prohaska
On Mon, Jul 23, 2018 at 08:25:59PM -0600, Ian Lepore wrote:
> On Mon, 2018-07-23 at 18:54 -0700, bob prohaska wrote:
> > at sdiff, the learning curve proved very steep. Too steep, in fact.
> > 
> > I'm going to try a more incremental approach.?
> > 
> > Thank you _very_ much!
> > 
> > bob prohaska
> 
> Your reaction to mergemaster is about the same as mine was when I first
> encountered it very long ago, and re-discovered when I tried it a
> couple years ago. It just seems like more trouble than it's worth, I
> can usually figure out what's broken and fix it by hand faster than
> messing with all the merge stuff.
> 
Your suggestion to use vipw seems to have worked. Copied the required
line, ran /usr/sbin/pwd_mkdb -p /etc/master.passwd and installworld ran
without issue.

The machine has now rebooted and ntp has set the clock correctly. 
I don't see ntpd in a ps -aux output.

It's unclear what I need to do next, but at least I'm over the first
hurdle. I'll go back to your earlier email and attempt the rest of the
updates by hand.

Thanks for all your help!

bob prohaska


> But, someone told me that if you give mergemaster the right flags it
> can potentially be intervention-free. Those apparently aren't the flag
> or two that're suggested at the bottom of UPDATING. So I didn't really
> dig into that any deeper, but I toss it out there in case someone can
> expand on it.
> 
> It certainly makes some sense that it could be done intervention-free.
> When doing other diff-based merges (like 'svn update') you only have to
> intervene when there's an actual conflict between some local change
> you've made and the incoming changes.
> 
> -- Ian
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
> 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ntpd as ntpd user question

2018-07-23 Thread bob prohaska
On Mon, Jul 23, 2018 at 09:34:26PM +0200, Herbert J. Skuhra wrote:
> 
> Yes, first you press m. Then you will see differences of installed
> file (left) and new file (right). Then you press either l or
> r:
> 
> l | 1:choose left diff
> r | 2:choose right diff
> 
> If the diff tries to remove/add to many lines you can:
> 
> el:   edit left diff
> er:   edit right diff
> 
> And if done you can view the merged file (v) before installing (i) it.
> 
> I am sure, someone can explain it better! :)
> 

Perhaps, but you've made the essential point. Your reply let me understand that 
mergemaster does not really "master" the merge, it rather identifies files 
needing 
to be merged and then starts sdiff to let me modify files. Never having even 
looked 
at sdiff, the learning curve proved very steep. Too steep, in fact.

I'm going to try a more incremental approach. 

Thank you _very_ much!

bob prohaska
 


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ntpd as ntpd user question

2018-07-22 Thread bob prohaska
On Sun, Jul 22, 2018 at 01:49:41AM +0200, Herbert J. Skuhra wrote:
> On Sat, Jul 21, 2018 at 03:09:26PM -0700, bob prohaska wrote:

> > The failure is a little surprising, is ntpd a reserved name?
> 
> Why? You obviously entered the string "ntpd" instead of an integer when
> asked for the uid!?
> 

Sigh...you're right. Must have been sleepier than I thought.

> > The machine is re-running buildworld/installworld from a clean start,
> > so presumably it'll halt over the same error again. When that happens, 
> > what's the simplest way to recover? Mergemaster is a big hammer, something
> > less comprehensive might suffice, even manual editing of files.
> 
> In this case 'mergemaster -p' is enough.
> 

An example or two on the use of mergemaster might be a considerable help. 
There's something
very basic that I don't understand.

What is the correct response to the prompts for this simple case? The output 
displayed
is said to be differences, so the "temporary" file's nature isn't self-evident. 
It looks
as if the most obvious option is m, followed by eb, but that leaves me editing 
by hand, 
which is what I thought mergemaster was supposed to avoid. 


Thanks for reading, and apologies for being dense.

bob prohaska


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ntpd as ntpd user question

2018-07-21 Thread bob prohaska
On Sat, Jul 21, 2018 at 12:14:10PM -0600, Ian Lepore wrote:
> 
> I can't see any way that installkernel would lead to the complaint
> about the ntpd user not existing; that check is tied to the
> installworld target.
> 
My mistake. I was sleepy and in a hurry. The error message was in installworld
and my attempt to adduser ntpd concluded with an error:
Locked : yes
OK? (yes/no): yes
pw: Bad id 'ntpd': invalid
adduser: ERROR: There was an error adding user (ntpd).
On reboot the old ntpd set the clock and I thought all was well.

The failure is a little surprising, is ntpd a reserved name?

The machine is re-running buildworld/installworld from a clean start,
so presumably it'll halt over the same error again. When that happens, 
what's the simplest way to recover? Mergemaster is a big hammer, something
less comprehensive might suffice, even manual editing of files.  

There's minimal customization on the machine, basically /etc/fstab, 
/etc/rc.conf and /etc/passwd. Nothing else of real value, so if I kill 
it in the attempt it won't be a disaster.


Thanks for waking me to my blunder...

bob prohaska
 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ntpd as ntpd user question

2018-07-21 Thread bob prohaska
On Sat, Jul 21, 2018 at 11:14:45AM -0600, Ian Lepore wrote:
> 
> There's a "pre-world" stage of mergemaster (-Fp option I think) which
> isn't needed often, but one of the times it is needed is apparently
> when new user ids are added. ?(So I've been told, I've never much used
> mergemaster myself). I think there are some words about it at the very
> bottom of UPDATING.
> 

FWIW, installkernel stopped with the note about needing an ntpd user/group.
Never having been successful with mergemaster (couldn't make heads nor tails
of the "what to do" prompts) I just ran adduser, creating a locked ntpd user
and group. Nothing else special done. The machine is up to r336567 on arm64.

Installkernel ran, I didn't touch anthing in /etc manually and reboot looked 
normal.
For now it seems ignorance is bliss

If there's something special I should do (beyond locking) to secure the ntpd 
account please warn me.

Thanks for reading,

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: use of undeclared identifier 'DW_LANG_C11'

2018-06-09 Thread bob prohaska
On Sat, Jun 09, 2018 at 04:13:08PM -0400, Mark Johnston wrote:
> On Sat, Jun 09, 2018 at 01:07:24PM -0700, bob prohaska wrote:
> > I'm seeing persistent 
> > --- dwarf.o ---
> > /usr/src/cddl/contrib/opensolaris/tools/ctf/cvt/dwarf.c:1980:8: error: use 
> > of undeclared identifier 'DW_LANG_C11'
> > case DW_LANG_C11:
> >  ^
> > errors very early in buildworld attempts on 334890 
> > 
> > I've tried "make clean" in /usr/src to no avail, is there something else
> > to try, or should I just wait for a source update?
> 
> Please give r334892 a try.

Thank you,  

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


use of undeclared identifier 'DW_LANG_C11'

2018-06-09 Thread bob prohaska
I'm seeing persistent 
--- dwarf.o ---
/usr/src/cddl/contrib/opensolaris/tools/ctf/cvt/dwarf.c:1980:8: error: use of 
undeclared identifier 'DW_LANG_C11'
case DW_LANG_C11:
 ^
errors very early in buildworld attempts on 334890 

I've tried "make clean" in /usr/src to no avail, is there something else
to try, or should I just wait for a source update?

Thanks for reading,

bob prohaska


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: Assertion td->td_lock == TDQ_LOCKPTR(tdq) failed at /usr/src/sys/kern/sched_ule.c:2137

2018-06-07 Thread bob prohaska
On Thu, Jun 07, 2018 at 05:13:45PM -0700, bob prohaska wrote:
> 
> I'll try again, this time with USB swap turned off.

The circle closed, back to the original panic in the subject line.
Console, top and buildworld.log files are at 
http://www.zefox.net/~fbsd/rpi3/crashes/20180607

Thanks for reading,

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: Assertion td->td_lock == TDQ_LOCKPTR(tdq) failed at /usr/src/sys/kern/sched_ule.c:2137

2018-06-07 Thread bob prohaska
On Wed, Jun 06, 2018 at 10:28:58PM -0700, Mark Millard wrote:
> 
> Looks like there has been another stab at avoiding some
> unnecessary Out Of Memory killing of processes:
> 
> Author: alc
> Date: Thu Jun  7 02:54:11 2018
> New Revision: 334752
> URL: https://svnweb.freebsd.org/changeset/base/334752
> 
> 
> Log:
>   . . .  One visible
>   effect of this error was that processes were being killed by the
>   virtual memory system's OOM killer when in fact there was plentiful
>   free memory.
> 

An RPI3 kernel at 334800 still reported 
 Jun  7 16:28:21 www kernel: pid 71329 (c++), uid 0, was killed: out of swap 
space

during a -j4 buildworld.

I wasn't watching top at the time, so I don't know how much swap was in
use. Total available was 4 GB, which certainly seems like it ought to be
enough. The swap was on both microSD and USB flash.

I've run make clean in /usr/src/lib/clang/libllvm and restarted
a -j4 buildworld with the -DNO_CLEAN option, and also set
sysctl vm.pageout_update_period=0 to see what would happen.

Within a few minutes buildworld stopped, the tail of the log file
contained


--- X86GenEVEX2VEXTables.inc ---
llvm-tblgen -gen-x86-EVEX2VEX-tables  -I /usr/src/contrib/llvm/include -I 
/usr/src/contrib/llvm/lib/Target/X86  -d X86GenEVEX2VEXTables.inc.d -o 
X86GenEVEX2VEXTables.inc  /usr/src/contrib/llvm/lib/Target/X86/X86.td
--- X86GenFastISel.inc ---
llvm-tblgen -gen-fast-isel  -I /usr/src/contrib/llvm/include -I 
/usr/src/contrib/llvm/lib/Target/X86  -d X86GenFastISel.inc.d -o 
X86GenFastISel.inc  /usr/src/contrib/llvm/lib/Target/X86/X86.td
--- X86GenDAGISel.inc ---
Killed
*** [X86GenDAGISel.inc] Error code 137

make[6]: stopped in /usr/src/lib/clang/libllvm
1 error

make[6]: stopped in /usr/src/lib/clang/libllvm
*** [all_subdir_lib/clang/libllvm] Error code 2

make[5]: stopped in /usr/src/lib/clang

I'll try again, this time with USB swap turned off.



Thanks for reading!

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: Assertion td->td_lock == TDQ_LOCKPTR(tdq) failed at /usr/src/sys/kern/sched_ule.c:2137

2018-06-06 Thread bob prohaska
On Wed, Jun 06, 2018 at 08:55:39PM +0200, Ronald Klop wrote:
> On Sat, 02 Jun 2018 13:40:27 +0200, Ronald Klop   
> wrote:
> 
> 
> How do you ever run a -j4 buildworld? My RPI3 starts building clang/llvm  
> with sometimes 500 MB+ per process so everything starts swapping like hell  
> and takes forever to run.
> 

Lately, never 8-)

When I started playing with an RPI3, in late 2016, -j4 buildworlds 
worked usably well.  Early in 2018 problems appeared, including  
Assertion td->td_lock == TDQ_LOCKPTR(tdq) failed, among others.

Things didn't really go to pot until somewhat later when the swap frenzy 
issue reared its head and haven't improved much.

Sadly, when the swap frenzy workaround of using
sysctl vm.pageout_update_period=0 was suggested,
a -j4 buildworld then resorted to the old td_lock
issue, so it looks as if both bugs are alive and
kicking.

Just to complicate matters, I was in the habit of
using a USB flash drive as both an outboard file
system (/usr/, /var/ and /tmp/) and as a swap device.
A very common reaction was to blame the flash device
for the trouble, though so far as I can tell a Sandisk
Extreme USB flash drive isn't much slower, if any, than
a mechanical hard disk for random writes. The same USB
flash devices on a Pi2 running 11-Stable seems to be fine.

However, turning off the USB flash swap device does seem
to reduce the number of "indefinite wait buffer" messages
on the console (they're usually not fatal) so I think there
is still something amiss. Whether it's the flash, the USB
or the VM system is unclear to me.

For now the workarounds are to run buildworld with no explicit
-j value (presumably equivalent to -j1), to use only swap on
the microSD card and to use the -DNO_CLEAN option for most
buildworld sessions, doing an explict "make clean" or
"rm -rf /usr/obj/usr" when necessary. In a few cases it
seemed helpful to start with "make kernel-toolchain" then
follow with make -DNO_CLEAN buildworld" but I didn't keep
good enough records to be certain of the benefits. 

Apologies for the length, HTH 

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Module compiles looking in /usr/src when alternate src tree is in use [actually the arm_neon.h and stdint.h issue]

2018-04-09 Thread bob prohaska
On Mon, Apr 09, 2018 at 06:04:24AM -0700, Mark Millard wrote:
> On 2018-Apr-8, at 10:08 PM, bob prohaska  wrote:
> >> . . .
> > On my RPi3 
> > root@www:/usr/src # ls -l /usr/lib/clang/6.0.0/include/stdint.h
> > -rw-r--r--  1 root  wheel  23387 Feb 16 07:37 
> > /usr/lib/clang/6.0.0/include/stdint.h
> > 
> > Every other file in that directory is dated January 22nd.
> > 
> > 
> >> . . .
> > 
> > Looks like it's close enough 8-)
> > Removing /usr/lib/clang/6.0.0/include/stdint.h has allowed make kernel
> > to proceed past its former point of failure. 
> > 
> 
> Looks like you copied the file there. Its presence is not a
> build problem. See below.
> 
> >From Feb 16 Email from you:
> 
> From: bob prohaska 
> Subject: Re: RPI3 can't build kernel-toolchain
> Date: February 16, 2018 at 9:09:27 AM PST
> To: Mark Millard 
> Cc: freebsd-arm at freebsd.org, bob prohaska 
> . . .
> Running
> cp ./contrib/llvm/tools/clang/lib/Headers/stdint.h 
> /usr/lib/clang/6.0.0/include
> didn't solve the problem. 
> 


I remembered the experiment that worked, and forgot the one that didn't.

Thank you!

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Module compiles looking in /usr/src when alternate src tree is in use [actually the arm_neon.h and stdint.h issue]

2018-04-08 Thread bob prohaska
On Sun, Apr 08, 2018 at 09:51:19PM -0700, Mark Millard wrote:
> Rodney W. Grimes freebsd-rwg at pdx.rh.CN85.dnsmgr.net wrote on
> Mon Apr 9 03:54:50 UTC 2018 :
> 
> > Something for some reason included arm_neon.h?
> 
> 
> # grep -r arm_neon.h /usr/src/sys/ | more
> /usr/src/sys/crypto/armv8/armv8_crypto_wrap.c:#include 
> 
> arm_neon.h is something that the kernel source itself has a reference
> to. [But the stdint.h that was in the error messages was found were
> the it should not exist as far as I can tell, see below.]
> 
> # find /usr/src -name .svn -prune -o -name arm_neon.h -print
> 
> finds nothing. But . . .
> 
> # find /usr/lib -name arm_neon.h -print
> /usr/lib/clang/6.0.0/include/arm_neon.h
> 
> This matches the error message report and is the only
> copy around in the system areas to find. (Ignoring
> ports materials and /usr/local/ .)
> 
> In turn that arm_neon.h has:
> 
> # grep stdint.h /usr/lib/clang/6.0.0/include/arm_neon.h
> #include 
> 
> Looking in a tree that I have (from an amd64 -> arm64 cross
> build for what is a Cortex-A53 intended use):
> 
> /usr/obj/DESTDIRs/clang-cortexA53-installworld/
> 
> were I did an installworld for arm64:
> 
> # find /usr/obj/DESTDIRs/clang-cortexA53-installworld -name stdint.h
> /usr/obj/DESTDIRs/clang-cortexA53-installworld/usr/include/c++/v1/stdint.h
> /usr/obj/DESTDIRs/clang-cortexA53-installworld/usr/include/c++/v1/tr1/stdint.h
> /usr/obj/DESTDIRs/clang-cortexA53-installworld/usr/include/sys/stdint.h
> /usr/obj/DESTDIRs/clang-cortexA53-installworld/usr/include/stdint.h
> 
> There is no stdint.h under that tree's /usr/lib/ area:
> 
> /usr/obj/DESTDIRs/clang-cortexA53-installworld/usr/lib/
> 
> was not listed anywhere.
> 
> For reference relative to arm_neon.h and this tree:
> 
> # find /usr/obj/DESTDIRs/clang-cortexA53-installworld -name arm_neon.h | more
> /usr/obj/DESTDIRs/clang-cortexA53-installworld/usr/lib/clang/6.0.0/include/arm_neon.h
> 
> I conclude that:
> 
> /usr/lib/clang/6.0.0/include/stdint.h
> 
> should not have been created in the first place.
> 
> [Does that stdint.h have file-system dates/times matching
> the other files from the build? Or does it look to be
> mismatched and possibly just needs to be deleted?]
> 
On my RPi3 
root@www:/usr/src # ls -l /usr/lib/clang/6.0.0/include/stdint.h
-rw-r--r--  1 root  wheel  23387 Feb 16 07:37 
/usr/lib/clang/6.0.0/include/stdint.h

Every other file in that directory is dated January 22nd.



> 
> For reference, all the above is based on source for head -r332293:
> 
> # uname -apKU
> FreeBSD FBSDFSSD 12.0-CURRENT FreeBSD 12.0-CURRENT  r332293M  amd64 amd64 
> 1200061 1200061
> 
> # svnlite info /usr/src | grep "Re[plv]"
> Relative URL: ^/head
> Repository Root: svn://svn.freebsd.org/base
> Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
> Revision: 332293
> Last Changed Rev: 332293
> 
> 
> I do not have an arm64 system that is anywhere near up to
> date at this time so the above evidence is not from a
> self-hosted build: My context is not a full-match.
> 
Looks like it's close enough 8-)
Removing /usr/lib/clang/6.0.0/include/stdint.h has allowed make kernel
to proceed past its former point of failure. 

Thank you!

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Module compiles looking in /usr/src when alternate src tree is in use

2018-04-08 Thread bob prohaska
On Sun, Apr 08, 2018 at 05:40:55PM -0700, Rodney W. Grimes wrote:
> > On Sun, Apr 08, 2018 at 12:00:52PM -0700, Rodney W. Grimes wrote:
> > > I am having a compile time issue for a patched that compiled fine on my
> > > r329294 system, but now failes to compile with what looks like a wrong
> > > header being included.
> > > 
> > Might this be a cousin to the problem reported at
> > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227274 ?
> > 
> > In that kernel compile (on an RPi3) the compiler complains
> > 
> > In file included from /usr/src/sys/crypto/armv8/armv8_crypto_wrap.c:46:
> > In file included from /usr/lib/clang/6.0.0/include/arm_neon.h:31:
> > /usr/lib/clang/6.0.0/include/stdint.h:228:25: error: typedef redefinition 
> > with different types ('int16_t' (aka 'short') vs '__int_fast16_t' (aka 
> > 'int'))
> > typedef __int_least16_t int_fast16_t;
> > 
> > The reference to /usr/lib/clang/... seems a bit strange; isn't a major 
> > purpose of the kernel build procedure to minimize reliance on the
> > host system's (already-stale) software?
> 
> Are you building in /usr/src, or are your sources located some place else?
> 
This is a straightforward self-hosted build on an RPi3. Sources are in
/usr/src. There are no modifications to the source directories. 



> Really need the log that includes the cc command line, as that has the
> tell tell -I/usr/src/sys in it.  That component is totally bogus!  At
> no time should a src tree rooted at /usr/src-topo be trying to use files
> from /usr/src/.
> 
Should files _outside_ /usr/src or /usr/obj _ever_ be referenced during
a world or kernel build? I thought the answer was "no".  

The line leading up to the error message is:

--- armv8_crypto_wrap.o ---
cc -target aarch64-unknown-freebsd12.0 --sysroot=/usr/obj/usr/src/arm64.aarch6
4/tmp -B/usr/obj/usr/src/arm64.aarch64/tmp/usr/bin -c -O3 -pipe -fno-strict-al
iasing -Werror -D_KERNEL -DKLD_MODULE -DHAVE_KERNEL_OPTION_HEADERS -include /u
sr/obj/usr/src/arm64.aarch64/sys/GENERIC-NODEBUG/opt_global.h -I. -I/usr/src/s
ys -fno-common -g -fPIC -I/usr/obj/usr/src/arm64.aarch64/sys/GENERIC-NODEBUG -
ffixed-x18 -ffreestanding -fwrapv -fstack-protector -gdwarf-2 -Wall -Wredundan
t-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-ar
ith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -D__printf__=__freebsd_kpri
ntf__ -Wmissing-include-dirs -fdiagnostics-show-option -Wno-unknown-pragmas -W
no-error-tautological-compare -Wno-error-empty-body -Wno-error-parentheses-equ
ality -Wno-error-unused-function -Wno-error-pointer-sign -Wno-error-shift-nega
tive-value -Wno-error-address-of-packed-member -std=iso9899:1999  -Werror   -m
arch=armv8-a+crypto /usr/src/sys/crypto/armv8/armv8_crypto_wrap.c
In file included from /usr/src/sys/crypto/armv8/armv8_crypto_wrap.c:46:
In file included from /usr/lib/clang/6.0.0/include/arm_neon.h:31:

There's a "-I/usr/src/sys" in the fourth line, which in my case makes sense,
but where does the reference to /usr/lib/clang/ come from, and is it 
appropriate?

> > If the two problems are related, should the subject line on the bug
> > report be changed?
> 
> It could be, but more info would be needed.
> 
Please let me know what additional information is needed. 

Thanks for reading,

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Module compiles looking in /usr/src when alternate src tree is in use

2018-04-08 Thread bob prohaska
On Sun, Apr 08, 2018 at 12:00:52PM -0700, Rodney W. Grimes wrote:
> I am having a compile time issue for a patched that compiled fine on my
> r329294 system, but now failes to compile with what looks like a wrong
> header being included.
> 
Might this be a cousin to the problem reported at
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227274 ?

In that kernel compile (on an RPi3) the compiler complains

In file included from /usr/src/sys/crypto/armv8/armv8_crypto_wrap.c:46:
In file included from /usr/lib/clang/6.0.0/include/arm_neon.h:31:
/usr/lib/clang/6.0.0/include/stdint.h:228:25: error: typedef redefinition with 
different types ('int16_t' (aka 'short') vs '__int_fast16_t' (aka 'int'))
typedef __int_least16_t int_fast16_t;

The reference to /usr/lib/clang/... seems a bit strange; isn't a major 
purpose of the kernel build procedure to minimize reliance on the
host system's (already-stale) software?

If the two problems are related, should the subject line on the bug
report be changed?

Thanks for reading,

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


  1   2   >