Re: Kerberos doc needs an update

2022-12-27 Thread Larry Rosenman



Can you gen a PR with a patch?  I'm sure the doc folks would appreciate 
it.


On 12/27/2022 6:34 pm, Rick Macklem wrote:


Hi,

I just set up a KDC, which is easy once you realize
that Sec. 14.5 of the FreeBSD handbook is out of date.
(I was a dummy and spent several hours installing stuff
from ports before I realized it was all in the system,
but the startup file in /etc/rc.d is called "kdc" and
not "kerberos".)

In 14.5.1, kerberos5_server_enable and kadmind5_server_enable
have been renamed, although these old names still work.

Further down in 14.5.1, it says "service kerberos start",
which doens't work. It is now "service kdc start".
(This was the one that sent me on a wild goose chase.;-)

Maybe someone that can do so could patch the handbook?

Thanks, rick


--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

BUILD BREAK:

2022-10-25 Thread Larry Rosenman
Building 
/usr/obj/usr/src/amd64.amd64/tests/sys/net/routing/test_rtsock_l3.debug

--- all_subdir_tests/sys/kern ---
--- subr_physmem_test ---
--- subr_physmem_test.o ---
In file included from /usr/src/tests/sys/kern/subr_physmem_test.c:34:
/usr/obj/usr/src/amd64.amd64/tmp/usr/include/sys/physmem.h:57:1: error: 
unknown type name 'bool'

bool physmem_excluded(vm_paddr_t pa, vm_size_t sz);
^



--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: Build Break?

2022-10-02 Thread Larry Rosenman



On 10/02/2022 11:44 am, Larry Rosenman wrote:


On 10/02/2022 11:27 am, Alexander V. Chernikov wrote:
02.10.2022, 17:18, "Larry Rosenman" :

On 10/02/2022 8:12 am, Alexander V. Chernikov wrote:  On 1 Oct 2022, at 
22:57, Larry Rosenman  wrote:


--- all_subdir_nfscommon ---
Building
/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL/modules/usr/src/sys/modules/nfscommon/nfs_commonkrpc.o
--- all_subdir_netgraph ---
--- all_subdir_netgraph/deflate ---
Building
/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL/modules/usr/src/sys/modules/netgraph/deflate/offset.inc
--- all_subdir_netgraph/device ---
Building
/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL/modules/usr/src/sys/modules/netgraph/device/i386
--- all_subdir_netgraph/echo ---
===> netgraph/echo (all)
--- all_subdir_netlink ---
--- netlink_io.o ---
/usr/src/sys/netlink/netlink_io.c:146:2: error: implicit declaration
of function 'mtx_lock' is invalid in C99
[-Werror,-Wimplicit-function-declaration]
NLP_LOCK(nlp);  That's interesting. netlink_io.c includes sys/mutex.h 
which defines

mutex_lock() / mutex_unlock().
Could you share the diff between GENERIC and LER-MINIMAL?


I sent the diff in another message, but here is LER-MINIMAL. Thank you!
So it's non-networking config. I'll make netlink build  conditional on 
INET || INET6 today/tomorrow.


I actually kldload a bunch of stuff.
kld_list="aesni coretemp filemon linux ichsmb ichwd cpuctl cryptodev 
dtraceall i

pmi "
kld_list="$kld_list if_bridge bridgestp if_tuntap hwpmc tcp_rack mfip 
ioat"
kld_list="$kld_list if_bce usb ukbd usb_quirk usb_template ums uhci xhci 
ehci oh

ci"
kld_list="$kld_list efirt nfscl nfscommon nfsd nfslockd nfssvc"
kld_list="$kld_list ataintel geom_label"
#kld_list="$kld_list geom_label"


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

also MINIMAL (which I INCLUDE) does have INET/INET6...

PFA MINIMAL

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106#
# MINIMAL -- Mostly Minimal kernel configuration file for FreeBSD/amd64
#
# Many definitions of minimal are possible. The one this file follows is
# GENERIC, minus all functionality that can be replaced by loading kernel
# modules.
#
# Exceptions:
# o While UFS is buildable as a module, the current module lacks
#   some features (ACL, GJOURNAL) that GENERIC includes.
# o acpi as a module has been reported flakey and not well tested, so
#   is included in the kernel.
# o (non-loaded) random is included due to uncertainty...
# o Many networking things are included
#
# For now, please run changes to these list past i...@freebsd.org
#
# For more information on this file, please read the config(5) manual page,
# and/or the handbook section on Kernel Configuration Files:
#
#
https://docs.freebsd.org/en/books/handbook/kernelconfig/#kernelconfig-config
#
# The handbook is also available locally in /usr/share/doc/handbook
# if you've installed the doc distribution, otherwise always see the
# FreeBSD World Wide Web server (https://www.FreeBSD.org/) for the
# latest information.
#
# An exhaustive list of options and more detailed explanations of the
# device lines is also present in the ../../conf/NOTES and NOTES files.
# If you are in doubt as to the purpose or necessity of a line, check first
# in NOTES.
#
# $FreeBSD$

cpu HAMMER
ident   MINIMAL

makeoptions DEBUG=-g# Build kernel with gdb(1) debug symbols
makeoptions WITH_CTF=1  # Run ctfconvert(1) for DTrace support

options SCHED_ULE   # ULE scheduler
options NUMA# Non-Uniform Memory Architecture 
support
options PREEMPTION  # Enable kernel thread preemption
options INET# InterNETworking
options INET6   # IPv6 communications protocols
options TCP_OFFLOAD # TCP offload
options SCTP_SUPPORT# Allow kldload of SCTP
options FFS # Berkeley Fast Filesystem
options SOFTUPDATES # Enable FFS soft updates support
options UFS_ACL # Support for access control lists
options UFS_DIRHASH # Improve performance on big directories
options UFS_GJOURNAL# Enable gjournal-based UFS journaling
options QUOTA   # Enable disk quotas for UFS
options MD_ROOT # MD is a potential root device
options COMPAT_FREEBSD32# Compatible with i386 binaries
o

Re: Build Break?

2022-10-02 Thread Larry Rosenman



On 10/02/2022 11:27 am, Alexander V. Chernikov wrote:


02.10.2022, 17:18, "Larry Rosenman" :

On 10/02/2022 8:12 am, Alexander V. Chernikov wrote:  On 1 Oct 2022, at 
22:57, Larry Rosenman  wrote:


--- all_subdir_nfscommon ---
Building
/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL/modules/usr/src/sys/modules/nfscommon/nfs_commonkrpc.o
--- all_subdir_netgraph ---
--- all_subdir_netgraph/deflate ---
Building
/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL/modules/usr/src/sys/modules/netgraph/deflate/offset.inc
--- all_subdir_netgraph/device ---
Building
/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL/modules/usr/src/sys/modules/netgraph/device/i386
--- all_subdir_netgraph/echo ---
===> netgraph/echo (all)
--- all_subdir_netlink ---
--- netlink_io.o ---
/usr/src/sys/netlink/netlink_io.c:146:2: error: implicit declaration
of function 'mtx_lock' is invalid in C99
[-Werror,-Wimplicit-function-declaration]
NLP_LOCK(nlp);  That's interesting. netlink_io.c includes sys/mutex.h 
which defines

mutex_lock() / mutex_unlock().
Could you share the diff between GENERIC and LER-MINIMAL?


I sent the diff in another message, but here is LER-MINIMAL. Thank you!
So it's non-networking config. I'll make netlink build  conditional on 
INET || INET6 today/tomorrow.


I actually kldload a bunch of stuff.
kld_list="aesni coretemp filemon linux ichsmb ichwd cpuctl cryptodev 
dtraceall i

pmi "
kld_list="$kld_list if_bridge bridgestp if_tuntap hwpmc tcp_rack mfip 
ioat"
kld_list="$kld_list if_bce usb ukbd usb_quirk usb_template ums uhci xhci 
ehci oh

ci"
kld_list="$kld_list efirt nfscl nfscommon nfsd nfslockd nfssvc"
kld_list="$kld_list ataintel geom_label"
#kld_list="$kld_list geom_label"


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

Re: Build Break?

2022-10-02 Thread Larry Rosenman

On 10/02/2022 8:12 am, Alexander V. Chernikov wrote:

On 1 Oct 2022, at 22:57, Larry Rosenman  wrote:

--- all_subdir_nfscommon ---
Building 
/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL/modules/usr/src/sys/modules/nfscommon/nfs_commonkrpc.o

--- all_subdir_netgraph ---
--- all_subdir_netgraph/deflate ---
Building 
/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL/modules/usr/src/sys/modules/netgraph/deflate/offset.inc

--- all_subdir_netgraph/device ---
Building 
/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL/modules/usr/src/sys/modules/netgraph/device/i386

--- all_subdir_netgraph/echo ---
===> netgraph/echo (all)
--- all_subdir_netlink ---
--- netlink_io.o ---
/usr/src/sys/netlink/netlink_io.c:146:2: error: implicit declaration 
of function 'mtx_lock' is invalid in C99 
[-Werror,-Wimplicit-function-declaration]

   NLP_LOCK(nlp);
That’s interesting. netlink_io.c includes sys/mutex.h which defines 
mutex_lock() / mutex_unlock().

 Could you share the diff between GENERIC and LER-MINIMAL?


I sent the diff in another message, but here is LER-MINIMAL.

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
# LER-MINIMAL  -- kernel config based on MINIMAL

include MINIMAL
ident   LER-MINIMAL

nooptions   WITNESS # Enable checks to detect deadlocks and 
cycles
nooptions   WITNESS_SKIPSPIN# Don't run witness on spinlocks for 
speed
options KDB_UNATTENDED
#optionsDEBUG_MEMGUARD
#optionsDEBUG_REDZONE
makeoptions WITH_EXTRA_TCP_STACKS=1
options TCPHPTS
device  mfi
options TCP_RFC7413
# Kernel dump features.
options EKCD# Support for encrypted kernel dumps
options GZIO# gzip-compressed kernel and user dumps
options ZSTDIO  # zstd-compressed kernel and user dumps
options NETDUMP # netdump(4) client support
# ipsec support
options IPSEC_SUPPORT
device  crypto

#netgraph debug
options NETGRAPH_DEBUG

#tcp ratelimit
options RATELIMIT

## INVARIANTS
options INVARIANT_SUPPORT
#optionsINVARIANTS


Re: Build Break?

2022-10-02 Thread Larry Rosenman

On 10/02/2022 8:12 am, Alexander V. Chernikov wrote:

On 1 Oct 2022, at 22:57, Larry Rosenman  wrote:

--- all_subdir_nfscommon ---
Building 
/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL/modules/usr/src/sys/modules/nfscommon/nfs_commonkrpc.o

--- all_subdir_netgraph ---
--- all_subdir_netgraph/deflate ---
Building 
/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL/modules/usr/src/sys/modules/netgraph/deflate/offset.inc

--- all_subdir_netgraph/device ---
Building 
/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL/modules/usr/src/sys/modules/netgraph/device/i386

--- all_subdir_netgraph/echo ---
===> netgraph/echo (all)
--- all_subdir_netlink ---
--- netlink_io.o ---
/usr/src/sys/netlink/netlink_io.c:146:2: error: implicit declaration 
of function 'mtx_lock' is invalid in C99 
[-Werror,-Wimplicit-function-declaration]

   NLP_LOCK(nlp);
That’s interesting. netlink_io.c includes sys/mutex.h which defines 
mutex_lock() / mutex_unlock().

 Could you share the diff between GENERIC and LER-MINIMAL?


attached.

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
--- GENERIC	2022-08-18 14:44:35.576844000 -0500
+++ LER-MINIMAL	2022-10-02 10:46:41.308926000 -0500
@@ -1,401 +1,32 @@
-#
-# GENERIC -- Generic kernel configuration file for FreeBSD/amd64
-#
-# For more information on this file, please read the config(5) manual page,
-# and/or the handbook section on Kernel Configuration Files:
-#
-#https://docs.freebsd.org/en/books/handbook/kernelconfig/#kernelconfig-config
-#
-# The handbook is also available locally in /usr/share/doc/handbook
-# if you've installed the doc distribution, otherwise always see the
-# FreeBSD World Wide Web server (https://www.FreeBSD.org/) for the
-# latest information.
-#
-# An exhaustive list of options and more detailed explanations of the
-# device lines is also present in the ../../conf/NOTES and NOTES files.
-# If you are in doubt as to the purpose or necessity of a line, check first
-# in NOTES.
-#
-# $FreeBSD$
+# LER-MINIMAL  -- kernel config based on MINIMAL
 
-cpu		HAMMER
-ident		GENERIC
+include		MINIMAL
+ident		LER-MINIMAL
 
-makeoptions	DEBUG=-g		# Build kernel with gdb(1) debug symbols
-makeoptions	WITH_CTF=1		# Run ctfconvert(1) for DTrace support
-
-options 	SCHED_ULE		# ULE scheduler
-options 	NUMA			# Non-Uniform Memory Architecture support
-options 	PREEMPTION		# Enable kernel thread preemption
-options 	VIMAGE			# Subsystem virtualization, e.g. VNET
-options 	INET			# InterNETworking
-options 	INET6			# IPv6 communications protocols
-options 	IPSEC_SUPPORT		# Allow kldload of ipsec and tcpmd5
-options		ROUTE_MPATH		# Multipath routing support
-options		FIB_ALGO		# Modular fib lookups
-options 	TCP_OFFLOAD		# TCP offload
-options 	TCP_BLACKBOX		# Enhanced TCP event logging
-options 	TCP_HHOOK		# hhook(9) framework for TCP
-options		TCP_RFC7413		# TCP Fast Open
-options 	SCTP_SUPPORT		# Allow kldload of SCTP
-options		KERN_TLS		# TLS transmit & receive offload
-options 	FFS			# Berkeley Fast Filesystem
-options 	SOFTUPDATES		# Enable FFS soft updates support
-options 	UFS_ACL			# Support for access control lists
-options 	UFS_DIRHASH		# Improve performance on big directories
-options 	UFS_GJOURNAL		# Enable gjournal-based UFS journaling
-options 	QUOTA			# Enable disk quotas for UFS
-options 	MD_ROOT			# MD is a potential root device
-options 	NFSCL			# Network Filesystem Client
-options 	NFSD			# Network Filesystem Server
-options 	NFSLOCKD		# Network Lock Manager
-options 	NFS_ROOT		# NFS usable as /, requires NFSCL
-options 	MSDOSFS			# MSDOS Filesystem
-options 	CD9660			# ISO 9660 Filesystem
-options 	PROCFS			# Process filesystem (requires PSEUDOFS)
-options 	PSEUDOFS		# Pseudo-filesystem framework
-options 	TMPFS			# Efficient memory filesystem
-options 	GEOM_RAID		# Soft RAID functionality.
-options 	GEOM_LABEL		# Provides labelization
-options 	EFIRT			# EFI Runtime Services support
-options 	COMPAT_FREEBSD32	# Compatible with i386 binaries
-options 	COMPAT_FREEBSD4		# Compatible with FreeBSD4
-options 	COMPAT_FREEBSD5		# Compatible with FreeBSD5
-options 	COMPAT_FREEBSD6		# Compatible with FreeBSD6
-options 	COMPAT_FREEBSD7		# Compatible with FreeBSD7
-options 	COMPAT_FREEBSD9		# Compatible with FreeBSD9
-options 	COMPAT_FREEBSD10	# Compatible with FreeBSD10
-options 	COMPAT_FREEBSD11	# Compatible with FreeBSD11
-options 	COMPAT_FREEBSD12	# Compatible with FreeBSD12
-options 	COMPAT_FREEBSD13	# Compatible with FreeBSD13
-options 	SCSI_DELAY=5000		# Delay (in ms) before probing SCSI
-options 	KTRACE			# ktrace(1) support
-options 	STACK			# stack(9) support
-options 	SYSVSHM			# SYSV-style shared memory
-options 	SYSVMSG			# SYSV-style message queues
-options 	SYSVSEM			# SYSV-style semaphores
-options 	_KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions
-options 	PRINTF_BUFR_SIZE=128	# Prevent printf outpu

Re: BOOT CRASH -- Current -CURRENT

2022-10-01 Thread Larry Rosenman



On 10/01/2022 10:43 pm, Larry Rosenman wrote:


On 10/01/2022 10:08 pm, Warner Losh wrote:

On Sat, Oct 1, 2022 at 9:06 PM Larry Rosenman  wrote:

On 10/01/2022 10:04 pm, Warner Losh wrote:

Do  you have a /boot tarball that can be loaded in a VM that recreates 
the problem (along with a clean hash)?


But before you try that, have you tried a completely clean rebuild of 
the kernel to preclude the possibility that something is somehow cross 
threaded?


Warner

On Sat, Oct 1, 2022 at 8:39 PM Larry Rosenman  wrote:
❯ more info.11
Dump header from device: /dev/mfid0p3
Architecture: amd64
Architecture Version: 2
Dump Length: 126748815
Blocksize: 512
Compression: zstd
Dumptime: 2022-10-01 21:26:40 -0500
Hostname:
Magic: FreeBSD Kernel Dump
Version String: FreeBSD 14.0-CURRENT #168
ler/freebsd-main-changes-n258354-6cdd871ebc4: Sat Oct  1 21:13:01 CDT
2022
r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL
Panic String: page fault
Dump Parity: 501115454
Bounds: 11
Dump Status: good

I do have source and debug stuff, BUT kgdb croaks on me.

I *CAN* give access to the machine.

the console backtrace showed something about the kld load of
dependencies.

--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106


let me wipe /usr/obj, and rebuild everything (I *DO* use meta-mode).

I've had fewer problems with it than non-meta mode, but this looks like 
a 'corruption' or 'cross threaded' crash I've chased in the past that 
went away with a rebuild. So it's better to be sure...


Warner

Still breaks -- did someone(tm) forget to make netlink a module?

--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

❯ sudo kgdb -c vmcore.12 /mnt/usr/lib/debug/boot/kernel/kernel.debug
GNU gdb (GDB) 12.1 [GDB v12.1 for FreeBSD]
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /mnt/usr/lib/debug/boot/kernel/kernel.debug...

Unread portion of the kernel message buffer:
---<>---
Copyright (c) 1992-2022 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
 The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 14.0-CURRENT #0 ler/freebsd-main-changes-n258354-6cdd871ebc4: 
Sat Oct  1 22:30:48 CDT 2022
r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL 
amd64
FreeBSD clang version 14.0.5 (https://github.com/llvm/llvm-project.git 
llvmorg-14.0.5-0-gc12386ae247c)

VT(efifb): resolution 640x480
CPU: Intel(R) Xeon(R) CPU   X5660  @ 2.80GHz (2793.16-MHz 
K8-class CPU)

  Origin="GenuineIntel"  Id=0x206c2  Family=0x6  Model=0x2c  Stepping=2
  
Features=0xbfebfbff
  
Features2=0x29ee3ff

  AMD Features=0x2c100800
  AMD Features2=0x1
  Structured Extended Features3=0x9c00
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics
real memory  = 137438953472 (131072 MB)
avail memory = 133789515776 (127591 MB)
CPU microcode: no matching update found
Event timer "LAPIC" quality 600
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 24 CPUs
FreeBSD/SMP: 2 package(s) x 6 core(s) x 2 hardware threads
random: unblocking device.
ioapic1: MADT APIC ID 1 != hw id 0
ioapic0  irqs 0-23
ioapic1  irqs 32-55
Launching APs: 1 14 12 21 2 6 17 10 18 15 4 19 7 3 8 20 13 5 23 11 9 16 
22

TCP_ratelimit: Is now initialized
TCP Hpts created 24 swi interrupt threads and bound 24 to NUMA domains
random: entropy device external interface
kbd1 at kbdmux0
acpi0: 
acpi0: Power Button (fixed)
apei0:  on acpi0
cpu0:  on acpi0
atrtc0:  port 0x70-0x7f irq 8 on acpi0
atrtc0: registered as a time-of-day clock, resolution 1.00s
Event timer "RTC" frequency 32768 Hz quality 0
attimer0:  port 0x40-0x5f irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
hpet0:  iomem 0xfed0-0xfed003ff on acpi0
Timecounter "HPET" frequency 143181

Re: BOOT CRASH -- Current -CURRENT

2022-10-01 Thread Larry Rosenman



On 10/01/2022 10:08 pm, Warner Losh wrote:


On Sat, Oct 1, 2022 at 9:06 PM Larry Rosenman  wrote:

On 10/01/2022 10:04 pm, Warner Losh wrote:

Do  you have a /boot tarball that can be loaded in a VM that recreates 
the problem (along with a clean hash)?


But before you try that, have you tried a completely clean rebuild of 
the kernel to preclude the possibility that something is somehow cross 
threaded?


Warner

On Sat, Oct 1, 2022 at 8:39 PM Larry Rosenman  wrote:
❯ more info.11
Dump header from device: /dev/mfid0p3
Architecture: amd64
Architecture Version: 2
Dump Length: 126748815
Blocksize: 512
Compression: zstd
Dumptime: 2022-10-01 21:26:40 -0500
Hostname:
Magic: FreeBSD Kernel Dump
Version String: FreeBSD 14.0-CURRENT #168
ler/freebsd-main-changes-n258354-6cdd871ebc4: Sat Oct  1 21:13:01 CDT
2022
r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL
Panic String: page fault
Dump Parity: 501115454
Bounds: 11
Dump Status: good

I do have source and debug stuff, BUT kgdb croaks on me.

I *CAN* give access to the machine.

the console backtrace showed something about the kld load of
dependencies.

--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106


let me wipe /usr/obj, and rebuild everything (I *DO* use meta-mode).

I've had fewer problems with it than non-meta mode, but this looks like 
a 'corruption' or 'cross threaded' crash I've chased in the past that 
went away with a rebuild. So it's better to be sure...


Warner

Still breaks -- did someone(tm) forget to make netlink a module?

--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

Re: BOOT CRASH -- Current -CURRENT

2022-10-01 Thread Larry Rosenman



On 10/01/2022 10:04 pm, Warner Losh wrote:

Do  you have a /boot tarball that can be loaded in a VM that recreates 
the problem (along with a clean hash)?


But before you try that, have you tried a completely clean rebuild of 
the kernel to preclude the possibility that something is somehow cross 
threaded?


Warner

On Sat, Oct 1, 2022 at 8:39 PM Larry Rosenman  wrote:


❯ more info.11
Dump header from device: /dev/mfid0p3
Architecture: amd64
Architecture Version: 2
Dump Length: 126748815
Blocksize: 512
Compression: zstd
Dumptime: 2022-10-01 21:26:40 -0500
Hostname:
Magic: FreeBSD Kernel Dump
Version String: FreeBSD 14.0-CURRENT #168
ler/freebsd-main-changes-n258354-6cdd871ebc4: Sat Oct  1 21:13:01 CDT
2022
r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL
Panic String: page fault
Dump Parity: 501115454
Bounds: 11
Dump Status: good

I do have source and debug stuff, BUT kgdb croaks on me.

I *CAN* give access to the machine.

the console backtrace showed something about the kld load of
dependencies.

--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106


let me wipe /usr/obj, and rebuild everything (I *DO* use meta-mode).

--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

Re: BOOT CRASH -- Current -CURRENT

2022-10-01 Thread Larry Rosenman

On 10/01/2022 9:39 pm, Larry Rosenman wrote:

❯ more info.11
Dump header from device: /dev/mfid0p3
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 126748815
  Blocksize: 512
  Compression: zstd
  Dumptime: 2022-10-01 21:26:40 -0500
  Hostname:
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 14.0-CURRENT #168 
ler/freebsd-main-changes-n258354-6cdd871ebc4: Sat Oct  1 21:13:01 CDT 
2022

r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL
  Panic String: page fault
  Dump Parity: 501115454
  Bounds: 11
  Dump Status: good

I do have source and debug stuff, BUT kgdb croaks on me.

I *CAN* give access to the machine.

the console backtrace showed something about the kld load of 
dependencies.


Here's the BT:
❯ sudo kgdb -c vmcore.11 /mnt/usr/lib/debug/boot/kernel/kernel.debug
GNU gdb (GDB) 12.1 [GDB v12.1 for FreeBSD]
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /mnt/usr/lib/debug/boot/kernel/kernel.debug...

Unread portion of the kernel message buffer:
---<>---
Copyright (c) 1992-2022 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 14.0-CURRENT #168 ler/freebsd-main-changes-n258354-6cdd871ebc4: 
Sat Oct  1 21:13:01 CDT 2022
r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL 
amd64
FreeBSD clang version 14.0.5 (https://github.com/llvm/llvm-project.git 
llvmorg-14.0.5-0-gc12386ae247c)

VT(efifb): resolution 640x480
CPU: Intel(R) Xeon(R) CPU   X5660  @ 2.80GHz (2793.07-MHz 
K8-class CPU)

  Origin="GenuineIntel"  Id=0x206c2  Family=0x6  Model=0x2c  Stepping=2
  
Features=0xbfebfbff
  
Features2=0x29ee3ff

  AMD Features=0x2c100800
  AMD Features2=0x1
  Structured Extended Features3=0x9c00
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics
real memory  = 137438953472 (131072 MB)
avail memory = 133789515776 (127591 MB)
CPU microcode: no matching update found
Event timer "LAPIC" quality 600
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 24 CPUs
FreeBSD/SMP: 2 package(s) x 6 core(s) x 2 hardware threads
random: unblocking device.
ioapic1: MADT APIC ID 1 != hw id 0
ioapic0  irqs 0-23
ioapic1  irqs 32-55
Launching APs: 1 8 7 5 2 12 15 17 14 20 3 18 13 4 19 10 22 11 6 9 16 23 
21

TCP_ratelimit: Is now initialized
TCP Hpts created 24 swi interrupt threads and bound 24 to NUMA domains
random: entropy device external interface
kbd1 at kbdmux0
acpi0: 
acpi0: Power Button (fixed)
apei0:  on acpi0
cpu0:  on acpi0
atrtc0:  port 0x70-0x7f irq 8 on acpi0
atrtc0: registered as a time-of-day clock, resolution 1.00s
Event timer "RTC" frequency 32768 Hz quality 0
attimer0:  port 0x40-0x5f irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
hpet0:  iomem 0xfed0-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 350
Event timer "HPET1" frequency 14318180 Hz quality 340
Event timer "HPET2" frequency 14318180 Hz quality 340
Event timer "HPET3" frequency 14318180 Hz quality 340
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
pcib0:  port 0xcf8-0xcff on acpi0
pci0:  on pcib0
pcib1:  at device 1.0 on pci0
pci1:  on pcib1
pci1:  at device 0.0 (no driver attached)
pci1:  at device 0.1 (no driver attached)
pcib2:  at device 3.0 on pci0
pci2:  on pcib2
pci2:  at device 0.0 (no driver attached)
pci2:  at device 0.1 (no driver attached)
pcib3:  at device 4.0 on pci0
pci3:  on pcib3
mfi0:  port 0xfc00-0xfcff mem 
0xdf1bc000-0xdf1b,0xdf1c-0xdf1f irq 33 at device 0.0 on pci3

mfi0: Using MSI
mfi0: Megaraid SAS driver Ver 4.23
mfi0: FW MaxCmds = 1008, limiting to 128
mfi0: 55158 (717992596s/0x0020/info) - Shutdown command received from 
host
pcib4: mfi0: 55159 (boot + 33s/0x0020/info) - 
Firmware initialization started (PCI ID 007

BOOT CRASH -- Current -CURRENT

2022-10-01 Thread Larry Rosenman



❯ more info.11
Dump header from device: /dev/mfid0p3
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 126748815
  Blocksize: 512
  Compression: zstd
  Dumptime: 2022-10-01 21:26:40 -0500
  Hostname:
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 14.0-CURRENT #168 
ler/freebsd-main-changes-n258354-6cdd871ebc4: Sat Oct  1 21:13:01 CDT 
2022

r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL
  Panic String: page fault
  Dump Parity: 501115454
  Bounds: 11
  Dump Status: good

I do have source and debug stuff, BUT kgdb croaks on me.

I *CAN* give access to the machine.

the console backtrace showed something about the kld load of 
dependencies.




--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Build Break?

2022-10-01 Thread Larry Rosenman
unction 'mtx_lock' is invalid in C99 
[-Werror,-Wimplicit-function-declaration]

NLP_LOCK(nlp);
^
/usr/src/sys/netlink/netlink_var.h:79:25: note: expanded from macro 
'NLP_LOCK'

#define NLP_LOCK(_nlp)  mtx_lock(&((_nlp)->nl_lock))
^
/usr/src/sys/netlink/netlink_io.c:357:3: error: implicit declaration of 
function 'mtx_unlock' is invalid in C99 
[-Werror,-Wimplicit-function-declaration]

NLP_UNLOCK(nlp);
^
/usr/src/sys/netlink/netlink_var.h:80:26: note: expanded from macro 
'NLP_UNLOCK'

#define NLP_UNLOCK(_nlp)mtx_unlock(&((_nlp)->nl_lock))
^
/usr/src/sys/netlink/netlink_io.c:369:3: error: implicit declaration of 
function 'mtx_unlock' is invalid in C99 
[-Werror,-Wimplicit-function-declaration]

NLP_UNLOCK(nlp);
^
/usr/src/sys/netlink/netlink_var.h:80:26: note: expanded from macro 
'NLP_UNLOCK'

#define NLP_UNLOCK(_nlp)mtx_unlock(&((_nlp)->nl_lock))
^
/usr/src/sys/netlink/netlink_io.c:395:2: error: implicit declaration of 
function 'mtx_unlock' is invalid in C99 
[-Werror,-Wimplicit-function-declaration]

NLP_UNLOCK(nlp);
^
/usr/src/sys/netlink/netlink_var.h:80:26: note: expanded from macro 
'NLP_UNLOCK'

#define NLP_UNLOCK(_nlp)mtx_unlock(&((_nlp)->nl_lock))
^
/usr/src/sys/netlink/netlink_io.c:519:3: error: implicit declaration of 
function 'mtx_lock' is invalid in C99 
[-Werror,-Wimplicit-function-declaration]

NLP_LOCK(nlp);
^
/usr/src/sys/netlink/netlink_var.h:79:25: note: expanded from macro 
'NLP_LOCK'

#define NLP_LOCK(_nlp)  mtx_lock(&((_nlp)->nl_lock))
^
/usr/src/sys/netlink/netlink_io.c:521:3: error: implicit declaration of 
function 'mtx_unlock' is invalid in C99 
[-Werror,-Wimplicit-function-declaration]

NLP_UNLOCK(nlp);
^
/usr/src/sys/netlink/netlink_var.h:80:26: note: expanded from macro 
'NLP_UNLOCK'

#define NLP_UNLOCK(_nlp)mtx_unlock(&((_nlp)->nl_lock))
^
16 errors generated.
--- all_subdir_netgraph ---
--- all_subdir_netgraph/device ---
--- i386 ---
i386 -> /usr/src/sys/i386/include
Building 
/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL/modules/usr/src/sys/modules/netgraph/device/vnode_if_newproto.h

--- all_subdir_netlink ---
*** [netlink_io.o] Error code 1

make[4]: stopped in /usr/src/sys/modules/netlink
.ERROR_TARGET='netlink_io.o'
.ERROR_META_FILE='/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL/modules/usr/src/sys/modules/netlink/netlink_io.o.meta'
.MAKE.LEVEL='4'
MAKEFILE=''
.MAKE.MODE='meta missing-filemon=yes missing-meta=yes silent=yes 
verbose'

5.79 real30.04 user 9.35 sys

make[1]: stopped in /usr/src

make: stopped in /usr/src

ler in  borg in src on  ler/freebsd-main-changes:main [⇡] on ☁️  
(us-east-1) took 1m56s

❯

ler in  borg in src on  ler/freebsd-main-changes:main [⇡] on ☁️  
(us-east-1)

❯



--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: Beadm can't create snapshot

2022-08-28 Thread Larry Rosenman

On 08/28/2022 2:28 pm, Ryan Moeller wrote:

On 8/17/22 12:16 PM, Ryan Moeller wrote:


On 8/17/22 12:05 PM, Ryan Moeller wrote:


On 8/17/22 10:35 AM, Thomas Laus wrote:
I attempted to create a ZFS snapshot after upgrading this morning 
and received this error


# beadm create n257443
cannot create 'zroot/ROOT/n257443': 'snapshots_changed' is readonly
#



This looks like a bug in beadm. It must be trying to set the 
snapshots_changed property when cloning the snapshot for the BE, but 
the property is of course readonly.


-Ryan



I took a closer look at what beadm is doing and this appears to be a 
bug in the property after all. beadm filters by source "local" or 
"received" and for "snapshots_changed" the source is "local" when it 
should be "-" like other readonly properties. We'll get this fixed 
ASAP.


-Ryan



Now fixed as of
https://github.com/openzfs/zfs/commit/518b4876022eee58b14903da09b99c01b8caa754



That doesn't look right?  It's about arc_c_max, and not properties?

-Ryan






My version info:

14.0-CURRENT FreeBSD 14.0-CURRENT #9 main-n257443-f7413197245: Wed 
Aug 17 08:15:27 EDT 2022


There was not any information in UPDATING about any required ZFS 
configuration changes required nor any ZFS flags that listed 
'snapshots_changed' as something that needed a new value.  There is 
actually a new snapshot created, but 'beadm list' does not show it 
and the boot menu does not have it listed.


Tom







--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: Hangs in bacula / NFS? on recent Current

2022-08-19 Thread Larry Rosenman




On 08/19/2022 10:36 am, Rick Macklem wrote:

On 08/18/2022 9:49 am, Larry Rosenman wrote:

I didn't get all my mail on my bacula backups today (they backup to
NFS mounted TrueNAS).
Also a df hangs.

Here are procstat -kk's for all:
ler in  borg in ~ via C v14.0.5-clang on ☁️  (us-east-1)
❯ ps auxxxwww|grep bacula
bacula   20670.0  0.0  63188  13652  -  Is   11:30
0:17.49 /usr/local/sbin/bacula-sd -u bacula -g bacula -v -c
/usr/local/etc/bacula/bacula-sd.conf
root 20720.0  0.0  59280  31276  -  Is   11:30
0:00.31 /usr/local/sbin/bacula-fd -u root -g wheel -v -c
/usr/local/etc/bacula/bacula-fd.conf
bacula   20750.0  0.0  86992  19352  -  Is   11:30
0:56.95 /usr/local/sbin/bacula-dir -u bacula -g bacula -v -c
/usr/local/etc/bacula/bacula-dir.conf
postgres502410.0  0.1 285764 160244  -  Is   23:05
0:00.38 postgres: bacula bacula [local]  (postgres)
postgres502440.0  0.1 298784  74448  -  Ds   23:05
0:00.67 postgres: bacula bacula [local]  (postgres)
ler 665950.0  0.0  12888   2600  3  S+   09:46
0:00.00 grep --color=auto bacula


At the end, I'll list what options are needed for ps and all of its
output is needed. See the end of the email..


ler in  borg in ~ via C v14.0.5-clang on ☁️  (us-east-1)
❯ sudo procstat -kk 2067
  PIDTID COMMTDNAME  KSTACK
 2067 100742 bacula-sd   -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266 sleepq_wait_sig+0x9
_cv_wait_sig+0x137 kern_select+0x9fe sys_select+0x56
amd64_syscall+0x12e fast_syscall_common+0xf8
 2067 101036 bacula-sd   -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266
sleepq_timedwait_sig+0x12 _sleep+0x27d umtxq_sleep+0x242 do_wait+0x26b
__umtx_op_wait_uint_private+0x54 sys__umtx_op+0x7e amd64_syscall+0x12e
fast_syscall_common+0xf8
 2067 101038 bacula-sd   -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266
sleepq_timedwait_sig+0x12 _sleep+0x27d umtxq_sleep+0x242 do_wait+0x26b
__umtx_op_wait_uint_private+0x54 sys__umtx_op+0x7e amd64_syscall+0x12e
fast_syscall_common+0xf8
 2067 124485 bacula-sd   -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266
sleepq_timedwait_sig+0x12 _cv_timedwait_sig_sbt+0x15c
kern_poll_kfds+0x457 kern_poll+0x9f sys_poll+0x50 amd64_syscall+0x12e
fast_syscall_common+0xf8

ler in  borg in ~ via C v14.0.5-clang on ☁️  (us-east-1)
❯ sudo procstat -kk 2072
  PIDTID COMMTDNAME  KSTACK
 2072 100677 bacula-fd   -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266 sleepq_wait_sig+0x9
_cv_wait_sig+0x137 kern_select+0x9fe sys_select+0x56
amd64_syscall+0x12e fast_syscall_common+0xf8
 2072 101039 bacula-fd   -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266
sleepq_timedwait_sig+0x12 _sleep+0x27d umtxq_sleep+0x242 do_wait+0x26b
__umtx_op_wait_uint_private+0x54 sys__umtx_op+0x7e amd64_syscall+0x12e
fast_syscall_common+0xf8
 2072 101040 bacula-fd   -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266
sleepq_timedwait_sig+0x12 _sleep+0x27d umtxq_sleep+0x242 do_wait+0x26b
__umtx_op_wait_uint_private+0x54 sys__umtx_op+0x7e amd64_syscall+0x12e
fast_syscall_common+0xf8
 2072 124490 bacula-fd   -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266
sleepq_timedwait_sig+0x12 _cv_timedwait_sig_sbt+0x15c
kern_poll_kfds+0x457 kern_poll+0x9f sys_poll+0x50 amd64_syscall+0x12e
fast_syscall_common+0xf8

ler in  borg in ~ via C v14.0.5-clang on ☁️  (us-east-1)
❯ sudo procstat -kk 2075
  PIDTID COMMTDNAME  KSTACK
 2075 101007 bacula-dir  -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266 sleepq_wait_sig+0x9
_sleep+0x29b umtxq_sleep+0x242 do_wait+0x26b __umtx_op_wait+0x53
sys__umtx_op+0x7e amd64_syscall+0x12e fast_syscall_common+0xf8
 2075 101041 bacula-dir  -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266
sleepq_timedwait_sig+0x12 _sleep+0x27d umtxq_sleep+0x242 do_wait+0x26b
__umtx_op_wait_uint_private+0x54 sys__umtx_op+0x7e amd64_syscall+0x12e
fast_syscall_common+0xf8
 2075 101045 bacula-dir  -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266 sleepq_wait_sig+0x9
_cv_wait_sig+0x137 kern_select+0x9fe sys_select+0x56
amd64_syscall+0x12e fast_syscall_common+0xf8
 2075 101046 bacula-dir  -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266
sleepq_timedwait_sig+0x12 _sleep+0x27d umtxq_sleep+0x242 do_wait+0x26b
__umtx_op_wait_uint_private+0x54 sys__umtx_op+0x7e amd64_syscall+0x12e
fast_syscall_common+0xf8
 2075 101047 bacula-dir  -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266
sleepq_timedwait_sig+0x12 _sleep+0x27d

Re: Lots of port failures today?

2022-08-19 Thread Larry Rosenman

On 08/19/2022 10:03 am, Tomoaki AOKI wrote:

On Fri, 19 Aug 2022 09:06:11 -0400
Charlie Li  wrote:


Mateusz Guzik wrote:
> On 8/18/22, Mateusz Guzik  wrote:
>> On 8/18/22, Larry Rosenman  wrote:
>>> 
https://home.lerctr.org:/build.html?mastername=live-host_ports=2022-08-18_13h12m51s
>>>
>>> circa 97ecdc00ac5 on main
>>> Ideas?
>>>
>>
>> try with 9ac6eda6c6a36db6bffa01be7faea24f8bb92a0f reverted
>>
>
> I'm pretty sure it will be fixed with  URL:
> 
https://cgit.FreeBSD.org/src/commit/?id=545db925c3d5408e71e21432895770cd49fd2cf3
>
Seems to be fixed with this commit, at least for graphics/jpeg-turbo,
whose configure failed with something about platform not supporting 
SIMD.


--
Charlie Li
…nope, still don't have an exit line.


And so as base /usr/bin/xz (through pipe) and ports lang/ruby30.

The former caused x11/linux-nvidia-libs to fail on extract,
and the latter caused ports-mgmt/portupgrade (including portsclean) to
fail on start.

Both are fixed at the commit.
Thanks!


and all my unexplained failures are fixed as well.

Thanks, Mateusz!
--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: Lots of port failures today?

2022-08-18 Thread Larry Rosenman

On 08/18/2022 4:25 pm, Mateusz Guzik wrote:

On 8/18/22, Mateusz Guzik  wrote:

On 8/18/22, Larry Rosenman  wrote:

https://home.lerctr.org:/build.html?mastername=live-host_ports=2022-08-18_13h12m51s

circa 97ecdc00ac5 on main
Ideas?



try with 9ac6eda6c6a36db6bffa01be7faea24f8bb92a0f reverted



I'm pretty sure it will be fixed with  URL:
https://cgit.FreeBSD.org/src/commit/?id=545db925c3d5408e71e21432895770cd49fd2cf3
should I un-revert 9ac6eda6c6a36db6bffa01be7faea24f8bb92a0f and pick up 
a new pull?

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Lots of port failures today?

2022-08-18 Thread Larry Rosenman

https://home.lerctr.org:/build.html?mastername=live-host_ports=2022-08-18_13h12m51s

circa 97ecdc00ac5 on main
Ideas?

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: Hangs in bacula / NFS? on recent Current

2022-08-18 Thread Larry Rosenman

On 08/18/2022 9:49 am, Larry Rosenman wrote:

I didn't get all my mail on my bacula backups today (they backup to
NFS mounted TrueNAS).
Also a df hangs.

Here are procstat -kk's for all:
ler in  borg in ~ via C v14.0.5-clang on ☁️  (us-east-1)
❯ ps auxxxwww|grep bacula
bacula   20670.0  0.0  63188  13652  -  Is   11:30
0:17.49 /usr/local/sbin/bacula-sd -u bacula -g bacula -v -c
/usr/local/etc/bacula/bacula-sd.conf
root 20720.0  0.0  59280  31276  -  Is   11:30
0:00.31 /usr/local/sbin/bacula-fd -u root -g wheel -v -c
/usr/local/etc/bacula/bacula-fd.conf
bacula   20750.0  0.0  86992  19352  -  Is   11:30
0:56.95 /usr/local/sbin/bacula-dir -u bacula -g bacula -v -c
/usr/local/etc/bacula/bacula-dir.conf
postgres502410.0  0.1 285764 160244  -  Is   23:05
0:00.38 postgres: bacula bacula [local]  (postgres)
postgres502440.0  0.1 298784  74448  -  Ds   23:05
0:00.67 postgres: bacula bacula [local]  (postgres)
ler 665950.0  0.0  12888   2600  3  S+   09:46
0:00.00 grep --color=auto bacula

ler in  borg in ~ via C v14.0.5-clang on ☁️  (us-east-1)
❯ sudo procstat -kk 2067
  PIDTID COMMTDNAME  KSTACK
 2067 100742 bacula-sd   -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266 sleepq_wait_sig+0x9
_cv_wait_sig+0x137 kern_select+0x9fe sys_select+0x56
amd64_syscall+0x12e fast_syscall_common+0xf8
 2067 101036 bacula-sd   -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266
sleepq_timedwait_sig+0x12 _sleep+0x27d umtxq_sleep+0x242 do_wait+0x26b
__umtx_op_wait_uint_private+0x54 sys__umtx_op+0x7e amd64_syscall+0x12e
fast_syscall_common+0xf8
 2067 101038 bacula-sd   -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266
sleepq_timedwait_sig+0x12 _sleep+0x27d umtxq_sleep+0x242 do_wait+0x26b
__umtx_op_wait_uint_private+0x54 sys__umtx_op+0x7e amd64_syscall+0x12e
fast_syscall_common+0xf8
 2067 124485 bacula-sd   -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266
sleepq_timedwait_sig+0x12 _cv_timedwait_sig_sbt+0x15c
kern_poll_kfds+0x457 kern_poll+0x9f sys_poll+0x50 amd64_syscall+0x12e
fast_syscall_common+0xf8

ler in  borg in ~ via C v14.0.5-clang on ☁️  (us-east-1)
❯ sudo procstat -kk 2072
  PIDTID COMMTDNAME  KSTACK
 2072 100677 bacula-fd   -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266 sleepq_wait_sig+0x9
_cv_wait_sig+0x137 kern_select+0x9fe sys_select+0x56
amd64_syscall+0x12e fast_syscall_common+0xf8
 2072 101039 bacula-fd   -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266
sleepq_timedwait_sig+0x12 _sleep+0x27d umtxq_sleep+0x242 do_wait+0x26b
__umtx_op_wait_uint_private+0x54 sys__umtx_op+0x7e amd64_syscall+0x12e
fast_syscall_common+0xf8
 2072 101040 bacula-fd   -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266
sleepq_timedwait_sig+0x12 _sleep+0x27d umtxq_sleep+0x242 do_wait+0x26b
__umtx_op_wait_uint_private+0x54 sys__umtx_op+0x7e amd64_syscall+0x12e
fast_syscall_common+0xf8
 2072 124490 bacula-fd   -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266
sleepq_timedwait_sig+0x12 _cv_timedwait_sig_sbt+0x15c
kern_poll_kfds+0x457 kern_poll+0x9f sys_poll+0x50 amd64_syscall+0x12e
fast_syscall_common+0xf8

ler in  borg in ~ via C v14.0.5-clang on ☁️  (us-east-1)
❯ sudo procstat -kk 2075
  PIDTID COMMTDNAME  KSTACK
 2075 101007 bacula-dir  -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266 sleepq_wait_sig+0x9
_sleep+0x29b umtxq_sleep+0x242 do_wait+0x26b __umtx_op_wait+0x53
sys__umtx_op+0x7e amd64_syscall+0x12e fast_syscall_common+0xf8
 2075 101041 bacula-dir  -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266
sleepq_timedwait_sig+0x12 _sleep+0x27d umtxq_sleep+0x242 do_wait+0x26b
__umtx_op_wait_uint_private+0x54 sys__umtx_op+0x7e amd64_syscall+0x12e
fast_syscall_common+0xf8
 2075 101045 bacula-dir  -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266 sleepq_wait_sig+0x9
_cv_wait_sig+0x137 kern_select+0x9fe sys_select+0x56
amd64_syscall+0x12e fast_syscall_common+0xf8
 2075 101046 bacula-dir  -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266
sleepq_timedwait_sig+0x12 _sleep+0x27d umtxq_sleep+0x242 do_wait+0x26b
__umtx_op_wait_uint_private+0x54 sys__umtx_op+0x7e amd64_syscall+0x12e
fast_syscall_common+0xf8
 2075 101047 bacula-dir  -   mi_switch+0x157
sleepq_switch+0x107 sleepq_catch_signals+0x266
sleepq_timedwait_sig+0x12 _sleep+0x27d kern_clock_nanosleep+0x1d1
sys_nanosleep+0x3b amd64_syscall+0x12e fast_syscall_common+0xf8
 2075 124479 bacula-dir  -   mi_switch+0x157

Hangs in bacula / NFS? on recent Current

2022-08-18 Thread Larry Rosenman
  -   mi_switch+0x157 
sleepq_switch+0x107 sleepq_catch_signals+0x266 sleepq_wait_sig+0x9 
_cv_wait_sig+0x137 kern_poll_kfds+0x48c kern_poll+0x9f sys_poll+0x50 
amd64_syscall+0x12e fast_syscall_common+0xf8
 2075 124480 bacula-dir  -   mi_switch+0x157 
sleepq_switch+0x107 sleepq_catch_signals+0x266 sleepq_timedwait_sig+0x12 
_sleep+0x27d kern_clock_nanosleep+0x1d1 sys_nanosleep+0x3b 
amd64_syscall+0x12e fast_syscall_common+0xf8
 2075 124481 bacula-dir  -   mi_switch+0x157 
sleepq_switch+0x107 sleepq_catch_signals+0x266 sleepq_timedwait_sig+0x12 
_sleep+0x27d kern_clock_nanosleep+0x1d1 sys_nanosleep+0x3b 
amd64_syscall+0x12e fast_syscall_common+0xf8
 2075 124489 bacula-dir  -   mi_switch+0x157 
sleepq_switch+0x107 sleepq_catch_signals+0x266 sleepq_timedwait_sig+0x12 
_cv_timedwait_sig_sbt+0x15c kern_poll_kfds+0x457 kern_poll+0x9f 
sys_poll+0x50 amd64_syscall+0x12e fast_syscall_common+0xf8
 2075 124506 bacula-dir  -   mi_switch+0x157 
sleepq_switch+0x107 sleepq_catch_signals+0x266 sleepq_timedwait_sig+0x12 
_sleep+0x27d kern_clock_nanosleep+0x1d1 sys_nanosleep+0x3b 
amd64_syscall+0x12e fast_syscall_common+0xf8


ler in  borg in ~ via C v14.0.5-clang on ☁️  (us-east-1)
❯ sudo procstat -kk 66390
  PIDTID COMMTDNAME  KSTACK
66390 101514 df  -   mi_switch+0x157 
sleepq_switch+0x107 sleepq_timedwait+0x4b _sleep+0x28e 
clnt_reconnect_call+0x809 newnfs_request+0xa95 nfscl_request+0x5a 
nfsrpc_statfs+0x19d nfs_statfs+0x148 vfs_statfs_sigdefer+0x2e 
kern_getfsstat+0x1f1 sys_getfsstat+0x22 amd64_syscall+0x12e 
fast_syscall_common+0xf8


ler in  borg in ~ via C v14.0.5-clang on ☁️  (us-east-1)
❯

this was built yesterday:
❯ uname -a
FreeBSD borg.lerctr.org 14.0-CURRENT FreeBSD 14.0-CURRENT #142 
ler/freebsd-main-changes-n257453-175a127a72f: Wed Aug 17 09:23:32 CDT 
2022 
r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL amd64


ler in  borg in ~ via C v14.0.5-clang on ☁️  (us-east-1)
❯

What else do we need?

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: Updating EFI boot loader results in boot hangup

2022-08-12 Thread Larry Rosenman



boot off a memstick?

On 08/12/2022 3:25 pm, Nuno Teixeira wrote:


The problem is if boot is failing, how to mount and rename it?

I'm looking for a way, if possible, to boot directly bkp boot64x in 
case of failure.

I was hoping to find it in loader(8) or uefi(8)...

Larry Rosenman  escreveu no dia sexta, 12/08/2022 à(s) 
21:09:


I would assume just rename the bootx64.old to bootx64.efi

and/or put it in a different directory that EFI can see

On 08/12/2022 3:03 pm, Nuno Teixeira wrote:

I'm searching without success to load a bkp loader in case of boot 
failure.


Upgrade process willl be like:
---
mount -t msdosfs /dev/nvd0p1 /mnt
cp /mnt/efi/boot/bootx64.efi /mnt/efi/boot/bootx64.old
cp /boot/loader.efi /mnt/efi/boot/bootx64.efi
---

I can't find the right docs to load bootx64.old.
Could you tell me what you did to solve your boot?

Thanks

Yasuhiro Kimura  escreveu no dia sexta, 12/08/2022 
à(s) 18:45: From: Nuno Teixeira 

Subject: Re: Updating EFI boot loader results in boot hangup
Date: Fri, 12 Aug 2022 18:26:11 +0100


Hello Yasu,

Does it needes to update boot loader everytime that we upgrade 
current?


No, you need not.

The only time that I updated was a month ago because of zfs upgrade 
and I need to practice how to boot

loader bkp file :)


I update boot loader everytime because I'd like to do it :-).
And sometimes problem hits upon me like this time and I contribute to
debugging base system :-):-).

---
Yasuhiro Kimura

--

Nuno Teixeira
FreeBSD Committer (ports)


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

--

Nuno Teixeira
FreeBSD Committer (ports)

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

Re: Updating EFI boot loader results in boot hangup

2022-08-12 Thread Larry Rosenman



I would assume just rename the bootx64.old to bootx64.efi

and/or put it in a different directory that EFI can see

On 08/12/2022 3:03 pm, Nuno Teixeira wrote:

I'm searching without success to load a bkp loader in case of boot 
failure.


Upgrade process willl be like:
---
mount -t msdosfs /dev/nvd0p1 /mnt
cp /mnt/efi/boot/bootx64.efi /mnt/efi/boot/bootx64.old
cp /boot/loader.efi /mnt/efi/boot/bootx64.efi
---

I can't find the right docs to load bootx64.old.
Could you tell me what you did to solve your boot?

Thanks

Yasuhiro Kimura  escreveu no dia sexta, 12/08/2022 
à(s) 18:45:



From: Nuno Teixeira 
Subject: Re: Updating EFI boot loader results in boot hangup
Date: Fri, 12 Aug 2022 18:26:11 +0100


Hello Yasu,

Does it needes to update boot loader everytime that we upgrade 
current?


No, you need not.

The only time that I updated was a month ago because of zfs upgrade 
and I need to practice how to boot

loader bkp file :)


I update boot loader everytime because I'd like to do it :-).
And sometimes problem hits upon me like this time and I contribute to
debugging base system :-):-).

---
Yasuhiro Kimura


--

Nuno Teixeira
FreeBSD Committer (ports)


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

Re: limits.conf/stacksize doesn't seem to work?

2022-07-15 Thread Larry Rosenman

On 07/15/2022 5:32 pm, Mark Johnston wrote:

On Fri, Jul 15, 2022 at 05:26:09PM -0500, Larry Rosenman wrote:

On 07/15/2022 5:24 pm, Mark Johnston wrote:
> On Fri, Jul 15, 2022 at 05:21:27PM -0500, Larry Rosenman wrote:
>> On 07/15/2022 5:18 pm, Mark Johnston wrote:
>> > On Fri, Jul 15, 2022 at 05:04:18PM -0500, Larry Rosenman wrote:
>> >> I'm using the following kernel config:
>> >> [...]
>> >> and the following login.conf:
>> >> [...]
>> >> bacula_dir:\
>> >>   :stacksize-max=68719476736:\
>> >>   :stacksize-cur=68719476736:\
>> >>   :tc=daemon:
>> >> [...]
>> >> I've updated my (ler) password entry to reference bacula_dir:
>> >> ler::1001:1001:bacula_dir:0:0:Larry
>> >> Rosenman:/home/ler:/usr/local/bin/zsh
>> >>
>> >>
>> >> when I ssh in, the stacklimit is still:
>> >> ❯ ulimit -H -s
>> >> 2097152
>> >
>> > What is the value of the kern.maxssiz sysctl on this system?
>> >
>> >> ler in  borg in sys/amd64/conf on  ler/freebsd-main-changes:main on
>> >> ☁️  (us-east-1)
>> >> ❯ ulimit -S -s
>> >> 2097152
>> >>
>> >> ler in  borg in sys/amd64/conf on  ler/freebsd-main-changes:main on
>> >> ☁️  (us-east-1)
>> >> ❯
>> >>
>> >> Where does this number come from?  What am I missing here?
>> >
>> > The stack limit cannot be set to an arbitrarily large number.  It will
>> > silently be clamped to maxssiz.
>>
>> ❯ sysctl kern.maxssiz
>> kern.maxssiz: 2147483648
>
> Then what you're seeing is expected.  The kernel is clamping the stack
> segment limit to 2GB.

I assume this is the default for MAXSSIZ?  and if I change that in the
kernel config, it will
allow bigger?  Where is this default defined?


The default value is platform dependent.  On amd64 it's 512MB, so I'm
not sure where your value is coming from.  It's defined in a header.
You can set it in the kernel configuration, or as a tunable or sysctl.


ok, so I had (back when, heaven only knows) set it in /boot/loader.conf:
kern.maxssiz="2147483648"

thank you.

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: limits.conf/stacksize doesn't seem to work?

2022-07-15 Thread Larry Rosenman

On 07/15/2022 5:24 pm, Mark Johnston wrote:

On Fri, Jul 15, 2022 at 05:21:27PM -0500, Larry Rosenman wrote:

On 07/15/2022 5:18 pm, Mark Johnston wrote:
> On Fri, Jul 15, 2022 at 05:04:18PM -0500, Larry Rosenman wrote:
>> I'm using the following kernel config:
>> [...]
>> and the following login.conf:
>> [...]
>> bacula_dir:\
>>:stacksize-max=68719476736:\
>>:stacksize-cur=68719476736:\
>>:tc=daemon:
>> [...]
>> I've updated my (ler) password entry to reference bacula_dir:
>> ler::1001:1001:bacula_dir:0:0:Larry
>> Rosenman:/home/ler:/usr/local/bin/zsh
>>
>>
>> when I ssh in, the stacklimit is still:
>> ❯ ulimit -H -s
>> 2097152
>
> What is the value of the kern.maxssiz sysctl on this system?
>
>> ler in  borg in sys/amd64/conf on  ler/freebsd-main-changes:main on
>> ☁️  (us-east-1)
>> ❯ ulimit -S -s
>> 2097152
>>
>> ler in  borg in sys/amd64/conf on  ler/freebsd-main-changes:main on
>> ☁️  (us-east-1)
>> ❯
>>
>> Where does this number come from?  What am I missing here?
>
> The stack limit cannot be set to an arbitrarily large number.  It will
> silently be clamped to maxssiz.

❯ sysctl kern.maxssiz
kern.maxssiz: 2147483648


Then what you're seeing is expected.  The kernel is clamping the stack
segment limit to 2GB.


I assume this is the default for MAXSSIZ?  and if I change that in the 
kernel config, it will

allow bigger?  Where is this default defined?

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: limits.conf/stacksize doesn't seem to work?

2022-07-15 Thread Larry Rosenman

On 07/15/2022 5:18 pm, Mark Johnston wrote:

On Fri, Jul 15, 2022 at 05:04:18PM -0500, Larry Rosenman wrote:

I'm using the following kernel config:
[...]
and the following login.conf:
[...]
bacula_dir:\
:stacksize-max=68719476736:\
:stacksize-cur=68719476736:\
:tc=daemon:
[...]
I've updated my (ler) password entry to reference bacula_dir:
ler::1001:1001:bacula_dir:0:0:Larry
Rosenman:/home/ler:/usr/local/bin/zsh


when I ssh in, the stacklimit is still:
❯ ulimit -H -s
2097152


What is the value of the kern.maxssiz sysctl on this system?


ler in  borg in sys/amd64/conf on  ler/freebsd-main-changes:main on
☁️  (us-east-1)
❯ ulimit -S -s
2097152

ler in  borg in sys/amd64/conf on  ler/freebsd-main-changes:main on
☁️  (us-east-1)
❯

Where does this number come from?  What am I missing here?


The stack limit cannot be set to an arbitrarily large number.  It will
silently be clamped to maxssiz.


❯ sysctl kern.maxssiz
kern.maxssiz: 2147483648


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



limits.conf/stacksize doesn't seem to work?

2022-07-15 Thread Larry Rosenman
:priority=0:\
#   :requirehome@:\
#   :umask=022:\
#   :tc=auth-defaults:
#
#
##
## standard - standard user defaults
##
#standard:\
#   :copyright=/etc/COPYRIGHT:\
#   :welcome=/var/run/motd:\
#   :setenv=BLOCKSIZE=K:\
#   :mail=/var/mail/$:\
#   :path=~/bin /bin /usr/bin /usr/local/bin:\
#   :manpath=/usr/share/man /usr/local/man:\
#   :nologin=/var/run/nologin:\
#   :cputime=1h30m:\
#   :datasize=8M:\
#   :vmemoryuse=100M:\
#   :stacksize=2M:\
#   :memorylocked=4M:\
#   :memoryuse=8M:\
#   :filesize=8M:\
#   :coredumpsize=8M:\
#   :openfiles=24:\
#   :maxproc=32:\
#   :priority=0:\
#   :requirehome:\
#   :passwordtime=90d:\
#   :umask=002:\
#   :ignoretime@:\
#   :tc=default:
#
#
##
## users of X (needs more resources!)
##
#xuser:\
#   :manpath=/usr/share/man /usr/local/man:\
#   :cputime=4h:\
#   :datasize=12M:\
#   :vmemoryuse=infinity:\
#   :stacksize=4M:\
#   :filesize=8M:\
#   :memoryuse=16M:\
#   :openfiles=32:\
#   :maxproc=48:\
#   :tc=standard:
#
#
##
## Staff users - few restrictions and allow login anytime
##
#staff:\
#   :ignorenologin:\
#   :ignoretime:\
#   :requirehome@:\
#   :accounted@:\
#	:path=~/bin /bin /sbin /usr/bin /usr/sbin /usr/local/bin 
/usr/local/sbin:\

#   :umask=022:\
#   :tc=standard:
#
#
##
## root - fallback for root logins
##
#root:\
#	:path=~/bin /bin /sbin /usr/bin /usr/sbin /usr/local/bin 
/usr/local/sbin:\

#   :cputime=infinity:\
#   :datasize=infinity:\
#   :stacksize=infinity:\
#   :memorylocked=infinity:\
#   :memoryuse=infinity:\
#   :filesize=infinity:\
#   :coredumpsize=infinity:\
#   :openfiles=infinity:\
#   :maxproc=infinity:\
#   :memoryuse-cur=32M:\
#   :maxproc-cur=64:\
#   :openfiles-cur=1024:\
#   :priority=0:\
#   :requirehome@:\
#   :umask=022:\
#   :tc=auth-root-defaults:
#
#
##
## Settings used by /etc/rc
##
#daemon:\
#   :coredumpsize@:\
#   :coredumpsize-cur=0:\
#   :datasize=infinity:\
#   :datasize-cur@:\
#   :maxproc=512:\
#   :maxproc-cur@:\
#   :memoryuse-cur=64M:\
#   :memorylocked-cur=64M:\
#   :openfiles=1024:\
#   :openfiles-cur@:\
#   :stacksize=16M:\
#   :stacksize-cur@:\
#   :tc=default:
#
#
##
## Settings used by news subsystem
##
#news:\
#	:path=/usr/local/news/bin /bin /sbin /usr/bin /usr/sbin /usr/local/bin 
/usr/local/sbin:\

#   :cputime=infinity:\
#   :filesize=128M:\
#   :datasize-cur=64M:\
#   :stacksize-cur=32M:\
#   :coredumpsize-cur=0:\
#   :maxmemorysize-cur=128M:\
#   :memorylocked=32M:\
#   :maxproc=128:\
#   :openfiles=256:\
#   :tc=default:
#
#
##
## The dialer class should be used for a dialup PPP account
## Welcome messages/news suppressed
##
#dialer:\
#   :hushlogin:\
#   :requirehome@:\
#   :cputime=unlimited:\
#   :filesize=2M:\
#   :datasize=2M:\
#   :stacksize=4M:\
#   :coredumpsize=0:\
#   :memoryuse=4M:\
#   :memorylocked=1M:\
#   :maxproc=16:\
#   :openfiles=32:\
#   :tc=standard:
#
#
##
## Site full-time 24/7 PPP connection
## - no time accounting, restricted to access via dialin lines
##
#site:\
#   :ignoretime:\
#   :passwordtime@:\
#   :refreshtime@:\
#   :refreshperiod@:\
#   :sessionlimit@:\
#   :autodelete@:\
#   :expireperiod@:\
#   :graceexpire@:\
#   :gracetime@:\
#   :warnexpire@:\
#   :warnpassword@:\
#   :idletime@:\
#   :sessiontime@:\
#   :daytime@:\
#   :weektime@:\
#   :monthtime@:\
#   :warntime@:\
#   :accounted@:\
#   :tc=dialer:\
#   :tc=staff:
#
#
##
## Example standard accounting entries for subscriber levels
##
#
#subscriber|Subscribers:\
#   :accounted:\
#   :refreshtime=180d:\
#   :refreshperiod@:\
#   :sessionlimit@:\
#   :autodelete=30d:\
#   :expireperiod=180d:\
#   :graceexpire=7d:\
#   :gracetime=10m:\
#   :warnexpire=7d:\
#   :warnpassword=7d:\
#   :idletime=30m:\
#   :sessiontime=4h:\
#   :daytime=6h:\
#   :weektime=40h:\
#   :monthtime=120h:\
#   :warntime=4h:\
#   :tc=standard:
#
#
##
## Subscriber accounts. These accounts have their login times
## accounted and have access limits applied.
##
#subppp|PPP Subscriber Accounts:\
#   :tc=dialer:\
#   :tc=subscriber:
#
#
#subshell|Shell Subscriber Accounts:\
#   :tc=subscriber:
#
##
## If you want some of the accounts to use traditional UNIX DES based
## password hashes.
##
#des_users:\
#   :passwd_format=des:\
#   :tc=default:

ler in  borg in sys/amd64/conf on  ler/freebsd-main-changes:main on 
☁️  (us-east-1)

❯

I've updated my (ler) password entry to reference bacula_dir:
ler::1001:1001:bacula_dir:0:0:Larry 
Rosenman:/home/ler:/usr/local/bin/zsh



when I ssh in, the stacklimit is still:
❯ uli

Re: SAS/SATA controllers: 8 port that support 8TB Drives

2022-06-22 Thread Larry Rosenman



On 06/18/2022 8:30 am, Michael Gmelin wrote:





That certainly sounds promising.
Best
Michael





got the new controllers, and no sweat -- saw all 8TB on the drives 
(modulo one bad drive -- seller is replacing).


thanks all.

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

Re: MCE: Does this look possibly like a slot issue?

2022-06-21 Thread Larry Rosenman

On 06/21/2022 1:23 pm, Chris wrote:

On 2022-06-20 17:23, Larry Rosenman wrote:

I'm seeing them constantly:

FWIW it looks like a sync(ing) problem between your
RAM && CPU cache. Are are your clocks set correctly
for your CPU && RAM? Is your CPU too hot? Is the CPU
cache ECC?


root@freenas[~]# mcelog --dmi


[snip]

Hrm.  IIRC all the BIOS parameters are default (I could be mistaken).  
It's a

SuperMicro X8DTN+ motherboard with:
CPU: Intel(R) Xeon(R) CPU   E5645  @ 2.40GHz (2400.22-MHz 
K8-class CPU)

  Origin="GenuineIntel"  Id=0x206c2  Family=0x6  Model=0x2c  Stepping=2
  
Features=0xbfebfbff
  
Features2=0x29ee3ff

  AMD Features=0x2c100800
  AMD Features2=0x1
  Structured Extended Features3=0x9c00
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics
real memory  = 77309411328 (73728 MB)
avail memory = 75186962432 (71703 MB)
(2 packages, 6 core, 12-threads each) and 18 4GB sticks.
this ONE slot seems to be a problem.

How would you recommend looking for an issue modulo pulling the 2 cpu 
packages?


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: MCE: Does this look possibly like a slot issue?

2022-06-21 Thread Larry Rosenman
-BE
Hardware event. This is not a software error.
MCE 6
CPU 12 BANK 8 TSC 5f6cbe9ef2bc
MISC ac29890200041181 ADDR ee2f6e800
TIME 1655827951 Tue Jun 21 11:12:31 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 81
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 20 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Nanya
Serial Number: 642264CD
Asset Tag:
Part Number: NT4GC72B4NA1NL-BE
Hardware event. This is not a software error.
MCE 7
CPU 14 BANK 8 TSC 64ba63c66e52
MISC ac29890200041181 ADDR ee2f6e800
TIME 1655827951 Tue Jun 21 11:12:31 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 81
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Nanya
Serial Number: 642264CD
Asset Tag:
Part Number: NT4GC72B4NA1NL-BE
Hardware event. This is not a software error.
MCE 8
CPU 14 BANK 8 TSC 659878c17622
MISC ac29890200040282 ADDR ee2f6e800
TIME 1655827951 Tue Jun 21 11:12:31 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 82
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Nanya
Serial Number: 642264CD
Asset Tag:
Part Number: NT4GC72B4NA1NL-BE
Hardware event. This is not a software error.
MCE 9
CPU 14 BANK 8 TSC 66b71c1dccf6
MISC ac29890200040183 ADDR ee2f6e800
TIME 1655827951 Tue Jun 21 11:12:31 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 83
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Nanya
Serial Number: 642264CD
Asset Tag:
Part Number: NT4GC72B4NA1NL-BE
Hardware event. This is not a software error.
MCE 10
CPU 14 BANK 8 TSC 6be0988610ce
MISC ac29890200040682 ADDR ee2f6e800
TIME 1655827951 Tue Jun 21 11:12:31 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 82
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Nanya
Serial Number: 642264CD
Asset Tag:
Part Number: NT4GC72B4NA1NL-BE
Hardware event. This is not a software error.
MCE 11
CPU 14 BANK 8 TSC 6be0995926f8
MISC ac29890200044000 ADDR ee2f6e800
TIME 1655827951 Tue Jun 21 11:12:31 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 0
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Nanya
Serial Number: 642264CD
Asset Tag:
Part Number: NT4GC72B4NA1NL-BE
root@freenas[~]#

On 06/21/2022 11:06 am, Rodney W. Grimes wrote:



Swapped 2 DIMMS, now we wait for the ZFS ARC to fill and start using 
all

the memory.


Depending on the results of that one thing that is often overlooked
when trying to trouble shoot memory systems in modern Intel systems
is the fact that the DIMM now talks directly to the CPU chip that
has the memory controller built into it.  THUS these "slot" related
ECC/Parity/blowup errors can actually be the CPU and/or the CPU
socket and/or the seating of the CPU in the socket.

So if the error sticks with the DIMM slot and not the DIMM
module the next thing I would try would be a CPU chip reseat,
including a good inspection of the socket for for a damaged
pin.  Also look at the lands on the CPU chip itself, and you
can even try swaping CPU chips to see if it follows the
CPU or the socket, much as you do with a DIMM.




On 06/20/2022 7:59 pm, Larry Rosenman wrote:

> Sup

Re: MCE: Does this look possibly like a slot issue?

2022-06-20 Thread Larry Rosenman



Swapped 2 DIMMS, now we wait for the ZFS ARC to fill and start using all 
the memory.


On 06/20/2022 7:59 pm, Larry Rosenman wrote:


SuperMicro X8DTN+

2 Processors, 6-core/12-Thread. CPU: Intel(R) Xeon(R) CPU   
E5645  @ 2.40GHz (2400.20-MHz K8-class CPU)


I'll bring it down and swap DIMMS around

On 06/20/2022 7:57 pm, Ultima wrote:

Hey Larry,

One red flag I am seeing is that the error is being produced on
the same CPU/bank with each error you have provided so far.

Can you try and follow my original recommendation and swap
currently installed DIMM with the problem DIMM slot and see
if anything changes?

Can you also provide the motherboard model? Also, do you
have multiple CPUs installed in this system?

Best regards,
Richard Gallamore

On Mon, Jun 20, 2022 at 5:41 PM Larry Rosenman  wrote:

Yes and Yes.

On 06/20/2022 7:37 pm, Ultima wrote:

Are you sure that the module you replaced it with was good?
Are you sure you replaced the correct module?

Best regards,
Richard Gallamore

On Mon, Jun 20, 2022 at 5:23 PM Larry Rosenman  wrote:

I'm seeing them constantly:

root@freenas[~]# mcelog --dmi
Hardware event. This is not a software error.
MCE 0
CPU 22 BANK 8 TSC 20aab486464a
MISC ac29890200046444 ADDR ee2f6e800
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 44
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
WARNING: SMBIOS data is often unreliable. Take with a grain of salt!
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9
Hardware event. This is not a software error.
MCE 1
CPU 22 BANK 8 TSC 296dfcc82582
MISC ac29890200041381 ADDR ee2f6e800
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 81
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9
Hardware event. This is not a software error.
MCE 2
CPU 22 BANK 8 TSC 2a5604a6a070
MISC ac29890200044281
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory ECC error occurred during scrub
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 81
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 884200cf MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
Hardware event. This is not a software error.
MCE 3
CPU 22 BANK 8 TSC 31e141418eb8
MISC ac29890200046a4a ADDR ee2f6e800
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 4a
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9
Hardware event. This is not a software error.
MCE 4
CPU 22 BANK 8 TSC 3a014afee106
MISC ac29890200046646 ADDR ee2f6e800
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 46
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9
Hardware event. This is not a software error.
MCE 5
CPU 22 BANK 8 TSC 41d1dbef1a6a
MISC ac29890200046141 ADDR ee2f6e800
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 41
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial

Re: MCE: Does this look possibly like a slot issue?

2022-06-20 Thread Larry Rosenman



SuperMicro X8DTN+

2 Processors, 6-core/12-Thread. CPU: Intel(R) Xeon(R) CPU   
E5645  @ 2.40GHz (2400.20-MHz K8-class CPU)


I'll bring it down and swap DIMMS around

On 06/20/2022 7:57 pm, Ultima wrote:


Hey Larry,

One red flag I am seeing is that the error is being produced on
the same CPU/bank with each error you have provided so far.

Can you try and follow my original recommendation and swap
currently installed DIMM with the problem DIMM slot and see
if anything changes?

Can you also provide the motherboard model? Also, do you
have multiple CPUs installed in this system?

Best regards,
Richard Gallamore

On Mon, Jun 20, 2022 at 5:41 PM Larry Rosenman  wrote:

Yes and Yes.

On 06/20/2022 7:37 pm, Ultima wrote:

Are you sure that the module you replaced it with was good?
Are you sure you replaced the correct module?

Best regards,
Richard Gallamore

On Mon, Jun 20, 2022 at 5:23 PM Larry Rosenman  wrote:

I'm seeing them constantly:

root@freenas[~]# mcelog --dmi
Hardware event. This is not a software error.
MCE 0
CPU 22 BANK 8 TSC 20aab486464a
MISC ac29890200046444 ADDR ee2f6e800
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 44
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
WARNING: SMBIOS data is often unreliable. Take with a grain of salt!
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9
Hardware event. This is not a software error.
MCE 1
CPU 22 BANK 8 TSC 296dfcc82582
MISC ac29890200041381 ADDR ee2f6e800
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 81
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9
Hardware event. This is not a software error.
MCE 2
CPU 22 BANK 8 TSC 2a5604a6a070
MISC ac29890200044281
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory ECC error occurred during scrub
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 81
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 884200cf MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
Hardware event. This is not a software error.
MCE 3
CPU 22 BANK 8 TSC 31e141418eb8
MISC ac29890200046a4a ADDR ee2f6e800
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 4a
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9
Hardware event. This is not a software error.
MCE 4
CPU 22 BANK 8 TSC 3a014afee106
MISC ac29890200046646 ADDR ee2f6e800
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 46
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9
Hardware event. This is not a software error.
MCE 5
CPU 22 BANK 8 TSC 41d1dbef1a6a
MISC ac29890200046141 ADDR ee2f6e800
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 41
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9
Hardware event. This is not a software error.
MCE 6
CPU 22 BANK 8 TSC

Re: MCE: Does this look possibly like a slot issue?

2022-06-20 Thread Larry Rosenman



Yes and Yes.

On 06/20/2022 7:37 pm, Ultima wrote:


Are you sure that the module you replaced it with was good?
Are you sure you replaced the correct module?

Best regards,
Richard Gallamore

On Mon, Jun 20, 2022 at 5:23 PM Larry Rosenman  wrote:

I'm seeing them constantly:

root@freenas[~]# mcelog --dmi
Hardware event. This is not a software error.
MCE 0
CPU 22 BANK 8 TSC 20aab486464a
MISC ac29890200046444 ADDR ee2f6e800
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 44
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
WARNING: SMBIOS data is often unreliable. Take with a grain of salt!
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9
Hardware event. This is not a software error.
MCE 1
CPU 22 BANK 8 TSC 296dfcc82582
MISC ac29890200041381 ADDR ee2f6e800
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 81
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9
Hardware event. This is not a software error.
MCE 2
CPU 22 BANK 8 TSC 2a5604a6a070
MISC ac29890200044281
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory ECC error occurred during scrub
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 81
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 884200cf MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
Hardware event. This is not a software error.
MCE 3
CPU 22 BANK 8 TSC 31e141418eb8
MISC ac29890200046a4a ADDR ee2f6e800
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 4a
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9
Hardware event. This is not a software error.
MCE 4
CPU 22 BANK 8 TSC 3a014afee106
MISC ac29890200046646 ADDR ee2f6e800
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 46
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9
Hardware event. This is not a software error.
MCE 5
CPU 22 BANK 8 TSC 41d1dbef1a6a
MISC ac29890200046141 ADDR ee2f6e800
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 41
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9
Hardware event. This is not a software error.
MCE 6
CPU 22 BANK 8 TSC 4a1b1ecef446
MISC ac29890200046a4a ADDR ee2f6e800
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 4a
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9
Hardware event. This is not a software error.
MCE 7
CPU 22 BANK 8 TSC 527bc27db776
MISC

Re: MCE: Does this look possibly like a slot issue?

2022-06-20 Thread Larry Rosenman
: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9
Hardware event. This is not a software error.
MCE 8
CPU 22 BANK 8 TSC 5aa4ecdd795a
MISC ac29890200046646 ADDR ee2f6e800
TIME 1655770989 Mon Jun 20 19:23:09 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 46
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9
root@freenas[~]#

and I replaced the DIMM yesterday :(

On 06/20/2022 7:19 pm, Ultima wrote:


Hey Larry,

It is possible it's the motherboard itself, but it's rare. The way I
would determine this is to swap the DIMM module with another
populated slot on the motherboard and see if the error migrated
to the new slot or not. Also, this error doesn't necessarily mean
there is a problem that needs to be addressed. If you have been
running the system for many months and you see ECC errors a
handful of times, it can probably be safely ignored.

Best regards,
Richard Gallamore

On Mon, Jun 20, 2022 at 3:14 PM Larry Rosenman  wrote:


I've gotten a BUNCH of these on my TrueNAS server.  I've replaced this
DIMM a couple of times, and still the MCE's continue.
Is it possible it's Motherboard slot issue?

Hardware event. This is not a software error.
MCE 8
CPU 22 BANK 8 TSC 5aa4ecdd795a
MISC ac29890200046646 ADDR ee2f6e800
TIME 1655762472 Mon Jun 20 17:01:12 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 46
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

MCE: Does this look possibly like a slot issue?

2022-06-20 Thread Larry Rosenman
I've gotten a BUNCH of these on my TrueNAS server.  I've replaced this 
DIMM a couple of times, and still the MCE's continue.

Is it possible it's Motherboard slot issue?

Hardware event. This is not a software error.
MCE 8
CPU 22 BANK 8 TSC 5aa4ecdd795a
MISC ac29890200046646 ADDR ee2f6e800
TIME 1655762472 Mon Jun 20 17:01:12 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 46
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c41009f MCGSTATUS 0
MCGCAP 1c09 APICID 34 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Hyundai
Serial Number: 40F3C20F
Asset Tag:
Part Number: HMT151R7BFR4C-H9



--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: SAS/SATA controllers: 8 port that support 8TB Drives

2022-06-18 Thread Larry Rosenman



On 06/18/2022 8:30 am, Michael Gmelin wrote:


On 18. Jun 2022, at 15:10, Larry Rosenman  wrote:



On 06/18/2022 3:54 am, Michael Gmelin wrote:

[SNIP]

Subvendor is Fujitsu Siemens - so I guess this is integrated into a 
system by them.


Seems like flashing the 2108 to an IT firmware isn't an option (based 
on what I found online). You could check if there are firmware updates 
available though. How did you configure the drives in the megaraid 
utility (ctrl-h after boot)? Did you create a RAID-0 for each disk? 
And what capacity is shown in there?


Based on [0], 2108 based controllers don't support 4kn. IT mode would 
help (true passthrough), but as written above, I don't think it's an 
option for this model.


-m
[0] https://bitdeals.tech/blogs/news/4kn-lsi-compatibility-list

as I said earlier in the thread, I've bought 2 of these:
https://www.ebay.com/itm/194910024856

which if I'm reading that chart right should work with the 4Kn drives.


That certainly sounds promising.
Best
Michael


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106


And I realized I didn't answer the question about how stuff was 
configured, each disk is Raid0.  the current


pool is 10x3T disks, and I'm adding 6x8T since the pool is 70% full 
(bacula backups, Time Machine Backups, random other stuff).


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

Re: SAS/SATA controllers: 8 port that support 8TB Drives

2022-06-18 Thread Larry Rosenman



On 06/18/2022 3:54 am, Michael Gmelin wrote:

[SNIP]

Subvendor is Fujitsu Siemens - so I guess this is integrated into a 
system by them.


Seems like flashing the 2108 to an IT firmware isn't an option (based on 
what I found online). You could check if there are firmware updates 
available though. How did you configure the drives in the megaraid 
utility (ctrl-h after boot)? Did you create a RAID-0 for each disk? And 
what capacity is shown in there?


Based on [0], 2108 based controllers don't support 4kn. IT mode would 
help (true passthrough), but as written above, I don't think it's an 
option for this model.


-m
  [0] https://bitdeals.tech/blogs/news/4kn-lsi-compatibility-list

as I said earlier in the thread, I've bought 2 of these:
https://www.ebay.com/itm/194910024856

which if I'm reading that chart right should work with the 4Kn drives.

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

Re: SAS/SATA controllers: 8 port that support 8TB Drives

2022-06-17 Thread Larry Rosenman

On 06/17/2022 6:20 pm, Michael Gmelin wrote:

On 18. Jun 2022, at 00:57, Larry Rosenman  wrote:
On 06/17/2022 5:48 pm, Michael Gmelin wrote:

On 18. Jun 2022, at 00:31, Alexander Motin  wrote:


On 17.06.2022 18:24, Alexander Motin wrote:

On 17.06.2022 18:16, Larry Rosenman wrote:
On 06/17/2022 5:08 pm, Alexander Motin wrote:

On 17.06.2022 11:59, Larry Rosenman wrote:
I'm looking to upgrade the controllers in my TrueNAS box to 
something that will
support 8TB drives because apparently my LSI 2108 controllers do 
not support 8TB drives.

What's the communities recommendation?
needs to support SFF connectors for a total of 4 SFF connectors, 
as I have 16 slots.

We at iX are still using LSI/Broadcom HBAs, just moved from long
discontinued mps(4) to newer mpr(4).  And I don't believe the 
problem
is directly related to capacity.  According to my observations it 
may

be Seagate HDDs of/above certain (8TB) generation.  We do not use
Seagate HDDs in our products, so about that instability I only 
heard

from forums and TrueNAS community user reports.
This is a mfi(4) set of controllers, and a ST8Nm0045 8TB (CMR) 
drive.

Is this a bad combo?
mfi0: 9973 (708793330s/0x0002/WARN) - PD 00(e0xfc/s3) is not 
supported

(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an 
error

(probe0:mfi0:0:0:0): Retrying command, 3 more tries remain
(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an 
error

(probe0:mfi0:0:0:0): Retrying command, 2 more tries remain
(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an 
error

(probe0:mfi0:0:0:0): Retrying command, 1 more tries remain
(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an 
error

(probe0:mfi0:0:0:0): Retrying command, 0 more tries remain
(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an 
error

(probe0:mfi0:0:0:0): Error 5, Retries exhausted
mfi0 Physical Drives:
 0 (  932G) UNCONFIGURED GOOD serial=ZA1AC912> SATA E1:S3
mfi(4) are RAIDs, not HBAs.  We do not recommend RAIDs with TrueNAS 
due to problems with hot-plug, disk identification, etc. and so 
have limited experience with them.  But I know some of LSI RAIDs 
can be reflashed into equivalent HBAs, so if they share the 
hardware, I can speculate that they may share some issues.
I've just noticed "932G" instead of "8000G".  It is obviously a 
bigger problem than what we heard for HBAs.  It looks like a kind of 
problems that should not happen to HBAs, since they should not care 
about disk capacity.

What does `smartctl -a ` report (especially sector sizes)?
-m

--
Alexander Motin

It's not even making a mfid* node (it is a 4Kn disk)


Ok, that’s sad (and explains the wrong size calculation as 4096/512=8).

Is this in HBA mode? (Like Alexander suggested, re-/crossflashing
using an IT firmware might be an option). What controller / firmware
image version is it?

-m



--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



mfi0@pci0:8:0:0:	class=0x010400 rev=0x05 hdr=0x00 vendor=0x1000 
device=0x0079 subvendor=0x1734 subdevice=0x1176

vendor = 'Broadcom / LSI'
device = 'MegaRAID SAS 2108 [Liberator]'
class  = mass storage
subclass   = RAID

mfi1@pci0:3:0:0:	class=0x010400 rev=0x05 hdr=0x00 vendor=0x1000 
device=0x0079 subvendor=0x1734 subdevice=0x1176

vendor = 'Broadcom / LSI'
device = 'MegaRAID SAS 2108 [Liberator]'
class  = mass storage
subclass   = RAID

mfi0:  port 0xd000-0xd0ff mem 
0xfbc9c000-0xfbc9,0xfbcc

-0xfbcf irq 26 at device 0.0 on pci3
mfi0: Using MSI
mfi0: Megaraid SAS driver Ver 4.23
mfi0: FW MaxCmds = 1008, limiting to 128
mfip0:  on mfi0
mfi0: 10014 (708822708s/0x0020/info) - Shutdown command received from 
host
mfi0: 10015 (boot + 3s/0x0020/info) - Firmware initialization started 
(PCI ID 00

79/1000/1176/1734)
mfi0: 10016 (boot + 3s/0x0020/info) - Firmware version 2.130.353-2727
mfi0: 10017 (boot + 6s/0x0020/info) - Package version 12.12.0-0174
mfi0: 10018 (boot + 6s/0x0020/info) - Board Revision

mfi1:  port 0xc000-0xc0ff mem 
0xfba9c000-0xfba9,0xfbac

-0xfbaf irq 16 at device 0.0 on pci8
mfi1: Using MSI
mfi1: Megaraid SAS driver Ver 4.23
mfi1: FW MaxCmds = 1008, limiting to 128


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: SAS/SATA controllers: 8 port that support 8TB Drives

2022-06-17 Thread Larry Rosenman

On 06/17/2022 5:48 pm, Michael Gmelin wrote:

On 18. Jun 2022, at 00:31, Alexander Motin  wrote:




On 17.06.2022 18:24, Alexander Motin wrote:

On 17.06.2022 18:16, Larry Rosenman wrote:
On 06/17/2022 5:08 pm, Alexander Motin wrote:

On 17.06.2022 11:59, Larry Rosenman wrote:
I'm looking to upgrade the controllers in my TrueNAS box to 
something that will
support 8TB drives because apparently my LSI 2108 controllers do 
not support 8TB drives.


What's the communities recommendation?
needs to support SFF connectors for a total of 4 SFF connectors, 
as I have 16 slots.


We at iX are still using LSI/Broadcom HBAs, just moved from long
discontinued mps(4) to newer mpr(4).  And I don't believe the 
problem
is directly related to capacity.  According to my observations it 
may

be Seagate HDDs of/above certain (8TB) generation.  We do not use
Seagate HDDs in our products, so about that instability I only 
heard

from forums and TrueNAS community user reports.


This is a mfi(4) set of controllers, and a ST8Nm0045 8TB (CMR) 
drive.


Is this a bad combo?

mfi0: 9973 (708793330s/0x0002/WARN) - PD 00(e0xfc/s3) is not 
supported

(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an error
(probe0:mfi0:0:0:0): Retrying command, 3 more tries remain
(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an error
(probe0:mfi0:0:0:0): Retrying command, 2 more tries remain
(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an error
(probe0:mfi0:0:0:0): Retrying command, 1 more tries remain
(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an error
(probe0:mfi0:0:0:0): Retrying command, 0 more tries remain
(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an error
(probe0:mfi0:0:0:0): Error 5, Retries exhausted

mfi0 Physical Drives:
  0 (  932G) UNCONFIGURED GOOD serial=ZA1AC912> SATA E1:S3
mfi(4) are RAIDs, not HBAs.  We do not recommend RAIDs with TrueNAS 
due to problems with hot-plug, disk identification, etc. and so have 
limited experience with them.  But I know some of LSI RAIDs can be 
reflashed into equivalent HBAs, so if they share the hardware, I can 
speculate that they may share some issues.


I've just noticed "932G" instead of "8000G".  It is obviously a bigger 
problem than what we heard for HBAs.  It looks like a kind of problems 
that should not happen to HBAs, since they should not care about disk 
capacity.




What does `smartctl -a ` report (especially sector sizes)?

-m



--
Alexander Motin


It's not even making a mfid* node (it is a 4Kn disk)
--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: SAS/SATA controllers: 8 port that support 8TB Drives

2022-06-17 Thread Larry Rosenman

On 06/17/2022 5:24 pm, Alexander Motin wrote:

On 17.06.2022 18:16, Larry Rosenman wrote:

On 06/17/2022 5:08 pm, Alexander Motin wrote:

On 17.06.2022 11:59, Larry Rosenman wrote:
I'm looking to upgrade the controllers in my TrueNAS box to 
something that will
support 8TB drives because apparently my LSI 2108 controllers do not 
support 8TB drives.


What's the communities recommendation?
needs to support SFF connectors for a total of 4 SFF connectors, as 
I have 16 slots.


We at iX are still using LSI/Broadcom HBAs, just moved from long
discontinued mps(4) to newer mpr(4).  And I don't believe the problem
is directly related to capacity.  According to my observations it may
be Seagate HDDs of/above certain (8TB) generation.  We do not use
Seagate HDDs in our products, so about that instability I only heard
from forums and TrueNAS community user reports.


This is a mfi(4) set of controllers, and a ST8Nm0045 8TB (CMR) 
drive.


Is this a bad combo?

mfi0: 9973 (708793330s/0x0002/WARN) - PD 00(e0xfc/s3) is not supported
(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an error
(probe0:mfi0:0:0:0): Retrying command, 3 more tries remain
(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an error
(probe0:mfi0:0:0:0): Retrying command, 2 more tries remain
(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an error
(probe0:mfi0:0:0:0): Retrying command, 1 more tries remain
(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an error
(probe0:mfi0:0:0:0): Retrying command, 0 more tries remain
(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an error
(probe0:mfi0:0:0:0): Error 5, Retries exhausted

mfi0 Physical Drives:
  0 (  932G) UNCONFIGURED GOOD  
SATA E1:S3


mfi(4) are RAIDs, not HBAs.  We do not recommend RAIDs with TrueNAS
due to problems with hot-plug, disk identification, etc. and so have
limited experience with them.  But I know some of LSI RAIDs can be
reflashed into equivalent HBAs, so if they share the hardware, I can
speculate that they may share some issues.

I bought 2 of these:
https://www.ebay.com/itm/194910024856
to replace the 2 mfi(4)'s

Hopefully I can just move the controllers and TrueNAS 13.0-RELEASE will 
just notice them.

and pick up the new 8T disks.

let me know if I'm setting myself up for failure.

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: SAS/SATA controllers: 8 port that support 8TB Drives

2022-06-17 Thread Larry Rosenman

On 06/17/2022 5:08 pm, Alexander Motin wrote:

On 17.06.2022 11:59, Larry Rosenman wrote:
I'm looking to upgrade the controllers in my TrueNAS box to something 
that will
support 8TB drives because apparently my LSI 2108 controllers do not 
support 8TB drives.


What's the communities recommendation?
needs to support SFF connectors for a total of 4 SFF connectors, as I 
have 16 slots.


We at iX are still using LSI/Broadcom HBAs, just moved from long
discontinued mps(4) to newer mpr(4).  And I don't believe the problem
is directly related to capacity.  According to my observations it may
be Seagate HDDs of/above certain (8TB) generation.  We do not use
Seagate HDDs in our products, so about that instability I only heard
from forums and TrueNAS community user reports.



This is a mfi(4) set of controllers, and a ST8Nm0045 8TB (CMR) 
drive.


Is this a bad combo?


mfi0: 9973 (708793330s/0x0002/WARN) - PD 00(e0xfc/s3) is not supported
(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an error
(probe0:mfi0:0:0:0): Retrying command, 3 more tries remain
(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an error
(probe0:mfi0:0:0:0): Retrying command, 2 more tries remain
(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an error
(probe0:mfi0:0:0:0): Retrying command, 1 more tries remain
(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an error
(probe0:mfi0:0:0:0): Retrying command, 0 more tries remain
(probe0:mfi0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:mfi0:0:0:0): CAM status: CCB request completed with an error
(probe0:mfi0:0:0:0): Error 5, Retries exhausted

mfi0 Physical Drives:
 0 (  932G) UNCONFIGURED GOOD  
SATA E1:S3

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106




SAS/SATA controllers: 8 port that support 8TB Drives

2022-06-17 Thread Larry Rosenman
I'm looking to upgrade the controllers in my TrueNAS box to something 
that will
support 8TB drives because apparently my LSI 2108 controllers do not 
support 8TB drives.


What's the communities recommendation?
needs to support SFF connectors for a total of 4 SFF connectors, as I 
have 16 slots.



--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: Zpool with latest feature com.delpfix:head_errlog can not be booted from.

2022-05-21 Thread Larry Rosenman
@head_errlogactive 
local

root@freebsd:~ #

after re.installing boot programs, it does boot, also does work:
root@freebsd:~ # /usr/obj/usr/src/amd64.amd64/stand/userboot/test/test 
-d /dev/da0


the fix is already pushed.

rgds,
toomas

On 21. May 2022, at 03:56, Larry Rosenman  wrote:

Can you let me know when a replacement binary is available for EFI?  I 
have my buildbox/dev system in a non-bootable

state.  It's RAIDZ-1 pool, and no place to put another disk.

Thanks for any help.
(If  can email the replacement binary that would be 
wonderful).


On 05/20/2022 4:47 am, Toomas Soome wrote:
I'll see into it. It would be nice to have at least heads up message
about such features, or zfs code does have means to block feature
upgrade on boot pool.
Rgds,
Toomas
On 20. May 2022, at 11:39, Johan Hendriks  
wrote:
I did upgrade my FreeBSD Current and with that i updated my storage 
pool and my zroot pool.
I did add the new gptboot code on the disk. After the reboot i can not 
boot anymore.
So i did reinstall the os on one disk of the old zroot mirror pool and 
did leave the second untouched.

Then i can import the pools.
If i boot with the latest snapshot ISO 
(FreeBSD-14.0-CURRENT-amd64-20220519-716fd348e01-255696-disc1.iso) i 
see the following when i boot.

BIOS drive A: is fd0
BIOS drive B: is fd1

BIOS drive K: is disk9
ZFS: unsupported feature: com.delpfix:head_errlog
ZFS: pool zroot is not supported
ZFS: unsupported feature: com.delpfix:head_errlog
ZFS: pool storage is not supported
BIOS 624kB/2000420kB available memory
Then the OS is loaded, if i then go to the shell of the installer and 
do a zpool import, ik can import the pool zroot and storage. So this 
snapshot has the latest ZFS version with the com.delpfix:head_errlog 
feature. So it looks like the bootloader is not able to use the new 
feature and thus renders your system unbootable.

regards
Johan


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106


Links:
--
[1] http://148-52-235-80.sta.estpak.ee/boot.tar

Re: Zpool with latest feature com.delpfix:head_errlog can not be booted from.

2022-05-20 Thread Larry Rosenman
Can you let me know when a replacement binary is available for EFI?  I 
have my buildbox/dev system in a non-bootable

state.  It's RAIDZ-1 pool, and no place to put another disk.

Thanks for any help.
(If  can email the replacement binary that would be wonderful).


On 05/20/2022 4:47 am, Toomas Soome wrote:

I’ll see into it. It would be nice to have at least heads up message
about such features, or zfs code does have means to block feature
upgrade on boot pool.

Rgds,
Toomas


On 20. May 2022, at 11:39, Johan Hendriks  
wrote:


I did upgrade my FreeBSD Current and with that i updated my storage 
pool and my zroot pool.
I did add the new gptboot code on the disk. After the reboot i can not 
boot anymore.


So i did reinstall the os on one disk of the old zroot mirror pool and 
did leave the second untouched.


Then i can import the pools.
If i boot with the latest snapshot ISO 
(FreeBSD-14.0-CURRENT-amd64-20220519-716fd348e01-255696-disc1.iso) i 
see the following when i boot.


BIOS drive A: is fd0
BIOS drive B: is fd1

BIOS drive K: is disk9
ZFS: unsupported feature: com.delpfix:head_errlog
ZFS: pool zroot is not supported
ZFS: unsupported feature: com.delpfix:head_errlog
ZFS: pool storage is not supported
BIOS 624kB/2000420kB available memory

Then the OS is loaded, if i then go to the shell of the installer and 
do a zpool import, ik can import the pool zroot and storage. So this 
snapshot has the latest ZFS version with the com.delpfix:head_errlog 
feature. So it looks like the bootloader is not able to use the new 
feature and thus renders your system unbootable.


regards
Johan




--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: ZFS PANIC: HELP.

2022-02-27 Thread Larry Rosenman

On 02/27/2022 3:58 pm, Mark Johnston wrote:

On Sun, Feb 27, 2022 at 01:16:44PM -0600, Larry Rosenman wrote:

On 02/26/2022 11:08 am, Larry Rosenman wrote:
> On 02/26/2022 10:57 am, Larry Rosenman wrote:
>> On 02/26/2022 10:37 am, Juraj Lutter wrote:
>>>> On 26 Feb 2022, at 03:03, Larry Rosenman  wrote:
>>>> I'm running this script:
>>>> #!/bin/sh
>>>> for i in $(zfs list -H | awk '{print $1}')
>>>> do
>>>>   FS=$1
>>>>   FN=$(echo ${FS} | sed -e s@/@_@g)
>>>>   sudo zfs send -vecLep ${FS}@REPAIR_SNAP | ssh
>>>> l...@freenas.lerctr.org cat - \> $FN
>>>> done
>>>>
>>>>
>>>>
>>> I’d put, like:
>>>
>>> echo ${FS}
>>>
>>> before “sudo zfs send”, to get at least a bit of a clue on where it
>>> can get to.
>>>
>>> otis
>>>
>>>
>>> —
>>> Juraj Lutter
>>> o...@freebsd.org
>> I just looked at the destination to see where it died (it did!) and I
>> bectl destroy'd the
>> BE that crashed it, and am running a new scrub -- we'll see whether
>> that was sufficient.
>>
>> Thanks, all!
> Well, it was NOT sufficient More zfs export fun to come :(

I was able to export the rest of the datasets, and re-install 
14-CURRENT

from a recent snapshot, and restore the datasets I care about.

I'm now seeing:
mfi0: IOCTL 0x40086481 not handled
mfi0: IOCTL 0x40086481 not handled
mfi0: IOCTL 0x40086481 not handled
mfi0: IOCTL 0x40086481 not handled
pid 48 (zpool), jid 0, uid 0: exited on signal 6
mfi0: IOCTL 0x40086481 not handled
mfi0: IOCTL 0x40086481 not handled
mfi0: IOCTL 0x40086481 not handled
mfi0: IOCTL 0x40086481 not handled
pid 54 (zpool), jid 0, uid 0: exited on signal 6

On boot.  Ideas?


That ioctl is DIOCGMEDIASIZE, i.e., something is asking /dev/mfi0, the
controller device node, about the size of a disk.  Presumably this is
the result of some kind of misconfiguration somewhere, and /dev/mfid0
was meant instead.



per advice from markj@ I deleted the /{etc,boot}/zfs/zpool.cache files, 
and this issue went

away.  Stale cache files which are no longer needed.

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: ZFS PANIC: HELP.

2022-02-27 Thread Larry Rosenman

On 02/27/2022 3:03 pm, Michael Butler wrote:

[ cc list trimmed ]

On 2/27/22 14:16, Larry Rosenman wrote:


I was able to export the rest of the datasets, and re-install 
14-CURRENT from a recent snapshot, and restore the datasets I care 
about.


I'm now seeing:
mfi0: IOCTL 0x40086481 not handled
mfi0: IOCTL 0x40086481 not handled
mfi0: IOCTL 0x40086481 not handled
mfi0: IOCTL 0x40086481 not handled
pid 48 (zpool), jid 0, uid 0: exited on signal 6
mfi0: IOCTL 0x40086481 not handled
mfi0: IOCTL 0x40086481 not handled
mfi0: IOCTL 0x40086481 not handled
mfi0: IOCTL 0x40086481 not handled
pid 54 (zpool), jid 0, uid 0: exited on signal 6

On boot.  Ideas?


These messages may or may not be related. I found both the mfi and
mrsas drivers to be 'chatty' in this way - IOCTL complaints. I ended
up setting the debug flag for mrsas in /etc/sysctl.conf ..

dev.mrsas.0.mrsas_debug=0

There's an equivalent for mfi

Michael


I don't see it:
✖1 ❯ sysctl dev.mfi
dev.mfi.0.keep_deleted_volumes: 0
dev.mfi.0.delete_busy_volumes: 0
dev.mfi.0.%parent: pci3
dev.mfi.0.%pnpinfo: vendor=0x1000 device=0x0079 subvendor=0x1028 
subdevice=0x1f17 class=0x010400

dev.mfi.0.%location: slot=0 function=0 dbsf=pci0:3:0:0
dev.mfi.0.%driver: mfi
dev.mfi.0.%desc: Dell PERC H700 Integrated
dev.mfi.%parent:

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: ZFS PANIC: HELP.

2022-02-27 Thread Larry Rosenman

On 02/26/2022 11:08 am, Larry Rosenman wrote:

On 02/26/2022 10:57 am, Larry Rosenman wrote:

On 02/26/2022 10:37 am, Juraj Lutter wrote:

On 26 Feb 2022, at 03:03, Larry Rosenman  wrote:
I'm running this script:
#!/bin/sh
for i in $(zfs list -H | awk '{print $1}')
do
  FS=$1
  FN=$(echo ${FS} | sed -e s@/@_@g)
  sudo zfs send -vecLep ${FS}@REPAIR_SNAP | ssh 
l...@freenas.lerctr.org cat - \> $FN

done




I’d put, like:

echo ${FS}

before “sudo zfs send”, to get at least a bit of a clue on where it 
can get to.


otis


—
Juraj Lutter
o...@freebsd.org

I just looked at the destination to see where it died (it did!) and I
bectl destroy'd the
BE that crashed it, and am running a new scrub -- we'll see whether
that was sufficient.

Thanks, all!

Well, it was NOT sufficient More zfs export fun to come :(


I was able to export the rest of the datasets, and re-install 14-CURRENT 
from a recent snapshot, and restore the datasets I care about.


I'm now seeing:
mfi0: IOCTL 0x40086481 not handled
mfi0: IOCTL 0x40086481 not handled
mfi0: IOCTL 0x40086481 not handled
mfi0: IOCTL 0x40086481 not handled
pid 48 (zpool), jid 0, uid 0: exited on signal 6
mfi0: IOCTL 0x40086481 not handled
mfi0: IOCTL 0x40086481 not handled
mfi0: IOCTL 0x40086481 not handled
mfi0: IOCTL 0x40086481 not handled
pid 54 (zpool), jid 0, uid 0: exited on signal 6

On boot.  Ideas?



--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: ZFS PANIC: HELP.

2022-02-26 Thread Larry Rosenman

On 02/26/2022 10:57 am, Larry Rosenman wrote:

On 02/26/2022 10:37 am, Juraj Lutter wrote:

On 26 Feb 2022, at 03:03, Larry Rosenman  wrote:
I'm running this script:
#!/bin/sh
for i in $(zfs list -H | awk '{print $1}')
do
  FS=$1
  FN=$(echo ${FS} | sed -e s@/@_@g)
  sudo zfs send -vecLep ${FS}@REPAIR_SNAP | ssh 
l...@freenas.lerctr.org cat - \> $FN

done




I’d put, like:

echo ${FS}

before “sudo zfs send”, to get at least a bit of a clue on where it 
can get to.


otis


—
Juraj Lutter
o...@freebsd.org

I just looked at the destination to see where it died (it did!) and I
bectl destroy'd the
BE that crashed it, and am running a new scrub -- we'll see whether
that was sufficient.

Thanks, all!

Well, it was NOT sufficient More zfs export fun to come :(

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: ZFS PANIC: HELP.

2022-02-26 Thread Larry Rosenman

On 02/26/2022 10:37 am, Juraj Lutter wrote:

On 26 Feb 2022, at 03:03, Larry Rosenman  wrote:
I'm running this script:
#!/bin/sh
for i in $(zfs list -H | awk '{print $1}')
do
  FS=$1
  FN=$(echo ${FS} | sed -e s@/@_@g)
  sudo zfs send -vecLep ${FS}@REPAIR_SNAP | ssh l...@freenas.lerctr.org 
cat - \> $FN

done




I’d put, like:

echo ${FS}

before “sudo zfs send”, to get at least a bit of a clue on where it can 
get to.


otis


—
Juraj Lutter
o...@freebsd.org
I just looked at the destination to see where it died (it did!) and I 
bectl destroy'd the
BE that crashed it, and am running a new scrub -- we'll see whether that 
was sufficient.


Thanks, all!
--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: ZFS PANIC: HELP.

2022-02-25 Thread Larry Rosenman



On 02/25/2022 2:11 am, Alexander Leidinger wrote:

Quoting Larry Rosenman  (from Thu, 24 Feb 2022 20:19:45 
-0600):



I tried a scrub -- it panic'd on a fatal double fault.

Suggestions?


The safest / cleanest (but not fastest) is data export and pool 
re-creation. If you export dataset by dataset (instead of recursively 
all), you can even see which dataset is causing the issue. In case this 
per dataset export narrows down the issue and it is a dataset you don't 
care about (as in: 1) no issue to recreate from scratch or 2) there is 
a backup available) you could delete this (or each such) dataset and 
re-create it in-place (= not re-creating the entire pool).


Bye,
Alexander.

http://www.Leidinger.net alexan...@leidinger.net: PGP 
0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 
0x8F31830F9F2772BF


I'm running this script:
#!/bin/sh
for i in $(zfs list -H | awk '{print $1}')
do
  FS=$1
  FN=$(echo ${FS} | sed -e s@/@_@g)
  sudo zfs send -vecLep ${FS}@REPAIR_SNAP | ssh l...@freenas.lerctr.org 
cat - \> $FN

done

How will I know a "Problem" dataset?

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

Re: ZFS PANIC: HELP.

2022-02-24 Thread Larry Rosenman



On 02/24/2022 8:07 pm, Larry Rosenman wrote:


On 02/24/2022 1:27 pm, Larry Rosenman wrote:

On 02/24/2022 10:48 am, Rob Wing wrote:

even with those set, I still get the panid. :(

Let me see if I can compile a 14 non-INVARIANTS kernel on the 13-REL 
system.


UGH.


I chroot'd to the pool, and built a no invariants kernel.  It booted and 
seems(!) to be running.


Is there any diagnostics/clearing the crappy ZIL?

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

I tried a scrub -- it panic'd on a fatal double fault.

Suggestions?

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

Re: ZFS PANIC: HELP.

2022-02-24 Thread Larry Rosenman



On 02/24/2022 1:27 pm, Larry Rosenman wrote:


On 02/24/2022 10:48 am, Rob Wing wrote:


even with those set, I still get the panid. :(


Let me see if I can compile a 14 non-INVARIANTS kernel on the 13-REL 
system.


UGH.


I chroot'd to the pool, and built a no invariants kernel.  It booted and 
seems(!) to be running.


Is there any diagnostics/clearing the crappy ZIL?

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

Re: ZFS PANIC: HELP.

2022-02-24 Thread Larry Rosenman



On 02/24/2022 10:48 am, Rob Wing wrote:


Yes, I believe so.

On Thu, Feb 24, 2022 at 7:42 AM Larry Rosenman  wrote:

On 02/24/2022 10:36 am, Rob Wing wrote:

You might try setting `sysctl vfs.zfs.recover=1` and `sysctl 
vfs.zfs.spa.load_verify_metadata=0`.


I had a similar error the other day (couple months ago). The best I did 
was being able to import the pool read only. I ended up restoring from 
backup.


Are those tunables that I can set in loader.conf?

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

even with those set, I still get the panid. :(

Let me see if I can compile a 14 non-INVARIANTS kernel on the 13-REL 
system.


UGH.

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

Re: ZFS PANIC: HELP.

2022-02-24 Thread Larry Rosenman



On 02/24/2022 10:36 am, Rob Wing wrote:

You might try setting `sysctl vfs.zfs.recover=1` and `sysctl 
vfs.zfs.spa.load_verify_metadata=0`.


I had a similar error the other day (couple months ago). The best I did 
was being able to import the pool read only. I ended up restoring from 
backup.






Are those tunables that I can set in loader.conf?

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

Re: ZFS PANIC: HELP.

2022-02-24 Thread Larry Rosenman

On 02/24/2022 10:29 am, Alexander Motin wrote:

On 24.02.2022 10:57, Larry Rosenman wrote:

On 02/23/2022 9:27 pm, Larry Rosenman wrote:

It crashes just after root mount (this is the boot pool and only pool
on the system),
seeL
https://www.lerctr.org/~ler/14-BOOT-Crash.png


Where do I go from here?


I see 2 ways: 1) Since it is only an assertion and 13 is working (so
far), you may just build 14 kernel without INVARIANTS option and later
recreate the pool when you have time.  2) You may treat it as metadata
corruption: import pool read-only and evacuate the data.  If you have
recent enough snapshots you may be able to easily replicate the pool
with all the settings to some other disk.  ZIL is not replicated, so
corruptions there should not be a problem.  If there are no snapshots,
then either copy on file level, or you may be able to create snapshot
for replication in 13 (on 14 without INVARIANTS), importing pool
read-write.


Ugh.  The box is a 6 disk R710, and all 6 disks are in the pool.

I do have a FreeNAS box with enough space to copy the data out.  There 
ARE snaps of MOST filesystems that are taken regularly.


The 13 I'm booting from is the 13 memstick image.

There are ~70 filesystems (IIRC) with poudriere, ports, et al.

I'm not sure how to build the 14 kernel from the 13 booted box.

Ideas?  Methods?


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: ZFS PANIC: HELP.

2022-02-24 Thread Larry Rosenman

On 02/23/2022 9:27 pm, Larry Rosenman wrote:

On 02/23/2022 9:15 pm, Alexander Motin wrote:

On 23.02.2022 22:01, Larry Rosenman wrote:

On 02/23/2022 8:58 pm, Alexander Motin wrote:

On 23.02.2022 21:52, Larry Rosenman wrote:

On 02/23/2022 8:41 pm, Alexander Motin wrote:

Hi Larry,

The panic you are getting is an assertion, enabled by kernel built
with INVARIANTS option.  On 13 you may just not have that 
debugging

enabled to hit the issue.  But that may be only a consequence.
Original problem I guess in possibly corrupted ZFS intent log 
records
(or false positive), that could happen so due to use of -F 
recovery
option on `zpool import`, that supposed to try import pool at 
earlier
transaction group if there is some metadata corruption found.  It 
is
not supposed to work 100% and only a last resort.  Though may be 
that
assertion is just excessively strict for that specific recovery 
case.
If as you say pool can be imported and scrubbed on 13, then I'd 
expect

following clean export should allow later import on 14 without -F.

On 23.02.2022 21:21, Larry Rosenman wrote:


've got my main dev box that crashes on 14 with the screen shot 
at https://www.lerctr.org/~ler/14-zfs-crash.png.

Booting from a 13-REL USB installer it imports and scrubs.

Ideas?

I can either video conference with shared screen or give access 
to the console via my Dominion KVM.


Any help/ideas/etc welcome I really need to get this box back.


How can I import the pool withOUT it mounting the FileSystems so I 
can export it cleanly on the 13 system?


Why do you need to import without mounting file systems?  I think 
you

may actually wish them to be mounted to replay their ZILs.  Just use
-R option to mount file systems in some different place.


I get the errors shown at:
https://www.lerctr.org/~ler/14-mount-R-output.png

Should I worry?  Or do something(tm) here?


This looks weird, but may possibly depend on mount points topology,
whether /mnt is writable, etc.  What happen if you export it now and
try to import it in normal way on 14 without -F?


It crashes just after root mount (this is the boot pool and only pool
on the system),
seeL
https://www.lerctr.org/~ler/14-BOOT-Crash.png


Where do I go from here?


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: ZFS PANIC: HELP.

2022-02-23 Thread Larry Rosenman

On 02/23/2022 9:15 pm, Alexander Motin wrote:

On 23.02.2022 22:01, Larry Rosenman wrote:

On 02/23/2022 8:58 pm, Alexander Motin wrote:

On 23.02.2022 21:52, Larry Rosenman wrote:

On 02/23/2022 8:41 pm, Alexander Motin wrote:

Hi Larry,

The panic you are getting is an assertion, enabled by kernel built
with INVARIANTS option.  On 13 you may just not have that debugging
enabled to hit the issue.  But that may be only a consequence.
Original problem I guess in possibly corrupted ZFS intent log 
records

(or false positive), that could happen so due to use of -F recovery
option on `zpool import`, that supposed to try import pool at 
earlier
transaction group if there is some metadata corruption found.  It 
is
not supposed to work 100% and only a last resort.  Though may be 
that
assertion is just excessively strict for that specific recovery 
case.
If as you say pool can be imported and scrubbed on 13, then I'd 
expect

following clean export should allow later import on 14 without -F.

On 23.02.2022 21:21, Larry Rosenman wrote:


've got my main dev box that crashes on 14 with the screen shot at 
https://www.lerctr.org/~ler/14-zfs-crash.png.

Booting from a 13-REL USB installer it imports and scrubs.

Ideas?

I can either video conference with shared screen or give access to 
the console via my Dominion KVM.


Any help/ideas/etc welcome I really need to get this box back.


How can I import the pool withOUT it mounting the FileSystems so I 
can export it cleanly on the 13 system?


Why do you need to import without mounting file systems?  I think you
may actually wish them to be mounted to replay their ZILs.  Just use
-R option to mount file systems in some different place.


I get the errors shown at:
https://www.lerctr.org/~ler/14-mount-R-output.png

Should I worry?  Or do something(tm) here?


This looks weird, but may possibly depend on mount points topology,
whether /mnt is writable, etc.  What happen if you export it now and
try to import it in normal way on 14 without -F?


It crashes just after root mount (this is the boot pool and only pool on 
the system),

seeL
https://www.lerctr.org/~ler/14-BOOT-Crash.png


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: ZFS PANIC: HELP.

2022-02-23 Thread Larry Rosenman

On 02/23/2022 8:58 pm, Alexander Motin wrote:

On 23.02.2022 21:52, Larry Rosenman wrote:

On 02/23/2022 8:41 pm, Alexander Motin wrote:

Hi Larry,

The panic you are getting is an assertion, enabled by kernel built
with INVARIANTS option.  On 13 you may just not have that debugging
enabled to hit the issue.  But that may be only a consequence.
Original problem I guess in possibly corrupted ZFS intent log records
(or false positive), that could happen so due to use of -F recovery
option on `zpool import`, that supposed to try import pool at earlier
transaction group if there is some metadata corruption found.  It is
not supposed to work 100% and only a last resort.  Though may be that
assertion is just excessively strict for that specific recovery case.
If as you say pool can be imported and scrubbed on 13, then I'd 
expect

following clean export should allow later import on 14 without -F.

On 23.02.2022 21:21, Larry Rosenman wrote:


've got my main dev box that crashes on 14 with the screen shot at 
https://www.lerctr.org/~ler/14-zfs-crash.png.

Booting from a 13-REL USB installer it imports and scrubs.

Ideas?

I can either video conference with shared screen or give access to 
the console via my Dominion KVM.


Any help/ideas/etc welcome I really need to get this box back.


How can I import the pool withOUT it mounting the FileSystems so I can 
export it cleanly on the 13 system?


Why do you need to import without mounting file systems?  I think you
may actually wish them to be mounted to replay their ZILs.  Just use
-R option to mount file systems in some different place.


I get the errors shown at:
https://www.lerctr.org/~ler/14-mount-R-output.png

Should I worry?  Or do something(tm) here?


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: ZFS PANIC: HELP.

2022-02-23 Thread Larry Rosenman

On 02/23/2022 8:41 pm, Alexander Motin wrote:

Hi Larry,

The panic you are getting is an assertion, enabled by kernel built
with INVARIANTS option.  On 13 you may just not have that debugging
enabled to hit the issue.  But that may be only a consequence.
Original problem I guess in possibly corrupted ZFS intent log records
(or false positive), that could happen so due to use of -F recovery
option on `zpool import`, that supposed to try import pool at earlier
transaction group if there is some metadata corruption found.  It is
not supposed to work 100% and only a last resort.  Though may be that
assertion is just excessively strict for that specific recovery case.
If as you say pool can be imported and scrubbed on 13, then I'd expect
following clean export should allow later import on 14 without -F.

On 23.02.2022 21:21, Larry Rosenman wrote:


've got my main dev box that crashes on 14 with the screen shot at 
https://www.lerctr.org/~ler/14-zfs-crash.png.

Booting from a 13-REL USB installer it imports and scrubs.

Ideas?

I can either video conference with shared screen or give access to the 
console via my Dominion KVM.


Any help/ideas/etc welcome I really need to get this box back.


How can I import the pool withOUT it mounting the FileSystems so I can 
export it cleanly on the 13 system?



--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



ZFS PANIC: HELP.

2022-02-23 Thread Larry Rosenman



've got my main dev box that crashes on 14 with the screen shot at 
https://www.lerctr.org/~ler/14-zfs-crash.png.

Booting from a 13-REL USB installer it imports and scrubs.

Ideas?

I can either video conference with shared screen or give access to the 
console via my Dominion KVM.


Any help/ideas/etc welcome I really need to get this box back.


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: Panic, CURRENT, yesterday

2022-02-19 Thread Larry Rosenman

On 02/09/2022 10:08 pm, Larry Rosenman wrote:


Another one today:
❯ more /var/crash/core.txt.1
borg.lerctr.org dumped core - see /var/crash/vmcore.1

Wed Feb  9 19:30:43 CST 2022





core is available, and I can give access and/or send the core and
kernel/debug stuff.


True for this one too.


Yet another one:
❯ more core.txt.3
borg.lerctr.org dumped core - see /var/crash/vmcore.3

Sat Feb 19 00:42:59 CST 2022

FreeBSD borg.lerctr.org 14.0-CURRENT FreeBSD 14.0-CURRENT #56 
ler/freebsd-main-changes-n253181-c140933ef40: Tue Feb 15 12:26:23 CST 
2022 
r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL  amd64


panic: ng_snd_item: 42 != 173

GNU gdb (GDB) 11.2 [GDB v11.2 for FreeBSD]
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...
Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...

Unread portion of the kernel message buffer:
panic: ng_snd_item: 42 != 173
cpuid = 0
time = 1645251876
KDB: stack backtrace:
#0 0x80516005 at kdb_backtrace+0x65
#1 0x804cba7f at vpanic+0x17f
#2 0x804cb853 at panic+0x43
#3 0x82c755b7 at ng_snd_item+0x587
#4 0x82c8e263 at ng_ether_output+0xb3
#5 0x805e0e2d at ether_output+0x6cd
#6 0x805f6461 at arpintr+0xd71
#7 0x805e5797 at netisr_dispatch_src+0x97
#8 0x805e112e at ether_demux+0x14e
#9 0x82c8e89c at ng_ether_rcv_upper+0x12c
#10 0x82c75dab at ng_apply_item+0x7eb
#11 0x82c7538d at ng_snd_item+0x35d
#12 0x82c75dab at ng_apply_item+0x7eb
#13 0x82c7538d at ng_snd_item+0x35d
#14 0x82c8e33f at ng_ether_input+0x9f
#15 0x805e23e7 at ether_nh_input+0x217
#16 0x805e5797 at netisr_dispatch_src+0x97
#17 0x805e159d at ether_input+0x5d
Uptime: 2d6h42m17s
Dumping 29172 out of 131023 
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%


__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55  __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" 
(offsetof(struct pcpu,

(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=)
at /usr/src/sys/kern/kern_shutdown.c:399
#2  0x804cb68f in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:487
#3  0x804cbaee in vpanic (fmt=0x82c7ed98 "%s: %d != %d",
ap=) at /usr/src/sys/kern/kern_shutdown.c:920
#4  0x804cb853 in panic (fmt=)
at /usr/src/sys/kern/kern_shutdown.c:844
#5  0x82c755b7 in ng_snd_item (item=0xf8131de0bd80, flags=0)
at /usr/src/sys/netgraph/ng_base.c:2256
#6  0x82c8e263 in ng_ether_output (ifp=,
ifp@entry=,
mp=0xfe025a044868,
mp@entry=)
at /usr/src/sys/netgraph/ng_ether.c:294
#7  0x805e0e2d in ether_output (ifp=0xf8010cfe0800,
m=0xf81d2e92b000, dst=, ro=)
at /usr/src/sys/net/if_ethersubr.c:427
#8  0x805f6461 in in_arpinput (m=0xf81d2e92b000)
at /usr/src/sys/netinet/if_ether.c:1129
#9  arpintr (m=0xf81d2e92b000,
m@entry=)
at /usr/src/sys/netinet/if_ether.c:739
#10 0x805e5797 in netisr_dispatch_src (proto=4,
source=source@entry=0, m=0xf81d2e92b000)
at /usr/src/sys/net/netisr.c:1153
#11 0x805e5aef in netisr_dispatch (proto=,
m=) at /usr/src/sys/net/netisr.c:1244
#12 0x805e112e in ether_demux (ifp=ifp@entry=0xf8010cfe0800,
m=, m@entry=0xf81d2e92b000)
at /usr/src/sys/net/if_ethersubr.c:926
#13 0x82c8e89c in ng_ether_rcv_upper (hook=,
hook@entry=,
item=0xf8131de0bd80,
item@entry=)
at /usr/src/sys/netgraph/ng_ether.c:742
#14 0x82c75dab in ng_apply_item 
(node=node@entry=0xf81365630b00,

item=item@entry=0xf8131de0bd80, rw=0)
at /usr/src/sys/netgraph/ng_base.c:2406
#15 0x82c7538d in ng_snd_item (item=0xf8131de0bd80,
item@entry=, 
flags=0,

flags@entry=)
at /usr/src/sys/netgraph/ng_base.c:2323
#16 0x82c75dab in ng_apply_item 
(node=node@entry=0xf813660f8500,

item=item@entry=0xf8131de0bd80, rw=0)
at /usr/src/sys/netgraph/ng_base.c:2406
#17 0x82c7538d in ng_snd_item 
(item=item@entry=0xf8131de0bd80,

Re: Panic, CURRENT, yesterday

2022-02-09 Thread Larry Rosenman

On 02/08/2022 1:51 pm, Larry Rosenman wrote:

I got the following last night while doing a poudriere run as well as
a full bacula backup:

borg.lerctr.org dumped core - see /var/crash/vmcore.0

Mon Feb  7 23:05:48 CST 2022

FreeBSD borg.lerctr.org 14.0-CURRENT FreeBSD 14.0-CURRENT #54
ler/freebsd-main-changes-n252969-5e5fd0c788c: Sat Feb  5 14:48:30 CST
2022
r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL
amd64

panic: ng_snd_item: 42 != 290


Another one today:
❯ more /var/crash/core.txt.1
borg.lerctr.org dumped core - see /var/crash/vmcore.1

Wed Feb  9 19:30:43 CST 2022

FreeBSD borg.lerctr.org 14.0-CURRENT FreeBSD 14.0-CURRENT #54 
ler/freebsd-main-changes-n252969-5e5fd0c788c: Sat Feb  5 14:48:30 CST 
2022 
r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL  amd64


panic: ng_snd_item: 42 != 1414

GNU gdb (GDB) 11.2 [GDB v11.2 for FreeBSD]
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...
Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...

Unread portion of the kernel message buffer:
panic: ng_snd_item: 42 != 1414
cpuid = 0
time = 1644455454
KDB: stack backtrace:
#0 0x80515fc5 at kdb_backtrace+0x65
#1 0x804cbaef at vpanic+0x17f
#2 0x804cb8c3 at panic+0x43
#3 0x82c765b7 at ng_snd_item+0x587
#4 0x82c8f263 at ng_ether_output+0xb3
#5 0x805e0c1d at ether_output+0x6cd
#6 0x805f6251 at arpintr+0xd71
#7 0x805e5587 at netisr_dispatch_src+0x97
#8 0x805e0f1e at ether_demux+0x14e
#9 0x82c8f89c at ng_ether_rcv_upper+0x12c
#10 0x82c76dab at ng_apply_item+0x7eb
#11 0x82c7638d at ng_snd_item+0x35d
#12 0x82c76dab at ng_apply_item+0x7eb
#13 0x82c7638d at ng_snd_item+0x35d
#14 0x82c8f33f at ng_ether_input+0x9f
#15 0x805e21d7 at ether_nh_input+0x217
#16 0x805e5587 at netisr_dispatch_src+0x97
#17 0x805e138d at ether_input+0x5d
Uptime: 1d20h10m31s
Dumping 28528 out of 131023 
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%


__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55  __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" 
(offsetof(struct pcpu,

(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=)
at /usr/src/sys/kern/kern_shutdown.c:399
#2  0x804cb6ff in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:487
#3  0x804cbb5e in vpanic (fmt=0x82c7fd98 "%s: %d != %d",
ap=) at /usr/src/sys/kern/kern_shutdown.c:920
#4  0x804cb8c3 in panic (fmt=)
at /usr/src/sys/kern/kern_shutdown.c:844
#5  0x82c765b7 in ng_snd_item (item=0xf8132d74f880, flags=0)
at /usr/src/sys/netgraph/ng_base.c:2256
#6  0x82c8f263 in ng_ether_output (ifp=,
ifp@entry=,
mp=0xfe02ba63f868,
mp@entry=)
at /usr/src/sys/netgraph/ng_ether.c:294
#7  0x805e0c1d in ether_output (ifp=0xf80114a43000,
m=0xf81f8203e600, dst=, ro=)
at /usr/src/sys/net/if_ethersubr.c:427
#8  0x805f6251 in in_arpinput (m=0xf81f8203e600)
at /usr/src/sys/netinet/if_ether.c:1129
#9  arpintr (m=0xf81f8203e600,
m@entry=)
at /usr/src/sys/netinet/if_ether.c:739
#10 0x805e5587 in netisr_dispatch_src (proto=4,
source=source@entry=0, m=0xf81f8203e600)
at /usr/src/sys/net/netisr.c:1153
#11 0x805e58df in netisr_dispatch (proto=,
m=) at /usr/src/sys/net/netisr.c:1244
#12 0x805e0f1e in ether_demux (ifp=ifp@entry=0xf80114a43000,
m=, m@entry=0xf81f8203e600)
at /usr/src/sys/net/if_ethersubr.c:926
#13 0x82c8f89c in ng_ether_rcv_upper (hook=,
hook@entry=,
item=0xf8132d74f880,
item@entry=)
at /usr/src/sys/netgraph/ng_ether.c:742
#14 0x82c76dab in ng_apply_item 
(node=node@entry=0xf812992fe600,

item=item@entry=0xf8132d74f880, rw=0)
at /usr/src/sys/netgraph/ng_base.c:2406
#15 0x82c7638d in ng_snd_item (item=0xf8132d74f880,
item@entry=, 
flags=0,

flags@entry=)
at /usr/src/sys/netgraph/ng_base.c:2323
#16 0x82c76dab in ng_apply_item 
(node=node@entry=0xfff

Panic, CURRENT, yesterday

2022-02-08 Thread Larry Rosenman
ther_input_internal (ifp=0xf8010dc57000,
m=0xf81736a3f600) at /usr/src/sys/net/if_ethersubr.c:661
#20 ether_nh_input (m=,
m@entry=)
at /usr/src/sys/net/if_ethersubr.c:742
#21 0x805e5587 in netisr_dispatch_src (proto=proto@entry=5,
source=source@entry=0, m=m@entry=0xf81736a3f600)
at /usr/src/sys/net/netisr.c:1153
#22 0x805e58df in netisr_dispatch (proto=,
proto@entry=5, m=, m@entry=0xf81736a3f600)
at /usr/src/sys/net/netisr.c:1244
#23 0x805e138d in ether_input (ifp=0xf8010dc57000,
m=0xf81736a3f600) at /usr/src/sys/net/if_ethersubr.c:833
#24 0x821a934d in bce_rx_intr (sc=0xfe02a141c000)
at /usr/src/sys/dev/bce/if_bce.c:6721
#25 bce_intr (xsc=) at /usr/src/sys/dev/bce/if_bce.c:7870
#26 0x80490929 in intr_event_execute_handlers 
(ie=0xf8010dac6900,

p=) at /usr/src/sys/kern/kern_intr.c:1205
#27 ithread_execute_handlers (ie=0xf8010dac6900, p=)
at /usr/src/sys/kern/kern_intr.c:1218
#28 ithread_loop (arg=, arg@entry=0xf8015d3683a0)
at /usr/src/sys/kern/kern_intr.c:1306
#29 0x8048d3a0 in fork_exit (
callout=0x804906d0 , arg=0xf8015d3683a0,
frame=0xfe025a5b2f40) at /usr/src/sys/kern/kern_fork.c:1102
#30 
(kgdb)

core is available, and I can give access and/or send the core and 
kernel/debug stuff.


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: My -CURRENT crashes....

2021-12-27 Thread Larry Rosenman
On Mon, Dec 27, 2021 at 09:15:53PM +0200, Konstantin Belousov wrote:
> On Mon, Dec 27, 2021 at 10:58:02AM -0800, Gleb Smirnoff wrote:
> > On Mon, Dec 27, 2021 at 01:43:01PM -0500, Alexander Motin wrote:
> > A> > This allows us to deduct that the callout belongs to proc subsystem and
> > A> > we can retrieve the proc it points to: c_lock - 0x128 = 
> > 0xf8030521e548
> > A> > It is ccache in PRS_NORMAL state. And the "tmp" in our stack frame is 
> > its
> > A> > p_itcallout.
> > A> > 
> > A> > So there is something that would zero out most of the p_itcallout while
> > A> > it is scheduled?
> > A> 
> > A> So carefully zero it, but keep the lock pointer...  The only way that
> > A> comes to mind is callout_init_mtx() in do_fork() if we assume the
> > A> process has completed and the struct proc was reused.  I guess if we
> > A> could somehow leak scheduled callout in exit1().  May be we could add
> > A> some more assertions to try catch callout still being active there.
> > 
> > Note that _callout_stop_safe(p_itcallout) is the only place in kernel where
> > CS_EXECUTING is used.
> 
> I would start asking are there any third-party modules loaded.

Nope.

Id Refs AddressSize Name
 1  239 0x8020   d94b58 kernel
 21 0x81441000 f990 ehci.ko
 3   12 0x814510003da98 usb.ko
 41 0x8148f000   70ae00 zfs.ko
 55 0x81b9a000 5338 xdr.ko
 61 0x81ba ccf0 ukbd.ko
 77 0x81bad000 5248 hid.ko
 81 0x81bb3000 b2c0 uhci.ko
 91 0x8203d000 cec8 tmpfs.ko
101 0x8204a000 3538 fdescfs.ko
112 0x8204e000 3240 procfs.ko
123 0x82052000 5778 pseudofs.ko
131 0x82058000 9290 aesni.ko
141 0x82062000 20f0 coretemp.ko
151 0x82065000 3238 filemon.ko
161 0x820690002dd58 linux.ko
174 0x82097000 aea8 linux_common.ko
181 0x820a2000 4250 ichsmb.ko
192 0x820a7000 2180 smbus.ko
201 0x820aa000 4c10 ichwd.ko
211 0x820af000 2220 cpuctl.ko
221 0x820b2000 4338 cryptodev.ko
231 0x820b7000 2238 dtraceall.ko
248 0x820ba000 8a60 opensolaris.ko
258 0x8220   84a300 dtrace.ko
261 0x820c3000 2274 dtmalloc.ko
271 0x820c6000 3331 fbt.ko
281 0x820ca00056570 fasttrap.ko
291 0x82121000 2258 sdt.ko
301 0x82124000 91b4 systrace.ko
311 0x8212e000 91b4 systrace_freebsd32.ko
321 0x82138000 234c profile.ko
331 0x8213b000 8b38 ipmi.ko
343 0x82144000 45b0 efirt.ko
351 0x82149000 75b0 if_bridge.ko
361 0x82151000 50d8 bridgestp.ko
371 0x821570001662c hwpmc.ko
381 0x8216e00028bb8 tcp_rack.ko
391 0x82197000 21b8 mfip.ko
402 0x82a4b00084470 cam.ko
411 0x8219a000 7d38 ioat.ko
421 0x821a20004 if_bce.ko
431 0x82ad17a50 miibus.ko
441 0x821eb000 44b0 usb_quirk.ko
451 0x821f b3a8 usb_template.ko
461 0x821fc000 3268 ums.ko
471 0x82ae8000 92d0 xhci.ko
481 0x82af2000 6120 ohci.ko
491 0x82af900043ef8 nfscl.ko
503 0x82b3d00018cf0 nfscommon.ko
513 0x82b56000 2168 nfssvc.ko
524 0x82b59000138a0 krpc.ko
531 0x82b6d0004e638 nfsd.ko
541 0x82bbc000 bdc0 nfslockd.ko
551 0x82bc8000 4168 ataintel.ko
562 0x82bcd000 8358 ata.ko
571 0x82bd6000 5388 atapci.ko
581 0x82bdc000 4d40 geom_label.ko
591 0x82be100029f58 linux64.ko
601 0x82c0b000 2260 pty.ko
611 0x82c0e000 639c linprocfs.ko
621 0x82c15000 3284 linsysfs.ko
631 0x82c19000 3378 acpi_wmi.ko
641 0x82c1d000 2280 uhid.ko
651 0x82c2 3320 usbhid.ko
661 0x82c24000 31f8 hidbus.ko
671 0x82c28000 32c0 wmt.ko
681 0x82c2c00041a38 pf.ko
691 0x82c6e000 2a08 mac_ntpd.ko
705 0x82c71000 fb28 netgraph.ko
711 0x82c81000 63f8 ng_netflow.ko
72    1 0x82c88000 41e8 ng_ksocket.ko
731 0x82c8d000 3180 ng_ether.ko
741 0x82c91000 3918 ng_socket.ko
751 0x82c95000 4708 nullfs.ko
-- 
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: Panic: Page Fault in Kernel: Yesterday's CURRENT

2021-12-17 Thread Larry Rosenman

On 12/17/2021 1:36 pm, Mark Johnston wrote:

On Fri, Dec 10, 2021 at 10:43:19AM -0600, Larry Rosenman wrote:
14-2021_12_07-1217 -  -  1.87G 2021-12-07 
12:17
14-2021_12_09-1957 NR /  121G  2021-12-09 
19:57


If that's any help


I can't tell what this is saying.  A kernel built on the 7th does not
crash, or...?  Which revision did you update from before you started
seeing crashes?

From a kgdb session it'd be useful to see output from

(kgdb) frame 8
(kgdb) p/x *tmp

to start.



Correct, the 7th didn't panic, but the 9th did, and yesterday's too.

Grrr
ler in borg in /mnt on ☁️  (us-east-1)
❯ kgdb -c /var/crash/vmcore.0  /mnt/boot/kernel/kernel
GNU gdb (GDB) 11.1 [GDB v11.1 for FreeBSD]
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /mnt/boot/kernel/kernel...
(No debugging symbols found in /mnt/boot/kernel/kernel)
Failed to open vmcore: /var/crash/vmcore.0: Permission denied
(kgdb) bt
No stack.
quitb)

ler in borg in /mnt on ☁️  (us-east-1) took 6s
❯ sudo chmod +r /var/crash/*

ler in borg in /mnt on ☁️  (us-east-1)
❯ kgdb -c /var/crash/vmcore.0  /mnt/boot/kernel/kernel
GNU gdb (GDB) 11.1 [GDB v11.1 for FreeBSD]
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /mnt/boot/kernel/kernel...
(No debugging symbols found in /mnt/boot/kernel/kernel)
/wrkdirs/usr/ports/devel/gdb/work-py37/gdb-11.1/gdb/thread.c:1345: 
internal-error: void switch_to_thread(thread_info *): Assertion `thr != 
NULL' failed.

A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) n

This is a bug, please report it.  For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.

/wrkdirs/usr/ports/devel/gdb/work-py37/gdb-11.1/gdb/thread.c:1345: 
internal-error: void switch_to_thread(thread_info *): Assertion `thr != 
NULL' failed.

A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n) n
Command aborted.
(kgdb) bt
No thread selected.
(kgdb) fr 8
No thread selected.
(kgdb)


On 12/10/2021 10:36 am, Alexander Motin wrote:
> Hi Larry,
>
> This looks like some use-after-free or otherwise corrupted callout
> structure.  Unfortunately the backtrace does not tell what was the
> callout.  When was the previous update to look what could change?
>
> On 10.12.2021 11:24, Larry Rosenman wrote:
>> FreeBSD borg.lerctr.org 14.0-CURRENT FreeBSD 14.0-CURRENT #15
>> main-n251537-ab639f2398b: Thu Dec  9 19:45:37 CST 2021
>> r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL 
>> amd64
>>
>> VMCORE *IS* available.
>>
>>
>>
>>
>> Unread portion of the kernel message buffer:
>> kernel trap 12 with interrupts disabled
>>
>>
>> Fatal trap 12: page fault while in kernel mode
>> cpuid = 0; apic id = 20
>> fault virtual address   = 0x0
>> fault code  = supervisor write data, page not present
>> instruction pointer = 0x20:0x804e0db4
>> stack pointer   = 0x0:0xfe0434de4e10
>> frame pointer   = 0x0:0xfe0434de4e70
>> code segment    = base 0x0, limit 0xf, type 0x1b
>>     = DPL 0, pres 1, long 1, def32 0, gran 1
>> processor eflags    = resume, IOPL = 0
>> current process = 82990 (c++)

Re: Panic: Page Fault in Kernel: Yesterday's CURRENT

2021-12-16 Thread Larry Rosenman

On 12/16/2021 9:03 pm, Larry Rosenman wrote:

On 12/10/2021 10:43 am, Larry Rosenman wrote:
14-2021_12_07-1217 -  -  1.87G 2021-12-07 
12:17
14-2021_12_09-1957 NR /  121G  2021-12-09 
19:57


If that's any help

On 12/10/2021 10:36 am, Alexander Motin wrote:

Hi Larry,

This looks like some use-after-free or otherwise corrupted callout
structure.  Unfortunately the backtrace does not tell what was the
callout.  When was the previous update to look what could change?

On 10.12.2021 11:24, Larry Rosenman wrote:

FreeBSD borg.lerctr.org 14.0-CURRENT FreeBSD 14.0-CURRENT #15
main-n251537-ab639f2398b: Thu Dec  9 19:45:37 CST 2021
r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL  
amd64


VMCORE *IS* available.




Unread portion of the kernel message buffer:
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 20
fault virtual address   = 0x0
fault code  = supervisor write data, page not present
instruction pointer = 0x20:0x804e0db4
stack pointer   = 0x0:0xfe0434de4e10
frame pointer   = 0x0:0xfe0434de4e70
code segment    = base 0x0, limit 0xf, type 0x1b
    = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags    = resume, IOPL = 0
current process = 82990 (c++)
trap number = 12
panic: page fault
cpuid = 0
time = 163998
KDB: stack backtrace:
#0 0x8050fc95 at kdb_backtrace+0x65
#1 0x804c468f at vpanic+0x17f
#2 0x804c4503 at panic+0x43
#3 0x807a2195 at trap_fatal+0x385
#4 0x807a21ef at trap_pfault+0x4f
#5 0x80779c78 at calltrap+0x8
#6 0x8045ddb8 at handleevents+0x188
#7 0x8045ea3e at timercb+0x24e
#8 0x807ca9eb at lapic_handle_timer+0x9b
#9 0x8077b9b1 at Xtimerint+0xb1
Uptime: 2h28m57s
Dumping 12829 out of 131023
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55  __asm("movq %%gs:%P1,%0" : "=r" (td) : "n"
(offsetof(struct pcpu,
(kgdb) #0  __curthread () at 
/usr/src/sys/amd64/include/pcpu_aux.h:55

#1  doadump (textdump=)
    at /usr/src/sys/kern/kern_shutdown.c:399
#2  0x804c428c in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:487
#3  0x804c46fe in vpanic (fmt=0x807e1276 "%s",
    ap=) at /usr/src/sys/kern/kern_shutdown.c:920
#4  0x804c4503 in panic (fmt=)
    at /usr/src/sys/kern/kern_shutdown.c:844
#5  0x807a2195 in trap_fatal (frame=0xfe0434de4d50, 
eva=0)

    at /usr/src/sys/amd64/amd64/trap.c:946
#6  0x807a21ef in trap_pfault (frame=0xfe0434de4d50,
    usermode=false, signo=, ucode=)
    at /usr/src/sys/amd64/amd64/trap.c:765
#7  
#8  0x804e0db4 in callout_process 
(now=now@entry=38385536922300)

    at /usr/src/sys/kern/kern_timeout.c:488
#9  0x8045ddb8 in handleevents 
(now=now@entry=38385536922300,

    fake=fake@entry=0) at /usr/src/sys/kern/kern_clocksource.c:213
#10 0x8045ea3e in timercb (et=0x80d475e0 ,
    arg=) at /usr/src/sys/kern/kern_clocksource.c:357
#11 0x807ca9eb in lapic_handle_timer 
(frame=0xfe0434de4f40)

    at /usr/src/sys/x86/x86/local_apic.c:1364
#12 
#13 0x00080df42bb6 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7def2c90
(kgdb)




'

I got a new crash on a today's current:
❯ more core.txt.1
borg.lerctr.org dumped core - see /var/crash/vmcore.1

Thu Dec 16 17:01:37 CST 2021

FreeBSD borg.lerctr.org 14.0-CURRENT FreeBSD 14.0-CURRENT #22
main-n251748-c610426c4de: Thu Dec 16 09:22:52 CST 2021
r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL
amd64

panic: page fault

GNU gdb (GDB) 11.1 [GDB v11.1 for FreeBSD]
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...
Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...

Unread portion of the kernel message buffer:
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in ker

Re: Panic: Page Fault in Kernel: Yesterday's CURRENT

2021-12-16 Thread Larry Rosenman

On 12/10/2021 10:43 am, Larry Rosenman wrote:

14-2021_12_07-1217 -  -  1.87G 2021-12-07 12:17
14-2021_12_09-1957 NR /  121G  2021-12-09 19:57

If that's any help

On 12/10/2021 10:36 am, Alexander Motin wrote:

Hi Larry,

This looks like some use-after-free or otherwise corrupted callout
structure.  Unfortunately the backtrace does not tell what was the
callout.  When was the previous update to look what could change?

On 10.12.2021 11:24, Larry Rosenman wrote:

FreeBSD borg.lerctr.org 14.0-CURRENT FreeBSD 14.0-CURRENT #15
main-n251537-ab639f2398b: Thu Dec  9 19:45:37 CST 2021
r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL  
amd64


VMCORE *IS* available.




Unread portion of the kernel message buffer:
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 20
fault virtual address   = 0x0
fault code  = supervisor write data, page not present
instruction pointer = 0x20:0x804e0db4
stack pointer   = 0x0:0xfe0434de4e10
frame pointer   = 0x0:0xfe0434de4e70
code segment    = base 0x0, limit 0xf, type 0x1b
    = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags    = resume, IOPL = 0
current process = 82990 (c++)
trap number = 12
panic: page fault
cpuid = 0
time = 163998
KDB: stack backtrace:
#0 0x8050fc95 at kdb_backtrace+0x65
#1 0x804c468f at vpanic+0x17f
#2 0x804c4503 at panic+0x43
#3 0x807a2195 at trap_fatal+0x385
#4 0x807a21ef at trap_pfault+0x4f
#5 0x80779c78 at calltrap+0x8
#6 0x8045ddb8 at handleevents+0x188
#7 0x8045ea3e at timercb+0x24e
#8 0x807ca9eb at lapic_handle_timer+0x9b
#9 0x8077b9b1 at Xtimerint+0xb1
Uptime: 2h28m57s
Dumping 12829 out of 131023
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55  __asm("movq %%gs:%P1,%0" : "=r" (td) : "n"
(offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=)
    at /usr/src/sys/kern/kern_shutdown.c:399
#2  0x804c428c in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:487
#3  0x804c46fe in vpanic (fmt=0x807e1276 "%s",
    ap=) at /usr/src/sys/kern/kern_shutdown.c:920
#4  0x804c4503 in panic (fmt=)
    at /usr/src/sys/kern/kern_shutdown.c:844
#5  0x807a2195 in trap_fatal (frame=0xfe0434de4d50, 
eva=0)

    at /usr/src/sys/amd64/amd64/trap.c:946
#6  0x807a21ef in trap_pfault (frame=0xfe0434de4d50,
    usermode=false, signo=, ucode=)
    at /usr/src/sys/amd64/amd64/trap.c:765
#7  
#8  0x804e0db4 in callout_process 
(now=now@entry=38385536922300)

    at /usr/src/sys/kern/kern_timeout.c:488
#9  0x8045ddb8 in handleevents (now=now@entry=38385536922300,
    fake=fake@entry=0) at /usr/src/sys/kern/kern_clocksource.c:213
#10 0x8045ea3e in timercb (et=0x80d475e0 ,
    arg=) at /usr/src/sys/kern/kern_clocksource.c:357
#11 0x807ca9eb in lapic_handle_timer 
(frame=0xfe0434de4f40)

    at /usr/src/sys/x86/x86/local_apic.c:1364
#12 
#13 0x00080df42bb6 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7def2c90
(kgdb)




'

I got a new crash on a today's current:
❯ more core.txt.1
borg.lerctr.org dumped core - see /var/crash/vmcore.1

Thu Dec 16 17:01:37 CST 2021

FreeBSD borg.lerctr.org 14.0-CURRENT FreeBSD 14.0-CURRENT #22 
main-n251748-c610426c4de: Thu Dec 16 09:22:52 CST 2021 
r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL  amd64


panic: page fault

GNU gdb (GDB) 11.1 [GDB v11.1 for FreeBSD]
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...
Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...

Unread portion of the kernel message buffer:
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 20
fault virtual add

Re: Panic: Page Fault in Kernel: Yesterday's CURRENT

2021-12-10 Thread Larry Rosenman

14-2021_12_07-1217 -  -  1.87G 2021-12-07 12:17
14-2021_12_09-1957 NR /  121G  2021-12-09 19:57

If that's any help

On 12/10/2021 10:36 am, Alexander Motin wrote:

Hi Larry,

This looks like some use-after-free or otherwise corrupted callout
structure.  Unfortunately the backtrace does not tell what was the
callout.  When was the previous update to look what could change?

On 10.12.2021 11:24, Larry Rosenman wrote:

FreeBSD borg.lerctr.org 14.0-CURRENT FreeBSD 14.0-CURRENT #15
main-n251537-ab639f2398b: Thu Dec  9 19:45:37 CST 2021
r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL  
amd64


VMCORE *IS* available.




Unread portion of the kernel message buffer:
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 20
fault virtual address   = 0x0
fault code  = supervisor write data, page not present
instruction pointer = 0x20:0x804e0db4
stack pointer   = 0x0:0xfe0434de4e10
frame pointer   = 0x0:0xfe0434de4e70
code segment    = base 0x0, limit 0xf, type 0x1b
    = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags    = resume, IOPL = 0
current process = 82990 (c++)
trap number = 12
panic: page fault
cpuid = 0
time = 163998
KDB: stack backtrace:
#0 0x8050fc95 at kdb_backtrace+0x65
#1 0x804c468f at vpanic+0x17f
#2 0x804c4503 at panic+0x43
#3 0x807a2195 at trap_fatal+0x385
#4 0x807a21ef at trap_pfault+0x4f
#5 0x80779c78 at calltrap+0x8
#6 0x8045ddb8 at handleevents+0x188
#7 0x8045ea3e at timercb+0x24e
#8 0x807ca9eb at lapic_handle_timer+0x9b
#9 0x8077b9b1 at Xtimerint+0xb1
Uptime: 2h28m57s
Dumping 12829 out of 131023
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55  __asm("movq %%gs:%P1,%0" : "=r" (td) : "n"
(offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=)
    at /usr/src/sys/kern/kern_shutdown.c:399
#2  0x804c428c in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:487
#3  0x804c46fe in vpanic (fmt=0x807e1276 "%s",
    ap=) at /usr/src/sys/kern/kern_shutdown.c:920
#4  0x804c4503 in panic (fmt=)
    at /usr/src/sys/kern/kern_shutdown.c:844
#5  0x807a2195 in trap_fatal (frame=0xfe0434de4d50, eva=0)
    at /usr/src/sys/amd64/amd64/trap.c:946
#6  0x807a21ef in trap_pfault (frame=0xfe0434de4d50,
    usermode=false, signo=, ucode=)
    at /usr/src/sys/amd64/amd64/trap.c:765
#7  
#8  0x804e0db4 in callout_process 
(now=now@entry=38385536922300)

    at /usr/src/sys/kern/kern_timeout.c:488
#9  0x8045ddb8 in handleevents (now=now@entry=38385536922300,
    fake=fake@entry=0) at /usr/src/sys/kern/kern_clocksource.c:213
#10 0x8045ea3e in timercb (et=0x80d475e0 ,
    arg=) at /usr/src/sys/kern/kern_clocksource.c:357
#11 0x807ca9eb in lapic_handle_timer 
(frame=0xfe0434de4f40)

    at /usr/src/sys/x86/x86/local_apic.c:1364
#12 
#13 0x00080df42bb6 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7def2c90
(kgdb)

--------



--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Panic: Page Fault in Kernel: Yesterday's CURRENT

2021-12-10 Thread Larry Rosenman
FreeBSD borg.lerctr.org 14.0-CURRENT FreeBSD 14.0-CURRENT #15 
main-n251537-ab639f2398b: Thu Dec  9 19:45:37 CST 2021 
r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL  amd64


VMCORE *IS* available.




Unread portion of the kernel message buffer:
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 20
fault virtual address   = 0x0
fault code  = supervisor write data, page not present
instruction pointer = 0x20:0x804e0db4
stack pointer   = 0x0:0xfe0434de4e10
frame pointer   = 0x0:0xfe0434de4e70
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= resume, IOPL = 0
current process = 82990 (c++)
trap number = 12
panic: page fault
cpuid = 0
time = 163998
KDB: stack backtrace:
#0 0x8050fc95 at kdb_backtrace+0x65
#1 0x804c468f at vpanic+0x17f
#2 0x804c4503 at panic+0x43
#3 0x807a2195 at trap_fatal+0x385
#4 0x807a21ef at trap_pfault+0x4f
#5 0x80779c78 at calltrap+0x8
#6 0x8045ddb8 at handleevents+0x188
#7 0x8045ea3e at timercb+0x24e
#8 0x807ca9eb at lapic_handle_timer+0x9b
#9 0x8077b9b1 at Xtimerint+0xb1
Uptime: 2h28m57s
Dumping 12829 out of 131023 
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%


__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55  __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" 
(offsetof(struct pcpu,

(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=)
at /usr/src/sys/kern/kern_shutdown.c:399
#2  0x804c428c in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:487
#3  0x804c46fe in vpanic (fmt=0x807e1276 "%s",
ap=) at /usr/src/sys/kern/kern_shutdown.c:920
#4  0x804c4503 in panic (fmt=)
at /usr/src/sys/kern/kern_shutdown.c:844
#5  0x807a2195 in trap_fatal (frame=0xfe0434de4d50, eva=0)
at /usr/src/sys/amd64/amd64/trap.c:946
#6  0x807a21ef in trap_pfault (frame=0xfe0434de4d50,
usermode=false, signo=, ucode=)
at /usr/src/sys/amd64/amd64/trap.c:765
#7  
#8  0x804e0db4 in callout_process (now=now@entry=38385536922300)
at /usr/src/sys/kern/kern_timeout.c:488
#9  0x8045ddb8 in handleevents (now=now@entry=38385536922300,
fake=fake@entry=0) at /usr/src/sys/kern/kern_clocksource.c:213
#10 0x8045ea3e in timercb (et=0x80d475e0 ,
arg=) at /usr/src/sys/kern/kern_clocksource.c:357
#11 0x807ca9eb in lapic_handle_timer (frame=0xfe0434de4f40)
at /usr/src/sys/x86/x86/local_apic.c:1364
#12 
#13 0x00080df42bb6 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7def2c90
(kgdb)

----

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: installworld: Certificate Error?

2021-12-07 Thread Larry Rosenman



Looks like I picked it up when I installed this box.

Removed.  Sorry for the noise. :(

On 12/07/2021 5:05 pm, Larry Rosenman wrote:


I have no clue.

On 12/07/2021 5:04 pm, Kyle Evans wrote:

Where did this ecpubkey.pem come from?

On Tue, Dec 7, 2021, 15:28 Larry Rosenman  wrote:
--

Installing everything completed on Tue Dec  7 15:23:33 CST 2021

--
68.45 real   262.43 user95.61 sys
Scanning /mnt/usr/share/certs/untrusted for certificates...
Scanning /mnt/usr/share/certs/trusted for certificates...
Scanning /mnt/usr/local/share/certs for certificates...
Scanning /mnt/usr/local/etc/ssl/certs for certificates...
unable to load certificate
67912877395968:error:0909006C:PEM routines:get_name:no start
line:/usr/src/crypto/openssl/crypto/pem/pem_lib.c:745:Expecting: 
TRUSTED

CERTIFICATE
Error: /mnt/usr/local/etc/ssl/certs/ecpubkey.pem
*** [installworld] Error code 1

[I] ➜ cat /mnt/usr/local/etc/ssl/certs/ecpubkey.pem
-BEGIN PUBLIC KEY-
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE/ZGaXNnGRqI4vEFFlrs3HNfyWjeL
5HcODD2mLHyvI+948pNZ9ngZl/afkZZZOHwcnlChxcBwNsgPFBXf1ZqKIA==
-END PUBLIC KEY-

ler in src  at ler-r610 on  main [?]
[I] ➜

Can someone fix this?

--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106


--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

Re: installworld: Certificate Error?

2021-12-07 Thread Larry Rosenman



I have no clue.

On 12/07/2021 5:04 pm, Kyle Evans wrote:


Where did this ecpubkey.pem come from?

On Tue, Dec 7, 2021, 15:28 Larry Rosenman  wrote:


--

Installing everything completed on Tue Dec  7 15:23:33 CST 2021

--
68.45 real   262.43 user95.61 sys
Scanning /mnt/usr/share/certs/untrusted for certificates...
Scanning /mnt/usr/share/certs/trusted for certificates...
Scanning /mnt/usr/local/share/certs for certificates...
Scanning /mnt/usr/local/etc/ssl/certs for certificates...
unable to load certificate
67912877395968:error:0909006C:PEM routines:get_name:no start
line:/usr/src/crypto/openssl/crypto/pem/pem_lib.c:745:Expecting: 
TRUSTED

CERTIFICATE
Error: /mnt/usr/local/etc/ssl/certs/ecpubkey.pem
*** [installworld] Error code 1

[I] ➜ cat /mnt/usr/local/etc/ssl/certs/ecpubkey.pem
-BEGIN PUBLIC KEY-
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE/ZGaXNnGRqI4vEFFlrs3HNfyWjeL
5HcODD2mLHyvI+948pNZ9ngZl/afkZZZOHwcnlChxcBwNsgPFBXf1ZqKIA==
-END PUBLIC KEY-

ler in src  at ler-r610 on  main [?]
[I] ➜

Can someone fix this?

--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106


--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

installworld: Certificate Error?

2021-12-07 Thread Larry Rosenman


--

Installing everything completed on Tue Dec  7 15:23:33 CST 2021

--
   68.45 real   262.43 user95.61 sys
Scanning /mnt/usr/share/certs/untrusted for certificates...
Scanning /mnt/usr/share/certs/trusted for certificates...
Scanning /mnt/usr/local/share/certs for certificates...
Scanning /mnt/usr/local/etc/ssl/certs for certificates...
unable to load certificate
67912877395968:error:0909006C:PEM routines:get_name:no start 
line:/usr/src/crypto/openssl/crypto/pem/pem_lib.c:745:Expecting: TRUSTED 
CERTIFICATE

Error: /mnt/usr/local/etc/ssl/certs/ecpubkey.pem
*** [installworld] Error code 1

[I] ➜ cat /mnt/usr/local/etc/ssl/certs/ecpubkey.pem
-BEGIN PUBLIC KEY-
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE/ZGaXNnGRqI4vEFFlrs3HNfyWjeL
5HcODD2mLHyvI+948pNZ9ngZl/afkZZZOHwcnlChxcBwNsgPFBXf1ZqKIA==
-END PUBLIC KEY-

ler in src  at ler-r610 on  main [?]
[I] ➜

Can someone fix this?


--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



signature.asc
Description: OpenPGP digital signature


Re: NFSv4 client: Doesn't see files/protocol error 10020

2021-10-21 Thread Larry Rosenman

On 10/20/2021 1:58 pm, Larry Rosenman wrote:

I have a -CURRENT box that I upgraded yesterday & today, and it no
longer can read NFS mounts from my TrueNAS 12.0-U6 server.

It mounts, but any access garners:
nfsv4 client/server protocol prob err=10020
nfsv4 client/server protocol prob err=10020

the fstab entries:
freenas.lerctr.org:/mnt/data/TBH/vault/backup/TBHnfs
rw,nfsv4,minorversion=1 0 0
freenas.lerctr.org:/mnt/data/BACULA /vault/backup/BACULA nfs
rw,nfsv4,minorversion=1 0 0

Ideas?
rmacklem@ helped me diagnose this, the issue (apparently) was some how 
my TruNAS server screwed up the NFSv4 exports.  Re-did

them from the GUI, and all is back to normal.

Thanks, as always, rmacklem!
--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



NFSv4 client: Doesn't see files/protocol error 10020

2021-10-20 Thread Larry Rosenman



I have a -CURRENT box that I upgraded yesterday & today, and it no 
longer can read NFS mounts from my TrueNAS 12.0-U6 server.


It mounts, but any access garners:
nfsv4 client/server protocol prob err=10020
nfsv4 client/server protocol prob err=10020

the fstab entries:
freenas.lerctr.org:/mnt/data/TBH/vault/backup/TBHnfs 
rw,nfsv4,minorversion=1 0 0
freenas.lerctr.org:/mnt/data/BACULA /vault/backup/BACULA nfs 
rw,nfsv4,minorversion=1 0 0


Ideas?


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



Re: poudriere jail with todays current: Install fail?

2021-10-20 Thread Larry Rosenman

On 10/20/2021 7:11 am, Dimitry Andric wrote:

On 20 Oct 2021, at 13:51, Larry Rosenman  wrote:


On 10/20/2021 6:41 am, Dimitry Andric wrote:

On 20 Oct 2021, at 03:50, Larry Rosenman  wrote:
Anyone else having poudriere jail -u or jail -c fail in the 
installworld?

log:
https://www.lerctr.org/~ler/jail-install.log

The actual error is pretty far from the bottom of that log:
--- realinstall_subdir_usr.sbin/lpr/chkprintcap ---
install: chkprintcap: No such file or directory
So probably usr.sbin/lpr wasn't built during buildworld? Do you have 
any

special settings in e.g. src.conf?
-Dimitry


I had
WITHOUT_LPR=yes

in make.conf.  But I've had that in there for a LONG time, and this is 
the

first time poudriere has complained.

So, I commented that out for now, but I'd like to know why the sudden 
change.


I haven't been able to find how poudriere jail -c passes any src.conf
settings to its installworld phase. It does seem to have a bunch of
stuff that goes through contortions to put a src.conf into the jail
directory, but only during *buildworld*, not during installworld.

It could very well be that this use case was broken due to a recent
poudriere update. I don't see anything in the recent log of -CURRENT
hat indicates some sort of flipping of the MK_LPR default, it has been
"yes" for ages now.

Whatever the case may be, for some reason you now run into a common
problem with the disconnect between buildworld and installworld: if you
run these under even slightly different environments, there can be
unexpected consequences... :)

-Dimitry


Thanks, Dimitry!


--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



signature.asc
Description: OpenPGP digital signature


Re: poudriere jail with todays current: Install fail?

2021-10-20 Thread Larry Rosenman

On 10/20/2021 6:41 am, Dimitry Andric wrote:

On 20 Oct 2021, at 03:50, Larry Rosenman  wrote:
Anyone else having poudriere jail -u or jail -c fail in the 
installworld?


log:
https://www.lerctr.org/~ler/jail-install.log


The actual error is pretty far from the bottom of that log:

--- realinstall_subdir_usr.sbin/lpr/chkprintcap ---
install: chkprintcap: No such file or directory

So probably usr.sbin/lpr wasn't built during buildworld? Do you have 
any

special settings in e.g. src.conf?

-Dimitry


I had
WITHOUT_LPR=yes

in make.conf.  But I've had that in there for a LONG time, and this is 
the

first time poudriere has complained.

So, I commented that out for now, but I'd like to know why the sudden 
change.



--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



signature.asc
Description: OpenPGP digital signature


poudriere jail with todays current: Install fail?

2021-10-19 Thread Larry Rosenman
Anyone else having poudriere jail -u or jail -c fail in the 
installworld?


log:
https://www.lerctr.org/~ler/jail-install.log



--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



signature.asc
Description: OpenPGP digital signature


Fwd: [package - head-amd64-default][sysutils/lsof] Failed for lsof-4.93.2_9,8 in build

2020-03-02 Thread Larry Rosenman
/fs/zfs/sys/zfs_znode.h:33:
In file included from 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h:48:
In file included from 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_context.h:73:

In file included from /usr/src/sys/cddl/compat/opensolaris/sys/vfs.h:37:
/usr/src/sys/cddl/compat/opensolaris/sys/vnode.h:243:10: warning: 
implicit declaration of function 'VOP_FSYNC' is invalid in C99 
[-Wimplicit-function-declaration]

error = VOP_FSYNC(vp, MNT_WAIT, curthread);
^
1 warning generated.
--- dproc.o ---
cc   -pipe -fstack-protector-strong -fno-strict-aliasing 
-DNEEDS_BOOL_TYPEDEF -DHASTASKS -DHAS_PAUSE_SBT -DHAS_DUP2 
-DHAS_CLOSEFROM -DHASEFFNLINK=i_effnlink -DHASF_VNODE -DHAS_FILEDESCENT 
-DHAS_TMPFS -DHASWCTYPE_H -DHASSBSTATE -DHAS_KVM_VNODE -DHAS_UFS1_2 
-DHAS_NO_IDEV -DHAS_VM_MEMATTR_T -DNEEDS_DEVICE_T -DHAS_CDEV2PRIV 
-DHAS_NO_SI_UDEV -DHAS_SYS_SX_H -DHASFUSEFS -DHAS_ZFS -DHAS_V_LOCKF 
-DHAS_LOCKF_ENTRY -DHAS_NO_6PORT -DHAS_NO_6PPCB -DNEEDS_BOOLEAN_T 
-DHAS_SB_CCC -DHAS_FDESCENTTBL -DFREEBSDV=13000 -DHASFDESCFS=2 
-DHASPSEUDOFS -DHASNULLFS -DHASIPv6 -DHASUTMPX -DHAS_STRFTIME 
-DLSOF_VSTR=\"13.0-CURRENT\" -I/usr/src/sys -O2 -c dproc.c -o dproc.o

--- lib/liblsof.a ---
--- lkud.o ---
cc   -pipe -fstack-protector-strong -fno-strict-aliasing 
-DNEEDS_BOOL_TYPEDEF -DHASTASKS -DHAS_PAUSE_SBT -DHAS_DUP2 
-DHAS_CLOSEFROM -DHASEFFNLINK=i_effnlink -DHASF_VNODE -DHAS_FILEDESCENT 
-DHAS_TMPFS -DHASWCTYPE_H -DHASSBSTATE -DHAS_KVM_VNODE -DHAS_UFS1_2 
-DHAS_NO_IDEV -DHAS_VM_MEMATTR_T -DNEEDS_DEVICE_T -DHAS_CDEV2PRIV 
-DHAS_NO_SI_UDEV -DHAS_SYS_SX_H -DHASFUSEFS -DHAS_ZFS -DHAS_V_LOCKF 
-DHAS_LOCKF_ENTRY -DHAS_NO_6PORT -DHAS_NO_6PPCB -DNEEDS_BOOLEAN_T 
-DHAS_SB_CCC -DHAS_FDESCENTTBL -DFREEBSDV=13000 -DHASFDESCFS=2 
-DHASPSEUDOFS -DHASNULLFS -DHASIPv6 -DHASUTMPX -DHAS_STRFTIME 
-DLSOF_VSTR="13.0-CURRENT" -I/usr/src/sys -O2 -c lkud.c -o lkud.o

--- dproc.o ---
dproc.c:350:24: error: no member named 'fd_cdir' in 'struct filedesc'
if (!ckscko && fd.fd_cdir) {
   ~~ ^
dproc.c:353:25: error: no member named 'fd_cdir' in 'struct filedesc'
process_node((KA_T)fd.fd_cdir);
   ~~ ^
dproc.c:360:24: error: no member named 'fd_rdir' in 'struct filedesc'
if (!ckscko && fd.fd_rdir) {
   ~~ ^
dproc.c:363:25: error: no member named 'fd_rdir' in 'struct filedesc'
process_node((KA_T)fd.fd_rdir);
   ~~ ^
dproc.c:372:24: error: no member named 'fd_jdir' in 'struct filedesc'
if (!ckscko && fd.fd_jdir) {
   ~~ ^
dproc.c:375:25: error: no member named 'fd_jdir' in 'struct filedesc'
process_node((KA_T)fd.fd_jdir);
   ~~ ^
6 errors generated.
*** [dproc.o] Error code 1

make[1]: stopped in /wrkdirs/usr/ports/sysutils/lsof/work/lsof-4.93.2
--- lib/liblsof.a ---
A failure has been detected in another branch of the parallel make

make[2]: stopped in 
/wrkdirs/usr/ports/sysutils/lsof/work/lsof-4.93.2/lib

*** [lib/liblsof.a] Error code 2

make[1]: stopped in /wrkdirs/usr/ports/sysutils/lsof/work/lsof-4.93.2
2 errors

make[1]: stopped in /wrkdirs/usr/ports/sysutils/lsof/work/lsof-4.93.2
===> Compilation failed unexpectedly.
Try to set MAKE_JOBS_UNSAFE=yes and rebuild before reporting the failure 
to

the maintainer.
*** Error code 1

Stop.
make: stopped in /usr/ports/sysutils/lsof

--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Panic with ataintel and not ready CD on a Dell r710@r357958

2020-02-17 Thread Larry Rosenman

On 02/17/2020 3:13 pm, Larry Rosenman wrote:

On 02/17/2020 3:07 pm, Warner Losh wrote:

On Feb 17, 2020, at 1:24 PM, Mateusz Guzik  wrote:

On 2/17/20, Larry Rosenman  wrote:

On 02/17/2020 1:46 pm, Larry Rosenman wrote:

Unread portion of the kernel message buffer:
panic: aprobe1: freed with 1 active CCBs

cpuid = 22
time = 1581771571
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe01fb9a11a0
vpanic() at vpanic+0x185/frame 0xfe01fb9a1200
panic() at panic+0x43/frame 0xfe01fb9a1260
cam_periph_release_locked_buses() at
cam_periph_release_locked_buses+0x372/frame 0xfe01fb9a1780
cam_periph_release_locked() at cam_periph_release_locked+0x1b/frame
0xfe01fb9a17a0
probedone() at probedone+0x186/frame 0xfe01fb9a1c60
xpt_done_process() at xpt_done_process+0x358/frame 
0xfe01fb9a1ca0

xpt_done_td() at xpt_done_td+0xf5/frame 0xfe01fb9a1cf0
fork_exit() at fork_exit+0x80/frame 0xfe01fb9a1d30
fork_trampoline() at fork_trampoline+0xe/frame 0xfe01fb9a1d30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Uptime: 1m8s
Dumping 6077 out of 131029
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55  __asm("movq %%gs:%P1,%0" : "=r" (td) : "n"
(offsetof(struct pcpu,
(kgdb) #0  __curthread () at 
/usr/src/sys/amd64/include/pcpu_aux.h:55

#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:393
#2  0x804bdf80 in kern_reboot (howto=260)
   at /usr/src/sys/kern/kern_shutdown.c:480
#3  0x804be3dd in vpanic (fmt=, 
ap=
out>)
   at /usr/src/sys/kern/kern_shutdown.c:910
#4  0x804be133 in panic (fmt=)
   at /usr/src/sys/kern/kern_shutdown.c:836
#5  0x823c5bc2 in camperiphfree (periph=0xf80115da2300)
   at /usr/src/sys/cam/cam_periph.c:685
#6  cam_periph_release_locked_buses (periph=0xf80115da2300)
   at /usr/src/sys/cam/cam_periph.c:450
#7  0x823c5bfb in cam_periph_release_locked
(periph=0xf80115da2300)
   at /usr/src/sys/cam/cam_periph.c:461
#8  0x8240dce6 in probedone (periph=0xf80115da2300,
   done_ccb=) at /usr/src/sys/cam/ata/ata_xpt.c:1352
#9  0x823cee08 in xpt_done_process 
(ccb_h=0xf8015013e800)

   at /usr/src/sys/cam/cam_xpt.c:5488
#10 0x823d0db5 in xpt_done_td (arg=0x8243d780
)
   at /usr/src/sys/cam/cam_xpt.c:5515
#11 0x80483200 in fork_exit (callout=0x823d0cc0
,
   arg=0x8243d780 , 
frame=0xfe01fb9a1d40)

   at /usr/src/sys/kern/kern_fork.c:1059
#12 
(kgdb)


Core IS available as is the kernel

I do load the ataintel driver as a module.  Removing it allows me 
to

boot.

What info do you all need?


Forgot to include, the previous working version was r356506



Can you try prior to r357647?


I’m pretty sure this is mine… and I’ve already reverted the bad 
change.


Warner


I've got a world/kernel building at r358050.  I'll post back either 
way.


and it boots fine and runs with ataintel back in the mix.

Thanks for the quick answer, Warner.


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Panic with ataintel and not ready CD on a Dell r710@r357958

2020-02-17 Thread Larry Rosenman

On 02/17/2020 3:07 pm, Warner Losh wrote:

On Feb 17, 2020, at 1:24 PM, Mateusz Guzik  wrote:

On 2/17/20, Larry Rosenman  wrote:

On 02/17/2020 1:46 pm, Larry Rosenman wrote:

Unread portion of the kernel message buffer:
panic: aprobe1: freed with 1 active CCBs

cpuid = 22
time = 1581771571
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe01fb9a11a0
vpanic() at vpanic+0x185/frame 0xfe01fb9a1200
panic() at panic+0x43/frame 0xfe01fb9a1260
cam_periph_release_locked_buses() at
cam_periph_release_locked_buses+0x372/frame 0xfe01fb9a1780
cam_periph_release_locked() at cam_periph_release_locked+0x1b/frame
0xfe01fb9a17a0
probedone() at probedone+0x186/frame 0xfe01fb9a1c60
xpt_done_process() at xpt_done_process+0x358/frame 
0xfe01fb9a1ca0

xpt_done_td() at xpt_done_td+0xf5/frame 0xfe01fb9a1cf0
fork_exit() at fork_exit+0x80/frame 0xfe01fb9a1d30
fork_trampoline() at fork_trampoline+0xe/frame 0xfe01fb9a1d30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Uptime: 1m8s
Dumping 6077 out of 131029
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55  __asm("movq %%gs:%P1,%0" : "=r" (td) : "n"
(offsetof(struct pcpu,
(kgdb) #0  __curthread () at 
/usr/src/sys/amd64/include/pcpu_aux.h:55

#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:393
#2  0x804bdf80 in kern_reboot (howto=260)
   at /usr/src/sys/kern/kern_shutdown.c:480
#3  0x804be3dd in vpanic (fmt=, ap=)
   at /usr/src/sys/kern/kern_shutdown.c:910
#4  0x804be133 in panic (fmt=)
   at /usr/src/sys/kern/kern_shutdown.c:836
#5  0x823c5bc2 in camperiphfree (periph=0xf80115da2300)
   at /usr/src/sys/cam/cam_periph.c:685
#6  cam_periph_release_locked_buses (periph=0xf80115da2300)
   at /usr/src/sys/cam/cam_periph.c:450
#7  0x823c5bfb in cam_periph_release_locked
(periph=0xf80115da2300)
   at /usr/src/sys/cam/cam_periph.c:461
#8  0x8240dce6 in probedone (periph=0xf80115da2300,
   done_ccb=) at /usr/src/sys/cam/ata/ata_xpt.c:1352
#9  0x823cee08 in xpt_done_process 
(ccb_h=0xf8015013e800)

   at /usr/src/sys/cam/cam_xpt.c:5488
#10 0x823d0db5 in xpt_done_td (arg=0x8243d780
)
   at /usr/src/sys/cam/cam_xpt.c:5515
#11 0x80483200 in fork_exit (callout=0x823d0cc0
,
   arg=0x8243d780 , 
frame=0xfe01fb9a1d40)

   at /usr/src/sys/kern/kern_fork.c:1059
#12 
(kgdb)


Core IS available as is the kernel

I do load the ataintel driver as a module.  Removing it allows me to
boot.

What info do you all need?


Forgot to include, the previous working version was r356506



Can you try prior to r357647?


I’m pretty sure this is mine… and I’ve already reverted the bad change.

Warner


I've got a world/kernel building at r358050.  I'll post back either way.


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Panic with ataintel and not ready CD on a Dell r710@r357958

2020-02-17 Thread Larry Rosenman

On 02/17/2020 1:46 pm, Larry Rosenman wrote:

Unread portion of the kernel message buffer:
panic: aprobe1: freed with 1 active CCBs

cpuid = 22
time = 1581771571
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe01fb9a11a0

vpanic() at vpanic+0x185/frame 0xfe01fb9a1200
panic() at panic+0x43/frame 0xfe01fb9a1260
cam_periph_release_locked_buses() at
cam_periph_release_locked_buses+0x372/frame 0xfe01fb9a1780
cam_periph_release_locked() at cam_periph_release_locked+0x1b/frame
0xfe01fb9a17a0
probedone() at probedone+0x186/frame 0xfe01fb9a1c60
xpt_done_process() at xpt_done_process+0x358/frame 0xfe01fb9a1ca0
xpt_done_td() at xpt_done_td+0xf5/frame 0xfe01fb9a1cf0
fork_exit() at fork_exit+0x80/frame 0xfe01fb9a1d30
fork_trampoline() at fork_trampoline+0xe/frame 0xfe01fb9a1d30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Uptime: 1m8s
Dumping 6077 out of 131029 
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%


__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55  __asm("movq %%gs:%P1,%0" : "=r" (td) : "n"
(offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:393
#2  0x804bdf80 in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:480
#3  0x804be3dd in vpanic (fmt=, ap=out>)

at /usr/src/sys/kern/kern_shutdown.c:910
#4  0x804be133 in panic (fmt=)
at /usr/src/sys/kern/kern_shutdown.c:836
#5  0x823c5bc2 in camperiphfree (periph=0xf80115da2300)
at /usr/src/sys/cam/cam_periph.c:685
#6  cam_periph_release_locked_buses (periph=0xf80115da2300)
at /usr/src/sys/cam/cam_periph.c:450
#7  0x823c5bfb in cam_periph_release_locked 
(periph=0xf80115da2300)

at /usr/src/sys/cam/cam_periph.c:461
#8  0x8240dce6 in probedone (periph=0xf80115da2300,
done_ccb=) at /usr/src/sys/cam/ata/ata_xpt.c:1352
#9  0x823cee08 in xpt_done_process (ccb_h=0xf8015013e800)
at /usr/src/sys/cam/cam_xpt.c:5488
#10 0x823d0db5 in xpt_done_td (arg=0x8243d780 
)

at /usr/src/sys/cam/cam_xpt.c:5515
#11 0x80483200 in fork_exit (callout=0x823d0cc0 
,

arg=0x8243d780 , frame=0xfe01fb9a1d40)
at /usr/src/sys/kern/kern_fork.c:1059
#12 
(kgdb)


Core IS available as is the kernel

I do load the ataintel driver as a module.  Removing it allows me to 
boot.


What info do you all need?


Forgot to include, the previous working version was r356506

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Panic with ataintel and not ready CD on a Dell r710@r357958

2020-02-17 Thread Larry Rosenman

Unread portion of the kernel message buffer:
panic: aprobe1: freed with 1 active CCBs

cpuid = 22
time = 1581771571
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe01fb9a11a0

vpanic() at vpanic+0x185/frame 0xfe01fb9a1200
panic() at panic+0x43/frame 0xfe01fb9a1260
cam_periph_release_locked_buses() at 
cam_periph_release_locked_buses+0x372/frame 0xfe01fb9a1780
cam_periph_release_locked() at cam_periph_release_locked+0x1b/frame 
0xfe01fb9a17a0

probedone() at probedone+0x186/frame 0xfe01fb9a1c60
xpt_done_process() at xpt_done_process+0x358/frame 0xfe01fb9a1ca0
xpt_done_td() at xpt_done_td+0xf5/frame 0xfe01fb9a1cf0
fork_exit() at fork_exit+0x80/frame 0xfe01fb9a1d30
fork_trampoline() at fork_trampoline+0xe/frame 0xfe01fb9a1d30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Uptime: 1m8s
Dumping 6077 out of 131029 
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%


__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55  __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" 
(offsetof(struct pcpu,

(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:393
#2  0x804bdf80 in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:480
#3  0x804be3dd in vpanic (fmt=, ap=out>)

at /usr/src/sys/kern/kern_shutdown.c:910
#4  0x804be133 in panic (fmt=)
at /usr/src/sys/kern/kern_shutdown.c:836
#5  0x823c5bc2 in camperiphfree (periph=0xf80115da2300)
at /usr/src/sys/cam/cam_periph.c:685
#6  cam_periph_release_locked_buses (periph=0xf80115da2300)
at /usr/src/sys/cam/cam_periph.c:450
#7  0x823c5bfb in cam_periph_release_locked 
(periph=0xf80115da2300)

at /usr/src/sys/cam/cam_periph.c:461
#8  0x8240dce6 in probedone (periph=0xf80115da2300,
done_ccb=) at /usr/src/sys/cam/ata/ata_xpt.c:1352
#9  0x823cee08 in xpt_done_process (ccb_h=0xf8015013e800)
at /usr/src/sys/cam/cam_xpt.c:5488
#10 0x823d0db5 in xpt_done_td (arg=0x8243d780 
)

at /usr/src/sys/cam/cam_xpt.c:5515
#11 0x80483200 in fork_exit (callout=0x823d0cc0 
,

arg=0x8243d780 , frame=0xfe01fb9a1d40)
at /usr/src/sys/kern/kern_fork.c:1059
#12 
(kgdb)


Core IS available as is the kernel

I do load the ataintel driver as a module.  Removing it allows me to 
boot.


What info do you all need?
--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


sysutils/lsof: Recent changes have broken lsof

2019-12-10 Thread Larry Rosenman
HAS_SYS_SX_H -DHASFUSEFS -DHAS_ZFS -DHAS_V_LOCKF 
-DHAS_LOCKF_ENTRY -DHAS_NO_6PORT -DHAS_NO_6PPCB -DNEEDS_BOOLEAN_T 
-DHAS_SB_CCC -DHAS_FDESCENTTBL -DFREEBSDV=13000 -DHASFDESCFS=2 
-DHASPSEUDOFS -DHASNULLFS -DHASIPv6 -DHASUTMPX -DHAS_STRFTIME 
-DLSOF_VSTR="13.0-CURRENT" -I/usr/src/sys -O2 -c isfn.c -o isfn.o

--- dnode2.o ---
1 warning generated.
--- dproc.o ---
cc   -pipe -fstack-protector-strong -fno-strict-aliasing 
-DNEEDS_BOOL_TYPEDEF -DHASTASKS -DHAS_PAUSE_SBT -DHAS_DUP2 
-DHAS_CLOSEFROM -DHASEFFNLINK=i_effnlink -DHASF_VNODE -DHAS_FILEDESCENT 
-DHAS_TMPFS -DHASWCTYPE_H -DHASSBSTATE -DHAS_KVM_VNODE -DHAS_UFS1_2 
-DHAS_NO_IDEV -DHAS_VM_MEMATTR_T -DNEEDS_DEVICE_T -DHAS_CDEV2PRIV 
-DHAS_NO_SI_UDEV -DHAS_SYS_SX_H -DHASFUSEFS -DHAS_ZFS -DHAS_V_LOCKF 
-DHAS_LOCKF_ENTRY -DHAS_NO_6PORT -DHAS_NO_6PPCB -DNEEDS_BOOLEAN_T 
-DHAS_SB_CCC -DHAS_FDESCENTTBL -DFREEBSDV=13000 -DHASFDESCFS=2 
-DHASPSEUDOFS -DHASNULLFS -DHASIPv6 -DHASUTMPX -DHAS_STRFTIME 
-DLSOF_VSTR=\"13.0-CURRENT\" -I/usr/src/sys -O2 -c dproc.c -o dproc.o

dproc.c:693:23: error: no member named 'next' in 'struct vm_map_entry'
if (!(ka = (KA_T)e->next))
 ~  ^
1 error generated.
*** [dproc.o] Error code 1


from pkg-fallout.

Thanks!

--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



signature.asc
Description: OpenPGP digital signature


ng_snd_item: I thought(?) we fixed this :( r354843

2019-11-25 Thread Larry Rosenman

I thought someone, somewhere fixed this,  but it's hit again.

core *IS* available, and I can give access as well.



Unread portion of the kernel message buffer:
panic: ng_snd_item: 42 != 1414
cpuid = 0
time = 1574707403
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe0215a304d0

vpanic() at vpanic+0x17e/frame 0xfe0215a30530
panic() at panic+0x43/frame 0xfe0215a30590
ng_snd_item() at ng_snd_item+0x482/frame 0xfe0215a305d0
ng_ether_output() at ng_ether_output+0x5e/frame 0xfe0215a30600
ether_output() at ether_output+0x661/frame 0xfe0215a306a0
arpintr() at arpintr+0xf0c/frame 0xfe0215a30840
netisr_dispatch_src() at netisr_dispatch_src+0x94/frame 
0xfe0215a308c0

ether_demux() at ether_demux+0x15e/frame 0xfe0215a308f0
ng_ether_rcv_upper() at ng_ether_rcv_upper+0xb2/frame 0xfe0215a30940
ng_apply_item() at ng_apply_item+0xa4/frame 0xfe0215a309c0
ng_snd_item() at ng_snd_item+0x2b0/frame 0xfe0215a30a00
ng_apply_item() at ng_apply_item+0xa4/frame 0xfe0215a30a80
ng_snd_item() at ng_snd_item+0x2b0/frame 0xfe0215a30ac0
ng_ether_input() at ng_ether_input+0x4c/frame 0xfe0215a30af0
ether_nh_input() at ether_nh_input+0x2c9/frame 0xfe0215a30b40
netisr_dispatch_src() at netisr_dispatch_src+0x94/frame 
0xfe0215a30bc0

ether_input() at ether_input+0x58/frame 0xfe0215a30c10
bce_intr() at bce_intr+0x6b7/frame 0xfe0215a30c90
ithread_loop() at ithread_loop+0x1c6/frame 0xfe0215a30cf0
fork_exit() at fork_exit+0x80/frame 0xfe0215a30d30
fork_trampoline() at fork_trampoline+0xe/frame 0xfe0215a30d30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Uptime: 6d7h53m25s
Dumping 27613 out of 131029 
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%


__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55  __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" 
(offsetof(struct pcpu,

(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:392
#2  0x804bbc20 in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:479
#3  0x804bc076 in vpanic (fmt=, ap=out>)

at /usr/src/sys/kern/kern_shutdown.c:908
#4  0x804bbdd3 in panic (fmt=)
at /usr/src/sys/kern/kern_shutdown.c:835
#5  0x8262f442 in ng_snd_item (item=0xf813c2e41280, flags=0)
at /usr/src/sys/netgraph/ng_base.c:2252
#6  0x82643c0e in ng_ether_output (ifp=,
mp=0xfe0215a30658) at /usr/src/sys/netgraph/ng_ether.c:294
#7  0x805c44e1 in ether_output (ifp=,
m=0xf80be7cfe500, dst=0xfe0215a30800, ro=)
at /usr/src/sys/net/if_ethersubr.c:430
#8  0x805ded6c in in_arpinput (m=)
at /usr/src/sys/netinet/if_ether.c:1144
#9  arpintr (m=0xf80be7cfe500) at 
/usr/src/sys/netinet/if_ether.c:747

#10 0x805cff94 in netisr_dispatch_src (proto=4, source=0,
m=0xf80be7cfe500) at /usr/src/sys/net/netisr.c:1127
#11 0x805c47ce in ether_demux (ifp=0xf8107db75800,
m=) at /usr/src/sys/net/if_ethersubr.c:916
#12 0x82644042 in ng_ether_rcv_upper (hook=,
item=) at /usr/src/sys/netgraph/ng_ether.c:744
#13 0x8262f514 in ng_apply_item (node=0xf8106c02c200,
item=0xf813c2e41280, rw=0) at 
/usr/src/sys/netgraph/ng_base.c:2403

#14 0x8262f270 in ng_snd_item (item=0xf813c2e41280, flags=0)
at /usr/src/sys/netgraph/ng_base.c:2320
#15 0x8262f514 in ng_apply_item (node=0xf8012c69de00,
item=0xf813c2e41280, rw=0) at 
/usr/src/sys/netgraph/ng_base.c:2403

#16 0x8262f270 in ng_snd_item (item=0xf813c2e41280, flags=0)
at /usr/src/sys/netgraph/ng_base.c:2320
#17 0x82643c9c in ng_ether_input (ifp=,
mp=0xfe0215a30b18) at /usr/src/sys/netgraph/ng_ether.c:255
#18 0x805c5a59 in ether_input_internal (ifp=0xf8107db75800,
m=0xf80be7cfe500) at /usr/src/sys/net/if_ethersubr.c:654
#19 ether_nh_input (m=) at 
/usr/src/sys/net/if_ethersubr.c:735

#20 0x805cff94 in netisr_dispatch_src (proto=5, source=0,
m=0xf80be7cfe500) at /usr/src/sys/net/netisr.c:1127
#21 0x805c4c48 in ether_input (ifp=0xf8107db75800, m=0x0)
at /usr/src/sys/net/if_ethersubr.c:824
#22 0x82455767 in bce_rx_intr (sc=0xfe0234fc8000)
at /usr/src/sys/dev/bce/if_bce.c:6848
#23 bce_intr (xsc=0xfe0234fc8000) at 
/usr/src/sys/dev/bce/if_bce.c:8017
#24 0x80485166 in intr_event_execute_handlers (p=out>,

ie=) at /usr/src/sys/kern/kern_intr.c:1148
#25 ithread_execute_handlers (p=, ie=)
at /usr/src/sys/kern/kern_intr.c:1161
#26 ithread_loop (arg=) at 
/usr/src/sys/kern/kern_intr.c:1241

#27 0x80481ca0 in fork_exit (
callout=0x80484fa0 , arg=0xf8107d1ffcc0,
frame=0xfe0215a30d40) at /usr/src/sys/kern/kern_fork.c:1059
#28 
(kgdb)

--
Larry Rosenman http://people.fre

ZFS Panic: Current: r354843: panic: solaris assert: error || lr->lr_length <= size, file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c, line: 1324

2019-11-19 Thread Larry Rosenman

Ideas?  Core *IS* available, and I can give access.

Unread portion of the kernel message buffer:
panic: solaris assert: error || lr->lr_length <= size, file: 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c, 
line: 1324

cpuid = 20
time = 1574159903
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe028c4d1920

vpanic() at vpanic+0x17e/frame 0xfe028c4d1980
panic() at panic+0x43/frame 0xfe028c4d19e0
assfail() at assfail+0x1a/frame 0xfe028c4d19f0
zfs_get_data() at zfs_get_data+0x358/frame 0xfe028c4d1a60
zil_commit_impl() at zil_commit_impl+0xfa5/frame 0xfe028c4d1bb0
zfs_sync() at zfs_sync+0xa2/frame 0xfe028c4d1bd0
sys_sync() at sys_sync+0xf5/frame 0xfe028c4d1c00
amd64_syscall() at amd64_syscall+0x29b/frame 0xfe028c4d1d30
fast_syscall_common() at fast_syscall_common+0x101/frame 
0xfe028c4d1d30
--- syscall (36, FreeBSD ELF64, sys_sync), rip = 0x80030d7aa, rsp = 
0x7fffe138, rbp = 0x7fffe260 ---

Uptime: 4h32m18s
Dumping 24794 out of 131029 
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%


__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55  __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" 
(offsetof(struct pcpu,

(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:392
#2  0x804bbc20 in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:479
#3  0x804bc076 in vpanic (fmt=, ap=out>)

at /usr/src/sys/kern/kern_shutdown.c:908
#4  0x804bbdd3 in panic (fmt=)
at /usr/src/sys/kern/kern_shutdown.c:835
#5  0x8177021a in assfail (a=, f=,
l=)
at 
/usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:81

#6  0x81418e98 in zfs_get_data (arg=,
lr=0xfe0365716b60, buf=, lwb=0xf813d468a000,
zio=)
at 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1324

#7  0x813e1775 in zil_lwb_commit (zilog=0xf81044baa800,
itx=, lwb=0xf813d468a000)
at 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c:1610

#8  zil_process_commit_list (zilog=0xf81044baa800)
at 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c:2188

#9  zil_commit_writer (zilog=0xf81044baa800, zcw=)
at 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c:2321

#10 zil_commit_impl (zilog=, foid=)
at 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c:2835

#11 0x81415752 in zfs_sync (vfsp=,
waitfor=)
at 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:331
#12 0x80593e35 in sys_sync (td=, uap=out>)

at /usr/src/sys/kern/vfs_syscalls.c:142
#13 0x8080c75b in syscallenter (td=0xf816486ce000)
at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:144
#14 amd64_syscall (td=0xf816486ce000, traced=0)
at /usr/src/sys/amd64/amd64/trap.c:1163
#15 
#16 0x00080030d7aa in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffe138
(kgdb)

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: My buildfarm member now giving permission denied

2019-10-01 Thread Larry Rosenman

On 10/01/2019 8:27 pm, Larry Rosenman wrote:

FreeBSD SVN rev:
r352600   -  -  1.69G 2019-09-22 13:13
r352873   NR /  43.1G 2019-09-29 16:36

I went from r352600 to r352873 and now I'm getting PostgreSQL 
permission denied

errors on the check phase of the build.

FreeBSD folks: Any ideas?
PostgreSQL folks: FYI.

latest build log:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=peripatus=2019-10-02%2001%3A20%3A14


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


My buildfarm member now giving permission denied

2019-10-01 Thread Larry Rosenman

FreeBSD SVN rev:
r352600   -  -  1.69G 2019-09-22 13:13
r352873   NR /  43.1G 2019-09-29 16:36

I went from r352600 to r352873 and now I'm getting PostgreSQL permission 
denied

errors on the check phase of the build.

FreeBSD folks: Any ideas?
PostgreSQL folks: FYI.



--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: rcv_start < rcv_end

2019-09-10 Thread Larry Rosenman

On 09/10/2019 9:20 am, Michael Tuexen wrote:

On 10. Sep 2019, at 14:37, Yuri Pankov  wrote:

Just seen this almost immediately after booting the system installed 
from amd64-20190906-r351901 snapshot, trying to do initial pkg 
bootstrap.  Sadly, I didn't have the swap/dump device configured at 
the time, so no dump was saved.


But it looks like I'm not alone, seeing the 
https://forums.freebsd.org/threads/kernel-panic-on-bhyve-virtualization.7/ 
topic.  Note that I'm running on bare metal, so bhyve isn't involved. 
My panic screenshot is at https://pasteboard.co/IwLaXXb.jpg.


In (the most likely) case it's not helpful enough, I'm now running 
with dump device configured, and will update if/when the panic 
reproduces.

This panic should be fixed by:
https://svnweb.freebsd.org/changeset/base/352072

Please drop me a note if not.

Best regards
Michael

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to 
"freebsd-current-unsubscr...@freebsd.org"


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to 
"freebsd-current-unsubscr...@freebsd.org"



is this the same panic:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240471

I *DO* have a core.


--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


panic: VOP_UNSET_TEXT returned 22: on r351627

2019-09-07 Thread Larry Rosenman

I got the following panic this AM during a poudriere run.

r351627 is the revision I'm at.

Core *IS* available.

Ideas?



Unread portion of the kernel message buffer:
VNASSERT failed
0xf809e6335960: tag tmpfs, type VREG
usecount 1, writecount 0, refcount 2
flags (VI_ACTIVE)
v_object 0xf81f37227000 ref 2 pages 1063 cleanbuf 0 dirtybuf 0
lock type tmpfs: SHARED (count 1)
tag VT_TMPFS, tmpfs_node 0xf803214f83a0, flags 0x0, links 1
mode 0755, owner 65534, group 0, size 4352808, status 0x0

panic: VOP_UNSET_TEXT returned 22
cpuid = 22
time = 1567862254
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe01bfd618b0

vpanic() at vpanic+0x19d/frame 0xfe01bfd61900
panic() at panic+0x43/frame 0xfe01bfd61960
vm_map_entry_set_vnode_text() at vm_map_entry_set_vnode_text+0x275/frame 
0xfe01bfd619b0
vm_map_process_deferred() at vm_map_process_deferred+0x70/frame 
0xfe01bfd619d0

vm_map_remove() at vm_map_remove+0xc6/frame 0xfe01bfd61a00
vmspace_exit() at vmspace_exit+0xd8/frame 0xfe01bfd61a40
exit1() at exit1+0x57d/frame 0xfe01bfd61ab0
sys_sys_exit() at sys_sys_exit+0xd/frame 0xfe01bfd61ac0
amd64_syscall() at amd64_syscall+0x29f/frame 0xfe01bfd61bf0
fast_syscall_common() at fast_syscall_common+0x101/frame 
0xfe01bfd61bf0
--- syscall (1, FreeBSD ELF64, sys_sys_exit), rip = 0x8008326aa, rsp = 
0x7fffe1b8, rbp = 0x7fffe1d0 ---

Uptime: 7d15h33m31s
Dumping 23246 out of 131027 
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%


__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55  __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct 
pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:392
#2  0x804bcf60 in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:479
#3  0x804bd3d9 in vpanic (fmt=, ap=out>)

at /usr/src/sys/kern/kern_shutdown.c:905
#4  0x804bd113 in panic (fmt=)
at /usr/src/sys/kern/kern_shutdown.c:832
#5  0x807644e5 in vm_map_entry_set_vnode_text (entry=out>,

add=) at /usr/src/sys/vm/vm_map.c:557
#6  0x807645a0 in vm_map_process_deferred ()
at /usr/src/sys/vm/vm_map.c:593
#7  0x8076a1b6 in _vm_map_unlock (map=,
file=, line=3653) at /usr/src/sys/vm/vm_map.c:607
#8  vm_map_remove (map=, start=4096, end=140737488355328)
at /usr/src/sys/vm/vm_map.c:3653
#9  0x80764118 in vmspace_dofree (vm=)
at /usr/src/sys/vm/vm_map.c:335
#10 vmspace_exit (td=0xf8016632c000) at /usr/src/sys/vm/vm_map.c:416
#11 0x8047d27d in exit1 (td=0xf8016632c000, rval=out>,

signo=0) at /usr/src/sys/kern/kern_exit.c:416
#12 0x8047ccfd in sys_sys_exit (td=, uap=out>)

at /usr/src/sys/kern/kern_exit.c:195
#13 0x807f13df in syscallenter (td=0xf8016632c000)
at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:144
#14 amd64_syscall (td=0xf8016632c000, traced=0)
at /usr/src/sys/amd64/amd64/trap.c:1180
#15 
#16 0x0008008326aa in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffe1b8
(kgdb)

--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



signature.asc
Description: OpenPGP digital signature


Re: sysutils/lsof: VOP_FSYNC definition moved?

2019-08-30 Thread Larry Rosenman

On 08/30/2019 10:20 pm, Yuri Pankov wrote:

Larry Rosenman wrote:

http://home.lerctr.org:/data/live-host-ports/2019-08-30_20h25m06s/logs/errors/lsof-4.93.2_4,8.log

--- dnode2.o ---
In file included from dnode2.c:56:
In file included from
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_znode.h:33:
In file included from
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h:47:
In file included from
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_context.h:73:
In file included from 
/usr/src/sys/cddl/compat/opensolaris/sys/vfs.h:37:

/usr/src/sys/cddl/compat/opensolaris/sys/vnode.h:243:10: warning:
implicit declaration of function 'VOP_FSYNC' is invalid in C99
[-Wimplicit-function-declaration]
 error = VOP_FSYNC(vp, MNT_WAIT, curthread);
 ^
1 warning generated.
A failure has been detected in another branch of the parallel make


Real error seems to be way above that (see below), and VOP_FSYNC one is
just a fallout from that.  It is likely related to r351594 by
Konstantin, but I didn't look into the details.  You could try defining
_SYS_PCPU_H_ before including   in dlsof.h with _KERNEL
defined -- this seems to fix the lsof build for me.

-
In file included from ckkv.c:43:
In file included from ./../lsof.h:221:
In file included from ./../dlsof.h:412:
In file included from /usr/src/sys/sys/file.h:44:
In file included from /usr/src/sys/sys/refcount.h:36:
In file included from /usr/src/sys/sys/systm.h:126:
In file included from /usr/src/sys/sys/pcpu.h:223:
/usr/include/machine/pcpu_aux.h:55:55: error: expected expression
__asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct 
pcpu,

 ^
/usr/include/machine/pcpu_aux.h:56:6: error: use of undeclared
identifier 'pc_curthread'; did you mean '__curthread'?
pc_curthread)));
^
/usr/include/machine/pcpu_aux.h:51:1: note: '__curthread' declared here
__curthread(void)
^
/usr/include/machine/pcpu_aux.h:66:56: error: expected expression
__asm("movq %%gs:%P1,%0" : "=r" (pcb) : "n" (offsetof(struct 
pcpu,

  ^
/usr/include/machine/pcpu_aux.h:67:6: error: use of undeclared
identifier 'pc_curpcb'; did you mean '__curpcb'?
pc_curpcb)));
^
/usr/include/machine/pcpu_aux.h:62:1: note: '__curpcb' declared here
__curpcb(void)



Thanks, Yuri.  I'd *REALLY* like someone with real kernel knowledge to 
look

at lsof and help modernize the #ifdef mess.


--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


sysutils/lsof: VOP_FSYNC definition moved?

2019-08-30 Thread Larry Rosenman

http://home.lerctr.org:/data/live-host-ports/2019-08-30_20h25m06s/logs/errors/lsof-4.93.2_4,8.log

--- dnode2.o ---
In file included from dnode2.c:56:
In file included from 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_znode.h:33:
In file included from 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h:47:
In file included from 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_context.h:73:

In file included from /usr/src/sys/cddl/compat/opensolaris/sys/vfs.h:37:
/usr/src/sys/cddl/compat/opensolaris/sys/vnode.h:243:10: warning: 
implicit declaration of function 'VOP_FSYNC' is invalid in C99 
[-Wimplicit-function-declaration]

error = VOP_FSYNC(vp, MNT_WAIT, curthread);
^
1 warning generated.
A failure has been detected in another branch of the parallel make


Can some of the kernel folks help me here?

Thanks!

--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



signature.asc
Description: OpenPGP digital signature


panic... r350849M panic: ng_snd_item: 42 != 1414

2019-08-20 Thread Larry Rosenman

the M is a patch from rrs@ for the previous crash on m_pullup.

Ideas?  I have a core.


Unread portion of the kernel message buffer:
panic: ng_snd_item: 42 != 1414
cpuid = 0
time = 1566303841
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe01265c0400

vpanic() at vpanic+0x19d/frame 0xfe01265c0450
panic() at panic+0x43/frame 0xfe01265c04b0
ng_snd_item() at ng_snd_item+0x455/frame 0xfe01265c04f0
ng_ether_output() at ng_ether_output+0x5e/frame 0xfe01265c0520
ether_output() at ether_output+0x665/frame 0xfe01265c05c0
arpintr() at arpintr+0xfe3/frame 0xfe01265c0780
netisr_dispatch_src() at netisr_dispatch_src+0x89/frame 
0xfe01265c07f0

ether_demux() at ether_demux+0x13b/frame 0xfe01265c0820
ng_ether_rcv_upper() at ng_ether_rcv_upper+0x95/frame 0xfe01265c0840
ng_apply_item() at ng_apply_item+0xf9/frame 0xfe01265c08c0
ng_snd_item() at ng_snd_item+0x2af/frame 0xfe01265c0900
ng_apply_item() at ng_apply_item+0xf9/frame 0xfe01265c0980
ng_snd_item() at ng_snd_item+0x2af/frame 0xfe01265c09c0
ng_ether_input() at ng_ether_input+0x4c/frame 0xfe01265c09f0
ether_nh_input() at ether_nh_input+0x2cd/frame 0xfe01265c0a40
netisr_dispatch_src() at netisr_dispatch_src+0x89/frame 
0xfe01265c0ab0

ether_input() at ether_input+0x48/frame 0xfe01265c0ad0
bce_intr() at bce_intr+0x697/frame 0xfe01265c0b50
ithread_loop() at ithread_loop+0x187/frame 0xfe01265c0bb0
fork_exit() at fork_exit+0x84/frame 0xfe01265c0bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe01265c0bf0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Uptime: 19h52m2s
Dumping 27502 out of 131026 
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%


__curthread () at /usr/src/sys/amd64/include/pcpu.h:246
246 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" 
(OFFSETOF_CURTHREAD));

(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu.h:246
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:392
#2  0x804bb950 in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:479
#3  0x804bbdc9 in vpanic (fmt=, ap=out>)

at /usr/src/sys/kern/kern_shutdown.c:905
#4  0x804bbb03 in panic (fmt=)
at /usr/src/sys/kern/kern_shutdown.c:832
#5  0x828e1515 in ng_snd_item (item=0xf81769193580, flags=0)
at /usr/src/sys/netgraph/ng_base.c:2252
#6  0x828f2c2e in ng_ether_output (ifp=,
mp=0xfe01265c0578) at /usr/src/sys/netgraph/ng_ether.c:294
#7  0x805c9975 in ether_output (ifp=0xf80122488000,
m=0xf818181a8b00, dst=0xfe01265c0740, ro=)
at /usr/src/sys/net/if_ethersubr.c:430
#8  0x805e2e43 in in_arpinput (m=)
at /usr/src/sys/netinet/if_ether.c:1152
#9  arpintr (m=0xf818181a8b00) at 
/usr/src/sys/netinet/if_ether.c:749

#10 0x805d4959 in netisr_dispatch_src (proto=4,
source=, m=) at 
/usr/src/sys/net/netisr.c:1123
#11 0x805c9c3b in ether_demux (ifp=0xf80122488000, 
m=)

at /usr/src/sys/net/if_ethersubr.c:913
#12 0x828f3045 in ng_ether_rcv_upper (hook=,
item=) at /usr/src/sys/netgraph/ng_ether.c:741
#13 0x828e1639 in ng_apply_item (node=0xf801c6ae3c00,
item=0xf81769193580, rw=0) at 
/usr/src/sys/netgraph/ng_base.c:2403

#14 0x828e136f in ng_snd_item (item=0xf81769193580, flags=0)
at /usr/src/sys/netgraph/ng_base.c:2320
#15 0x828e1639 in ng_apply_item (node=0xf810410a0400,
item=0xf81769193580, rw=0) at 
/usr/src/sys/netgraph/ng_base.c:2403

#16 0x828e136f in ng_snd_item (item=0xf81769193580, flags=0)
at /usr/src/sys/netgraph/ng_base.c:2320
#17 0x828f2cbc in ng_ether_input (ifp=,
mp=0xfe01265c0a18) at /usr/src/sys/netgraph/ng_ether.c:255
#18 0x805cae8d in ether_input_internal (ifp=0xf80122488000,
m=0xf818181a8b00) at /usr/src/sys/net/if_ethersubr.c:654
#19 ether_nh_input (m=) at 
/usr/src/sys/net/if_ethersubr.c:735

#20 0x805d4959 in netisr_dispatch_src (proto=5,
source=, m=) at 
/usr/src/sys/net/netisr.c:1123

#21 0x805ca078 in ether_input (ifp=0xf80122488000, m=0x0)
at /usr/src/sys/net/if_ethersubr.c:823
#22 0x8272f877 in bce_rx_intr (sc=)
at /usr/src/sys/dev/bce/if_bce.c:6848
#23 bce_intr (xsc=0xfe013abc2000) at 
/usr/src/sys/dev/bce/if_bce.c:8017
#24 0x80484997 in intr_event_execute_handlers (p=out>,

ie=) at /usr/src/sys/kern/kern_intr.c:1148
#25 ithread_execute_handlers (p=, ie=)
at /usr/src/sys/kern/kern_intr.c:1161
#26 ithread_loop (arg=) at 
/usr/src/sys/kern/kern_intr.c:1241

#27 0x80481544 in fork_exit (
callout=0x80484810 , arg=0xf81058226460,
frame=0xfe01265c0c00) at /usr/src/sys/kern/kern_fork.c:1057
#28 
(kgdb)

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 

Panic... r350849 panic: m_copydata, negative off -1

2019-08-17 Thread Larry Rosenman

I do have a core if folks want to look.

r350849

Unread portion of the kernel message buffer:
panic: m_copydata, negative off -1
cpuid = 0
time = 1566090669
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe011a798720

vpanic() at vpanic+0x19d/frame 0xfe011a798770
panic() at panic+0x43/frame 0xfe011a7987d0
m_copydata() at m_copydata+0x17a/frame 0xfe011a798850
rack_output() at rack_output+0x2c00/frame 0xfe011a798a70
tcp_hpts_thread() at tcp_hpts_thread+0x5e6/frame 0xfe011a798b50
ithread_loop() at ithread_loop+0x187/frame 0xfe011a798bb0
fork_exit() at fork_exit+0x84/frame 0xfe011a798bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe011a798bf0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Uptime: 6d22h22m59s
Dumping 28815 out of 131026 
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%


__curthread () at /usr/src/sys/amd64/include/pcpu.h:246
246 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" 
(OFFSETOF_CURTHREAD));

(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu.h:246
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:392
#2  0x804bb950 in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:479
#3  0x804bbdc9 in vpanic (fmt=, ap=out>)

at /usr/src/sys/kern/kern_shutdown.c:905
#4  0x804bbb03 in panic (fmt=)
at /usr/src/sys/kern/kern_shutdown.c:832
#5  0x8054868a in m_copydata (m=, 
off=,
len=, cp=) at 
/usr/src/sys/kern/uipc_mbuf.c:622

#6  0x8268bda0 in rack_output (tp=)
at 
/usr/src/sys/modules/tcp/rack/../../../netinet/tcp_stacks/rack.c:7957

#7  0x80679176 in tcp_hptsi (hpts=)
at /usr/src/sys/netinet/tcp_hpts.c:1621
#8  tcp_hpts_thread (ctx=)
at /usr/src/sys/netinet/tcp_hpts.c:1842
#9  0x80484997 in intr_event_execute_handlers (p=out>,

ie=) at /usr/src/sys/kern/kern_intr.c:1148
#10 ithread_execute_handlers (p=, ie=)
at /usr/src/sys/kern/kern_intr.c:1161
#11 ithread_loop (arg=) at 
/usr/src/sys/kern/kern_intr.c:1241

#12 0x80481544 in fork_exit (
callout=0x80484810 , arg=0xf80106ec01a0,
frame=0xfe011a798c00) at /usr/src/sys/kern/kern_fork.c:1057
#13 

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106



signature.asc
Description: OpenPGP digital signature


Re: [package - head-i386-default][sysutils/lsof] Failed for lsof-4.93.2_2,8 in build

2019-07-25 Thread Larry Rosenman

On 07/25/2019 1:40 pm, Justin Hibbits wrote:

On Thu, 25 Jul 2019 12:35:32 -0600
Alan Somers  wrote:


On Thu, Jul 25, 2019 at 12:13 PM Larry Rosenman 
wrote:
>
> On 07/25/2019 1:10 pm, Alan Somers wrote:
> > On Thu, Jul 25, 2019 at 12:05 PM Larry Rosenman 
> > wrote:
> >>
> >> Um  Who broke this?

...

> > "svn blame" suggests r350199 by kib.  However, refcount.h should
> > only be included if lsof defines _KERNEL, which normal programs
> > shouldn't. So I think this should be considered a bug in lsof.
> > -Alan
>
>
> we *HAVE* to define _KERNEL, to get at the kernel structures.

Then I think you have to live with this amount of instability.
refcount(9) says that you should include .  Did you do
that?  If so, then this is a man page bug and refcount(9) should also
specify stdbool.h.
-Alan


 includes  already, which typedefs bool.  So
 should suffice to include in lsof.

- Justin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to 
"freebsd-current-unsubscr...@freebsd.org"


Thanks all!  I've got a PR into the lsof repo, and I'll fix it there.

If we can't get a release out in the next day or 2, I'll patch it in the 
port.


https://github.com/lsof-org/lsof/pull/70
--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [package - head-i386-default][sysutils/lsof] Failed for lsof-4.93.2_2,8 in build

2019-07-25 Thread Larry Rosenman

On 07/25/2019 1:10 pm, Alan Somers wrote:
On Thu, Jul 25, 2019 at 12:05 PM Larry Rosenman  
wrote:


Um  Who broke this?

/usr/src/sys/sys/refcount.h:65:12: error: use of undeclared identifier
'false'
 return (false);
 ^
/usr/src/sys/sys/refcount.h:68:12: error: use of undeclared identifier
'true'
 return (true);
 ^
/usr/src/sys/sys/refcount.h:81:11: error: use of undeclared identifier
'false'
 return (false);
 ^
/usr/src/sys/sys/refcount.h:90:10: error: use of undeclared identifier
'true'
 return (true);
 ^
/usr/src/sys/sys/refcount.h:106:12: error: use of undeclared 
identifier

'false'
 return (false);
 ^
/usr/src/sys/sys/refcount.h:108:12: error: use of undeclared 
identifier

'true'
 return (true);
 ^
/usr/src/sys/sys/refcount.h:121:12: error: use of undeclared 
identifier

'false'
 return (false);
 ^
/usr/src/sys/sys/refcount.h:123:12: error: use of undeclared 
identifier

'true'
 return (true);
 ^
8 errors generated.
*** Error code 1

Stop.
make[2]: stopped in
/wrkdirs/usr/ports/sysutils/lsof/work/lsof-4.93.2/lib
*** Error code 1

Stop.
make[1]: stopped in /wrkdirs/usr/ports/sysutils/lsof/work/lsof-4.93.2
*** Error code 1

Stop.
make: stopped in /usr/ports/sysutils/lsof

--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to 
"freebsd-current-unsubscr...@freebsd.org"



"svn blame" suggests r350199 by kib.  However, refcount.h should only
be included if lsof defines _KERNEL, which normal programs shouldn't.
So I think this should be considered a bug in lsof.
-Alan



we *HAVE* to define _KERNEL, to get at the kernel structures.


--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Fwd: [package - head-i386-default][sysutils/lsof] Failed for lsof-4.93.2_2,8 in build

2019-07-25 Thread Larry Rosenman
==

===>  Building for lsof-4.93.2_2,8
(cd lib; /usr/bin/make DEBUG="-O2" CFGF="-pipe -fstack-protector-strong 
-fno-strict-aliasing -DNEEDS_BOOL_TYPEDEF -DHASTASKS -DHAS_PAUSE_SBT 
-DHAS_DUP2 -DHAS_CLOSEFROM -DHASEFFNLINK=i_effnlink -DHASF_VNODE 
-DHAS_FILEDESCENT -DHAS_TMPFS -DHASWCTYPE_H -DHASSBSTATE -DHAS_KVM_VNODE 
-DHAS_UFS1_2 -DHAS_NO_IDEV -DHAS_VM_MEMATTR_T -DNEEDS_DEVICE_T 
-DHAS_CDEV2PRIV -DHAS_NO_SI_UDEV -DHAS_SYS_SX_H -DHASFUSEFS -DHAS_ZFS 
-DHAS_V_LOCKF -DHAS_LOCKF_ENTRY -DHAS_NO_6PORT -DHAS_NO_6PPCB 
-DNEEDS_BOOLEAN_T -DHAS_SB_CCC -DHAS_FDESCENTTBL -DFREEBSDV=13000 
-DHASFDESCFS=2 -DHASPSEUDOFS -DHASNULLFS -DHASIPv6 -DHASUTMPX 
-DHAS_STRFTIME -DLSOF_VSTR=\"13.0-CURRENT\"")
cc   -pipe -fstack-protector-strong -fno-strict-aliasing 
-DNEEDS_BOOL_TYPEDEF -DHASTASKS -DHAS_PAUSE_SBT -DHAS_DUP2 
-DHAS_CLOSEFROM -DHASEFFNLINK=i_effnlink -DHASF_VNODE -DHAS_FILEDESCENT 
-DHAS_TMPFS -DHASWCTYPE_H -DHASSBSTATE -DHAS_KVM_VNODE -DHAS_UFS1_2 
-DHAS_NO_IDEV -DHAS_VM_MEMATTR_T -DNEEDS_DEVICE_T -DHAS_CDEV2PRIV 
-DHAS_NO_SI_UDEV -DHAS_SYS_SX_H -DHASFUSEFS -DHAS_ZFS -DHAS_V_LOCKF 
-DHAS_LOCKF_ENTRY -DHAS_NO_6PORT -DHAS_NO_6PPCB -DNEEDS_BOOLEAN_T 
-DHAS_SB_CCC -DHAS_FDESCENTTBL -DFREEBSDV=13000 -DHASFDESCFS=2 
-DHASPSEUDOFS -DHASNULLFS -DHASIPv6 -DHASUTMPX -DHAS_STRFTIME 
-DLSOF_VSTR="13.0-CURRENT" -I/usr/src/sys -O2 -c ckkv.c -o ckkv.o

In file included from ckkv.c:43:
In file included from ./../lsof.h:221:
In file included from ./../dlsof.h:406:
In file included from /usr/src/sys/sys/file.h:44:
/usr/src/sys/sys/refcount.h:65:12: error: use of undeclared identifier 
'false'

return (false);
^
/usr/src/sys/sys/refcount.h:68:12: error: use of undeclared identifier 
'true'

return (true);
^
/usr/src/sys/sys/refcount.h:81:11: error: use of undeclared identifier 
'false'

return (false);
^
/usr/src/sys/sys/refcount.h:90:10: error: use of undeclared identifier 
'true'

return (true);
^
/usr/src/sys/sys/refcount.h:106:12: error: use of undeclared identifier 
'false'

return (false);
^
/usr/src/sys/sys/refcount.h:108:12: error: use of undeclared identifier 
'true'

return (true);
^
/usr/src/sys/sys/refcount.h:121:12: error: use of undeclared identifier 
'false'

return (false);
^
/usr/src/sys/sys/refcount.h:123:12: error: use of undeclared identifier 
'true'

return (true);
^
8 errors generated.
*** Error code 1

Stop.
make[2]: stopped in 
/wrkdirs/usr/ports/sysutils/lsof/work/lsof-4.93.2/lib

*** Error code 1

Stop.
make[1]: stopped in /wrkdirs/usr/ports/sysutils/lsof/work/lsof-4.93.2
*** Error code 1

Stop.
make: stopped in /usr/ports/sysutils/lsof

--
Larry Rosenman http://people.freebsd.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@freebsd.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: vm_page_free_prep: freeing mapped page

2019-07-15 Thread Larry Rosenman

On 07/15/2019 5:10 pm, Larry Rosenman wrote:

On 07/15/2019 3:53 pm, Mark Johnston wrote:

On Mon, Jul 15, 2019 at 07:53:43AM -0500, Larry Rosenman wrote:

I got another panic overnight while my bacula backups were running.
It's
probably postgres again.

Mark,
I've uploaded the dump/core.txt, info file to the same place on
freefall (*17*).

What else can I provide to help find this bug and eradicate it.


Just to follow up, the panic should be resolved by r350005.


Updated to r350011.  We'll see how it does.  Thank You for
looking at it.

Well, it made it through the pgbuildfarm run.  I think we're good.

Thanks, all!
--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: vm_page_free_prep: freeing mapped page

2019-07-15 Thread Larry Rosenman

On 07/15/2019 3:53 pm, Mark Johnston wrote:

On Mon, Jul 15, 2019 at 07:53:43AM -0500, Larry Rosenman wrote:

I got another panic overnight while my bacula backups were running.
It's
probably postgres again.

Mark,
I've uploaded the dump/core.txt, info file to the same place on
freefall (*17*).

What else can I provide to help find this bug and eradicate it.


Just to follow up, the panic should be resolved by r350005.


Updated to r350011.  We'll see how it does.  Thank You for
looking at it.

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: vm_page_free_prep: freeing mapped page

2019-07-15 Thread Larry Rosenman

On 07/15/2019 10:14 am, Mark Johnston wrote:

On Mon, Jul 15, 2019 at 07:53:43AM -0500, Larry Rosenman wrote:

I got another panic overnight while my bacula backups were running.
It's
probably postgres again.

Mark,
I've uploaded the dump/core.txt, info file to the same place on
freefall (*17*).

What else can I provide to help find this bug and eradicate it.


What's the last known good revision for this system?  That is, at what
revision was the last kernel where you didn't see the problem on this
system?


r349858 on 2019-07-09.
--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: vm_page_free_prep: freeing mapped page

2019-07-15 Thread Larry Rosenman
I got another panic overnight while my bacula backups were running.  
It's

probably postgres again.

Mark,
   I've uploaded the dump/core.txt, info file to the same place on 
freefall (*17*).


What else can I provide to help find this bug and eradicate it.



--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: vm_page_free_prep: freeing mapped page

2019-07-13 Thread Larry Rosenman

On 07/13/2019 5:14 pm, Konstantin Belousov wrote:

On Sat, Jul 13, 2019 at 04:50:57PM -0500, Larry Rosenman wrote:

I have cores.  Ideas?
svn rev: r349976

[I] ➜ more core.txt.12
borg.lerctr.org dumped core - see /var/crash/vmcore.12

Sat Jul 13 16:47:03 CDT 2019

FreeBSD borg.lerctr.org 13.0-CURRENT FreeBSD 13.0-CURRENT r349976
LER-MINIMAL  amd64

panic: vm_page_free_prep: freeing mapped page 0xf82031044790

GNU gdb (GDB) 8.3 [GDB v8.3 for FreeBSD]
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd13.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
 <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...
Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...

Unread portion of the kernel message buffer:
panic: vm_page_free_prep: freeing mapped page 0xf82031044790
cpuid = 21
time = 1563053382
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe018f9fd890
vpanic() at vpanic+0x19d/frame 0xfe018f9fd8e0
panic() at panic+0x43/frame 0xfe018f9fd940
vm_page_free_prep() at vm_page_free_prep+0x18a/frame 
0xfe018f9fd960

vm_page_free_toq() at vm_page_free_toq+0x12/frame 0xfe018f9fd990
vm_object_terminate() at vm_object_terminate+0x1db/frame
0xfe018f9fd9e0
vm_object_deallocate() at vm_object_deallocate+0x412/frame
0xfe018f9fda40
vm_map_process_deferred() at vm_map_process_deferred+0x7f/frame
0xfe018f9fda60
kern_munmap() at kern_munmap+0x181/frame 0xfe018f9fdad0
amd64_syscall() at amd64_syscall+0x25c/frame 0xfe018f9fdbf0
fast_syscall_common() at fast_syscall_common+0x101/frame
0xfe018f9fdbf0
--- syscall (73, FreeBSD ELF64, sys_munmap), rip = 0x80119978a, rsp =
0x7fffce18, rbp = 0x7fffce20 ---
Uptime: 2h27m22s
Dumping 15640 out of 131026
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

__curthread () at /usr/src/sys/amd64/include/pcpu.h:246
246 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n"
(OFFSETOF_CURTHREAD));
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu.h:246
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:392
#2  0x804b6620 in kern_reboot (howto=260)
 at /usr/src/sys/kern/kern_shutdown.c:479
#3  0x804b6a99 in vpanic (fmt=, ap=)
 at /usr/src/sys/kern/kern_shutdown.c:905
#4  0x804b67d3 in panic (fmt=)
 at /usr/src/sys/kern/kern_shutdown.c:832
#5  0x8076c21a in vm_page_free_prep (m=0xf82031044790)
 at /usr/src/sys/vm/vm_page.c:3273
#6  0x80768152 in vm_page_free_toq (m=0xf82031044790)
 at /usr/src/sys/vm/vm_page.c:3483
#7  0x8076321b in vm_object_terminate_pages (object=)
 at /usr/src/sys/vm/vm_object.c:726
#8  vm_object_terminate (object=0xf81b924c9600)
 at /usr/src/sys/vm/vm_object.c:798
#9  0x80762582 in vm_object_deallocate
(object=0xf81b924c9600)
 at /usr/src/sys/vm/vm_object.c:663
#10 0x80756aef in vm_map_entry_deallocate (entry=,
 system_map=0) at /usr/src/sys/vm/vm_map.c:3457
#11 vm_map_process_deferred () at /usr/src/sys/vm/vm_map.c:586
#12 0x807606b1 in kern_munmap (td=0xf80c207e,
 addr0=, size=149200896) at
/usr/src/sys/vm/vm_mmap.c:603
#13 0x807adaac in syscallenter (td=0xf80c207e)
 at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:135
#14 amd64_syscall (td=0xf80c207e, traced=0)
 at /usr/src/sys/amd64/amd64/trap.c:1181
#15 
#16 0x00080119978a in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffce18
(kgdb)


What was the process which caused the panic ?  Was it threaded ?


I got a second one:
[00:02:04] Building 43 packages using 24 builders
[00:02:04] Starting/Cloning builders
[00:02:07] Hit CTRL+t at any time to see build progress and stats
[00:02:07] [01] [00:00:00] Building devel/llvm80 | llvm80-8.0.0_2
[00:02:07] [02] [00:00:00] Building lang/gcc48 | gcc48-4.8.5_10
[00:02:07] [03] [00:00:00] Building lang/gcc8 | gcc8-8.3.0_2
[00:35:48] [02] [00:33:41] Finished lang/gcc48 | gcc48-4.8.5_10: Success
[00:35:50] [02] [00:00:00] Building sysutils/uefi-edk2-bhyve | 
uefi-edk2-bhyve-0.2_1,1
packet_write_wait: Connection to 2600:1700:210:b180:a6ba:dbff:fe29:6695 
port 22: Broken pipe


[I] ➜ more core.txt.13
borg.lerctr.org dumped core - see /var/crash/vmcore.13

Sat Jul 13 17:

Re: panic: vm_page_free_prep: freeing mapped page

2019-07-13 Thread Larry Rosenman

On 07/13/2019 5:14 pm, Konstantin Belousov wrote:

On Sat, Jul 13, 2019 at 04:50:57PM -0500, Larry Rosenman wrote:

I have cores.  Ideas?
svn rev: r349976

[I] ➜ more core.txt.12
borg.lerctr.org dumped core - see /var/crash/vmcore.12

Sat Jul 13 16:47:03 CDT 2019

FreeBSD borg.lerctr.org 13.0-CURRENT FreeBSD 13.0-CURRENT r349976
LER-MINIMAL  amd64

panic: vm_page_free_prep: freeing mapped page 0xf82031044790

GNU gdb (GDB) 8.3 [GDB v8.3 for FreeBSD]
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd13.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
 <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...
Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...

Unread portion of the kernel message buffer:
panic: vm_page_free_prep: freeing mapped page 0xf82031044790
cpuid = 21
time = 1563053382
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe018f9fd890
vpanic() at vpanic+0x19d/frame 0xfe018f9fd8e0
panic() at panic+0x43/frame 0xfe018f9fd940
vm_page_free_prep() at vm_page_free_prep+0x18a/frame 
0xfe018f9fd960

vm_page_free_toq() at vm_page_free_toq+0x12/frame 0xfe018f9fd990
vm_object_terminate() at vm_object_terminate+0x1db/frame
0xfe018f9fd9e0
vm_object_deallocate() at vm_object_deallocate+0x412/frame
0xfe018f9fda40
vm_map_process_deferred() at vm_map_process_deferred+0x7f/frame
0xfe018f9fda60
kern_munmap() at kern_munmap+0x181/frame 0xfe018f9fdad0
amd64_syscall() at amd64_syscall+0x25c/frame 0xfe018f9fdbf0
fast_syscall_common() at fast_syscall_common+0x101/frame
0xfe018f9fdbf0
--- syscall (73, FreeBSD ELF64, sys_munmap), rip = 0x80119978a, rsp =
0x7fffce18, rbp = 0x7fffce20 ---
Uptime: 2h27m22s
Dumping 15640 out of 131026
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

__curthread () at /usr/src/sys/amd64/include/pcpu.h:246
246 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n"
(OFFSETOF_CURTHREAD));
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu.h:246
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:392
#2  0x804b6620 in kern_reboot (howto=260)
 at /usr/src/sys/kern/kern_shutdown.c:479
#3  0x804b6a99 in vpanic (fmt=, ap=)
 at /usr/src/sys/kern/kern_shutdown.c:905
#4  0x804b67d3 in panic (fmt=)
 at /usr/src/sys/kern/kern_shutdown.c:832
#5  0x8076c21a in vm_page_free_prep (m=0xf82031044790)
 at /usr/src/sys/vm/vm_page.c:3273
#6  0x80768152 in vm_page_free_toq (m=0xf82031044790)
 at /usr/src/sys/vm/vm_page.c:3483
#7  0x8076321b in vm_object_terminate_pages (object=)
 at /usr/src/sys/vm/vm_object.c:726
#8  vm_object_terminate (object=0xf81b924c9600)
 at /usr/src/sys/vm/vm_object.c:798
#9  0x80762582 in vm_object_deallocate
(object=0xf81b924c9600)
 at /usr/src/sys/vm/vm_object.c:663
#10 0x80756aef in vm_map_entry_deallocate (entry=,
 system_map=0) at /usr/src/sys/vm/vm_map.c:3457
#11 vm_map_process_deferred () at /usr/src/sys/vm/vm_map.c:586
#12 0x807606b1 in kern_munmap (td=0xf80c207e,
 addr0=, size=149200896) at
/usr/src/sys/vm/vm_mmap.c:603
#13 0x807adaac in syscallenter (td=0xf80c207e)
 at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:135
#14 amd64_syscall (td=0xf80c207e, traced=0)
 at /usr/src/sys/amd64/amd64/trap.c:1181
#15 
#16 0x00080119978a in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffce18
(kgdb)


What was the process which caused the panic ?  Was it threaded ?

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to 
"freebsd-current-unsubscr...@freebsd.org"

It was a poudriere run, so I don't know for sure.


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


  1   2   3   4   5   >