from:"Andriy Gapon"

Re: loader.efi module path vs kernel directory

2022-10-24 Thread Andriy Gapon


On 20/10/2022 13:59, Toomas Soome wrote:

Whoops, meant cli.lua(8), of course.


Thank you very much to everyone who helped!

The commands are available indeed, just not listed by loader.
I had a recollection that in the past I saw them listed either with '?' or 
'help'.  Maybe that was with forth loader.


It would be nice if there was a way to list the extended commands online (via 
loader itself) as loader does not have command completion and it's not always 
possible to get another system for exploring manual pages while being stuck in 
loader.


Anyway, thank you again.

On 20. Oct 2022, at 13:58, Toomas Soome mailto:tso...@me.com>> 
wrote:



the problem with ‘?’ command is that it only does list commands written in C, 
it does not list scripted commands. cli_lua(8) should list lua specific ones. 
And at least my stable/13 branch does seem to confirm, enable-module, 
disable-module, toggle-module and show-module-options should be present 
(defined in /boot/lua/cli.lua). I am also pretty sure, Kyle did add those when 
13 was current, lua version was missing those, Forth version had them first:)


rgds,
toomas

On 20. Oct 2022, at 13:27, Andriy Gapon <mailto:a...@freebsd.org>> wrote:


On 20/10/2022 13:20, Toomas Soome wrote:

Also, instead of manual load, you may want to use enable-module.


Emmanuel, Toomas,

thank you very much for the suggestions.

It seems like my installation may be messed up or outdated somehow, see below 
(and sorry about those ^M-s).  I do not seem to have boot-conf or *-module 
commands.


I checked that the EFI partition has exactly the same loader.efi as in /boot, 
but maybe some other files (configuration?) are outdated.

Also, forgot to mention, this is with stable/13, not main / current.

OK ?^M
Available commands:^M
 copy_staging copy staging^M
 staging_slop set staging slop^M
 efi-autoresizeconEFI Auto-resize Console^M
 gop  graphics output protocol^M
 uga  universal graphics adapter^M
 efi-seed-entropy try to get entropy from the EFI RNG^M
 poweroff power off the system^M
 reboot   reboot the system^M
 quit exit the loader^M
 memmap   print memory map^M
 configuration    print configuration tables^M
 mode change or display EFI text modes^M
 lsefi    list EFI handles^M
 chain    chain load file^M
 netserver    change or display netserver URI^M
 loadfont load console font from file^M
 grab_faults  grab faults^M
 ungrab_faults    ungrab faults^M
 fault    generate fault^M
 boot boot a file or loaded kernel^M
 autoboot boot automatically after a delay^M
 help detailed help^M
 ?    list commands^M
 show show variable(s)^M
 set  set a variable^M
 unset    unset a variable^M
 echo echo arguments^M
 read read input from the terminal^M
 more show contents of a file^M
 lsdev    list all devices^M
 readtest Time a file read^M
 include  read commands from a file^M
 ls   list files^M
 load load a kernel or module^M
 unload   unload all modules^M
 lsmod    list loaded modules^M
 pnpmatch list matched modules based on pnpinfo^M
 pnpload  load matched modules based on pnpinfo^M
 pnpautoload  auto load modules based on pnpinfo^M
 nvstore  manage non-volatile data^M
 map-vdisk    map file as virtual disk^M
 unmap-vdisk  unmap virtual disk^M
 bcachestat   get disk block cache stats^M
 lszfs    list child datasets of a zfs dataset^M
 reloadbe refresh the list of ZFS Boot Environments^M
 efi-show print some or all EFI variables^M
 efi-set  set EFI variables^M
 efi-unset    delete / unset EFI variables^M


Sent from my iPhone
On 20. Oct 2022, at 13:08, Emmanuel Vadot <mailto:m...@bidouilliste.com>> wrote:


On Thu, 20 Oct 2022 13:03:26 +0300
Andriy Gapon mailto:a...@freebsd.org>> wrote:



I recently needed to recover a system by manually preloading a driver.
To a bit of surprise, simple 'load $modname' did not work, I had to use 'load
/boot/kernel/$modname.ko'.  I didn't have to do this in a long time, but I
recall that the short command used to work.  Additionally, required 
modules also

failed to get loaded automatically because loader couldn't find them.

I am not sure what the issue is.  Is it that /boot/kernel is not in module 
path

(as per /boot/defaults/loader.conf) ? Or is it that /boot/kernel does not get
added to the *effective* module path?

Thanks!
--
Andriy Gapon



if you escape to prompt directly loader didn't loaded all it's config
so there is no modulepath defined, you need to 'boot-conf' to load the
configuration files.

Cheers,

--
Emmanuel Vadot mailto:m...@bidouilliste.com>> 
mailto:m...@freebsd.org>>




--
Andriy Gapon






--
Andriy Gapon

Re: loader.efi module path vs kernel directory

2022-10-20 Thread Andriy Gapon


On 20/10/2022 13:20, Toomas Soome wrote:

Also, instead of manual load, you may want to use enable-module.


Emmanuel, Toomas,

thank you very much for the suggestions.

It seems like my installation may be messed up or outdated somehow, see below 
(and sorry about those ^M-s).  I do not seem to have boot-conf or *-module commands.


I checked that the EFI partition has exactly the same loader.efi as in /boot, 
but maybe some other files (configuration?) are outdated.

Also, forgot to mention, this is with stable/13, not main / current.

OK ?^M
Available commands:^M
  copy_staging copy staging^M
  staging_slop set staging slop^M
  efi-autoresizeconEFI Auto-resize Console^M
  gop  graphics output protocol^M
  uga  universal graphics adapter^M
  efi-seed-entropy try to get entropy from the EFI RNG^M
  poweroff power off the system^M
  reboot   reboot the system^M
  quit exit the loader^M
  memmap   print memory map^M
  configurationprint configuration tables^M
  mode change or display EFI text modes^M
  lsefilist EFI handles^M
  chainchain load file^M
  netserverchange or display netserver URI^M
  loadfont load console font from file^M
  grab_faults  grab faults^M
  ungrab_faultsungrab faults^M
  faultgenerate fault^M
  boot boot a file or loaded kernel^M
  autoboot boot automatically after a delay^M
  help detailed help^M
  ?list commands^M
  show show variable(s)^M
  set  set a variable^M
  unsetunset a variable^M
  echo echo arguments^M
  read read input from the terminal^M
  more show contents of a file^M
  lsdevlist all devices^M
  readtest Time a file read^M
  include  read commands from a file^M
  ls   list files^M
  load load a kernel or module^M
  unload   unload all modules^M
  lsmodlist loaded modules^M
  pnpmatch list matched modules based on pnpinfo^M
  pnpload  load matched modules based on pnpinfo^M
  pnpautoload  auto load modules based on pnpinfo^M
  nvstore  manage non-volatile data^M
  map-vdiskmap file as virtual disk^M
  unmap-vdisk  unmap virtual disk^M
  bcachestat   get disk block cache stats^M
  lszfslist child datasets of a zfs dataset^M
  reloadbe refresh the list of ZFS Boot Environments^M
  efi-show print some or all EFI variables^M
  efi-set  set EFI variables^M
  efi-unsetdelete / unset EFI variables^M



Sent from my iPhone


On 20. Oct 2022, at 13:08, Emmanuel Vadot  wrote:

On Thu, 20 Oct 2022 13:03:26 +0300
Andriy Gapon  wrote:



I recently needed to recover a system by manually preloading a driver.
To a bit of surprise, simple 'load $modname' did not work, I had to use 'load
/boot/kernel/$modname.ko'.  I didn't have to do this in a long time, but I
recall that the short command used to work.  Additionally, required modules also
failed to get loaded automatically because loader couldn't find them.

I am not sure what the issue is.  Is it that /boot/kernel is not in module path
(as per /boot/defaults/loader.conf) ? Or is it that /boot/kernel does not get
added to the *effective* module path?

Thanks!
--
Andriy Gapon



if you escape to prompt directly loader didn't loaded all it's config
so there is no modulepath defined, you need to 'boot-conf' to load the
configuration files.

Cheers,

--
Emmanuel Vadot  



--
Andriy Gapon

loader.efi module path vs kernel directory

2022-10-20 Thread Andriy Gapon




I recently needed to recover a system by manually preloading a driver.
To a bit of surprise, simple 'load $modname' did not work, I had to use 'load 
/boot/kernel/$modname.ko'.  I didn't have to do this in a long time, but I 
recall that the short command used to work.  Additionally, required modules also 
failed to get loaded automatically because loader couldn't find them.


I am not sure what the issue is.  Is it that /boot/kernel is not in module path 
(as per /boot/defaults/loader.conf) ? Or is it that /boot/kernel does not get 
added to the *effective* module path?


Thanks!
--
Andriy Gapon

UDP port re-use [Was: in_pcbbind_setup: wrong condition regarding INP_REUSEPORT ?]

2022-10-13 Thread Andriy Gapon




Do we have test cases or a document that can be a definitive guide to UDP port 
re-use on FreeBSD?
Including effects of INP_REUSEADDR, INP_REUSEPORT and INP_REUSEPORT_LB socket 
options, socket addresses, socket credentials?
I can find some descriptions on the internet, but they do not seem to be 
complete.  The effect of addresses is under-described and I do not see any 
mention of credentials (UIDs).


Is there a way to tell if some behavior is correct or not?
Is it all in heads of people and in the change history?

On 04/10/2022 14:46, Andriy Gapon wrote:

On 04/10/2022 14:37, Sean Bruno wrote:



On 10/3/22 04:14, Andriy Gapon wrote:


I must admit that the condition in question is fairly long and non-trivial 
and I cannot decipher it, but these two lines look wrong to me:


  (t->inp_flags2 & INP_REUSEPORT) ||
  (t->inp_flags2 & INP_REUSEPORT_LB) == 
0) &&


I'd expect that the check would be symmetric with respect to INP_REUSEPORT 
and INP_REUSEPORT_LB.

The problem seems to come from 1a43cff92a20d / r334719 / D11003.



I think you are pointing at this absurd conditional?

https://cgit.freebsd.org/src/tree/sys/netinet/in_pcb.c#n1049

Besides the twisted logic, what problem are you trying to solve?


Yes, that conditional.
I pointed out the part of it that does not make sense to me.

Also, in my tests SO_REUSEPORT does not actually allow to share a port.
Test scenario:
- create a UDP socket
- setsockopt(SO_REUSEPORT)
- bind the socket to a port and wild card address
- success
- now repeat the previous steps with the same port *under a different user id*
- bind fails

I wonder if the following would be a correct change.

diff --git a/sys/netinet/in_pcb.c b/sys/netinet/in_pcb.c
index d9247f50d32b..f5e6e3932a96 100644
--- a/sys/netinet/in_pcb.c
+++ b/sys/netinet/in_pcb.c
@@ -1003,6 +1003,7 @@ in_pcbbind_setup(struct inpcb *inp, struct sockaddr *nam, 
in_addr_t *laddrp,

  /*
   * XXX
   * This entire block sorely needs a rewrite.
+ * And a good comment describing the rationale behind the conditions.
   */
  if (t &&
  ((inp->inp_flags2 & INP_BINDMULTI) == 0) &&
@@ -1011,8 +1012,7 @@ in_pcbbind_setup(struct inpcb *inp, struct sockaddr *nam, 
in_addr_t *laddrp,

   ntohl(t->inp_faddr.s_addr) == INADDR_ANY) &&
  (ntohl(sin->sin_addr.s_addr) != INADDR_ANY ||
   ntohl(t->inp_laddr.s_addr) != INADDR_ANY ||
- (t->inp_flags2 & INP_REUSEPORT) ||
- (t->inp_flags2 & INP_REUSEPORT_LB) == 0) &&
+ (t->inp_flags2 & (INP_REUSEPORT | INP_REUSEPORT_LB)) == 0) 
&&
  (inp->inp_cred->cr_uid !=
   t->inp_cred->cr_uid))
  return (EADDRINUSE);



--
Andriy Gapon

Re: usbhid panic when switching vt-s (invariants+witness enabled)

2022-09-30 Thread Andriy Gapon


On 26/09/2022 18:13, Hans Petter Selasky wrote:

On 9/23/22 23:43, Hans Petter Selasky wrote:

vpanic() at 0x808f4c84 = vpanic+0x184/frame 0xfe003590e900
panic() at 0x808f4a33 = panic+0x43/frame 0xfe003590e960
sleepq_add() at 0x809521ab = sleepq_add+0x37b/frame 0xfe003590e9b0
_sleep() at 0x80902118 = _sleep+0x238/frame 0xfe003590ea40
usbhid_sync_xfer() at 0x8532e071 = usbhid_sync_xfer+0x171/frame 
0xfe003590eaa0
usbhid_set_report() at 0x8532db26 = usbhid_set_report+0x96/frame 
0xfe003590eae0
hid_set_report() at 0x80686caa = hid_set_report+0x6a/frame 
0xfe003590eb20

hidbus_write() at 0x85335a7c = hidbus_write+0x5c/frame 
0xfe003590eb50
hid_write() at 0x80686b98 = hid_write+0x58/frame 0xfe003590eb80
hkbd_set_leds() at 0x85c1cfe6 = hkbd_set_leds+0x206/frame 
0xfe003590ebc0
hkbd_ioctl_locked() at 0x85c1cd6b = hkbd_ioctl_locked+0x33b/frame 
0xfe003590ec20
hkbd_ioctl_locked() at 0x85c1caff = hkbd_ioctl_locked+0xcf/frame 
0xfe003590ec80

hkbd_ioctl() at 0x85c1ba5a = hkbd_ioctl+0xba/frame 0xfe003590ecc0
kbdmux_ioctl() at 0x80695d3b = kbdmux_ioctl+0x12b/frame 
0xfe003590ed00
vt_window_switch() at 0x8079d969 = vt_window_switch+0x229/frame 
0xfe003590ed40
vt_switch_timer() at 0x807a15a1 = vt_switch_timer+0x21/frame 
0xfe003590ed60


Can you test this patch:
https://reviews.freebsd.org/D36715


Sorry that it took a while.
I cannot reproduce the problem after applying the patch.
I see that you already committed the change, but I thought that I'd let you 
know.
Thank you very much!

--
Andriy Gapon

usbhid panic when switching vt-s (invariants+witness enabled)

2022-09-23 Thread Andriy Gapon




It seems that the problem may be related to different keyboard LED states 
between the VTs.  The system is a fresh stable/13.  The panic looks like an 
attempt to sleep while in an interrupt thread (a callout?).


panic: sleepq_add: td 0xf80006af to sleep on wchan 0xf802ea752e08 
with sleeping prohibited

cpuid = 5
time = 1663940484
KDB: stack backtrace:
db_trace_self_wrapper() at 0x8061555b = db_trace_self_wrapper+0x2b/frame 
0xfe003590e7f0

kdb_backtrace() at 0x80942637 = kdb_backtrace+0x37/frame 
0xfe003590e8a0
vpanic() at 0x808f4c84 = vpanic+0x184/frame 0xfe003590e900
panic() at 0x808f4a33 = panic+0x43/frame 0xfe003590e960
sleepq_add() at 0x809521ab = sleepq_add+0x37b/frame 0xfe003590e9b0
_sleep() at 0x80902118 = _sleep+0x238/frame 0xfe003590ea40
usbhid_sync_xfer() at 0x8532e071 = usbhid_sync_xfer+0x171/frame 
0xfe003590eaa0
usbhid_set_report() at 0x8532db26 = usbhid_set_report+0x96/frame 
0xfe003590eae0
hid_set_report() at 0x80686caa = hid_set_report+0x6a/frame 
0xfe003590eb20

hidbus_write() at 0x85335a7c = hidbus_write+0x5c/frame 
0xfe003590eb50
hid_write() at 0x80686b98 = hid_write+0x58/frame 0xfe003590eb80
hkbd_set_leds() at 0x85c1cfe6 = hkbd_set_leds+0x206/frame 
0xfe003590ebc0
hkbd_ioctl_locked() at 0x85c1cd6b = hkbd_ioctl_locked+0x33b/frame 
0xfe003590ec20
hkbd_ioctl_locked() at 0x85c1caff = hkbd_ioctl_locked+0xcf/frame 
0xfe003590ec80

hkbd_ioctl() at 0x85c1ba5a = hkbd_ioctl+0xba/frame 0xfe003590ecc0
kbdmux_ioctl() at 0x80695d3b = kbdmux_ioctl+0x12b/frame 
0xfe003590ed00
vt_window_switch() at 0x8079d969 = vt_window_switch+0x229/frame 
0xfe003590ed40
vt_switch_timer() at 0x807a15a1 = vt_switch_timer+0x21/frame 
0xfe003590ed60
softclock_call_cc() at 0x809127c4 = softclock_call_cc+0x244/frame 
0xfe003590ee20

softclock() at 0x80912c1c = softclock+0x7c/frame 0xfe003590ee50
ithread_loop() at 0x808b662a = ithread_loop+0x2da/frame 
0xfe003590eef0
fork_exit() at 0x808b2f85 = fork_exit+0xc5/frame 0xfe003590ef30
fork_trampoline() at 0x80c084fe = fork_trampoline+0xe/frame 
0xfe003590ef30



--
Andriy Gapon

Re: Beadm can't create snapshot

2022-08-22 Thread Andriy Gapon


On 2022-08-22 12:29, Peter Jeremy wrote:

On 2022-Aug-22 10:56:51 +0200, "Patrick M. Hausen"  wrote:

Am 22.08.2022 um 10:45 schrieb Peter Jeremy :
On 2022-Aug-17 18:07:20 +0200, "Patrick M. Hausen"  wrote:

Isn't beadm retired in favour of bectl?


2) "bectl activate" doesn't update /boot/loader.conf so the wrong
   root filesystem is mounted.


You mean the vfs.root.mountfrom option? I thought that, too, was deprecated and
replaced by the bootfs property of the zpool.


I've looking through mailing list archives and searched the 'net and
haven't found anything saying vfs.root.mountfrom is deprecated.
loader(8) mentions that it will fallback to using "currdev" if there's
no root entry in /etc/fstab and vfs.root.mountfrom isn't set.


It's not deprecated, but it's a manual override of a _normal_ ZFS boot flow.
If you mount root via fstab or you override it via vfs.root.mountfrom, 
then you should know what you do and why.



At the very least, it's an undocumented incompatibility between beadm
and bectl: I can't take an existing system that's using beadm and just
switch to using bectl.


Yeah, but I would blame beadm for doing things in a backwards fashion.

--
Andriy Gapon


https://standforukraine.com
https://razomforukraine.org

Re: pkg: Newer FreeBSD version for package... but why?

2022-08-01 Thread Andriy Gapon


On 2022-07-13 21:33, John Baldwin wrote:

On 7/13/22 3:17 AM, Andriy Gapon wrote:

On 2022-07-13 13:09, Michael Gmelin wrote:



On Wed, 13 Jul 2022 10:29:06 +0300
Andriy Gapon  wrote:


# uname -U
1400063

# uname -K
1400063

# pkg upgrade
Updating FreeBSD repository catalogue...
Fetching packagesite.pkg: 100%    5 MiB   4.8MB/s    00:01
Processing entries:   0%
Newer FreeBSD version for package zyre:
To ignore this error set IGNORE_OSVERSION=yes
- package: 1400063
- running kernel: 1400051
Ignore the mismatch and continue? [y/N]:

Does anyone know why this would happen?
Where does pkg get its notion of the running kernel version?



If I'm reading the sources correctly, it's determining the OS version
by looking at the elf headers of various files in this order:

  getenv("ABI_FILE")
  /usr/bin/uname
  /bin/sh

So I would assume that `file /usr/bin/uname` shows 1400051 on your
system.


Thank you very much!  That's it:
# file /usr/bin/uname
/usr/bin/uname: ELF 32-bit LSB executable, ARM, EABI5 version 1
(FreeBSD), dynamically linked, interpreter /libexec/ld-elf.so.1,
FreeBSD-style, for FreeBSD 14.0 (1400051), stripped


You can point it to checking another file by setting ABI_FILE[0] in the
environment or ignore the check by setting IGNORE_OSVERSION (like
advised). The "running kernel:" label seems a bit misleading.


Indeed.

Now the next thing (for me) to research is why the binaries were built
"for FreeBSD 14.0 (1400051)" when the source tree has 1400063 and uname
-U also reports 1400063.
FWIW, this was a cross-build, maybe that played a role too.


If you do a NO_CLEAN=yes build, we don't relink binaries just because
crt*.o changed (where the note is stored).



I see.  It was a NO_CLEAN build indeed.
Thanks.

--
Andriy Gapon


https://standforukraine.com
https://razomforukraine.org

Re: problem with bhyve, ryzen 5800x, freebsd guest

2022-08-01 Thread Andriy Gapon


On 2022-07-10 20:28, Gleb Smirnoff wrote:

On Thu, Jul 07, 2022 at 03:29:04PM +0300, Andriy Gapon wrote:
A> I have a strange issue with running an 'appliance' image based on
A> FreeBSD 12 in bhyve on a machine with Ryzen 5800x processor.
A>
A> The problem is that the guest would run for a while and then the host
A> would suddenly reset itself.  It appears like a triple fault or
A> something with similar consequences.
A>
A> The time may be from a few dozens of minutes to many hours.
A>
A> Just to be clear, no such thing occurs if I do not run the guest.
A> Also, I have an older AMD system (pre-Zen), the problem does not happen
A> there.
A> A vanilla FreeBSD 12.3 installation that just sits idle also does not
A> cause the problem.
A>
A> Does anyone have an idea what the problem could be?
A> What workaround or diagnostics to try?
A> Anybody else seen something like this?
A>
A> Since it's the host that resets it would be hard to capture any traces.

I also run bhyve on Ryzen since late 2021 and never had such an issue.
But not FreeBSD 12, I run the head.



Thank you everyone who responded.  It seems that the problem was with 
some BIOS configuration changes, probably related to the power settings.
Once I reset everything to factory defaults (plus some minimum "safe" 
and well-understood changes) the problem went away.
It's really surprising that I saw it only with bhyve and only with the 
particular kind of VMs.  Perhaps there was a workload pattern that 
triggered a hardware bug or overloaded some specific module.


Anyways, sorry for the noise and thank you for the help.

--
Andriy Gapon


https://standforukraine.com
https://razomforukraine.org

Re: pkg: Newer FreeBSD version for package... but why?

2022-07-13 Thread Andriy Gapon


On 2022-07-13 13:09, Michael Gmelin wrote:



On Wed, 13 Jul 2022 10:29:06 +0300
Andriy Gapon  wrote:


# uname -U
1400063

# uname -K
1400063

# pkg upgrade
Updating FreeBSD repository catalogue...
Fetching packagesite.pkg: 100%5 MiB   4.8MB/s00:01
Processing entries:   0%
Newer FreeBSD version for package zyre:
To ignore this error set IGNORE_OSVERSION=yes
- package: 1400063
- running kernel: 1400051
Ignore the mismatch and continue? [y/N]:

Does anyone know why this would happen?
Where does pkg get its notion of the running kernel version?



If I'm reading the sources correctly, it's determining the OS version
by looking at the elf headers of various files in this order:

 getenv("ABI_FILE")
 /usr/bin/uname
 /bin/sh

So I would assume that `file /usr/bin/uname` shows 1400051 on your
system.


Thank you very much!  That's it:
# file /usr/bin/uname
/usr/bin/uname: ELF 32-bit LSB executable, ARM, EABI5 version 1 
(FreeBSD), dynamically linked, interpreter /libexec/ld-elf.so.1, 
FreeBSD-style, for FreeBSD 14.0 (1400051), stripped



You can point it to checking another file by setting ABI_FILE[0] in the
environment or ignore the check by setting IGNORE_OSVERSION (like
advised). The "running kernel:" label seems a bit misleading.


Indeed.

Now the next thing (for me) to research is why the binaries were built 
"for FreeBSD 14.0 (1400051)" when the source tree has 1400063 and uname 
-U also reports 1400063.

FWIW, this was a cross-build, maybe that played a role too.


--
Andriy Gapon


https://standforukraine.com
https://razomforukraine.org

pkg: Newer FreeBSD version for package... but why?

2022-07-13 Thread Andriy Gapon




# uname -U
1400063

# uname -K
1400063

# pkg upgrade
Updating FreeBSD repository catalogue...
Fetching packagesite.pkg: 100%5 MiB   4.8MB/s00:01
Processing entries:   0%
Newer FreeBSD version for package zyre:
To ignore this error set IGNORE_OSVERSION=yes
- package: 1400063
- running kernel: 1400051
Ignore the mismatch and continue? [y/N]:

Does anyone know why this would happen?
Where does pkg get its notion of the running kernel version?

--
Andriy Gapon


https://standforukraine.com
https://razomforukraine.org

problem with bhyve, ryzen 5800x, freebsd guest

2022-07-07 Thread Andriy Gapon




I have a strange issue with running an 'appliance' image based on 
FreeBSD 12 in bhyve on a machine with Ryzen 5800x processor.


The problem is that the guest would run for a while and then the host 
would suddenly reset itself.  It appears like a triple fault or 
something with similar consequences.


The time may be from a few dozens of minutes to many hours.

Just to be clear, no such thing occurs if I do not run the guest.
Also, I have an older AMD system (pre-Zen), the problem does not happen 
there.
A vanilla FreeBSD 12.3 installation that just sits idle also does not 
cause the problem.


Does anyone have an idea what the problem could be?
What workaround or diagnostics to try?
Anybody else seen something like this?

Since it's the host that resets it would be hard to capture any traces.
Thank you.
--
Andriy Gapon


https://standforukraine.com
https://razomforukraine.org

Re: rc.d/zpool should require (rw) root?

2022-02-04 Thread Andriy Gapon


On 04/02/2022 13:23, Daniel Braniss wrote:




On 4 Feb 2022, at 12:07, Andriy Gapon  wrote:


It seems that in some cases zpool import -c requires read/write access to the zpool.cache 
file.  So, it probably makes sense to import "other" pools (non-root) after 
upgrading / to rw.
What do you think?



what if root is ro? i.e: diskless?


Then nothing changes.  rc.d/root would leave / alone.

--
Andriy Gapon

rc.d/zpool should require (rw) root?

2022-02-04 Thread Andriy Gapon




It seems that in some cases zpool import -c requires read/write access to the 
zpool.cache file.  So, it probably makes sense to import "other" pools 
(non-root) after upgrading / to rw.

What do you think?

--
Andriy Gapon

Re: POLLHUP detected on devd socket

2022-01-23 Thread Andriy Gapon


On 23/01/2022 07:40, Daniel O'Connor wrote:

It is very strange that devd dying would kill anything else running


Because most likely it's a correlation, not causation.
Many things die and among them devd.

--
Andriy Gapon

gmirror: read failed when one disk (of two) failed

2021-12-29 Thread Andriy Gapon




I have a gmirror-ed swap partition:
# swapinfo -h
Device  Size UsedAvail Capacity
/dev/mirror/swap 8.0G 2.6G 5.4G33%

# gmirror list
Geom name: swap
State: COMPLETE
Components: 2
Balance: prefer
Slice: 4096
Flags: NONE
GenID: 4
SyncID: 20
ID: 1722474567
Type: AUTOMATIC
Providers:
1. Name: mirror/swap
   Mediasize: 8589934080 (8.0G)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e0
Consumers:
1. Name: ada0p2
   Mediasize: 8589934592 (8.0G)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e1
   State: ACTIVE
   Priority: 0
   Flags: (null)
   GenID: 4
   SyncID: 20
   ID: 1654937889
2. Name: ada1p2
   Mediasize: 8589934592 (8.0G)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 655360
   Mode: r1w1e1
   State: ACTIVE
   Priority: 0
   Flags: (null)
   GenID: 4
   SyncID: 20
   ID: 1289215755

Recently I had an accident where one of the disks got lost, with some 
"struggle".

I see two things in the logs that concern me.
One is mildly concerning and the other seems to be serious.

The disk was known as ada1, it was attached to ahcich5.

So, while the failing disk was struggling, I got messages like:
ahcich5: AHCI reset: device not ready after 31000ms (tfd = 0080)
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1175296, size: 4096
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 268223, size: 4096
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1175296, size: 4096
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 268223, size: 4096
ahcich5: Timeout on slot 18 port 0
ahcich5: is  cs 0004 ss  rs 0004 tfd 80 serr  
cmd 0020f217

(aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
(aprobe0:ahcich5:0:0:0): CAM status: Command timeout
(aprobe0:ahcich5:0:0:0): Retrying command, 0 more tries remain
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1175296, size: 4096
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 268223, size: 4096
ahcich5: AHCI reset: device not ready after 31000ms (tfd = 0080)
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1175296, size: 4096
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 268223, size: 4096
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1175296, size: 4096
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 268223, size: 4096
ahcich5: Timeout on slot 19 port 0
ahcich5: is  cs 0008 ss  rs 0008 tfd 80 serr  
cmd 0020f317

(aprobe0:ahcich5:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
(aprobe0:ahcich5:0:0:0): CAM status: Command timeout
(aprobe0:ahcich5:0:0:0): Error 5, Retries exhausted
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1175296, size: 4096
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 268223, size: 4096

Ideally, it would be nice if gmirror could instruct a chosen member disk to 
"fail fast", so that a different disk could be tried ASAP in case of any 
trouble.  Only if all members cannot provide the requested data the mirror 
should switch to "try hard" mode.


The more serious issue happened after the struggle was over:
ada1 at ahcich5 bus 0 scbus6 target 0 lun 0
ada1:  s/n GY3049963 detached
GEOM_MIRROR: Request failed (error=6). ada1p2[READ(offset=1098641408, 
length=4096)]
pass4 at ahcich5 bus 0 scbus6 target 0 lun 0
pass4:  s/n GY3049963 detached
GEOM_MIRROR: Device swap: provider ada1p2 disconnected.
swap_pager: I/O error - pagein failed; blkno 1175296,size 4096, error 6
vm_fault: pager read error, pid 1310 (devd)
pid 1310 (devd), jid 0, uid 0: exited on signal 10 (core dumped)

It looks that the mirror did not do its job properly?
It got ENXIO from one disk and it passed that up to a consumer (swap).

I suspect that there wa a race between an I/O request getting ENXIO from a 
failed disk and an orphan event from the same disk.  With only one member left 
the mirror may not realize that the ENXIO came from the detached disk, so it may 
think that there is no device left to re-try the request.


--
Andriy Gapon

schedgraph.d experience, per-CPU buffers, pipes

2021-12-24 Thread Andriy Gapon




I would like to share some experience or maybe rather a warning about using 
DTrace for tracing scheduling events.  Unlike KTR which has a global circular 
buffer, DTrace with bufpolicy=ring uses per-CPU circular buffers.  So, if there 
is an asymmetry in processor load, the buffers will fill and wrap-around at 
different speeds.  In the end, they might have approximately equal numbers of 
events but those may cover very different time intervals.  So, some additional 
post-processing is required to find the latest event among first ones of each 
per-CPU buffer.  Any traces from before that would have information gaps 
("missing" processors) and would be very confusing.


Also, I noticed that processes passing a lot of data through pipes produce a lot 
of scheduling events as they seem to get blocked and unlocked every few 
microseconds (on a modern performant system with the default pipe sizing 
configuration).  That contributes to a quick wrap-around of circular buffers.


--
Andriy Gapon

observations on Ryzen 5xxx (Zen 3) processors

2021-12-22 Thread Andriy Gapon

There have been some reports on strange / unexpected things with Ryzen 5xxx
processors. I think I have seen 5950X, 5900X and 5800X mentioned, not sure
about others.

Since I have 5800X myself I looked into a couple of issues that have
straightforward demonstrators. I would like to share my findings and
observations on those issues.

Issue 1. High wake-up latency for CPU idle states.

This seems to be related to the so called CC6 idle state.
The official information on it is very sparse.
The state is not explicitly exposed to the OS, at least, though ACPI interfaces
that FreeBSD currently supports.

In my tests I see that if all logical processors enter an idle state then an
external interrupt can be delayed by 500+ us. Specifically, I observed this
with an MSI-X interrupt from a discrete network chip. Interrupts from internal
components seem to be affected as well, but to a lesser degree.

The deep state in question can be entered regardless of whether C2 (via I/O) is
enabled, C1 (via hlt) is sufficient. In fact, with machdep.idle=hlt it works
the same.

The state is not entered if at least one logical CPU is not idle.
The state is not entered if machdep.idle=mwait is used. Apparently, the
processors do not attempt to automatically enter as deep idle modes with mwait
as they do with hlt.
Finally, the state is not entered if zenstates.py utility is used to disable C6
/ CC6 state via an undocumented (publicly) MSR.

For me personally that state does not cause any annoyances but anyone who
experiences problems related to "stuttering", "jitter", latency might want to
look into this.

Issue 2. Uneven performance of CPU intensive tasks, especially with SCHED_ULE,
when SMT is enabled.

I found out that at least on my hardware all even numbered logical CPUs can
perform much better than odd numbered logical CPUs. It seems that hardware
threads within a core are not equal. Maybe this is related to ability to use
boosted frequencies, but maybe something else, I am not sure.
From a brief look at the ULE code it looks that the selection of a hw thread
within a core is intentionally random when all other things are equal.
I suspect that the hardware + firmware may actually describe that performance
disparity via ACPI CPPC (_CPC object, etc), but right now we do not support
querying that or making use of it.

It would interesting to see if other owners of similar processors can confirm or
provide counter-examples to my observations.

Simple tests for issue 1:
- ping a host attached to the same switch (so, with very low expected latency)
- ping 127.0.0.1

For issue 2: take some CPU intensive single-threaded task and bind it (with
cpuset -l) to different logical CPUs. Multiple such tasks can be run
concurrently on different logical CPUs.

References:
- https://forums.freebsd.org/threads/variable-ping-latency-on-ryzen-setup.82791/
- https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=256594
- https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=254040
- https://github.com/r4m0n/ZenStates-Linux
- https://github.com/meowthink/ZenStates-FreeBSD -- has a bug
- https://github.com/avg-I/ZenStates-FreeBSD -- has a fix
- https://www.kernel.org/doc/html/latest/admin-guide/acpi/cppc_sysfs.html
- https://static.linaro.org/connect/lvc21/presentations/lvc21-219.pdf
-
https://uefi.org/specs/ACPI/6.4/14_Platform_Communications_Channel/Platform_Comm_Channel.html

--
Andriy Gapon

Re: CURRENT: ZFS freezes system beyond reboot

2021-12-15 Thread Andriy Gapon


On 15/12/2021 19:55, FreeBSD User wrote:

It is spooky, if not to say "buggy", if ZFS is capable of freezing the whole 
box even if
the essential operating system stuff is isolated on a dedicated UFS filesystem.


I do not think that this is the case.
Commands that do not access anything on ZFS or anything related to ZFS should be 
unaffected.


--
Andriy Gapon

Re: CURRENT: ZFS freezes system beyond reboot

2021-12-12 Thread Andriy Gapon


On 12/12/2021 18:45, Alan Somers wrote:

You need to look at what's causing those errors.  What kind of disks
are you using, with what HBA?  It's not surprising that any access to
ZFS hangs; that's what it's designed to do when a pool is suspended.


However, a pool does not have to be suspended on errors.
failmode property provides a couple of alternatives:
 wait  Blocks all I/O access until the device connectivity is
   recovered and the errors are cleared.  This is the
   default behavior.

 continue  Returns EIO to any new write I/O requests but allows
   reads to any of the remaining healthy devices.  Any
   write requests that have yet to be committed to disk
   would be blocked.

 panic Prints out a message to the console and generates a
   system crash dump.

But neither does any magic.
The errors will still be there.

--
Andriy Gapon

Re: 14-current: unable to boot after upgrade (installworld)

2021-12-09 Thread Andriy Gapon


On 09/12/2021 15:36, Sergey V. Dyatko wrote:

Hi,

Yesterday I tried to upgrade old 13-current (svn rev r368473) to fresh
14-current from git,it looked like this:
1) git pull https://git.freebsd.org/src.git /usr/src
2) cd /usr/src ; make buildworld; make kernel
3) shutdown -r now
after that I _successfully_ booted into 14-current and continued with
etcupdate -p
make installworld
etcupdate -B
shutdown -r now

but after that server doesn't come back. After I conneted to this server via
IPMI ip-kvm I saw following (sorry for external link):
https://i.imgur.com/jH6MHd2.png

Well. There was a migration to zol between r368473 and current 'main' branch so
I decided to install fresh 14-current from snapshot
FreeBSD-14.0-CURRENT-amd64-20211202-610d908f8a6-251253 in order to avoid
possible problems

and again, after make kernel and reboot OS runs, but after installworld I ended 
up in the same situation

thoughts ?


Try to update boot blocks.  Not sure what you use, like gptzfsboot, etc.


--
Andriy Gapon

Re: ctfconvert: rc = 1 Unsupported version [_dwarf_info_load(229)]

2021-11-30 Thread Andriy Gapon


On 27/11/2021 10:42, Andriy Gapon wrote:

On 26/11/2021 21:48, Mark Johnston wrote:

On Fri, Nov 26, 2021 at 02:00:27PM -0500, Mark Johnston wrote:

Thanks, I can reproduce it now.

Our libdwarf is complaining that the first compilation unit header in
.debug_info contains an unsupported DWARF version number (libdwarf only
supports 2, 3 and 4).  In files compiled by clang it ends up being zero.
For instance, compiling bin/cat and dumping the .debug_info section:

gcc10:
  c125 0400 0801 
    ^ DWARF version
clang:
   0100  4e23 

llvm-dwarfdump and binutils readelf are somehow still able to find a
valid-looking unit header, but I haven't yet been able to figure out how
they do that from reading the DWARF 4/5 specs or the LLVM sources.
|Ah, we recently started configuring clang to compress debug sections by 
default, and our libdwarf doesn't know how to handle that. As an interim 
workaround this could simply be disabled with WITH_CTF is configured:


Oh wow, you were very fast at figuring this out.
Thank you very much!

I'll give the build change a whirl first and then test D33139 a bit later.


Tested both (individually) and both do the job just as expected.

--
Andriy Gapon

Re: ctfconvert: rc = 1 Unsupported version [_dwarf_info_load(229)]

2021-11-27 Thread Andriy Gapon


On 26/11/2021 21:48, Mark Johnston wrote:

On Fri, Nov 26, 2021 at 02:00:27PM -0500, Mark Johnston wrote:

Thanks, I can reproduce it now.

Our libdwarf is complaining that the first compilation unit header in
.debug_info contains an unsupported DWARF version number (libdwarf only
supports 2, 3 and 4).  In files compiled by clang it ends up being zero.
For instance, compiling bin/cat and dumping the .debug_info section:

gcc10:
  c125 0400 0801 
^ DWARF version
clang:
   0100  4e23 

llvm-dwarfdump and binutils readelf are somehow still able to find a
valid-looking unit header, but I haven't yet been able to figure out how
they do that from reading the DWARF 4/5 specs or the LLVM sources.
|Ah, we recently started configuring clang to compress debug sections by 
default, and our libdwarf doesn't know how to handle that. As an interim 
workaround this could simply be disabled with WITH_CTF is configured:


Oh wow, you were very fast at figuring this out.
Thank you very much!

I'll give the build change a whirl first and then test D33139 a bit later.

--
Andriy Gapon

Re: ctfconvert: rc = 1 Unsupported version [_dwarf_info_load(229)]

2021-11-26 Thread Andriy Gapon


On 26/11/2021 18:06, Mark Johnston wrote:

On Thu, Nov 25, 2021 at 10:48:36PM +0200, Andriy Gapon wrote:


I've just finished builds of yesterday's CURRENT / main for arm and arm64.
In both builds I got lots of messages from ctfconvert:
ctfconvert: rc = 1 Unsupported version [_dwarf_info_load(229)]

I got an impression that there was a message for each object file, that's how
many of them were there.

I don't recall seeing those messages before.

Should I be concerned?
Maybe I am doing something wrong or have an unusual configuration?
Any way to fix the issue?

Thanks!

P.S.
The builds were done on stable/13, so maybe there is an issue with host tools
not being able to grok something new.


I haven't seen this before, for what it's worth.  I presume this is from
a kernel build?  Does the configuration enable generation of debug info
with, e.g., "makeoptions DEBUG=-g"?


This is actually from buildworld.
buildkernel is silent.

I have WITH_CTF=yes in src.conf for both arm and arm64.

For completeness, here are all other options in the src.conf (many of them are 
probably obsolete):


WITHOUT_ACCT=yes
WITHOUT_ACPI=yes
WITHOUT_AMD=yes
WITHOUT_APM=yes
WITHOUT_ATM=yes
WITHOUT_BLACKLIST=yes
WITHOUT_BLACKLIST_SUPPORT=yes
WITHOUT_BLUETOOTH=yes
WITHOUT_BOOTPARAMD=yes
WITHOUT_BOOTPD=yes
WITHOUT_CCD=yes
WITHOUT_CUSE=yes
WITHOUT_CXGBETOOL=yes
WITHOUT_EXAMPLES=yes
WITHOUT_FINGER=yes
WITHOUT_FLOPPY=yes
WITHOUT_GOOGLETEST=yes
WITHOUT_HAST=yes
WITHOUT_HTML=yes
WITHOUT_HYPERV=yes
WITHOUT_IPFILTER=yes
WITHOUT_KERBEROS=yes
WITHOUT_KERBEROS_SUPPORT=yes
WITHOUT_LOADER_GELI=yes
WITHOUT_LPR=yes
WITHOUT_MLX5TOOL=yes
WITHOUT_NDIS=yes
WITHOUT_PROFILE=yes
WITHOUT_RBOOTD=yes
WITHOUT_ROUTED=yes
WITHOUT_SHAREDOCS=yes
WITHOUT_TALK=yes
WITHOUT_TESTS=yes
WITHOUT_TESTS_SUPPORT=yes
WITHOUT_USB_GADGET_EXAMPLES=yes
WITHOUT_ZONEINFO=yes# comes from the misc/zoneinfo port


--
Andriy Gapon

ctfconvert: rc = 1 Unsupported version [_dwarf_info_load(229)]

2021-11-25 Thread Andriy Gapon




I've just finished builds of yesterday's CURRENT / main for arm and arm64.
In both builds I got lots of messages from ctfconvert:
  ctfconvert: rc = 1 Unsupported version [_dwarf_info_load(229)]

I got an impression that there was a message for each object file, that's how 
many of them were there.


I don't recall seeing those messages before.

Should I be concerned?
Maybe I am doing something wrong or have an unusual configuration?
Any way to fix the issue?

Thanks!

P.S.
The builds were done on stable/13, so maybe there is an issue with host tools 
not being able to grok something new.


--
Andriy Gapon

Re: thread on sleepqueue does not wake up after timeout

2021-11-10 Thread Andriy Gapon


On 10/11/2021 11:30, Andriy Gapon wrote:

On 09/11/2021 17:56, Andriy Gapon wrote:
So, as I was saying, when the delta is large the calculations in tc_windup and 
bintime_off give slightly different results and that can lead to a 
discontinuity of the time when timehands are switched.


A quick follow-up.
I think that both tc_windup and bintime_off have fundamentally correct 
calculations but with different precision.  Both seem to produce values slightly 
greater than a "true" value where the bintime fractional delta would be 
calculated as tc_delta * 2^64 / tc_frequency.  That's because of how th_scale is 
calculated.


When the timecounter delta is greater than the frequency then the value in 
tc_windup is closer to the true value because it accounts for whole seconds 
precisely: a tc_frequency number of timecounter ticks is equal to one second. 
bintime_off, however, converts both whole seconds and fractions using th_scale. 
  So, its result is consistently greater when the delta is longer than a second.


E.g., in my environment: tc_frequency = 14318180, th_scale = 1288420532460.
For a delta of 14318180 (== tc_frequency) tc_windup calculates a one second 
advance, bt = { 1, 0 }.

bintime_off for the same delta will produce bt = { 1, 1093027638570944 }.
The difference is minuscule, just 59 ppm in relative terms.
But it's 59 microseconds of "jumping back in time".

I think that the precision of bintime_off is sufficient and its calculations are 
faster, so I think that it's better to use the same calculations in tc_windup as 
well.  Especially given that they are identical for sub-second deltas and longer 
deltas should be extremely rare.


I am working on patch to implement this.


The promised patch: https://people.freebsd.org/~avg/kern-tc-add-delta.diff

--
Andriy Gapon

Re: thread on sleepqueue does not wake up after timeout

2021-11-10 Thread Andriy Gapon


On 09/11/2021 17:56, Andriy Gapon wrote:
So, as I was saying, when the delta is large the calculations in tc_windup and 
bintime_off give slightly different results and that can lead to a discontinuity 
of the time when timehands are switched.


A quick follow-up.
I think that both tc_windup and bintime_off have fundamentally correct 
calculations but with different precision.  Both seem to produce values slightly 
greater than a "true" value where the bintime fractional delta would be 
calculated as tc_delta * 2^64 / tc_frequency.  That's because of how th_scale is 
calculated.


When the timecounter delta is greater than the frequency then the value in 
tc_windup is closer to the true value because it accounts for whole seconds 
precisely: a tc_frequency number of timecounter ticks is equal to one second. 
bintime_off, however, converts both whole seconds and fractions using th_scale. 
 So, its result is consistently greater when the delta is longer than a second.


E.g., in my environment: tc_frequency = 14318180, th_scale = 1288420532460.
For a delta of 14318180 (== tc_frequency) tc_windup calculates a one second 
advance, bt = { 1, 0 }.

bintime_off for the same delta will produce bt = { 1, 1093027638570944 }.
The difference is minuscule, just 59 ppm in relative terms.
But it's 59 microseconds of "jumping back in time".

I think that the precision of bintime_off is sufficient and its calculations are 
faster, so I think that it's better to use the same calculations in tc_windup as 
well.  Especially given that they are identical for sub-second deltas and longer 
deltas should be extremely rare.


I am working on patch to implement this.

--
Andriy Gapon

Re: thread on sleepqueue does not wake up after timeout

2021-11-09 Thread Andriy Gapon


On Tue, Nov 09, 2021 at 11:58:30AM +0200, Andriy Gapon wrote:

Here is an explanation for the numbers reported in the panic message (sorted
from earliest to latest):
190543869603412 - 'now' as seen in sleepq_timeout(), returned by sbinuptime();
190543869738008 - td_sleeptimo, also c_time in the callout;
190543869798505 - 'now' as captured when the LAPIC timer fired, seen in the
stack trace and also recorded as c_exec_time in the callout.


Kostik,

thank you very much for the pointers!

I spent some more time staring at the code and at the timehands data from the 
dump (which I neglected to share earlier).  I am starting to think that there is 
a bug in the FreeBSD code (at least in the copy that we use) unless I got 
confused somewhere or made a mistake in calculations.


So, I think that there is a discrepancy between how "large deltas" are handled 
in tc_windup and bintime_off.
As large deltas happen very rarely, especially on good hardware, the bug should 
be very rare as well.


Now to the data.

(kgdb) p *timehands->th_counter
$28 = {tc_get_timecount = 0x809de380 , tc_poll_pps = 
0x0, tc_counter_mask = 4294967295, tc_frequency = 14318180, tc_name = 
0x80b0ff97 "HPET", tc_quality = 950, tc_flags = 0,
  tc_priv = 0xfe0010916000, tc_next = 0x810f6c30 , 
tc_fill_vdso_timehands = 0x809dc6b0 , 
tc_fill_vdso_timehands32 = 0x0}


(kgdb) p timehands_count
$76 = 2
(kgdb) p timehands
$75 = (struct timehands * volatile) 0x8109e1a0 
(kgdb) p [0]
$77 = (struct timehands *) 0x8109e120 
(kgdb) p [1]
$78 = (struct timehands *) 0x8109e1a0 

(kgdb) p ths[0]
$79 = {th_counter = 0xfe0010916060, th_adjustment = 254493021346896, 
th_scale = 1288420532592, th_large_delta = 14317331, th_offset_count = 
3817197766, th_offset = {sec = 44363, frac = 7084573033620442688}, th_bintime = {
sec = 1636195324, frac = 14622574300909856022}, th_microtime = {tv_sec = 
1636195324, tv_usec = 792691}, th_nanotime = {tv_sec = 1636195324, tv_nsec = 
792691341}, th_boottime = {sec = 1636150961, frac = 7538001267289413334},

  th_generation = 2204358, th_next = 0x8109e1a0 }

(kgdb) p ths[1]
$80 = {th_counter = 0xfe0010916060, th_adjustment = 254492583661022, 
th_scale = 1288420532460, th_large_delta = 14317331, th_offset_count = 
3832485779, th_offset = {sec = 44364, frac = 8334125784005739824}, th_bintime = {
sec = 1636195325, frac = 15872127051295153158}, th_microtime = {tv_sec = 
1636195325, tv_usec = 860429}, th_nanotime = {tv_sec = 1636195325, tv_nsec = 
860429731}, th_boottime = {sec = 1636150961, frac = 7538001267289413334},

  th_generation = 2204358, th_next = 0x8109e120 }



th_offset_count difference between the hands is 15288013.
It's a bit above tc_frequency of 14318180, so before the latest wind-up there 
hasn't been a wind-up for more than a second, a rare situation indeed.

The difference is also greater than th_large_delta of 14317331.

I redid the th_offset calculations in tc_windup by hand and arrived at exactly 
the same value of ths[1].th_offset as seen in kgdb using ths[0].th_offset and 
the delta as inputs.  So, this is consistent.


Then I did a thought experiment: what would binuptime() return at exactly the 
same moment when tc_windup was called?  That binuptime() would still use ths[0] 
as the timehands because the hands have not been switched yet and it would also 
see exactly the same timecounter delta.


So, starting conditions:
delta = 15288013
th_large_delta = 14317331
th_offset = {sec = 44363, frac = 7084573033620442688}
th_scale = 1288420532592

The calculations in the code (bintime_off) are:
if (__predict_false(delta >= large_delta)) {
/* Avoid overflow for scale * delta. */
x = (scale >> 32) * delta;
bt->sec += x >> 32;
bintime_addx(bt, x << 32);
bintime_addx(bt, (scale & 0x) * delta);
} else {


My manual calculations:
x = (1288420532592 >> 32) * 15288013 == 4571115887
bt->sec += 4571115887 >> 32 == 1
  bt = { 44364, 7084573033620442688 }
bt  4571115887 << 32 == 1186049167181479936
  bt = { 44364, 8270622200801922624 }
bt  (scale & 0x) * delta == 4225311088 * 15288013 == 
64596610842388144

  bt = { 44364, 8335218811644310768 }

So, comparing to ths[1].th_offset the resulting time has larger 'frac' part:
8335218811644310768 - 8334125784005739824 == 1093027638570944

So, IMO, this means that at the moment of the hands switch the binuptime (and 
all other times) would jump backwards.


Converting both times to sbintime_t I got:
190543869814104 is sbinuptime using ths[0] (and delta of 15288013)
190543869559614 is sbinuptime using ths[1] (and delta of 0)
This is a jump backwards by 254490 parts.

If I put these times together with the times found in the crash dump stack 
(quoted at the start), then I get:


190543869559614 - hand

Re: thread on sleepqueue does not wake up after timeout

2021-11-09 Thread Andriy Gapon

 now > td_sleeptimo there.
But sleepq_timeout() thought that it was premature as sbinuptime() < 
td_sleeptimo there.


In the above case the eventtimer is obviously LAPIC, the timecounter is HPET.
But we also saw the same issue when we changed the timecounter to ACPI-fast.
I think that we haven't tried any other choices because:
kern.timecounter.tc.ACPI-fast.quality: 900
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.HPET.quality: 950
kern.timecounter.tc.TSC-low.quality: -100

I see three possibilities of why sbinuptime() could go backwards:
- a bug in FreeBSD time keeping code that gets triggered only under very 
specific conditions that don't normally happen

- a bug in VMWare, e.g. in HPET and ACPI timer emulation
- something exotic, like correct C code miscompiled to incorrect machine code or 
weird memory model violation or etc...


I'd appreciate any suggestions on additional diagnostics to narrow down / rule 
out the possibilities.


P.S.
As a workaround I could modify sleepq_timeout() to get "current time" from 
c_exec_time (added by us) instead of sbinuptime().  c_exec_time is the value of 
'now' in callout_process() when it decides that the callout should fire.

But I'd like to get to the bottom of the issue.

--
Andriy Gapon

NO_ROOT+DESTDIR: Permission denied?

2021-10-01 Thread Andriy Gapon




When installing world with NO_ROOT and DESTDIR set I see a handful of permission 
denied errors like below:


--
>>> Installing everything started on Fri Oct  1 13:09:32 EEST 2021
--
make[3]: 
"/usr/obj/apu2c4/usr/devel/git/apu2c4/amd64.amd64/toolchain-metadata.mk" line 1: 
Using cached toolchain metadata from build at trant on Mon 27 Sep 2021 16:29:14 EEST

===> lib (install)
===> bin (install)
===> cddl (install)
===> libexec (install)
===> gnu (install)
make[4] warning: /bin: Permission denied.
===> include (install)
make[4] warning: /lib: Permission denied.
make[4] warning: /libexec: Permission denied.

installworld works despite those and the result looks fine, but the messages a 
re little bit concerning.


Perhaps, DESTDIR is not honored in some place?

The able is for installing CURRENT on a stable/13 host.
--
Andriy Gapon

Re: latest current fails to boot.

2021-09-25 Thread Andriy Gapon


On 25/09/2021 19:10, Johan Hendriks wrote:
For me i had kern.sched.steal_thresh=1 in my sysctl as i use this machine mainly 
for tests and so on.
By removing this sysctl the system boots again. I already used the latest 
snapshot and that booted fine.


Might have something to do with
https://cgit.FreeBSD.org/src/commit/?id=bd84094a51c4648a7c97ececdaccfb30bc832096

--
Andriy Gapon

Re: installworld with NO_ROOT produces paths with .. for man pages

2021-09-23 Thread Andriy Gapon


On 28/08/2021 17:28, Andriy Gapon wrote:


This seems to be related to the recent change to install manual pages for all 
platforms.


My method of creating a cross-platform installation image is to install with 
NO_ROOT and then to tar up with @METALOG argument.
On the destination I simply untar the archive into a destination directory 
(typically a fresh ZFS BE).


Today I noticed some complaints when extracting the archive, here is a few:
./usr/share/man/man4/i386/../smapi.4.gz: Path contains '..'
./usr/share/man/man4/i386/../vpd.4.gz: Path contains '..'
./usr/share/man/man4/powerpc/../adb.4.gz: Path contains '..'
./usr/share/man/man4/powerpc/../akbd.4.gz: Path contains '..'

This is a not a big deal but would be nice to "straighten" the installation 
paths when installing such manual pages.


P.S.
NO_ROOT does not seem to be documented outside of the source code.



I think that it would be nice to fix that .. issue.
Any suggestions?

--
Andriy Gapon

installworld with NO_ROOT produces paths with .. for man pages

2021-08-28 Thread Andriy Gapon




This seems to be related to the recent change to install manual pages for all 
platforms.


My method of creating a cross-platform installation image is to install with 
NO_ROOT and then to tar up with @METALOG argument.
On the destination I simply untar the archive into a destination directory 
(typically a fresh ZFS BE).


Today I noticed some complaints when extracting the archive, here is a few:
./usr/share/man/man4/i386/../smapi.4.gz: Path contains '..'
./usr/share/man/man4/i386/../vpd.4.gz: Path contains '..'
./usr/share/man/man4/powerpc/../adb.4.gz: Path contains '..'
./usr/share/man/man4/powerpc/../akbd.4.gz: Path contains '..'

This is a not a big deal but would be nice to "straighten" the installation 
paths when installing such manual pages.


P.S.
NO_ROOT does not seem to be documented outside of the source code.

--
Andriy Gapon

Re: drm-kmod kernel crash fatal trap 12

2021-06-11 Thread Andriy Gapon


On 10/06/2021 18:13, Bakul Shah wrote:

On Jun 10, 2021, at 7:13 AM, Thomas Laus  wrote:

The drm-kmod module is the latest from the pkg server.  It all
worked this past Monday after the recent drm-kmod update.


This is what I did:

git clone https://github.com/freebsd/drm-kmod
ln -s $PWD/drm-kmod /usr/local/sys/modules

Now it gets compiled every time you do make buildkernel.
If things break you can do a git pull in the drm-kmod dir
and rebuild.


I did approximately the same, but instead of a symlink I use LOCAL_MODULES_DIR 
and LOCAL_MODULES.


--
Andriy Gapon

Re: ZFS rename with associated snapshot present: odd error message

2021-05-05 Thread Andriy Gapon


On 05/05/2021 01:59, Mark Millard via freebsd-current wrote:

I had a:

# zfs list -tall
NAME   USED  AVAIL REFER  MOUNTPOINT
. . .
zroot/DESTDIRs/13_0R-CA72-instwrld-norm  1.44G   117G   96K  
/usr/obj/DESTDIRs/13_0R-CA72-instwrld-norm
zroot/DESTDIRs/13_0R-CA72-instwrld-norm@dirty-style  1.44G  - 1.44G  -. 
. .
. . .

(copied/pasted from somewhat earlier) and then attempted:

# zfs rename zroot/DESTDIRs/13_0R-CA72-instwrld-norm 
zroot/DESTDIRs/13_0R-CA72-instwrld-alt-0
cannot open 'zroot/DESTDIRs/13_0R-CA72-instwrld-norm@dirty-style': snapshot 
delimiter '@' is not expected here

Despite the "cannot open" message, the result looks like:

# zfs list -tall
NAME   USED  AVAIL 
REFER  MOUNTPOINT
. . .
zroot/DESTDIRs/13_0R-CA72-instwrld-alt-0  1.44G   114G   
96K  /usr/obj/DESTDIRs/13_0R-CA72-instwrld-alt-0
zroot/DESTDIRs/13_0R-CA72-instwrld-alt-0@dirty-style  1.44G  - 
1.44G  -
. . .

Still, it leaves me wondering if everything is okay
given that internal attempt to use the old name with
@dirty-style when it was apparently no longer
available under that naming.

For reference:

# uname -apKU
FreeBSD CA72_4c8G_ZFS 13.0-RELEASE FreeBSD 13.0-RELEASE #0 
releng/13.0-n244733-ea31abc261ff-dirty: Thu Apr 29 21:53:20 PDT 2021 
root@CA72_4c8G_ZFS:/usr/obj/BUILDs/13_0R-CA72-nodbg-clang/usr/13_0R-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
  arm64 aarch64 1300139 1300139


Cannot reproduce here (but with much simpler names and on stable/13):
zfs create testz/test
zfs snapshot testz/test@snap1
zfs rename testz/test testz/test2

All worked.

--
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: stable/13, vm page counts do not add up

2021-04-14 Thread Andriy Gapon


On 14/04/2021 16:32, Mark Johnston wrote:

On Wed, Apr 14, 2021 at 02:21:44PM +0300, Andriy Gapon wrote:

On 14/04/2021 00:18, Mark Johnston wrote:

fbt::vm_page_unwire:entry
/args[0]->oflags & 0x4/
{
@unwire[stack()] = count();
}


Unrelated report, dtrace complains about this probe on my stable/13 system:
  failed to resolve translated type for args[0]

And I do not have any idea why...


There was a regression, see PR 253440.  I think you have the fix
already, but perhaps not.  Could you show output from
"dtrace -lv -n fbt::vm_page_unwire:entry"?


dtrace -lv -n fbt::vm_page_unwire:entry
   ID   PROVIDERMODULE  FUNCTION NAME
54323fbtkernelvm_page_unwire entry

Probe Description Attributes
Identifier Names: Private
Data Semantics:   Private
Dependency Class: Unknown

Argument Attributes
Identifier Names: Private
Data Semantics:   Private
Dependency Class: ISA

Argument Types
args[0]: (unknown)
args[1]: (unknown)

It seems that I should have the fix, but somehow I still have the problem.
I've been doing NO_CLEAN builds for a long while, so maybe some stale file 
didn't get re-created...


It looks that dt_lex.c under /usr/obj is rather dated.

... I've removed that file and rebuilt libdtrace and everything is okay now.
Thank you.


  From ctfdump:
[27290] FUNC (vm_page_unwire) returns: 38 args: (1463, 3)

<1463> TYPEDEF vm_page_t refers to 778
<778> POINTER (anon) refers to 3575
<3575> STRUCT vm_page (104 bytes)
  plinks type=3563 off=0
  listq type=3558 off=128
  object type=3564 off=256
  pindex type=3565 off=320
  phys_addr type=42 off=384
  md type=3571 off=448
  ref_count type=31 off=640
  busy_lock type=31 off=672
  a type=3573 off=704
  order type=3 off=736
  pool type=3 off=744
  flags type=3 off=752
  oflags type=3 off=760
  psind type=2167 off=768
  segind type=2167 off=776
  valid type=3574 off=784
      dirty type=3574 off=792

--
Andriy Gapon



--
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: stable/13, vm page counts do not add up

2021-04-14 Thread Andriy Gapon


On 14/04/2021 00:18, Mark Johnston wrote:

fbt::vm_page_unwire:entry
/args[0]->oflags & 0x4/
{
@unwire[stack()] = count();
}


Unrelated report, dtrace complains about this probe on my stable/13 system:
failed to resolve translated type for args[0]

And I do not have any idea why...

From ctfdump:
  [27290] FUNC (vm_page_unwire) returns: 38 args: (1463, 3)

  <1463> TYPEDEF vm_page_t refers to 778
  <778> POINTER (anon) refers to 3575
  <3575> STRUCT vm_page (104 bytes)
plinks type=3563 off=0
listq type=3558 off=128
object type=3564 off=256
pindex type=3565 off=320
phys_addr type=42 off=384
md type=3571 off=448
ref_count type=31 off=640
busy_lock type=31 off=672
a type=3573 off=704
order type=3 off=736
pool type=3 off=744
flags type=3 off=752
oflags type=3 off=760
psind type=2167 off=768
segind type=2167 off=776
valid type=3574 off=784
dirty type=3574 off=792

--
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: stable/13, vm page counts do not add up

2021-04-13 Thread Andriy Gapon

On 07/04/2021 23:56, Mark Johnston wrote:
> I don't know what might be causing it then.  It could be a page leak.
> The kernel allocates wired pages without adjusting the v_wire_count
> counter in some cases, but the ones I know about happen at boot and
> should not account for such a large disparity.  I do not see it on a few
> systems that I have access to.

Mark or anyone,

do you have a suggestion on how to approach hunting for the potential page leak?
It's been a long while since I worked with that code and it changed a lot.

Here is some additional info.
I had approximately 2 million unaccounted pages.
I rebooted the system and that number became 20 thousand which is more
reasonable and could be explained by those boot-time allocations that you 
mentioned.
After 30 hours of uptime the number became 60 thousand.

I monitored the number and so far I could not correlate it with any activity.

P.S.
I have not been running any virtual machines.
I do use nvidia graphics driver.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: stable/13, vm page counts do not add up

2021-04-07 Thread Andriy Gapon

On 07/04/2021 22:54, Mark Johnston wrote:
> On Wed, Apr 07, 2021 at 10:42:57PM +0300, Andriy Gapon wrote:
>>
>> I regularly see that the top's memory line does not add up (and by a lot).
>> That can be seen with vm.stats as well.
>>
>> For example:
>> $ sysctl vm.stats | fgrep count
>> vm.stats.vm.v_cache_count: 0
>> vm.stats.vm.v_user_wire_count: 3231
>> vm.stats.vm.v_laundry_count: 262058
>> vm.stats.vm.v_inactive_count: 3054178
>> vm.stats.vm.v_active_count: 621131
>> vm.stats.vm.v_wire_count: 1871176
>> vm.stats.vm.v_free_count: 18
>> vm.stats.vm.v_page_count: 8134982
>>
>> $ bc
>>>>> 18 + 1871176 + 621131 + 3054178 + 262058
>> 5996320
>>>>> 8134982 - 5996320
>> 2138662
>>
>> As you can see, it's not a small number of pages either.
>> Approximately 2 million pages, 8 gigabytes or 25% of the whole memory on this
>> system.
>>
>> This is 47c00a9835926e96, 13.0-STABLE amd64.
>> I do not think that I saw anything like that when I used (much) older 
>> FreeBSD.
> 
> One relevant change is that vm_page_wire() no longer removes pages from
> LRU queues, so the count of pages in the queues can include wired pages.
> If the page daemon runs, it will dequeue any wired pages that are
> encountered.

Maybe I misunderstand how that works, but I would expect that the sum of all
counters could be greater than v_page_count at times.  But in my case it's less.

> This was done to reduce queue lock contention, operations like
> sendfile() which transiently wire pages would otherwise trigger two
> queue operations per page.  Now that queue operations are batched this
> might not be as important.
> 
> We could perhaps add a new flavour of vm_page_wire() which is not lazy
> and would be suited for e.g., the buffer cache.  What is the primary
> source of wired pages in this case?

It should be ZFS, I guess.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

stable/13, vm page counts do not add up

2021-04-07 Thread Andriy Gapon



I regularly see that the top's memory line does not add up (and by a lot).
That can be seen with vm.stats as well.

For example:
$ sysctl vm.stats | fgrep count
vm.stats.vm.v_cache_count: 0
vm.stats.vm.v_user_wire_count: 3231
vm.stats.vm.v_laundry_count: 262058
vm.stats.vm.v_inactive_count: 3054178
vm.stats.vm.v_active_count: 621131
vm.stats.vm.v_wire_count: 1871176
vm.stats.vm.v_free_count: 18
vm.stats.vm.v_page_count: 8134982

$ bc
>>> 18 + 1871176 + 621131 + 3054178 + 262058
5996320
>>> 8134982 - 5996320
2138662

As you can see, it's not a small number of pages either.
Approximately 2 million pages, 8 gigabytes or 25% of the whole memory on this
system.

This is 47c00a9835926e96, 13.0-STABLE amd64.
I do not think that I saw anything like that when I used (much) older FreeBSD.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: freebsd 13 ryzen micro stutter

2021-04-05 Thread Andriy Gapon

On 27/03/2021 12:54, Santiago Martinez wrote:
> Hi, i have the same output as @Nils B. If i run with steal =2 and dtrace the
> micro stutter doesn't happen but as soon as i stop the dtrace script it the
> stutters come back again.

It seems that DTrace creates some extra CPU load that masks the problem.
So, I guess that DTrace produced traces won't have many clues, if any at all.
I wonder if KTR tracing would be better in this respect.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Strange behavior after running under high load

2021-03-28 Thread Andriy Gapon

On 28/03/2021 17:39, Stefan Esser wrote:
> After a period of high load, my now idle system needs 4 to 10 seconds to
> run any trivial command - even after 20 minutes of no load ...
> 
> 
> I have run some Monte-Carlo simulations for a few hours, with initially 35
> processes running in parallel for some 10 seconds each.

I saw somewhat similar symptoms with 13-CURRENT some time ago.
To me it looked like even small kernel memory allocations took a very long time.
But it was hard to properly diagnose that as my favorite tool, dtrace, was also
affected by the same problem.

> The load decreased over time since some parameter sets were faster to process.
> All in all 63000 processes ran within some 3 hours.
> 
> When the system became idle, interactive performance was very bad. Running
> any trivial command (e.g. uptime) takes some 5 to 10 seconds. Since I have
> to have this system working, I plan to reboot it later today, but will keep
> it in this state for some more time to see whether this state persists or
> whether the system recovers from it.
> 
> Any ideas what might cause such a system state???
> 
> 
> The system has a Ryzen 5 3600 CPU (6 core/12 threads) and 32 GB or RAM.
> 
> The following are a few commands that I have tried on this now practically
> idle system:
> 
> $ time vmstat -n 1
>   procs    memory    page  disks faults   cpu
>   r  b  w  avm  fre  flt  re  pi  po   fr   sr nv0   in   sy   cs us sy id
>   2  0  0  26G 922M 1.2K   1   4   0 1.4K  239   0  482 7.2K  934 11  1 88
> 
> real    0m9,357s
> user    0m0,001s
> sys    0m0,018
> 
>  wait 1 minute 
> 
> $ time vmstat -n 1
>   procs    memory    page  disks faults   cpu
>   r  b  w  avm  fre  flt  re  pi  po   fr   sr nv0   in   sy   cs us sy id
>   1  0  0  26G 925M 1.2K   1   4   0 1.4K  239   0  482 7.2K  933 11  1 88
> 
> real    0m9,821s
> user    0m0,003s
> sys    0m0,389s
> 
> $ systat -vm
> 
>  4 users    Load  0.10  0.72  3.57  Mar 28 16:15
>     Mem usage:  97%Phy 55%Kmem   VN PAGER   SWAP PAGER
> Mem:  REAL   VIRTUAL in   out in  out
>     Tot   Share Tot    Share Free   count
> Act  2387M    460K  26481M 460K 923M   pages
> All  2605M    218M  27105M 572M    ioflt  Interrupts
> Proc:  cow 132 total
>    r   p   d    s   w   Csw  Trp  Sys  Int  Sof  Flt    52 zfod 96 
> hpet0:t0
>   316   356   39  225  132   21   53   ozfod nvme0:admi
>   %ozfod nvme0:io0
>   0.1%Sys   0.0%Intr  0.0%User  0.0%Nice 99.9%Idle daefr nvme0:io1
> |    |    |    |    |    |    |    |    |    |    |    prcfr nvme0:io2
>    totfr nvme0:io3
>     dtbuf  react nvme0:io4
> Namei  Name-cache   Dir-cache    620370 maxvn  pdwak nvme0:io5
>     Calls    hits   %    hits   %    627486 numvn  168 pdpgs    27 xhci0 
> 66
>    18  14  78    65 frevn  intrn ahci0 67
>     17539M wire xhci1 68
> Disks  nvd0  ada0  ada1  ada2  ada3  ada4   cd0   430M act   9 re0 69
> KB/t   0.00  0.00  0.00  0.00  0.00  0.00  0.00 12696M inact hdac0 76
> tps   0 0 0 0 0 0 0 54276K laund vgapci0 78
> MB/s   0.00  0.00  0.00  0.00  0.00  0.00  0.00   923M free
> %busy 0 0 0 0 0 0 0  0 buf
> 
>  5 minutes later 
> 
> $ time vmstat -n 1
>  procs    memory    page  disks faults   cpu
>  r  b  w  avm  fre  flt  re  pi  po   fr   sr nv0   in   sy   cs us sy id
>  1  0  0  26G 922M 1.2K   1   4   0 1.4K  239   0  481 7.2K  931 11  1 88
> 
> real    0m4,270s
> user    0m0,000s
> sys    0m0,019s
> 
> $ time uptime
> 16:20  up 23:23, 4 users, load averages: 0,17 0,39 2,68
> 
> real    0m10,840s
> user    0m0,001s
> sys    0m0,374s
> 
> $ time uptime
> 16:37  up 23:40, 4 users, load averages: 0,29 0,27 0,96
> 
> real    0m9,273s
> user    0m0,000s
> sys    0m0,020s
> 


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: freebsd 13 ryzen micro stutter

2021-03-25 Thread Andriy Gapon

On 23/03/2021 16:54, Nils B. wrote:
> Hi,
> 
> On 23.03.21 10:34, myfreeweb wrote:
>> None of these should be an issue, but:
>>
>> sysctl kern.sched.steal_thresh=1
>>
>> For some reason with the default value of 2, I'm seeing weird stuttering in
>> youtube
>> videos, games, etc. on a 5950X system. 1 (or 0, IIRC) works fine.
> 
> yes, finally... Using a Ryzen 1700, Asrock AB350 Pro4 and Radeon RX460 and 
> got that
> awful micro stuttering all the time; not only under FreeBSD 13.0-ALPHA3 now, 
> but
> also
> under FreeBSD 12-STABLE in the past.
> 
> Occurences were during listening to music using MPV (one-second-*krk*-loops);
> watching
> YouTube videos (video hangs for a second but audio continues) and often simply
> during
> mouse movements where even MouseKeyPress- and MouseKeyRelease-events just 
> didn't
> reach
> the system at all.
> 
> Setting
> 
> kern.sched.steal_thresh=0
> 
> eliminates these micro stutterings in the whole system.
> 
> 
> I also would really, really like to know the reason why this parameter has 
> such an
> impact...

It's been a long time since I looked at that corner of the code.
I think that in theory there should not be any difference between steal_thresh
of zero, one and two.  For a thread to be stolen there should be at least one
thread that's runnable, but not running.  That also should imply that there is a
a thread that's currently running.  So, values equal or less than two should
mean the same thing.

The only practical difference I can think of is a situation where a processor
has a runnable thread but does not "realize" it, so the processor stays idle
when it actually has work to do.
If such a thread is not stolen then it may take some time for the processor to
actually start running it.  If it's stolen then the thread may start executing
sooner on a different processor that was about to become idle.

That's just a hypothesis though.

If it's correct, then there can be a number of explanations.  From a problem
with inter-processor communication (e.g., related to mwait) to a slow wakeup of
a core from a deep idle state to a problem with interrupt delivery.

There are some tools in tools/sched/ directory.
schedgraph.py can be used for visual inspection of scheduling traces collected
using KTR.  The file has instructions on how to collect them.
Alternatively, schedgraph.d can be used to collect such traces.
If anyone affected can gather a short sample that captures the problem, then
there might be someone who would be willing to look at them.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: console: no USB keyboard!

2021-03-14 Thread Andriy Gapon

On 13/03/2021 21:01, Hartmann, O. wrote:
> Running 14-CURRENT on several boxes (i.e. FreeBSD 14.0-CURRENT #49
> main-n245422-cecfaf9bede9: Fri Mar 12 16:08:09 CET 2021 amd64) with custom 
> and/or GENERIC
> kernel and USB-only equipment (mouse if available, keyboard).
> In multiuser mode, there is no problem using the USB keyboard. On single user 
> console
> (for maintenance purposes), no USB keyboard is available. The same is true 
> while booting
> and the rc scripts are worked on. Usually, one can hit the enter key and 
> inserts a
> newline, this doesn't work anymore until the box is completely up! 
> 
> I do not know when this problem as been introduced, the very same config is 
> used since
> 13-CURRENT in its earlier time and has been modified accordingly, but I can't 
> see obvios
> changes which would explain the wrecked behaviour now. 
> 
> I got aware of this problem, when a small mistake in /etc/fstab rendered a box
> unbootable, I had to head for the datacenter and wasn't even capable of 
> interrupting the
> stuck system. Checking on other boxes running recent 14-CURRENT revealed the 
> same problem.
> 
> The interesting part is, that as long as those boxes are with the loader 
> present (all
> boxes are UEFI booting!), the USB keyboard works as expected and I'm able to 
> select
> kernel/kernel.old and so on.
> 
> How to fix this?

Can't help with fixing the problem, but here's some info.
When you are at the loader prompt, BIOS provides emulation of a standard /
legacy keyboard for the USB keyboard.  That's why loader can work even though it
doesn't know much about USB.
When a FreeBSD driver for the USB controller takes over then the BIOS emulation
stops.  Until a FreeBSD peripheral driver like ukbd attaches, it's not possible
to use the keyboad, unfortunately.  You can check your dmesg to see when that
happens.

Personally, I try to avoid "legacy free" solutions and always have a PS/2
keyboard (even if it's a really a USB one using PS/2 <-> USB adapter).

Of course, it would be great to reduce the dead window for USB keyboards and I
think that it is doable.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: FreeBSD 13-RC1 PVSCSI install error with VMware

2021-03-06 Thread Andriy Gapon

On 06/03/2021 14:38, Fabien via freebsd-current wrote:
> Hello,
> 
> A quick feedback related to VMware install of FreeBSD 13-RC1 with PVSCSI 
> drive.
> At the end of the install it fails with the following error:
> https://e.pcloud.link/publink/show?code=XZBzIkZS2Sx31RK3k0L91Hz4I8p70F82iHV
> 
> Is it planned to be fixed before release ?

Please see if this helps:
https://lists.freebsd.org/pipermail/freebsd-current/2020-December/077859.html
Note that you don't have to recompile, kern.maxphys in loader.conf or at the
loader prompt should work as well.

But it would be ideal to fix the issue in the driver.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: uchcom update

2021-02-28 Thread Andriy Gapon

On 28/02/2021 11:03, KOT MATPOCKuH wrote:
> Hello!
> 
> I'm using FreeBSD 12.2-STABLE r368656 and got usb-to-rs232 with this
> controller.
> I see /dev/cuaU0 after plugging in adapter, I can attach to serial line
> using cu, but after sending any symbol to device I have device reconnection:
> uchcom0 on uhub0
> uchcom0:  on usbus0
> uchcom0: CH340 detected
> uchcom0: at uhub0, port 9, addr 17 (disconnected)
> uchcom0: detached
> uchcom0 on uhub0
> uchcom0:  on usbus0
> uchcom0: CH340 detected

I have this in my loader.conf:

# Ignore result of "clear stall" (clearing halt on endpoints)
# CH340 USB<->RS232 requires this
# and it seems that Linux and Windows do this by default
hw.usb.no_cs_fail=1

I recall that without that tuning I had a similar problem.

> вт, 5 июн. 2018 г. в 15:05, Ian FREISLICH :
> 
>> On 05/22/2018 09:44 AM, Andriy Gapon wrote:
>>> Yesterday I committed some changes to uchcom (so far, only in CURRENT).
>>> Commits are r333997 - r334002.
>>>
>>> If you have a CH340/341 based USB<->RS232 adapter and it works for you,
>> could
>>> you please test that it still does?
>>> If you tried your adapter in the past and it did not work, there is a
>> chance it
>>> might start working now.  Could you please test that as well?
>>
>> ugen5.4:  at usbus5, cfg=0 md=HOST spd=FULL
>> (12Mbps) pwr=ON (96mA)
>> ugen5.4.0: uchcom0: 
>>
>> It's not made it any worse.  I'm not using this adapter by choice - it's
>> a USB to Maxim (Dallas) one-wire bus adapter.  The manual used to state
>> that these are possibly the worst chips ever.  Is that still the
>> prevailing opinion?


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic: condition seqc_in_modify(_vp->v_seqc) not met at zfs_acl.c:1147 (zfs_acl_chown_setattr)

2021-02-27 Thread Andriy Gapon

On 16/02/2021 22:38, Mateusz Guzik wrote:
> I think for future proofing it would be best if all vnodes going there
> had seqc marked, thus I think this should do the trick:
> 
> diff --git a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> index d5f0da9ecd4b..8172916c4329 100644
> --- a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> +++ b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> @@ -2756,7 +2756,9 @@ zfs_setattr(znode_t *zp, vattr_t *vap, int
> flags, cred_t *cr)
> err = zfs_acl_chown_setattr(zp);
> ASSERT(err == 0);
> if (attrzp) {
> +   vn_seqc_write_begin(ZTOV(attrzp));
> err = zfs_acl_chown_setattr(attrzp);
> +   vn_seqc_write_end(ZTOV(attrzp));
> ASSERT(err == 0);
> }
> }
> 
> I don't see other calls to the routine.


This patch works perfectly for me.
Thank you!

> On 2/16/21, Andriy Gapon  wrote:
>> On 15/02/2021 11:45, Andriy Gapon wrote:
>>> On 15/02/2021 10:22, Andriy Gapon wrote:
>>>>
>>>> I've got this panic once when copying a couple of files.
>>>> The system is stable/13 as of 1996360d7338d, a custom kernel
>>>> configuration, but
>>>> no local source code modifications.
>>>>
>>>> Unread portion of the kernel message buffer:
>>>> VNASSERT failed: ({ seqc_t __seqc = (_vp->v_seqc);
>>>> __builtin_expect((__seqc &
>>>> 1), 0); }) not true at
>>>> /usr/devel/git/trant/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_acl.c:1147
>>>> (zfs_acl_chown_setattr)
>>>> 0xf8013e4e85b8: type VDIR
>>>> usecount 1, writecount 0, refcount 1 seqc users 0 mountedhere 0
>>>> hold count flags ()
>>>> flags ()
>>>> lock type zfs: EXCL by thread 0xfe01dd1cd560 (pid 30747,
>>>> kdeinit5, tid
>>>> 159911)
>>>> panic: condition seqc_in_modify(_vp->v_seqc) not met at
>>>> /usr/devel/git/trant/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_acl.c:1147
>>>> (zfs_acl_chown_setattr)
>>>>
>>>> Any ideas, suggestions, hints?
>>>> Thanks!
>>>>
>>> ...
>>>> #4  0x8036fd21 in zfs_acl_chown_setattr (zp=0xf801ccd203b0)
>>>> at
>>>> /usr/devel/git/trant/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_acl.c:1147
>>>> #5  0x8037e52d in zfs_setattr (zp=0xf8024b04f760,
>>>> vap=vap@entry=0xfe029a36c870, flags=flags@entry=0,
>>>> cr=, cr@entry=0xf8003ecedc00)
>>>> at
>>>> /usr/devel/git/trant/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c:2758
>>>
>>> So, this is actually the second zfs_acl_chown_setattr call here:
>>> err = zfs_acl_chown_setattr(zp);
>>> ASSERT(err == 0);
>>>     if (attrzp) {
>>> err = zfs_acl_chown_setattr(attrzp);
>>> ASSERT(err == 0);
>>> }
>>>
>>> I am not sure if the assertion is actually applicable to attrzp (extended
>>> attributes "directory").
>>> At least I do not see any seq calls for it.
>>>
>>
>> So, I think that the problem should be reproducible by simply chown-ing a
>> file
>> with an extended attribute.  The kernel should be compiled with both
>> DEBUG_VFS_LOCKS and INVARIANTS.
>>
>> --
>> Andriy Gapon
>>
> 
> 


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

panic: sackhint bytes rtx >= 0

2021-02-23 Thread Andriy Gapon

akrate_thr = 0, ts_offset = 71449215,
  rfbuf_ts = 162235772, rcv_numsacks = 0, t_tsomax = 0, t_tsomaxsegcount = 0,
t_tsomaxsegsize = 0, rcv_nxt = 2281639092, rcv_adv = 2281705332, rcv_wnd =
66240, t_flags2 = 1030, t_srtt = 876, t_rttvar = 33, ts_recent = 0,
  snd_scale = 8 '\b', rcv_scale = 6 '\006', snd_limited = 2 '\002',
request_r_scale = 6 '\006', last_ack_sent = 2281639092, t_rcvtime = 2309118641,
rcv_up = 2281639092, t_segqlen = 0, t_segqmbuflen = 0, t_segq = {
tqh_first = 0x0, tqh_last = 0xf80754818880}, t_in_pkt = 0x0, t_tail_pkt
= 0x0, t_timers = 0xf80754818a78, t_vnet = 0x0, snd_ssthresh = 31680,
snd_wl1 = 2281639092, snd_wl2 = 3846347980, irs = 2281631223,
  iss = 3840447913, t_acktime = 0, t_sndtime = 2309118613, ts_recent_age = 0,
snd_recover = 3846415660, cl4_spare = 0, t_oobflags = 0 '\000', t_iobc = 0
'\000', t_rxtcur = 270, t_rxtshift = 1, t_rtttime = 2309118613,
  t_rtseq = 3846415660, t_starttime = 2309086941, t_fbyte_in = 2309087188,
t_fbyte_out = 2309087159, t_pmtud_saved_maxseg = 0, t_blackhole_enter = 0,
t_blackhole_exit = 0, t_rttmin = 30, t_rttbest = 845, t_softerror = 0,
  max_sndwnd = 237568, snd_cwnd_prev = 64800, snd_ssthresh_prev = 8640,
snd_recover_prev = 3846347980, t_sndzerowin = 0, t_rttupdated = 368,
snd_numholes = 2, t_badrxtwin = 0, snd_holes = {tqh_first = 0xf8013da5a320,
tqh_last = 0xf8013da5a230}, snd_fack = 3846415660, sackblks = {{start =
2281632180, end = 2281632690}, {start = 0, end = 0}, {start = 0, end = 0},
{start = 0, end = 0}, {start = 0, end = 0}, {start = 0, end = 0}},
  sackhint = {nexthole = 0xf8013da5a220, sack_bytes_rexmit = -1440,
last_sack_ack = 3846415660, delivered_data = 1440, sacked_bytes = 61920,
recover_fs = 67680, prr_delivered = 1440, _pad = {0}}, t_rttlow = 25,
  rfbuf_cnt = 0, tod = 0x0, t_sndrexmitpack = 520, t_rcvoopack = 0, t_toe = 0x0,
cc_algo = 0x80ef2530 , ccv = 0xf80754818bc0, osd =
0x0, t_bytes_acked = 11520, t_maxunacktime = 0, t_keepinit = 0,
  t_keepidle = 0, t_keepintvl = 0, t_keepcnt = 0, t_dupacks = 4, t_lognum = 0,
t_loglimit = 0, t_pacing_rate = -1, t_logs = {stqh_first = 0x0, stqh_last =
0x0}, t_lin = 0x0, t_lib = 0x0, t_output_caller = 0x0, t_stats = 0x0,
  t_logsn = 0, gput_ts = 0, gput_seq = 0, gput_ack = 0, t_stats_gput_prev = 0,
t_tfo_client_cookie_len = 0 '\000', t_end_info_status = 0, t_tfo_pending = 0x0,
t_tfo_cookie = {client = '\000' , server = 0}, {
t_end_info_bytes = "\000\000\000\000\000\000\000", t_end_info = 0}}

(kgdb) p *tp@entry->sackhint.nexthole
$8 = {start = 3846396940, end = 3846398380, rxmit = 3846398380, scblink =
{tqe_next = 0x0, tqe_prev = 0xf8013da5a330}}

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic: condition seqc_in_modify(_vp->v_seqc) not met at zfs_acl.c:1147 (zfs_acl_chown_setattr)

2021-02-16 Thread Andriy Gapon

On 15/02/2021 11:45, Andriy Gapon wrote:
> On 15/02/2021 10:22, Andriy Gapon wrote:
>>
>> I've got this panic once when copying a couple of files.
>> The system is stable/13 as of 1996360d7338d, a custom kernel configuration, 
>> but
>> no local source code modifications.
>>
>> Unread portion of the kernel message buffer:
>> VNASSERT failed: ({ seqc_t __seqc = (_vp->v_seqc); __builtin_expect((__seqc &
>> 1), 0); }) not true at
>> /usr/devel/git/trant/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_acl.c:1147
>> (zfs_acl_chown_setattr)
>> 0xf8013e4e85b8: type VDIR
>> usecount 1, writecount 0, refcount 1 seqc users 0 mountedhere 0
>> hold count flags ()
>> flags ()
>> lock type zfs: EXCL by thread 0xfe01dd1cd560 (pid 30747, kdeinit5, 
>> tid
>> 159911)
>> panic: condition seqc_in_modify(_vp->v_seqc) not met at
>> /usr/devel/git/trant/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_acl.c:1147
>> (zfs_acl_chown_setattr)
>>
>> Any ideas, suggestions, hints?
>> Thanks!
>>
> ...
>> #4  0x8036fd21 in zfs_acl_chown_setattr (zp=0xf801ccd203b0)
>> at 
>> /usr/devel/git/trant/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_acl.c:1147
>> #5  0x8037e52d in zfs_setattr (zp=0xf8024b04f760,
>> vap=vap@entry=0xfe029a36c870, flags=flags@entry=0,
>> cr=, cr@entry=0xf8003ecedc00)
>> at
>> /usr/devel/git/trant/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c:2758
> 
> So, this is actually the second zfs_acl_chown_setattr call here:
> err = zfs_acl_chown_setattr(zp);
> ASSERT(err == 0);
> if (attrzp) {
> err = zfs_acl_chown_setattr(attrzp);
> ASSERT(err == 0);
> }
> 
> I am not sure if the assertion is actually applicable to attrzp (extended
> attributes "directory").
> At least I do not see any seq calls for it.
> 

So, I think that the problem should be reproducible by simply chown-ing a file
with an extended attribute.  The kernel should be compiled with both
DEBUG_VFS_LOCKS and INVARIANTS.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic: condition seqc_in_modify(_vp->v_seqc) not met at zfs_acl.c:1147 (zfs_acl_chown_setattr)

2021-02-15 Thread Andriy Gapon

On 15/02/2021 10:22, Andriy Gapon wrote:
> 
> I've got this panic once when copying a couple of files.
> The system is stable/13 as of 1996360d7338d, a custom kernel configuration, 
> but
> no local source code modifications.
> 
> Unread portion of the kernel message buffer:
> VNASSERT failed: ({ seqc_t __seqc = (_vp->v_seqc); __builtin_expect((__seqc &
> 1), 0); }) not true at
> /usr/devel/git/trant/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_acl.c:1147
> (zfs_acl_chown_setattr)
> 0xf8013e4e85b8: type VDIR
> usecount 1, writecount 0, refcount 1 seqc users 0 mountedhere 0
> hold count flags ()
> flags ()
> lock type zfs: EXCL by thread 0xfe01dd1cd560 (pid 30747, kdeinit5, tid
> 159911)
> panic: condition seqc_in_modify(_vp->v_seqc) not met at
> /usr/devel/git/trant/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_acl.c:1147
> (zfs_acl_chown_setattr)
> 
> Any ideas, suggestions, hints?
> Thanks!
> 
...
> #4  0x8036fd21 in zfs_acl_chown_setattr (zp=0xf801ccd203b0)
> at 
> /usr/devel/git/trant/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_acl.c:1147
> #5  0x8037e52d in zfs_setattr (zp=0xf8024b04f760,
> vap=vap@entry=0xfe029a36c870, flags=flags@entry=0,
> cr=, cr@entry=0xf8003ecedc00)
> at
> /usr/devel/git/trant/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c:2758

So, this is actually the second zfs_acl_chown_setattr call here:
err = zfs_acl_chown_setattr(zp);
ASSERT(err == 0);
if (attrzp) {
err = zfs_acl_chown_setattr(attrzp);
ASSERT(err == 0);
}

I am not sure if the assertion is actually applicable to attrzp (extended
attributes "directory").
At least I do not see any seq calls for it.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

panic: condition seqc_in_modify(_vp->v_seqc) not met at zfs_acl.c:1147 (zfs_acl_chown_setattr)

2021-02-15 Thread Andriy Gapon



I've got this panic once when copying a couple of files.
The system is stable/13 as of 1996360d7338d, a custom kernel configuration, but
no local source code modifications.

Unread portion of the kernel message buffer:
VNASSERT failed: ({ seqc_t __seqc = (_vp->v_seqc); __builtin_expect((__seqc &
1), 0); }) not true at
/usr/devel/git/trant/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_acl.c:1147
(zfs_acl_chown_setattr)
0xf8013e4e85b8: type VDIR
usecount 1, writecount 0, refcount 1 seqc users 0 mountedhere 0
hold count flags ()
flags ()
lock type zfs: EXCL by thread 0xfe01dd1cd560 (pid 30747, kdeinit5, tid
159911)
panic: condition seqc_in_modify(_vp->v_seqc) not met at
/usr/devel/git/trant/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_acl.c:1147
(zfs_acl_chown_setattr)

Any ideas, suggestions, hints?
Thanks!

(kgdb) #0  doadump (textdump=textdump@entry=1)
at /usr/devel/git/trant/sys/kern/kern_shutdown.c:399
#1  0x8083bea2 in kern_reboot (howto=260)
at /usr/devel/git/trant/sys/kern/kern_shutdown.c:486
#2  0x8083c4f7 in vpanic (
fmt=0x80c33e58 "condition %s not met at %s:%d (%s)",
ap=0xfe029a36c2c0)
at /usr/devel/git/trant/sys/kern/kern_shutdown.c:919
#3  0x8083c0a3 in panic (fmt=)
at /usr/devel/git/trant/sys/kern/kern_shutdown.c:843
#4  0x8036fd21 in zfs_acl_chown_setattr (zp=0xf801ccd203b0)
at 
/usr/devel/git/trant/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_acl.c:1147
#5  0x8037e52d in zfs_setattr (zp=0xf8024b04f760,
vap=vap@entry=0xfe029a36c870, flags=flags@entry=0,
cr=, cr@entry=0xf8003ecedc00)
at
/usr/devel/git/trant/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c:2758
#6  0x803817ee in zfs_freebsd_setattr (ap=)
at
/usr/devel/git/trant/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c:4918
#7  0x80ba6087 in VOP_SETATTR_APV (
vop=0x80e59280 , a=a@entry=0xfe029a36ca00)
at vnode_if.c:927
#8  0x80915a89 in VOP_SETATTR (vp=vp@entry=0xf8016524d5b8,
vap=vap@entry=0xfe029a36ca30, cred=,
cred@entry=0xf8003ecedc00) at ./vnode_if.h:485
#9  0x80915d67 in setfown (td=,
cred=0xf8003ecedc00, vp=0xf8016524d5b8, uid=uid@entry=4294967295,
gid=gid@entry=20) at /usr/devel/git/trant/sys/kern/vfs_syscalls.c:2942
#10 0x80915eb6 in kern_fchownat (td=0xfe01dd1cd560,
fd=fd@entry=-100,
path=0x803697858 ,
pathseg=pathseg@entry=UIO_USERSPACE, uid=-1, gid=, flag=0)
at /usr/devel/git/trant/sys/kern/vfs_syscalls.c:3002
#11 0x80915db6 in sys_chown (td=, uap=)
at /usr/devel/git/trant/sys/kern/vfs_syscalls.c:2962
#12 0x80b25b69 in syscallenter (td=0xfe01dd1cd560)
at /usr/devel/git/trant/sys/amd64/amd64/../../kern/subr_syscall.c:189
#13 0x80b25845 in amd64_syscall (td=0xfe01dd1cd560, traced=0)
at /usr/devel/git/trant/sys/amd64/amd64/trap.c:1156

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic in drm or vt or deadlock on mutex or ...

2021-02-03 Thread Andriy Gapon

On 03/02/2021 07:08, Steve Kargl wrote:
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address = 0xc374
> fault code= supervisor read data, page not present
> instruction pointer   = 0x20:0xef411f
> stack pointer = 0x28:0x4074e97c
> frame pointer = 0x28:0x4074e988
> code segment  = base 0x0, limit 0xf, type 0x1b
>   = DPL 0, pres 1, def32 1, gran 1
> processor eflags  = interrupt enabled, resume, IOPL = 0
> current process   = 91696 (chrome)
> trap number   = 12
...
> panic: page fault
> cpuid = 0
> time = 1612328062
> KDB: stack backtrace:
> db_trace_self_wrapper(2,4074e93c,c,0,4074e800,...) at 
> db_trace_self_wrapper+0x28/frame 0x4074e7d4
> vpanic(f9d603,4074e80c,4074e80c,4074e834,f6e2b7,...) at vpanic+0x11a/frame 
> 0x4074e7ec
> panic(f9d603,fe16b8,0,f,c39b,...) at panic+0x14/frame 0x4074e800
> trap_fatal(1327100,0,c95893,78f03f4,4074e860,...) at trap_fatal+0x347/frame 
> 0x4074e834
> trap_pfault(c374,0,0) at trap_pfault+0x30/frame 0x4074e864
> trap(4074e93c,8,28,28,0,...) at trap+0x381/frame 0x4074e930
> calltrap() at 0xffc0319f/frame 0x4074e930
> --- trap 0xc, eip = 0xef411f, esp = 0x4074e97c, ebp = 0x4074e988 ---
> vm_radix_lookup(28c7b884,2,0) at vm_radix_lookup+0x7f/frame 0x4074e988
> vm_page_lookup(28c7b854,2,0) at vm_page_lookup+0x15/frame 0x4074e99c
> vm_fault(24ed8d58,3488b000,2,0,4074eab0) at vm_fault+0x839/frame 0x4074ea48
> vm_fault_quick_hold_pages(24ed8d58,34889f00,8000,2,4074eaa8,12) at 
> vm_fault_quick_hold_pages+0x122/frame 0x4074ea88
> vn_io_fault1(247f4380) at vn_io_fault1+0x214/frame 0x4074eb44
> vn_io_fault(2f5a58e8,4074ebc8,262c1e00,0,247f4380) at vn_io_fault+0x1c4/frame 
> 0x4074eb7c
> dofileread(2f5a58e8,4074ebc8,,,0) at dofileread+0x6d/frame 
> 0x4074ebac
> sys_read(247f4380,247f4618,343fb000,247f4380,40516068,...) at 
> sys_read+0x67/frame 0x4074ec00
> syscall(4074ece8,3b,3b,3b,2d130d1c,...) at syscall+0x17d/frame 0x4074ecdc
> Xint0x80_syscall() at 0xffc033f9/frame 0x4074ecdc
> --- syscall (881410048), eip = 0x2d086faf, esp = 0xfa1e339c, ebp = 0xfa1e33c8 
> ---
> KDB: enter: panic

This is the crash.
The DRM mutex noise is just noise (but it would be good to get rid of it).

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Laptop ACPI poweroff failed after main-c255826 -> main-c255850

2021-01-13 Thread Andriy Gapon

On 2021-01-13 16:13, David Wolfskill wrote:
> On Wed, Jan 13, 2021 at 04:07:35PM +0200, Andriy Gapon wrote:
>> ...
>>> I believe that this is evidence in favor of a "race condition" diagnosis.
>>> (In precisely what, I don't know,)
>>
>> I haven't followed source changes too closely as of recent.
>> It might be a good idea to check for recent imports of ACPICA updates.
>> 
> 
> Most recent of those in head was:
> 
> | commit fbde34778ba0ba31fcae99e992f353d989433dba
> | Merge: a2fe464c81de 960614968e0d
> | Author: Jung-uk Kim 
> | Date:   Fri Nov 13 22:45:26 2020 +
> | 
> | MFV:r367652
> | 
> | Merge ACPICA 20201113.
> | 
> | Notes:
> | svn path=/head/; revision=367654
> 
> and I certainly had not been seeing the symptom at all until I
> mentioned it on 11 January.  (And I have been tracking head daily,
> including the "poweroff" at the end).

Another "wild" idea: some sort of a change related to signal delivery or
checking.

As I understand, the whole kernel shutdown procedure is executed in a
context of a userland process (init? shutdown?).  And I guess that that
process gets a signal at some point during the shutdown.
Now, our implementation of the ACPI mutex is such that it would abort /
fail if msleep(PCATCH) in it returns EINTR.  I was concerned about that
for a long time and I think that it is wrong, but it didn't cause much
problems before.  Also, I should note that that applies not only to
mutexes declared in AML but also to ACPICA's mutexes that protect its
internal states (such as ACPI_MTX_Caches / ACPI_MTX_CACHES which appears
in your output).

So, if that mutex is uncontested then it can be acquired even when a
signal is pending and everything is okay.  But if the mutex happens to
be held by some other thread, then the signal gets checked and the
operation is failed because of EINTR.

This is the only failure mode that I can think of for that mutex.
But again, I have no idea what could have changed recently with respect
to signal delivery / signal checking.
Or perhaps it's something else, something that creates concurrent ACPI
activity that increases likelihood of that mutex being contested.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Laptop ACPI poweroff failed after main-c255826 -> main-c255850

2021-01-13 Thread Andriy Gapon

On 2021-01-13 16:03, David Wolfskill wrote:
> On Tue, Jan 12, 2021 at 07:37:28AM -0800, David Wolfskill wrote:
>> On Tue, Jan 12, 2021 at 05:31:30PM +0200, Andriy Gapon wrote:
>>> On 2021-01-11 14:55, David Wolfskill wrote:
>>>> pci3: unknown notify 0x2
>>>> ACPI Error: AE_ERROR, Thread 12 could not acquire Mutex 
>>>> [ACPI_MTX_Caches] (0c4) (20201113/utmutex-434)
>>>
>>> Looks like that was some sort of a race or otherwise transient condition
>>> that lead to the _PTS (prepare-to-sleep) failure.
>>>
>>>> ACPI Error: Aborting method \_PTS due to previous error (AE_NO_MEMORY) 
>>>> (20201113/psparse-689)
>>>> acpi0: AcpiEnterSleepStatePrep failed - AE_NO_MEMORY
>>> 
>>
>> That's certainly plausible -- as I noted a bit earlier today, there was
>> no recurrence  after this morning's main-c255850-g16079c7233be ->
>> main-c255894-g8b1839548750 update.
>>
>> Should I encounter a recurrence, I will plan to get another screenshot,
>> then bring the machine back up and re-try the poweroff (and then report
>> my findings).
>> 
> 
> I had a recurrence this morninig, after the update from:
> 
> FreeBSD g1-55.catwhisker.org 13.0-CURRENT FreeBSD 13.0-CURRENT #120 
> main-c255894-g8b1839548750-dirty: Tue Jan 12 05:23:50 PST 2021 
> r...@g1-55.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY  
> amd64 1300134 1300134
> 
> to:
> 
> FreeBSD g1-55.catwhisker.org 13.0-CURRENT FreeBSD 13.0-CURRENT #121 
> main-c255921-gec2700e01532-dirty: Wed Jan 13 05:06:22 PST 2021 
> r...@g1-55.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY  
> amd64 1300135 1300135
> 
> 
> New swcreenshot is in https://www.catwhisker.org/~david/FreeBSD/head/c255921;
> the previous one is in https://www.catwhisker.org/~david/FreeBSD/head/c255850.
> 
> They look quite similar to me.
> 
> After grabbing the screenshot, I rebooted again, but the poweroff
> just worked normally on re-try.
> 
> I believe that this is evidence in favor of a "race condition" diagnosis.
> (In precisely what, I don't know,)

I haven't followed source changes too closely as of recent.
It might be a good idea to check for recent imports of ACPICA updates.


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Laptop ACPI poweroff failed after main-c255826 -> main-c255850

2021-01-12 Thread Andriy Gapon

On 2021-01-11 14:55, David Wolfskill wrote:
> pci3: unknown notify 0x2
> ACPI Error: AE_ERROR, Thread 12 could not acquire Mutex [ACPI_MTX_Caches] 
> (0c4) (20201113/utmutex-434)

Looks like that was some sort of a race or otherwise transient condition
that lead to the _PTS (prepare-to-sleep) failure.

> ACPI Error: Aborting method \_PTS due to previous error (AE_NO_MEMORY) 
> (20201113/psparse-689)
> acpi0: AcpiEnterSleepStatePrep failed - AE_NO_MEMORY


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: git and the loss of revision numbers

2020-12-29 Thread Andriy Gapon

On 2020-12-29 17:11, monochrome wrote:
> ok, this appears to be what I was looking for
> 
> example:
> git reset --hard f20c0e331
> then:
> git pull --ff-only
> is again able to update as normal
> 
> I should point out also that this is from the point of view of any
> random person just building freebsd from source, not a developer, so
> there are no local changes. Though it does blow away changes to the conf
> file, that's a lesser issue to deal with.

git stash [save] and git stash pop can be used to try[*] to preserve
minor local changes.

[*] there can be merge conflicts after stash pop if the same file(s) are
changed upstream as well.

-- 
Andriy
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: git and the loss of revision numbers

2020-12-29 Thread Andriy Gapon

On 2020-12-29 02:56, Pete Wright wrote:
> 
> On 12/28/20 4:38 PM, monochrome wrote:
>> what would be the git command for reverting source to a previous
>> version using these numbers? for example, with svn and old numbers:
>> svnlite update -r367627 /usr/src
>>
> I will generally just checkout the short git hash like so in my local
> checkout:
> $ git checkout gb81783dc98e6
> 
> you can quickly get the hashes by running "git log" from your checkout.

I think that git checkout  is a wrong tool here.
I personally would use git reset --hard .
Note that that command would also revert any local uncommitted changes
as well!

My view of the difference between the commands:
- checkout: stage[*] a change that would modify the current state of the
branch to the selected commit's state
- reset: change the current branch (its head) to point to the selected
commit

[*] by stage I mean modify the working copy and the index.
That is, if after git checkout you would run git commit then you would
commit a change that reverts the current branch to the selected point.

-- 
Andriy


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: installation on pvscsi fails with "The request was too large for this host"

2020-12-17 Thread Andriy Gapon

On 17/12/2020 07:02, Yuri Pankov wrote:
> Trying to install latest snapshot (20201210) on a VMware ESXi/Workstation VMs
> with pvscsi fails on bootloader step, and the following is in dmesg:
> 
> pvscsi0: pvscsi_execute_ccb error 27
> pvscsi0: pvscsi_execute_ccb error 27
> (da0:pvscsi0:0:0:0): WRITE(10). CDB: 2a 00 00 00 00 28 00 04 00
> (da0:pvscsi0:0:0:0): CAM status: The request was too large for this host
> (da0:pvscsi0:0:0:0): Error 22, Unretryable error
> (da0:pvscsi0:0:0:0): WRITE(10). CDB: 2a 00 00 00 00 28 00 04 00
> (da0:pvscsi0:0:0:0): CAM status: The request was too large for this host
> (da0:pvscsi0:0:0:0): Error 22, Unretryable error
> 
> That is the first I'm trying installing on pvscsi since it was integrated, so 
> no
> idea if it worked previously.  If yes, I have not tried to bisect this yet
> hoping that it could be identified as related to any of the recent changes.
> 
> The VMs in question are set with 8-64 GB RAM, and 100 GB boot disks.

Not an expert in this areas, but that command tried to transfer 0x400 / 1024
blocks, which is 512KB of data.
Could it be that the problem is revealed by the MAXPHYS increase?
There might be a bug in pvscsi where it does not respect or correctly advertise
some limit.  There could be a similar issue with VMware itself (its emulation of
a disk / target).


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: rand() is racy in multi-threaded programs?

2020-12-02 Thread Andriy Gapon

On 03/12/2020 01:20, Conrad Meyer wrote:
> Hi Andriy,
> 
> Rand(3) is explicitly unsafe to use from concurrent threads without some
> external serialization, even after initialization. I’d suggest using a 
> different
> API.

Conrad,

thank you!
Just want to check, unsafe in terms of bogus results (with respect to
randomness) or unsafe as in may crash?


> On Wed, Dec 2, 2020 at 13:53 Andriy Gapon  <mailto:a...@freebsd.org>> wrote:
> 
> 
> Specifically, concurrent "first" calls to rand().
> There can be a moment when rand3_state is allocated but not completely 
> set up
> with initstate_r().
> Is this a known / documented issue?
> Should we try to do better?
> 
> P.S.
> I am seeing this issue from time to time when running ztest program (from 
> ZFS).
> I guess that it uses rand() just because that's what OpenZFS did / does on
> illumos and Linux.
> 
> P.P.S.
> Just realized that the problem can be relatively recent.
> https://svnweb.freebsd.org/base?view=revision=357382
> 
> -- 
> Andriy Gapon
> 


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

rand() is racy in multi-threaded programs?

2020-12-02 Thread Andriy Gapon



Specifically, concurrent "first" calls to rand().
There can be a moment when rand3_state is allocated but not completely set up
with initstate_r().
Is this a known / documented issue?
Should we try to do better?

P.S.
I am seeing this issue from time to time when running ztest program (from ZFS).
I guess that it uses rand() just because that's what OpenZFS did / does on
illumos and Linux.

P.P.S.
Just realized that the problem can be relatively recent.
https://svnweb.freebsd.org/base?view=revision=357382

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: dtrace: give %'d a chance?

2020-12-02 Thread Andriy Gapon

On 02/12/2020 18:52, Mark Johnston wrote:
> On Mon, Nov 30, 2020 at 03:50:53PM +0200, Andriy Gapon wrote:
>> On 19/11/2020 16:57, Mark Johnston wrote:
>>> On Thu, Nov 19, 2020 at 01:28:56PM +0200, Andriy Gapon wrote:
>>>>
>>>> what do people think about adding
>>>> setlocale(LC_NUMERIC, "");
>>>> to dtrace's main function?
>>>
>>> That seems reasonable to me.
>>>
>>>> My primary interest is to (pretty-)print some numbers with a thousands 
>>>> separator.
>>>>
>>>> Not sure if any other LC_ types are worth bothering.
>>>
>>> Maybe LC_TIME?  libdtrace a couple of date formatters, %T and %Y.  A
>>> locale-aware formatter might be worth having.
>>
>> FWIW, I've just discovered that despite what
>> http://dtrace.org/guide/chp-fmt.html says about %Y its output is not 
>> dependent
>> on locale settings.
>> A quick look at the code confirms that -- pfprint_time uses ctime_r.
>> But %T (undocumented at the above link) indeed depends on LC_TIME as
>> pfprint_time822 uses strftime("%a, %d %b %G %T %Z").
>>
>> Sample output in C locale:
>> 1000
>> Mon, 30 Nov 2020 13:47:24 UTC
>> 2020 Nov 30 13:47:24
>>
>> The same formats (%'d, %T, %Y) in uk_UA locale:
>> 10 000 000
>> Пн, 30 лист. 2020 13:43:11 UTC
>> 2020 Nov 30 13:43:11
> 
> So to be clear, there is nothing that needs to be done for time locales?

Sorry, it was I who was not clear.  The above output is after adding setlocale()
calls.  Stock dtrace always operates in C locale.

> In any case, I'm fine with adding the %'d formatter.

It's already there and it delegates the work to the C printf.
Hence the need for setlocale.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: dtrace: give %'d a chance?

2020-11-30 Thread Andriy Gapon

On 19/11/2020 16:57, Mark Johnston wrote:
> On Thu, Nov 19, 2020 at 01:28:56PM +0200, Andriy Gapon wrote:
>>
>> what do people think about adding
>> setlocale(LC_NUMERIC, "");
>> to dtrace's main function?
> 
> That seems reasonable to me.
> 
>> My primary interest is to (pretty-)print some numbers with a thousands 
>> separator.
>>
>> Not sure if any other LC_ types are worth bothering.
> 
> Maybe LC_TIME?  libdtrace a couple of date formatters, %T and %Y.  A
> locale-aware formatter might be worth having.

FWIW, I've just discovered that despite what
http://dtrace.org/guide/chp-fmt.html says about %Y its output is not dependent
on locale settings.
A quick look at the code confirms that -- pfprint_time uses ctime_r.
But %T (undocumented at the above link) indeed depends on LC_TIME as
pfprint_time822 uses strftime("%a, %d %b %G %T %Z").

Sample output in C locale:
1000
Mon, 30 Nov 2020 13:47:24 UTC
2020 Nov 30 13:47:24

The same formats (%'d, %T, %Y) in uk_UA locale:
10 000 000
Пн, 30 лист. 2020 13:43:11 UTC
2020 Nov 30 13:43:11

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: buildworld: lib/libc: install: short write to libc.so.7.debug: [_libinstall] Error code 71

2020-11-23 Thread Andriy Gapon

On 22/11/2020 18:06, Kyle Evans wrote:
> On Sun, Nov 22, 2020 at 6:00 AM Dimitry Andric  wrote:
>> I'd guess it's an unintended side-effect of
>> https://svnweb.freebsd.org/base?view=revision=366697
>> ("install(1): Avoid unncessary fstatfs() calls and use mmap() based on
>> size").
>>
> 
> Almost certainly -- before, we would never attempt to mmap() on ZFS.
> 
> Tossing arichardson@ into CC as the committer
> Tossing mmacy@ and freqlabs@ into CC as ZFS folks
> 
> Looking through sys/contrib/openzfs, there's special handling for mmap
> on linux because it bypasses the page cache and relies on caching in
> ARC. AFAICT the FreeBSD side seems to handle write() to mmap'd
> regions, but doesn't do anything with VOP_MMAPPED which might be
> needed to sync the file when it's mmap'd for reading like this. My
> understanding of how this is supposed to actually work on FreeBSD is
> limited, though, so I defer...

Last time I checked mmap worked correctly with ZFS, that was before the switch.
Perhaps, there was an undetected issue -- this can be tested, e.g., by applying
the install change to stable/12.
Perhaps, the ZFS switch came with a regression.


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: dtrace: give %'d a chance?

2020-11-19 Thread Andriy Gapon

On 19/11/2020 15:31, Ash Gokhale wrote:
> I'm not a fan of reading nanosecond timestamps ; however This would add work 
> to
> downstream scripts  that have to toss the prettyprint later; 
> s|.,()||g  downstream. Think of the wee awk scripts. 
> Could we gate the behaviour behind an environment DTRACE_LOCALE or whatever?
> 
> Eh It's getting harder to live in the C locale anyway, the immigration  rules
> seem to be tightening. 

Sorry, but you don't have to use %'d.
You can keep using %d.

> On Thu, Nov 19, 2020 at 6:29 AM Andriy Gapon  <mailto:a...@freebsd.org>> wrote:
> 
> 
> what do people think about adding
>     setlocale(LC_NUMERIC, "");
> to dtrace's main function?
> 
> My primary interest is to (pretty-)print some numbers with a thousands
> separator.
> 
> Not sure if any other LC_ types are worth bothering.
> 
> -- 
> Andriy Gapon
> ___
> freebsd-dtr...@freebsd.org <mailto:freebsd-dtr...@freebsd.org> mailing 
> list
> https://lists.freebsd.org/mailman/listinfo/freebsd-dtrace
> To unsubscribe, send any mail to "freebsd-dtrace-unsubscr...@freebsd.org
> <mailto:freebsd-dtrace-unsubscr...@freebsd.org>"
> 


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

dtrace: give %'d a chance?

2020-11-19 Thread Andriy Gapon



what do people think about adding
setlocale(LC_NUMERIC, "");
to dtrace's main function?

My primary interest is to (pretty-)print some numbers with a thousands 
separator.

Not sure if any other LC_ types are worth bothering.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic: VERIFY(ZFS_TEARDOWN_READ_HELD(zfsvfs)) failed

2020-11-09 Thread Andriy Gapon

On 07/11/2020 19:00, Mateusz Guzik wrote:
> Fixed as of r367454 (also see r367453).

Thank you!

> On 11/6/20, Mateusz Guzik  wrote:
>> I think I have an idea how to keep this. In the meantime you can just
>> comment it out.
>>
>> On 11/6/20, Mateusz Guzik  wrote:
>>> On 11/6/20, Andriy Gapon  wrote:
>>>> On 06/11/2020 22:58, Mateusz Guzik wrote:
>>>>> Note the underlying primitive was recently replaced.
>>>>>
>>>>> One immediate thing to check would be exact state of the lock.
>>>>> READ_HELD checks for reading only, fails if you have this
>>>>> write-locked, which is a plausible explanation if you are coming in
>>>>> from less likely codepath.
>>>>>
>>>>> iow what's the backtrace and can you print both rms->readers and
>>>>> rms->owner (+ curthread)
>>>>
>>>> Unfortunately, I do not have a vmcore, only a picture of the screen.
>>>>
>>>> ZFS code looks correct, the lock should be held in read mode, so indeed
>>>> I
>>>> suspect that the problem is with rms.
>>>>
>>>> It looks like rms_rlock() does not change rmslock::readers, but
>>>> rms_rowned()
>>>> checks it?
>>>>
>>>> That's just from a first, super-quick look at the code.
>>>>
>>>
>>> Heh, now that you mention it, I remember wanting to just remove the
>>> arguably spurious assert. Linux is never doing it for reading. The
>>> only state asserts made are for writing which works fine.
>>>
>>> As for reading assertions, there is no performant way to make it work
>>> and I don't think it is worth it as it is.
>>>
>>> As such, I vote for just removing these 2 asserts. They really don't
>>> buy anything to begin with.
>>>
>>> --
>>> Mateusz Guzik 
>>>
>>
>>
>> --
>> Mateusz Guzik 
>>
> 
> 


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

(perceived) regression for panics during boot

2020-11-06 Thread Andriy Gapon



Having tried an upgrade from r365296 to r367410 I ran into a problem.
If a system panics during boot, then it does not automatically reboot.
It prints:
  Automatic reboot in 15 seconds - press a key on the console to abort
  --> Press a key on the console to reboot,
  --> or switch off the system now.

So, it looks like the system detects a (phantom) key press after it prints the
first line and then it (correctly) does not see any more key presses.

The result is that unattended systems just hang at the prompt.

I do not recall having this problem earlier.
Not sure where the regression could be.
Perhaps something changed in the keyboard initialization?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic: VERIFY(ZFS_TEARDOWN_READ_HELD(zfsvfs)) failed

2020-11-06 Thread Andriy Gapon

On 06/11/2020 22:58, Mateusz Guzik wrote:
> Note the underlying primitive was recently replaced.
> 
> One immediate thing to check would be exact state of the lock.
> READ_HELD checks for reading only, fails if you have this
> write-locked, which is a plausible explanation if you are coming in
> from less likely codepath.
> 
> iow what's the backtrace and can you print both rms->readers and
> rms->owner (+ curthread)

Unfortunately, I do not have a vmcore, only a picture of the screen.

ZFS code looks correct, the lock should be held in read mode, so indeed I
suspect that the problem is with rms.

It looks like rms_rlock() does not change rmslock::readers, but rms_rowned()
checks it?

That's just from a first, super-quick look at the code.


> On 11/6/20, Andriy Gapon  wrote:
>>
>> The subject panic happens for me with r367410 when mounting root
>> filesystem.
>> The panic is in zfs_freebsd_cached_lookup -> zfs_lookup -> zfs_dirlook.
>> I have a picture of the screen with a little bit more details, I'll share it
>> later.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

panic: VERIFY(ZFS_TEARDOWN_READ_HELD(zfsvfs)) failed

2020-11-06 Thread Andriy Gapon



The subject panic happens for me with r367410 when mounting root filesystem.
The panic is in zfs_freebsd_cached_lookup -> zfs_lookup -> zfs_dirlook.
I have a picture of the screen with a little bit more details, I'll share it 
later.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Zpool doesn't boot anymore after FreeBSD 12.1

2020-10-22 Thread Andriy Gapon

On 22/10/2020 16:39, Cassiano Peixoto wrote:
> Hi Andriy,
> 
> I've just tried copying my zfsloader from 11.2-STABLE (R350026) to FreeBSD 
> 12.1
> and 12.2 (STABLE) and fixed the issue.
> 
> I also tried to use zfsloader of 11.3 but didn't work and the same issue 
> happened.
> 
> So it seems that something has changed on zfsloader after 11.2 that brings 
> this
> issue.
> 
> My question is: Should it be expected or is it a bug to be fixed?
> 

In my opinion it's a bug.
zfsloader should not require that disks must be partitioned.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Zpool doesn't boot anymore after FreeBSD 12.1

2020-10-22 Thread Andriy Gapon

On 21/10/2020 15:20, Cassiano Peixoto wrote:
> Hi there,
> 
> Anyone can help please? I've many servers with this same issue. Thanks

Can you try to replace /boot/zfsloader with zfsloader from other FreeBSD 
versions?
E.g., 12.0, 12.2-RC, 11.4, recent snapshot of the CURRENT?

> On Fri, Oct 16, 2020 at 10:24 AM Cassiano Peixoto 
> wrote:
> 
>> Hi there,
>>
>> I have a FreeBSD 12.1-STABLE running on VMWARE with one disk. Then I added 
>> two more disks to expand my pool. BTW I already did it many time with no 
>> issues.
>>
>> I ran:
>>
>> # zpool status
>>   pool: zroot
>>  state: ONLINE
>> status: Some supported features are not enabled on the pool. The pool can
>>  still be used, but some features are unavailable.
>> action: Enable all features using 'zpool upgrade'. Once this is done,
>>  the pool may no longer be accessible by software that does not support
>>  the features. See zpool-features(7) for details.
>>   scan: none requested
>> config:
>>
>>  NAME STATE READ WRITE CKSUM
>>  zrootONLINE   0 0 0
>>gpt/disk0  ONLINE   0 0 0
>>
>> errors: No known data errors
>>
>> # zpool add -f zroot da1
>> # zpool add -f zroot da2
>> # zpool status
>>   pool: zroot
>>  state: ONLINE
>>   scan: none requested
>> config:
>>
>>  NAME STATE READ WRITE CKSUM
>>  zrootONLINE   0 0 0
>>gpt/disk0  ONLINE   0 0 0
>>da1ONLINE   0 0 0
>>da2ONLINE   0 0 0
>>
>> errors: No known data errors
>> # reboot
>>
>> Then my system doesn’t boot anymore, i got the following error:
>>
>> gptzfsboot: error 4 lba 2038346899
>> gptzfsboot: error 4 lba 1361327267
>> /boot/config: -Dh
>>
>> BTX loader 1.00  BTX version is 1.02
>> Consoles: internal video/keyboard serial port
>> BIOS drive A: is fd0
>> BIOS drive C: is disk0
>> BIOS drive D: is disk1
>> BIOS drive E: is disk2
>> BIOS drive F: is disk3
>> BIOS drive G: is disk4
>> BIOS drive H: is disk5
>> ZFS: i/o error - all block copies unavailable
>> ZFS: failed to read pool zroot directory object
>> BIOS 638kB/3143616kB available memory
>>
>> FreeBSD/x86 bootstrap loader, Revision 1.1
>> ERROR: cannot open /boot/lua/loader.lua: invalid argument.
>>
>> Type '?' for list of commands, 'help' for more datailed help.
>> OK
>>
>> I can import my pool with no problems using the lived, but I could not fix 
>> it.
>>
>> Seems a bug after 12.1-STABLE. Please, anyone can take a look ok that?
>>
>> Thanks.
>>
>>
>>
>>
> ___
> freebsd-sta...@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> 

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: clang build buggy code with certain CPUTYPE setting

2020-09-28 Thread Andriy Gapon

On 26/09/2020 22:55, Marek Zarychta wrote:
> Thank you for the information and for the fix. Sadly I must admit it
> doesn't work for me. I have tried two builds with fresh sources today to
> be certain and it looks like the bug is still present on FreeBSD
> 13-CURRENT r366186. Either the upstream fixed it only partially or it is
> another bug. As a workaround, I will build worlds without
> CPUTYPE?=amdfam10 for a while. I hope the problem will be resolved
> before clang 11 is MFCed to 12-STABLE.

Can you disassemble the faulting instruction in the core dump?
Can you provide full CPU ID / features information from dmesg?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Plans for git

2020-09-20 Thread Andriy Gapon

Just my +100500 to this.

On 20/09/2020 18:03, Christian Weisgerber wrote:
> On 2020-09-19, Zaphod Beeblebrox  wrote:
> 
>> Hrm.  Maybe what I hear others saying, tho, and not entirely being replied
>> to is just a nice concise document of the why.  What I hear you saying is
>> that GIT has momentum and that it's popular... (and I accept that --- it is
>> evidently true), but then I hear handwaving about features, but no list of
>> features that are a clear win/loose.
> 
> How about the very basics (that Warner appears to have lost sight
> of)?
> 
> Git is a distributed version control system.  You clone a repository
> and apart from pulling and pushing changes to another repository,
> all your work happens with the local repository.  Subversion has a
> central repository and needs to talk to the server all the time.
> Laptop on a plane?  No change of workflow with Git.
> 
> And since it's your repository, you can cheaply create your own
> branches, where you can commit your work and have a versioned history
> of it instead of just a flat diff.  I can't overstate the value of
> that.  Whether you work on something that will be pushed back
> upstream or just your private changes, it has a full commit history.
> You can easily revert commits, you can upstream it one by one, you
> can upstream it with history.
> 
> When FreeBSD switched from CVS to SVN, there was hope or promise
> of lightweight branches, but that never materialized.  Developers
> still can't have private branches in the FreeBSD repository.  For
> a while, a lot of development happened in a Perforce repository--a
> commerical version control system, whose company had donated a
> license--which offered this feature.  Nowadays, everybody who does
> any but the most trivial development does so in a private Git
> repository anyway.  It only makes sense to interface this directly
> with the FreeBSD repository instead of going through a SVN<>Git
> media break.
> 
>> Certainly the only clear things a quick search turns up that seem relevant
>> is that GIT is GPL2.0 and SVN is Apache2.0.  This was enough for LLVM vs
>> GCC and the repository is a core function, but I suppose not a necessary
>> function for forked projects that can't abide, so...
> 
> There is a bit of historical precedent: The original BSD work at
> Berkeley was kept in a SCCS repository, a proprietary version control
> system at the time.
> 
> And of course the fact that significant FreeBSD development has
> effectively happened in Perforce, then in Git for a long time and
> is just merged back into the Subversion repository.  To put it
> bluntly, the people doing the work have voted with their feet years
> ago.
> 


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Wake from sleep kinda broken-ish? (ThinkPad Carbon X1 6th gen)

2020-09-16 Thread Andriy Gapon

On 16/09/2020 10:05, Warner Losh wrote:
> 
> 
> On Wed, Sep 16, 2020 at 12:31 AM Andriy Gapon  <mailto:a...@freebsd.org>> wrote:
> 
> On 15/09/2020 23:13, Eirik Øverby wrote:
> > On 9/15/20 9:50 PM, Andriy Gapon wrote:
> >> On 15/09/2020 22:36, Eirik Øverby wrote:
> >>> Now, since I updated from r365358 to r365688, I have not once been 
> able
> to wake from sleep.
> >>
> >> Is that the only thing that changed?
> >> Any port / package upgrades?
> >
> > There have been updates to packages, yes - but it didn't even occur to 
> me
> that these could impact the resume process at such an early stage. Not 
> sure
> which that would be; obviously the drm module has been rebuilt each time I
> upgraded, but I don't have any other kernel modules installed from 
> packages.
> 
> Yes, I specifically had drm modules in mind.
> 
> 
> I too can report this for my Lenovo Yoga running code as of September 13, but
> with manu's latest drm...  It used to work fine, but my last build on the 
> system
> was from May. Most likely a new panic in that code path, but I've not chased
> down further...

One thing to check is to set debug.acpi.suspend_bounce=1 before suspending.
This will run suspend (and then resume) methods of all drivers just like for a
normal suspend, but will skip the actual ACPI suspend.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Wake from sleep kinda broken-ish? (ThinkPad Carbon X1 6th gen)

2020-09-16 Thread Andriy Gapon

On 15/09/2020 23:13, Eirik Øverby wrote:
> On 9/15/20 9:50 PM, Andriy Gapon wrote:
>> On 15/09/2020 22:36, Eirik Øverby wrote:
>>> Now, since I updated from r365358 to r365688, I have not once been able to 
>>> wake from sleep.
>>
>> Is that the only thing that changed?
>> Any port / package upgrades?
> 
> There have been updates to packages, yes - but it didn't even occur to me 
> that these could impact the resume process at such an early stage. Not sure 
> which that would be; obviously the drm module has been rebuilt each time I 
> upgraded, but I don't have any other kernel modules installed from packages.

Yes, I specifically had drm modules in mind.


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Wake from sleep kinda broken-ish? (ThinkPad Carbon X1 6th gen)

2020-09-15 Thread Andriy Gapon

On 15/09/2020 22:36, Eirik Øverby wrote:
> Now, since I updated from r365358 to r365688, I have not once been able to 
> wake from sleep.

Is that the only thing that changed?
Any port / package upgrades?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: [openzfs] r365058 arm64 uefi boot fails with "unknown filesystem" after launching kernel

2020-09-03 Thread Andriy Gapon

On 03/09/2020 10:01, Dave Cottlehuber wrote:
> On Thu, 3 Sep 2020, at 06:48, Andriy Gapon wrote:
>> On 03/09/2020 00:01, Navdeep Parhar wrote:
>>> Load cryptodev manually from the loader to boot and then add
>>> cryptodev_load="YES" to your loader.conf.
>>
>> I think that this shouldn't be needed *if* zfs module has a dependency on
>> cryptodev module (MODULE_DEPEND in the code).
>> The loader knows how to load dependencies.
> 
> emaste mentioned that this dependency walking doesn't work on aarch64 yet,
> until after loader stage is complete.

Sorry for the noise then!
I didn't know about that bug.


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: [openzfs] r365058 arm64 uefi boot fails with "unknown filesystem" after launching kernel

2020-09-03 Thread Andriy Gapon

On 03/09/2020 00:01, Navdeep Parhar wrote:
> Load cryptodev manually from the loader to boot and then add
> cryptodev_load="YES" to your loader.conf.

I think that this shouldn't be needed *if* zfs module has a dependency on
cryptodev module (MODULE_DEPEND in the code).
The loader knows how to load dependencies.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Current panics on connecting disks to a LSI-3108 controller

2020-07-13 Thread Andriy Gapon

On 14/07/2020 03:39, Willem Jan Withagen wrote:
> And what I read from the manual page, mrsas plays even nicer with CAM which 
> is a
> plus.

If by "nicer" you mean that mfi does not integrate with CAM at all, then you are
right :-)
Also, last I looked mfi has some pretty serious bugs in its direct interface to
GEOM.  We've seen all kinds of crashes with mfi at work.

Whatever the reason why mrsas is not always preferred over mfi, it must pretty
nebulous like POLA for existing users.  From technical point of view, mrsas
appears to be superior.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: driver for cp2112 (USB GPIO and I2C gadget)

2020-07-08 Thread Andriy Gapon

On 19/06/2020 17:14, Andriy Gapon wrote:
> 
> If anyone interested in reviewing a new driver please help yourself to:
> https://reviews.freebsd.org/D25359
> https://reviews.freebsd.org/D25360
> What might be curious about it is that there are usb, i2c and gpio mixed 
> together.

Any interest at all?
I am still torn about which of the approaches to take.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: weird Ctrl-T debug messages

2020-06-27 Thread Andriy Gapon

On 27/06/2020 10:44, Li-Wen Hsu wrote:
> On Sat, Jun 27, 2020 at 3:04 PM Hartmann, O.  wrote:
>>
>> Running poudriere on recent CURRENT with (recent) 12-STABLE and CURRENT
>> jails reveals a weird behaviour recently when hitting Ctrl-T:
> ...
>> Is this debug fallout from /bin/sh?
> 
> It's because kern.tty_info_kstacks is on by default now:
> 
> https://svnweb.freebsd.org/changeset/base/362141

May I suggest that the stack trace is printed procstat -kk style (single line) ?
I think that the more compact output would be more convenient.


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

[HEADSUP] snd/hda interrupt handling change

2020-06-18 Thread Andriy Gapon



If you get any problems with HDA sound driver, please be aware of r362294.
Please let me know about any problems that appear to be related to that commit.
It would be helpful to test if reverting the commit helps.
Thanks.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Building modules gives error: "invalid output constraint '=@cce' in asm"

2020-06-17 Thread Andriy Gapon

On 17/06/2020 04:53, Rajesh Kumar wrote:
> Then, I am trying to compile the driver modules and hit the
> compilation error.  I haven't done "install world" as I don't want the base
> 12.0 to be disturbed.

You should do `make buildenv` and then do the module build in the subshell.
This way you will be using a compiler (toolchain, in general) form the
buildworld, not the currently installed compiler.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: MRSAS Panic during Install.

2020-06-09 Thread Andriy Gapon

On 09/06/2020 03:42, Santiago Martinez wrote:
> Hi Everyone, today I tested with 12.1 and it works without any issues (at 
> least
> for now).
> 
> I will sync against current and see if it fails.
> 
> Santiago
> 
> On 2020-06-08 17:41, Santiago Martinez wrote:
>> Hi there, tried again and now i got it with UFS also.. that make sense.. 
>> right...
>>
>>
>> On 2020-06-08 15:20, Santiago Martinez wrote:
>>> Hi Everyone,
>>>
>>> I'm installing FreeBSD current(361567) snapshot on a Lenovo SR655 server.
>>> After selecting ZFS, and the installer tries to make the partitions, etc I
>>> get the following panic.
>>>
>>> I tried selecting UFS and its works.
>>>
>>> I uploaded a screenshot as I only have KVM access to it:
>>>
>>> https://0bin.net/paste/4yn33GkSKiYto6m4#h78yCE6h80-3DsApbXa1XLW9+bhoKhOr3MVS+NRgA5A
>>>
>>>
>>> The server is a ThinkSystem SR655, with the following controller, RAID 
>>> 930-8i
>>> 2GB Flash PCIe 12Gb Adapter

Lousy OCR of the picture:
...
nic: nutex mrsas_sin_lock not ouned at /usr/src/sys/kern/kern_nutex.c:284
...
b_trace_self_urapper () at db_trace_self_urapper+8x2b/frane BxfeB33c44a918
anic() at vpanic+Bx182/frane BxfeA33c44ad68
nic() at panic+Bx43/frame BxfeB33c44adcd
_mtx_assert() at __mtx_assert+@xb@/frane Bxfed33c44a9dd
callout_stop_safe() at _callout_stop_safe+Bx82/frane Bxfe33c44aac
rsas_conplete_cnd() at mrsas_complete_cnd+8x1b8/frane BxfeB33c4daaed
ithread_loop() at ithread_loop+@x279/frame BxfeB33c44ah78

This looks like a fallout from r342064.
cm_callout is initialized like this:
callout_init_mtx(>cm_callout, >sim_lock, 0);
but in mrsas_complete_cmd() it's stopped without holding the lock.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: acpi timer reads all ones [Was: efirtc + atrtc at the same time]

2020-05-27 Thread Andriy Gapon

On 27/05/2020 16:27, John Baldwin wrote:
> The "solution" I think is to have resume be multi-pass and to resume all the 
> bridges
> first before trying to resume leaf devices (including timers), but that's a 
> fair bit
> of work.  It might be that we just need to resume timer interrupts later 
> after the
> new-bus resume (I think we currently do it before?), though the reason for 
> that was
> to allow resume methods in devices to sleep (I'm not sure if any do).

But it's not only about timers.
{sbin,bin,micro,etc}uptime() calls can return garbage as well and confuse their
callers.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: acpi timer reads all ones [Was: efirtc + atrtc at the same time]

2020-05-27 Thread Andriy Gapon

On 27/05/2020 11:13, Andriy Gapon wrote:
> I added more diagnostics and it seems to support the idea that the problem is
> related to I/O cycles and bridges.
> 
> ACPI timer suddenly starts returning 0x and that lasts for tens of
> microseconds before the timer goes back to returning normal values with an
> expected increase.
> AMD provides a proprietary way to access ACPI registers via MMIO (0xfed808xx).
> That mechanism is unaffected, ACPI timer register always returns good values.
> 
> The problem seems to happen when restoring configuration of a particular PCI
> bridge.  What's interesting is that the bridge decodes one memory range and 
> one
> I/O range.
> 
> Looking at pci_cfg_restore() I wonder if it is wise to restore PCIR_COMMAND so
> early.  Could it be that after the resume the bridge is configured with a 
> wrong
> I/O range (e.g., too wide) and by writing PCIR_COMMAND we enable that 
> decoding.
>  So, the bridge steals I/O cycles destined for ACPI support hardware.  If 
> there
> is nothing behind the bridge to handle those ports, then we get those bad 
> readings.
> Once the bridge configuration is fully restored, the I/O handling goes back to
> normal.

>From what I see, this looks like a BIOS bug.
Upon resume, it swaps window configurations of pcib1 and pcib2 (until FreeBSD
restores them).  pcib1 originally does not have an I/O window.  So, BIOS
programs both base and limit of pcib2 I/O window to zero.   When FreeBSD writes
its command register to enable I/O decoding it starts claiming 0x0 - 0xFFF I/O
port range.  That covers the ACPI ports at 0x8xx.

Some printf-s.
>From (verbose) boot time:
pcib1:   domain0
pcib1:   secondary bus 1
pcib1:   subordinate bus   1
pcib1:   memory decode 0xfea0-0xfeaf
pcib2:   domain0
pcib2:   secondary bus 2
pcib2:   subordinate bus   2
pcib2:   I/O decode0xf000-0x
pcib2:   memory decode 0xfe90-0xfe9f

My printf-s from resume time:
pcib1: old I/O base (low): 0xf1
pcib1: old I/O base (high): 0x0
pcib1: old I/O limit (low): 0x1
pcib1: old I/O limit (high): 0x0
pcib2: old I/O base (low): 0x1
pcib2: old I/O base (high): 0x0
pcib2: old I/O limit (low): 0x1
pcib2: old I/O limit (high): 0x0

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: acpi timer reads all ones [Was: efirtc + atrtc at the same time]

2020-05-27 Thread Andriy Gapon

On 27/05/2020 01:14, John Baldwin wrote:
> On 5/26/20 11:55 AM, Konstantin Belousov wrote:
>> On Tue, May 26, 2020 at 06:22:13PM +0300, Andriy Gapon wrote:
>>> I am not sure if this is just a coincidence but it appears as if a write to 
>>> some
>>> PCI configuration register could temporarily interfere with access to the PM
>>> timer I/O port.
>>> Is that plausible?
>> If something disabled a BAR, then typical response of x86 chipset for timed
>> out read from PCIe is 0xf... . 
> 
> And the ACPI timer might be "behind" the isab0 bridge device which would 
> indeed
> cause this.

I added more diagnostics and it seems to support the idea that the problem is
related to I/O cycles and bridges.

ACPI timer suddenly starts returning 0x and that lasts for tens of
microseconds before the timer goes back to returning normal values with an
expected increase.
AMD provides a proprietary way to access ACPI registers via MMIO (0xfed808xx).
That mechanism is unaffected, ACPI timer register always returns good values.

The problem seems to happen when restoring configuration of a particular PCI
bridge.  What's interesting is that the bridge decodes one memory range and one
I/O range.

Looking at pci_cfg_restore() I wonder if it is wise to restore PCIR_COMMAND so
early.  Could it be that after the resume the bridge is configured with a wrong
I/O range (e.g., too wide) and by writing PCIR_COMMAND we enable that decoding.
 So, the bridge steals I/O cycles destined for ACPI support hardware.  If there
is nothing behind the bridge to handle those ports, then we get those bad 
readings.
Once the bridge configuration is fully restored, the I/O handling goes back to
normal.

Is this possible?

P.S.
pci_cfg_restore() also attempts to restore PCIR_INTPIN, but it's read-only?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: iflib and options RSS is a no go for igbX

2020-05-26 Thread Andriy Gapon

On 26/05/2020 12:03, Hans Petter Selasky wrote:
> Hi,
> 
> I just found a bug where outgoing TCP connections over igb0 doesn't work 
> because
> likely the software computed hash is wrong, so the incoming packets get 
> dropped
> because they are received on the wrong queue.
> 
> It is the management port, so I'm just using this hack for now:
> 
> dev.igb.0.iflib.override_nrxqs=1
> dev.igb.0.iflib.override_ntxqs=1

This is a very common problem in our drivers.
E.g., https://reviews.freebsd.org/D24733


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

acpi timer reads all ones [Was: efirtc + atrtc at the same time]

2020-05-26 Thread Andriy Gapon

On 25/05/2020 11:37, Andriy Gapon wrote:
> Also, there is another issue related to atrtc.
> When I have both drivers attached, and also when I have only atrtc attached
> (efi.rt.disabled=1), system clock jumps 10 minutes forward after each suspend 
> /
> resume cycle (S0 -> S3 -> S0).  That does not happen for reboot and shutdown
> cycles.  I haven't investigated this deeper, but it is a curious problem.

Actually, I was wrong.  The problem can also occur with efirtc alone.
Also, sometimes there is a different problem where there are no callouts for a
period of time on the order of minutes.  I tracked it to cc_lastscan being set
to a value greater than the current uptime.  So, any scheduled callout gets
scheduled at cc_lastscan and it is a while before the uptime catches up.

It seemed that both issues were connected and were a result of the uptime
jumping forward by some minutes and then jumping back to a sane value.
If something important happened during the weird period, like getting time of
day from hardware or invoking a callout, it lead to the observed effects.

So, that gave me some ideas where to add debugging checks.
What I determined is that ACPI timer (ACPI-fast) could produce a reading of all
1-s like happens when there is no hardware response.

I caught one such instance and got a stack trace for it (but no crash dump
because devices had not resumed yet):
tc_windup() at tc_windup+0x318/frame 0xfe00a7a19300
tc_ticktock() at tc_ticktock+0x4b/frame 0xfe00a7a19320
hardclock() at hardclock+0x107/frame 0xfe00a7a19360
handleevents() at handleevents+0xb3/frame 0xfe00a7a193a0
timercb() at timercb+0x196/frame 0xfe00a7a193f0
lapic_handle_timer() at lapic_handle_timer+0x98/frame 0xfe00a7a19420
Xtimerint() at Xtimerint+0xb1/frame 0xfe00a7a19420
--- interrupt, rip = 0x80b34500, rsp = 0xfe00a7a194f8, rbp =
0xfe00a7a19540 ---
acpi_pcib_write_config() at acpi_pcib_write_config/frame 0xfe00a7a19540
pci_cfg_restore() at pci_cfg_restore+0x2cc/frame 0xfe00a7a195a0
pci_resume_child() at pci_resume_child+0xee/frame 0xfe00a7a195e0
pci_resume() at pci_resume+0x49/frame 0xfe00a7a19630
bus_generic_resume_child() at bus_generic_resume_child+0x43/frame 
0xfe00a7a19650
bus_generic_resume() at bus_generic_resume+0x29/frame 0xfe00a7a19680
bus_generic_resume_child() at bus_generic_resume_child+0x43/frame 
0xfe00a7a196a0
bus_generic_resume() at bus_generic_resume+0x29/frame 0xfe00a7a196d0
bus_generic_resume_child() at bus_generic_resume_child+0x43/frame 
0xfe00a7a196f0
bus_generic_resume() at bus_generic_resume+0x29/frame 0xfe00a7a19720
bus_generic_resume_child() at bus_generic_resume_child+0x43/frame 
0xfe00a7a19740
root_resume() at root_resume+0x29/frame 0xfe00a7a19770
acpi_EnterSleepState() at acpi_EnterSleepState+0x73b/frame 0xfe00a7a197f0
acpi_AckSleepState() at acpi_AckSleepState+0x144/frame 0xfe00a7a19820
devfs_ioctl() at devfs_ioctl+0xcb/frame 0xfe00a7a19870
vn_ioctl() at vn_ioctl+0x132/frame 0xfe00a7a19980
devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfe00a7a199a0
kern_ioctl() at kern_ioctl+0x27b/frame 0xfe00a7a19a00
sys_ioctl() at sys_ioctl+0x123/frame 0xfe00a7a19ad0
amd64_syscall() at amd64_syscall+0x140/frame 0xfe00a7a19bf0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfe00a7a19bf0

I am not sure if this is just a coincidence but it appears as if a write to some
PCI configuration register could temporarily interfere with access to the PM
timer I/O port.
Is that plausible?

I'll try to dig up more data.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

efirtc + atrtc at the same time

2020-05-25 Thread Andriy Gapon



I see that on my laptop both efirtc and atrtc get attached.
The latter is via an ACPI attachment:
efirtc0: 
efirtc0: registered as a time-of-day clock, resolution 1.00s
atrtc0:  port 0x70-0x71 on acpi0
atrtc0: registered as a time-of-day clock, resolution 1.00s

I am not sure if this is a problem by itself, but it certainly seems redundant
to have two drivers controlling the same(?) hardware via different platform
mechanisms.
Maybe there is a nice way to automatically disable (or "neutralize") one of the
drivers?

Also, there is another issue related to atrtc.
When I have both drivers attached, and also when I have only atrtc attached
(efi.rt.disabled=1), system clock jumps 10 minutes forward after each suspend /
resume cycle (S0 -> S3 -> S0).  That does not happen for reboot and shutdown
cycles.  I haven't investigated this deeper, but it is a curious problem.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: zfs deadlock on r360452 relating to busy vm page

2020-05-14 Thread Andriy Gapon

On 13/05/2020 17:42, Mark Johnston wrote:
> On Wed, May 13, 2020 at 10:45:24AM +0300, Andriy Gapon wrote:
>> On 13/05/2020 10:35, Andriy Gapon wrote:
>>> In r329363 I re-worked zfs_getpages and introduced range locking to it.
>>> At the time I believed that it was safe and maybe it was, please see the 
>>> commit
>>> message.
>>> There, indeed, have been many performance / concurrency improvements to the 
>>> VM
>>> system and r358443 is one of them.
>>
>> Thinking more about it, it could be r352176.
>> I think that vm_page_grab_valid (and later vm_page_grab_valid_unlocked) are 
>> not
>> equivalent to the code that they replaced.
>> The original code would check valid field before any locking and it would
>> attempt any locking / busing if a page is invalid.  The object was required 
>> to
>> be locked though.
>> The new code tries to busy the page in any case.
>>
>>> I am not sure how to resolve the problem best.  Maybe someone who knows the
>>> latest VM code better than me can comment on my assumptions stated in the 
>>> commit
>>> message.
> 
> The general trend has been to use the page busy lock as the single point
> of synchronization for per-page state.  As you noted, updates to the
> valid bits were previously interlocked by the object lock, but this is
> coarse-grained and hurts concurrency.  I think you are right that the
> range locking in getpages was ok before the recent change, but it seems
> preferable to try and address this in ZFS.
> 
>>> In illumos (and, I think, in OpenZFS/ZoL) they don't have the range locking 
>>> in
>>> this corner of the code because of a similar deadlock a long time ago.
> 
> Do they just not implement readahead?

I think so, but not 100% sure.
I recall seeing a comment in illumos code that they do not care about read-ahead
because there is ZFS prefetch and the data will be cached in ARC.  That makes
sense from the I/O point of view, but it does not help with page faults.

> Can you explain exactly what the
> range lock accomplishes here - is it entirely to ensure that znode block
> size remains stable?

As far as I can recall, this is the reason indeed.


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: zfs deadlock on r360452 relating to busy vm page

2020-05-13 Thread Andriy Gapon

On 13/05/2020 10:35, Andriy Gapon wrote:
> On 13/05/2020 01:47, Bryan Drewery wrote:
>> Trivial repro:
>>
>> dd if=/dev/zero of=blah & tail -F blah
>> ^C
>> load: 0.21  cmd: tail 72381 [prev->lr_read_cv] 2.17r 0.00u 0.01s 0% 2600k
>> #0 0x80bce615 at mi_switch+0x155
>> #1 0x80c1cfea at sleepq_switch+0x11a
>> #2 0x80b57f0a at _cv_wait+0x15a
>> #3 0x829ddab6 at rangelock_enter+0x306
>> #4 0x829ecd3f at zfs_freebsd_getpages+0x14f
>> #5 0x810e3ab9 at VOP_GETPAGES_APV+0x59
>> #6 0x80f349e7 at vnode_pager_getpages+0x37
>> #7 0x80f2a93f at vm_pager_get_pages+0x4f
>> #8 0x80f054b0 at vm_fault+0x780
>> #9 0x80f04bde at vm_fault_trap+0x6e
>> #10 0x8106544e at trap_pfault+0x1ee
>> #11 0x81064a9c at trap+0x44c
>> #12 0x8103a978 at calltrap+0x8
> 
> In r329363 I re-worked zfs_getpages and introduced range locking to it.
> At the time I believed that it was safe and maybe it was, please see the 
> commit
> message.
> There, indeed, have been many performance / concurrency improvements to the VM
> system and r358443 is one of them.

Thinking more about it, it could be r352176.
I think that vm_page_grab_valid (and later vm_page_grab_valid_unlocked) are not
equivalent to the code that they replaced.
The original code would check valid field before any locking and it would
attempt any locking / busing if a page is invalid.  The object was required to
be locked though.
The new code tries to busy the page in any case.

> I am not sure how to resolve the problem best.  Maybe someone who knows the
> latest VM code better than me can comment on my assumptions stated in the 
> commit
> message.
> 
> In illumos (and, I think, in OpenZFS/ZoL) they don't have the range locking in
> this corner of the code because of a similar deadlock a long time ago.
> 
>> On 5/12/2020 3:13 PM, Bryan Drewery wrote:
>>>> panic: deadlres_td_sleep_q: possible deadlock detected for 
>>>> 0xfe25eefa2e00 (find), blocked for 1802392 ticks
> ...
>>>> (kgdb) backtrace
>>>> #0  sched_switch (td=0xfe255eac, flags=) at 
>>>> /usr/src/sys/kern/sched_ule.c:2147
>>>> #1  0x80bce615 in mi_switch (flags=260) at 
>>>> /usr/src/sys/kern/kern_synch.c:542
>>>> #2  0x80c1cfea in sleepq_switch (wchan=0xf810fb57dd48, pri=0) 
>>>> at /usr/src/sys/kern/subr_sleepqueue.c:625
>>>> #3  0x80b57f0a in _cv_wait (cvp=0xf810fb57dd48, 
>>>> lock=0xf80049a99040) at /usr/src/sys/kern/kern_condvar.c:146
>>>> #4  0x82434ab6 in rangelock_enter_reader (rl=0xf80049a99018, 
>>>> new=0xf8022cadb100) at 
>>>> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c:429
>>>> #5  rangelock_enter (rl=0xf80049a99018, off=, 
>>>> len=, type=) at 
>>>> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c:477
>>>> #6  0x82443d3f in zfs_getpages (vp=, 
>>>> ma=0xfe259f204b18, count=, rbehind=0xfe259f204ac4, 
>>>> rahead=0xfe259f204ad0) at 
>>>> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4500
>>>> #7  zfs_freebsd_getpages (ap=) at 
>>>> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4567
>>>> #8  0x810e3ab9 in VOP_GETPAGES_APV (vop=0x8250a1e0 
>>>> , a=0xfe259f2049f0) at vnode_if.c:2644
>>>> #9  0x80f349e7 in VOP_GETPAGES (vp=, m=>>> out>, count=, rbehind=, rahead=) at 
>>>> ./vnode_if.h:1171
>>>> #10 vnode_pager_getpages (object=, m=, 
>>>> count=, rbehind=, rahead=) at 
>>>> /usr/src/sys/vm/vnode_pager.c:743
>>>> #11 0x80f2a93f in vm_pager_get_pages (object=0xf806cb637c60, 
>>>> m=0xfe259f204b18, count=1, rbehind=, 
>>>> rahead=) at /usr/src/sys/vm/vm_pager.c:305
>>>> #12 0x80f054b0 in vm_fault_getpages (fs=, nera=0, 
>>>> behindp=, aheadp=) at 
>>>> /usr/src/sys/vm/vm_fault.c:1163
>>>> #13 vm_fault (map=, vaddr=, 
>>>> fault_type=, fault_flags=, m_hold=>>> out>) at /usr/src/sys/vm/vm_fault.c:1394
>>>> #14 0x80f04bde in vm_fault_trap (map=0xfe25653949e8, 
>>>> vaddr=, fault_type=, fault_flags=0, 
>>>> signo=0xfe259f204d04, ucode=0xfe259f204d00) at 
>>>> /usr/src/sys/vm/vm_fault.c:589
>>>> #15 0x8106544e in trap_pfault (frame=0xfe259f204d40,

Re: zfs deadlock on r360452 relating to busy vm page

2020-05-13 Thread Andriy Gapon

rev = 
>>> 0x0}, rvq = {lh_first = 0x0}, handle = 0xf80571f29500, un_pager = {vnp 
>>> = {vnp_size = 4499568,
>>>   writemappings = 0}, devp = {devp_pglist = {tqh_first = 0x44a870, 
>>> tqh_last = 0x0}, ops = 0x0, dev = 0x0}, sgp = {sgp_pglist = {tqh_first = 
>>> 0x44a870, tqh_last = 0x0}}, swp = {swp_tmpfs = 0x44a870, swp_blks = 
>>> {pt_root = 0}, writemappings = 0}}, cred = 0x0, charge = 0, umtx_data = 0x0}
>>> (kgdb) frame 5
>>> #5  vm_page_acquire_unlocked (object=0xf806cb637c60, pindex=1098, 
>>> prev=, mp=0xfe2717fc6730, allocflags=21504) at 
>>> /usr/src/sys/vm/vm_page.c:4469
>>> 4469if (!vm_page_grab_sleep(object, m, pindex, "pgnslp",
>>> (kgdb) p *m
>>> $9 = {plinks = {q = {tqe_next = 0x, tqe_prev = 
>>> 0x}, s = {ss = {sle_next = 0x}}, memguard = 
>>> {p = 18446744073709551615, v = 18446744073709551615}, uma = {slab = 
>>> 0x, zone = 0x}}, listq = {tqe_next = 0x0, 
>>> tqe_prev = 0xf806cb637ca8},
>>>   object = 0xf806cb637c60, pindex = 1098, phys_addr = 18988408832, md = 
>>> {pv_list = {tqh_first = 0x0, tqh_last = 0xfe001cbca888}, pv_gen = 
>>> 44682, pat_mode = 6}, ref_count = 2147483648, busy_lock = 1588330502, a = 
>>> {{flags = 0, queue = 255 '\377', act_count = 0 '\000'}, _bits = 16711680}, 
>>> order = 13 '\r',
>>>   pool = 0 '\000', flags = 1 '\001', oflags = 0 '\000', psind = 0 '\000', 
>>> segind = 6 '\006', valid = 0 '\000', dirty = 0 '\000'}
>>
>> Pretty sure this thread is holding the rangelock from zfs_write() that
>> tail is waiting on. So what is this thread (101255) waiting on exactly
>> for? I'm not sure the way to track down what is using vm object
>> 0xf806cb637c60. If the tail thread busied the page then they are
>> waiting on each other I guess. If that's true then r358443 removing the
>> write lock on the object in update_pages() could be a problem.
>>
>>
>> Not sure the rest is interesting. I think they are just waiting on the
>> locked vnode but I give it here in case I missed something.


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

lkpi: print stack trace in WARN_ON ?

2020-05-13 Thread Andriy Gapon



Just to get a bigger exposure: https://reviews.freebsd.org/D24779
I think that this is a good idea and, if I am not mistaken, it should match the
Linux behavior.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: CHANGE_PV_LIST_LOCK_TO_PHYS is not correct when !NUMA ?

2020-05-10 Thread Andriy Gapon

On 09/05/2020 19:50, Konstantin Belousov wrote:
> On Sat, May 09, 2020 at 07:16:27PM +0300, Andriy Gapon wrote:
>> On 09/05/2020 19:13, Konstantin Belousov wrote:
>>> On Sat, May 09, 2020 at 06:52:24PM +0300, Andriy Gapon wrote:
>>>> I tried this change:
>>>> diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c
>>>> index 4deed86a76d1a..b834b7f0388b7 100644
>>>> --- a/sys/amd64/amd64/pmap.c
>>>> +++ b/sys/amd64/amd64/pmap.c
>>>> @@ -345,7 +345,7 @@ pmap_pku_mask_bit(pmap_t pmap)
>>>>  #define   NPV_LIST_LOCKS  MAXCPU
>>>>
>>>>  #define   PHYS_TO_PV_LIST_LOCK(pa)\
>>>> -  (_list_locks[pa_index(pa) % NPV_LIST_LOCKS])
>>>> +  (_list_locks[((pa) >> PDRSHIFT) % NPV_LIST_LOCKS])
>>>>  #endif
>>>>
>>>>  #define   CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa)  do {\
>>>>
>>>> It fixed the original problem, but I got a new panic.
>>>> "DI already started" in pmap_remove() -> pmap_delayed_invl_start_u().
>>>> I guess that !NUMA variant does not get much testing, so I'll probably just
>>>> stick with the default.
>>> Why didn't you just removed the KASSERT from pa_index ?
>>
>> Well, I thought it might be useful in the NUMA case.
>> pa_index() definition is shared between both cases.
> Might be define the macro two times, for NUMA/non-NUMA.  non-NUMA case
> does not need the assert, because users take it mod NPV_LIST_LOCKS.

Yes, this works.
Thank you!

diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c
index 4deed86a76d1a..8dd236acc8205 100644
--- a/sys/amd64/amd64/pmap.c
+++ b/sys/amd64/amd64/pmap.c
@@ -323,12 +323,12 @@ pmap_pku_mask_bit(pmap_t pmap)
 #endif

 #undef pa_index
+#ifdef NUMA
 #definepa_index(pa)({  \
KASSERT((pa) <= vm_phys_segs[vm_phys_nsegs - 1].end,\
("address %lx beyond the last segment", (pa))); \
(pa) >> PDRSHIFT;   \
 })
-#ifdef NUMA
 #definepa_to_pmdp(pa)  (_table[pa_index(pa)])
 #definepa_to_pvh(pa)   (&(pa_to_pmdp(pa)->pv_page))
 #definePHYS_TO_PV_LIST_LOCK(pa)    ({  \
@@ -340,6 +340,7 @@ pmap_pku_mask_bit(pmap_t pmap)
_lock;  \
 })
 #else
+#definepa_index(pa)((pa) >> PDRSHIFT)
 #definepa_to_pvh(pa)   (_table[pa_index(pa)])

 #defineNPV_LIST_LOCKS  MAXCPU


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: CHANGE_PV_LIST_LOCK_TO_PHYS is not correct when !NUMA ?

2020-05-09 Thread Andriy Gapon

80bb921c in pmap_remove (pmap=, sva=34523316224,
eva=) at /usr/devel/git/motil/sys/amd64/amd64/pmap.c:5506
#44 0x80b301a0 in vm_map_delete (map=0xfe00a4cdb9e8,
start=, end=) at
/usr/devel/git/motil/sys/vm/vm_map.c:3804
#45 0x80b3856e in kern_munmap (td=0xfe009c7be800, addr0=, size=2097152) at /usr/devel/git/motil/sys/vm/vm_mmap.c:624
#46 0x80bcff00 in syscallenter (td=) at
/usr/devel/git/motil/sys/amd64/amd64/../../kern/subr_syscall.c:162
#47 amd64_syscall (td=0xfe009c7be800, traced=0) at
/usr/devel/git/motil/sys/amd64/amd64/trap.c:1161

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: CHANGE_PV_LIST_LOCK_TO_PHYS is not correct when !NUMA ?

2020-05-09 Thread Andriy Gapon

On 09/05/2020 19:50, Konstantin Belousov wrote:
> On Sat, May 09, 2020 at 07:16:27PM +0300, Andriy Gapon wrote:
>> On 09/05/2020 19:13, Konstantin Belousov wrote:
>>> On Sat, May 09, 2020 at 06:52:24PM +0300, Andriy Gapon wrote:
>>>> On 08/05/2020 19:15, Konstantin Belousov wrote:
>>>>> On Fri, May 08, 2020 at 06:53:24PM +0300, Andriy Gapon wrote:
>>>>>>
>>>>>> I have a reproducible panic with a custom kernel without option NUMA 
>>>>>> while using
>>>>>> amdgpu driver from linuxkpi-based drm:
>>>>>>
>>>>>> panic: address 41ec0 beyond the last segment
>>>>>>
>>>>>> I did some quick debugging and the panic happens when Xorg server tries 
>>>>>> to
>>>>>> access a frame buffer (or something like that).  There is a page fault 
>>>>>> that gets
>>>>>> satisfied by ttm with a fictitious page.
>>>>>>
>>>>>> The stack trace is:
>>>>>> #11 0x808031a3 in panic (fmt=0x8119a998 
>>>>>> "5\003ʀ\377\377\377\377") at 
>>>>>> /usr/devel/git/motil/sys/kern/kern_shutdown.c:839
>>>>>> #12 0x80bbc552 in pmap_enter (pmap=, 
>>>>>> va=34504441856,
>>>>>> m=, prot=, flags=, 
>>>>>> psind=>>>>> out>) at /usr/devel/git/motil/sys/amd64/amd64/pmap.c:6035
>>>>>> #13 0x80b288be in vm_fault_populate (fs=) at
>>>>>> /usr/devel/git/motil/sys/vm/vm_fault.c:519
>>>>>> #14 vm_fault_allocate (fs=) at
>>>>>> /usr/devel/git/motil/sys/vm/vm_fault.c:1032
>>>>>> #15 vm_fault (map=, vaddr=, 
>>>>>> fault_type=>>>>> out>, fault_flags=, m_hold=) at
>>>>>> /usr/devel/git/motil/sys/vm/vm_fault.c:1342
>>>>>> #16 0x80b26e7e in vm_fault_trap (map=0xfe0017cd39e8,
>>>>>> vaddr=, fault_type=, fault_flags=0,
>>>>>> signo=0xfe00a810dbc4, ucode=0xfe00a810dbc0) at
>>>>>> /usr/devel/git/motil/sys/vm/vm_fault.c:589
>>>>>> #17 0x80bcf89c in trap_pfault (frame=0xfe00a810dc00,
>>>>>> usermode=, signo=, ucode=0x80853250
>>>>>> ) at /usr/devel/git/motil/sys/amd64/amd64/trap.c:821
>>>>>> #18 0x80bceeec in trap (frame=0xfe00a810dc00) at
>>>>>> /usr/devel/git/motil/sys/amd64/amd64/trap.c:34
>>>>>>
>>>>>>
>>>>>> The line number in pmap_enter() is incorrect, I guess because of 
>>>>>> optimizations.
>>>>>> The assert seems to be reached via pmap_enter -> 
>>>>>> CHANGE_PV_LIST_LOCK_TO_PHYS ->
>>>>>> PHYS_TO_PV_LIST_LOCK -> pa_index().
>>>>>>
>>>>>> The panic in correct in that the page is fictitious and its physical 
>>>>>> address is
>>>>>> beyond the end of real physical memory.
>>>>>> It seems that NUMA PHYS_TO_PV_LIST_LOCK() is aware of such pages, but 
>>>>>> !NUMA one
>>>>>> is not.
>>>>>
>>>>> I think you can remove this assert.  pa_index() is always taken by
>>>>> % NVP_LIST_LOCKS, because fictitious mappings are not promoted.
>>>>>
>>>>> Try that and commit if it works for you.
>>>>
>>>> I tried this change:
>>>> diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c
>>>> index 4deed86a76d1a..b834b7f0388b7 100644
>>>> --- a/sys/amd64/amd64/pmap.c
>>>> +++ b/sys/amd64/amd64/pmap.c
>>>> @@ -345,7 +345,7 @@ pmap_pku_mask_bit(pmap_t pmap)
>>>>  #define   NPV_LIST_LOCKS  MAXCPU
>>>>
>>>>  #define   PHYS_TO_PV_LIST_LOCK(pa)\
>>>> -  (_list_locks[pa_index(pa) % NPV_LIST_LOCKS])
>>>> +  (_list_locks[((pa) >> PDRSHIFT) % NPV_LIST_LOCKS])
>>>>  #endif
>>>>
>>>>  #define   CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa)  do {\
>>>>
>>>> It fixed the original problem, but I got a new panic.
>>>> "DI already started" in pmap_remove() -> pmap_delayed_invl_start_u().
>>>> I guess that !NUMA variant does not get much testing, so I'll probably just
>>>> stick with the default.
>>> Why didn't you just removed the KASSERT from pa_index ?
>>
>> Well, I thought it might be useful in the NUMA case.
>> pa_index() definition is shared between both cases.
> Might be define the macro two times, for NUMA/non-NUMA.  non-NUMA case
> does not need the assert, because users take it mod NPV_LIST_LOCKS.
> 

I still don't see how that could help with "DI already started" panic.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: CHANGE_PV_LIST_LOCK_TO_PHYS is not correct when !NUMA ?

2020-05-09 Thread Andriy Gapon

On 09/05/2020 19:13, Konstantin Belousov wrote:
> On Sat, May 09, 2020 at 06:52:24PM +0300, Andriy Gapon wrote:
>> On 08/05/2020 19:15, Konstantin Belousov wrote:
>>> On Fri, May 08, 2020 at 06:53:24PM +0300, Andriy Gapon wrote:
>>>>
>>>> I have a reproducible panic with a custom kernel without option NUMA while 
>>>> using
>>>> amdgpu driver from linuxkpi-based drm:
>>>>
>>>> panic: address 41ec0 beyond the last segment
>>>>
>>>> I did some quick debugging and the panic happens when Xorg server tries to
>>>> access a frame buffer (or something like that).  There is a page fault 
>>>> that gets
>>>> satisfied by ttm with a fictitious page.
>>>>
>>>> The stack trace is:
>>>> #11 0x808031a3 in panic (fmt=0x8119a998 
>>>> "5\003ʀ\377\377\377\377") at 
>>>> /usr/devel/git/motil/sys/kern/kern_shutdown.c:839
>>>> #12 0x80bbc552 in pmap_enter (pmap=, va=34504441856,
>>>> m=, prot=, flags=, 
>>>> psind=>>> out>) at /usr/devel/git/motil/sys/amd64/amd64/pmap.c:6035
>>>> #13 0x80b288be in vm_fault_populate (fs=) at
>>>> /usr/devel/git/motil/sys/vm/vm_fault.c:519
>>>> #14 vm_fault_allocate (fs=) at
>>>> /usr/devel/git/motil/sys/vm/vm_fault.c:1032
>>>> #15 vm_fault (map=, vaddr=, 
>>>> fault_type=>>> out>, fault_flags=, m_hold=) at
>>>> /usr/devel/git/motil/sys/vm/vm_fault.c:1342
>>>> #16 0x80b26e7e in vm_fault_trap (map=0xfe0017cd39e8,
>>>> vaddr=, fault_type=, fault_flags=0,
>>>> signo=0xfe00a810dbc4, ucode=0xfe00a810dbc0) at
>>>> /usr/devel/git/motil/sys/vm/vm_fault.c:589
>>>> #17 0x80bcf89c in trap_pfault (frame=0xfe00a810dc00,
>>>> usermode=, signo=, ucode=0x80853250
>>>> ) at /usr/devel/git/motil/sys/amd64/amd64/trap.c:821
>>>> #18 0x80bceeec in trap (frame=0xfe00a810dc00) at
>>>> /usr/devel/git/motil/sys/amd64/amd64/trap.c:34
>>>>
>>>>
>>>> The line number in pmap_enter() is incorrect, I guess because of 
>>>> optimizations.
>>>> The assert seems to be reached via pmap_enter -> 
>>>> CHANGE_PV_LIST_LOCK_TO_PHYS ->
>>>> PHYS_TO_PV_LIST_LOCK -> pa_index().
>>>>
>>>> The panic in correct in that the page is fictitious and its physical 
>>>> address is
>>>> beyond the end of real physical memory.
>>>> It seems that NUMA PHYS_TO_PV_LIST_LOCK() is aware of such pages, but 
>>>> !NUMA one
>>>> is not.
>>>
>>> I think you can remove this assert.  pa_index() is always taken by
>>> % NVP_LIST_LOCKS, because fictitious mappings are not promoted.
>>>
>>> Try that and commit if it works for you.
>>
>> I tried this change:
>> diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c
>> index 4deed86a76d1a..b834b7f0388b7 100644
>> --- a/sys/amd64/amd64/pmap.c
>> +++ b/sys/amd64/amd64/pmap.c
>> @@ -345,7 +345,7 @@ pmap_pku_mask_bit(pmap_t pmap)
>>  #define NPV_LIST_LOCKS  MAXCPU
>>
>>  #define PHYS_TO_PV_LIST_LOCK(pa)\
>> -(_list_locks[pa_index(pa) % NPV_LIST_LOCKS])
>> +(_list_locks[((pa) >> PDRSHIFT) % NPV_LIST_LOCKS])
>>  #endif
>>
>>  #define CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa)  do {\
>>
>> It fixed the original problem, but I got a new panic.
>> "DI already started" in pmap_remove() -> pmap_delayed_invl_start_u().
>> I guess that !NUMA variant does not get much testing, so I'll probably just
>> stick with the default.
> Why didn't you just removed the KASSERT from pa_index ?

Well, I thought it might be useful in the NUMA case.
pa_index() definition is shared between both cases.


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: CHANGE_PV_LIST_LOCK_TO_PHYS is not correct when !NUMA ?

2020-05-09 Thread Andriy Gapon

On 08/05/2020 19:15, Konstantin Belousov wrote:
> On Fri, May 08, 2020 at 06:53:24PM +0300, Andriy Gapon wrote:
>>
>> I have a reproducible panic with a custom kernel without option NUMA while 
>> using
>> amdgpu driver from linuxkpi-based drm:
>>
>> panic: address 41ec0 beyond the last segment
>>
>> I did some quick debugging and the panic happens when Xorg server tries to
>> access a frame buffer (or something like that).  There is a page fault that 
>> gets
>> satisfied by ttm with a fictitious page.
>>
>> The stack trace is:
>> #11 0x808031a3 in panic (fmt=0x8119a998 
>> "5\003ʀ\377\377\377\377") at 
>> /usr/devel/git/motil/sys/kern/kern_shutdown.c:839
>> #12 0x80bbc552 in pmap_enter (pmap=, va=34504441856,
>> m=, prot=, flags=, 
>> psind=> out>) at /usr/devel/git/motil/sys/amd64/amd64/pmap.c:6035
>> #13 0x80b288be in vm_fault_populate (fs=) at
>> /usr/devel/git/motil/sys/vm/vm_fault.c:519
>> #14 vm_fault_allocate (fs=) at
>> /usr/devel/git/motil/sys/vm/vm_fault.c:1032
>> #15 vm_fault (map=, vaddr=, 
>> fault_type=> out>, fault_flags=, m_hold=) at
>> /usr/devel/git/motil/sys/vm/vm_fault.c:1342
>> #16 0x80b26e7e in vm_fault_trap (map=0xfe0017cd39e8,
>> vaddr=, fault_type=, fault_flags=0,
>> signo=0xfe00a810dbc4, ucode=0xfe00a810dbc0) at
>> /usr/devel/git/motil/sys/vm/vm_fault.c:589
>> #17 0x80bcf89c in trap_pfault (frame=0xfe00a810dc00,
>> usermode=, signo=, ucode=0x80853250
>> ) at /usr/devel/git/motil/sys/amd64/amd64/trap.c:821
>> #18 0x80bceeec in trap (frame=0xfe00a810dc00) at
>> /usr/devel/git/motil/sys/amd64/amd64/trap.c:34
>>
>>
>> The line number in pmap_enter() is incorrect, I guess because of 
>> optimizations.
>> The assert seems to be reached via pmap_enter -> CHANGE_PV_LIST_LOCK_TO_PHYS 
>> ->
>> PHYS_TO_PV_LIST_LOCK -> pa_index().
>>
>> The panic in correct in that the page is fictitious and its physical address 
>> is
>> beyond the end of real physical memory.
>> It seems that NUMA PHYS_TO_PV_LIST_LOCK() is aware of such pages, but !NUMA 
>> one
>> is not.
> 
> I think you can remove this assert.  pa_index() is always taken by
> % NVP_LIST_LOCKS, because fictitious mappings are not promoted.
> 
> Try that and commit if it works for you.

I tried this change:
diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c
index 4deed86a76d1a..b834b7f0388b7 100644
--- a/sys/amd64/amd64/pmap.c
+++ b/sys/amd64/amd64/pmap.c
@@ -345,7 +345,7 @@ pmap_pku_mask_bit(pmap_t pmap)
 #defineNPV_LIST_LOCKS  MAXCPU

 #definePHYS_TO_PV_LIST_LOCK(pa)\
-   (_list_locks[pa_index(pa) % NPV_LIST_LOCKS])
+   (_list_locks[((pa) >> PDRSHIFT) % NPV_LIST_LOCKS])
 #endif

 #defineCHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa)  do {\

It fixed the original problem, but I got a new panic.
"DI already started" in pmap_remove() -> pmap_delayed_invl_start_u().
I guess that !NUMA variant does not get much testing, so I'll probably just
stick with the default.


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

CHANGE_PV_LIST_LOCK_TO_PHYS is not correct when !NUMA ?

2020-05-08 Thread Andriy Gapon


I have a reproducible panic with a custom kernel without option NUMA while using
amdgpu driver from linuxkpi-based drm:

panic: address 41ec0 beyond the last segment

I did some quick debugging and the panic happens when Xorg server tries to
access a frame buffer (or something like that).  There is a page fault that gets
satisfied by ttm with a fictitious page.

The stack trace is:
#11 0x808031a3 in panic (fmt=0x8119a998 
"5\003ʀ\377\377\377\377") at /usr/devel/git/motil/sys/kern/kern_shutdown.c:839
#12 0x80bbc552 in pmap_enter (pmap=, va=34504441856,
m=, prot=, flags=, psind=) at /usr/devel/git/motil/sys/amd64/amd64/pmap.c:6035
#13 0x80b288be in vm_fault_populate (fs=) at
/usr/devel/git/motil/sys/vm/vm_fault.c:519
#14 vm_fault_allocate (fs=) at
/usr/devel/git/motil/sys/vm/vm_fault.c:1032
#15 vm_fault (map=, vaddr=, fault_type=, fault_flags=, m_hold=) at
/usr/devel/git/motil/sys/vm/vm_fault.c:1342
#16 0x80b26e7e in vm_fault_trap (map=0xfe0017cd39e8,
vaddr=, fault_type=, fault_flags=0,
signo=0xfe00a810dbc4, ucode=0xfe00a810dbc0) at
/usr/devel/git/motil/sys/vm/vm_fault.c:589
#17 0x80bcf89c in trap_pfault (frame=0xfe00a810dc00,
usermode=, signo=, ucode=0x80853250
) at /usr/devel/git/motil/sys/amd64/amd64/trap.c:821
#18 0x80bceeec in trap (frame=0xfe00a810dc00) at
/usr/devel/git/motil/sys/amd64/amd64/trap.c:34


The line number in pmap_enter() is incorrect, I guess because of optimizations.
The assert seems to be reached via pmap_enter -> CHANGE_PV_LIST_LOCK_TO_PHYS ->
PHYS_TO_PV_LIST_LOCK -> pa_index().

The panic in correct in that the page is fictitious and its physical address is
beyond the end of real physical memory.
It seems that NUMA PHYS_TO_PV_LIST_LOCK() is aware of such pages, but !NUMA one
is not.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1235 matches

Mail list logo