14 stable Guided Root-on-ZFS with 2 disk mirror

2021-10-03 Thread Mikhail Holt

Hello All,

- Fresh installation of FreeBSD 14 stable from this:
FreeBSD-14.0-CURRENT-amd64-20210930-9aa29457d55-249761-memstick.img.xz

  Took the 'Guided Root-on-ZFS' installation with the following options:
  - Pool type:   Mirror - 2 disks
  - Partition scheme:GPT (UEFI)
  - Mirror Swap? No
  - Encrypt Swap?No

- On reboot after the initial installation the system works as it should.
  The zpool is mirrored and the two disks are synced.

- The system will not boot after shutting down the system and 
disconnecting the first disk.

  The UEFI BIOS reports that there are no bootable disks.
  On reconnecting the first disk the system boots and the zpool is re 
silvered.


- It appears that the duplicate EFI partition on second disk is created 
but the file system is not created and hence the boot files are not 
installed.


- The system will boot from just the second disk after copying the EFI 
partition from the first disk to the second.


- Is this known/expected behaviour?
  I was expecting to be able to boot from the second disk without 
having to copy the efi partition.



Thanks
Mikhail


Re: witness_lock_list_get: witness exhausted

2021-10-03 Thread Alan Somers
On Mon, Jan 8, 2018 at 5:31 PM Mateusz Guzik  wrote:
>
> On Tue, Jan 9, 2018 at 12:41 AM, Michael Jung  wrote:
>
> > On 2018-01-08 13:39, John Baldwin wrote:
> >
> >> On Tuesday, November 28, 2017 02:46:03 PM Michael Jung wrote:
> >>
> >>> Hi!
> >>>
> >>> I've recently up'd my processor count on our poudriere box and have
> >>> started noticing the error
> >>> "witness_lock_list_get: witness exhausted" on the console.  The kernel
> >>> *DOES NOT* crash but I
> >>> thought the report may be useful to someone.
> >>>
> >>> $ uname -a
> >>> FreeBSD poudriere 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r325999: Sun Nov
> >>> 19 18:41:20 EST 2017
> >>> mikej@poudriere:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64
> >>>
> >>> The machine is pretty busy running four poudriere build instances.
> >>>
> >>> last pid: 76584;  load averages: 115.07, 115.96, 98.30
> >>>
> >>>   up 6+07:32:59  14:44:03
> >>> 763 processes: 117 running, 581 sleeping, 2 zombie, 63 lock
> >>> CPU: 59.0% user,  0.0% nice, 40.7% system,  0.1% interrupt,  0.1% idle
> >>> Mem: 12G Active, 2003M Inact, 44G Wired, 29G Free
> >>> ARC: 28G Total, 11G MFU, 16G MRU, 122M Anon, 359M Header, 1184M Other
> >>>   25G Compressed, 32G Uncompressed, 1.24:1 Ratio
> >>>
> >>> Let me know what additional information I might supply.
> >>>
> >>
> >> This just means that WITNESS stopped working because it ran out of
> >> pre-allocated objects.  In particular the objects used to track how
> >> many locks are held by how many threads:
> >>
> >> /*
> >>  * XXX: This is somewhat bogus, as we assume here that at most 2048
> >> threads
> >>  * will hold LOCK_NCHILDREN locks.  We handle failure ok, and we should
> >>  * probably be safe for the most part, but it's still a SWAG.
> >>  */
> >> #define LOCK_NCHILDREN  5
> >> #define LOCK_CHILDCOUNT 2048
> >>
> >> Probably the '2048' (max number of concurrent threads) needs to scale with
> >> MAXCPU.  2048 threads is probably a bit low on big x86 boxes.
> >>
> >
> >
> > Thank you for you explanation.  We are expanding our ESXi cluster and even
> > though with standard edition I can only assign 64 vCPU's to a guest and as
> > much
> > RAM as I want, I do like to help with edge cases if I can make them occur
> > pushing
> > boundaries as I can towards additianional improvements in FreeBSD.
> >
>
> Can you apply this and re-run the test?
>
> https://people.freebsd.org/~mjg/witness.diff
>
> It bumps the counters to be "high enough" but also starts tracking usage.
> If you get
> the message again, bump the values even higher.
>
> Once you get a complete poudriere run which did not result in the problem,
> do:
> $ sysctl debug.witness.list_used debug.witness.list_max_used
>
> to dump the actual usage.

This is a nice little patch.  Can we commit to head?  Even better
would be if LOCK_CHILDCOUNT could be a tunable.  On my largish system,
here's what I get shortly after boot:

debug.witness.list_max_used: 8432
debug.witness.list_used: 8420

-Alan