On 2018-Oct-21, at 8:30 PM, Warner Losh <imp at bsdimp.com> wrote:

> On Sun, Oct 21, 2018 at 9:28 PM Warner Losh <imp at bsdimp.com> wrote:
> 
> On Sun, Oct 21, 2018 at 8:57 PM Mark Millard via freebsd-stable 
> <freebsd-sta...@freebsd.org> wrote:
>> [I built based on WITHOUT_ZFS= for other reasons. But,
>> after installing the build, Hyper-V based boots are
>> working.]
>> 
>> On 2018-Oct-20, at 2:09 AM, Mark Millard <marklmi at yahoo.com> wrote:
>> 
>> > On 2018-Oct-20, at 1:39 AM, Mark Millard <marklmi at yahoo.com> wrote:
>> > 
>> >> I attempted to jump from head -r334014 to -r339076
>> >> on a threadripper 1950X board and the boot fails.
>> >> This is both native booting and under Hyper-V,
>> >> same machine and root file system in both cases.
>> > 
>> > I did my investigation under Hyper-V after seeing
>> > a boot failure native.
>> > 
>> > Looks like the native failure is even earlier,
>> > before db> is even possible, possibly during
>> > early loader activity.
>> > 
>> > So this report is really for running under
>> > Hyper-V: -r338804 boots and -r338810 does
>> > not. By contrast -r334804 does not boot native.
>> > (But I've little information for that context.)
>> > 
>> > Sorry for the confusion. I rushed the report
>> > in hopes of getting to sleep. It was not to be.
>> > 
>> >> It fails just after the FreeBSD/SMP lines,
>> >> reporting "kernel trap 9 with interrupts disabled".
>> >> 
>> >> It fails in pmap_force_invaldiate_cache_range at
>> >> a clflusl (%rax) instruction that produces a
>> >> "Fatal trap 9: general protection fault while
>> >> in kernel mode". cpudid=0 apic id= 00
>> >> 
>> >> I used kernel.txz files from:
>> >> 
>> >> https://artifact.ci.freebsd.org/snapshot/head/r*/amd64/amd64/
>> >> 
>> >> to narrow the range of kernel builds for working -> failing
>> >> and got:
>> >> 
>> >> -r338804 boots fine
>> >> (no amd64 kernel builds between to try)
>> >> -r338810+ fails (any that I tried, anyway)
>> >> 
>> >> In that range is -r338807 :
>> >> 
>> >> QUOTE
>> >> Author: kib
>> >> Date: Wed Sep 19 19:35:02 2018
>> >> New Revision: 338807
>> >> URL: 
>> >> https://svnweb.freebsd.org/changeset/base/338807
>> >> 
>> >> 
>> >> Log:
>> >> Convert x86 cache invalidation functions to ifuncs.
>> >> 
>> >> This simplifies the runtime logic and reduces the number of
>> >> runtime-constant branches.
>> >> 
>> >> Reviewed by: alc, markj
>> >> Sponsored by:        The FreeBSD Foundation
>> >> Approved by: re (gjb)
>> >> Differential revision:       
>> >> https://reviews.freebsd.org/D16736
>> >> 
>> >> Modified:
>> >> head/sys/amd64/amd64/pmap.c
>> >> head/sys/amd64/include/pmap.h
>> >> head/sys/dev/drm2/drm_os_freebsd.c
>> >> head/sys/dev/drm2/i915/intel_ringbuffer.c
>> >> head/sys/i386/i386/pmap.c
>> >> head/sys/i386/i386/vm_machdep.c
>> >> head/sys/i386/include/pmap.h
>> >> head/sys/x86/iommu/intel_utils.c
>> >> END QUOTE
>> >> 
>> >> There do seem to be changes associated with
>> >> clflush(...) use. Looking at:
>> >> 
>> >> https://svnweb.freebsd.org/base/head/sys/amd64/amd64/pmap.c?annotate=339432
>> >> 
>> >> it appears that pmap_force_invalidate_cache_range has not
>> >> changed since -r338807.
>> >> 
>> >> It seems that -r338806 and -r3388810 would be unlikely
>> >> contributors.
>> > 
>> 
>> I went after my native-boot loader problem first because I
>> could switch kernels via the loader for booting FreeBSD under
>> Hyper-V. Switching loaders is more of a problem.
>> 
>> In order to avoid the loader-time crash I switched to building
>> installing based on WITHOUT_ZFS= . I've had no active use of
>> ZFS in years. (The old official-build loaders that worked were
>> non-ZFS ones.)
>> 
>> This took care of the native-boot loader-crash --and, to my
>> surprise, also the Hyper-V-boot kernel-time crash.
>> 
>> My private builds now boot the 1950X in both contexts just
>> fine.
>> 
>> During my early investigation I did pick up specific changes
>> from after -r339076 that seemed to be tied to Ryzen and such.
>> (They made no difference to the boot problems at the time
>> but I saw no reason to remove them.)
>> 
>> # uname -apKU
>> FreeBSD FBSDFSSD 12.0-ALPHA8 FreeBSD 12.0-ALPHA8 #5 r339076:339432M: Sun Oct 
>> 21 16:44:25 PDT 2018     
>> markmi@FBSDFSSD:/usr/obj/amd64_clang/amd64.amd64/usr/src/amd64.amd64/sys/GENERIC-NODBG
>>   amd64 amd64 1200084 1200084
>> 
>> (stupid gmail) 
> 
> The phrase "no active use" bothers me. What does that mean? Are there any ZFS 
> pools or any disks that any whiff of ZFSish thing on it at all? Clearly, 
> there's something in the zfs boot loader that's freaking out by something on 
> your system, but absent that information I can't help you.

No ZFS pools: Strictly UFS for FreeBSD file systems
for the last few years, UFS before I had access to
the 1950X system.

I've never before bothered to use WITHOUT_ZFS= in
my builds. So the system had the ZFS support,
such as kernel modules, over all the time that
this system had been in use.

Prior to the recent versions I saw no such problems.
But the default loader was not ZFS capable.


As seen in the under-Hyper-V use-context:

# gpart show -p
=>       40  937703008    da0  GPT  (447G)
         40       1024  da0p1  freebsd-boot  (512K)
       1064  746586112  da0p2  freebsd-ufs  (356G)
  746587176   31457280  da0p3  freebsd-swap  (15G)
  778044456  159383552  da0p4  freebsd-swap  (76G)
  937428008     275040         - free -  (134M)

=>       40  937703008    da1  GPT  (447G)
         40       1024  da1p1  freebsd-boot  (512K)
       1064  369098752  da1p2  freebsd-ufs  (176G)
  369099816  406846424  da1p3  freebsd-swap  (194G)
  775946240  130024488         - free -  (62G)
  905970728   31457280  da1p4  freebsd-swap  (15G)
  937428008     275040         - free -  (134M)

=>       40  419430320    da2  GPT  (200G)
         40       4056         - free -  (2.0M)
       4096  419426263  da2p1  freebsd-ufs  (200G)
  419430359          1         - free -  (512B)

=>        40  2000409184    da3  GPT  (954G)
          40        1024  da3p1  freebsd-boot  (512K)
        1064  2000408159  da3p2  freebsd-ufs  (954G)
  2000409223           1         - free -  (512B)

So no ZFS pools.

The above context never had the ZFS-capable loader
problem but did have the kernel problem. I was
booting the 356G freebsd-ufs partition: the only
one that I have updated the FreeBSD version on
so far.


FreeBSD booted natively more drives are seen in
gpart show, some not from/for FreeBSD. But the
above drives are present and I was booting from
the same partition of the same drive: the 356G
freebsd-ufs partition. Still no ZFS pools
anywhere:

# gpart show -p
=>        34  4000797293    nvd0  GPT  (1.9T)
          34      262144  nvd0p1  ms-reserved  (128M)
      262178        2014          - free -  (1.0M)
      264192  3600451584  nvd0p2  ms-basic-data  (1.7T)
  3600715776   400081551          - free -  (191G)

=>       40  937703008    nvd1  GPT  (447G)
         40       1024  nvd1p1  freebsd-boot  (512K)
       1064  746586112  nvd1p2  freebsd-ufs  (356G)
  746587176   31457280  nvd1p3  freebsd-swap  (15G)
  778044456  159383552  nvd1p4  freebsd-swap  (76G)
  937428008     275040          - free -  (134M)

=>       40  937703008    nvd2  GPT  (447G)
         40       1024  nvd2p1  freebsd-boot  (512K)
       1064  369098752  nvd2p2  freebsd-ufs  (176G)
  369099816  406846424  nvd2p3  freebsd-swap  (194G)
  775946240  130024488          - free -  (62G)
  905970728   31457280  nvd2p4  freebsd-swap  (15G)
  937428008     275040          - free -  (134M)

=>        34  2000409197    nvd3  GPT  (954G)
          34        2014          - free -  (1.0M)
        2048     1021952  nvd3p1  ms-recovery  (499M)
     1024000      202752  nvd3p2  efi  (99M)
     1226752       32768  nvd3p3  ms-reserved  (16M)
     1259520  1859119104  nvd3p4  ms-basic-data  (886G)
  1860378624   140030607          - free -  (67G)

=>        40  2000409184    nvd4  GPT  (954G)
          40        1024  nvd4p1  freebsd-boot  (512K)
        1064  2000408159  nvd4p2  freebsd-ufs  (954G)
  2000409223           1          - free -  (512B)

=>        63  2000409201    ada0  MBR  (954G)
          63        1985          - free -  (993K)
        2048        4096  ada0s1  linux-data  (2.0M)
        6144     2093056          - free -  (1.0G)
     2099200  1998309376  ada0s2  linux-lvm  (953G)
  2000408576         688          - free -  (344K)

=>        34  2000409197    ada1  GPT  (954G)
          34      262144  ada1p1  ms-reserved  (128M)
      262178  2000147053          - free -  (954G)

=>        34  2000409197    ada2  GPT  (954G)
          34      262144  ada2p1  ms-reserved  (128M)
      262178  2000147053          - free -  (954G)

=>        34  1953497022    da0  GPT  (932G)
          34      262144  da0p1  ms-reserved  (128M)
      262178        2014         - free -  (1.0M)
      264192  1953230848  da0p2  ms-basic-data  (931G)
  1953495040        2016         - free -  (1.0M)

=>       1  60062499    da1  MBR  (29G)
         1        31         - free -  (16K)
        32  60062468  da1s1  fat32lba  (29G)

The 356G freebsd-ufs partition is the only one
of the freebsd-ufs partitions updated so far.

This is the context that had the problem with
the ZFS-capable loaders --but no later kernel
problem when a not-ZFS-capable loader was used
(via copying over an older one --until I did the
WITHOUT_ZFS= build/install).

As for the ZFS-capable loader: May it has
problems when it sees one or more of:

ms-reserved (on GPT)
ms-basic-data (on GPT) (NTFS file system)
ms-recovery (on GPT)
efi (on GPT)
linux-data (on MBR)
linux-lvm (on MBR)
fat32lba (on MBR)

(given that none of these is available in
the Hyper-V context as the virtual machine
has been configured).

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

_______________________________________________
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to