head -r355777: Lack of USB keyboard input during loader

2019-12-17 Thread Mark Millard via freebsd-amd64
I normally use a PS2 keyboard on a Gigabyte Aorus Gaming 7
(with a ThreadRipper 1950X) during its boot sequence. (Most
use is via ssh once booted.) But I've used a USB keyboard as
well on rare occasion. (Unsure when that last was.)

Well, I happened to try using a USB keyboard for head -r355777
and:

A) The USB keyboard  worked fine for controlling the BIOS
   operation (F12h BIOS version)
B) It was ignored by the loader (so: booted via the timeout)
C) It worked fine for logging-in on the console and later
   activity there

This was repeatable.

By contrast, for the PS2 keyboard, all 3 stages worked fine
when tested, also repeatable.

Overall I can avoid (B) in my context, but I thought that the
status might be of interest to someone.

The -r355777 build is a non-debug build (with symbols).

At this point I've not established any other head -r355777
(or later) instances on any other hardware and I do not
normally use usb keyboards for booting, other than on old
PowerMacs. So I've no evidence for how the issue might
generalize. It may be the weekend (or longer) before I'd
find out.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-amd64@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-amd64
To unsubscribe, send any mail to "freebsd-amd64-unsubscr...@freebsd.org"


FreeBSD head vs. ThreadRipper 1950X X399 AORUS gaming 7's EtherNet : Am I the only one with it not working?

2019-11-24 Thread Mark Millard via freebsd-amd64
I (sometimes) have access to an Threadripper 1950X based X399 AORUS
Gaming 7 system. (Not used for gaming.) The notes below are about
that context. Currently, I'm mostly checking if my context is unique
for some reason.

I'll start off noting that Fedora (currently 31) and Windows 10 Pro
x64 (1903) have had no troubles with using the EtherNet or WiFi from
this board, simply rebooting the machine in the same physical and
networking context. Such is still true. The FreeBSD configuration
tend to be near simplest. The same is true for the other OS's. Nearly
all network activity is just local area network activity unless I'm
updating software.

Historically I've used the FreeBSD drive booted under HyperV a lot,
in part because the networking always worked well in FreeBSD in that
context.

I conclude that the hardware is okay and that FreeBSD is the odd-ball
thing involved, at least for native booting. (But I've no useful
detail of how it is odd-ball at this point.)

I'll note that the Threadripper system is my only native FreeBSD
amd64 context and it is the only context that I've been having
FreeBSD networking problems in. The cortex-a7, cortex-a53,
cortex-a57, and old PowerMac contexts seem to be doing fine for
such activity.


In this note I focus on EtherNet, since it seem to be effectively
non-functional. (WiFI is also odd, but somewhat functional. When
FreeBSD is native-booted I depend on the WiFi, despite poor
performance. Again Fedora and Windows 10 do not show problems.)
I recently jumped from -r352341 to -r355027 but the behavior has
been the same for EtherNet.

I count dhclient not being able to get an assignment as example
of non-functional. (Again, no such problems rebooting using the
Fedora or Windows 10 drives.) I deleted FreeBSD's very-old
ip4 fall-back address information file in order to make it hard
to miss when DHCP activity was not assigning an address.

FYI, in case of similar EtherNet hardware on other boards:

alc0:  port 0x1000-0x107f mem 
0xba00-0xba03 irq 27 at device 0.0 numa-domain 0 on pci5
alc0: 11776 Tx FIFO, 12032 Rx FIFO
alc0: Using 1 MSIX message(s).
alc0: 4GB boundary crossed, switching to 32bit DMA addressing mode.
miibus0:  numa-domain 0 on alc0
atphy0:  PHY 0 on miibus0
atphy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, 
1000baseT-FDX-master, auto, auto-flow
alc0: Using defaults for TSO: 65518/35/2048
alc0: Ethernet address: . . .

Anyone else had such problems in a somewhat similar context?
Is having numa domains fairly unique to my context?

My time for such things is currently rather limited, but if there
are basic things to check on I'd eventually use any notes to help
isolate what to look at in more detail. (Jumping directly to a
solution seems unlikely: more stages/steps.)

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-amd64@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-amd64
To unsubscribe, send any mail to "freebsd-amd64-unsubscr...@freebsd.org"


Re: head -r352341 example context on ThreadRipper 1950X: cpuset -n prefer:1 with -l 0-15 vs. -l 16-31 odd performance?

2019-09-28 Thread Mark Millard via freebsd-amd64



On 2019-Sep-27, at 15:22, Mark Millard  wrote:

> On 2019-Sep-27, at 13:52, Mark Millard  wrote:
> 
>> On 2019-Sep-27, at 12:24, Mark Johnston  wrote:
>> 
>>> On Thu, Sep 26, 2019 at 08:37:39PM -0700, Mark Millard wrote:
 
 
 On 2019-Sep-26, at 17:05, Mark Millard  wrote:
 
> On 2019-Sep-26, at 13:29, Mark Johnston  wrote:
>> One possibility is that these are kernel memory allocations occurring in
>> the context of the benchmark threads.  Such allocations may not respect
>> the configured policy since they are not private to the allocating
>> thread.  For instance, upon opening a file, the kernel may allocate a
>> vnode structure for that file.  That vnode may be accessed by threads
>> from many processes over its lifetime, and may be recycled many times
>> before its memory is released back to the allocator.
> 
> For -l0-15 -n prefer:1 :
> 
> Looks like this reports sys_thr_new activity, sys_cpuset
> activity, and 0x80bc09bd activity (whatever that
> is). Mostly sys_thr_new activity, over 1300 of them . . .
> 
> dtrace: pid 13553 has exited
> 
> 
> kernel`uma_small_alloc+0x61
> kernel`keg_alloc_slab+0x10b
> kernel`zone_import+0x1d2
> kernel`uma_zalloc_arg+0x62b
> kernel`thread_init+0x22
> kernel`keg_alloc_slab+0x259
> kernel`zone_import+0x1d2
> kernel`uma_zalloc_arg+0x62b
> kernel`thread_alloc+0x23
> kernel`thread_create+0x13a
> kernel`sys_thr_new+0xd2
> kernel`amd64_syscall+0x3ae
> kernel`0x811b7600
>   2
> 
> kernel`uma_small_alloc+0x61
> kernel`keg_alloc_slab+0x10b
> kernel`zone_import+0x1d2
> kernel`uma_zalloc_arg+0x62b
> kernel`cpuset_setproc+0x65
> kernel`sys_cpuset+0x123
> kernel`amd64_syscall+0x3ae
> kernel`0x811b7600
>   2
> 
> kernel`uma_small_alloc+0x61
> kernel`keg_alloc_slab+0x10b
> kernel`zone_import+0x1d2
> kernel`uma_zalloc_arg+0x62b
> kernel`uma_zfree_arg+0x36a
> kernel`thread_reap+0x106
> kernel`thread_alloc+0xf
> kernel`thread_create+0x13a
> kernel`sys_thr_new+0xd2
> kernel`amd64_syscall+0x3ae
> kernel`0x811b7600
>   6
> 
> kernel`uma_small_alloc+0x61
> kernel`keg_alloc_slab+0x10b
> kernel`zone_import+0x1d2
> kernel`uma_zalloc_arg+0x62b
> kernel`uma_zfree_arg+0x36a
> kernel`vm_map_process_deferred+0x8c
> kernel`vm_map_remove+0x11d
> kernel`vmspace_exit+0xd3
> kernel`exit1+0x5a9
> kernel`0x80bc09bd
> kernel`amd64_syscall+0x3ae
> kernel`0x811b7600
>   6
> 
> kernel`uma_small_alloc+0x61
> kernel`keg_alloc_slab+0x10b
> kernel`zone_import+0x1d2
> kernel`uma_zalloc_arg+0x62b
> kernel`thread_alloc+0x23
> kernel`thread_create+0x13a
> kernel`sys_thr_new+0xd2
> kernel`amd64_syscall+0x3ae
> kernel`0x811b7600
>  22
> 
> kernel`vm_page_grab_pages+0x1b4
> kernel`vm_thread_stack_create+0xc0
> kernel`kstack_import+0x52
> kernel`uma_zalloc_arg+0x62b
> kernel`vm_thread_new+0x4d
> kernel`thread_alloc+0x31
> kernel`thread_create+0x13a
> kernel`sys_thr_new+0xd2
> kernel`amd64_syscall+0x3ae
> kernel`0x811b7600
>1324
 
 With sys_thr_new not respecting -n prefer:1 for
 -l0-15 (especially for the thread stacks), I
 looked some at the generated integration kernel
 code and it makes significant use of %rsp based
 memory accesses (read and write).
 
 That would get both memory controllers going in
 parallel (kernel vectors accesses to the preferred
 memory domain), so not slowing down as expected.
 
 If round-robin is not respected for thread stacks,
 and if threads migrate cpus across memory domains
 at times, there could be considerable variability
 for that context as well. (This may not be the
 only way to have different/extra variability for
 this context.)
 
 Overall: I'd be surprised if this was not
 contributing to what I thought was odd about
 the benchmark results.

Re: head -r352341 example context on ThreadRipper 1950X: cpuset -n prefer:1 with -l 0-15 vs. -l 16-31 odd performance?

2019-09-27 Thread Mark Millard via freebsd-amd64



On 2019-Sep-27, at 13:52, Mark Millard  wrote:

> On 2019-Sep-27, at 12:24, Mark Johnston  > wrote:
> 
>> On Thu, Sep 26, 2019 at 08:37:39PM -0700, Mark Millard wrote:
>>> 
>>> 
>>> On 2019-Sep-26, at 17:05, Mark Millard >> > wrote:
>>> 
 On 2019-Sep-26, at 13:29, Mark Johnston >>> > wrote:
> One possibility is that these are kernel memory allocations occurring in
> the context of the benchmark threads.  Such allocations may not respect
> the configured policy since they are not private to the allocating
> thread.  For instance, upon opening a file, the kernel may allocate a
> vnode structure for that file.  That vnode may be accessed by threads
> from many processes over its lifetime, and may be recycled many times
> before its memory is released back to the allocator.
 
 For -l0-15 -n prefer:1 :
 
 Looks like this reports sys_thr_new activity, sys_cpuset
 activity, and 0x80bc09bd activity (whatever that
 is). Mostly sys_thr_new activity, over 1300 of them . . .
 
 dtrace: pid 13553 has exited
 
 
 kernel`uma_small_alloc+0x61
 kernel`keg_alloc_slab+0x10b
 kernel`zone_import+0x1d2
 kernel`uma_zalloc_arg+0x62b
 kernel`thread_init+0x22
 kernel`keg_alloc_slab+0x259
 kernel`zone_import+0x1d2
 kernel`uma_zalloc_arg+0x62b
 kernel`thread_alloc+0x23
 kernel`thread_create+0x13a
 kernel`sys_thr_new+0xd2
 kernel`amd64_syscall+0x3ae
 kernel`0x811b7600
   2
 
 kernel`uma_small_alloc+0x61
 kernel`keg_alloc_slab+0x10b
 kernel`zone_import+0x1d2
 kernel`uma_zalloc_arg+0x62b
 kernel`cpuset_setproc+0x65
 kernel`sys_cpuset+0x123
 kernel`amd64_syscall+0x3ae
 kernel`0x811b7600
   2
 
 kernel`uma_small_alloc+0x61
 kernel`keg_alloc_slab+0x10b
 kernel`zone_import+0x1d2
 kernel`uma_zalloc_arg+0x62b
 kernel`uma_zfree_arg+0x36a
 kernel`thread_reap+0x106
 kernel`thread_alloc+0xf
 kernel`thread_create+0x13a
 kernel`sys_thr_new+0xd2
 kernel`amd64_syscall+0x3ae
 kernel`0x811b7600
   6
 
 kernel`uma_small_alloc+0x61
 kernel`keg_alloc_slab+0x10b
 kernel`zone_import+0x1d2
 kernel`uma_zalloc_arg+0x62b
 kernel`uma_zfree_arg+0x36a
 kernel`vm_map_process_deferred+0x8c
 kernel`vm_map_remove+0x11d
 kernel`vmspace_exit+0xd3
 kernel`exit1+0x5a9
 kernel`0x80bc09bd
 kernel`amd64_syscall+0x3ae
 kernel`0x811b7600
   6
 
 kernel`uma_small_alloc+0x61
 kernel`keg_alloc_slab+0x10b
 kernel`zone_import+0x1d2
 kernel`uma_zalloc_arg+0x62b
 kernel`thread_alloc+0x23
 kernel`thread_create+0x13a
 kernel`sys_thr_new+0xd2
 kernel`amd64_syscall+0x3ae
 kernel`0x811b7600
  22
 
 kernel`vm_page_grab_pages+0x1b4
 kernel`vm_thread_stack_create+0xc0
 kernel`kstack_import+0x52
 kernel`uma_zalloc_arg+0x62b
 kernel`vm_thread_new+0x4d
 kernel`thread_alloc+0x31
 kernel`thread_create+0x13a
 kernel`sys_thr_new+0xd2
 kernel`amd64_syscall+0x3ae
 kernel`0x811b7600
1324
>>> 
>>> With sys_thr_new not respecting -n prefer:1 for
>>> -l0-15 (especially for the thread stacks), I
>>> looked some at the generated integration kernel
>>> code and it makes significant use of %rsp based
>>> memory accesses (read and write).
>>> 
>>> That would get both memory controllers going in
>>> parallel (kernel vectors accesses to the preferred
>>> memory domain), so not slowing down as expected.
>>> 
>>> If round-robin is not respected for thread stacks,
>>> and if threads migrate cpus across memory domains
>>> at times, there could be considerable variability
>>> for that context as well. (This may not be the
>>> only way to have different/extra variability for
>>> this context.)
>>> 
>>> Overall: I'd be surprised if this was not
>>> contributing to what I thought was odd about
>>> the benchmark results.
>> 
>> Your tracing refers to kernel thread stacks though, not the stacks used
>> by threads when 

Re: head -r352341 example context on ThreadRipper 1950X: cpuset -n prefer:1 with -l 0-15 vs. -l 16-31 odd performance?

2019-09-27 Thread Mark Millard via freebsd-amd64



On 2019-Sep-27, at 12:24, Mark Johnston  wrote:

> On Thu, Sep 26, 2019 at 08:37:39PM -0700, Mark Millard wrote:
>> 
>> 
>> On 2019-Sep-26, at 17:05, Mark Millard  wrote:
>> 
>>> On 2019-Sep-26, at 13:29, Mark Johnston  wrote:
 One possibility is that these are kernel memory allocations occurring in
 the context of the benchmark threads.  Such allocations may not respect
 the configured policy since they are not private to the allocating
 thread.  For instance, upon opening a file, the kernel may allocate a
 vnode structure for that file.  That vnode may be accessed by threads
 from many processes over its lifetime, and may be recycled many times
 before its memory is released back to the allocator.
>>> 
>>> For -l0-15 -n prefer:1 :
>>> 
>>> Looks like this reports sys_thr_new activity, sys_cpuset
>>> activity, and 0x80bc09bd activity (whatever that
>>> is). Mostly sys_thr_new activity, over 1300 of them . . .
>>> 
>>> dtrace: pid 13553 has exited
>>> 
>>> 
>>> kernel`uma_small_alloc+0x61
>>> kernel`keg_alloc_slab+0x10b
>>> kernel`zone_import+0x1d2
>>> kernel`uma_zalloc_arg+0x62b
>>> kernel`thread_init+0x22
>>> kernel`keg_alloc_slab+0x259
>>> kernel`zone_import+0x1d2
>>> kernel`uma_zalloc_arg+0x62b
>>> kernel`thread_alloc+0x23
>>> kernel`thread_create+0x13a
>>> kernel`sys_thr_new+0xd2
>>> kernel`amd64_syscall+0x3ae
>>> kernel`0x811b7600
>>>   2
>>> 
>>> kernel`uma_small_alloc+0x61
>>> kernel`keg_alloc_slab+0x10b
>>> kernel`zone_import+0x1d2
>>> kernel`uma_zalloc_arg+0x62b
>>> kernel`cpuset_setproc+0x65
>>> kernel`sys_cpuset+0x123
>>> kernel`amd64_syscall+0x3ae
>>> kernel`0x811b7600
>>>   2
>>> 
>>> kernel`uma_small_alloc+0x61
>>> kernel`keg_alloc_slab+0x10b
>>> kernel`zone_import+0x1d2
>>> kernel`uma_zalloc_arg+0x62b
>>> kernel`uma_zfree_arg+0x36a
>>> kernel`thread_reap+0x106
>>> kernel`thread_alloc+0xf
>>> kernel`thread_create+0x13a
>>> kernel`sys_thr_new+0xd2
>>> kernel`amd64_syscall+0x3ae
>>> kernel`0x811b7600
>>>   6
>>> 
>>> kernel`uma_small_alloc+0x61
>>> kernel`keg_alloc_slab+0x10b
>>> kernel`zone_import+0x1d2
>>> kernel`uma_zalloc_arg+0x62b
>>> kernel`uma_zfree_arg+0x36a
>>> kernel`vm_map_process_deferred+0x8c
>>> kernel`vm_map_remove+0x11d
>>> kernel`vmspace_exit+0xd3
>>> kernel`exit1+0x5a9
>>> kernel`0x80bc09bd
>>> kernel`amd64_syscall+0x3ae
>>> kernel`0x811b7600
>>>   6
>>> 
>>> kernel`uma_small_alloc+0x61
>>> kernel`keg_alloc_slab+0x10b
>>> kernel`zone_import+0x1d2
>>> kernel`uma_zalloc_arg+0x62b
>>> kernel`thread_alloc+0x23
>>> kernel`thread_create+0x13a
>>> kernel`sys_thr_new+0xd2
>>> kernel`amd64_syscall+0x3ae
>>> kernel`0x811b7600
>>>  22
>>> 
>>> kernel`vm_page_grab_pages+0x1b4
>>> kernel`vm_thread_stack_create+0xc0
>>> kernel`kstack_import+0x52
>>> kernel`uma_zalloc_arg+0x62b
>>> kernel`vm_thread_new+0x4d
>>> kernel`thread_alloc+0x31
>>> kernel`thread_create+0x13a
>>> kernel`sys_thr_new+0xd2
>>> kernel`amd64_syscall+0x3ae
>>> kernel`0x811b7600
>>>1324
>> 
>> With sys_thr_new not respecting -n prefer:1 for
>> -l0-15 (especially for the thread stacks), I
>> looked some at the generated integration kernel
>> code and it makes significant use of %rsp based
>> memory accesses (read and write).
>> 
>> That would get both memory controllers going in
>> parallel (kernel vectors accesses to the preferred
>> memory domain), so not slowing down as expected.
>> 
>> If round-robin is not respected for thread stacks,
>> and if threads migrate cpus across memory domains
>> at times, there could be considerable variability
>> for that context as well. (This may not be the
>> only way to have different/extra variability for
>> this context.)
>> 
>> Overall: I'd be surprised if this was not
>> contributing to what I thought was odd about
>> the benchmark results.
> 
> Your tracing refers to kernel thread stacks though, not the stacks used
> by threads when executing in user mode.  My understanding is that a HINT
> implementation would spend virtually all of its time in user mode, so it
> shouldn't matter much or at all if kernel thread stacks are backed by
> memory from the "wrong" domain.

Looks 

Re: head -r352341 example context on ThreadRipper 1950X: cpuset -n prefer:1 with -l 0-15 vs. -l 16-31 odd performance?

2019-09-26 Thread Mark Millard via freebsd-amd64



On 2019-Sep-26, at 17:05, Mark Millard  wrote:

> On 2019-Sep-26, at 13:29, Mark Johnston  wrote:
> 
>> On Wed, Sep 25, 2019 at 10:03:14PM -0700, Mark Millard wrote:
>>> 
>>> 
>>> On 2019-Sep-25, at 20:27, Mark Millard  wrote:
>>> 
>>>> On 2019-Sep-25, at 19:26, Mark Millard  wrote:
>>>> 
>>>>> On 2019-Sep-25, at 10:02, Mark Johnston  wrote:
>>>>> 
>>>>>> On Mon, Sep 23, 2019 at 01:28:15PM -0700, Mark Millard via freebsd-amd64 
>>>>>> wrote:
>>>>>>> Note: I have access to only one FreeBSD amd64 context, and
>>>>>>> it is also my only access to a NUMA context: 2 memory
>>>>>>> domains. A Threadripper 1950X context. Also: I have only
>>>>>>> a head FreeBSD context on any architecture, not 12.x or
>>>>>>> before. So I have limited compare/contrast material.
>>>>>>> 
>>>>>>> I present the below basically to ask if the NUMA handling
>>>>>>> has been validated, or if it is going to be, at least for
>>>>>>> contexts that might apply to ThreadRipper 1950X and
>>>>>>> analogous contexts. My results suggest they are not (or
>>>>>>> libc++'s now times get messed up such that it looks like
>>>>>>> NUMA mishandling since this is based on odd benchmark
>>>>>>> results that involve mean time for laps, using a median
>>>>>>> of such across multiple trials).
>>>>>>> 
>>>>>>> I ran a benchmark on both Fedora 30 and FreeBSD 13 on this
>>>>>>> 1950X got got expected  results on Fedora but odd ones on
>>>>>>> FreeBSD. The benchmark is a variation on the old HINT
>>>>>>> benchmark, spanning the old multi-threading variation. I
>>>>>>> later tried Fedora because the FreeBSD results looked odd.
>>>>>>> The other architectures I tried FreeBSD benchmarking with
>>>>>>> did not look odd like this. (powerpc64 on a old PowerMac 2
>>>>>>> socket with 2 cores per socket, aarch64 Cortex-A57 Overdrive
>>>>>>> 1000, CortextA53 Pine64+ 2GB, armv7 Cortex-A7 Orange Pi+ 2nd
>>>>>>> Ed. For these I used 4 threads, not more.)
>>>>>>> 
>>>>>>> I tend to write in terms of plots made from the data instead
>>>>>>> of the raw benchmark data.
>>>>>>> 
>>>>>>> FreeBSD testing based on:
>>>>>>> cpuset -l0-15  -n prefer:1
>>>>>>> cpuset -l16-31 -n prefer:1
>>>>>>> 
>>>>>>> Fedora 30 testing based on:
>>>>>>> numactl --preferred 1 --cpunodebind 0
>>>>>>> numactl --preferred 1 --cpunodebind 1
>>>>>>> 
>>>>>>> While I have more results, I reference primarily DSIZE
>>>>>>> and ISIZE being unsigned long long and also both being
>>>>>>> unsigned long as examples. Variations in results are not
>>>>>>> from the type differences for any LP64 architectures.
>>>>>>> (But they give an idea of benchmark variability in the
>>>>>>> test context.)
>>>>>>> 
>>>>>>> The Fedora results solidly show the bandwidth limitation
>>>>>>> of using one memory controller. They also show the latency
>>>>>>> consequences for the remote memory domain case vs. the
>>>>>>> local memory domain case. There is not a lot of
>>>>>>> variability between the examples of the 2 type-pairs used
>>>>>>> for Fedora.
>>>>>>> 
>>>>>>> Not true for FreeBSD on the 1950X:
>>>>>>> 
>>>>>>> A) The latency-constrained part of the graph looks to
>>>>>>> normally be using the local memory domain when
>>>>>>> -l0-15 is in use for 8 threads.
>>>>>>> 
>>>>>>> B) Both the -l0-15 and the -l16-31 parts of the
>>>>>>> graph for 8 threads that should be bandwidth
>>>>>>> limited show mostly examples that would have to
>>>>>>> involve both memory controllers for the bandwidth
>>>>>>> to get the results shown as far as I can tell.
>>>>>>> There is also wide variability ranging bet

Re: head -r352341 example context on ThreadRipper 1950X: cpuset -n prefer:1 with -l 0-15 vs. -l 16-31 odd performance?

2019-09-25 Thread Mark Millard via freebsd-amd64



On 2019-Sep-25, at 20:27, Mark Millard  wrote:

> On 2019-Sep-25, at 19:26, Mark Millard  wrote:
> 
>> On 2019-Sep-25, at 10:02, Mark Johnston  wrote:
>> 
>>> On Mon, Sep 23, 2019 at 01:28:15PM -0700, Mark Millard via freebsd-amd64 
>>> wrote:
>>>> Note: I have access to only one FreeBSD amd64 context, and
>>>> it is also my only access to a NUMA context: 2 memory
>>>> domains. A Threadripper 1950X context. Also: I have only
>>>> a head FreeBSD context on any architecture, not 12.x or
>>>> before. So I have limited compare/contrast material.
>>>> 
>>>> I present the below basically to ask if the NUMA handling
>>>> has been validated, or if it is going to be, at least for
>>>> contexts that might apply to ThreadRipper 1950X and
>>>> analogous contexts. My results suggest they are not (or
>>>> libc++'s now times get messed up such that it looks like
>>>> NUMA mishandling since this is based on odd benchmark
>>>> results that involve mean time for laps, using a median
>>>> of such across multiple trials).
>>>> 
>>>> I ran a benchmark on both Fedora 30 and FreeBSD 13 on this
>>>> 1950X got got expected  results on Fedora but odd ones on
>>>> FreeBSD. The benchmark is a variation on the old HINT
>>>> benchmark, spanning the old multi-threading variation. I
>>>> later tried Fedora because the FreeBSD results looked odd.
>>>> The other architectures I tried FreeBSD benchmarking with
>>>> did not look odd like this. (powerpc64 on a old PowerMac 2
>>>> socket with 2 cores per socket, aarch64 Cortex-A57 Overdrive
>>>> 1000, CortextA53 Pine64+ 2GB, armv7 Cortex-A7 Orange Pi+ 2nd
>>>> Ed. For these I used 4 threads, not more.)
>>>> 
>>>> I tend to write in terms of plots made from the data instead
>>>> of the raw benchmark data.
>>>> 
>>>> FreeBSD testing based on:
>>>> cpuset -l0-15  -n prefer:1
>>>> cpuset -l16-31 -n prefer:1
>>>> 
>>>> Fedora 30 testing based on:
>>>> numactl --preferred 1 --cpunodebind 0
>>>> numactl --preferred 1 --cpunodebind 1
>>>> 
>>>> While I have more results, I reference primarily DSIZE
>>>> and ISIZE being unsigned long long and also both being
>>>> unsigned long as examples. Variations in results are not
>>>> from the type differences for any LP64 architectures.
>>>> (But they give an idea of benchmark variability in the
>>>> test context.)
>>>> 
>>>> The Fedora results solidly show the bandwidth limitation
>>>> of using one memory controller. They also show the latency
>>>> consequences for the remote memory domain case vs. the
>>>> local memory domain case. There is not a lot of
>>>> variability between the examples of the 2 type-pairs used
>>>> for Fedora.
>>>> 
>>>> Not true for FreeBSD on the 1950X:
>>>> 
>>>> A) The latency-constrained part of the graph looks to
>>>> normally be using the local memory domain when
>>>> -l0-15 is in use for 8 threads.
>>>> 
>>>> B) Both the -l0-15 and the -l16-31 parts of the
>>>> graph for 8 threads that should be bandwidth
>>>> limited show mostly examples that would have to
>>>> involve both memory controllers for the bandwidth
>>>> to get the results shown as far as I can tell.
>>>> There is also wide variability ranging between the
>>>> expected 1 controller result and, say, what a 2
>>>> controller round-robin would be expected produce.
>>>> 
>>>> C) Even the single threaded result shows a higher
>>>> result for larger total bytes for the kernel
>>>> vectors. Fedora does not.
>>>> 
>>>> I think that (B) is the most solid evidence for
>>>> something being odd.
>>> 
>>> The implication seems to be that your benchmark program is using pages
>>> from both domains despite a policy which preferentially allocates pages
>>> from domain 1, so you would first want to determine if this is actually
>>> what's happening.  As far as I know we currently don't have a good way
>>> of characterizing per-domain memory usage within a process.
>>> 
>>> If your benchmark uses a large fraction of the system's memory, you
>>> could use the vm.phys_free sysctl to get a sense of

Re: head -r352341 example context on ThreadRipper 1950X: cpuset -n prefer:1 with -l 0-15 vs. -l 16-31 odd performance?

2019-09-25 Thread Mark Millard via freebsd-amd64



On 2019-Sep-25, at 19:26, Mark Millard  wrote:

> On 2019-Sep-25, at 10:02, Mark Johnston  wrote:
> 
>> On Mon, Sep 23, 2019 at 01:28:15PM -0700, Mark Millard via freebsd-amd64 
>> wrote:
>>> Note: I have access to only one FreeBSD amd64 context, and
>>> it is also my only access to a NUMA context: 2 memory
>>> domains. A Threadripper 1950X context. Also: I have only
>>> a head FreeBSD context on any architecture, not 12.x or
>>> before. So I have limited compare/contrast material.
>>> 
>>> I present the below basically to ask if the NUMA handling
>>> has been validated, or if it is going to be, at least for
>>> contexts that might apply to ThreadRipper 1950X and
>>> analogous contexts. My results suggest they are not (or
>>> libc++'s now times get messed up such that it looks like
>>> NUMA mishandling since this is based on odd benchmark
>>> results that involve mean time for laps, using a median
>>> of such across multiple trials).
>>> 
>>> I ran a benchmark on both Fedora 30 and FreeBSD 13 on this
>>> 1950X got got expected  results on Fedora but odd ones on
>>> FreeBSD. The benchmark is a variation on the old HINT
>>> benchmark, spanning the old multi-threading variation. I
>>> later tried Fedora because the FreeBSD results looked odd.
>>> The other architectures I tried FreeBSD benchmarking with
>>> did not look odd like this. (powerpc64 on a old PowerMac 2
>>> socket with 2 cores per socket, aarch64 Cortex-A57 Overdrive
>>> 1000, CortextA53 Pine64+ 2GB, armv7 Cortex-A7 Orange Pi+ 2nd
>>> Ed. For these I used 4 threads, not more.)
>>> 
>>> I tend to write in terms of plots made from the data instead
>>> of the raw benchmark data.
>>> 
>>> FreeBSD testing based on:
>>> cpuset -l0-15  -n prefer:1
>>> cpuset -l16-31 -n prefer:1
>>> 
>>> Fedora 30 testing based on:
>>> numactl --preferred 1 --cpunodebind 0
>>> numactl --preferred 1 --cpunodebind 1
>>> 
>>> While I have more results, I reference primarily DSIZE
>>> and ISIZE being unsigned long long and also both being
>>> unsigned long as examples. Variations in results are not
>>> from the type differences for any LP64 architectures.
>>> (But they give an idea of benchmark variability in the
>>> test context.)
>>> 
>>> The Fedora results solidly show the bandwidth limitation
>>> of using one memory controller. They also show the latency
>>> consequences for the remote memory domain case vs. the
>>> local memory domain case. There is not a lot of
>>> variability between the examples of the 2 type-pairs used
>>> for Fedora.
>>> 
>>> Not true for FreeBSD on the 1950X:
>>> 
>>> A) The latency-constrained part of the graph looks to
>>>  normally be using the local memory domain when
>>>  -l0-15 is in use for 8 threads.
>>> 
>>> B) Both the -l0-15 and the -l16-31 parts of the
>>>  graph for 8 threads that should be bandwidth
>>>  limited show mostly examples that would have to
>>>  involve both memory controllers for the bandwidth
>>>  to get the results shown as far as I can tell.
>>>  There is also wide variability ranging between the
>>>  expected 1 controller result and, say, what a 2
>>>  controller round-robin would be expected produce.
>>> 
>>> C) Even the single threaded result shows a higher
>>>  result for larger total bytes for the kernel
>>>  vectors. Fedora does not.
>>> 
>>> I think that (B) is the most solid evidence for
>>> something being odd.
>> 
>> The implication seems to be that your benchmark program is using pages
>> from both domains despite a policy which preferentially allocates pages
>> from domain 1, so you would first want to determine if this is actually
>> what's happening.  As far as I know we currently don't have a good way
>> of characterizing per-domain memory usage within a process.
>> 
>> If your benchmark uses a large fraction of the system's memory, you
>> could use the vm.phys_free sysctl to get a sense of how much memory from
>> each domain is free.
> 
> The ThreadRipper 1950X has 96 GiBytes of ECC RAM, so 48 GiBytes per memory
> domain. I've never configured the benchmark such that it even reaches
> 10 GiBytes on this hardware. (It stops for a time constraint first,
> based on the values in use for the "adjustable" items.)
> 
> . . . (much omitted material) . . .

Re: head -r352341 example context on ThreadRipper 1950X: cpuset -n prefer:1 with -l 0-15 vs. -l 16-31 odd performance?

2019-09-25 Thread Mark Millard via freebsd-amd64



On 2019-Sep-25, at 10:02, Mark Johnston  wrote:

> On Mon, Sep 23, 2019 at 01:28:15PM -0700, Mark Millard via freebsd-amd64 
> wrote:
>> Note: I have access to only one FreeBSD amd64 context, and
>> it is also my only access to a NUMA context: 2 memory
>> domains. A Threadripper 1950X context. Also: I have only
>> a head FreeBSD context on any architecture, not 12.x or
>> before. So I have limited compare/contrast material.
>> 
>> I present the below basically to ask if the NUMA handling
>> has been validated, or if it is going to be, at least for
>> contexts that might apply to ThreadRipper 1950X and
>> analogous contexts. My results suggest they are not (or
>> libc++'s now times get messed up such that it looks like
>> NUMA mishandling since this is based on odd benchmark
>> results that involve mean time for laps, using a median
>> of such across multiple trials).
>> 
>> I ran a benchmark on both Fedora 30 and FreeBSD 13 on this
>> 1950X got got expected  results on Fedora but odd ones on
>> FreeBSD. The benchmark is a variation on the old HINT
>> benchmark, spanning the old multi-threading variation. I
>> later tried Fedora because the FreeBSD results looked odd.
>> The other architectures I tried FreeBSD benchmarking with
>> did not look odd like this. (powerpc64 on a old PowerMac 2
>> socket with 2 cores per socket, aarch64 Cortex-A57 Overdrive
>> 1000, CortextA53 Pine64+ 2GB, armv7 Cortex-A7 Orange Pi+ 2nd
>> Ed. For these I used 4 threads, not more.)
>> 
>> I tend to write in terms of plots made from the data instead
>> of the raw benchmark data.
>> 
>> FreeBSD testing based on:
>> cpuset -l0-15  -n prefer:1
>> cpuset -l16-31 -n prefer:1
>> 
>> Fedora 30 testing based on:
>> numactl --preferred 1 --cpunodebind 0
>> numactl --preferred 1 --cpunodebind 1
>> 
>> While I have more results, I reference primarily DSIZE
>> and ISIZE being unsigned long long and also both being
>> unsigned long as examples. Variations in results are not
>> from the type differences for any LP64 architectures.
>> (But they give an idea of benchmark variability in the
>> test context.)
>> 
>> The Fedora results solidly show the bandwidth limitation
>> of using one memory controller. They also show the latency
>> consequences for the remote memory domain case vs. the
>> local memory domain case. There is not a lot of
>> variability between the examples of the 2 type-pairs used
>> for Fedora.
>> 
>> Not true for FreeBSD on the 1950X:
>> 
>> A) The latency-constrained part of the graph looks to
>>   normally be using the local memory domain when
>>   -l0-15 is in use for 8 threads.
>> 
>> B) Both the -l0-15 and the -l16-31 parts of the
>>   graph for 8 threads that should be bandwidth
>>   limited show mostly examples that would have to
>>   involve both memory controllers for the bandwidth
>>   to get the results shown as far as I can tell.
>>   There is also wide variability ranging between the
>>   expected 1 controller result and, say, what a 2
>>   controller round-robin would be expected produce.
>> 
>> C) Even the single threaded result shows a higher
>>   result for larger total bytes for the kernel
>>   vectors. Fedora does not.
>> 
>> I think that (B) is the most solid evidence for
>> something being odd.
> 
> The implication seems to be that your benchmark program is using pages
> from both domains despite a policy which preferentially allocates pages
> from domain 1, so you would first want to determine if this is actually
> what's happening.  As far as I know we currently don't have a good way
> of characterizing per-domain memory usage within a process.
> 
> If your benchmark uses a large fraction of the system's memory, you
> could use the vm.phys_free sysctl to get a sense of how much memory from
> each domain is free.

The ThreadRipper 1950X has 96 GiBytes of ECC RAM, so 48 GiBytes per memory
domain. I've never configured the benchmark such that it even reaches
10 GiBytes on this hardware. (It stops for a time constraint first,
based on the values in use for the "adjustable" items.)

The benchmark runs the Hierarchical INTegeration kernel for a sequence
of larger and larger number of cells in the grid that it uses. Each
size is run in isolation before the next is tried, each gets its own
timings. Each size gets its own kernel vector allocations (and
deallocations) with the trails and laps within a trail reusing the
same memory. Each lap in each trial gets its own thread creations (and
completions). The main thread combines the results wh

head -r352341 example context on ThreadRipper 1950X: cpuset -n prefer:1 with -l 0-15 vs. -l 16-31 odd performance?

2019-09-23 Thread Mark Millard via freebsd-amd64
Note: I have access to only one FreeBSD amd64 context, and
it is also my only access to a NUMA context: 2 memory
domains. A Threadripper 1950X context. Also: I have only
a head FreeBSD context on any architecture, not 12.x or
before. So I have limited compare/contrast material.

I present the below basically to ask if the NUMA handling
has been validated, or if it is going to be, at least for
contexts that might apply to ThreadRipper 1950X and
analogous contexts. My results suggest they are not (or
libc++'s now times get messed up such that it looks like
NUMA mishandling since this is based on odd benchmark
results that involve mean time for laps, using a median
of such across multiple trials).

I ran a benchmark on both Fedora 30 and FreeBSD 13 on this
1950X got got expected  results on Fedora but odd ones on
FreeBSD. The benchmark is a variation on the old HINT
benchmark, spanning the old multi-threading variation. I
later tried Fedora because the FreeBSD results looked odd.
The other architectures I tried FreeBSD benchmarking with
did not look odd like this. (powerpc64 on a old PowerMac 2
socket with 2 cores per socket, aarch64 Cortex-A57 Overdrive
1000, CortextA53 Pine64+ 2GB, armv7 Cortex-A7 Orange Pi+ 2nd
Ed. For these I used 4 threads, not more.)

I tend to write in terms of plots made from the data instead
of the raw benchmark data.

FreeBSD testing based on:
cpuset -l0-15  -n prefer:1
cpuset -l16-31 -n prefer:1

Fedora 30 testing based on:
numactl --preferred 1 --cpunodebind 0
numactl --preferred 1 --cpunodebind 1

While I have more results, I reference primarily DSIZE
and ISIZE being unsigned long long and also both being
unsigned long as examples. Variations in results are not
from the type differences for any LP64 architectures.
(But they give an idea of benchmark variability in the
test context.)

The Fedora results solidly show the bandwidth limitation
of using one memory controller. They also show the latency
consequences for the remote memory domain case vs. the
local memory domain case. There is not a lot of
variability between the examples of the 2 type-pairs used
for Fedora.

Not true for FreeBSD on the 1950X:

A) The latency-constrained part of the graph looks to
   normally be using the local memory domain when
   -l0-15 is in use for 8 threads.

B) Both the -l0-15 and the -l16-31 parts of the
   graph for 8 threads that should be bandwidth
   limited show mostly examples that would have to
   involve both memory controllers for the bandwidth
   to get the results shown as far as I can tell.
   There is also wide variability ranging between the
   expected 1 controller result and, say, what a 2
   controller round-robin would be expected produce.

C) Even the single threaded result shows a higher
   result for larger total bytes for the kernel
   vectors. Fedora does not.

I think that (B) is the most solid evidence for
something being odd.



For reference for FreeBSD:

# cpuset -g -d 1
domain 1 mask: 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31

-r352341 allows -prefer:0 but I happen to have
used -prefer:1 in these experiments.

The benchmark was built via devel/g++9 but linked with
system libraries, including libc++. Unfortunately, I'm
not yet ready for distributing source to the benchmark,
but expect to at some point. I do not expect to ever
distribute binaries. The source code for normal builds
involves just standard C++17 code. Such builds are what
is involved here.

[The powerpc64 context is a system-clang 8, ELFv1 based
system context, not the usual gcc 4.2.1 based one.]

More notes:

In the 'kernel vectors: total Bytes' vs. 'QUality
Improvement Per Second' graphs the left hand side of
the curve is latency limited. On the right is bandwidth
limited for LP64. (The total Bytes axis is log base 2
scaling in the graphs.) Thread creation has latency
so the 8-thread curves are mostly of interest for kernel
vectors total bytes being 1 MiByte or more (say) so that
thread creations are not that much of the total
contributions to the measured time.

The thread creations are via std::async use.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-amd64@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-amd64
To unsubscribe, send any mail to "freebsd-amd64-unsubscr...@freebsd.org"


FreeBSD-head-amd64-gcc builds are broken since 2019-Aug-17 or so: no previous declaration for '__ashldi3' (stand/i386/boot2 context)

2019-08-23 Thread Mark Millard via freebsd-amd64


https://ci.freebsd.org/job/FreeBSD-head-amd64-gcc/11176/console
is for -r351411 and shows:

15:43:33 
--- ashldi3.o ---

15:43:33 
/usr/local/bin/x86_64-unknown-freebsd12.0-gcc 
--sysroot=/tmp/obj/workspace/src/amd64.amd64/tmp 
-B/usr/local/x86_64-unknown-freebsd12.0/bin/  -O2 -pipe   
-I/workspace/src/stand/i386/btx/lib -nostdinc 
-I/tmp/obj/workspace/src/amd64.amd64/stand/libsa32 -I/workspace/src/stand/libsa 
-D_STANDALONE -I/workspace/src/sys -Ddouble=jagged-little-pill 
-Dfloat=floaty-mcfloatface -DLOADER_GELI_SUPPORT 
-I/workspace/src/stand/libsa/geli -DLOADER_DISK_SUPPORT -m32 -ffreestanding 
-mno-mmx -mno-sse  -msoft-float -march=i386 -I. -fomit-frame-pointer  -mrtd  
-mregparm=3  -DUFS1_AND_UFS2  -DFLAGS=0x80  -DSIOPRT=0x3f8  -DSIOFMT=0x3  
-DSIOSPD=9600  -I/workspace/src/stand/common  -Wall -Waggregate-return 
-Wbad-function-cast -Wno-cast-align  -Wmissing-declarations 
-Wmissing-prototypes -Wnested-externs  -Wpointer-arith -Wshadow 
-Wstrict-prototypes -Wwrite-strings  -Winline -g  -std=gnu99 
-Wno-format-zero-length -Wsystem-headers -Werror -Wno-pointer-sign 
-Wno-error=address -Wno-error=array-bounds -Wno-error=att
 ributes -Wno-error=bool-compare -Wno-error=cast-align -Wno-error=clobbered 
-Wno-error=enum-compare -Wno-error=extra -Wno-error=inline 
-Wno-error=logical-not-parentheses -Wno-error=strict-aliasing 
-Wno-error=uninitialized -Wno-error=unused-but-set-variable 
-Wno-error=unused-function -Wno-error=unused-value 
-Wno-error=misleading-indentation -Wno-error=nonnull-compare 
-Wno-error=shift-negative-value -Wno-error=tautological-compare 
-Wno-error=unused-const-variable   -Os -mpreferred-stack-boundary=2 -Os  
-fno-asynchronous-unwind-tables  --param max-inline-insns-single=100 
-Wno-missing-prototypes   -c 
/workspace/src/contrib/compiler-rt/lib/builtins/ashldi3.c -o ashldi3.o

15:43:33 
/workspace/src/contrib/compiler-rt/lib/builtins/ashldi3.c:22:1: error: no 
previous declaration for '__ashldi3' [-Werror=missing-declarations]

15:43:33 
 __ashldi3(di_int a, si_int b)

15:43:33 
 ^
15:43:33 
--- all_subdir_kerberos5 ---
. . .
15:43:33 
--- all_subdir_stand ---

15:43:33 
*** [ashldi3.o] Error code 1

15:43:33 
15:43:33 
make[5]: stopped in /workspace/src/stand/i386/boot2

15:43:33 
1 error

This error first showed back at:

https://ci.freebsd.org/job/FreeBSD-head-amd64-gcc/11080/

which is for -r351138 .

The prior build was for -r351133 and it built okay.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-amd64@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-amd64
To unsubscribe, send any mail to "freebsd-amd64-unsubscr...@freebsd.org"


amd64 head -r351153 self-built but via devel/llvm90: 'objcopy: elf_update() failed: Layout constraint violation' for gptboot.bin

2019-08-16 Thread Mark Millard via freebsd-amd64
I upgraded to head -r351153 and then attempted a buildworld
buildkernel via devel/llvm90 (rc2 via ports head -r509054),
but that (from scratch) build attempt got:

--- gptboot.bin ---
objcopy: elf_update() failed: Layout constraint violation
*** [gptboot.bin] Error code 1
make[5]: *** gptboot.bin removed

make[5]: stopped in /usr/src/stand/i386/gptboot
.ERROR_TARGET='gptboot.bin'
.ERROR_META_FILE='/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64/stand/i386/gptboot/gptboot.bin.meta'
.MAKE.LEVEL='5'
MAKEFILE=''
.MAKE.MODE='meta missing-filemon=yes missing-meta=yes silent=yes verbose'
_ERROR_CMD='objcopy -S -O binary gptboot.out gptboot.bin;'
.CURDIR='/usr/src/stand/i386/gptboot'
.MAKE='make'
.OBJDIR='/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64/stand/i386/gptboot'
.TARGETS='all'
DESTDIR='/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64/tmp'
LD_LIBRARY_PATH=''
MACHINE='amd64'
MACHINE_ARCH='amd64'
MAKEOBJDIRPREFIX=''
MAKESYSPATH='/usr/src/share/mk'
MAKE_VERSION='20181221'
PATH='/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64/tmp/usr/sbin:/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64/tmp/usr/bin:/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64/tmp/legacy/usr/sbin:/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64/tmp/legacy/usr/bin:/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64/tmp/legacy/bin::/sbin:/bin:/usr/sbin:/usr/bin'
SRCTOP='/usr/src'
OBJTOP='/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64'
.MAKE.MAKEFILES='/usr/src/share/mk/sys.mk /usr/src/share/mk/local.sys.env.mk 
/usr/src/share/mk/src.sys.env.mk 
/root/src.configs/src.conf.amd64-xtoolchain-llvm.amd64-host 
/usr/src/share/mk/bsd.mkopt.mk /usr/src/share/mk/src.sys.obj.mk 
/usr/src/share/mk/auto.obj.mk /usr/src/share/mk/bsd.suffixes.mk 
/root/src.configs/make.conf /usr/src/share/mk/local.sys.mk 
/usr/src/share/mk/src.sys.mk /dev/null /usr/src/stand/i386/gptboot/Makefile 
/usr/src/share/mk/bsd.init.mk /usr/src/share/mk/bsd.opts.mk 
/usr/src/share/mk/bsd.cpu.mk /usr/src/share/mk/local.init.mk 
/usr/src/share/mk/src.init.mk /usr/src/stand/i386/gptboot/../Makefile.inc 
/usr/src/share/mk/bsd.linker.mk /usr/src/stand/i386/gptboot/../../Makefile.inc 
/usr/src/stand/i386/gptboot/../../defs.mk /usr/src/share/mk/src.opts.mk 
/usr/src/share/mk/bsd.own.mk /usr/src/share/mk/bsd.compiler.mk 
/usr/src/share/mk/bsd.compiler.mk /usr/src/share/mk/bsd.prog.mk 
/usr/src/share/mk/bsd.libnames.mk /usr/src/share/mk/src.libnames.mk 
/usr/src/share/mk/bsd.nl
 s.mk /usr/src/share/mk/bsd.confs.mk /usr/src/share/mk/bsd.files.mk 
/usr/src/share/mk/bsd.dirs.mk /usr/src/share/mk/bsd.incs.mk 
/usr/src/share/mk/bsd.links.mk /usr/src/share/mk/bsd.man.mk 
/usr/src/share/mk/bsd.dep.mk /usr/src/share/mk/bsd.clang-analyze.mk 
/usr/src/share/mk/bsd.obj.mk /usr/src/share/mk/bsd.subdir.mk 
/usr/src/share/mk/bsd.sys.mk'
.PATH='. /usr/src/stand/i386/gptboot /usr/src/stand/i386/boot2 
/usr/src/stand/i386/common /usr/src/stand/libsa'
1 error

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-amd64@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-amd64
To unsubscribe, send any mail to "freebsd-amd64-unsubscr...@freebsd.org"


head -r351102 amd64 rebuilding itself but via devel/xtoolchain-llvm90 ( rc2: ports head -r509054 ) fails for boot2.out: ld.lld: error: undefined symbol: __ashldi3

2019-08-15 Thread Mark Millard via freebsd-amd64
My attempt to have -r351102 rebuild itself via devel/llvm90 (rc2)
got:

--- all_subdir_stand ---
--- boot2.out ---
ld.lld: error: undefined symbol: __ashldi3
>>> referenced by ufsread.c:234 (/usr/src/stand/libsa/ufsread.c:234)
>>>   boot2.o:(fsread)
>>> referenced by ufsread.c:270 (/usr/src/stand/libsa/ufsread.c:270)
>>>   boot2.o:(fsread)
>>> referenced by ufsread.c:295 (/usr/src/stand/libsa/ufsread.c:295)
>>>   boot2.o:(fsread)
>>> referenced by ufsread.c:297 (/usr/src/stand/libsa/ufsread.c:297)
>>>   boot2.o:(fsread)
*** [boot2.out] Error code 1

make[5]: stopped in /usr/src/stand/i386/boot2
.ERROR_TARGET='boot2.out'
.ERROR_META_FILE='/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64/stand/i386/boot2/boot2.out.meta'
.MAKE.LEVEL='5'
MAKEFILE=''
.MAKE.MODE='meta missing-filemon=yes missing-meta=yes silent=yes verbose'
_ERROR_CMD='/usr/local/llvm90/bin/ld.lld -m elf_i386_fbsd -static -N 
--gc-sections -Ttext 0x2000 -o boot2.out 
/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64/stand/i386/btx/lib/crt0.o
 boot2.o sio.o;'
.CURDIR='/usr/src/stand/i386/boot2'
.MAKE='make'
.OBJDIR='/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64/stand/i386/boot2'
.TARGETS='all'
DESTDIR='/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64/tmp'
LD_LIBRARY_PATH=''
MACHINE='amd64'
MACHINE_ARCH='amd64'
MAKEOBJDIRPREFIX=''
MAKESYSPATH='/usr/src/share/mk'
MAKE_VERSION='20181221'
PATH='/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64/tmp/usr/sbin:/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64/tmp/usr/bin:/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64/tmp/legacy/usr/sbin:/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64/tmp/legacy/usr/bin:/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64/tmp/legacy/bin::/sbin:/bin:/usr/sbin:/usr/bin'
SRCTOP='/usr/src'
OBJTOP='/usr/obj/amd64_xtoolchain-llvm/amd64.amd64/usr/src/amd64.amd64'
.MAKE.MAKEFILES='/usr/src/share/mk/sys.mk /usr/src/share/mk/local.sys.env.mk 
/usr/src/share/mk/src.sys.env.mk 
/root/src.configs/src.conf.amd64-xtoolchain-llvm.amd64-host 
/usr/src/share/mk/bsd.mkopt.mk /usr/src/share/mk/src.sys.obj.mk 
/usr/src/share/mk/auto.obj.mk /usr/src/share/mk/bsd.suffixes.mk 
/root/src.configs/make.conf /usr/src/share/mk/local.sys.mk 
/usr/src/share/mk/src.sys.mk /dev/null /usr/src/stand/i386/boot2/Makefile 
/usr/src/share/mk/bsd.init.mk /usr/src/share/mk/bsd.opts.mk 
/usr/src/share/mk/bsd.cpu.mk /usr/src/share/mk/local.init.mk 
/usr/src/share/mk/src.init.mk /usr/src/stand/i386/boot2/../Makefile.inc 
/usr/src/share/mk/bsd.linker.mk /usr/src/stand/i386/boot2/../../Makefile.inc 
/usr/src/stand/i386/boot2/../../defs.mk /usr/src/share/mk/src.opts.mk 
/usr/src/share/mk/bsd.own.mk /usr/src/share/mk/bsd.compiler.mk 
/usr/src/share/mk/bsd.compiler.mk /usr/src/share/mk/bsd.prog.mk 
/usr/src/share/mk/bsd.libnames.mk /usr/src/share/mk/src.libnames.mk 
/usr/src/share/mk/bsd.nls.mk /us
 r/src/share/mk/bsd.confs.mk /usr/src/share/mk/bsd.files.mk 
/usr/src/share/mk/bsd.dirs.mk /usr/src/share/mk/bsd.incs.mk 
/usr/src/share/mk/bsd.links.mk /usr/src/share/mk/bsd.man.mk 
/usr/src/share/mk/bsd.dep.mk /usr/src/share/mk/bsd.clang-analyze.mk 
/usr/src/share/mk/bsd.obj.mk /usr/src/share/mk/bsd.subdir.mk 
/usr/src/share/mk/bsd.sys.mk'
.PATH='. /usr/src/stand/i386/boot2'
1 error

FYI:

# uname -apKU
FreeBSD FBSDFHUGE 13.0-CURRENT FreeBSD 13.0-CURRENT #29 r351102M: Thu Aug 15 
14:22:00 PDT 2019 
markmi@FBSDFHUGE:/usr/obj/amd64_clang/amd64.amd64/usr/src/amd64.amd64/sys/GENERIC-NODBG
  amd64 amd64 1300039 1300039


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-amd64@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-amd64
To unsubscribe, send any mail to "freebsd-amd64-unsubscr...@freebsd.org"


amd64 head -r349794 (under Hyper-V): "panic: spin lock held too long" during a buildworld buildkernel

2019-07-06 Thread Mark Millard via freebsd-amd64
Looks like pmap_invalidate_range using smp_targeted_tlb_shootdown using 
_mtx_lock_spin_cookie.

I'll note that I had no trouble with -r349444 building world or kernel 
repeatedly, including
when I originally build -r349794 to upgrade.

The below is from my 2nd buildworld buildkernel under -r349794 (but the 2 
builds were
not from scratch ones).


# more /var/crash/core.txt.2 
FBSDFSSD dumped core - see /var/crash/vmcore.2

Sun Jul  7 02:26:36 PDT 2019

FreeBSD FBSDFSSD 13.0-CURRENT FreeBSD 13.0-CURRENT #24 r349794M: Sun Jul  7 
01:55:57 PDT 2019 
markmi@FBSDFSSD:/usr/obj/amd64_clang/amd64.amd64/usr/src/amd64.amd64/sys/GENERIC-NODBG
  amd64

panic: spin lock held too long

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
spin lock 0x829540e0 (smp rendezvous) held by 0xf80ae7ebb5a0 (tid 
100669) too long
timeout stopping cpus
panic: spin lock held too long
cpuid = 15
time = 1562491248
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe00db5263c0
vpanic() at vpanic+0x19d/frame 0xfe00db526410
panic() at panic+0x43/frame 0xfe00db526470
_mtx_lock_indefinite_check() at _mtx_lock_indefinite_check+0x6d/frame 
0xfe00db526480
_mtx_lock_spin_cookie() at _mtx_lock_spin_cookie+0xd5/frame 0xfe00db5264f0
smp_targeted_tlb_shootdown() at smp_targeted_tlb_shootdown+0x3de/frame 
0xfe00db526560
pmap_invalidate_range() at pmap_invalidate_range+0x25c/frame 0xfe00db5265f0
vm_thread_stack_dispose() at vm_thread_stack_dispose+0x2c/frame 
0xfe00db526640
thread_reap() at thread_reap+0x106/frame 0xfe00db526660
proc_reap() at proc_reap+0x788/frame 0xfe00db5266a0
proc_to_reap() at proc_to_reap+0x463/frame 0xfe00db5266f0
kern_wait6() at kern_wait6+0x34c/frame 0xfe00db526790
sys_wait4() at sys_wait4+0x78/frame 0xfe00db526980
amd64_syscall() at amd64_syscall+0x36e/frame 0xfe00db526ab0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfe00db526ab0
--- syscall (7, FreeBSD ELF64, sys_wait4), rip = 0x80038f7fa, rsp = 
0x7fffb168, rbp = 0x7fffb1b0 ---
KDB: enter: panic

Reading symbols from /boot/kernel/intpm.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/intpm.ko.debug...done.
done.
Loaded symbols for /boot/kernel/intpm.ko
Reading symbols from /boot/kernel/smbus.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/smbus.ko.debug...done.
done.
Loaded symbols for /boot/kernel/smbus.ko
Reading symbols from /boot/kernel/mac_ntpd.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/mac_ntpd.ko.debug...done.
done.
Loaded symbols for /boot/kernel/mac_ntpd.ko
Reading symbols from /boot/kernel/imgact_binmisc.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/imgact_binmisc.ko.debug...done.
done.
Loaded symbols for /boot/kernel/imgact_binmisc.ko
Reading symbols from /boot/kernel/filemon.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/filemon.ko.debug...done.
done.
Loaded symbols for /boot/kernel/filemon.ko
#0  doadump (textdump=0) at src/sys/amd64/include/pcpu.h:246
246 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" 
(OFFSETOF_CURTHREAD));
(kgdb) #0  doadump (textdump=0) at src/sys/amd64/include/pcpu.h:246
#1  0x804a152b in db_dump (dummy=, 
dummy3=, dummy4=)
at /usr/src/sys/ddb/db_command.c:575
#2  0x804a12f9 in db_command (cmd_table=, 
dopager=1) at /usr/src/sys/ddb/db_command.c:482
#3  0x804a1074 in db_command_loop ()
at /usr/src/sys/ddb/db_command.c:535
#4  0x804a42cf in db_trap (type=, 
code=) at /usr/src/sys/ddb/db_main.c:252
#5  0x80c502bc in kdb_trap (type=3, code=0, tf=)
at /usr/src/sys/kern/subr_kdb.c:692
#6  0x810e1e2c in trap (frame=0xfe00db5262f0)
at /usr/src/sys/amd64/amd64/trap.c:621
#7  0x810bb8b5 in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:232
#8  0x80c4f9cb in kdb_enter (why=0x8134eb81 "panic", 
msg=) at src/sys/amd64/include/cpufunc.h:65
#9  0x80c0226a in vpanic (fmt=, 
ap=) at /usr/src/sys/kern/kern_shutdown.c:894
#10 0x80c020a3 in panic (fmt=)
at /usr/src/sys/kern/kern_shutdown.c:832
#11 0x80be128d in _mtx_lock_indefinite_check (m=, 
ldap=) at /usr/src/sys/kern/kern_mutex.c:1222
#12 0x80be0dd5 in _mtx_lock_spin_cookie (c=0x829540f8, 
v=) at /usr/src/sys/kern/kern_mutex.c:748
#13 0x8125b32e in smp_targeted_tlb_shootdown (mask=
  {__bits = 0xfe00db526570}, vector=246, pmap=0x82a682f8, 
addr1=18446741878369513472, addr2=18446741878369529856)
at /usr/src/sys/x86/x86/mp_x86.c:1671
#14 

Does head well-support the old MacPro3,1 (2 sockets of Xeon E5472 Quad-Core Processors)?

2019-04-07 Thread Mark Millard via freebsd-amd64
Does someone know the status of head relative to supporting the MacPro3,1 ?
If yes, please comment on the status. I may be able to get access to one
if it is well supported.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-amd64@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-amd64
To unsubscribe, send any mail to "freebsd-amd64-unsubscr...@freebsd.org"


Re: head -r345758 Ryzen Threadripper 1950X vs. amdtemp.ko : dev.cpu.31 missing

2019-04-06 Thread Mark Millard via freebsd-amd64



On 2019-Apr-6, at 09:50, Konstantin Belousov  wrote:

> On Fri, Apr 05, 2019 at 11:47:58AM -0700, Mark Millard wrote:
>> 
>> 
>> On 2019-Apr-5, at 04:46, Konstantin Belousov  wrote:
>> 
>>> On Thu, Apr 04, 2019 at 04:58:15PM -0700, Mark Millard via freebsd-amd64 
>>> wrote:
>>>> On a:
>>>> 
>>>> CPU: AMD Ryzen Threadripper 1950X 16-Core Processor  (3393.70-MHz K8-class 
>>>> CPU)
>>>> Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1  Stepping=1
>>>> Features=0x178bfbff
>>>> Features2=0x7ed8320b
>>>> AMD Features=0x2e500800
>>>> AMD 
>>>> Features2=0x35c233ff
>>>> Structured Extended 
>>>> Features=0x209c01a9
>>>> XSAVE Features=0xf
>>>> AMD Extended Feature Extensions ID 
>>>> EBX=0x1007
>>>> SVM: NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
>>>> TSC: P-state invariant, performance statistics
>>>> 
>>>> after "kldload amdtemp" the following is seen:
>>>> 
>>>> # sysctl dev.cpu.31
>>>> sysctl: unknown oid 'dev.cpu.31'
>>>> 
>>>> # sysctl dev.cpu.30
>>>> dev.cpu.30.temperature: 62.1C
>>>> dev.cpu.30.cx_method: C1/hlt C2/io
>>>> dev.cpu.30.cx_usage_counters: 0 0
>>>> dev.cpu.30.cx_usage: 0.00% 0.00% last 100us
>>>> dev.cpu.30.cx_lowest: C1
>>>> dev.cpu.30.cx_supported: C1/1/0 C2/2/100
>>>> dev.cpu.30.%parent: acpi0
>>>> dev.cpu.30.%pnpinfo: _HID=none _UID=0
>>>> dev.cpu.30.%location: handle=\_PR_.C01F
>>>> dev.cpu.30.%driver: cpu
>>>> dev.cpu.30.%desc: ACPI CPU
>>>> 
>>>> . . . 
>>> 
>>> In the output of devinfo(8), how many CPUs do you see ?  Is there cpu31,
>>> and does it have amdtemp child ?
>> 
>> (I only used 'sysctl -a | grep "temp.*[0-9]C$"' as a short
>> way to show one line per dev.cpu.N so show the others  were
>> all present.)
>> 
>> cpu31 is missing in the devinfo output. The amdtempM's are under
>> pcibX > pciY > hostbZ , not per cpuN .
>> 
>> Shortended output but showing all the cpuN and amdtmpM
>> and their "parents" and "childern":
>> 
>> # devinfo
>> nexus0
>>  cryptosoft0
>>  vtvga0
>>  apic0
>>  ram0
>>  acpi0
>>cpu0
>>  hwpstate0
>>  cpufreq0
>>cpu1
>>cpu2
>>cpu3
>>cpu4
>>cpu5
>>cpu6
>>cpu7
>>cpu8
>>cpu9
>>cpu10
>>cpu11
>>cpu12
>>cpu13
>>cpu14
>>cpu15
>>cpu16
>>cpu17
>>cpu18
>>cpu19
>>cpu20
>>cpu21
>>cpu22
>>cpu23
>>cpu24
>>cpu25
>>cpu26
>>cpu27
>>cpu28
>>cpu29
>>cpu30
>>pcib0
>>  pci0
>>hostb0
>>  amdsmn0
>>  amdtemp0
>> . . ,
>>pcib12
>>  pci12
>>hostb23
>>  amdsmn1
>>  amdtemp1
>> . . .
> 
> Ok, I see, it was unexpected to see amdtemp to attach under the host
> bridge instead of cpu device.  Please post complete output of devinfo -r
> and pciconf -lvcb somewhere.

Sure. Thanks. I created:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237063

and added the two files with the output as attachments.


Note: I'm not going to have more access to the system for
a few(?) days. Hopefully the 2 files are sufficient
evidence for now.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-amd64@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-amd64
To unsubscribe, send any mail to "freebsd-amd64-unsubscr...@freebsd.org"


Re: head -r345758 Ryzen Threadripper 1950X vs. amdtemp.ko : dev.cpu.31 missing

2019-04-05 Thread Mark Millard via freebsd-amd64



On 2019-Apr-5, at 04:46, Konstantin Belousov  wrote:

> On Thu, Apr 04, 2019 at 04:58:15PM -0700, Mark Millard via freebsd-amd64 
> wrote:
>> On a:
>> 
>> CPU: AMD Ryzen Threadripper 1950X 16-Core Processor  (3393.70-MHz K8-class 
>> CPU)
>>  Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1  Stepping=1
>> Features=0x178bfbff
>> Features2=0x7ed8320b
>>  AMD Features=0x2e500800
>>  AMD 
>> Features2=0x35c233ff
>>  Structured Extended 
>> Features=0x209c01a9
>>  XSAVE Features=0xf
>>  AMD Extended Feature Extensions ID EBX=0x1007
>>  SVM: NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
>>  TSC: P-state invariant, performance statistics
>> 
>> after "kldload amdtemp" the following is seen:
>> 
>> # sysctl dev.cpu.31
>> sysctl: unknown oid 'dev.cpu.31'
>> 
>> # sysctl dev.cpu.30
>> dev.cpu.30.temperature: 62.1C
>> dev.cpu.30.cx_method: C1/hlt C2/io
>> dev.cpu.30.cx_usage_counters: 0 0
>> dev.cpu.30.cx_usage: 0.00% 0.00% last 100us
>> dev.cpu.30.cx_lowest: C1
>> dev.cpu.30.cx_supported: C1/1/0 C2/2/100
>> dev.cpu.30.%parent: acpi0
>> dev.cpu.30.%pnpinfo: _HID=none _UID=0
>> dev.cpu.30.%location: handle=\_PR_.C01F
>> dev.cpu.30.%driver: cpu
>> dev.cpu.30.%desc: ACPI CPU
>> 
>> . . . 
> 
> In the output of devinfo(8), how many CPUs do you see ?  Is there cpu31,
> and does it have amdtemp child ?

(I only used 'sysctl -a | grep "temp.*[0-9]C$"' as a short
way to show one line per dev.cpu.N so show the others  were
all present.)

cpu31 is missing in the devinfo output. The amdtempM's are under
pcibX > pciY > hostbZ , not per cpuN .

Shortended output but showing all the cpuN and amdtmpM
and their "parents" and "childern":

# devinfo
nexus0
  cryptosoft0
  vtvga0
  apic0
  ram0
  acpi0
cpu0
  hwpstate0
  cpufreq0
cpu1
cpu2
cpu3
cpu4
cpu5
cpu6
cpu7
cpu8
cpu9
cpu10
cpu11
cpu12
cpu13
cpu14
cpu15
cpu16
cpu17
cpu18
cpu19
cpu20
cpu21
cpu22
cpu23
cpu24
cpu25
cpu26
cpu27
cpu28
cpu29
cpu30
pcib0
  pci0
hostb0
  amdsmn0
  amdtemp0
. . ,
pcib12
  pci12
hostb23
  amdsmn1
  amdtemp1
. . .

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-amd64@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-amd64
To unsubscribe, send any mail to "freebsd-amd64-unsubscr...@freebsd.org"


head -r345758 Ryzen Threadripper 1950X vs. amdtemp.ko : dev.cpu.31 missing

2019-04-04 Thread Mark Millard via freebsd-amd64
On a:

CPU: AMD Ryzen Threadripper 1950X 16-Core Processor  (3393.70-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1  Stepping=1
 
Features=0x178bfbff
 
Features2=0x7ed8320b
  AMD Features=0x2e500800
  AMD 
Features2=0x35c233ff
  Structured Extended 
Features=0x209c01a9
  XSAVE Features=0xf
  AMD Extended Feature Extensions ID EBX=0x1007
  SVM: NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
  TSC: P-state invariant, performance statistics

after "kldload amdtemp" the following is seen:

# sysctl dev.cpu.31
sysctl: unknown oid 'dev.cpu.31'

# sysctl dev.cpu.30
dev.cpu.30.temperature: 62.1C
dev.cpu.30.cx_method: C1/hlt C2/io
dev.cpu.30.cx_usage_counters: 0 0
dev.cpu.30.cx_usage: 0.00% 0.00% last 100us
dev.cpu.30.cx_lowest: C1
dev.cpu.30.cx_supported: C1/1/0 C2/2/100
dev.cpu.30.%parent: acpi0
dev.cpu.30.%pnpinfo: _HID=none _UID=0
dev.cpu.30.%location: handle=\_PR_.C01F
dev.cpu.30.%driver: cpu
dev.cpu.30.%desc: ACPI CPU

# sysctl -a | grep "temp.*[0-9]C$"
dev.amdtemp.1.core0.sensor0: 62.0C
dev.amdtemp.0.core0.sensor0: 62.1C
dev.cpu.30.temperature: 62.1C
dev.cpu.29.temperature: 62.1C
dev.cpu.28.temperature: 62.1C
dev.cpu.27.temperature: 62.1C
dev.cpu.26.temperature: 62.1C
dev.cpu.25.temperature: 62.1C
dev.cpu.24.temperature: 62.1C
dev.cpu.23.temperature: 62.1C
dev.cpu.22.temperature: 62.1C
dev.cpu.21.temperature: 62.1C
dev.cpu.20.temperature: 62.1C
dev.cpu.19.temperature: 62.1C
dev.cpu.18.temperature: 62.1C
dev.cpu.17.temperature: 62.1C
dev.cpu.16.temperature: 62.1C
dev.cpu.15.temperature: 62.1C
dev.cpu.14.temperature: 62.1C
dev.cpu.13.temperature: 62.1C
dev.cpu.12.temperature: 62.1C
dev.cpu.11.temperature: 62.1C
dev.cpu.10.temperature: 62.1C
dev.cpu.9.temperature: 62.1C
dev.cpu.8.temperature: 62.1C
dev.cpu.7.temperature: 62.1C
dev.cpu.6.temperature: 62.1C
dev.cpu.5.temperature: 62.1C
dev.cpu.4.temperature: 62.1C
dev.cpu.3.temperature: 62.1C
dev.cpu.2.temperature: 62.1C
dev.cpu.1.temperature: 62.1C
dev.cpu.0.temperature: 62.1C


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-amd64@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-amd64
To unsubscribe, send any mail to "freebsd-amd64-unsubscr...@freebsd.org"


Re: Ryzen Threadripper 1950X based on head -r340287: sysctl dev.cpu: 0-30 but no 31? (top shows all 0-31 "CPU"s) [subject corrected]

2018-11-17 Thread Mark Millard via freebsd-amd64
[Fixing dumb, confusing subject typo. No change below.]

On 2018-Nov-17, at 12:54, Mark Millard  wrote:


> For some reason there is no .dev.cpu.31 listed for the 1950X that
> I use. This is a native boot, not use under Hyper-V. For
> illustration I list:
> 
> # sysctl dev.cpu | grep "desc"
> dev.cpu.30.%desc: ACPI CPU
> dev.cpu.29.%desc: ACPI CPU
> dev.cpu.28.%desc: ACPI CPU
> dev.cpu.27.%desc: ACPI CPU
> dev.cpu.26.%desc: ACPI CPU
> dev.cpu.25.%desc: ACPI CPU
> dev.cpu.24.%desc: ACPI CPU
> dev.cpu.23.%desc: ACPI CPU
> dev.cpu.22.%desc: ACPI CPU
> dev.cpu.21.%desc: ACPI CPU
> dev.cpu.20.%desc: ACPI CPU
> dev.cpu.19.%desc: ACPI CPU
> dev.cpu.18.%desc: ACPI CPU
> dev.cpu.17.%desc: ACPI CPU
> dev.cpu.16.%desc: ACPI CPU
> dev.cpu.15.%desc: ACPI CPU
> dev.cpu.14.%desc: ACPI CPU
> dev.cpu.13.%desc: ACPI CPU
> dev.cpu.12.%desc: ACPI CPU
> dev.cpu.11.%desc: ACPI CPU
> dev.cpu.10.%desc: ACPI CPU
> dev.cpu.9.%desc: ACPI CPU
> dev.cpu.8.%desc: ACPI CPU
> dev.cpu.7.%desc: ACPI CPU
> dev.cpu.6.%desc: ACPI CPU
> dev.cpu.5.%desc: ACPI CPU
> dev.cpu.4.%desc: ACPI CPU
> dev.cpu.3.%desc: ACPI CPU
> dev.cpu.2.%desc: ACPI CPU
> dev.cpu.1.%desc: ACPI CPU
> dev.cpu.0.%desc: ACPI CPU
> 
> # sysctl dev.cpu.0
> dev.cpu.0.temperature: 57.1C
> dev.cpu.0.cx_method: C1/hlt C2/io
> dev.cpu.0.cx_usage_counters: 0 0
> dev.cpu.0.cx_usage: 0.00% 0.00% last 100us
> dev.cpu.0.cx_lowest: C1
> dev.cpu.0.cx_supported: C1/1/0 C2/2/100
> dev.cpu.0.freq_levels: 3400/-1 2800/-1 2200/-1
> dev.cpu.0.freq: 3400
> dev.cpu.0.%parent: acpi0
> dev.cpu.0.%pnpinfo: _HID=none _UID=0
> dev.cpu.0.%location: handle=\_PR_.C001
> dev.cpu.0.%driver: cpu
> dev.cpu.0.%desc: ACPI CPU
> 
> # sysctl dev.cpu.31
> sysctl: unknown oid 'dev.cpu.31'
> 
> By contrast I show from top's output all 0-31 CPU's:
> 
> CPU 0:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 1:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 2:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 3:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 4:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 5:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 6:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 7:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 8:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 9:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 10:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 11:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 12:  0.0% user,  0.0% nice,  0.0% system,  1.1% interrupt, 98.9% idle
> CPU 13:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 14:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 15:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 16:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 17:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 18:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 19:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 20:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 21:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 22:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 23:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 24:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 25:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 26:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 27:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 28:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 29:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 30:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 31:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-amd64@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-amd64
To unsubscribe, send any mail to "freebsd-amd64-unsubscr...@freebsd.org"


Ryzen Threadripper 1950X based on head -r340287: sysctl dev.cpu: 0-30 but no 31? (top shows all 31 "CPU"s)

2018-11-17 Thread Mark Millard via freebsd-amd64


For some reason there is no .dev.cpu.31 listed for the 1950X that
I use. This is a native boot, not use under Hyper-V. For
illustration I list:

# sysctl dev.cpu | grep "desc"
dev.cpu.30.%desc: ACPI CPU
dev.cpu.29.%desc: ACPI CPU
dev.cpu.28.%desc: ACPI CPU
dev.cpu.27.%desc: ACPI CPU
dev.cpu.26.%desc: ACPI CPU
dev.cpu.25.%desc: ACPI CPU
dev.cpu.24.%desc: ACPI CPU
dev.cpu.23.%desc: ACPI CPU
dev.cpu.22.%desc: ACPI CPU
dev.cpu.21.%desc: ACPI CPU
dev.cpu.20.%desc: ACPI CPU
dev.cpu.19.%desc: ACPI CPU
dev.cpu.18.%desc: ACPI CPU
dev.cpu.17.%desc: ACPI CPU
dev.cpu.16.%desc: ACPI CPU
dev.cpu.15.%desc: ACPI CPU
dev.cpu.14.%desc: ACPI CPU
dev.cpu.13.%desc: ACPI CPU
dev.cpu.12.%desc: ACPI CPU
dev.cpu.11.%desc: ACPI CPU
dev.cpu.10.%desc: ACPI CPU
dev.cpu.9.%desc: ACPI CPU
dev.cpu.8.%desc: ACPI CPU
dev.cpu.7.%desc: ACPI CPU
dev.cpu.6.%desc: ACPI CPU
dev.cpu.5.%desc: ACPI CPU
dev.cpu.4.%desc: ACPI CPU
dev.cpu.3.%desc: ACPI CPU
dev.cpu.2.%desc: ACPI CPU
dev.cpu.1.%desc: ACPI CPU
dev.cpu.0.%desc: ACPI CPU

# sysctl dev.cpu.0
dev.cpu.0.temperature: 57.1C
dev.cpu.0.cx_method: C1/hlt C2/io
dev.cpu.0.cx_usage_counters: 0 0
dev.cpu.0.cx_usage: 0.00% 0.00% last 100us
dev.cpu.0.cx_lowest: C1
dev.cpu.0.cx_supported: C1/1/0 C2/2/100
dev.cpu.0.freq_levels: 3400/-1 2800/-1 2200/-1
dev.cpu.0.freq: 3400
dev.cpu.0.%parent: acpi0
dev.cpu.0.%pnpinfo: _HID=none _UID=0
dev.cpu.0.%location: handle=\_PR_.C001
dev.cpu.0.%driver: cpu
dev.cpu.0.%desc: ACPI CPU

# sysctl dev.cpu.31
sysctl: unknown oid 'dev.cpu.31'

By contrast I show from top's output all 0-31 CPU's:

CPU 0:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 1:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 2:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 3:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 4:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 5:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 6:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 7:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 8:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 9:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 10:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 11:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 12:  0.0% user,  0.0% nice,  0.0% system,  1.1% interrupt, 98.9% idle
CPU 13:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 14:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 15:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 16:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 17:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 18:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 19:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 20:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 21:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 22:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 23:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 24:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 25:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 26:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 27:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 28:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 29:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 30:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 31:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle



===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-amd64@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-amd64
To unsubscribe, send any mail to "freebsd-amd64-unsubscr...@freebsd.org"


Re: svn commit: r335873 - in head: . sys/amd64/amd64 sys/amd64/include sys/conf sys/i386/i386 sys/i386/include sys/sys sys/vm

2018-07-31 Thread Mark Millard via freebsd-amd64
> Author: mmacy
> Date: Mon Jul  2 19:48:38 2018
> New Revision: 335873
> URL: 
> https://svnweb.freebsd.org/changeset/base/335873
> 
> 
> Log:
>   inline atomics and allow tied modules to inline locks
>   
>   - inline atomics in modules on i386 and amd64 (they were always
> inline on other arches)
>   - allow modules to opt in to inlining locks by specifying
> MODULE_TIED=1 in the makefile

I recently found the following about ABI incompatibilities
between clang and gcc relative to C11 language based
atomics:

https://bugs.llvm.org/show_bug.cgi?id=26462

26462 – GCC/clang C11 _Atomic incompatibility


So are there implications about building the kernel
vs. modules that overall mix the toolchains once
modules are loaded? Do the toolchains need to match,
at least for amd64 and i386 TARGET_ARCH 's?



For reference as an introduction to the material
in llvm's 26462 . . .

It appears that the normal source of platform ABI definitions are
not explicit/detailed in the area and allow for incompatibilities
in this area. clang and gcc made differing choices absent being
constrained to match.

An example (a powerpc64 context was indicated):

struct A16 { char val[16]; }; 
_Atomic struct A16 a16; 
// GCC:
_Static_assert(_Alignof(a16) == 16, ""); 
// Clang:
_Static_assert(_Alignof(a16) == 1, ""); 


Non-power-of-2 is a general problem
(not a powerpc64 context from what I can
tell):

struct A3 { char val[3]; };
_Atomic struct A3 a3;
// GCC:
_Static_assert(sizeof(a3) == 3, "");
_Static_assert(_Alignof(a3) == 1, "");
// Clang:
_Static_assert(sizeof(a3) == 4, "");
_Static_assert(_Alignof(a3) == 4, "");


Comment 6 (by John McCall) is relevant:

QUOTE
Anyway, while I prefer the Clang rule, the GCC rule is defensible, as are any 
number of other rules.  The important point, however, is that having this 
discussion is not the right approach to solving this problem.  The layout of 
_Atomic(T) is ABI.  ABI rules are not generally determined by compiler 
implementors making things up as they go along, or at least they shouldn't be.  
The Darwin ABI for _Atomic is the rule implemented in Clang, which we actually 
did think about carefully when we adopted it.  Other platforms need to make 
their own call, and it probably shouldn't just be "whatever's implemented in 
GCC", especially on other platforms where GCC is not the system compiler.
END QUOTE


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-amd64@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-amd64
To unsubscribe, send any mail to "freebsd-amd64-unsubscr...@freebsd.org"


Why https://ci.freebsd.org/job/FreeBSD-head-amd64-gcc builds are failing . . .

2018-04-21 Thread Mark Millard via freebsd-amd64
/usr/local/bin/x86_64-freebsd-ld: unrecognized option '--no-rosegment'

is the message that reports what stops the build. I think this traces
back to:

/usr/src/share/mk/bsd.sys.mk:LDFLAGS+=  ${LDFLAGS.${LINKER_TYPE}}

being incorrect for an amd64-gcc / x86_64-freebsd-ld based build.



The details for how I got to that follow.

Looking around . . .

# grep -r rosegment /usr/src/* | more
/usr/src/contrib/llvm/tools/lld/ELF/Writer.cpp:  //   -no-rosegment option is 
used.
/usr/src/contrib/llvm/tools/lld/ELF/Options.td:def no_rosegment: 
F<"no-rosegment">,
/usr/src/contrib/llvm/tools/lld/ELF/ScriptParser.cpp:  // -no-rosegment is used 
to avoid placing read only non-executable sections in
/usr/src/contrib/llvm/tools/lld/ELF/SyntheticSections.cpp:  // for that case, 
which happens only when -no-rosegment is given.
/usr/src/contrib/llvm/tools/lld/ELF/Driver.cpp:  Config->SingleRoRx = 
Args.hasArg(OPT_no_rosegment);
/usr/src/stand/i386/Makefile.inc:LDFLAGS.lld+=  -Wl,--no-rosegment
/usr/src/usr.bin/clang/lld/ld.lld.1:.It Fl -no-rosegment

Note the line: /usr/src/stand/i386/Makefile.inc:LDFLAGS.lld+=  
-Wl,--no-rosegment
which seems to be the only place that the --no-rosegment could be from.

The error report detail is:

===> stand/i386/mbr (all)
--- machine ---
machine -> /workspace/src/sys/i386/include
--- x86 ---
x86 -> /workspace/src/sys/x86/include
--- mbr.o ---
as  --defsym FLAGS=0x80 --32  -o mbr.o /workspace/src/stand/i386/mbr/mbr.s
--- mbr ---
/usr/local/bin/x86_64-unknown-freebsd11.1-gcc -isystem 
/workspace/obj/workspace/src/amd64.amd64/tmp/usr/include 
-L/workspace/obj/workspace/src/amd64.amd64/tmp/usr/lib 
-B/workspace/obj/workspace/src/amd64.amd64/tmp/usr/lib 
--sysroot=/workspace/obj/workspace/src/amd64.amd64/tmp 
-B/workspace/obj/workspace/src/amd64.amd64/tmp/usr/bin -O2 -pipe 
-I/workspace/src/stand/i386/btx/lib -nostdinc 
-I/workspace/obj/workspace/src/amd64.amd64/stand/libsa32 
-I/workspace/src/stand/libsa -D_STANDALONE -I/workspace/src/sys 
-Ddouble=jagged-little-pill -Dfloat=floaty-mcfloatface -DLOADER_DISK_SUPPORT 
-m32 -mcpu=i386 -ffreestanding -mno-mmx -mno-sse -msoft-float -march=i386 -I. 
-std=gnu99 -Wsystem-headers -Werror -Wno-pointer-sign -Wno-error=address 
-Wno-error=array-bounds -Wno-error=attributes -Wno-error=bool-compare 
-Wno-error=cast-align -Wno-error=clobbered -Wno-error=enum-compare 
-Wno-error=extra -Wno-error=inline -Wno-error=logical-not-parentheses 
-Wno-error=strict-aliasing -Wno-error=uninitialized -W
 no-error=unused-but-set-variable -Wno-error=unused-function 
-Wno-error=unused-value -Wno-error=misleading-indentation 
-Wno-error=nonnull-compare -Wno-error=shift-negative-value 
-Wno-error=tautological-compare -Wno-error=unused-const-variable 
-mpreferred-stack-boundary=2  -e start -Ttext 0x600 -Wl,-N,-S,--oformat,binary 
-nostdlib -Wl,--no-rosegment -o mbr mbr.o  
x86_64-unknown-freebsd11.1-gcc: warning: '-mcpu=' is deprecated; use '-mtune=' 
or '-march=' instead
/usr/local/bin/x86_64-freebsd-ld: unrecognized option '--no-rosegment'
/usr/local/bin/x86_64-freebsd-ld: use the --help option for usage information
collect2: error: ld returned 1 exit status
*** [mbr] Error code 1


It appears that, for a amd64-gcc build that is using 
/usr/local/bin/x86_64-freebsd-ld
via x86_64-unknown-freebsd11.1-gcc , for some reason LDFLAGS.lld content is 
using used.

This suggests that in:

/usr/src/share/mk/bsd.sys.mk:LDFLAGS+=  ${LDFLAGS.${LINKER_TYPE}}

LINKER_TYPE is "lld". In turn that would seem to be from:

/usr/src/share/mk/bsd.linker.mk:${X_}LINKER_TYPE=   lld

Or with more context and indented but inside a
".for ld X_ in LD $${_empty_var_} XLD X_":

.if ${ld} == "LD" || (${ld} == "XLD" && ${XLD} != ${LD})
  .if !defined(${X_}LINKER_TYPE) || !defined(${X_}LINKER_VERSION)
_ld_version!=   (${${ld}} --version || echo none) | head -n 1
. . .
.elif ${_ld_version:[1]} == "LLD"
  ${X_}LINKER_TYPE=   lld
  _v= ${_ld_version:[2]}
.else
. . .
  .endif
.else
  . . .
.endif  # ${ld} == "LD" || (${ld} == "XLD" && ${XLD} != ${LD})


So it seems that ${${ld}} picked out lld for assigning _ld_version .
In turn, the following what apparently executed:

.elif ${_ld_version:[1]} == "LLD"
  ${X_}LINKER_TYPE=   lld
  _v= ${_ld_version:[2]}

But for (again):

/usr/src/share/mk/bsd.sys.mk:LDFLAGS+=  ${LDFLAGS.${LINKER_TYPE}}

that implies that the case involved is:

ld X_ in LD $${_empty_var_}

It looks like the LDFLAGS+= should not be using that for this type
of build.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-amd64@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-amd64
To unsubscribe, send any mail to "freebsd-amd64-unsubscr...@freebsd.org"


Ryzen Threadripper, Hyper-V, and NUMA vs. DIMMs: 3 DIMMs on each side seems to always be in "Local" mode, not "Distributed"

2018-04-08 Thread Mark Millard via freebsd-amd64
Context: Ryzen Threadripper 1950X under Windows 10 Pro
with Hyper-V (used to run FreeBSD).

In experimenting with switching a Threadripper 1950X to
have ECC RAM I discovered:

A) The maximum ECC memory it would put to use was 96 GiBytes
   (3 DIMMs on each side, a 4th on each side was recognized
   but was ignored/disabled if present).

B) AMD Ryzen Master classified the 96 GiByte configurations
   (with or without the ignored DIMMs) as "Local" without an
   ability switch to "Distributed".

C) The downloaded Windows CoreInfo.exe utility agreed on there
   being 2 NUMA nodes.

D) As did the result of the User Hardware Topology button
   in the Hyper-V Processor > NUMA settings:

   On a single virtual non-uniform memory architecture node:
   Maximum number of processors  :16
   Maximum amount of memory (MB) : 48070
   Maximum NUMA nodes allowed on a socket: 2

   Only 1 socket.

E) The CoreInfo.exe quick "Approximate Cross-NUMA Node Access
   Cost (relative to fastest)" tends to show the 4 numbers
   varying from 1.0 to 1.7 when retried repeatedly. An oddity
   is that sometimes the 1.0's are between 00 and 01, in fact
   this seems usual, and normally at most one 1.0 exists. The
   00 row seems to always have the smaller numbers. An example:

00  01
   00: 1.2 1.0
   01: 1.3 1.5


I had no original intent of playing with NUMA but I figured
that the Threadripper could be configured for such, and even
has configurations that apparently require such as far as
AMD Ryzen Master is concerned, could be of interest and
possible use for folks testing FreeBSD NUMA support.

Since I'd done nothing to build a kernel with NUMA enabled,
FreeBSD 12.0 under Hyper-V did not see the NUMA structure
from (D). One thing that did show during booting was
getting 4 lines of: "SRAT: Ignoring memory at addr" instead
of 2.

===
Mark Millard
marklmi26-fbsd at yahoo.com
( dsl-only.net went
away in early 2018-Mar)






___
freebsd-amd64@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-amd64
To unsubscribe, send any mail to "freebsd-amd64-unsubscr...@freebsd.org"


"Could not allocate I/O space" and "intsmb0 attach returned 6" in a under-Hyper-V context on Ryzen Threadripper: Is this expected?

2018-04-01 Thread Mark Millard via freebsd-amd64
For:

# uname -apKU
FreeBSD FBSDHUGE 12.0-CURRENT FreeBSD 12.0-CURRENT  r331831M  amd64 amd64 
1200060 1200060

I get:

. . .
pci0:  at device 7.3 (no driver attached)
. . .
intsmb0:  at device 7.3 on pci0
intsmb0: Could not allocate I/O space
device_attach: intsmb0 attach returned 6

on a Ryzen Threadripper 1950X where FreeBSD is being run under
Hyper-V (on a Windows 10 Pro machine).

Is this expected? Did I misconfigure something in Hyper-V?

This may have been true for a long time and I just
had not noticed until now.

For reference:

# pciconf -l
hostb0@pci0:0:0:0:  class=0x06 card=0x chip=0x71928086 rev=0x03 
hdr=0x00
isab0@pci0:0:7:0:   class=0x060100 card=0x1414 chip=0x71108086 rev=0x01 
hdr=0x00
atapci0@pci0:0:7:1: class=0x010180 card=0x chip=0x71118086 rev=0x01 
hdr=0x00
none0@pci0:0:7:3:   class=0x068000 card=0x chip=0x71138086 rev=0x02 
hdr=0x00
vgapci0@pci0:0:8:0: class=0x03 card=0x chip=0x53531414 rev=0x00 
hdr=0x00

# pciconf -l -v 0:0:7:3
none0@pci0:0:7:3:   class=0x068000 card=0x chip=0x71138086 rev=0x02 
hdr=0x00
vendor = 'Intel Corporation'
device = '82371AB/EB/MB PIIX4 ACPI'
class  = bridge

And . . .

Hyper-V Version: 10.0.16299 [SP0]
  
Features=0x2e7f
  PM Features=0x0 [C2]
  Features3=0xbed7b2
Timecounter "Hyper-V" frequency 1000 Hz quality 2000
CPU: AMD Ryzen Threadripper 1950X 16-Core Processor  (3393.73-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1  Stepping=1
  
Features=0x1783fbff
  
Features2=0xfed83203
  AMD Features=0x2e500800
  AMD Features2=0x3f3
  Structured Extended 
Features=0x201c01a9
  XSAVE Features=0xf
  AMD Extended Feature Extensions ID EBX=0x4
Hypervisor: Origin = "Microsoft Hv"
real memory  = 53687091200 (51200 MB)
avail memory = 52206305280 (49787 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 29 CPUs
FreeBSD/SMP: 1 package(s) x 29 core(s)



The local changes to /usr/src/ are mostly tied to
powerpc64 and powerpc experimental activity, but
there is some arm64 and arm material:

# svnlite status /usr/src/ | sort
?   /usr/src/nohup.out
?   /usr/src/sys/amd64/conf/GENERIC-DBG
?   /usr/src/sys/amd64/conf/GENERIC-NODBG
?   /usr/src/sys/arm/conf/GENERIC-DBG
?   /usr/src/sys/arm/conf/GENERIC-NODBG
?   /usr/src/sys/arm64/conf/GENERIC-DBG
?   /usr/src/sys/arm64/conf/GENERIC-NODBG
?   /usr/src/sys/dts/arm/a83t.dtsi
?   /usr/src/sys/dts/arm/sinovoip-bpi-m3.dts
?   /usr/src/sys/dts/arm/sun8i-a83t-sinovoip-bpi-m3.dts
?   /usr/src/sys/dts/arm/sun8i-a83t.dtsi
?   /usr/src/sys/powerpc/conf/GENERIC64vtsc-DBG
?   /usr/src/sys/powerpc/conf/GENERIC64vtsc-NODBG
?   /usr/src/sys/powerpc/conf/GENERICvtsc-DBG
?   /usr/src/sys/powerpc/conf/GENERICvtsc-NODBG
M   /usr/src/contrib/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
M   /usr/src/contrib/llvm/tools/lld/ELF/Arch/PPC64.cpp
M   /usr/src/crypto/openssl/crypto/armcap.c
M   /usr/src/lib/libkvm/kvm_powerpc.c
M   /usr/src/lib/libkvm/kvm_private.c
M   /usr/src/stand/defs.mk
M   /usr/src/stand/powerpc/boot1.chrp/Makefile
M   /usr/src/stand/powerpc/kboot/Makefile
M   /usr/src/sys/arm64/arm64/identcpu.c
M   /usr/src/sys/conf/kmod.mk
M   /usr/src/sys/conf/ldscript.powerpc
M   /usr/src/sys/kern/subr_pcpu.c
M   /usr/src/sys/modules/dtb/allwinner/Makefile
M   /usr/src/sys/powerpc/aim/mmu_oea64.c
M   /usr/src/sys/powerpc/ofw/ofw_machdep.c
M   /usr/src/sys/powerpc/powerpc/interrupt.c
M   /usr/src/sys/powerpc/powerpc/mp_machdep.c
M   /usr/src/sys/powerpc/powerpc/trap.c
M   /usr/src/usr.bin/top/machine.c

I've modified top to show "MaxObsUsed" (Maximum Observed
used) for Swap when it is positive:

Swap: 194G Total, 4235M Used, 4235M MaxObsUsed, 190G Free, 2% Inuse, 416K In


===
Mark Millard
marklmi26-fbsd at yahoo.com
( dsl-only.net went
away in early 2018-Mar)






___
freebsd-amd64@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-amd64
To unsubscribe, send any mail to "freebsd-amd64-unsubscr...@freebsd.org"


head -r331499 amd64/threadripper panic in vm_page_free_prep during "poudriere bulk -a", after 14h 22m or so.

2018-03-25 Thread Mark Millard via freebsd-amd64
FreeBSD panic'd while attempting to see if a "poudriere bulk -w -a"
would get the "unnecessary swapping" problem in my UFS-only context,
-r331499 (non-debug but with symbols), under Hyper-V. This is a
Ryzen Threadripper context, but I've no clue if that is important
to the problem. This was after 14 hours or so of building:

. . .
[14:22:05] [18] [00:01:16] Finished devel/p5-Test-HTML-Tidy | 
p5-Test-HTML-Tidy-1.00_1: Success
[14:22:08] [18] [00:00:00] Building devel/ocaml-camlp5 | ocaml-camlp5-6.16

So I've no clue if or how to repeat this.

Unfortunately dump was unsuccessful. So all I have is the
backtrace. Hand typed from a screen shot of the console
window:

cpuid = 18
time = 1521986594
KDB: stack backtrace:
db_trace_self_srapper() at db_trace_self_srapper+0x2b/frame 0xfe00f2e132a0
vpanic() at vpanic+0x18d/frame 0xfe00f2e13300
panic() at panic+0c43/frame 0xfe00f2e13360
vm_page_free_prep() at vm_page_free_prep+0x174/frame 0xfe00f2e13390
vm_page_free_toq() at vm_page_free_toq+0x11/frame 0xfe00f2e133b0
unlock_and_deallocate() at unlock_and_deallocate+0xbb/frame 0xfe00f2e133d0
vm_fault_hold() at vm_fault_hold+0x1d04/frame 0xfe00f2e13500
proc_rwmem() at proc_rwmem+0x8d/frame 0xfe00f2e13570
proc_readmem() at proc_readmem+0x46/frame 0xfe00f2e135d0
get_proc_vector() at get_proc_vector+0x16e/frame 0xfe00f2e13660
proc_getauxv() at proc_getauxv+0x26/frame 0xfe00f2e136a0
elf64_note_procstat_auxv() at elf64_note_procstat_auxv+0x1ee/frame 
0xfe00f2e136f0
elf64_coredump() at elf64coredump+0x57c7/frame 0xfe00f2e137c0
sigexit() at sigexit+0x76f/frame 0xfe00f2e139b0
postsig() at postsig+0x289/frame 0xfe00f2e13a70
ast() at ast+0x357/frame 0xfe00f2e13ab0
doreti_ast() at doreti_ast+0x1f/frame 0x706d6f6320432041
KBD: enter: panic
[ thread pid 61836 tid 101063 ]
Stopped at kdb_enter+0x3b: movq $0,kdb_why


The Hyper-V/Ryzen-Threadripper context was/is:

FreeBSD 12.0-CURRENT  r331499M amd64
FreeBSD clang version 6.0.0 (tags/RELEASE_600/final 326565) (based on LLVM 
6.0.0)
SRAT: Ignoring memory at addr 0x1b2820
VT(vga): text 80x25
Hyper-V Version: 10.0.16299 [SP0]
  
Features=0x2e7f
  PM Features=0x0 [C2]
  Features3=0xbed7b2
Timecounter "Hyper-V" frequency 1000 Hz quality 2000
CPU: AMD Ryzen Threadripper 1950X 16-Core Processor  (3393.73-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1  Stepping=1
  
Features=0x1783fbff
  
Features2=0xfed83203
  AMD Features=0x2e500800
  AMD Features2=0x3f3
  Structured Extended 
Features=0x201c01a9
  XSAVE Features=0xf
  AMD Extended Feature Extensions ID EBX=0x4
Hypervisor: Origin = "Microsoft Hv"
real memory  = 115964116992 (110592 MB)
avail memory = 112847249408 (107619 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 29 CPUs
FreeBSD/SMP: 1 package(s) x 29 core(s)

(I leave 3 hardware threads and some of the 128 GiBytes
of memory for Windows 10 Pro x64.)

FreeBSD and its swap are directly on NVMe SSDs, not in
NTFS file(s).


The M in -r331499M is for powerpc64/powerpc/arm64/armv7
related experiments, not amd64:

# svnlite status /usr/src/ | sort
?   /usr/src/nohup.out
?   /usr/src/sys/amd64/conf/GENERIC-DBG
?   /usr/src/sys/amd64/conf/GENERIC-NODBG
?   /usr/src/sys/arm/conf/GENERIC-DBG
?   /usr/src/sys/arm/conf/GENERIC-NODBG
?   /usr/src/sys/arm64/conf/GENERIC-DBG
?   /usr/src/sys/arm64/conf/GENERIC-NODBG
?   /usr/src/sys/dts/arm/a83t.dtsi
?   /usr/src/sys/dts/arm/sinovoip-bpi-m3.dts
?   /usr/src/sys/dts/arm/sun8i-a83t-sinovoip-bpi-m3.dts
?   /usr/src/sys/dts/arm/sun8i-a83t.dtsi
?   /usr/src/sys/powerpc/conf/GENERIC64vtsc-DBG
?   /usr/src/sys/powerpc/conf/GENERIC64vtsc-NODBG
?   /usr/src/sys/powerpc/conf/GENERICvtsc-DBG
?   /usr/src/sys/powerpc/conf/GENERICvtsc-NODBG
M   /usr/src/contrib/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
M   /usr/src/contrib/llvm/tools/lld/ELF/Arch/PPC64.cpp
M   /usr/src/crypto/openssl/crypto/armcap.c
M   /usr/src/lib/libkvm/kvm_powerpc.c
M   /usr/src/lib/libkvm/kvm_private.c
M   /usr/src/stand/defs.mk
M   /usr/src/stand/powerpc/boot1.chrp/Makefile
M   /usr/src/stand/powerpc/kboot/Makefile
M   /usr/src/sys/arm64/arm64/identcpu.c
M   /usr/src/sys/conf/kmod.mk
M   /usr/src/sys/conf/ldscript.powerpc
M   /usr/src/sys/kern/subr_pcpu.c
M   /usr/src/sys/modules/dtb/allwinner/Makefile
M   /usr/src/sys/powerpc/aim/mmu_oea64.c
M