On 2019-Sep-27, at 12:24, Mark Johnston <markj at FreeBSD.org> wrote:

> On Thu, Sep 26, 2019 at 08:37:39PM -0700, Mark Millard wrote:
>> 
>> 
>> On 2019-Sep-26, at 17:05, Mark Millard <marklmi at yahoo.com> wrote:
>> 
>>> On 2019-Sep-26, at 13:29, Mark Johnston <markj at FreeBSD.org> wrote:
>>>> One possibility is that these are kernel memory allocations occurring in
>>>> the context of the benchmark threads.  Such allocations may not respect
>>>> the configured policy since they are not private to the allocating
>>>> thread.  For instance, upon opening a file, the kernel may allocate a
>>>> vnode structure for that file.  That vnode may be accessed by threads
>>>> from many processes over its lifetime, and may be recycled many times
>>>> before its memory is released back to the allocator.
>>> 
>>> For -l0-15 -n prefer:1 :
>>> 
>>> Looks like this reports sys_thr_new activity, sys_cpuset
>>> activity, and 0xffffffff80bc09bd activity (whatever that
>>> is). Mostly sys_thr_new activity, over 1300 of them . . .
>>> 
>>> dtrace: pid 13553 has exited
>>> 
>>> 
>>>             kernel`uma_small_alloc+0x61
>>>             kernel`keg_alloc_slab+0x10b
>>>             kernel`zone_import+0x1d2
>>>             kernel`uma_zalloc_arg+0x62b
>>>             kernel`thread_init+0x22
>>>             kernel`keg_alloc_slab+0x259
>>>             kernel`zone_import+0x1d2
>>>             kernel`uma_zalloc_arg+0x62b
>>>             kernel`thread_alloc+0x23
>>>             kernel`thread_create+0x13a
>>>             kernel`sys_thr_new+0xd2
>>>             kernel`amd64_syscall+0x3ae
>>>             kernel`0xffffffff811b7600
>>>               2
>>> 
>>>             kernel`uma_small_alloc+0x61
>>>             kernel`keg_alloc_slab+0x10b
>>>             kernel`zone_import+0x1d2
>>>             kernel`uma_zalloc_arg+0x62b
>>>             kernel`cpuset_setproc+0x65
>>>             kernel`sys_cpuset+0x123
>>>             kernel`amd64_syscall+0x3ae
>>>             kernel`0xffffffff811b7600
>>>               2
>>> 
>>>             kernel`uma_small_alloc+0x61
>>>             kernel`keg_alloc_slab+0x10b
>>>             kernel`zone_import+0x1d2
>>>             kernel`uma_zalloc_arg+0x62b
>>>             kernel`uma_zfree_arg+0x36a
>>>             kernel`thread_reap+0x106
>>>             kernel`thread_alloc+0xf
>>>             kernel`thread_create+0x13a
>>>             kernel`sys_thr_new+0xd2
>>>             kernel`amd64_syscall+0x3ae
>>>             kernel`0xffffffff811b7600
>>>               6
>>> 
>>>             kernel`uma_small_alloc+0x61
>>>             kernel`keg_alloc_slab+0x10b
>>>             kernel`zone_import+0x1d2
>>>             kernel`uma_zalloc_arg+0x62b
>>>             kernel`uma_zfree_arg+0x36a
>>>             kernel`vm_map_process_deferred+0x8c
>>>             kernel`vm_map_remove+0x11d
>>>             kernel`vmspace_exit+0xd3
>>>             kernel`exit1+0x5a9
>>>             kernel`0xffffffff80bc09bd
>>>             kernel`amd64_syscall+0x3ae
>>>             kernel`0xffffffff811b7600
>>>               6
>>> 
>>>             kernel`uma_small_alloc+0x61
>>>             kernel`keg_alloc_slab+0x10b
>>>             kernel`zone_import+0x1d2
>>>             kernel`uma_zalloc_arg+0x62b
>>>             kernel`thread_alloc+0x23
>>>             kernel`thread_create+0x13a
>>>             kernel`sys_thr_new+0xd2
>>>             kernel`amd64_syscall+0x3ae
>>>             kernel`0xffffffff811b7600
>>>              22
>>> 
>>>             kernel`vm_page_grab_pages+0x1b4
>>>             kernel`vm_thread_stack_create+0xc0
>>>             kernel`kstack_import+0x52
>>>             kernel`uma_zalloc_arg+0x62b
>>>             kernel`vm_thread_new+0x4d
>>>             kernel`thread_alloc+0x31
>>>             kernel`thread_create+0x13a
>>>             kernel`sys_thr_new+0xd2
>>>             kernel`amd64_syscall+0x3ae
>>>             kernel`0xffffffff811b7600
>>>            1324
>> 
>> With sys_thr_new not respecting -n prefer:1 for
>> -l0-15 (especially for the thread stacks), I
>> looked some at the generated integration kernel
>> code and it makes significant use of %rsp based
>> memory accesses (read and write).
>> 
>> That would get both memory controllers going in
>> parallel (kernel vectors accesses to the preferred
>> memory domain), so not slowing down as expected.
>> 
>> If round-robin is not respected for thread stacks,
>> and if threads migrate cpus across memory domains
>> at times, there could be considerable variability
>> for that context as well. (This may not be the
>> only way to have different/extra variability for
>> this context.)
>> 
>> Overall: I'd be surprised if this was not
>> contributing to what I thought was odd about
>> the benchmark results.
> 
> Your tracing refers to kernel thread stacks though, not the stacks used
> by threads when executing in user mode.  My understanding is that a HINT
> implementation would spend virtually all of its time in user mode, so it
> shouldn't matter much or at all if kernel thread stacks are backed by
> memory from the "wrong" domain.

Looks like I was trying to think about it when I should have been sleeping.
You are correct.

> This also doesn't really explain some of the disparities in the plots
> you sent me.  For instance, you get a much higher peak QUIS on FreeBSD
> than on Fedora with 16 threads and an interleave/round-robin domain
> selection policy.

True. I suppose that there is the possibility that steady_clock's now() results
are odd for some reason for the type of context, leading to the durations
between such being on the short side where things look different.

But the left hand side of the single-thread results (smaller memory sizes for
the vectors for the integration kernel's use) do not show such a rescaling.
(The single thread time measurements are strictly inside the thread of
execution, no thread creation or such counted for any size.) The right hand
side of the single thread results (larger memory use, making smaller cache
levels fairly ineffective) do generally show some rescaling, but not as drastic
as multi-threaded.

Both round-robin and prefer:1  showed such for single threaded.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net <http://dsl-only.net/> went
away in early 2018-Mar)

_______________________________________________
freebsd-amd64@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-amd64
To unsubscribe, send any mail to "freebsd-amd64-unsubscr...@freebsd.org"

Reply via email to