On Aug 2, 2014, at 6:55 AM, Chris Ferebee via smartos-discuss
<[email protected]> wrote:
> Robert,
>
> The script you suggested outputs the following, with the server mostly idle
> and one OS X client spinning in the Finder, doing its thing to cause the
> flood of getcwd() calls.
>
> # dtrace -n 'fbt::dnlc_reverse_lookup:entry{ self->t = timestamp; }' -n
> 'fbt::dnlc_reverse_lookup:return/self->t/{ @[arg1 == 0 ? 0 : 1] =
> quantize(timestamp - self->t); self->t = 0; }'
> dtrace: description 'fbt::dnlc_reverse_lookup:entry' matched 1 probe
> dtrace: description 'fbt::dnlc_reverse_lookup:return' matched 1 probe
>
> ^C
>
> 1
> value ------------- Distribution ------------- count
> 1048576 | 0
> 2097152 |@@@@@@@@ 240
> 4194304 | 8
> 8388608 |@@@@@@@@@@@@@@@@@@@@@ 636
> 16777216 |@@@@@@@@@@@ 348
> 33554432 | 0
>
> 0
> value ------------- Distribution ------------- count
> 8388608 | 0
> 16777216 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 971
> 33554432 | 2
> 67108864 | 0
>
>
> I don't yet understand how this relates to the results from Ralph's afp.d
> Dtrace script, which shows getcwd() taking either ~ 50 us or 250 ms, with a
> 3:1 bimodal distribution. Does getcwd() call dnlc_reverse_lookup always, or
> just sometimes?
>
> How might the size of the dnlc affect things? I understand it can be adjusted
> by putting
>
> set ncsize=nnn
>
> into /etc/system, but I have also seen suggestions that it can be dynamically
> tuned like this:
>
> # echo dnlc_max_nentries/W0t10485760 | mdb -kw
> # echo ncsize/W0t5240000 | mdb -kw
>
> (Is that safe or advisable? Does it work? I'm hesitant to try, but it would
> be convenient not to reboot the server, not to mention the difficulty of
> modifying /etc/system on SmartOS.)
>
> Currently, ncsize is set to 444,719 (SmartOS default?), and dnlc_nentries
> seems to slowly fluctuate between 200,000 and 300,000.
>
> # echo ncsize/D | mdb -kw
> ncsize: 444719 # and similarly for
> dnlc_nentries: 272139
> dnlc_max_nentries: 889438
> dnlc_nentries_low_water: 440316
>
> Dnlc cache hits seem to be around 50% per
>
> # vmstat -s
> 5079608454 total name lookups (cache hits 50%)
>
> and
>
> # kstat -n dnlcstats
> module: unix instance: 0
> name: dnlcstats class: misc
> crtime 8,600642349
> dir_add_abort 0
> dir_add_max 0
> dir_add_no_memory 0
> dir_cached_current 2
> dir_cached_total 2
> dir_entries_cached_current 1094
> dir_fini_purge 0
> dir_hits 0
> dir_misses 19498
> dir_reclaim_any 0
> dir_reclaim_last 0
> dir_remove_entry_fail 0
> dir_remove_space_fail 0
> dir_start_no_memory 0
> dir_update_fail 0
> double_enters 14
> enters 1858063
> hits 2546253549
> misses 2531297654
> negative_cache_hits 22475516
> pick_free 0
> pick_heuristic 995275
> pick_last 3997
pick_heuristic and pick_last are the import indicators that your DLNC could be
too small. Ideally, these are 0.
Somewhere around here I've got a dnlcstat... there is one in the dtrace toolkit,
but you can also do one using just kstats instead.
-- richard
> purge_all 0
> purge_fs1 0
> purge_total_entries 1786
> purge_vfs 44
> purge_vp 67
> snaptime 3016218,120526680
>
> Thanks for insights so far!
>
> Best,
> Chris
>
>
> Am 01.08.2014 um 17:07 schrieb Robert Mustacchi via smartos-discuss
> <[email protected]>:
>
>> On 07/31/2014 11:15 PM, Chris Ferebee via smartos-discuss wrote:
>>> Hi Robert,
>>>
>>> The flame graph is here:
>>>
>>> <http://fere.be/p/netatalk-flamegraph.svg>
>>>
>>> with corresponding output from Ralph's Dtrace script here:
>>>
>>> <http://fere.be/p/netatalk.afp.d.out>
>>>
>>> The raw kernel Dtrace output of
>>>
>>> dtrace -x stackframes=100 -n 'profile-997 /arg0/ { @[stack()] = count(); }
>>> tick-60s { exit(0); }' -o netatalk.kern_stacks-dtrace.out
>>>
>>> is here:
>>>
>>> <http://fere.be/p/netatalk.kern_stacks-dtrace.out>
>>>
>>> So if I understand things up to this point, the kernel is spending a lot of
>>> time in dnlc_reverse_lookup. Does that tell you anything?
>>
>> Hey Chris,
>>
>> I'm not terribly familiar with how the dnlc is designed. So what'd be
>> interesting here is to correlate this with the number of hits versus
>> misses in the dnlc.
>>
>> The questions I want to answer are something along the lines of:
>>
>> o Are we spending a lot of time in the dnlc just to then miss?
>> o Are we spending a lot of time in the dnlc scanning it because it's
>> just really large and we're not hashing across it well?
>> o Is something else entirely going on?
>>
>> Of course, the comment of the function has the following:
>>
>> /*
>> * Perform a reverse lookup in the DNLC. This will find the first
>> occurrence of
>> * the vnode. If successful, it will return the vnode of the parent,
>> and the
>> * name of the entry in the given buffer. If it cannot be found, or the
>> buffer
>> * is too small, then it will return NULL. Note that this is a highly
>> * inefficient function, since the DNLC is constructed solely for forward
>> * lookups.
>> */
>>
>> Which doesn't bode well. But perhaps, what I might go and do is
>> something like:
>>
>> dtrace -n 'fbt::dnlc_reverse_lookup:entry{ self->t = timestamp; }' -n
>> 'fbt::dnlc_reverse_lookup:return/self->t/{ @[arg1 == 0 ? 0 : 1] =
>> quantize(timestamp - self->t); self->t = 0; }'
>>
>> Robert
>>
>>>> Am 31.07.2014 um 17:29 schrieb Robert Mustacchi <[email protected]>:
>>>>
>>>>> On 07/31/2014 08:15 AM, Chris Ferebee via smartos-discuss wrote:
>>>>> I'm trying to debug performance issues on a SmartOS
>>>>> joyent_20140613T024634Z server that is running netatalk in a zone.
>>>>>
>>>>> After some investigation with the help of netatalk developer Ralph Böhme,
>>>>> I find that the netatalk afpd is spending a very long time in the syscall
>>>>>
>>>>> getcwd()
>>>>>
>>>>> in some cases. Ralph provided a Dtrace script which breaks down the time
>>>>> spent in various syscalls, and getcwd() seems to be the major bottleneck
>>>>> here. I've attached an excerpt of the Dtrace analysis below.
>>>>>
>>>>> Is there any reason why getcwd() should take a long time? Disk I/O does
>>>>> not appear to be an issue. For reference, I got similar results before
>>>>> and after starting a zpool scrub on the zones pool, and the scrub is
>>>>> proceeding at 940 MB/s.
>>>>>
>>>>> Further background on the issue is discussed here:
>>>>>
>>>>> <http://sourceforge.net/p/netatalk/mailman/message/32660961/>
>>>>>
>>>>> and the full output of the Dtrace script is here:
>>>>>
>>>>> <https://gist.github.com/ferebee/99016ccd293b3f616c77>
>>>>>
>>>>> All sugestions are most welcome!
>>>>
>>>> Hey Chris,
>>>>
>>>> I'd suggest gathering a kernel flame graph while you're running the
>>>> test. It's not immediately obvious to me why getcwd() would be hot, but
>>>> what we're seeing is that with getcwd() you have a bimodal distribution.
>>>> So you have the ~100 entries that are taking about 250ms (I'm fairly
>>>> certain I read the script correctly and it saves everything in us).
>>>>
>>>> It also appears that the kernel caches the value of the current working
>>>> directory, therefore it might be the case that these longer ones are
>>>> when you have that cache miss because the working directory changed.
>>>>
>>>> What I might suggest as a next step is making a kernel flame graph to
>>>> correspond with this workload and run it concurrently with the script.
>>>> That might allow us to better understand what's going on. You can
>>>> generate a flame graph with https://github.com/brendangregg/flamegraph.
>>>>
>>>> Robert
>>>
>>>
>>>
>>> -------------------------------------------
>>> smartos-discuss
>>> Archives: https://www.listbox.com/member/archive/184463/=now
>>> RSS Feed:
>>> https://www.listbox.com/member/archive/rss/184463/21483261-4b78dd38
>>> Modify Your Subscription: https://www.listbox.com/member/?&
>>> Powered by Listbox: http://www.listbox.com
>>>
>>
>>
>>
>> -------------------------------------------
>> smartos-discuss
>> Archives: https://www.listbox.com/member/archive/184463/=now
>> RSS Feed: https://www.listbox.com/member/archive/rss/184463/24804823-eebbfb1e
>> Modify Your Subscription: https://www.listbox.com/member/?&
>> Powered by Listbox: http://www.listbox.com
>
>
>
> -------------------------------------------
> smartos-discuss
> Archives: https://www.listbox.com/member/archive/184463/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/184463/21953302-fd56db47
> Modify Your Subscription: https://www.listbox.com/member/?&
> Powered by Listbox: http://www.listbox.com
-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription:
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com