Hey Samuel,
These:
osd op threads = 8
osd disk threads = 2
filestore op threads = 8
They increased performance on my test-cluster, but seemed to have the
opposite effect on the much heavier loaded production-environment.
Regards,
Oliver
On 27-08-13 16:37, Samuel Just wrote:
> What options were you using?
> -Sam
>
> On Tue, Aug 27, 2013 at 7:35 AM, Oliver Daudey <[email protected]> wrote:
>> Hey Ian, Samuel,
>>
>> FYI: I still had some attempted optimization-options in place on the
>> production-cluster, which might have skewed my results a bit. OSD
>> version 0.67.2-16-geeb1f86 seems to be a lot less hard on the CPU in the
>> configuration that I did all other tests in. I haven't yet verified
>> sufficiently if this is accompanied by a speed-increase as well. On the
>> test-cluster, I didn't see any difference in speed, but that may not
>> mean much, as the load-pattern on production is totally different.
>> Sorry for that mixup.
>>
>> Updated `perf top'-output, extra options removed, under current load,
>> which should be higher than in my previous mail:
>> 18.08% [kernel] [k] intel_idle
>>
>> 5.87% [kernel] [k] find_busiest_group
>>
>> 4.92% kvm [.] 0xcefe2
>>
>> 3.24% [kernel] [k] native_write_msr_safe
>>
>> 2.92% [kernel] [k]
>> default_send_IPI_mask_sequence_phys
>> 2.66% [kernel] [k] _raw_spin_lock
>>
>> 1.50% [kernel] [k] native_write_cr0
>>
>> 1.36% libleveldb.so.1.9 [.] 0x3cebc
>>
>> 1.27% [kernel] [k] __hrtimer_start_range_ns
>>
>> 1.17% [kernel] [k] hrtimer_interrupt
>>
>> 1.10% libc-2.11.3.so [.] memcmp
>>
>> 1.07% [kernel] [k] apic_timer_interrupt
>>
>> 1.00% [kernel] [k] find_next_bit
>>
>> 0.99% [kernel] [k] cpumask_next_and
>>
>> 0.99% [kernel] [k] __schedule
>>
>> 0.97% [kernel] [k] clockevents_program_event
>>
>> 0.97% [kernel] [k] _raw_spin_unlock_irqrestore
>>
>> 0.90% [kernel] [k] fget_light
>>
>> 0.85% [kernel] [k] do_select
>>
>> 0.84% [kernel] [k] reschedule_interrupt
>>
>> 0.83% [kvm_intel] [k] vmx_vcpu_run
>>
>> 0.79% [kernel] [k] _raw_spin_lock_irqsave
>>
>> 0.78% [kernel] [k] try_to_wake_up
>>
>> 0.70% libc-2.11.3.so [.] memcpy
>>
>> 0.66% [kernel] [k] copy_user_generic_string
>>
>> 0.63% [kernel] [k] sync_inodes_sb
>>
>> 0.61% [kernel] [k] load_balance
>>
>> 0.61% [kernel] [k] tg_load_down
>>
>> 0.56% [kernel] [k] irq_entries_start
>>
>> 0.56% libc-2.11.3.so [.] 0x73612
>>
>> 0.54% libpthread-2.11.3.so [.] pthread_mutex_lock
>>
>> 0.51% [kernel] [k] rcu_needs_cpu
>>
>>
>> Regards,
>>
>> Oliver
>>
>> On 27-08-13 16:04, Ian Colle wrote:
>>>
>>> On Aug 27, 2013, at 2:08, Oliver Daudey <[email protected]> wrote:
>>>
>>>> Hey Samuel,
>>>>
>>>> The "PGLog::check()" is now no longer visible in profiling, so it helped
>>>> for that. Unfortunately, it doesn't seem to have helped to bring down
>>>> the OSD's CPU-loading much. Leveldb still uses much more than in
>>>> Cuttlefish. On my test-cluster, I didn't notice any difference in the
>>>> RBD bench-results, either, so I have to assume that it didn't help
>>>> performance much.
>>>>
>>>> Here's the `perf top' I took just now on my production-cluster with your
>>>> new version, under regular load. Also note the "memcmp" and "memcpy",
>>>> which also don't show up when running a Cuttlefish-OSD:
>>>
>>> memcpy is in fact also present in your Cuttlefish OSD, just a bit further
>>> down the list (increased from .7% to 1.4%).
>>>
>>> memcmp definitely looks suspicious and is something we're looking into.
>>>
>>>> 15.65% [kernel] [k]
>>>> intel_idle
>>>> 7.20% libleveldb.so.1.9 [.]
>>>> 0x3ceae
>>>> 6.28% libc-2.11.3.so [.]
>>>> memcmp
>>>> 5.22% [kernel] [k]
>>>> find_busiest_group
>>>> 3.92% kvm [.]
>>>> 0x2cf006
>>>> 2.40% libleveldb.so.1.9 [.]
>>>> leveldb::InternalKeyComparator::Compar
>>>> 1.95% [kernel] [k]
>>>> _raw_spin_lock
>>>> 1.69% [kernel] [k]
>>>> default_send_IPI_mask_sequence_phys
>>>> 1.46% libc-2.11.3.so [.]
>>>> memcpy
>>>> 1.17% libleveldb.so.1.9 [.]
>>>> leveldb::Block::Iter::Next()
>>>> 1.16% [kernel] [k]
>>>> hrtimer_interrupt
>>>> 1.07% [kernel] [k]
>>>> native_write_cr0
>>>> 1.01% [kernel] [k]
>>>> __hrtimer_start_range_ns
>>>> 1.00% [kernel] [k]
>>>> clockevents_program_event
>>>> 0.93% [kernel] [k]
>>>> find_next_bit
>>>> 0.93% libstdc++.so.6.0.13 [.]
>>>> std::string::_M_mutate(unsigned long,
>>>> 0.89% [kernel] [k]
>>>> cpumask_next_and
>>>> 0.87% [kernel] [k]
>>>> __schedule
>>>> 0.85% [kernel] [k]
>>>> _raw_spin_unlock_irqrestore
>>>> 0.85% [kernel] [k]
>>>> do_select
>>>> 0.84% [kernel] [k]
>>>> apic_timer_interrupt
>>>> 0.80% [kernel] [k]
>>>> fget_light
>>>> 0.79% [kernel] [k]
>>>> native_write_msr_safe
>>>> 0.76% [kernel] [k]
>>>> _raw_spin_lock_irqsave
>>>> 0.66% libc-2.11.3.so [.]
>>>> 0xdc6d8
>>>> 0.61% libpthread-2.11.3.so [.]
>>>> pthread_mutex_lock
>>>> 0.61% [kernel] [k]
>>>> tg_load_down
>>>> 0.59% [kernel] [k]
>>>> reschedule_interrupt
>>>> 0.59% libsnappy.so.1.1.2 [.]
>>>> snappy::RawUncompress(snappy::Source*,
>>>> 0.56% libstdc++.so.6.0.13 [.] std::string::append(char
>>>> const*, unsig
>>>> 0.54% [kvm_intel] [k]
>>>> vmx_vcpu_run
>>>> 0.53% [kernel] [k]
>>>> copy_user_generic_string
>>>> 0.53% [kernel] [k]
>>>> load_balance
>>>> 0.50% [kernel] [k]
>>>> rcu_needs_cpu
>>>> 0.45% [kernel] [k] fput
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Oliver
>>>>
>>>> On ma, 2013-08-26 at 23:33 -0700, Samuel Just wrote:
>>>>> I just pushed a patch to wip-dumpling-log-assert (based on current
>>>>> dumpling head). I had disabled most of the code in PGLog::check() but
>>>>> left an (I thought) innocuous assert. It seems that with (at least)
>>>>> g++ 4.6.3, stl list::size() is linear in the size of the list, so that
>>>>> assert actually traverses the pg log on each operation. The patch in
>>>>> wip-dumpling-log-assert should disable that assert as well by default.
>>>>> Let me know if it helps.
>>>>>
>>>>> It should be built within an hour of this email.
>>>>> -Sam
>>>>>
>>>>> On Mon, Aug 26, 2013 at 10:46 PM, Matthew Anderson
>>>>> <[email protected]> wrote:
>>>>>> Hi Guys,
>>>>>>
>>>>>> I'm having the same problem as Oliver with 0.67.2. CPU usage is around
>>>>>> double that of the 0.61.8 OSD's in the same cluster which appears to
>>>>>> be causing the performance decrease.
>>>>>>
>>>>>> I did a perf comparison (not sure if I did it right but it seems ok).
>>>>>> Both hosts are the same spec running Ubuntu 12.04.1 (3.2 kernel),
>>>>>> journal and osd data is on an SSD, OSD's are in the same pool with the
>>>>>> same weight and the perf tests were run at the same time on a
>>>>>> realworld load consisting of RBD traffic only.
>>>>>>
>>>>>> Dumpling -
>>>>>>
>>>>>> Events: 332K cycles
>>>>>> 17.93% ceph-osd libc-2.15.so [.] 0x15d523
>>>>>> 17.03% ceph-osd ceph-osd [.] 0x5c2897
>>>>>> 4.66% ceph-osd ceph-osd [.]
>>>>>> leveldb::InternalKeyComparator::Compare(leveldb::Slice const&, level
>>>>>> 3.46% ceph-osd ceph-osd [.] leveldb::Block::Iter::Next()
>>>>>> 2.70% ceph-osd libstdc++.so.6.0.16 [.]
>>>>>> std::string::_M_mutate(unsigned long, unsigned long, unsigned long)
>>>>>> 2.60% ceph-osd ceph-osd [.] PGLog::check()
>>>>>> 2.57% ceph-osd [kernel.kallsyms] [k] __ticket_spin_lock
>>>>>> 2.49% ceph-osd ceph-osd [.] ceph_crc32c_le_intel
>>>>>> 1.93% ceph-osd libsnappy.so.1.1.2 [.]
>>>>>> snappy::RawUncompress(snappy::Source*, char*)
>>>>>> 1.53% ceph-osd libstdc++.so.6.0.16 [.] std::string::append(char
>>>>>> const*, unsigned long)
>>>>>> 1.47% ceph-osd libtcmalloc.so.0.1.0 [.] operator new(unsigned long)
>>>>>> 1.33% ceph-osd [kernel.kallsyms] [k] copy_user_generic_string
>>>>>> 0.98% ceph-osd libtcmalloc.so.0.1.0 [.] operator delete(void*)
>>>>>> 0.90% ceph-osd libstdc++.so.6.0.16 [.] std::string::assign(char
>>>>>> const*, unsigned long)
>>>>>> 0.75% ceph-osd libstdc++.so.6.0.16 [.]
>>>>>> std::string::_M_replace_safe(unsigned long, unsigned long, char cons
>>>>>> 0.58% ceph-osd [kernel.kallsyms] [k] wait_sb_inodes
>>>>>> 0.55% ceph-osd ceph-osd [.]
>>>>>> leveldb::Block::Iter::Valid() const
>>>>>> 0.51% ceph-osd libtcmalloc.so.0.1.0 [.]
>>>>>> tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::
>>>>>> 0.50% ceph-osd libtcmalloc.so.0.1.0 [.]
>>>>>> tcmalloc::CentralFreeList::FetchFromSpans()
>>>>>> 0.47% ceph-osd libstdc++.so.6.0.16 [.] 0x9ebc8
>>>>>> 0.46% ceph-osd libc-2.15.so [.] vfprintf
>>>>>> 0.45% ceph-osd [kernel.kallsyms] [k] find_busiest_group
>>>>>> 0.45% ceph-osd libstdc++.so.6.0.16 [.]
>>>>>> std::string::resize(unsigned long, char)
>>>>>> 0.43% ceph-osd libpthread-2.15.so [.] pthread_mutex_unlock
>>>>>> 0.41% ceph-osd [kernel.kallsyms] [k] iput_final
>>>>>> 0.40% ceph-osd ceph-osd [.]
>>>>>> leveldb::Block::Iter::Seek(leveldb::Slice const&)
>>>>>> 0.39% ceph-osd libc-2.15.so [.] _IO_vfscanf
>>>>>> 0.39% ceph-osd ceph-osd [.] leveldb::Block::Iter::key()
>>>>>> const
>>>>>> 0.39% ceph-osd libtcmalloc.so.0.1.0 [.]
>>>>>> tcmalloc::CentralFreeList::ReleaseToSpans(void*)
>>>>>> 0.37% ceph-osd libstdc++.so.6.0.16 [.] std::basic_ostream<char,
>>>>>> std::char_traits<char> >& std::__ostream_in
>>>>>>
>>>>>>
>>>>>> Cuttlefish -
>>>>>>
>>>>>> Events: 160K cycles
>>>>>> 7.53% ceph-osd [kernel.kallsyms] [k] __ticket_spin_lock
>>>>>> 6.26% ceph-osd libc-2.15.so [.] 0x89115
>>>>>> 3.06% ceph-osd ceph-osd [.] ceph_crc32c_le
>>>>>> 2.66% ceph-osd libtcmalloc.so.0.1.0 [.] operator new(unsigned long)
>>>>>> 2.46% ceph-osd [kernel.kallsyms] [k] find_busiest_group
>>>>>> 1.80% ceph-osd libtcmalloc.so.0.1.0 [.] operator delete(void*)
>>>>>> 1.42% ceph-osd [kernel.kallsyms] [k] try_to_wake_up
>>>>>> 1.27% ceph-osd ceph-osd [.] 0x531fb6
>>>>>> 1.21% ceph-osd libstdc++.so.6.0.16 [.] 0x9ebc8
>>>>>> 1.14% ceph-osd [kernel.kallsyms] [k] wait_sb_inodes
>>>>>> 1.02% ceph-osd libc-2.15.so [.] _IO_vfscanf
>>>>>> 1.01% ceph-osd [kernel.kallsyms] [k] update_shares
>>>>>> 0.98% ceph-osd [kernel.kallsyms] [k] filemap_fdatawait_range
>>>>>> 0.90% ceph-osd libstdc++.so.6.0.16 [.] std::basic_ostream<char,
>>>>>> std::char_traits<char> >& std
>>>>>> 0.89% ceph-osd [kernel.kallsyms] [k] iput_final
>>>>>> 0.79% ceph-osd libstdc++.so.6.0.16 [.] std::basic_string<char,
>>>>>> std::char_traits<char>, std::a
>>>>>> 0.79% ceph-osd [kernel.kallsyms] [k] copy_user_generic_string
>>>>>> 0.78% ceph-osd libc-2.15.so [.] vfprintf
>>>>>> 0.70% ceph-osd libtcmalloc.so.0.1.0 [.]
>>>>>> tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc:
>>>>>> 0.69% ceph-osd [kernel.kallsyms] [k] __d_lookup_rcu
>>>>>> 0.69% ceph-osd libtcmalloc.so.0.1.0 [.]
>>>>>> tcmalloc::CentralFreeList::FetchFromSpans()
>>>>>> 0.66% ceph-osd [kernel.kallsyms] [k] igrab
>>>>>> 0.63% ceph-osd [kernel.kallsyms] [k] update_cfs_load
>>>>>> 0.63% ceph-osd [kernel.kallsyms] [k] link_path_walk
>>>>>>
>>>>>> If you'd like some more tests run just let me know, more than happy to
>>>>>> help
>>>>>>
>>>>>> Thanks
>>>>>> -Matt
>>>>
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> [email protected]
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com