On Thu, 13 Nov 2014, Alexandre DERUMIER wrote:
> >>I think we need to figure out why so much time is being spent
> >>mallocing/freeing memory. Got to get those symbols resolved!
>
> Ok, I don't known why, but if I remove all ceph -dbg packages, I'm seeing the
> rbd && rados symbols now...
>
> I have udpdate the files:
>
> http://odisoweb1.odiso.net/cephperf/perf-librbd/report.txt
Ran it through c++filt:
https://gist.github.com/88ba9409f5d201b957a1
I'm a bit suprised by the some of the items near the top
(bufferlist.clear() callers). I'm sure several of those can be
streamlined to avoid temporary bufferlists. I don't see any super
egregious users of the allocator, though.
The memcpy callers might be a good place to start...
sage
>
>
>
>
> ----- Mail original -----
>
> De: "Mark Nelson" <[email protected]>
> ?: "Alexandre DERUMIER" <[email protected]>, "Ceph Devel"
> <[email protected]>
> Cc: "Mark Nelson" <[email protected]>, "Sage Weil" <[email protected]>,
> "Somnath Roy" <[email protected]>
> Envoy?: Jeudi 13 Novembre 2014 15:20:40
> Objet: Re: client cpu usage : kbrd vs librbd perf report
>
> On 11/13/2014 05:15 AM, Alexandre DERUMIER wrote:
> > Hi,
> >
> > I have redone perf with dwarf
> >
> > perf record -g --call-graph dwarf -a -F 99 -- sleep 60
> >
> > I have put perf reports, ceph conf, fio config here:
> >
> > http://odisoweb1.odiso.net/cephperf/
> >
> > test setup
> > -----------
> > client cpu config : 8 x Intel(R) Xeon(R) CPU E5-2603 v2 @ 1.80GHz
> > ceph cluster : 3 nodes (same cpu than client) with 2 osd each (intel ssd
> > s3500), test pool with replication x1
> > rbd volume size : 10G (almost all reads are done in osd buffer cache)
> >
> > benchmark with fio 4k randread, with 1 rbd volume. (also tested with 20 rbd
> > volumes, results are equals).
> > debian wheezy - kernel 3.17 - and ceph packages from master on gitbuilder
> >
> > (BTW, I have installed librbd/rados dbg packages but I have missing symbols
> > ?)
>
> I think if you run perf report with verbose enabled it will tell you
> which symbols are missing:
>
> perf report -v 2>&1 | less
>
> If you have them but it's not detecting them properly you can clean out
> the cache or even manually reassign the symbols but it's annoying.
>
> >
> >
> >
> > Global results:
> > ---------------
> > librbd : 60000iops : 98% cpu
> > krbd : 90000iops : 32% cpu
> >
> >
> > So, librbd usage is 4,5x more than krbd for same ios throughput
> >
> > The difference seem to be quite huge, is it expected ?
>
> This is kind of the wild west. With that many IOPS we are running into
> new bottlenecks. :)
>
> >
> >
> >
> >
> > librbd perf report:
> > -------------------------
> > top cpu usage
> > --------------
> > 25.71% fio libc-2.13.so
> > 17.69% fio librados.so.2.0.0
> > 12.38% fio librbd.so.1.0.0
> > 27.99% fio [kernel.kallsyms]
> > 4.19% fio libpthread-2.13.so
> >
> >
> > libc-2.13.so (seem that malloc/free use a lot of cpu here)
> > ------------
> > 21.05%-- _int_malloc
> > 14.36%-- free
> > 13.66%-- malloc
> > 9.89%-- __lll_unlock_wake_private
> > 5.35%-- __clone
> > 4.38%-- __poll
> > 3.77%-- __memcpy_ssse3
> > 1.64%-- vfprintf
> > 1.02%-- arena_get2
> >
>
> I think we need to figure out why so much time is being spent
> mallocing/freeing memory. Got to get those symbols resolved!
>
> > fio [kernel.kallsyms] : seem to have a lot of futex functions here
> > -----------------------
> > 5.27%-- _raw_spin_lock
> > 3.88%-- futex_wake
> > 2.88%-- __switch_to
> > 2.74%-- system_call
> > 2.70%-- __schedule
> > 2.52%-- tcp_sendmsg
> > 2.47%-- futex_wait_setup
> > 2.28%-- _raw_spin_lock_irqsave
> > 2.16%-- idle_cpu
> > 1.66%-- enqueue_task_fair
> > 1.57%-- native_write_msr_safe
> > 1.49%-- hash_futex
> > 1.46%-- futex_wait
> > 1.40%-- reschedule_interrupt
> > 1.37%-- try_to_wake_up
> > 1.28%-- account_entity_enqueue
> > 1.25%-- copy_user_enhanced_fast_string
> > 1.25%-- futex_requeue
> > 1.24%-- __fget
> > 1.24%-- update_curr
> > 1.20%-- tcp_write_xmit
> > 1.14%-- wake_futex
> > 1.08%-- scheduler_ipi
> > 1.05%-- select_task_rq_fair
> > 1.01%-- dequeue_task_fair
> > 0.97%-- do_futex
> > 0.97%-- futex_wait_queue_me
> > 0.83%-- cpuacct_charge
> > 0.82%-- tcp_transmit_skb
> > ...
> >
> >
> > Regards,
> >
> > Alexandre
> >
> >
> >
> >
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html