Am 16.02.2015 um 23:18 schrieb Alexandre DERUMIER: >>> Just tested write. This might be the result of higher CPU load of the >>> ceph-osd processes under firefly. >>> >>> Dumpling 180% per process vs. firefly 220% > > Oh yes, indeed, that's what I think too. (and more cpu -> less ios in qemu > because of single iothread) > > > I think that perf report should show the difference, this is strange that you > have almost same result in your perf reports.
Not that easy as you can't downgrade from firefly to dumpling. Here is perf output of firefly vs giant. While they have a difference of about 6k iops i do not see any differences in the perf output. This is the perf output of the osd side with firefly (27k iops): 5,44% libc-2.13.so [.] 0x1370b0 4,02% libtcmalloc.so.4.1.0 [.] operator new(unsigned long) 2,84% libtcmalloc.so.4.1.0 [.] operator delete(void*) 2,50% ceph-osd [.] 0x5aed30 1,40% libstdc++.so.6.0.17 [.] 0xbeacc 1,39% [kernel] [k] intel_idle 1,34% libleveldb.so.1.12 [.] 0x24673 1,30% libc-2.13.so [.] vfprintf 1,20% [kernel] [k] __schedule 1,00% [kernel] [k] __switch_to 0,99% libleveldb.so.1.12 [.] leveldb::SkipList<char const*, leveldb::MemTable::KeyComparator>::FindGreaterOrEqual(char const* 0,96% libpthread-2.13.so [.] pthread_mutex_trylock 0,86% [kernel] [k] _raw_spin_lock 0,83% libstdc++.so.6.0.17 [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string 0,82% ceph-osd [.] ceph::buffer::list::append(char const*, unsigned int) 0,81% libsnappy.so.1.1.3 [.] snappy::internal::CompressFragment(char const*, unsigned long, char*, unsigned short*, int) 0,74% libpthread-2.13.so [.] __pthread_mutex_unlock_usercnt 0,73% ceph-osd [.] Mutex::Lock(bool) 0,72% [kernel] [k] __d_lookup_rcu 0,65% libtcmalloc.so.4.1.0 [.] tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int 0,65% ceph-osd [.] ceph::buffer::ptr::release() 0,64% libleveldb.so.1.12 [.] leveldb::crc32c::Extend(unsigned int, char const*, unsigned long) 0,62% [kernel] [k] page_fault 0,58% ceph-osd [.] ceph::buffer::list::iterator::copy(unsigned int, char*) 0,55% [kernel] [k] link_path_walk 0,53% libstdc++.so.6.0.17 [.] std::string::append(char const*, unsigned long) 0,51% ceph-osd [.] ceph::buffer::list::iterator::advance(int) 0,46% ceph-osd [.] ceph::buffer::ptr::append(char const*, unsigned int) 0,45% libpthread-2.13.so [.] pthread_cond_wait@@GLIBC_2.3.2 0,43% libsnappy.so.1.1.3 [.] snappy::RawUncompress(snappy::Source*, char*) 0,43% libstdc++.so.6.0.17 [.] std::string::_M_mutate(unsigned long, unsigned long, unsigned long) 0,43% ceph-osd [.] ceph::buffer::list::append(ceph::buffer::ptr const&, unsigned int, unsigned int) 0,42% [kernel] [k] copy_user_enhanced_fast_string and this is giant having (33k iops): 5,65% libc-2.13.so [.] 0x1376a4 4,36% libtcmalloc.so.4.1.0 [.] operator new(unsigned long) 2,74% ceph-osd [.] 0x48011d 2,49% libtcmalloc.so.4.1.0 [.] operator delete(void*) 1,42% libleveldb.so.1.12 [.] 0x24673 1,27% libleveldb.so.1.12 [.] leveldb::SkipList<char const*, leveldb::MemTable::KeyComparator>::FindGreaterOrEqual(char cons 1,19% libstdc++.so.6.0.17 [.] 0xbfacb 0,99% libc-2.13.so [.] vfprintf 0,91% libsnappy.so.1.1.3 [.] snappy::internal::CompressFragment(char const*, unsigned long, char*, unsigned short*, int) 0,88% [kernel] [k] __schedule 0,88% libpthread-2.13.so [.] pthread_mutex_trylock 0,81% ceph-osd [.] ceph::buffer::list::append(char const*, unsigned int) 0,76% [kernel] [k] __d_lookup_rcu 0,76% [kernel] [k] __switch_to 0,74% ceph-osd [.] Mutex::Lock(bool) 0,74% libtcmalloc.so.4.1.0 [.] tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, 0,72% [kernel] [k] intel_idle 0,72% ceph-osd [.] ceph::buffer::ptr::release() 0,70% [kernel] [k] _raw_spin_lock 0,66% libleveldb.so.1.12 [.] leveldb::crc32c::Extend(unsigned int, char const*, unsigned long) 0,66% ceph-osd [.] ceph::buffer::list::iterator::copy(unsigned int, char*) 0,64% libstdc++.so.6.0.17 [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::stri 0,64% [kernel] [k] page_fault 0,59% [kernel] [k] link_path_walk 0,58% libstdc++.so.6.0.17 [.] std::string::append(char const*, unsigned long) 0,57% libpthread-2.13.so [.] __pthread_mutex_unlock_usercnt 0,56% ceph-osd [.] ceph::buffer::list::iterator::advance(int) 0,44% ceph-osd [.] ceph::buffer::ptr::append(char const*, unsigned int) Stefan > > > > ----- Mail original ----- > De: "Stefan Priebe" <[email protected]> > À: "aderumier" <[email protected]> > Cc: "Mark Nelson" <[email protected]>, "ceph-devel" > <[email protected]> > Envoyé: Lundi 16 Février 2015 23:08:37 > Objet: Re: speed decrease since firefly,giant,hammer the 2nd try > > Am 16.02.2015 um 23:02 schrieb Alexandre DERUMIER <[email protected]>: > >>>> This results in fio-rbd showing avg 26000 iop/s instead of 30500 iop/s >>>> while running dumpling... >> >> Is it for write only ? >> or do you see same decrease for read too > > Just tested write. This might be the result of higher CPU load of the > ceph-osd processes under firefly. > > Dumpling 180% per process vs. firefly 220% > > Stefan > >> ? >> >> >> ----- Mail original ----- >> De: "Stefan Priebe" <[email protected]> >> À: "Mark Nelson" <[email protected]>, "ceph-devel" >> <[email protected]> >> Envoyé: Lundi 16 Février 2015 22:22:01 >> Objet: Re: speed decrease since firefly,giant,hammer the 2nd try >> >> I've now upgraded server side and client side to latest upstream/firefly. >> >> This results in fio-rbd showing avg 26000 iop/s instead of 30500 iop/s >> while running dumpling... >> >> Greets, >> Stefan >>> Am 15.02.2015 um 19:40 schrieb Stefan Priebe: >>> Hi Mark, >>> >>> what's next? >>> >>> I've this test cluster only for 2 more days. >>> >>> Here some perf Details: >>> >>> dumpling: >>> 12,65% libc-2.13.so [.] 0x79000 >>> 2,86% libc-2.13.so [.] malloc >>> 2,80% kvm [.] 0xb59c5 >>> 2,59% libc-2.13.so [.] free >>> 2,35% [kernel] [k] __schedule >>> 2,16% [kernel] [k] _raw_spin_lock >>> 1,92% [kernel] [k] __switch_to >>> 1,58% [kernel] [k] lapic_next_deadline >>> 1,09% [kernel] [k] update_sd_lb_stats >>> 1,08% [kernel] [k] _raw_spin_lock_irqsave >>> 0,91% librados.so.2.0.0 [.] ceph_crc32c_le_intel >>> 0,91% libpthread-2.13.so [.] pthread_mutex_trylock >>> 0,87% [kernel] [k] resched_task >>> 0,72% [kernel] [k] cpu_startup_entry >>> 0,71% librados.so.2.0.0 [.] crush_hash32_3 >>> 0,66% [kernel] [k] leave_mm >>> 0,65% librados.so.2.0.0 [.] Mutex::Lock(bool) >>> 0,64% [kernel] [k] idle_cpu >>> 0,62% libpthread-2.13.so [.] __pthread_mutex_unlock_usercnt >>> 0,59% [kernel] [k] try_to_wake_up >>> 0,56% [kernel] [k] wake_futex >>> 0,50% librados.so.2.0.0 [.] ceph::buffer::ptr::release() >>> >>> firefly: >>> 12,56% libc-2.13.so [.] 0x7905d >>> 2,82% libc-2.13.so [.] malloc >>> 2,64% libc-2.13.so [.] free >>> 2,61% kvm [.] 0x34322f >>> 2,33% [kernel] [k] __schedule >>> 2,14% [kernel] [k] _raw_spin_lock >>> 1,83% [kernel] [k] __switch_to >>> 1,62% [kernel] [k] lapic_next_deadline >>> 1,17% [kernel] [k] _raw_spin_lock_irqsave >>> 1,09% [kernel] [k] update_sd_lb_stats >>> 1,08% libpthread-2.13.so [.] pthread_mutex_trylock >>> 0,85% libpthread-2.13.so [.] __pthread_mutex_unlock_usercnt >>> 0,77% [kernel] [k] resched_task >>> 0,74% librbd.so.1.0.0 [.] 0x71b73 >>> 0,72% librados.so.2.0.0 [.] Mutex::Lock(bool) >>> 0,68% librados.so.2.0.0 [.] crush_hash32_3 >>> 0,67% [kernel] [k] idle_cpu >>> 0,65% [kernel] [k] leave_mm >>> 0,65% [kernel] [k] cpu_startup_entry >>> 0,59% [kernel] [k] try_to_wake_up >>> 0,51% librados.so.2.0.0 [.] ceph::buffer::ptr::release() >>> 0,51% [kernel] [k] wake_futex >>> >>> Stefan >>> >>>> Am 11.02.2015 um 06:42 schrieb Stefan Priebe: >>>> >>>>> Am 11.02.2015 um 05:45 schrieb Mark Nelson: >>>>>> On 02/10/2015 04:18 PM, Stefan Priebe wrote: >>>>>> >>>>>>> Am 10.02.2015 um 22:38 schrieb Mark Nelson: >>>>>>>> On 02/10/2015 03:11 PM, Stefan Priebe wrote: >>>>>>>> >>>>>>>> mhm i installed librbd1-dbg and librados2-dbg - but the output still >>>>>>>> looks useless to me. Should i upload it somewhere? >>>>>>> >>>>>>> Meh, if it's all just symbols it's probably not that helpful. >>>>>>> >>>>>>> I've summarized your results here: >>>>>>> >>>>>>> 1 concurrent 4k write (libaio, direct=1, iodepth=1) >>>>>>> >>>>>>> IOPS Latency >>>>>>> wb on wb off wb on wb off >>>>>>> dumpling 10870 536 ~100us ~2ms >>>>>>> firefly 10350 525 ~100us ~2ms >>>>>>> >>>>>>> So in single op tests dumpling and firefly are far closer. Now let's >>>>>>> see each of these cases with iodepth=32 (still 1 thread for now). >>>>>> >>>>>> >>>>>> dumpling: >>>>>> >>>>>> file1: (g=0): rw=randwrite, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=32 >>>>>> 2.0.8 >>>>>> Starting 1 thread >>>>>> Jobs: 1 (f=1): [w] [100.0% done] [0K/72812K /s] [0 /18.3K iops] [eta >>>>>> 00m:00s] >>>>>> file1: (groupid=0, jobs=1): err= 0: pid=3011 >>>>>> write: io=2060.6MB, bw=70329KB/s, iops=17582 , runt= 30001msec >>>>>> slat (usec): min=1 , max=3517 , avg= 3.42, stdev= 7.30 >>>>>> clat (usec): min=93 , max=7475 , avg=1815.72, stdev=233.43 >>>>>> lat (usec): min=219 , max=7477 , avg=1819.27, stdev=233.52 >>>>>> clat percentiles (usec): >>>>>> | 1.00th=[ 1480], 5.00th=[ 1576], 10.00th=[ 1608], 20.00th=[ >>>>>> 1672], >>>>>> | 30.00th=[ 1704], 40.00th=[ 1752], 50.00th=[ 1800], 60.00th=[ >>>>>> 1832], >>>>>> | 70.00th=[ 1896], 80.00th=[ 1960], 90.00th=[ 2064], 95.00th=[ >>>>>> 2128], >>>>>> | 99.00th=[ 2352], 99.50th=[ 2448], 99.90th=[ 4704], 99.95th=[ >>>>>> 5344], >>>>>> | 99.99th=[ 7072] >>>>>> bw (KB/s) : min=59696, max=77840, per=100.00%, avg=70351.27, >>>>>> stdev=4783.25 >>>>>> lat (usec) : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, >>>>>> 1000=0.53% >>>>>> lat (msec) : 2=85.02%, 4=14.31%, 10=0.13% >>>>>> cpu : usr=1.96%, sys=6.71%, ctx=22791, majf=0, minf=133 >>>>>> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >>>>>>> =64=0.0% >>>>>> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >>>>>>> =64=0.0% >>>>>> complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >>>>>>> =64=0.0% >>>>>> issued : total=r=0/w=527487/d=0, short=r=0/w=0/d=0 >>>>>> >>>>>> Run status group 0 (all jobs): >>>>>> WRITE: io=2060.6MB, aggrb=70329KB/s, minb=70329KB/s, maxb=70329KB/s, >>>>>> mint=30001msec, maxt=30001msec >>>>>> >>>>>> Disk stats (read/write): >>>>>> sdb: ios=166/526079, merge=0/0, ticks=24/890120, in_queue=890064, >>>>>> util=98.73% >>>>>> >>>>>> firefly: >>>>>> >>>>>> file1: (g=0): rw=randwrite, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=32 >>>>>> 2.0.8 >>>>>> Starting 1 thread >>>>>> Jobs: 1 (f=1): [w] [100.0% done] [0K/69096K /s] [0 /17.3K iops] [eta >>>>>> 00m:00s] >>>>>> file1: (groupid=0, jobs=1): err= 0: pid=2982 >>>>>> write: io=1784.9MB, bw=60918KB/s, iops=15229 , runt= 30002msec >>>>>> slat (usec): min=1 , max=1389 , avg= 3.43, stdev= 5.32 >>>>>> clat (usec): min=117 , max=8235 , avg=2096.88, stdev=396.30 >>>>>> lat (usec): min=540 , max=8258 , avg=2100.43, stdev=396.61 >>>>>> clat percentiles (usec): >>>>>> | 1.00th=[ 1608], 5.00th=[ 1720], 10.00th=[ 1768], 20.00th=[ >>>>>> 1832], >>>>>> | 30.00th=[ 1896], 40.00th=[ 1944], 50.00th=[ 2008], 60.00th=[ >>>>>> 2064], >>>>>> | 70.00th=[ 2160], 80.00th=[ 2256], 90.00th=[ 2512], 95.00th=[ >>>>>> 2896], >>>>>> | 99.00th=[ 3600], 99.50th=[ 3792], 99.90th=[ 5088], 99.95th=[ >>>>>> 6304], >>>>>> | 99.99th=[ 6752] >>>>>> bw (KB/s) : min=36717, max=73712, per=99.94%, avg=60879.92, >>>>>> stdev=8302.27 >>>>>> lat (usec) : 250=0.01%, 750=0.01% >>>>>> lat (msec) : 2=48.56%, 4=51.18%, 10=0.26% >>>>>> cpu : usr=2.03%, sys=5.48%, ctx=20440, majf=0, minf=133 >>>>>> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >>>>>>> =64=0.0% >>>>>> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >>>>>>> =64=0.0% >>>>>> complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >>>>>>> =64=0.0% >>>>>> issued : total=r=0/w=456918/d=0, short=r=0/w=0/d=0 >>>>>> >>>>>> Run status group 0 (all jobs): >>>>>> WRITE: io=1784.9MB, aggrb=60918KB/s, minb=60918KB/s, maxb=60918KB/s, >>>>>> mint=30002msec, maxt=30002msec >>>>>> >>>>>> Disk stats (read/write): >>>>>> sdb: ios=166/455574, merge=0/0, ticks=12/897748, in_queue=897696, >>>>>> util=98.96% >>>>> >>>>> Ok, so it looks like as you increase concurrency the effect increases >>>>> (ie contention?). Does the same thing happen without cache enabled? >>>> >>>> here again without rbd cache: >>>> >>>> dumpling: >>>> file1: (g=0): rw=randwrite, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=32 >>>> 2.0.8 >>>> Starting 1 thread >>>> Jobs: 1 (f=1): [w] [100.0% done] [0K/83488K /s] [0 /20.9K iops] [eta >>>> 00m:00s] >>>> file1: (groupid=0, jobs=1): err= 0: pid=3000 >>>> write: io=2449.2MB, bw=83583KB/s, iops=20895 , runt= 30005msec >>>> slat (usec): min=1 , max=975 , avg= 4.50, stdev= 5.25 >>>> clat (usec): min=364 , max=80566 , avg=1525.87, stdev=1194.57 >>>> lat (usec): min=519 , max=80568 , avg=1530.51, stdev=1194.44 >>>> clat percentiles (usec): >>>> | 1.00th=[ 660], 5.00th=[ 780], 10.00th=[ 876], 20.00th=[ >>>> 1032], >>>> | 30.00th=[ 1144], 40.00th=[ 1240], 50.00th=[ 1304], 60.00th=[ >>>> 1384], >>>> | 70.00th=[ 1480], 80.00th=[ 1640], 90.00th=[ 2096], 95.00th=[ >>>> 2960], >>>> | 99.00th=[ 6816], 99.50th=[ 7840], 99.90th=[11712], >>>> 99.95th=[13888], >>>> | 99.99th=[18816] >>>> bw (KB/s) : min=47184, max=95432, per=100.00%, avg=83639.19, >>>> stdev=7973.92 >>>> lat (usec) : 500=0.01%, 750=3.82%, 1000=14.40% >>>> lat (msec) : 2=70.57%, 4=7.91%, 10=3.11%, 20=0.17%, 50=0.01% >>>> lat (msec) : 100=0.01% >>>> cpu : usr=3.12%, sys=11.49%, ctx=74951, majf=0, minf=133 >>>> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >>>>> =64=0.0% >>>> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >>>>> =64=0.0% >>>> complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >>>>> =64=0.0% >>>> issued : total=r=0/w=626979/d=0, short=r=0/w=0/d=0 >>>> >>>> Run status group 0 (all jobs): >>>> WRITE: io=2449.2MB, aggrb=83583KB/s, minb=83583KB/s, maxb=83583KB/s, >>>> mint=30005msec, maxt=30005msec >>>> >>>> Disk stats (read/write): >>>> sdb: ios=168/625292, merge=0/0, ticks=144/916096, in_queue=916128, >>>> util=99.93% >>>> >>>> >>>> firefly: >>>> >>>> fio --filename=/dev/sdb --direct=1 --rw=randwrite --bs=4k --numjobs=1 >>>> --thread --iodepth=32 --ioengine=libaio --runtime=30 --group_reporting >>>> --name=file1 >>>> file1: (g=0): rw=randwrite, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=32 >>>> 2.0.8 >>>> Starting 1 thread >>>> Jobs: 1 (f=1): [w] [100.0% done] [0K/90044K /s] [0 /22.6K iops] [eta >>>> 00m:00s] >>>> file1: (groupid=0, jobs=1): err= 0: pid=2970 >>>> write: io=2372.9MB, bw=80976KB/s, iops=20244 , runt= 30006msec >>>> slat (usec): min=1 , max=4047 , avg= 4.36, stdev= 7.17 >>>> clat (usec): min=197 , max=76656 , avg=1575.29, stdev=1165.74 >>>> lat (usec): min=523 , max=76660 , avg=1579.79, stdev=1165.59 >>>> clat percentiles (usec): >>>> | 1.00th=[ 676], 5.00th=[ 804], 10.00th=[ 916], 20.00th=[ >>>> 1096], >>>> | 30.00th=[ 1224], 40.00th=[ 1304], 50.00th=[ 1384], 60.00th=[ >>>> 1448], >>>> | 70.00th=[ 1544], 80.00th=[ 1704], 90.00th=[ 2128], 95.00th=[ >>>> 2736], >>>> | 99.00th=[ 6752], 99.50th=[ 7904], 99.90th=[12096], >>>> 99.95th=[14656], >>>> | 99.99th=[18560] >>>> bw (KB/s) : min=47800, max=91952, per=99.91%, avg=80900.88, >>>> stdev=7234.98 >>>> lat (usec) : 250=0.01%, 500=0.01%, 750=2.95%, 1000=11.38% >>>> lat (msec) : 2=73.81%, 4=8.81%, 10=2.85%, 20=0.19%, 50=0.01% >>>> lat (msec) : 100=0.01% >>>> cpu : usr=2.99%, sys=10.60%, ctx=66549, majf=0, minf=133 >>>> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >>>>> =64=0.0% >>>> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >>>>> =64=0.0% >>>> complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >>>>> =64=0.0% >>>> issued : total=r=0/w=607445/d=0, short=r=0/w=0/d=0 >>>> >>>> Run status group 0 (all jobs): >>>> WRITE: io=2372.9MB, aggrb=80976KB/s, minb=80976KB/s, maxb=80976KB/s, >>>> mint=30006msec, maxt=30006msec >>>> >>>> Disk stats (read/write): >>>> sdb: ios=170/605440, merge=0/0, ticks=156/916492, in_queue=916560, >>>> util=99.93% >>>> >>>> Stefan >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to [email protected] >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to [email protected] > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
