Hi Mark --thanks for the detailed description!  Here's my latency #'s (local 
ping) on identical hardware 

Ubuntu 14.04LTS:  rtt min/avg/max/mdev    0.025/0.026/0.030/0.005 ms
RHEL7:                        rtt min/avg/max/mdev    0.008/0.009/0.022/0.003ms

So I am seeing a similar network stack latency difference.   Also, all the 
tests I did were with 'debug off' (but with other things such as message 
signing, crc. ) .  Maybe we could have a quick discussion on what settings are 
best to use when trying to get comparable numbers with memstore or all-flash 
setups.

As far as the high concurrency test goes, that peak # of IOPS will be reached 
at lower concurrency (something around like t=8 probably), and at that point 
(the 'knee' of the latency/throughput curve), there's a pretty substantial 
latency difference.     Once it gets to t=256 I imagine the latency was 10+ms 
for both platforms.  

Since the last direct comparison was on older code, and the mixing of 
libnss/cryptopp in the builds, I think I need to rerun the comparison(at least 
one last time!) between the two distro's on a more recent version of code.

Thanks,

Stephen



-----Original Message-----
From: Mark Nelson [mailto:[email protected]] 
Sent: Tuesday, January 27, 2015 2:03 PM
To: Blinick, Stephen L; Ceph Development
Subject: Re: Memstore performance improvements v0.90 vs v0.87

Hi Stephen,

Took a little longer than I wanted it to, but I finally got some results 
looking at RHEL7 and Ubuntu 14.04 in our test lab.  This is with a recent 
master pull.

Tests are with rados bench to a single memstore OSD on localhost.

Single Op Avg Write Latency:

Ubuntu 14.04:            0.91ms
Ubuntu 14.04 (no debug): 0.67ms
RHEL 7:                  0.49ms
RHEL 7 (no debug):       0.31ms

Single Op Avg read Latency:

Ubuntu 14.04:            0.58ms
Ubuntu 14.04 (no debug): 0.33ms
RHEL 7:                  0.32ms
RHEL 7 (no debug):       0.17ms

I then checked avg network latency to localhost using ping for 120s:

Ubuntu 14.04: 0.025ms
RHEL 7:       0.015ms

So looking at your results, I see similar latency numbers, though not quite as 
dramatic (ie  Ubuntu isn't quite so bad).  I wanted to know if the latency 
would be hidden if enough IOs were thrown at the problem so I increased 
concurrent IOs to 256:

256 concurrent op Write IOPS:

Ubuntu 14.04:             7199 IOPS
Ubuntu 14.04 (no debug): 14613 IOPS
RHEL 7:                   7784 IOPS
REHL 7 (no debug):       17907 IOPS

256 concurrent op Read IOPS:

Ubuntu 14.04:             9887 IOPS
Ubuntu 14.04 (no debug): 20489 IOPS
RHEL 7:                  10832 IOPS
REHL 7 (no debug):       21257 IOPS

So on one hand I'm seeing an effect similar to what you saw, but once I throw 
enough concurrency at the problem it seems like other things take over as the 
bottleneck.  With default debug logging levels the latency difference is mostly 
masked, but with debugging off we see at least for writes a fairly substantial 
difference.

I collected some system utilization data during the tests and will go back and 
see if I can discover anything more with perf as well.  I think the two big 
takeaways at this point are:

1) There is definitely something interesting going on with Ubuntu vs RHEL 
(Maybe network related).
2) Our debug logging has become a major bottleneck in high IOPS scenarios 
(though we already kind of knew this).

Mark

On 01/14/2015 05:39 PM, Blinick, Stephen L wrote:
> Haha :)  Well, my intuition is still pointing to something I've configured 
> wrong (or had wrong).. but it will be interesting to see what it is.
>
> -----Original Message-----
> From: Mark Nelson [mailto:[email protected]]
> Sent: Wednesday, January 14, 2015 3:43 PM
> To: Blinick, Stephen L; Ceph Development
> Subject: Re: Memstore performance improvements v0.90 vs v0.87
>
> On 01/14/2015 04:32 PM, Blinick, Stephen L wrote:
>> I went back and grabbed 87 and built it on RHEL7 as well, and performance is 
>> also similar (much better).  I've also run it on a few systems (Dual socket 
>> 10-core E5v2,  Dual socket 6-core E5v3).  So, it's related to my switch to 
>> RHEL7, and not to the code changes between v0.90 and v0.87.     Will post 
>> when I get more data.
>
> Stephen, you are practically writing press releases for the RHEL guys 
> here! ;)
>
> Mark
>
>>
>> Thanks,
>>
>> Stephen
>>
>> -----Original Message-----
>> From: [email protected] 
>> [mailto:[email protected]] On Behalf Of Blinick, 
>> Stephen L
>> Sent: Wednesday, January 14, 2015 12:06 AM
>> To: Ceph Development
>> Subject: Memstore performance improvements v0.90 vs v0.87
>>
>> In the process of moving to a new cluster (RHEL7 based) I grabbed v0.90, 
>> compiled RPM's and re-ran the simple local-node memstore test I've run on 
>> .80 - .87.  It's a single Memstore OSD and a single Rados Bench client 
>> locally on the same node.  Increasing queue depth and measuring latency 
>> /IOPS.  So far, the measurements have been consistent across different 
>> hardware and code releases (with about a 30% improvement with the OpWQ 
>> Sharding changes that came in after Firefly).
>>
>> These are just very early results, but I'm seeing a very large improvement 
>> in latency and throughput with v90 on RHEL7.   Next  I'm working to get 
>> lttng installed and working in RHEL7 to determine where the improvement is.  
>>  On previous levels, these measurements have been roughly the same using a 
>> real (fast) backend (i.e. NVMe flash), and I will verify here as well.   
>> Just wondering if anyone else has measured similar improvements?
>>
>>
>> 100% Reads or Writes, 4K Objects, Rados Bench
>>
>> ========================
>> V0.87: Ubuntu 14.04LTS
>>
>> *Writes*
>> #Thr IOPS    Latency(ms)
>> 1    618.80          1.61
>> 2    1401.70         1.42
>> 4    3962.73         1.00
>> 8    7354.37         1.10
>> 16   7654.67         2.10
>> 32   7320.33         4.37
>> 64   7424.27         8.62
>>
>> *Reads*
>> #thr IOPS    Latency(ms)
>> 1    837.57          1.19
>> 2    1950.00         1.02
>> 4    6494.03         0.61
>> 8    7243.53         1.10
>> 16   7473.73         2.14
>> 32   7682.80         4.16
>> 64   7727.10         8.28
>>
>>
>> ========================
>> V0.90:  RHEL7
>>
>> *Writes*
>> #Thr IOPS    Latency(ms)
>> 1    2558.53         0.39
>> 2    6014.67         0.33
>> 4    10061.33        0.40
>> 8    14169.60        0.56
>> 16   14355.63        1.11
>> 32   14150.30        2.26
>> 64   15283.33        4.19
>>
>> *Reads*
>> #Thr IOPS    Latency(ms)
>> 1    4535.63         0.22
>> 2    9969.73         0.20
>> 4    17049.43        0.23
>> 8    19909.70        0.40
>> 16   20320.80        0.79
>> 32   19827.93        1.61
>> 64   22371.17        2.86
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>> in the body of a message to [email protected] More majordomo 
>> info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>> in the body of a message to [email protected] More majordomo 
>> info at  http://vger.kernel.org/majordomo-info.html
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to