Re: [ceph-users] Performance Testing of CEPH on ARM MicroServer

2016-02-17 Thread Christian Balzer


Hello,

On Wed, 17 Feb 2016 21:47:31 +0530 Swapnil Jain wrote:

> Thanks Christian,
> 
> 
> 
> > On 17-Feb-2016, at 7:25 AM, Christian Balzer  wrote:
> > 
> > 
> > Hello,
> > 
> > On Mon, 15 Feb 2016 21:10:33 +0530 Swapnil Jain wrote:
> > 
> >> For most of you CEPH on ARMv7 might not sound good. This is our setup
> >> and our FIO testing report.  I am not able to understand ….
> >> 
> > Just one OSD per Microserver as in your case should be fine.
> > As always, use atop (or similar) on your storage servers when running
> > these tests to see where your bottlenecks are (HDD/network/CPU).
> > 
> >> 1) Are these results good or bad
> >> 2) Write is much better than read, where as read should be better.
> >> 
> > Your testing is flawed, more below.
> > 
> >> Hardware:
> >> 
> >> 8 x ARMv7 MicroServer with 4 x 10G Uplink
> >> 
> >> Each MicroServer with:
> >> 2GB RAM
> > Barely OK for one OSD, not enough if you run MONs as well on it (as you
> > do).
> > 
> >> Dual Core 1.6 GHz processor
> >> 2 x 2.5 Gbps Ethernet (1 for Public / 1 for Cluster Network)
> >> 1 x 3TB SATA HDD
> >> 1 x 128GB MSata Flash
> > Exact model/maker please.
> 
> Its Seagate ST3000NC000 & Phison Msata
>
There's quite a large number of Phison MSata drive models available it
seems.
And their specifications don't mention endurance, DWPD or TBW...

Anyway, you will want to look into this to see if they are a good match
for Ceph journals:
http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/

 
> 
> > 
> >> 
> >> Software:
> >> Debian 8.3 32bit
> >> ceph version 9.2.0-25-gf480cea
> >> 
> >> Setup:
> >> 
> >> 3 MON (Shared with 3 OSD)
> >> 8 OSD
> >> Data on 3TB SATA with XFS
> >> Journal on 128GB MSata Flash
> >> 
> >> pool with replica 1
> > Not a very realistic test of course.
> > For a production, fault resilient cluster you would have to divide your
> > results by 3 (at least).
> > 
> >> 500GB image with 4M object size
> >> 
> >> FIO command: fio --name=unit1 --filename=/dev/rbd1 --bs=4k
> >> --runtime=300 --readwrite=write
> >> 
> > 
> > If that is your base FIO command line, I'm assuming you mounted that
> > image on the client via the kernel RBD module?
> 
> Yes, its via kernel RBD module
> 
> 
> > 
> > Either way, the main reason you're seeing writes being faster than
> > reads is that with this command line (no direct=1 flag) fio will use
> > the page cache on your client host for writes, speeding things up
> > dramatically. To get a realistic idea of your clusters ability, use
> > direct=1 and also look into rados bench.
> > 
> > Another reason for the slow reads is that Ceph (RBD) does badly with
> > regards to read-ahead, setting  /sys/block/rdb1/queue/read_ahead_kb to
> > something like 2048 should improve things.
> > 
> > That all being said, your read values look awfully low.
> 
> Thanks again for the suggestion. Below are some results using rados
> bench, here read looks much better than write. Still is it good or can
> be better? 

rados bench with default setting operates on 4MB blocks, which matches
Ceph objects. 
Meaning it is optimized for giving the best performance figures in terms
of throughput. 
In real live situations you're likely to be more interested in IOPS than
MB/s.
If you run it with "-b 4096" (aka 4KB blocks) you're likely to see with
atop that your CPUs are getting much MUCH more of a workout.

> I also checked atop, couldn't see any bottleneck except that
> that sda disk was busy 80-90% of time during the test.
> 
Well, if that is true (on average) for all your nodes, then you found the
bottleneck. 
Also, which one is "sda", the HDD or the SSD?

> 
> WRITE Throughput (MB/sec): 297.544
> WRITE Average Latency:0.21499
> 
> READ Throughput (MB/sec):478.026
> READ Average Latency:   0.133818
> 
These are pretty good numbers for this setup indeed. 
But again, with a replication size of 1 they're not representative of
reality at all.

Regards,

Christian
> —
> Swapnil
> 
> > 
> > Christian
> >> Client:
> >> 
> >> Ubuntu on Intel 24core/16GB RAM 10G Ethernet
> >> 
> >> Result for different tests
> >> 
> >> 128k-randread.txt:  read : io=2587.4MB, bw=8830.2KB/s, iops=68,
> >> runt=300020msec 128k-randwrite.txt:  write: io=48549MB, bw=165709KB/s,
> >> iops=1294, runt=35msec 128k-read.txt:  read : io=26484MB,
> >> bw=90397KB/s, iops=706, runt=32msec 128k-write.txt:  write:
> >> io=89538MB, bw=305618KB/s, iops=2387, runt=34msec
> >> 16k-randread.txt: read : io=383760KB, bw=1279.2KB/s, iops=79,
> >> runt=31msec 16k-randwrite.txt:  write: io=8720.7MB, bw=29764KB/s,
> >> iops=1860, runt=32msec 16k-read.txt:  read : io=27444MB,
> >> bw=93676KB/s, iops=5854, runt=31msec 16k-write.txt:  write:
> >> io=87811MB, bw=299726KB/s, iops=18732, runt=31msec
> >> 1M-randread.txt:  read : io=10439MB, bw=35631KB/s, iops=34,
> >> runt=38msec 1M-randwrite.txt: write: io=98943MB, bw=337721KB/s,
> >> iops=329, 

Re: [ceph-users] Performance Testing of CEPH on ARM MicroServer

2016-02-17 Thread Swapnil Jain
Thanks Christian,



> On 17-Feb-2016, at 7:25 AM, Christian Balzer  wrote:
> 
> 
> Hello,
> 
> On Mon, 15 Feb 2016 21:10:33 +0530 Swapnil Jain wrote:
> 
>> For most of you CEPH on ARMv7 might not sound good. This is our setup
>> and our FIO testing report.  I am not able to understand ….
>> 
> Just one OSD per Microserver as in your case should be fine.
> As always, use atop (or similar) on your storage servers when running
> these tests to see where your bottlenecks are (HDD/network/CPU).
> 
>> 1) Are these results good or bad
>> 2) Write is much better than read, where as read should be better.
>> 
> Your testing is flawed, more below.
> 
>> Hardware:
>> 
>> 8 x ARMv7 MicroServer with 4 x 10G Uplink
>> 
>> Each MicroServer with:
>> 2GB RAM
> Barely OK for one OSD, not enough if you run MONs as well on it (as you
> do).
> 
>> Dual Core 1.6 GHz processor
>> 2 x 2.5 Gbps Ethernet (1 for Public / 1 for Cluster Network)
>> 1 x 3TB SATA HDD
>> 1 x 128GB MSata Flash
> Exact model/maker please.

Its Seagate ST3000NC000 & Phison Msata


> 
>> 
>> Software:
>> Debian 8.3 32bit
>> ceph version 9.2.0-25-gf480cea
>> 
>> Setup:
>> 
>> 3 MON (Shared with 3 OSD)
>> 8 OSD
>> Data on 3TB SATA with XFS
>> Journal on 128GB MSata Flash
>> 
>> pool with replica 1
> Not a very realistic test of course.
> For a production, fault resilient cluster you would have to divide your
> results by 3 (at least).
> 
>> 500GB image with 4M object size
>> 
>> FIO command: fio --name=unit1 --filename=/dev/rbd1 --bs=4k --runtime=300
>> --readwrite=write
>> 
> 
> If that is your base FIO command line, I'm assuming you mounted that image
> on the client via the kernel RBD module?

Yes, its via kernel RBD module


> 
> Either way, the main reason you're seeing writes being faster than reads
> is that with this command line (no direct=1 flag) fio will use the page
> cache on your client host for writes, speeding things up dramatically.
> To get a realistic idea of your clusters ability, use direct=1 and also
> look into rados bench.
> 
> Another reason for the slow reads is that Ceph (RBD) does badly with
> regards to read-ahead, setting  /sys/block/rdb1/queue/read_ahead_kb to
> something like 2048 should improve things.
> 
> That all being said, your read values look awfully low.

Thanks again for the suggestion. Below are some results using rados bench, here 
read looks much better than write. Still is it good or can be better? I also 
checked atop, couldn't see any bottleneck except that that sda disk was busy 
80-90% of time during the test.


WRITE Throughput (MB/sec): 297.544
WRITE Average Latency:0.21499

READ Throughput (MB/sec):478.026
READ Average Latency:   0.133818

—
Swapnil

> 
> Christian
>> Client:
>> 
>> Ubuntu on Intel 24core/16GB RAM 10G Ethernet
>> 
>> Result for different tests
>> 
>> 128k-randread.txt:  read : io=2587.4MB, bw=8830.2KB/s, iops=68,
>> runt=300020msec 128k-randwrite.txt:  write: io=48549MB, bw=165709KB/s,
>> iops=1294, runt=35msec 128k-read.txt:  read : io=26484MB,
>> bw=90397KB/s, iops=706, runt=32msec 128k-write.txt:  write:
>> io=89538MB, bw=305618KB/s, iops=2387, runt=34msec 16k-randread.txt:
>> read : io=383760KB, bw=1279.2KB/s, iops=79, runt=31msec
>> 16k-randwrite.txt:  write: io=8720.7MB, bw=29764KB/s, iops=1860,
>> runt=32msec 16k-read.txt:  read : io=27444MB, bw=93676KB/s,
>> iops=5854, runt=31msec 16k-write.txt:  write: io=87811MB,
>> bw=299726KB/s, iops=18732, runt=31msec 1M-randread.txt:  read :
>> io=10439MB, bw=35631KB/s, iops=34, runt=38msec 1M-randwrite.txt:
>> write: io=98943MB, bw=337721KB/s, iops=329, runt=34msec
>> 1M-read.txt:  read : io=25717MB, bw=87779KB/s, iops=85, runt=37msec
>> 1M-write.txt:  write: io=74264MB, bw=253487KB/s, iops=247,
>> runt=31msec 4k-randread.txt:  read : io=116920KB, bw=399084B/s,
>> iops=97, runt=32msec 4k-randwrite.txt:  write: io=5579.2MB,
>> bw=19043KB/s, iops=4760, runt=34msec 4k-read.txt:  read :
>> io=27032MB, bw=92271KB/s, iops=23067, runt=31msec 4k-write.txt:
>> write: io=92955MB, bw=317284KB/s, iops=79320, runt=31msec
>> 64k-randread.txt:  read : io=1400.2MB, bw=4778.2KB/s, iops=74,
>> runt=300020msec 64k-randwrite.txt:  write: io=27676MB, bw=94467KB/s,
>> iops=1476, runt=35msec 64k-read.txt:  read : io=27805MB,
>> bw=94909KB/s, iops=1482, runt=32msec 64k-write.txt:  write:
>> io=95484MB, bw=325917KB/s, iops=5092, runt=33msec
>> 
>> 
>> —
>> Swapnil Jain | swap...@linux.com  
>> >
>> Solution Architect & Red Hat Certified Instructor
>> RHC{A,DS,E,I,SA,SA-RHOS,VA}, CE{H,I}, CC{DA,NA}, MCSE, CNE
>> 
>> 
> 
> 
> --
> Christian BalzerNetwork/Systems Engineer
> ch...@gol.com   Global OnLine Japan/Rakuten 
> Communications
> http://www.gol.com/ 


signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: [ceph-users] Performance Testing of CEPH on ARM MicroServer

2016-02-16 Thread Christian Balzer

Hello,

On Mon, 15 Feb 2016 21:10:33 +0530 Swapnil Jain wrote:

> For most of you CEPH on ARMv7 might not sound good. This is our setup
> and our FIO testing report.  I am not able to understand ….
>
Just one OSD per Microserver as in your case should be fine.
As always, use atop (or similar) on your storage servers when running
these tests to see where your bottlenecks are (HDD/network/CPU).
 
> 1) Are these results good or bad
> 2) Write is much better than read, where as read should be better.
> 
Your testing is flawed, more below.

> Hardware:
> 
> 8 x ARMv7 MicroServer with 4 x 10G Uplink
> 
> Each MicroServer with:
> 2GB RAM
Barely OK for one OSD, not enough if you run MONs as well on it (as you
do).

> Dual Core 1.6 GHz processor
> 2 x 2.5 Gbps Ethernet (1 for Public / 1 for Cluster Network)
> 1 x 3TB SATA HDD
> 1 x 128GB MSata Flash
Exact model/maker please.

> 
> Software:
> Debian 8.3 32bit
> ceph version 9.2.0-25-gf480cea
> 
> Setup:
> 
> 3 MON (Shared with 3 OSD)
> 8 OSD
> Data on 3TB SATA with XFS
> Journal on 128GB MSata Flash
> 
> pool with replica 1
Not a very realistic test of course.
For a production, fault resilient cluster you would have to divide your
results by 3 (at least).
 
> 500GB image with 4M object size
> 
> FIO command: fio --name=unit1 --filename=/dev/rbd1 --bs=4k --runtime=300
> --readwrite=write
>

If that is your base FIO command line, I'm assuming you mounted that image
on the client via the kernel RBD module? 

Either way, the main reason you're seeing writes being faster than reads
is that with this command line (no direct=1 flag) fio will use the page
cache on your client host for writes, speeding things up dramatically.
To get a realistic idea of your clusters ability, use direct=1 and also
look into rados bench.

Another reason for the slow reads is that Ceph (RBD) does badly with
regards to read-ahead, setting  /sys/block/rdb1/queue/read_ahead_kb to
something like 2048 should improve things.

That all being said, your read values look awfully low.

Christian
> Client:
> 
> Ubuntu on Intel 24core/16GB RAM 10G Ethernet
> 
> Result for different tests
> 
> 128k-randread.txt:  read : io=2587.4MB, bw=8830.2KB/s, iops=68,
> runt=300020msec 128k-randwrite.txt:  write: io=48549MB, bw=165709KB/s,
> iops=1294, runt=35msec 128k-read.txt:  read : io=26484MB,
> bw=90397KB/s, iops=706, runt=32msec 128k-write.txt:  write:
> io=89538MB, bw=305618KB/s, iops=2387, runt=34msec 16k-randread.txt:
> read : io=383760KB, bw=1279.2KB/s, iops=79, runt=31msec
> 16k-randwrite.txt:  write: io=8720.7MB, bw=29764KB/s, iops=1860,
> runt=32msec 16k-read.txt:  read : io=27444MB, bw=93676KB/s,
> iops=5854, runt=31msec 16k-write.txt:  write: io=87811MB,
> bw=299726KB/s, iops=18732, runt=31msec 1M-randread.txt:  read :
> io=10439MB, bw=35631KB/s, iops=34, runt=38msec 1M-randwrite.txt:
> write: io=98943MB, bw=337721KB/s, iops=329, runt=34msec
> 1M-read.txt:  read : io=25717MB, bw=87779KB/s, iops=85, runt=37msec
> 1M-write.txt:  write: io=74264MB, bw=253487KB/s, iops=247,
> runt=31msec 4k-randread.txt:  read : io=116920KB, bw=399084B/s,
> iops=97, runt=32msec 4k-randwrite.txt:  write: io=5579.2MB,
> bw=19043KB/s, iops=4760, runt=34msec 4k-read.txt:  read :
> io=27032MB, bw=92271KB/s, iops=23067, runt=31msec 4k-write.txt:
> write: io=92955MB, bw=317284KB/s, iops=79320, runt=31msec
> 64k-randread.txt:  read : io=1400.2MB, bw=4778.2KB/s, iops=74,
> runt=300020msec 64k-randwrite.txt:  write: io=27676MB, bw=94467KB/s,
> iops=1476, runt=35msec 64k-read.txt:  read : io=27805MB,
> bw=94909KB/s, iops=1482, runt=32msec 64k-write.txt:  write:
> io=95484MB, bw=325917KB/s, iops=5092, runt=33msec
> 
> 
> —
> Swapnil Jain | swap...@linux.com 
> Solution Architect & Red Hat Certified Instructor
> RHC{A,DS,E,I,SA,SA-RHOS,VA}, CE{H,I}, CC{DA,NA}, MCSE, CNE
> 
> 


-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Rakuten Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Performance Testing of CEPH on ARM MicroServer

2016-02-15 Thread Swapnil Jain
For most of you CEPH on ARMv7 might not sound good. This is our setup and our 
FIO testing report.  I am not able to understand ….

1) Are these results good or bad
2) Write is much better than read, where as read should be better.

Hardware:

8 x ARMv7 MicroServer with 4 x 10G Uplink

Each MicroServer with:
2GB RAM
Dual Core 1.6 GHz processor
2 x 2.5 Gbps Ethernet (1 for Public / 1 for Cluster Network)
1 x 3TB SATA HDD
1 x 128GB MSata Flash

Software:
Debian 8.3 32bit
ceph version 9.2.0-25-gf480cea

Setup:

3 MON (Shared with 3 OSD)
8 OSD
Data on 3TB SATA with XFS
Journal on 128GB MSata Flash

pool with replica 1
500GB image with 4M object size

FIO command: fio --name=unit1 --filename=/dev/rbd1 --bs=4k --runtime=300 
--readwrite=write

Client:

Ubuntu on Intel 24core/16GB RAM 10G Ethernet

Result for different tests

128k-randread.txt:  read : io=2587.4MB, bw=8830.2KB/s, iops=68, runt=300020msec
128k-randwrite.txt:  write: io=48549MB, bw=165709KB/s, iops=1294, 
runt=35msec
128k-read.txt:  read : io=26484MB, bw=90397KB/s, iops=706, runt=32msec
128k-write.txt:  write: io=89538MB, bw=305618KB/s, iops=2387, runt=34msec
16k-randread.txt:  read : io=383760KB, bw=1279.2KB/s, iops=79, runt=31msec
16k-randwrite.txt:  write: io=8720.7MB, bw=29764KB/s, iops=1860, runt=32msec
16k-read.txt:  read : io=27444MB, bw=93676KB/s, iops=5854, runt=31msec
16k-write.txt:  write: io=87811MB, bw=299726KB/s, iops=18732, runt=31msec
1M-randread.txt:  read : io=10439MB, bw=35631KB/s, iops=34, runt=38msec
1M-randwrite.txt:  write: io=98943MB, bw=337721KB/s, iops=329, runt=34msec
1M-read.txt:  read : io=25717MB, bw=87779KB/s, iops=85, runt=37msec
1M-write.txt:  write: io=74264MB, bw=253487KB/s, iops=247, runt=31msec
4k-randread.txt:  read : io=116920KB, bw=399084B/s, iops=97, runt=32msec
4k-randwrite.txt:  write: io=5579.2MB, bw=19043KB/s, iops=4760, runt=34msec
4k-read.txt:  read : io=27032MB, bw=92271KB/s, iops=23067, runt=31msec
4k-write.txt:  write: io=92955MB, bw=317284KB/s, iops=79320, runt=31msec
64k-randread.txt:  read : io=1400.2MB, bw=4778.2KB/s, iops=74, runt=300020msec
64k-randwrite.txt:  write: io=27676MB, bw=94467KB/s, iops=1476, runt=35msec
64k-read.txt:  read : io=27805MB, bw=94909KB/s, iops=1482, runt=32msec
64k-write.txt:  write: io=95484MB, bw=325917KB/s, iops=5092, runt=33msec


—
Swapnil Jain | swap...@linux.com 
Solution Architect & Red Hat Certified Instructor
RHC{A,DS,E,I,SA,SA-RHOS,VA}, CE{H,I}, CC{DA,NA}, MCSE, CNE




signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com