Re: [ceph-users] improve single job sequencial read performance.

2018-03-08 Thread Cassiano Pilipavicius
Hi Alex... thank you for the tips! Yesterday I've made a lot of testing 
and it seems that my network is really what is holding the speed down. I 
just like to confirm if this is not really a problem or 
misconfigurantion in my cluster that would be masked by the network 
upgrade. The cache make the things really better, the second time I read 
a file, even if I drop the caches at the guest OS, the data is read at 
the netwotk speed limit.


I dont know if it is normal, but in the testsĀ  with fio the KRDB shows a 
great performance boost over librbd (50MB/s in KRBD, 28MB/s in librbd).


To check how much the network latency is slowing down things, I've 
created a 4xSSD only pool with size 2, and set the osds on one host to 
primary-affinity 0, when I run the test with this config, data was read 
at 900MB/s and clat is under 1ms. When I turned primary-affinity to 1 
again and run the same test, the bw dropped to 100MB/s only and the 
higher clat is 250ms and the average 70ms.


I will post the difference on the speeds next week when I have the 
network upgraded in the case anyone like to see the results.



Em 3/7/2018 10:38 PM, Alex Gorbachev escreveu:

On Wed, Mar 7, 2018 at 8:37 PM, Alex Gorbachev  wrote:

On Wed, Mar 7, 2018 at 9:43 AM, Cassiano Pilipavicius
 wrote:

Hi all, this issue already have been discussed in older threads and I've
already tried most of the solutions proposed in older threads.


I have a small and  old ceph cluster (slarted in hammer and upgraded until
luminous 12.2.2) , connected thru single 1gbe link shared (I know this is
not optimal but for my workload it is handling the load reasonably well). I
use for RBD for small VMs in libvirtu/qemu.

My problem is... If i need to copy a large file (cp, dd, tar), the read
speed is very low (15MB/s). I've tested the write speed of a single job with
dd zero (direct) > file and the speed is good enought for my environment
(80MB/s)

If I run paralell jobs, I can saturate the network connection, the speed
scales with the number of jobs. I've tried setting read ahead on ceph.conf
and in the guest O.S

I've never heard any report of a cluster using single 1gbe, maybe this speed
is what should I expect? The next week I will be upgrading the network for 2
x 10gbe (private and public) but I would like to know if I have any issue
that I need to address before, as the problem can be masked by the network
upgrade.

If anyone can throw some light or point me in any direction or tell me
this is what you should expect I really apreciate. If anyone need more
info please let me know.

Workarounds I have heard of or used:

1. Use fancy striping and parallelize that way
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-April/017744.html

2. Use lvm and set up a striped volume over multiple RBDs

3. Weird but we had seen improvement in sequential speeds with larger
object size (16 MB) in the past

4. Caching solutions may help smooth out peaks and valleys of IO -
bcache, flashcache and we have successfully used EnhanceIO with
writethrough mode

5. Better SSD journals help if using filestore

6. Caching controllers, e.g. Areca

--
Alex Gorbachev
Storcium



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] improve single job sequencial read performance.

2018-03-07 Thread Alex Gorbachev
On Wed, Mar 7, 2018 at 8:37 PM, Alex Gorbachev  wrote:
> On Wed, Mar 7, 2018 at 9:43 AM, Cassiano Pilipavicius
>  wrote:
>> Hi all, this issue already have been discussed in older threads and I've
>> already tried most of the solutions proposed in older threads.
>>
>>
>> I have a small and  old ceph cluster (slarted in hammer and upgraded until
>> luminous 12.2.2) , connected thru single 1gbe link shared (I know this is
>> not optimal but for my workload it is handling the load reasonably well). I
>> use for RBD for small VMs in libvirtu/qemu.
>>
>> My problem is... If i need to copy a large file (cp, dd, tar), the read
>> speed is very low (15MB/s). I've tested the write speed of a single job with
>> dd zero (direct) > file and the speed is good enought for my environment
>> (80MB/s)
>>
>> If I run paralell jobs, I can saturate the network connection, the speed
>> scales with the number of jobs. I've tried setting read ahead on ceph.conf
>> and in the guest O.S
>>
>> I've never heard any report of a cluster using single 1gbe, maybe this speed
>> is what should I expect? The next week I will be upgrading the network for 2
>> x 10gbe (private and public) but I would like to know if I have any issue
>> that I need to address before, as the problem can be masked by the network
>> upgrade.
>>
>> If anyone can throw some light or point me in any direction or tell me
>> this is what you should expect I really apreciate. If anyone need more
>> info please let me know.
>
> Workarounds I have heard of or used:
>
> 1. Use fancy striping and parallelize that way
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-April/017744.html
>
> 2. Use lvm and set up a striped volume over multiple RBDs
>
> 3. Weird but we had seen improvement in sequential speeds with larger
> object size (16 MB) in the past
>
> 4. Caching solutions may help smooth out peaks and valleys of IO -
> bcache, flashcache and we have successfully used EnhanceIO with
> writethrough mode
>
> 5. Better SSD journals help if using filestore
>
> 6. Caching controllers, e.g. Areca
>
> --
> Alex Gorbachev
> Storcium
>
>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com