Re: [ceph-users] VMware + Ceph using NFS sync/async ?

2017-08-19 Thread Maged Mokhtar
Hi Nick, 

Interesting your note on PG locking, but I would be surprised if its
effect is that bad. I would think that in your example the 2 ms is a
total latency, the lock will probably be applied to small portion of
that, so the concurrent operations are not serialized for the entire
time..but again i may be wrong. Also if the lock is that bad, then we
should see 4k sequential writes to be much slower than random ones in
general testing, which is not the case. 

Another thing that may help in vm migration as per your description is
reducing the rbd stripe size to be a couple of times smaller than 2M (
32 x 64k ). 

Maged 

On 2017-08-16 16:12, Nick Fisk wrote:

> Hi Matt, 
> 
> Well behaved applications are the problem here. ESXi sends all writes as sync 
> writes. So although OS's will still do their own buffering, any ESXi level 
> operation is all done as sync. This is probably seen the greatest when 
> migrating vm's between datastores, everything gets done as sync 64KB ios 
> meaning, copying a 1TB VM can often take nearly 24 hours. 
> 
> Osama, can you describe the difference in performance you see between 
> Openstack and ESXi and what type of operations are these? Sync writes should 
> be the same no matter the client, except in the NFS case you will have an 
> extra network hop and potentially a little bit of PG congestion around the FS 
> journal on the RBd device. 
> 
> Osama, you can't compare Ceph to a SAN. Just in terms of network latency you 
> have an extra 2 hops. In ideal scenario you might be able to get Ceph write 
> latency down to 0.5-1ms for a 4kb io, compared to to about 0.1-0.3 for a 
> storage array. However, what you will find with Ceph is that other things 
> start to increase this average long before you would start to see this on 
> storage arrays. 
> 
> The migration is a good example of this. As I said, ESXi migrates a vm in 
> 64KB io's, but does 32 of these blocks in parallel at a time. On storage 
> arrays, these 64KB io's are coalesced in the battery protected write cached 
> into bigger IO's before being persisted to disk. The storage array can also 
> accept all 32 of these requests at once. 
> 
> A similar thing happens in Ceph/RBD/NFS via the Ceph filestore journal, but 
> that coalescing is now an extra 2 hops away and with a bit of extra latency 
> introduced by the Ceph code, we are already a bit slower. But here's the 
> killer, PG locking!!! You can't write 32 IO's in parallel to the same 
> object/PG, each one has to be processed sequentially because of the locks. 
> (Please someone correct me if I'm wrong here). If your 64KB write latency is 
> 2ms, then you can only do 500 64KB IO's a second. 64KB*500=~30MB/s vs a 
> Storage Array which would be doing the operation in the hundreds of MB/s 
> range. 
> 
> Note: When proper iSCSI for RBD support is finished, you might be able to use 
> the VAAI offloads, which would dramatically increase performance for 
> migrations as well. 
> 
> Also once persistent SSD write caching for librbd becomes available, a lot of 
> these problems will go away, as the SSD will behave like a storage array's 
> write cache and will only be 1 hop away from the client as well. 
> 
> FROM: Matt Benjamin [mailto:mbenj...@redhat.com] 
> SENT: 16 August 2017 14:49
> TO: Osama Hasebou <osama.hase...@csc.fi>
> CC: n...@fisk.me.uk; ceph-users <ceph-users@lists.ceph.com>
> SUBJECT: Re: [ceph-users] VMware + Ceph using NFS sync/async ? 
> 
> Hi Osama, 
> 
> I don't have a clear sense of the the application workflow here--and Nick 
> appears to--but I thought it worth noting that NFSv3 and NFSv4 clients 
> shouldn't normally need the sync mount option to achieve i/o stability with 
> well-behaved applications.  In both versions of the protocol, an application 
> write that is synchronous (or, more typically, the equivalent application 
> sync barrier) should not succeed until an NFS-protocol COMMIT (or in some 
> cases w/NFSv4, WRITE w/stable flag set) has been acknowledged by the NFS 
> server.  If the NFS i/o stability model is insufficient for a your workflow, 
> moreover, I'd be worried that -osync writes (which might be incompletely 
> applied during a failure event) may not be correctly enforcing your 
> invariant, either. 
> 
> Matt 
> 
> On Wed, Aug 16, 2017 at 8:33 AM, Osama Hasebou <osama.hase...@csc.fi> wrote:
> 
>> Hi Nick, 
>> 
>> Thanks for replying! If Ceph is combined with Openstack then, does that mean 
>> that actually when openstack writes are happening, it is not fully sync'd 
>> (as in written to disks) before it starts receiving more data, so acting as 
>> async ? In that scenario there is a chance for data loss if things go bad, 
>> i.e power outage or someth

Re: [ceph-users] VMware + Ceph using NFS sync/async ?

2017-08-16 Thread Adrian Saul
> I'd be interested in details of this small versus large bit.

The smaller shares is just simply to distribute the workload over more RBDs so 
the bottleneck doesn’t become the RBD device. The size itself doesn’t 
particularly matter but just the idea to distribute VMs across many shares 
rather than a few large datastores.

We originally started with 10TB shares, just because we had the space - but we 
found performance was running out before capacity did.  But it's been apparent 
that the limitation appears to be at the RBD level, particularly with writes.  
So under heavy usage with say VMWare snapshot backups VMs gets impacted by 
higher latency to the point that some VMs become unresponsive for small 
periods.  The ceph cluster itself has plenty of performance available and 
handles far higher workload periods, but individual RBD devices just seem to 
hit the wall.

For example, one of our shares will sit there all day happily doing 3-400 IOPS 
read at very low latencies.  During the backup period we get heavier writes as 
snapshots are created and cleaned up.   That increased write activity pushes 
the RBD to 100% busy and read latencies go up from 1-2ms to 20-30ms, even 
though the number of reads doesn’t change that much.   The devices though can 
handle more, I can see periods of up to 1800 IOPS read and 800 write.

There is probably more tuning that can be applied at the XFS/NFS level, but for 
the moment that’s the direction we are taking - creating more shares.

>
> Would you say that the IOPS starvation is more an issue of the large
> filesystem than the underlying Ceph/RBD?

As above - I think its more to do with an IOPS limitation at the RBD device 
level - likely due to sync write latency limiting the number of effective IOs.  
That might be XFS as well but I have not had the chance to dial that in more.

> With a cache-tier in place I'd expect all hot FS objects (inodes, etc) to be
> there and thus be as fast as it gets from a Ceph perspective.

Yeah - the cache teir takes a fair bit of the heat and improves the response 
considerably for the SATA environments - it makes a significant difference.  
The SSD only pool images behave in a similar way but operate to a much higher 
performance level before they start showing issues.

> OTOH lots of competing accesses to same journal, inodes would be a
> limitation inherent to the FS.

Its likely there is tuning there to improve the XFS performance, but from the 
stats of the RBD device they are showing the latencies going up, there might be 
more impact further up the stack, but the underlying device shows the change in 
performance.

>
> Christian
>
> >
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> > Of Osama Hasebou
> > Sent: Wednesday, 16 August 2017 10:34 PM
> > To: n...@fisk.me.uk
> > Cc: ceph-users <ceph-users@lists.ceph.com>
> > Subject: Re: [ceph-users] VMware + Ceph using NFS sync/async ?
> >
> > Hi Nick,
> >
> > Thanks for replying! If Ceph is combined with Openstack then, does that
> mean that actually when openstack writes are happening, it is not fully sync'd
> (as in written to disks) before it starts receiving more data, so acting as 
> async
> ? In that scenario there is a chance for data loss if things go bad, i.e power
> outage or something like that ?
> >
> > As for the slow operations, reading is quite fine when I compare it to a SAN
> storage system connected to VMware. It is writing data, small chunks or big
> ones, that suffer when trying to use the sync option with FIO for
> benchmarking.
> >
> > In that case, I wonder, is no one using CEPH with VMware in a production
> environment ?
> >
> > Cheers.
> >
> > Regards,
> > Ossi
> >
> >
> >
> > Hi Osama,
> >
> > This is a known problem with many software defined storage stacks, but
> potentially slightly worse with Ceph due to extra overheads. Sync writes
> have to wait until all copies of the data are written to disk by the OSD and
> acknowledged back to the client. The extra network hops for replication and
> NFS gateways add significant latency which impacts the time it takes to carry
> out small writes. The Ceph code also takes time to process each IO request.
> >
> > What particular operations are you finding slow? Storage vmotions are just
> bad, and I don’t think there is much that can be done about them as they are
> split into lots of 64kb IO’s.
> >
> > One thing you can try is to force the CPU’s on your OSD nodes to run at C1
> cstate and force their minimum frequency to 100%. This can have quite a
> large impact on latency. Also you don’t specify your network, but 10G is a
> must.
> >
> > Nick
> >
> >
> > From: ceph-users [ma

Re: [ceph-users] VMware + Ceph using NFS sync/async ?

2017-08-16 Thread Christian Balzer

Hello,

On Thu, 17 Aug 2017 00:13:24 + Adrian Saul wrote:

> We are using Ceph on NFS for VMWare – we are using SSD tiers in front of SATA 
> and some direct SSD pools.  The datastores are just XFS file systems on RBD 
> managed by a pacemaker cluster for failover.
> 
> Lessons so far are that large datastores quickly run out of IOPS and compete 
> for performance – you are better off with many smaller RBDs (say 1TB) to 
> spread out workloads.  Also tuning up NFS threads seems to help.
> 
I'd be interested in details of this small versus large bit.

Would you say that the IOPS starvation is more an issue of the large
filesystem than the underlying Ceph/RBD?

With a cache-tier in place I'd expect all hot FS objects (inodes, etc) to
be there and thus be as fast as it gets from a Ceph perspective. 

OTOH lots of competing accesses to same journal, inodes would be a
limitation inherent to the FS.

Christian

> 
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
> Osama Hasebou
> Sent: Wednesday, 16 August 2017 10:34 PM
> To: n...@fisk.me.uk
> Cc: ceph-users <ceph-users@lists.ceph.com>
> Subject: Re: [ceph-users] VMware + Ceph using NFS sync/async ?
> 
> Hi Nick,
> 
> Thanks for replying! If Ceph is combined with Openstack then, does that mean 
> that actually when openstack writes are happening, it is not fully sync'd (as 
> in written to disks) before it starts receiving more data, so acting as async 
> ? In that scenario there is a chance for data loss if things go bad, i.e 
> power outage or something like that ?
> 
> As for the slow operations, reading is quite fine when I compare it to a SAN 
> storage system connected to VMware. It is writing data, small chunks or big 
> ones, that suffer when trying to use the sync option with FIO for 
> benchmarking.
> 
> In that case, I wonder, is no one using CEPH with VMware in a production 
> environment ?
> 
> Cheers.
> 
> Regards,
> Ossi
> 
> 
> 
> Hi Osama,
> 
> This is a known problem with many software defined storage stacks, but 
> potentially slightly worse with Ceph due to extra overheads. Sync writes have 
> to wait until all copies of the data are written to disk by the OSD and 
> acknowledged back to the client. The extra network hops for replication and 
> NFS gateways add significant latency which impacts the time it takes to carry 
> out small writes. The Ceph code also takes time to process each IO request.
> 
> What particular operations are you finding slow? Storage vmotions are just 
> bad, and I don’t think there is much that can be done about them as they are 
> split into lots of 64kb IO’s.
> 
> One thing you can try is to force the CPU’s on your OSD nodes to run at C1 
> cstate and force their minimum frequency to 100%. This can have quite a large 
> impact on latency. Also you don’t specify your network, but 10G is a must.
> 
> Nick
> 
> 
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
> Osama Hasebou
> Sent: 14 August 2017 12:27
> To: ceph-users <ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>>
> Subject: [ceph-users] VMware + Ceph using NFS sync/async ?
> 
> Hi Everyone,
> 
> We started testing the idea of using Ceph storage with VMware, the idea was 
> to provide Ceph storage through open stack to VMware, by creating a virtual 
> machine coming from Ceph + Openstack , which acts as an NFS gateway, then 
> mount that storage on top of VMware cluster.
> 
> When mounting the NFS exports using the sync option, we noticed a huge 
> degradation in performance which makes it very slow to use it in production, 
> the async option makes it much better but then there is the risk of it being 
> risky that in case a failure shall happen, some data might be lost in that 
> Scenario.
> 
> Now I understand that some people in the ceph community are using Ceph with 
> VMware using NFS gateways, so if you can kindly shed some light on your 
> experience, and if you do use it in production purpose, that would be great 
> and how did you mitigate the sync/async options and keep write performance.
> 
> 
> Thanks you!!!
> 
> Regards,
> Ossi
> 
> 
> Confidentiality: This email and any attachments are confidential and may be 
> subject to copyright, legal or some other professional privilege. They are 
> intended solely for the attention and use of the named addressee(s). They may 
> only be copied, distributed or disclosed with the consent of the copyright 
> owner. If you have received this email by mistake or by breach of the 
> confidentiality clause, please notify the sender immediately by return email 
> and delete or destroy all copies of the email. Any confidentiality, privilege 

Re: [ceph-users] VMware + Ceph using NFS sync/async ?

2017-08-16 Thread Adrian Saul

We are using Ceph on NFS for VMWare – we are using SSD tiers in front of SATA 
and some direct SSD pools.  The datastores are just XFS file systems on RBD 
managed by a pacemaker cluster for failover.

Lessons so far are that large datastores quickly run out of IOPS and compete 
for performance – you are better off with many smaller RBDs (say 1TB) to spread 
out workloads.  Also tuning up NFS threads seems to help.


From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Osama 
Hasebou
Sent: Wednesday, 16 August 2017 10:34 PM
To: n...@fisk.me.uk
Cc: ceph-users <ceph-users@lists.ceph.com>
Subject: Re: [ceph-users] VMware + Ceph using NFS sync/async ?

Hi Nick,

Thanks for replying! If Ceph is combined with Openstack then, does that mean 
that actually when openstack writes are happening, it is not fully sync'd (as 
in written to disks) before it starts receiving more data, so acting as async ? 
In that scenario there is a chance for data loss if things go bad, i.e power 
outage or something like that ?

As for the slow operations, reading is quite fine when I compare it to a SAN 
storage system connected to VMware. It is writing data, small chunks or big 
ones, that suffer when trying to use the sync option with FIO for benchmarking.

In that case, I wonder, is no one using CEPH with VMware in a production 
environment ?

Cheers.

Regards,
Ossi



Hi Osama,

This is a known problem with many software defined storage stacks, but 
potentially slightly worse with Ceph due to extra overheads. Sync writes have 
to wait until all copies of the data are written to disk by the OSD and 
acknowledged back to the client. The extra network hops for replication and NFS 
gateways add significant latency which impacts the time it takes to carry out 
small writes. The Ceph code also takes time to process each IO request.

What particular operations are you finding slow? Storage vmotions are just bad, 
and I don’t think there is much that can be done about them as they are split 
into lots of 64kb IO’s.

One thing you can try is to force the CPU’s on your OSD nodes to run at C1 
cstate and force their minimum frequency to 100%. This can have quite a large 
impact on latency. Also you don’t specify your network, but 10G is a must.

Nick


From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Osama 
Hasebou
Sent: 14 August 2017 12:27
To: ceph-users <ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>>
Subject: [ceph-users] VMware + Ceph using NFS sync/async ?

Hi Everyone,

We started testing the idea of using Ceph storage with VMware, the idea was to 
provide Ceph storage through open stack to VMware, by creating a virtual 
machine coming from Ceph + Openstack , which acts as an NFS gateway, then mount 
that storage on top of VMware cluster.

When mounting the NFS exports using the sync option, we noticed a huge 
degradation in performance which makes it very slow to use it in production, 
the async option makes it much better but then there is the risk of it being 
risky that in case a failure shall happen, some data might be lost in that 
Scenario.

Now I understand that some people in the ceph community are using Ceph with 
VMware using NFS gateways, so if you can kindly shed some light on your 
experience, and if you do use it in production purpose, that would be great and 
how did you mitigate the sync/async options and keep write performance.


Thanks you!!!

Regards,
Ossi


Confidentiality: This email and any attachments are confidential and may be 
subject to copyright, legal or some other professional privilege. They are 
intended solely for the attention and use of the named addressee(s). They may 
only be copied, distributed or disclosed with the consent of the copyright 
owner. If you have received this email by mistake or by breach of the 
confidentiality clause, please notify the sender immediately by return email 
and delete or destroy all copies of the email. Any confidentiality, privilege 
or copyright is not waived or lost because this email has been sent to you by 
mistake.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] VMware + Ceph using NFS sync/async ?

2017-08-16 Thread Nick Fisk
Hi Matt,

 

Well behaved applications are the problem here. ESXi sends all writes as sync 
writes. So although OS’s will still do their own buffering, any ESXi level 
operation is all done as sync. This is probably seen the greatest when 
migrating vm’s between datastores, everything gets done as sync 64KB ios 
meaning, copying a 1TB VM can often take nearly 24 hours.

 

Osama, can you describe the difference in performance you see between Openstack 
and ESXi and what type of operations are these? Sync writes should be the same 
no matter the client, except in the NFS case you will have an extra network hop 
and potentially a little bit of PG congestion around the FS journal on the RBd 
device.

 

Osama, you can’t compare Ceph to a SAN. Just in terms of network latency you 
have an extra 2 hops. In ideal scenario you might be able to get Ceph write 
latency down to 0.5-1ms for a 4kb io, compared to to about 0.1-0.3 for a 
storage array. However, what you will find with Ceph is that other things start 
to increase this average long before you would start to see this on storage 
arrays. 

 

The migration is a good example of this. As I said, ESXi migrates a vm in 64KB 
io’s, but does 32 of these blocks in parallel at a time. On storage arrays, 
these 64KB io’s are coalesced in the battery protected write cached into bigger 
IO’s before being persisted to disk. The storage array can also accept all 32 
of these requests at once.

 

A similar thing happens in Ceph/RBD/NFS via the Ceph filestore journal, but 
that coalescing is now an extra 2 hops away and with a bit of extra latency 
introduced by the Ceph code, we are already a bit slower. But here’s the 
killer, PG locking!!! You can’t write 32 IO’s in parallel to the same 
object/PG, each one has to be processed sequentially because of the locks. 
(Please someone correct me if I’m wrong here). If your 64KB write latency is 
2ms, then you can only do 500 64KB IO’s a second. 64KB*500=~30MB/s vs a Storage 
Array which would be doing the operation in the hundreds of MB/s range.

 

Note: When proper iSCSI for RBD support is finished, you might be able to use 
the VAAI offloads, which would dramatically increase performance for migrations 
as well.

 

Also once persistent SSD write caching for librbd becomes available, a lot of 
these problems will go away, as the SSD will behave like a storage array’s 
write cache and will only be 1 hop away from the client as well.

 

From: Matt Benjamin [mailto:mbenj...@redhat.com] 
Sent: 16 August 2017 14:49
To: Osama Hasebou <osama.hase...@csc.fi>
Cc: n...@fisk.me.uk; ceph-users <ceph-users@lists.ceph.com>
Subject: Re: [ceph-users] VMware + Ceph using NFS sync/async ?

 

Hi Osama,

I don't have a clear sense of the the application workflow here--and Nick 
appears to--but I thought it worth noting that NFSv3 and NFSv4 clients 
shouldn't normally need the sync mount option to achieve i/o stability with 
well-behaved applications.  In both versions of the protocol, an application 
write that is synchronous (or, more typically, the equivalent application sync 
barrier) should not succeed until an NFS-protocol COMMIT (or in some cases 
w/NFSv4, WRITE w/stable flag set) has been acknowledged by the NFS server.  If 
the NFS i/o stability model is insufficient for a your workflow, moreover, I'd 
be worried that -osync writes (which might be incompletely applied during a 
failure event) may not be correctly enforcing your invariant, either.

 

Matt

 

On Wed, Aug 16, 2017 at 8:33 AM, Osama Hasebou <osama.hase...@csc.fi 
<mailto:osama.hase...@csc.fi> > wrote:

Hi Nick,

 

Thanks for replying! If Ceph is combined with Openstack then, does that mean 
that actually when openstack writes are happening, it is not fully sync'd (as 
in written to disks) before it starts receiving more data, so acting as async ? 
In that scenario there is a chance for data loss if things go bad, i.e power 
outage or something like that ?

 

As for the slow operations, reading is quite fine when I compare it to a SAN 
storage system connected to VMware. It is writing data, small chunks or big 
ones, that suffer when trying to use the sync option with FIO for benchmarking.

 

In that case, I wonder, is no one using CEPH with VMware in a production 
environment ?

 

Cheers.

 

Regards,
Ossi

 

 

 

Hi Osama,

 

This is a known problem with many software defined storage stacks, but 
potentially slightly worse with Ceph due to extra overheads. Sync writes have 
to wait until all copies of the data are written to disk by the OSD and 
acknowledged back to the client. The extra network hops for replication and NFS 
gateways add significant latency which impacts the time it takes to carry out 
small writes. The Ceph code also takes time to process each IO request.

 

What particular operations are you finding slow? Storage vmotions are just bad, 
and I don’t think there is much that can be done about them as they are split

Re: [ceph-users] VMware + Ceph using NFS sync/async ?

2017-08-16 Thread Osama Hasebou
Hi Nick, 

Thanks for replying! If Ceph is combined with Openstack then, does that mean 
that actually when openstack writes are happening, it is not fully sync'd (as 
in written to disks) before it starts receiving more data, so acting as async ? 
In that scenario there is a chance for data loss if things go bad, i.e power 
outage or something like that ? 

As for the slow operations, reading is quite fine when I compare it to a SAN 
storage system connected to VMware. It is writing data, small chunks or big 
ones, that suffer when trying to use the sync option with FIO for benchmarking. 

In that case, I wonder, is no one using CEPH with VMware in a production 
environment ? 

Cheers. 

Regards, 
Ossi 







Hi Osama, 



This is a known problem with many software defined storage stacks, but 
potentially slightly worse with Ceph due to extra overheads. Sync writes have 
to wait until all copies of the data are written to disk by the OSD and 
acknowledged back to the client. The extra network hops for replication and NFS 
gateways add significant latency which impacts the time it takes to carry out 
small writes. The Ceph code also takes time to process each IO request. 



What particular operations are you finding slow? Storage vmotions are just bad, 
and I don’t think there is much that can be done about them as they are split 
into lots of 64kb IO’s. 



One thing you can try is to force the CPU’s on your OSD nodes to run at C1 
cstate and force their minimum frequency to 100%. This can have quite a large 
impact on latency. Also you don’t specify your network, but 10G is a must. 



Nick 




From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Osama 
Hasebou 
Sent: 14 August 2017 12:27 
To: ceph-users <ceph-users@lists.ceph.com> 
Subject: [ceph-users] VMware + Ceph using NFS sync/async ? 




Hi Everyone, 





We started testing the idea of using Ceph storage with VMware, the idea was to 
provide Ceph storage through open stack to VMware, by creating a virtual 
machine coming from Ceph + Openstack , which acts as an NFS gateway, then mount 
that storage on top of VMware cluster. 





When mounting the NFS exports using the sync option, we noticed a huge 
degradation in performance which makes it very slow to use it in production, 
the async option makes it much better but then there is the risk of it being 
risky that in case a failure shall happen, some data might be lost in that 
Scenario. 





Now I understand that some people in the ceph community are using Ceph with 
VMware using NFS gateways, so if you can kindly shed some light on your 
experience, and if you do use it in production purpose, that would be great and 
how did you mitigate the sync/async options and keep write performance. 








Thanks you!!! 





Regards, 
Ossi 




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] VMware + Ceph using NFS sync/async ?

2017-08-14 Thread Nick Fisk
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Osama Hasebou
Sent: 14 August 2017 12:27
To: ceph-users <ceph-users@lists.ceph.com>
Subject: [ceph-users] VMware + Ceph using NFS sync/async ?

 

Hi Everyone,

 

We started testing the idea of using Ceph storage with VMware, the idea was
to provide Ceph storage through open stack to VMware, by creating a virtual
machine coming from Ceph + Openstack , which acts as an NFS gateway, then
mount that storage on top of VMware cluster.

 

When mounting the NFS exports using the sync option, we noticed a huge
degradation in performance which makes it very slow to use it in production,
the async option makes it much better but then there is the risk of it being
risky that in case a failure shall happen, some data might be lost in that
Scenario.

 

Now I understand that some people in the ceph community are using Ceph with
VMware using NFS gateways, so if you can kindly shed some light on your
experience, and if you do use it in production purpose, that would be great
and how did you mitigate the sync/async options and keep write performance.

 

 

Thanks you!!!

 

Regards,
Ossi

 

Hi Osama,

 

This is a known problem with many software defined storage stacks, but
potentially slightly worse with Ceph due to extra overheads. Sync writes
have to wait until all copies of the data are written to disk by the OSD and
acknowledged back to the client. The extra network hops for replication and
NFS gateways add significant latency which impacts the time it takes to
carry out small writes. The Ceph code also takes time to process each IO
request.

 

What particular operations are you finding slow? Storage vmotions are just
bad, and I don't think there is much that can be done about them as they are
split into lots of 64kb IO's.

 

One thing you can try is to force the CPU's on your OSD nodes to run at C1
cstate and force their minimum frequency to 100%. This can have quite a
large impact on latency. Also you don't specify your network, but 10G is a
must.

 

Nick

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] VMware + Ceph using NFS sync/async ?

2017-08-14 Thread Osama Hasebou
Hi Everyone, 

We started testing the idea of using Ceph storage with VMware, the idea was to 
provide Ceph storage through open stack to VMware, by creating a virtual 
machine coming from Ceph + Openstack , which acts as an NFS gateway, then mount 
that storage on top of VMware cluster. 

When mounting the NFS exports using the sync option, we noticed a huge 
degradation in performance which makes it very slow to use it in production, 
the async option makes it much better but then there is the risk of it being 
risky that in case a failure shall happen, some data might be lost in that 
Scenario. 

Now I understand that some people in the ceph community are using Ceph with 
VMware using NFS gateways, so if you can kindly shed some light on your 
experience, and if you do use it in production purpose, that would be great and 
how did you mitigate the sync/async options and keep write performance. 


Thanks you!!! 

Regards, 
Ossi 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com