Re: [ceph-users] anyone using CephFS for HPC?

2015-06-15 Thread Barclay Jameson
I am currently implementing Ceph into our HPC environment to handle
SAS temp workspace.
I am starting out with 3 OSD nodes with 1 MON/MDS node.
16 4TB HDDs per OSD node with 4 120GB SSD.
Each node has 40Gb Mellanox interconnect between each other to a
Mellanox switch.
Each client node has 10Gb to switch.

I have not done comparisons to Lustre but I have done comparisons to
PanFS which we currently use in production.
I have found that most workflows Ceph is comparibale to PanFS if not
better; however, PanFS still does better with small IO due to how it
caches small files.
If you want I can give you some hard numbers.

almightybeeij

On Fri, Jun 12, 2015 at 12:31 AM, Nigel Williams
nigel.d.willi...@gmail.com wrote:
 Wondering if anyone has done comparisons between CephFS and other
 parallel filesystems like Lustre typically used in HPC deployments
 either for scratch storage or persistent storage to support HPC
 workflows?

 thanks.
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] anyone using CephFS for HPC?

2015-06-15 Thread Shinobu Kinjo
Thanks for your info.
I would like to know how large i/o that you mentioned, and what kind of app
you used to do benchmarking?

Sincerely,
Kinjo

On Tue, Jun 16, 2015 at 12:04 AM, Barclay Jameson almightybe...@gmail.com
wrote:

 I am currently implementing Ceph into our HPC environment to handle
 SAS temp workspace.
 I am starting out with 3 OSD nodes with 1 MON/MDS node.
 16 4TB HDDs per OSD node with 4 120GB SSD.
 Each node has 40Gb Mellanox interconnect between each other to a
 Mellanox switch.
 Each client node has 10Gb to switch.

 I have not done comparisons to Lustre but I have done comparisons to
 PanFS which we currently use in production.
 I have found that most workflows Ceph is comparibale to PanFS if not
 better; however, PanFS still does better with small IO due to how it
 caches small files.
 If you want I can give you some hard numbers.

 almightybeeij

 On Fri, Jun 12, 2015 at 12:31 AM, Nigel Williams
 nigel.d.willi...@gmail.com wrote:
  Wondering if anyone has done comparisons between CephFS and other
  parallel filesystems like Lustre typically used in HPC deployments
  either for scratch storage or persistent storage to support HPC
  workflows?
 
  thanks.
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
Life w/ Linux http://i-shinobu.hatenablog.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] anyone using CephFS for HPC?

2015-06-14 Thread Nigel Williams

On 12/06/2015 3:41 PM, Gregory Farnum wrote:

...  and the test evaluation was on repurposed Lustre
hardware so it was a bit odd, ...


Agree, it was old (at least by now) DDN kit (SFA10K?) and not ideally suited for Ceph 
(really high OSD per host ratio).



Sage's thesis or some of the earlier papers will be happy to tell you
all the ways in which Ceph  Lustre, of course, since creating a
successor is how the project started. ;)
-Greg


Thanks Greg, yes those original documents have been well-thumbed; but I was hoping someone 
had done a more recent comparison given the significant improvements over the last couple 
of Ceph releases.


My superficial poking about in Lustre doesn't reveal to me anything particularly 
compelling in the design or typical deployments that would magically yield 
higher-performance than an equally well tuned Ceph cluster. Blair Bethwaite commented that 
Lustre client-side write caching might be more effective than CephFS at the moment.




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] anyone using CephFS for HPC?

2015-06-14 Thread Mark Nelson



On 06/14/2015 06:53 PM, Nigel Williams wrote:

On 12/06/2015 3:41 PM, Gregory Farnum wrote:

...  and the test evaluation was on repurposed Lustre
hardware so it was a bit odd, ...


Agree, it was old (at least by now) DDN kit (SFA10K?) and not ideally
suited for Ceph (really high OSD per host ratio).


FWIW, I did most of the performance work on the Ceph side for that 
paper.  Let me know if you are interested in any of the details.  It was 
definitely not ideal, though in the end we did relatively well I think. 
 Ultimately the lack of SSD journals hurt us as we hit the IB limit to 
the SFA10K long before we hit the disk limits, and we were topping out 
at about 6-8GB/s for sequential reads when we should have been able to 
hit 12GB/s.  We have seen some cases where filestore doesn't do large 
reads as quickly as you'd think (newstore seems to do better).


The big things that took a lot of effort to figure out during this 
testing were:


- General strange
- cache mirroring on the SFA10k *really* hurting performance with Ceph 
(Not sure why it didn't hurt Lustre as badly)
- Back around kernel 3.6 there were some nasty VM compaction issues that 
caused major performance problems.
- Somewhat strange mdtest results.  Probably just issues in the MDS back 
then.





Sage's thesis or some of the earlier papers will be happy to tell you
all the ways in which Ceph  Lustre, of course, since creating a
successor is how the project started. ;)
-Greg


Thanks Greg, yes those original documents have been well-thumbed; but I
was hoping someone had done a more recent comparison given the
significant improvements over the last couple of Ceph releases.

My superficial poking about in Lustre doesn't reveal to me anything
particularly compelling in the design or typical deployments that would
magically yield higher-performance than an equally well tuned Ceph
cluster. Blair Bethwaite commented that Lustre client-side write caching
might be more effective than CephFS at the moment.


I suspect the big things are:

- Lustre doesn't do asynchronous replication (relies on hardware raid)
- Lustre may have more tuning issues worked out.
- Lustre doesn't (last I checked) do full data journaling.

Frankly a well-tuned Lustre configuration is going to do pretty well for 
large sequential IO.  That's pretty much it's bread and butter.  At 
least historically it's not been great at small random IO, and most 
lustre setups use some kind of STONITH setup for node outage which is 
obviously not nearly as nice as Ceph is.






___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] anyone using CephFS for HPC?

2015-06-11 Thread Nigel Williams
Wondering if anyone has done comparisons between CephFS and other
parallel filesystems like Lustre typically used in HPC deployments
either for scratch storage or persistent storage to support HPC
workflows?

thanks.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] anyone using CephFS for HPC?

2015-06-11 Thread Gregory Farnum
On Thu, Jun 11, 2015 at 10:31 PM, Nigel Williams
nigel.d.willi...@gmail.com wrote:
 Wondering if anyone has done comparisons between CephFS and other
 parallel filesystems like Lustre typically used in HPC deployments
 either for scratch storage or persistent storage to support HPC
 workflows?

Oak Ridge had a paper at Supercomputing a couple years ago about this
from their perspective. I don't remember how many of its concerns are
still up-to-date, and the test evaluation was on repurposed Lustre
hardware so it was a bit odd, but it might give you some stuff to
think about.
Sage's thesis or some of the earlier papers will be happy to tell you
all the ways in which Ceph  Lustre, of course, since creating a
successor is how the project started. ;)
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com