Re: [lustre-discuss] Do I need Lustre?

2018-04-27 Thread Philippe Weill



Le 27/04/2018 à 19:07, Thackeray, Neil L a écrit :
I’m new to the cluster realm, so I’m hoping for some good advice. We are starting up a new cluster, and I’ve noticed that lustre 
seems to be used widely in datacenters. The thing is I’m not sure the scale of our cluster will need it.


We are planning a small cluster, starting with 6 -8 nodes with 2 GPUs per node. They will be used for Deep Learning, MRI data 
processing, and Matlab among other things. With the size of the cluster we figure that 10Gb networking will be sufficient. We aren’t 
going to allow persistent storage on the cluster. Users will just upload and download data. I’m mostly concerned about I/O speeds. I 
don’t know if NFS would be fast enough to handle the data.


We are hoping that the cluster will grow over time. We are already talking 
about buying more nodes next fiscal year.

Thanks.



hello

you didn't say anything about filesystem size needed and if you are thinking to 
grow fast
we also run a small cluster ( 20 nodes )
but for climate data modeling results and satellite atmospheric data analysis
we are growing at least 300TB per year (2PB now)
and it's easier for us to grow with lustre


--
Weill Philippe -  Administrateur Systeme et Reseaux
CNRS/UPMC/IPSL   LATMOS (UMR 8190)
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Do I need Lustre?

2018-04-27 Thread Patrick Farrell
One factor is probably budget - Lustre is probably a higher budget option, in 
terms of hardware and time investment.  I would guess at the 6-8 node range you 
probably don't need its speed, though you might need at least one other trick 
it has:

One thing Lustre gives that NFS does not is the ability for multiple nodes to 
write to the same file in parallel while maintaining consistency.  It's a 
clustered/parallel file system, not just a network file system.  Some codes 
require this if you want to run them across multiple nodes.

You might start by setting up whatever seems "easy" to you, probably an NFS 
share of a storage appliance you've already got, and then see what happens.  If 
users are happy and you don't seem to be spending a lot of time doing I/O, then 
you're probably OK.  If not, Lustre is more work, but you do get something for 
your labors. :)


From: lustre-discuss  on behalf of 
Brett Lee 
Sent: Friday, April 27, 2018 8:11:21 PM
To: lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] Do I need Lustre?

Hi Neil,

One of the considerations in using Lustre should be the I/O patterns of your 
applications.  Lustre excels with large, sequential reads and writes.

Another are the costs, to include hardware, software, support, and coming up to 
speed with Lustre.  These components interact.  For example, having 
professional support helps with coming up to speed on Lustre. :)

Hey Michael!


On Fri, Apr 27, 2018, 12:22 PM Hebenstreit, Michael 
mailto:michael.hebenstr...@intel.com>> wrote:

You can do a simple test. Run a small sample of you application directly out of 
/dev/shm (the ram-disk). Then run it from the NFS file server. If you measure 
significant speedups your application is I/O sensitive and a Lustre configured 
with OPA or other InfiniBand solution will help.



From: lustre-discuss 
[mailto:lustre-discuss-boun...@lists.lustre.org]
 On Behalf Of Thackeray, Neil L
Sent: Friday, April 27, 2018 11:08 AM
To: lustre-discuss@lists.lustre.org
Subject: [lustre-discuss] Do I need Lustre?



I’m new to the cluster realm, so I’m hoping for some good advice. We are 
starting up a new cluster, and I’ve noticed that lustre seems to be used widely 
in datacenters. The thing is I’m not sure the scale of our cluster will need it.



We are planning a small cluster, starting with 6 -8 nodes with 2 GPUs per node. 
They will be used for Deep Learning, MRI data processing, and Matlab among 
other things. With the size of the cluster we figure that 10Gb networking will 
be sufficient. We aren’t going to allow persistent storage on the cluster. 
Users will just upload and download data. I’m mostly concerned about I/O 
speeds. I don’t know if NFS would be fast enough to handle the data.



We are hoping that the cluster will grow over time. We are already talking 
about buying more nodes next fiscal year.



Thanks.

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Do I need Lustre?

2018-04-27 Thread Brett Lee
Hi Neil,

One of the considerations in using Lustre should be the I/O patterns of
your applications.  Lustre excels with large, sequential reads and writes.

Another are the costs, to include hardware, software, support, and coming
up to speed with Lustre.  These components interact.  For example, having
professional support helps with coming up to speed on Lustre. :)

Hey Michael!


On Fri, Apr 27, 2018, 12:22 PM Hebenstreit, Michael <
michael.hebenstr...@intel.com> wrote:

> You can do a simple test. Run a small sample of you application directly
> out of /dev/shm (the ram-disk). Then run it from the NFS file server. If
> you measure significant speedups your application is I/O sensitive and a
> Lustre configured with OPA or other InfiniBand solution will help.
>
>
>
> *From:* lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] *On
> Behalf Of *Thackeray, Neil L
> *Sent:* Friday, April 27, 2018 11:08 AM
> *To:* lustre-discuss@lists.lustre.org
> *Subject:* [lustre-discuss] Do I need Lustre?
>
>
>
> I’m new to the cluster realm, so I’m hoping for some good advice. We are
> starting up a new cluster, and I’ve noticed that lustre seems to be used
> widely in datacenters. The thing is I’m not sure the scale of our cluster
> will need it.
>
>
>
> We are planning a small cluster, starting with 6 -8 nodes with 2 GPUs per
> node. They will be used for Deep Learning, MRI data processing, and Matlab
> among other things. With the size of the cluster we figure that 10Gb
> networking will be sufficient. We aren’t going to allow persistent storage
> on the cluster. Users will just upload and download data. I’m mostly
> concerned about I/O speeds. I don’t know if NFS would be fast enough to
> handle the data.
>
>
>
> We are hoping that the cluster will grow over time. We are already talking
> about buying more nodes next fiscal year.
>
>
>
> Thanks.
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Do I need Lustre?

2018-04-27 Thread Hebenstreit, Michael
You can do a simple test. Run a small sample of you application directly out of 
/dev/shm (the ram-disk). Then run it from the NFS file server. If you measure 
significant speedups your application is I/O sensitive and a Lustre configured 
with OPA or other InfiniBand solution will help.

From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf 
Of Thackeray, Neil L
Sent: Friday, April 27, 2018 11:08 AM
To: lustre-discuss@lists.lustre.org
Subject: [lustre-discuss] Do I need Lustre?

I'm new to the cluster realm, so I'm hoping for some good advice. We are 
starting up a new cluster, and I've noticed that lustre seems to be used widely 
in datacenters. The thing is I'm not sure the scale of our cluster will need it.

We are planning a small cluster, starting with 6 -8 nodes with 2 GPUs per node. 
They will be used for Deep Learning, MRI data processing, and Matlab among 
other things. With the size of the cluster we figure that 10Gb networking will 
be sufficient. We aren't going to allow persistent storage on the cluster. 
Users will just upload and download data. I'm mostly concerned about I/O 
speeds. I don't know if NFS would be fast enough to handle the data.

We are hoping that the cluster will grow over time. We are already talking 
about buying more nodes next fiscal year.

Thanks.
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Do I need Lustre?

2018-04-27 Thread Thackeray, Neil L
I'm new to the cluster realm, so I'm hoping for some good advice. We are 
starting up a new cluster, and I've noticed that lustre seems to be used widely 
in datacenters. The thing is I'm not sure the scale of our cluster will need it.

We are planning a small cluster, starting with 6 -8 nodes with 2 GPUs per node. 
They will be used for Deep Learning, MRI data processing, and Matlab among 
other things. With the size of the cluster we figure that 10Gb networking will 
be sufficient. We aren't going to allow persistent storage on the cluster. 
Users will just upload and download data. I'm mostly concerned about I/O 
speeds. I don't know if NFS would be fast enough to handle the data.

We are hoping that the cluster will grow over time. We are already talking 
about buying more nodes next fiscal year.

Thanks.
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org