Re: [gpfsug-discuss] GPFS and Flash/SSD Storage tiered storage

valleru Thu, 22 Feb 2018 13:21:24 -0800

Thank you.

I am sorry if i was not clear, but the metadata pool is all on SSDs in the GPFS 
clusters that we use. Its just the data pool that is on Near-Line Rotating 
disks.
I understand that AFM might not be able to solve the issue, and I will try and 
see if file heat works for migrating the files to flash tier.
You mentioned an all flash storage pool for heavily used files - so you mean a 
different GPFS cluster just with flash storage, and to manually copy the files 
to flash storage whenever needed?
The IO performance that i am talking is prominently for reads, so you mention 
that LROC can work in the way i want it to? that is prefetch all the files into 
LROC cache, after only few headers/stubs of data are read from those files?
I thought LROC only keeps that block of data that is prefetched from the disk, 
and will not prefetch the whole file if a stub of data is read.
Please do let me know, if i understood it wrong.


On Feb 22, 2018, 4:08 PM -0500, IBM Spectrum Scale <sc...@us.ibm.com>, wrote:
> I do not think AFM is intended to solve the problem you are trying to solve.  
> If I understand your scenario correctly you state that you are placing 
> metadata on NL-SAS storage.  If that is true that would not be wise 
> especially if you are going to do many metadata operations.  I suspect your 
> performance issues are partially due to the fact that metadata is being 
> stored on NL-SAS storage.  You stated that you did not think the file heat 
> feature would do what you intended but have you tried to use it to see if it 
> could solve your problem?  I would think having metadata on SSD/flash storage 
> combined with a all flash storage pool for your heavily used files would 
> perform well.  If you expect IO usage will be such that there will be far 
> more reads than writes then LROC should be beneficial to your overall 
> performance.
>
> Regards, The Spectrum Scale (GPFS) team
>
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of  Spectrum Scale 
> (GPFS), then please post it to the public IBM developerWroks Forum at 
> https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.
>
> If your query concerns a potential software error in Spectrum Scale (GPFS) 
> and you have an IBM software maintenance contract please contact  
> 1-800-237-5511 in the United States or your local IBM Service Center in other 
> countries.
>
> The forum is informally monitored as time permits and should not be used for 
> priority messages to the Spectrum Scale (GPFS) team.
>
>
>
> From:        vall...@cbio.mskcc.org
> To:        gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
> Date:        02/22/2018 03:11 PM
> Subject:        [gpfsug-discuss] GPFS and Flash/SSD Storage tiered storage
> Sent by:        gpfsug-discuss-boun...@spectrumscale.org
>
>
>
> Hi All,
>
> I am trying to figure out a GPFS tiering architecture with flash storage in 
> front end and near line storage as backend, for Supercomputing
>
> The Backend storage will be a GPFS storage on near line of about 8-10PB. The 
> backend storage will/can be tuned to give out large streaming bandwidth and 
> enough metadata disks to make the stat of all these files fast enough.
>
> I was thinking if it would be possible to use a GPFS flash cluster or GPFS 
> SSD cluster in front end that uses AFM and acts as a cache cluster with the 
> backend GPFS cluster.
>
> At the end of this .. the workflow that i am targeting is where:
>
>
> “
> If the compute nodes read headers of thousands of large files ranging from 
> 100MB to 1GB, the AFM cluster should be able to bring up enough threads to 
> bring up all of the files from the backend to the faster SSD/Flash GPFS 
> cluster.
> The working set might be about 100T, at a time which i want to be on a 
> faster/low latency tier, and the rest of the files to be in slower tier until 
> they are read by the compute nodes.
> “
>
>
> I do not want to use GPFS policies to achieve the above, is because i am not 
> sure - if policies could be written in a way, that files are moved from the 
> slower tier to faster tier depending on how the jobs interact with the files.
> I know that the policies could be written depending on the heat, and 
> size/format but i don’t think thes policies work in a similar way as above.
>
> I did try the above architecture, where an SSD GPFS cluster acts as an AFM 
> cache cluster before the near line storage. However the AFM cluster was 
> really really slow, It took it about few hours to copy the files from near 
> line storage to AFM cache cluster.
> I am not sure if AFM is not designed to work this way, or if AFM is not tuned 
> to work as fast as it should.
>
> I have tried LROC too, but it does not behave the same way as i guess AFM 
> works.
>
> Has anyone tried or know if GPFS supports an architecture - where the fast 
> tier can bring up thousands of threads and copy the files almost 
> instantly/asynchronously from the slow tier, whenever the jobs from compute 
> nodes reads few blocks from these files?
> I understand that with respect to hardware - the AFM cluster should be really 
> fast, as well as the network between the AFM cluster and the backend cluster.
>
> Please do also let me know, if the above workflow can be done using GPFS 
> policies and be as fast as it is needed to be.
>
> Regards,
> Lohit
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=kMYZhGPhwadAbNHucw79NJgyYAJAMgxyFZKEW-kMeqk&s=AT1gb89TzzE7nt58h8DYyhYkybvBY8mbXvdPjtaRRpU&e=
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Re: [gpfsug-discuss] GPFS and Flash/SSD Storage tiered storage

Reply via email to