Re: [gpfsug-discuss] GPFS and Flash/SSD Storage tiered storage

valleru Thu, 22 Feb 2018 17:35:53 -0800

Thanks, I will try the file heat feature but i am really not sure, if it would 
work - since the code can access cold files too, and not necessarily files 
recently accessed/hot files.


With respect to LROC. Let me explain as below:

The use case is that -
The code initially reads headers (small region of data) from thousands of files 
as the first step. For example about 30,000 of them with each about 300MB to 
500MB in size.
After the first step, with the help of those headers - it mmaps/seeks across 
various regions of a set of files in parallel.
Since its all small IOs and it was really slow at reading from GPFS over the 
network directly from disks - Our idea was to use AFM which i believe fetches 
all file data into flash/ssds, once the initial few blocks of the files are 
read.
But again - AFM seems to not solve the problem, so i want to know if LROC 
behaves in the same way as AFM, where all of the file data is prefetched in 
full block size utilizing all the worker threads  - if few blocks of the file 
is read initially.

Thanks,
Lohit

On Feb 22, 2018, 4:52 PM -0500, IBM Spectrum Scale <[email protected]>, wrote:
> My apologies for not being more clear on the flash storage pool.  I meant 
> that this would be just another GPFS storage pool in the same cluster, so no 
> separate AFM cache cluster.  You would then use the file heat feature to 
> ensure more frequently accessed files are migrated to that all flash storage 
> pool.
>
> As for LROC could you please clarify what you mean by a few headers/stubs of 
> the file?  In reading the LROC documentation and the LROC variables available 
> in the mmchconfig command I think you might want to take a look a the 
> lrocDataStubFileSize variable since it seems to apply to your situation.
>
> Regards, The Spectrum Scale (GPFS) team
>
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of  Spectrum Scale 
> (GPFS), then please post it to the public IBM developerWroks Forum at 
> https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.
>
> If your query concerns a potential software error in Spectrum Scale (GPFS) 
> and you have an IBM software maintenance contract please contact  
> 1-800-237-5511 in the United States or your local IBM Service Center in other 
> countries.
>
> The forum is informally monitored as time permits and should not be used for 
> priority messages to the Spectrum Scale (GPFS) team.
>
>
>
> From:        [email protected]
> To:        gpfsug main discussion list <[email protected]>
> Cc:        [email protected]
> Date:        02/22/2018 04:21 PM
> Subject:        Re: [gpfsug-discuss] GPFS and Flash/SSD Storage tiered storage
> Sent by:        [email protected]
>
>
>
> Thank you.
>
> I am sorry if i was not clear, but the metadata pool is all on SSDs in the 
> GPFS clusters that we use. Its just the data pool that is on Near-Line 
> Rotating disks.
> I understand that AFM might not be able to solve the issue, and I will try 
> and see if file heat works for migrating the files to flash tier.
> You mentioned an all flash storage pool for heavily used files - so you mean 
> a different GPFS cluster just with flash storage, and to manually copy the 
> files to flash storage whenever needed?
> The IO performance that i am talking is prominently for reads, so you mention 
> that LROC can work in the way i want it to? that is prefetch all the files 
> into LROC cache, after only few headers/stubs of data are read from those 
> files?
> I thought LROC only keeps that block of data that is prefetched from the 
> disk, and will not prefetch the whole file if a stub of data is read.
> Please do let me know, if i understood it wrong.
>
> On Feb 22, 2018, 4:08 PM -0500, IBM Spectrum Scale <[email protected]>, wrote:
> I do not think AFM is intended to solve the problem you are trying to solve.  
> If I understand your scenario correctly you state that you are placing 
> metadata on NL-SAS storage.  If that is true that would not be wise 
> especially if you are going to do many metadata operations.  I suspect your 
> performance issues are partially due to the fact that metadata is being 
> stored on NL-SAS storage.  You stated that you did not think the file heat 
> feature would do what you intended but have you tried to use it to see if it 
> could solve your problem?  I would think having metadata on SSD/flash storage 
> combined with a all flash storage pool for your heavily used files would 
> perform well.  If you expect IO usage will be such that there will be far 
> more reads than writes then LROC should be beneficial to your overall 
> performance.
>
> Regards, The Spectrum Scale (GPFS) team
>
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of  Spectrum Scale 
> (GPFS), then please post it to the public IBM developerWroks Forum at 
> https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.
>
> If your query concerns a potential software error in Spectrum Scale (GPFS) 
> and you have an IBM software maintenance contract please contact  
> 1-800-237-5511 in the United States or your local IBM Service Center in other 
> countries.
>
> The forum is informally monitored as time permits and should not be used for 
> priority messages to the Spectrum Scale (GPFS) team.
>
>
>
> From:        [email protected]
> To:        gpfsug main discussion list <[email protected]>
> Date:        02/22/2018 03:11 PM
> Subject:        [gpfsug-discuss] GPFS and Flash/SSD Storage tiered storage
> Sent by:        [email protected]
>
>
>
> Hi All,
>
> I am trying to figure out a GPFS tiering architecture with flash storage in 
> front end and near line storage as backend, for Supercomputing
>
> The Backend storage will be a GPFS storage on near line of about 8-10PB. The 
> backend storage will/can be tuned to give out large streaming bandwidth and 
> enough metadata disks to make the stat of all these files fast enough.
>
> I was thinking if it would be possible to use a GPFS flash cluster or GPFS 
> SSD cluster in front end that uses AFM and acts as a cache cluster with the 
> backend GPFS cluster.
>
> At the end of this .. the workflow that i am targeting is where:
>
>
> “
> If the compute nodes read headers of thousands of large files ranging from 
> 100MB to 1GB, the AFM cluster should be able to bring up enough threads to 
> bring up all of the files from the backend to the faster SSD/Flash GPFS 
> cluster.
> The working set might be about 100T, at a time which i want to be on a 
> faster/low latency tier, and the rest of the files to be in slower tier until 
> they are read by the compute nodes.
> “
>
>
> I do not want to use GPFS policies to achieve the above, is because i am not 
> sure - if policies could be written in a way, that files are moved from the 
> slower tier to faster tier depending on how the jobs interact with the files.
> I know that the policies could be written depending on the heat, and 
> size/format but i don’t think thes policies work in a similar way as above.
>
> I did try the above architecture, where an SSD GPFS cluster acts as an AFM 
> cache cluster before the near line storage. However the AFM cluster was 
> really really slow, It took it about few hours to copy the files from near 
> line storage to AFM cache cluster.
> I am not sure if AFM is not designed to work this way, or if AFM is not tuned 
> to work as fast as it should.
>
> I have tried LROC too, but it does not behave the same way as i guess AFM 
> works.
>
> Has anyone tried or know if GPFS supports an architecture - where the fast 
> tier can bring up thousands of threads and copy the files almost 
> instantly/asynchronously from the slow tier, whenever the jobs from compute 
> nodes reads few blocks from these files?
> I understand that with respect to hardware - the AFM cluster should be really 
> fast, as well as the network between the AFM cluster and the backend cluster.
>
> Please do also let me know, if the above workflow can be done using GPFS 
> policies and be as fast as it is needed to be.
>
> Regards,
> Lohit
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=kMYZhGPhwadAbNHucw79NJgyYAJAMgxyFZKEW-kMeqk&s=AT1gb89TzzE7nt58h8DYyhYkybvBY8mbXvdPjtaRRpU&e=
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=DuqESC-4ycoY5GoHpYeH1T8baq0JWY8QfkN8z6b8jPw&s=zNUAH3mFyzxcvXtrep_OroKiwR88QouIrcdN8TLJK8M&e=
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Re: [gpfsug-discuss] GPFS and Flash/SSD Storage tiered storage

Reply via email to