Re: [gpfsug-discuss] Metadata with GNR code

2018-09-21 Thread Sven Oehme
24_4m_4m/mdtest
>> Path: /homebrewed/gh24_4m_4m
>> FS: 10.0 TiB   Used FS: 0.0%   Inodes: 12.0 Mi   Used Inodes: 2.3%
>>
>> 40 tasks, 100 files/directories
>>
>> SUMMARY: (of 3 iterations)
>>   Operation  MaxMin   Mean
>>  Std Dev
>>   -  ------   
>>  ---
>>   Directory creation: 449160.409 430869.822 437002.187
>> 8597.272
>>   Directory stat:6664420.5605785712.5446324276
>> <(544)%20632-4276>.731 385192.527
>>   Directory removal : 398360.058 351503.369 371630.648
>>  19690.580
>>   File creation : 288985.217 270550.129 279096.800
>> 7585.659
>>   File stat :6720685.1176641301.4996674123.407
>>  33833.182
>>   File read :3055661.3722871044.8812945513.966
>>  79479.638
>>   File removal  : 215187.602 146639.435 179898.441
>>  28021.467
>>   Tree creation : 10.215  3.165  6.603
>>2.881
>>   Tree removal  :  5.484  0.880  2.418
>>2.168
>>
>> -- finished at 09/07/2018 06:55:42 --
>>
>>
>>
>>
>> Mit freundlichen Grüßen / Kind regards
>>
>>
>> Olaf Weiser
>>
>> EMEA Storage Competence Center Mainz, German / IBM Systems, Storage
>> Platform,
>>
>> ---
>> IBM Deutschland
>> IBM Allee 1
>> 71139 Ehningen
>> Phone: +49-170-579-44-66 <+49%20170%205794466>
>> E-Mail: olaf.wei...@de.ibm.com
>>
>> -----------------------
>> IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter
>> Geschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
>> Janzen, Dr. Christian Keller, Ivo Koerner, Markus Koerner
>> Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
>> HRB 14562 / WEEE-Reg.-Nr. DE 99369940
>>
>>
>>
>> From:"Andrew Beattie" 
>> To:gpfsug-discuss@spectrumscale.org
>> Date:09/21/2018 02:34 AM
>> Subject:Re: [gpfsug-discuss] Metadata with GNR code
>> Sent by:gpfsug-discuss-boun...@spectrumscale.org
>> --
>>
>>
>>
>> Simon,
>>
>> My recommendation is still very much to use SSD for Metadata and NL-SAS
>> for data and
>> the GH14 / GH24 Building blocks certainly make this much easier.
>>
>> Unless your filesystem is massive (Summit sized) you will typically still
>> continue to benefit from the Random IO performance of SSD (even RI SSD) in
>> comparison to NL-SAS.
>>
>> It still makes more sense to me to continue to use 2 copy or 3 copy for
>> Metadata even in ESS / GNR style environments.  The read performance for
>> metadata using 3copy is still significantly better than any other scenario.
>>
>> As with anything there are exceptions to the rule, but my experiences
>> with ESS and ESS with SSD so far still maintain that the standard thoughts
>> on managing Metadata and Small file IO remain the same -- even with the
>> improvements around sub blocks with Scale V5.
>>
>> MDtest is still the typical benchmark for this comparison and MDTest
>> shows some very clear differences  even on SSD when you use a large
>> filesystem block size with more sub blocks vs a smaller block size with
>> 1/32 subblocks
>>
>> This only gets worse if you change the storage media from SSD to NL-SAS
>> *Andrew Beattie*
>> *Software Defined Storage  - IT Specialist*
>> *Phone: *614-2133-7927
>> *E-mail: **abeat...@au1.ibm.com* 
>>
>>
>> - Original message -
>> From: Simon Thompson 
>> Sent by: gpfsug-discuss-boun...@spectrumscale.org
>> To: "gpfsug-discuss@spectrumscale.org" 
>> Cc:
>> Subject: [gpfsug-discuss] Metadata with GNR code
>> Date: Fri, Sep 21, 2018 3:29 AM
>>
>> Just wondering if anyone has any strong views/recommendations with
>> metadata when using GNR code?
>>
>>
>>
>> I know in “san” based GPFS, there is a recommendation to have data and
>> metadata split with the metadata on SSD.
>>
>>
>>
>> I’ve also heard that with GNR there isn’t much difference in splitting
&

Re: [gpfsug-discuss] Metadata with GNR code

2018-09-21 Thread Jan-Frode Myklebust
That reminds me of a point Sven made when I was trying to optimize mdtest
results with metadata on FlashSystem... He sent me the following:

-- started at 11/15/2015 15:20:39 --
mdtest-1.9.3 was launched with 138 total task(s) on 23 node(s)
Command line used: /ghome/oehmes/mpi/bin/mdtest-pcmpi9131-existingdir -d
/ibm/fs2-4m-02/shared/mdtest-ec -i 1 -n 7 -F -i 1 -w 0 -Z -u
Path: /ibm/fs2-4m-02/
sharedFS: 32.0 TiB   Used FS: 6.7%   Inodes: 145.4 Mi   Used Inodes: 22.0%
138 tasks, 966 files
SUMMARY: (of 1 iterations)
   Operation  MaxMin   Mean
Std Dev
   -  ------   
---
File creation : 650440.486 650440.486 650440.486
0.000
File stat :   23599134.618   23599134.618   23599134.618
0.000
File read :2171391.0972171391.0972171391.097
0.000
File removal  :1007566.9811007566.9811007566.981
0.000
Tree creation :  3.072  3.072  3.072
0.000
Tree removal  :  1.471  1.471  1.471
0.000
-- finished at 11/15/2015 15:21:10 --

from a GL6 -- only spinning disks -- pointing out that mdtest doesn't
really require Flash/SSD. The key to good results are

a) large GPFS log ( mmchfs -L 128m)

b) high maxfilestocache (you need to be able to cache all entries , so for
10 million across 20 nodes you need to have at least 750k per node)

c) fast network, thats key to handle the token requests and metadata
operations that need to get over the network.



  -jf

On Fri, Sep 21, 2018 at 10:22 AM Olaf Weiser  wrote:

> see a mdtest for a default block size file system ...
> 4 MB blocksize..
> mdata is on SSD
> data is on HDD   ... which is not really relevant for this mdtest ;-)
>
>
> -- started at 09/07/2018 06:54:54 --
>
> mdtest-1.9.3 was launched with 40 total task(s) on 20 node(s)
> Command line used: mdtest -n 25000 -i 3 -u -d
> /homebrewed/gh24_4m_4m/mdtest
> Path: /homebrewed/gh24_4m_4m
> FS: 10.0 TiB   Used FS: 0.0%   Inodes: 12.0 Mi   Used Inodes: 2.3%
>
> 40 tasks, 100 files/directories
>
> SUMMARY: (of 3 iterations)
>   Operation  MaxMin   Mean
>  Std Dev
>   -  ------   
>  ---
>   Directory creation: 449160.409 430869.822 437002.187
> 8597.272
>   Directory stat:6664420.5605785712.5446324276.731
> 385192.527
>   Directory removal : 398360.058 351503.369 371630.648
>  19690.580
>   File creation : 288985.217 270550.129 279096.800
> 7585.659
>   File stat :6720685.1176641301.4996674123.407
>  33833.182
>   File read :3055661.3722871044.8812945513.966
>  79479.638
>   File removal  : 215187.602 146639.435 179898.441
>  28021.467
>   Tree creation : 10.215  3.165  6.603
>  2.881
>   Tree removal  :  5.484  0.880  2.418
>  2.168
>
> -- finished at 09/07/2018 06:55:42 --
>
>
>
>
> Mit freundlichen Grüßen / Kind regards
>
>
> Olaf Weiser
>
> EMEA Storage Competence Center Mainz, German / IBM Systems, Storage
> Platform,
>
> ---
> IBM Deutschland
> IBM Allee 1
> 71139 Ehningen
> Phone: +49-170-579-44-66
> E-Mail: olaf.wei...@de.ibm.com
>
> ---
> IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter
> Geschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
> Janzen, Dr. Christian Keller, Ivo Koerner, Markus Koerner
> Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
> HRB 14562 / WEEE-Reg.-Nr. DE 99369940
>
>
>
> From:"Andrew Beattie" 
> To:gpfsug-discuss@spectrumscale.org
> Date:09/21/2018 02:34 AM
> Subject:Re: [gpfsug-discuss] Metadata with GNR code
> Sent by:gpfsug-discuss-boun...@spectrumscale.org
> --
>
>
>
> Simon,
>
> My recommendation is still very much to use SSD for Metadata and NL-SAS
> for data and
> the GH14 / GH24 Building blocks certainly make this much easier.
>
> Unless your filesystem is massive (Summit sized) you will typically still
> continue to benefit from the Random IO performance of SSD (even RI SSD) in
> comparison to NL-SAS.
>
> It still makes more sense to me to continue to use 2 copy or 3 copy for
> Metadata even in ESS / GNR style environments.  The read pe

Re: [gpfsug-discuss] Metadata with GNR code

2018-09-21 Thread Olaf Weiser
see a mdtest for a default block size file
system ...4 MB blocksize..mdata is on SSD data is on HDD   ... which is not
really relevant for this mdtest ;-) -- started at 09/07/2018 06:54:54 -- mdtest-1.9.3 was launched with 40 total
task(s) on 20 node(s) Command line used: mdtest -n 25000 -i 3
-u -d /homebrewed/gh24_4m_4m/mdtest Path: /homebrewed/gh24_4m_4m FS: 10.0 TiB   Used FS: 0.0%  
Inodes: 12.0 Mi   Used Inodes: 2.3% 40 tasks, 100 files/directories SUMMARY: (of 3 iterations)   Operation        
             Max      
     Min           Mean  
     Std Dev   -        
             ---      
     ---             
     ---   Directory creation:    
449160.409     430869.822     437002.187    
  8597.272   Directory stat    :  
 6664420.560    5785712.544    6324276.731  
  385192.527   Directory removal :    
398360.058     351503.369     371630.648    
 19690.580   File creation     :  
  288985.217     270550.129     279096.800  
    7585.659   File stat        
:    6720685.117    6641301.499    6674123.407
     33833.182   File read        
:    3055661.372    2871044.881    2945513.966
     79479.638   File removal      :
    215187.602     146639.435     179898.441
     28021.467   Tree creation     :  
      10.215          3.165  
       6.603          2.881
  Tree removal      :
         5.484          0.880
         2.418          2.168
-- finished at 09/07/2018 06:55:42 --Mit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:      
 "Andrew Beattie"
To:      
 gpfsug-discuss@spectrumscale.orgDate:      
 09/21/2018 02:34 AMSubject:    
   Re: [gpfsug-discuss]
Metadata with GNR codeSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgSimon, My recommendation is still very much to use
SSD for Metadata and NL-SAS for data andthe GH14 / GH24 Building blocks certainly
make this much easier. Unless your filesystem is massive (Summit
sized) you will typically still continue to benefit from the Random IO
performance of SSD (even RI SSD) in comparison to NL-SAS. It still makes more sense to me to continue
to use 2 copy or 3 copy for Metadata even in ESS / GNR style environments.
 The read performance for metadata using 3copy is still significantly
better than any other scenario. As with anything there are exceptions to
the rule, but my experiences with ESS and ESS with SSD so far still maintain
that the standard thoughts on managing Metadata and Small file IO remain
the same -- even with the improvements around sub blocks with Scale V5. MDtest is still the typical benchmark for
this comparison and MDTest shows some very clear differences  even
on SSD when you use a large filesystem block size with more sub blocks
vs a smaller block size with 1/32 subblocks This only gets worse if you change the storage
media from SSD to NL-SASAndrew BeattieSoftware Defined Storage  - IT SpecialistPhone: 614-2133-7927E-mail: abeat...@au1.ibm.com  - Original message -From: Simon Thompson Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: [gpfsug-discuss] Metadata with GNR codeDate: Fri, Sep 21, 2018 3:29 AM Just wondering if anyone has any strong views/recommendations
with metadata when using GNR code? I know in “san” based GPFS, there is a recommendation
to have data and metadata split with the metadata on SSD. I’ve also heard that with GNR there isn’t
much difference in splitting data and metadata. We’re looking at two systems and want to
replicate metadata, but not data (mostly) between them, so I’m not really
sure how we’d do this without having separate system pool (and then NSDs
in different failure groups)…. If we used 8+2P vdisks for metadata only,
would we still see no difference in performance compared to mixed (I guess
the 8+2P is still spread over a DA so we’d get half the drives in the
GNR system active…). Or should we stick SSD based storage in as
well for the metadata pool? (Which brings an interesting question about
RAID code related to the recent discussions on mirroring vs RAID5…) Thoughts welcome! Simon___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss