Re: [gpfsug-discuss] Write performances and filesystem size

Daniel Kidger Wed, 15 Nov 2017 15:48:55 -0800

My 2c ...
Be careful here about mixing up three different possible effects seen in 
filesystems


1. Performance degradation as the filesystem approaches 100% full, often due to 
the difficulty of finding the remaining unallocated blocks.
GPFS doesn’t noticeably suffer from this effect compared to its competitors.

2. Performance degradation over time as files get fragmented and so cause extra 
movement of the actuator arm of a HDD. (hence defrag on Windows and the idea of 
short stroking drives).

3. Performance degradation as blocks are written further from the fastest part 
of a hard disk drive. SSDs do not show this effect. 


Benchmarks on newly formatted empty filesystems are often artificially high 
compared to performance after say 12 months whether or not the filesystem is 
near 90%+ capacity utilisation. The -j scatter option allows for more realistic 
performance measurement when designing for the long term usage of the 
filesystem. But this is due to the distributed location of the blocks not how 
full the filesystem is.



Daniel


 

 
        
Dr Daniel Kidger 
IBM Technical Sales Specialist
Software Defined Solution Sales

+ 44-(0)7818 522 266 
[email protected]

> On 15 Nov 2017, at 11:26, Olaf Weiser <[email protected]> wrote:
> 
>  to add a comment ...  .. very simply... depending on how you allocate the 
> physical block storage .... if you - simply - using less physical resources 
> when reducing the capacity (in the same ratio) .. you get , what you see.... 
> 
> so you need to tell us, how you allocate your block-storage .. (Do you using 
> RAID controllers , where are your LUNs coming from, are then less RAID groups 
> involved, when reducing the capacity ?...) 
> 
> GPFS can be configured to give you pretty as much as what the hardware can 
> deliver.. if you reduce resource.. ... you'll get less , if you enhance your 
> hardware .. you get more... almost regardless of the total capacity in 
> #blocks .. 
> 
> 
> 
> 
> 
> 
> From:        "Kumaran Rajaram" <[email protected]>
> To:        gpfsug main discussion list <[email protected]>
> Date:        11/15/2017 11:56 AM
> Subject:        Re: [gpfsug-discuss] Write performances and filesystem size
> Sent by:        [email protected]
> 
> 
> 
> Hi,
> 
> >>Am I missing something? Is this an expected behaviour and someone has an 
> >>explanation for this?
> 
> Based on your scenario, write degradation as the file-system is populated is 
> possible if you had formatted the file-system with "-j cluster". 
> 
> For consistent file-system performance, we recommend mmcrfs "-j scatter" 
> layoutMap.   Also, we need to ensure the mmcrfs "-n"  is set properly.
> 
> [snip from mmcrfs]
> # mmlsfs <fs> | egrep 'Block allocation| Estimated number'
> -j                 scatter                  Block allocation type
> -n                 128                       Estimated number of nodes that 
> will mount file system
> [/snip]
> 
> 
> [snip from man mmcrfs]
> layoutMap={scatter| cluster}
>                  Specifies the block allocation map type. When
>                  allocating blocks for a given file, GPFS first
>                  uses a round‐robin algorithm to spread the data
>                  across all disks in the storage pool. After a
>                  disk is selected, the location of the data
>                  block on the disk is determined by the block
>                  allocation map type. If cluster is
>                  specified, GPFS attempts to allocate blocks in
>                  clusters. Blocks that belong to a particular
>                  file are kept adjacent to each other within
>                  each cluster. If scatter is specified,
>                  the location of the block is chosen randomly.
> 
>                 The cluster allocation method may provide
>                  better disk performance for some disk
>                  subsystems in relatively small installations.
>                  The benefits of clustered block allocation
>                  diminish when the number of nodes in the
>                  cluster or the number of disks in a file system
>                  increases, or when the file system’s free space
>                  becomes fragmented. The cluster
>                  allocation method is the default for GPFS
>                  clusters with eight or fewer nodes and for file
>                  systems with eight or fewer disks.
> 
>                 The scatter allocation method provides
>                  more consistent file system performance by
>                  averaging out performance variations due to
>                  block location (for many disk subsystems, the
>                  location of the data relative to the disk edge
>                  has a substantial effect on performance).This
>                  allocation method is appropriate in most cases
>                  and is the default for GPFS clusters with more
>                  than eight nodes or file systems with more than
>                  eight disks.
> 
>                  The block allocation map type cannot be changed
>                  after the storage pool has been created.
> 
> 
> -n NumNodes
>         The estimated number of nodes that will mount the file
>         system in the local cluster and all remote clusters.
>         This is used as a best guess for the initial size of
>         some file system data structures. The default is 32.
>         This value can be changed after the file system has been
>         created but it does not change the existing data
>         structures. Only the newly created data structure is
>         affected by the new value. For example, new storage
>         pool.
> 
>         When you create a GPFS file system, you might want to
>         overestimate the number of nodes that will mount the
>         file system. GPFS uses this information for creating
>         data structures that are essential for achieving maximum
>         parallelism in file system operations (For more
>         information, see GPFS architecture in IBM Spectrum
>         Scale: Concepts, Planning, and Installation Guide ). If
>         you are sure there will never be more than 64 nodes,
>         allow the default value to be applied. If you are
>         planning to add nodes to your system, you should specify
>         a number larger than the default.
> 
> [/snip from man mmcrfs]
> 
> Regards,
> -Kums
> 
> 
> 
> 
> 
> From:        Ivano Talamo <[email protected]>
> To:        <[email protected]>
> Date:        11/15/2017 11:25 AM
> Subject:        [gpfsug-discuss] Write performances and filesystem size
> Sent by:        [email protected]
> 
> 
> 
> Hello everybody,
> 
> together with my colleagues we are actually running some tests on a new 
> DSS G220 system and we see some unexpected behaviour.
> 
> What we actually see is that write performances (we did not test read 
> yet) decreases with the decrease of filesystem size.
> 
> I will not go into the details of the tests, but here are some numbers:
> 
> - with a filesystem using the full 1.2 PB space we get 14 GB/s as the 
> sum of the disk activity on the two IO servers;
> - with a filesystem using half of the space we get 10 GB/s;
> - with a filesystem using 1/4 of the space we get 5 GB/s.
> 
> We also saw that performances are not affected by the vdisks layout, ie. 
> taking the full space with one big vdisk or 2 half-size vdisks per RG 
> gives the same performances.
> 
> To our understanding the IO should be spread evenly across all the 
> pdisks in the declustered array, and looking at iostat all disks seem to 
> be accessed. But so there must be some other element that affects 
> performances.
> 
> Am I missing something? Is this an expected behaviour and someone has an 
> explanation for this?
> 
> Thank you,
> Ivano
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=Yu5Gt0RPmbb6KaS_emGivhq5C2A33w5DeecdU2aLViQ&s=K0Mz-y4oBH66YUf1syIXaQ3hxck6WjeEMsM-HNHhqAU&e=
> 
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Re: [gpfsug-discuss] Write performances and filesystem size

Reply via email to