256K
Giovanni
On 11/06/20 10:01, Luis Bolinches wrote:
On that RAID 6 what is the logical RAID block size? 128K, 256K, other?
--
Ystävällisin terveisin / Kind regards / Saludos cordiales / Salutations
/ Salutacions
Luis Bolinches
Consultant IT Specialist
IBM Spectrum Scale development
ESS & client adoption teams
Mobile Phone: +358503112585
*https://www.youracclaim.com/user/luis-bolinches*
Ab IBM Finland Oy
Laajalahdentie 23
00330 Helsinki
Uusimaa - Finland
*"If you always give you will always have" -- Anonymous*
----- Original message -----
From: Giovanni Bracco <giovanni.bra...@enea.it>
Sent by: gpfsug-discuss-boun...@spectrumscale.org
To: Jan-Frode Myklebust <janfr...@tanso.net>, gpfsug main discussion
list <gpfsug-discuss@spectrumscale.org>
Cc: Agostino Funel <agostino.fu...@enea.it>
Subject: [EXTERNAL] Re: [gpfsug-discuss] very low read performance
in simple spectrum scale/gpfs cluster with a storage-server SAN
Date: Thu, Jun 11, 2020 10:53
Comments and updates in the text:
On 05/06/20 19:02, Jan-Frode Myklebust wrote:
> fre. 5. jun. 2020 kl. 15:53 skrev Giovanni Bracco
> <giovanni.bra...@enea.it <mailto:giovanni.bra...@enea.it>>:
>
> answer in the text
>
> On 05/06/20 14:58, Jan-Frode Myklebust wrote:
> >
> > Could maybe be interesting to drop the NSD servers, and
let all
> nodes
> > access the storage via srp ?
>
> no we can not: the production clusters fabric is a mix of a
QDR based
> cluster and a OPA based cluster and NSD nodes provide the
service to
> both.
>
>
> You could potentially still do SRP from QDR nodes, and via NSD
for your
> omnipath nodes. Going via NSD seems like a bit pointless indirection.
not really: both clusters, the 400 OPA nodes and the 300 QDR nodes share
the same data lake in Spectrum Scale/GPFS so the NSD servers support the
flexibility of the setup.
NSD servers make use of a IB SAN fabric (Mellanox FDR switch) where at
the moment 3 different generations of DDN storages are connected,
9900/QDR 7700/FDR and 7990/EDR. The idea was to be able to add some less
expensive storage, to be used when performance is not the first
priority.
>
>
>
> >
> > Maybe turn off readahead, since it can cause performance
degradation
> > when GPFS reads 1 MB blocks scattered on the NSDs, so that
> read-ahead
> > always reads too much. This might be the cause of the slow
read
> seen —
> > maybe you’ll also overflow it if reading from both
NSD-servers at
> the
> > same time?
>
> I have switched the readahead off and this produced a small
(~10%)
> increase of performances when reading from a NSD server, but
no change
> in the bad behaviour for the GPFS clients
>
>
> >
> >
> > Plus.. it’s always nice to give a bit more pagepool to hhe
> clients than
> > the default.. I would prefer to start with 4 GB.
>
> we'll do also that and we'll let you know!
>
>
> Could you show your mmlsconfig? Likely you should set maxMBpS to
> indicate what kind of throughput a client can do (affects GPFS
> readahead/writebehind). Would typically also increase
workerThreads on
> your NSD servers.
At this moment this is the output of mmlsconfig
# mmlsconfig
Configuration data for cluster GPFSEXP.portici.enea.it:
-------------------------------------------------------
clusterName GPFSEXP.portici.enea.it
clusterId 13274694257874519577
autoload no
dmapiFileHandleSize 32
minReleaseLevel 5.0.4.0
ccrEnabled yes
cipherList AUTHONLY
verbsRdma enable
verbsPorts qib0/1
[cresco-gpfq7,cresco-gpfq8]
verbsPorts qib0/2
[common]
pagepool 4G
adminMode central
File systems in cluster GPFSEXP.portici.enea.it:
------------------------------------------------
/dev/vsd_gexp2
/dev/vsd_gexp3
>
>
> 1 MB blocksize is a bit bad for your 9+p+q RAID with 256 KB strip
size.
> When you write one GPFS block, less than a half RAID stripe is
written,
> which means you need to read back some data to calculate new
parities.
> I would prefer 4 MB block size, and maybe also change to 8+p+q so
that
> one GPFS is a multiple of a full 2 MB stripe.
>
>
> -jf
we have now added another file system based on 2 NSD on RAID6 8+p+q,
keeping the 1MB block size just not to change too many things at the
same time, but no substantial change in very low readout performances,
that are still of the order of 50 MB/s while write performance are
1000MB/s
Any other suggestion is welcomed!
Giovanni
--
Giovanni Bracco
phone +39 351 8804788
E-mail giovanni.bra...@enea.it
WWW http://www.afs.enea.it/bracco
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Ellei edellä ole toisin mainittu: / Unless stated otherwise above:
Oy IBM Finland Ab
PL 265, 00101 Helsinki, Finland
Business ID, Y-tunnus: 0195876-3
Registered in Finland
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
--
Giovanni Bracco
phone +39 351 8804788
E-mail giovanni.bra...@enea.it
WWW http://www.afs.enea.it/bracco
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss