Re: [gpfsug-discuss] RAID config for SSD's used for data

2017-04-20 Thread Uwe Falke
Some thoughts: 
you give typical cumulative usage values. However, a fast pool might 
matter most for spikes of the traffic. Do you have spikes driving your 
current system to the edge? 

Then: using the SSD pool for writes is straightforward (placement), using 
it for reads will only pay off if data are either pre-fetched to the pool 
somehow, or read more than once before getting migrated back to the HDD 
pool(s). Write traffic is less than read as you wrote. 

RAID1 vs RAID6: RMW penalty of parity-based RAIDs was mentioned, which 
strikes at writes smaller than the full stripe width of your RAID - what 
type of write I/O do you have (or expect)? (This may also be important for 
choosing the quality of SSDs, with RMW in mind you will have a comparably 
huge amount of data written on the SSD devices if your I/O traffic 
consists of myriads of small IOs and you organized the SSDs in a RAID5 or 
RAID6)

I suppose your current system is well set to provide the required 
aggregate throughput. Now, what kind of improvement do you expect? How are 
the clients connected? Would they have sufficient network bandwidth to see 
improvements at all?




 
Mit freundlichen Grüßen / Kind regards

 
Dr. Uwe Falke
 
IT Specialist
High Performance Computing Services / Integrated Technology Services / 
Data Center Services
---
IBM Deutschland
Rathausstr. 7
09111 Chemnitz
Phone: +49 371 6978 2165
Mobile: +49 175 575 2877
E-Mail: uwefa...@de.ibm.com
---
IBM Deutschland Business & Technology Services GmbH / Geschäftsführung: 
Andreas Hasse, Thorsten Moehring
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, 
HRB 17122 


gpfsug-discuss-boun...@spectrumscale.org wrote on 04/19/2017 09:53:42 PM:

> From: "Buterbaugh, Kevin L" 
> To: gpfsug main discussion list 
> Date: 04/19/2017 09:54 PM
> Subject: [gpfsug-discuss] RAID config for SSD's used for data
> Sent by: gpfsug-discuss-boun...@spectrumscale.org
> 
> Hi All, 
> 
> We currently have what I believe is a fairly typical setup ? 
> metadata for our GPFS filesystems is the only thing in the system 
> pool and it?s on SSD, while data is on spinning disk (RAID 6 LUNs). 
> Everything connected via 8 Gb FC SAN.  8 NSD servers.  Roughly 1 PB 
> usable space.
> 
> Now lets just say that you have a little bit of money to spend. 
> Your I/O demands aren?t great - in fact, they?re way on the low end 
> ? typical (cumulative) usage is 200 - 600 MB/sec read, less than 
> that for writes.  But while GPFS has always been great and therefore
> you don?t need to Make GPFS Great Again, you do want to provide your
> users with the best possible environment.
> 
> So you?re considering the purchase of a dual-controller FC storage 
> array with 12 or so 1.8 TB SSD?s in it, with the idea being that 
> that storage would be in its? own storage pool and that pool would 
> be the default location for I/O for your main filesystem ? at least 
> for smaller files.  You intend to use mmapplypolicy nightly to move 
> data to / from this pool and the spinning disk pools.
> 
> Given all that ? would you configure those disks as 6 RAID 1 mirrors
> and have 6 different primary NSD servers or would it be feasible to 
> configure one big RAID 6 LUN?  I?m thinking the latter is not a good
> idea as there could only be one primary NSD server for that one LUN,
> but given that:  1) I have no experience with this, and 2) I have 
> been wrong once or twice before (), I?m looking for advice. 
Thanks!
> 
> ?
> Kevin Buterbaugh - Senior System Administrator
> Vanderbilt University - Advanced Computing Center for Research and 
Education
> kevin.buterba...@vanderbilt.edu - (615)875-9633
> 
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] RAID config for SSD's used for data

2017-04-20 Thread Jonathan Buzzard
On Wed, 2017-04-19 at 14:23 -0700, Alex Chekholko wrote:
> On 04/19/2017 12:53 PM, Buterbaugh, Kevin L wrote:
> >
> > So you’re considering the purchase of a dual-controller FC storage array
> > with 12 or so 1.8 TB SSD’s in it, with the idea being that that storage
> > would be in its’ own storage pool and that pool would be the default
> > location for I/O for your main filesystem … at least for smaller files.
> >  You intend to use mmapplypolicy nightly to move data to / from this
> > pool and the spinning disk pools.
> 
> We did this and failed in interesting (but in retrospect obvious) ways. 
> You will want to ensure that your users cannot fill your write target 
> pool within a day.  The faster the storage, the more likely that is to 
> happen.  Or else your users will get ENOSPC.

Eh? Seriously you should have a fail over rule so that when your "fast"
pool is filled up it starts allocating in the "slow" pool (nice good
names that are descriptive and less than 8 characters including
termination character). Now there are issues when you get close to very
full so you need to set the fail over to as sizeable bit less than the
full size, 95% is a good starting point.

The pool names size is important because if the fast pool is less than
eight characters and the slow is more because you called in
"nearline" (which is 9 including termination character) once the files
get moved they get backed up again by TSM, yeah!!!

The 95% bit comes about from this. Imagine you had 12KB left in the fast
pool and you go to write a file. You open the file with 0B in size and
then start writing. At 12KB you run out of space in the fast pool and as
the file can only be in one pool you get a ENOSPC, and the file gets
canned. This then starts repeating on a regular basis.

So if you start allocating at significantly less than 100%, say 95%
where that 5% is larger than the largest file you expect that file
works, but all subsequent files get allocated in the slow pool, till you
flush the fast pool.

Something like this as the last two rules in your policy should do the
trick.

/* by default new files to the fast disk unless full, then to slow */
RULE 'new' SET POOL 'fast' LIMIT(95)
RULE 'spillover' SET POOL 'slow'

However in general your fast pool needs to have sufficient capacity to
take your daily churn and then some.

JAB.

-- 
Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] RAID config for SSD's used for data

2017-04-20 Thread Jonathan Buzzard
On Wed, 2017-04-19 at 20:05 +, Simon Thompson (IT Research Support)
wrote:
> By having many LUNs, you get many IO queues for Linux to play with. Also the 
> raid6 overhead can be quite significant, so it might be better to go with 
> raid1 anyway depending on the controller...
> 
> And if only gpfs had some sort of auto tier back up the pools for hot or data 
> caching :-)
> 

If you have sized the "fast" pool correctly then the "slow" pool will be
spending most of it's time doing diddly squat, aka under 10 IOPS per
second unless you are flushing the pool of old files to make space. I
have graphs that show this.

Then two things happen, if you are just reading the file then fine,
probably coming from the cache or the disks are not very busy anyway so
you won't notice.

If you happen to *change* the file and start doing things actively with
it again, then because most programs approach this by creating an
entirely new file with a temporary name, then doing a rename and delete
shuffle so a crash will leave you with a valid file somewhere then the
changed version ends up on the fast disk by virtue of being a new file.

JAB.

-- 
Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Spectrum Scale Slow to create directories

2017-04-20 Thread Peter Childs
Simon,

We've managed to resolve this issue by switching off quota's and switching them 
back on again and rebuilding the quota file.

Can I check if you run quota's on your cluster.

See you 2 weeks in Manchester

Thanks in advance.

Peter Childs
Research Storage Expert
ITS Research Infrastructure
Queen Mary, University of London
Phone: 020 7882 8393


From: gpfsug-discuss-boun...@spectrumscale.org 
 on behalf of Simon Thompson (IT 
Research Support) 
Sent: Tuesday, April 11, 2017 4:55:35 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Spectrum Scale Slow to create directories

We actually saw this for a while on one of our clusters which was new. But
by the time I'd got round to looking deeper, it had gone, maybe we were
using the NSDs more heavily, or possibly we'd upgraded. We are at 4.2.2-2,
so might be worth trying to bump the version and see if it goes away.

We saw it on the NSD servers directly as well, so not some client trying
to talk to it, so maybe there was some buggy code?

Simon

On 11/04/2017, 16:51, "gpfsug-discuss-boun...@spectrumscale.org on behalf
of Bryan Banister"  wrote:

>There are so many things to look at and many tools for doing so (iostat,
>htop, nsdperf, mmdiag, mmhealth, mmlsconfig, mmlsfs, etc).  I would
>recommend a review of the presentation that Yuri gave at the most recent
>GPFS User Group:
>https://drive.google.com/drive/folders/0B124dhp9jJC-UjFlVjJTa2ZaVWs
>
>Cheers,
>-Bryan
>
>-Original Message-
>From: gpfsug-discuss-boun...@spectrumscale.org
>[mailto:gpfsug-discuss-boun...@spectrumscale.org] On Behalf Of Peter
>Childs
>Sent: Tuesday, April 11, 2017 3:58 AM
>To: gpfsug main discussion list 
>Subject: [gpfsug-discuss] Spectrum Scale Slow to create directories
>
>This is a curious issue which I'm trying to get to the bottom of.
>
>We currently have two Spectrum Scale file systems, both are running GPFS
>4.2.1-1 some of the servers have been upgraded to 4.2.1-2.
>
>The older one which was upgraded from GPFS 3.5 works find create a
>directory is always fast and no issue.
>
>The new one, which has nice new SSD for metadata and hence should be
>faster. can take up to 30 seconds to create a directory but usually takes
>less than a second, The longer directory creates usually happen on busy
>nodes that have not used the new storage in a while. (Its new so we've
>not moved much of the data over yet) But it can also happen randomly
>anywhere, including from the NSD servers them selves. (times of 3-4
>seconds from the NSD servers have been seen, on a single directory create)
>
>We've been pointed at the network and suggested we check all network
>settings, and its been suggested to build an admin network, but I'm not
>sure I entirely understand why and how this would help. Its a mixed
>1G/10G network with the NSD servers connected at 40G with an MTU of 9000.
>
>However as I say, the older filesystem is fine, and it does not matter if
>the nodes are connected to the old GPFS cluster or the new one, (although
>the delay is worst on the old gpfs cluster), So I'm really playing spot
>the difference. and the network is not really an obvious difference.
>
>Its been suggested to look at a trace when it occurs but as its difficult
>to recreate collecting one is difficult.
>
>Any ideas would be most helpful.
>
>Thanks
>
>
>
>Peter Childs
>ITS Research Infrastructure
>Queen Mary, University of London
>___
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
>Note: This email is for the confidential use of the named addressee(s)
>only and may contain proprietary, confidential or privileged information.
>If you are not the intended recipient, you are hereby notified that any
>review, dissemination or copying of this email is strictly prohibited,
>and to please notify the sender immediately and destroy this email and
>any attachments. Email transmission cannot be guaranteed to be secure or
>error-free. The Company, therefore, does not make any guarantees as to
>the completeness or accuracy of this email or any attachments. This email
>is for informational purposes only and does not constitute a
>recommendation, offer, request or solicitation of any kind to buy, sell,
>subscribe, redeem or perform any type of transaction of a financial
>product.
>___
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] bizarre performance behavior

2017-04-20 Thread Kenneth Waegeman

Hi,


Having an issue that looks the same as this one:

We can do sequential writes to the filesystem at 7,8 GB/s total , which 
is the expected speed for our current storage
backend.  While we have even better performance with sequential reads on 
raw storage LUNS, using GPFS we can only reach 1GB/s in total (each nsd 
server seems limited by 0,5GB/s) independent of the number of clients
(1,2,4,..) or ways we tested (fio,dd). We played with blockdev params, 
MaxMBps, PrefetchThreads, hyperthreading, c1e/cstates, .. as discussed 
in this thread, but nothing seems to impact this read performance.


Any ideas?

Thanks!

Kenneth

On 17/02/17 19:29, Jan-Frode Myklebust wrote:
I just had a similar experience from a sandisk infiniflash system 
SAS-attached to s single host. Gpfsperf reported 3,2 Gbyte/s for 
writes. and 250-300 Mbyte/s on sequential reads!! Random reads were on 
the order of 2 Gbyte/s.


After a bit head scratching snd fumbling around I found out that 
reducing maxMBpS from 1 to 100 fixed the problem! Digging further 
I found that reducing prefetchThreads from default=72 to 32 also fixed 
it, while leaving maxMBpS at 1. Can now also read at 3,2 GByte/s.


Could something like this be the problem on your box as well?



-jf
fre. 17. feb. 2017 kl. 18.13 skrev Aaron Knister 
mailto:aaron.s.knis...@nasa.gov>>:


Well, I'm somewhat scrounging for hardware. This is in our test
environment :) And yep, it's got the 2U gpu-tray in it although even
without the riser it has 2 PCIe slots onboard (excluding the on-board
dual-port mezz card) so I think it would make a fine NSD server even
without the riser.

-Aaron

On 2/17/17 11:43 AM, Simon Thompson (Research Computing - IT Services)
wrote:
> Maybe its related to interrupt handlers somehow? You drive the
load up on one socket, you push all the interrupt handling to the
other socket where the fabric card is attached?
>
> Dunno ... (Though I am intrigued you use idataplex nodes as NSD
servers, I assume its some 2U gpu-tray riser one or something !)
>
> Simon
> 
> From: gpfsug-discuss-boun...@spectrumscale.org

[gpfsug-discuss-boun...@spectrumscale.org
] on behalf of
Aaron Knister [aaron.s.knis...@nasa.gov
]
> Sent: 17 February 2017 15:52
> To: gpfsug main discussion list
> Subject: [gpfsug-discuss] bizarre performance behavior
>
> This is a good one. I've got an NSD server with 4x 16GB fibre
> connections coming in and 1x FDR10 and 1x QDR connection going
out to
> the clients. I was having a really hard time getting anything
resembling
> sensible performance out of it (4-5Gb/s writes but maybe 1.2Gb/s for
> reads). The back-end is a DDN SFA12K and I *know* it can do
better than
> that.
>
> I don't remember quite how I figured this out but simply by running
> "openssl speed -multi 16" on the nsd server to drive up the load
I saw
> an almost 4x performance jump which is pretty much goes against
every
> sysadmin fiber in me (i.e. "drive up the cpu load with unrelated
crap to
> quadruple your i/o performance").
>
> This feels like some type of C-states frequency scaling
shenanigans that
> I haven't quite ironed down yet. I booted the box with the following
> kernel parameters "intel_idle.max_cstate=0
processor.max_cstate=0" which
> didn't seem to make much of a difference. I also tried setting the
> frequency governer to userspace and setting the minimum frequency to
> 2.6ghz (it's a 2.6ghz cpu). None of that really matters-- I
still have
> to run something to drive up the CPU load and then performance
improves.
>
> I'm wondering if this could be an issue with the C1E state? I'm
curious
> if anyone has seen anything like this. The node is a dx360 M4
> (Sandybridge) with 16 2.6GHz cores and 32GB of RAM.
>
> -Aaron
>
> --
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2)
> Goddard Space Flight Center
> (301) 286-2776
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org 
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org 
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>

--
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org 

Re: [gpfsug-discuss] bizarre performance behavior

2017-04-20 Thread Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]
Interesting. Could you share a little more about your architecture? Is it 
possible to mount the fs on an NSD server and do some dd's from the fs on the 
NSD server? If that gives you decent performance perhaps try NSDPERF next 
https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General+Parallel+File+System+(GPFS)/page/Testing+network+performance+with+nsdperf

-Aaron




On April 20, 2017 at 10:53:47 EDT, Kenneth Waegeman  
wrote:

Hi,


Having an issue that looks the same as this one:

We can do sequential writes to the filesystem at 7,8 GB/s total , which is the 
expected speed for our current storage
backend.  While we have even better performance with sequential reads on raw 
storage LUNS, using GPFS we can only reach 1GB/s in total (each nsd server 
seems limited by 0,5GB/s) independent of the number of clients
(1,2,4,..) or ways we tested (fio,dd). We played with blockdev params, MaxMBps, 
PrefetchThreads, hyperthreading, c1e/cstates, .. as discussed in this thread, 
but nothing seems to impact this read performance.

Any ideas?

Thanks!

Kenneth

On 17/02/17 19:29, Jan-Frode Myklebust wrote:
I just had a similar experience from a sandisk infiniflash system SAS-attached 
to s single host. Gpfsperf reported 3,2 Gbyte/s for writes. and 250-300 Mbyte/s 
on sequential reads!! Random reads were on the order of 2 Gbyte/s.

After a bit head scratching snd fumbling around I found out that reducing 
maxMBpS from 1 to 100 fixed the problem! Digging further I found that 
reducing prefetchThreads from default=72 to 32 also fixed it, while leaving 
maxMBpS at 1. Can now also read at 3,2 GByte/s.

Could something like this be the problem on your box as well?



-jf
fre. 17. feb. 2017 kl. 18.13 skrev Aaron Knister 
mailto:aaron.s.knis...@nasa.gov>>:
Well, I'm somewhat scrounging for hardware. This is in our test
environment :) And yep, it's got the 2U gpu-tray in it although even
without the riser it has 2 PCIe slots onboard (excluding the on-board
dual-port mezz card) so I think it would make a fine NSD server even
without the riser.

-Aaron

On 2/17/17 11:43 AM, Simon Thompson (Research Computing - IT Services)
wrote:
> Maybe its related to interrupt handlers somehow? You drive the load up on one 
> socket, you push all the interrupt handling to the other socket where the 
> fabric card is attached?
>
> Dunno ... (Though I am intrigued you use idataplex nodes as NSD servers, I 
> assume its some 2U gpu-tray riser one or something !)
>
> Simon
> 
> From: 
> gpfsug-discuss-boun...@spectrumscale.org
>  
> [gpfsug-discuss-boun...@spectrumscale.org]
>  on behalf of Aaron Knister 
> [aaron.s.knis...@nasa.gov]
> Sent: 17 February 2017 15:52
> To: gpfsug main discussion list
> Subject: [gpfsug-discuss] bizarre performance behavior
>
> This is a good one. I've got an NSD server with 4x 16GB fibre
> connections coming in and 1x FDR10 and 1x QDR connection going out to
> the clients. I was having a really hard time getting anything resembling
> sensible performance out of it (4-5Gb/s writes but maybe 1.2Gb/s for
> reads). The back-end is a DDN SFA12K and I *know* it can do better than
> that.
>
> I don't remember quite how I figured this out but simply by running
> "openssl speed -multi 16" on the nsd server to drive up the load I saw
> an almost 4x performance jump which is pretty much goes against every
> sysadmin fiber in me (i.e. "drive up the cpu load with unrelated crap to
> quadruple your i/o performance").
>
> This feels like some type of C-states frequency scaling shenanigans that
> I haven't quite ironed down yet. I booted the box with the following
> kernel parameters "intel_idle.max_cstate=0 processor.max_cstate=0" which
> didn't seem to make much of a difference. I also tried setting the
> frequency governer to userspace and setting the minimum frequency to
> 2.6ghz (it's a 2.6ghz cpu). None of that really matters-- I still have
> to run something to drive up the CPU load and then performance improves.
>
> I'm wondering if this could be an issue with the C1E state? I'm curious
> if anyone has seen anything like this. The node is a dx360 M4
> (Sandybridge) with 16 2.6GHz cores and 32GB of RAM.
>
> -Aaron
>
> --
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2)
> Goddard Space Flight Center
> (301) 286-2776
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>

--
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776
_

Re: [gpfsug-discuss] bizarre performance behavior

2017-04-20 Thread Uwe Falke
Hi Kennmeth, 

is prefetching off or on  at your storage backend?
Raw sequential is very different from GPFS sequential at the storage 
device !
GPFS does its own prefetching, the storage would never know what sectors 
sequential read at GPFS level maps to at storage level!

 
Mit freundlichen Grüßen / Kind regards

 
Dr. Uwe Falke
 
IT Specialist
High Performance Computing Services / Integrated Technology Services / 
Data Center Services
---
IBM Deutschland
Rathausstr. 7
09111 Chemnitz
Phone: +49 371 6978 2165
Mobile: +49 175 575 2877
E-Mail: uwefa...@de.ibm.com
---
IBM Deutschland Business & Technology Services GmbH / Geschäftsführung: 
Andreas Hasse, Thorsten Moehring
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, 
HRB 17122 




From:   Kenneth Waegeman 
To: gpfsug main discussion list 
Date:   04/20/2017 04:53 PM
Subject:Re: [gpfsug-discuss] bizarre performance behavior
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi,

Having an issue that looks the same as this one: 
We can do sequential writes to the filesystem at 7,8 GB/s total , which is 
the expected speed for our current storage
backend.  While we have even better performance with sequential reads on 
raw storage LUNS, using GPFS we can only reach 1GB/s in total (each nsd 
server seems limited by 0,5GB/s) independent of the number of clients   
(1,2,4,..) or ways we tested (fio,dd). We played with blockdev params, 
MaxMBps, PrefetchThreads, hyperthreading, c1e/cstates, .. as discussed in 
this thread, but nothing seems to impact this read performance. 
Any ideas?
Thanks!

Kenneth

On 17/02/17 19:29, Jan-Frode Myklebust wrote:
I just had a similar experience from a sandisk infiniflash system 
SAS-attached to s single host. Gpfsperf reported 3,2 Gbyte/s for writes. 
and 250-300 Mbyte/s on sequential reads!! Random reads were on the order 
of 2 Gbyte/s.

After a bit head scratching snd fumbling around I found out that reducing 
maxMBpS from 1 to 100 fixed the problem! Digging further I found that 
reducing prefetchThreads from default=72 to 32 also fixed it, while 
leaving maxMBpS at 1. Can now also read at 3,2 GByte/s.

Could something like this be the problem on your box as well?



-jf
fre. 17. feb. 2017 kl. 18.13 skrev Aaron Knister :
Well, I'm somewhat scrounging for hardware. This is in our test
environment :) And yep, it's got the 2U gpu-tray in it although even
without the riser it has 2 PCIe slots onboard (excluding the on-board
dual-port mezz card) so I think it would make a fine NSD server even
without the riser.

-Aaron

On 2/17/17 11:43 AM, Simon Thompson (Research Computing - IT Services)
wrote:
> Maybe its related to interrupt handlers somehow? You drive the load up 
on one socket, you push all the interrupt handling to the other socket 
where the fabric card is attached?
>
> Dunno ... (Though I am intrigued you use idataplex nodes as NSD servers, 
I assume its some 2U gpu-tray riser one or something !)
>
> Simon
> 
> From: gpfsug-discuss-boun...@spectrumscale.org [
gpfsug-discuss-boun...@spectrumscale.org] on behalf of Aaron Knister [
aaron.s.knis...@nasa.gov]
> Sent: 17 February 2017 15:52
> To: gpfsug main discussion list
> Subject: [gpfsug-discuss] bizarre performance behavior
>
> This is a good one. I've got an NSD server with 4x 16GB fibre
> connections coming in and 1x FDR10 and 1x QDR connection going out to
> the clients. I was having a really hard time getting anything resembling
> sensible performance out of it (4-5Gb/s writes but maybe 1.2Gb/s for
> reads). The back-end is a DDN SFA12K and I *know* it can do better than
> that.
>
> I don't remember quite how I figured this out but simply by running
> "openssl speed -multi 16" on the nsd server to drive up the load I saw
> an almost 4x performance jump which is pretty much goes against every
> sysadmin fiber in me (i.e. "drive up the cpu load with unrelated crap to
> quadruple your i/o performance").
>
> This feels like some type of C-states frequency scaling shenanigans that
> I haven't quite ironed down yet. I booted the box with the following
> kernel parameters "intel_idle.max_cstate=0 processor.max_cstate=0" which
> didn't seem to make much of a difference. I also tried setting the
> frequency governer to userspace and setting the minimum frequency to
> 2.6ghz (it's a 2.6ghz cpu). None of that really matters-- I still have
> to run something to drive up the CPU load and then performance improves.
>
> I'm wondering if this could be an issue with the C1E state? I'm curious
> if anyone has seen anything like this. The node is a dx360 M4
> (Sandybridge) with 16 2.6GHz cores and 32GB of RAM.
>
> -Aaron

Re: [gpfsug-discuss] bizarre performance behavior

2017-04-20 Thread Marcus Koenig1

Hi Kennmeth,

we also had similar performance numbers in our tests. Native was far
quicker than through GPFS. When we learned though that the client tested
the performance on the FS at a big blocksize (512k) with small files - we
were able to speed it up significantly using a smaller FS blocksize
(obviously we had to recreate the FS).

So really depends on how you do your tests.


Cheers,

Marcus Koenig
Lab Services Storage & Power Specialist
IBM Australia & New Zealand Advanced Technical Skills
IBM Systems-Hardware
|---+--+>
|   |  |
|
|---+--+>
  
>|
  | 
   |
  
>|
|---+--+>
|   |  |
|
|   |Mobile: +64 21 67 34 27   |
|
|   |E-mail: marc...@nz1.ibm.com   |
|
|   |  |
|
|   |  |
|
|   |  |
|
|   |82 Wyndham Street |
|
|   |Auckland, AUK 1010|
|
|   |New Zealand   |
|
|   |  |
|
|   |  |
|
|   |  |
|
|   |  |
|
|   |  |
|
|---+--+>
  
>|
  | 
   |
  
>|
|---+--+>
|   |  |
|
|---+--+>
  
>|
  | 
   |
  
>|





From:   "Uwe Falke" 
To: gpfsug main discussion list 
Date:   04/21/2017 03:07 AM
Subject:Re: [gpfsug-discuss] bizarre performance behavior
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi Kennmeth,

is prefetching off or on  at your storage backend?
Raw sequential is very different from GPFS sequential at the storage
device !
GPFS does its own prefetching, the storage would never know what sectors
sequential read at GPFS level maps to at storage level!


Mit freundlichen Grüßen / Kind regards


Dr. Uwe Falke

IT Specialist
H