Re: [gpfsug-discuss] Request for folks using encryption on SKLM, run a word count

2020-09-11 Thread J. Eric Wonderley
We have spectrum archive with encryption on disk and tape.   We get maybe a
100 or so messages like this daily.  It would be nice if message had some
information about which client is the issue.

We have had client certs expire in the past.  The root cause of the outage
was a network outage...iirc the certs are cached in the clients.

I don't know what to make of these messages...they do concern me.  I don't
have a very good opinion of the sklm code...key replication between the key
servers has never worked as expected.


Eric Wonderley


On Tue, Sep 8, 2020 at 7:10 PM Wahl, Edward  wrote:

>  Ran into something a good while back and I'm curious how many others this
> affects.   If folks with encryption enabled could run a quick word count on
> their SKLM server and reply with a rough count I'd appreciate it.
> I've gone round and round with IBM SKLM support over the last year on this
> and it just has me wondering.  This is one of those "morbidly curious about
> making the sausage" things.
>
> Looking to see if this is a normal error message folks are seeing.  Just
> find your daily, rotating audit log and search it.  I'll trust most folks
> to figure this out, but let me know if you need help.
> Normal location is /opt/IBM/WebSphere/AppServer/products/sklm/logs/audit
> If you are on a normal linux box try something like:  "locate
> sklm_audit.log |head -1 |xargs -i grep "Server does not trust the client
> certificate" {} |wc "  or whatever works for you.   If your audit log is
> fairly fresh, you might want to check the previous one.   I do NOT need
> exact information, just 'yeah we get 12million out a 500MB file' or ' we
> get zero', or something like that.
>
>  Mostly I'm curious if folks get zero, or a large number.  I've got my
> logs adjusted to 500MB and I get 8 digit numbers out of the previous log.
> Yet things work perfectly.I've talked to two other SS sites I know the
> admins personally, and they get larger numbers than I do. But it's such a
> tiny sample size! LOL
>
> Ed Wahl
> Ohio Supercomputer Center
>
> Apologies for the message formatting issues.  Outlook fought tooth and
> nail against sending it with the path as is, and kept breaking my
> paragraphs.
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] gpfs filesets question

2020-04-16 Thread J. Eric Wonderley
Hi Fred:

I do.  I have 3 pools.  system, ssd data pool(fc_ssd400G) and a spinning
disk pool(fc_8T).

I want to think the ssd_data_pool is empty at the moment and the system
pool is ssd and only contains metadata.
[root@cl005 ~]# mmdf home -P fc_ssd400G
diskdisk size  failure holdsholds  free KB
free KB
namein KBgroup metadata datain full blocks
   in fragments
--- -   - 
---
Disks in storage pool: fc_ssd400G (Maximum disk size allowed is 97 TB)
r10f1e81924720640 1001 No   Yes  1924644864 (100%)
 9728 ( 0%)
r10f1e71924720640 1001 No   Yes  1924636672 (100%)
17408 ( 0%)
r10f1e61924720640 1001 No   Yes  1924636672 (100%)
17664 ( 0%)
r10f1e51924720640 1001 No   Yes  1924644864 (100%)
 9728 ( 0%)
r10f6e81924720640 1001 No   Yes  1924644864 (100%)
 9728 ( 0%)
r10f1e91924720640 1001 No   Yes  1924644864 (100%)
 9728 ( 0%)
r10f6e91924720640 1001 No   Yes  1924644864 (100%)
 9728 ( 0%)
- 
---
(pool total)  13473044480   13472497664 (100%)
83712 ( 0%)

More or less empty.

Interesting...


On Thu, Apr 16, 2020 at 1:11 PM Frederick Stock  wrote:

> Do you have more than one GPFS storage pool in the system?  If you do and
> they align with the filesets then that might explain why moving data from
> one fileset to another is causing increased IO operations.
>
> Fred
> __
> Fred Stock | IBM Pittsburgh Lab | 720-430-8821
> sto...@us.ibm.com
>
>
>
> ----- Original message -
> From: "J. Eric Wonderley" 
> Sent by: gpfsug-discuss-boun...@spectrumscale.org
> To: gpfsug main discussion list 
> Cc:
> Subject: [EXTERNAL] [gpfsug-discuss] gpfs filesets question
> Date: Thu, Apr 16, 2020 12:32 PM
>
> I have filesets setup in a filesystem...looks like:
> [root@cl005 ~]# mmlsfileset home -L
> Filesets in file system 'home':
> NameId  RootInode  ParentId Created
>InodeSpace  MaxInodesAllocInodes Comment
> root 0  3-- Tue Jun 30
> 07:54:09 20150402653184  320946176 root fileset
> hess 1  543733376 0 Tue Jun 13
> 14:56:13 201700  0
> predictHPC   21171116 0 Thu Jan  5
> 15:16:56 201700  0
> HYCCSIM  3  544258049 0 Wed Jun 14
> 10:00:41 201700  0
> socialdet4  544258050 0 Wed Jun 14
> 10:01:02 201700  0
> arc  51171073 0 Thu Jan  5
> 15:07:09 201700  0
> arcadm   61171074 0 Thu Jan  5
> 15:07:10 201700  0
>
> I beleive these are dependent filesets.  Dependent on the root fileset.
>  Anyhow a user wants to move a large amount of data from one fileset to
> another.   Would this be a metadata only operation?  He has attempted to
> small amount of data and has noticed some thrasing.
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] gpfs filesets question

2020-04-16 Thread J. Eric Wonderley
I have filesets setup in a filesystem...looks like:
[root@cl005 ~]# mmlsfileset home -L
Filesets in file system 'home':
NameId  RootInode  ParentId Created
 InodeSpace  MaxInodesAllocInodes Comment
root 0  3-- Tue Jun 30
07:54:09 20150402653184  320946176 root fileset
hess 1  543733376 0 Tue Jun 13
14:56:13 201700  0
predictHPC   21171116 0 Thu Jan  5
15:16:56 201700  0
HYCCSIM  3  544258049 0 Wed Jun 14
10:00:41 201700  0
socialdet4  544258050 0 Wed Jun 14
10:01:02 201700  0
arc  51171073 0 Thu Jan  5
15:07:09 201700  0
arcadm   61171074 0 Thu Jan  5
15:07:10 201700  0

I beleive these are dependent filesets.  Dependent on the root fileset.
 Anyhow a user wants to move a large amount of data from one fileset to
another.   Would this be a metadata only operation?  He has attempted to
small amount of data and has noticed some thrasing.
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks

2019-03-27 Thread J. Eric Wonderley
mmlspool might suggest there's only 1 system pool per cluster.  We have 2
clusters and it has id=0 on both.

One of our clusters has 2 filesystems that have same id for two different
dataonly pools:
[root@cl001 ~]# mmlspool home all
NameId
system   0
fc_8T65537
fc_ssd400G   65538
[root@cl001 ~]# mmlspool work all
NameId
system   0
sas_6T   65537

I know md lives in the system pool and if you do encryption you can forget
about putting data into you inodes for small files



On Wed, Mar 27, 2019 at 10:57 AM Stephen Ulmer  wrote:

> This presentation contains lots of good information about file system
> structure in general, and GPFS in specific, and I appreciate that and
> enjoyed reading it.
>
> However, it states outright (both graphically and in text) that storage
> pools are a feature of the cluster, not of a file system — which I believe
> to be completely incorrect. For example, it states that there is "only one
> system pool per cluster", rather than one per file system.
>
> Given that this was written by IBMers and presented at an actual users’
> group, can someone please weigh in on this? I’m asking because it
> represents a fundamental misunderstanding of a very basic GPFS concept,
> which makes me wonder how authoritative the rest of it is...
>
> --
> Stephen
>
>
>
> On Mar 26, 2019, at 12:27 PM, Dorigo Alvise (PSI) 
> wrote:
>
> Hi Marc,
> "Indirect block size" is well explained in this presentation:
>
>
> http://files.gpfsug.org/presentations/2016/south-bank/D2_P2_A_spectrum_scale_metadata_dark_V2a.pdf
>
> pages 37-41
>
> Cheers,
>
>Alvise
>
> --
> *From:* gpfsug-discuss-boun...@spectrumscale.org [
> gpfsug-discuss-boun...@spectrumscale.org] on behalf of Caubet Serrabou
> Marc (PSI) [marc.cau...@psi.ch]
> *Sent:* Tuesday, March 26, 2019 4:39 PM
> *To:* gpfsug main discussion list
> *Subject:* [gpfsug-discuss] GPFS v5: Blocksizes and subblocks
>
> Hi all,
>
> according to several GPFS presentations as well as according to the man
> pages:
>
>  Table 1. Block sizes and subblock sizes
>
> +---+---+
> | Block size| Subblock size |
> +---+---+
> | 64 KiB| 2 KiB |
> +---+---+
> | 128 KiB   | 4 KiB |
> +---+---+
> | 256 KiB, 512 KiB, 1 MiB, 2| 8 KiB |
> | MiB, 4 MiB|   |
> +---+---+
> | 8 MiB, 16 MiB | 16 KiB|
> +---+---+
>
> A block size of 8MiB or 16MiB should contain subblocks of 16KiB.
>
> However, when creating a new filesystem with 16MiB blocksize, looks like
> is using 128KiB subblocks:
>
> [root@merlindssio01 ~]# mmlsfs merlin
> flagvaluedescription
> --- 
> ---
>  -f 8192 Minimum fragment (subblock)
> size in bytes (system pool)
> 131072   Minimum fragment (subblock)
> size in bytes (other pools)
>  -i 4096 Inode size in bytes
>  -I 32768Indirect block size in bytes
> .
> .
> .
>  -n 128  Estimated number of nodes
> that will mount file system
>  -B 1048576  Block size (system pool)
> 16777216 Block size (other pools)
> .
> .
> .
>
> What am I missing? According to documentation, I expect this to be a fixed
> value, or it isn't at all?
>
> On the other hand, I don't really understand the concept 'Indirect block
> size in bytes', can somebody clarify or provide some details about this
> setting?
>
> Thanks a lot and best regards,
> Marc
> _
> Paul Scherrer Institut
> High Performance Computing
> Marc Caubet Serrabou
> Building/Room: WHGA/019A
> Forschungsstrasse, 111
> 5232 Villigen PSI
> Switzerland
>
> Telephone: +41 56 310 46 67
> E-Mail: marc.cau...@psi.ch
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org

Re: [gpfsug-discuss] Multihomed nodes and failover networks

2018-10-26 Thread J. Eric Wonderley
Multihoming is accomplished by using subnets...see mmchconfig.

Failover networks on the other hand are not allowed.  Bad network behavior
is dealt with by expelling nodes.  You must have decent/supported network
gear...we have learned that lesson the hard way

On Fri, Oct 26, 2018 at 10:37 AM Lukas Hejtmanek 
wrote:

> Hello,
>
> does anyone know whether there is a chance to use e.g., 10G ethernet
> together
> with IniniBand network for multihoming of GPFS nodes?
>
> I mean to setup two different type of networks to mitigate network
> failures.
> I read that you can have several networks configured in GPFS but it does
> not
> provide failover. Nothing changed in this as of GPFS version 5.x?
>
> --
> Lukáš Hejtmánek
>
> Linux Administrator only because
>   Full Time Multitasking Ninja
>   is not an official job title
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Mixing RDMA Client Fabrics for a single NSD Cluster

2018-07-19 Thread J. Eric Wonderley
Hi Stephan:

I think every node in C1 and in C2 have to see every node in the server
cluster NSD-[AD].

We have a 10 node server cluster where 2 nodes do nothing but server out
nfs.  Since these two are apart of the server cluster...client clusters
wanting to mount the server cluster via gpfs need to see them.

I think both OPA fabfics need to be on all 4 of your server nodes.

Eric

On Thu, Jul 19, 2018 at 10:05 AM, Peinkofer, Stephan <
stephan.peinko...@lrz.de> wrote:

> Dear GPFS List,
>
> does anyone of you know, if it is possible to have multiple file systems
> in a GPFS Cluster that all are served primary via Ethernet but for which
> different “booster” connections to various IB/OPA fabrics exist.
>
> For example let’s say in my central Storage/NSD Cluster, I implement two
> file systems FS1 and FS2. FS1 is served by NSD-A and NSD-B and FS2 is
> served by NSD-C and NSD-D.
> Now I have two client Clusters C1 and C2 which have different OPA fabrics.
> Both Clusters can mount the two file systems via Ethernet, but I now add
> OPA connections for NSD-A and NSD-B to C1’s fabric and OPA connections for
> NSD-C and NSD-D to  C2’s fabric and just switch on RDMA.
> As far as I understood, GPFS will use RDMA if it is available between two
> nodes but switch to Ethernet if RDMA is not available between the two
> nodes. So given just this, the above scenario could work in principle. But
> will it work in reality and will it be supported by IBM?
>
> Many thanks in advance.
> Best Regards,
> Stephan Peinkofer
> --
> Stephan Peinkofer
> Leibniz Supercomputing Centre
> Data and Storage Division
> Boltzmannstraße 1, 85748 Garching b. München
> URL: http://www.lrz.de
>
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] more than one mlx connectx-4 adapter in same host

2017-12-20 Thread J. Eric Wonderley
Just plain tcpip.

We have dual port connectx4s in our nsd servers.  Upon adding a second
connectx4 hba...no links go up or show "up".  I have one port on each hba
configured for eth and ibv_devinfo looks sane.

I cannot find anything indicating that this should not work.  I have a
ticket opened with mellanox.

On Wed, Dec 20, 2017 at 3:25 PM, Knister, Aaron S. (GSFC-606.2)[COMPUTER
SCIENCE CORP] <aaron.s.knis...@nasa.gov> wrote:

>
>
> We’ve done a fair amount of VPI work but admittedly not with connectx4. Is
> it possible the cards are trying to talk IB rather than Eth? I figured
> you’re Ethernet based because of the mention of Juniper.
>
> Are you attempting to do RoCE or just plain TCP/IP?
>
>
> On December 20, 2017 at 14:40:48 EST, J. Eric Wonderley <
> eric.wonder...@vt.edu> wrote:
>
> Hello:
>
> Does anyone have this type of config?
>
> The host configuration looks sane but we seem to observe link-down on all
> mlx adapters no matter what we do.
>
> Big picture is that we are attempting to do mc(multichassis)-lags to a
> core switch.  I'm somewhat fearful as to how this is implemented in the
> juniper switch we are about to test.
>
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] mm'add|del'node with ccr enabled

2017-12-08 Thread J. Eric Wonderley
Hello:

If I recall correctly this does not work...correct?  I think the last time
I attempted this was gpfs version <=4.1.  I think I attempted to add a
quorum node.

The process was that I remember doing was mmshutdown -a, mmchcluster
--ccr-disable, mmaddnode yadayada, mmchcluster --ccr-enable, mmstartup.

I think with ccr disabled mmaddnode can be run with gpfs up.  We would like
to run with ccr enabled but it does make adding/removing nodes unpleasant.

Would this be required of a non-quorum node?

Any changes concerning this with gpfs version >=4.2?
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] sas avago/lsi hba reseller recommendation

2017-08-28 Thread J. Eric Wonderley
We have several avago/lsi 9305-16e that I believe came from Advanced HPC.

Can someone recommend a another reseller of these hbas or a contact with
Advance HPC?
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmsetquota produces error

2017-08-17 Thread J. Eric Wonderley
I recently opened a pmr on this issue(24603,442,000)...I'll keep this
thread posted on results.

On Thu, Aug 17, 2017 at 10:30 AM, James Davis <jamieda...@us.ibm.com> wrote:

> I've also tried on our in-house latest release and cannot recreate it.
>
> I'll ask around to see who's running a 4.2.2 cluster I can look at.
>
>
> - Original message -
> From: "Sobey, Richard A" <r.so...@imperial.ac.uk>
> Sent by: gpfsug-discuss-boun...@spectrumscale.org
> To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>,
> "Edward Wahl" <ew...@osc.edu>
> Cc:
> Subject: Re: [gpfsug-discuss] mmsetquota produces error
> Date: Thu, Aug 17, 2017 10:20 AM
>
>
> I’ve just done exactly that and can’t reproduce it in my prod environment.
> Running 4.2.3-2 though.
>
>
>
> [root@icgpfs01 ~]# mmlsfileset gpfs setquotafoo -L
>
> Filesets in file system 'gpfs':
>
> NameId  RootInode  ParentId
> Created  InodeSpace  MaxInodesAllocInodes
> Comment
>
> setquotafoo2518408295 0 Thu Aug 17
> 15:17:18 201700      0
>
>
>
> *From:* gpfsug-discuss-boun...@spectrumscale.org [mailto:gpfsug-discuss-
> boun...@spectrumscale.org] *On Behalf Of *J. Eric Wonderley
> *Sent:* 17 August 2017 15:14
> *To:* Edward Wahl <ew...@osc.edu>
> *Cc:* gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
> *Subject:* Re: [gpfsug-discuss] mmsetquota produces error
>
>
>
> The error is very repeatable...
> [root@cl001 ~]# mmcrfileset home setquotafoo
> Fileset setquotafoo created with id 61 root inode 3670407.
> [root@cl001 ~]# mmlinkfileset home setquotafoo -J /gpfs/home/setquotafoo
> Fileset setquotafoo linked at /gpfs/home/setquotafoo
> [root@cl001 ~]# mmsetquota home:setquotafoo --block 10T:10T --files
> 10M:10M
> tssetquota: Could not get id of fileset 'setquotafoo' error (22): 'Invalid
> argument'.
> mmsetquota: Command failed. Examine previous error messages to determine
> cause.
> [root@cl001 ~]# mmlsfileset home setquotafoo -L
> Filesets in file system 'home':
> NameId  RootInode  ParentId
> Created  InodeSpace  MaxInodesAllocInodes
> Comment
> setquotafoo     613670407 0 Thu Aug 17
> 10:10:54 201700  0
>
>
>
> On Thu, Aug 17, 2017 at 9:43 AM, Edward Wahl <ew...@osc.edu> wrote:
>
> On Fri, 4 Aug 2017 01:02:22 -0400
> "J. Eric Wonderley" <eric.wonder...@vt.edu> wrote:
>
> > 4.2.2.3
> >
> > I want to think maybe this started after expanding inode space
>
> What does 'mmlsfileset home nathanfootest -L'   say?
>
> Ed
>
>
>
> >
> > On Thu, Aug 3, 2017 at 9:11 AM, James Davis <jamieda...@us.ibm.com>
> wrote:
> >
> > > Hey,
> > >
> > > Hmm, your invocation looks valid to me. What's your GPFS level?
> > >
> > > Cheers,
> > >
> > > Jamie
> > >
> > >
> > > - Original message -
> > > From: "J. Eric Wonderley" <eric.wonder...@vt.edu>
> > > Sent by: gpfsug-discuss-boun...@spectrumscale.org
> > > To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
> > > Cc:
> > > Subject: [gpfsug-discuss] mmsetquota produces error
> > > Date: Wed, Aug 2, 2017 5:03 PM
> > >
> > > for one of our home filesystem we get:
> > > mmsetquota home:nathanfootest --block 10T:10T --files 10M:10M
> > > tssetquota: Could not get id of fileset 'nathanfootest' error (22):
> > > 'Invalid argument'.
> > >
> > >
> > > mmedquota -j home:nathanfootest
> > > does work however
> > > ___
> > > gpfsug-discuss mailing list
> > > gpfsug-discuss at spectrumscale.org
> > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> > >
> > >
> > >
> > >
> > > ___
> > > gpfsug-discuss mailing list
> > > gpfsug-discuss at spectrumscale.org
> > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> > >
> > >
>
>
> --
>
> Ed Wahl
> Ohio Supercomputer Center
> 614-292-9302
>
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmsetquota produces error

2017-08-17 Thread J. Eric Wonderley
The error is very repeatable...
[root@cl001 ~]# mmcrfileset home setquotafoo
Fileset setquotafoo created with id 61 root inode 3670407.
[root@cl001 ~]# mmlinkfileset home setquotafoo -J /gpfs/home/setquotafoo
Fileset setquotafoo linked at /gpfs/home/setquotafoo
[root@cl001 ~]# mmsetquota home:setquotafoo --block 10T:10T --files 10M:10M
tssetquota: Could not get id of fileset 'setquotafoo' error (22): 'Invalid
argument'.
mmsetquota: Command failed. Examine previous error messages to determine
cause.
[root@cl001 ~]# mmlsfileset home setquotafoo -L
Filesets in file system 'home':
NameId  RootInode  ParentId
Created  InodeSpace  MaxInodesAllocInodes
Comment
setquotafoo 613670407 0 Thu Aug 17
10:10:54 201700  0


On Thu, Aug 17, 2017 at 9:43 AM, Edward Wahl <ew...@osc.edu> wrote:

> On Fri, 4 Aug 2017 01:02:22 -0400
> "J. Eric Wonderley" <eric.wonder...@vt.edu> wrote:
>
> > 4.2.2.3
> >
> > I want to think maybe this started after expanding inode space
>
> What does 'mmlsfileset home nathanfootest -L'   say?
>
> Ed
>
>
> >
> > On Thu, Aug 3, 2017 at 9:11 AM, James Davis <jamieda...@us.ibm.com>
> wrote:
> >
> > > Hey,
> > >
> > > Hmm, your invocation looks valid to me. What's your GPFS level?
> > >
> > > Cheers,
> > >
> > > Jamie
> > >
> > >
> > > - Original message -
> > > From: "J. Eric Wonderley" <eric.wonder...@vt.edu>
> > > Sent by: gpfsug-discuss-boun...@spectrumscale.org
> > > To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
> > > Cc:
> > > Subject: [gpfsug-discuss] mmsetquota produces error
> > > Date: Wed, Aug 2, 2017 5:03 PM
> > >
> > > for one of our home filesystem we get:
> > > mmsetquota home:nathanfootest --block 10T:10T --files 10M:10M
> > > tssetquota: Could not get id of fileset 'nathanfootest' error (22):
> > > 'Invalid argument'.
> > >
> > >
> > > mmedquota -j home:nathanfootest
> > > does work however
> > > ___
> > > gpfsug-discuss mailing list
> > > gpfsug-discuss at spectrumscale.org
> > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> > >
> > >
> > >
> > >
> > > ___
> > > gpfsug-discuss mailing list
> > > gpfsug-discuss at spectrumscale.org
> > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> > >
> > >
>
>
>
> --
>
> Ed Wahl
> Ohio Supercomputer Center
> 614-292-9302
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] data integrity documentation

2017-08-04 Thread J. Eric Wonderley
i actually hit this assert and turned it in to support on this version:
Build branch "4.2.2.3 efix6 (987197)".

i was told do to exactly what sven mentioned.

i thought it strange that i did NOT hit the assert in a no pass but hit it
in a yes pass.

On Thu, Aug 3, 2017 at 9:06 AM, Sven Oehme  wrote:

> a trace during a mmfsck with the checksum parameters turned on would
> reveal it.
> the support team should be able to give you specific triggers to cut a
> trace during checksum errors , this way the trace is cut when the issue
> happens and then from the trace on server and client side one can extract
> which card was used on each side.
>
> sven
>
> On Wed, Aug 2, 2017 at 2:53 PM Stijn De Weirdt 
> wrote:
>
>> hi steve,
>>
>> > The nsdChksum settings for none GNR/ESS based system is not officially
>> > supported.It will perform checksum on data transfer over the network
>> > only and can be used to help debug data corruption when network is a
>> > suspect.
>> i'll take not officially supported over silent bitrot any day.
>>
>> >
>> > Did any of those "Encountered XYZ checksum errors on network I/O to NSD
>> > Client disk" warning messages resulted in disk been changed to "down"
>> > state due to IO error?
>> no.
>>
>>  If no disk IO error was reported in GPFS log,
>> > that means data was retransmitted successfully on retry.
>> we suspected as much. as sven already asked, mmfsck now reports clean
>> filesystem.
>> i have an ibdump of 2 involved nsds during the reported checksums, i'll
>> have a closer look if i can spot these retries.
>>
>> >
>> > As sven said, only GNR/ESS provids the full end to end data integrity.
>> so with the silent network error, we have high probabilty that the data
>> is corrupted.
>>
>> we are now looking for a test to find out what adapters are affected. we
>> hoped that nsdperf with verify=on would tell us, but it doesn't.
>>
>> >
>> > Steve Y. Xiao
>> >
>> >
>> >
>> >
>> >
>> >
>> > ___
>> > gpfsug-discuss mailing list
>> > gpfsug-discuss at spectrumscale.org
>> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>> >
>> ___
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmsetquota produces error

2017-08-03 Thread J. Eric Wonderley
4.2.2.3

I want to think maybe this started after expanding inode space

On Thu, Aug 3, 2017 at 9:11 AM, James Davis <jamieda...@us.ibm.com> wrote:

> Hey,
>
> Hmm, your invocation looks valid to me. What's your GPFS level?
>
> Cheers,
>
> Jamie
>
>
> ----- Original message -
> From: "J. Eric Wonderley" <eric.wonder...@vt.edu>
> Sent by: gpfsug-discuss-boun...@spectrumscale.org
> To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
> Cc:
> Subject: [gpfsug-discuss] mmsetquota produces error
> Date: Wed, Aug 2, 2017 5:03 PM
>
> for one of our home filesystem we get:
> mmsetquota home:nathanfootest --block 10T:10T --files 10M:10M
> tssetquota: Could not get id of fileset 'nathanfootest' error (22):
> 'Invalid argument'.
>
>
> mmedquota -j home:nathanfootest
> does work however
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] data integrity documentation

2017-08-02 Thread J. Eric Wonderley
No guarantee...unless you are using ess/gss solution.

Crappy network will get you loads of expels and occasional fscks.  Which I
guess beats data loss and recovery from backup.

YOu probably have a network issue...they can be subtle.  Gpfs is a very
extremely thorough network tester.


Eric

On Wed, Aug 2, 2017 at 11:57 AM, Stijn De Weirdt 
wrote:

> hi all,
>
> is there any documentation wrt data integrity in spectrum scale:
> assuming a crappy network, does gpfs garantee somehow that data written
> by client ends up safe in the nsd gpfs daemon; and similarly from the
> nsd gpfs daemon to disk.
>
> and wrt crappy network, what about rdma on crappy network? is it the same?
>
> (we are hunting down a crappy infiniband issue; ibm support says it's
> network issue; and we see no errors anywhere...)
>
> thanks a lot,
>
> stijn
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Quota and hardlimit enforcement

2017-07-31 Thread J. Eric Wonderley
Hi Renar:

What does 'mmlsquota -j fileset filesystem' report?

I did not think you would get a grace period of none unless the
hardlimit=softlimit.

On Mon, Jul 31, 2017 at 1:44 PM, Grunenberg, Renar <
renar.grunenb...@huk-coburg.de> wrote:

> Hallo All,
> we are on Version 4.2.3.2 and see some missunderstandig in the enforcement
> of hardlimit definitions on a flieset quota. What we see is we put some 200
> GB files on following quota definitions:  quota 150 GB Limit 250 GB Grace
> none.
> After the creating of one 200 GB we hit the softquota limit, thats ok. But
> After the the second file was created!! we expect an io error but it don’t
> happen. We define all well know Parameters (-Q,..) on the filesystem . Is
> this a bug or a Feature? mmcheckquota are already running at first.
> Regards Renar.
>
>
>
> Renar Grunenberg
> Abteilung Informatik – Betrieb
>
> HUK-COBURG
> Bahnhofsplatz
> 96444 Coburg
> Telefon: 09561 96-44110
> Telefax: 09561 96-44104
> E-Mail: renar.grunenb...@huk-coburg.de
> Internet: www.huk.de
>
> --
> HUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter
> Deutschlands a. G. in Coburg
> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021
> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg
> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.
> Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav
> Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel Thomas (stv.).
> --
> Diese Nachricht enthält vertrauliche und/oder rechtlich geschützte
> Informationen.
> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich
> erhalten haben,
> informieren Sie bitte sofort den Absender und vernichten Sie diese
> Nachricht.
> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht
> ist nicht gestattet.
>
> This information may contain confidential and/or privileged information.
> If you are not the intended recipient (or have received this information
> in error) please notify the
> sender immediately and destroy this information.
> Any unauthorized copying, disclosure or distribution of the material in
> this information is strictly forbidden.
> --
>
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] gui related connection fail in gpfs logs

2017-06-20 Thread J. Eric Wonderley
These type messages repeat often in our logs:

017-06-20_09:25:13.676-0400: [E]
An%20attempt%20to%20send%20notification%20to%20the%20GUI%20subsystem%20failed%2E%20response%3Dcurl%3A%20%287%29%20Failed%20connect%20to%20arproto2%2Ear%2Enis%2Eisb%2Einternal%3A443%3B%20Connection%20refused%20rc%3D7
rc=1
2017-06-20_09:25:24.292-0400: [E]
An%20attempt%20to%20send%20notification%20to%20the%20GUI%20subsystem%20failed%2E%20response%3Dcurl%3A%20%287%29%20Failed%20connect%20to%20arproto2%2Ear%2Enis%2Eisb%2Einternal%3A443%3B%20Connection%20refused%20rc%3D7
rc=1
2017-06-20_10:00:25.935-0400: [E]
An%20attempt%20to%20send%20notification%20to%20the%20GUI%20subsystem%20failed%2E%20response%3Dcurl%3A%20%287%29%20Failed%20connect%20to%20arproto2%2Ear%2Enis%2Eisb%2Einternal%3A443%3B%20Connection%20refused%20rc%3D7
rc=1

Is there any way to tell if it is a misconfiguration or communications
issue?
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] help with multi-cluster setup: Network is unreachable

2017-05-08 Thread J. Eric Wonderley
Hi Jamie:

I think typically you want to keep the clients ahead of the server in
version.  I would advance the version of you client nodes.

New clients can communicate with older versions of server nsds.  Vice
versa...no so much.
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] snapshots & tiering in a busy filesystem

2017-03-22 Thread J. Eric Wonderley
The filesystem I'm working with has about 100M files and 80Tb of data.

What kind of metadata latency do you observe?
I did a mmdiag --iohist and filtered out all of the md devices and averaged
over reads and writes.  I'm seeing ~.28ms on a one off dump.  The pure
array which we have is 10G iscsi connected and is reporting average .25ms.

On Wed, Mar 22, 2017 at 6:47 AM, Sobey, Richard A <r.so...@imperial.ac.uk>
wrote:

> We’re also snapshotting 4 times a day. Filesystem isn’t tremendously busy
> at all but we’re creating snaps for each fileset.
>
>
>
> [root@cesnode tmp]# mmlssnapshot gpfs | wc -l
>
> 6916
>
>
>
> *From:* gpfsug-discuss-boun...@spectrumscale.org [mailto:gpfsug-discuss-
> boun...@spectrumscale.org] *On Behalf Of *J. Eric Wonderley
> *Sent:* 20 March 2017 14:03
> *To:* gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
> *Subject:* [gpfsug-discuss] snapshots & tiering in a busy filesystem
>
>
>
> I found this link and it didn't give me much hope for doing snapshots &
> backup in a home(busy) filesystem:
>
> http://www.spectrumscale.org/pipermail/gpfsug-discuss/2013-
> February/000200.html
>
> I realize this is dated and I wondered if qos etc have made is a tolerable
> thing to do now.  Gpfs I think was barely above v3.5 in mid 2013.
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] snapshots & tiering in a busy filesystem

2017-03-20 Thread J. Eric Wonderley
I found this link and it didn't give me much hope for doing snapshots &
backup in a home(busy) filesystem:
http://www.spectrumscale.org/pipermail/gpfsug-discuss/2013-
February/000200.html

I realize this is dated and I wondered if qos etc have made is a tolerable
thing to do now.  Gpfs I think was barely above v3.5 in mid 2013.
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Fw: mmbackup examples using policy

2017-02-15 Thread J. Eric Wonderley
Hi Steven:

Yes that is more or less what we want to do.  We have tivoli here for
backup so I'm somewhat familiar with inclexcl files.  The filesystem I want
to backup is a shared home.

Right now I do have a policy...mmlspolicy home -L does return a policy.  So
if I did not want to backup core and cache files I could create a backup
policy using /var/mmfs/mmbackup/.mmbackupRules.home and place in it?:
EXCLUDE "/gpfs/home/.../core"
EXCLUDE "/igpfs/home/.../.opera/cache4"
EXCLUDE "/gpfs/home/.../.netscape/cache/.../*"
EXCLUDE "/gpfs/home/.../.mozilla/default/.../Cache"
EXCLUDE "/gpfs/home/.../.mozilla/.../Cache/*"
EXCLUDE "/gpfs/home/.../.mozilla/.../Cache"
EXCLUDE "/gpfs/home/.../.cache/mozilla/*"
EXCLUDE.DIR "/gpfs/home/.../.mozilla/firefox/.../Cache"

I did a test run of mmbackup and I noticed I got a template put in that
location:
[root@cl002 ~]# ll -al /var/mmfs/mmbackup/
total 12
drwxr-xr-x  2 root root 4096 Feb 15 07:43 .
drwxr-xr-x 10 root root 4096 Jan  4 10:42 ..
-r  1 root root 1177 Feb 15 07:43 .mmbackupRules.home

So I can copy this off into /var/mmfs/etc for example and to use next time
with my edits.

What is normally used to schedule the mmbackup?   Cronjob?   dsmcad?

Thanks much.

On Tue, Feb 14, 2017 at 11:21 AM, Steven Berman <sber...@us.ibm.com> wrote:

> Eric,
>What specifically do you wish to accomplish?It sounds to me like
> you want to use mmbackup to do incremental backup of parts or all of your
> file system.   But your question did not specify what specifically other
> than "whole file system incremental"  you want to accomplish.   Mmbackup by
> default, with "-t incremental"  will back up the whole file system,
> including all filesets of either variety, and without regard to storage
> pools.   If you wish to back up only a sub-tree of the file system, it must
> be in an independent fileset (--inode-space=new)  and the current product
> supports doing the backup of just that fileset.If you want to backup
> parts of the file system but exclude things in certain storage pools, from
> anywhere in the tree, you can either use "include exclude rules"  in your
> Spectrum Protect (formerly TSM) configuration file, or you can hand-edit
> the policy rules for mmbackup which can be copied from 
> /var/mmfs/mmbackup/.mmbackupRules. system name>(only persistent during mmbackup execution).  Copy that
> file to a new location, hand-edit and run mmbackup next time with -P  policy rules file>. Is there something else you want to accomplish?
>
> https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.2/
> com.ibm.spectrum.scale.v4r22.doc/bl1adv_semaprul.htm
> https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.2/
> com.ibm.spectrum.scale.v4r22.doc/bl1adm_backupusingmmbackup.htm
>
> Steven Berman Spectrum Scale / HPC General Parallel File
> System Dev.
> Pittsburgh, PA  (412) 667-6993   Tie-Line 989-6993
>sber...@us.ibm.com
> ----Every once in a while, it is a good idea to call out, "Computer, end
> program!"  just to check.  --David Noelle
> All Your Base Are Belong To Us.  --CATS
>
>
>
>
>
> From:"J. Eric Wonderley" <eric.wonder...@vt.edu>
> To:gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
>
> Date:02/13/2017 10:28 AM
> Subject:[gpfsug-discuss] mmbackup examples using policy
> Sent by:gpfsug-discuss-boun...@spectrumscale.org
> --
>
>
>
> Anyone have any examples of this?  I have a filesystem that has 2 pools
> and several filesets and would like daily progressive incremental backups
> of its contents.
>
> I found some stuff here(nothing real close to what I wanted however):
> /usr/lpp/mmfs/samples/ilm
>
> I have the tsm client installed on the server nsds.
>
> Thanks much___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
>
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] mmbackup examples using policy

2017-02-13 Thread J. Eric Wonderley
Anyone have any examples of this?  I have a filesystem that has 2 pools and
several filesets and would like daily progressive incremental backups of
its contents.

I found some stuff here(nothing real close to what I wanted however):
/usr/lpp/mmfs/samples/ilm

I have the tsm client installed on the server nsds.

Thanks much
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] proper gpfs shutdown when node disappears

2017-02-03 Thread J. Eric Wonderley
Well we got it into the down state using mmsdrrestore -p to recover stuff
into /var/mmfs/gen to cl004.

Anyhow we ended up unknown for cl004 when it powered off.  Short of
removing node, unknown is the state you get.

Unknown seems stable for a hopefully short outage of cl004.


Thanks

On Thu, Feb 2, 2017 at 4:28 PM, Olaf Weiser  wrote:

> many ways lead to Rome .. and I agree .. mmexpelnode is a nice command ..
> another approach...
> power it off .. (not reachable by ping) .. mmdelnode ... power on/boot ...
> mmaddnode ..
>
>
>
> From:Aaron Knister 
> To:
> Date:02/02/2017 08:37 PM
> Subject:Re: [gpfsug-discuss] proper gpfs shutdown when node
> disappears
> Sent by:gpfsug-discuss-boun...@spectrumscale.org
> --
>
>
>
> You could forcibly expel the node (one of my favorite GPFS commands):
>
> mmexpelnode -N $nodename
>
> and then power it off after the expulsion is complete and then do
>
> mmepelenode -r -N $nodename
>
> which will allow it to join the cluster next time you try and start up
> GPFS on it. You'll still likely have to go through recovery but you'll
> skip the part where GPFS wonders where the node went prior to it
> expelling it.
>
> -Aaron
>
> On 2/2/17 2:28 PM, valdis.kletni...@vt.edu wrote:
> > On Thu, 02 Feb 2017 18:28:22 +0100, "Olaf Weiser" said:
> >
> >> but the /var/mmfs DIR is obviously damaged/empty .. what ever.. that's
> why you
> >> see a message like this..
> >> have you reinstalled that node / any backup/restore thing ?
> >
> > The internal RAID controller died a horrid death and basically took
> > all the OS partitions with it.  So the node was just sort of limping
> along,
> > where the mmfsd process was still coping because it wasn't doing any
> > I/O to the OS partitions - but 'ssh bad-node mmshutdown' wouldn't work
> > because that requires accessing stuff in /var.
> >
> > At that point, it starts getting tempting to just use ipmitool from
> > another node to power the comatose one down - but that often causes
> > a cascade of other issues while things are stuck waiting for timeouts.
> >
> >
> > ___
> > gpfsug-discuss mailing list
> > gpfsug-discuss at spectrumscale.org
> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> >
>
> --
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2)
> Goddard Space Flight Center
> (301) 286-2776
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
>
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] proper gpfs shutdown when node disappears

2017-02-02 Thread J. Eric Wonderley
Is there a way to accomplish this so the rest of cluster knows its down?

My state now:
[root@cl001 ~]# mmgetstate -aL
cl004.cl.arc.internal:  mmremote: determineMode: Missing file
/var/mmfs/gen/mmsdrfs.
cl004.cl.arc.internal:  mmremote: This node does not belong to a GPFS
cluster.
mmdsh: cl004.cl.arc.internal remote shell process had return code 1.

 Node number  Node name   Quorum  Nodes up  Total nodes  GPFS state
Remarks

   1  cl001  57  8   active
quorum node
   2  cl002  57  8   active
quorum node
   3  cl003  57  8   active
quorum node
   4  cl004  00  8   unknown
quorum node
   5  cl005  57  8   active
quorum node
   6  cl006  57  8   active
quorum node
   7  cl007  57  8   active
quorum node
   8  cl008  57  8   active
quorum node

cl004 we think has an internal raid controller blowout
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Path to NSD lost when host_sas_address changed on port

2017-01-20 Thread J. Eric Wonderley
Maybe multipath is not seeing all of the wwns?

multipath -v3 | grep ^51855 look ok?

For some unknown reason multipath does not see our sandisk array...we have
to add them to the end of /etc/multipath/wwids file


On Fri, Jan 20, 2017 at 10:32 AM, David D. Johnson 
wrote:

> We have most of our GPFS NSD storage set up as pairs of RAID boxes served
> by failover pairs of servers.
> Most of it is FibreChannel, but the newest four boxes and servers are
> using dual port SAS controllers.
> Just this week, we had one server lose one out of the paths to one of the
> raid boxes. Took a while
> to realize what happened, but apparently the port2 ID changed from
> 51866da05cf7b001 to
> 51866da05cf7b002 on the fly, without rebooting.  Port1 is still
> 51866da05cf7b000, which is the card ID (host_add).
>
> We’re running gpfs 4.2.2.1 on RHEL7.2 on these hosts.
>
> Has anyone else seen this kind of behavior?
> First noticed these messages, 3 hours 13 minutes after boot:
> Jan 10 13:15:53 storage043 kernel: megasas: Err returned from
> build_and_issue_cmd
> Jan 10 13:15:53 storage043 kernel: megasas: Err returned from
> build_and_issue_cmd
> Jan 10 13:15:53 storage043 kernel: megasas: Err returned from
> build_and_issue_cmd
> Jan 10 13:15:53 storage043 kernel: megasas: Err returned from
> build_and_issue_cmd
> Jan 10 13:15:53 storage043 kernel: megasas: Err returned from
> build_and_issue_cmd
> Jan 10 13:15:53 storage043 kernel: megasas: Err returned from
> build_and_issue_cmd
> Jan 10 13:15:53 storage043 kernel: megasas: Err returned from
> build_and_issue_cmd
>
> The multipath daemon was sending lots of log messages like:
> Jan 10 13:49:22 storage043 multipathd: mpathw: load table [0 4642340864
> multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1
> 1 8:64 1]
> Jan 10 13:49:22 storage043 multipathd: mpathaa: load table [0 4642340864
> multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1
> 1 8:96 1]
> Jan 10 13:49:22 storage043 multipathd: mpathx: load table [0 4642340864
> multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1
> 1 8:128 1]
>
> Currently worked around problem by including 00 01 and 02 for all 8 SAS
> cards when mapping LUN/volume to host groups.
>
> Thanks,
>  — ddj
> Dave Johnson
> Brown University CCV
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] rmda errors scatter thru gpfs logs

2017-01-17 Thread J. Eric Wonderley
I have messages like these frequent my logs:
Tue Jan 17 11:25:49.731 2017: [E] VERBS RDMA rdma write error
IBV_WC_REM_ACCESS_ERR to 10.51.10.5 (cl005) on mlx5_0 port 1 fabnum 0
vendor_err 136
Tue Jan 17 11:25:49.732 2017: [E] VERBS RDMA closed connection to
10.51.10.5 (cl005) on mlx5_0 port 1 fabnum 0 due to RDMA write error
IBV_WC_REM_ACCESS_ERR index 23

Any ideas on cause..?
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Authorized Key Messages

2017-01-13 Thread J. Eric Wonderley
Our intent was to have ccr turned off since all nodes are quorum in the
server cluster:

Considering this:
[root@cl001 ~]# mmfsadm dump config | grep -i ccr
 ! ccrEnabled 0
   ccrMaxChallengeCheckRetries 4
 ccr :  0   (cluster configuration repository)
 ccr :  1   (cluster configuration repository)

Will this disable ccr?

On Fri, Jan 13, 2017 at 5:58 PM, Felipe Knop  wrote:

> Brian,
>
> I had to check again whether the fix in question was in 4.2.0.0 (as
> opposed to a newer mod release), but confirmed that it seems to be.   So
> this could be a new or different problem than the one I was thinking about.
>
> Researching a bit further, I found another potential match (internal
> defect number 981469), but that should be fixed in 4.2.1 as well. I have
> not seen recent reports of this problem.
>
> Perhaps this could be pursued via a PMR.
>
>   Felipe
>
> 
> Felipe Knop k...@us.ibm.com
> GPFS Development and Security
> IBM Systems
> IBM Building 008
> 2455 South Rd, Poughkeepsie, NY 12601
> (845) 433-9314  T/L 293-9314
>
>
>
>
>
> From:Brian Marshall 
> To:gpfsug main discussion list 
> Date:01/13/2017 03:21 PM
> Subject:Re: [gpfsug-discuss] Authorized Key Messages
>
> Sent by:gpfsug-discuss-boun...@spectrumscale.org
> --
>
>
>
> We are running 4.2.1  (there may be some point fixes we don't have)
>
> Any report of it being in this version?
>
> Brian
>
> On Fri, Jan 13, 2017 at 3:14 PM, Felipe Knop <*k...@us.ibm.com*
> > wrote:
> Brian,
>
> This seems to match a problem which was fixed in 4.1.1.7 and 4.2.0.0.
>
> Regards,
>
>   Felipe
>
> 
> Felipe Knop *k...@us.ibm.com*
> 
> GPFS Development and Security
> IBM Systems
> IBM Building 008
> 2455 South Rd, Poughkeepsie, NY 12601
> *(845) 433-9314* <(845)%20433-9314>  T/L 293-9314
>
>
>
>
>
> From:Brian Marshall <*mimar...@vt.edu* >
> To:gpfsug main discussion list <*gpfsug-discuss@spectrumscale.org*
> >
> Date:01/13/2017 02:50 PM
> Subject:[gpfsug-discuss] Authorized Key Messages
> Sent by:*gpfsug-discuss-boun...@spectrumscale.org*
> 
> --
>
>
>
>
> All,
>
> I just saw this message start popping up constantly on one our NSD Servers.
>
> [N] Auth: '/var/mmfs/ssl/authorized_ccr_keys' does not exist
>
> CCR Auth is disabled on all the NSD Servers.
>
> What other features/checks would look for the ccr keys?
>
> Thanks,
> Brian Marshall
> Virginia Tech___
> gpfsug-discuss mailing list
> gpfsug-discuss at *spectrumscale.org* 
> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss*
> 
>
>
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at *spectrumscale.org* 
> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss*
> 
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] replication and no failure groups

2017-01-09 Thread J. Eric Wonderley
Hi Yuran:

We have 5...4x md3860fs and 1x if150.

the if150 requires data replicas=2 to get the ha and protection they
recommend.  we have it presented in a fileset that appears in a users work
area.

On Mon, Jan 9, 2017 at 3:53 PM, Yaron Daniel <y...@il.ibm.com> wrote:

> Hi
>
> So - do u able to have GPFS replication for the MD Failure Groups ?
>
> I can see that u have 3 Failure Groups for Data -1, 2012,2034 , how many
> Storage Subsystems you have ?
>
>
>
>
> Regards
>
>
>
> --
>
>
>
> *Yaron Daniel*  94 Em Ha'Moshavot Rd
> *Server, **Storage and Data Services*
> <https://w3-03.ibm.com/services/isd/secure/client.wss/Somt?eventType=getHomePage=115>*-
> Team Leader*   Petach Tiqva, 49527
> *Global Technology Services*  Israel
> Phone: +972-3-916-5672 <+972%203-916-5672>
> Fax: +972-3-916-5672 <+972%203-916-5672>
> Mobile: +972-52-8395593 <+972%2052-839-5593>
> e-mail: y...@il.ibm.com
> *IBM Israel* <http://www.ibm.com/il/he/>
>
>
>
>
>
>
>
> From:"J. Eric Wonderley" <eric.wonder...@vt.edu>
> To:gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
> Date:01/09/2017 10:48 PM
> Subject:Re: [gpfsug-discuss] replication and no failure groups
> Sent by:gpfsug-discuss-boun...@spectrumscale.org
> --
>
>
>
> Hi Yaron:
>
> This is the filesystem:
>
> [root@cl005 net]# mmlsdisk work
> disk driver   sector failure holds
> holdsstorage
> name type   size   group metadata data  status
> availability pool
>   -- ---  - -
>  
> nsd_a_7  nsd 512  -1 No   Yes   ready
> up   system
> nsd_b_7  nsd 512  -1 No   Yes   ready
> up   system
> nsd_c_7  nsd 512  -1 No   Yes   ready
> up   system
> nsd_d_7  nsd 512  -1 No   Yes   ready
> up   system
> nsd_a_8  nsd 512  -1 No   Yes   ready
> up   system
> nsd_b_8  nsd 512  -1 No   Yes   ready
> up   system
> nsd_c_8  nsd 512  -1 No   Yes   ready
> up   system
> nsd_d_8  nsd 512  -1 No   Yes   ready
> up   system
> nsd_a_9  nsd 512  -1 No   Yes   ready
> up   system
> nsd_b_9  nsd 512  -1 No   Yes   ready
> up   system
> nsd_c_9  nsd 512  -1 No   Yes   ready
> up   system
> nsd_d_9  nsd 512  -1 No   Yes   ready
> up   system
> nsd_a_10 nsd 512  -1 No   Yes   ready
> up   system
> nsd_b_10 nsd 512  -1 No   Yes   ready
> up   system
> nsd_c_10 nsd 512  -1 No   Yes   ready
> up   system
> nsd_d_10 nsd 512  -1 No   Yes   ready
> up   system
> nsd_a_11 nsd 512  -1 No   Yes   ready
> up   system
> nsd_b_11 nsd 512  -1 No   Yes   ready
> up   system
> nsd_c_11 nsd 512  -1 No   Yes   ready
> up   system
> nsd_d_11 nsd 512  -1 No   Yes   ready
> up   system
> nsd_a_12 nsd 512  -1 No   Yes   ready
> up   system
> nsd_b_12 nsd 512  -1 No   Yes   ready
> up   system
> nsd_c_12 nsd 512  -1 No   Yes   ready
> up   system
> nsd_d_12 nsd 512  -1 No   Yes   ready
> up   system
> work_md_pf1_1 nsd 512 200 Yes  Noready
> up   system
> jbf1z1   nsd40962012 No   Yes   ready
> up   sas_ssd4T
> jbf2z1   nsd40962012 No   Yes   ready
> up   sas_ssd4T
> jbf3z1   nsd40962012 No   Yes   ready
> up   sas_ssd4T
> jbf4z1   nsd40962012 No   Yes   ready
> up   sas_ssd4T
> jbf5z1   nsd40962012 No   Yes   ready
> up   sas_ssd4T
> jbf6z1   nsd40962012 No   Yes   ready
> up   sas_ssd4T
> jbf7z1   nsd40962012 No   Yes   ready
> up   sas_ssd4T
> jbf8z1   nsd40962012 No   Yes   ready
> up   sas_ssd4T
> jbf1z2   nsd409

[gpfsug-discuss] nsd not adding with one quorum node down?

2017-01-05 Thread J. Eric Wonderley
I have one quorum node down and attempting to add a nsd to a fs:
[root@cl005 ~]# mmadddisk home -F add_1_flh_home -v no |& tee
/root/adddisk_flh_home.out
Verifying file system configuration information ...

The following disks of home will be formatted on node cl003:
r10f1e5: size 1879610 MB
Extending Allocation Map
Checking Allocation Map for storage pool fc_ssd400G
  55 % complete on Thu Jan  5 14:43:31 2017
Lost connection to file system daemon.
mmadddisk: tsadddisk failed.
Verifying file system configuration information ...
mmadddisk: File system home has some disks that are in a non-ready state.
mmadddisk: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
mmadddisk: Command failed. Examine previous error messages to determine
cause.

Had to use -v no (this failed once before).  Anyhow I next see:
[root@cl002 ~]# mmgetstate -aL

 Node number  Node name   Quorum  Nodes up  Total nodes  GPFS state
Remarks

   1  cl001  00  8   down
quorum node
   2  cl002  56  8   active
quorum node
   3  cl003  50  8   arbitrating
quorum node
   4  cl004  56  8   active
quorum node
   5  cl005  56  8   active
quorum node
   6  cl006  56  8   active
quorum node
   7  cl007  56  8   active
quorum node
   8  cl008  56  8   active
quorum node
[root@cl002 ~]# mmlsdisk home
disk driver   sector failure holds
holdsstorage
name type   size   group metadata data  status
availability pool
  -- ---  - -
 
r10f1e5  nsd 5121001 No   Yes   allocmap add
up   fc_ssd400G
r6d2e8   nsd 5121001 No   Yes   ready
up   fc_8T
r6d3e8   nsd 5121001 No   Yes   ready
up   fc_8T

Do all quorum node have to be up and participating to do these admin type
operations?
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] strange mmchnsd error?

2017-01-04 Thread J. Eric Wonderley
[root@cl001 ~]# cat chnsd_home_flh
%nsd: nsd=r10f1e5 servers=cl008,cl001,cl002,cl003,cl004,cl005,cl006,cl007
%nsd: nsd=r10f6e5 servers=cl007,cl008,cl001,cl002,cl003,cl004,cl005,cl006
%nsd: nsd=r10f1e6 servers=cl006,cl007,cl008,cl001,cl002,cl003,cl004,cl005
%nsd: nsd=r10f6e6 servers=cl005,cl006,cl007,cl008,cl001,cl002,cl003,cl004
%nsd: nsd=r10f1e7 servers=cl004,cl005,cl006,cl007,cl008,cl001,cl002,cl003
%nsd: nsd=r10f6e7 servers=cl003,cl004,cl005,cl006,cl007,cl008,cl001,cl002
%nsd: nsd=r10f1e8 servers=cl002,cl003,cl004,cl005,cl006,cl007,cl008,cl001
%nsd: nsd=r10f6e8 servers=cl001,cl002,cl003,cl004,cl005,cl006,cl007,cl008
%nsd: nsd=r10f1e9 servers=cl008,cl001,cl002,cl003,cl004,cl005,cl006,cl007
%nsd: nsd=r10f6e9 servers=cl007,cl008,cl001,cl002,cl003,cl004,cl005,cl006
[root@cl001 ~]# mmchnsd -F chnsd_home_flh
mmchnsd: Processing disk r10f6e5
mmchnsd: Processing disk r10f6e6
mmchnsd: Processing disk r10f6e7
mmchnsd: Processing disk r10f6e8
mmchnsd: Node cl005.cl.arc.internal returned ENODEV for disk r10f6e8.
mmchnsd: Node cl006.cl.arc.internal returned ENODEV for disk r10f6e8.
mmchnsd: Node cl007.cl.arc.internal returned ENODEV for disk r10f6e8.
mmchnsd: Node cl008.cl.arc.internal returned ENODEV for disk r10f6e8.
mmchnsd: Error found while processing stanza
%nsd: nsd=r10f6e8
servers=cl001,cl002,cl003,cl004,cl005,cl006,cl007,cl008
mmchnsd: Processing disk r10f1e9
mmchnsd: Processing disk r10f6e9
mmchnsd: Command failed. Examine previous error messages to determine cause.

I comment out the r10f6e8 line and then it completes?

I have some sort of fabric san issue:
[root@cl005 ~]# for i in {1..8}; do ssh cl00$i lsscsi -s | grep 38xx | grep
1.97 | wc -l; done
80
80
80
80
68
72
70
72

but i'm suprised removing one line allows it to complete.
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Quotas on Multiple Filesets

2016-12-02 Thread J. Eric Wonderley
Hi Michael:

I was about to ask a similar question about nested filesets.

I have this setup:
[root@cl001 ~]# mmlsfileset home
Filesets in file system 'home':
Name StatusPath
root Linked/gpfs/home
groupLinked/gpfs/home/group
predictHPC   Linked/gpfs/home/group/predictHPC


and I see this:
[root@cl001 ~]# mmlsfileset home -L -d
Collecting fileset usage information ...
Filesets in file system 'home':
NameId  RootInode  ParentId
Created  InodeSpace  MaxInodesAllocInodes
Data (in KB) Comment
root 0  3-- Tue Jun 30
07:54:09 20150134217728  12380569663306355456
root fileset
group1   67409030 0 Tue Nov  1
13:22:24 201600  0  0
predictHPC   2  111318203 1 Fri Dec  2
14:05:56 201600  0  212206080

I would have though that usage in fileset predictHPC would also go against
the group fileset

On Tue, Nov 15, 2016 at 4:47 AM, Michael Holliday <
michael.holli...@crick.ac.uk> wrote:

> Hey Everyone,
>
>
>
> I have a GPFS system which contain several groups of filesets.
>
>
>
> Each group has a root fileset, along with a number of other files sets.
> All of the filesets share the inode space with the root fileset.
>
>
>
> The file sets are linked  to create a tree structure as shown:
>
>
>
> Fileset Root -> /root
>
> Fileset a  -> /root/a
>
> Fileset B -> /root/b
>
> Fileset C -> /root/c
>
>
>
>
>
> I have applied a quota of 5TB to the root fileset.
>
>
>
> Could someone tell me if  the quota will only take into account the files
> in the root fileset, or if it would include the sub filesets aswell.   eg
> If have 3TB in A and 2TB in B   - would that hit the 5TB quota on root?
>
>
>
> Thanks
>
> Michael
>
>
>
>
>
> The Francis Crick Institute Limited is a registered charity in England and
> Wales no. 1140062 and a company registered in England and Wales no.
> 06885462, with its registered office at 215 Euston Road, London NW1 2BE.
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] rpldisk vs deldisk & adddisk

2016-12-02 Thread J. Eric Wonderley
Ah...rpldisk is used to fix a single problem and typically you don't want
to take a long trip thru md for just one small problem.  Likely why it is
seldom if ever used.

On Thu, Dec 1, 2016 at 3:28 PM, Matt Weil <mw...@wustl.edu> wrote:

> I always suspend the disk then use mmrestripefs -m to remove the data.
> Then delete the disk with mmdeldisk.
>
>  ‐m
>   Migrates all critical data off of any suspended
>   disk in this file system. Critical data is all
>   data that would be lost if currently suspended
>   disks were removed.
> Can do multiple that why and us the entire cluster to move data if you
> want.
>
> On 12/1/16 1:10 PM, J. Eric Wonderley wrote:
>
> I have a few misconfigured disk groups and I have a few same size
> correctly configured disk groups.
>
> Is there any (dis)advantage to running mmrpldisk over mmdeldisk and
> mmadddisk?  Everytime I have ever run mmdeldisk...it been somewhat
> painful(even with qos) process.
>
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at 
> spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
> --
>
> The materials in this message are private and may contain Protected
> Healthcare Information or other information of a sensitive nature. If you
> are not the intended recipient, be advised that any unauthorized use,
> disclosure, copying or the taking of any action in reliance on the contents
> of this information is strictly prohibited. If you have received this email
> in error, please immediately notify the sender via telephone or return mail.
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] wanted...gpfs policy that places larger files onto a pool based on size

2016-10-31 Thread J. Eric Wonderley
Placement policy only applies to writes and I had thought that gpfs did
enough writing to memory "pagepool" to figure out the size before
committing the write to pool.

I also admit I don't know all of the innards of gpfs.  Pehaps being a copy
on write type filesystem prevents this for occurring.

On Mon, Oct 31, 2016 at 1:29 PM, Chris Scott <chrisjsc...@gmail.com> wrote:

> Hi Brian
>
> This is exactly what I do with a SSD tier on top of 10K and 7.2K tiers.
>
> HAWC is another recent option that might address Eric's requirement but
> needs further consideration of the read requirements you want from the
> small files.
>
> Cheers
> Chris
>
> On 31 October 2016 at 17:23, Brian Marshall <mimar...@vt.edu> wrote:
>
>> When creating a "fast tier" storage pool in a filesystem is the normal
>> case to create a placement policy that places all files in the fast tier
>> and migrates out old and large files?
>>
>>
>> Brian Marshall
>>
>> On Mon, Oct 31, 2016 at 1:20 PM, Jez Tucker <jez.tuc...@gpfsug.org>
>> wrote:
>>
>>> Hey Bryan
>>>
>>>   There was a previous RFE for path placement from the UG, but Yuri told
>>> me this was not techically possible as an inode has no knowledge about the
>>> parent dentry.  (IIRC).You can see this in effect in the C API.  It is
>>> possible to work this out at kernel level, but it's so costly that it
>>> becomes non-viable at scale / performance.
>>>
>>> IBMers please chip in and expand if you will.
>>>
>>> Jez
>>>
>>>
>>> On 31/10/16 17:09, Bryan Banister wrote:
>>>
>>> The File Placement Policy that you are trying to set cannot use the size
>>> of the file to determine the placement of the file in a GPFS Storage Pool.
>>> This is because GPFS has no idea what the file size will be when the file
>>> is open()’d for writing.
>>>
>>>
>>>
>>> Hope that helps!
>>>
>>> -Bryan
>>>
>>>
>>>
>>> PS. I really wish that we could use a path for specifying data placement
>>> in a GPFS Pool, and not just the file name, owner, etc.  I’ll submit a RFE
>>> for this.
>>>
>>>
>>>
>>> *From:* gpfsug-discuss-boun...@spectrumscale.org [
>>> mailto:gpfsug-discuss-boun...@spectrumscale.org
>>> <gpfsug-discuss-boun...@spectrumscale.org>] *On Behalf Of *J. Eric
>>> Wonderley
>>> *Sent:* Monday, October 31, 2016 11:53 AM
>>> *To:* gpfsug main discussion list
>>> *Subject:* [gpfsug-discuss] wanted...gpfs policy that places larger
>>> files onto a pool based on size
>>>
>>>
>>>
>>> I wanted to do something like this...
>>>
>>>
>>> [root@cl001 ~]# cat /opt/gpfs/home.ply
>>> /*Failsafe migration of old small files back to spinning media
>>> pool(fc_8T) */
>>> RULE 'theshold' MIGRATE FROM POOL 'system' THRESHOLD(90,70)
>>> WEIGHT(ACCESS_TIME) TO POOL 'fc_8T'
>>> /*Write files larger than 16MB to pool called "fc_8T" */
>>> RULE 'bigfiles' SET POOL 'fc_8T' WHERE FILE_SIZE>16777216
>>> /*Move anything else to system pool */
>>> RULE 'default' SET POOL 'system'
>>>
>>> Apparently there is no happiness using FILE_SIZE in a placement policy:
>>> [root@cl001 ~]# mmchpolicy home /opt/gpfs/home.ply
>>> Error while validating policy `home.ply': rc=22:
>>> PCSQLERR: 'FILE_SIZE' is an unsupported or unknown attribute or variable
>>> name in this context.
>>> PCSQLCTX: at line 4 of 6: RULE 'bigfiles' SET POOL 'fc_8T' WHERE
>>> {{{FILE_SIZE}}}>16777216
>>> runRemoteCommand_v2: cl002.cl.arc.internal: tschpolicy /dev/home
>>> /var/mmfs/tmp/tspolicyFile.mmchpolicy.113372 -t home.ply   failed.
>>> mmchpolicy: Command failed. Examine previous error messages to determine
>>> cause.
>>>
>>> Can anyone suggest a way to accomplish this using policy?
>>>
>>> --
>>>
>>> Note: This email is for the confidential use of the named addressee(s)
>>> only and may contain proprietary, confidential or privileged information.
>>> If you are not the intended recipient, you are hereby notified that any
>>> review, dissemination or copying of this email is strictly prohibited, and
>>> to please notify the sender immediately and destroy this email and any
>>> attachments. Email transmission cannot be guaranteed to be secure or
>>> error-free. The 

[gpfsug-discuss] wanted...gpfs policy that places larger files onto a pool based on size

2016-10-31 Thread J. Eric Wonderley
I wanted to do something like this...

[root@cl001 ~]# cat /opt/gpfs/home.ply
/*Failsafe migration of old small files back to spinning media pool(fc_8T)
*/
RULE 'theshold' MIGRATE FROM POOL 'system' THRESHOLD(90,70)
WEIGHT(ACCESS_TIME) TO POOL 'fc_8T'
/*Write files larger than 16MB to pool called "fc_8T" */
RULE 'bigfiles' SET POOL 'fc_8T' WHERE FILE_SIZE>16777216
/*Move anything else to system pool */
RULE 'default' SET POOL 'system'

Apparently there is no happiness using FILE_SIZE in a placement policy:
[root@cl001 ~]# mmchpolicy home /opt/gpfs/home.ply
Error while validating policy `home.ply': rc=22:
PCSQLERR: 'FILE_SIZE' is an unsupported or unknown attribute or variable
name in this context.
PCSQLCTX: at line 4 of 6: RULE 'bigfiles' SET POOL 'fc_8T' WHERE
{{{FILE_SIZE}}}>16777216
runRemoteCommand_v2: cl002.cl.arc.internal: tschpolicy /dev/home
/var/mmfs/tmp/tspolicyFile.mmchpolicy.113372 -t home.ply   failed.
mmchpolicy: Command failed. Examine previous error messages to determine
cause.

Can anyone suggest a way to accomplish this using policy?
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] migrate policy vs restripe

2016-10-04 Thread J. Eric Wonderley
We have the need to move data from one set of spindles to another.

Are there any performance or availability considerations when choosing to
do either a migration policy or a restripe to make this move?  I did
discover that a restripe only works within the same pool...even though you
setup two pools in different failure groups.
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss