We have spectrum archive with encryption on disk and tape. We get maybe a
100 or so messages like this daily. It would be nice if message had some
information about which client is the issue.
We have had client certs expire in the past. The root cause of the outage
was a network outage...iirc
>
> Fred
> __
> Fred Stock | IBM Pittsburgh Lab | 720-430-8821
> sto...@us.ibm.com
>
>
>
> ----- Original message -
> From: "J. Eric Wonderley"
> Sent by: gpfsug-discuss-boun...@spectrumscale.org
> To: gpfsug ma
I have filesets setup in a filesystem...looks like:
[root@cl005 ~]# mmlsfileset home -L
Filesets in file system 'home':
NameId RootInode ParentId Created
InodeSpace MaxInodesAllocInodes Comment
root 0
mmlspool might suggest there's only 1 system pool per cluster. We have 2
clusters and it has id=0 on both.
One of our clusters has 2 filesystems that have same id for two different
dataonly pools:
[root@cl001 ~]# mmlspool home all
NameId
system 0
fc_8T65537
Multihoming is accomplished by using subnets...see mmchconfig.
Failover networks on the other hand are not allowed. Bad network behavior
is dealt with by expelling nodes. You must have decent/supported network
gear...we have learned that lesson the hard way
On Fri, Oct 26, 2018 at 10:37 AM
Hi Stephan:
I think every node in C1 and in C2 have to see every node in the server
cluster NSD-[AD].
We have a 10 node server cluster where 2 nodes do nothing but server out
nfs. Since these two are apart of the server cluster...client clusters
wanting to mount the server cluster via gpfs need
than Eth? I figured
> you’re Ethernet based because of the mention of Juniper.
>
> Are you attempting to do RoCE or just plain TCP/IP?
>
>
> On December 20, 2017 at 14:40:48 EST, J. Eric Wonderley <
> eric.wonder...@vt.edu> wrote:
>
> Hello:
>
> Does anyone hav
Hello:
If I recall correctly this does not work...correct? I think the last time
I attempted this was gpfs version <=4.1. I think I attempted to add a
quorum node.
The process was that I remember doing was mmshutdown -a, mmchcluster
--ccr-disable, mmaddnode yadayada, mmchcluster --ccr-enable,
We have several avago/lsi 9305-16e that I believe came from Advanced HPC.
Can someone recommend a another reseller of these hbas or a contact with
Advance HPC?
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
8408295 0 Thu Aug 17
> 15:17:18 201700 0
>
>
>
> *From:* gpfsug-discuss-boun...@spectrumscale.org [mailto:gpfsug-discuss-
> boun...@spectrumscale.org] *On Behalf Of *J. Eric Wonderley
> *Sent:* 17 August 2017 15:14
> *To:* Edw
<ew...@osc.edu> wrote:
> On Fri, 4 Aug 2017 01:02:22 -0400
> "J. Eric Wonderley" <eric.wonder...@vt.edu> wrote:
>
> > 4.2.2.3
> >
> > I want to think maybe this started after expanding inode space
>
> What does 'mmlsfileset home nathanfootest
i actually hit this assert and turned it in to support on this version:
Build branch "4.2.2.3 efix6 (987197)".
i was told do to exactly what sven mentioned.
i thought it strange that i did NOT hit the assert in a no pass but hit it
in a yes pass.
On Thu, Aug 3, 2017 at 9:06 AM, Sven Oehme
4.2.2.3
I want to think maybe this started after expanding inode space
On Thu, Aug 3, 2017 at 9:11 AM, James Davis <jamieda...@us.ibm.com> wrote:
> Hey,
>
> Hmm, your invocation looks valid to me. What's your GPFS level?
>
> Cheers,
>
> Jamie
>
>
> ----- Ori
No guarantee...unless you are using ess/gss solution.
Crappy network will get you loads of expels and occasional fscks. Which I
guess beats data loss and recovery from backup.
YOu probably have a network issue...they can be subtle. Gpfs is a very
extremely thorough network tester.
Eric
On
Hi Renar:
What does 'mmlsquota -j fileset filesystem' report?
I did not think you would get a grace period of none unless the
hardlimit=softlimit.
On Mon, Jul 31, 2017 at 1:44 PM, Grunenberg, Renar <
renar.grunenb...@huk-coburg.de> wrote:
> Hallo All,
> we are on Version 4.2.3.2 and see some
These type messages repeat often in our logs:
017-06-20_09:25:13.676-0400: [E]
An%20attempt%20to%20send%20notification%20to%20the%20GUI%20subsystem%20failed%2E%20response%3Dcurl%3A%20%287%29%20Failed%20connect%20to%20arproto2%2Ear%2Enis%2Eisb%2Einternal%3A443%3B%20Connection%20refused%20rc%3D7
Hi Jamie:
I think typically you want to keep the clients ahead of the server in
version. I would advance the version of you client nodes.
New clients can communicate with older versions of server nsds. Vice
versa...no so much.
___
gpfsug-discuss
tmp]# mmlssnapshot gpfs | wc -l
>
> 6916
>
>
>
> *From:* gpfsug-discuss-boun...@spectrumscale.org [mailto:gpfsug-discuss-
> boun...@spectrumscale.org] *On Behalf Of *J. Eric Wonderley
> *Sent:* 20 March 2017 14:03
> *To:* gpfsug main discussion list <gpfsug-discuss@spec
I found this link and it didn't give me much hope for doing snapshots &
backup in a home(busy) filesystem:
http://www.spectrumscale.org/pipermail/gpfsug-discuss/2013-
February/000200.html
I realize this is dated and I wondered if qos etc have made is a tolerable
thing to do now. Gpfs I think was
Spectrum Scale / HPC General Parallel File
> System Dev.
> Pittsburgh, PA (412) 667-6993 Tie-Line 989-6993
>sber...@us.ibm.com
> ----Every once in a while, it is a good idea to call out, "Computer, end
> program!" just to check. --David Noelle
&g
Anyone have any examples of this? I have a filesystem that has 2 pools and
several filesets and would like daily progressive incremental backups of
its contents.
I found some stuff here(nothing real close to what I wanted however):
/usr/lpp/mmfs/samples/ilm
I have the tsm client installed on
Well we got it into the down state using mmsdrrestore -p to recover stuff
into /var/mmfs/gen to cl004.
Anyhow we ended up unknown for cl004 when it powered off. Short of
removing node, unknown is the state you get.
Unknown seems stable for a hopefully short outage of cl004.
Thanks
On Thu,
Is there a way to accomplish this so the rest of cluster knows its down?
My state now:
[root@cl001 ~]# mmgetstate -aL
cl004.cl.arc.internal: mmremote: determineMode: Missing file
/var/mmfs/gen/mmsdrfs.
cl004.cl.arc.internal: mmremote: This node does not belong to a GPFS
cluster.
mmdsh:
Maybe multipath is not seeing all of the wwns?
multipath -v3 | grep ^51855 look ok?
For some unknown reason multipath does not see our sandisk array...we have
to add them to the end of /etc/multipath/wwids file
On Fri, Jan 20, 2017 at 10:32 AM, David D. Johnson
wrote:
I have messages like these frequent my logs:
Tue Jan 17 11:25:49.731 2017: [E] VERBS RDMA rdma write error
IBV_WC_REM_ACCESS_ERR to 10.51.10.5 (cl005) on mlx5_0 port 1 fabnum 0
vendor_err 136
Tue Jan 17 11:25:49.732 2017: [E] VERBS RDMA closed connection to
10.51.10.5 (cl005) on mlx5_0 port 1
Our intent was to have ccr turned off since all nodes are quorum in the
server cluster:
Considering this:
[root@cl001 ~]# mmfsadm dump config | grep -i ccr
! ccrEnabled 0
ccrMaxChallengeCheckRetries 4
ccr : 0 (cluster configuration repository)
ccr : 1 (cluster
72>
> Fax: +972-3-916-5672 <+972%203-916-5672>
> Mobile: +972-52-8395593 <+972%2052-839-5593>
> e-mail: y...@il.ibm.com
> *IBM Israel* <http://www.ibm.com/il/he/>
>
>
>
>
>
>
>
> From:"J. Eric Wonderley" <eric.wonder...
I have one quorum node down and attempting to add a nsd to a fs:
[root@cl005 ~]# mmadddisk home -F add_1_flh_home -v no |& tee
/root/adddisk_flh_home.out
Verifying file system configuration information ...
The following disks of home will be formatted on node cl003:
r10f1e5: size 1879610 MB
[root@cl001 ~]# cat chnsd_home_flh
%nsd: nsd=r10f1e5 servers=cl008,cl001,cl002,cl003,cl004,cl005,cl006,cl007
%nsd: nsd=r10f6e5 servers=cl007,cl008,cl001,cl002,cl003,cl004,cl005,cl006
%nsd: nsd=r10f1e6 servers=cl006,cl007,cl008,cl001,cl002,cl003,cl004,cl005
%nsd: nsd=r10f6e6
Hi Michael:
I was about to ask a similar question about nested filesets.
I have this setup:
[root@cl001 ~]# mmlsfileset home
Filesets in file system 'home':
Name StatusPath
root Linked/gpfs/home
groupLinked/gpfs/home/group
; disks were removed.
> Can do multiple that why and us the entire cluster to move data if you
> want.
>
> On 12/1/16 1:10 PM, J. Eric Wonderley wrote:
>
> I have a few misconfigured disk groups and I have a few same size
> correctly configured disk groups.
&
t;>
>>>
>>>
>>> Hope that helps!
>>>
>>> -Bryan
>>>
>>>
>>>
>>> PS. I really wish that we could use a path for specifying data placement
>>> in a GPFS Pool, and not just the file name, owner, etc. I’ll s
I wanted to do something like this...
[root@cl001 ~]# cat /opt/gpfs/home.ply
/*Failsafe migration of old small files back to spinning media pool(fc_8T)
*/
RULE 'theshold' MIGRATE FROM POOL 'system' THRESHOLD(90,70)
WEIGHT(ACCESS_TIME) TO POOL 'fc_8T'
/*Write files larger than 16MB to pool called
We have the need to move data from one set of spindles to another.
Are there any performance or availability considerations when choosing to
do either a migration policy or a restripe to make this move? I did
discover that a restripe only works within the same pool...even though you
setup two
34 matches
Mail list logo