Re: [gpfsug-discuss] Well, this is the pits...

2017-05-04 Thread Kumaran Rajaram
>>Thanks for the info on the releases … can you clarify about 
pitWorkerThreadsPerNode? 

pitWorkerThreadsPerNode -- Specifies how many threads do restripe, data 
movement, etc

>>As I said in my original post, on all 8 NSD servers and the filesystem 
manager it is set to zero.  No matter how many times I add zero to zero I 
don’t get a value > 31!  ;-)  So I take it that zero has some sort of 
unspecified significance?  Thanks…

Value of 0 just indicates pitWorkerThreadsPerNode takes internal_value 
based on GPFS setup and file-system configuration (which can be 16 or 
lower) based on the following formula.

Default is  pitWorkerThreadsPerNode = MIN(16, (numberOfDisks_in_filesystem 
* 4) / numberOfParticipatingNodes_in_mmrestripefs + 1) 

For example, if you have 64 x NSDs in your file-system and you are using 8 
NSD servers in "mmrestripefs -N", then

pitWorkerThreadsPerNode = MIN (16, (256/8)+1) resulting in 
pitWorkerThreadsPerNode to take value of 16 ( default 0 will result in 16 
threads doing restripe per mmrestripefs participating Node).

If you want 8 NSD servers (running 4.2.2.3) to participate in mmrestripefs 
operation then set "mmchconfig pitWorkerThreadsPerNode=3 -N 
<8_NSD_Servers>" such that (8 x 3) is less than 31.

Regards,
-Kums





From:   "Buterbaugh, Kevin L" 
To: gpfsug main discussion list 
Date:   05/04/2017 12:57 PM
Subject:Re: [gpfsug-discuss] Well, this is the pits...
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi Kums, 

Thanks for the info on the releases … can you clarify about 
pitWorkerThreadsPerNode?  As I said in my original post, on all 8 NSD 
servers and the filesystem manager it is set to zero.  No matter how many 
times I add zero to zero I don’t get a value > 31!  ;-)  So I take it that 
zero has some sort of unspecified significance?  Thanks…

Kevin

On May 4, 2017, at 11:49 AM, Kumaran Rajaram  wrote:

Hi,

>>I’m running 4.2.2.3 on my GPFS servers (some clients are on 4.2.1.1 or 
4.2.0.3 and are gradually being upgraded).  What version of GPFS fixes 
this?  With what I’m doing I need the ability to run mmrestripefs.

GPFS version 4.2.3.0 (and above) fixes this issue and supports "sum of 
pitWorkerThreadsPerNode of the participating nodes (-N parameter to 
mmrestripefs)" to exceed 31.

If you are using 4.2.2.3, then depending on "number of nodes participating 
in the mmrestripefs" then the GPFS config parameter 
"pitWorkerThreadsPerNode" need to be adjusted such that "sum of 
pitWorkerThreadsPerNode of the participating nodes <=  31".

For example, if  "number of nodes participating in the mmrestripefs" is 6 
then adjust "mmchconfig pitWorkerThreadsPerNode=5 -N 
". GPFS would need to be restarted for this parameter 
to take effect on the participating_nodes (verify with  mmfsadm dump 
config | grep pitWorkerThreadsPerNode)

Regards,
-Kums





From:"Buterbaugh, Kevin L" 
To:gpfsug main discussion list 
Date:05/04/2017 12:08 PM
Subject:Re: [gpfsug-discuss] Well, this is the pits...
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi Olaf, 

I didn’t touch pitWorkerThreadsPerNode … it was already zero.

I’m running 4.2.2.3 on my GPFS servers (some clients are on 4.2.1.1 or 
4.2.0.3 and are gradually being upgraded).  What version of GPFS fixes 
this?  With what I’m doing I need the ability to run mmrestripefs.

It seems to me that mmrestripefs could check whether QOS is enabled … 
granted, it would have no way of knowing whether the values used actually 
are reasonable or not … but if QOS is enabled then “trust” it to not 
overrun the system.

PMR time?  Thanks..

Kevin

On May 4, 2017, at 10:54 AM, Olaf Weiser  wrote:

HI Kevin, 
the number of NSDs is more or less nonsense .. it is just the number of 
nodes x PITWorker  should not exceed to much the #mutex/FS block
did you adjust/tune the PitWorker ? ... 

so far as I know.. that the code checks the number of NSDs is already 
considered as a defect and will be fixed / is already fixed ( I stepped 
into it here as well) 

ps. QOS is the better approach to address this, but unfortunately.. not 
everyone is using it by default... that's why I suspect , the development 
decide to put in a check/limit here .. which in your case(with QOS) 
would'nt needed 





From:"Buterbaugh, Kevin L" 
To:gpfsug main discussion list 
Date:05/04/2017 05:44 PM
Subject:Re: [gpfsug-discuss] Well, this is the pits...
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi Olaf, 

Your explanation mostly makes sense, but...

Failed with 4 nodes … failed with 2 nodes … not gonna try with 1 node. And 
this filesystem only has 32 disks, which I would imagine is not an 
especially large number 

Re: [gpfsug-discuss] Well, this is the pits...

2017-05-04 Thread Buterbaugh, Kevin L
Hi Kums,

Thanks for the info on the releases … can you clarify about 
pitWorkerThreadsPerNode?  As I said in my original post, on all 8 NSD servers 
and the filesystem manager it is set to zero.  No matter how many times I add 
zero to zero I don’t get a value > 31!  ;-)  So I take it that zero has some 
sort of unspecified significance?  Thanks…

Kevin

On May 4, 2017, at 11:49 AM, Kumaran Rajaram 
> wrote:

Hi,

>>I’m running 4.2.2.3 on my GPFS servers (some clients are on 4.2.1.1 or 
>>4.2.0.3 and are gradually being upgraded).  What version of GPFS fixes this?  
>>With what I’m doing I need the ability to run mmrestripefs.

GPFS version 4.2.3.0 (and above) fixes this issue and supports "sum of 
pitWorkerThreadsPerNode of the participating nodes (-N parameter to 
mmrestripefs)" to exceed 31.

If you are using 4.2.2.3, then depending on "number of nodes participating in 
the mmrestripefs" then the GPFS config parameter "pitWorkerThreadsPerNode" need 
to be adjusted such that "sum of pitWorkerThreadsPerNode of the participating 
nodes <=  31".

For example, if  "number of nodes participating in the mmrestripefs" is 6 then 
adjust "mmchconfig pitWorkerThreadsPerNode=5 -N ". GPFS 
would need to be restarted for this parameter to take effect on the 
participating_nodes (verify with  mmfsadm dump config | grep 
pitWorkerThreadsPerNode)

Regards,
-Kums





From:"Buterbaugh, Kevin L" 
>
To:gpfsug main discussion list 
>
Date:05/04/2017 12:08 PM
Subject:Re: [gpfsug-discuss] Well, this is the pits...
Sent by:
gpfsug-discuss-boun...@spectrumscale.org




Hi Olaf,

I didn’t touch pitWorkerThreadsPerNode … it was already zero.

I’m running 4.2.2.3 on my GPFS servers (some clients are on 4.2.1.1 or 4.2.0.3 
and are gradually being upgraded).  What version of GPFS fixes this?  With what 
I’m doing I need the ability to run mmrestripefs.

It seems to me that mmrestripefs could check whether QOS is enabled … granted, 
it would have no way of knowing whether the values used actually are reasonable 
or not … but if QOS is enabled then “trust” it to not overrun the system.

PMR time?  Thanks..

Kevin

On May 4, 2017, at 10:54 AM, Olaf Weiser 
> wrote:

HI Kevin,
the number of NSDs is more or less nonsense .. it is just the number of nodes x 
PITWorker  should not exceed to much the #mutex/FS block
did you adjust/tune the PitWorker ? ...

so far as I know.. that the code checks the number of NSDs is already 
considered as a defect and will be fixed / is already fixed ( I stepped into it 
here as well)

ps. QOS is the better approach to address this, but unfortunately.. not 
everyone is using it by default... that's why I suspect , the development 
decide to put in a check/limit here .. which in your case(with QOS)  would'nt 
needed





From:"Buterbaugh, Kevin L" 
>
To:gpfsug main discussion list 
>
Date:05/04/2017 05:44 PM
Subject:Re: [gpfsug-discuss] Well, this is the pits...
Sent by:
gpfsug-discuss-boun...@spectrumscale.org




Hi Olaf,

Your explanation mostly makes sense, but...

Failed with 4 nodes … failed with 2 nodes … not gonna try with 1 node.  And 
this filesystem only has 32 disks, which I would imagine is not an especially 
large number compared to what some people reading this e-mail have in their 
filesystems.

I thought that QOS (which I’m using) was what would keep an mmrestripefs from 
overrunning the system … QOS has worked extremely well for us - it’s one of my 
favorite additions to GPFS.

Kevin

On May 4, 2017, at 10:34 AM, Olaf Weiser 
> wrote:

no.. it is just in the code, because we have to avoid to run out of mutexs / 
block

reduce the number of nodes -N down to 4  (2nodes is even more safer) ... is the 
easiest way to solve it for now

I've been told the real root cause will be fixed in one of the next ptfs .. 
within this year ..
this warning messages itself should appear every time.. but unfortunately 
someone coded, that it depends on the number of disks (NSDs).. that's why I 
suspect you did'nt see it before
but the fact , that we have to make sure, not to overrun the system by 
mmrestripe  remains.. to please lower the -N number of nodes to 4 or better 2

(even though we know.. than the mmrestripe will take longer)


From:"Buterbaugh, Kevin L" 

Re: [gpfsug-discuss] Well, this is the pits...

2017-05-04 Thread Kumaran Rajaram
Hi,

>>I’m running 4.2.2.3 on my GPFS servers (some clients are on 4.2.1.1 or 
4.2.0.3 and are gradually being upgraded).  What version of GPFS fixes 
this?  With what I’m doing I need the ability to run mmrestripefs.

GPFS version 4.2.3.0 (and above) fixes this issue and supports "sum of 
pitWorkerThreadsPerNode of the participating nodes (-N parameter to 
mmrestripefs)" to exceed 31.

If you are using 4.2.2.3, then depending on "number of nodes participating 
in the mmrestripefs" then the GPFS config parameter 
"pitWorkerThreadsPerNode" need to be adjusted such that "sum of 
pitWorkerThreadsPerNode of the participating nodes <=  31".

For example, if  "number of nodes participating in the mmrestripefs" is 6 
then adjust "mmchconfig pitWorkerThreadsPerNode=5 -N 
". GPFS would need to be restarted for this parameter 
to take effect on the participating_nodes (verify with  mmfsadm dump 
config | grep pitWorkerThreadsPerNode)

Regards,
-Kums





From:   "Buterbaugh, Kevin L" 
To: gpfsug main discussion list 
Date:   05/04/2017 12:08 PM
Subject:Re: [gpfsug-discuss] Well, this is the pits...
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi Olaf, 

I didn’t touch pitWorkerThreadsPerNode … it was already zero.

I’m running 4.2.2.3 on my GPFS servers (some clients are on 4.2.1.1 or 
4.2.0.3 and are gradually being upgraded).  What version of GPFS fixes 
this?  With what I’m doing I need the ability to run mmrestripefs.

It seems to me that mmrestripefs could check whether QOS is enabled … 
granted, it would have no way of knowing whether the values used actually 
are reasonable or not … but if QOS is enabled then “trust” it to not 
overrun the system.

PMR time?  Thanks..

Kevin

On May 4, 2017, at 10:54 AM, Olaf Weiser  wrote:

HI Kevin, 
the number of NSDs is more or less nonsense .. it is just the number of 
nodes x PITWorker  should not exceed to much the #mutex/FS block
did you adjust/tune the PitWorker ? ... 

so far as I know.. that the code checks the number of NSDs is already 
considered as a defect and will be fixed / is already fixed ( I stepped 
into it here as well) 

ps. QOS is the better approach to address this, but unfortunately.. not 
everyone is using it by default... that's why I suspect , the development 
decide to put in a check/limit here .. which in your case(with QOS) 
would'nt needed 





From:"Buterbaugh, Kevin L" 
To:gpfsug main discussion list 
Date:05/04/2017 05:44 PM
Subject:Re: [gpfsug-discuss] Well, this is the pits...
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi Olaf, 

Your explanation mostly makes sense, but...

Failed with 4 nodes … failed with 2 nodes … not gonna try with 1 node. And 
this filesystem only has 32 disks, which I would imagine is not an 
especially large number compared to what some people reading this e-mail 
have in their filesystems.

I thought that QOS (which I’m using) was what would keep an mmrestripefs 
from overrunning the system … QOS has worked extremely well for us - it’s 
one of my favorite additions to GPFS.

Kevin

On May 4, 2017, at 10:34 AM, Olaf Weiser  wrote:

no.. it is just in the code, because we have to avoid to run out of mutexs 
/ block

reduce the number of nodes -N down to 4  (2nodes is even more safer) ... 
is the easiest way to solve it for now

I've been told the real root cause will be fixed in one of the next ptfs 
.. within this year .. 
this warning messages itself should appear every time.. but unfortunately 
someone coded, that it depends on the number of disks (NSDs).. that's why 
I suspect you did'nt see it before
but the fact , that we have to make sure, not to overrun the system by 
mmrestripe  remains.. to please lower the -N number of nodes to 4 or 
better 2 

(even though we know.. than the mmrestripe will take longer)


From:"Buterbaugh, Kevin L" 
To:gpfsug main discussion list 
Date:05/04/2017 05:26 PM
Subject:[gpfsug-discuss] Well, this is the pits...
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi All, 

Another one of those, “I can open a PMR if I need to” type questions…

We are in the process of combining two large GPFS filesystems into one new 
filesystem (for various reasons I won’t get into here).  Therefore, I’m 
doing a lot of mmrestripe’s, mmdeldisk’s, and mmadddisk’s.

Yesterday I did an “mmrestripefs  -r -N ” (after 
suspending a disk, of course).  Worked like it should.

Today I did a “mmrestripefs  -b -P capacity -N ” and got:

mmrestripefs: The total number of PIT worker threads of all participating 
nodes has been exceeded to safely restripe the file system.  The total 
number of PIT worker threads, which is the sum of 

Re: [gpfsug-discuss] Replace SSL cert in GUI - need guidance

2017-05-04 Thread Sobey, Richard A
Never mind - /usr/lpp/mmfs/java/jre/bin is where it's at.

Richard

From: gpfsug-discuss-boun...@spectrumscale.org 
[mailto:gpfsug-discuss-boun...@spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: 04 May 2017 16:46
To: 'gpfsug-discuss@spectrumscale.org' 
Subject: [gpfsug-discuss] Replace SSL cert in GUI - need guidance

Hi all,

I'm going through the steps outlines in the following article:

https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.adv.doc/bl1adv_managecertforgui.htm

Will this work for 4.2.1 installations?

Only because in step 5, "Generate a Java(tm) keystore file (.jks) by using the 
keytool. It is stored in the following directory:", the given directory - 
/opt/ibm/wlp/java/jre/bin - does not exist. Only the path upto and including 
wlp is on my GUI server.

I can't imagine the instructions being so different between 4.2.1 and 4.2 but 
I've seen it happen..

Cheers
Richard

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Well, this is the pits...

2017-05-04 Thread Olaf Weiser
HI Kevin, the number of NSDs is more or less nonsense
.. it is just the number of nodes x PITWorker  should not exceed to
much the #mutex/FS blockdid you adjust/tune the PitWorker ?
... so far as I know.. that the code checks
the number of NSDs is already considered as a defect and will be fixed
/ is already fixed ( I stepped into it here as well) ps. QOS is the better approach to address
this, but unfortunately.. not everyone is using it by default... that's
why I suspect , the development decide to put in a check/limit here ..
which in your case(with QOS)  would'nt needed From:      
 "Buterbaugh, Kevin
L" To:      
 gpfsug main discussion
list Date:      
 05/04/2017 05:44 PMSubject:    
   Re: [gpfsug-discuss]
Well, this is the pits...Sent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHi Olaf, Your explanation mostly makes sense, but...Failed with 4 nodes … failed with 2 nodes … not gonna
try with 1 node.  And this filesystem only has 32 disks, which I would
imagine is not an especially large number compared to what some people
reading this e-mail have in their filesystems.I thought that QOS (which I’m using) was what would keep
an mmrestripefs from overrunning the system … QOS has worked extremely
well for us - it’s one of my favorite additions to GPFS.KevinOn May 4, 2017, at 10:34 AM, Olaf Weiser 
wrote:no.. it is just in the code, because
we have to avoid to run out of mutexs / blockreduce the number of nodes -N down to 4  (2nodes is even more safer)
... is the easiest way to solve it for nowI've been told the real root cause will be fixed in one of the next ptfs
.. within this year .. this warning messages itself should
appear every time.. but unfortunately someone coded, that it depends on
the number of disks (NSDs).. that's why I suspect you did'nt see it beforebut the fact , that we have to make sure, not to overrun the system by
mmrestripe  remains.. to please lower the -N number of nodes to 4
or better 2 (even though we know.. than the mmrestripe will take longer)From:        "Buterbaugh,
Kevin L" To:        gpfsug
main discussion list Date:        05/04/2017
05:26 PMSubject:        [gpfsug-discuss]
Well, this is the pits...Sent by:        gpfsug-discuss-boun...@spectrumscale.orgHi All, Another one of those, “I can open a PMR if I need to” type questions…We are in the process of combining two large GPFS filesystems into one
new filesystem (for various reasons I won’t get into here).  Therefore,
I’m doing a lot of mmrestripe’s, mmdeldisk’s, and mmadddisk’s.Yesterday I did an “mmrestripefs  -r -N ”
(after suspending a disk, of course).  Worked like it should.Today I did a “mmrestripefs  -b -P capacity -N ” and got:mmrestripefs: The total number of PIT worker threads of all participating
nodes has been exceeded to safely restripe the file system.  The total
number of PIT worker threads, which is the sum of pitWorkerThreadsPerNode
of the participating nodes, cannot exceed 31.  Reissue the command
with a smaller set of participating nodes (-N option) and/or lower the
pitWorkerThreadsPerNode configure setting.  By default the file system
manager node is counted as a participating node.mmrestripefs: Command failed. Examine previous error messages to determine
cause.So there must be some difference in how the “-r” and “-b” options calculate
the number of PIT worker threads.  I did an “mmfsadm dump all | grep
pitWorkerThreadsPerNode” on all 8 NSD servers and the filesystem manager
node … they all say the same thing:   pitWorkerThreadsPerNode 0Hmmm, so 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 > 31?!?  I’m confused...—Kevin Buterbaugh - Senior System AdministratorVanderbilt University - Advanced Computing Center for Research and Educationkevin.buterba...@vanderbilt.edu-
(615)875-9633___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] Replace SSL cert in GUI - need guidance

2017-05-04 Thread Sobey, Richard A
Hi all,

I'm going through the steps outlines in the following article:

https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.adv.doc/bl1adv_managecertforgui.htm

Will this work for 4.2.1 installations?

Only because in step 5, "Generate a Java(tm) keystore file (.jks) by using the 
keytool. It is stored in the following directory:", the given directory - 
/opt/ibm/wlp/java/jre/bin - does not exist. Only the path upto and including 
wlp is on my GUI server.

I can't imagine the instructions being so different between 4.2.1 and 4.2 but 
I've seen it happen..

Cheers
Richard

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Well, this is the pits...

2017-05-04 Thread Olaf Weiser
no.. it is just in the code, because we
have to avoid to run out of mutexs / blockreduce the number of nodes -N down to
4  (2nodes is even more safer) ... is the easiest way to solve it
for nowI've been told the real root cause will
be fixed in one of the next ptfs .. within this year .. this warning messages itself should
appear every time.. but unfortunately someone coded, that it depends on
the number of disks (NSDs).. that's why I suspect you did'nt see it beforebut the fact , that we have to make
sure, not to overrun the system by mmrestripe  remains.. to please
lower the -N number of nodes to 4 or better 2 (even though we know.. than the mmrestripe
will take longer)From:      
 "Buterbaugh, Kevin
L" To:      
 gpfsug main discussion
list Date:      
 05/04/2017 05:26 PMSubject:    
   [gpfsug-discuss]
Well, this is the pits...Sent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHi All, Another one of those, “I can open a PMR if I need to”
type questions…We are in the process of combining two large GPFS filesystems
into one new filesystem (for various reasons I won’t get into here).  Therefore,
I’m doing a lot of mmrestripe’s, mmdeldisk’s, and mmadddisk’s.Yesterday I did an “mmrestripefs  -r -N
” (after suspending a disk, of course).  Worked
like it should.Today I did a “mmrestripefs  -b -P capacity
-N ” and got:mmrestripefs: The total number of PIT worker threads of
all participating nodes has been exceeded to safely restripe the file system.
 The total number of PIT worker threads, which is the sum of pitWorkerThreadsPerNode
of the participating nodes, cannot exceed 31.  Reissue the command
with a smaller set of participating nodes (-N option) and/or lower the
pitWorkerThreadsPerNode configure setting.  By default the file system
manager node is counted as a participating node.mmrestripefs: Command failed. Examine previous error messages
to determine cause.So there must be some difference in how the “-r” and
“-b” options calculate the number of PIT worker threads.  I did
an “mmfsadm dump all | grep pitWorkerThreadsPerNode” on all 8 NSD servers
and the filesystem manager node … they all say the same thing:   pitWorkerThreadsPerNode 0Hmmm, so 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 > 31?!?
 I’m confused...—Kevin Buterbaugh - Senior System AdministratorVanderbilt University - Advanced Computing Center for
Research and Educationkevin.buterba...@vanderbilt.edu- (615)875-9633___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] Well, this is the pits...

2017-05-04 Thread Buterbaugh, Kevin L
Hi All,

Another one of those, “I can open a PMR if I need to” type questions…

We are in the process of combining two large GPFS filesystems into one new 
filesystem (for various reasons I won’t get into here).  Therefore, I’m doing a 
lot of mmrestripe’s, mmdeldisk’s, and mmadddisk’s.

Yesterday I did an “mmrestripefs  -r -N ” (after 
suspending a disk, of course).  Worked like it should.

Today I did a “mmrestripefs  -b -P capacity -N ” and got:

mmrestripefs: The total number of PIT worker threads of all participating nodes 
has been exceeded to safely restripe the file system.  The total number of PIT 
worker threads, which is the sum of pitWorkerThreadsPerNode of the 
participating nodes, cannot exceed 31.  Reissue the command with a smaller set 
of participating nodes (-N option) and/or lower the pitWorkerThreadsPerNode 
configure setting.  By default the file system manager node is counted as a 
participating node.
mmrestripefs: Command failed. Examine previous error messages to determine 
cause.

So there must be some difference in how the “-r” and “-b” options calculate the 
number of PIT worker threads.  I did an “mmfsadm dump all | grep 
pitWorkerThreadsPerNode” on all 8 NSD servers and the filesystem manager node … 
they all say the same thing:

   pitWorkerThreadsPerNode 0

Hmmm, so 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 > 31?!?  I’m confused...

—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
kevin.buterba...@vanderbilt.edu - 
(615)875-9633



___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] HAWC question

2017-05-04 Thread Sven Oehme
let me clarify and get back, i am not 100% sure on a cross cluster , i
think the main point was that the FS manager for that fs should be
reassigned (which could also happen via mmchmgr) and then the individual
clients that mount that fs restarted  , but i will double check and reply
later .


On Thu, May 4, 2017 at 6:39 AM Simon Thompson (IT Research Support) <
s.j.thomp...@bham.ac.uk> wrote:

> Which cluster though? The client and storage are separate clusters, so all
> the nodes on the remote cluster or storage cluster?
>
> Thanks
>
> Simon
> 
> From: gpfsug-discuss-boun...@spectrumscale.org [
> gpfsug-discuss-boun...@spectrumscale.org] on behalf of oeh...@gmail.com [
> oeh...@gmail.com]
> Sent: 04 May 2017 14:28
> To: gpfsug main discussion list
> Subject: Re: [gpfsug-discuss] HAWC question
>
> well, it's a bit complicated which is why the message is there in the
> first place.
>
> reason is, there is no easy way to tell except by dumping the stripgroup
> on the filesystem manager and check what log group your particular node is
> assigned to and then check the size of the log group.
>
> as soon as the client node gets restarted it should in most cases pick up
> a new log group and that should be at the new size, but to be 100% sure we
> say all nodes need to be restarted.
>
> you need to also turn HAWC on as well, i assume you just left this out of
> the email , just changing log size doesn't turn it on :-)
>
> On Thu, May 4, 2017 at 6:15 AM Simon Thompson (IT Research Support) <
> s.j.thomp...@bham.ac.uk> wrote:
> Hi,
>
> I have a question about HAWC, we are trying to enable this for our
> OpenStack environment, system pool is on SSD already, so we try to change
> the log file size with:
>
> mmchfs FSNAME -L 128M
>
> This says:
>
> mmchfs: Attention: You must restart the GPFS daemons before the new log
> file
> size takes effect. The GPFS daemons can be restarted one node at a time.
> When the GPFS daemon is restarted on the last node in the cluster, the new
> log size becomes effective.
>
>
> We multi-cluster the file-system, so do we have to restart every node in
> all clusters, or just in the storage cluster?
>
> And how do we tell once it has become active?
>
> Thanks
>
> Simon
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] HAWC question

2017-05-04 Thread Simon Thompson (IT Research Support)
Which cluster though? The client and storage are separate clusters, so all the 
nodes on the remote cluster or storage cluster?

Thanks

Simon 

From: gpfsug-discuss-boun...@spectrumscale.org 
[gpfsug-discuss-boun...@spectrumscale.org] on behalf of oeh...@gmail.com 
[oeh...@gmail.com]
Sent: 04 May 2017 14:28
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] HAWC question

well, it's a bit complicated which is why the message is there in the first 
place.

reason is, there is no easy way to tell except by dumping the stripgroup on the 
filesystem manager and check what log group your particular node is assigned to 
and then check the size of the log group.

as soon as the client node gets restarted it should in most cases pick up a new 
log group and that should be at the new size, but to be 100% sure we say all 
nodes need to be restarted.

you need to also turn HAWC on as well, i assume you just left this out of the 
email , just changing log size doesn't turn it on :-)

On Thu, May 4, 2017 at 6:15 AM Simon Thompson (IT Research Support) 
> wrote:
Hi,

I have a question about HAWC, we are trying to enable this for our
OpenStack environment, system pool is on SSD already, so we try to change
the log file size with:

mmchfs FSNAME -L 128M

This says:

mmchfs: Attention: You must restart the GPFS daemons before the new log
file
size takes effect. The GPFS daemons can be restarted one node at a time.
When the GPFS daemon is restarted on the last node in the cluster, the new
log size becomes effective.


We multi-cluster the file-system, so do we have to restart every node in
all clusters, or just in the storage cluster?

And how do we tell once it has become active?

Thanks

Simon

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] HAWC question

2017-05-04 Thread Sven Oehme
well, it's a bit complicated which is why the message is there in the first
place.

reason is, there is no easy way to tell except by dumping the stripgroup on
the filesystem manager and check what log group your particular node is
assigned to and then check the size of the log group.

as soon as the client node gets restarted it should in most cases pick up a
new log group and that should be at the new size, but to be 100% sure we
say all nodes need to be restarted.

you need to also turn HAWC on as well, i assume you just left this out of
the email , just changing log size doesn't turn it on :-)

On Thu, May 4, 2017 at 6:15 AM Simon Thompson (IT Research Support) <
s.j.thomp...@bham.ac.uk> wrote:

> Hi,
>
> I have a question about HAWC, we are trying to enable this for our
> OpenStack environment, system pool is on SSD already, so we try to change
> the log file size with:
>
> mmchfs FSNAME -L 128M
>
> This says:
>
> mmchfs: Attention: You must restart the GPFS daemons before the new log
> file
> size takes effect. The GPFS daemons can be restarted one node at a time.
> When the GPFS daemon is restarted on the last node in the cluster, the new
> log size becomes effective.
>
>
> We multi-cluster the file-system, so do we have to restart every node in
> all clusters, or just in the storage cluster?
>
> And how do we tell once it has become active?
>
> Thanks
>
> Simon
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Tiebreaker disk question

2017-05-04 Thread Olaf Weiser
this configuration (2 nodes and tiebreaker)
is not designed to survive node and disk failures at the same time... this depends on , where the clustermanager
and the filesystem manager runs .. when a node and half of the disk disappear
at the same time...for a real active-active configuration
you may consider https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adv_actact.htmFrom:      
 Jan-Frode Myklebust
To:      
 gpfsug main discussion
list Date:      
 05/04/2017 07:27 AMSubject:    
   Re: [gpfsug-discuss]
Tiebreaker disk questionSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgThis doesn't sound like normal behaviour. It shouldn't
matter which filesystem your tiebreaker disks belong to. I think the failure
was caused by something else, but am not able to guess from the little
information you posted.. The mmfs.log will probably tell you the reason.-jfons. 3. mai 2017 kl. 19.08 skrev Shaun Anderson :We noticed some odd behavior recently. 
I have a customer with a small Scale (with Archive on top) configuration
that we recently updated to a dual node configuration.  We are using
CES and setup a very small 3 nsd shared-root filesystem(gpfssr). 
We also set up tiebreaker disks and figured it would be ok to use
the gpfssr NSDs for this purpose.  When we tried to perform some basic failover
testing, both nodes came down.  It appears from the logs that when
we initiated the node failure (via mmshutdown command...not great, I know)
it unmounts and remounts the shared-root filesystem.  When it did
this, the cluster lost access to the tiebreaker disks, figured it had lost
quorum and the other node came down as well.We got around this by changing the tiebreaker
disks to our other normal gpfs filesystem.  After that failover
worked as expected.  This is documented nowhere as far as I could
find​.  I wanted to know if anybody else had experienced this and
if this is expected behavior.  All is well now and operating as we
want so I don't think we'll pursue a support request.Regards,SHAUN ANDERSONSTORAGE ARCHITECTO208.577.2112M214.263.7014NOTICE: This email message and any attachments
here to may contain confidentialinformation. Any unauthorized review, use, disclosure, or distribution
of suchinformation is prohibited. If you are not the intended recipient, please
contactthe sender by reply email and destroy the original message and all copies
of it.___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss