Re: [gpfsug-discuss] IO sizes

2022-02-24 Thread Olaf Weiser
in addition, to Achim,
where do you see those "smaller IO"...
have you checked IO sizes with mmfsadm dump iohist on each NSDclient/Server ?... If ok on that level.. it's not GPFS
 
 
 
Mit freundlichen Grüßen / Kind regards
 Olaf Weiser 
 
 
- Ursprüngliche Nachricht -Von: "Achim Rehor" Gesendet von: gpfsug-discuss-boun...@spectrumscale.orgAn: "gpfsug main discussion list" CC:Betreff: [EXTERNAL] Re: [gpfsug-discuss] IO sizesDatum: Do, 24. Feb 2022 13:41 
Hi Uwe, first of all, glad to see you back in the GPFS space ;)agreed, groups of subblocks being written will end up in IO sizes, being smaller than the 8MB filesystem blocksize,also agreed, this cannot be metadata, since their size is MUCH smaller, like 4k or less, mostly.But why would these grouped subblock reads/writes all end up on the same NSD server, while the others do full block writes ?How is your NSD server setup per NSD ? did you 'round-robin' set the preferred NSD server per NSD ?are the client nodes transferring the data in anyway doing specifics  ?Sorry for not having a solution for you, jsut sharing a few ideas ;) Mit freundlichen Grüßen / Kind regardsAchim RehorTechnical Support Specialist Spectrum Scale and ESS (SME)Advisory Product Services ProfessionalIBM Systems Storage Support - EMEA
gpfsug-discuss-boun...@spectrumscale.org wrote on 23/02/2022 22:20:11:> From: "Andrew Beattie" > To: "gpfsug main discussion list" > Date: 23/02/2022 22:20> Subject: [EXTERNAL] Re: [gpfsug-discuss] IO sizes> Sent by: gpfsug-discuss-boun...@spectrumscale.org>> Alex, Metadata will be 4Kib Depending on the filesystem version you> will also have subblocks to consider V4 filesystems have 1/32> subblocks, V5 filesystems have 1/1024 subblocks (assuming metadata> and data block size is the same) ‍‍‍ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > ZjQcmQRYFpfptBannerEnd> Alex,>> Metadata will be 4Kib >> Depending on the filesystem version you will also have subblocks to> consider V4 filesystems have 1/32 subblocks, V5 filesystems have 1/> 1024 subblocks (assuming metadata and data block size is the same)>> My first question would be is “ Are you sure that Linux OS is> configured the same on all 4 NSD servers?.>> My second question would be do you know what your average file size> is if most of your files are smaller than your filesystem block> size, then you are always going to be performing writes using groups> of subblocks rather than a full block writes.>> Regards, >> Andrew>> On 24 Feb 2022, at 04:39, Alex Chekholko  wrote:>  Hi, Metadata I/Os will always be smaller than the usual data block> size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23,> 2022 at 10:26 AM Uwe Falke  wrote: Dear all,> sorry for asking a question which seems ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > ZjQcmQRYFpfptBannerEnd> Hi,>> Metadata I/Os will always be smaller than the usual data block size, right?> Which version of GPFS?>> Regards,> Alex>> On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke  wrote:> Dear all,>> sorry for asking a question which seems not directly GPFS related:>> In a setup with 4 NSD servers (old-style, with storage controllers in> the back end), 12 clients and 10 Seagate storage systems, I do see in> benchmark tests that  just one of the NSD servers does send smaller IO> requests to the storage  than the other 3 (that is, both reads and> writes are smaller).>> The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes> ( one server to the controllers A, the other one to controllers B of the> Seagates, resp.).>> All 4 NSD servers are set up similarly:>> kernel: 3.10.0-1160.el7.x86_64 #1 SMP>> HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx>> driver : mpt3sas 31.100.01.00>> max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by> mpt3sas) for all sd devices and all multipath (dm) devices built on top.>> scheduler: deadline>> multipath (actually we do have 3 pat

Re: [gpfsug-discuss] snapshots causing filesystem quiesce

2022-02-02 Thread Olaf Weiser
keep in mind... creating many snapshots... means ;-) .. you'll have to delete many snapshots..
at a certain level, which depends on #files, #directories, ~workload, #nodes, #networks etc we ve seen cases, where generating just full snapshots (whole file system)  is the better approach instead of maintaining snapshots for each file set individually ..
 
sure. this has other side effects , like space consumption etc...
so as always.. it depends..
 
 
 
- Ursprüngliche Nachricht -Von: "Jan-Frode Myklebust" Gesendet von: gpfsug-discuss-boun...@spectrumscale.orgAn: "gpfsug main discussion list" CC:Betreff: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesceDatum: Mi, 2. Feb 2022 12:54 
Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset.
 
i.e. do:
 
    snapname=$(date --utc +@GMT-%Y.%m.%d-%H.%M.%S)
    mmcrsnapshot gpfs0 fileset1:$snapname,filset2:snapname,fileset3:snapname
 
instead of:
 
    mmcrsnapshot gpfs0 fileset1:$snapname
    mmcrsnapshot gpfs0 fileset2:$snapname
    mmcrsnapshot gpfs0 fileset3:$snapname   
 
 
  -jf
  

On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou  wrote:
Ivano,
 
if it happens frequently, I would recommend to open a support case.
 
The creation or deletion of a snapshot requires a quiesce of the nodes to obtain a consistent point-in-time image of the file system and/or update some internal structures afaik. Quiesce is required for nodes at the storage cluster but also remote clusters. Quiesce means stop activities (incl. I/O) for a short period of time to get such consistent image. Also waiting to flush any data in-flight to disk that does not allow a consistent point-in-time image.
 
Nodes receive a quiesce request and acknowledge when ready. When all nodes acknowledge, snapshot operation can proceed and immediately I/O can resume. It usually takes few seconds at most and the operation performed is short but time I/O is stopped depends of how long it takes to quiesce the nodes. If some node take longer to agree stop the activities, such node will be delay the completion of the quiesce and keep I/O paused on the rest.
There could many things while some nodes delay quiesce ack.
 
The larger the cluster, the more difficult it gets. The more network congestion or I/O load, the more difficult it gets. I recommend to open a ticket for support to try to identify the root cause of which nodes not acknowledge the quiesce  and maybe find the root cause. If I recall some previous thread, default timeout was 60 seconds which match your log message. After such timeout, snapshot is considered failed to complete.
 
Support might help you understand the root cause and provide some recommendations if it happens frequently.
 
Best Regards,--Jordi Caubet SerrabouIBM Storage Client Technical Specialist (IBM Spain)
 
- Original message -From: "Talamo Ivano Giuseppe (PSI)" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug main discussion list" Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesceDate: Wed, Feb 2, 2022 11:45 AM 
Hello Andrew,
 
Thanks for your questions.
 
We're not experiencing any other issue/slowness during normal activity.
The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only.
 
The two NSD servers have 750GB of RAM and 618 are configured as pagepool.
 
The issue we see is happening on both the two filesystems we have:
 
- perf filesystem:
 - 1.8 PB size (71% in use)
 - 570 milions of inodes (24% in use)
 
- tiered filesystem:
 - 400 TB size (34% in use)
 - 230 Milions of files (60% in use)
 
Cheers,
Ivano
 
 
 
__
Paul Scherrer Institut
Ivano Talamo
WHGA/038
Forschungsstrasse 111
5232 Villigen PSI
Schweiz
 
Telefon: +41 56 310 47 11
E-Mail: ivano.tal...@psi.ch 

  

From: gpfsug-discuss-boun...@spectrumscale.org  on behalf of Andrew Beattie Sent: Wednesday, February 2, 2022 10:33 AMTo: gpfsug main discussion listSubject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce
 
Ivano,
 
How big is the filesystem in terms of number of files?
How big is the filesystem in terms of capacity? 
Is the Metadata on Flash or Spinning disk? 
Do you see issues when users do an LS of the filesystem or only when you are doing snapshots.
 
How much memory do the NSD servers have?
How much is allocated to the OS / Spectrum
 Scale  Pagepool 
Regards
 
Andrew Beattie
Technical Specialist - Storage for Big Data & AI
IBM Technology Group
IBM Australia & New Zealand
P. +61 421 337 927
E. abeat...@au1.ibm.com
 
 
 
On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI)  wrote: 

 
Dear all,
 
Since a while we are experiencing an issue when dealing with snapshots.

[gpfsug-discuss] email format check again for IBM domain send email

2021-12-17 Thread Olaf Weiser
 
Hallo Lucas , here we are
this is a regular email, send from Verse
 
@All, please ignore this email, it is to track  internal email format issues
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] Test email format / mail format

2021-12-10 Thread Olaf Weiser
This email is just a test, because we've seen mail format issues from IBM sent emails
you can ignore this email , just for internal problem determination

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] alternate path between ESS Servers for Datamigration

2021-12-09 Thread Olaf Weiser
Hallo Walter,
;-)
yes !AND! no ..
 
for sure , you can specifiy a subset of nodes to use RDMA and other nodes just communicating TCPIP
But that's only half of the truth .
 
The other half is.. who and how , you are going to migrate/copy the data
in case you 'll use mmrestripe  you will have to make sure , that only nodes, connected(green) and configured for RDMA  doing the work
otherwise.. if will also work to migrate the data, but then data is send throught the Ethernet as well , (as long all those nodes are in the same cluster)
 
 
laff
 
 
 
 
 
- Ursprüngliche Nachricht -Von: "Walter Sklenka" Gesendet von: gpfsug-discuss-boun...@spectrumscale.orgAn: "'gpfsug-discuss@spectrumscale.org'" CC:Betreff: [EXTERNAL] [gpfsug-discuss] alternate path between ESS Servers for DatamigrationDatum: Do, 9. Dez 2021 11:04  
Dear spectrum scale users!
May I ask you a design question?
We have an IB environment which is very mixed at the moment ( connecX3 … connect-X6 with FDR , even FDR10 and with arrive of ESS5000SC7 now also HDR100 and HDR switches. We still have some big  troubles in this fabric when using RDMA , a case at Mellanox and IBM is open . 
The environment has 3 old Building blocks 2xESSGL6 and 1x GL4 , from where we want to migrate the data to ess5000 , ( mmdelvdisk +qos) 
Due to the current problems with RDMA we though eventually we could try a workaround :
If you are interested there is Maybe you can find the attachment ? 
We build 2 separate fabrics , the ess-IO servers attached to both blue and green and all other cluster members and all remote clusters only to fabric blue 
The daemon interfaces (IPoIP) are on fabric blue
 
It is the aim to setup rdma only on the ess-ioServers in the fabric green ,  in the blue we must use IPoIB (tcp) 
Do you think datamigration would work between ess01,ess02,… to ess07,ess08 via RDMA ?
Or is it  principally not possible to make a rdma network only  for a subset of a cluster (though this subset would be reachable via other fabric) ?
 
Thank you very much for any input !
Best regards walter 
 
 
 
Mit freundlichen GrüßenWalter SklenkaTechnical Consultant 
 
EDV-Design Informationstechnologie GmbHGiefinggasse 6/1/2, A-1210 WienTel: +43 1 29 22 165-31Fax: +43 1 29 22 165-90E-Mail: skle...@edv-design.atInternet: www.edv-design.at
 
 
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] gpfsgui in a core dump/restart loop

2021-11-30 Thread Olaf Weiser
add this line to the ticket/record information, when opening a service ticket ..
 
fput failed: Version mismatch on conditional put (err 805)
 
 
- Ursprüngliche Nachricht -Von: "Luis Bolinches" Gesendet von: gpfsug-discuss-boun...@spectrumscale.orgAn: gpfsug-discuss@spectrumscale.orgCC: gpfsug-discuss@spectrumscale.orgBetreff: [EXTERNAL] Re: [gpfsug-discuss] gpfsgui in a core dump/restart loopDatum: Di, 30. Nov 2021 14:30 
Hi
 
Not really a solution ...
 
first disable the systemd service
 
systemd disable gpfsgui
 
So at least does not go on this loop
 
This can be indicative of few issues going on. 2 or more nodes trying to modify the same file; removed nodes that were perfmon; "too many" collectors on certain conditions; ... and probably many other.
 
I strongly suggest you get the last round of generated dump data and open a case to IBM (assuming this is IBM, whoever else the vendor is if not). Maybe a snap with it to speed up things so there is a clear picture of the cluster and CCR nodes and collectors.
 
 
--Ystävällisin terveisin / Kind regards / Saludos cordiales / Salutations / SalutacionsLuis Bolinches
IBM Spectrum Scale development
Mobile Phone: +358503112585
 
https://www.youracclaim.com/user/luis-bolinches
 
Ab IBM Finland Oy
Laajalahdentie 23
00330 Helsinki
Uusimaa - Finland"If you always give you will always have" --  Anonymous
 
 
 
- Original message -From: "Losen, Stephen C (scl)" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug main discussion list" Cc:Subject: [EXTERNAL] [gpfsug-discuss] gpfsgui in a core dump/restart loopDate: Tue, Nov 30, 2021 14:48 
Hi folks,Our gpfsgui service keeps crashing and restarting. About every three minutes we get files like these in /var/crash/scalemgmt-rw--- 1 scalemgmt scalemgmt 1067843584 Nov 30 06:54 core.20211130.065414.59174.0001.dmp-rw-r--r-- 1 scalemgmt scalemgmt    2636747 Nov 30 06:54 javacore.20211130.065414.59174.0002.txt-rw-r--r-- 1 scalemgmt scalemgmt    1903304 Nov 30 06:54 Snap.20211130.065414.59174.0003.trc-rw-r--r-- 1 scalemgmt scalemgmt        202 Nov 30 06:54 jitdump.20211130.065414.59174.0004.dmpThe core.*.dmp files are cores from the java command.And the below errors keep repeating in /var/adm/ras/mmsysmonitor.log.Any suggestions? Thanks for any help.2021-11-30_07:25:09.944-0500: [W] ET_gui          Event=gui_down identifier= arg0=started arg1=stopped2021-11-30_07:25:09.961-0500: [I] ET_gui          state_change for service: gui to FAILED at 2021.11.30 07.25.09.9615722021-11-30_07:25:09.963-0500: [I] ClientThread-4  received command: 'thresholds  refresh  collectors  4021694'2021-11-30_07:25:09.964-0500: [I] ClientThread-4  reload collectors                                2021-11-30_07:25:09.964-0500: [I] ClientThread-4  read_collectors                                  2021-11-30_07:25:10.059-0500: [W] ClientThread-4  QueryHandler: query response has no data results  2021-11-30_07:25:10.059-0500: [W] ClientThread-4  QueryProcessor::execute: Error sending query in execute, quitting2021-11-30_07:25:10.060-0500: [W] ClientThread-4  QueryHandler: query response has no data results  2021-11-30_07:25:10.060-0500: [W] ClientThread-4  QueryProcessor::execute: Error sending query in execute, quitting2021-11-30_07:25:10.061-0500: [I] ClientThread-4  _activate_rules_scheduler completed              2021-11-30_07:25:10.147-0500: [I] ET_gui          Event=component_state_change identifier= arg0=GUI arg1=FAILED2021-11-30_07:25:10.148-0500: [I] ET_gui          StateChange: change_to=FAILED nodestate=DEGRADED CESState=UNKNOWN2021-11-30_07:25:10.148-0500: [I] ET_gui          Service gui state changed. isInRunningState=True, wasInRunningState=True. New state=42021-11-30_07:25:10.148-0500: [I] ET_gui          Monitor: LocalState:FAILED Events:607 Entities:0 RT:  0.832021-11-30_07:25:11.975-0500: [W] ET_perfmon      got rc (153) while executing ['/usr/lpp/mmfs/bin/mmccr', 'fput', 'collectors', '/var/mmfs/tmp/tmpq4ac8o', '-c 4021693']2021-11-30_07:25:11.975-0500: [E] ET_perfmon      fput failed: Version mismatch on conditional put (err 805) - CCRProxy._run_ccr_command:2562021-09-29_20:03:53.322-0500: [I] MainThread      -                2021-11-30_07:25:04.553-0500: [D] ET_perfmon      File collectors has no newer version than 4021693  - CCRProxy.getFile:1192021-11-30_07:25:11.975-0500: [W] ET_perfmon      Conditional put for file collectors with version 4021693 failed2021-11-30_07:25:11.975-0500: [W] ET_perfmon      New version received, start new collectors update cycle2021-11-30_07:25:11.976-0500: [I] ET_perfmon      read_collectors                                  2021-11-30_07:25:12.077-0500: [I] ET_perfmon      write_collectors                                  2021-11-30_07:25:13.333-0500: [I] ClientThread-20 received command: 'thresholds  refresh  collectors  4021695'2021-11-30_07:25:13.334-0500: [I] ClientThread-20 reload collectors                                

Re: [gpfsug-discuss] /tmp/mmfs vanishes randomly?

2021-11-08 Thread Olaf Weiser
Hallo Heiner,
 
multiple levels of answers..
 
(1st) ... it the directory is not there, the gpfs trace would create it automatically - just like this:
[root@ess5-ems1 ~]# ls -l /tmp/mmfs ls: cannot access '/tmp/mmfs': No such file or directory[root@ess5-ems1 ~]# mmtracectl --start -N ems5k.mmfsd.netmmchconfig: Command successfully completedmmchconfig: Propagating the cluster configuration data to all affected nodes.  This is an asynchronous process.[root@ess5-ems1 ~]#  [root@ess5-ems1 ~]#  [root@ess5-ems1 ~]# ls -l /tmp/mmfs   total 0-rw-r--r-- 1 root root 0 Nov  8 10:47 lxtrace.trcerr.ems5k[root@ess5-ems1 ~]#  
(2nd) I think - the cleaning of /tmp is something done by the OS -
please check - 
systemctl status systemd-tmpfiles-setup.service
or look at this config file
[root@ess5-ems1 ~]# cat /usr/lib/tmpfiles.d/tmp.conf #  This file is part of systemd.##  systemd is free software; you can redistribute it and/or modify it#  under the terms of the GNU Lesser General Public License as published by#  the Free Software Foundation; either version 2.1 of the License, or#  (at your option) any later version.# See tmpfiles.d(5) for details# Clear tmp directories separately, to make them easier to overrideq /tmp 1777 root root 10dq /var/tmp 1777 root root 30d# Exclude namespace mountpoints created with PrivateTmp=yesx /tmp/systemd-private-%b-*X /tmp/systemd-private-%b-*/tmpx /var/tmp/systemd-private-%b-*X /var/tmp/systemd-private-%b-*/tmp# Remove top-level private temporary directories on each bootR! /tmp/systemd-private-*R! /var/tmp/systemd-private-*[root@ess5-ems1 ~]#  
 
hope this helps -
cheers
 
 
 
Mit freundlichen Grüßen / Kind regards
 Olaf Weiser IBM Systems, SpectrumScale Client Adoption---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Gregor Pillen (Vorsitzender), Agnes Heftberger, Norbert Janzen, Markus Koerner, Christian Noll, Nicole ReimerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940
 
 
 
- Ursprüngliche Nachricht -Von: "Billich Heinrich Rainer (ID SD)" Gesendet von: gpfsug-discuss-boun...@spectrumscale.orgAn: "gpfsug main discussion list" CC:Betreff: [EXTERNAL] [gpfsug-discuss] /tmp/mmfs vanishes randomly?Datum: Mo, 8. Nov 2021 10:35 
Hello,We use /tmp/mmfs as dataStructureDump directory. Since a while I notice that this directory randomly vanishes. Mmhealth does not complain but just notes that it will no longer monitor the directory. Still I doubt that trace collection and similar will create the directory when needed?Do you know of any spectrum scale internal mechanism that could cause /tmp/mmfs to get deleted? It happens on ESS nodes, with a plain IBM installation, too. It happens just on one or two nodes at a time, it's no cluster-wide cleanup or similar. We run scale 5.0.5 and ESS 6.0.2.2 and 6.0.2.2.Thank you,Mmhealth message:local_fs_path_not_found   INFO       The configured dataStructureDump path /tmp/mmfs does not exists. Skipping monitoring.Kind regards,Heiner---===Heinrich BillichETH ZürichInformatikdiensteTel.: +41 44 632 72 56heinrich.bill...@id.ethz.ch  ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] alphafold and mmap performance

2021-10-19 Thread Olaf Weiser
[...] We have tried a number of things including Spectrum Scale clientversion 5.0.5-9[...]  in the client code or the server code? 
there are going  multiple improvements in the code.. continuously... Since your version 4.2.3 /  5.0.5 a lot of them are in the area of NSD server/GNR (which is server based) and also a lot of enhancements went into the client part. Some are on both .. such as RoCE, or using multiple TCP/IP sockets per communication pair, etc
All this influences your performance..
 
But I d like to try to give you some answers to  your specific Q -
Only now do I notice a suggestion:    mmchconfig prefetchAggressivenessRead=0 -iI did not use this.  Would a performance change be expected? 
 
YES;-)  .. this parameter should really help .. from the UG expert talk 2020 we shared some numbers/charts on it
https://www.spectrumscaleug.org/event/ssugdigital-spectrum-scale-expert-talks-update-on-performance-enhancements-in-spectrum-scale/
starting ~ 8:30 minutes / just 2 slides  ... 
let us know, if you need more information
 
next Q:
Would the pagepool size be involved in this?
GPFS pagepool can be considered to be pinned in memory. So I doubt - and as you are writing - [...]echo 3 > /proc/sys/vm/drop_caches) the slow performance returns[...] - that papgepool is your first choice for further analysis. 
however... I can't see/say smth about your pagepool utilization w/o seeing any counters, statistics.. etc..
 
let me know, if this helps you .. or if you need more information ...
 
 
 
Mit freundlichen Grüßen / Kind regards ;-)
 Olaf
 
 
- Ursprüngliche Nachricht -Von: "Jon Diprose" Gesendet von: gpfsug-discuss-boun...@spectrumscale.orgAn: "gpfsug main discussion list" CC:Betreff: [EXTERNAL] Re: [gpfsug-discuss] alphafold and mmap performanceDatum: Di, 19. Okt 2021 20:12 
Not that it answers Stuart's questions in any way, but we gave up on the same problem on a similar setup, rescued an old fileserver off the scrapheap (RAID6 of 12 x 7.2k rpm SAS on a PERC H710P) and just served the reference data by nfs - good enough to keep the compute busy rather than in cxiWaitEventWait. If there's significant demand for Alphafold then somebody's arm will be twisted for a new server with some NVMe. If I remember right, the reference data is ~2.3TB, ruling out our usual approach of just reading the problematic files into a ramdisk first.We are also interested in hearing how it might be usably served from GPFS.Thanks,Jon--Dr. Jonathan Diprose              Tel: 01865 287873Research Computing ManagerHenry Wellcome Building for Genomic MedicineRoosevelt Drive, Headington, Oxford OX3 7BNFrom: gpfsug-discuss-boun...@spectrumscale.org [gpfsug-discuss-boun...@spectrumscale.org] on behalf of Stuart Barkley [stua...@4gh.net]Sent: 19 October 2021 18:16To: gpfsug-discuss@spectrumscale.orgSubject: [gpfsug-discuss] alphafold and mmap performanceOver the years there have been several discussions about performanceproblems with mmap() on GPFS/Spectrum Scale.We are currently having problems with mmap() performance on oursystems with new alphafold protein folding software.  Things look similar to previous times wehave had mmap() problems.The software component "hhblits" appears to mmap a large file withgenomic data and then does random reads throughout the file.  GPFSappears to be doing 4K reads for each block limiting the performance.The first run takes 20+ hours to run.  Subsequent identical runscomplete in just 1-2 hours.  After clearing the linux system cache(echo 3 > /proc/sys/vm/drop_caches) the slow performance returns forthe next run.GPFS Server is 4.2.3-5 running on DDN hardware.  CentOS 7.3Default GPFS Client is 4.2.3-22. CentOS 7.9We have tried a number of things including Spectrum Scale clientversion 5.0.5-9 which should have Sven's recent mmap performanceimprovements. Are the recent mmap performance improvements in theclient code or the server code?Only now do I notice a suggestion:    mmchconfig prefetchAggressivenessRead=0 -iI did not use this.  Would a performance change be expected?Would the pagepool size be involved in this?Stuart Barkley--I've never been lost; I was once bewildered for three days, but never lost!                                        --  Daniel Boone___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Handling bad file names in policies?

2021-10-05 Thread Olaf Weiser
Hi  Ed,
 
not a ready to run for "everything".. but just to remind, there is an ESCAPE statement
by this you can
 
 cat policy2 RULE EXTERNAL LIST 'allfiles' EXEC '/var/mmfs/etc/list.exe'  ESCAPE '%/#'
 
and turn a file name into smth , what a policy can use
 
I haven't used it for a while , but here is an example from a while ago .. ;-)
 
[root@c25m4n03 stupid_files]# lltotal 0-rw-r--r-- 1 root root 21 Mar 22 03:44 dämlicher filename-rw-r--r-- 1 root root  2 Mar 22 03:59 üöä???ßß spacefilen[root@c25m4n03 stupid_files]#
 
 
policy:
101378 247907919 0   -- /gpfs/fpofs/files/stupid_files/d%C3%A4mlicher%20filename101381 1945364096 0   -- /gpfs/fpofs/files/stupid_files/%C3%BC%C3%BC%C3%BC%C3%B6%C3%B6%C3%A4%C3%A4%3F%3F%3F%C3%9F%C3%9F%20spacefilename[I]2013-03-22@13:12:58.687 Policy execution. 2 files dispatched.           
 
 verify with policy  (ESCAPE '%/ä ')
101378 247907919 0   -- /gpfs/fpofs/files/stupid_files/dämlicher filename[...]  
 
 
hope this helps..
cheers
 
 
 
 
- Ursprüngliche Nachricht -Von: "Jonathan Buzzard" Gesendet von: gpfsug-discuss-boun...@spectrumscale.orgAn: gpfsug-discuss@spectrumscale.orgCC:Betreff: [EXTERNAL] Re: [gpfsug-discuss] Handling bad file names in policies?Datum: Di, 5. Okt 2021 01:29 
On 04/10/2021 23:23, Wahl, Edward wrote:> I know I've run into this before way back, but my notes on how I solved> this aren't getting the job done in Scale 5.0.5.8 and my notes are from> 3.5.  > Anyone know a way to get a LIST policy to properly feed bad filenames> into the output or an external script?>> When I say bad I mean things like control characters, spaces, etc.   Not> concerned about the dreaded 'newline' as we force users to fix those or> the files do not get backed up in Tivoli.>Since when? Last time I checked which was admittedly circa 2008, TSMwould backup files with newlines in them no problem. mmbackup on theother hand in that time frame would simply die and backup nothing ifthere was a single file on the file system with a newline in it.I would take a look at the mmbackup scripts which can handle such stuff(least ways in >4.2) which would also suggest dsmc can handle it.As an aside I now think I know how you end up with newlines in filenames. Basically you cut and paste the file name complete with newlines(most likely at the end) into a text field when saving the file.Personally I think any program should baulk at that point but what do Iknow.JAB.--Jonathan A. Buzzard                         Tel: +44141-5483420HPC System Administrator, ARCHIE-WeSt.University of Strathclyde, John Anderson Building, Glasgow. G4 0NG___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] RDMA write error IBV_WC_RETRY_EXC_ERR

2021-07-09 Thread Olaf Weiser
smells like a network problem ..
 
IBV_WC_RETRY_EXC_ERR  comes from OFED and clearly says that the data didn't get through successfully,
 
further help .. check
ibstat
iblinkinfo
ibdiagnet
and the sminfo .. (should be the same on all members)
 
 
 
 
- Ursprüngliche Nachricht -Von: "Iban Cabrillo" Gesendet von: gpfsug-discuss-boun...@spectrumscale.orgAn: "gpfsug-discuss" CC:Betreff: [EXTERNAL] [gpfsug-discuss] RDMA write error IBV_WC_RETRY_EXC_ERRDatum: Fr, 9. Jul 2021 13:29 
Dear,
    Since a couple of hours we are seen lots off IB error at GPFS logs, on every IB node (gpfs version is 5.0.4-3):
 
  2021-07-09_13:11:40.600+0200: [E] VERBS RDMA closed connection to 10.10.152.73 (node157) on mlx5_0 port 1 fabnum 0 index 251 cookie 648 RDMA write error IBV_WC_RETRY_EXC_ERR
2021-07-09_13:11:40.600+0200: [E] VERBS RDMA closed connection to 10.10.152.18 (node102) on mlx5_0 port 1 fabnum 0 index 227 cookie 687 RDMA write error IBV_WC_RETRY_EXC_ERR
2021-07-09_13:11:40.600+0200: [E] VERBS RDMA closed connection to 10.10.152.17 (node101) on mlx5_0 port 1 fabnum 0 index 298 cookie 693 RDMA write error IBV_WC_RETRY_EXC_ERR
2021-07-09_13:11:40.600+0200: [E] VERBS RDMA closed connection to 10.10.151.6 (node6) on mlx5_0 port 1 fabnum 0 index 18 cookie 696 RDMA write error IBV_WC_RETRY_EXC_ERR
2021-07-09_13:11:40.601+0200: [E] VERBS RDMA closed connection to 10.10.152.46 (node130) on mlx5_0 port 1 fabnum 0 index 254 cookie 680 RDMA write error IBV_WC_RETRY_EXC_ERR
2021-07-09_13:11:40.601+0200: [E] VERBS RDMA closed connection to 10.10.151.81 (node81) on mlx5_0 port 1 fabnum 0 index 289 cookie 679 RDMA read error IBV_WC_RETRY_EXC_ERR
 
and ofcourse long waiters:
 
=== mmdiag: waiters ===
Waiting 34.8493 sec since 13:11:35, ignored, thread 2935 VerbsReconnectThread: delaying for 25.150686000 more seconds, reason: delaying for next reconnect attempt
Waiting 34.6249 sec since 13:11:35, ignored, thread 10198 VerbsReconnectThread: delaying for 25.375072000 more seconds, reason: delaying for next reconnect attempt
Waiting 27.0957 sec since 13:11:43, ignored, thread 10052 VerbsReconnectThread: delaying for 32.904264000 more seconds, reason: delaying for next reconnect attempt
Waiting 14.8909 sec since 13:11:55, monitored, thread 23135 NSDThread: for RDMA write completion fast on node 10.10.151.65 
Waiting 14.8891 sec since 13:11:55, monitored, thread 23109 NSDThread: for RDMA write completion fast on node 10.10.152.32 
Waiting 14.8865 sec since 13:11:55, monitored, thread 23302 NSDThread: for RDMA write completion fast on node 10.10.150.1 
 
[common]
verbsRdma enable
verbsPorts mlx4_0/1/0
[gpfs02,gpfs04,gpfs05,gpfs06,gpfs07,gpfs08]
verbsPorts mlx5_0/1/0
[gpfs01]
verbsPorts mlx5_1/1/0
[gpfs03]
verbsPorts mlx5_0/1/0 mlx5_1/1/0
 
 
[common]
verbsRdma enable
verbsPorts mlx4_0/1/0
[gpfs02,gpfs04,gpfs05,gpfs06,gpfs07,gpfs08,wngpu001,wngpu002,wngpu003,wngpu004,wngpu005]
verbsPorts mlx5_0/1/0
[gpfs01]
verbsPorts mlx5_1/1/0
[gpfs03]
verbsPorts mlx5_0/1/0 mlx5_1/1/0
 
Any advise is welcomed
regards, I
  

___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Filesystem mount attempt hangs GPFS client node

2021-03-30 Thread Olaf Weiser
Hallo Olu,
from the log you provide, nothing seems to be faulty... but that does not mean, there is no issue ...
 
if you think , it is a GPFS problem start gpfs trace on a sample node, , which has this problem again and again... and capture a trae as well and provide that data to IBM
I suggest, to open a PMR to IBM , collect a GPFS snap ...
 
personally, I would start debugging the node... make journalctl  persistent
https://access.redhat.com/solutions/696893
and start from there ...
 
it smells a bit like a network  problem related to RDMA/ OFED.. do you use same OFED version as in the cluster, what works fine ?
 
 
 
- Original message -From: "Saula, Oluwasijibomi" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: [EXTERNAL] [gpfsug-discuss] Filesystem mount attempt hangs GPFS client nodeDate: Mon, Mar 29, 2021 8:38 PM 
Hello Folks,
 
So we are experiencing a mind-boggling issue where just a couple of nodes in our cluster, at GPFS boot up, get hung so badly that the node must be power reset.
 
These AMD client nodes are diskless in nature and have at least 256G of memory. We have other AMD nodes that are working just fine in a separate GPFS cluster albeit on RHEL7.
 
Just before GPFS (or related processes) seize up the node, the following lines of /var/mmfs/gen/mmfslog are noted:
 
2021-03-29_12:47:37.343-0500: [N] mmfsd ready
2021-03-29_12:47:37.426-0500: mmcommon mmfsup invoked. Parameters: 10.12.50.47 10.12.50.242 all
2021-03-29_12:47:37.587-0500: mounting /dev/mmfs1
2021-03-29_12:47:37.590-0500: [I] Command: mount mmfs1
2021-03-29_12:47:37.859-0500: [N] Connecting to 10.12.50.243 tier1-sn-02.pixstor 
2021-03-29_12:47:37.864-0500: [I] VERBS RDMA connecting to 10.12.50.242 (tier1-sn-01.pixstor) on mlx5_0 port 1 fabnum 0 sl 0 index 0
2021-03-29_12:47:37.864-0500: [I] VERBS RDMA connecting to 10.12.50.242 (tier1-sn-01) on mlx5_0 port 1 fabnum 0 sl 0 index 1
2021-03-29_12:47:37.866-0500: [I] VERBS RDMA connected to 10.12.50.242 (tier1-sn-01) on mlx5_0 port 1 fabnum 0 sl 0 index 0
2021-03-29_12:47:37.867-0500: [I] VERBS RDMA connected to 10.12.50.242 (tier1-sn-01) on mlx5_0 port 1 fabnum 0 sl 0 index 1
2021-03-29_12:47:37.868-0500: [I] Connected to 10.12.50.243 tier1-sn-02 
There have been hunches that this might be a network issue, however, other nodes connected to the IB network switch are mounting the filesystem without incident.
 
I'm inclined to believe there's a GPFS/OS-specific setting that might be causing these crashes especially when we note that disabling the automount on the client node doesn't result in the node hanging. However, once we issue mmmount, we see the node seize up shortly...
 
Please let me know if you have any thoughts on where to look for root-causes as I and a few fellows are stuck here 
 
 
 
 
Thanks,
 
Oluwasijibomi (Siji) Saula
HPC Systems Administrator  /  Information Technology
 
Research 2 Building 220B / Fargo ND 58108-6050
p: 701.231.7749 / www.ndsu.edu
 

  

 
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Using setfacl vs. mmputacl

2021-03-01 Thread Olaf Weiser
[...]  that you would do this over NFSv4 or Samba.[...] 
 
I know this statement and I've been told the same many times for my projects I did... 
I can't promise anything , nor am I able to rate .. if it is worth or not.. 
all I'm saying... I saw/see a need for setfacl directly on GPFS  in multiple projects..
 
so if that items is interesting for many people.. why not give it a try in 2021 ... knowing, that really developing such a tool is a lot of discussion and in the end it will most likely become a better "compromise"... But maybe this "compromise" could then be the very best solution we'll  ever get .. 
 
P.S.
hey - at least the last 2 digit's of the year  has turned around ;-)  since you opened up this item
cheers
olaf
 
 
 
- Original message -From: Jonathan Buzzard Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc:Subject: [EXTERNAL] Re: [gpfsug-discuss] Using setfacl vs. mmputaclDate: Mon, Mar 1, 2021 5:51 PM 
On 01/03/2021 15:18, Olaf Weiser wrote:> CAUTION: This email originated outside the University. Check before> clicking links or attachments.> JAB,> yes-this is in argument ;-) ... and personally I like the idea of having> smth like setfacl also for GPFS ..  for years...> *but* it would not take away the generic challenge , what to do, if> there are competing standards / definitions to meet> at least that is most likely just one reason, why there's no tool yet> there are several hits on RFE page for "ACL".. some of them could be> also addressed with a (mm)setfacl tool> but I was not able to find a request for a tool itself> (I quickly  searched  public but  not found it there, maybe there is> already one in private...)> So - dependent on how important this item for others  is  ... its time> to fire an RFE ?!? ...Well when I asked I was told by an IBM representative that it was bydesign there was no proper way to set ACLs directly from Linux. Theexpectation was that you would do this over NFSv4 or Samba.So filing an RFE would be pointless under those conditions and I havenever bothered as a result. This was pre 2012 so IBM's outlook mighthave changed in the meantime.JAB.--Jonathan A. Buzzard                         Tel: +44141-5483420HPC System Administrator, ARCHIE-WeSt.University of Strathclyde, John Anderson Building, Glasgow. G4 0NG___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttps://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwIGaQ=jf_iaSHvJObTbx-siA1ZOg=QInBVUG2345zpTXGAPvczeXAfnCgUNXuJEI_-wZlDDs=rIGzfsaZeHVJsB7f40z3Rq08gAVzunl8HAjTQUJeAsY=f-ToxCe0HJVrknlsf5YGm6ZZPFJn_W5Xp1emUpG2VJ0=  
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Using setfacl vs. mmputacl

2021-03-01 Thread Olaf Weiser
JAB,
yes-this is in argument ;-) ... and personally I like the idea of having smth like setfacl also for GPFS ..  for years...
*but* it would not take away the generic challenge , what to do, if there are competing standards / definitions to meet
at least that is most likely just one reason, why there's no tool yet
 
there are several hits on RFE page for "ACL".. some of them could be also addressed with a (mm)setfacl tool
but I was not able to find a request for a tool itself
(I quickly  searched  public but  not found it there, maybe there is already one in private...)
 
So - dependent on how important this item for others  is  ... its time to fire an RFE ?!? ...
 
 
cheers
olaf
 
 
 
- Original message -From: Jonathan Buzzard Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc:Subject: [EXTERNAL] Re: [gpfsug-discuss] Using setfacl vs. mmputaclDate: Mon, Mar 1, 2021 2:14 PM 
On 01/03/2021 12:45, Olaf Weiser wrote:> CAUTION: This email originated outside the University. Check before> clicking links or attachments.> Hallo Stephen,> behavior ... or better to say ... predicted behavior for chmod and ACLs> .. is not an easy thing or only  , if  you stay in either POSIX world or> NFSv4 world> to be POSIX compliant, a chmod overwrites ACLsOne might argue that the general rubbishness of the mmputacl cammand,and if a mmsetfacl command (or similar) existed it would negate messingwith Linux utilities to change ACL's on GPFS file systemsOnly been bringing it up for over a decade now ;-)JAB.--Jonathan A. Buzzard                         Tel: +44141-5483420HPC System Administrator, ARCHIE-WeSt.University of Strathclyde, John Anderson Building, Glasgow. G4 0NG___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttps://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwIGaQ=jf_iaSHvJObTbx-siA1ZOg=QInBVUG2345zpTXGAPvczeXAfnCgUNXuJEI_-wZlDDs=KN9CCGBxfu8z2xeIa5YiWGneEWPQdVKuxzCt0nOppOU=gtWdBqUTozWDxX6BOs4mukZVVDsHaL2vOYCj-xWaC04=  
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Using setfacl vs. mmputacl

2021-03-01 Thread Olaf Weiser
Hallo Stephen,
 
behavior ... or better to say ... predicted behavior for chmod and ACLs .. is not an easy thing or only  , if  you stay in either POSIX world or NFSv4 world
 
to be POSIX compliant, a chmod overwrites ACLs
 
GPFS was enhanced to ignore overwrites to ACLs on chmod by a parameter.. and I can't remember exactly when, but in your very old version (blink blink, please update) it shoud be already there...  ... .. Then(later) it was even more enhanced to better mitigate between the world's ...
 
You need to have in mind... in case you use kernelNFS ... the linux kernel NFS required a so called lossy mapping ... because at the time of writing the kernel-NFS, there was no linux file system available,  supporting native NFSv4 ACLs... so there was no other way .. than "lossy" map NFSv4 ACLs into POSIX ACLs (long time ago) ...
 
but as always.. everything in IT business has some history.. ;-)
 
later in GPFS, we introduced, that you can fine grained allow behavior of chmod , NFSv4 ACLs, POSIX ACls or both  -and - do that per file set level
 
--allow-permission-change PermissionChangeMode Specifies the new permission change mode. This modecontrols how chmod and ACL operations are handled on objects in the fileset. Valid modes are as follows:chmodOnly  Specifies that only the UNIX change mode operation (chmod) is allowed to change access permissions (ACL commands and API will not be  accepted).setAclOnly  Specifies that permissions can be changed using ACL commands and API only (chmod will not be  accepted).chmodAndSetAcl  Specifies that chmod and ACL operations are  permitted. If the chmod command (or setattr file operation) is issued, the result depends on the type of ACL that was previously  controlling access to the object: *  If the object had a Posix ACL, it will be modified accordingly. *  If the object had an NFSv4 ACL, it will be replaced by the given UNIX mode bits. Note:  This is the default setting when a  fileset is created.chmodAndUpdateAcl  Specifies that chmod and ACL operations are  permitted. If chmod is issued, the ACL will be  updated by privileges derived from UNIX mode bits. 
hope this helps ..
 
 
- Original message -From: "Losen, Stephen C (scl)" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] [gpfsug-discuss] Using setfacl vs. mmputaclDate: Mon, Mar 1, 2021 1:31 PM 
Hi folks,Experimenting with POSIX ACLs on GPFS 4.2 and noticed that the Linux command setfacl clears "c" permissions that were set with mmputacl. So if I have this:...group:group1:rwxcmask::rwxc...and I modify a different entry with:setfacl -m group:group2:r-x dirnamethen the "c" permissions above get cleared and I end up with...group:group1:rwx-mask::rwx-...I discovered that chmod does not clear the "c" mode. Is there any filesystem option to change this behavior to leave "c" modes in place?Steve LosenResearch ComputingUniversity of Virginias...@virginia.edu   434-924-0640___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttps://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=QInBVUG2345zpTXGAPvczeXAfnCgUNXuJEI_-wZlDDs=Cb4nCNXx2mpY3MW5kuFoMZe8SsgXt-2_m6k7OMk50v8=FxuSoG7O3C3D-I-NblJA4tsPcsXlkF0JGTSormvlYiE=  
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] dssgmkfs.mmvdisk number of NSD's

2021-03-01 Thread Olaf Weiser
@all, please note...
 
as being said. there is a major difference , if we talk about GNR or GPFS native... 
 
one "comon" key is the #queues in the OS to talk to a disk-device,
so if you run "classical" NSD  architecture.. you may check how many IOPS you can fire against your block devices...
GPFS's internal rule is ~ 3 IOs per device , you can adjust it .. .or (proceed below)
 
 in GNR/ESS.. here... we (IBM) pre-configured everything on the NSD server side.. ready to use ..
in on GNR.. you have to do this job .. (how many NSD worker ... etc..) ((its a bit more complex... (other topic)  ))
 
the IMPORTANT key in both cases for the client is:
ignorePrefetchLunCount=yes
workerThreads = ...your number of IOs you may think is ok...
 
those  parameters tell GPFS .. use # x worker and do your work... ignore the # disk to calculate how many IO traffic can be inflight
 
for GNR:
as Luis said.. its more the management of FS rather than performance , to decide for other from default #vdisk
 
 
 
- Original message -From: "Luis Bolinches" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc: gpfsug-discuss@spectrumscale.orgSubject: [EXTERNAL] Re: [gpfsug-discuss] dssgmkfs.mmvdisk number of NSD'sDate: Mon, Mar 1, 2021 10:08 AM 
Hi
 
There other reasons to have more than 1. It is management of those. When you have to add or remove NSDs of a FS having more than 1 makes it possible to empty some space and manage those in and out. Manually but possible. If you have one big NSD or even 1 per enclosure it might difficult or even not possible depending the number of enclosures and FS utilization.
 
Starting some ESS version (not DSS, cant comment on that) that I do not recall but in the last 6 months, we have change the default (for those that use the default) to 4 NSDs per enclosure for ESS 5000. There is no impact on performance either way on ESS, we tested it. But management of those on the long run should be easier.
--Ystävällisin terveisin / Kind regards / Saludos cordiales / Salutations / SalutacionsLuis Bolinches
Consultant IT Specialist
IBM Spectrum Scale development
Mobile Phone: +358503112585
 
https://www.youracclaim.com/user/luis-bolinches
 
Ab IBM Finland Oy
Laajalahdentie 23
00330 Helsinki
Uusimaa - Finland"If you always give you will always have" --  Anonymous
 
 
 
- Original message -From: "Achim Rehor" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] dssgmkfs.mmvdisk number of NSD'sDate: Mon, Mar 1, 2021 10:16 
The reason for having multiple NSDs in legacy NSD (non-GNR) handling isthe increased parallelism, that gives you 'more spindles' and thus moreperformance.In GNR the drives are used in parallel anyway through the GNRstriping.Therfore, you are using all drives of a ESS/GSS/DSS model under the hoodin the vdisks anyway.The only reason for having more NSDs is for using them for differentfilesystems. Mit freundlichen Grüßen / Kind regardsAchim RehorIBM EMEA ESS/Spectrum Scale Supportgpfsug-discuss-boun...@spectrumscale.org wrote on 01/03/2021 08:58:43:> From: Jonathan Buzzard > To: gpfsug-discuss@spectrumscale.org> Date: 01/03/2021 08:58> Subject: [EXTERNAL] Re: [gpfsug-discuss] dssgmkfs.mmvdisk number ofNSD's> Sent by: gpfsug-discuss-boun...@spectrumscale.org>> On 28/02/2021 09:31, Jan-Frode Myklebust wrote:> >> > I?ve tried benchmarking many vs. few vdisks per RG, and never couldsee> > any performance difference.>> That's encouraging.>> >> > Usually we create 1 vdisk per enclosure per RG,   thinking this will> > allow us to grow with same size vdisks when adding additionalenclosures> > in the future.> >> > Don?t think mmvdisk can be told to create multiple vdisks per RG> > directly, so you have to manually create multiple vdisk sets each with> > the apropriate size.> >>> Thing is back in the day so GPFS v2.x/v3.x there where strict warnings> that you needed a minimum of six NSD's for optimal performance. I have> sat in presentations where IBM employees have said so. What we where> told back then is that GPFS needs a minimum number of NSD's in order to> be able to spread the I/O's out. So if an NSD is being pounded for reads> and a write comes in it. can direct it to a less busy NSD.>> Now I can imagine that in a ESS/DSS-G that as it's being scattered to> the winds under the hood this is no longer relevant. But some notes to> the effect for us old timers would be nice if that is the case to put> our minds to rest.>>> JAB.>> --> Jonathan A. Buzzard                         Tel: +44141-5483420> HPC System Administrator, ARCHIE-WeSt.> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG> ___> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org> https://urldefense.proofpoint.com/v2/url?>u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwIGaQ=jf_iaSHvJObTbx-> 

Re: [gpfsug-discuss] cannot unmount fs

2021-01-27 Thread Olaf Weiser
HI ,
those  so called "nested" mounts .. you need to make sure to have the "upper" FS mounted first
 
this here may help
--mount-priority PriorityControls the order in which the individual file systems are mounted at daemon startup or when one of the all keywords is specified on the mmmount command.
File systems with higher Priority numbers are mounted after file systems with lower numbers. File systems that do not have mount priorities are mounted last. A value of zero indicates no priority.
 
 
in addition , now to your question:  you try to umount the gpfs2 locally on gpfsgui, and it seems, it is not mounted...
this does not mean.. that the file system is not mounted any where else .. .So before you initiate the mmchfs -T  you should make sure, to have umounted it "everywhere
 
use mmlsmount   -L  to navigate ...
 
 
 
- Original message -From: Iban Cabrillo Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss Cc:Subject: [EXTERNAL] [gpfsug-discuss] cannot unmount fsDate: Wed, Jan 27, 2021 2:20 PM 
Dear,
  We have a couple of GPFS fs, gpfs mount on /gpfs and gpfs2 mount on /gpfs/external, the problem is the mount path of the second fs  sometimes is missied
  I am trying to mmumount this FS in order to change the mount path. but I  cann't. If I make a mmumont gpfs2  or mmumount /gpfs/external  I get this error:  
 
[root@gpfsgui ~]# mmumount gpfs2
Wed Jan 27 14:11:07 CET 2021: mmumount: Unmounting file systems ...
umount: /gpfs/external: not mounted
 
(/gpfs/external path exists)
 
If I try to mmchfs -T XXX , the system says that the FS is already mounted.
 
But there is no error in the logs. Any Idea?
 
Regards, I
 
 
  

___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Disk in unrecovered state

2021-01-12 Thread Olaf Weiser
Hallo Iban,
 
this seems to be a hardware issue
 
 
Input/output error
 
 
just try /make sure.. that you really can read from the disk .. all NSDs from all of its NSD servers
so to say.. its most important, that the NSD is accessible on the primary NSD server.. as long this primary NSD server is up n running
 
either use dd (be carfully ;-)   .. only try to READ   ) 
e.g.   dd if=/dev/$yourblockdevice bs=4k count=2 | od -xc
 
 
or 
mmfsadm test ...(if you are familiar with that   - if not.. feel free, open a problem record.. and we'll guide your through  ..)
 
 
cheers
olaf
 
 
 
 
- Original message -From: Iban Cabrillo Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] Disk in unrecovered stateDate: Tue, Jan 12, 2021 4:59 PM 
Hi Renar,   The version we are installed is 5.0.4-3, and the paths to these wrong disks seems to be fine: [root@gpfs06 ~]# mmlsnsd -m| grep nsd18jbod1 nsd18jbod1      0A0A00675EE76CF5   /dev/sds        gpfs05.ifca.es           server node nsd18jbod1      0A0A00675EE76CF5   /dev/sdby       gpfs06.ifca.es           server node[root@gpfs06 ~]# mmlsnsd -m| grep nsd19jbod1 nsd19jbod1      0A0A00665EE76CF6   /dev/sdt        gpfs05.ifca.es           server node nsd19jbod1      0A0A00665EE76CF6   /dev/sdaa       gpfs06.ifca.es           server node[root@gpfs06 ~]# mmlsnsd -m| grep nsd19jbod2 nsd19jbod2      0A0A00695EE79A12   /dev/sdt        gpfs07.ifca.es           server node nsd19jbod2      0A0A00695EE79A12   /dev/sdat       gpfs08.ifca.es           server node[root@gpfs06 ~]# mmlsnsd -m| grep nsd24jbod2 nsd24jbod2      0A0A00685EE79749   /dev/sdbn       gpfs07.ifca.es           server node nsd24jbod2      0A0A00685EE79749   /dev/sdcg       gpfs08.ifca.es           server node[root@gpfs06 ~]# mmlsnsd -m| grep nsd57jbod1 nsd57jbod1      0A0A00665F243CE1   /dev/sdbg       gpfs05.ifca.es           server node nsd57jbod1      0A0A00665F243CE1   /dev/sdbx       gpfs06.ifca.es           server node[root@gpfs06 ~]# mmlsnsd -m| grep nsd61jbod1 nsd61jbod1      0A0A00665F243CFA   /dev/sdbk       gpfs05.ifca.es           server node nsd61jbod1      0A0A00665F243CFA   /dev/sdy        gpfs06.ifca.es           server node[root@gpfs06 ~]# mmlsnsd -m| grep nsd71jbod1 nsd71jbod1      0A0A00665F243D38   /dev/sdbu       gpfs05.ifca.es           server node nsd71jbod1      0A0A00665F243D38   /dev/sdbv       gpfs06.ifca.es           server nodetrying to start 19jbod1 again:[root@gpfs06 ~]# mmchdisk gpfs2 start -d nsd19jbod1mmnsddiscover:  Attempting to rediscover the disks.  This may take a while ...mmnsddiscover:  Finished.gpfs06.ifca.es:  Rediscovered nsd server access to nsd19jbod1.gpfs05.ifca.es:  Rediscovered nsd server access to nsd19jbod1.Failed to open gpfs2.Log recovery failed.Input/output errorInitial disk state was updated successfully, but another error may have changed the state again.mmchdisk: Command failed. Examine previous error messages to determine cause.Regards, I___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss  
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Services on DSS/ESS nodes

2020-10-05 Thread Olaf Weiser
let me add a few comments from some very successful large installations in Eruope
 
# InterOP
Even though (as Luis pointed to) , there is no support statement to run intermix DSS/ESS in general, it was ~, and is, and will be, ~ allowed for short term purposes, such as e.g migration.
The reason to not support those DSS/ESS mixed configuration in general is simply driven by the fact, that different release version of DSS/ESS potentially (not in every release, but sometimes)  comes with different driver levels, (e.g. MOFED), OS, RDMA-settings, GPFS tuning,  etc...
Those changes can have an impact/multiple impacts and therefore, we do not support that in general. Of course -and this would be the advice for every one - if you are faced the need to run a mixed configuration for e.g. a migration and/or e.g. cause of you need to temporary provide space etc... contact you IBM representative and settle to plan that accordingly..
There will be (likely) some additional requirements/dependencies defined  like  driver versions, OS,  and/or Scale versions, but you'll get a chance to run mixed configuration - temporary limited to your specific scenario.
 
# Monitoring
No doubt, monitoring is essential and absolutely needed. - and/but - IBM wants customers to be very sensitive, what kind of additional software (=workload) gets installed on the ESS-IO servers. BTW, this rule applies as well to any other important GPFS node with special roles (e.g. any other NSD server etc)
But given the fact, that customer's usually manage and monitor their server farms from a central point of control (any 3rd party software), it is common/ best practice , that additionally monitor software(clients/endpoints) has to run on GPFS nodes, so as on ESS nodes too.
 
If that way of acceptance applies for DSS too, you may want to double check with Lenovo ?!
 
 
#additionally GW functions
It would be a hot iron, to general allow routing on IO nodes. Similar to the mixed support approach, the field variety for such a statement would be hard(==impossible) to manage. As we all agree, additional network traffic can (and in fact will) impact GPFS.
In your special case, the expected data rates seems to me more than ok and acceptable to go with your suggested config (as long workloads remain on that level / monitor it accordingly as you are already obviously doing) 
Again,to be on the safe side.. contact your IBM representative and I'm sure you 'll find a way..
 
 
 
kind regards
olaf
 
 
- Original message -From: Jonathan Buzzard Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc:Subject: [EXTERNAL] Re: [gpfsug-discuss] Services on DSS/ESS nodesDate: Sun, Oct 4, 2020 12:17 PM 
On 04/10/2020 10:29, Luis Bolinches wrote:> Hi>> As stated on the same link you can do remote mounts from each other and> be a supported setup.>> “ You can use the remote mount feature of IBM Spectrum Scale to share> file system data across clusters.”>You can, but imagine I have a DSS-G cluster, with 2PB of storage on itwhich is quite modest in 2020. It is now end of life and for whateverreason I decide I want to move to ESS instead.What any sane storage admin want to do at this stage is set the ESS, addthe ESS nodes to the existing cluster on the DSS-G then do a bit ofmmadddisk/mmdeldisk and sit back while the data is seemlessly moved fromthe DSS-G to the ESS. Admittedly this might take a while :-)Then once all the data is moved a bit of mmdelnode and bingo the storagehas been migrated from DSS-G to ESS with zero downtime.As that is not allowed for what I presume are commercial reasons (youcould do it in reverse and presumable that is what IBM don't want) thenonce you are down the rabbit hole of one type of storage the you are notgoing to switch to a different one.You need to look at it from the perspective of the users. They franklycould not give a monkeys what storage solution you are using. All theycare about is having usable storage and large amounts of downtime toswitch from one storage type to another is not really acceptable.JAB.--Jonathan A. Buzzard                         Tel: +44141-5483420HPC System Administrator, ARCHIE-WeSt.University of Strathclyde, John Anderson Building, Glasgow. G4 0NG___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss  
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Checking if a AFM-managed file is still inflight

2020-09-21 Thread Olaf Weiser
do you looking fo smth like this:
mmafmlocal ls filename    or stat filename 
 
 
 
 
 
- Original message -From: "Dorigo Alvise (PSI)" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] [gpfsug-discuss] Checking if a AFM-managed file is still inflightDate: Mon, Sep 21, 2020 10:45 AM 
Dear GPFS users,
I know that through a policy one can know if a file is still being transferred from the cache to your home by AFM.
 
I wonder if there is another method @cache or @home side, faster and less invasive (a policy, as far as I know, can put some pressure on the system when there are many files).
I quickly checked mmlsattr that seems not to be AFM-aware (but there is a flags field that can show several things, like compression status, archive, etc).
 
Any suggestion ?
 
Thanks in advance,
 
   Alvise
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] tsgskkm stuck

2020-08-30 Thread Olaf Weiser
Hallo Philipp, seems, your nodes can not clearly communicate ?!? ..
can you check , that gpfs.gskit  is at the same level ..if not,  pls update to the same level
I've seen similar behavior , when reverse lookup of host names / wrong entries in /etc/hosts  ... is breaking you setup ..
 
if DNS and gskit is correct... please open a PMR
 
  

 
 
 
- Original message -From: Philipp Helo Rehs Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] [gpfsug-discuss] tsgskkm stuckDate: Fri, Aug 28, 2020 11:52 AM 
Hello,we have a gpfs v4 cluster running with 4 nsds and i am trying to addsome clients:mmaddnode -N hpc-storage-1-ib:client:hpc-storage-1this commands hangs and do not finishWhen i look into the server, i can see the following processes whichnever finish:root 38138  0.0  0.0 123048 10376 ?    Ss   11:32   0:00/usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmremote checkNewClusterNode3lc/setupClient%%%%:00_VERSION_LINE::1709:3:1::lc:gpfs3.hilbert.hpc.uni-duesseldorf.de::0:/bin/ssh:/bin/scp:5362040003754711198:lc2:1597757602::HPCStorage.hilbert.hpc.uni-duesseldorf.de:2:1:1:2:A:::central:0.0:%%home%%:20_MEMBER_NODE::5:20:hpc-storage-1root 38169  0.0  0.0 123564 10892 ?    S    11:32   0:00/usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmremote ccrctl setupClient 2214791=gpfs3-ib.hilbert.hpc.uni-duesseldorf.de:1191,2=gpfs4-ib.hilbert.hpc.uni-duesseldorf.de:1191,4=gpfs6-ib.hilbert.hpc.uni-duesseldorf.de:1191,3=gpfs5-ib.hilbert.hpc.uni-duesseldorf.de:11910 1191root 38212  100  0.0  35544  5752 ?    R    11:32   9:40/usr/lpp/mmfs/bin/tsgskkm store --cert/var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.cert --priv/var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.priv --out/var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.keystore --fips offThe node is an AMD epyc.Any idea what could cause the issue?ssh is possible in both directions and firewall is disabled.Kind regards Philipp Rehs 
 
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Spectrum Scale pagepool size with RDMA

2020-07-23 Thread Olaf Weiser
dear colleagues,
I don't think, that we can cover a specific customer setup .. .you may open a PMR to address your issues or ask your local IBM /business partner account
but to answer your questions...
 
 
(1) pagepool can be up to  75% (by default) and it works well with RDMA , even for pagepool > 64GB
(2) the reason for high load is hard to analyze by email ... and I doubt, that reducing this high load will  be done by lowering  pagepool
(3) should nt be to tune any more... since we have recent MLX adapter/Mofed, this should be obsolete
(4) most likely on that node this verbsPort is not existing or active (check by mmlsconfig and ibdev2netdev)
 
 
Mit freundlichen Grüßen / Kind regards
 Olaf Weiser IBM Systems, SpectrumScale Client Adoption---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Gregor Pillen (Vorsitzender), Agnes Heftberger, Norbert Janzen, Markus Koerner, Christian Noll, Nicole ReimerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940
 
 
 
- Original message -From: "Frederick Stock" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc: gpfsug-discuss@spectrumscale.orgSubject: [EXTERNAL] Re: [gpfsug-discuss] Spectrum Scale pagepool size with RDMADate: Thu, Jul 23, 2020 1:15 PM 
And what version of ESS/Scale are you running on your systems (mmdiag --version)?
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Yaron Daniel" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] Spectrum Scale pagepool size with RDMADate: Thu, Jul 23, 2020 3:09 AM HiWhat is the output for:#mmlsconfig |grep -i verbs #ibstat Regards 
  Yaron Daniel 94 Em Ha'Moshavot RdStorage Architect – IL Lab Services (Storage) Petach Tiqva, 49527IBM Global Markets, Systems HW Sales Israel   Phone:+972-3-916-5672  Fax:+972-3-916-5672   Mobile:+972-52-8395593   e-mail:y...@il.ibm.com   Webex:    https://ibm.webex.com/meet/yardIBM Israel    
    From:        Prasad Surampudi To:        "gpfsug-discuss@spectrumscale.org" Date:        07/23/2020 03:34 AMSubject:        [EXTERNAL] [gpfsug-discuss] Spectrum Scale pagepool size with RDMASent by:        gpfsug-discuss-boun...@spectrumscale.org
Hi,We have an ESS clusters with two CES nodes. The pagepool is set to 128 GB ( Real Memory is 256 GB ) on both ESS NSD servers and CES nodes as well. Occasionally we see the mmfsd process memory usage reaches 90% on NSD servers and CES nodes and stays there until GPFS is recycled. I have couple of questions in this scenario:
 What are the general recommendations of pagepool size for nodes with RDMA enabled? On, IBM knowledge center for RDMA tuning says "If the GPFS pagepool is set to 32 GB, then the mapping of the RDMA for this pagepool must be at least 64 GB."  So, does this mean that the pagepool can't be more than half of real memory with RDMA enabled? Also, Is this the reason why mmfsd memory usage exceeds pagepool size and spikes to almost 90

Re: [gpfsug-discuss] gpfs filesets question

2020-04-20 Thread Olaf Weiser
Hallo Stephan.. @all,
I think .. yes.. RFE is the way to go ...
the current behavior is really works-as designed even though I see your point, currently :  a move of a file between filesets is smth like writing a new file and delete the old
so I expect, this will remain always the case , when moving between different inode spaces (regardless of the storage pool)
 
for moving files between different -but dependent- filesets (so same inode space) - and - within the same storage pool,  there might be an alternative way  - to evaluate and check.. yes .. we should have an RFE . .
 
 
 
 
 
- Original message -From: Stephan Graf Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] gpfs filesets questionDate: Mon, Apr 20, 2020 10:42 AM 
Hi,we recognized this behavior when we tried to move HSM migrated filesbetween filesets. This cases a recall. Very annoying when the data areafterword stored on the same pools and have to be migrated back to tape.@IBM: should we open a RFE to address this?StephanAm 18.04.2020 um 17:04 schrieb Stephen Ulmer:> Is this still true if the source and target fileset are both in the same> storage pool? It seems like they could just move the metadata…> Especially in the case of dependent filesets where the metadata is> actually in the same allocation area for both the source and target.>> Maybe this just doesn’t happen often enough to optimize?>> --> Stephen> On Apr 16, 2020, at 12:50 PM, Oesterlin, Robert>> mailto:robert.oester...@nuance.com>> wrote: Moving data between filesets is like moving files between file>> systems. Normally when you move files between directories, it’s simple>> metadata, but with filesets (dependent or independent) is a full copy>> and delete of the old data.>> Bob Oesterlin>> Sr Principal Storage Engineer, Nuance>> *From:*>> > on behalf of "J.>> Eric Wonderley" mailto:eric.wonder...@vt.edu *Reply-To:*gpfsug main discussion list>> >> > *To:*gpfsug main discussion list >> > I have filesets setup in a filesystem...looks like:>> [root@cl005 ~]# mmlsfileset home -L>> Filesets in file system 'home':>> Name                            Id      RootInode  ParentId Created  >>                    InodeSpace      MaxInodes    AllocInodes Comment>> root                             0              3        -- Tue Jun 30>> 07:54:09 2015        0            402653184      320946176 root fileset>> hess                             1      543733376         0 Tue Jun 13>> 14:56:13 2017        0                    0              0>> predictHPC                       2        1171116         0 Thu Jan  5>> 15:16:56 2017        0                    0              0>> HYCCSIM                          3      544258049         0 Wed Jun 14>> 10:00:41 2017        0                    0              0>> socialdet                        4      544258050         0 Wed Jun 14>> 10:01:02 2017        0                    0              0>> arc                              5        1171073         0 Thu Jan  5>> 15:07:09 2017        0                    0              0>> arcadm                           6        1171074         0 Thu Jan  5>> 15:07:10 2017        0                    0              0>> I beleive these are dependent filesets.  Dependent on the root>> fileset.   Anyhow a user wants to move a large amount of data from one>> fileset to another.   Would this be a metadata only operation?  He has>> attempted to small amount of data and has noticed some thrasing.>> ___>> gpfsug-discuss mailing list>> gpfsug-discuss atspectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss>>> ___> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org> http://gpfsug.org/mailman/listinfo/gpfsug-discuss>  
 
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] AFM Alternative?

2020-02-26 Thread Olaf Weiser
you may consider WatchFolder  ... (cluster wider inotify --> kafka) .. and then you go from there
 
 
- Original message -From: Andi Christiansen Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: [EXTERNAL] [gpfsug-discuss] AFM Alternative?Date: Wed, Feb 26, 2020 1:59 PM 
Hi all,
 
Does anyone know of an alternative to AFM ?
 
We have been working on tuning AFM for a few weeks now and see little to no improvement.. And now we are searching for an alternative.. So if anyone knows of a product that can implement with Spectrum Scale i am open to any suggestions :)
 
We have a good mix of files but primarily billions of very small files which AFM does not handle well on long distances.
 
 
Best Regards
A. Christiansen
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Max number of vdisks in a recovery group - is it 64?

2019-12-13 Thread Olaf Weiser
Hallo Heiner, Stefan, thx for this heads up ..we know ... it's all GNR .. so the answer
is a bit different, depending on the scenario on the regular building blocks..  the
RG Layout (2 RGs per BB) .. is different from the scale out ECE (#4+ nodes,
one RG )  The absolute max #vdisk is ( I think
)  512  per RG  - but I would not recommend to go
that far , when designing a data layout .. if ever possible.. we can have a short talk next week on
that.. Try to keep the #vdisk as small as you
can is the better approach .. From:      
 "Dietrich, Stefan"
To:      
 gpfsug main discussion
list Date:      
 12/13/2019 02:26 AMSubject:    
   [EXTERNAL] Re:
[gpfsug-discuss] Max number of vdisks in a recovery group - is it  
     64?Sent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHello Heiner,the 64 vdisk limit per RG is still present in the latest ESS docs:https://www.ibm.com/support/knowledgecenter/SSYSP8_5.3.5/com.ibm.spectrum.scale.raid.v5r04.adm.doc/bl1adv_vdisks.htmFor the other questions, no idea.Regards,Stefan- Original Message -> From: "Billich Heinrich Rainer (ID SD)" > To: "gpfsug main discussion list" > Sent: Thursday, December 12, 2019 3:26:31 PM> Subject: [gpfsug-discuss] Max number of vdisks in a recovery group
- is it                
64?> Hello,> > I remember that a GNR/ESS  recovery group can hold up to 64 vdisks,
but I can’t> find a citation to proof it. Now I wonder if  64 is the actual
limit? And where> is it documented? And did the limit change with versions? Thank you.
I did> spend quite some time searching the documentation, no luck .. maybe
you know.> > We run ESS 5.3.4.1 and the recovery groups have current/allowable
 format> version 5.0.0.0> > Thank you,> > Heiner> --> ===> Heinrich Billich> ETH Zürich> Informatikdienste> Tel.: +41 44 632 72 56> heinrich.bill...@id.ethz.ch> > > > > > ___> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org> http://gpfsug.org/mailman/listinfo/gpfsug-discuss___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] How to join GNR nodes to a non-GNR cluster

2019-12-03 Thread Olaf Weiser
Hallo "merging" 2 different GPFS
cluster into one .. is not possible .. for sure you can do "nested"
mounts .. .but that's most likely not, what you want to do .. if you want to add a GL2 (or any other
ESS) ..to an existing (other) cluster... -  you can't preserve ESS's
RG definitions... you need to create the RGs after adding
the IO-nodes to the existing cluster... so if you got a new ESS.. (no data on
it) .. simply unconfigure cluster ..  .. add the nodes to your existing
cluster.. and then start configuring the RGsFrom:      
 "Dorigo Alvise
(PSI)" To:      
 "gpfsug-discuss@spectrumscale.org"
Date:      
 12/03/2019 09:35 AMSubject:    
   [EXTERNAL] [gpfsug-discuss]
How to join GNR nodes to a non-GNR clusterSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHello everyone,I have: - A NetApp system with hardware RAID - SpectrumScale 4.2.3-13 running
on top of the NetApp - A GL2 system with ESS 5.3.2.1 (Spectrum
Scale 5.0.2-1)What I need to do is to merge the GL2 in
the other GPFS cluster (running on the NetApp) without loosing, of course,
the RecoveryGroup configuration, etc.I'd like to ask the experts1.        whether
it is feasible, considering the difference in the GPFS versions, architectures
differences (x86_64 vs. power)2.        if
yes, whether anyone already did something like this and what is the best
strategy suggested3.        finally:
is there any documentation dedicated to that, or at least inspiring the
correct procedure ?Thank you very much,   Alvise Dorigo___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Compression question

2019-11-28 Thread Olaf Weiser

Hi Alex, not 100% sure about my answer.. but
so far as I see it.. it is working, because of the so called "dito
resolution " .. In the snaphost's inode .. die pointer to the DA's
point the the next (more recent) inode information .. so accessing a file in a snapshot- "redirects"
the request to the origin inode - and there ..the information about compression
is given and points to the origin DA(of course.. only as long nobody changed
the file since the snapshot was taken)From:      
 "Alexander Wolf"
To:      
 gpfsug-discuss@spectrumscale.orgDate:      
 11/28/2019 07:03 AMSubject:    
   [EXTERNAL] Re:
[gpfsug-discuss] Compression questionSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgI see the same behavior of mmlsattr on my
system (with some post 5.0.4 development build). Funny enough if I look
at the file content in the snapshot it gets properly decompressed. Mit freundlichen Grüßen / Kind regards  
   Dr.
Alexander Wolf-ReberSpectrum Scale Release Lead ArchitectDepartment M069 / Spectrum Scale Software Development+49-160-90540880a.wolf-re...@de.ibm.comIBM
Data Privacy StatementIBM Deutschland Research &
Development GmbH / Vorsitzende des Aufsichtsrats: Matthias Hartmann / Geschäftsführung:
Dirk WittkoppSitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart,
HRB 243294  - Original message -From: "Luis Bolinches" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug main discussion list" Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] Compression questionDate: Thu, Nov 28, 2019 14:00 Which version are you running?  I was involved on a big for compressed file
sets and snapshots that were related to what you see.   -- Cheers  On 28. Nov 2019, at 14.57, Cregan, Bob 
wrote:  Hi       Sounds logical - except
the snap metadata does not have the compression flag set. So if the inode
now points to a set of compressed blocks how does the client know to decompress
it? After compression of an existing file we
get in the snap -bash-4.2$ mmlsattr -L .snapshots/@GMT-2019.11.27-19.30.14/UserGuide_13.06.pdffile name:        
   .snapshots/@GMT-2019.11.27-19.30.14/UserGuide_13.06.pdfmetadata replication: 2 max 2data replication:     1 max 3immutable:        
   noappendOnly:        
  noflags:          
     storage pool name:    sata1fileset name:        
userdirssnapshot name:        @GMT-2019.11.27-19.30.14creation time:        Tue
Mar  5 16:16:40 2019Misc attributes:      ARCHIVEEncrypted:        
   no  and the original file is definitely
compressed. -bash-4.2$ mmlsattr -L UserGuide_13.06.pdf
file name:        
   UserGuide_13.06.pdfmetadata replication: 2 max 2data replication:     1 max 3immutable:        
   noappendOnly:        
  noflags:          
     storage pool name:    sata1fileset name:        
userdirssnapshot name:        creation time:        Tue
Mar  5 16:16:40 2019Misc attributes:      ARCHIVE
COMPRESSION (library z)Encrypted:        
   noBobBob CreganHPC Systems AnalystInformation & Communication TechnologiesImperial College London, South Kensington Campus London, SW7 2AZT: 07712388129E: b.cre...@imperial.ac.ukW: www.imperial.ac.uk/ict/rcs  @imperialRCS
@imperialRSE            From: gpfsug-discuss-boun...@spectrumscale.org
 on behalf of Daniel Kidger
Sent: 28 November 2019 12:30To: gpfsug-discuss@spectrumscale.org Cc: gpfsug-discuss@spectrumscale.org Subject: Re: [gpfsug-discuss] Compression question Caution
- This email from daniel.kid...@uk.ibm.com
originated outside Imperial  Alexander, Can you then confirm then that the inodes
in the snapshot will now point to fewer but compressed blocks ?Daniel _Daniel KidgerIBM Technical Sales SpecialistSpectrum Scale, Spectrum NAS and IBM Cloud
Object Store+44-(0)7818 522 266 daniel.kid...@uk.ibm.com    - Original message -From: "Alexander Wolf" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc:Subject: [EXTERNAL] Re: [gpfsug-discuss] Compression questionDate: Thu, Nov 28, 2019 12:21  I just tested this. Compressing a file did
free up space in the file system. Looks like our compression code does
not trigger COW on the snapshot. You can test this yourself by looking
into mmlssnapshot -d (please not on a large production fs, this command
is expensive). Mit freundlichen Grüßen / Kind regards
   Dr.
Alexander Wolf-ReberSpectrum Scale Release Lead ArchitectDepartment M069 / Spectrum Scale Software Development+49-160-90540880a.wolf-re...@de.ibm.comIBM
Data Privacy StatementIBM Deutschland Research &
Development GmbH / Vorsitzende des Aufsichtsrats: Matthias Hartmann / Geschäftsführung:
Dirk WittkoppSitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart,
HRB 243294  - Original message -From: "Luis Bolinches" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc: gpfsug-discuss@spectrumscale.orgSubject: [EXTERNAL] Re: [gpfsug-discuss] Compression 

Re: [gpfsug-discuss] introduction

2019-11-20 Thread Olaf Weiser
sorry.. this time.. with link Hallo Bill, welcome .. hard to predict, what  your read "slowness" is about ...
some baseline tuning seems to be the trick for you... https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)From:      
 "Peters, Bill"
To:      
 "gpfsug-discuss@spectrumscale.org"
Date:      
 11/20/2019 07:18 PMSubject:    
   [EXTERNAL] [gpfsug-discuss]
introductionSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHello,The welcome email said I should introduce
myself to the group. I’m Bill Peters, a Linux engineer at ATPCO
(Airline Tariff Publishing Company)located in Dulles, Virginia, USA. We process
airline fare data. I've was recently introduced to Spectrum
Scale because we are doing a proof of concept to see if we can move some of our
workload onto virtual machines running in IBM's zVM. Our x86 systems use
Veritas for the network filesystem and since Veritas doesn't support the s390
arcitecture, we are using SpectrumScale. So far it's been great, much easier
to understand than Veritas. We're not doing anything too complicated. The
disks are DASD on SSD. We have 3 clusters with sharing between them. At
this point I expect the POC to be successful so I will probably be working
with Spectrum Scale into the future. The only question I have so far is read
speeds seem to be slower than Veritas. It's not slow enough to be a real problem,
but if anyone has a suggestion for speeding up reads, I'd love to hear it. Other than that, I'll probably just be
lurking on the email list. Thanks,-Bill___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] introduction

2019-11-20 Thread Olaf Weiser

Hallo Bill, welcome .. hard to predict, what's your read "slowness"
is about ... some baseline tuning seems to be the
trick for you... From:      
 "Peters, Bill"
To:      
 "gpfsug-discuss@spectrumscale.org"
Date:      
 11/20/2019 07:18 PMSubject:    
   [EXTERNAL] [gpfsug-discuss]
introductionSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHello,The welcome email said I should introduce
myself to the group. I’m Bill Peters, a Linux engineer at ATPCO
(Airline Tariff Publishing Company)located in Dulles, Virginia, USA. We process
airline fare data. I've was recently introduced to Spectrum
Scale because we are doing a proof of concept to see if we can move some of our
workload onto virtual machines running in IBM's zVM. Our x86 systems use
Veritas for the network filesystem and since Veritas doesn't support the s390
arcitecture, we are using SpectrumScale. So far it's been great, much easier
to understand than Veritas. We're not doing anything too complicated. The
disks are DASD on SSD. We have 3 clusters with sharing between them. At
this point I expect the POC to be successful so I will probably be working
with Spectrum Scale into the future. The only question I have so far is read
speeds seem to be slower than Veritas. It's not slow enough to be a real problem,
but if anyone has a suggestion for speeding up reads, I'd love to hear it. Other than that, I'll probably just be
lurking on the email list. Thanks,-Bill___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] ESS - Considerations when adding NSD space?

2019-10-25 Thread Olaf Weiser
Hi  - sorry for delayed response..
as Alex started.. let me add a little though on thatyour said ... you came from GL4 ... to now GL6  
... MES update is only supported , when converting everything to mmvdisk
...    so  I suspect.. you did it  alreadynext.. by going through this MES upgrade ...
all GNR tracks are re-balanced.. among all (old+new) drives..  (this
comment is for those, who not so familiar with MES update) then .. from a performance perspective it makes
no(reasonable/able to measure)  difference , if you have vdisks with
different sizes.  .. the GNR layer will carry about your IOs to your
NSDs and by  underlaying vdisk layer,  GNR code will direct them
to the final physical disks . By the fact, the vdisks are served from the
same IOnodes anyway - so same communication channels from client to NSD
server  -->  we can say.. you can relax, that this  given
NSD size of 330T or 380T .. won't make any difference...  ..but.. to add some flavor here ;-)
 in case you use/have different
failure groups  in your  configuration .. .. as said.. if ..
.then - . different NSD sizes  makes a difference .. because the algorithm
to allocate space .. does this currently in a round robin fashion over
all NSDs .. At least, thats the default and I'm not aware , that you can
change it nor customize it yet... IN a round robin allocation, each disk
is given equal allocation priority  - you can double check this by
mmlsfs  -s So by default  there 's no heuristic
to take different NSD sizes into account for allocation new blocks - as
being said.. this is ONLY affecting you, if you have multiple Failure Groups
.. and from your email .. I conclude..
that 's not the case .. after all .. I'm sure every administrator these days
has multiple things (=too much) to do .. .. so I would'nt  go the
extra mile to recreate NSD (=vdisk) just to have them all the same size
.. as long you have a strong  reason to do so .. ;-) .. just my view on it.. cheerslaffFrom:      
 "Alexander Wolf"
To:      
 gpfsug-discuss@spectrumscale.orgCc:      
 gpfsug-discuss@spectrumscale.orgDate:      
 10/25/2019 07:14 PMSubject:    
   [EXTERNAL] Re:
[gpfsug-discuss] ESS - Considerations when adding NSD space?Sent by:    
   gpfsug-discuss-boun...@spectrumscale.orgBob, from what you describe I would assume that
you have now "old" vdisks that span four enclosures and "new"
vdisks that span the two new enclosures. So you already are unbalanced
at the vdisk level. From a performance point I would guess the optimum
would be to have the new NSDs being half the size of the old ones. But
honestly I do not know how much of a difference it really makes. Fred is right, if you can you shoud always
go for a homogenous setup. On the other hand if you can't, you can't.  Mit freundlichen Grüßen / Kind regards  
   Dr.
Alexander Wolf-ReberSpectrum Scale Release Lead ArchitectDepartment M069 / Spectrum Scale Software Development+49-160-90540880a.wolf-re...@de.ibm.comIBM Deutschland Research &
Development GmbH / Vorsitzende des Aufsichtsrats: Matthias Hartmann / Geschäftsführung:
Dirk WittkoppSitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart,
HRB 243294  - Original message -From: "Frederick Stock" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc: gpfsug-discuss@spectrumscale.orgSubject: [EXTERNAL] Re: [gpfsug-discuss] ESS - Considerations when adding
NSD space?Date: Thu, Oct 24, 2019 17:55  Bob as I understand having different size
NSDs is still not considered ideal even for ESS.  I had another customer
recently add storage to an ESS system and they were advised to first check
the size of their current vdisks and size the new vdisks to be the same.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com  - Original message -From: "Oesterlin, Robert" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] [gpfsug-discuss] ESS - Considerations when adding NSD
space?Date: Thu, Oct 24, 2019 11:34 AM We recently upgraded our GL4 to a GL6 (trouble
free process for those considering FYI). I now have 615T free (raw) in
each of my recovery groups.  I’d like to increase the size of one
of the file systems (currently at 660T, I’d like to add 100T). My first thought was going to be: mmvdisk vdiskset define --vdisk-set fsdata1
--recovery-group rg_gssio1-hs,rg_gssio2-hs --set-size 50T --code 8+2p --block-size
4m --nsd-usage dataOnly --storage-pool datammvdisk vdiskset create --vdisk-set fs1data1
mmvdisk filesystem add --filesystem fs1 --vdisk-set
fs1data1  I know in the past use of mixed size NSDs
was frowned upon, not sure on the ESS.  The other approach would be add two larger
NSDs (current ones are 330T) of 380T, migrate the data to the new ones
using mmrestripe, then delete the old ones. The other benefit of this process
would be to have the file system data better balanced across all the 

Re: [gpfsug-discuss] Ganesha all IPv6 sockets - ist this to be expected?

2019-09-16 Thread Olaf Weiser

Hallo Heiner, usually, Spectrum Scale comes with a
tuned profile (named scale) .. [root@nsd01 ~]# tuned-adm active Current active profile: scalein there [root@nsd01 ~]# cat /etc/tuned/scale/tuned.conf |
tail -3  # Disable IPv6 net.ipv6.conf.all.disable_ipv6=1 net.ipv6.conf.default.disable_ipv6=1 [root@nsd01 ~]# depending on  what you need to achieve
.. one might be forced to changed that.. e.g. for RoCE .. you need IPv6
to be active ... but for all other scenarios with SpectrumScale
(at least what I'm aware of right now) ... IPv6 can be disabled... From:      
 "Billich  Heinrich
Rainer (ID SD)" To:      
 gpfsug main discussion
list Date:      
 09/13/2019 05:02 PMSubject:    
   [EXTERNAL] [gpfsug-discuss]
Ganesha all IPv6 sockets - ist this to be expected?Sent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHello,I just noted that our ganesha daemons offer IPv6 sockets only, IPv4 traffic
gets encapsulated.  But all traffic to samba is IPv4, smbd offers
both IPv4 and IPv6 sockets. I just wonder whether this is to be expected? Protocols support IPv4 only,
so why running on IPv6 sockets only for ganesha? Did we configure something
wrong and should completely disable IPv6 on the kernel levelAny comment is welcomeCheers,Heiner-- ===Heinrich BillichETH ZürichInformatikdiensteTel.: +41 44 632 72 56heinrich.bill...@id.ethz.ch I did check with  ss -l -t -4  ss -l -t  -6add  -p to get the process name, too.do you get the same results on your ces nodes?[root@nas22ces04-i config_samples]#   ss -l -t   -4State       Recv-Q Send-Q          
                     
                     
             Local Address:Port  
                     
                     
                     
                Peer Address:PortLISTEN      0      8192      
                     
                     
                     
         *:gpfs          
                     
                     
                     
                   *:*LISTEN      0      50      
                     
                     
                     
           *:netbios-ssn      
                     
                     
                     
                *:*LISTEN      0      128      
                     
                     
                     
          *:5355          
                     
                     
                     
                   *:*LISTEN      0      128      
                     
                     
                     
          *:sunrpc        
                     
                     
                     
                   *:*LISTEN      0      128      
                     
                     
                     
          *:ssh          
                     
                     
                     
                    *:*LISTEN      0      100      
                     
                     
                     
  127.0.0.1:smtp              
                     
                     
                     
               *:*LISTEN      0      10      
                     
                     
                     10.250.135.24:4379
                     
                     
                     
                     
       *:*LISTEN      0      128      
                     
                     
                     
          *:32765        
                     
                     
                     
                    *:*LISTEN      0      50      
                     
                     
                     
           *:microsoft-ds      
                     
                     
                     
               *:*[root@nas22ces04-i config_samples]#   ss -l -t   -6State       Recv-Q Send-Q          
                     
                     
             Local Address:Port  
                     
                     
                     
                Peer Address:PortLISTEN      0      128      
                     
                     
                     
         :::32767        
                     
                     
                     
                   :::*LISTEN      0      128      
                     
                     
                     
         :::32768        
                     
                     
                     
                   :::*LISTEN      0      128      
                     
                     
                     
         :::32769        
                     
                     
                     
                   :::*LISTEN      0      128      
                     
                     
                     
         :::2049          
                     
                     
                     
                  :::*LISTEN      0      128      
                     
                     
                     
         :::5355          
                     
                     
                     
                  :::*LISTEN    

Re: [gpfsug-discuss] Getting which files are store fully in inodes

2019-03-28 Thread Olaf Weiser

Hi you can take filehist ... -rwxr--r-- 1 root root 1840 Jan 30 02:24 /usr/lpp/mmfs/samples/debugtools/filehistit gives you a nice report how
many files at all, how much space .. etc..From:      
 "Dorigo Alvise
(PSI)" To:      
 "gpfsug-discuss@spectrumscale.org"
Date:      
 03/28/2019 01:52 PMSubject:    
   [gpfsug-discuss]
Getting which files are store fully in inodesSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHello,to get the list (and size) of files that
fit into inodes what I do, using a policy, is listing "online"
(not evicted) files that have zero allocated KB. Is this correct or there could be some exception
I'm missing ?Does it exists a smarter/faster way ?thanks,   Alvise___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Adding to an existing GPFS ACL

2019-03-27 Thread Olaf Weiser
unfortunately .. commands like nfs4_setfacl
  are not implemented yet in GPFS
I once helped me out with a local NFS
mount to set ACLs automated ... then you can use NFSv4 client to do the
ACL stuff .. From:      
 "Buterbaugh, Kevin
L" To:      
 gpfsug main discussion
list Date:      
 03/27/2019 05:19 PMSubject:    
   Re: [gpfsug-discuss]
Adding to an existing GPFS ACLSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHi Jonathan, Thanks for the response.  I did look at mmeditacl,
but unless I’m missing something it’s interactive (kind of like mmedquota
is by default).  If I had only a handful of files / directories to
modify that would be fine, but in this case there are thousands of ACL’s
that need modifying.Am I missing something?  Thanks…Kevin—Kevin Buterbaugh - Senior System AdministratorVanderbilt University - Advanced Computing Center for
Research and Educationkevin.buterba...@vanderbilt.edu- (615)875-9633On Mar 27, 2019, at 11:02 AM, Fosburgh,Jonathan 
wrote:Try mmeditacl.-- Jonathan FosburghPrincipal Application Systems
AnalystIT Operations Storage TeamThe University of Texas MD
Anderson Cancer Center(713) 745-9346From: gpfsug-discuss-boun...@spectrumscale.org
on behalf of Buterbaugh, Kevin L Sent: Wednesday, March 27, 2019 10:59:17 AMTo: gpfsug main discussion listSubject: [EXT] [gpfsug-discuss] Adding to an existing GPFS ACL WARNING:This email originated from outside of MD Anderson. Please validate the
sender's email address before clicking on links or attachments as they
may not be safe. Hi All, First off, I have very limited experience
with GPFS ACL’s, so please forgive me if I’m missing something obvious
here.  AFAIK, this is the first time we’ve hit something like this…We have a fileset where all the files
/ directories have GPFS NFSv4 ACL’s set on them.  However, unlike
most of our filesets where the same ACL is applied to every file / directory
in the share, this one has different ACL’s on different files / directories.
 Now we have the need to add to the existing ACL’s … another group
needs access.  Unlike regular Unix / Linux ACL’s where setfacl can
be used to just add to an ACL (i.e. setfacl -R g:group_name:rwx), I’m
not seeing where GPFS has a similar command … i.e. mmputacl seems to expect
the _entire_ new ACL to be supplied via either manual entry or an input
file.  That’s obviously problematic in this scenario.So am I missing something?  Is there
an easier solution than writing a script which recurses over the fileset,
gets the existing ACL with mmgetacl and outputs that to a file, edits that
file to add in the new group, and passes that as input to mmputacl?  That
seems very cumbersome and error prone, especially if I’m the one writing
the script!Thanks…Kevin—Kevin Buterbaugh - Senior System AdministratorVanderbilt University - Advanced Computing
Center for Research and Educationkevin.buterba...@vanderbilt.edu- (615)875-9633The information contained in this e-mail
message may be privileged, confidential, and/or protected from disclosure.
This e-mail message may contain protected health information (PHI); dissemination
of PHI should comply with applicable federal and state laws. If you are
not the intended recipient, or an authorized representative of the intended
recipient, any further review, disclosure, use, dissemination, distribution,
or copying of this message or any attachment (or the information contained
therein) is strictly prohibited. If you think that you have received this
e-mail message in error, please notify the sender by return e-mail and
delete all references to it and its contents from your systems.___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttps://nam04.safelinks.protection.outlook.com/?url="">___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Clarification of mmdiag --iohist output

2019-02-21 Thread Olaf Weiser
so from the nsdMaxWorkerThreads 1024 I used to specify the same way for minWorker
... and tell everybody in the cluster.. ignorePrefetchLunCount=yesto adjust the min/maxworkers to your
infrastructure according your need.. how many IOPS - and / or bandwidth
with your given BS , do you think can your Backend handle ? depending on
that ... adjust #nsdmin/maxThreads... so if your backEnd can manage 10.000
IOPS ... roughly divide by #NSDservers and adjust nsdworkerthreads ..in addition .. check the client setting
... sometimes it is helpful to lower workerThreads on the clients.. to
prevent, that they don't overrun your NSD servers.. From:      
 "Buterbaugh, Kevin
L" To:      
 gpfsug main discussion
list Date:      
 02/21/2019 01:39 PMSubject:    
   Re: [gpfsug-discuss]
Clarification of mmdiag --iohist outputSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHi All, My thanks to Aaron, Sven, Steve, and whoever responded
for the GPFS team.  You confirmed what I suspected … my example 10
second I/O was _from an NSD server_ … and since we’re in a 8 Gb FC SAN
environment, it therefore means - correct me if I’m wrong about this someone
- that I’ve got a problem somewhere in one (or more) of the following
3 components:1) the NSD servers2) the SAN fabric3) the storage arraysI’ve been looking at all of the above and none of them
are showing any obvious problems.  I’ve actually got a techie from
the storage array vendor stopping by on Thursday, so I’ll see if he can
spot anything there.  Our FC switches are QLogic’s, so I’m kinda
screwed there in terms of getting any help.  But I don’t see any
errors in the switch logs and “show perf” on the switches is showing
I/O rates of 50-100 MB/sec on the in use ports, so I don’t _think_ that’s
the issue.And this is the GPFS mailing list, after all … so let’s
talk about the NSD servers.  Neither memory (64 GB) nor CPU (2 x quad-core
Intel Xeon E5620’s) appear to be an issue.  But I have been looking
at the output of “mmfsadm saferdump nsd” based on what Aaron and then
Steve said.  Here’s some fairly typical output from one of the SMALL
queues (I’ve checked several of my 8 NSD servers and they’re all showing
similar output):    Queue NSD type NsdQueueTraditional [244]:
SMALL, threads started 12, active 3, highest 12, deferred 0, chgSize 0,
draining 0, is_chg 0     requests pending 0, highest pending
73, total processed 4859732     mutex 0x7F3E449B8F10, reqCond 0x7F3E449B8F58,
thCond 0x7F3E449B8F98, queue 0x7F3E449B8EF0, nFreeNsdRequests 29And for a LARGE queue:    Queue NSD type NsdQueueTraditional [8]:
LARGE, threads started 12, active 1, highest 12, deferred 0, chgSize 0,
draining 0, is_chg 0     requests pending 0, highest pending
71, total processed 2332966     mutex 0x7F3E441F3890, reqCond 0x7F3E441F38D8,
thCond 0x7F3E441F3918, queue 0x7F3E441F3870, nFreeNsdRequests 31So my large queues seem to be slightly less utilized than
my small queues overall … i.e. I see more inactive large queues and they
generally have a smaller “highest pending” value.Question:  are those non-zero “highest pending”
values something to be concerned about?I have the following thread-related parameters set:[common]maxReceiverThreads 12nsdMaxWorkerThreads 640nsdThreadsPerQueue 4nsdSmallThreadRatio 3workerThreads 128[serverLicense]nsdMaxWorkerThreads 1024nsdThreadsPerQueue 12nsdSmallThreadRatio 1pitWorkerThreadsPerNode 3workerThreads 1024Also, at the top of the “mmfsadm saferdump nsd” output
I see:Total server worker threads: running 1008, desired 147,
forNSD 147, forGNR 0, nsdBigBufferSize 16777216nsdMultiQueue: 256, nsdMultiQueueType: 1, nsdMinWorkerThreads:
16, nsdMaxWorkerThreads: 1024Question:  is the fact that 1008 is pretty close
to 1024 a concern?Anything jump out at anybody?  I don’t mind sharing
full output, but it is rather lengthy.  Is this worthy of a PMR?Thanks!--Kevin Buterbaugh - Senior System AdministratorVanderbilt University - Advanced Computing Center for
Research and Educationkevin.buterba...@vanderbilt.edu- (615)875-9633On Feb 17, 2019, at 1:01 PM, IBM Spectrum Scale 
wrote:Hi Kevin,The I/O hist shown by the command mmdiag --iohist actually depends on the
node on which you are running this command from.If you are running this on a NSD server node then it will show the time
taken to complete/serve the read or write I/O operation sent from the client
node. And if you are running this on a client (or non NSD server) node then it
will show the complete time taken by the read or write I/O operation requested
by the client node to complete.So in a nut shell for the NSD server case it is just the latency of the
I/O done on disk by the server whereas for the NSD client case it also
the latency of send and receive of I/O request to the NSD server along
with the latency of I/O done on disk by the NSD server.I hope this answers your query.Regards, The Spectrum Scale (GPFS) 

Re: [gpfsug-discuss] Querying size of snapshots

2019-01-29 Thread Olaf Weiser
HI Jan, yes.. but we should highlight, that
this means.. an extra / additionally copy on writes/ changes to a block
... so it adds a bit latency , when running in this mode From:      
 Jan-Frode Myklebust
To:      
 gpfsug main discussion
list Date:      
 01/29/2019 08:19 PMSubject:    
   Re: [gpfsug-discuss]
Querying size of snapshotsSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgYou could put snapshot data in a separate storage pool.
Then it should be visible how much space it occupies, but it’s a bit hard
to see how this will be usable/manageable..-jftir. 29. jan. 2019 kl. 20:08 skrev Christopher Black :Thanks for the quick and detailed reply! I had read the
manual and was aware of the warnings about -d (mentioned in my PS).On systems with high churn (lots of temporary files, lots
of big and small deletes along with many new files), I’ve previously used
estimates of snapshot size as a useful signal on whether we can expect
to see an increase in available space over the next few days as snapshots
expire. I’ve used this technique on a few different more mainstream storage
systems, but never on gpfs.I’d find it useful to have a similar way to monitor “space
to be freed pending snapshot deletes” on gpfs. It sounds like there is
not an existing solution for this so it would be a request for enhancement.I’m not sure how much overhead there would be keeping
a running counter for blocks changed since snapshot creation or if that
would completely fall apart on large systems or systems with many snapshots.
If that is a consideration even having only an estimate for the oldest
snapshot would be useful, but I realize that can depend on all the other
later snapshots as well. Perhaps an overall “size of all snapshots” would
be easier to manage and would still be useful to us.I don’t need this number to be 100% accurate, but a low
or floor estimate would be very useful. Is anyone else interested in this? Do other people have
other ways to estimate how much space they will get back as snapshots expire?
Is there a more efficient way of making such an estimate available to admins
other than running an mmlssnapshot -d every night and recording the output? Thanks all!Chris From: 
on behalf of Marc A Kaplan Reply-To: gpfsug main discussion list Date: Tuesday, January 29, 2019 at 1:24 PMTo: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Querying size of snapshots 1. First off, let's RTFM ...-d Displays the amount of storage that is used by the snapshot.This operation requires an amount of time that is proportional to the size
of the file system; therefore,it can take several minutes or even hours on a large and heavily-loaded
file system.This optional parameter can impact overall system performance. Avoid running
the mmlssnapshotcommand with this parameter frequently or during periods of high file system
activity.S.. there's that.  2. Next you may ask, HOW is that?Snapshots are maintained with a "COW" strategy -- They are created
quickly, essentially just making a record that the snapshot was created
and at such and such time -- when the snapshot is the same as the "live"
filesystem...Then over time, each change to a block of data in live system requires
that a copy is made of the old data block and that is associated with the
most recently created snapshot   SO, as more and more changes
are made to different blocks over time the snapshot becomes bigger and
bigger.   How big? Well it seems the current implementation does not
keep a "simple counter" of the number of blocks -- but rather,
a list of the blocks that were COW'ed So when you come and ask "How
big"... GPFS has to go traverse the file sytem metadata and count
those COW'ed blocks3. So why not keep a counter?  Well, it's likely not so simple. For
starters GPFS is typically running concurrently on several or many nodes... 
And probably was not deemed worth the effort . IF a convincing case
could be made, I'd bet there is a way... to at least keep approximate numbers,
log records, exact updates periodically, etc, etc -- similar to the way
space allocation and accounting is done for the live file system...This message is for the recipient’s use
only, and may contain confidential, privileged or protected information.
Any unauthorized use or dissemination of this communication is prohibited.
If you received this message in error, please immediately notify the sender
and destroy all copies of this message. The recipient should check this
email and any attachments for the presence of viruses, as we accept no
liability for any damage caused by any virus transmitted by this email.___gpfsug-discuss mailing listgpfsug-discuss at 

Re: [gpfsug-discuss] Filesystem automount issues

2019-01-16 Thread Olaf Weiser
and .. check mmlsnode -N waiters -Lfor a very long waiter .. if the FS is still not mountedFrom:      
 "Frederick Stock"
To:      
 gpfsug main discussion
list Date:      
 01/16/2019 07:38 PMSubject:    
   Re: [gpfsug-discuss]
Filesystem automount issuesSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgWould it be possible for you to include the
output of "mmlsmount all -L" and "df -k" in your response?Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.comFrom:        KG
To:        gpfsug
main discussion list Date:        01/16/2019
01:15 PMSubject:        Re:
[gpfsug-discuss] Filesystem automount issuesSent by:        gpfsug-discuss-boun...@spectrumscale.orgIt shows that the filesystem is not mountedOn Wed, Jan 16, 2019, 22:03 Frederick Stock To:        gpfsug
main discussion list Date:        01/16/2019
11:19 AMSubject:        [gpfsug-discuss]
Filesystem automount issuesSent by:        gpfsug-discuss-boun...@spectrumscale.orgHi IHAC running Scale 5.x on RHEL 7.5One out of two filesystems (/home) does not get mounted automatically at
boot. (/home is scale filesystem)The scale log does mention that the filesystem is mounted but mount output
says otherwise.There are no entries for /home in fstab since we let scale mount it. Automount
on scale and filesystem both have been set to yes.Any pointers to troubleshoot would be appreciated.___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] A cautionary tale of upgrades

2019-01-13 Thread Olaf Weiser
Hallo Simon, It is a known issue - tsctl shownodes
up .. reports wrong FQDN and so CES can't retrieve the right information,
that the node is up and healthy ..  once in a while , I  had
same .. and I'm told, that there should be a fix (soon) . The only official way to recover is..
bring down whole cluster and start again from scratch .. to workaround  - without bring
up/down the whole cluster ... - I wrote a little workaround - it's a wrapper
around tsctl .. only kicking in , in case of "shownodes up"if any one else ever  hit the same
issue.. don't hesitate, open a PMR and from L2 support team, you can get
this work around (or fix soon) cheersFrom:      
 Simon Thompson To:      
 "gpfsug-discuss@spectrumscale.org"
Date:      
 01/11/2019 03:19 PMSubject:    
   [gpfsug-discuss]
A cautionary tale of upgradesSent by:    
   gpfsug-discuss-boun...@spectrumscale.org I’ll start by saying this is our experience,
maybe we did something stupid along the way, but just in case others see
similar issues … We have a cluster which contains protocol
nodes, these were all happily running GPFS 5.0.1-2 code. But the cluster
was a only 4 nodes + 1 quorum node – manager and quorum functions were
handled by the 4 protocol nodes. Then one day we needed to reboot a protocol
node. We did so and its disk controller appeared to have failed. Oh well,
we thought we’ll fix that another day, we still have three other quorum
nodes. As they are all getting a little long in
the tooth and were starting to struggle, we thought, well we have DME,
lets add some new nodes for quorum and token functions. Being shiny and
new they were all installed with GPFS 5.0.2-1 code. All was well. The some-time later, we needed to restart
another of the CES nodes, when we started GPFS on the node, it was causing
havock in our cluster – CES IPs were constantly being assigned, then removed
from the remaining nodes in the cluster. Crap we thought and disabled the
node in the cluster. This made things stabilise and as we’d been having
other GPFS issues, we didn’t want service to be interrupted whilst we
dug into this. Besides, it was nearly Christmas and we had conferences
and other work to content with. More time passes and we’re about to cut
over all our backend storage to some shiny new DSS-G kit, so we plan a
whole system maintenance window. We finish all our data sync’s and then
try to start our protocol nodes to test them. No dice … we can’t get
any of the nodes to bring up IPs, the logs look like they start the assignment
process, but then gave up. A lot of digging in the mm korn shell scripts,
and some studious use of DEBUG=1 when testing, we find that mmcesnetmvaddress
is calling “tsctl shownodes up”. On our protocol nodes, we find output
of the form:bear-er-dtn01.bb2.cluster.cluster,rds-aw-ctdb01-data.bb2.cluster.cluster,rds-er-ctdb01-data.bb2.cluster.cluster,bber-irods-ires01-data.bb2.cluster.cluster,bber-irods-icat01-data.bb2.cluster.cluster,bbaw-irods-icat01-data.bb2.cluster.cluster,proto-pg-mgr01.bear.cluster.cluster,proto-pg-pf01.bear.cluster.cluster,proto-pg-dtn01.bear.cluster.cluster,proto-er-mgr01.bear.cluster.cluster,proto-er-pf01.bear.cluster.cluster,proto-aw-mgr01.bear.cluster.cluster,proto-aw-pf01.bear.cluster.cluster Now our DNS name for these nodes is bb2.cluster
… something is repeating the DNS name. So we dig around, resolv.conf, /etc/hosts
etc all look good and name resolution seems fine. We look around on the manager/quorum nodes
and they don’t do this cluster.cluster thing. We can’t find anything
else Linux config wise that looks bad. In fact the only difference is that
our CES nodes are running 5.0.1-2 and the manager nodes 5.0.2-1. Given
we’re changing the whole storage hardware, we didn’t want to change the
GPFS/NFS/SMB code on the CES nodes, (we’ve been bitten before with SMB
packages not working properly in our environment), but we go ahead and
do GPFS and NFS packages. Suddenly, magically all is working again.
CES starts fine and IPs get assigned OK. And tsctl gives the correct output. So, my supposition is that there is some
incompatibility between 5.0.1-2 and 5.0.2-1 when running CES and the cluster
manager is running on 5.0.2-1. As I said before, I don’t have hard evidence
we did something stupid, but it certainly is fishy. We’re guessing this
same “feature” was the cause of the CES issues we saw when we rebooted
a CES node and the IPs kept deassigning… It looks like all was well as
we added the manager nodes after CES was started, but when a CES node restarted,
things broke. We got everything working again in house
so didn’t raise a PMR, but if you find yourself in this upgrade path,
beware! Simon  ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Status for Alert: remotely mounted filesystem panic on accessing cluster after upgrading the owning cluster first

2018-11-29 Thread Olaf Weiser
HI Tomer, send my work around wrapper to Renar..
I've seen to less data to be sure, that's the same (tsctl shownodes ...)
issue but he'll try and let us know .. From:      
 "Grunenberg, Renar"
To:      
 gpfsug main discussion
list , "Olaf Weiser"
Date:      
 11/29/2018 09:04 AMSubject:    
   AW: [gpfsug-discuss]
Status for Alert: remotely mounted filesystem panic on accessing cluster
after upgrading the owning cluster firstHallo Tomer,thanks for this Info, but can you explain
in witch release all these points fixed now? Renar GrunenbergAbteilung Informatik – BetriebHUK-COBURGBahnhofsplatz96444 CoburgTelefon:09561
96-44110Telefax:09561
96-44104E-Mail:renar.grunenb...@huk-coburg.deInternet:www.huk.deHUK-COBURG Haftpflicht-Unterstützungs-Kasse
kraftfahrender Beamter Deutschlands a. G. in CoburgReg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021Sitz der Gesellschaft: Bahnhofsplatz, 96444 CoburgVorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav
Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel Thomas.Diese Nachricht enthält vertrauliche und/oder
rechtlich geschützte Informationen.Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich
erhalten haben,informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht
ist nicht gestattet.This information may contain confidential and/or privileged information.If you are not the intended recipient (or have received this information
in error) please notify thesender immediately and destroy this information.Any unauthorized copying, disclosure or distribution of the material in
this information is strictly forbidden.Von: gpfsug-discuss-boun...@spectrumscale.org
[mailto:gpfsug-discuss-boun...@spectrumscale.org]
Im Auftrag von Tomer PerryGesendet: Donnerstag, 29. November 2018 08:45An: gpfsug main discussion list ;
Olaf Weiser Betreff: Re: [gpfsug-discuss] Status for Alert: remotely mounted filesystem
panic on accessing cluster after upgrading the owning cluster first Hi,I remember there was some defect around tsctl and mixed domains - bot sure
if it was fixed and in what version.A workaround in the past was to "wrap" tsctl with a script that
would strip those.Olaf might be able to provide more info ( I believe he had some sample
script).Regards,Tomer PerryScalable I/O Development (Spectrum Scale)email: t...@il.ibm.com1 Azrieli Center, Tel Aviv 67021, IsraelGlobal Tel:    +1 720 3422758Israel Tel:      +972 3 9188625Mobile:         +972 52 2554625From:        "Grunenberg,
Renar" <renar.grunenb...@huk-coburg.de>To:        'gpfsug
main discussion list' <gpfsug-discuss@spectrumscale.org>Date:        29/11/2018
09:29Subject:        Re:
[gpfsug-discuss] Status for Alert: remotely mounted filesystem panic on
accessing cluster after upgrading the owning cluster firstSent by:        gpfsug-discuss-boun...@spectrumscale.orgHallo All,in this relation to the Alert, i had some question about experiences to
establish remote cluster with different FQDN’s.What we see here that the owning (5.0.1.1) and the local Cluster (5.0.2.1)
has different Domain-Names and both are connected to a firewall. Icmp,1191
and ephemeral Ports ports are open.If we dump the tscomm component of both daemons, we see connections to
nodes that are named [hostname+ FGDN localCluster+FGDN remote Cluster]. We analyzed nscd, DNS and make some tcp-dumps and
so on and come to the conclusion that tsctl generate this wrong nodename
and then if a Cluster Manager takeover are happening, because of a shutdown
of these daemon (at Owning Cluster side), the join protocol rejected these
connection. Are there any comparable experiences in the field. And if yes what are
the solution of that?Thanks RenarRenar GrunenbergAbteilung Informatik – BetriebHUK-COBURGBahnhofsplatz96444 CoburgTelefon:09561
96-44110Telefax:09561
96-44104E-Mail:renar.grunenb...@huk-coburg.deInternet:www.huk.deHUK-COBURG Haftpflicht-Unterstützungs-Kasse
kraftfahrender Beamter Deutschlands a. G. in CoburgReg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021Sitz der Gesellschaft: Bahnhofsplatz, 96444 CoburgVorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav
Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel Thomas.Diese Nachricht enthält vertrauliche und/oder
rechtlich geschützte Informationen.Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich
erhalten haben,informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht
ist nicht gestattet.This information may contain confidential and/or privileged information.If you are not the intended recipient (or have received this information
in error) please notify th

Re: [gpfsug-discuss] Error with AFM fileset creation with mapping

2018-11-26 Thread Olaf Weiser
Try an dedicated extra „ -p „  foreach Attribute 

Von meinem iPhone gesendet

> Am 26.11.2018 um 16:50 schrieb Dorigo Alvise (PSI) :
> 
> Good evening,
> I'm following this guide: 
> https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1ins_paralleldatatransfersafm.htm
> to setup AFM parallel transfer.
> 
> Why the following command (grabbed directly from the web page above) fires 
> out that error ?
> 
> [root@sf-export-3 ~]# mmcrfileset RAW test1 --inode-space new –p 
> afmmode=sw,afmtarget=afmgw1://gpfs/gpfs2/swhome
> mmcrfileset: Incorrect extra argument: –p
> Usage:
>   mmcrfileset Device FilesetName [-p afmAttribute=Value...] [-t Comment]
>  [--inode-space {new [--inode-limit 
> MaxNumInodes[:NumInodesToPreallocate]] | ExistingFileset}]
>  [--allow-permission-change PermissionChangeMode]
> 
> The mapping was correctly created:
> 
> [root@sf-export-3 ~]# mmafmconfig show
> Map name: afmgw1
> Export server map:
> 172.16.1.2/sf-export-2.psi.ch,172.16.1.3/sf-export-3.psi.ch
> 
> Is this a known bug ?
> 
> Thanks,
> Regards.
> 
> Alvise

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Wrong behavior of mmperfmon

2018-11-15 Thread Olaf Weiser
ntp running / time correct ?From:      
 "Dorigo Alvise
(PSI)" To:      
 "gpfsug-discuss@spectrumscale.org"
Date:      
 11/15/2018 04:30 PMSubject:    
   [gpfsug-discuss]
Wrong behavior of mmperfmonSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHello,I'm using mmperfmon to get writing stats
on NSD during a write activity on a GPFS filesystem (Lenovo system with
dss-g-2.0a).I use this command:# mmperfmon query 'sf-dssio-1.psi.ch|GPFSNSDFS|RAW|gpfs_nsdfs_bytes_written'
--number-buckets 48 -b 1to get the stats. What it returns is a list
of valid values followed by a longer list of 'null' as shown below:Legend: 1:    sf-dssio-1.psi.ch|GPFSNSDFS|RAW|gpfs_nsdfs_bytes_written Row           Timestamp gpfs_nsdfs_bytes_written
  1 2018-11-15-16:15:57            
   746586112   2 2018-11-15-16:15:58            
   704643072   3 2018-11-15-16:15:59            
   805306368   4 2018-11-15-16:16:00            
   754974720   5 2018-11-15-16:16:01            
   754974720   6 2018-11-15-16:16:02            
   763363328   7 2018-11-15-16:16:03            
   746586112   8 2018-11-15-16:16:04            
   746848256   9 2018-11-15-16:16:05            
   780140544  10 2018-11-15-16:16:06              
 679923712  11 2018-11-15-16:16:07              
 746618880  12 2018-11-15-16:16:08              
 780140544  13 2018-11-15-16:16:09              
 746586112  14 2018-11-15-16:16:10              
 763363328  15 2018-11-15-16:16:11              
 780173312  16 2018-11-15-16:16:12              
 721420288  17 2018-11-15-16:16:13              
 796917760  18 2018-11-15-16:16:14              
 763363328  19 2018-11-15-16:16:15              
 738197504  20 2018-11-15-16:16:16              
 738197504  21 2018-11-15-16:16:17              
      null  22 2018-11-15-16:16:18              
      null  23 2018-11-15-16:16:19              
      null  24 2018-11-15-16:16:20              
      null  25 2018-11-15-16:16:21              
      null  26 2018-11-15-16:16:22              
      null  27 2018-11-15-16:16:23              
      null  28 2018-11-15-16:16:24              
      null  29 2018-11-15-16:16:25              
      null  30 2018-11-15-16:16:26              
      null  31 2018-11-15-16:16:27              
      null  32 2018-11-15-16:16:28              
      null  33 2018-11-15-16:16:29              
      null  34 2018-11-15-16:16:30              
      null  35 2018-11-15-16:16:31              
      null  36 2018-11-15-16:16:32              
      null  37 2018-11-15-16:16:33              
      null  38 2018-11-15-16:16:34              
      null  39 2018-11-15-16:16:35              
      null  40 2018-11-15-16:16:36              
      null  41 2018-11-15-16:16:37              
      null  42 2018-11-15-16:16:38              
      null  43 2018-11-15-16:16:39              
      null  44 2018-11-15-16:16:40              
      null  45 2018-11-15-16:16:41              
      null  46 2018-11-15-16:16:42              
      null  47 2018-11-15-16:16:43              
      null  48 2018-11-15-16:16:44              
      null If I run again and again I still get the
same pattern: valid data (even 0 in case of not write activity) followed
by more null data.Is that normal ? If not, is there a way
to get only non-null data by fine-tuning pmcollector's configuration file
?The corresponding ZiMon sensor (GPFSNSDFS)
have period=1.The ZiMon version is 4.2.3-7.Thanks,   Alvise ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Job vacancy @Birmingham

2018-10-18 Thread Olaf Weiser
Hi  Simon .. well - I would love to .. .but .. ;-)
hey - what do you think, how long a citizen from the EU can live (and work)
in UK ;-)   don't take me too serious... see you
soon, consider you invited for a coffee for my rude comment .. ;-)olafFrom:      
 Simon Thompson To:      
 "gpfsug-discuss@spectrumscale.org"
Date:      
 10/17/2018 11:02 PMSubject:    
   [gpfsug-discuss]
Job vacancy @BirminghamSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgWe're looking for someone to join our systems
team here at University of Birmingham. In case you didn't realise, we're
pretty reliant on Spectrum Scale to deliver our storage systems. https://atsv7.wcn.co.uk/search_engine/jobs.cgi?amNvZGU9MTc2MzczOSZ2dF90ZW1wbGF0ZT03Njcmb3duZXI9NTAzMjUyMSZvd25lcnR5cGU9ZmFpciZicmFuZF9pZD0wJmxvY2F0aW9uX2NvZGU9MTU0NDUmb2NjX2NvZGU9Njg3NiZwb3N0aW5nX2NvZGU9MTE3=1763739_template=767=5032521=fair_id=0_code=15445_code=6876_code=117https://atsv7.wcn.co.uk/search_engine/jobs.cgi?amNvZGU9MTc2MzczOSZ2dF90ZW1wbGF0ZT03Njcmb3duZXI9NTAzMjUyMSZvd25lcnR5cGU9ZmFpciZicmFuZF9pZD0wJmxvY2F0aW9uX2NvZGU9MTU0NDUmb2NjX2NvZGU9Njg3NiZwb3N0aW5nX2NvZGU9MTE3=1763739_template=767=5032521=fair_id=0_code=15445_code=6876_code=117Such a snappy URL :-)Feel free to email me *OFFLIST* if you have
informal enquiries!Simon___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Preliminary conclusion: single client, single thread, small files - native Scale vs NFS

2018-10-17 Thread Olaf Weiser
d for writing
to disk. The writing is not necessarily complete when the sync() call returns
to the program.- A user can enter the sync command, which in turn issues a sync() call.
Again, some of the writes may not be complete when the user is prompted
for input (or the next command in a shell script is processed).close() vs fclose()A successful close does not guarantee that the data has been successfully
saved to disk, as the kernel defers writes. It is not common for a file
system to flush the buffers when the stream is closed. If you need to be
sure that the data isphysically stored use fsync(2). (It will depend on the disk hardware at
this point.)Mit freundlichen Grüßen / Kind regardsAlexander SauppIBM Systems, Storage Platform, EMEA Storage Competence CenterPhone:+49
7034-643-1512IBM
Deutschland GmbHMobile:+49-172
7251072Am
Weiher 24Email:alexander.sa...@de.ibm.com65451
KelsterbachGermanyIBM
Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Matthias Hartmann (Vorsitzender), Norbert Janzen, Stefan
Lutz, Nicole Reimer, Dr. Klaus Seifert, Wolfgang WendtSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss[attachment
"ecblank.gif" deleted by Olaf Weiser/Germany/IBM] [attachment
"19995626.gif" deleted by Olaf Weiser/Germany/IBM] [attachment
"ecblank.gif" deleted by Olaf Weiser/Germany/IBM] ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] error compiling IOR on GPFS

2018-10-12 Thread Olaf Weiser
I think the step you are missing is this/configure LIBS=/usr/lpp/mmfs/lib/libgpfs.so make Mit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:      
 KG To:      
 gpfsug main discussion
list Date:      
 10/12/2018 12:40 PMSubject:    
   Re: [gpfsug-discuss]
error compiling IOR on GPFSSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHi JohnYes, I am using latest version from
this link.Do I have to use any additional switches
for compilation? I used following sequence./bootstrap./configure ./make (fails)On Fri, Oct 12, 2018 at 7:51 AM John Bent <johnb...@gmail.com>
wrote:Kiran,Are you using the latest version of IOR?https://github.com/hpc/iorThanks,JohnOn Thu, Oct 11, 2018 at 10:39 PM KG <spectrumsc...@kiranghag.com>
wrote:Hi FolksI am trying to compile IOR on a GPFS filesystem
and running into following errors.Github forum says that "The
configure script does not add -lgpfs to the CFLAGS when it detects GPFS
support."Any help on how to get around
this?mpicc -DHAVE_CONFIG_H -I.     -g -O2
-MT aiori-MPIIO.o -MD -MP -MF .deps/aiori-MPIIO.Tpo -c -o aiori-MPIIO.o
aiori-MPIIO.caiori-MPIIO.c: In function ‘MPIIO_Xfer’:aiori-MPIIO.c:236:24: warning: assignment from incompatible
pointer type [enabled by default]               
 Access = MPI_File_write;               
        ^aiori-MPIIO.c:237:27: warning: assignment from incompatible
pointer type [enabled by default]               
 Access_at = MPI_File_write_at;               
           ^aiori-MPIIO.c:238:28: warning: assignment from incompatible
pointer type [enabled by default]               
 Access_all = MPI_File_write_all;               
            ^aiori-MPIIO.c:239:31: warning: assignment from incompatible
pointer type [enabled by default]               
 Access_at_all = MPI_File_write_at_all;               
               ^mv -f .deps/aiori-MPIIO.Tpo .deps/aiori-MPIIO.Pompicc  -g -O2   -o ior ior.o utilities.o
parse_options.o aiori-POSIX.o aiori-MPIIO.o     -lmaiori-POSIX.o: In function `gpfs_free_all_locks':/gpfs/Aramco_POC/ior-master/src/aiori-POSIX.c:118:
undefined reference to `gpfs_fcntl'aiori-POSIX.o: In function `gpfs_access_start':aiori-POSIX.c:(.text+0x91f): undefined reference to
`gpfs_fcntl'aiori-POSIX.o: In function `gpfs_access_end':aiori-POSIX.c:(.text+0xa04): undefined reference to
`gpfs_fcntl'collect2: error: ld returned 1 exit statusmake[2]: *** [ior] Error 1make[2]: Leaving directory `/gpfs/Aramco_POC/ior-master/src'make[1]: *** [all] Error 2make[1]: Leaving directory `/gpfs/Aramco_POC/ior-master/src'make: *** [all-recursive] Error 1Kiran___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] IBM ESS - certified now for SAP

2018-09-27 Thread Olaf Weiser
Hallo friends and fans of GPFS and Scale
;-) in case you have an interest in running
SAP on Scale .. by this week , we got the final approval and re-certification
for all new ESS models to run SAP HANA https://www.sap.com/dmc/exp/2014-09-02-hana-hardware/enEN/enterprise-storage.html#categories=certified%23International%20Business%20Machines%20Corporation%23NAS%20-%20Distributed%20file%20system=2068

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Metadata with GNR code

2018-09-21 Thread Olaf Weiser
see a mdtest for a default block size file
system ...4 MB blocksize..mdata is on SSD data is on HDD   ... which is not
really relevant for this mdtest ;-) -- started at 09/07/2018 06:54:54 -- mdtest-1.9.3 was launched with 40 total
task(s) on 20 node(s) Command line used: mdtest -n 25000 -i 3
-u -d /homebrewed/gh24_4m_4m/mdtest Path: /homebrewed/gh24_4m_4m FS: 10.0 TiB   Used FS: 0.0%  
Inodes: 12.0 Mi   Used Inodes: 2.3% 40 tasks, 100 files/directories SUMMARY: (of 3 iterations)   Operation        
             Max      
     Min           Mean  
     Std Dev   -        
             ---      
     ---             
     ---   Directory creation:    
449160.409     430869.822     437002.187    
  8597.272   Directory stat    :  
 6664420.560    5785712.544    6324276.731  
  385192.527   Directory removal :    
398360.058     351503.369     371630.648    
 19690.580   File creation     :  
  288985.217     270550.129     279096.800  
    7585.659   File stat        
:    6720685.117    6641301.499    6674123.407
     33833.182   File read        
:    3055661.372    2871044.881    2945513.966
     79479.638   File removal      :
    215187.602     146639.435     179898.441
     28021.467   Tree creation     :  
      10.215          3.165  
       6.603          2.881
  Tree removal      :
         5.484          0.880
         2.418          2.168
-- finished at 09/07/2018 06:55:42 --Mit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:      
 "Andrew Beattie"
To:      
 gpfsug-discuss@spectrumscale.orgDate:      
 09/21/2018 02:34 AMSubject:    
   Re: [gpfsug-discuss]
Metadata with GNR codeSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgSimon, My recommendation is still very much to use
SSD for Metadata and NL-SAS for data andthe GH14 / GH24 Building blocks certainly
make this much easier. Unless your filesystem is massive (Summit
sized) you will typically still continue to benefit from the Random IO
performance of SSD (even RI SSD) in comparison to NL-SAS. It still makes more sense to me to continue
to use 2 copy or 3 copy for Metadata even in ESS / GNR style environments.
 The read performance for metadata using 3copy is still significantly
better than any other scenario. As with anything there are exceptions to
the rule, but my experiences with ESS and ESS with SSD so far still maintain
that the standard thoughts on managing Metadata and Small file IO remain
the same -- even with the improvements around sub blocks with Scale V5. MDtest is still the typical benchmark for
this comparison and MDTest shows some very clear differences  even
on SSD when you use a large filesystem block size with more sub blocks
vs a smaller block size with 1/32 subblocks This only gets worse if you change the storage
media from SSD to NL-SASAndrew BeattieSoftware Defined Storage  - IT SpecialistPhone: 614-2133-7927E-mail: abeat...@au1.ibm.com  - Original message -From: Simon Thompson Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: [gpfsug-discuss] Metadata with GNR codeDate: Fri, Sep 21, 2018 3:29 AM Just wondering if anyone has any strong views/recommendations
with metadata when using GNR code? I know in “san” based GPFS, there is a recommendation
to have data and metadata split with the metadata on SSD. I’ve also heard that with GNR there isn’t
much difference in splitting data and metadata. We’re looking at two systems and want to
replicate metadata, but not data (mostly) between them, so I’m not really
sure how we’d do this without having separate system pool (and then NSDs
in different failure groups)…. If we used 8+2P vdisks for metadata only,
would we still see no difference in performance compared to mixed (I guess
the 8+2P is still spread over a DA so we’d get half the drives in the
GNR system active…). Or should we stick SSD based storage in as
well for the metadata pool? (Which brings an interesting question about
RAID code related to the recent discussions on mirroring vs RAID5…) Thoughts welcome! Simon___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/lis

Re: [gpfsug-discuss] Top files on GPFS filesystem

2018-09-05 Thread Olaf Weiser
r(CHANGE_TIME)
||            ' M=' || varchar(MODIFICATION_TIME)
)  # mmapplypolicy sasconfig -P policy-file-heat3.txt
-I defer -f teste6  Then we could grep by inode number
and see which file it is:  # grep "^451799 " teste6.list.hotfiles  For privacy reasons I won't show the
result but it found the file. The good thing this list also provides the
UID and GID of the file. We still waiting a feedback from SAS admin to
see it's acceptable.- dstat with --gpfs-ops --top-io-adv|--top-bio|--top-io:
The problem is it only shows one process. That's not enough.- Systemtap: It didn't work. I think it's
because there's no GPFS symbols. If somebody know how to add GPFS symbols
that can be very handy.- QOS: We first enabled QOS to just collect
filesystem statistics:  # mmchqos saswork --enable --fine-stats
60 --pid-stats yes  The the SAS admin started another
SAS job and got the PID. Then we run the following command:  # mmlsqos saswork --fine-stats 2 --seconds
60 | grep SASPID  We never matched the PIDs. When you
get from ps -ef | grep nodms, it return a PID of 5 digits and mmlsqos gives
PIDs of 8 digits. We have a ticket opended to understand what's happening. After all this time trying to figure out
a way to generate this report, I think the problem is more complex. Even
if we get this information what we could do to put a limit in those processes?
I think the best option would have AIX servers running WLM and the saswork
filesystems would need to be local on each server. In that way we not only
could monitor but define classes, shares and limits for I/O. I think RedHat
or Linux in general doesn't have a workload manager like in AIX.  Abraços
/ Regards / Saludos, Anderson NobreAIX & Power ConsultantMaster Certified IT SpecialistIBM Systems Hardware Client Technical Team – IBM Systems Lab Services Phone:55-19-2132-4317E-mail: ano...@br.ibm.com  - Original message -From: "Olaf Weiser" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: Re: [gpfsug-discuss] Top files on GPFS filesystemDate: Mon, Aug 13, 2018 3:10 AM there's no mm* command to get it cluster wide.. you can use fileheat and policy engine to identify most active files
..  and further more... combine it with migration rules ... to replace
those files .. please note.. files, that are accessed very heavily but all requests answered
out of pagepol (cached files) .. fileheat does'nt get increased for cache
hits...  fileheat is only counted for real IOs to the disk... as intended
...From:        "Anderson
Ferreira Nobre" To:        gpfsug-discuss@spectrumscale.orgDate:        08/10/2018
08:10 PMSubject:        [gpfsug-discuss]
Top files on GPFS filesystemSent by:        gpfsug-discuss-boun...@spectrumscale.orgHi all, Does anyone know how to list the top files by throughput and IOPS in a
single GPFS filesystem like filemon in AIX?   Abraços
/ Regards / Saludos,  Anderson NobreAIX & Power ConsultantMaster Certified IT SpecialistIBM Systems Hardware Client Technical Team – IBM Systems Lab Services   Phone:55-19-2132-4317E-mail: ano...@br.ibm.com___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss  ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] GPFS Independent Fileset Limit vs Quotas?

2018-08-13 Thread Olaf Weiser
as Dominic said.. .. your are absolutely
right .. for mmbackup you need dedicated inode spaces .. so "independent"
filesets .. (in case I  you wanna  be able to mmbackup on a fileset
level or multiple mmbackup's in parallel .. )From:      
 "Peinkofer, Stephan"
To:      
 gpfsug main discussion
list Date:      
 08/13/2018 09:26 AMSubject:    
   Re: [gpfsug-discuss]
GPFS Independent Fileset Limit vs Quotas?Sent by:    
   gpfsug-discuss-boun...@spectrumscale.orgDear Marc,OK, so let’s give it a try:[root@datdsst100 pr74qo]# mmlsfileset dsstestfs01Filesets in file system 'dsstestfs01':Name              
      Status    Path        
                     
     root              
      Linked    /dss/dsstestfs01    
                   ...quota_test_independent   Linked    /dss/dsstestfs01/quota_test_independent
quota_test_dependent     Linked    /dss/dsstestfs01/quota_test_independent/quota_test_dependent[root@datdsst100 pr74qo]# mmsetquota dsstestfs01:quota_test_independent
--user a2822bp --block 1G:1G --files 10:10[root@datdsst100 pr74qo]# mmsetquota dsstestfs01:quota_test_dependent
--user a2822bp --block 10G:10G --files 100:100[root@datdsst100 pr74qo]#  mmrepquota -u -v dsstestfs01:quota_test_independent
*** Report for USR quotas on dsstestfs01               
         Block Limits        
                     
     |              
      File LimitsName       fileset    type  
          KB      quota  
   limit   in_doubt    grace |    files
  quota    limit in_doubt    grace entryTypea2822bp    quota_test_independent USR  
            0    1048576  
 1048576          0     none |
       0      10      
10        0     none e      
  root       quota_test_independent USR  
            0        
 0          0        
 0     none |        1    
  0        0        0  
  none i         [root@datdsst100 pr74qo]#  mmrepquota -u -v dsstestfs01:quota_test_dependent*** Report for USR quotas on dsstestfs01               
         Block Limits        
                     
     |              
      File LimitsName       fileset    type  
          KB      quota  
   limit   in_doubt    grace |    files
  quota    limit in_doubt    grace entryTypea2822bp    quota_test_dependent USR    
          0   10485760   10485760  
       0     none |      
 0     100      100      
 0     none e         root       quota_test_dependent USR  
            0        
 0          0        
 0     none |        1    
  0        0        0  
  none i         Looks good …[root@datdsst100 pr74qo]# cd /dss/dsstestfs01/quota_test_independent/quota_test_dependent/[root@datdsst100 quota_test_dependent]# for foo in `seq
1 99`; do touch file${foo}; chown a2822bp:pr28fa file${foo}; done[root@datdsst100 quota_test_dependent]#  mmrepquota
-u -v dsstestfs01:quota_test_dependent*** Report for USR quotas on dsstestfs01               
         Block Limits        
                     
     |              
      File LimitsName       fileset    type  
          KB      quota  
   limit   in_doubt    grace |    files
  quota    limit in_doubt    grace entryTypea2822bp    quota_test_dependent USR    
          0   10485760   10485760  
       0     none |      
99     100      100        0
    none e         root       quota_test_dependent USR  
            0        
 0          0        
 0     none |        1    
  0        0        0  
  none i         [root@datdsst100 quota_test_dependent]#  mmrepquota
-u -v dsstestfs01:quota_test_independent *** Report for USR quotas on dsstestfs01               
         Block Limits        
                     
     |              
      File LimitsName       fileset    type  
          KB      quota  
   limit   in_doubt    grace |    files
  quota    limit in_doubt    grace entryTypea2822bp    quota_test_independent USR  
            0    1048576  
 1048576          0     none |
       0      10      
10        0     none e      
  root       quota_test_independent USR  
            0        
 0          0        
 0     none |        1    
  0        0        0  
  none i         So it seems that per fileset per user quota is really
not depending on independence. But what is the documentation then meaning
with:>>> User group and user quotas can be tracked
at the file system level or per independent fileset.???However, there still remains the problem with mmbackup
and mmapplypolicy …And if you look at some of the RFEs, like the one from
DESY, they want even more than 10k independent filesets …Best Regards,Stephan Peinkofer-- Stephan PeinkoferDipl. Inf. (FH), M. Sc. (TUM) Leibniz Supercomputing CentreData and Storage DivisionBoltzmannstraße 1, 85748 Garching b. MünchenTel: +49(0)89 35831-8715     Fax: +49(0)89 35831-9700URL: http://www.lrz.deOn 12. Aug 2018, at 15:05, Marc A Kaplan 
wrote:That's interesting, I confess I never read that piece
of documentation.What's also interesting, is that if you look at this doc for 

Re: [gpfsug-discuss] Top files on GPFS filesystem

2018-08-13 Thread Olaf Weiser
there's no mm* command to get it cluster
wide.. you can use fileheat and policy
engine to identify most active files ..  and further more... combine
it with migration rules ... to replace those files .. please note.. files, that are accessed
very heavily but all requests answered out of pagepol (cached files) ..
fileheat does'nt get increased for cache hits...  fileheat is only
counted for real IOs to the disk... as intended ...From:      
 "Anderson Ferreira
Nobre" To:      
 gpfsug-discuss@spectrumscale.orgDate:      
 08/10/2018 08:10 PMSubject:    
   [gpfsug-discuss]
Top files on GPFS filesystemSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHi all, Does anyone know how to list the top files
by throughput and IOPS in a single GPFS filesystem like filemon in AIX?  Abraços
/ Regards / Saludos, Anderson NobreAIX & Power ConsultantMaster Certified IT SpecialistIBM Systems Hardware Client Technical Team – IBM Systems Lab Services Phone:55-19-2132-4317E-mail: ano...@br.ibm.com___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] GPFS Independent Fileset Limit

2018-08-10 Thread Olaf Weiser
Hallo Stephan, the limit is not a hard coded limit
 - technically spoken, you can raise it easily. But as always, it is a question of test
'n support .. I've seen customer cases, where the
use of much smaller amount of independent filesets generates a lot performance
issues, hangs ... at least noise and partial trouble .. it might be not the case with your specific
workload, because due to the fact, that you 're running already  close
to 1000 ...I suspect , this number of 1000 file
sets  - at the time of introducing it - was as also just that one
had to pick a number... ... turns out.. that a general commitment
to support > 1000 ind.fileset is more or less hard.. because what uses
cases should we test / supportI think , there might be a good chance
for you , that for your specific workload, one would allow and support
more than 1000 do you still have a PMR for your side
for this ?  - if not - I know .. open PMRs is an additional ...but
could you please .. then we can decide .. if raising the
limit is an option for you .. Mit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:      
 "Peinkofer, Stephan"
To:      
 gpfsug main discussion
list Cc:      
 Doris Franke ,
Uwe Tron , Dorian Krause Date:      
 08/10/2018 01:29 PMSubject:    
   [gpfsug-discuss]
GPFS Independent Fileset LimitSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgDear IBM and GPFS List,we at the Leibniz Supercomputing Centre
and our GCS Partners from the Jülich Supercomputing Centre will soon be
hitting the current Independent Fileset Limit of 1000 on a number of our
GPFS Filesystems.There are also a number of RFEs from other
users open, that target this limitation:https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe_ID=56780https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe_ID=120534https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe_ID=106530https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe_ID=85282I know GPFS Development was very busy fulfilling
the CORAL requirements but maybe now there is again some time to improve
something else. If there are any other users on the list
that are approaching the current limitation in independent filesets, please
take some time and vote for the RFEs above.Many thanks in advance and have a nice
weekend.Best Regards,Stephan Peinkofer___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Sven Oehme now at DDN

2018-08-08 Thread Olaf Weiser
dear friends of GPFS, Sven is Sven.. and he is "[...]permanent[...]"
.. it is hard to saw him went away to DDN,
'cause as we all agree he contributes very^99 much in GPFS and he's good
friend to.. but hey.. GPFS has a 20 year history..is
very complex ... so it's not a one man show .. we are still here ;-) So feel free.. keep on asking question
... we'll take it ...lot's of people around the globe contributing
daily to the product.. mass of planning/evaluation and research is ongoing
cheersolafFrom:      
 Stephen Ulmer To:      
 gpfsug main discussion
list Date:      
 08/09/2018 06:46 AMSubject:    
   Re: [gpfsug-discuss]
Sven Oehme now at DDNSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgBut it still shows him employed at IBM through “present”.
So is he on-loan or is it “permanent”?-- StephenOn Aug 2, 2018, at 11:56 AM, Marc A Kaplan 
wrote:https://www.linkedin.com/in/oehmes/   Apparently, Sven is now "Chief Research Officer at
DDN"___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Sub-block size not quite as expected on GPFS 5 filesystem?

2018-08-03 Thread Olaf Weiser
Can u share your stanza file ?

Von meinem iPhone gesendet

> Am 02.08.2018 um 23:15 schrieb Buterbaugh, Kevin L 
> :
> 
> OK, so hold on … NOW what’s going on???  I deleted the filesystem … went to 
> lunch … came back an hour later … recreated the filesystem with a metadata 
> block size of 4 MB … and I STILL have a 1 MB block size in the system pool 
> and the wrong fragment size in other pools…
> 
> Kevin
> 
> /root/gpfs
> root@testnsd1# mmdelfs gpfs5
> All data on the following disks of gpfs5 will be destroyed:
> test21A3nsd
> test21A4nsd
> test21B3nsd
> test21B4nsd
> test23Ansd
> test23Bnsd
> test23Cnsd
> test24Ansd
> test24Bnsd
> test24Cnsd
> test25Ansd
> test25Bnsd
> test25Cnsd
> Completed deletion of file system /dev/gpfs5.
> mmdelfs: Propagating the cluster configuration data to all
>   affected nodes.  This is an asynchronous process.
> /root/gpfs
> root@testnsd1# mmcrfs gpfs5 -F ~/gpfs/gpfs5.stanza -A yes -B 4M -E yes -i 
> 4096 -j scatter -k all -K whenpossible -m 2 -M 3 -n 32 -Q yes -r 1 -R 3 -T 
> /gpfs5 -v yes --nofilesetdf --metadata-block-size 4M
> 
> The following disks of gpfs5 will be formatted on node testnsd3:
> test21A3nsd: size 953609 MB
> test21A4nsd: size 953609 MB
> test21B3nsd: size 953609 MB
> test21B4nsd: size 953609 MB
> test23Ansd: size 15259744 MB
> test23Bnsd: size 15259744 MB
> test23Cnsd: size 1907468 MB
> test24Ansd: size 15259744 MB
> test24Bnsd: size 15259744 MB
> test24Cnsd: size 1907468 MB
> test25Ansd: size 15259744 MB
> test25Bnsd: size 15259744 MB
> test25Cnsd: size 1907468 MB
> Formatting file system ...
> Disks up to size 8.29 TB can be added to storage pool system.
> Disks up to size 16.60 TB can be added to storage pool raid1.
> Disks up to size 132.62 TB can be added to storage pool raid6.
> Creating Inode File
>   12 % complete on Thu Aug  2 13:16:26 2018
>   25 % complete on Thu Aug  2 13:16:31 2018
>   38 % complete on Thu Aug  2 13:16:36 2018
>   50 % complete on Thu Aug  2 13:16:41 2018
>   62 % complete on Thu Aug  2 13:16:46 2018
>   74 % complete on Thu Aug  2 13:16:52 2018
>   85 % complete on Thu Aug  2 13:16:57 2018
>   96 % complete on Thu Aug  2 13:17:02 2018
>  100 % complete on Thu Aug  2 13:17:03 2018
> Creating Allocation Maps
> Creating Log Files
>3 % complete on Thu Aug  2 13:17:09 2018
>   28 % complete on Thu Aug  2 13:17:15 2018
>   53 % complete on Thu Aug  2 13:17:20 2018
>   78 % complete on Thu Aug  2 13:17:26 2018
>  100 % complete on Thu Aug  2 13:17:27 2018
> Clearing Inode Allocation Map
> Clearing Block Allocation Map
> Formatting Allocation Map for storage pool system
>   98 % complete on Thu Aug  2 13:17:34 2018
>  100 % complete on Thu Aug  2 13:17:34 2018
> Formatting Allocation Map for storage pool raid1
>   52 % complete on Thu Aug  2 13:17:39 2018
>  100 % complete on Thu Aug  2 13:17:43 2018
> Formatting Allocation Map for storage pool raid6
>   24 % complete on Thu Aug  2 13:17:48 2018
>   50 % complete on Thu Aug  2 13:17:53 2018
>   74 % complete on Thu Aug  2 13:17:58 2018
>   99 % complete on Thu Aug  2 13:18:03 2018
>  100 % complete on Thu Aug  2 13:18:03 2018
> Completed creation of file system /dev/gpfs5.
> mmcrfs: Propagating the cluster configuration data to all
>   affected nodes.  This is an asynchronous process.
> /root/gpfs
> root@testnsd1# mmlsfs gpfs5
> flagvaluedescription
> ---  
> ---
>  -f 8192 Minimum fragment (subblock) size 
> in bytes (system pool)
> 32768Minimum fragment (subblock) size 
> in bytes (other pools)
>  -i 4096 Inode size in bytes
>  -I 32768Indirect block size in bytes
>  -m 2Default number of metadata 
> replicas
>  -M 3Maximum number of metadata 
> replicas
>  -r 1Default number of data replicas
>  -R 3Maximum number of data replicas
>  -j scatter  Block allocation type
>  -D nfs4 File locking semantics in effect
>  -k all  ACL semantics in effect
>  -n 32   Estimated number of nodes that 
> will mount file system
>  -B 1048576  Block size (system pool)
> 4194304  Block size (other pools)
>  -Q user;group;fileset   Quotas accounting enabled
> user;group;fileset   Quotas enforced
> none Default quotas enabled
>  --perfileset-quota No   Per-fileset quota 

Re: [gpfsug-discuss] mmbackup issue

2018-06-20 Thread Olaf Weiser

Hi Renar, if possible, let's check if you can
identify specific parts of your name space, which a affected (fileset,
subDir ...) if so .. you can EXCLUDE them from mmbackup
andrun a 2nd policy in parallel with an
EXEC LIST and call dsmc incr directly upon that list .. I know.. it's not a solution .. its
a work around ..cheersFrom:      
 "Grunenberg, Renar"
To:      
 'gpfsug main discussion
list' Date:      
 06/20/2018 06:00 PMSubject:    
   Re: [gpfsug-discuss]
mmbackup issueSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHallo Valdis,first thanks for the explanation we understand that, but this problem generate
only 2 Version at tsm server for the same file, in the same directory.
This mean that mmbackup and the .shadow... has no possibility to have for
the same file in the same directory more then 2 backup versions with tsm.
The native ba-client manage this. (Here are there already different inode
numbers existent.) But at TSM-Server side the file that are selected at
'ba incr' are merged to the right filespace and will be binded to the mcclass
>2 Version exist.Renar GrunenbergAbteilung Informatik – BetriebHUK-COBURGBahnhofsplatz96444 CoburgTelefon:  09561 96-44110Telefax:  09561 96-44104E-Mail:   renar.grunenb...@huk-coburg.deInternet: www.huk.de===HUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter Deutschlands
a. G. in CoburgReg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021Sitz der Gesellschaft: Bahnhofsplatz, 96444 CoburgVorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav
Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel Thomas.===Diese Nachricht enthält vertrauliche und/oder rechtlich geschützte Informationen.Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich
erhalten haben,informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht
ist nicht gestattet.This information may contain confidential and/or privileged information.If you are not the intended recipient (or have received this information
in error) please notify thesender immediately and destroy this information.Any unauthorized copying, disclosure or distribution of the material in
this information is strictly forbidden.===-Ursprüngliche Nachricht-Von: gpfsug-discuss-boun...@spectrumscale.org [mailto:gpfsug-discuss-boun...@spectrumscale.org]
Im Auftrag von valdis.kletni...@vt.eduGesendet: Mittwoch, 20. Juni 2018 16:45An: gpfsug main discussion list Betreff: Re: [gpfsug-discuss] mmbackup issueOn Wed, 20 Jun 2018 14:08:09 -, "Grunenberg, Renar" said:> There are after each test (change of the content) the file became
every time> a new inode number. This behavior is the reason why the shadowfile
think(or the> policyengine) the old file is never existentThat's because as far as the system is concerned, this is a new file that
happensto have the same name.> At SAS the data file will updated with a xx.data.new file and after
the close> the xx.data.new will be renamed to the original name xx.data again.
And the> miss interpretation of different inodes happen again.Note that all the interesting information about a file is contained in
theinode (the size, the owner/group, the permissions, creation time, disk
blocksallocated, and so on).  The *name* of the file is pretty much the
only thingabout a file that isn't in the inode - and that's because it's not a uniquevalue for the file (there can be more than one link to a file).  The
name(s) ofthe file are stored in the parent directory as inode/name pairs.So here's what happens.You have the original file xx.data.  It has an inode number 9934 or
whatever.In the parent directory, there's an entry "name xx.data -> inode
9934".SAS creates a new file xx.data.new with inode number 83425 or whatever.Different file - the creation time, blocks allocated on disk, etc are alldifferent than the file described by inode 9934. The directory now has"name xx.data -> 9934" "name xx.data.new -> inode
83425".SAS then renames xx.data.new - and rename is defined as "change the
name entryfor this inode, removing any old mappings for the same name" .  So...0) 'rename xx.data.new xx.data'.1) Find 'xx.data.new' in this directory. "xx.data.new -> 83425"
. So we're working with that inode.2) Check for occurrences of the new name. Aha.  There's 'xxx.data
-> 9934'.  Remove it.(2a) This may or may not actually make the file go away, as there may be
other links and/oropen file references to it.)3) The directory now only has '83425 xx.data.new -> 83425'.4) We now change the name. The directory now has 'xx.data -> 83425'.And your backup program quite rightly concludes that this is a new file
by 

Re: [gpfsug-discuss] NFS on system Z

2018-05-19 Thread Olaf Weiser
HI, yes.. CES comes along with lots of monitors
about status, health checks and a special NFS (ganesha) code.. which is
optimized / available only for a limited choice of OS/platforms so CES is not available for e.g. AIX
and in your case... not available for systemZ ... but - of course you can setup your own
NFS server  ..From:      
 KG To:      
 gpfsug main discussion
list Date:      
 05/19/2018 06:00 AMSubject:    
   [gpfsug-discuss]
NFS on system ZSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHiThe SS FAQ says following for system ZCluster Export Service (CES) is not supported.
(Monitoring capabilities, Object, CIFS, User space implementation of NFS)Kernel NFS (v3 and v4) is supported. Clustered
NFS is not supported.Does this mean we can only configure OS
based non-redundant NFS exports from scale nodes without CNFS/CES?Kiran Ghag___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] 5.0.1.0 Update issue with python dependencies

2018-05-15 Thread Olaf Weiser
Renar, can you share , what gpfs packages you
tried to install I just did a fresh 5.0.1 install and
it works fine for me... even though, I don't see this ibm python rpm [root@tlinc04 ~]# rpm -qa | grep -i openssl  
                     
                     
                     
                     
                     
                     
                     
  openssl-1.0.2k-12.el7.x86_64 openssl-libs-1.0.2k-12.el7.x86_64 pyOpenSSL-0.13.1-3.el7.x86_64
openssl-devel-1.0.2k-12.el7.x86_64 xmlsec1-openssl-1.2.20-7.el7_4.x86_64So I assume, you installed GUI, or scale
mgmt .. let us know - thxFrom:      
 "Grunenberg, Renar"
To:      
 "'gpfsug-discuss@spectrumscale.org'"
Date:      
 05/15/2018 08:00 AMSubject:    
   Re: [gpfsug-discuss]
5.0.1.0 Update issue with python dependenciesSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHallo All,follow some experiences with the update to
5.0.1.0 (from 5.0.0.2) on rhel7.4. After the complete yum update to this
version, we had a non-function yum cmd.The reason for this is following packet pyOpenSSL-0.14-1.ibm.el7.noarch
This package break the yum cmds.The error are:Loaded plugins: langpacks, product-id, rhnplugin, search-disabled-reposTraceback (most recent call last):  File "/bin/yum", line 29, in     yummain.user_main(sys.argv[1:], exit_code=True)  File "/usr/share/yum-cli/yummain.py",
line 370, in user_main    errcode = main(args)  File "/usr/share/yum-cli/yummain.py",
line 165, in main    base.getOptionsConfig(args)  File "/usr/share/yum-cli/cli.py", line
261, in getOptionsConfig    self.conf  File "/usr/lib/python2.7/site-packages/yum/__init__.py",
line 1078, in     conf = property(fget=lambda self: self._getConfig(),  File "/usr/lib/python2.7/site-packages/yum/__init__.py",
line 420, in _getConfig    self.plugins.run('init')  File "/usr/lib/python2.7/site-packages/yum/plugins.py",
line 188, in run    func(conduitcls(self, self.base, conf, **kwargs))  File "/usr/share/yum-plugins/rhnplugin.py",
line 141, in init_hook    svrChannels = rhnChannel.getChannelDetails(timeout=timeout)  File "/usr/share/rhn/up2date_client/rhnChannel.py",
line 71, in getChannelDetails    sourceChannels = getChannels(timeout=timeout)  File "/usr/share/rhn/up2date_client/rhnChannel.py",
line 98, in getChannels    up2dateChannels = s.up2date.listChannels(up2dateAuth.getSystemId())  File "/usr/share/rhn/up2date_client/rhnserver.py",
line 63, in __call__    return rpcServer.doCall(method, *args, **kwargs)  File "/usr/share/rhn/up2date_client/rpcServer.py",
line 204, in doCall    ret = method(*args, **kwargs)  File "/usr/lib64/python2.7/xmlrpclib.py",
line 1233, in __call__    return self.__send(self.__name, args)  File "/usr/share/rhn/up2date_client/rpcServer.py",
line 38, in _request1    ret = self._request(methodname, params)  File "/usr/lib/python2.7/site-packages/rhn/rpclib.py",
line 384, in _request    self._handler, request, verbose=self._verbose)  File "/usr/lib/python2.7/site-packages/rhn/transports.py",
line 171, in request    headers, fd = req.send_http(host, handler)  File "/usr/lib/python2.7/site-packages/rhn/transports.py",
line 721, in send_http    self._connection.connect()  File "/usr/lib/python2.7/site-packages/rhn/connections.py",
line 187, in connect    self.sock.init_ssl()  File "/usr/lib/python2.7/site-packages/rhn/SSL.py",
line 90, in init_ssl    self._ctx.load_verify_locations(f)  File "/usr/lib/python2.7/site-packages/OpenSSL/SSL.py",
line 303, in load_verify_locations    raise TypeError("cafile must be None
or a byte string")TypeError: cafile must be None or a byte string My questions now: why does IBM patch here
rhel python-libaries. This goes to a update nirvana. The Dependencies does looks like this!!rpm -e pyOpenSSL-0.14-1.ibm.el7.noarcherror: Failed dependencies:        pyOpenSSL is needed by (installed)
redhat-access-insights-0:1.0.13-2.el7_3.noarch        pyOpenSSL is needed by (installed)
rhnlib-2.5.65-4.el7.noarch        pyOpenSSL
>= 0.14 is needed by (installed) python2-urllib3-1.21.1-1.ibm.el7.noarch Its PMR time. Regards Renar  Renar GrunenbergAbteilung Informatik – BetriebHUK-COBURGBahnhofsplatz96444 CoburgTelefon:09561
96-44110Telefax:09561
96-44104E-Mail:renar.grunenb...@huk-coburg.deInternet:www.huk.deHUK-COBURG Haftpflicht-Unterstützungs-Kasse
kraftfahrender Beamter Deutschlands a. G. in CoburgReg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021Sitz der Gesellschaft: Bahnhofsplatz, 96444 CoburgVorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav
Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel Thomas.Diese Nachricht enthält vertrauliche und/oder
rechtlich geschützte Informationen.Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich
erhalten haben,informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.Das unerlaubte Kopieren sowie die 

Re: [gpfsug-discuss] Pool migration and replicate

2018-04-26 Thread Olaf Weiser

Hallo Simon replication attributes of a file won't
be changed just by the fact , that the pool attribute is changed.. or in other words .. if a file gets
migrated from POOLA to POOLB, does not change the replication automatically...
even if the pool consists of NSDs with multiple fgso depending on what you want to achieve..
you need to additionally specify REPLICATE clause or not.. if you want to keep your replication
attribute.. it should work without specifying REPLICATE ... as you said.. this statement is true
for "move"... a initial placement of a file would
still follow the default of the fs cheersFrom:      
 "Simon Thompson
(IT Research Support)" To:      
 gpfsug main discussion
list Date:      
 04/26/2018 11:09 AMSubject:    
   [gpfsug-discuss]
Pool migration and replicateSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHi all,We'd like to move some data from a non replicated pool to another pool,
but keep replication at 1 (the fs default is 2).When using an ILM policy, is the default to keep the current replication
or use the fs default?I.e.just wondering if I need to include a "REPLICATE(1)" clause.Also if the data is already migrated to the pool, is it still considered
by the policy engine, or should I include FROM POOL...?I.e. just wondering what is the most efficient way to target the files.ThanksSimon___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] GPFS autoload - wait for IB ports tobecomeactive

2018-03-15 Thread Olaf Weiser

you can try :systemctl enable  NetworkManager-wait-onlineln -s '/usr/lib/systemd/system/NetworkManager-wait-online.service' '/etc/systemd/system/multi-user.target.wants/NetworkManager-wait-online.service'in many cases .. it helps .. From:      
 Jan-Frode Myklebust
To:      
 gpfsug main discussion
list Date:      
 03/15/2018 06:18 PMSubject:    
   Re: [gpfsug-discuss]
GPFS autoload - wait for IB ports to        becomeactiveSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgI found some discussion on this at https://www.ibm.com/developerworks/community/forums/html/threadTopic?id=----14471957=25and there it's claimed that none of the callback events are early enough
to resolve this. That we need a pre-preStartup trigger. Any idea if this
has changed -- or is the callback option then only to do a "--onerror
shutdown" if it has failed to connect IB ?On Thu, Mar 8, 2018 at 1:42 PM, Frederick Stock 
wrote:You could also use the GPFS prestartup callback
(mmaddcallback) to execute a script synchronously that waits for the IB
ports to become available before returning and allowing GPFS to continue.
 Not systemd integrated but it should work.Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.comFrom:        david_john...@brown.eduTo:        gpfsug
main discussion list Date:        03/08/2018
07:34 AMSubject:        Re:
[gpfsug-discuss] GPFS autoload - wait for IB ports to become    
   activeSent by:        gpfsug-discuss-boun...@spectrumscale.orgUntil IBM provides a solution, here is my workaround. Add it so it runs
before the gpfs script, I call it from our custom xcat diskless boot scripts.
Based on rhel7, not fully systemd integrated. YMMV!Regards,  — ddj——-[ddj@storage041 ~]$ cat /etc/init.d/ibready #! /bin/bash## chkconfig: 2345 06 94# /etc/rc.d/init.d/ibready# written in 2016 David D Johnson (ddj  brown.edu) BEGIN INIT INFO# Provides:             ibready# Required-Start:# Required-Stop:# Default-Stop:# Description: Block until infiniband is ready# Short-Description: Block until infiniband is ready### END INIT INFORETVAL=0if [[ -d /sys/class/infiniband ]] then        IBDEVICE=$(dirname $(grep -il infiniband /sys/class/infiniband/*/ports/1/link*
| head -n 1))fi# See how we were called.case "$1" in  start)        if [[ -n $IBDEVICE && -f $IBDEVICE/state
]]        then                echo -n "Polling
for InfiniBand link up: "                for (( count =
60; count > 0; count-- ))                do                     
  if grep -q ACTIVE $IBDEVICE/state                     
  then                     
          echo ACTIVE                     
          break                     
  fi                     
  echo -n "."                     
  sleep 5                done                if (( count <=
0 ))                then                     
  echo DOWN - $0 timed out                fi        fi        ;;  stop|restart|reload|force-reload|condrestart|try-restart)        ;;  status)        if [[ -n $IBDEVICE && -f $IBDEVICE/state
]]        then                echo "$IBDEVICE
is $(< $IBDEVICE/state) $(< $IBDEVICE/rate)"        else                echo "No IBDEVICE
found"        fi        ;;  *)        echo "Usage: ibready {start|stop|status|restart|reload|force-reload|condrestart|try-restart}"        exit 2esacexit ${RETVAL}  -- ddjDave JohnsonOn Mar 8, 2018, at 6:10 AM, Caubet Serrabou Marc (PSI) 
wrote:Hi all,with autoload = yes we do not ensure that GPFS will be started after the
IB link becomes up. Is there a way to force GPFS waiting to start until
IB ports are up? This can be probably done by adding something like After=network-online.target
and Wants=network-online.target in the systemd file but I would like to
know if this is natively possible from the GPFS configuration.Thanks a lot,Marc                _Paul Scherrer Institut High Performance ComputingMarc Caubet SerrabouWHGA/0365232 Villigen PSISwitzerlandTelephone: +41
56 310 46 67E-Mail: marc.cau...@psi.ch___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttps://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY=JCfJgq6pZnKUI6d-rIgJXVcdZh7vmA5ypB1_goP_FFA=___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___gpfsug-discuss mailing listgpfsug-discuss at 

Re: [gpfsug-discuss] Underlying LUN mirroring NSD impact

2018-03-14 Thread Olaf Weiser
HI Mark.. yes.. that's  possible...
 at least ,  I'm sure.. there was a chapter in the former advanced
admin guide of older releases with PPRC .. how to do that.. similar to PPRC , you might use other
methods , but from gpfs perspective this should'nt make a difference..
and I had have a german customer, who
was doing this for years... (but it is some years back meanwhile ... hihi
time flies...) From:      
 Mark Bush To:      
 "gpfsug-discuss@spectrumscale.org"
Date:      
 03/14/2018 09:11 PMSubject:    
   [gpfsug-discuss]
Underlying LUN mirroring NSD impactSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgIs it possible (albeit not advisable) to
mirror LUNs that are NSD’s to another storage array in another site basically
for DR purposes?  Once it’s mirrored to a new cluster elsewhere what
would be the step to get the filesystem back up and running.  I know
that AFM-DR is meant for this but in this case my client only has Standard
edition and has mirroring software purchased with the underlying disk array. Is this even doable?    Mark ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] 100G RoCEE and Spectrum Scale Performance

2018-03-07 Thread Olaf Weiser
HI Doug,
I did some compares with gpfsperf ...
betweend IB and 100GbE .. but we used the 100GbE with ROCE .. so my results
might not be representative for you .. (don't wonder about edited hostnames
.. its from a real customer environment..) so with real data workload.. it is nearly
the same... ~ 6 msInfiniband[root@ ~]# mmdsh -N ,,,n
  \"tsqosperf read seq /gpfs/sase1/testdev/\$(hostname)
-n 100g -r 4m -th 8 -dio" | grep "rate was"n  Data rate was 5057593.05 Kbytes/sec,
Op Rate was 1205.82 Ops/sec, Avg Latency was 6.557 milliseconds, thread
utilization 0.988, bytesTransferred 107374182400n        
Data rate was 5046234.10 Kbytes/sec, Op Rate was 1203.12 Ops/sec, Avg Latency
was 6.576 milliseconds, thread utilization 0.989, bytesTransferred 107374182400n        
Data rate was 4988625.75 Kbytes/sec, Op Rate was 1189.38 Ops/sec, Avg Latency
was 6.557 milliseconds, thread utilization 0.975, bytesTransferred 107374182400n  Data rate was 4136019.23 Kbytes/sec,
Op Rate was 986.10 Ops/sec, Avg Latency was 7.995 milliseconds, thread
utilization 0.985, bytesTransferred 10737418240100GbE RoCE[root@bb1gssio1 ~]# mmdsh -N c09n1,c09n2,c09n3,c09n4
\"tsqosperf read seq /gpfs/gpfs0/\$(hostname)
-n 100g -r 4m -th 8 -dio" | grep "rate was"C09n1        Data
rate was 5350528.27 Kbytes/sec, Op Rate was 1275.67 Ops/sec, Avg Latency
was 6.242 milliseconds, thread utilization 0.995, bytesTransferred 107374182400C09n2 Data rate was 4964347.14 Kbytes/sec,
Op Rate was 1183.59 Ops/sec, Avg Latency was 6.743 milliseconds, thread
utilization 0.998, bytesTransferred 107374182400C09n3        Data
rate was 4857516.69 Kbytes/sec, Op Rate was 1158.12 Ops/sec, Avg Latency
was 6.893 milliseconds, thread utilization 0.998, bytesTransferred 107374182400C09n4        Data
rate was 4829485.95 Kbytes/sec, Op Rate was 1151.44 Ops/sec, Avg Latency
was 6.929 milliseconds, thread utilization 0.997, bytesTransferred 107374182400with Mellanox ib_read_lat .. the picture
looks different  ~ 2,5 usec versus 4,6 usec ...  Infiniband[root@nn ~]# ib_read_lat n
---
           
       RDMA_Read Latency Test Dual-port       : OFF  
       Device         : mlx4_0
Number of qps   : 1    
       Transport type : IB Connection type : RC      
    Using SRQ      : OFF TX depth        : 1
Mtu          
  : 2048[B] Link type       : IB Outstand reads  : 16 rdma_cm QPs     : OFF Data ex. method : Ethernet ---
local address: LID 0x68 QPN 0xd751 PSN
0xadfaf8 OUT 0x10 RKey 0x1001411b VAddr 0x007fca3197 remote address: LID 0x58 QPN 0x09d6 PSN
0x6a6c59 OUT 0x10 RKey 0x081392 VAddr 0x003fff879d ---
#bytes #iterations    t_min[usec]
   t_max[usec]  t_typical[usec]    t_avg[usec]
   t_stdev[usec]   99% percentile[usec]   99.9%
percentile[usec]  2       1000    
     2.40           18.12  
     2.46              
 2.48             0.28    
       2.57            
       18.12   ---ROCE---root@dc18n2:~# ib_read_lat 10.10.4.1---           
        RDMA_Read Latency Test Dual-port       :
OFF          Device        
: mlx5_1 Number of qps   : 1    
       Transport type : IB Connection type : RC    
      Using SRQ      : OFF TX depth        :
1 Mtu          
  : 4096[B] Link type       :
Ethernet GID index       :
3 Outstand reads  : 16 rdma_cm QPs     : OFF Data ex. method : Ethernet--- local address: LID  QPN 0x0e7e
PSN 0x51b972 OUT 0x10 RKey 0x089b6f VAddr 0x007fe80cd11000 GID: 00:00:00:00:00:00:00:00:00:00:255:255:172:19:02:18 remote address: LID  QPN 0x0d02
PSN 0xdc9761 OUT 0x10 RKey 0x008142 VAddr 0x003fffb24f GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:10:04:01--- #bytes #iterations    t_min[usec]
   t_max[usec]  t_typical[usec]    t_avg[usec]   t_stdev[usec]   99% percentile[usec]   99.9% percentile[usec]
 2       1000  
       4.55           7.12
        4.61          
     4.62            
0.05            4.82      
             7.12   ---From:      
 Douglas Duckworth To:      
 Date:      
 03/06/2018 08:00 PMSubject:    
   [gpfsug-discuss]
100G RoCEE and Spectrum Scale PerformanceSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHiWe are currently running Spectrum Scale over FDR Infiniband. 

Re: [gpfsug-discuss] tscCmdPortRange question

2018-03-06 Thread Olaf Weiser
this parameter is just for administrative
commands.. "where" to send the output of a command...and for those admin ports .. so called
ephemeral ports... it depends , how much admin commands ( = sessions  =
sockets)  you want to run in parallel in my experience.. 10 ports is more
than enough we use those in a range from 5-50010
to be clear .. demon - to - demon .. communication
always uses 1191cheersFrom:      
 "Simon Thompson
(IT Research Support)" To:      
 "gpfsug-discuss@spectrumscale.org"
Date:      
 03/06/2018 06:55 PMSubject:    
   [gpfsug-discuss]
tscCmdPortRange questionSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgWe are looking at setting a value for tscCmdPortRange
so that we can apply firewalls to a small number of GPFS nodes in one of
our clusters. The docs don’t give an indication on the
number of ports that are required to be in the range. Could anyone make
a suggestion on this? It doesn’t appear as a parameter for “mmchconfig
-i”, so I assume that it requires the nodes to be restarted, however I’m
not clear if we could do a rolling restart on this? Thanks Simon___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] storage-based replication for Spectrum Scale

2018-01-25 Thread Olaf Weiser
yes... to add some more details  even though it might be very theoretical,
that only some nodes from the foreign cluster 'll suffer from connection
issues, the rule to react up on expel request isA) if the requested node is really unreachable
(or in trouble) ... the node will loose the disk lease after next lease
period anyway -  so the clsmgr will expel that node B) if the node is (from the perspective
of clsmgr) still healthy ... then the clsmgr decides smth like thisSo the "remote" nodes gets
expelled first .please keep in mind... even your local
file system / disk access might get in trouble if token revokes/tokem msg
can't be send or 're delayed .. From:      
 "Achim Rehor"
To:      
 gpfsug main discussion
list Date:      
 01/25/2018 11:24 AMSubject:    
   Re: [gpfsug-discuss]
storage-based replication for Spectrum ScaleSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgJohn, yes, they definitely can!Nodes in a remote cluster are tob viewed just as local nodes in terms of
taking part in the mechanisms of access to data.Token management will be done just as with local nodes. So if one node in cluster A recognizes a communication issue with a node
in cluster B, it will let the clustermgr know, and that one then decides on whether to expel one or the other.Having a remote cluster connected relies on a stable and low latency network,
just as a local cluster does.if your network is not reliable, you woudl go for AFM or other replication
mechanisms (as the thread title implies;) ) Mit freundlichen Grüßen / Kind regardsAchim Rehor Software
Technical Support Specialist AIX/ Emea HPC SupportIBM
Certified Advanced Technical Expert - Power Systems with AIXTSCC
Software Service, Dept. 7922Global
Technology Services Phone:+49-7034-274-7862 IBM
DeutschlandE-Mail:achim.re...@de.ibm.com Am
Weiher 24   65451
Kelsterbach   GermanyIBM
Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Martin Hartmann (Vorsitzender), Norbert Janzen, Stefan
Lutz, Nicole Reimer, Dr. Klaus Seifert, Wolfgang Wendt Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 WEEE-Reg.-Nr. DE 99369940  From:        John
Hearns To:        gpfsug
main discussion list Date:        25/01/2018
10:53Subject:        Re:
[gpfsug-discuss] storage-based replication for Spectrum ScaleSent by:        gpfsug-discuss-boun...@spectrumscale.orgJan Frode, thankyou for that link. I have a general question regarding remote GPFS filesystems.If we have two clusters, in separate locations on separate Infiniband fabrics,we can set up a remote relationship between filesystems. As Valdis discusses, what happens if the IP link between the clusters goes
down or is unstable?Can nodes in one cluster vote out nodes in the other cluster?   From: gpfsug-discuss-boun...@spectrumscale.org [mailto:gpfsug-discuss-boun...@spectrumscale.org]
On Behalf Of Jan-Frode MyklebustSent: Wednesday, January 24, 2018 8:08 AMTo: gpfsug main discussion list Subject: Re: [gpfsug-discuss] storage-based replication for Spectrum
Scale  Have you seen https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.adv.doc/bl1adv_dr.htm?
Seems to cover what you’re looking for..    -jf ons. 24. jan. 2018 kl. 07:33 skrev Harold Morales :Thanks for answering. Essentially, the idea being explored is to replicate LUNs between identical
storage hardware (HP 3PAR volumesrein) on both sites. There is an IP connection
between storage boxes but not between servers on both sites, there is a
dark fiber connecting both sites. Here they dont want to explore the idea
of a scaled-based. ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss-- The information contained in this communication and
any attachments is confidential and may be privileged, and is for the sole
use of the intended recipient(s). Any unauthorized review, use, disclosure
or distribution is prohibited. Unless explicitly stated otherwise in the
body of this communication or the attachment thereto (if any), the information
is provided on an AS-IS basis without any express or implied warranties
or liabilities. To the extent you are relying on this information, you
are doing so at your own risk. If you are not the intended recipient, please
notify the sender immediately by replying to this message and destroy all
copies of this message and any attachments. Neither the sender nor the
company/group of companies he or she represents shall be liable for the
proper and complete transmission of the information contained in this communication,
or for any delay in its receipt. ___gpfsug-discuss mailing listgpfsug-discuss at 

Re: [gpfsug-discuss] pmcollector and NSD perf

2017-12-19 Thread Olaf Weiser
Hi Mark, I think what you'll need is to set name = "GPFSDisk"this should report the utilization to
the direct attached disk cheers olsfFrom:      
 Mark Bush To:      
 gpfsug main discussion
list Date:      
 12/19/2017 04:50 PMSubject:    
   Re: [gpfsug-discuss]
pmcollector and NSD perfSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgIt appears number 3 on your list is the
case.  My nodes are all SAN connected and until I get separate CES
nodes no NSD is necessary (currently run CES on the NSD servers – just
for a test cluster). Mark From: gpfsug-discuss-boun...@spectrumscale.org
[mailto:gpfsug-discuss-boun...@spectrumscale.org]
On Behalf Of Markus RohwedderSent: Tuesday, December 19, 2017 9:24 AMTo: gpfsug main discussion list Subject: Re: [gpfsug-discuss] pmcollector and NSD perf Hello Mark,the NSD sensor is GPFSNSDDisk.Some things to check:1. Is the sensor activated?In a GPFS managed sensor config you should be able to see something like
this when you call mmperfmon config show:{name = "GPFSNSDDisk"period = 10restrict = "nsdNodes"},2. Perfmon designationThe NSD server nodes should have the perfmon designation.[root@cache-41 ~]# mmlsclusterGPFS cluster informationGPFS cluster name: gpfsgui-cluster-4.localnet.comGPFS cluster id: 10583479681538672379GPFS UID domain: localnet.comRemote shell command: /usr/bin/sshRemote file copy command: /usr/bin/scpRepository type: CCRNode Daemon node name IP address Admin node name Designation--1 cache-41.localnet.com 10.0.100.41 cache-41.localnet.com quorum-perfmon2 cache-42.localnet.com 10.0.100.42 cache-42.localnet.com quorum-gateway-perfmon3 cache-43.localnet.com 10.0.100.43 cache-43.localnet.com gateway-perfmon3. Direct Disk writes?One reason why there may be no data on your system is if you are not using
the NSD protocol, meaning the clients can directly write to disk as in a SAN environment.In this case the sensor does not catch the transactions.4. Cross cluster mountOr maybe you are using a cross cluster mount.Mit freundlichen Grüßen / Kind regardsDr. Markus RohwedderSpectrum Scale GUI DevelopmentPhone:+49
7034 6430190IBM
DeutschlandE-Mail:rohwed...@de.ibm.comAm
Weiher 2465451
KelsterbachGermanyIBM
Deutschland Research & Development GmbH / Vorsitzender des Aufsichtsrats:
Martina Köderitz Geschäftsführung: Dirk WittkoppSitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart,
HRB 243294 Mark
Bush ---12/19/2017 03:30:14 PM---I've noticed this in my test cluster both
in 4.2.3.4 and 5.0.0.0 that in the GUI on the monitoring sFrom: Mark Bush To: "gpfsug-discuss@spectrumscale.org"
Date: 12/19/2017 03:30 PMSubject: [gpfsug-discuss] pmcollector and NSD perfSent by: gpfsug-discuss-boun...@spectrumscale.orgI’ve noticed this in my test cluster both in 4.2.3.4 and 5.0.0.0 that
in the GUI on the monitoring screen with the default view the NSD Server
Throughput graph shows “Performance Collector did not return any data”.
I’ve seen that in other items (SMB before for example) but never for NSD.
Is there something that must be enabled in the zimon sensor or collector
config file to grab this or is this a bug?Mark___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttps://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=IbxtjdkPAM2Sbon4Lbbi4w=a6GCq72qeADy6hsfA-24PmWHU06W5z2xqx9tKIJ8qJ4=OQccy8ikWB-ByYgLsJFgI8szDs1ZrwnsaFrLCwTfTwI=___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] GPFS long waiter

2017-11-16 Thread Olaf Weiser

even though I think, this is something
to open a PMR .. you might help you out yourself  by
finding pending messages to this nodeso check on mmfsadm dump tscomm ...
output on that node if you find pending messages to a specific
node.. .go on that node and debug further.. if it is not an important node.. you
might solve your situation by mmshutdown/mmstartup the demon but as said before... I recommend to
open a PMR and get help from the support team .. From:      
 Ahmad El Khouly To:      
 "gpfsug-discuss@spectrumscale.org"
Date:      
 11/16/2017 01:41 PMSubject:    
   [gpfsug-discuss]
GPFS long waiterSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHello all I’m facing long waiter issue and I could
not find any way to clear it, I can see all filesystems are responsive
and look normal but I can not perform any GPFS commands like mmdf or adding
or removing any vdisk, could you please advise how to show more details
about this waiter and which pool it is talking about…  and any workaround
to clear it.  0x7FA0446BF1A0 (  27706)
waiting 20634.654553503 seconds, TSDFCmdThread: on ThCond 0x1803173EE10
(0xC9003173EE10) (AllocManagerCond), reason 'waiting for pool freeSpace
recovery' Ahmed M. ElkhoulySystems Administrator, Scientific
ComputingBioinformatics Division Disclaimer: This email and its attachments may be confidential
and are intended solely for the use of the individual to whom it is addressed.
If you are not the intended recipient, any reading, printing, storage,
disclosure, copying or any other action taken in respect of this e-mail
is prohibited and may be unlawful. If you are not the intended recipient,
please notify the sender immediately by using the reply function and then
permanently delete what you have received. Any views or opinions expressed
are solely those of the author and do not necessarily represent those of
Sidra Medical and Research Center. ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Write performances and filesystem size

2017-11-16 Thread Olaf Weiser
p      2WayReplication    
NVR              48 MiB    
 2 MiB      4096      ok    logTip  sf_g_01_logTipBackup  Unreplicated        SSD
             48 MiB 2 MiB      4096      ok    logTipBackup  sf_g_01_logHome     4WayReplication     DA1
            144 GiB      2
MiB      4096      ok    log  sf_g_01_vdisk02     3WayReplication     DA1
            103 GiB      1
MiB     32 KiB     ok  sf_g_01_vdisk07     3WayReplication     DA1
            103 GiB      1
MiB     32 KiB     ok  sf_g_01_vdisk01     8+2p          
     DA1             270 TiB
    16 MiB     32 KiB     ok  config data         declustered array  
spare space    remarks  --  --  -
 ---  rebuild space       DA1        
        68 pdisk      increasing VCD spares is suggested  config data         disk group fault tolerance
        remarks  --  -  ---  rg descriptor       1 node + 3 pdisk    
              limited by rebuild space  system index        1 node + 3 pdisk  
                limited by rebuild space  vdisk               disk group
fault tolerance         remarks  --  -  ---  sf_g_01_logTip      1 pdisk  sf_g_01_logTipBackup  0 pdisk  sf_g_01_logHome     1 node + 2 pdisk      
            limited by rebuild space  sf_g_01_vdisk02     1 node + 1 pdisk      
            limited by rebuild space  sf_g_01_vdisk07     1 node + 1 pdisk      
            limited by rebuild space  sf_g_01_vdisk01     2 pdiskThanks,IvanoIl 16/11/17 13:03, Olaf Weiser ha scritto:> Rjx, that makes it a bit clearer.. as  your vdisk  is big
enough to span> over all pdisks  in each of your test 1/1 or 1/2 or 1/4  of
capacity...> should bring the same performance. ..>> You mean something about vdisk Layout. ..> So in your test,  for the full capacity test, you use just one
vdisk per> RG - so 2 in total for 'data' - right?>> What about Md .. did you create separate vdisk for MD  / what
size then> ?>> Gesendet von IBM Verse>> Ivano Talamo --- Re: [gpfsug-discuss] Write performances and filesystem> size --->> Von:                
"Ivano Talamo" <ivano.tal...@psi.ch>> An:                
"gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>> Datum:                
Do. 16.11.2017 03:49> Betreff:              
  Re: [gpfsug-discuss] Write performances and filesystem size>> >> Hello Olaf,>> yes, I confirm that is the Lenovo version of the ESS GL2, so 2> enclosures/4 drawers/166 disks in total.>> Each recovery group has one declustered array with all disks inside,
so> vdisks use all the physical ones, even in the case of a vdisk that
is> 1/4 of the total size.>> Regarding the layout allocation we used scatter.>> The tests were done on the just created filesystem, so no close-to-full> effect. And we run gpfsperf write seq.>> Thanks,> Ivano>>> Il 16/11/17 04:42, Olaf Weiser ha scritto:>> Sure... as long we assume that really all physical disk are used
.. the>> fact that  was told 1/2  or 1/4  might turn out
that one / two complet>> enclosures 're eliminated ... ?  ..that s why I was asking
for  more>> details ..>>>> I dont see this degration in my environments. . as long the vdisks
are>> big enough to span over all pdisks ( which should be the case
for>> capacity in a range of TB ) ... the performance stays the same>>>> Gesendet von IBM Verse>>>> Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances
and>> filesystem size --->>>> Von:    "Jan-Frode Myklebust" <janfr...@tanso.net>>> An:    "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>>> Datum:    Mi. 15.11.2017 21:35>> Betreff:    Re: [gpfsug-discuss] Write performances
and filesystem size>>>> >>>> Olaf, this looks like a Lenovo «ESS GLxS» version. Should be using
same>> number of spindles for any size filesystem, so I would also expect
them>> to perform the same.>>>>>>>> -jf>>>>>> ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser <olaf.wei...@de.ibm.com>> <mailto:olaf.wei...@de.ibm.com>>:>>>>      to add a comment ...  .. very simply...
depending on how you>>     allocate the physical block storage  if you
- simply - using>>     less physical resources when reducing the capacity
(in the same>>     ratio) .. you get , what you see>>>>     so you need to tell us, how you allocate your block-storage
.. (Do>>     you using RAID controllers , where are your LUNs
comin

Re: [gpfsug-discuss] Write performances and filesystem size

2017-11-16 Thread Olaf Weiser
Rjx, that makes it a bit clearer.. as  your vdisk  is big enough to span over 
all pdisks  in each of your test 1/1 or 1/2 or 1/4  of capacity... should bring 
the same performance. .. 
 
You mean something about vdisk Layout. .. 
So in your test,  for the full capacity test, you use just one vdisk per RG - 
so 2 in total for 'data' - right? 
 
What about Md .. did you create separate vdisk for MD  / what size then ?

Gesendet von IBM Verse


   Ivano Talamo --- Re: [gpfsug-discuss] Write performances and filesystem size 
--- 
Von:"Ivano Talamo" <ivano.tal...@psi.ch>An:"gpfsug main discussion list" 
<gpfsug-discuss@spectrumscale.org>Datum:Do. 16.11.2017 03:49Betreff:Re: 
[gpfsug-discuss] Write performances and filesystem size
  
Hello Olaf,yes, I confirm that is the Lenovo version of the ESS GL2, so 2 
enclosures/4 drawers/166 disks in total.Each recovery group has one declustered 
array with all disks inside, so vdisks use all the physical ones, even in the 
case of a vdisk that is 1/4 of the total size.Regarding the layout allocation 
we used scatter.The tests were done on the just created filesystem, so no 
close-to-full effect. And we run gpfsperf write seq.Thanks,IvanoIl 16/11/17 
04:42, Olaf Weiser ha scritto:> Sure... as long we assume that really all 
physical disk are used .. the> fact that  was told 1/2  or 1/4  might turn out 
that one / two complet> enclosures 're eliminated ... ?  ..that s why I was 
asking for  more> details ..>> I dont see this degration in my environments. . 
as long the vdisks are> big enough to span over all pdisks ( which should be 
the case for> capacity in a range of TB ) ... the performance stays the same>> 
Gesendet von IBM Verse>> Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write 
performances and> filesystem size --->> Von:"Jan-Frode Myklebust" 
<janfr...@tanso.net>> An:"gpfsug main discussion list" 
<gpfsug-discuss@spectrumscale.org>> Datum:Mi. 15.11.2017 21:35> Betreff:
Re: [gpfsug-discuss] Write performances and filesystem size>> 
>> 
Olaf, this looks like a Lenovo «ESS GLxS» version. Should be using same> number 
of spindles for any size filesystem, so I would also expect them> to perform 
the same.>>>> -jf>>> ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser 
<olaf.wei...@de.ibm.com> <mailto:olaf.wei...@de.ibm.com>>:>>  to add a 
comment ...  .. very simply... depending on how you> allocate the physical 
block storage  if you - simply - using> less physical resources when 
reducing the capacity (in the same> ratio) .. you get , what you see>>  
   so you need to tell us, how you allocate your block-storage .. (Do> you 
using RAID controllers , where are your LUNs coming from, are> then less 
RAID groups involved, when reducing the capacity ?...)>> GPFS can be 
configured to give you pretty as much as what the> hardware can deliver.. 
if you reduce resource.. ... you'll get less> , if you enhance your 
hardware .. you get more... almost regardless> of the total capacity in 
#blocks ..>>>>>>> From:"Kumaran Rajaram" <k...@us.ibm.com> 
<mailto:k...@us.ibm.com>>> To:gpfsug main discussion list> 
<gpfsug-discuss@spectrumscale.org> 
<mailto:gpfsug-discuss@spectrumscale.org>>> Date:11/15/2017 11:56 
AM> Subject:Re: [gpfsug-discuss] Write performances and> 
filesystem size> Sent by:gpfsug-discuss-boun...@spectrumscale.org>  
   <mailto:gpfsug-discuss-boun...@spectrumscale.org>> 
>>>>
 Hi,>> >>Am I missing something? Is this an expected behaviour and someone> 
has an explanation for this?>> Based on your scenario, write 
degradation as the file-system is> populated is possible if you had 
formatted the file-system with "-j> cluster".>> For consistent 
file-system performance, we recommend *mmcrfs "-j> scatter" layoutMap.*   
Also, we need to ensure the mmcrfs "-n"  is> set properly.>> [snip from 
mmcrfs]/> # mmlsfs  | egrep 'Block allocation| Estimated number'> 
-j scatter  Block allocation type> -n   
  128   Estimated number of> nodes that will 
mount file system/> [/snip]>>> [snip from man mmcrfs]/> 
*layoutMap={scatter|*//*cluster}*//>  Specifies the block 
allocation map type. When>  al

Re: [gpfsug-discuss] nsdperf crash testing RDMA between Power BE and Intel nodes

2017-10-24 Thread Olaf Weiser
Hi Falk, can you open a PMR for it  .. it
should be investigated in detail From:      
 "Uwe Falke"
To:      
 gpfsug main discussion
list Date:      
 10/24/2017 06:49 PMSubject:    
   [gpfsug-discuss]
nsdperf crash testing RDMA between Power BE and        Intel
nodesSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHi, I am about to run nsdperf for testing the IB fabric in a new system comprising ESS (BE) and Intel-based nodes. nsdperf crashes reliably when invoking ESS nodes and x86-64 nodes in one
test using RDMA: client          server        
 RDMA x86-64          ppc-64        
 on              crashppc-64          x86-64        
 on              crashx86-64          ppc-64        
 off             successx86-64          x86-64        
 on              successppc-64          ppc-64        
 on              successThat implies that the nsdperf RDMA test might struggle with BE vs LE. However, I learned from a talk given at a GPFS workshop in Germany in 2015
that RDMA works between Power-BE and Intel boxes. Has anyone made similar
or contrary experiences? Is it an nsdperf issue or more general (I have
not yet attempted any GPFS mount)? Mit freundlichen Grüßen / Kind regards Dr. Uwe Falke IT SpecialistHigh Performance Computing Services / Integrated Technology Services /
Data Center Services---IBM DeutschlandRathausstr. 709111 ChemnitzPhone: +49 371 6978 2165Mobile: +49 175 575 2877E-Mail: uwefa...@de.ibm.com---IBM Deutschland Business & Technology Services GmbH / Geschäftsführung:
Thomas Wolter, Sven SchooßSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 17122 ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] RoCE not playing ball

2017-09-19 Thread Olaf Weiser
is ib_read_bw  working  ?just test it between the two nodes ...
From:      
 Barry Evans To:      
 gpfsug main discussion
list Date:      
 09/20/2017 03:21 AMSubject:    
   [gpfsug-discuss]
RoCE not playing ballSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHi All,Weirdness with a RoCE interface - verbs is not playing
ball and is complaining about the inet6 address not matching up:2017-09-02_07:46:01.376+0100: [I] VERBS RDMA starting
with verbsRdmaCm=yes verbsRdmaSend=no verbsRdmaUseMultiCqThreads=yes verbsRdmaUseCompVectors=yes2017-09-02_07:46:01.377+0100: [I] VERBS RDMA library
librdmacm.so (version >= 1.1) loaded and initialized.2017-09-02_07:46:01.377+0100: [I] VERBS RDMA verbsRdmasPerNode
reduced from 1000 to 514 to match (nsdMaxWorkerThreads 512 + (nspdThreadsPerQueue
2 * nspdQueues 1)).2017-09-02_07:46:01.382+0100: [I] VERBS RDMA discover
mlx4_1 port 1 transport IB link ETH NUMA node  0 pkey[0] 0x gid[0]
subnet 0xFE80 id 0x268A07FFFEF981C0 state ACTIVE2017-09-02_07:46:01.383+0100: [I] VERBS RDMA discover
mlx4_1 port 1 transport IB link ETH NUMA node  0 pkey[0] 0x gid[1]
subnet 0x id 0xAC106404 state ACTIVE2017-09-02_07:46:01.384+0100: [I] VERBS RDMA discover
mlx4_1 port 2 transport IB link ETH NUMA node  0 pkey[0] 0x gid[0]
subnet 0xFE80 id 0x248A070001F981E1 state DOWN2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover
mlx4_0 port 1 transport IB link ETH NUMA node  0 pkey[0] 0x gid[0]
subnet 0xFE80 id 0x268A07FFFEF981C0 state ACTIVE2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover
mlx4_0 port 1 transport IB link ETH NUMA node  0 pkey[0] 0x gid[1]
subnet 0x id 0xAC106404 state ACTIVE2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover
mlx4_0 port 2 transport IB link ETH NUMA node  0 pkey[0] 0x gid[0]
subnet 0xFE80 id 0x248A070001F981C1 state ACTIVE2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover
mlx4_0 port 2 transport IB link ETH NUMA node  0 pkey[0] 0x gid[1]
subnet 0x id 0x0AC20011 state ACTIVE2017-09-02_07:46:01.387+0100: [I] VERBS RDMA parse
verbsPorts mlx4_0/12017-09-02_07:46:01.390+0100: [W] VERBS RDMA parse
error   verbsPort mlx4_0/1   ignored due to interface not found
for port 1 of device mlx4_0 with GID c081f9feff078a26. Please check if
the correct inet6 address for the corresponding IP network interface is
set2017-09-02_07:46:01.390+0100: [E] VERBS RDMA: rdma_get_cm_event
err -12017-09-02_07:46:01.391+0100: [I] VERBS RDMA library
librdmacm.so unloaded.2017-09-02_07:46:01.391+0100: [E] VERBS RDMA failed
to start, no valid verbsPorts defined.Anyone run into this before? I have another node imaged
the *exact* same way and no dice. Have tried a variety of drivers, cards,
etc, same result every time.Cheers,BarryThis email is confidential in that it is intended for the exclusive attention
of the addressee(s) indicated. If you are not the intended recipient, this
email should not be read or disclosed to any other person. Please notify
the sender immediately and delete this email from your computer system.
Any opinions expressed are not necessarily those of the company from which
this email was sent and, whilst to the best of our knowledge no viruses
or defects exist, no responsibility can be accepted for any loss or damage
arising from its receipt or subsequent use of this email.___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] multicluster security

2017-09-09 Thread Olaf Weiser
HI Aaron, not sure, if we are ready to talk/ distribute
pnfs4.1 experiences here.. I know one customer doing pNFS and for
myself , we did a lot of testing hereplease contact me directly .. let's
see, how I can help .. From:      
 Aaron Knister To:      
 Date:      
 09/08/2017 11:14 PMSubject:    
   Re: [gpfsug-discuss]
multicluster securitySent by:    
   gpfsug-discuss-boun...@spectrumscale.orgInteresting! Thank you for the explanation.This makes me wish GPFS had a client access model that
more closely mimicked parallel NAS, specifically for this reason. That
then got me wondering about pNFS support. I've not been able to find much
about that but in theory Ganesha supports pNFS. Does anyone know of successful
pNFS testing with GPFS and if so how one would set up such a thing?-AaronOn 08/25/2017 06:41 PM, IBM Spectrum Scale wrote:Hi Aaron,If cluster A uses the mmauth command to grant a file system read-only access
to a remote cluster B, nodes on cluster B can only mount that file system
with read-only access. But the only checking being done at the RPC level
is the TLS authentication. This should prevent non-root users from initiating
RPCs, since TLS authentication requires access to the local cluster's private
key. However, a root user on cluster B, having access to cluster B's private
key, might be able to craft RPCs that may allow one to work around the
checks which are implemented at the file system level.Regards, The Spectrum Scale (GPFS) team--If you feel that your question can benefit other users of Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=----0479.
If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact 1-800-237-5511
in the United States or your local IBM Service Center in other countries.
The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.Aaron
Knister ---08/21/2017 11:04:06 PM---Hi Everyone, I have a theoretical question
about GPFS multiclusters and security.From: Aaron Knister To: gpfsug main discussion list Date: 08/21/2017 11:04 PMSubject: [gpfsug-discuss] multicluster securitySent by: gpfsug-discuss-boun...@spectrumscale.orgHi Everyone,I have a theoretical question about GPFS multiclusters and security. Let's say I have clusters A and B. Cluster A is exporting a filesystem
as read-only to cluster B.Where does the authorization burden lay? Meaning, does the security rely
on mmfsd in cluster B to behave itself and enforce the conditions of the
multi-cluster export? Could someone using the credentials on a compromised node in cluster B just start sending arbitrary nsd read/write commands to the nsds from cluster A (or something along those
lines)? Do the NSD servers in cluster A do any sort of sanity or security checking on the I/O requests coming from cluster B to the NSDs
they're serving to exported filesystems?I imagine any enforcement would go out the window with shared disks in
a multi-cluster environment since a compromised node could just "dd"
over the LUNs.Thanks!-Aaron-- Aaron KnisterNASA Center for Climate Simulation (Code 606.2)Goddard Space Flight Center(301) 286-2776___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttps://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=IbxtjdkPAM2Sbon4Lbbi4w=oK_bEPbjuD7j6qLTHbe7HM4ujUlpcNYtX3tMW2QC7_w=BliMQ0pToLIIiO1jfyUp2Q3icewcONrcmHpsIj_hMtY= ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] gpfs filesystem heat reporting, howto setup

2017-06-01 Thread Olaf Weiser
Hi Andreas, one could use the WEIGHT  statement
... a simple policy for e.g. rule ’repack’ MIGRATE FROM POOL ’xx’
TO POOL ’’ WEIGHT(FILE_HEAT)and then the -I prepare to see, what
would be done by policy.. or you use the LIST function .. or ..
and so on .. From:      
 Andreas Landhäußer
To:      
 gpfsug-discuss@spectrumscale.orgDate:      
 06/01/2017 11:36 AMSubject:    
   [gpfsug-discuss]
gpfs filesystem heat reporting, howto setupSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHello all out there,customer wants to receive periodical reports on the filesystem heat (relatively age) of files.We already switched on fileheat using mmchconfig.mmchconfig fileheatlosspercent=10,fileHeatPeriodMinutes=1440for the reports, I think I need to know the file usage in a given time
period.Are there any how-to for obtaining this information, examples for ILM policies to be used as a start?any help will be highly appreciated.Best regards                  Andreas-- Andreas Landhäußer              
               
               
               
     +49 151 12133027 (mobile)alandhae@gmx.de___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Well, this is the pits...

2017-05-04 Thread Olaf Weiser
HI Kevin, the number of NSDs is more or less nonsense
.. it is just the number of nodes x PITWorker  should not exceed to
much the #mutex/FS blockdid you adjust/tune the PitWorker ?
... so far as I know.. that the code checks
the number of NSDs is already considered as a defect and will be fixed
/ is already fixed ( I stepped into it here as well) ps. QOS is the better approach to address
this, but unfortunately.. not everyone is using it by default... that's
why I suspect , the development decide to put in a check/limit here ..
which in your case(with QOS)  would'nt needed From:      
 "Buterbaugh, Kevin
L" <kevin.buterba...@vanderbilt.edu>To:      
 gpfsug main discussion
list <gpfsug-discuss@spectrumscale.org>Date:      
 05/04/2017 05:44 PMSubject:    
   Re: [gpfsug-discuss]
Well, this is the pits...Sent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHi Olaf, Your explanation mostly makes sense, but...Failed with 4 nodes … failed with 2 nodes … not gonna
try with 1 node.  And this filesystem only has 32 disks, which I would
imagine is not an especially large number compared to what some people
reading this e-mail have in their filesystems.I thought that QOS (which I’m using) was what would keep
an mmrestripefs from overrunning the system … QOS has worked extremely
well for us - it’s one of my favorite additions to GPFS.KevinOn May 4, 2017, at 10:34 AM, Olaf Weiser <olaf.wei...@de.ibm.com>
wrote:no.. it is just in the code, because
we have to avoid to run out of mutexs / blockreduce the number of nodes -N down to 4  (2nodes is even more safer)
... is the easiest way to solve it for nowI've been told the real root cause will be fixed in one of the next ptfs
.. within this year .. this warning messages itself should
appear every time.. but unfortunately someone coded, that it depends on
the number of disks (NSDs).. that's why I suspect you did'nt see it beforebut the fact , that we have to make sure, not to overrun the system by
mmrestripe  remains.. to please lower the -N number of nodes to 4
or better 2 (even though we know.. than the mmrestripe will take longer)From:        "Buterbaugh,
Kevin L" <kevin.buterba...@vanderbilt.edu>To:        gpfsug
main discussion list <gpfsug-discuss@spectrumscale.org>Date:        05/04/2017
05:26 PMSubject:        [gpfsug-discuss]
Well, this is the pits...Sent by:        gpfsug-discuss-boun...@spectrumscale.orgHi All, Another one of those, “I can open a PMR if I need to” type questions…We are in the process of combining two large GPFS filesystems into one
new filesystem (for various reasons I won’t get into here).  Therefore,
I’m doing a lot of mmrestripe’s, mmdeldisk’s, and mmadddisk’s.Yesterday I did an “mmrestripefs  -r -N ”
(after suspending a disk, of course).  Worked like it should.Today I did a “mmrestripefs  -b -P capacity -N ” and got:mmrestripefs: The total number of PIT worker threads of all participating
nodes has been exceeded to safely restripe the file system.  The total
number of PIT worker threads, which is the sum of pitWorkerThreadsPerNode
of the participating nodes, cannot exceed 31.  Reissue the command
with a smaller set of participating nodes (-N option) and/or lower the
pitWorkerThreadsPerNode configure setting.  By default the file system
manager node is counted as a participating node.mmrestripefs: Command failed. Examine previous error messages to determine
cause.So there must be some difference in how the “-r” and “-b” options calculate
the number of PIT worker threads.  I did an “mmfsadm dump all | grep
pitWorkerThreadsPerNode” on all 8 NSD servers and the filesystem manager
node … they all say the same thing:   pitWorkerThreadsPerNode 0Hmmm, so 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 > 31?!?  I’m confused...—Kevin Buterbaugh - Senior System AdministratorVanderbilt University - Advanced Computing Center for Research and Educationkevin.buterba...@vanderbilt.edu-
(615)875-9633___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Well, this is the pits...

2017-05-04 Thread Olaf Weiser
no.. it is just in the code, because we
have to avoid to run out of mutexs / blockreduce the number of nodes -N down to
4  (2nodes is even more safer) ... is the easiest way to solve it
for nowI've been told the real root cause will
be fixed in one of the next ptfs .. within this year .. this warning messages itself should
appear every time.. but unfortunately someone coded, that it depends on
the number of disks (NSDs).. that's why I suspect you did'nt see it beforebut the fact , that we have to make
sure, not to overrun the system by mmrestripe  remains.. to please
lower the -N number of nodes to 4 or better 2 (even though we know.. than the mmrestripe
will take longer)From:      
 "Buterbaugh, Kevin
L" To:      
 gpfsug main discussion
list Date:      
 05/04/2017 05:26 PMSubject:    
   [gpfsug-discuss]
Well, this is the pits...Sent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHi All, Another one of those, “I can open a PMR if I need to”
type questions…We are in the process of combining two large GPFS filesystems
into one new filesystem (for various reasons I won’t get into here).  Therefore,
I’m doing a lot of mmrestripe’s, mmdeldisk’s, and mmadddisk’s.Yesterday I did an “mmrestripefs  -r -N
” (after suspending a disk, of course).  Worked
like it should.Today I did a “mmrestripefs  -b -P capacity
-N ” and got:mmrestripefs: The total number of PIT worker threads of
all participating nodes has been exceeded to safely restripe the file system.
 The total number of PIT worker threads, which is the sum of pitWorkerThreadsPerNode
of the participating nodes, cannot exceed 31.  Reissue the command
with a smaller set of participating nodes (-N option) and/or lower the
pitWorkerThreadsPerNode configure setting.  By default the file system
manager node is counted as a participating node.mmrestripefs: Command failed. Examine previous error messages
to determine cause.So there must be some difference in how the “-r” and
“-b” options calculate the number of PIT worker threads.  I did
an “mmfsadm dump all | grep pitWorkerThreadsPerNode” on all 8 NSD servers
and the filesystem manager node … they all say the same thing:   pitWorkerThreadsPerNode 0Hmmm, so 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 > 31?!?
 I’m confused...—Kevin Buterbaugh - Senior System AdministratorVanderbilt University - Advanced Computing Center for
Research and Educationkevin.buterba...@vanderbilt.edu- (615)875-9633___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Tiebreaker disk question

2017-05-04 Thread Olaf Weiser
this configuration (2 nodes and tiebreaker)
is not designed to survive node and disk failures at the same time... this depends on , where the clustermanager
and the filesystem manager runs .. when a node and half of the disk disappear
at the same time...for a real active-active configuration
you may consider https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adv_actact.htmFrom:      
 Jan-Frode Myklebust
To:      
 gpfsug main discussion
list Date:      
 05/04/2017 07:27 AMSubject:    
   Re: [gpfsug-discuss]
Tiebreaker disk questionSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgThis doesn't sound like normal behaviour. It shouldn't
matter which filesystem your tiebreaker disks belong to. I think the failure
was caused by something else, but am not able to guess from the little
information you posted.. The mmfs.log will probably tell you the reason.-jfons. 3. mai 2017 kl. 19.08 skrev Shaun Anderson :We noticed some odd behavior recently. 
I have a customer with a small Scale (with Archive on top) configuration
that we recently updated to a dual node configuration.  We are using
CES and setup a very small 3 nsd shared-root filesystem(gpfssr). 
We also set up tiebreaker disks and figured it would be ok to use
the gpfssr NSDs for this purpose.  When we tried to perform some basic failover
testing, both nodes came down.  It appears from the logs that when
we initiated the node failure (via mmshutdown command...not great, I know)
it unmounts and remounts the shared-root filesystem.  When it did
this, the cluster lost access to the tiebreaker disks, figured it had lost
quorum and the other node came down as well.We got around this by changing the tiebreaker
disks to our other normal gpfs filesystem.  After that failover
worked as expected.  This is documented nowhere as far as I could
find​.  I wanted to know if anybody else had experienced this and
if this is expected behavior.  All is well now and operating as we
want so I don't think we'll pursue a support request.Regards,SHAUN ANDERSONSTORAGE ARCHITECTO208.577.2112M214.263.7014NOTICE: This email message and any attachments
here to may contain confidentialinformation. Any unauthorized review, use, disclosure, or distribution
of suchinformation is prohibited. If you are not the intended recipient, please
contactthe sender by reply email and destroy the original message and all copies
of it.___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] bizarre performance behavior

2017-04-21 Thread Olaf Weiser
pls checkworkerThreads  (assuming you 're
 > 4.2.2) start with 128 .. increase iteratively pagepool  at least 8 GignorePrefetchLunCount=yes (1) then you won't see a difference and
GPFS is as fast or even faster .. From:      
 "Marcus Koenig1"
To:      
 gpfsug main discussion
list Date:      
 04/21/2017 03:24 AMSubject:    
   Re: [gpfsug-discuss]
bizarre performance behaviorSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHi Kennmeth,we also had similar performance numbers in our tests. Native was far quicker
than through GPFS. When we learned though that the client tested the performance
on the FS at a big blocksize (512k) with small files - we were able to
speed it up significantly using a smaller FS blocksize (obviously we had
to recreate the FS).So really depends on how you do your tests.Cheers,Marcus KoenigLab Services Storage & Power SpecialistIBM Australia & New Zealand Advanced Technical SkillsIBM Systems-HardwareMobile:+64 21 67 34 27E-mail: marc...@nz1.ibm.com82 Wyndham StreetAuckland, AUK 1010New Zealand"Uwe
Falke" ---04/21/2017 03:07:48 AM---Hi Kennmeth, is prefetching off
or on at your storage backend?From: "Uwe Falke" To: gpfsug main discussion list Date: 04/21/2017 03:07 AMSubject: Re: [gpfsug-discuss] bizarre performance behaviorSent by: gpfsug-discuss-boun...@spectrumscale.orgHi Kennmeth, is prefetching off or on  at your storage backend?Raw sequential is very different from GPFS sequential at the storage device !GPFS does its own prefetching, the storage would never know what sectors
sequential read at GPFS level maps to at storage level!Mit freundlichen Grüßen / Kind regardsDr. Uwe FalkeIT SpecialistHigh Performance Computing Services / Integrated Technology Services /
Data Center Services---IBM DeutschlandRathausstr. 709111 ChemnitzPhone: +49 371 6978 2165Mobile: +49 175 575 2877E-Mail: uwefa...@de.ibm.com---IBM Deutschland Business & Technology Services GmbH / Geschäftsführung:
Andreas Hasse, Thorsten MoehringSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 17122 From:   Kenneth Waegeman To:     gpfsug main discussion list Date:   04/20/2017 04:53 PMSubject:        Re: [gpfsug-discuss] bizarre performance
behaviorSent by:        gpfsug-discuss-boun...@spectrumscale.orgHi,Having an issue that looks the same as this one: We can do sequential writes to the filesystem at 7,8 GB/s total , which
is the expected speed for our current storage    backend.  While we have even better performance with sequential reads
on raw storage LUNS, using GPFS we can only reach 1GB/s in total (each nsd
server seems limited by 0,5GB/s) independent of the number of clients  
(1,2,4,..) or ways we tested (fio,dd). We played with blockdev params,
MaxMBps, PrefetchThreads, hyperthreading, c1e/cstates, .. as discussed
in this thread, but nothing seems to impact this read performance. Any ideas?Thanks!KennethOn 17/02/17 19:29, Jan-Frode Myklebust wrote:I just had a similar experience from a sandisk infiniflash system SAS-attached to s single host. Gpfsperf reported 3,2 Gbyte/s for writes.
and 250-300 Mbyte/s on sequential reads!! Random reads were on the order
of 2 Gbyte/s.After a bit head scratching snd fumbling around I found out that reducing
maxMBpS from 1 to 100 fixed the problem! Digging further I found that
reducing prefetchThreads from default=72 to 32 also fixed it, while leaving maxMBpS at 1. Can now also read at 3,2 GByte/s.Could something like this be the problem on your box as well?-jffre. 17. feb. 2017 kl. 18.13 skrev Aaron Knister :Well, I'm somewhat scrounging for hardware. This is in our testenvironment :) And yep, it's got the 2U gpu-tray in it although evenwithout the riser it has 2 PCIe slots onboard (excluding the on-boarddual-port mezz card) so I think it would make a fine NSD server evenwithout the riser.-AaronOn 2/17/17 11:43 AM, Simon Thompson (Research Computing - IT Services)wrote:> Maybe its related to interrupt handlers somehow? You drive the load
up on one socket, you push all the interrupt handling to the other socket
where the fabric card is attached?>> Dunno ... (Though I am intrigued you use idataplex nodes as NSD servers,
I assume its some 2U gpu-tray riser one or something !)>> Simon> > From: gpfsug-discuss-boun...@spectrumscale.org [gpfsug-discuss-boun...@spectrumscale.org] on behalf of Aaron Knister [aaron.s.knis...@nasa.gov]> Sent: 17 February 2017 15:52> To: gpfsug main discussion list> Subject: [gpfsug-discuss] bizarre 

Re: [gpfsug-discuss] CES doesn't assign addresses to nodes

2017-03-23 Thread Olaf Weiser

the issue is fixed, an APAR will be released soon - IV93100From:      
 Olaf Weiser/Germany/IBM@IBMDETo:      
 "gpfsug main discussion
list" <gpfsug-discuss@spectrumscale.org>Cc:      
 "gpfsug main discussion
list" <gpfsug-discuss@spectrumscale.org>Date:      
 01/31/2017 11:47 PMSubject:    
   Re: [gpfsug-discuss]
CES doesn't assign addresses to nodesSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgYeah... depending on the #nodes you 're affected
or not. .So if your remote ces  cluster is small enough in terms of the #nodes
... you'll neuer hit into this issue  Gesendet von IBM VerseSimon Thompson (Research Computing - IT
Services) --- Re: [gpfsug-discuss] CES doesn't assign addresses to nodes
--- Von:"Simon
Thompson (Research Computing - IT Services)" <s.j.thomp...@bham.ac.uk>An:"gpfsug
main discussion list" <gpfsug-discuss@spectrumscale.org>Datum:Di.
31.01.2017 21:07Betreff:Re:
[gpfsug-discuss] CES doesn't assign addresses to nodesWe use multicluster for our environment, storage systems
in a separate cluster to hpc nodes on a separate cluster from protocol
nodes.According to the docs, this isn't supported, but we haven't seen any issues.
Note unsupported as opposed to broken.SimonFrom: gpfsug-discuss-boun...@spectrumscale.org [gpfsug-discuss-boun...@spectrumscale.org]
on behalf of Jonathon A Anderson [jonathon.ander...@colorado.edu]Sent: 31 January 2017 17:47To: gpfsug main discussion listSubject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodesYeah, I searched around for places where ` tsctl shownodes up` appears
in the GPFS code I have access to (i.e., the ksh and python stuff); but
it’s only in CES. I suspect there just haven’t been that many people
exporting CES out of an HPC cluster environment.~jonathonFrom: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf
Weiser <olaf.wei...@de.ibm.com>Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Date: Tuesday, January 31, 2017 at 10:45 AMTo: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodesI ll open a pmr here for my env ... the issue may hurt you in a ces env.
only... but needs to be fixed in core gpfs.base  i thi kGesendet von IBM VerseJonathon A Anderson --- Re: [gpfsug-discuss] CES doesn't assign addresses
to nodes ---Von:"Jonathon A Anderson" <jonathon.ander...@colorado.edu>An:"gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>Datum:Di. 31.01.2017 17:32Betreff:Re: [gpfsug-discuss] CES doesn't assign addresses to nodesNo, I’m having trouble getting this through DDN support because, while
we have a GPFS server license and GRIDScaler support, apparently we don’t
have “protocol node” support, so they’ve pushed back on supporting this
as an overall CES-rooted effort.I do have a DDN case open, though: 78804. If you are (as I suspect) a GPFS
developer, do you mind if I cite your info from here in my DDN case to
get them to open a PMR?Thanks.~jonathonFrom: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf
Weiser <olaf.wei...@de.ibm.com>Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Date: Tuesday, January 31, 2017 at 8:42 AMTo: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodesok.. so obviously ... it seems , that we have several issues..the 3983 characters is obviously a defecthave you already raised a PMR , if so , can you send me the number ?From:        Jonathon A Anderson <jonathon.ander...@colorado.edu>To:        gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Date:        01/31/2017 04:14 PMSubject:        Re: [gpfsug-discuss] CES doesn't assign
addresses to nodesSent by:        gpfsug-discuss-boun...@spectrumscale.orgThe tail isn’t the issue; that’ my addition, so that I didn’t have to
paste the hundred or so line nodelist into the thread.The actual command istsctl shownodes up | $tr ',' '\n' | $sort -o $upnodefileBut you can see in my tailed output that the last hostname listed is cut-off
halfway through the hostname. Less obvious in the example, but true, is
the fact that it’s only showing the first 120 hosts, when we have 403
nodes in our gpfs cluster.[root@sgate2 ~]# tsctl shownodes up | tr ',' '\n' | wc -l120[root@sgate2 ~]# mmlscluster | grep '\-opa' | wc -l403Perhaps more explicitly, it looks like `tsctl shownodes up` can only transmit
3983 characters.[root@sgate2 ~]# tsctl shownodes up | wc -c3983Again, I’m convinced this is a bug not only because the command doesn’t
actually produce a list of all of the up nodes in our cluster; but because
the last name listed is incomplete.[root@sgate2 ~]# tsctl shownodes up | 

Re: [gpfsug-discuss] Running multiple mmrestripefs in a single cluster?

2017-03-15 Thread Olaf Weiser
yes.. and please be carefully about the
number of nodes , doing the job because of multiple PIT worker hammering
against your data if you limit the restripe to 2 nodes
 (-N ..)   of adjust the PITworker down to 8 or even 4  ...
you can run multiple restripes.. without hurting the application workload
to much ... but the final duration of your restripe then will be affected
cheersFrom:      
 "Oesterlin, Robert"
To:      
 gpfsug main discussion
list Date:      
 03/15/2017 03:27 PMSubject:    
   [gpfsug-discuss]
Running multiple mmrestripefs in a single cluster?Sent by:    
   gpfsug-discuss-boun...@spectrumscale.orgI’m looking at migrating multiple file
systems from one set of NSDs to another. Assuming I put aside any potential
IO bottlenecks, has anyone tried running multiple “mmrestripefs” commands
in a single cluster? Bob OesterlinSr Principal Storage Engineer, Nuance507-269-0413  ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Reverting to older versions

2017-02-10 Thread Olaf Weiser
as long, you did not changed mmchconfig
release=latest and the file system version has'nt changed as well , this
should work ( I did it several times..) From:      
 "mark.b...@siriuscom.com"
To:      
 gpfsug main discussion
list Date:      
 02/10/2017 05:33 PMSubject:    
   [gpfsug-discuss]
Reverting to older versionsSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgIs there a documented way to go down a
level of GPFS code?. For example since 4.2.2.x has broken my protocol nodes,
is there a straight forward way to revert back to 4.2.1.x?. Can I just
stop my cluster remove RPMS and add older version RPMS?   Mark R. Bush| Storage ArchitectMobile: 210-237-8415 Twitter: @bushmr| LinkedIn: /markreedbush10100 Reunion Place, Suite 500, San Antonio, TX 78216www.siriuscom.com|mark.b...@siriuscom.com This message (including any attachments) is intended only
for the use of the individual or entity to which it is addressed and may
contain information that is non-public, proprietary, privileged, confidential,
and exempt from disclosure under applicable law. If you are not the intended
recipient, you are hereby notified that any use, dissemination, distribution,
or copying of this communication is strictly prohibited. This message may
be viewed by parties at Sirius Computer Solutions other than those named
in the message header. This message does not contain an official representation
of Sirius Computer Solutions. If you have received this communication in
error, notify Sirius Computer Solutions immediately and (i) destroy this
message if a facsimile or (ii) delete this message immediately if this
is an electronic communication. Thank you. Sirius
Computer Solutions ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] CES doesn't assign addresses to nodes

2017-02-09 Thread Olaf Weiser
an APAR number / fix was created  at
the end of last week .. so for your environment.. easily open
a PMR , so that you 'll get the fix for your installed level immediately
, once it ll be releasedFrom:      
 "mark.b...@siriuscom.com"
<mark.b...@siriuscom.com>To:      
 gpfsug main discussion
list <gpfsug-discuss@spectrumscale.org>Date:      
 02/09/2017 03:40 PMSubject:    
   Re: [gpfsug-discuss]
CES doesn't assign addresses to nodesSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHas any headway been made on this issue?
 I just ran into it as well.  The CES ip addresses just disappeared
from my two protocol nodes (4.2.2.0). From: <gpfsug-discuss-boun...@spectrumscale.org>
on behalf of Olaf Weiser <olaf.wei...@de.ibm.com>Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Date: Thursday, February 2, 2017 at 12:02 PMTo: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes pls contact me directly olaf.wei...@de.ibm.comMit freundlichen Grüßen / Kind regards Olaf WeiserEMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:        Jonathon
A Anderson <jonathon.ander...@colorado.edu>To:        gpfsug
main discussion list <gpfsug-discuss@spectrumscale.org>Date:        02/02/2017
06:45 PMSubject:        Re:
[gpfsug-discuss] CES doesn't assign addresses to nodesSent by:        gpfsug-discuss-boun...@spectrumscale.orgAny chance I can get that PMR# also, so I can reference it in my DDN case? ~jonathon  From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of
Olaf Weiser <olaf.wei...@de.ibm.com>Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Date: Wednesday, February 1, 2017 at 2:28 AMTo: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Pmr opened... send the # directly to u Gesendet von IBM VerseMathias Dietz --- Re: [gpfsug-discuss] CES doesn't assign addresses to
nodes ---   Von:"Mathias
Dietz" <mdi...@de.ibm.com>An:"gpfsug
main discussion list" <gpfsug-discuss@spectrumscale.org>Datum:Mi.
01.02.2017 10:05Betreff:Re:
[gpfsug-discuss] CES doesn't assign addresses to nodes  >I ll open a pmr here for my env ... the issue may hurt you inralf a
ces env. only... but needs to be fixed in core gpfs.base  i thinkThanks for opening the PMR.The problem is inside the gpfs base code and we are working on a fix right
now.In the meantime until the fix is available we will use the PMR to propose/discuss
potential work arounds.Mit freundlichen Grüßen / Kind regardsMathias DietzSpectrum Scale - Release Lead Architect (4.2.X Release)System Health and Problem Determination Architect IBM Certified Software Engineer--IBM DeutschlandHechtsheimer Str. 255131 MainzPhone: +49-6131-84-2027Mobile: +49-15152801035E-Mail: mdi...@de.ibm.com--IBM Deutschland Research & Development GmbHVorsitzender des Aufsichtsrats: Martina Koederitz, Geschäftsführung: Dirk
WittkoppSitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart,
HRB 243294From:        Olaf
Weiser/Germany/IBM@IBMDETo:        "gpfsug
main discussion list" <gpfsug-discuss@spectrumscale.org>Cc:        "gpfsug
main discussion list" <gpfsug-discuss@spectrumscale.org>Date:        01/31/2017
11:47 PMSubject:        Re:
[gpfsug-discuss] CES doesn't assign addresses to nodesSent by:        gpfsug-discuss-boun...@spectrumscale.orgYeah... depending on the #nodes you 're affected or not. .So if your remote ces  cluster is small enough in terms of the #nodes
... you'll neuer hit into this issue  Gesendet von IBM VerseSimon Thompson (Research Computing - IT Services) --- Re: [gpfsug-discuss]
CES doesn't assign addresses to nodes --- Von:"Simon
Thompson (Research Computing - IT Services)" <s.j.thomp...@bham.ac.uk>An:"gpfsug
main discussion list" <gpfsug-discuss@spectrumscale.org>

Re: [gpfsug-discuss] Mount of file set

2017-02-03 Thread Olaf Weiser
Hi Ha-Jo, we do the same here .. so no news so
far as I know... grussvom laffFrom:      
 Hans-Joachim Ehlers
To:      
 gpfsug main discussion
list Date:      
 02/03/2017 05:14 PMSubject:    
   [gpfsug-discuss]
Mount of file setSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgMoin Moin,is it nowaday possible to  mount directly a GPFS Fileset
? In the old day i mounted the whole GPFS to a Mount point
with 000 rights and did a Sub Mount of the needed Fileset. It works but
it is ugly.-- Unix Systems Engineer --MetaModul GmbHSüderstr. 1225336 ElmshornHRB: 11873 PIUstID: DE213701983Mobil: + 49 177 4393994Mail: serv...@metamodul.com___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] CES doesn't assign addresses to nodes

2017-02-02 Thread Olaf Weiser
pls contact me directly olaf.wei...@de.ibm.comMit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:      
 Jonathon A Anderson
<jonathon.ander...@colorado.edu>To:      
 gpfsug main discussion
list <gpfsug-discuss@spectrumscale.org>Date:      
 02/02/2017 06:45 PMSubject:    
   Re: [gpfsug-discuss]
CES doesn't assign addresses to nodesSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgAny chance I can get that PMR# also, so
I can reference it in my DDN case? ~jonathon  From: <gpfsug-discuss-boun...@spectrumscale.org>
on behalf of Olaf Weiser <olaf.wei...@de.ibm.com>Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Date: Wednesday, February 1, 2017 at 2:28 AMTo: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Pmr opened... send the # directly to u Gesendet von IBM VerseMathias Dietz ---
Re: [gpfsug-discuss] CES doesn't assign addresses to nodes ---  Von:"Mathias
Dietz" <mdi...@de.ibm.com>An:"gpfsug
main discussion list" <gpfsug-discuss@spectrumscale.org>Datum:Mi.
01.02.2017 10:05Betreff:Re:
[gpfsug-discuss] CES doesn't assign addresses to nodes >I ll open a pmr here for my env ... the issue may
hurt you inralf a ces env. only... but needs to be fixed in core gpfs.base
 i thinkThanks for opening the PMR.The problem is inside the gpfs base code and we are working on a fix right
now.In the meantime until the fix is available we will use the PMR to propose/discuss
potential work arounds.Mit freundlichen Grüßen / Kind regardsMathias DietzSpectrum Scale - Release Lead Architect (4.2.X Release)System Health and Problem Determination Architect IBM Certified Software Engineer--IBM DeutschlandHechtsheimer Str. 255131 MainzPhone: +49-6131-84-2027Mobile: +49-15152801035E-Mail: mdi...@de.ibm.com--IBM Deutschland Research & Development GmbHVorsitzender des Aufsichtsrats: Martina Koederitz, Geschäftsführung: Dirk
WittkoppSitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart,
HRB 243294From:        Olaf
Weiser/Germany/IBM@IBMDETo:        "gpfsug
main discussion list" <gpfsug-discuss@spectrumscale.org>Cc:        "gpfsug
main discussion list" <gpfsug-discuss@spectrumscale.org>Date:        01/31/2017
11:47 PMSubject:        Re:
[gpfsug-discuss] CES doesn't assign addresses to nodesSent by:        gpfsug-discuss-boun...@spectrumscale.orgYeah... depending on the #nodes you 're affected or not. .So if your remote ces  cluster is small enough in terms of the #nodes
... you'll neuer hit into this issue  Gesendet von IBM VerseSimon Thompson (Research Computing - IT Services) --- Re: [gpfsug-discuss]
CES doesn't assign addresses to nodes --- Von:"Simon
Thompson (Research Computing - IT Services)" <s.j.thomp...@bham.ac.uk>An:"gpfsug
main discussion list" <gpfsug-discuss@spectrumscale.org>Datum:Di.
31.01.2017 21:07Betreff:Re:
[gpfsug-discuss] CES doesn't assign addresses to nodes We use multicluster for our environment, storage systems in a separate
cluster to hpc nodes on a separate cluster from protocol nodes.According to the docs, this isn't supported, but we haven't seen any issues.
Note unsupported as opposed to broken.SimonFrom: gpfsug-discuss-boun...@spectrumscale.org [gpfsug-discuss-boun...@spectrumscale.org]
on behalf of Jonathon A Anderson [jonathon.ander...@colorado.edu]Sent: 31 January 2017 17:47To: gpfsug main discussion listSubject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodesYeah, I searched around for places where ` tsctl shownodes up` appears
in the GPFS code I have access to (i.e., the ksh and python stuff); but
it’s only in CES. I suspect there just haven’t been that many people
exporting CES out of an HPC cluster environment.~jonathonFrom: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf
Weiser <olaf.wei...@de.ibm.co

Re: [gpfsug-discuss] proper gpfs shutdown when node disappears

2017-02-02 Thread Olaf Weiser
seems, that the node is up n running from
the OS point of view .. so one can ping the node/ login the
node... but the /var/mmfs DIR is obviously damaged/empty
.. what ever.. that's why you see a message like this..have you reinstalled that node / any
backup/restore thing ?From:      
 "J. Eric Wonderley"
To:      
 gpfsug main discussion
list Date:      
 02/02/2017 06:04 PMSubject:    
   [gpfsug-discuss]
proper gpfs shutdown when node disappearsSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgIs there a way to accomplish this so the rest of cluster
knows its down?My state now:[root@cl001 ~]# mmgetstate -aLcl004.cl.arc.internal:  mmremote: determineMode: Missing file /var/mmfs/gen/mmsdrfs.cl004.cl.arc.internal:  mmremote: This node does not belong to a GPFS
cluster.mmdsh: cl004.cl.arc.internal remote shell process had return code 1. Node number  Node name   Quorum 
Nodes up  Total nodes  GPFS state  Remarks   
   1  cl001 
5    7 
8   active 
quorum node   2  cl002 
5    7 
8   active 
quorum node   3  cl003 
5    7 
8   active 
quorum node   4  cl004 
0    0 
8   unknown quorum
node   5  cl005 
5    7 
8   active 
quorum node   6  cl006 
5    7 
8   active 
quorum node   7  cl007 
5    7 
8   active 
quorum node   8  cl008 
5    7 
8   active 
quorum nodecl004 we think has an internal raid controller blowout___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] CES doesn't assign addresses to nodes

2017-01-31 Thread Olaf Weiser
Hi ...same thing here.. everything after
10 nodes will be truncated.. though I don't have an issue with it
... I 'll open a PMR .. and I recommend you to do the same thing.. ;-)
the reason seems simple.. it is the
"| tail" .at the end of the command.. .. which truncates
the output to the last 10 items... should be easy to fix.. cheersolafFrom:      
 Jonathon A Anderson
To:      
 "gpfsug-discuss@spectrumscale.org"
Date:      
 01/30/2017 11:11 PMSubject:    
   Re: [gpfsug-discuss]
CES doesn't assign addresses to nodesSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgIn trying to figure this out on my own, I’m relatively
certain I’ve found a bug in GPFS related to the truncation of output from
`tsctl shownodes up`. Any chance someone in development can confirm?Here are the details of my investigation:## GPFS is up on sgate2[root@sgate2 ~]# mmgetstate Node number  Node name        GPFS state --     414      sgate2-opa      
active## but if I tell ces to explicitly put one of our ces addresses on that
node, it says that GPFS is down[root@sgate2 ~]# mmces address move --ces-ip 10.225.71.102 --ces-node sgate2-opammces address move: GPFS is down on this node.mmces address move: Command failed. Examine previous error messages to
determine cause.## the “GPFS is down on this node” message is defined as code 109 in
mmglobfuncs[root@sgate2 ~]# grep --before-context=1 "GPFS is down on this node."
/usr/lpp/mmfs/bin/mmglobfuncs    109 ) msgTxt=\"%s: GPFS is down on this node."## and is generated by printErrorMsg in mmcesnetmvaddress when it detects
that the current node is identified as “down” by getDownCesNodeList[root@sgate2 ~]# grep --before-context=5 'printErrorMsg 109' /usr/lpp/mmfs/bin/mmcesnetmvaddress  downNodeList=$(getDownCesNodeList)  for downNode in $downNodeList  do    if [[ $toNodeName == $downNode ]]    then      printErrorMsg 109 "$mmcmd"## getDownCesNodeList is the intersection of all ces nodes with GPFS cluster
nodes listed in `tsctl shownodes up`[root@sgate2 ~]# grep --after-context=16 '^function getDownCesNodeList'
/usr/lpp/mmfs/bin/mmcesfuncsfunction getDownCesNodeList{  typeset sourceFile="mmcesfuncs.sh"  [[ -n $DEBUG || -n $DEBUGgetDownCesNodeList ]] && set -x  $mmTRACE_ENTER "$*"  typeset upnodefile=${cmdTmpDir}upnodefile  typeset downNodeList  # get all CES nodes  $sort -o $nodefile $mmfsCesNodes.dae  $tsctl shownodes up | $tr ',' '\n' | $sort -o $upnodefile  downNodeList=$($comm -23 $nodefile $upnodefile)  print -- $downNodeList}  #- end of function getDownCesNodeList ## but not only are the sgate nodes not listed by `tsctl shownodes up`;
its output is obviously and erroneously truncated[root@sgate2 ~]# tsctl shownodes up | tr ',' '\n' | tailshas0251-opa.rc.int.colorado.edushas0252-opa.rc.int.colorado.edushas0253-opa.rc.int.colorado.edushas0254-opa.rc.int.colorado.edushas0255-opa.rc.int.colorado.edushas0256-opa.rc.int.colorado.edushas0257-opa.rc.int.colorado.edushas0258-opa.rc.int.colorado.edushas0259-opa.rc.int.colorado.edushas0260-opa.rc.int.col[root@sgate2 ~]### I expect that this is a bug in GPFS, likely related to a maximum output
buffer for `tsctl shownodes up`.On 1/24/17, 12:48 PM, "Jonathon A Anderson" 
wrote:    I think I'm having the same issue described here:        http://www.spectrumscale.org/pipermail/gpfsug-discuss/2016-October/002288.html        Any advice or further troubleshooting steps would be much
appreciated. Full disclosure: I also have a DDN case open. (78804)        We've got a four-node (snsd{1..4}) DDN gridscaler system.
I'm trying to add two CES protocol nodes (sgate{1,2}) to serve NFS.         Here's the steps I took:         ---     mmcrnodeclass protocol -N sgate1-opa,sgate2-opa     mmcrnodeclass nfs -N sgate1-opa,sgate2-opa     mmchconfig cesSharedRoot=/gpfs/summit/ces     mmchcluster --ccr-enable     mmchnode --ces-enable -N protocol     mmces service enable NFS     mmces service start NFS -N nfs     mmces address add --ces-ip 10.225.71.104,10.225.71.105     mmces address policy even-coverage     mmces address move --rebalance     ---         This worked the very first time I ran it, but the CES addresses
weren't re-distributed after restarting GPFS or a node reboot.         Things I've tried:         * disabling ces on the sgate nodes and re-running the above
procedure     * moving the cluster and filesystem managers to different
snsd nodes     * deleting and re-creating the cesSharedRoot directory         Meanwhile, the following log entry appears in mmfs.log.latest
every ~30s:         ---     Mon Jan 23 20:31:20 MST 2017: mmcesnetworkmonitor: Found
unassigned address 10.225.71.104     Mon Jan 23 20:31:20 MST 2017: mmcesnetworkmonitor: Found
unassigned address 10.225.71.105     Mon Jan 23 20:31:20 MST 2017: mmcesnetworkmonitor: handleNetworkProblem
with lock held: assignIP 

Re: [gpfsug-discuss] mmrepquota and group names in GPFS 4.2.2.x

2017-01-19 Thread Olaf Weiser
unfortunately , I don't own a cluster right
now, which has 4.2.2 to double check... SpectrumScale should resolve the
GID into a name, if it find the name somewhere... but in your case.. I would say.. before
we waste to much time in a version-mismatch issue.. finish the rolling
migration, especially RHEL .. and then we continue meanwhile  -I'll try to find a
way for me here to setup up an 4.2.2. clustercheersFrom:      
 "Buterbaugh, Kevin
L" <kevin.buterba...@vanderbilt.edu>To:      
 gpfsug main discussion
list <gpfsug-discuss@spectrumscale.org>Date:      
 01/19/2017 04:48 PMSubject:    
   Re: [gpfsug-discuss]
mmrepquota and group names in GPFS 4.2.2.xSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHi Olaf, The filesystem manager runs on one of our servers, all
of which are upgraded to 4.2.2.x.Also, I didn’t mention this yesterday but our /etc/nsswitch.conf
has “files” listed first for /etc/group.In addition to a mixture of GPFS versions, we also have
a mixture of OS versions (RHEL 6/7).  AFAIK tell with all of my testing
/ experimenting the only factor that seems to change the behavior of mmrepquota
in regards to GIDs versus group names is the GPFS version.Other ideas, anyone?  Is anyone else in a similar
situation and can test whether they see similar behavior?Thanks...KevinOn Jan 19, 2017, at 2:45 AM, Olaf Weiser <olaf.wei...@de.ibm.com>
wrote:have you checked, where th fsmgr runs
as you have nodes with different code levelsmmlsmgr From:        "Buterbaugh,
Kevin L" <kevin.buterba...@vanderbilt.edu>To:        gpfsug
main discussion list <gpfsug-discuss@spectrumscale.org>Date:        01/18/2017
04:57 PMSubject:        [gpfsug-discuss]
mmrepquota and group names in GPFS 4.2.2.xSent by:        gpfsug-discuss-boun...@spectrumscale.orgHi All, We recently upgraded our cluster (well, the servers are all upgraded; the
clients are still in progress) from GPFS 4.2.1.1 to GPFS 4.2.2.1 and there
appears to be a change in how mmrepquota handles group names in its’ output.
 I’m trying to get a handle on it, because it is messing with some
of my scripts and - more importantly - because I don’t understand the
behavior.From one of my clients which is still running GPFS 4.2.1.1 I can run an
“mmrepquota -g ” and if the group exists in /etc/group the
group name is displayed.  Of course, if the group doesn’t exist in
/etc/group, the GID is displayed.  Makes sense.However, on my servers which have been upgraded to GPFS 4.2.2.1 most -
but not all - of the time I see GID numbers instead of group names.  My
question is, what is the criteria GPFS 4.2.2.x is using to decide when
to display a GID instead of a group name?  It’s apparently *not*
the length of the name of the group, because I have output in front of
me where a 13 character long group name is displayed but a 7 character
long group name is *not* displayed - its’ GID is instead (and yes, both
exist in /etc/group).I know that sample output would be useful to illustrate this, but I do
not want to post group names or GIDs to a public mailing list … if you
want to know what those are, you’ll have to ask Vladimir Putin… ;-)I am in the process of updating scripts to use “mmrepquota -gn ”
and then looking up the group name myself, but I want to try to understand
this.  Thanks…Kevin—Kevin Buterbaugh - Senior System AdministratorVanderbilt University - Advanced Computing Center for
Research and Educationkevin.buterba...@vanderbilt.edu- (615)875-9633___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmrepquota and group names in GPFS 4.2.2.x

2017-01-19 Thread Olaf Weiser
have you checked, where th fsmgr runs as
you have nodes with different code levelsmmlsmgr From:      
 "Buterbaugh, Kevin
L" To:      
 gpfsug main discussion
list Date:      
 01/18/2017 04:57 PMSubject:    
   [gpfsug-discuss]
mmrepquota and group names in GPFS 4.2.2.xSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHi All, We recently upgraded our cluster (well, the servers are
all upgraded; the clients are still in progress) from GPFS 4.2.1.1 to GPFS
4.2.2.1 and there appears to be a change in how mmrepquota handles group
names in its’ output.  I’m trying to get a handle on it, because
it is messing with some of my scripts and - more importantly - because
I don’t understand the behavior.From one of my clients which is still running GPFS 4.2.1.1
I can run an “mmrepquota -g ” and if the group exists in /etc/group
the group name is displayed.  Of course, if the group doesn’t exist
in /etc/group, the GID is displayed.  Makes sense.However, on my servers which have been upgraded to GPFS
4.2.2.1 most - but not all - of the time I see GID numbers instead of group
names.  My question is, what is the criteria GPFS 4.2.2.x is using
to decide when to display a GID instead of a group name?  It’s apparently
*not* the length of the name of the group, because I have output in front
of me where a 13 character long group name is displayed but a 7 character
long group name is *not* displayed - its’ GID is instead (and yes, both
exist in /etc/group).I know that sample output would be useful to illustrate
this, but I do not want to post group names or GIDs to a public mailing
list … if you want to know what those are, you’ll have to ask Vladimir
Putin… ;-)I am in the process of updating scripts to use “mmrepquota
-gn ” and then looking up the group name myself, but I want
to try to understand this.  Thanks…Kevin—Kevin Buterbaugh - Senior System AdministratorVanderbilt University - Advanced Computing Center for
Research and Educationkevin.buterba...@vanderbilt.edu- (615)875-9633___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] nodes being ejected out of the cluster

2017-01-11 Thread Olaf Weiser
most likely, there's smth wrong with your
IB fabric ... you say, you run ~ 700 nodes ? ...Are you running with verbsRdmaSendenabled ? ,if so, please consider to disable  - and discuss this within
the PMR another issue, you may check is  -
Are you running the IPoIB in connected mode or datagram ... but as I said,
please discuss this within the PMR .. there are to much dependencies to
discuss this here .. cheersMit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:      
 Damir Krstic <damir.krs...@gmail.com>To:      
 gpfsug main discussion
list <gpfsug-discuss@spectrumscale.org>Date:      
 01/11/2017 03:39 PMSubject:    
   [gpfsug-discuss]
nodes being ejected out of the clusterSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgWe are running GPFS 4.2 on our cluster (around 700 compute
nodes). Our storage (ESS GL6) is also running GPFS 4.2. Compute nodes and
storage are connected via Infiniband (FDR14). At the time of implementation
of ESS, we were instructed to enable RDMA in addition to IPoIB. Previously
we only ran IPoIB on our GPFS3.5 cluster.Every since the implementation (sometime back in July
of 2016) we see a lot of compute nodes being ejected. What usually precedes
the ejection are following messages:Jan 11 02:03:15 quser13 mmfs: [E] VERBS RDMA rdma send
error IBV_WC_RNR_RETRY_EXC_ERR to 172.41.2.5 (gssio2-fdr) on mlx4_0 port
1 fabnum 0 vendor_err 135 Jan 11 02:03:15 quser13 mmfs: [E] VERBS RDMA closed connection
to 172.41.2.5 (gssio2-fdr) on mlx4_0 port 1 fabnum 0 due to send error
IBV_WC_RNR_RETRY_EXC_ERR index 2Jan 11 02:03:26 quser13 mmfs: [E] VERBS RDMA rdma send
error IBV_WC_RNR_RETRY_EXC_ERR to 172.41.2.5 (gssio2-fdr) on mlx4_0 port
1 fabnum 0 vendor_err 135 Jan 11 02:03:26 quser13 mmfs: [E] VERBS RDMA closed connection
to 172.41.2.5 (gssio2-fdr) on mlx4_0 port 1 fabnum 0 due to send error
IBV_WC_WR_FLUSH_ERR index 1Jan 11 02:03:26 quser13 mmfs: [E] VERBS RDMA rdma send
error IBV_WC_RNR_RETRY_EXC_ERR to 172.41.2.5 (gssio2-fdr) on mlx4_0 port
1 fabnum 0 vendor_err 135 Jan 11 02:03:26 quser13 mmfs: [E] VERBS RDMA closed connection
to 172.41.2.5 (gssio2-fdr) on mlx4_0 port 1 fabnum 0 due to send error
IBV_WC_RNR_RETRY_EXC_ERR index 2Jan 11 02:06:38 quser11 mmfs: [E] VERBS RDMA rdma send
error IBV_WC_RNR_RETRY_EXC_ERR to 172.41.2.5 (gssio2-fdr) on mlx4_0 port
1 fabnum 0 vendor_err 135 Jan 11 02:06:38 quser11 mmfs: [E] VERBS RDMA closed connection
to 172.41.2.5 (gssio2-fdr) on mlx4_0 port 1 fabnum 0 due to send error
IBV_WC_WR_FLUSH_ERR index 400Even our ESS IO server sometimes ends up being ejected
(case in point - yesterday morning):Jan 10 11:23:42 gssio2 mmfs: [E] VERBS RDMA rdma send
error IBV_WC_RNR_RETRY_EXC_ERR to 172.41.2.1 (gssio1-fdr) on mlx5_1 port
1 fabnum 0 vendor_err 135Jan 10 11:23:42 gssio2 mmfs: [E] VERBS RDMA closed connection
to 172.41.2.1 (gssio1-fdr) on mlx5_1 port 1 fabnum 0 due to send error
IBV_WC_RNR_RETRY_EXC_ERR index 3001Jan 10 11:23:43 gssio2 mmfs: [E] VERBS RDMA rdma send
error IBV_WC_RNR_RETRY_EXC_ERR to 172.41.2.1 (gssio1-fdr) on mlx5_1 port
2 fabnum 0 vendor_err 135Jan 10 11:23:43 gssio2 mmfs: [E] VERBS RDMA closed connection
to 172.41.2.1 (gssio1-fdr) on mlx5_1 port 2 fabnum 0 due to send error
IBV_WC_RNR_RETRY_EXC_ERR index 2671Jan 10 11:23:43 gssio2 mmfs: [E] VERBS RDMA rdma send
error IBV_WC_RNR_RETRY_EXC_ERR to 172.41.2.1 (gssio1-fdr) on mlx5_0 port
2 fabnum 0 vendor_err 135Jan 10 11:23:43 gssio2 mmfs: [E] VERBS RDMA closed connection
to 172.41.2.1 (gssio1-fdr) on mlx5_0 port 2 fabnum 0 due to send error
IBV_WC_RNR_RETRY_EXC_ERR index 2495Jan 10 11:23:44 gssio2 mmfs: [E] VERBS RDMA rdma send
error IBV_WC_RNR_RETRY_EXC_ERR to 172.41.2.1 (gssio1-fdr) on mlx5_0 port
1 fabnum 0 vendor_err 135Jan 10 11:23:44 gssio2 mmfs: [E] VERBS RDMA closed connection
to 172.41.2.1 (gssio1-fdr) on mlx5_0 port 1 fabnum 0 due to send error
IBV_WC_RNR_RETRY_EXC_ERR index 3077Jan 10 11:24:11 gssio2 mmfs: [N] Node 172.41.2.1 (gssio1-fdr)
lease renewal is overdue. Pinging to check if it is aliveI've had multiple PMRs open for this issue, and I am told
that our ESS needs code level upgrades in order to fix this issue. Looking
at the errors, I think the issue is Infiniband related, and I am wondering
if anyone on this l

Re: [gpfsug-discuss] CES ifs-ganashe

2016-12-20 Thread Olaf Weiser
rsize , wsize is set to 1Mhowever... some current kernel levels
(RHEL7) are cutting it down to 256K peaces .. it is solved with 7.3 (I
think/hope)Mit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:      
 Matt Weil <mw...@wustl.edu>To:      
 gpfsug main discussion
list <gpfsug-discuss@spectrumscale.org>Date:      
 12/20/2016 09:45 PMSubject:    
   [gpfsug-discuss]
CES ifs-ganasheSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgDoes ganashe have a default read and write max size?
 if so what is it?ThanksMatt___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmchdisk performance/behavior in a stretch cluster config?

2016-11-18 Thread Olaf Weiser
so as you already figured out.. as long
your 10GbE is saturated.. it will take time to transfer all the data over
the wire.. "all the data" .. is the key
here.. in older releases /file system versions..we only flagged "data
update miss" or "meta data update miss" in the MD of a file,
when something was written to the file during some NSDs 're missing/down...
means.. when all the disks are back
" active "  -.we scan the file systmes meta data to find
all files with one of these flags .. and resync  the  data..
means, even if you only change some
Bytes of a 10 TB file .. the whole 10TB file needs to be resynced with 4.2 we introduced "rapid repair"
.. (note, you need to be on that demon level and you need to update you
file system version) So with Rapid Repair.. we not only flag
MD or D update miss, we write an extra bit on every disk address, which
has changed.. so that now... when after a side failure (or disk outage)
 we are able to find all files , which has changed (like before in
the good old days) but now - we know, which disk address has changed and
we just need to sync that changed data  in other words.. if you 've changed
1MB of a 10 TB file .. only that 1 MB is re-syncedyou can check, if rapid repair is in
place by mmlsfs command (enable disable by mmchfs ) of course.. if everything( every disk
address)  has changed during our NSD've been down... rapid repair
won't help ..but depending on the amount of your changed data rate .. it
will definitively  shorten your sync times in the future .. cheersMit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:      
 Valdis Kletnieks <valdis.kletni...@vt.edu>To:      
 gpfsug-discuss@spectrumscale.orgDate:      
 11/18/2016 08:06 PMSubject:    
   [gpfsug-discuss]
mmchdisk performance/behavior in a stretch cluster      
 config?Sent by:    
   gpfsug-discuss-boun...@spectrumscale.orgSo as a basis for our archive solution, we're using
a GPFS clusterin a stretch configuration, with 2 sites separated by about 20ms worthof 10G link.  Each end has 2 protocol servers doing NFS and 3 NSD
servers.Identical disk arrays and LTFS/EE at both ends, and all metadata anduserdata are replicated to both sites.We had a fiber issue for about 8 hours yesterday, and as expected (since
thereare only 5 quorum nodes, 3 local and 2 at the far end) the far end fell
off thecluster and down'ed all the NSDs on the remote arrays.There's about 123T of data at each end, 6 million files in there so far.So after the fiber came back up after a several-hour downtime, Idid the 'mmchdisk archive start -a'.  That was at 17:45 yesterday.I'm now 20 hours in, at:  62.15 % complete on Fri Nov 18 13:52:59 2016  (   4768429
inodes with total  173675926 MB data processed)  62.17 % complete on Fri Nov 18 13:53:20 2016  (   4769416
inodes with total  173710731 MB data processed)  62.18 % complete on Fri Nov 18 13:53:40 2016  (   4772481
inodes with total  173762456 MB data processed)network statistics indicate that the 3 local NSDs are all tossing outpackets at about 400Mbytes/second, which means the 10G pipe is pretty damnedclose to totally packed full, and the 3 remotes are sending back ACKsof all the data.Rough back-of-envelop calculations indicate that (a) if I'm at 62% after20 hours, it will take 30 hours to finish and (b) a 10G link takes about29 hours at full blast to move 123T of data.  So it certainly *looks*like it's resending everything.And that's even though at least 100T of that 123T is test data that waswritten by one of our users back on Nov 12/13, and thus theoretically *should*already have been at the remote site.Any ideas what's going on here?___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long distances

2016-11-09 Thread Olaf Weiser
let's say you have a RRT of 180 ms what you then need is your theoretical
link speed  - let's say 10 Gbit/s ... easily let's take 1 GB/sthis means, you socket must be capable
to take your bandwidth (data stream) during the "first" 180ms
because it will take at least this time to get back the first ACKs .. .so 1 GB / s x 0,180 s = 1024 MB/s x
0,180 s ==>> 185 MB   this means, you have to allow the operating
system to accept socketsizes in that range... set something like this - but increase
these values to 185 MBsysctl -w net.ipv4.tcp_rmem="12194304
12194304 12194304"            
   sysctl -w net.ipv4.tcp_wmem="12194304
12194304 12194304"sysctl -w net.ipv4.tcp_mem="12194304
12194304 12194304"sysctl -w net.core.rmem_max=12194304sysctl -w net.core.wmem_max=12194304sysctl -w net.core.rmem_default=12194304sysctl -w net.core.wmem_default=12194304sysctl -w net.core.optmem_max=12194304in addition set this :sysctl -w net.core.netdev_max_backlog=5sysctl -w net.ipv4.tcp_no_metrics_save=1sysctl -w net.ipv4.tcp_timestamps=0sysctl -w net.ipv4.tcp_sack=1sysctl -w net.core.netdev_max_backlog=5sysctl -w net.ipv4.tcp_max_syn_backlog=3you need to "recycle" the
sockets.. means .. mmshutdown/stsartuposhould fix you issueMit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:      
 Jan-Frode Myklebust
<janfr...@tanso.net>To:      
 "gpfsug-discuss@spectrumscale.org"
<gpfsug-discuss@spectrumscale.org>Date:      
 11/09/2016 07:05 PMSubject:    
   Re: [gpfsug-discuss]
Tuning AFM for high throughput/high IO over _really_ long distancesSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgMostly curious, don't have experience in such environments, but ... Is
this AFM over NFS or NSD protocol? Might be interesting to try the other
option -- and also check how nsdperf performs over such distance/latency.-jfons. 9. nov. 2016 kl. 18.39 skrev Jake Carroll <jake.carr...@uq.edu.au>:Hi. I’ve got an GPFS to GPFS AFM cache/home (IW) relationship
set up over a really long distance. About 180ms of latency between the
two clusters and around 13,000km of optical path. Fortunately for me, I’ve
actually got near theoretical maximum IO over the NIC’s between the clusters
and I’m iPerf’ing at around 8.90 to 9.2Gbit/sec over a 10GbE circuit.
All MTU9000 all the way through. Anyway – I’m finding my AFM traffic to be dragging its
feet and I don’t really understand why that might be. I’ve verified the
links and transports ability as I said above with iPerf, and CERN’s FDT
to near 10Gbit/sec.  I also verified the clusters on both sides in terms of
disk IO and they both seem easily capable in IOZone and IOR tests of multiple
GB/sec of throughput. So – my questions: 1.   Are there very specific
tunings AFM needs for high latency/long distance IO? 2.   Are there very specific
NIC/TCP-stack tunings (beyond the type of thing we already have in place)
that benefits AFM over really long distances and high latency?3.   We are seeing on
the “cache” side really lazy/sticky “ls –als” in the home mount. It
sometimes takes 20 to 30 seconds before the command line will report back
with a long listing of files. Any ideas why it’d take that long to get
a response from “home”. We’ve got our TCP stack setup fairly aggressively, on
all hosts that participate in these two clusters. ethtool -C enp2s0f0 adaptive-rx offifconfig enp2s0f0 txqueuelen 1sysctl -w net.core.rmem_max=536870912sysctl -w net.core.wmem_max=536870912sysctl -w net.ipv4.tcp_rmem="4096 87380 268435456"sysctl -w net.ipv4.tcp_wmem="4096 65536 268435456"sysctl -w net.core.netdev_max_backlog=25sysctl -w net.ipv4.tcp_congestion_control=htcpsysctl -w net.ipv4.tcp_mtu_probing=1 I modified a couple of small things on the AFM “cache”
side to see if it’d make a difference such as: mmchconfig afmNumWriteThreads=4mmchconfig afmNumReadThreads=4 But no difference so far. Thoughts would be appreciated. I’ve done this before over
much shorter distances (30Km) and I’ve flattened a 10GbE wire without
really tuning…anything. Are my large in-flight-packets numbers/long-time-to-acknowledgement
semantics going to hurt here? I really thought AFM

Re: [gpfsug-discuss] HAWC and LROC

2016-11-05 Thread Olaf Weiser
You can use both -HAWC ,LROC- on the same node... but you need dedicated 
,independent ,block devices ...
In addition for hawc, you could consider replication and use 2 devices, even 
across 2 nodes. ...

Gesendet von IBM Verse


   leslie elliott --- [gpfsug-discuss] HAWC and LROC --- 
Von:"leslie elliott" An:"gpfsug main 
discussion list" Datum:Sa. 05.11.2016 
02:09Betreff:[gpfsug-discuss] HAWC and LROC
  

 Hi I am curious if anyone has run these together on a client and whether 
it helped 
   
   If we wanted to have these functions out at the client to optimise 
compute IO in a couple of special cases  
   
   can both exist at the same time on the same nonvolatile hardware or do 
the two functions need independent devices
   
   and what would be the process to disestablish them on the clients as the 
requirement was satisfied 
   
   thanks
   
   leslie
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] [EXTERNAL] Re: CES: IP address won't assign: "handleNetworkProblem with lock held"

2016-10-17 Thread Olaf Weiser
ah .. I see. sorry, should have checked
that , so to stay with this example, the  IP
address 10.30.22.176is set by CES.. as a floating service IP .. something is insane.. are the smb/NFS
services running (systemctl ...)    and can you access the exports
from outside ?From:      
 "Oesterlin, Robert"
<robert.oester...@nuance.com>To:      
 gpfsug main discussion
list <gpfsug-discuss@spectrumscale.org>Date:      
 10/17/2016 12:02 PMSubject:    
   Re: [gpfsug-discuss]
[EXTERNAL] Re: CES: IP address won't assign: "handleNetworkProblem
with lock held"Sent by:    
   gpfsug-discuss-boun...@spectrumscale.orgNo - the :0 and :1 address are floating
addresses *assigned by CES* - it created those interfaces. The issue seems
to be that these are assigned and CES doesn't know it.  Bob OesterlinSr Storage Engineer, Nuance HPC Grid  From: <gpfsug-discuss-boun...@spectrumscale.org>
on behalf of Olaf Weiser <olaf.wei...@de.ibm.com>Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Date: Monday, October 17, 2016 at 1:53 PMTo: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Subject: [EXTERNAL] Re: [gpfsug-discuss] CES: IP address won't assign:
"handleNetworkProblem with lock held" ah .. I see.. seems, that you already
has IP aliases around .. GPFS don't like it... eg. your node tct-gw01.infra.us.grid.nuance.com:
     inet 10.30.22.160/24   has already an alias -  10.30.22.176
... if I understand you answers correctly...from the doc'... [...] you need to provide a static IP adress, that is
not already[...] as an alias [...]  ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] CES: IP address won't assign: "handleNetworkProblem with lock held"

2016-10-17 Thread Olaf Weiser
ah .. I see.. seems, that you already has
IP aliases around .. GPFS don't like it... eg. your node tct-gw01.infra.us.grid.nuance.com:
     inet 10.30.22.160/24    has already an alias -  10.30.22.176 ... if I understand you answers correctly...from the doc'... [...] you need to provide
a static IP adress, that is not already[...] as an alias
[...]  Mit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:      
 "Oesterlin, Robert"
<robert.oester...@nuance.com>To:      
 gpfsug main discussion
list <gpfsug-discuss@spectrumscale.org>Date:      
 10/17/2016 11:43 AMSubject:    
   [gpfsug-discuss]
CES: IP address won't assign:        "handleNetworkProblem
with lock held"Sent by:    
   gpfsug-discuss-boun...@spectrumscale.orgYes - so interesting - it looks like
the nodes have the addresses assigned but CES doesn’t know that. [root@tct-gw01 ~]# mmlscluster --ces GPFS cluster information  GPFS cluster name:    
    nrg1-tct.nrg1.us.grid.nuance.com  GPFS cluster id:    
      17869514639699411874 Cluster Export Services global parameters-  Shared root directory:  
             /gpfs/fs1/ces  Enabled Services:    
                NFS  Log level:      
                     0  Address distribution policy:
         even-coverage Node  Daemon node name  
         IP address       CES IP
address list---   4   tct-gw01.infra.us.grid.nuance.com
10.30.22.160     None   5   tct-gw02.infra.us.grid.nuance.com
10.30.22.161     None   6   tct-gw03.infra.us.grid.nuance.com
10.30.22.162     None   7   tct-gw04.infra.us.grid.nuance.com
10.30.22.163     None [root@tct-gw01 ~]# mmdsh -N cesnodes
"ip a | grep 10.30.22." | sort -k 1tct-gw01.infra.us.grid.nuance.com:
     inet 10.30.22.160/24 brd 10.30.22.255 scope global
bond0tct-gw01.infra.us.grid.nuance.com:
     inet 10.30.22.176/24 brd 10.30.22.255 scope global
secondary bond0:0 tct-gw02.infra.us.grid.nuance.com:
     inet 10.30.22.161/24 brd 10.30.22.255 scope global
bond0tct-gw02.infra.us.grid.nuance.com:
     inet 10.30.22.177/24 brd 10.30.22.255 scope global
secondary bond0:0tct-gw02.infra.us.grid.nuance.com:
     inet 10.30.22.178/24 brd 10.30.22.255 scope global
secondary bond0:1 tct-gw03.infra.us.grid.nuance.com:
     inet 10.30.22.162/24 brd 10.30.22.255 scope global
bond0tct-gw03.infra.us.grid.nuance.com:
     inet 10.30.22.178/24 brd 10.30.22.255 scope global
secondary bond0:0 tct-gw04.infra.us.grid.nuance.com:
     inet 10.30.22.163/24 brd 10.30.22.255 scope global
bond0tct-gw04.infra.us.grid.nuance.com:
     inet 10.30.22.179/24 brd 10.30.22.255 scope global
secondary bond0:0  Bob OesterlinSr Storage Engineer, Nuance HPC Grid  From: <gpfsug-discuss-boun...@spectrumscale.org>
on behalf of Olaf Weiser <olaf.wei...@de.ibm.com>Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Date: Monday, October 17, 2016 at 1:27 PMTo: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Subject: [EXTERNAL] Re: [gpfsug-discuss] CES: IP address won't assign:
"handleNetworkProblem with lock held" ip a | grep 10.30.22.___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] CES: IP address won't assign: "handleNetworkProblem with lock held"

2016-10-17 Thread Olaf Weiser
simple question  -sorry for that -
your Nodes.. do they have an IP address in the same subnet as your IP address
listed here ?and if, is this network up n running
so that GPFS can find/detect it ?what tells mmlscluster --ces ?from each node - assuming class C /24
network , do a ip a | grep 10.30.22.cheersFrom:      
 "Oesterlin, Robert"
To:      
 gpfsug main discussion
list Date:      
 10/17/2016 08:00 AMSubject:    
   [gpfsug-discuss]
CES: IP address won't assign:        "handleNetworkProblem
with lock held"Sent by:    
   gpfsug-discuss-boun...@spectrumscale.orgCan anyone help me pinpoint the issue
here? These message repeat and the IP addresses never get assigned. [root@tct-gw01 ~]# tail /var/mmfs/gen/mmfslogMon Oct 17 10:57:55 EDT 2016: mmcesnetworkmonitor:
Found unassigned address 10.30.22.178Mon Oct 17 10:57:55 EDT 2016: mmcesnetworkmonitor:
Found unassigned address 10.30.22.179Mon Oct 17 10:57:55 EDT 2016: mmcesnetworkmonitor:
Found unassigned address 10.30.22.177Mon Oct 17 10:57:55 EDT 2016: mmcesnetworkmonitor:
Found unassigned address 10.30.22.176Mon Oct 17 10:57:55 EDT 2016: mmcesnetworkmonitor:
handleNetworkProblem with lock held: assignIP 10.30.22.178_0-_+,10.30.22.179_0-_+,10.30.22.177_0-_+,10.30.22.176_0-_+
1Mon Oct 17 10:57:55 EDT 2016: mmcesnetworkmonitor:
Assigning addresses: 10.30.22.178_0-_+,10.30.22.179_0-_+,10.30.22.177_0-_+,10.30.22.176_0-_+Mon Oct 17 10:57:55 EDT 2016: mmcesnetworkmonitor:
moveCesIPs: 10.30.22.178_0-_+,10.30.22.179_0-_+,10.30.22.177_0-_+,10.30.22.176_0-_+  [root@tct-gw01 ~]# mmces state show
-aNODE          
                     
    AUTH          AUTH_OBJ    
 NETWORK       NFS          
OBJ           SMB        
  CEStct-gw01.infra.us.grid.nuance.com  
     DISABLED      DISABLED    
 HEALTHY       HEALTHY       DISABLED
     DISABLED      HEALTHYtct-gw02.infra.us.grid.nuance.com  
     DISABLED      DISABLED    
 HEALTHY       HEALTHY       DISABLED
     DISABLED      HEALTHYtct-gw03.infra.us.grid.nuance.com  
     DISABLED      DISABLED    
 HEALTHY       HEALTHY       DISABLED
     DISABLED      HEALTHYtct-gw04.infra.us.grid.nuance.com  
     DISABLED      DISABLED    
 HEALTHY       HEALTHY       DISABLED
     DISABLED      HEALTHY[root@tct-gw01 ~]# mmces address list Address        
Node                    
           Group      Attribute-10.30.22.178    unassigned
                     
   none       none10.30.22.179    unassigned
                     
   none       none10.30.22.177    unassigned
                     
   none       none10.30.22.176    unassigned
                     
   none       none  Bob OesterlinSr Storage Engineer, Nuance HPC Grid507-269-0413 ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] SGExceptionLogBufferFullThread waiter

2016-10-15 Thread Olaf Weiser
in deed, it is.. consider last recent GPFS
releases .. lot's of enhancements/improvements in terms of file creation
rate 're included here .. cheersMit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:      
 "Knister, Aaron
S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" <aaron.s.knis...@nasa.gov>To:      
 gpfsug main discussion
list <gpfsug-discuss@spectrumscale.org>, "gpfsug    
   main discussion list" <gpfsug-discuss@spectrumscale.org>Date:      
 10/15/2016 10:07 AMSubject:    
   Re: [gpfsug-discuss]
SGExceptionLogBufferFullThread waiterSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgUnderstood. Thank you for your help. By the way, I was able to figure out by
poking mmpmon gfis that the job is performing 20k a second each of inode
creations, updates and deletions across 64 nodes. There's my 60k iops on
the backend. While I'm impressed and not surprised GPFS can keep up with
this...that's a pretty hefty workload. From: Olaf WeiserSent: 10/15/16, 12:47 PMTo: gpfsug main discussion listSubject: Re: [gpfsug-discuss] SGExceptionLogBufferFullThread waiterwell - hard to say.. 60K IO may or may not
be a problem... it depends on your storage backends.. check the response times to the physical disk on the NSD server... concerning
the output you provided ... check particularly 10.1.53.5  and 10.1.53.7  if they are in the same (bad/ poor) range .. then your storage back
end is in trouble or maybe just too heavily utilized ...  if the response times to physical disks on the NSD server are ok... ..
than maybe the network from client <-->  NSD server is somehow
in trouble .. From:        Aaron
Knister <aaron.s.knis...@nasa.gov>To:        <gpfsug-discuss@spectrumscale.org>Date:        10/15/2016
08:28 AMSubject:        Re:
[gpfsug-discuss] SGExceptionLogBufferFullThread waiterSent by:        gpfsug-discuss-boun...@spectrumscale.orgIt absolutely does, thanks Olaf!The tasks running on these nodes are running on 63 other nodes and generating ~60K iop/s of metadata writes and I *think* about the same in
reads. Do you think that could be contributing to the higher waiter times? I'm not sure quite what the job is up to. It's seemingly doing very little data movement, the cpu %used is very low but the load is rather high.-AaronOn 10/15/16 11:23 AM, Olaf Weiser wrote:> from your file system configuration .. mmfs  -L you'll
find the> size of the LOG> since release 4.x ..you can change it, but you need to re-mount the
FS> on every client , to make the change effective ...>> when a clients initiate writes/changes to GPFS  it needs to update
its> changes to the log -  if this narrows a certain filling degree,
GPFS> triggers so called logWrapThreads to write content to disk and so
free> space>> with your given numbers ... double digit [ms] waiter times .. you
fs> get's probably slowed down.. and there's something suspect with the> storage, because LOG-IOs are rather small and should not take that
long>> to give you an example from a healthy environment... the IO times
are so> small, that you usually don't see waiters for this..>> I/O start time RW    Buf type disk:sectorNum    
nSec  time ms      tag1>      tag2           Disk UID
typ      NSD node context   thread> --- -- --- - -  ---> - - -- --- --- -> --> 06:23:32.358851  W     logData    2:524306424
       8    0.439> 0         0  C0A70D08:57CF40D1 cli  
192.167.20.17 LogData> SGExceptionLogBufferFullThread> 06:23:33.576367  W     logData    1:524257280
       8    0.646> 0         0  C0A70D08:57CF40D0 cli  
192.167.20.16 LogData> SGExceptionLogBufferFullThread> 06:23:32.358851  W     logData    2:524306424
       8    0.439> 0         0  C0A70D08:57CF40D1 cli  
192.167.20.17 LogData> SGExceptionLogBufferFullThread> 06:23:33.576367  W     logData    1:524257280
       8    0.646> 0         0  C0A70D08:57CF40D0 cli  
192.167.20.16 LogData> SGExceptionLogBufferFullThread> 06:23:32.212426  W   iallocSeg    1:524490048
      64    0.733> 2       245  C0A

Re: [gpfsug-discuss] SGExceptionLogBufferFullThread waiter

2016-10-15 Thread Olaf Weiser
well - hard to say.. 60K IO may or may
not be a problem... it depends on your storage backends.. check the response times to the physical
disk on the NSD server... concerning the output you provided ... check
particularly 10.1.53.5  and 10.1.53.7
 if they are in the same (bad/ poor)
range .. then your storage back end is in trouble or maybe just too heavily
utilized ...  if the response times to physical
disks on the NSD server are ok... .. than maybe the network from client
<-->  NSD server is somehow in trouble .. From:      
 Aaron Knister <aaron.s.knis...@nasa.gov>To:      
 <gpfsug-discuss@spectrumscale.org>Date:      
 10/15/2016 08:28 AMSubject:    
   Re: [gpfsug-discuss]
SGExceptionLogBufferFullThread waiterSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgIt absolutely does, thanks Olaf!The tasks running on these nodes are running on 63 other nodes and generating ~60K iop/s of metadata writes and I *think* about the same in
reads. Do you think that could be contributing to the higher waiter times? I'm not sure quite what the job is up to. It's seemingly doing very little data movement, the cpu %used is very low but the load is rather high.-AaronOn 10/15/16 11:23 AM, Olaf Weiser wrote:> from your file system configuration .. mmfs  -L you'll
find the> size of the LOG> since release 4.x ..you can change it, but you need to re-mount the
FS> on every client , to make the change effective ...>> when a clients initiate writes/changes to GPFS  it needs to update
its> changes to the log -  if this narrows a certain filling degree,
GPFS> triggers so called logWrapThreads to write content to disk and so
free> space>> with your given numbers ... double digit [ms] waiter times .. you
fs> get's probably slowed down.. and there's something suspect with the> storage, because LOG-IOs are rather small and should not take that
long>> to give you an example from a healthy environment... the IO times
are so> small, that you usually don't see waiters for this..>> I/O start time RW    Buf type disk:sectorNum    
nSec  time ms      tag1>      tag2           Disk UID
typ      NSD node context   thread> --- -- --- - -  ---> - - -- --- --- -> --> 06:23:32.358851  W     logData    2:524306424
       8    0.439> 0         0  C0A70D08:57CF40D1 cli  
192.167.20.17 LogData> SGExceptionLogBufferFullThread> 06:23:33.576367  W     logData    1:524257280
       8    0.646> 0         0  C0A70D08:57CF40D0 cli  
192.167.20.16 LogData> SGExceptionLogBufferFullThread> 06:23:32.358851  W     logData    2:524306424
       8    0.439> 0         0  C0A70D08:57CF40D1 cli  
192.167.20.17 LogData> SGExceptionLogBufferFullThread> 06:23:33.576367  W     logData    1:524257280
       8    0.646> 0         0  C0A70D08:57CF40D0 cli  
192.167.20.16 LogData> SGExceptionLogBufferFullThread> 06:23:32.212426  W   iallocSeg    1:524490048
      64    0.733> 2       245  C0A70D08:57CF40D0 cli   192.167.20.16
Logwrap> LogWrapHelperThread> 06:23:32.212412  W     logWrap    2:524552192
       8    0.755> 0    179200  C0A70D08:57CF40D1 cli   192.167.20.17
Logwrap> LogWrapHelperThread> 06:23:32.212432  W     logWrap    2:525162760
       8    0.737> 0    125473  C0A70D08:57CF40D1 cli   192.167.20.17
Logwrap> LogWrapHelperThread> 06:23:32.212416  W   iallocSeg    2:524488384
      64    0.763> 2       347  C0A70D08:57CF40D1 cli   192.167.20.17
Logwrap> LogWrapHelperThread> 06:23:32.212414  W     logWrap    2:525266944
       8    2.160> 0    177664  C0A70D08:57CF40D1 cli   192.167.20.17
Logwrap> LogWrapHelperThread>>> hope this helps ..>>> Mit freundlichen Grüßen / Kind regards>>> Olaf Weiser>> EMEA Storage Competence Center Mainz, German / IBM Systems, Storage> Platform,> ---> IBM Deutschland> IBM Allee 1> 71139 Ehningen> Phone: +49-170-579-44-66> E-Mail: olaf.wei...@de.ibm.com> ---> IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter> Geschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter,> Norbert Janzen, Dr. Christian Keller, Ivo Koerner, Markus Koerner> Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht> Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940>>>> From:        Aaron Knister <aaron.s.knis...@nasa.gov>> To:        gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>> Date:        10/15/2016 07:23 AM> Subject:        [gpfsug-discuss

Re: [gpfsug-discuss] 4K sector NSD support (was: Hardware refresh)

2016-10-11 Thread Olaf Weiser
If you File System was created with i=512 you wont benefit from 4k Disk 
technologies ... some backend emulate it by Controller Software but most likely 
you'll get in trouble, when trying to add 4k Disk into your filesystem ... 

Gesendet von IBM Verse


   Aaron Knister --- Re: [gpfsug-discuss] 4K sector NSD support (was: Hardware 
refresh) --- 
Von:"Aaron Knister" 
An:gpfsug-discuss@spectrumscale.orgDatum:Di. 
11.10.2016 17:59Betreff:Re: [gpfsug-discuss] 4K sector NSD support (was: 
Hardware refresh)
  
Yuri,(Sorry for being somewhat spammy) I now understand the limitation 
after some more testing (I'm a hands-on learner, can you tell?). Given the 
right code/cluster/fs version levels I can add 4K dataOnly NSDv2 NSDs to a 
filesystem created with NSDv1 NSDs. What I can't do is seemingly add any 
metadataOnly or dataAndMetadata 4K luns to an fs that is not 4K aligned which I 
assume would be any fs originally created with NSDv1 LUNs. It seems possible to 
move all data away from NSDv1 LUNS in a filesystem behind-the-scenes using GPFS 
migration tools, and move the data to NSDv2 LUNs. In this case I believe what's 
missing is a tool to convert just the metadata structures to be 4K aligned 
since the data would already on 4k-based NSDv2 LUNS, is that the case? I'm 
trying to figure out what exactly I'm asking for in an RFE.-AaronOn 10/11/16 
7:57 PM, Aaron Knister wrote:> I think I was a little quick to the trigger. I 
re-read your last mail> after doing some testing and understand it differently. 
I was wrong> about my interpretation-- you can add 4K NSDv2 formatted NSDs to 
a> filesystem previously created with NSDv1 NSDs assuming, as you say, the.> 
minReleaseLevel and filesystem version are high enough. That negates> about 
half of my last e-mail. The fs still doesn't show as 4K aligned:>> loressd01:~ 
# /usr/lpp/mmfs/bin/mmlsfs tnb4k --is4KAligned> flagvalue   
 description> --- > 
--->  --is4KAligned  No 
  is4KAligned?>> but *shrug* most of the I/O to these disks should be 1MB 
anyway. If> somebody is pounding the FS with smaller than 4K I/O they're gonna 
get a> talkin' to.>> -Aaron>> On 10/11/16 6:41 PM, Aaron Knister wrote:>> 
Thanks Yuri. I'm asking for my own purposes but I think it's still relevant 
here:>> we're still at GPFS 3.5 and will be adding dataOnly NSDs with 4K 
sectors>> in the near future. We're planning to update to 4.1 before we 
format>> these NSDs, though. If I understand you correctly we can't bring 
these>> 4K NSDv2 NSDs into a filesystem with 512b-based NSDv1 NSDs? That's a>> 
pretty big deal :( Reformatting every few years with 10's of petabytes of 
data is not>> realistic for us (it would take years to move the data around). 
It also>> goes against my personal preachings about GPFS's storage 
virtualization>> capabilities: the ability to perform upgrades/make underlying 
storage>> infrastructure changes with behind-the-scenes data migration,>> 
eliminating much of the manual hassle of storage administrators doing>> rsync 
dances. I guess it's RFE time? It also seems as though AFM could>> help with 
automating the migration, although many of our filesystems do>> not have 
filesets on them so we would have to re-think how we lay out>> our 
filesystems. This is also curious to me with IBM pitching GPFS as a 
filesystem for>> cloud services (the cloud *never* goes down, right?). Granted 
I believe>> this pitch started after the NSDv2 format was defined, but if 
somebody>> is building a large cloud with GPFS as the underlying filesystem for 
an>> object or an image store one might think the idea of having to re-format>> 
the filesystem to gain access to critical new features is inconsistent>> with 
this pitch. It would be hugely impactful. Just my $.02. As you can tell, 
I'm frustrated there's no online conversion tool :) Not>> that there couldn't 
be... you all are brilliant developers. -Aaron On 10/11/16 1:22 PM, 
Yuri L Volobuev wrote:>>> This depends on the committed cluster version level 
(minReleaseLevel)>>> and file system format. Since NFSv2 is an on-disk format 
change, older>>> code wouldn't be able to understand what it is, and thus if 
there's a>>> possibility of a downlevel node looking at the NSD, the NFSv1 
format is>>> going to be used. The code does NSDv1<->NSDv2 conversions under 
the>>> covers as needed when adding an empty NSD to a file system.>> I'd 
strongly recommend getting a fresh start by formatting a new file>>> system. 
Many things have changed over the course of the last few years.>>> In 
particular, having a 4K-aligned file system can be a pretty big deal,>>> 
depending on what hardware one is going to deploy in the future, and>>> this is 
something that can't be bolted onto an existing file system.>>> Having 4K 
inodes is very handy for many reasons. New 

Re: [gpfsug-discuss] Fwd: Blocksize

2016-09-29 Thread Olaf Weiser
Hi - let me try to explain[...]  I.e. 256 / 32 = 8K, so
am I reading / writing *2* inodes (assuming 4K inode size) minimum? [...]
Answer .. your inodes are written in
an separate (hidden) file .. with an MD blocksize of 256K ... you can access
 64 inodes with one IO to the file system so e.g. a policy ran need to initiate
1 IO to read 64 inodes... if your MD blocksize would be 1MB ..
you could access 256 Inodes with one IO to the file system meta data...(policy
runs)if you write an  new  regular
file, an inode gets created for it  .. and gets written into your
inode file... forget about MD blocksize here... it gets written directly so you
will see an 8n of segments (512 segment size) IO   to the MD (in case
your inode size is 4k) in addition...  other meta data
is stored in the system pool, like directory blocks or indirect blocks..
these blocks are 32K .. and so.. if you would choose a blocksize for MD
> 1 MB ... you would waste some space because of the rule 1/32 of blocksize
 is the smallest allocatable space in one line: my advice ...select 1MB
blocksize for MD ... [..] disk layout.. keep in mind.. your #IOPS is most likely
limited by your storage backend .. with spinning drives you can estimate
around 100 IOPS per drive.. even though the metaData is stored in
a hidden file.. inodes are access directly from/to disk during normal operation
.. you backend should be able to cache these IOs accordingly... but you
won't be able to avoid , that Inodes have to be flushed to disk and -other
way round-, red from disk  without accessing a full stripe of your
RAID .. so depending on the BE .. an N-Way replication is more efficient
here than a RAID6 or 8+2pin addition keep in mind.. if you divide
1 MB (blocksize from FS) into a RAID 6 or 8+2p ...the data transfer size
to each physical disk is rather small and will hurt your performance  l.b.n.l. .. in terms of IO .. you can
save a lot of IOPS to the physical disk layer.. if you go with an nWay
replication in comparison to RAID6 .. because every physical disk these
days... can satisfy an 1MB IO request ... So if you initiate 1 IO with
1MB size from GPFS .. it can be answered with exactly 1 IO from physical
disk .. (compared to RAID 6 .. - your storage
backend would have to satisfy this single IO with 1MB with at least 4 or
8 IOs ... ).. MD is rather small so the trade off
(waste of space ) can be ignored .. so go with RAID 1  or nWay replication...for
MD hope this helps..Mit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:      
 "Buterbaugh, Kevin
L" <kevin.buterba...@vanderbilt.edu>To:      
 gpfsug main discussion
list <gpfsug-discuss@spectrumscale.org>Date:      
 09/29/2016 05:03 PMSubject:    
   [gpfsug-discuss]
Fwd:  BlocksizeSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgResending from the right e-mail address...Begin forwarded message:From: gpfsug-discuss-ow...@spectrumscale.orgSubject: Re: [gpfsug-discuss] BlocksizeDate: September 29, 2016 at 10:00:36
AM CDTTo: k...@accre.vanderbilt.eduYou are not allowed to post to this mailing list, and
your message hasbeen automatically rejected.  If you think that your messages arebeing rejected in error, contact the mailing list owner atgpfsug-discuss-ow...@spectrumscale.org.From: "Kevin L. Buterbaugh"
<k...@accre.vanderbilt.edu>Subject: Re: [gpfsug-discuss] BlocksizeHi
Marc and others, I understand … I guess I did a poor
job of wording my question, so I’ll try again.  The IBM recommendation
for metadata block size seems to be somewhere between 256K - 1 MB, depending
on who responds to the question.  If I were to hypothetically use
a 256K metadata block size, does the “1/32nd of a block” come into play
like it does for “not metadata”?  I.e. 256 / 32 = 8K, so am I reading
/ writing *2* inodes (assuming 4K inode size) minimum?Date: September 29, 2016 at 10:00:29
AM CDTTo: gpfsug main discussion list
<gpfsug-discuss@spectrumscale.org>Hi Marc and others, I understand … I guess I did a poor job of wording my
question, so I’ll try again.  The IBM recommendation for metadata
block size seems to be somewhere between 256K - 1 MB, depending on who
responds to the question.  If I were to hypothetically use a 256K
m

Re: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection"

2016-08-30 Thread Olaf Weiser

there're multiple dependencies , the
performance of MD scan is related toas a rule of thumb... the total amount of IOPS you need to
scan your MD is highly dependent on the metadata blocksize, inode size
(assuming default 4K )   ( and the total number Inodes.. ;-) ) the time it takes to answer these IOs
depends on your backend(s) , and ... .. the parallelism and the node's hardware
resource  and finally the network connectivity (latency, bandwidth)
to give some directions... we even have clusters, using regular
(old and spinning) drives , and 're able to scan > 200 mio files within
< 15 minutes.. From:      
 "Knister, Aaron
S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" To:      
 gpfsug main discussion
list , "gpfsug    
   main discussion list" Date:      
 08/31/2016 06:01 AMSubject:    
   Re: [gpfsug-discuss]
*New* IBM Spectrum Protect Whitepaper "Petascale Data Protection"Sent by:    
   gpfsug-discuss-boun...@spectrumscale.orgJust want to add on to one of the points
Sven touched on regarding metadata HW. We have a modest SSD infrastructure
for our metadata disks and we can scan 500M inodes in parallel in about
5 hours if my memory serves me right (and I believe we could go faster
if we really wanted to). I think having solid metadata disks (no pun intended)
will really help with scan times. From: Sven OehmeSent: 8/30/16, 7:25 PMTo: gpfsug main discussion listSubject: Re: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper
"Petascale Data Protection"so lets start with some simple questions.  when you say mmbackup takes ages, what version of gpfs
code are you running ? how do you execute the mmbackup command ? exact parameters
would be useful . what HW are you using for the metadata disks ? how much capacity (df -h) and how many inodes (df -i)
do you have in the filesystem you try to backup ?svenOn Tue, Aug 30, 2016 at 3:02 PM, Lukas Hejtmanek 
wrote:Hello,On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote:> Find the paper here:>> https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protectionthank you for the paper, I appreciate it.However, I wonder whether it could be extended a little. As it has the
titlePetascale Data Protection, I think that in Peta scale, you have to deal
withmillions (well rather hundreds of millions) of files you store in and this
issomething where TSM does not scale well.Could you give some hints:On the backup site:mmbackup takes ages for:a) scan (try to scan 500M files even in parallel)b) backup - what if 10 % of files get changed - backup process can be blockedseveral days as mmbackup cannot run in several instances on the same filesystem, so you have to wait until one run of mmbackup finishes. How long
couldit take at petascale?On the restore site:how can I restore e.g. 40 millions of file efficiently? dsmc restore '/path/*'runs into serious troubles after say 20M files (maybe wrong internalstructures used), however, scanning 1000 more files takes several minutesresulting the dsmc restore never reaches that 40M files.using filelists the situation is even worse. I run dsmc restore -filelistwith a filelist consisting of 2.4M files. Running for *two* days withoutrestoring even a single file. dsmc is consuming 100 % CPU.So any hints addressing these issues with really large number of files
wouldbe even more appreciated.--Lukáš Hejtmánek___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] iowait?

2016-08-29 Thread Olaf Weiser
try mmfsadm dump iohist gives you a nice approach, on how long
it takes until an IO is processed .. the statistic reports the time it
takes, the IO is done from GPFS <--> to your block devices (including
the path to it) Mit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:      
 Aaron Knister <aaron.s.knis...@nasa.gov>To:      
 <gpfsug-discuss@spectrumscale.org>Date:      
 08/29/2016 07:54 PMSubject:    
   Re: [gpfsug-discuss]
iowait?Sent by:    
   gpfsug-discuss-boun...@spectrumscale.orgSure, we can and we do use both iostat/sar and collectl
to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how
much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight.On 8/29/16 1:50 PM, Alex Chekholko wrote:> Any reason you can't just use iostat or collectl or any of a number
of> other standards tools to look at disk utilization?>> On 08/29/2016 10:33 AM, Aaron Knister wrote:>> Hi Everyone,>>>> Would it be easy to have GPFS report iowait values in linux? This
would>> be a huge help for us in determining whether a node's low utilization
is>> due to some issue with the code running on it or if it's blocked
on I/O,>> especially in a historical context.>>>> I naively tried on a test system changing schedule() in>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this:>>>> again:>>   /* call the scheduler */>>   if ( waitFlags & INTERRUPTIBLE )>>     schedule();>>   else>>     io_schedule();>>>> Seems to actually do what I'm after but generally bad things happen
when>> I start pretending I'm a kernel developer.>>>> Any thoughts? If I open an RFE would this be something that's
relatively>> easy to implement (not asking for a commitment *to* implement
it, just>> that I'm not asking for something seemingly simple that's actually>> fairly hard to implement)?>>>> -Aaron>>>-- Aaron KnisterNASA Center for Climate Simulation (Code 606.2)Goddard Space Flight Center(301) 286-2776___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Aggregating filesystem performance

2016-07-12 Thread Olaf Weiser
Hi, a simple approach is to use enhanced
dstat statistics on the NSD server side .. example:cp /usr/lpp/mmfs/samples/util/dstat_gpfsops.py.dstat.0.7
/usr/share/dstat/dstat_gpfsops.pyexport DSTAT_GPFS_WHAT="vio,vflush
-c -n -d -M gpfsops --nocolor"> dstat --gpfsopsbetter and more fully, what you want...
configure ZIMON/ perfmonthis gerates statistics like this.mmperfmon query compareNodes gpfs_nsdds_bytes_written
10Legend: 1:     gss01.frozen|GPFSNSDDisk|c1_LDATA1core|gpfs_nsdds_bytes_written 2:     gss01.frozen|GPFSNSDDisk|c1_LDATA1fs1|gpfs_nsdds_bytes_written 3:     gss01.frozen|GPFSNSDDisk|c1_LDATA2core|gpfs_nsdds_bytes_written 4:     gss01.frozen|GPFSNSDDisk|c1_LDATA2fs1|gpfs_nsdds_bytes_written 5:     gss01.frozen|GPFSNSDDisk|c1_LMETA1core|gpfs_nsdds_bytes_written 6:     gss01.frozen|GPFSNSDDisk|c1_LMETA1fs1|gpfs_nsdds_bytes_written 7:     gss01.frozen|GPFSNSDDisk|c1_LMETA2core|gpfs_nsdds_bytes_written 8:     gss01.frozen|GPFSNSDDisk|c1_LMETA2fs1|gpfs_nsdds_bytes_written 9:     gss02.frozen|GPFSNSDDisk|c1_RDATA1core|gpfs_nsdds_bytes_written10:     gss02.frozen|GPFSNSDDisk|c1_RDATA1fs1|gpfs_nsdds_bytes_written11:     gss02.frozen|GPFSNSDDisk|c1_RDATA2core|gpfs_nsdds_bytes_written12:     gss02.frozen|GPFSNSDDisk|c1_RDATA2fs1|gpfs_nsdds_bytes_written13:     gss02.frozen|GPFSNSDDisk|c1_RMETA1core|gpfs_nsdds_bytes_written14:     gss02.frozen|GPFSNSDDisk|c1_RMETA1fs1|gpfs_nsdds_bytes_written15:     gss02.frozen|GPFSNSDDisk|c1_RMETA2core|gpfs_nsdds_bytes_written16:     gss02.frozen|GPFSNSDDisk|c1_RMETA2fs1|gpfs_nsdds_bytes_written Row          
Timestamp gss01     gss01 gss01     gss01 gss01 gss01
gss01 gss01 gss02     gss02 gss02     gss02 gss02 gss02
gss02 gss02   1 2016-02-27-01:22:06    
0         0     0        
0     0     0     0     0  
  0         0     0      
  0     0     0     0    
0   2 2016-02-27-01:22:07    
0         0     0        
0     0     0     0     0  
  0         0     0      
  0     0     0     0    
0   3 2016-02-27-01:22:08    
0         0     0        
0     0     0     0     0  
  0         0     0      
  0     0     0     0    
0   4 2016-02-27-01:22:09    
0         0     0        
0     0     0     0     0  
  0         0     0      
  0     0     0     0    
0   5 2016-02-27-01:22:10    
0         0     0        
0     0     0     0     0  
  0         0     0      
  0     0     0     0    
0   6 2016-02-27-01:22:11    
0         0     0        
0     0     0     0     0  
  0         0     0      
  0     0     0     0    
0   7 2016-02-27-01:22:12    
0  83886080     0  67108864     0  
  0     0     0     0  16777216
    0  16777216     0     0    
0     0   8 2016-02-27-01:22:13    
0 436207616     0 452984832     0     0  
  0     0     0 436207616     0 419430400
    0     0     0     0   9 2016-02-27-01:22:14    
0  16777216     0         0    
0     0     0     0     0  67108864
    0  83886080     0     0    
0     0  10 2016-02-27-01:22:15    
0         0     0        
0     0     0     0     0  
  0         0     0      
  0     0     0     0    
0 you can filter the overall IO's by filesystem
, or dedicated to some nodes.. it is very very flexible e.g.  mmperfmon query compareNodes
gpfs_fs_bytes_written,gpfs_fs_bytes_read -n 5 -b 30 --filter gpfs_fs_name=beer
 ... and so on you may need some minutes to set it
up .. but once it is configured, it is very powerful ... have fun.. ;-) Mit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:      
 Brian Marshall <mimar...@vt.edu>To:      
 gpfsug-discuss@spectrumscale.orgDate:      
 07/12/2016 03:13 PMSubject:    
   [gpfsug-discuss]
Aggregating filesystem performanceSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgAll,I have a Spectrum Scale 4.1 cluster serving data to 4
different client clusters (~800 client nodes total).  I am looking
for ways to monitor filesystem performance to uncover network bottlenecks
or job usage patterns affecting performance.I received this info below from an IBM person.  Does
anyone have examples of aggregating mmperfmon data?  Is anyone doing
something different?"mmpmon
does not currently aggregate cluster-wide data. As of SS 4.1.x you can
look at "mmperfmon query" as well, but it also primaril

Re: [gpfsug-discuss] Migration policy confusion

2016-07-07 Thread Olaf Weiser
HI , first of all, given by the fact, that
the MetaData is stored in system pool .. system should be the "fastest"
pool / underlaying disks ... you have.. with a "slow" access to the
MD, access to data is very likely  affected.. (except for cached data,
where MD is cached) in addition..  tell us, how "big"
your test files are ? .. you moved by mmapplypolicy Mit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:      
 "mark.b...@siriuscom.com"
<mark.b...@siriuscom.com>To:      
 gpfsug main discussion
list <gpfsug-discuss@spectrumscale.org>Date:      
 07/07/2016 03:00 PMSubject:    
   [gpfsug-discuss]
Migration policy confusionSent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHello all, I’m struggling trying to understand
tiering and policies in general in SpecScale.  I have a single filesystem
with two pools defined (system, GOLD).  The GOLD pool is made up of
some faster disks than the system pool.  The policy I’m trying to
get working is as follows RULE 'go_gold'    MIGRATE        FROM POOL 'system'    TO POOL 'GOLD'        WHERE (LOWER(NAME)
LIKE '%.perf') I’m simply trying to get the data to move
the NDS’s in GOLD pool. When I do an mmapplypolicy, mmlsattr shows
that it’s now in the GOLD pool but when I do a mmdf the data shows 100%
free still.  I tried a mmrestripefs as well and no change to the mmdf
output.  Am I missing something here?  Is this just normal behavior
and the blocks will get moved at some other time?  I guess I was expecting
instant gratification and that those files would have been moved to the
correct NSD.  Mark R. Bush| Solutions ArchitectMobile: 210.237.8415 | mark.b...@siriuscom.comSirius Computer Solutions | www.siriuscom.com10100 Reunion Place, Suite 500, San Antonio, TX 78216  This message (including any attachments)
is intended only for the use of the individual or entity to which it is
addressed and may contain information that is non-public, proprietary,
privileged, confidential, and exempt from disclosure under applicable law.
If you are not the intended recipient, you are hereby notified that any
use, dissemination, distribution, or copying of this communication is strictly
prohibited. This message may be viewed by parties at Sirius Computer Solutions
other than those named in the message header. This message does not contain
an official representation of Sirius Computer Solutions. If you have received
this communication in error, notify Sirius Computer Solutions immediately
and (i) destroy this message if a facsimile or (ii) delete this message
immediately if this is an electronic communication. Thank you. Sirius
Computer Solutions ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] ESS GL6

2016-06-20 Thread Olaf Weiser
Hi Damir, mmlsrecovergroup   --> will
show your RGmmlsrecoverygroup RG -L .. will provide
capacity information or .. you can use the GUI with ESS / GNR , there's no need any
more to create more than one vdisk(=nsd) per RG for a pool a practical approach/example  for
youso a file system  consists of 1 vdisk(NSD) for MetaData , RAID: 4WR
, BS 1M in RG"left"1 vdisk(NSD) for MetaData , Raid : 4WR,
BS 1M  in RG "right"1 vdisk (NSD) for data , 8+3p , BS 1...16M
.. depends on your data/workload  in RG "left"1 vdisk (NSD) for data , 8+3p , BS 1...16M
.. depends on your data/workload  in RG "right"so 4 NSDs to provide everything you
need to serve a file system .. the size of the vdisks can be up to
half of the capacity of your RG please note: if you come from an existing
environment , and the file system should be migrated to ESS (online) ,
you might hit some limitations like - blocksize (can not be changed)
 - disk size.. depending on the
existing storage pools/disk sizes have fun cheersMit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:      
 Damir Krstic <damir.krs...@gmail.com>To:      
 gpfsug main discussion
list <gpfsug-discuss@spectrumscale.org>Date:      
 06/20/2016 05:10 PMSubject:    
   [gpfsug-discuss]
ESS GL6Sent by:    
   gpfsug-discuss-boun...@spectrumscale.orgCouple of questions regarding Spectrum Scale 4.2 and ESS.
We recently got our ESS delivered and are putting it in production this
week. Previous to ESS we ran GPFS 3.5 and IBM DCS3700 storage arrays. My question about ESS and Spectrum Scale has to do with
querying available free space and adding capacity to existing file system. In the old days of GPFS 3.5 I would create LUNs on 3700,
zone them to appropriate hosts, and then see them as multipath devices
on NSD servers. After that, I would create NSD disks and add them to the
filesystem. With the ESS, however, I don't think the process is quite
the same. IBM tech that was here installing the system has created all
the "LUNs" or the equivalent in the ESS system. How do you I
query what space is available to add to the existing filesystems, and then
how do you actually add space? I am reading ESS RedBook but the answers are not obvious. Thanks,Damir ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss