Thanks Olaf!!! I've sent an email to mramos open a PMR for this case.
 
 
Abraços / Regards / Saludos,

 

Anderson Nobre
AIX & Power Consultant
Master Certified IT Specialist
IBM Systems Hardware Client Technical Team – IBM Systems Lab Services

community_general_lab_services
 

Phone: 55-19-2132-4317
E-mail:
IBM
 
 
----- Original message -----
From: "Olaf Weiser" <[email protected]>
Sent by: [email protected]
To: gpfsug main discussion list <[email protected]>
Cc:
Subject: Re: [gpfsug-discuss] Top files on GPFS filesystem
Date: Wed, Sep 5, 2018 12:02 PM
 
hmm ...limiting IO resources per file / file based . .is really not an easy thing to do ...
please consider open a PMR  and tell me the number.... there is an undocumented parameter , that one could limit the buffer per file .. so indirectly , you can influence the #threads working per file ..
but this change will affect all files on that node ...
I made some good experiences with that for SAS, which is here mostly read driven ( running on RHEL )





From:        "Anderson Ferreira Nobre" <[email protected]>
To:        [email protected]
Cc:        [email protected]
Date:        09/04/2018 09:40 PM
Subject:        Re: [gpfsug-discuss] Top files on GPFS filesystem
Sent by:        [email protected]



Hi Olaf,
 
Thanks and sorry for reply so long. We've been testing several ways to provide this information for the user. Let me give you more details about that.
There's a corporate SASGRID consolidating several SAS applications from several business areas. All of them are using the same saswork filesystem. So the idea is to provide a way to identify the top processes or users that are doing more I/O in terms of throughput or IOPS. We have tested the following:
-  fileheat and policy engine to identify most active files: We first activated fileheat by executing the command
   # mmchconfig fileHeatLossPercent=25,fileHeatPeriodMinutes=720
   After that the SAS admin started to run a job and the we created the following policy to see if we could detect the corresponding SAS file:
   rule 'fileheatlist' list 'hotfiles' weight(FILE_HEAT)
  SHOW( HEX( XATTR( 'gpfs.FileHeat' )) ||
     ' A=' || varchar(ACCESS_TIME) ||
     ' K=' || varchar(KB_ALLOCATED) ||
     ' H=' || varchar(FILE_HEAT) ||
     ' U=' || varchar(USER_ID) ||
     ' G=' || varchar(GROUP_ID) ||
     ' FZ=' || varchar(FILE_SIZE) ||
     ' CT=' || varchar(CREATION_TIME) ||
     ' CHT=' || varchar(CHANGE_TIME) ||
     ' M=' || varchar(MODIFICATION_TIME) )

    where FILE_HEAT != 0.0
  Then, we executed the command:
  # mmapplypolicy  -P policy-file-heat.txt -I defer -f test1
  I don't know why, but always was reporting that zero files were selected. I don't know what´s missing or if that's the way it is.
- Combine mmdiag with a list of files generated by ILM engine: For we get busiest files we executed the following command:
  # mmdiag --iohist verbose > mmdiag--iohist_verbose.out
  One way to list the top files was this:
  # cat mmdiag--iohist_verbose.out | grep data | awk '{print $10}' | uniq -c | sort -nr | head
     7 135003
     5 135003
     3 135003
     2 134985
     2 134985
     1 64171
     1 64094
     1 64013
     1 46465
     1 46465

  Another one was executing the following command:
  # cat mmdiag--iohist_verbose.out | grep data | sort -k6 -nr | head
03:12:11.911813  W        data    2:132768           8   11.782  cli  0AC3C23C:58AEDD53    10.195.194.60    451799         0 Sync      SyncFSWorkerThread
03:12:10.927003  W        data    1:5410160          8   11.086  cli  0AC3C23C:58091F75    10.195.194.60     46465      1319 Sync      SyncFSWorkerThread
03:12:11.927521  W        data    2:113995072        8    7.602  cli  0AC3C23C:58AEDD53    10.195.194.60    451776         1 Sync      SyncFSWorkerThread
03:12:10.999507  W        data    2:149912432       24    3.830  cli  0AC3C23C:58AEDC8D    10.195.194.60    134985         4 Sync      SyncFSWorkerThread
03:12:20.190427  W        data    1:40854976         8    3.058  cli  0AC3C23C:58091F75    10.195.194.60     64013         0 Sync      SyncFSWorkerThread
03:12:11.923742  W        data    2:182741840        8    3.036  cli  0AC3C23C:58AEDD53    10.195.194.60    385976         0 Sync      SyncFSWorkerThread
03:12:20.186045  W        data    1:41352672        16    2.451  cli  0AC3C23C:58091F1B    10.195.194.60    451774         2 Sync      SyncFSWorkerThread
03:12:16.139833  W        data    2:149912416       24    1.595  cli  0AC3C23C:58AEDC8D    10.195.194.60    134985         4 Cleaner   CleanBufferThread
03:12:21.544674  W        data    3:146654840        8    0.873  cli  0AC3C23C:592334F8    10.195.194.60    451780         0 Sync      SyncFSWorkerThread
03:12:10.998636  W        data    2:149912352        8    0.833  cli  0AC3C23C:58AEDC8D    10.195.194.60    134985         4 Sync      SyncFSWorkerThread

  For we discover which filesystem that inode number belongs:
  # mmlsnsd -L | grep 58AEDD53
sasconfig     nsdconfig0001 0AC3C23C58AEDD53   host1,host2

 Then we could run a policy rule to just list the files, here is the policy:
  rule 'fileheatlist' list 'hotfiles' weight(FILE_HEAT))
     show( ' U=' || varchar(USER_ID) ||
           ' G=' || varchar(GROUP_ID) ||
           ' A=' || varchar(ACCESS_TIME) ||
           ' K=' || varchar(KB_ALLOCATED) ||
           ' H=' || varchar(computeFileHeat(CURRENT_TIMESTAMP-ACCESS_TIME,xattr('gpfs.FileHeat'),KB_ALLOCATED)) ||
           ' FZ=' || varchar(FILE_SIZE) ||
           ' CT=' || varchar(CREATION_TIME) ||
           ' CHT=' || varchar(CHANGE_TIME) ||
           ' M=' || varchar(MODIFICATION_TIME) )

  # mmapplypolicy sasconfig -P policy-file-heat3.txt -I defer -f teste6
  Then we could grep by inode number and see which file it is:
  # grep "^451799 " teste6.list.hotfiles
  For privacy reasons I won't show the result but it found the file. The good thing this list also provides the UID and GID of the file. We still waiting a feedback from SAS admin to see it's acceptable.
- dstat with --gpfs-ops --top-io-adv|--top-bio|--top-io: The problem is it only shows one process. That's not enough.
- Systemtap: It didn't work. I think it's because there's no GPFS symbols. If somebody know how to add GPFS symbols that can be very handy.
- QOS: We first enabled QOS to just collect filesystem statistics:
  # mmchqos saswork --enable --fine-stats 60 --pid-stats yes
  The the SAS admin started another SAS job and got the PID. Then we run the following command:
  # mmlsqos saswork --fine-stats 2 --seconds 60 | grep SASPID
  We never matched the PIDs. When you get from ps -ef | grep nodms, it return a PID of 5 digits and mmlsqos gives PIDs of 8 digits. We have a ticket opended to understand what's happening.
 
After all this time trying to figure out a way to generate this report, I think the problem is more complex. Even if we get this information what we could do to put a limit in those processes? I think the best option would have AIX servers running WLM and the saswork filesystems would need to be local on each server. In that way we not only could monitor but define classes, shares and limits for I/O. I think RedHat or Linux in general doesn't have a workload manager like in AIX.
 
 
Abraços / Regards / Saludos,

 
Anderson Nobre
AIX & Power Consultant
Master Certified IT Specialist
IBM Systems Hardware Client Technical Team – IBM Systems Lab Services


community_general_lab_services

 
 

Phone:55-19-2132-4317
E-mail: [email protected]
IBM

 
 
----- Original message -----
From: "Olaf Weiser" <[email protected]>
Sent by: [email protected]
To: gpfsug main discussion list <[email protected]>
Cc:
Subject: Re: [gpfsug-discuss] Top files on GPFS filesystem
Date: Mon, Aug 13, 2018 3:10 AM


there's no mm* command to get it cluster wide..
you can use fileheat and policy engine to identify most active files ..  and further more... combine it with migration rules ... to replace those files ..
please note.. files, that are accessed very heavily but all requests answered out of pagepol (cached files) .. fileheat does'nt get increased for cache hits...  fileheat is only counted for real IOs to the disk... as intended ...







From:        "Anderson Ferreira Nobre" <[email protected]>
To:        [email protected]
Date:        08/10/2018 08:10 PM
Subject:        [gpfsug-discuss] Top files on GPFS filesystem
Sent by:        [email protected]



Hi all,

Does anyone know how to list the top files by throughput and IOPS in a single GPFS filesystem like filemon in AIX?


 
Abraços / Regards / Saludos,

 
Anderson Nobre
AIX & Power Consultant
Master Certified IT Specialist
IBM Systems Hardware Client Technical Team – IBM Systems Lab Services


community_general_lab_services

 

 

Phone:55-19-2132-4317
E-mail: [email protected]
IBM


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org

http://gpfsug.org/mailman/listinfo/gpfsug-discuss


 
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org

http://gpfsug.org/mailman/listinfo/gpfsug-discuss
 
Image._2_DBC5F19CDBC5ECBC00214F54C12582E8.jpgImage._1_DBCF2504DBCF20E800214F54C12582E8.gif
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org

http://gpfsug.org/mailman/listinfo/gpfsug-discuss

 
 
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
 

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to