Re: [gpfsug-discuss] Backend corruption
Hi, I tried to use a policy to find out what files are located on the broken disks. But this is not finding any files or directories (I cleaned some of the output): [I] GPFS Current Data Pool Utilization in KB and % Pool_Name KB_OccupiedKB_Total Percent_Occupied V53 173121536 69877104640 0.247751444% [I] 29609813 of 198522880 inodes used: 14.915063%. [I] Loaded policy rules from test.rule. rule 'ListRule' list 'ListName' from pool 'V53' [I] Directories scan: 28649029 files, 960844 directories, 0 other objects, 0 'skipped' files and/or errors. [I] Inodes scan: 28649029 files, 960844 directories, 0 other objects, 0 'skipped' files and/or errors. [I] Summary of Rule Applicability and File Choices: Rule# Hit_Cnt KB_Hit Chosen KB_Chosen KB_Ill Rule 00 0 0 0 0 RULE 'ListRule' LIST 'ListName' FROM POOL 'V53' [I] Filesystem objects with no applicable rules: 29609873. [I] A total of 0 files have been migrated, deleted or processed by an EXTERNAL EXEC/script; 0 'skipped' files and/or errors. So the policy is not finding any files but there is still some data on the V50003 pool? Stef On 2020-08-03 17:21, Uwe Falke wrote: Hi, Stef, if just that V5000 has provided the storage for one of your pools entirely, and if your metadata are still incorrupted, a inode scan with a suited policy should yield the list of files on that pool. If I am not mistaken, the list policy could look like RULE 'list_v5000' LIST 'v5000_filelist' FROM POOL paut it into a (policy) file, run that by mmapplypolicy against the file system in question, it should produce a file listing in /tmp/v5000_filelist. If it doesn#T work exactly like that (I might have made one or mor mistakes), check out the information lifycacle section in the scal admin guide. If the prereqs for the above are not met, you need to run more expensive investigations (using tsdbfs for all block addresses on v5000-provided NSDs). Mit freundlichen Grüßen / Kind regards Dr. Uwe Falke IT Specialist Global Technology Services / Project Services Delivery / High Performance Computing +49 175 575 2877 Mobile Rathausstr. 7, 09111 Chemnitz, Germany uwefa...@de.ibm.com IBM Services IBM Data Privacy Statement IBM Deutschland Business & Technology Services GmbH Geschäftsführung: Dr. Thomas Wolter, Sven Schooss Sitz der Gesellschaft: Ehningen Registergericht: Amtsgericht Stuttgart, HRB 17122 From: Stef Coene To: gpfsug-discuss@spectrumscale.org Date: 03/08/2020 16:07 Subject:[EXTERNAL] [gpfsug-discuss] Backend corruption Sent by:gpfsug-discuss-boun...@spectrumscale.org Hi, We have a GPFS file system which uses, among other storage, a V5000 as backend. There was an error in the fire detection alarm in the datacenter and a fire alarm was triggered. The result was that the V5000 had a lot of broken disks. Most of the disks recovered fine after a reseat, but some data is corrupted on the V5000. This means that for 22MB of data, the V5000 returns a read error to the GPFS. We migrated most of the data to an disks but there is still 165 GB left on the V5000 pool. When we try to remove the disks with mmdeldisk, it fails after a while and places some of the disks as down. It generated a file with inodes, this an example of a 2 lines: 9168519 0:00 1 1 exposed illreplicated illplaced REGULAR_FILE Error: 218 Input/output error 9251611 0:00 1 1 exposed illreplicated REGULAR_FILE Error: 218 Input/output error How can I get a list of files that uses data of the V5000 pool? The data is written by CommVault. When I have a list of files, I can determine the impact on the application. Stef ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=fTuVGtgq6A14KiNeaGfNZzOOgtHW5Lm4crZU6lJxtB8=HhbxQEWLNTXFDCFT5LDpMD4YvYTUEdl6Nt6IgjdVlNo=fxsoDddp4OUnP7gORNUOnAmrnHPIU57OQMnraXEEO0k= ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
[gpfsug-discuss] Backend corruption
Hi, We have a GPFS file system which uses, among other storage, a V5000 as backend. There was an error in the fire detection alarm in the datacenter and a fire alarm was triggered. The result was that the V5000 had a lot of broken disks. Most of the disks recovered fine after a reseat, but some data is corrupted on the V5000. This means that for 22MB of data, the V5000 returns a read error to the GPFS. We migrated most of the data to an disks but there is still 165 GB left on the V5000 pool. When we try to remove the disks with mmdeldisk, it fails after a while and places some of the disks as down. It generated a file with inodes, this an example of a 2 lines: 9168519 0:00 1 1 exposed illreplicated illplaced REGULAR_FILE Error: 218 Input/output error 9251611 0:00 1 1 exposed illreplicated REGULAR_FILE Error: 218 Input/output error How can I get a list of files that uses data of the V5000 pool? The data is written by CommVault. When I have a list of files, I can determine the impact on the application. Stef ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] GUI refresh task error
Ok, thanx for the answer. I will wait for the fix. Stef On 2020-07-16 15:25, Roland Schuemann wrote: Hi Stef, we already recognized this error too and opened a PMR/Case at IBM. You can set this task to inactive, but this is not persistent. After gui restart it comes again. This was the answer from IBM Support. This will be fixed in the next release of 5.0.5.2, right now there is no work-around but will not cause issue besides the cosmetic task failed message. Is this OK for you? So we ignore (Gui is still degraded) it and wait for the fix. Kind regards Roland Schümann Freundliche Grüße / Kind regards Roland Schümann Roland Schümann Infrastructure Engineering (BTE) CIO PB Germany Deutsche Bank I Technology, Data and Innovation Postbank Systems AG -Ursprüngliche Nachricht- Von: gpfsug-discuss-boun...@spectrumscale.org Im Auftrag von Stef Coene Gesendet: Donnerstag, 16. Juli 2020 15:14 An: gpfsug main discussion list Betreff: [gpfsug-discuss] GUI refresh task error Hi, On brand new 5.0.5 cluster we have the following errors on all nodes: "The following GUI refresh task(s) failed: WATCHFOLDER" It also says "Failure reason: Command mmwatch all functional --list-clustered-status failed" Running mmwatch manually gives: mmwatch: The Clustered Watch Folder function is only available in the IBM Spectrum Scale Advanced Edition or the Data Management Edition. mmwatch: Command failed. Examine previous error messages to determine cause. How can I get rid of this error? I tried to disable the task with: chtask WATCHFOLDER --inactive EFSSG1811C The task with the name WATCHFOLDER is already not scheduled. Stef ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Die Europäische Kommission hat unter http://ec.europa.eu/consumers/odr/ eine Europäische Online-Streitbeilegungsplattform (OS-Plattform) errichtet. Verbraucher können die OS-Plattform für die außergerichtliche Beilegung von Streitigkeiten aus Online-Verträgen mit in der EU niedergelassenen Unternehmen nutzen. Informationen (einschließlich Pflichtangaben) zu einzelnen, innerhalb der EU tätigen Gesellschaften und Zweigniederlassungen des Konzerns Deutsche Bank finden Sie unter https://www.deutsche-bank.de/Pflichtangaben. Diese E-Mail enthält vertrauliche und/ oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser E-Mail ist nicht gestattet. The European Commission has established a European online dispute resolution platform (OS platform) under http://ec.europa.eu/consumers/odr/. Consumers may use the OS platform to resolve disputes arising from online contracts with providers established in the EU. Please refer to https://www.db.com/disclosures for information (including mandatory corporate particulars) on selected Deutsche Bank branches and group companies registered or incorporated in the European Union. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
[gpfsug-discuss] GUI refresh task error
Hi, On brand new 5.0.5 cluster we have the following errors on all nodes: "The following GUI refresh task(s) failed: WATCHFOLDER" It also says "Failure reason: Command mmwatch all functional --list-clustered-status failed" Running mmwatch manually gives: mmwatch: The Clustered Watch Folder function is only available in the IBM Spectrum Scale Advanced Edition or the Data Management Edition. mmwatch: Command failed. Examine previous error messages to determine cause. How can I get rid of this error? I tried to disable the task with: chtask WATCHFOLDER --inactive EFSSG1811C The task with the name WATCHFOLDER is already not scheduled. Stef ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
[gpfsug-discuss] Policy question
Hi, I have a file system with 2 pools: V51 and NAS01. I want to use pool V51 as the default and migrate the oldest files to the pool NAS01 when the pool V51 fills up. Whatever rule combination I tried, I can not get this working. This is the currently defined policy (created by the GUI): RULE 'Migration' MIGRATE FROM POOL 'V51' THRESHOLD(95,85) WEIGHT(10 - DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME)) TO POOL 'NAS01' RULE 'Default to V5000' SET POOL 'V51' And also, how can I monitor the migration processes? Stef ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] Toolkit
On 10/04/2016 04:44 PM, Aaron S Palazzolo wrote: *Hi Stef,* Thanks for your Install Toolkit feature request: /Is it possible to recreate the clusterdefinition.txt based on the current configuration?/ A couple questions if you don't mind: *1) *Are you using the Install Toolkit for base gpfs installation or solely protocol deployment? I used the spectrum toolkit for basic setup. Then I used some mm commands to add disks and change other stuff. But the toolkit is unaware of theses changes. *2) *If using it for base gpfs installation, do you create both NSDs and file systems with it or do you do this manually? Like said, I used the toolkit for the first few NSDs and later used mm commands for other NSDs. I have no problems using the mm commands, but I don't know what will happen if I want to use the toolkit to create proto nodes. Stef ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] Blocksize
On 09/22/2016 08:36 PM, Stef Coene wrote: Hi, Is it needed to specify a different blocksize for the system pool that holds the metadata? IBM recommends a 1 MB blocksize for the file system. But I wonder a smaller blocksize (256 KB or so) for metadata is a good idea or not... I have read the replies and at the end, this is what we will do: Since the back-end storage will be V5000 with a default stripe size of 256KB and we use 8 data disk in an array, this means that 256KB * 8 = 2M is the best choice for block size. So 2 MB block size for data is the best choice. Since the block size for metadata is not that important in the latest releases, we will also go for 2 MB block size for metadata. Inode size will be left at the default: 4 KB. Stef ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] Blocksize
On 09/22/2016 09:07 PM, J. Eric Wonderley wrote: It defaults to 4k: mmlsfs testbs8M -i flagvaluedescription --- --- -i 4096 Inode size in bytes I think you can make as small as 512b. Gpfs will store very small files in the inode. Typically you want your average file size to be your blocksize and your filesystem has one blocksize and one inodesize. The files are not small, but around 20 MB on average. So I calculated with IBM that a 1 MB or 2 MB block size is best. But I'm not sure if it's better to use a smaller block size for the metadata. The file system is not that large (400 TB) and will hold backup data from CommVault. Stef ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
[gpfsug-discuss] Blocksize
Hi, Is it needed to specify a different blocksize for the system pool that holds the metadata? IBM recommends a 1 MB blocksize for the file system. But I wonder a smaller blocksize (256 KB or so) for metadata is a good idea or not... Stef ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] Ubuntu client
On 09/20/2016 07:42 PM, Stef Coene wrote: Hi, I just installed 4.2.1 on 2 RHEL 7.2 servers without any issue. But I also need 2 clients on Ubuntu 14.04. I installed the GPFS client on the Ubuntu server and used mmbuildgpl to build the required kernel modules. ssh keys are exchanged between GPFS servers and the client. But I can't add the node: [root@gpfs01 ~]# mmaddnode -N client1 Tue Sep 20 19:40:09 CEST 2016: mmaddnode: Processing node client1 mmremote: The CCR environment could not be initialized on node client1. mmaddnode: The CCR environment could not be initialized on node client1. mmaddnode: mmaddnode quitting. None of the specified nodes are valid. mmaddnode: Command failed. Examine previous error messages to determine cause. I don't see any error in /var/mmfs on client and server. What can I try to debug this error? Pfff, problem solved. I tailed the logs in /var/adm/ras and found out there was a type in /etc/hosts so the hostname of the client was unresolvable. Stef ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] Blogs and publications on Spectrum Scale
Hi, When trying to register on the website, I each time get the error: "Session expired. Please try again later." Stef ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss