I would not plan a direct upgrade until Whamcloud fixes the underlying issue. Currently the only viable way seem to be a step by step upgrade. I imagine you'd first upgrade to 2.10.8, and then copy all old file to a new place (something like: mkdir .new_copy; rsync -a * .new_copy; rm -rf *; mv .new_copy/* .; rmdir .new_copy) so that all files have been re-created with correct information. Knut's script is a hack and last minute resort.
-----Original Message----- From: lustre-discuss <lustre-discuss-boun...@lists.lustre.org> On Behalf Of Patrick Shopbell Sent: Wednesday, June 24, 2020 12:36 To: lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] problem after upgrading 2.10.4 to 2.12.4 Hello all, I have been following this discussion with interest, as we are in the process of a long-overdue upgrade of our small Lustre system. We are moving everything from RHEL 6 + Lustre 2.5.2 to RHEL 7 + Lustre 2.8.0 We are taking this route merely because 2.8.0 supported both RHEL 6 and 7, and so we could keep running, to some extent. (In reality, we have found that v2.8 clients crash our v2.5 MGS on a pretty regular basis.) Once our OS upgrades are done, the plan is to then take everything to RHEL 7 + Lustre 2.12.x From what I gather on this thread, however... I should expect to have some difficulty reading most of my files, since we have been running 2.5 for a long time. And so I should plan on running Knut's 'update_25_objects' on all of my OSTs? Is that correct? Should I need to do that at Lustre 2.8.0, or not until I get to v2.12? Also, I assume this issue is irrelevant of underlying filesystem - we are still running lustrefs on our 12 OSTs, rather than ZFS. Thanks so much. This list is always very helpful and interesting. -- Patrick On 6/24/20 1:16 AM, Franke, Knut wrote: > Am Dienstag, den 23.06.2020, 20:03 +0000 schrieb Hebenstreit, Michael: >> Is there any way to stop the scans on the OSTs? > Yes, by re-mounting them with -o noscrub. This doesn't fix the issue > though. > >> Is there any way to force the file system checks? > As shown in your second mail, the scrubs are already running. > Unfortunately, they don't (as of Lustre 2.12.4) fix the issue. > >> Has anyone found a workaround for the FID sequence errors? > Yes, see the script attached to LU-13392. In short: > > 0. Make sure you have a backup. This might eat your lunch and fry your > cat for afters. > 1. Enable the canmount property on the backend filesystem. For example: > [oss]# zfs set canmount=on mountpoint=/mnt/ostX ${fsname}-ost/ost > 2. Mount the target as 'zfs'. For example: > [oss]# zfs mount ${fsname}-ost/ost 3. update_25_objects /mnt/ostX > 4. unmount and remount the OST as 'lustre' > > This will rewrite the extended attributes of OST objects created by > Lustre 2.4/2.5 to a format compatible with 2.12. > >> Can I downgrade from 2.12.4 to 2.10.8 without destroying the FS? > We've done this successfully, but again - no guarantees. > >> Has the error described in https://jira.whamcloud.com/browse/LU-13392 >> been fixed in 2.12.5? > I don't think so. > > Cheers, > Knut -- *--------------------------------------------------------------------* | Patrick Shopbell Department of Astronomy | | p...@astro.caltech.edu Mail Code 249-17 | | (626) 395-4097 California Institute of Technology | | (626) 568-9352 (FAX) Pasadena, CA 91125 | | WWW: http://www.astro.caltech.edu/~pls/ | *--------------------------------------------------------------------* _______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org _______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org