Thanks Rick,

I've always assumed those data resides inside MDT, but your explanation makes sense since the files are temporarily files used by mysqld which might have been deleted while the files were being migrated. Since they are not needed anyway, I just unlink-ed them (as rm will stat the file before removal and it outright fails).

IIRC lfsck needs to be done with the whole volume offline?

Best regards,
Angelos

On 04/03/2021 06:10, Mohr, Rick via lustre-discuss wrote:
Angelos,

If a file still existed on the MDS but its data on the OST had somehow been 
removed, then you might see symptoms like those you described.  (stat fails 
because info can't be retrieved from the ost, but lfs getstripe can still query 
layout info from the mds.).  But if that is the case, I can't really say how it 
might have happened in the first place.

Have you tried running lfsck to look for consistency problems?

--Rick


On 3/2/21, 5:24 AM, "lustre-discuss on behalf of Angelos Ching via lustre-discuss" 
<[email protected] on behalf of [email protected]> 
wrote:

     Dear all,

     I was dealing with some OST migration using lfs_migrate and things went
     mostly fine albeit for a few files that might have been in use during
     the migration:

     > # ls
     > ls: cannot access ibleTHWm: No such file or directory
     > ls: cannot access ib7rP0qy: No such file or directory
     > ls: cannot access ib3AQ9vK: No such file or directory
     > ls: cannot access ib30N1p9: No such file or directory
     > ib30N1p9  ib3AQ9vK  ib7rP0qy  ibleTHWm
     > # stat ib30N1p9
     > stat: cannot stat ‘ib30N1p9’: No such file or directory
     > # lfs getstripe ib30N1p9
     > ib30N1p9
     > lmm_stripe_count:  1
     > lmm_stripe_size:   1048576
     > lmm_pattern:       raid0
     > lmm_layout_gen:    0
     > lmm_stripe_offset: 1
     >     obdidx         objid         objid         group
     >          1          71909438        0x449403e 0
     The files couldn't be stat'ed but still returns upon lfs getstripe.

     The same error appears on all clients and I've tried unmounting and
     remounting the MDT on the server side already.

     Any idea what might have been corrupted and what could be the fix?

     Cheers,

     --
     Angelos Ching
     ClusterTech Limited

     Tel     : +852-2655-6138
     Fax     : +852-2994-2101
     Address    : Unit 211-213, Lakeside 1, 8 Science Park West Ave., Shatin, 
Hong Kong

     Got praises or room for improvements? http://bit.ly/TellAngelos

     
********************************************************************************
     The information contained in this e-mail and its attachments is 
confidential and
     intended solely for the specified addressees. If you have received this 
email in
     error, please do not read, copy, distribute, disclose or use any 
information of
     this email in any way and please immediately notify the sender and delete 
this
     email. Thank you for your cooperation.
     
********************************************************************************

     _______________________________________________
     lustre-discuss mailing list
     [email protected]
     http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to