Hsieh,

This sounds similar to a bug with pre-2.5 servers and 2.7 (or newer) clients.  
The client and server have a disagreement about which does the delete, and the 
delete doesn’t happen.  Since you’re running 2.5, I don’t think you should see 
this, but the symptoms are the same.   You can temporarily fix things by 
restarting/remounting your OST(s), which will trigger orphan cleanup.  But if 
that works, the only long term fix is to upgrade your servers to a version that 
is expected to work with your clients.  (The 2.10 maintenance release is nice 
if you are not interested in the newest features, otherwise, 2.12 is also an 
option.)

I would also recommend where possible that you keep clients and servers in sync 
- we do interop testing, but same version on both is much more widely used.

- Patrick
________________________________
From: lustre-discuss <[email protected]> on behalf of 
Tung-Han Hsieh <[email protected]>
Sent: Sunday, March 3, 2019 4:00:17 AM
To: [email protected]
Subject: [lustre-discuss] Data migration from one OST to anther

Dear All,

We have a problem of data migration from one OST two another.

We have installed Lustre-2.5.3 on the MDS and OSS servers, and Lustre-2.8
on the clients. We want to migrate some data from one OST to another in
order to re-balance the data occupation among OSTs. In the beginning we
follow the old method (i.e., method found in Lustre-1.8.X manuals) for
the data migration. Suppose we have two OSTs:

root@client# /opt/lustre/bin/lfs df
UUID                   1K-blocks        Used   Available Use% Mounted on
chome-OST0028_UUID    7692938224  7246709148    55450156  99% /work[OST:40]
chome-OST002a_UUID   14640306852  7094037956  6813847024  51% /work[OST:42]

and we want to migrate data from chome-OST0028_UUID to chome-OST002a_UUID.
Our procedures are:

1. We deactivate chome-OST0028_UUID:
   root@mds# echo 0 > /opt/lustre/fs/osc/chome-OST0028-osc-MDT0000/active

2. We find all files located in chome-OST0028_UUID:
   root@client# /opt/lustre/bin/lfs find --obd chome-OST0028_UUID /work > list

3. In each file listed in the file "list", we did:

        cp -a <file> <file>.tmp
        mv <file>.tmp <file>

During the migration, we really saw that more and more data written into
chome-OST002a_UUID. But we did not see any disk release in chome-OST0028_UUID.
In Lustre-1.8.X, doing this way we did saw that chome-OST002a_UUID has
more data coming in, and chome-OST0028_UUID has more and more free space.

It looks like that the data files referenced by MDT have copied to
chome-OST002a_UUID, but the junks still remain in chome-OST0028_UUID.
Even though we activate chome-OST0028_UUID after migration, the situation
is still the same:

root@mds# echo 1 > /opt/lustre/fs/osc/chome-OST0028-osc-MDT0000/active

Is there any way to cure this problem ?


Thanks very much.

T.H.Hsieh
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to