On 2012-10-18, at 16:11, Jason Brooks 
<brook...@ohsu.edu<mailto:brook...@ohsu.edu>> wrote:

I suffered an oss crash where my oss server had a cpu fault.  I have it running 
again, but I am trying to decommission it.  I am migrating the data off of it 
onto other ost's using the lfs find command with lfs_migrate.

It's been nearly 36 hours and about 2 terabytes have been moved.  This means I 
am about halfway.  Is this a decent rate?

Depends on how large your files are, and how fast the network is, but I 
wouldn't call it outstanding...

Here are the particulars, which basically are snags.  I know they affect 
things, I just am not certain to what degree:

  1.  I am running lfs_migrate on two systems, migrating different 
subdirectories of the same mount point.

This increases contention on the MDS, but two clients shouldn't be overloading 
the server.  Presumably you are only finding and migrating files which are 
striped over the affected server?

  1.  All systems are running using ip over infiniband.

IPoIB is far slower than native IB, both for data and metadata, but in the 
middle if migration is probably not the time to be messing with your network 
configuration.

  1.  None of my client-only systems have lfs or lfs_migrate.  I think this is 
because they are ubuntu and only the lustre kernel modules are installed.  Thus 
I can't run it there.

This is just a shell script, so you could have copied it from another mode.

  1.   Oh, and that also means that the lustre filesytem is mounted on the 
oss's too.

This is not an ideal situation, since the memory usage on the client is 
competing with the memory of the OSS.

  1.  lfs_migrate and lfs did not seem to operate correctly on the oss's that 
are 1.8.6.  Works ok on 1.8.8 though.

Can't really comment based on this limited information.

  1.  AND the two systems I am running lfs_migrate on are probably the very 
systems with free ost space on them.  In other words, file blocks are being 
written to the very systems that lfs_migrate is being run on and/or there is a 
lot of block write traffic between the two.


Lustre versions:
Mds/mgs: 1.8.6
5 of 7 OSS's: 1.8.6
2 of 7 oss's: 1.8.8

Clients: 1.8.6, ubuntu.


_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org<mailto:Lustre-discuss@lists.lustre.org>
http://lists.lustre.org/mailman/listinfo/lustre-discuss
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to