Brian, Andreas,

Thanks for the info.  The benefit of keeping open filehandles and inode number 
is good to understand, but for my immediate case the OSTs have been deactivated 
and then set max_create_count=0 for several weeks, so I'm not too concerned 
about any remaining open files. 

If I get anything interesting back from strace, I'll report here.

What about the checksum issue?  It still looks to me like that is only done 
with the rsync method.

Thanks,
Nathan

________________________________________
From: Dilger, Andreas [andreas.dil...@intel.com]
Sent: Sunday, November 19, 2017 4:01 PM
To: Dauchy, Nathan (ARC-TNC)[CSRA, LLC]
Cc: lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] lfs_migrate rsync vs. lfs migrate and layout swap

It would be interesting to strace your rsync vs. "lfs migrate" read/write 
patterns so that the copy method of "lfs migrate" can be improved to match 
rsync. Since they are both userspace copy actions they should be about the same 
performance. It may be that "lfs migrate" is using O_DIRECT to minimize client 
cache pollution (I don't have the code handy to check right now).  In the 
future we could use "copyfile()" to avoid this as well.

The main benefit of migrate is that it keeps the open file handles and inode 
number on the MDS. Using rsync is just a copy+rename, which is why it is not 
safe for in-use files.

There is no need to clean up volatile files, they are essentially open-unlinked 
files, so they clean up automatically if the program or client crash.

Cheers, Andreas

> On Nov 19, 2017, at 11:31, Dauchy, Nathan (ARC-TNC)[CSRA, LLC] 
> <nathan.dau...@nasa.gov> wrote:
>
> Greetings,
>
> I'm trying to clarify and confirm the differences between lfs_migrate's use 
> of rsync vs. "lfs migrate".  This is in regards to performance, checksumming, 
> and interrupts.  Relevant code changes that introduced the two methods are 
> here:
> https://jira.hpdd.intel.com/browse/LU-2445
> https://review.whamcloud.com/#/c/5620/
>
> The quick testing I have done is with a 8GB file with stripe count of 4, and 
> included the patch to lfs_migrate from:
> https://review.whamcloud.com/#/c/20621/
> (and client cache was dropped between each test)
>
> $ time ./lfs_migrate -y bigfile
> real    1m13.643s
>
> $ time ./lfs_migrate -y -s bigfile
> real    1m13.194s
>
> $ time ./lfs_migrate -y -f bigfile
> real    0m31.791s
>
> $ time ./lfs_migrate -y -f -s bigfile
> real    0m28.020s
>
> * Performance:  The migrate runs faster when forcing rsync (assuming multiple 
> stripes).  There is also minimal performance benefit to skipping the checksum 
> with the rsync method.  Interestingly, performance with "lfs migrate" as the 
> backend is barely effected (and within the noise when I ran multiple tests) 
> by the choice of checksumming or not.  So, my question is whether there is 
> some serialization going on with the layout swap method which causes it to be 
> slower?
>
> * Checksums:  In reading the migrate code in lfs.c, it is not obvious to me 
> that there is any checksumming done at all for "lfs migrate".  That would 
> explain why there is minimal performance difference.  How is data integrity 
> ensured with this method?  Does the file data version somehow capture the 
> checksum too?
>
> * Interrupts:  If the rsync method is interrupted (kill -9, or client reboot) 
> then a ".tmp.XXXXXX" file is left.  This is reasonably easy to search for and 
> clean up.  With the lfs migrate layout swap method, what happens to the 
> "volatile file" and it's objects?  Is an lfsck required in order to clean up 
> the objects?
>
> At this point, the "old" method seems preferable.  Are there other benefits 
> to using the lfs migrate layout swap method that I'm missing?
>
> Thanks for any clarifications or other suggestions!
>
> -Nathan
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to