On Sep 11, 2019, at 10:06, Michael Di Domenico 
<[email protected]<mailto:[email protected]>> wrote:

On Tue, Sep 10, 2019 at 5:48 PM Andreas Dilger 
<[email protected]<mailto:[email protected]>> wrote:

I don't think "lfs find -xdev" has never been a priority for Lustre, since it 
is rare for Lustre filesystems to be
mounted in a nested manner.  Since people already run multiple "lfs find" tasks 
in parallel on different
clients to get better performance, it isn't hard to run separate tasks from the 
top-level mountpoint of
different filesystems.  What is the use case for this?

doesn't xdev keep find from crossing mount points, not necessarily
only in a nested manner but also if there's a link to a directory in a
different filesystem.  i believe 'find' without -xdev will follow and
descend.  but this predicates that my understanding is sound (which it
probably isn't).  i generally add -xdev to my finds as a habit to keep
from scanning nfs volumes.

Yes, -xdev is to avoid crossing mountpoints, but like I wrote it is rare to 
have nested Lustre
mountpoints, so this wouldn't really be useful in most cases.  The find tree 
walking does
*not* follow symlinks into the target directory, only mountpoints.


along the same vein, can anyone state whether there's any actual
performance gain walking the filesystem using find vs lfs find?

For "find" vs. "lfs find" performance, this depends heavily on what the search 
parameters are.  If just
the filename, they will be the same.  If it includes some MDT-specific 
attributes (e.g. uid, gid) then
"lfs find" can be significantly faster (e.g 3-5x).  If it is uses file size, 
then they will be about the same
unless there are other MDT-only parameters, or once LSOM support is landed 
(hopefully 2.13).

okay, that's what i thought or recalled correctly from hearing
somewhere else.  in my particular instance i was just using 'find
-type f' and didn't see any appreciable difference in scanning speed
between the two

For Lustre, ext4, and most other filesystems, the file type is also stored in 
the directory entry, so that
"find" can determine the type without a "stat".  That is safe since the file 
type cannot be changed after
the file is created.

In a scan like "(lfs) find -name '*foo*' -type f" it only needs to read the 
directory entries (including the
file type) and process each entry.  There is nothing that "lfs find" can 
optimize.  With mode, uid, gid, and
*some* timestamp queries, "lfs find" can fetch only the MDS attributes and skip 
any OST RPCs for
that file if the (non)match can be decided without the OST attributes.

Once the Lazy Size-on-MDT (LSOM) integration is finished 
(https://review.whamcloud.com/35167) it
will be possible for "lfs find --lazy" to use *only* attributes from the MDS 
(size, timestamps) to speed
up scanning and avoid OST RPC overhead.

Cheers, Andreas
--
Andreas Dilger
Principal Lustre Architect
Whamcloud






_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to