If all you need is filenames and no other attributes, readdir is usually faster 
than getattrlistbulk because it doesn't have to do as much work. However, if 
you need additional attributes, getattrlistbulk is usually much faster. Some of 
that extra work done by getattrlistbulk involves checking to see what 
attributes were requested and packing the results into the result buffer. 
You'll find that lstat is slightly faster than getattrlist (when getattrlist is 
returning the same set of attributes) for the same reason. There's no extra 
code needed in lstat to see what attributes were requested and packing the 
results into the result buffer.

The original implementation of CFURLEnumerator (which is the implementation 
under NSFileManager's directory enumeration) was readdir followed by 
getattrlist requests to get the additional attributes on each item. Before we 
even shipped SnowLeopard, the implementation was changed to use 
getdirentriesattr if the file system supported it (getattrlistbulk was not 
available until several releases later) because of performance improvements.

By the way, I haven't tested this but I would expect 
enumeratorAtURL:includingPropertiesForKeys:options:errorHandler: (followed by a 
"for (NSURL *fileURL in directoryEnumerator)" loop) to be slightly faster than 
contentsOfDirectoryAtURL:includingPropertiesForKeys:options:error: because the 
URLs aren't retained in a NSArray. Using CFURLEnumerator may also be slightly 
faster than NSFileManager's directory enumeration. Using POSIX/BSD APIs will be 
the fastest, but that means you have to deal with the different capabilities 
between file systems yourself (although getattrlistbulk helps with that a lot).

- Jim

> On Apr 21, 2019, at 7:35 PM, Thomas Tempelmann <tempelm...@gmail.com> wrote:
> I like to add some info on a thread from 2015:
> I recently worked on my file search tool (FAF) and wanted to make sure that I 
> use the best method to deep-scan directory contents.
> I had expected that getattrlistbulk() would always be the best choice, but it 
> turns out that opendir/readdir perform much better in some cases, oddly (this 
> is about reading just the file names, no other attributes).
> See my blog post: https://blog.tempel.org/2019/04/dir-read-performance.html 
> <https://blog.tempel.org/2019/04/dir-read-performance.html>
> There's also a test project trying out the various methods.
> Any comments, insights, clarifications and bug reports are most welcome.
> Enjoy,
>  Thomas Tempelmann
>> On 12. Jan 2015, at 17:33, Jim Luther <luthe...@apple.com 
>> <mailto:luthe...@apple.com>> wrote:
>> getattrlistbulk() works on all file systems. If the file system supports 
>> bulk enumeration natively, great! If it does not, then the kernel code takes 
>> care of it. In addition, getattrlistbulk() supports all non-volume 
>> attributes (getattrlistbulk only supported a large subset).
>> The API calling convention for getattrlistbulk() is slightly different than 
>> getattrlistbulk() — read the man page carefully. In particular:
>> • ATTR_CMN_NAME and ATTR_CMN_RETURNED_ATTRS are required (requiring 
>> ATTR_CMN_NAME allowed us to get rid of the newState argument).
>> • A new attribute, ATTR_CMN_ERROR, can be requested to detect error 
>> conditions for a specific directory entry.
>> • The method for determining when enumeration is complete is different. You 
>> just keep calling getattrlistbulk() until 0 entries are returned.
>> - Jim
>>> On Jan 11, 2015, at 9:31 PM, James Bucanek <subscri...@gloaming.com 
>>> <mailto:subscri...@gloaming.com>> wrote:
>>> Eric,
>>> I would just like to clarify: the new getattrlistbulk() function works on 
>>> all filesystem. We don't have to check the volume's VOL_CAP_INT_READDIRATTR 
>>> capability before calling it, correct?
>>> James Bucanek
>>>>    Eric Tamura     December 10, 2014 at 5:57 PM
>>>> It should be much faster.
>>>> Also note that as of Yosemite, we have added a new API: 
>>>> getattrlistbulk(2), which is like getdirentriesattr(), but supported in 
>>>> VFS for all filesystems. getdirentriesattr() is now deprecated. 
>>>> The main advantage of the bulk call is that we can return results in most 
>>>> cases without having to create a vnode in-kernel, which saves on I/O: HFS+ 
>>>> on-disk layout is such that all of the directory entries in a given 
>>>> directory are clustered together and we can get multiple directory entries 
>>>> from the same cached on-disk blocks.
> _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> Filesystem-dev mailing list      (Filesystem-dev@lists.apple.com)
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/filesystem-dev/luther.j%40apple.com
> This email sent to luthe...@apple.com

Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list      (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:

This email sent to arch...@mail-archive.com

Reply via email to