subject:"readdir vs. getdirentriesattr"

Re: readdir vs. getdirentriesattr

2019-04-29 Thread Thomas Tempelmann

> The volume ID is at a higher layer, but the enumeration code attempts to
> retrieve the value less than once per URL returned. That said, if the
> directory hierarchy has few items per directory, the number of times it is
> retrieved will be higher. You can write a bug report and I'll look to see
> if there are ways to improve the performance.
>

As I just wrote, going with your proposed enumeratorAtURL: method takes
care of that already. I may still write a report, and will let you know if
I do.

Though I still haven't gotten to see if I can speed up recursive search by
using multiple threads for each directory read. If that helps, then I
cannot use enumeratorAtURL for that but would have to revert to classic
recursion, and which point the volumeID checking comes into play again
(but, with only checking it whenever I enter a dir, it'll be less of an
impact).


> In the meantime, there's something you could do to improve the performance
> (even if our code changes). You can get the volumeIdentifier for the
> directory you start enumerating from. It will be the same for the entire
> enumeration except when directories are seen on other file systems (today,
> that's volume mount points and mount triggers). Like this:
>

I already do that in my actual working code. I was just showing this more
inefficient way of ALWAYS getting the value in order to demonstrate its
performance impact.

It used to be based on heavily modified fts(3). I rewrote it for Mojave to
> improve the memory footprint. It uses getattrlistbulk()for everything
> except when ti sees a mount point, and then it calls getattrlist on the
> mount point path to get the attributes from the other file system's root
> directory.
>

Glad to see you're still on top of it.

Thomas
 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2019-04-29 Thread Thomas Tempelmann

Quick update:


> -[enumeratorAtURL:includingPropertiesForKeys:options:errorHandler:] also
>> supports recursive enumeration (which stops at device boundaries -- you'll
>> see mount points but not their contents) so you don't have to do that
>> yourself.
>>
>
This is indeed faster than most of the other options, but, if only looking
for file names, not as fast as fts_read. When also looking at file sizes,
it's the fastest, though. Here are run times for "best case" in an APFS
volume ("/System" folder). These times come out quite similarly on repeated
runs.

*Target: /System, format: apfs*

*--- contentsOfDirectoryAtURL ---*

*3.35s, scanned: 336991, found: 520, size: 0*

*4.31s, scanned: 336991, found: 520, size: 9184548546*

*--- getattrlistbulk() ---*

*3.45s, scanned: 336991, found: 520, size: 0*

*3.50s, scanned: 336991, found: 520, size: 9184548546*

*--- readdir() ---*

*3.05s, scanned: 336991, found: 520, size: 0*

*8.04s, scanned: 336991, found: 520, size: 9184548546*

*--- fts ---*

*2.32s, scanned: 336991, found: 520, size: 0*

*2.40s, scanned: 336991, found: 520, size: 9184548546*

*--- enumeratorAtURL ---*

*1.97s, scanned: 336991, found: 520, size: 0*

*2.52s, scanned: 336991, found: 520, size: 9184548546*

The first of each test type looks for names only (and it extracts them from
the URL, not by getting it as a resource value like the code in
https://developer.apple.com/documentation/foundation/nsfilemanager/1409571-enumeratoraturl
suggests.
The second test also fetches the file size.

Note that on network volumes, readdir may be faster than the others,
though. Also depends on the server (Linux based NAS vs. macOS).

Thomas
 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2019-04-29 Thread Jim Luther



> On Apr 29, 2019, at 1:19 PM, Thomas Tempelmann  wrote:
> 
> Jim,
> 
> In contentsOfDirectoryAtURL, instead of "includingPropertiesForKeys:nil", use 
> "includingPropertiesForKeys:@[NSURLVolumeIdentifierKey]" (and add whatever 
> other property keys you know you'll need). The whole purpose of the 
> includingPropertiesForKeys argument is so the enumerator code can pre-fetch 
> the properties you need as efficiently as possible. The enumeration will be a 
> bit slower, but the entire operation of enumerating and getting the 
> properties from the URLs returned will be faster.
> 
> I know. That's the theory, but my benchmarking says it makes no difference in 
> that case. And that's quite logical because the pre-caching is meant for data 
> that has to come from the lowest level, i.e. where the catalog data is 
> fetched - it makes sense to combine multiple property requests into one, just 
> like the getdirentriesattr is meant to used like. However, as I explained the 
> volume ID is not stored in the catalog but at a higher level, and therefore 
> pre-fetching this at the lowest level makes no difference, at requires no 
> catalog access, right?

The volume ID is at a higher layer, but the enumeration code attempts to 
retrieve the value less than once per URL returned. That said, if the directory 
hierarchy has few items per directory, the number of times it is retrieved will 
be higher. You can write a bug report and I'll look to see if there are ways to 
improve the performance.

In the meantime, there's something you could do to improve the performance 
(even if our code changes). You can get the volumeIdentifier for the directory 
you start enumerating from. It will be the same for the entire enumeration 
except when directories are seen on other file systems (today, that's volume 
mount points and mount triggers). Like this:

NSURL *directoryURL = [NSURL 
fileURLWithPath:@"/System/Applications/Utilities/" isDirectory:YES];
// get the volume identifier for most of the enumeration
id mainVolumeIdentifier;
[directoryURL getResourceValue:&mainVolumeIdentifier 
forKey:NSURLVolumeIdentifierKey error:nil];
NSDirectoryEnumerator *directoryEnumerator = 
[NSFileManager.defaultManager enumeratorAtURL:directoryURL 
includingPropertiesForKeys:nil options:0 errorHandler:nil];
for (NSURL *url in directoryEnumerator) {
NSNumber *isVolume;
NSNumber *isMountTrigger;
if ( ([url getResourceValue:&isVolume forKey:NSURLIsVolumeKey 
error:nil] && isVolume.boolValue)
|| ([url getResourceValue:&isMountTrigger 
forKey:NSURLIsMountTriggerKey error:nil] && isMountTrigger.boolValue) ) {
// get the volume identifier for the volume or mount 
trigger
id otherVolumeIdentifier ;
[directoryURL getResourceValue:&otherVolumeIdentifier 
forKey:NSURLVolumeIdentifierKey error:nil];
}
}

> My performance tests always runs twice in fast succession, so that in the 
> second run, due to caching, all data's ready and does not incur random delays 
> that would give imprecise measurements. Sure, this does not give me the worst 
> case, but it gives me the best case results at least. And these best case 
> results say: Scanning "/System" on my Mac without getting the Volume ID takes 
> less than 3s, but with (with and without pre-fetching) getting it takes over 
> 6s. That's TWICE as much time. With smaller dir tree the difference is less, 
> possibly because then there's other caches helping.
> 
> I assume that when I re-run the scan, after having released all NSURLs from 
> the previous scan (even by restarting the test app), the framework creates, 
> fresh, NSURL objects, right? It's not that there is only one NSURL instance 
> on the entire system per volume item, shared between all processes, or is 
> there? The only caching, once I release an NSURL, is at the volume block 
> cache level, isn't it?
> 
> Also, use -[enumeratorAtURL:includingPropertiesForKeys:options:errorHandler:] 
> instead of 
> -[contentsOfDirectoryAtURL:includingPropertiesForKeys:options:error:] unless 
> you really need an NSArray of NSURLs. If your code is just processing all of 
> the URLs and has no need to keep them after processing, there's no reason to 
> add them to an array (which takes time and adds to peak memory pressure).
> 
> Thanks, that makes sense.
> 
> -[enumeratorAtURL:includingPropertiesForKeys:options:errorHandler:] also 
> supports recursive enumeration (which stops at device boundaries -- you'll 
> see mount points but not their contents) so you don't have to do that 
> yourself.
> 
> Is that based on fts_read? Because I found that this is much faster on local 
> volumes (not on network vols, though) than all other ways I've tried. And it 
> brings along the st_dev value without time penalty, unlike 
> contentsOfDirectoryAtURL.

It use

Re: readdir vs. getdirentriesattr

2019-04-29 Thread Thomas Tempelmann

Jim,

In contentsOfDirectoryAtURL, instead of "includingPropertiesForKeys:nil",
> use "includingPropertiesForKeys:@[NSURLVolumeIdentifierKey]" (and add
> whatever other property keys you know you'll need). The whole purpose of
> the includingPropertiesForKeys argument is so the enumerator code can
> pre-fetch the properties you need as efficiently as possible. The
> enumeration will be a bit slower, but the entire operation of enumerating
> and getting the properties from the URLs returned will be faster.
>

I know. That's the theory, but my benchmarking says it makes no difference
in that case. And that's quite logical because the pre-caching is meant for
data that has to come from the lowest level, i.e. where the catalog data is
fetched - it makes sense to combine multiple property requests into one,
just like the getdirentriesattr is meant to used like. However, as I
explained the volume ID is not stored in the catalog but at a higher level,
and therefore pre-fetching this at the lowest level makes no difference, at
requires no catalog access, right?

My performance tests always runs twice in fast succession, so that in the
second run, due to caching, all data's ready and does not incur random
delays that would give imprecise measurements. Sure, this does not give me
the worst case, but it gives me the best case results at least. And these
best case results say: Scanning "/System" on my Mac without getting the
Volume ID takes less than 3s, but with (with and without pre-fetching)
getting it takes over 6s. That's TWICE as much time. With smaller dir tree
the difference is less, possibly because then there's other caches helping.

I assume that when I re-run the scan, after having released all NSURLs from
the previous scan (even by restarting the test app), the framework creates,
fresh, NSURL objects, right? It's not that there is only one
NSURL instance on the entire system per volume item, shared between all
processes, or is there? The only caching, once I release an NSURL, is at
the volume block cache level, isn't it?

Also, use
> -[enumeratorAtURL:includingPropertiesForKeys:options:errorHandler:] instead
> of -[contentsOfDirectoryAtURL:includingPropertiesForKeys:options:error:]
> unless you really need an NSArray of NSURLs. If your code is just
> processing all of the URLs and has no need to keep them after processing,
> there's no reason to add them to an array (which takes time and adds to
> peak memory pressure).
>

Thanks, that makes sense.

-[enumeratorAtURL:includingPropertiesForKeys:options:errorHandler:] also
> supports recursive enumeration (which stops at device boundaries -- you'll
> see mount points but not their contents) so you don't have to do that
> yourself.
>

Is that based on fts_read? Because I found that this is much faster on
local volumes (not on network vols, though) than all other ways I've tried.
And it brings along the st_dev value without time penalty, unlike
contentsOfDirectoryAtURL.

Regardless, I'll give that a try.

-- 
Thomas Tempelmann, http://apps.tempel.org/
Follow me on Twitter: https://twitter.com/tempelorg
Read my programming blog: http://blog.tempel.org/
 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2019-04-29 Thread Jim Luther

In contentsOfDirectoryAtURL, instead of "includingPropertiesForKeys:nil", use 
"includingPropertiesForKeys:@[NSURLVolumeIdentifierKey]" (and add whatever 
other property keys you know you'll need). The whole purpose of the 
includingPropertiesForKeys argument is so the enumerator code can pre-fetch the 
properties you need as efficiently as possible. The enumeration will be a bit 
slower, but the entire operation of enumerating and getting the properties from 
the URLs returned will be faster.

Also, use -[enumeratorAtURL:includingPropertiesForKeys:options:errorHandler:] 
instead of 
-[contentsOfDirectoryAtURL:includingPropertiesForKeys:options:error:] unless 
you really need an NSArray of NSURLs. If your code is just processing all of 
the URLs and has no need to keep them after processing, there's no reason to 
add them to an array (which takes time and adds to peak memory pressure).

-[enumeratorAtURL:includingPropertiesForKeys:options:errorHandler:] also 
supports recursive enumeration (which stops at device boundaries -- you'll see 
mount points but not their contents) so you don't have to do that yourself.

- Jim

> On Apr 29, 2019, at 8:01 AM, Thomas Tempelmann  wrote:
> 
> Doing more performance tests for directory traversal I ran into a performance 
> issue with [NSURL contentsOfDirectoryAtURL:]:
> 
> See this typical code for scanning a directory:
> 
>   NSArray *contentURLs = [fileMgr contentsOfDirectoryAtURL:parentURL 
> includingPropertiesForKeys:nil options:0 error:nil];
>   for (NSURL *url in contentURLs) {
> id value;
> [url getResourceValue:&value forKey:NSURLVolumeIdentifierKey error:nil];
> 
> I would have expected the call for fetching NSURLVolumeIdentifierKey to be 
> rather fast because the upper file system layer should know which volume this 
> belong to because it has to know which FS driver it has to pass the calls to. 
> I.e., asking for the volume ID should be much faster than fetching actual 
> directory data such as the file size, for instance.
> 
> However, it turns out that this is just as slow as getting actual data from 
> the lower levels.
> 
> Could it be that the call is not optimized for returning this information as 
> earlier as possible but that it passes the call down to the lowest level 
> regardless of need?
> 
> I mention this because it degrades the performance of a recursive directory 
> scan significantly in my tests (on both APFS and HFS) - by more than 30%! The 
> only thing even slower would be to call stat() instead (for getting the 
> st_dev value).
> 
> Is this worth having looked at? If so, should I report this via bugreporter 
> (though, when I'm then asked to provide a system profiler report then, it's 
> not going anywhere)?
> 
> Thomas
> 
> ___
> Do not post admin requests to the list. They will be ignored.
> Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/filesystem-dev/luther.j%40apple.com
> 
> This email sent to luthe...@apple.com

 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2019-04-29 Thread Thomas Tempelmann

Doing more performance tests for directory traversal I ran into a
performance issue with [NSURL contentsOfDirectoryAtURL:]:

See this typical code for scanning a directory:

  NSArray *contentURLs = [fileMgr contentsOfDirectoryAtURL:parentURL
includingPropertiesForKeys:nil options:0 error:nil];

  for (NSURL *url in contentURLs) {

id value;

[url getResourceValue:&value forKey:NSURLVolumeIdentifierKey error:nil];


I would have expected the call for fetching NSURLVolumeIdentifierKey to be
rather fast because the upper file system layer should know which volume
this belong to because it has to know which FS driver it has to pass the
calls to. I.e., asking for the volume ID should be much faster than
fetching actual directory data such as the file size, for instance.

However, it turns out that this is just as slow as getting actual data from
the lower levels.

Could it be that the call is not optimized for returning this information
as earlier as possible but that it passes the call down to the lowest level
regardless of need?

I mention this because it degrades the performance of a recursive directory
scan significantly in my tests (on both APFS and HFS) - by more than 30%!
The only thing even slower would be to call stat() instead (for getting the
st_dev value).

Is this worth having looked at? If so, should I report this via bugreporter
(though, when I'm then asked to provide a system profiler report then, it's
not going anywhere)?

Thomas
 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2019-04-22 Thread Jim Luther

I don’t really have time to look at the current fts implementation, but… it has 
several options that effect performance (in particular, the FTS_NOCHDIR, 
FTS_NOSTAT, FTS_NOSTAT_TYPE, and FTS_XDEV options). If you are trying to 
compare fts to CFURLEnumerator (for example), use FTS_NOCHDIR and FTS_XDEV, but 
don’t use FTS_NOSTAT and FTS_NOSTAT_TYPE.

> On Apr 22, 2019, at 9:59 AM, Thomas Tempelmann  wrote:
> 
> Jim,
> thanks for your comments.
> 
> If all you need is filenames and no other attributes, readdir is usually 
> faster than getattrlistbulk because it doesn't have to do as much work. 
> However, if you need additional attributes, getattrlistbulk is usually much 
> faster. Some of that extra work done by getattrlistbulk involves checking to 
> see what attributes were requested and packing the results into the result 
> buffer. 
> 
> What's interesting is that on HFS+, readdir is not faster in my tests, but on 
> a recent and fast Mac (i.e. not on my MacPro 2010), it can be twice as fast 
> as the others when scanning an APFS volume. I wonder why. Is the 
> implementation for getattrlistbulk in the APFS driver inefficient compared to 
> the one in HFS+? The source code for the APFS FS driver has still not be 
> published, or has it?
> 
> You'll find that lstat is slightly faster than getattrlist (when getattrlist 
> is returning the same set of attributes) for the same reason. There's no 
> extra code needed in lstat to see what attributes were requested and packing 
> the results into the result buffer.
> 
> It's also significantly faster than using NSURL's getResourceValue, even if 
> the NSURL has already been created regardless. That's probably due to all the 
> objc overhead.
> 
> By the way, I haven't tested this but I would expect 
> enumeratorAtURL:includingPropertiesForKeys:options:errorHandler: (followed by 
> a "for (NSURL *fileURL in directoryEnumerator)" loop) to be slightly faster 
> than contentsOfDirectoryAtURL:includingPropertiesForKeys:options:error: 
> because the URLs aren't retained in a NSArray. Using CFURLEnumerator may also 
> be slightly faster than NSFileManager's directory enumeration.
> 
> Now, that's something I had not considered, yet. Will try.
>  
> Using POSIX/BSD APIs will be the fastest, but that means you have to deal 
> with the different capabilities between file systems yourself (although 
> getattrlistbulk helps with that a lot).
> 
> Most interesting, though:
> 
> Today someone pointed out fts_read. This does, so far always beat all other 
> methods, especially if I also need extra attributes (e.g. file size).
> 
> Can you give some more information about the fts implementation? Is this 
> user-library-level oder kernel code that's doing this? I had expected that 
> this would only be a convenience userland function that uses readdir or 
> similar BSD functions, but it appears to beat them all, suggesting this is 
> optimized at a lower level.
> 
> 
> I have updated my test project accordingly (with the fts code) in case anyone 
> likes to run their own tests:
> 
>   http://files.tempel.org/Various/DirScanner.zip 
> 
> 
> Also, I am wondering if using concurrent threads will speed up scanning a dir 
> tree on an SSD as well, by distributing each directory read to one thread (or 
> dispatch queue). Will eventually try, but probably not soon. Gotta get my 
> program out of the door soon, first.
> 
> Thomas
> 

 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2019-04-22 Thread Wim Lewis

On Apr 22, 2019, at 9:59 AM, Thomas Tempelmann  wrote:
> Can you give some more information about the fts implementation? Is this 
> user-library-level oder kernel code that's doing this? I had expected that 
> this would only be a convenience userland function that uses readdir or 
> similar BSD functions, but it appears to beat them all, suggesting this is 
> optimized at a lower level.

That is surprising to me also. You can find the fts implementation here — at a 
first glance, it seems to be using both getattrlistbulk() and fstatat(), but 
nothing more exotic than that:

   https://opensource.apple.com/source/Libc/Libc-1272.200.26/gen/fts.c.auto.html

 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2019-04-22 Thread Thomas Tempelmann

Jim,
thanks for your comments.

If all you need is filenames and no other attributes, readdir is usually
> faster than getattrlistbulk because it doesn't have to do as much
> work. However, if you need additional attributes, getattrlistbulk is
> usually much faster. Some of that extra work done
> by getattrlistbulk involves checking to see what attributes were requested
> and packing the results into the result buffer.
>

What's interesting is that on HFS+, readdir is not faster in my tests, but
on a recent and fast Mac (i.e. not on my MacPro 2010), it can be twice as
fast as the others when scanning an APFS volume. I wonder why. Is the
implementation for getattrlistbulk in the APFS driver inefficient compared
to the one in HFS+? The source code for the APFS FS driver has still not be
published, or has it?

You'll find that lstat is slightly faster than getattrlist (when
> getattrlist is returning the same set of attributes) for the same reason.
> There's no extra code needed in lstat to see what attributes were requested
> and packing the results into the result buffer.
>

It's also significantly faster than using NSURL's getResourceValue, even if
the NSURL has already been created regardless. That's probably due to all
the objc overhead.

By the way, I haven't tested this but I would expect
> enumeratorAtURL:includingPropertiesForKeys:options:errorHandler: (followed
> by a "for (NSURL *fileURL in directoryEnumerator)" loop) to be slightly
> faster than
> contentsOfDirectoryAtURL:includingPropertiesForKeys:options:error: because
> the URLs aren't retained in a NSArray. Using CFURLEnumerator may also be
> slightly faster than NSFileManager's directory enumeration.
>

Now, that's something I had not considered, yet. Will try.


> Using POSIX/BSD APIs will be the fastest, but that means you have to deal
> with the different capabilities between file systems yourself (although
> getattrlistbulk helps with that a lot).
>

*Most interesting, though:*

Today someone pointed out *fts_read*. This does, so far always beat all
other methods, especially if I also need extra attributes (e.g. file size).

Can you give some more information about the fts implementation? Is this
user-library-level oder kernel code that's doing this? I had expected that
this would only be a convenience userland function that uses readdir or
similar BSD functions, but it appears to beat them all, suggesting this is
optimized at a lower level.


I have updated my test project accordingly (with the fts code) in case
anyone likes to run their own tests:

  http://files.tempel.org/Various/DirScanner.zip

Also, I am wondering if using concurrent threads will speed up scanning a
dir tree on an SSD as well, by distributing each directory read to one
thread (or dispatch queue). Will eventually try, but probably not soon.
Gotta get my program out of the door soon, first.

Thomas
 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2019-04-22 Thread Jim Luther

If all you need is filenames and no other attributes, readdir is usually faster 
than getattrlistbulk because it doesn't have to do as much work. However, if 
you need additional attributes, getattrlistbulk is usually much faster. Some of 
that extra work done by getattrlistbulk involves checking to see what 
attributes were requested and packing the results into the result buffer. 
You'll find that lstat is slightly faster than getattrlist (when getattrlist is 
returning the same set of attributes) for the same reason. There's no extra 
code needed in lstat to see what attributes were requested and packing the 
results into the result buffer.

The original implementation of CFURLEnumerator (which is the implementation 
under NSFileManager's directory enumeration) was readdir followed by 
getattrlist requests to get the additional attributes on each item. Before we 
even shipped SnowLeopard, the implementation was changed to use 
getdirentriesattr if the file system supported it (getattrlistbulk was not 
available until several releases later) because of performance improvements.

By the way, I haven't tested this but I would expect 
enumeratorAtURL:includingPropertiesForKeys:options:errorHandler: (followed by a 
"for (NSURL *fileURL in directoryEnumerator)" loop) to be slightly faster than 
contentsOfDirectoryAtURL:includingPropertiesForKeys:options:error: because the 
URLs aren't retained in a NSArray. Using CFURLEnumerator may also be slightly 
faster than NSFileManager's directory enumeration. Using POSIX/BSD APIs will be 
the fastest, but that means you have to deal with the different capabilities 
between file systems yourself (although getattrlistbulk helps with that a lot).

- Jim

> On Apr 21, 2019, at 7:35 PM, Thomas Tempelmann  wrote:
> 
> I like to add some info on a thread from 2015:
> 
> I recently worked on my file search tool (FAF) and wanted to make sure that I 
> use the best method to deep-scan directory contents.
> 
> I had expected that getattrlistbulk() would always be the best choice, but it 
> turns out that opendir/readdir perform much better in some cases, oddly (this 
> is about reading just the file names, no other attributes).
> 
> See my blog post: https://blog.tempel.org/2019/04/dir-read-performance.html 
> 
> 
> There's also a test project trying out the various methods.
> 
> Any comments, insights, clarifications and bug reports are most welcome.
> 
> Enjoy,
>  Thomas Tempelmann
> 
> 
>> On 12. Jan 2015, at 17:33, Jim Luther > > wrote:
>> 
>> getattrlistbulk() works on all file systems. If the file system supports 
>> bulk enumeration natively, great! If it does not, then the kernel code takes 
>> care of it. In addition, getattrlistbulk() supports all non-volume 
>> attributes (getattrlistbulk only supported a large subset).
>> 
>> The API calling convention for getattrlistbulk() is slightly different than 
>> getattrlistbulk() — read the man page carefully. In particular:
>> 
>> • ATTR_CMN_NAME and ATTR_CMN_RETURNED_ATTRS are required (requiring 
>> ATTR_CMN_NAME allowed us to get rid of the newState argument).
>> • A new attribute, ATTR_CMN_ERROR, can be requested to detect error 
>> conditions for a specific directory entry.
>> • The method for determining when enumeration is complete is different. You 
>> just keep calling getattrlistbulk() until 0 entries are returned.
>> 
>> - Jim
>> 
>>> On Jan 11, 2015, at 9:31 PM, James Bucanek >> > wrote:
>>> 
>>> Eric,
>>> 
>>> I would just like to clarify: the new getattrlistbulk() function works on 
>>> all filesystem. We don't have to check the volume's VOL_CAP_INT_READDIRATTR 
>>> capability before calling it, correct?
>>> 
>>> James Bucanek
>>> 
Eric Tamura December 10, 2014 at 5:57 PM
 It should be much faster.

 Also note that as of Yosemite, we have added a new API: 
 getattrlistbulk(2), which is like getdirentriesattr(), but supported in 
 VFS for all filesystems. getdirentriesattr() is now deprecated. 

 The main advantage of the bulk call is that we can return results in most 
 cases without having to create a vnode in-kernel, which saves on I/O: HFS+ 
 on-disk layout is such that all of the directory entries in a given 
 directory are clustered together and we can get multiple directory entries 
 from the same cached on-disk blocks.
> ___
> Do not post admin requests to the list. They will be ignored.
> Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/filesystem-dev/luther.j%40apple.com
> 
> This email sent to luthe...@apple.com

 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.

Re: readdir vs. getdirentriesattr

2019-04-21 Thread Thomas Tempelmann

I like to add some info on a thread from 2015:

I recently worked on my file search tool (FAF) and wanted to make sure that I 
use the best method to deep-scan directory contents.

I had expected that getattrlistbulk() would always be the best choice, but it 
turns out that opendir/readdir perform much better in some cases, oddly (this 
is about reading just the file names, no other attributes).

See my blog post: https://blog.tempel.org/2019/04/dir-read-performance.html 


There's also a test project trying out the various methods.

Any comments, insights, clarifications and bug reports are most welcome.

Enjoy,
 Thomas Tempelmann


> On 12. Jan 2015, at 17:33, Jim Luther  wrote:
> 
> getattrlistbulk() works on all file systems. If the file system supports bulk 
> enumeration natively, great! If it does not, then the kernel code takes care 
> of it. In addition, getattrlistbulk() supports all non-volume attributes 
> (getattrlistbulk only supported a large subset).
> 
> The API calling convention for getattrlistbulk() is slightly different than 
> getattrlistbulk() — read the man page carefully. In particular:
> 
> • ATTR_CMN_NAME and ATTR_CMN_RETURNED_ATTRS are required (requiring 
> ATTR_CMN_NAME allowed us to get rid of the newState argument).
> • A new attribute, ATTR_CMN_ERROR, can be requested to detect error 
> conditions for a specific directory entry.
> • The method for determining when enumeration is complete is different. You 
> just keep calling getattrlistbulk() until 0 entries are returned.
> 
> - Jim
> 
>> On Jan 11, 2015, at 9:31 PM, James Bucanek  wrote:
>> 
>> Eric,
>> 
>> I would just like to clarify: the new getattrlistbulk() function works on 
>> all filesystem. We don't have to check the volume's VOL_CAP_INT_READDIRATTR 
>> capability before calling it, correct?
>> 
>> James Bucanek
>> 
>>> Eric Tamura December 10, 2014 at 5:57 PM
>>> It should be much faster.
>>> 
>>> Also note that as of Yosemite, we have added a new API: getattrlistbulk(2), 
>>> which is like getdirentriesattr(), but supported in VFS for all 
>>> filesystems. getdirentriesattr() is now deprecated. 
>>> 
>>> The main advantage of the bulk call is that we can return results in most 
>>> cases without having to create a vnode in-kernel, which saves on I/O: HFS+ 
>>> on-disk layout is such that all of the directory entries in a given 
>>> directory are clustered together and we can get multiple directory entries 
>>> from the same cached on-disk blocks.
 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2015-01-13 Thread Thomas Tempelmann

>
> For the most part, all of Apple's code has switched from
> getdirentriesattr() to getattrlistbulk().
>

*applauds*

Thomas
 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2015-01-13 Thread Jim Luther

In Yosemite (10.10), Carbon's PBGetCatalogInfoBulk and directory enumerators 
created with CoreFoundation's CFURLEnumerator (it's in CFURLEnumerator.h, not 
CFURL.h) both use getattrlistbulk() instead of readdir() or 
getdirentriesattr(). Foundation API like- [NSFileManager 
contentsOfDirectoryAtURL:includingPropertiesForKeys:options:error:] are layered 
upon CFURLEnumerator so they are using getattrlistbulk(), too.

For the most part, all of Apple's code has switched from getdirentriesattr() to 
getattrlistbulk().

- Jim

> On Jan 13, 2015, at 11:28 AM, Thomas Tempelmann  wrote:
> 
> On Tue, Jan 13, 2015 at 7:21 PM, Eric Tamura  > wrote:
> HFS, AFP, and SMB all support getattrlistbulk() natively.
>  
> Thanks for the clarification, Eric.
> 
> Do any of the higher level APIs also make use of this call, or do I have to 
> use this call to take advantage of its functionality?
> 
> I guess that NSURL's 
> contentsOfDirectoryAtURL:includingPropertiesForKeys:options:error: makes use 
> of it.
> 
> I can't find a CFURL equivalent for this bulk call, though. Has it been 
> omitted or am I just looking at the wrong docs?
> 
> And what about PBGetCatalogInfoBulk and FSOpenIterator with related functions?
> 
> Thomas
>  
> ___
> Do not post admin requests to the list. They will be ignored.
> Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/filesystem-dev/luther.j%40apple.com
> 
> This email sent to luthe...@apple.com

 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2015-01-13 Thread Thomas Tempelmann

On Tue, Jan 13, 2015 at 7:21 PM, Eric Tamura  wrote:

> HFS, AFP, and SMB all support getattrlistbulk() natively.
>

Thanks for the clarification, Eric.

Do any of the higher level APIs also make use of this call, or do I have to
use this call to take advantage of its functionality?

I guess that NSURL's
contentsOfDirectoryAtURL:includingPropertiesForKeys:options:error: makes
use of it.

I can't find a CFURL equivalent for this bulk call, though. Has it been
omitted or am I just looking at the wrong docs?

And what about PBGetCatalogInfoBulk and FSOpenIterator with related
functions?

Thomas
 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2015-01-13 Thread Eric Tamura

HFS, AFP, and SMB all support getattrlistbulk() natively.

Eric

> On 13 Jan 2015, at 3:52 AM, Thomas Tempelmann  wrote:
> 
> James,
> 
> I have to say, the new getattrlistbulk() function is working very well 
> here.[...] 
> And, I can confirm that's it's fast. :)
> 
> Can you or someone else who tried this new function share with us where this 
> improves speed and where not? In particular, do any of the network file 
> systems (CIFS, SMB2, AFP) support this function over the network so that it 
> avoids making separate network inquiries for each item individually?
> 
> Thomas
> 
> ___
> Do not post admin requests to the list. They will be ignored.
> Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/filesystem-dev/etamura%40apple.com
> 
> This email sent to etam...@apple.com


 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2015-01-13 Thread Thomas Tempelmann

James,

I have to say, the new getattrlistbulk() function is working very well
> here.[...]
>
And, I can confirm that's it's fast. :)
>

Can you or someone else who tried this new function share with us where
this improves speed and where not? In particular, do any of the network
file systems (CIFS, SMB2, AFP) support this function over the network so
that it avoids making separate network inquiries for each item individually?

Thomas
 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2015-01-12 Thread James Bucanek


Jim, Eric, list,

I have to say, the new getattrlistbulk() function is working very well 
here. Requiring ATTR_CMN_RETURNED_ATTRS made me reorganize my code a 
bit, but that was minor. Specifically, it's nice to have a single 
routine to call (instead of getdirentriesattr() and a readdir() fallback 
when it's not available). It's nice to have a less complicated use 
pattern (i.e. no count-estimate/returned-count or state 
management/reread code to write). And, I can confirm that's it's fast. :)


James


Jim Luther 
January 12, 2015 at 9:33 AM
getattrlistbulk() works on all file systems. If the file system 
supports bulk enumeration natively, great! If it does not, then the 
kernel code takes care of it. In addition, getattrlistbulk() supports 
all non-volume attributes (getattrlistbulk only supported a large subset).


The API calling convention for getattrlistbulk() is slightly different 
than getattrlistbulk() — read the man page carefully. In particular:


• ATTR_CMN_NAME and ATTR_CMN_RETURNED_ATTRS are required (requiring 
ATTR_CMN_NAME allowed us to get rid of the newState argument).
• A new attribute, ATTR_CMN_ERROR, can be requested to detect error 
conditions for a specific directory entry.
• The method for determining when enumeration is complete is 
different. You just keep calling getattrlistbulk() until 0 entries are 
returned.


- Jim


James Bucanek 
January 11, 2015 at 10:31 PM
Eric,

I would just like to clarify: the new getattrlistbulk() function works 
on all filesystem. We don't have to check the volume's 
VOL_CAP_INT_READDIRATTR capability before calling it, correct?


James Bucanek

Eric Tamura 
December 10, 2014 at 5:57 PM
It should be much faster.

Also note that as of Yosemite, we have added a new API: 
getattrlistbulk(2), which is like getdirentriesattr(), but supported 
in VFS for all filesystems. getdirentriesattr() is now deprecated.


The main advantage of the bulk call is that we can return results in 
most cases without having to create a vnode in-kernel, which saves on 
I/O: HFS+ on-disk layout is such that all of the directory entries in 
a given directory are clustered together and we can get multiple 
directory entries from the same cached on-disk blocks.


How big are the directories in question? How many times are you 
calling this?


Eric





___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/subscriber%40gloaming.com

This email sent to subscri...@gloaming.com
 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2015-01-12 Thread Jim Luther

getattrlistbulk() works on all file systems. If the file system supports bulk 
enumeration natively, great! If it does not, then the kernel code takes care of 
it. In addition, getattrlistbulk() supports all non-volume attributes 
(getattrlistbulk only supported a large subset).

The API calling convention for getattrlistbulk() is slightly different than 
getattrlistbulk() — read the man page carefully. In particular:

• ATTR_CMN_NAME and ATTR_CMN_RETURNED_ATTRS are required (requiring 
ATTR_CMN_NAME allowed us to get rid of the newState argument).
• A new attribute, ATTR_CMN_ERROR, can be requested to detect error conditions 
for a specific directory entry.
• The method for determining when enumeration is complete is different. You 
just keep calling getattrlistbulk() until 0 entries are returned.

- Jim

> On Jan 11, 2015, at 9:31 PM, James Bucanek  wrote:
> 
> Eric,
> 
> I would just like to clarify: the new getattrlistbulk() function works on all 
> filesystem. We don't have to check the volume's VOL_CAP_INT_READDIRATTR 
> capability before calling it, correct?
> 
> James Bucanek
> 
>>  Eric Tamura   December 10, 2014 at 5:57 PM
>> It should be much faster.
>> 
>> Also note that as of Yosemite, we have added a new API: getattrlistbulk(2), 
>> which is like getdirentriesattr(), but supported in VFS for all filesystems. 
>> getdirentriesattr() is now deprecated. 
>> 
>> The main advantage of the bulk call is that we can return results in most 
>> cases without having to create a vnode in-kernel, which saves on I/O: HFS+ 
>> on-disk layout is such that all of the directory entries in a given 
>> directory are clustered together and we can get multiple directory entries 
>> from the same cached on-disk blocks.
>> 
> ___
> Do not post admin requests to the list. They will be ignored.
> Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/filesystem-dev/luther.j%40apple.com
> 
> This email sent to luthe...@apple.com

 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2015-01-11 Thread James Bucanek


Eric,

I would just like to clarify: the new getattrlistbulk() function works 
on all filesystem. We don't have to check the volume's 
VOL_CAP_INT_READDIRATTR capability before calling it, correct?


James Bucanek


Eric Tamura 
December 10, 2014 at 5:57 PM
It should be much faster.

Also note that as of Yosemite, we have added a new API: 
getattrlistbulk(2), which is like getdirentriesattr(), but supported 
in VFS for all filesystems. getdirentriesattr() is now deprecated.


The main advantage of the bulk call is that we can return results in 
most cases without having to create a vnode in-kernel, which saves on 
I/O: HFS+ on-disk layout is such that all of the directory entries in 
a given directory are clustered together and we can get multiple 
directory entries from the same cached on-disk blocks.


 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2014-12-10 Thread Sean Farley

Jim Luther writes:

> And to clarify... readdir may be faster than getattrlistbulk if all you need 
> are the names. If you call getattrlist (or lstat) on every item you get back 
> from readdir, you'll find that getattrlistbulk is faster.

That is exactly what we are doing: calling lstat per file in the directory,

http://selenic.com/hg/file/416c133145ee/mercurial/osutil.c#l341
 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2014-12-10 Thread Sean Farley

Eric Tamura writes:

> It should be much faster.
>
> Also note that as of Yosemite, we have added a new API: getattrlistbulk(2), 
> which is like getdirentriesattr(), but supported in VFS for all filesystems.  
> getdirentriesattr() is now deprecated. 

Aha, that is interesting and a good lead. Thanks :-)

> The main advantage of the bulk call is that we can return results in most 
> cases without having to create a vnode in-kernel, which saves on I/O:  HFS+ 
> on-disk layout is such that all of the directory entries in a given directory 
> are clustered together and we can get multiple directory entries from the 
> same cached on-disk blocks.

Thanks a lot for the explanation. So, if I understand correctly,
directories with a large amount of files will be sped up using this bulk
call vs. one-by-one calling lstat.

But perhaps not as much benefit for a large amount of directories with
one file each?

> How big are the directories in question? How many times are you calling this?

Since this is for the mercurial project, the answer is: depends on the
project. For my tests, I ran this on a handful of repositories (MacPorts
and some others I had lying around). I could generate test repositories
that are of a certain variety (e.g. one root with lots of files per
directory vs. lots of directories with one file) if there is some
insight into what you'd like me to specifically test.

As for the number of times we call this: the answer is once per
directory. This code stems from linux ext4 world where we call lstat for
each file in a directory and rely on the kernel to optimize that.
 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2014-12-10 Thread Jim Luther

And to clarify... readdir may be faster than getattrlistbulk if all you need 
are the names. If you call getattrlist (or lstat) on every item you get back 
from readdir, you'll find that getattrlistbulk is faster.

- Jim

> On Dec 10, 2014, at 4:57 PM, Eric Tamura  wrote:
> 
> It should be much faster.
> 
> Also note that as of Yosemite, we have added a new API: getattrlistbulk(2), 
> which is like getdirentriesattr(), but supported in VFS for all filesystems.  
> getdirentriesattr() is now deprecated. 
> 
> The main advantage of the bulk call is that we can return results in most 
> cases without having to create a vnode in-kernel, which saves on I/O:  HFS+ 
> on-disk layout is such that all of the directory entries in a given directory 
> are clustered together and we can get multiple directory entries from the 
> same cached on-disk blocks.
> 
> How big are the directories in question? How many times are you calling this?
> 
> Eric
> 
> 
> 
>> On 10 Dec 2014, at 4:32 PM, Sean Farley  
>> wrote:
>> 
>> Hello HFS+ devs :-)
>> 
>> I was playing around with trying to speed up the status operation in
>> mercurial on HFS+ filesystems and heard that getdirentriesattr might be
>> faster.
>> 
>> From what I could gather (man pages and online resources), it seems the
>> potential speedup comes from the ability to do a bulk call on the files,
>> though correct me if I'm wrong.
>> 
>> I posted a proof-of-concept patch here:
>> 
>> http://www.selenic.com/pipermail/mercurial-devel/2014-September/061777.html
>> 
>> But got no real results. Experiments tried included: warm cache vs cold
>> cache, numbers of files to batch call, and combinations thereof.
>> 
>> Am I missing something obvious or just looking in the wrong place?
>> ___
>> Do not post admin requests to the list. They will be ignored.
>> Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
>> Help/Unsubscribe/Update your Subscription:
>> https://lists.apple.com/mailman/options/filesystem-dev/etamura%40apple.com
>> 
>> This email sent to etam...@apple.com
> 
> 
> ___
> Do not post admin requests to the list. They will be ignored.
> Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/filesystem-dev/luther.j%40apple.com
> 
> This email sent to luthe...@apple.com


 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

2014-12-10 Thread Eric Tamura

It should be much faster.

Also note that as of Yosemite, we have added a new API: getattrlistbulk(2), 
which is like getdirentriesattr(), but supported in VFS for all filesystems.  
getdirentriesattr() is now deprecated. 

The main advantage of the bulk call is that we can return results in most cases 
without having to create a vnode in-kernel, which saves on I/O:  HFS+ on-disk 
layout is such that all of the directory entries in a given directory are 
clustered together and we can get multiple directory entries from the same 
cached on-disk blocks.

How big are the directories in question? How many times are you calling this?

Eric

> On 10 Dec 2014, at 4:32 PM, Sean Farley  wrote:
> 
> Hello HFS+ devs :-)
> 
> I was playing around with trying to speed up the status operation in
> mercurial on HFS+ filesystems and heard that getdirentriesattr might be
> faster.
> 
> From what I could gather (man pages and online resources), it seems the
> potential speedup comes from the ability to do a bulk call on the files,
> though correct me if I'm wrong.
> 
> I posted a proof-of-concept patch here:
> 
> http://www.selenic.com/pipermail/mercurial-devel/2014-September/061777.html
> 
> But got no real results. Experiments tried included: warm cache vs cold
> cache, numbers of files to batch call, and combinations thereof.
> 
> Am I missing something obvious or just looking in the wrong place?
> ___
> Do not post admin requests to the list. They will be ignored.
> Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/filesystem-dev/etamura%40apple.com
> 
> This email sent to etam...@apple.com

 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

readdir vs. getdirentriesattr

2014-12-10 Thread Sean Farley

Hello HFS+ devs :-)

I was playing around with trying to speed up the status operation in
mercurial on HFS+ filesystems and heard that getdirentriesattr might be
faster.

From what I could gather (man pages and online resources), it seems the
potential speedup comes from the ability to do a bulk call on the files,
though correct me if I'm wrong.

I posted a proof-of-concept patch here:

http://www.selenic.com/pipermail/mercurial-devel/2014-September/061777.html

But got no real results. Experiments tried included: warm cache vs cold
cache, numbers of files to batch call, and combinations thereof.

Am I missing something obvious or just looking in the wrong place?
 ___
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

Re: readdir vs. getdirentriesattr

readdir vs. getdirentriesattr

24 matches

Site Navigation

Mail list logo

Footer information