mds vs f_fstypename

2019-04-21 Thread Jorgen Lundman

Hello list,

Back when we desperately tried to get ZFS to work with Spotlight, we found
that we had to change vfsstatfs->f_fstypename to return "hfs" instead of
"zfs". I just assumed mds did a simple string compare with "hfs" and that
was just how things were.

So recently we tried to lie and say we were "apfs" instead of "hfs", and
found that mds does not want to work in that case either.

That could suggest that ZFS is doing something wrong, as opposed to mds
ignoring ZFS. (So maybe it isn't doing a string compare after all).

Working with mds is a bit of a "black box", it either works or it doesn't.
There are no clues to get as to what could be wrong. (right? No debug
messages or similar?)

I have made the ZFS replies to statfs, getattrlist(ATTR_VOL_CAPABILITIES),
vnop_tables, identical to that of apfs, as well as dtruss'ing mds for
clues, but they still don't like each other.

I know there are no apfs, nor mds, sources. But is it possible to get clues
as to what mds does differently when running on "apfs" compared to "hfs"?
Are we accidentally hooking into a hfs compatibility mode in mds.

(We haven't implemented getattrlist, getattrlistbulk, but the XNU kernel
looks to handle that very nicely, and it /looks like/ that shouldn't be the

As for not working, it looks a bit like:

# zpool create -f -O tank disk0
# touch /Volumes/tank/test.txt
# mdls /world.txt
kMDItemContentCreationDate = 2014-04-17 03:02:53 +
[lots more information dumped here]

# zpool create -f -O tank disk0
# touch /Volumes/tank/test.txt
# mdls /world.txt

Anything would be appreciated!


Jorgen Lundman   | 
Unix Administrator   | +81 (0)90-5578-8500
Shibuya-ku, Tokyo| Japan

Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (
Help/Unsubscribe/Update your Subscription:

This email sent to

Re: readdir vs. getdirentriesattr

2019-04-21 Thread Thomas Tempelmann
I like to add some info on a thread from 2015:

I recently worked on my file search tool (FAF) and wanted to make sure that I 
use the best method to deep-scan directory contents.

I had expected that getattrlistbulk() would always be the best choice, but it 
turns out that opendir/readdir perform much better in some cases, oddly (this 
is about reading just the file names, no other attributes).

See my blog post: 

There's also a test project trying out the various methods.

Any comments, insights, clarifications and bug reports are most welcome.

 Thomas Tempelmann

> On 12. Jan 2015, at 17:33, Jim Luther  wrote:
> getattrlistbulk() works on all file systems. If the file system supports bulk 
> enumeration natively, great! If it does not, then the kernel code takes care 
> of it. In addition, getattrlistbulk() supports all non-volume attributes 
> (getattrlistbulk only supported a large subset).
> The API calling convention for getattrlistbulk() is slightly different than 
> getattrlistbulk() — read the man page carefully. In particular:
> • ATTR_CMN_NAME and ATTR_CMN_RETURNED_ATTRS are required (requiring 
> ATTR_CMN_NAME allowed us to get rid of the newState argument).
> • A new attribute, ATTR_CMN_ERROR, can be requested to detect error 
> conditions for a specific directory entry.
> • The method for determining when enumeration is complete is different. You 
> just keep calling getattrlistbulk() until 0 entries are returned.
> - Jim
>> On Jan 11, 2015, at 9:31 PM, James Bucanek  wrote:
>> Eric,
>> I would just like to clarify: the new getattrlistbulk() function works on 
>> all filesystem. We don't have to check the volume's VOL_CAP_INT_READDIRATTR 
>> capability before calling it, correct?
>> James Bucanek
>>> Eric Tamura December 10, 2014 at 5:57 PM
>>> It should be much faster.
>>> Also note that as of Yosemite, we have added a new API: getattrlistbulk(2), 
>>> which is like getdirentriesattr(), but supported in VFS for all 
>>> filesystems. getdirentriesattr() is now deprecated. 
>>> The main advantage of the bulk call is that we can return results in most 
>>> cases without having to create a vnode in-kernel, which saves on I/O: HFS+ 
>>> on-disk layout is such that all of the directory entries in a given 
>>> directory are clustered together and we can get multiple directory entries 
>>> from the same cached on-disk blocks.
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list  (
Help/Unsubscribe/Update your Subscription:

This email sent to