Hi and thanks for the invaluable responses.

I understood that since I cannot limit the file id range, and that the
allocated fileid keeps ascending, it would be better using hash-map.

However, My requirements are to be able to scan the volume with reentrancy.
This means that if the machine will undergo reboot during the scan, I'd be
able to start from this point, by keeping some sort of pivot so I can
regain scanning from the point it stopped.

If I'll scan the files in ascending fileid order, I can simply keep the
last scanned fileid, and upon restart, I can start from this point (since
no files added with fileid before that value)... this is one option, it can
also be creation date, or any other order which allows me to differentiate
between the scanned and unscanned files.

Scanning according to directory hierarchy is one example which doesn't
satisfy this requirement, since there might be new files creation during
the scan, in folders that I already covered.

So far I hadn't had much lack in scanning by this order, sparse filesystem
makes the /.vol/<fsid>/<fileid> option inefficient.
As for the searchfs option, I haven't seen in the man page any way to
control the order of the files.

Perhaps you can suggest me a proper use of searchfs, or any other efficient
way to do this ?

Thanks A lot,
Irad K


On Thu, Mar 22, 2018 at 7:08 PM, Kevin Elliott <kelli...@mac.com> wrote:

>
>
> > On Mar 21, 2018, at 1:38 PM, Irad K <iradizat...@gmail.com> wrote:
> >
> > Eric,
> >
> > Thanks for the info, that can explain the offset since I previously
> upgraded the OS from Sierra which uses HFS+ for its root filesystem.
> >
> > The reason that brought me looking into the fileid values, is some file
> scanner design I'm currently working on that instead of iterating the files
> according to their directory structure (i.e. BFS), I iterate according to
> ascending file-id attribute, where I always assumed that the file-id starts
> from zero.
> >
> > Using this scanning order, I can halt my scanning and regain to it
> according to the last scanned file-id (assuming that I can ignore newly
> created files that got file-id value lower than this last scanned value).
> >
> > I'd be happy if you could tell me what is the file-id allocation policy
> in APFS or HFS+ in the following aspects
> >
> > 1. Is there any way to extract the current file-id range (minimum to
> maximum fileid).
>
> No, and it’s unlikely that there will ever be one.  Functionally, the most
> useful way to think about file ID’s is a UUID’s .  Their value implies
> gives you NO useful information about the file, aside from the fact that
> it’s a unique ID for the file.
>
> > 2. I've noticed that there are some gaps in file-id list. meaning that
> some ids aren't connected to files. How can this happen (I assume it’s due
> to deleted files), and when creating new file, does it get file-id from the
> lowest available value or the next file-id after the current maximum value.
>
> FIle deletion, arbitrary action, dumb luck…  Again, the file ID is
> effectively a unique identifier for the file.  The fact that APFS HAPPENS
> to hand them out in a particular order is an artifact of it’s
> implementation, not a requirement.
>
>
> > 3. I’d like to use an array that each index represent a file-id.
>
> Why?  I can’t think of any real use for doing this.  Thomas’s idea of
> using a dictionary isn’t a bad one if this is a problem you really need to
> solve, but this seems like a very strange problem to decide you need to
> solve.
>
>
> > Can I assume that the file-ids aren’t sparse (meaning that the gaps of
> unused id values are small) so that I won't waste too much memory ?
>
> No, not at all.  Over time it would be normal and expected for the list to
> become quite sparse-  some files are written out early in the drives life
> and basically never change.  Just looking in my /Applications directory,
> the Info.plist of the oldest app on my system was last modified on “July
> 17, 2006”.  A lot of files have come and gone since 2006…
>
> > 4. Do you recommend other, more efficient way to iterate through the
> files in order to ascending file-id, other than through the /.vol/ drive ?
>
> Well, I’m not sure I’d recommend using .vol either…
>
> What are you trying to do, and why does iterating by file ID seem like a
> good way to do it?
>
> -Kevin
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list      (Filesystem-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to