What information are you trying to get out of each scan? You will always have a time-of-use vs. time-of-check race condition here .. the filesystem is in a perennial state of flux.
Eric > On 22 Mar 2018, at 12:53 PM, Irad K <iradizat...@gmail.com> wrote: > > Hi and thanks for the invaluable responses. > > I understood that since I cannot limit the file id range, and that the > allocated fileid keeps ascending, it would be better using hash-map. > > However, My requirements are to be able to scan the volume with reentrancy. > This means that if the machine will undergo reboot during the scan, I'd be > able to start from this point, by keeping some sort of pivot so I can regain > scanning from the point it stopped. > > If I'll scan the files in ascending fileid order, I can simply keep the last > scanned fileid, and upon restart, I can start from this point (since no files > added with fileid before that value)... this is one option, it can also be > creation date, or any other order which allows me to differentiate between > the scanned and unscanned files. > > Scanning according to directory hierarchy is one example which doesn't > satisfy this requirement, since there might be new files creation during the > scan, in folders that I already covered. > > So far I hadn't had much lack in scanning by this order, sparse filesystem > makes the /.vol/<fsid>/<fileid> option inefficient. > As for the searchfs option, I haven't seen in the man page any way to control > the order of the files. > > Perhaps you can suggest me a proper use of searchfs, or any other efficient > way to do this ? > > Thanks A lot, > Irad K > > > On Thu, Mar 22, 2018 at 7:08 PM, Kevin Elliott <kelli...@mac.com> wrote: > > > > On Mar 21, 2018, at 1:38 PM, Irad K <iradizat...@gmail.com> wrote: > > > > Eric, > > > > Thanks for the info, that can explain the offset since I previously > > upgraded the OS from Sierra which uses HFS+ for its root filesystem. > > > > The reason that brought me looking into the fileid values, is some file > > scanner design I'm currently working on that instead of iterating the files > > according to their directory structure (i.e. BFS), I iterate according to > > ascending file-id attribute, where I always assumed that the file-id starts > > from zero. > > > > Using this scanning order, I can halt my scanning and regain to it > > according to the last scanned file-id (assuming that I can ignore newly > > created files that got file-id value lower than this last scanned value). > > > > I'd be happy if you could tell me what is the file-id allocation policy in > > APFS or HFS+ in the following aspects > > > > 1. Is there any way to extract the current file-id range (minimum to > > maximum fileid). > > No, and it’s unlikely that there will ever be one. Functionally, the most > useful way to think about file ID’s is a UUID’s . Their value implies gives > you NO useful information about the file, aside from the fact that it’s a > unique ID for the file. > > > 2. I've noticed that there are some gaps in file-id list. meaning that some > > ids aren't connected to files. How can this happen (I assume it’s due to > > deleted files), and when creating new file, does it get file-id from the > > lowest available value or the next file-id after the current maximum value. > > FIle deletion, arbitrary action, dumb luck… Again, the file ID is > effectively a unique identifier for the file. The fact that APFS HAPPENS to > hand them out in a particular order is an artifact of it’s implementation, > not a requirement. > > > > 3. I’d like to use an array that each index represent a file-id. > > Why? I can’t think of any real use for doing this. Thomas’s idea of using a > dictionary isn’t a bad one if this is a problem you really need to solve, but > this seems like a very strange problem to decide you need to solve. > > > > Can I assume that the file-ids aren’t sparse (meaning that the gaps of > > unused id values are small) so that I won't waste too much memory ? > > No, not at all. Over time it would be normal and expected for the list to > become quite sparse- some files are written out early in the drives life and > basically never change. Just looking in my /Applications directory, the > Info.plist of the oldest app on my system was last modified on “July 17, > 2006”. A lot of files have come and gone since 2006… > > > 4. Do you recommend other, more efficient way to iterate through the files > > in order to ascending file-id, other than through the /.vol/ drive ? > > Well, I’m not sure I’d recommend using .vol either… > > What are you trying to do, and why does iterating by file ID seem like a good > way to do it? > > -Kevin > > _______________________________________________ > Do not post admin requests to the list. They will be ignored. > Filesystem-dev mailing list (Filesystem-dev@lists.apple.com) > Help/Unsubscribe/Update your Subscription: > https://lists.apple.com/mailman/options/filesystem-dev/etamura%40apple.com > > This email sent to etam...@apple.com _______________________________________________ Do not post admin requests to the list. They will be ignored. Filesystem-dev mailing list (Filesystem-dev@lists.apple.com) Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/filesystem-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com