On Fri, Jun 13, 2025 at 08:14:16AM -0400, Greg Troxel wrote: > > With that said, I wanted to know if there was anyone working on > > this. I'd also welcome any mentorship or advice in regards to > > working on this project. > > I have not heard that anyone is working on this.
As far as I know, nobody is currently working on it. There have been several past attempts that have come to grief; it isn't as easy as it may look. > From a really high level, it would be good to articulate if this is > about freeing blocks to allow storing more files, or if it's about > interacting with TRIM. The project on the projects page that's been referenced is about punching holes in files and has nothing to do with device-level discard or TRIM. > Personally, my first step would be a survey of other implementations. > The semantics of fdiscard(2) are a bit messy, with TRIM and undefined > contents. That's surprising to me, as I see telling the hardware that > blocks are no longer needed seems separable from a file no longer > referencing those blocks. : When fdiscard() is applied to a regular file, a hole is created and any : data in the affected region is thrown away. Subsequent reads of the : region return zeros. Seems very clear. The semantics of discard on devices, which are not immediately relevant, are a reflection of the hardware-level behavior of TRIM. > It looks like fdiscard(2) isn't part of > POSIX, which raises the question of where it came from : The posix_fallocate() and fdiscard() function calls appeared in : NetBSD 7.0. I made it up to replace an ugly ioctl interface to TRIM that we didn't want in a release. Extending it to regular files is a natural generalization and directly comparable to fallocate. All this can be found in the tech-kern archives. > [more irrelevant stuff about TRIM] > Then, I'd write down how fdiscard on ffs is supposed to behave (as in > external specification), both at the VOP layer, and if it interacts with > block devices specially. It does not interact with block devices. Why would it? > Next, read and understand the code, which is a pretty big job. I'd > document invariants and improve comments, assuming that this hasn't been > done already. BSD has a history of code having critical invariants that > aren't really written down; I don't know about the current state of ffs. > You'll also have to understand wapbl. My impression is that most people > use wapbl, and thus discard on ffs w/o wapbl isn't so interesting. > > Then, a written design of how the code is going to change, both in the > base case and wapbl case. This is probably "change the list of blocks > to have holes" but it's all about ordering of writes and invariants. To expand on that, there's two parts to the problem: implementing discard itself, and then making it interact correctly with recovery. Implementing discard itself is relatively straightforward - for each block, you clear the block entry in the inode or indirect block, mark same dirty, and free the block in the free block bitmap. It is more or less the same thing truncate does, except on a range. For non-wapbl ffs you need to preserve the on-disk invariants that permit fsck to work. The fsck paper (mckusick85-fsck, that you can find in /usr/share/doc/papers) is a good place to start for that. For wapbl you need to write the affected blocks to the journal (wapbl uses a somewhat dodgy block-based journaling approach) and you'll probably need to make sure you write each block only once. (Otherwise you're likely to run out of space in the journal doing large discards.) There's been a history of both performance and correctness problems with truncate in wapbl. Looking into that history (in the tech-kern archives, the source history, and the bug database) will probably help illuminate why it works the way it does. In general because the operation is very similar to truncate, using the truncate code as a reference is the best plan. Share as much of the code as possible. Also beware that doing any of this will require understanding the rather grotty way cache writes and journaling got implemented for wapbl. Bluntly, it was hacked on rather than built in properly. > I would suggest writing things up and posting them somewhere for review > by others. Indeed. -- David A. Holland dholl...@netbsd.org