On Fri, 2008-10-17 at 14:24 -0400, Valerie Aurora Henson wrote: > On Thu, Oct 16, 2008 at 03:30:49PM -0400, Chris Mason wrote: > > On Thu, 2008-10-16 at 15:25 -0400, Valerie Aurora Henson wrote: > > > > > > Both deduplication and compression have an interesting side effect in > > > which a write to a previously "allocated" block can return ENOSPC. > > > This is even more exciting when you factor in mmap. Any thoughts on > > > how to handle this? > > > > Unfortunately we'll have a number of places where ENOSPC will jump in > > where people don't expect it, and this includes any COW overwrite of an > > existing extent. The old extent isn't freed until snapshot deletion > > time, which won't happen until after the current transaction commits. > > > > Another example is fallocate. The extent will have a little flag that > > says I'm a preallocated extent, which is how we'll know we're allowed to > > overwrite it directly instead of doing COW. > > > > But, to write to the fallocated extent, we'll have to clear the flag. > > So, we'll have to cow the block that holds the file extent pointer, > > which means we can enospc. > > I'm sure you know this, but for the peanut gallery: You can avoid some > of these sort of purely copy-on-write ENOSPC cases. Any operation > where the space used afterwards is less than or equal to the space > used before - like in your fallocate case - can avoid ENOSPC as long > as you reserve a certain amount of space on the fs and break down the > changes into small enough groups. Most file systems don't let you > fill up beyond 90-95% anyway because performance goes to hell. You > also need to do this so you can delete when your file system is full. > > In general, it'd be nice to say that if your app can't handle suprise > ENOSPC, then if you run without snapshots, compression, or data dedup, > we guarantee you'll only get ENOSPC in the "normal" cases. What do > you think?
I think I'll have to come back to this after getting ENOSPC to work at all ;) You're right that reserved space can do wonders to dig us out of holes, it has to be reserved at a multiple of the number of procs that I allow into the transaction. I should be able to go into an emergency one writer at a time theme as space gets really tight, but there are lots of missing pieces that haven't been coded yet in that area. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html