On Mon, Mar 26, 2012 at 6:18 AM, Richard Miller <[email protected]> wrote: > (1) When taking a snapshot, blockWrite in cache.c is called to write > an updated super block S, which has a pointer to the root block R > for the new epoch. To maintain consistency on the disk, R must be > written before S, so blockWrite checks whether R is still in the > cache and marked dirty. Very rarely, blockWrite finds R locked (eg > because the flush thread is just now writing it), so it gives up and > returns zero. The zero return is OK when blockWrite is called by > the flush thread, because the flush thread can get on with writing > out other blocks before coming back to try the failed block again. > But when blockWrite is called by superWrite, there's nothing else to > do; hence the 10 second sleep and warning message. The solution is > to add a waitlock parameter to blockWrite, so superWrite can tell it > to wait for a locked dependent block. > > (2) After the new super block S is sent to the disk write queue, > superWrite removes the previous epoch's root block R' from the > active file system. This is normally done by attaching a BList > entry to S in the cache, noting that R' must be marked closed after > S actually goes to the disk. Rarely, S has already been written by > the time blistAlloc is called. In this case the correct thing was > being done (just close R' immediately), but a spurious warning was > produced.
Than you for cleaning these up. These are both things that I meant to come back to some day, but I never did. Russ
