e...@math.uni-bonn.de (Edgar =?iso-8859-1?B?RnXf?=) writes: >So, while investigating my WAPL performance problems, It looks like I can >crash the machine (not reliably, but more often that not) with a simple > seq 1 3000 | xargs mkdir >command. I get the following backtrace in ddb (wetware OCR):
>panic: wapbl_register_deallocation: out of resources Shouldn't really happen with mkdir. If I understand the code correctly the deallocation buffer is flushed at the begin of every transaction when it is more than half full. You could try a kernel with WAPBL_DEBUG and set some bits in wapbl_debug_print. But the console output might influence the timing and thus hide the bug. >On reboot, at mounting one file system (NOT the one I was operating on as >the crash happened), the "replaying log to disk" took several minutes. >I physically walked to the server to have a look whether the discs were >actually busy, and there was a strange pattern: Out of the five discs that >the RAID was built on, four were blinking at ~7Hz while the fifth was idle. The fifth disk probably wasn't idle. Updating a RAID5 stripe requires reading of everything but the parity block and writing back the whole stripe including the new parity block. The write completes much faster than the read, it is possible that this makes it invisible. dumpfs(8) lets you view the journal (before mounting and replaying). You should see a few thousand inode records after such a crash. With the huge overhead of reading, modifying and writing back a stripe, possibly once per inode, it is no wonder that it takes minutes. N.B. RAID5 overhead will become better once we teach the filesystem to send more data than the stripe size to the raid driver. Currently the filesystem is limited by MAXPHYS (i.e. 64kbyte on most systems) which is almost always much smaller than the stripe size. -- -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."