On Thu, Jun 26, 2008 at 8:34 AM, Jukka Salmi <[EMAIL PROTECTED]> wrote: > I just had a closer look at this problem and it seems to be caused by > a race condition between programs started by Amanda's driver program: > all of these programs (dumper, chunker, taper) and the driver program > itself write to the same log file. Since these programs run > simultaneously, their calls to log_add() may result in interleaved > writes. And this is exactly what happens here almost daily: e.g. > yesterday's logfile contains
Yuck! For those playing along at home, note that these are the amdump logs, which are only used by amreport and amstatus, and not the trace logs, which form Amanda's catalog. Trace logs are only written by a single process, so they won't see this kind of overlap. (See http://wiki.zmanda.com/index.php/Log_Files for the types of logfiles Amanda uses) > To fix this problem correctly, the writing of the log file should be > synchronised. As a hack, the two calls to fullwrite() in log_add() > could be replaced by a single call; this would probably cause the > problem to occur less often. I'd certainly prefer the correct > solution... Unfortunately, that file is basically the intermingled freeform stderr of just about every process spawned by amdump/amflush, so there's no single interface through which we can funnel all notifications. The long-term plan is to replace the whole "fleet" of Amanda processes with a single process, using the transfer architecture to juggle multiple concurrent data transfers. Your suggestion seems a good short-term fix. I don't have any immediate ideas for a mid-term fix, but I'm open to suggestions. It's interesting that we haven't seen this happen more often. This is the first time in my memory, but I haven't been around that long :) Do you want to send along a patch? Dustin -- Storage Software Engineer http://www.zmanda.com
