On Fri, 16 Jun 2006 10:37:05 -0600 [EMAIL PROTECTED] (Eric W. Biederman) wrote:
> > The processing of the notifier is to make a SCSI adaptor power off to > > stop writing in the shared disk completely and then notify to standby-node. > > The kernel has called panic no new SCSI operations were execute. > I'm not saying don't notify your standby-node As you say, the kernel does not do anything about SCSI operations. But many SCSI adaptors flush their cache after a few seconds pass after a SCSI write command is invoked, especially RAID cards. To completely stop writing immediately, we should make the adaptor power off. > Please walk me through a real world kernel failure, and show me how > your millisecond requirement is met. > > In the example please answer: > - What causes the kernel to call panic? > - From the real failure to the kernel calling panic how long > does it take? For instance, if a file system inconsistency is detected, it takes few time until invoking panic. I have seen various kernel failure so far and these will unfortunately occur. > - What actions does the notifier take to tell the other kernel > it is dead. The operation is only writing to BMC a few times to use IPMI interface. That operation using outb is very simple. > - Why do we think the kernel taking that action will be reliable? I agree the notifier may spoil reliability as compared with doing nothing. It depends on quality of the notifier processing. But I think the one is needed because it is more effective. > - From the point where we call panic() how long does it take until > the kdump kernel is active? On my box it takes about one second or so, but on a actual enterprise system which have many disks(hundreds or more) it becomes more. Thanks, -- Akiyama, Nobuyuki _______________________________________________ fastboot mailing list [email protected] https://lists.osdl.org/mailman/listinfo/fastboot
