Re: [Linux-HA] heartbeat 2.0.8: causing nfs kernel oops

Alan Robertson Tue, 01 May 2007 05:43:19 -0700

Gerry Reno wrote:
> I'm seeing some very strange things lately.  Whenever heartbeat is
> running there are these messages in the log:
> Apr 30 19:38:08 grp-01-30-01 heartbeat: [2533]: ERROR: write failure on
> bcast eth0.: No such device
> Apr 30 19:38:08 grp-01-30-01 heartbeat: [2533]: ERROR: glib: Unable to
> send bcast [-1] packet(len=214): No such device
> Apr 30 19:38:08 grp-01-30-01 heartbeat: [2533]: ERROR: MSG: Dumping
> message with 10 fields
> Apr 30 19:38:08 grp-01-30-01 heartbeat: [2533]: ERROR: MSG[0] :
> [t=NS_ackmsg]
> Apr 30 19:38:08 grp-01-30-01 heartbeat: [2533]: ERROR: MSG[1] :
> [dest=grp-01-30-02]
> Apr 30 19:38:08 grp-01-30-01 heartbeat: [2533]: ERROR: MSG[2] :
> [ackseq=40cd2]
> Apr 30 19:38:08 grp-01-30-01 heartbeat: [2533]: ERROR: MSG[3] :
> [(1)destuuid=0x835cfc8(37 28)]
> Apr 30 19:38:08 grp-01-30-01 heartbeat: [2533]: ERROR: MSG[4] :
> [src=grp-01-30-01]
> Apr 30 19:38:08 grp-01-30-01 heartbeat: [2533]: ERROR: MSG[5] :
> [(1)srcuuid=0x8361848(36 27)]
> Apr 30 19:38:08 grp-01-30-01 heartbeat: [2533]: ERROR: MSG[6] : [hg=a1]
> Apr 30 19:38:08 grp-01-30-01 heartbeat: [2533]: ERROR: MSG[7] :
> [ts=46367de0]
> Apr 30 19:38:08 grp-01-30-01 heartbeat: [2533]: ERROR: MSG[8] : [ttl=4]
> Apr 30 19:38:08 grp-01-30-01 heartbeat: [2533]: ERROR: MSG[9] : [auth=1
> dcf0feb393f46354b060306713eb72adc15eecf3]
> 
> But yet, in most other respects eth0 seems to behave perfectly normal. 
> I even went so far as to swap out the NIC card for eth0 and same
> result.  I can ping, ftp, ssh, etc. using eth0 with no problems.  Where
> I do see a problem is with using NFS.  If I mount a remote NFS mount and
> try to push a compressed tar to the NFS mounted directory, after about
> 1GB of transfer I get a kernel oops in the NFS code.  Now, if I shutdown
> heartbeat and perform the same compressed tar it completes correctly
> without any oops.  So I'm baffled by this.  Is there any known problem
> that would cause the above log messages on an otherwise perfectly good
> network connection and also cause some type of interaction with NFS? 
> This problem seems to follow the primary node.  In other words the
> lockup occurs on whichever node has the primary IPaddr.  I can post the
> log, but it's hundreds of megabytes of this same message.


Yes.

Running DHCP on a network link.  Taking the link down manually.  Other
things that involve messing around with eth0.



-- 
    Alan Robertson <[EMAIL PROTECTED]>

"Openness is the foundation and preservative of friendship...  Let me
claim from you at all times your undisguised opinions." - William
Wilberforce
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] heartbeat 2.0.8: causing nfs kernel oops

Reply via email to