On Thu, 2010-02-25 at 14:36 -0700, hj lee wrote:
> 
> 
> On Thu, Feb 25, 2010 at 1:02 PM, Steven Dake <[email protected]> wrote:
>         
>         On Thu, 2010-02-25 at 11:56 -0700, hj lee wrote:
>         > On Mon, Feb 22, 2010 at 11:30 AM, Steven Dake
>         <[email protected]>
>         > wrote:
>         >
>         >         On Sun, 2010-02-21 at 21:59 -0700, hj lee wrote:
>         >         > Hi,
>         >         >
>         >         > I am seeing this message time to time in the log.
>         Does this
>         >         measure
>         >         > the pause time of corosyns correctly? When the
>         corosync is
>         >         scheduled
>         >         > back, how is memb_join message processed before
>         pause_timer
>         >         expires?
>         >         > The pause_timer can expire before memb_join
>         message, then it
>         >         can not
>         >         > measure the time of corosync descheduled.
>         >         >
>         >
>         >
>         >         HJ,
>         >
>         >         I have not seen any process pause detected messages
>         with
>         >         token=1000 at
>         >         32 node count.  the pause_timer should expire every
>         token/5,
>         >         which
>         >         resets the pause_timestamp indicating when corosync
>         was last
>         >         scheduled.
>         >
>         >         The way coropoll works though, is to schedule timers
>         after
>         >         executing
>         >         delivery of all the UDP messages.  If it takes
>         token/2 time to
>         >         process
>         >         all those udp messages, it is possible the timer
>         that resets
>         >         the
>         >         pause_timestamp reset is being caught behind a bunch
>         of
>         >         messages
>         >         processed by the poll loop.
>         >
>         >         Could you try the attached patch.  It resets the
>         pause
>         >         timestamp on
>         >         receipt of the various message events that occur to
>         prevent
>         >         this
>         >         theoretical condition.
>         >
>         >
>         > Hi,
>         >
>         > Thanks for the patch. I haven't tried the patch yet. The
>         problem I had
>         > is pause detect is logged in my two-node cluster. Sometimes
>         the log
>         > says more than 900 ms paused. That's OK, but when one node
>         prints this
>         > log, then the other node gets token lost timeout, so the
>         cluster
>         > enters to GATHER mode. When a node is paused more than
>         900ms, there is
>         > no mcast message, the corosync is pretty much idle except
>         token
>         > passing. I really can not understand why corosync is not
>         running for
>         > more than 900ms!
>         >
>         > The corosync is running SCHED_RR real-time scheduling
>         policy. I should
>         > get CPU. Also there are pause timer (60ms) and retransmit
>         timer(130ms)
>         > enabled in the operational mode. The corosync should run at
>         least
>         > every 60ms if there is no mcast message! Also it should wake
>         up every
>         > token or mcast message arrival. I think corosync is stuck at
>         somewhere
>         > during this 900ms pause time, either at poll() routine or
>         message
>         > processing callback. Do you have any idea about this kind of
>         pause?
>         >
>         > Thanks very much
>         > hj
>         >
>         
>         
>         The pause detection is designed to detect when the corosync
>         process is
>         not scheduled for long periods of time.  I have seen
>         situations where
>         kernel drivers take spinlocks for long periods and don't
>         release them
>         (disabling scheduling in the process).
>         
> 
> The only driver related to corosync I can think of is network driver.
> Does this spinlock holding happen in network driver code? If so, more
> specifically does it at poll() call or read()/write()?
> 

Tne entire kernel uses spinlocks all the time (writing to the
filesystem, writing to the network, allocating memory, etc).
Alternatively it is possible that corosync is somehow blocking during
normal operation.  I really hope this isn't the case, as corosync is
required to be nonblocking in the protocol.

4 years ago I worked on these sorts of kernel issues and we commonly
used a tool called "ltt" or Linux Trace Toolkit.  It would tell you what
processes were scheduled when.  It looks like this project has moved to
a new home called lttng -> 

http://lttng.org/

You may find it worthwhile to see if corosync is indeed scheduled during
that period that the process pause is detected.  If this were the case,
it would indicate a defect in corosync rather then bad luck with the
kernel.

I can't duplicate your issue here.

Regards
-steve

> Thanks
> hj
> 
> 
>         
>         > --
>         > Peakpoint Service
>         >
>         > Cluster Setup, Troubleshooting & Development
>         > [email protected]
>         > (303) 997-2823
>         
>         
> 
> 
> 
> -- 
> Peakpoint Service
> 
> Cluster Setup, Troubleshooting & Development
> [email protected]
> (303) 997-2823

_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to