Re: [Openais] Can we extend the CoroSync to support 64 nodes?

Javen Wu Mon, 22 Feb 2010 02:42:19 -0800

Hi Steve,

We tried to setup a large scale corosync cluster. But found when the node
size > 32, the cluster is not stable,  the node joining may cause the
existing cluster broken and re-configured. In my recent test, the
breaking/re-config happened when node 46, 51, 59, 60 and 62 tried to join
the cluster. And sometime "service corosync stop" causes the node 100% CPU
usage and cannot recover. I have to restart the node.

BTW, I curious of the token lost timeout setting. My understanding is the
token lose timeout is to detect node lost or network partition in the ring.
But if the Totem protocol depends on passing token in the ring, should the
scale of the cluster is relevant to the setting of the token lose timeout
and might impact the time for failure detection time?

I will do more investigation and testing to see how corosync scale to 64
nodes cluster.

Thanks
Javen

2010/1/13 Steven Dake <[email protected]>

> Untested at this time.
>
> Feel free to try and report your experiences.
>
> I have tested 48 nodes on physical hardware and things work quite well
> with a 1 sec token timeout and 5 second consensus timeout.
>
> Regards
> -steve
>
> On Tue, 2010-01-12 at 12:10 +0800, Javen Wu wrote:
> > Hi Folks,
> >
> > I just realize the Corosync has limitation to support 32 nodes as
> > maximum.
> > Is it possible we extended the limitation to support 64 nodes? Any
> > technical barrier?
> >
> > thanks
> > --
> > Javen Wu
> > _______________________________________________
> > Openais mailing list
> > [email protected]
> > https://lists.linux-foundation.org/mailman/listinfo/openais
>
>

-- 
Javen Wu

_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Re: [Openais] Can we extend the CoroSync to support 64 nodes?

Reply via email to