Well, it's a complex scenario, so constructing a test case is also complex.
That is why I bothered you with my questions and did not simply test it
out.
What I mean with "cluster fragmentation" is a temporary (partial) loss of
network connectivity between communicating systems without any of the
systems themselves failing.
Let me try to explain it with an example:
Assuming, we have a moderately complex setup of 4 communicating systems:
DB-server-A, DB-server-B, Client-A and Client-B with DB-server-A and
Client-A
attached to a switch (named Switch-A) and DB-server-B and Client-B
attached to another switch (named Switch-B). Let's assume that both switches
are connected.
+----------+
DB-Server-A ---- | Switch A |--- Client-A
+----------+
|
+----------+
DB-Server-B ---- | Switch B |--- Client-B
+----------+
During normal operation, both clients will connect to both DB servers.
Now, lets' assume that the link between Swicht A and Switch B fails for
some time.
+----------+
DB-Server-A ---- | Switch A |--- Client-A
+----------+
+----------+
DB-Server-B ---- | Switch B |--- Client-B
+----------+
As a result, Client-A will eventually be disconnected from DB-Server-B
and vice versa. Both clients will think they have lost redundancy but
will happily continue working with their respective DB server.
After the link between the switches is re-established, none of the
components will notice any change ... so there is no instance that
will notice the need to run the CreateCluster tool.
Of course, for this special case, I could add some kind of
heartbeating between DB-Server-A and DB-Server-B to notice
the link failure. However, there are even more complex cases
that cannot be caught by such a heartbeating.
.... And yes, my experience shows, that in real life systems
everything that can go wrong will go grong ... after a surprisingly
short period of time. So I would not consider these thoughts as
purley academic and constructed.
Thanks,
Stephan
Am 13.05.2011 22:37, schrieb Thomas Mueller:
> Hi,
>
> I was thinking about building a cluster control program that would
> automate
> the cluster rebuild without any human intervention.
>
>
> That would be great of course!
>
> Yes, I know, in many cases you would not want such an automatism because
> there is so much that can go wrong...
>
>
> Well, if 99.999% of all risks can be eliminated, then automating this
> would be great :-)
>
> However, I need to deal with customers that don't want to control
> their database manually (in fact they don't want to care about these
> 'details').
>
>
> That's understandable.
>
>
> In the case I have described, that system could end up in a situation
> where one client (that was connected when cluster frgamentation occured)
> works on only one database while another client (that did connect
> when network connectivity was up again) works on both of them ...
> with nobody even noticing that they are running into more
> and more inconsistent databases.
>
>
> I think there is a mechanism that ensures this can't happen. If this
> mechanism doesn't work, then it's a bug.
>
> But first let's define what you mean with "cluster fragmentation",
> because this is a term I never heard. Do you mean one of the cluster
> nodes (instances) was killed?
>
> Well ... I was hoping you would answer that there already is a
> mechanism in place that would help the clients to safely detect
> the inconsistent situation and force them to reconnect.
>
>
> Yes, there is such a mechanism in the "CreateCluster" tool: it sets the
> exclusive mode and kills other connections ("SET EXCLUSIVE 2"). The
> other connections need to use the auto-reconnect feature. This is
> documented.
>
> If it doesn't work for you please tell me - even better please post a
> simple test case.
>
> Regards,
> Thomas
>
> --
> You received this message because you are subscribed to the Google
> Groups "H2 Database" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/h2-database?hl=en.
--
You received this message because you are subscribed to the Google Groups "H2
Database" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/h2-database?hl=en.