On Wed, 5 Sep 2007 10:36:35 +0200
"BESSON-DEBLON, Pierre \(SOGETI HIGH TECH\)"
<[EMAIL PROTECTED]> wrote:

> Hi,
> 
> My architecture is something like that 
>     
>      ------- client --------
>     /                        \
>    /                          \
> server A ----------------- server B
> 
> 
> If link between servers is out, client will still have access to both
> servers.

Yes this could cause a split brain, as I am not sure what would happen
if both controllers thought they were active.

Have you looked at also improving the resilience of the physical
network. Some things we have done.

- Add 2 switches per controller, Trunk together
- Instead of using 1 nic, use bonded cards (2 cards act as one)
- Bond the cards together on the two switches (1 switch dies the other
is still there (miimon monitors link, arptarget a destination, you
must choose between the two unfortunately)
- Choose the type of Bonding, refer to bonding.txt we use mode 5
- Make sure to add multicast device to bond0 (route add -net
224.0.0.0/8 dev bond0, route add -net
225.0.0.0/8 dev bond0) you can use the specific IP address of the
cluster multicast if you wish.


As mentioned in the thread above, set the read timeout on the JVM
itself in the controller.bat

-Dsun.net.client.defaultReadTimeout=10000

> 
> How your solution with script will be safe ?
> Should I use prefered controller option in jdbcURL used be client ?
> 
> 
> Cheers,
> 
> 
> Pierre Besson-Deblon
> 
> 
> -----Message d'origine-----
> De : [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] la part de
> Gilles Rayrat
> Envoyé : 04 September 2007 19:56
> À : Sequoia general mailing list
> Objet : Re: [Sequoia] 2 controllers on broken network
> 
> 
> Hi Pierre,
> The difference is that when killed, the controller socket connections 
> will be closed with an error. The second controller will then notice 
> immediately the failure.
> Upon cable unplugging, there will be no direct notification, you will 
> have to wait for tcp timeouts, which generally are ~15min. Note that 
> even upon timeout detection, the behavior won't be really clean and
> you will not be able to keep on going with the cluster.
> Then solution is to write a little script that will watch the network
> on both controllers.  Let's say you take a server or a switch as a 
> reference: just ping it all the time and upon failure, kill the 
> controller on which the script runs, it is safer. Then, when you will 
> plug your cable back, the remaining controller will see the error and 
> you will be operational again.
> Hope these help,
> Gilles.
> 
> 
> BESSON-DEBLON, Pierre (SOGETI HIGH TECH) wrote:
> > Hi,
> >
> > I have 2 controllers, each on different server. Set to RAIDb1 like
> > that <RequestManager>
> >       <RequestScheduler>
> >          <RAIDb-1Scheduler level="passThrough"/>
> >       </RequestScheduler>
> >
> >       <LoadBalancer>
> >          <RAIDb-1>
> >             <WaitForCompletion policy="all"/>
> >             <RAIDb-1-LeastPendingRequestsFirst/>
> >          </RAIDb-1>
> >       </LoadBalancer>
> >
> > 1) When a controller crash (ctrl+C), the other one detect this
> > failure. 2) But when I unplugged network between the two servers,
> > no detection. On that moment, SQL updates don't answer, my software
> > wait indefinitely request end. The first controller did the job but
> > apparently wait for second controller reply...
> >
> > I don't understand where the difference is between the two case.
> >
> > Does anyone have a beginning (and maybe an end :) ) of explanation ?
> >
> >
> > Thanks in advance
> >
> > Pierre Besson-Deblon
> >
> > This e-mail is intended only for the above addressee. It may
> > contain privileged information. If you are not the addressee you
> > must not copy, distribute, disclose or use any of the information
> > in it. If you have received it in error please delete it and
> > immediately notify the sender. Security Notice: all e-mail, sent to
> > or from this address, may be accessed by someone other than the
> > recipient, for system management and security reasons. This access
> > is controlled under Regulation of security reasons. This access is
> > controlled under Regulation of Investigatory Powers Act 2000,
> > Lawful Business Practises.
> >
> >
> >
> > _______________________________________________
> > Sequoia mailing list
> > [email protected]
> > https://forge.continuent.org/mailman/listinfo/sequoia
> >
> >   
> _______________________________________________
> Sequoia mailing list
> [email protected]
> https://forge.continuent.org/mailman/listinfo/sequoia
> 
> This mail has originated outside your organization, either from an
> external partner or the Global Internet. Keep this in mind if you
> answer this message.
> 
> 
> 
> This e-mail is intended only for the above addressee. It may contain
> privileged information. If you are not the addressee you must not
> copy, distribute, disclose or use any of the information in it. If
> you have received it in error please delete it and immediately notify
> the sender. Security Notice: all e-mail, sent to or from this
> address, may be accessed by someone other than the recipient, for
> system management and security reasons. This access is controlled
> under Regulation of security reasons. This access is controlled under
> Regulation of Investigatory Powers Act 2000, Lawful Business
> Practises.
> 
> 
> 
> _______________________________________________
> Sequoia mailing list
> [email protected]
> https://forge.continuent.org/mailman/listinfo/sequoia


-- 
Stuart James
Senior Systems Administrator
PayPoint Internet Payment Services
203 High Street, Tonbridge TN9 1BW, UK
Direct: +44 (0)1732 300205 Mobile: +44 (0)7809 504773 


SECPay Ltd registered offices. 1 The Boulevard, Shire Park, Welwyn
Garden City, Hertfordshire, AL7 1EL

This e-mail message is confidential and for use by the addressee only.
If the message is received by anyone other than the addressee, please
return the message to the sender by replying to it and then delete the
message from your computer. Internet e-mails are not necessarily
secure. SECPay Ltd does not accept responsibility for changes made to
this message after it was sent. Whilst all reasonable care has been
taken to avoid the transmission of viruses, it is the responsibility of
the recipient to ensure that the onward transmission, opening or use of
this message and any attachments will not adversely affect its systems
or data. No responsibility is accepted by SECPay Ltd in this regard and
the recipient should carry out such virus and other checks as it
considers appropriate.

_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia

Reply via email to