Hi Lucas,

please find comments and answers to your questions below.

Lucas McGregor wrote:

> I am trying to figure out how to achieve fail over using MDB and Joram as my
> JMS.
>
> I was wondering if someone could answer a few questions about how they all
> work together. First let me explain my set-up, then ask my questions.
>
> I have two machines solaris2 and solaris3.
> On each machine I start: rmiregistry, joram, then jonas.
>
> My joram  a3servers.xml file looks like.
>
> <?xml version="1.0"?>
> <!DOCTYPE config SYSTEM "../xml/a3config.dtd">
> <config>
>     <domain name="D1"/>
>     <server id="0" name="S0" hostname="solaris2">
>       <network domain="D1" port="11741"/>
>       <service class="fr.dyade.aaa.mom.ConnectionFactory" args="16020"/>
>     </server>
>     <server id="1" name="S1" hostname="solaris3">
>       <network domain="D1" port="11741"/>
>       <service class="fr.dyade.aaa.mom.ConnectionFactory" args="16020"/>
>     </server>
> </config>
>
> I start joram on solaris2 like: java -Dinstall.root=$JONAS_ROOT
> -DTransaction=ATransaction
> -Dfr.dyade.aaa.agent.A3CONF_DIR=$JONAS_ROOT/config "$@"
> fr.dyade.aaa.agent.AgentServer 0 ./s0
>
> on solaris3, like this: java -Dinstall.root=$JONAS_ROOT
> -DTransaction=ATransaction
> -Dfr.dyade.aaa.agent.A3CONF_DIR=$JONAS_ROOT/config "$@"
> fr.dyade.aaa.agent.AgentServer 1 ./s1
>
> Now I have tried just starting Jonas on each machine, letting each jonas
> connect to its local joram and create its own topic. But the topic actually
> seems to contain some routing information, i.e.
> joram://solaris2:16020/#0.0.1027. So the MDB on each machine would each bind
> to a different topic and I wouldn't see JMS messages broadcast to each MDB
> as hoped.

Right, you will get one different topic per jonas/joram server !


>
>
> So then I tried to start up everything on solaris2, then start the
> rmiregistry and joram on solaris3. Now I run a program that gets the Remote
> for the Topic from the rmigregistry on solaris2 and rebinds it to the
> registry on solaris3. I do this because I need to use RMI and not jermie
> and I want to cluster it so I don't have a single rmiregistry as a single
> point of failure. Then I start up jonas on solaris3, which sees the topic
> already in existence and thus binds its MDB to it.

It is a good way to share a destination between two jonas/joram servers when
using rmiregistry !


>
>
> Now when anyone publishes a message topic, it is broadcast to all MDB's. The
> problem is that if any one part crashes, they all lose connection to the
> JMS.
>
> The next thing I tried was to create the topic in a separate program before
> I start any Jonas servers. The program serializes the topic to a file. Then
> I have another program rebind the unmarshalled object in the local
> rmiregistry, and then make copies of its remote to the other registeries.
> Now, I have been able to bring down a single machine, and restart them and
> have them handle fail over.

I'm not sure to understand what your program does, i.e. the difference with
having the topic created by JOnAS ...
But I think it is related to one of the problems we have when using joram and
jonas : we currently store the JMS destination in the rmiregistry, which is not
persistent, while joram objects (destination "agents") may be persistent
(DTransaction=ATransaction) ! So such joram objects may recover, but we are
unable to rebind them ! The solution should be to have a persistent registry to
be used with JOnAS ...


>
>
> There is still a problem. For example, if I create the original topic on
> solaris2, then I have to be able to bring solaris2 up to its original state
> to have everything recover. If solaris2 becomes inoperable, I will not be
> able to recover without restarting the entire cluster.

It seems it is because Solaris2 hosts the joram server where the topic is
created.


>
>
> So these are my questions:
>
> When jonas creates a MDB, does it bind to connection to a Joram server
> socket? If so, is there any way to get it to try to reconnect when it loses
> a socket connection?

Yes. That's the point we should look at ! When opening a JMS connection, this
correponds to a socket connection with the joram server, and I think  there is
no mechanism to try to reconnect automatically after the connection is lost ...


>
>
> What does jonas/joram do with the URL in the Topic
> (joram://solaris2:16020/#0.0.1027)? Does it use it like routing information?

Yes Joram use this url to be aware of the location of the topic, i.e. that it is
on the joram server located on Solaris2, listening on port ...


>
> The reason I ask is that if I create the topic on solaris2
> (joram://solaris2:16020/#0.0.1027), and start up everything on solaris3 and
> bind the local jonas to a local joram, using the topic created on solaris2,
> and then take down joram on solaris2, but not the JVM that holds the topic
> object, then JMS services on solaris3 go down too. I would have guessed that
> since solaris3 jonas has a solaris3 joram and can still reach the original

No, a topic is held by a joram server, not the JVM ! What you bind in the
registry is only a proxy to access a "joram object" representing the topic, in
fact an "agent" that lives within the joram server.


>
> topic, it would be fine--but I was wrong. If I bring up joram again on
> solaris2, in a short period of time, then JMS on solaris3 starts working
> again (but only if DTransaction=ATransaction ). It seems like all JMS for

Yes because only if "DTransaction=ATransaction" is set, are the JMS administered
objects (destinations) persisted by joram, i.e. are recovered when joram
restarts !


>
> that topic goes though the joram on solaris2. But if I take down the joram
> on solaris3, JMS services on solaris3 go down. So it seems to need BOTH
> jorams to be up to run--now I have 2 points of failure.
>
> Is there any established methods (using RMI) for achieving fault tolerance
> and fail over for joram?

I think joram allows you to recover from a failure, i.e. to recover your
destinations (with their state) after a server failure. But I do not know about
mechanisms to implement fail over features within a cluster.
As we will study clustering features for JOnAS in the first half of 2002, we
will surely take into account the JMS behaviour (I have already planned this
aspect), but any input is welcomed !


>
>
> What are the classes that Jonas uses for handling JMS connections? Is there
> any thought or work being done for handling broken connections?

This is a particular point we should look at in a first time. We will discuss
that we the joram team to know how we could do it within JOnAS.


>
>
> I have tried redeploying my MDB's and from all that I have been reading, and
> my own experience, it appears that the Hot Deploy feature still doesn't
> work. Is there a work around or any progress being made on this?

Yes we are currently working on hot deployment.

>
>
> Thanks in advance for the time you have spent to read this long problem. If
> there is any information I can offer to help, please contact me.
>
>         Thanks,
>         Lucas McGregor
>
>

You're welcome.

I hope these answers will bring you some help.
To summarize, the points to study in order to solve your problems are the
following :
1) work on restoring a JMS broken connection
2) persistent registry
3) JOnAS/Joram configuration for clustering/fail over

I'm afraid only the 1st point could be "short term" ... but any suggestion is
welcomed !

Best Regards,

Fran�ois
--
==================================================================
Fran�ois EXERTIER         Evidian (Groupe Bull)
     1, rue de Provence,  BP 208,  38432 Echirolles cedex, FRANCE
     mailto:[EMAIL PROTECTED]
     http://www.evidian.com/jonas   http://www.objectweb.org/jonas
     Tel: +33 (0)4 76 29 71 51  -  Fax:   +33 (0)4 76 29 77 30
==================================================================


----
To unsubscribe, send email to [EMAIL PROTECTED] and
include in the body of the message "unsubscribe jonas-users".
For general help, send email to [EMAIL PROTECTED] and
include in the body of the message "help".

Reply via email to