Running multihomed database servers has become an issue on our
environment as we make AFS available in small sites where we will not
have dedicated file and/or database servers for AFS.

I am interested in hearing the experiences of other who have
experimented with running the database server processes (and/or the
file server processes) on multihomed machines.  In the configuration
described below, it worked fine.

I have some addition tests to perform, but my biggest problem is
coming up with a multi homed configuration which does NOT
work.... much to my suprise.

W. Phillip Moore                                        Phone: (212)-762-2433
Information Technology Department                         FAX: (212)-762-1009
Morgan Stanley and Co.                                 E-mail: [EMAIL PROTECTED]
750 9th Ave. 9F, NY, NY 10019

        "Grant me the serenity to accept the things I cannot change, the
         courage to change the things I can, and the wisdom to hide the
         bodies of the people that I had to kill because they pissed me
         off."
                        -- Anonymous


------- start of forwarded message (RFC 934 encapsulation) -------
From: wpm
To: [EMAIL PROTECTED]
Cc: afswg, cigwg, [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Multihomed Database Servers -- working fine with AFS 3.3a
Date: Tue, 30 May 1995 22:43:27 -0400


One challenge we have in Morgan Stanley is the global rollout of AFS
to small sites.  In most of those small sites, we will be installing
AFS server processes on existing machines, many of which are
multihomed.

As an experiment, I brought up 3 SS2s running AFS 3.3a, and SunOS
4.1.3, each of which was dual homed to 2 Ethernets.  

                         144.14.114.128
        |---------------------------------------------------|
                    |.148           |.149           |.150
                    |+              |               |+
                sazafsdb1       sazafsdb2       sazafsdb3
                    |               |+              |
                    |.10            |.11            |.12
        |---------------------------------------------------|
                         144.14.59.0

Both subnets are Class B addresses with a 9 bit subnet mask.  The "+"
sign indicated the primary address of each server.  The host portion
of the address for each interface is also given.

The /usr/afs/etc/CellServDB entries, on each of the machines, listed
only the primary IP address of each server:

>z.sa.ms.com    #Cell name
144.14.114.148  # sazafsdb1
144.14.59.11    # sazafsdb2
144.14.114.150  # sazafsdb3

Routing in our environment is configured using gated, in such a way
that the above hosts will always have host routes for the secondary
interfaces of machines to which they have direct connections.  In the
above case, for example, the relavent routing entries for sazafsdb1
are:

Routing tables
Destination          Gateway              Flags    Refcnt Use        Interface
144.14.59.10         144.14.114.148       UGH      6      20223      le0
144.14.59.11         144.14.114.149       UGH      1      5202       le0
144.14.59.12         144.14.114.150       UGH      2      243        le0
144.14.114.148       144.14.59.10         UGH      19     19804      le1
144.14.114.149       144.14.114.149       UGH      0      17         le0
144.14.114.150       144.14.114.150       UGH      0      2246       le0
default              144.14.114.129       UG       13     70921      le0
144.14.114.128       144.14.114.148       U        1      590        le0
144.14.59.0          144.14.59.10         U        0      0          le1

Once all 3 server processes were brought up, quorum was obtained, with
sazafsdb2 as the sync site for all the database processes (in our
environment, just ptserver and vlserver).

I experimented with bring down the sync site, and again quorum was
obtained, with the sync site on one of the other two servers.
Bringing the original sync site back up, and the restarting either of
the other servers resulted in the original state.

This was done to try to understand what doesn't work when database
servers are multihomed, and in this first test, they worked perfectly.
I was able to make database changes in all of the final states
discussed above, so as far as I can tell, multihomed database server work.

I am obviously missing something.  Can we get a explanation of why
multihomed database server are NOT supported??  It appears that with a
proper routing configuration, it is possible.  The only drawback I can
see is that CellServDB can only list the primary interface and thus,
AFS database services will not be redundantly available from a single
server in the event of a primary interface outage.

Phil



------- end -------

Reply via email to