<snip>

>
>         - Name service
>
>            The best of breed NS  that I've seen is in Weblogic server.  It
> uses a multi-cast group to
>            keep all NS in a cluster in sync.


Realize this system requires that all machines using the TCP multi-cast be
on the same Class C subnet.  That limits you to a single bridged network.
Now the bridge and that LAN are a single point of failure for the entire
cluster.

Thats why Sybase EAServer's cluster sync does use TCP Multi-cast.

> The worst choice is using
> commercial LDAP
>            directory servers for the local store and LDAP replication to
> keep N instances
>            in sync.  Our current appserver uses this architecture and we
> hate it for so many
>            reasons.

<snip>

>
>         - Client side Stub:
>                 - All vendors support re-bind and re-dispatch on
IOException
>
>                   is there differentiation between the vendors?

We also support the ability to replay a transaction but the target throwing
org.omg.CORBA.TRANSIENT.  The client stubs will play the entire call one
more time.  This is useful in optimistic locking designs.  Since TRANSIENT
is just a subclass of RuntimeException it will be portable as well.

<snip>

>
>         - Load balancing.
>
>                 - Some vendors use a bulletin board of which services are
> busiest
>                   and do method level load balancing.

We use a mixture of metrics which can be manual or dynamically determined.
The dynamic system measures the box itself, the TCP stack and the app server
process.

>
>                   I'm not sure I want this all the time??  Another source
of
> latency
>                   and bottle neck???

We use a hearbeat mechanism between the servers which has no network
limitations like a TCP mutli-cast does.  Obviously in any system the
Heisenberg Uncertainty Principle applies.

>
>                 - Bind level load balancing.  All vendors that support
> clustering at least
>                   support this form of load balancing unless they do
method
> level balancing.
>
> Resiliency / fault tolerance (clustered boxes):
>
>         - NS, object activation and system status must not have single
point
> of failure.
>
>           Any vendor differentiation here or particularly bad designs??
>
>         - Client Stub and NS provide re-dispatch of failed  (indepotient)
> method call.
>
>           I need but haven't found method level timeout.  Any vendors
> present or future?

What is your concern with a timeout?  The decision to failover can come from
the TCP driver itself.

>
>         - When service / box fails the NS must be quickly scrubbed of dead
> references
>

We handle this via a heartbeat.  As well, we provide the ability to place a
timeout on client side references to the naming service lookups.  In this
way you can force a client to get a new reference list every N seconds to be
sure it has the most up to date info.  Otherwise if you cache home
interfaces you can still have a stale reference to a server who is now down,
leading to latency as the failure is detected and resolved.


Dave Wolf
Internet Applications Division
Sybase


>
>           Any vendor support this?
> ]
>
> Thanks for your thoughts and experiences!
>
> Curt
> Architect of our next-gen telephone system.
>
>
> Curt Smith
> Z-Tel
> email:  [EMAIL PROTECTED]
> work:   404-237-1166  x182
> FAX:    404-237-1167
>
>
===========================================================================
> To unsubscribe, send email to [EMAIL PROTECTED] and include in the
body
> of the message "signoff EJB-INTEREST".  For general help, send email to
> [EMAIL PROTECTED] and include in the body of the message "help".
>
>

===========================================================================
To unsubscribe, send email to [EMAIL PROTECTED] and include in the body
of the message "signoff EJB-INTEREST".  For general help, send email to
[EMAIL PROTECTED] and include in the body of the message "help".

Reply via email to