We have a load balancer driver which we are in the
process of converting to use the GLDV3 bge interface.

We have tried this load balancer driver on Solaris 10
patch levels 118822-30, 118833-03 and 118833-17 and
have had no problems using the non-GLDV3 hme0 interface.
The load balancer driver works both as a scheduler
(traffic cop) and as a server with no problems.  So
the methods used in the load balancer driver seem
to work with the TCP and IP layers under Solaris 10
with the above patch levels.

The problem we are encountering with using the newly
converted load balancer driver with the GLDV3 bge
interface is that the connection between the client
and the server node cannot be completed.  The server
node gets into SYN_RCVD state but cannot get into
ESTABLISHED state.  The ACK sent by the client to
complete the connection after the SYN+ACK was sent by
the server gets intrepreted as a Dupack by the TCP
layer in the server node.  We see this by printing
the TCP counters using the modified macro
BUMP_MIB -> PRINT_MIB on each packet before and after
it's sent up the IP stack from the load balancer driver.

So our conclusion is that this newly converted load
balancer driver is probably at fault, but it's
been kind of a "bear" to isolate this problem.  If
anyone has any more ideas (kind of running out of them),
feel free to respond.  Note that the load balancer
conversion involves a new way of delivering receive
packets.  No other modification were made to the
basic load balancer driver methods.

We deliver packets to our load balancer receive
function one packet at a time in a loop getting
these packets directly from the ring buffers.  Maybe
there is something that we are not aware of in
getting these packets directly from the ring buffers?

We can use this converted load balancer driver for the
bge GLDV3 interface as a scheduler node (traffic cop)
with no problems.  The load balancer driver converted
to use the bge GLDV3 interface only doesn't work as a
server node.

Some of the problems seen on the server using the
newly converted load balancer driver for bge GLDV3
interface are below.  We set tcp_trace and tcp_debug and use
strace to see these errors on the server node.

1.  The SYN packet coming from the client into the
    traffic cop node and forwarded to the server node
    is getting an "unacceptable sequence number gap"
    error from the TCP layer on the server.

2.  TCP Duplicate segment counter increments
    from the TCP layer on the server.

3.  TCP Duplicate Ack counter increments
    from the TCP layer on the server.

4.  "bad_ack" in SYN_RCVD state.

Thanks,
Dan
 
 
This message posted from opensolaris.org
_______________________________________________
networking-discuss mailing list
[email protected]

Reply via email to