Re: [ofa-general] Re: IPOIB CM (NOSRQ)[PATCH V2] patch for review

2007-04-25 Thread Sean Hefty

What really should happen is that the field Local Ack Timeout in REQ
should be (2 * PacketLifeTime + Local CA’s ACK delay) (see 12.7.34)
and then the responder should use this for it's QP.


Just to clarify, the value is _based_ on (2 * PacketLifeTime + local CA ack 
delay).  For example, if local CA ack delay is 0, then local ack timeout = 
PacketLifeTime + 1.



This does not sound too hard - why can't we just fix CM to do this, then?


The work-arounds were only suggestions to use until a fix is in place and to 
verify that this really is the problem.  I do plan on submitting a fix.


- Sean
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] Re: IPOIB CM (NOSRQ)[PATCH V2] patch for review

2007-04-24 Thread Roland Dreier
  As previously stated, IBM HCA will address these issues. However,
  my understanding is that mthca/Topspin adapters also have a problem
  (too high a value for the Local CA Delay Ack). Both HCAs need to be
  fixed for good interoperability.

I think you're misunderstanding what local CA ack delay means.  This
is a property of an HCA that is not (necessarily) subject to tuning --
it is just a property of the HCA, namely the maximum amount of time it
may take to generate an ACK.

So if a certain HCA reports a value of 15, then that means that any
remote HCA talking to it must be prepared for a delay of 4.096 * 2^15
usecs before receiving an ACK.

If the ACK delays on both sides are not being taken into account
properly when establishing a connection, then I guess that is a bug in
our CM.

 - R.
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] Re: IPOIB CM (NOSRQ)[PATCH V2] patch for review

2007-04-24 Thread Shirley Ma




Hello Roland,

 If the ACK delays on both sides are not being taken into account
 properly when establishing a connection, then I guess that is a bug in
 our CM.

  - R.
So for each IPoIB connection, the ACK delays could be different from
remote. Then how TCP retransmission timeout have a corresponding value?

Thanks
Shirley Ma
IBM Linux Technology Center___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [ofa-general] Re: IPOIB CM (NOSRQ)[PATCH V2] patch for review

2007-04-24 Thread Pradeep Satyanarayana
Thanks for the clarifications Roland. There is something that I am still 
missing- I presume the Local
CA Ack Delay is common across all QPs in the HCA and the Local Ack Timeout 
is specific
to each QP. Is that correct?
 
I tried to change the ib_qp_attr .timeout value (this is the Local Ack 
Timeout -right?) to 0xf as the QP 
transitions from RTR to RTS (page 569 IB Spec) . A subsequent 
ib_query_qp() tells me that timeout = 0.
This happens on both ehca and mthca.

There may be a CM bug, but I am guessing somthing else is incorrect too. I 
have not yet narrowed 
that down.

Pradeep
[EMAIL PROTECTED]


Roland Dreier [EMAIL PROTECTED] wrote on 04/24/2007 11:33:25 AM:

   As previously stated, IBM HCA will address these issues. However,
   my understanding is that mthca/Topspin adapters also have a problem
   (too high a value for the Local CA Delay Ack). Both HCAs need to be
   fixed for good interoperability.
 
 I think you're misunderstanding what local CA ack delay means.  This
 is a property of an HCA that is not (necessarily) subject to tuning --
 it is just a property of the HCA, namely the maximum amount of time it
 may take to generate an ACK.
 
 So if a certain HCA reports a value of 15, then that means that any
 remote HCA talking to it must be prepared for a delay of 4.096 * 2^15
 usecs before receiving an ACK.
 
 If the ACK delays on both sides are not being taken into account
 properly when establishing a connection, then I guess that is a bug in
 our CM.
 
  - R.

___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] Re: IPOIB CM (NOSRQ)[PATCH V2] patch for review

2007-04-24 Thread Sean Hefty

If the ACK delays on both sides are not being taken into account
properly when establishing a connection, then I guess that is a bug in
our CM.


I looked, and the cm does not take into account the ca ack delay.  This can be 
worked around by bumping up the qp timeout value between calling 
ib_cm_init_qp_attr() and ib_modify_qp(), or by increasing the path record 
packet_life_time.


- Sean
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] Re: IPOIB CM (NOSRQ)[PATCH V2] patch for review

2007-04-24 Thread Michael S. Tsirkin
 Quoting Sean Hefty [EMAIL PROTECTED]:
 Subject: Re: [ofa-general] Re: IPOIB CM (NOSRQ)[PATCH V2] patch for review
 
 If the ACK delays on both sides are not being taken into account
 properly when establishing a connection, then I guess that is a bug in
 our CM.
 
 I looked, and the cm does not take into account the ca ack delay.  This can 
 be worked around by bumping up the qp timeout value between calling 
 ib_cm_init_qp_attr() and ib_modify_qp(), or by increasing the path record 
 packet_life_time.

What really should happen is that the field Local Ack Timeout in REQ
should be (2 * PacketLifeTime + Local CA’s ACK delay) (see 12.7.34)
and then the responder should use this for it's QP.

This does not sound too hard - why can't we just fix CM to do this, then?

-- 
MST
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general