Re: [networking-discuss] pluggable congestion control design

Artem Kachitchkine Tue, 19 Jan 2010 16:16:06 -0800

Since the kernel interface has a version, will add_cong
return an error if the cc module version number does not
match the kernel version?


Yes.

Section 5.1

Is tcpcong_load() called only once when TCP/SCTP wants to
use a cc module?  Does this mean that TCP/SCTP needs to
keep a counter on the number of conn/assoc using a particular
cc module.  When all those conns/assocs are gone, the
tcpcong_unload() can be called.  Or does the design require
that once a cc module is loaded, it should not be unloaded,
except maybe until the netstack is going away?  Or is
tcpcong_load() called for each TCP conn/SCTP assoc?

The latter. tcpcong_load/unload() keep the count and decide when themodule should be loaded and when it can be unloaded. Right now, theunloading policy I use in my prototype is the same as for other moduletypes, i.e. on non-DEBUG kernels they will remain loaded unless there ismemory shortage, and on DEBUG kernels they are subject tomod_uninstall_interval.

Since a cc module may not support both TCP and SCTP, should
we have a protocol argument to tcpcong_load()?  Otherwise,
TCP/SCTP cannot know if the cc module is OK to use until
maybe when it calls co_state_size().


Good point.

How does tcpcong_load() report an error, say cc module not
found?

Right now it only indicates failure by returning a NULL handle, sincethe caller has no use for failure type (i.e. there is nothing the callercan do differently for different error codes). This may change later, ofcourse.

It seems that the current design requires a different cc
module binary for different interface version.  Is it
difficult to change it so that one cc module binary can
support multiple interface versions (assuming the algorithm
allows that)?  And depending on the kernel cc interface
version, the binary can export different tcpcong_ops_t.
Just wondering.

I didn't think about that since at this point my proposal is to keepinterfaces Consolidation Private, so we always keep modules in sync withthe stack. It's probably not hard to support multiple versions though.

5.2

Will co_state_size() return 0?  Or does 0 mean error?


0 is a valid return value, meaning the algorithm requires no state.

What does TCP/SCTP need to set in tcpcong_args_t when
callng co_state_init()?  Just the ca_state?  If this is true,
why not just pass in ca_state directly?  In general, please
describe all the in/out parameters for each ops.

co_state_init() needs ca_state and ca_mss. I will be expanding theproposal with more complete interface descriptions in preparation forthe ARC commitment review.

What is the need to have co_enter_fr() since it will be
called immediately after calling co_loss(TCPCONG_DUPACK)?

Consider NewReno as an example, after the third dupack we want to setssthresh = cwnd / 2. Then retransmit the missing segment, and only thenenter fast recovery and set cwnd = ssthresh + 3. So there is aretransmission between co_loss() and co_enter_fr().

And the current design seems to suggest that between fast
retransmit and end of fast recovery, TCP/SCTP will do its
own stuff and only after that, the cc module will take
control of the cwnd again.  What if there is a cc module
which wants to do something here also?

co_ack() will continue to be called in that period, so the module canmake adjustments.

If TCP detects that a retransmission is spurious, say by
F-RTO, is there a call to notify the cc module?


Good point, we should probably accommodate for this.

ICMP Source Quench is dead.  I guess we may exclude it.

OK.

There is not an idle time field in tcpcong_args_t.  Should
there be one as the cc module may want to change the cwnd
according to how long the idle time is?


Yes, I think we should add that.

5.3

What is the use of ca_ssthresh?  The threshold is normally
checked when setting cwnd.  But the cc module already has
complete control on the cwnd.  Should TCP/SCTP care about the
value?


It is useful for observability, e.g. reporting ssthresh via TCP_INFO.

Does ca_bytes_acked == 0 serve the purpose of dup ack
notification to the cc module?

I suppose it can serve that purpose, though it didn't occur to me untilyou mentioned it.

5.4

If A have already written a cc module for Linux, what needs
to be done to port it over?  This section suggests that the
interface is actually quite different.  I guess adding some
guidelines/explanation on the differences is a good idea.

I agree, the porting guide sounds like a good idea. I'm not sure if theARC proposal is the right place for it though. Perhaps we can provide aseparate document or an appendix via opensolaris.org.


-Artem
_______________________________________________
networking-discuss mailing list
networking-discuss@opensolaris.org

Re: [networking-discuss] pluggable congestion control design

Reply via email to