http://dildimag.blogspot.com/2006/05/window-scaling-in-linux.html

Friday, May 19, 2006

Window Scaling in Linux

This document covers few details about the implementation of the TCP option, window scaling, in Linux kernel version 2.4.x.

Let me start with a brief background on why this option was introduced. The window field of the TCP header is of 16 bits. This limits the transmission rate of the TCP sender host as the window can grow to a max of 65535 (64 KB). To overcome this constraint, window scaling option was proposed in RFC 1323. This option allows the window to become a 32 bit entity as opposed to the traditional 16 bit entity of classical TCP. This allows the windows to grow upto 1 GB.

Linux implements the window scaling option. To accommodate it, all window variables in Linux are declared as u32. There are 3 variables that are used to implement this option.

linux/include/net/sock.h
wscale_ok : boolean variable
snd_wscale: stores the window scaling factor received from the remote host
rcv_wscale: stores the window scaling factor we advertised to the remote host

Consider the scenario when we receive the first SYN packet, a request to open a connection. The function tcp_v4_conn_request() (linux/net/ipv4/tcp_ipv4.c) is called for each such packet. This function calls tcp_parse_options() (linux/net/ipv4/tcp_input.c) which, as the name says, parses the different TCP options being negotiated. If the SYN packet contains the window scaling option, wscale_ok is set to 1 and snd_wscale is set to the value from the header. snd_wscale is bounded to 14. The function tcp_make_synack() (linux/net/ipv4/tcp_output.c), when creating the SYN-ACK packet, calls tcp_select_initial_window() (/linux/include/net/tcp.h) to calculate the window scaling factor to be advertised to the remote host. rcv_wscale is initialized to this calculated value.

Once the SYN packets have been exchanged, the TCP connection enters the established state. For each incoming ACK, the function tcp_ack() (linux/net/ipv4/tcp_input.c) is called. This function calls tcp_ack_update_window() (linux/net/ipv4/tcp_input.c) which, as the name indicates, is used to update our send window. The very first thing done in this function, is to multiply the advertised window by snd_wscale. The function tcp_select_window() (linux/net/ipv4/tcp_output.c) is called for calculating the window we would like to advertise to the remote host. The calculated window is divided by rcv_wscale before the function returns.

As can be seen, there isn't much to window scaling except multiplying and dividing the value of the advertised window by the right scaling factor variable.

Before I conclude this doc, I want to make a note about the sysctl variable, tcp_adv_win_scale. This variable can be seen in functions that deal with the host buffer space. This variable has nothing to do with window scaling. In Linux, the same buffer is shared by the network stack and the application. This variable decides the ratio of this sharing. The sharing of the buffer was chosen to smooth out the transition of the packets from the transport layer to the application layer.
posted by Rahul at 12:03 PM

Reply via email to