http://teaching.idallen.com/cst8165/07f/notes/week09notes.txt
-------------------------
Week 09 Notes for CST8165
-------------------------
-Ian! D. Allen - [EMAIL PROTECTED] - www.idallen.com
Remember - knowing how to find out an answer is more important than
memorizing the answer. Learn to fish! RTFM! (Read The Fine Manual)
Keep up on your readings (Course Outline: average 4 hours/week homework)
Review:
------
- Internet Protocol (layer 2)
- ICMP - Internet Control Message Protocol (layer 2)
- Layer Three: TCP and UDP - port numbers
- two major types: UDP (SOCK_DGRAM) or TCP (SOCK_STREAM)
- common port numbers: 20/21, 22, 23, 25, 53, 67/68, 80, 110, 137/139, 443
- socket options, e.g. SO_SNDTIMEO, SO_REUSEADDR
- UDP (layer three) - only three pages on top of IP
- TCP (layer three) - 85 pages on top of IP
- TCP state diagram
- three-way handshake to start a TCP session
- simultaneous open
- pseudo-headers
- testing all three processes in a TCP client/server echo application
Q: Describe the sequence of events that happens when your client
receives EOF from the keyboard. How does that EOF result in the
termination of the client and server processes? (see eof_handling.txt)
Q: Why must the client process reading your keyboard call
shutdown(fd,SHUT_WR) on the server socket? Wouldn't calling
close() on the socket also send EOF to the server? Why or why not?
------------------------------------------------------------------------------
TCP acknowledgements
--------------------
- after the three-way handshake, an open TCP connection communicates with
ACK always set
- ACK bit says highest byte received is in the 32-bit ACK field
- the ACK packet contains the highest contiguous byte number that was
received so far
- in plain TCP, ACKs are cumulative - the one number indicates the highest
*contiguous* set of successfully received bytes
- basic TCP/IP cannot issue out-of-sequence or selective ACKs (ACK ranges)
- cumulative ACK says *all* previous bytes received OK, cannot selectively
ACK ranges of bytes received (in plain vanilla TCP)
- TCP buffering is possible; use PSH to "push" (flush) data out at either end
- interactive programs need to do this to get good response times
Q: T/F The single 32-bit TCP header Acknowledgement number lets you
know the sequence number of the last successfully received byte of data
Q: What is the purpose of the TCP "PSH" flag? Which kind of programs use it?
TCP Windowing - how it works
----------------------------
- http://www.tcpipguide.com/free/t_TCPSlidingWindowAcknowledgmentSystemForDataTranspo.htm
"It is no exaggeration to say that comprehending how sliding windows
works is critical to understanding just about everything else in TCP."
- TCP stream windows allow multiple packets to be sent and waiting
acknowledgement
- better use of bandwidth; less overall waiting for ACKs to come back
TCP Reliability and Flow Control Features and Protocol Modifications
--------------------------------------------------------------------
- http://www.tcpipguide.com/free/t_TCPReliabilityandFlowControlFeaturesandProtocolMod.htm
* basic concept of TCP retransmission queue
- http://www.tcpipguide.com/free/t_TCPSegmentRetransmissionTimersandtheRetransmission.htm
* selective acknowledgement (SACK)
- http://www.tcpipguide.com/free/t_TCPNonContiguousAcknowledgmentHandlingandSelective.htm
* adaptive retransmission time-outs
- http://www.tcpipguide.com/free/t_TCPAdaptiveRetransmissionandRetransmissionTimerCal.htm
* window size adjustment and flow control
- http://www.tcpipguide.com/free/t_TCPWindowSizeAdjustmentandFlowControl.htm
- http://www.tcpipguide.com/free/t_TCPWindowManagementIssues.htm
- http://www.tcpipguide.com/free/t_TCPSillyWindowSyndromeandChangesTotheSlidingWindow.htm
* congestion handling and avoidance
- http://www.tcpipguide.com/free/t_TCPCongestionHandlingandCongestionAvoidanceAlgorit.htm
Q: The TCP sliding window feature classes bytes into four categories.
Describe each.
TCP Slow Start
--------------
- TCP "Slow Start" is a mandatory ("MUST") congestion avoidance mechanism:
http://tools.ietf.org/html/rfc2581
http://www.tcpipguide.com/free/t_TCPCongestionHandlingandCongestionAvoidanceAlgorit-2.htm
http://en.wikipedia.org/wiki/Slow-start
http://www.eventhelix.com/RealtimeMantra/Networking/TCP_Slow_Start.pdf
"Beginning transmission into a network with unknown conditions
requires TCP to slowly probe the network to determine the available
capacity, in order to avoid congesting the network with an
inappropriately large burst of data. The slow start algorithm is
used for this purpose at the beginning of a transfer, or after
repairing loss detected by the retransmission timer." - RFC2581
1) start by sending just one TCP "segment"
2) for every successfule ACK, double the number of segments sent
3) stop when you get above a predefined limit
Q: What does TCP "slow start" mean? How does it work? Why is it needed?
TCP Selective Acknowledgement option (SACK)
-------------------------------------------
- no provision for missed packet "holes" in the byte stream in vanilla TCP
- you can't say you got packets 1, 2, and 5; all you can say is "up to 2"
and the remote site has to retransmit everything after 2 (including 5)
- "selective ACK (SACK)" capability was added later as a TCP Option
- RFC1072/RFC1323/RFC2018 describe the TCP SACK option
http://tools.ietf.org/html/rfc1072
http://tools.ietf.org/html/rfc1323
http://tools.ietf.org/html/rfc2018
http://www.tcpipguide.com/free/t_TCPNonContiguousAcknowledgmentHandlingandSelective-4.htm
Q: T/F Selective Acknowledgment is a TCP option that has to be negotiated
Q: T/F The single 32-bit TCP header Acknowledgement number allows a
machine to selectively acknowledge packets, e.g. byte ranges such
as I got packets 1, 2, and 4 (but not 3).
Q: T/F The original TCP ACK field was sufficent to implement the
new "selective ACK" optional enhancment
- No, the TCP ACK field can only acknowledge a single byte value; you need
multiple byte ranges to implement SACK - these were added as TCP options
-------------------------------------------------------------------------------
Fragmentation Considered Harmful
--------------------------------
* How does IP send large amounts of data, if the wires won't?
Linux command: ifconfig
- shows MTU (Maximum Transmission Unit) size for each interface
- raw Ethernet shows limit of 1,500 bytes MTU
- other protocols will show other limits (e.g. PPP, PPPoE)
Q: Give the MTU for your ethernet card (eth0) and the loopback (lo)
- The IP Layer can split packets into "fragments" to pass them through
routers that can't handle large packets.
- IP packets also have a "Don't Fragment" bit that prevents fragmentation
Fragmentation Considered Harmful
- Google for this: "fragmentation considered harmful"
- www.acm.org/sigs/sigcomm/ccr/archive/1995/jan95/ccr-9501-mogulf1.pdf
- SIGCOMM October 1987
- 1. inefficient use of resources
"Consider a TCP process that tries to send 1024 data bytes across a route
that includes the ARPAnet, which has an MTU of 1006 bytes. The IP and TCP
headers are at least 40 bytes long, leading to a total unfragmented IP
datagram 1064 bytes in length. To cross the ARPAnet, this will be broken
into a 1006 byte fragment, followed by a 78 byte fragment. These short
fragments amortize the fixed overhead per ARPAnet packet over very few
bytes of data, and the total packet count is much higher than needed. If
the sending TCP instead chooses segments that fit in a 1006 byte ARPAnet
packet, the total packet count is minimized, and the total overhead is
as low as possible."
- 2. degraded performance (in reassembly, fragment loss)
"When segments are sent that are large enough to require fragmentation, the
loss of any fragment requires the entire segment to be retransmitted. This
can lead to poorer performance than would have been achieved by originally
sending segments that didn't require fragmentation."
- 3. lack of efficient reassembly
- TCP windowing communicates well the size of receive queue/buffer
- but IP has no indication of how many IP fragments are coming!
- TCP can ACK the bytes received so far and ship the data up the stack
- but TCP works on the *packet* level, not the *fragment* level
- not possible to partially ACK an initial sequence of fragments
- applications must cooperate with the IP layer in minimizing fragmentation
Q: Why should IP fragmentation be avoided? Describe two of three
problems with fragmentation.
Q: T/F TCP can ACK each fragment of a packet as it arrives, allowing
retransmission of individual fragments
Path Maximum Transmission Unit discovery PMTU RFC1191
- November 1990 - 19 pages
- http://tools.ietf.org/html/rfc1191
"This memo describes a technique for dynamically discovering the
maximum transmission unit (MTU) of an arbitrary internet path. It
specifies a small change to the way routers generate one type of ICMP
message. For a path that passes through a router that has not been
so changed, this technique might not discover the correct Path MTU,
but it will always choose a Path MTU as accurate as, and in many
cases more accurate than, the Path MTU that would be chosen by
current practice."
"In this memo, we describe a technique for using the Don't Fragment
(DF) bit in the IP header to dynamically discover the PMTU of a path.
The basic idea is that a source host initially assumes that the PMTU
of a path is the (known) MTU of its first hop, and sends all
datagrams on that path with the DF bit set. If any of the datagrams
are too large to be forwarded without fragmentation by some router
along the path, that router will discard them and return ICMP
Destination Unreachable messages with a code meaning "fragmentation
needed and DF set" [7]. Upon receipt of such a message (henceforth
called a "Datagram Too Big" message), the source host reduces its
assumed PMTU for the path."
"Unfortunately, the Datagram Too Big message, as currently specified,
does not report the MTU of the hop for which the rejected datagram
was too big, so the source host cannot tell exactly how much to
reduce its assumed PMTU. To remedy this, we propose that a currently
unused header field in the Datagram Too Big message be used to report
the MTU of the constricting hop. This is the only change specified
for routers in support of PMTU Discovery."
Q: How does IP PMTU discovery work?
Q: What changes were made to the ICMP "Datagram Too Big" message to
accommodate PMTU?
Congestion Control
------------------
Even if packets aren't fragmented, routers can be come congested if too
many packets arrive to process. When TCP originated, the only indication
that a router is overloaded came when packets started to drop - you
couldn't get any advance warning.
The Addition of Explicit Congestion Notification (ECN) to IP RFC3168
- September 2001 - 63 pages
- http://tools.ietf.org/html/rfc3168
- the Introduction paragraphs (Section 1.) are important
"Since TCP determines the appropriate congestion window to use by
gradually increasing the window size until it experiences a dropped
packet, this causes the queues at the bottleneck router to build up.
With most packet drop policies at the router that are not sensitive
to the load placed by each individual flow (e.g., tail-drop on queue
overflow), this means that some of the packets of latency-sensitive
flows may be dropped. In addition, such drop policies lead to
synchronization of loss across multiple flows."
- vanilla TCP minimizes effect of congestion on *throughput*, not *latency*
- but, the mechanism for detecting congestion is lost packets
- no mechanism for avoiding lost packets in the first place
"Active queue management mechanisms detect congestion before the
queue overflows, and provide an indication of this congestion to
the end nodes. Thus, active queue management can reduce unnecessary
queuing delay for all traffic sharing that queue."
Q: T/F Traditional "drop packet" TCP congestion control mechanisms
are designed to keep overall throughput high
Q: T/F Traditional "drop packet" TCP congestion control mechanisms
also keep packet latency to a minimum
Q: What advantage does ECN have over traditional "drop-packet" methods
for detecting and avoiding congestion?
Datagram Congestion Control Protocol DCCP RFC4340
- NEW! March 2006 - 125 pages
"The Datagram Congestion Control Protocol (DCCP) is a transport
protocol that implements bidirectional, unicast connections of
congestion-controlled, unreliable datagrams."
- this RFC also contains this important port allocation information:
"Port numbers are divided into three ranges. The Well Known Ports are
those from 0 through 1023, the Registered Ports are those from 1024
through 49151, and the Dynamic and/or Private Ports are those from
49152 through 65535. Well Known and Registered Ports are intended
for use by server applications that desire a default contact point
on a system. On most systems, Well Known Ports can only be used
by system (or root) processes or by programs executed by privileged
users, while Registered Ports can be used by ordinary user processes
or programs executed by ordinary users. Dynamic and/or Private Ports
are intended for temporary use, including client-side ports, out-of-
band negotiated ports, and application testing prior to registration
of a dedicated port; they MUST NOT be registered."
Q: What range of ports should your experimental application use?
Interpreting the RFC documents and the raw protocols
----------------------------------------------------
* The "Requirements for Internet Hosts" documents: RFC 1122 and 1123
http://tools.ietf.org/html/rfc1122
http://tools.ietf.org/html/rfc1123
- RFC1122 and 1123 are clarifications and examples of how the RFCs work
- RFC1122 deals with "Communication Layers" (e.g. IP, TCP)
- RFC1123 deals with "Application and Support" (e.g. SMTP, HTTP)
* The overview discussion document: RFC1127
http://tools.ietf.org/html/rfc1127
- describes the history of the creation of 1122 and 1123
"This group of people struggled with a broad range of issues in host
implementations of the Internet protocols, attempting to reconcile
theoretical and architectural concerns with the sometimes conflicting
imperatives of the real world. The present RFC recaps the results of
this struggle, with the issues that were settled and those that
remain for future work."
"Indeed, many of these are simply restatements or reinforcement of
requirements that are already explicit or implicit in the original
standards RFC's. Some more cynical members of the working group
refer to these as "Read The Manual" provisions. However, they were
included in the HR RFCs because at least one implementation has
failed to abide by these requirements. In addition, many provisions
of the HR RFCs are simply applications of Jon Postel's Robustness
Principle [1.2.2 in either RFC]."
Q: T/F The two "Requirements for Internet Hosts" documents were written to
extend the existing RFCs with new features.
Comments on Application Protocols - RFC1123
-------------------------------------------
* RFC1123 - Requirements for Internet Hosts - Application and Support
http://tools.ietf.org/html/rfc1123
- RFC1123 reviews and clarifies many major protocols and standards:
- TELNET, FTP, TFTP, SMTP, RFC822 (message format), DNS
"This RFC enumerates standard protocols that a host connected to the
Internet must use, and it incorporates by reference the RFCs and
other documents describing the current specifications for these
protocols. It corrects errors in the referenced documents and adds
additional discussion and guidance for an implementor."
"A good-faith implementation of the protocols that was produced after
careful reading of the RFC's and with some interaction with the
Internet technical community, and that followed good communications
software engineering practices, should differ from the requirements
of this document in only minor ways. Thus, in many cases, the
"requirements" in this RFC are already stated or implied in the
standard protocol documents, so that their inclusion here is, in a
sense, redundant. However, they were included because some past
implementation has made the wrong choice, causing problems of
interoperability, performance, and/or robustness."
Q: Why was the RFC1123 "Application and Support" document written?
Overview of the TCP Application Layer (slides)
----------------------------------------------
- from Kurose/Ross:
http://teaching.idallen.com/cst8165/07f/notes/kurose/
*** End of material covered in second midterm test on November 6 ***