This was a really strange (and very wrong) description of how the protocols work. Correcting a few of the worst bits inline below ...
On 11/27/2010 9:42 PM, Ray Dillinger wrote: > On Sun, 2010-11-28 at 01:38 +0700, Jérôme Prudent wrote: >> Hi! >> >> Sorry for the off topic (and the stupidity of the question), but could >> you briefly explain me what's wrong with the use of UDP? >> >> Thanks! > HTTP is not specified as a UDP protocol and extant browsers don't > generally speak UDP. Aside from that, most of the things > transmitted by HTTP (coded webpages, client-side programs, and > compressed graphics) are prone to catastrophic failure from small > errors, and UDP is prone to small errors. > > UDP does not contain any provision for correction of transmission > errors. When a packet is transmitted over the internet, there is > a chance that it will be corrupted en route. Electrical noise at > a switch, a radio station too near a cable, cosmic rays, etc... > these are all things which can cause a bit or byte or word to be > different at the recieving end than it was at the transmitting end. > First, studies have shown that hardware is a more likely cause of error than any of the above sources: http://www.cc.gatech.edu/classes/AY2002/cs8803d_spring/papers/checksum.pdf Most link-layer protocols have checksums like CRC32 that cause discard of link frames damaged in the ways you list. Second, UDP does include a checksum; you're technically correct that it doesn't correct for errors, but it does allow damaged datagrams to be discarded so that they aren't passed to the applications. UDP uses the same checksum algorithm as TCP. In UDP, it is optional for IPv4, but generally recommended to be used: http://tools.ietf.org/rfc/rfc5405.txt This does not correct errors, but causes the kernel to discard any UDP datagrams where the checksum fails. It's a relatively weak checksum, but can be updated when passing through NATs (since the IP pseudoheader is included), and works reasonably well in practice. Of course, applications that use UDP still need to perform their own higher-level error checking and retransmission, if strong reliability of data delivery and integrity is needed. > TCP/IP is an error-correcting protocol that uses incremental > checksums to detect transmission errors, so that the recieving > end can say, "wait a minute, could you resend packet number 12?" > or something when it realizes that it missed something or got > a bad copy of something. This is a good thing because it reduces > or eliminates errors, but it introduces roundtrip delays as the > sender and receiver negotiate to make sure all the checksums and > so on are correct. > There is no negotiation about whether checksums are correct. Either an incoming segment passes or fails. If it fails, the segment is discarded. Retransmission can come about in various ways (e.g. via RTO or the dupack threshold if there is sufficient data). TCP only acknowledges what it's received. It doesn't specifically request retransmissions of individual segments (or byte ranges). Note that the TCP checksum algorithm is the same as UDPs. It's only a 16-bit sum, not a CRC or cryptographic hash, and only covers individual segments, not whole objects/files/etc being transferred. Even in TCP, this is still up to the application to ensure. > HTTP is a TCP/IP protocol that introduces several other mechanisms > to try to deal with its intensive usage of bandwidth, and one of > these was (is) "slow-start." "slow-start" says you don't transmit > more than 4 packets before the receiver notifies you about whether > the first one was received intact. After the receiver has > acknolwedged the first 4 packets, you can assume that it's okay > to have 5 packets "in flight" out there. After the receiver has > acknowledged the next 5 packets, you can assume that it's okay to > have 6 packets "in flight" out there. And so on. This whole thing > was an effort not to overrun buffers along the way, not to transmit > faster than the receiver could buffer and process the data, and > so on. > This is completely wrong. TCP contains slow-start, not HTTP. RFC 5681 clearly describes this: http://www.ietf.org/rfc/rfc5681.txt The numbers you quote are also badly wrong. Slow-start has an exponential increase in the congestion window, not linear. Please read the actual specification. The data rate possible doubles every round trip time until a loss is inferred. > Slow-start was a good thing when it was developed, because at the > time network latency was usually a small fraction of transmission > time. Waiting for the acknowledgement to arrive usually didn't > noticeably delay the transmission of additional packets. But > things have changed. The bandwidth of our pipes, the size of our > buffers, and the speed of our transmission has gone way up, > switching speeds have gone up but have not kept pace, the number > of hops on the trip from transmitter to receiver is now one or > two jumps longer which eats the gains from faster switching times, > and the speed of light has remained obnoxiously constant. So at > this point the servers are sending four packets, then twiddling > their thumbs for a *LONG* time (relative to the time it takes to > transmit them) before the acknowledgement packet gets back and > they can send the next packets. > > Google wants to change slow-start by increasing the initial number > of packets to compensate for the increases in buffering capability > and pipe length relative to transmission speed. I think this is > probably a good idea. They aren't changing slow-start, only increasing the initial window, so that a couple of round trips can be shaved in some cases of "medium-sized" transfers. Long transfers do not benefit much. _______________________________________________ p2p-hackers mailing list [email protected] http://lists.zooko.com/mailman/listinfo/p2p-hackers
