You want to go to the site itself to see the pictures, at least. They're impressive. :-).
Cheers, RAH ------- http://www.caida.org/outreach/papers/2003/sapphire/sapphire.html The Spread of the Sapphire/Slammer Worm By (in alphabetical order) David Moore Vern Paxson Stefan Savage Colleen Shannon Stuart Staniford Nicholas Weaver CAIDA & UCSD CSE ICIR & LBNL UCSD CSE CAIDA Silicon Defense Silicon Defense & UC Berkeley EECS Introduction The Sapphire Worm was the fastest computer worm in history. As it began spreading throughout the Internet, it doubled in size every 8.5 seconds. It infected more than 90 percent of vulnerable hosts within 10 minutes. The worm (also called Slammer) began to infect hosts slightly before 05:30 UTC on Saturday, January 25. Sapphire exploited a buffer overflow vulnerability in computers on the Internet running Microsoft's SQL Server or MSDE 2000 (Microsoft SQL Server Desktop Engine). This weakness in an underlying indexing service was discovered in July 2002; Microsoft released a patch for the vulnerability before it was announced[ 1]. The worm infected at least 75,000 hosts, perhaps considerably more, and caused network outages and such unforeseen consequences as canceled airline flights, interference with elections, and ATM failures. Several disassembled versions of the source code of the worm are available. [ 2]. Figure 1: The geographic spread of Sapphire in the 30 minutes after release. The diameter of each circle is a function of the logarithm of the number of infected machines, so large circles visually underrepresent the number of infected cases in order to minimize overlap with adjacent locations. For some machines, only the country of origin can be determined (rather than a specific city). Propagation speed was Sapphire's novel feature: in the first minute, the infected population doubled in size every 8.5 (±1) seconds. The worm achieved its full scanning rate (over 55 million scans per second) after approximately three minutes, after which the rate of growth slowed down somewhat because significant portions of the network did not have enough bandwidth to allow it to operate unhindered. Most vulnerable machines were infected within 10-minutes of the worm's release. Although worms with this rapid propagation had been predicted on theoretical grounds [ 5], the spread of Sapphire provides the first real incident demonstrating the capabilities of a high-speed worm. By comparison, it was two orders magnitude faster than the Code Red worm, which infected over 359,000 hosts on July 19th, 2001 [ 3]. In comparison, the Code Red worm population had a leisurely doubling time of about 37 minutes. While Sapphire did not contain a malicious payload, it caused considerable harm simply by overloading networks and taking database servers out of operation. Many individual sites lost connectivity as their access bandwidth was saturated by local copies of the worm and there were several reports of Internet backbone disruption [ 4] (although most backbone providers appear to have remained stable throughout the epidemic). It is important to realize that if the worm had carried a malicious payload, had attacked a more widespread vulnerability, or had targeted a more popular service, the effects would likely have been far more severe. This document is a preliminary analysis of the Sapphire worm, principally focused on determining the speed and scope of its spread and the mechanisms that were used to achieve this result. Sapphire: A Random Scanning Worm Sapphire's spreading strategy is based on random scanning -- it selects IP addresses at random to infect, eventually finding all susceptible hosts. Random scanning worms initially spread exponentially rapidly, but the rapid infection of new hosts becomes less effective as the worm spends more effort retrying addresses that are either already infected or immune. Thus as with the Code Red worm of 2001, the proportion of infected hosts follows a classic logistic form of initially exponential growth in a finite system [ 5,3]. We refer to this as the random constant spread (RCS) model. Figure 2: Code Red was a typical scanning worm. This graph shows Code Red's probe rate during its re-emergence on August 1, 2001 as seen by the Chemical Abstract Service, matched against the RCS model of worm behavior. Sapphire's spread initially conformed to the RCS model, but in the later stages it began to saturate networks with its scans, and bandwidth consumption and network outages caused site-specific variations in the observed spread of the worm. A dataset from the DShield project [ 8] fit to a RCS model is shown below. The model fits extremely well up to a certain point when the probe rate abruptly levels out. This change in growth of the probe rate is due to the combined effects of bandwidth saturation and network failure (some networks shut down under the extreme load). Figure 3: The early moments of the DShield dataset, matched against the behavior of a random-scanning worm. Based on analysis of a number of datasets, we estimate the initial compromise rate (the number of hosts that a worm instance can infect per second) was 7 (±1) per minute, equivalent to a global doubling time of 8.5 ±1 seconds. Why Sapphire Was So Fast Sapphire spread nearly two orders of magnitude faster than Code Red, yet it probably infected fewer machines. Both worms used the same basic strategy of scanning to find vulnerable machines and then transferring the exploitive payload; they differed in their scanning constraints. While Code Red was latency limited , Sapphire was bandwidth-limited , allowing it to scan as fast as the compromised computer could transmit packets or the network could deliver them. Sapphire contains a simple, fast scanner in a small worm with a total size of only 376 bytes. With the requisite headers, the payload becomes a single 404-byte UDP packet. This can be contrasted with the 4kb size of Code Red, or the 60kb size of Nimda. Previous scanning worms, such as Code Red, spread via many threads, each invoking connect() to probe random addresses. Thus each thread's scanning rate was limited by network latency, the time required to transmit a TCP-SYN packet and wait for a response or timeout. In principal, worms can compensate for this latency by invoking a sufficiently large number of threads. However, in practice, context switch overhead is significant and there are insufficient resources to create enough threads to counteract the network delays -- the worm quickly stalls and becomes latency limited. In contrast, Sapphire's scanner was limited by each compromised machine's bandwidth to the Internet. Since the SQL Server vulnerability was exploitable using a single packet to UDP port 1434, the worm was able to send these scans without requiring a response from the potential victim. The inner loop is very small, and since modern servers have sufficient network I/O capacity to transmit network data at 100Mbps+, Sapphire was frequently limited by the access bandwidth to the Internet rather than its own ability to generate new copies of itself. In principle, an infected machine with a 100 Mb/s connection to the Internet could produce over 30,000 scans/second. In practice, due to bandwidth limitations and the per-packet overhead, the largest probe rate we directly observed was 26,000 scans/second, with an Internet-wide average of approximately 4,000 scans/second per worm during the early phase of growth. The Sapphire worm's scanning technique was so aggressive that it quickly interfered with its own growth. Consequently, the contribution to the rate of growth from later infections was diminished since these instances were forced to compete with existing infections for scarce bandwidth. Thus Sapphire achieved its maximum Internet-wide scanning rate within minutes. Any future single packet UDP worm will probably have the same property unless its spread is deliberately limited by the author, as a simple loop will create a bandwidth-limited scanner. While a TCP-based worm, such as Code Red, could also employ a bandwidth-limited scanner by sending TCP-SYNs at maximum rate and responding automatically to any replies in another thread, this would require more effort to implement correctly. Sapphire's Pseudo Random Number Generator For a random-scanning worm to be effective, it needs a good source of "random" numbers to select new targets to attack. Sapphire's random number generator turned out to have some interesting deficiencies which both made our analysis difficult, and perhaps had implications for future worms. Sapphire uses a linear congruent, or power residue, pseudo random number generation (PRNG) algorithm. These algorithms are of the form: x' = (x * a + b) mod m , where x' is the new pseudo-random number to be generated, xis the last pseudo-random number generated, mrepresents the range of the result, and both aand bare carefully chosen constants. Linear congruent generators can be implemented very efficiently and with proper values of aand bthey have reasonably good distributional properties (although they are not random from a sequence standpoint). The initial value of the generator is typically "seeded" with a source of higher quality random numbers to ensure that the precise sequence is not identical between runs. The writers of Sapphire intended to use a linear congruent parameterization popularized by Microsoft, x' = (x * 214013 + 2531011) mod 2^32 . However, they made two mistakes in its implementation. First, they substituted their own value for the increment: the hex number 0xffd9613c . This value is equivalent to -2531011 when interpreted as a 2's complement decimal, so it seems likely that either their representation was in error, or that they intended to use the SUB instruction to compensate, but mistakenly used ADD instead. The end result is that the increment is always even. Their second mistake was to misuse the OR instruction, instead of XOR , to clear a key register -- leaving the register's previous contents intact. As a result, the increment is accidentally XORed with the contents of a pointer contained in SqlSort's Import Address Table (IAT). Depending on the version of the SqlSort DLL this "salt" value will differ, although two common values, which we have directly observed, are 0x77f8313c and 0x77e89b18 . EEye also reports seeing 0x77ea094c [2]. These mistakes significantly reduce the quality of the generator's distribution. Since bis even and the salt is always 32-bit aligned, the least-significant two bits are always zero. Interpreted as a big-endian IP address this ensures that the 25th and 26th bits in the scan address (the upper octet) will stay constant in any execution of the worm. Similar weaknesses extend to the 24th bit of the address depending on the value of the uncleared register. Moreover, with the incorrectly chosen increment, any particular worm instance will cycle through a list of addresses significantly smaller than the actual Internet address space. Thus there are many worm instances which will never probe our monitored addresses, because none of these addresses are contained in the cycle which the worm scans. This, combined with the size of our monitored address space [ 6], prevents us from directly measuring the number of infected hosts during the first minutes of the worm's spread. It happens that Sapphire will include or not include entire /16 blocks of addresses in a cycle. We were able to assemble lists of the address blocks in each cycle for each value of the salt (the cycle structure is salt dependent). Fortunately the probability of choosing a particular cycle is directly proportional to the size of the cycle if the initial seed is selected uniformly at random. When considered over many randomly seeded worms, all Internet addresses are equally likely to be probed. Thus we can accurately estimate the scanning rate of the worm during the progress of the infection by monitoring relatively small address ranges. Since the probing will cover all Internet addresses, we can also estimate the percentage of the Internet infected. If not for the initial seed, these flaws would prevent the worm from reaching large portions of the Internet address space, no matter how many hosts were infected. For the same reason, these flaws could also bias our measurements, since even though our data comes from several different networks, there is a small chance that these particular networks were disproportionately more or less likely to be scanned. However, the worm uses an operating system service, GetTickCount , to seed their generator with the number of milliseconds since boot time, which should provide sufficient randomization to ensure that across many instances of the worm, at least one host will probe each address at some point in time. We feel confident that the risk of bias in our measurements is similarly minimized. An interesting feature of this PRNG is that it makes it difficult for the Internet community to assemble a list of the compromised Internet addresses. With earlier worms, it was sufficient to just collect a list of all addresses that probed into a large network. With Sapphire, one would need to monitor networks in every cycle of the random number generator for each salt value to have confidence of good coverage. Measurements of Sapphire's Spread and Operator Response [The remainder snipped for, heh, bandwidth... :-) --RAH] -- ----------------- R. A. Hettinga <mailto: [EMAIL PROTECTED]> The Internet Bearer Underwriting Corporation <http://www.ibuc.com/> 44 Farquhar Street, Boston, MA 02131 USA "... however it may deserve respect for its usefulness and antiquity, [predicting the end of the world] has not been found agreeable to experience." -- Edward Gibbon, 'Decline and Fall of the Roman Empire' --------------------------------------------------------------------- The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]