So when you create a clustered index on a database table, the point is to make it possible to read a lot of records with some particular value or range of some attribute quickly --- without seeking --- or alternatively to make it possible to process all the records in a certain order without seeking much.
The trouble is, you can only create a clustered index on one attribute, at most. (You can create a clustered index on a set of attributes, but like an index on the set of attributes, items with identical values for the second and later attributes are likely to be scattered all over the place.) And if you pick the wrong one, you get remarkably bad performance for everything else, once your table gets big. There's a disk performance characteristic of interest here: the product of the sustained data transfer rate and the latency Typical values for current low-end disks are 20MB/s * (8 ms seek + 5 ms rotational latency) are around a quarter of a megabyte. I'll call this the "bytes per seek" characteristic. When the amount of data you read or write sequentially in a single location is small compared to the bytes per seek, the disk spends most of its time seeking; when it's equal to the bytes per seek, the disk divides its time half-and-half between seeking and reading or writing; and when it's large compared to the bytes per seek, the disk spends relatively little time seeking and therefore gets close to its peak throughput. Barring brain damage, clustered indices don't matter at all (for disk time) when the entire table's data is small compared to the bytes per seek --- there's no reason to not read the whole table into memory once you've done the work of seeking to it, so it doesn't matter very much what the order on disk is. (It might matter for other levels of the memory hierarchy, but I doubt it.) But once your table starts to get large compared to the bytes per seek, clustered indices can matter a lot, down to chunks of the size of the bytes per seek. The order of data inside bytes-per-seek-sized chunks generally doesn't matter much, but the number of such chunks a query has to access essentially determines the speed of the query. There's an alternative to clustered indices for small numbers of attributes. You can think of each attribute as a spatial dimension, and divide the multidimensional space up in a binary kd-tree of bytes-per-seek-sized chunks: each node in the k-d tree has a partition value on some attribute which divides the remaining items roughly half and half. If you have, say, two attributes you want to cluster on, and your data is 64 times the bytes-per-seek number, you need 6 levels of division; and you can read all the records with a particular value for either attribute with about 8 seeks. This compares poorly to the single seek you'd need with a pure clustered index, but it compares well to the 64 you'd (probably) need with a clustered index on the other attribute. This kdtree technique gets less useful as the number of dimensions grows. If you cluster by 3 attributes, your 64-chunk tree is now 4x4x4 instead of 8x8, so you need about 16 seeks to find all the data with some particular attribute --- only a factor of 2 faster than reading all the data. And these numbers I've been touting as average cases are probably more like best cases; they could get knackered by attributes with some individual values that account for a large proportion of the data. To take a larger number, if you have a gigabyte of data --- 4096 or so times the bytes-per-seek number --- you can kd-cluster it 64x64, 16x16x16, 8x8x8x8, or 4x4x4 x 4x4x4, yielding 64, 256, 512, and 1024 expected seeks, respectively, or concretely, about 1.6, 6.4, 12.8, and 25.6 seconds (considering that each seek has an accompanying equally-long read, so you get about 10MB/s on average.) Reading the entire gigabyte only takes 50 seconds. (On my laptop, it's actually about 80, but that's close enough.) You can roughly double the speed for one of the attributes you choose by storing bytes-per-seek-sized blocks covering roughly the same range of that attribute consecutively on disk, eliminating seeks between them; or you can distribute the reduction in the number of seeks across all the attributes by just storing the blocks in the order produced by a tree traversal (so adjacent leaves are likely to sit next to each other). You can increase the speed bonus for that attribute linearly, at the cost of slowing everyone else down linearly, by using smaller blocks. Reducing the blocks to one record in size gives you ordinary clustered indexing.

