Rosser Schwarz wrote: > while you weren't looking, Kevin Brown wrote: > > [reordering bursty reads] > > > In other words, it's a corner case that I strongly suspect > > isn't typical in situations where SCSI has historically made a big > > difference. > > [...] > > > But I rather doubt that has to be a huge penalty, if any. When a > > process issues an fsync (or even a sync), the kernel doesn't *have* to > > drop everything it's doing and get to work on it immediately. It > > could easily gather a few more requests, bundle them up, and then > > issue them. > > To make sure I'm following you here, are you or are you not suggesting > that the kernel could sit on -all- IO requests for some small handful > of ms before actually performing any IO to address what you "strongly > suspect" is a "corner case"?
The kernel *can* do so. Whether or not it's a good idea depends on the activity in the system. You'd only consider doing this if you didn't already have a relatively large backlog of I/O requests to handle. You wouldn't do this for every I/O request. Consider this: I/O operations to a block device are so slow compared with the speed of other (non I/O) operations on the system that the system can easily wait for, say, a hundredth of the typical latency on the target device before issuing requests to it and not have any real negative impact on the system's I/O throughput. A process running on my test system, a 3 GHz Xeon, can issue a million read system calls per second (I've measured it. I can post the rather trivial source code if you're interested). That's the full round trip of issuing the system call and having the kernel return back. That means that in the span of a millisecond, the system could receive 1000 requests if the system were busy enough. If the average latency for a random read from the disk (including head movement and everything) is 10 milliseconds, and we decide to delay the issuance of the first I/O request for a tenth of a millisecond (a hundredth of the latency), then the system might receive 100 additional I/O requests, which it could then put into the queue and sort by block address before issuing the read request. As long as the system knows what the last block that was requested from that physical device was, it can order the requests properly and then begin issuing them. Since the latency on the target device is so high, this is likely to be a rather big win for overall throughput. -- Kevin Brown [EMAIL PROTECTED] ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings