On Wed, Jan 28, 2015 at 2:08 AM, Heikki Linnakangas <hlinnakan...@vmware.com> wrote: > OTOH, spreading the I/O across multiple files is not a good thing, if you > don't have a RAID setup like that. With a single spindle, you'll just induce > more seeks. > > Perhaps the OS is smart enough to read in large-enough chunks that the > occasional seek doesn't hurt much. But then again, why isn't the OS smart > enough to read in large-enough chunks to take advantage of the RAID even > when you read just a single file?
Suppose we have N spindles and N worker processes and it just so happens that the amount of computation is such that a each spindle can keep one CPU busy. Let's suppose the chunk size is 4MB. If you read from the relation at N staggered offsets, you might be lucky enough that each one of them keeps a spindle busy, and you might be lucky enough to have that stay true as the scans advance. You don't need any particularly large amount of read-ahead; you just need to stay at least one block ahead of the CPU. But if you read the relation in one pass from beginning to end, you need at least N*4MB of read-ahead to have data in cache for all N spindles, and the read-ahead will certainly fail you at the end of every 1GB segment. The problem here, as I see it, is that we're flying blind. If there's just one spindle, I think it's got to be right to read the relation sequentially. But if there are multiple spindles, it might not be, but it seems hard to predict what we should do. We don't know what the RAID chunk size is or how many spindles there are, so any guess as to how to chunk up the relation and divide up the work between workers is just a shot in the dark. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers