Re: [HACKERS] Parallel Seq Scan

Robert Haas Wed, 28 Jan 2015 06:03:52 -0800

On Wed, Jan 28, 2015 at 2:08 AM, Heikki Linnakangas
<[email protected]> wrote:
> OTOH, spreading the I/O across multiple files is not a good thing, if you
> don't have a RAID setup like that. With a single spindle, you'll just induce
> more seeks.
>
> Perhaps the OS is smart enough to read in large-enough chunks that the
> occasional seek doesn't hurt much. But then again, why isn't the OS smart
> enough to read in large-enough chunks to take advantage of the RAID even
> when you read just a single file?


Suppose we have N spindles and N worker processes and it just so
happens that the amount of computation is such that a each spindle can
keep one CPU busy.  Let's suppose the chunk size is 4MB.  If you read
from the relation at N staggered offsets, you might be lucky enough
that each one of them keeps a spindle busy, and you might be lucky
enough to have that stay true as the scans advance.  You don't need
any particularly large amount of read-ahead; you just need to stay at
least one block ahead of the CPU.  But if you read the relation in one
pass from beginning to end, you need at least N*4MB of read-ahead to
have data in cache for all N spindles, and the read-ahead will
certainly fail you at the end of every 1GB segment.

The problem here, as I see it, is that we're flying blind.  If there's
just one spindle, I think it's got to be right to read the relation
sequentially.  But if there are multiple spindles, it might not be,
but it seems hard to predict what we should do.  We don't know what
the RAID chunk size is or how many spindles there are, so any guess as
to how to chunk up the relation and divide up the work between workers
is just a shot in the dark.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Seq Scan

Reply via email to