Re: [HACKERS] Support Parallel Query Execution in Executor

Bruce Momjian Thu, 13 Apr 2006 03:51:24 -0700

Enhanded TODO:
        
        * Experiment with multi-threaded backend better resource utilization
        
          This would allow a single query to make use of multiple CPU's or
          multiple I/O channels simultaneously.  One idea is to create a
          background reader that can pre-fetch sequential and index scan
          pages needed by other backends.  This could be expanded to allow
          concurrent reads from multiple devices in a partitioned table.
        
The issue of parallism is basically a sliding scale, meaning what do you
want to do concurrently.  In this case, we are trying to do I/O and CPU
concurrently.  Another idea is to do I/O for partitioned tables
concurrently, and of course there are many CPU concurrent cases like
sorting.


---------------------------------------------------------------------------

Tom Lane wrote:
> Myron Scott <[EMAIL PROTECTED]> writes:
> > Gregory Maxwell wrote:
> >> There are other cases where it is useful to perform parallel I/O
> >> without parallel processing..
> 
> > I have done some testing more along these lines with an old fork of
> > postgres code (2001).  In my tests, I used a thread to delegate out
> > the actual heap scan of the SeqScan.  The job of the "slave" thread
> > the was to fault in buffer pages and determine the time validity of
> > the tuples.  ItemPointers are passed back to the "master" thread via a
> > common memory area guarded by mutex locking.
> 
> I was considering a variant idea in the shower this morning: suppose
> that we invent one or more "background reader" processes that have
> basically the same infrastructure as the background writer, but have
> the responsibility of causing buffer reads to happen at useful times
> (whereas the writer causes writes to happen at useful times).  The
> idea would be for backends to signal the readers when they know they
> will need a given block soon, and then hopefully when they need it
> it'll already be in shared buffers.  For instance, in a seqscan it'd be
> pretty trivial to request block N+1 just after reading block N, and then
> doing our actual processing on block N while (we hope) some reader
> process is faulting in N+1.  Bitmap indexscans could use this structure
> too; I'm less sure about whether plain indexscans could do much with it
> though.
> 
> The major issues I can see are:
> 
> 1. We'd need a shared-memory queue of read requests, probably much like
> the queue of fsync requests.  We've already seen problems with
> contention for the fsync queue, IIRC, and that's used much less heavily
> than the read request queue would be.  So there might be some
> performance issues with getting the block requests sent over to the
> readers.
> 
> 2. There are some low-level assumptions that no one reads in pages of
> a relation without having some kind of lock on the relation (consider
> eg the case where the relation is being dropped).  A bgwriter-like
> process wouldn't be able to hold lmgr locks, and we wouldn't really want
> it to be thrashing the lmgr shared data structures for each read anyway.
> So you'd have to design some interlock to guarantee that no backend
> abandons a query (and releases its own lmgr locks) while an async read
> request it made is still pending.  Ugh.
> 
>                       regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
> 

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Re: [HACKERS] Support Parallel Query Execution in Executor

Reply via email to