> On 2014-10-03 10:35:42 +0300, Heikki Linnakangas wrote:
> > On 10/03/2014 07:08 AM, Kouhei Kaigai wrote:
> > > Hello,
> > >
> > > I recently got a trouble on development of my extension that
> > > utilizes the shared buffer when it released each buffer page.
> > >
> > > This extension transfers contents of the shared buffers to GPU
> > > device using DMA feature, then kicks a device kernel code.
> > Wow, that sounds crazy.
> Agreed. I doubt that pinning that many buffers is a sane thing to do. At
> the very least you'll heavily interfere with vacuum and such.
My assumption is, this extension is used to handle OLAP type workload,
thus relatively less amount of write traffic to the database.
Sorry, I missed to mention about.
> > > Once backend/extension calls ReadBuffer(), resowner.c tracks which
> > > buffer was referenced by the current resource owner, to ensure these
> > > buffers being released at end of the transaction.
> > > However, it seems to me implementation of resowner.c didn't assume
> > > many buffers are referenced by a particular resource owner
> > > It manages the buffer index using an expandable array, then looks up
> > > the target buffer by sequential walk but from the tail because
> > > recently pinned buffer tends to be released first.
> > > It made a trouble in my case. My extension pinned multiple thousands
> > > buffers, so owner->buffers were enlarged and takes expensive cost
> > > to walk on.
> > > In my measurement, ResourceOwnerForgetBuffer() takes 36 seconds in
> > > total during hash-joining 2M rows; even though hash-joining itself
> > > takes less than 16 seconds.
> > > What is the best way to solve the problem?
> > How about creating a separate ResourceOwner for these buffer pins, and
> > doing a wholesale ResourceOwnerRelease() on it when you're done?
> Or even just unpinning them in reverse order? That should already fix the
> performance issues?
In case when multiple chunks (note: a chunk contains thousands buffers as
a unit of device kernel execution) are running asynchronously, order of
GPU job's completion is not predictable.
So, it does not help my situation if one resource-owner tracks all the
Probably, Heikki suggested to create a separate resource-owner per chunk.
In this case, all the buffers in a particular chunk shall be released
on the same time, so ReleaseBuffer() in reverse order makes sense.
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kai...@ak.jp.nec.com>
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: