> On 2014-10-03 10:35:42 +0300, Heikki Linnakangas wrote: > > On 10/03/2014 07:08 AM, Kouhei Kaigai wrote: > > > Hello, > > > > > > I recently got a trouble on development of my extension that > > > utilizes the shared buffer when it released each buffer page. > > > > > > This extension transfers contents of the shared buffers to GPU > > > device using DMA feature, then kicks a device kernel code. > > > > Wow, that sounds crazy. > > Agreed. I doubt that pinning that many buffers is a sane thing to do. At > the very least you'll heavily interfere with vacuum and such. > My assumption is, this extension is used to handle OLAP type workload, thus relatively less amount of write traffic to the database. Sorry, I missed to mention about.
> > > Once backend/extension calls ReadBuffer(), resowner.c tracks which > > > buffer was referenced by the current resource owner, to ensure these > > > buffers being released at end of the transaction. > > > However, it seems to me implementation of resowner.c didn't assume > > > many buffers are referenced by a particular resource owner > simultaneously. > > > It manages the buffer index using an expandable array, then looks up > > > the target buffer by sequential walk but from the tail because > > > recently pinned buffer tends to be released first. > > > It made a trouble in my case. My extension pinned multiple thousands > > > buffers, so owner->buffers[] were enlarged and takes expensive cost > > > to walk on. > > > In my measurement, ResourceOwnerForgetBuffer() takes 36 seconds in > > > total during hash-joining 2M rows; even though hash-joining itself > > > takes less than 16 seconds. > > > > What is the best way to solve the problem? > > > > How about creating a separate ResourceOwner for these buffer pins, and > > doing a wholesale ResourceOwnerRelease() on it when you're done? > > Or even just unpinning them in reverse order? That should already fix the > performance issues? > In case when multiple chunks (note: a chunk contains thousands buffers as a unit of device kernel execution) are running asynchronously, order of GPU job's completion is not predictable. So, it does not help my situation if one resource-owner tracks all the buffers. Probably, Heikki suggested to create a separate resource-owner per chunk. In this case, all the buffers in a particular chunk shall be released on the same time, so ReleaseBuffer() in reverse order makes sense. Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei <kai...@ak.jp.nec.com> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers