On Wed, Jan 15, 2014 at 3:41 PM, Stephen Frost <sfr...@snowman.net> wrote:
> * Claudio Freire (klaussfre...@gmail.com) wrote:
>> But, still, the implementation is very similar to what postgres needs:
>> sharing a physical page for two distinct logical pages, efficiently,
>> with efficient copy-on-write.
>
> Agreed, except that KSM seems like it'd be slow/lazy about it and I'm
> guessing there's a reason the pagecache isn't included normally..

KSM does an active de-duplication. That's slow. This would be
leveraging KSM structures in the kernel (page sharing) but without all
the de-duplication logic.

>
>> So it'd be just a matter of removing that limitation regarding page
>> cache and shared pages.
>
> Any idea why that limitation is there?

No, but I'm guessing it's because nobody bothered to implement the
required copy-on-write in the page cache, which would be a PITA to
write - think of all the complexities with privilege checks and
everything - even though the benefits for many kinds of applications
would be important.

>> If you asked me, I'd implement it as copy-on-write on the page cache
>> (not the user page). That ought to be low-overhead.
>
> Not entirely sure I'm following this- if it's a shared page, it doesn't
> matter who starts writing to it, as soon as that happens, it need to get
> copied.  Perhaps you mean that the application should keep the
> "original" and that the page-cache should get the "copy" (or, really,
> perhaps just forget about the page existing at that point- we won't want
> it again...).
>
> Would that be a way to go, perhaps?  This does go back to the "make it
> act like mmap, but not *be* mmap", but the idea would be:
> open(..., O_ZEROCOPY_READ)
> read() - Goes to PG's shared buffers, pagecache and PG share the page
> page fault (PG writes to it) - pagecache forgets about the page
> write() / fsync() - operate as normal

Yep.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to