On 10 December 2015 at 16:47, Robert Haas <robertmh...@gmail.com> wrote:

> On Thu, Dec 10, 2015 at 11:36 AM, Andres Freund <and...@anarazel.de>
> wrote:
> >> In fact, having no way to get the relation length other than scanning
> >> 1000 files doesn't seem like an especially good choice even if we used
> >> a better data structure.  Putting a header page in the heap would make
> >> getting the length of a relation O(1) instead of O(segments), and for
> >> a bonus, we'd be able to reliably detect it if a relation file
> >> disappeared out from under us.  That's a difficult project and
> >> definitely not my top priority, but this code is old and crufty all
> >> the same.)
> >
> > The md layer doesn't really know whether it's dealing with an index, or
> > with an index, or ... So handling this via a metapage doesn't seem
> > particularly straightforward.
>
> It's not straightforward, but I don't think that's the reason.  What
> we could do is look at the call sites that use
> RelationGetNumberOfBlocks() and change some of them to get the
> information some other way instead.  I believe get_relation_info() and
> initscan() are the primary culprits, accounting for some enormous
> percentage of the system calls we do on a read-only pgbench workload.
> Those functions certainly know enough to consult a metapage if we had
> such a thing.
>

It looks pretty straightforward to me...

The number of relations with >1 file is likely to be fairly small, so we
can just have an in-memory array to record that. 8 bytes per relation >1 GB
isn't going to take much shmem, but we can extend using dynshmem as needed.
We can seq scan the array at relcache build time and invalidate relcache
when we extend. WAL log any extension to a new segment and write the table
to disk at checkpoint.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply via email to