Re: Removing unneeded self joins

Tom Lane Wed, 16 May 2018 19:13:03 -0700

David Rowley <[email protected]> writes:
> On 17 May 2018 at 11:00, Andres Freund <[email protected]> wrote:
>> Wonder if we shouldn't just cache an estimated relation size in the
>> relcache entry till then. For planning purposes we don't need to be
>> accurate, and usually activity that drastically expands relation size
>> will trigger relcache activity before long. Currently there's plenty
>> workloads where the lseeks(SEEK_END) show up pretty prominently.


> While I'm in favour of speeding that up, I think we'd get complaints
> if we used a stale value.

Yeah, that scares me too.  We'd then be in a situation where (arguably)
any relation extension should force a relcache inval.  Not good.
I do not buy Andres' argument that the value is noncritical, either ---
particularly during initial population of a table, where the size could
go from zero to something-significant before autoanalyze gets around
to noticing.

I'm a bit skeptical of the idea of maintaining an accurate relation
size in shared memory, too.  AIUI, a lot of the problem we see with
lseek(SEEK_END) has to do with contention inside the kernel for access
to the single-point-of-truth where the file's size is kept.  Keeping
our own copy would eliminate kernel-call overhead, which can't hurt,
but it won't improve the contention angle.

                        regards, tom lane

Re: Removing unneeded self joins

Reply via email to