Re: Removing unneeded self joins

Konstantin Knizhnik Thu, 17 May 2018 01:17:21 -0700


On 17.05.2018 05:19, Andres Freund wrote:

On 2018-05-16 22:11:22 -0400, Tom Lane wrote:

David Rowley <david.row...@2ndquadrant.com> writes:

On 17 May 2018 at 11:00, Andres Freund <and...@anarazel.de> wrote:

Wonder if we shouldn't just cache an estimated relation size in the
relcache entry till then. For planning purposes we don't need to be
accurate, and usually activity that drastically expands relation size
will trigger relcache activity before long. Currently there's plenty
workloads where the lseeks(SEEK_END) show up pretty prominently.

While I'm in favour of speeding that up, I think we'd get complaints
if we used a stale value.

Yeah, that scares me too.  We'd then be in a situation where (arguably)
any relation extension should force a relcache inval.  Not good.
I do not buy Andres' argument that the value is noncritical, either ---
particularly during initial population of a table, where the size could
go from zero to something-significant before autoanalyze gets around
to noticing.

I don't think every extension needs to force a relcache inval. It'd
instead be perfectly reasonable to define a rule that an inval is
triggered whenever crossing a 10% relation size boundary. Which'll lead
to invalidations for the first few pages, but much less frequently
later.

I'm a bit skeptical of the idea of maintaining an accurate relation
size in shared memory, too.  AIUI, a lot of the problem we see with
lseek(SEEK_END) has to do with contention inside the kernel for access
to the single-point-of-truth where the file's size is kept.  Keeping
our own copy would eliminate kernel-call overhead, which can't hurt,
but it won't improve the contention angle.

A syscall is several hundred instructions. An unlocked read - which'll
be be sufficient in many cases, given that the value can quickly be out
of date anyway - is a few cycles. Even with a barrier you're talking a
few dozen cycles.  So I can't see how it'd not improve the contention.

But the main reason for keeping it in shmem is less the lseek avoidance
- although that's nice, context switches aren't great - but to make
relation extension need far less locking.

Greetings,

Andres Freund

I completely agree with Andreas. In my multithreaded Postgres prototypefile description cache (shared by all threads) becomes bottleneckexactly because of each query execution requiresaccess to file system (lseek) to provide optimizer estimation of therelation size, despite to the fact that all database fits in memory.Well, this is certainly specific of shared descriptor's pool in myprototype, but the fact the we have to perform lseek at each querycompilation seems to be annoying in any case.

And there is really no problem that cached relation size estimation isnot precise. It really can be invalidated even if relation size ischanged more than some threshold value (1Mb?) or lease time for cachedvalue is expired.May be it is reasonable to implement specific invalidation for relationsize esimation, to avoid complete invalidation and reconstruction ofrelation description and all dependent objects.In this case time-based invalidation seems to be the easiest choice toimplement. Repeating lseek each 10 or 1 second seems to have nonoticeable impact on performance and relation size can not dramaticallychanged during this time.




--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Removing unneeded self joins

Reply via email to