Re: [HACKERS] Extent Locks

Stephen Frost Thu, 16 May 2013 18:37:13 -0700

Robert,

  For not understanding me, we seem to be in violent agreement. ;)

* Robert Haas ([email protected]) wrote:
> I think you might be confused, or else I'm confused, because I don't
> believe we have any such thing as an extent lock.

The relation extension lock is what I was referring to.  Apologies for
any confusion there.

> What we do have is
> a relation extension lock, but the size of the segment on disk has
> nothing to do with that: there's only one for the whole relation, and
> you hold it when adding a block to the relation.  

Yes, which is farrr too small.  I'm certainly aware that the segments on
disk are dealt with in the storage layer- currently.  My proposal was to
consider how we might change that, a bit, to allow improved throughput
when there are multiple writers.

Consider this, for example- when we block on the relation extension
lock, rather than sit and wait or continue to compete with the other
threads, simply tell the storage layer to give us a dedicated file to
work with.  Once we're ready to commit, move that file into place as the
next segment (through some command to the storage layer), using an
atomic command to ensure that it either works and doesn't overwrite
anything, or fails and we try again by moving the segment number up.

We would need to work out, at the storage layer, how to handle cases
where the file is less than 1G and realize that we should just skip over
those blocks on disk as being known-to-be-empty.  Those blocks would
also be then put on the free space map and used for later processes
which need to find somewhere to put new data, etc.

> But that having been said, it just so happens that I was recently
> playing around with ways of trying to fix the relation extension
> bottleneck.  One thing I tried was: every time a particular backend
> extends the relation, it extends the relation by more than 1 block at
> a time before releasing the relation extension lock.  

Right, exactly.  One idea that I was discussing w/ Greg was to do this
using some log(relation-size) approach or similar.

> This does help...
> but at least in my tests, extending by 2 blocks instead of 1 was the
> big winner, and after that you didn't get much further relief.

How many concurrent writers did you have and what kind of filesystem was
backing this?  Was it a temp filesystem where writes are essentially to
memory, causing this relation extention lock to be much more
contentious?

> Another thing I tried was pre-extending the relation to the estimated
> final size.  That worked a lot better, and might be worth doing (e.g.
> ALTER TABLE zorp SET MINIMUM SIZE 1GB) but a less manual solution
> would be preferable if we can come up with one.

Slightly confused here- above you said that '2' was way better than '1',
but you implied that "more than 2 wasn't really much better"- yet "wayyy
more than 2 is much better"?  Did I follow that right?  I can certainly
understand such a case, just want to understand it and make sure it's
what you meant.  What "small-number" options did you try?

> After that, I ran out of time for investigation.

Too bad!  Thanks much for the work in this area, it'd really help if we
could improve this for our data warehouse, in particular, users.

        Thanks!

                Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Extent Locks

Reply via email to