Re: [HACKERS] crash-safe visibility map, take four

Jesper Krogh Tue, 22 Mar 2011 23:16:50 -0700

On 2011-03-22 21:43, Robert Haas wrote:

I took a crack at implementing the first approach described above,
which seems to be by far the simplest idea we've come up with to date.
  Patch attached.  It doesn't seem to be that complicated, which could
mean either that it's not that complicated or that I'm missing
something.  Feel free to point and snicker in the latter case.


Looks simple, but there is now benefit on the usage side in the patch,
so it isn't really "testable" yet? I would love to spend some time testing
when its doable (even with rough corners.)

I'm still a bit puzzled with how it would end up working with a page-level

visibillity map bit for index-scans. There is a clear "drop off" inusabillity

when the change rates of the table goes up, which may or may not be
relevant, but I cannot really judge, since I haven't even got a ballpark
figure about how much table churn would disable say 50% of the usage.

= Really naive suggestion approaching =
Another layout might be to simply drag out t_xmin, t_xmax pr row (8 bytes)

into a table by itself. This table will be way bigger than the one bitper page

map, but could be "wal-logged" as any other change in the system?

It would, by definition make the visibility testing work (way fasterthan today),

no matter how fast the underlying table changes.

State of today (PG 8.4) is that a query like this:
testdb=# select count(id) from testdb.seq where fts @@ to_tsquery('tag');
 count
-------
 69753
(1 row)

Time: 5863.600 ms
testdb=# select count(id) from testdb.seq where fts @@ to_tsquery('tag');
 count
-------
 69753
(1 row)

Time: 659.832 ms
testdb=# select count(id) from testdb.seq where fts @@ to_tsquery('tag');
 count
-------
 69753
(1 row)

Time: 1005.765 ms

Somewhere around 15ns / tuple (not bad at all).
(the first was probably "half warm")

The "average" rows per tuple is somewhere between 4 and 8 for thistable, assuming8 and that the 69K are randomly distributed among the 16M other tuples(fair assumptionin this case). The 600-1000ms for the fresh cache run are the timing todrag:69753*8192 (page size) = 571MB into memory for visibillity testingalone, on warm cacheall pages being in main memory. Packing 16M tuples with 8 bytes / tuplein a map would be

around 128MB.

given 8 bytes/row and random distribution of data, that would require usto read all 128MB,so a speedup of x4 on this example, but it would rougly let us count theentire table in

the same time.

With regard to disk vs. memory hotness.. those 128MB compares to a tablesize of 32GB(with a toast table next to it of 64GB) but that shouldn't be touched byabove query.

The ns/tuple number (today) on a "thin" table in my system isapproaching 1ns / tuple.

If the page-level bitmap would be set "quite fast" on a fairly busysystem anyway, thenthe above is just noise in the air, but I have currently no feeling, andthere is

some math in there I have trouble setting reliable ballpark numbers on.

There is, by all approaches room for significant improvements for thevisibillity

testing for a huge range of installations.

Can I drag out numbers of "frozenness of tuples" from my current systemsto fill in the

discussion? (how?)

Jesper
--
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] crash-safe visibility map, take four

Reply via email to