On 2011-03-22 21:43, Robert Haas wrote:
I took a crack at implementing the first approach described above,
which seems to be by far the simplest idea we've come up with to date.
  Patch attached.  It doesn't seem to be that complicated, which could
mean either that it's not that complicated or that I'm missing
something.  Feel free to point and snicker in the latter case.

Looks simple, but there is now benefit on the usage side in the patch,
so it isn't really "testable" yet? I would love to spend some time testing
when its doable (even with rough corners.)

I'm still a bit puzzled with how it would end up working with a page-level
visibillity map bit for index-scans. There is a clear "drop off" in usabillity
when the change rates of the table goes up, which may or may not be
relevant, but I cannot really judge, since I haven't even got a ballpark
figure about how much table churn would disable say 50% of the usage.

= Really naive suggestion approaching =
Another layout might be to simply drag out t_xmin, t_xmax pr row (8 bytes)
into a table by itself. This table will be way bigger than the one bit per page
map, but could be "wal-logged" as any other change in the system?

It would, by definition make the visibility testing work (way faster than today),
no matter how fast the underlying table changes.

State of today (PG 8.4) is that a query like this:
testdb=# select count(id) from testdb.seq where fts @@ to_tsquery('tag');
 count
-------
 69753
(1 row)

Time: 5863.600 ms
testdb=# select count(id) from testdb.seq where fts @@ to_tsquery('tag');
 count
-------
 69753
(1 row)

Time: 659.832 ms
testdb=# select count(id) from testdb.seq where fts @@ to_tsquery('tag');
 count
-------
 69753
(1 row)

Time: 1005.765 ms

Somewhere around 15ns / tuple (not bad at all).
(the first was probably "half warm")

The "average" rows per tuple is somewhere between 4 and 8 for this table, assuming 8 and that the 69K are randomly distributed among the 16M other tuples (fair assumption in this case). The 600-1000ms for the fresh cache run are the timing to drag: 69753*8192 (page size) = 571MB into memory for visibillity testing alone, on warm cache all pages being in main memory. Packing 16M tuples with 8 bytes / tuple in a map would be
around 128MB.

given 8 bytes/row and random distribution of data, that would require us to read all 128MB, so a speedup of x4 on this example, but it would rougly let us count the entire table in
the same time.

With regard to disk vs. memory hotness.. those 128MB compares to a table size of 32GB (with a toast table next to it of 64GB) but that shouldn't be touched by above query.

The ns/tuple number (today) on a "thin" table in my system is approaching 1ns / tuple.

If the page-level bitmap would be set "quite fast" on a fairly busy system anyway, then the above is just noise in the air, but I have currently no feeling, and there is
some math in there I have trouble setting reliable ballpark numbers on.

There is, by all approaches room for significant improvements for the visibillity
testing for a huge range of installations.

Can I drag out numbers of "frozenness of tuples" from my current systems to fill in the
discussion? (how?)

Jesper
--
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to