Could we please stop the ad-hominem stuff from all sides? We want to
solve the issue not to make it bigger.

On 2013-04-30 04:29:26 -0700, Kevin Grittner wrote:
> Let's look at the "corner" this supposedly paints us into.  If a
> later major release creates a better mechanism, current pg_dump and
> load will already use it, based on the way matviews are created
> empty and REFRESHed by pg_dump.  Worst case, we need to modify the
> behavior of pg_dump running with the switch used by pg_upgrade to
> use a new ALTER MATERIALIZED VIEW SET (populated); (or whatever
> syntax is chosen) -- a command we would probably want at that point
> anyway.  I'm not seeing cause for panic here.

I don't think panic is appropriate either, but I think there are some
valid concerns around this.

1) vacuum can truncate the table to an empty relation already if there is
   no data returned by the view's query which then leads to the wrong
   scannability state.

   S1: CREATE MATERIALIZED VIEW matview_empty AS SELECT false WHERE random() < 
-1;
   S2: S2: SELECT * FROM matview_empty; -- works
   S1: VACUUM matview_empty;
   S2: SELECT * FROM matview_empty; -- still works
   S3: SELECT * FROM matview_empty; -- errors out

   So we need to make vacuum skip cleaning out the last page. Once we
   get incrementally updated matviews there are more situations to get
   into this than just a query not returning anything.
   I remember this being discussed somewhere already, but couldn't find
   it quickly enough.

   Imo this is quite an indicator that the idea of using the filesize is
   just plain wrong. Adding logic to vacuum not to truncate data because
   its a flag for matview scannability is quite the layer violation and
   a sign for a bad design decision. Such a hack has already been added
   to copy_heap_data(), while not as bad, shows that it is hard to find
   all the places where we need to add it.

2) Since we don't have a metapage to represent scannability in 9.3 we
   cannot easily use one in 9.4+ without pg_upgrade emptying all
   matviews unless we only rely on the catalogs which we currently
   cannot. This will either slow down development or make users
   unhappy. Alternatively we can add yet another fork, but that has its
   price (quite a bit more open files during normal operation, slower
   CREATE DATABASE).

   This is actually an argument for not releasing matviews without such
   an external state. Going from disk-based state to catalog is easy,
   the other way round: not so much.

3) Using the filesize as a flag will make other stuff like preallocating
   on-disk data in bigger chunks and related things more complicated.

4) doing the check for scannability in the executor imo is a rather bad
   idea. Note that we e.g. don't see an error about a matview which
   won't be scanned because of constraint exclusion or one-time
   filters.

   S1: CREATE MATERIALIZED VIEW matview_unit_false AS SELECT false WHERE false 
WITH NO DATA;
   S1: SELECT * FROM matview_unit_false;

You can get there in far more reasonable cases than WHERE false.

5) I have to agree with Kevin that the scannability is an important thing
   to track though.

   a) we cannot remove support for WITH NO DATA because of pg_dump order
      and unlogged relations. So even without unlogged relations the
      problem exists although its easier to represent.
   b) Just giving wrong responses is bad [tm]. Outdated data is something
      completely different (it has existed in that state in the (near)
      past) than giving an empty response (might never have been a visible
      state, and more likely not so in any reasonably near
      past). Consider an application behind a pooler suddently getting
      an empty response from a SELECT * FROM unlogged_matview; It won't
      notice anything without a unscannable state since it probably
      won't even notice the database restart.

Not sure what the consequence of this is. The most reasonable solution
seems to be to introduce a metapage somewhere *now* which sucks, but ...

Greetings,

Andres Freund

--
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to