On Wed, Jun 15, 2016 at 2:59 PM, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote: > Kevin Grittner wrote: >> On Wed, Jun 15, 2016 at 1:50 PM, Robert Haas <robertmh...@gmail.com> wrote: >> >> > The expression index case is the one to worry about; if there is a >> > problem, that's where it is. What bothers me is that a function used >> > in an expression index could do anything at all - it can read any >> > table in the database. >> >> It *can*, but then you are lying to the database when you call it >> IMMUTABLE. Such an index can easily become corrupted through >> normal DML. Without DML the ANALYZE has no problem. So you seem >> to be concerned that if someone is lying to the database engine to >> force it accept a function as IMMUTABLE when it actually isn't, and >> then updating the referenced rows (which is very likely to render >> the index corrupted), that statistics might also become stale. > > We actually go quite some lengths to support this case, even when it's > the opinion of many that we shouldn't. For example VACUUM doesn't try > to find index entries using the values in each deleted tuple; instead we > remember the TIDs and then scan the indexes (possibly many times) to > find entries that match those TIDs -- which is much slower. Yet we do > it this way to protect the case that somebody is doing the > not-really-IMMUTABLE function. > > In other words, I don't think we consider the position you argued as > acceptable.
Well, I actually don't think there's a giant problem here. I'm just trying to follow the chain of the argument to its (illogical) conclusion. I mean, if ANALYZE itself fails to see a tuple subjected to early pruning, that should be fine. And if some query run by a supposedly-but-not-actually immutable function errors out because snapshot_too_old is set, that also seems more or less fine. The statistics might not get updated, but oh well: either make your supposedly-immutable function actually immutable, or else turn off snapshot_too_old, or else live with the fact that ANALYZE will fail some percentage of the time. Supporting people who cheat and do things that technically aren't allowed is one thing; saying that every new feature must never have any downsides for such people is something else. If we took the latter approach, parallel query would be right out, because you sure can break things by mislabeling functions as PARALLEL SAFE. I *do* think it *must* be possible to get an ANALYZE to do something funky - either error out, or give wrong answers - if you set up a strange enough set of circumstances, but I don't think that's necessarily something we need to do anything about. I think this whole discussion of snapshot too old has provoked far too many bitter emails -- on all sides. I entirely believe that there are legitimate reasons to have concerns about this feature, and I entirely suspect that it has problems we haven't found yet, and I also entirely believe that there will be some future bugs that stem from this feature that we would not have had otherwise. I think it is entirely legitimate to have concerns about those things. On the other hand, I *also* believe that Kevin is a pretty careful guy and that he's done his best to make this patch safe and that he did not just go off and commit something half-baked without due reflection. We have to expect that if people who are committers don't get much review of their patches, they will eventually commit them anyway. The "I can't believe you committed this" reactions seem out of line to me. This feature is not perfect. Nor is it the worst thing anybody's ever committed. But it's definitely provoked more ire than most. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers