>>>>> "Isaac" == Isaac Morland <isaac.morl...@gmail.com> writes:
Isaac> So it is as if checking the whole tuple for NULL requires Isaac> reading the PDF bytea columns, but checking just the primary key Isaac> for NULL or even reading the lengths of the PDFs does not. That is almost certainly exactly what happens. If the PDF columns are of type bytea, or if they are of type text and the database encoding is single-byte, then length() does not need to detoast the column in order to get the size (the byte size of the toasted datum is stored in the toast pointer). However, constructing a whole-row datum does require detoasting any externally-stored columns (this used not to be the case, but that caused a lot of bugs). Isaac> For the moment I'm going to fix it by just using Isaac> "y.primary_key_column IS NULL" instead of "y IS NULL" where I Isaac> want to check whether I have a row from y corresponding to a Isaac> given row in x. What you should actually use in these cases for your IS NULL check is one of the columns of the join condition. That allows the planner to detect that the query is in fact an anti-join, and optimize accordingly. The other, and IMO better, way to write anti-join queries is to use an explicit NOT EXISTS. (Note, do _not_ use NOT IN, since that has its own issues with NULL handling.) Isaac> 1) when checking an entire row for null, This isn't a very common operation and the SQL-standard semantics for it are actually quite weird (for example, x IS NULL is not the opposite condition to x IS NOT NULL). So I don't think we need to encourage it. Isaac> start with a primary key field or other NOT NULL field. In the Isaac> common case of checking what happened with a left join, this is Isaac> all that needs to be done - either there is a row, in which case Isaac> the field cannot be NULL, or there is no row and all the other Isaac> fields must also be NULL. The planner can do even better than this if you apply the IS NULL test _specifically to one of the join columns_. When a join condition is strict (which it almost always is), then testing the column from the nullable side is provably (to the planner) equivalent to testing the existence of the matching row, which allows it to transform the join from an outer join to an anti-join. -- Andrew (irc:RhodiumToad)