Stefan Nobis <stefan...@snobis.de> writes: > Eric Abrahamsen <e...@ericabrahamsen.net> writes: > >> I was confused in part because the "where exists (select *..." looks >> like its main purpose is to return rows. > > Indeed that's the purpose: Restrict the set of rows upon which update > acts on. Here I tried to reformat the statement a bit in order to > emphasize its structure:
Right -- I should have phrased that as "looks like its main purpose is to return data from rows", which as you clarify below, isn't its main purpose. > #+begin_src sql > UPDATE bookreview > SET rating = (select rating from updates > where bookreview.id = updates.id) > WHERE EXISTS (select * from updates > where updates.id = bookreview.id); > #+end_src > > The subselect of the "SET rating" part is a correlated subquery. So if > you imagine UPDATE as a kind of loop over the table, the subquery of > the SET part is executed once for every row UPDATE acts on (maybe the > SQL execution engine optimizes this in some kind, but the mental model > here is: run the subquery for every row we visit on our journey > throught the table). > > Only the WHERE EXISTS clause belonging directly to the UPDATE > statement will reduce the set of rows to act on. > >> Will the select subquery actually restrict the values that are >> available for updating/comparison in the update statement? > > No. > >> Or does the "exists" mean the subquery is treated as a plain yes/no >> boolean, and the update still has access to anything it likes? We >> could write "where exists (select <foo>" to the same effect? > > Yes. The SELECT clause of an EXISTS subquery (as in the above example) > is rather meaningless. So somethimes you see constructs like "where > exists (select 1 from ...)". Some SQL engines are not very clever and > execute the subquery of such an EXISTS clause unchanged - meaning that > way too much data is fetched for the intermediate result (unnecessary > IO and maybe polluting caches). Thus the "select 1" as a workaround > for those unclever engines. But current engines should have no > problems with optimizing these EXISTS subqueries and in that case it > does not matter how the select clause looks like - it will be ignored. > >> In essence, the "where exists" is acting as an "inner join"... > > Yes, effectively we are simulating an inner join at this point. Sadly, > many SQL engines are not able to update rows of join constructs (or at > least have quite severe constraints in these cases). Thus we need to > build these kinds of workarounds to change data in more complex cases. > > SQL is quite a capable language, but it has also has some rough edges. > :) Really interesting! Thanks again for the in-depth explanation.