On Thu, Aug 27, 2015 at 11:55 AM, Jean-Christophe Deschamps < jcd at antichoc.net> wrote:
> > I can see both sides of this debate, whether or not random() should be >> evaluated twice in this context: >> >> select random() from blah order by random() >> > > There are 2 distinct and volontary function invokations, so I don't see > how SQL<whatever> engine would decide not to perform the second call. Agreed, though I'm pretty sure I've read messages in this thread at advocate the same function should return the same value when called multiple times in a single select statement. > So let me pose a question. What should the following query generate? >> >> select random(), random() from blah order by random() >> > > Ditto, but with now 3 function direct invokations, hence in the case of > random(), 3 probably distinct values. That a more recent post shows that > SQLite issues an error about ambiguousness of random() is even more > worrisome! > Well, 3.8.11 (at least) does not issue such an error. I'm thinking it was a "this is what should be done IMO" comment vs a "this is what actually happens". > This is completely different from refering to an *-already computed > value-* like in > select random() rr where rr < 0 or rr >= 0 group by rr order by rr > > Here: only 1 function invokation and multiple references to an already > computed column. > What must occur here is (paraphrased): > > create temp table tmp as select random() rr; > select * from tmp where rr < 0 or rr >= 0 group by rr order by rr; -- > providing the result set > drop table tmp; > > Now let's broaden the picture: if rr is in fact a large, complex subquery, > taking 45 hours to compute, then should "order by rr" rerun the subquery > for every row comparison? Sounds ridiculous to me. > > I don't know SQLite code enough to start digging but I doubt that having, > where, group by and order clauses to have any right and need to > introspection into the column definition and private interpretation of > what's there, lest re-running what produced them. This is none of their > business. Using sqlfiddle.com I'm testing the following: create table blah(a int); insert into blah values(1),(2),(3); select rand(), rand() as b from blah order by b; where rand() is whatever random number function the platform in question provides. SQL Server 2008: same row three times; rand() is computed at the beginning of the query and cached. Which makes for decidedly non-random numbers and doesn't seem very useful IMO. MySQL 5.6 returns three different rows, and it seems to be sorted on b. Six invocations of the query always resulted in the set being sorted on b, so I'm going to assume it worked as expected without generating a third random number. PostgreSQL 9.3: same result as MySQL 5.6. Oracle 11g R2: same result as MySQL 5.6. SQLite: we already know. I think it still comes back to my earlier comment: Would changing it to behave more like the most common / expected outcome above be a breaking change? -- Scott Robison