On Thu, Aug 27, 2015 at 11:55 AM, Jean-Christophe Deschamps <
jcd at antichoc.net> wrote:

>
> I can see both sides of this debate, whether or not random() should be
>> evaluated twice in this context:
>>
>> select random() from blah order by random()
>>
>
> There are 2 distinct and volontary function invokations, so I don't see
> how SQL<whatever> engine would decide not to perform the second call.


Agreed, though I'm pretty sure I've read messages in this thread at
advocate the same function should return the same value when called
multiple times in a single select statement.


> So let me pose a question. What should the following query generate?
>>
>> select random(), random() from blah order by random()
>>
>
> Ditto, but with now 3 function direct invokations, hence in the case of
> random(), 3 probably distinct values. That a more recent post shows that
> SQLite issues an error about ambiguousness of random() is even more
> worrisome!
>

Well, 3.8.11 (at least) does not issue such an error. I'm thinking it was a
"this is what should be done IMO" comment vs a "this is what actually
happens".


> This is completely different from refering to an *-already computed
> value-* like in
> select random() rr where rr < 0 or rr >= 0 group by rr order by rr
>
> Here: only 1 function invokation and multiple references to an already
> computed column.
> What must occur here is (paraphrased):
>
> create temp table tmp as select random() rr;
> select * from tmp where rr < 0 or rr >= 0 group by rr order by rr; --
> providing the result set
> drop table tmp;
>
> Now let's broaden the picture: if rr is in fact a large, complex subquery,
> taking 45 hours to compute, then should "order by rr" rerun the subquery
> for every row comparison? Sounds ridiculous to me.
>
> I don't know SQLite code enough to start digging but I doubt that having,
> where, group by and order clauses to have any right and need to
> introspection into the column definition and private interpretation of
> what's there, lest re-running what produced them. This is none of their
> business.


Using sqlfiddle.com I'm testing the following:

create table blah(a int);
insert into blah values(1),(2),(3);
select rand(), rand() as b from blah order by b;

where rand() is whatever random number function the platform in question
provides.

SQL Server 2008: same row three times; rand() is computed at the beginning
of the query and cached. Which makes for decidedly non-random numbers and
doesn't seem very useful IMO.

MySQL 5.6 returns three different rows, and it seems to be sorted on b. Six
invocations of the query always resulted in the set being sorted on b, so
I'm going to assume it worked as expected without generating a third random
number.

PostgreSQL 9.3: same result as MySQL 5.6.

Oracle 11g R2: same result as MySQL 5.6.

SQLite: we already know.

I think it still comes back to my earlier comment: Would changing it to
behave more like the most common / expected outcome above be a breaking
change?

-- 
Scott Robison

Reply via email to