The rand([seed optional]) is loaded from hive's UDF, the 'random()' function is drill's implementation.
Since drill has the logic to use the previous function container thus the previous result is reused. I would say this is a bug for random generator. The fix should be allowing some functions not to hold previous result so each call of the function could return a new random value. Also we need to decide that whether we want to keep both 'rand' and 'random'. Could you open a bug for this? Thanks, Chunhui On Mon, Apr 11, 2016 at 9:13 AM, Ted Dunning <[email protected]> wrote: > I am trying to generate some random numbers. I have a large base file (foo) > this is what I get: > > 0: jdbc:drill:> select floor(1000*random()) as x, floor(1000*random()) as > y, floor(1000*rand()) as z from (select * from maprfs.tdunning.foo) a limit > 20; > +--------+--------+--------+ > | x | y | z | > +--------+--------+--------+ > | 556.0 | 556.0 | 618.0 | > | 564.0 | 564.0 | 618.0 | > | 129.0 | 129.0 | 618.0 | > | 48.0 | 48.0 | 618.0 | > | 696.0 | 696.0 | 618.0 | > | 642.0 | 642.0 | 618.0 | > | 535.0 | 535.0 | 618.0 | > | 440.0 | 440.0 | 618.0 | > | 894.0 | 894.0 | 618.0 | > | 24.0 | 24.0 | 618.0 | > | 508.0 | 508.0 | 618.0 | > | 28.0 | 28.0 | 618.0 | > | 816.0 | 816.0 | 618.0 | > | 717.0 | 717.0 | 618.0 | > | 334.0 | 334.0 | 618.0 | > | 978.0 | 978.0 | 618.0 | > | 646.0 | 646.0 | 618.0 | > | 787.0 | 787.0 | 618.0 | > | 260.0 | 260.0 | 618.0 | > | 711.0 | 711.0 | 618.0 | > +--------+--------+--------+ > > On this page, https://drill.apache.org/docs/math-and-trig/, the rand > function is described and random() is not. But it appears that rand() > delivers a constant instead (although a different constant each time the > query is run) and it appears that random() delivers the same value when > used multiple times in each returned value. > > This seems very, very wrong. > > The fault does not seem to be related to my querying a table: > > 0: jdbc:drill:> select rand(), random(), random() from (values (1),(2),(3)) > x; > +---------------------+-----------------------+-----------------------+ > | EXPR$0 | EXPR$1 | EXPR$2 | > +---------------------+-----------------------+-----------------------+ > | 0.1347749257216052 | 0.36724556209765014 | 0.36724556209765014 | > | 0.1347749257216052 | 0.006087161689924625 | 0.006087161689924625 | > | 0.1347749257216052 | 0.09417099142512142 | 0.09417099142512142 | > +---------------------+-----------------------+-----------------------+ > > For reference, postgres doesn't have rand() and does the right thing with > random(). >
