The rand([seed optional]) is loaded from hive's UDF, the 'random()'
function is drill's implementation.

Since drill has the logic to use the previous function container thus the
previous result is reused. I would say this is a bug for random generator.
The fix should be allowing some functions not to hold previous result so
each call of the function could return a new random value. Also we need to
decide that whether we want to keep both 'rand' and 'random'.

Could you open a bug for this?

Thanks,

Chunhui




On Mon, Apr 11, 2016 at 9:13 AM, Ted Dunning <[email protected]> wrote:

> I am trying to generate some random numbers. I have a large base file (foo)
> this is what I get:
>
> 0: jdbc:drill:>  select floor(1000*random()) as x, floor(1000*random()) as
> y, floor(1000*rand()) as z from (select * from maprfs.tdunning.foo) a limit
> 20;
> +--------+--------+--------+
> |   x    |   y    |   z    |
> +--------+--------+--------+
> | 556.0  | 556.0  | 618.0  |
> | 564.0  | 564.0  | 618.0  |
> | 129.0  | 129.0  | 618.0  |
> | 48.0   | 48.0   | 618.0  |
> | 696.0  | 696.0  | 618.0  |
> | 642.0  | 642.0  | 618.0  |
> | 535.0  | 535.0  | 618.0  |
> | 440.0  | 440.0  | 618.0  |
> | 894.0  | 894.0  | 618.0  |
> | 24.0   | 24.0   | 618.0  |
> | 508.0  | 508.0  | 618.0  |
> | 28.0   | 28.0   | 618.0  |
> | 816.0  | 816.0  | 618.0  |
> | 717.0  | 717.0  | 618.0  |
> | 334.0  | 334.0  | 618.0  |
> | 978.0  | 978.0  | 618.0  |
> | 646.0  | 646.0  | 618.0  |
> | 787.0  | 787.0  | 618.0  |
> | 260.0  | 260.0  | 618.0  |
> | 711.0  | 711.0  | 618.0  |
> +--------+--------+--------+
>
> On this page, https://drill.apache.org/docs/math-and-trig/, the rand
> function is described and random() is not. But it appears that rand()
> delivers a constant instead (although a different constant each time the
> query is run) and it appears that random() delivers the same value when
> used multiple times in each returned value.
>
> This seems very, very wrong.
>
> The fault does not seem to be related to my querying a table:
>
> 0: jdbc:drill:> select rand(), random(), random() from (values (1),(2),(3))
> x;
> +---------------------+-----------------------+-----------------------+
> |       EXPR$0        |        EXPR$1         |        EXPR$2         |
> +---------------------+-----------------------+-----------------------+
> | 0.1347749257216052  | 0.36724556209765014   | 0.36724556209765014   |
> | 0.1347749257216052  | 0.006087161689924625  | 0.006087161689924625  |
> | 0.1347749257216052  | 0.09417099142512142   | 0.09417099142512142   |
> +---------------------+-----------------------+-----------------------+
>
> For reference, postgres doesn't have rand() and does the right thing with
> random().
>

Reply via email to