Re: Recommended way of random sampling

Stéphane Thibaud Mon, 20 May 2019 03:29:05 -0700

Hello Ilya,

Thank you for that suggestion. On a traditional database I know that
approach does not scale well, since a random number is first assigned to
all rows (it scales linearly with the number of rows if I am not mistaken).
Do you think this would be different for Ignite?



Kind regards,

Stéphane Thibaud

2019年5月20日(月) 15:53 Ilya Kasnacheev <[email protected]>:

> Hello!
>
> You can have a random indexed field in your table and do queries like
> SELECT * FROM table WHERE rand_field < RAND() LIMIT 1; to sample random
> item.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пн, 20 мая 2019 г. в 04:50, Stéphane Thibaud <[email protected]>:
>
>> As a small addition: it would really help if Ignite had a hashing
>> function for this, but I only see AES encryption.
>>
>>
>> Kind regards,
>>
>> Stéphane Thibaud
>>
>> 2019年5月19日(日) 20:59 Stéphane Thibaud <[email protected]>:
>>
>>> Hello Ignite users,
>>>
>>> I am considering to sample randomly on large amounts of data, but I was
>>> wondering what would be the most efficient way for this. Right now, I think
>>> I might need cluster-based randomness using a MOD function as described
>>> here: https://www.alandix.com/academic/topics/random/sampling-SQL.html
>>>
>>> I currently have a UUID column (uuid4), which I think can be used for
>>> it, but I might need some bit manipulation to get the non-random parts out
>>> of the UUID.
>>> Do you think this is indeed the most straightforward way to do it?
>>>
>>>
>>> Kind regards,
>>>
>>> Stéphane Thibaud
>>>
>>

Re: Recommended way of random sampling

Reply via email to