[sqlite] picking random subset of rows
Hi, I am trying to use a correlated subquery with an 'order by random() limit 2' to pick upto two random rows for each value in the outer query. I am not sure if I am doing this correctly, but the number of rows I am getting seems to vary randomly which doesn't make sense to me. If i replace the order by random() with order by product I always get the expected number of rows. I have tried to create a simplified version of the code below to replicate the issue. I am experiencing the problem on sqlite 3.22.0, but I have tried on sqliteonline.com which I think is using 3.27.2 and am seeing similar results. Thanks, Kevin --- create table if not exists test_productattribs (product text primary key, attr, val); insert or ignore into test_productattribs values ('1', 'area', 'a'), ('2', 'area', 'b'), ('3', 'area', 'a'), ('4', 'area', 'a') ; --In the real query, this is done inside the with, but it does not seem relevant --to the issue. create table if not exists test_productarea as select product, val as area from test_productattribs where attr='area' ; --I have two areas, 'a' and 'b'. I limit to two random products from each area --As area 'b' only has one product, I always expect to get 3 rows, 1 for area 'b', and 2 for --area 'a'. with dareas as (select distinct area from test_productarea) select
Re: [sqlite] picking random subset of rows
On 18/3/62 17:36, Kevin Martin wrote: Hi, I am trying to use a correlated subquery with an 'order by random() limit 2' to pick upto two random rows for each value in the outer query. I am not sure if I am doing this correctly, but the number of rows I am getting seems to vary randomly which doesn't make sense to me. If i replace the order by random() with order by product I always get the expected number of rows. I have tried to create a simplified version of the code below to replicate the issue. with dareas as (select distinct area from test_productarea) select da.area, pa.product from dareas as da left join test_productarea as pa --if I don't order by random, but instead by product, I always get the correct number of rows on pa.product in (select product from test_productarea where da.area=area order by random() limit 2) In SQLite, a correlated sub-query on the RHS of an IN(...) operator may be rerun every time the IN(...) test is required. And if that sub-query contains "random()" it might return a different result every time. Your words suggest that you are hoping it will be run once for each different value of "da.area", with different results each time. But it will not. Dan. -- In the real query, this order by is required to see a different number of rows to the number in the -- limit. However, it seems it is not required in the small test dataset. -- order by --da.area, --pa.product ; ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] picking random subset of rows
> On 18 Mar 2019, at 16:15, Dan Kennedy wrote: > > > In SQLite, a correlated sub-query on the RHS of an IN(...) operator may be > rerun every time the IN(...) test is required. And if that sub-query contains > "random()" it might return a different result every time. > > Your words suggest that you are hoping it will be run once for each different > value of "da.area", with different results each time. But it will not. Ah yes, this makes complete sense now, thanks. I was going off a stack overflow post as to how to emulate outer apply in sqlite3. I didn't really think through what I was writing. In my case, I only need a different ordering each time I create the database, as opposed to each time the query is run. So, I can remove the random() from the subquery, and instead create a new table populated by a trigger mapping product to a random number and order by that in the subquery instead. I would be interested if there is a solution for sqlite 3.22 (i.e. no window functions) where it can be done so that the query gives a possibly different result each time it is executed. Thanks, Kevin ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users