Have you looked at the pherf module in Phoenix? You can predefined rules and have it generate data based on that.
Sent from my iPhone > On Sep 9, 2015, at 3:55 AM, James Heather <[email protected]> wrote: > > I've been trying to create a table and then bung lots of random data into it, > but without generating the data client-side and then upserting it. I'd rather > find a way to have it generated on the server. > > My table has three columns, all BIGINT, and the first is the primary key. > > I'd ideally like to write something like > > UPSERT into linktab (id, first_val, second_val) SELECT next value for > linktab_next_id, random_func, random_func ... > > and complete the query so that the SELECT generates its values without > needing to read from a table. But there doesn't seem to be a way of writing a > SELECT that doesn't need a table parameter. > > So I've ended up doing this. First I upsert a single row into the table; > then, I double the number of rows in it by a self-referential UPSERT SELECT: > > 0: jdbc:phoenix:172.31.31.143> upsert into linktab (id, first_val, > second_val) select next value for linktab_next_id, > round(rand()*1000000000000), round(rand()*1000000000000) from linktab; > 8 rows affected (0.193 seconds) > 0: jdbc:phoenix:172.31.31.143> upsert into linktab (id, first_val, > second_val) select next value for linktab_next_id, > round(rand()*1000000000000), round(rand()*1000000000000) from linktab; > 16 rows affected (0.202 seconds) > 0: jdbc:phoenix:172.31.31.143> upsert into linktab (id, first_val, > second_val) select next value for linktab_next_id, > round(rand()*1000000000000), round(rand()*1000000000000) from linktab; > 32 rows affected (0.205 seconds) > 0: jdbc:phoenix:172.31.31.143> upsert into linktab (id, first_val, > second_val) select next value for linktab_next_id, > round(rand()*1000000000000), round(rand()*1000000000000) from linktab; > 64 rows affected (0.235 seconds) > 0: jdbc:phoenix:172.31.31.143> upsert into linktab (id, first_val, > second_val) select next value for linktab_next_id, > round(rand()*1000000000000), round(rand()*1000000000000) from linktab; > 128 rows affected (0.28 seconds) > 0: jdbc:phoenix:172.31.31.143> upsert into linktab (id, first_val, > second_val) select next value for linktab_next_id, > round(rand()*1000000000000), round(rand()*1000000000000) from linktab; > 256 rows affected (0.378 seconds) > 0: jdbc:phoenix:172.31.31.143> upsert into linktab (id, first_val, > second_val) select next value for linktab_next_id, > round(rand()*1000000000000), round(rand()*1000000000000) from linktab; > 512 rows affected (0.53 seconds) > 0: jdbc:phoenix:172.31.31.143> upsert into linktab (id, first_val, > second_val) select next value for linktab_next_id, > round(rand()*1000000000000), round(rand()*1000000000000) from linktab; > 1,024 rows affected (0.848 seconds) > 0: jdbc:phoenix:172.31.31.143> upsert into linktab (id, first_val, > second_val) select next value for linktab_next_id, > round(rand()*1000000000000), round(rand()*1000000000000) from linktab; > 2,048 rows affected (1.513 seconds) > 0: jdbc:phoenix:172.31.31.143> upsert into linktab (id, first_val, > second_val) select next value for linktab_next_id, > round(rand()*1000000000000), round(rand()*1000000000000) from linktab; > 4,096 rows affected (2.817 seconds) > 0: jdbc:phoenix:172.31.31.143> upsert into linktab (id, first_val, > second_val) select next value for linktab_next_id, > round(rand()*1000000000000), round(rand()*1000000000000) from linktab; > 8,192 rows affected (5.314 seconds) > 0: jdbc:phoenix:172.31.31.143> upsert into linktab (id, first_val, > second_val) select next value for linktab_next_id, > round(rand()*1000000000000), round(rand()*1000000000000) from linktab; > 16,384 rows affected (10.807 seconds) > 0: jdbc:phoenix:172.31.31.143> upsert into linktab (id, first_val, > second_val) select next value for linktab_next_id, > round(rand()*1000000000000), round(rand()*1000000000000) from linktab; > 32,768 rows affected (21.306 seconds) > 0: jdbc:phoenix:172.31.31.143> upsert into linktab (id, first_val, > second_val) select next value for linktab_next_id, > round(rand()*1000000000000), round(rand()*1000000000000) from linktab; > 65,536 rows affected (45.713 seconds) > 0: jdbc:phoenix:172.31.31.143> explain upsert into linktab (id, first_val, > second_val) select next value for linktab_next_id, > round(rand()*1000000000000), round(rand()*1000000000000) from linktab; > +------------------------------------------+ > | PLAN | > +------------------------------------------+ > | UPSERT SELECT | > | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER LINKTAB | > | SERVER FILTER BY FIRST KEY ONLY | > | CLIENT RESERVE VALUES FROM 1 SEQUENCE | > +------------------------------------------+ > 4 rows selected (0.192 seconds) > 0: jdbc:phoenix:172.31.31.143> > > > This works, but as you can see from the timings, it's getting quite slow, and > this seems to be because it has to do a full scan of the table to retrieve > the existing rows in order to construct the dummy data to put in. This seems > a bit of a shame when it doesn't actually need the data! > > Is there a way round this? > > Could we have a SELECT ... LIMIT n command that doesn't require a table to be > specified as long as none of the columns comes from a table? > > James
