Hi, I was a little confusing. The query orders of the order by columns (salary is project out into the temporary table) but, later ob_hash is used. This is because the frame extends to all the values the order the same, and comparing multiple columns is hard, so a hash is used instead.
--Justin On Tue, Mar 3, 2015 at 1:28 PM, Justin Swanhart <[email protected]> wrote: > Hi, > > Well here is how Shard-Query does them. I assume you would do something > very similarly internally with a temp table. > > a) it create a temporary table for the query reserving null rows for the > window functions > b) it adds to the temporary a unique id for each row of the resultset. > This is used for framing. > c) it adds a hash of the order by columns for the window function for > ordering > d) it adds a hash of the partition columns for partitioning > > After the regular resultset is stored in the temp table, a function sweeps > the table for > each window function, calculating the result of the function based on the > framing clause, > then the column in the resultset is updated to reflect the computed value. > > Finally, the resultset is returned to the client. > > Here is the SQL log of the following query: > mysql> call shard_query.sq_helper("SELECT depname, empno, salary, > cume_dist() OVER (PARTITION BY depname ORDER by salary rows between 1 > following and 1 following) ss FROM empsalary", "", 'test','testtab',1,1); > +-----------+-------+--------+------------------+ > | depname | empno | salary | ss | > +-----------+-------+--------+------------------+ > | develop | 7 | 4200 | 0.2 | > | develop | 9 | 4500 | 0.4 | > | develop | 11 | 5200 | 0.8 | > | develop | 10 | 5200 | 0.8 | > | develop | 8 | 6000 | 1 | > | sales | 3 | 4800 | 0.66666666666667 | > | sales | 4 | 4800 | 0.66666666666667 | > | sales | 1 | 5000 | 1 | > | personnel | 5 | 3500 | 0.5 | > | personnel | 2 | 3900 | 1 | > +-----------+-------+--------+------------------+ > 10 rows in set (0.08 sec) > > Query OK, 0 rows affected (0.09 sec) > > 150303 13:12:05 1533 Query call shard_query.sq_helper("SELECT depname, > empno, salary, cume_dist() OVER (PARTITION BY depname ORDER by salary rows > between 1 following and 1 following) ss FROM empsalary", "", > 'test','testtab',1,1) > > -- get meta data for resultset (notice 0=1 in where clause) > 1553 Query SELECT NULL as wf_rownum, depname AS expr$0,empno AS > expr$1,salary AS expr$2,NULL as wf0,depname,SHA1(CONCAT_WS('#',depname)) as > wf0_hash,salary,SHA1(CONCAT_WS('#',salary)) as wf0_obhash > FROM empsalary AS `empsalary` WHERE 1=1 LIMIT 0 > > -- create temp table > 1553 Query CREATE TABLE aggregation_tmp_45403179 (wf_rownum bigint > auto_increment primary key,expr$0 VARCHAR(255),expr$1 VARCHAR(255),expr$2 > VARCHAR(255),wf0 VARCHAR(255),depname VARCHAR(255),wf0_hash > VARCHAR(255),salary VARCHAR(255),wf0_obhash VARCHAR(255)) ENGINE=MYISAM > > -- get resultset > 1556 Query SELECT NULL as wf_rownum, depname AS expr$0,empno AS > expr$1,salary AS expr$2,NULL as wf0,depname,SHA1(CONCAT_WS('#',depname)) as > wf0_hash,salary,SHA1(CONCAT_WS('#',salary)) as wf0_obhash > FROM empsalary AS `empsalary` WHERE 1=1 > > -- store resultset > 1557 Query INSERT INTO `aggregation_tmp_45403179` VALUES > (NULL,'develop',11,5200,NULL,'develop','418a6bc4deccf0f7d5182192d51a54e504b3f3c9',5200,'2fad4efb8e6aaaa53ecdf638137d390e002f9783'),(NULL,'develop',7,4200,NULL,'develop','418a6bc4deccf0f7d5182192d51a54e504b3f3c9',4200,'c79fce75b1583ddd36a96178757e0d8d0ac91228'),(NULL,'develop',9,4500,NULL,'develop','418a6bc4deccf0f7d5182192d51a54e504b3f3c9',4500,'97a87b470fe9ed5ff51ff9b8543e937e6016d48c'),(NULL,'develop',8,6000,NULL,'develop','418a6bc4deccf0f7d5182192d51a54e504b3f3c9',6000,'31d9ddeaa80bc88c1f3117b9724726ebcc7fc72d'),(NULL,'develop',10,5200,NULL,'develop','418a6bc4deccf0f7d5182192d51a54e504b3f3c9',5200,'2fad4efb8e6aaaa53ecdf638137d390e002f9783'),(NULL,'personnel',5,3500,NULL,'personnel','fc08ce9ebee8734c2dc883c0dbd607686bdce8f3',3500,'65609286cc04ece831a844984a6bc9eb80450cf7'),(NULL,'personnel',2,3900,NULL,'personnel','fc08ce9ebee8734c2dc883c0dbd607686bdce8f3',3900,'5446569e8572251dcac168152d6c37074427eae3'),(NULL,'sales',3,4800,NULL,'sales','59248c4dae276a021cb296d2ee0e6a0c962a8d7f',4800,'98e9f55262269a05dd4f1ed788626580fdef2e95'),(NULL,'sales',1,5000,NULL,'sales','59248c4dae276a021cb296d2ee0e6a0c962a8d7f',5000,'f8237d8959e03355010bb85cc3dc46a46fb31110'),(NULL,'sales',4,4800,NULL,'sales','59248c4dae276a021cb296d2ee0e6a0c962a8d7f',4800,'98e9f55262269a05dd4f1ed788626580fdef2e95') > > -- handle window function 0 (the only one in this case) > > -- get the hashes for each partition > 1553 Query SELECT distinct wf0_hash h from aggregation_tmp_45403179 > ORDER BY salary asc > > -- compute the values for each partition (three in this case) > > -- process the window and update the temp table (see code at the end for > wf_cume) for each partition in turn. > -- as you can see there is a select followed by updates > > 1553 Query SELECT *,NULL as wf0 > FROM `aggregation_tmp_45403179` where > wf0_hash='fc08ce9ebee8734c2dc883c0dbd607686bdce8f3' ORDER BY salary asc > > > 1553 Query UPDATE aggregation_tmp_45403179 SET wf0 = 0.5 WHERE wf_rownum > in (6) > 1553 Query UPDATE aggregation_tmp_45403179 SET wf0 = 1 WHERE wf_rownum > in (7) > 1553 Query SELECT *,NULL as wf0 > FROM `aggregation_tmp_45403179` where > wf0_hash='59248c4dae276a021cb296d2ee0e6a0c962a8d7f' ORDER BY salary asc > 1553 Query UPDATE aggregation_tmp_45403179 SET wf0 = 0.66666666666667 > WHERE wf_rownum in (8,10) > 1553 Query UPDATE aggregation_tmp_45403179 SET wf0 = 1 WHERE wf_rownum > in (9) > 1553 Query SELECT *,NULL as wf0 > FROM `aggregation_tmp_45403179` where > wf0_hash='418a6bc4deccf0f7d5182192d51a54e504b3f3c9' ORDER BY salary asc > 1553 Query UPDATE aggregation_tmp_45403179 SET wf0 = 0.2 WHERE wf_rownum > in (2) > 1553 Query UPDATE aggregation_tmp_45403179 SET wf0 = 0.4 WHERE wf_rownum > in (3) > 1553 Query UPDATE aggregation_tmp_45403179 SET wf0 = 0.8 WHERE wf_rownum > in (1,5) > 1553 Query UPDATE aggregation_tmp_45403179 SET wf0 = 1 WHERE wf_rownum > in (4) > > -- return resultset to client > > 1559 Query SELECT expr$0 AS `depname`,expr$1 AS `empno`,expr$2 AS > `salary`,wf0 as `ss` > FROM `aggregation_tmp_45403179` ORDER BY wf0_hash,salary asc > > -- remove temp table > > 1559 Query DROP TABLE IF EXISTS aggregation_tmp_45403179 > 1559 Quit > 1550 Quit > > 150303 13:12:12 1533 Query set global general_log=0 > > > > protected function wf_cume_dist($num,$state) { > static $sum; > $win = $state->windows[$num]; > if(empty($win['order'])) { > if($percent) > $sql = "update " . $state->table_name . " set wf{$num}=1"; > else > $sql = "update " . $state->table_name . " set wf{$num}=0"; > > $state->DAL->my_query($sql); > if($err = $state->DAL->my_error()) { > $this->errors[] = $err; > return false; > } > return true; > } else { > /* running sum*/ > $sql = "SELECT distinct wf{$num}_hash h from " . $state->table_name > . " ORDER BY " . $win['order_by']; > $stmt = $state->DAL->my_query($sql); > if($err = $state->DAL->my_error()) { > $this->errors[] = $err; > return false; > } > $last_hash = ""; > $hash = ""; > $last_ob_hash = ""; > $ob_hash = ""; > while($row = $state->DAL->my_fetch_assoc($stmt)) { > #$sql = "select * from " . $state->table_name . " where > wf{$num}_hash='" . $row['h'] . "' ORDER BY " . $win['order_by']; > $sql = "SELECT *," . $state->winfunc_sql . " where > wf{$num}_hash='" . $row['h'] . "' ORDER BY " . $win['order_by']; > $stmt2 = $state->DAL->my_query($sql); > if($err = $state->DAL->my_error()) { > $this->errors[] = $err; > return false; > } > $done=array(); > $rows=array(); > while($row2=$state->DAL->my_fetch_assoc($stmt2)) { > $rows[] = $row2; > } > $last_hash = ""; > $last_ob_hash = ""; > $i = 0; > $rowlist=""; > $rank = 0; > > while($i<count($rows)) { > $row2 = $rows[$i]; > ++$rank; > $ob_hash = $row2["wf{$num}_obhash"]; > $rowlist=$row2['wf_rownum']; > for($n=$i+1;$n<count($rows);++$n) { > $row3 = $rows[$n]; > $new_ob_hash = $row3["wf{$num}_obhash"]; > if($new_ob_hash != $ob_hash) { > break; > } > $rowlist .= "," . $row3['wf_rownum']; > ++$i; > ++$rank; > } > $dist = $rank/count($rows); > $sql = "UPDATE " . $state->table_name . " SET wf{$num} = {$dist} > WHERE wf_rownum in ({$rowlist})"; > $state->DAL->my_query($sql); > if($err = $state->DAL->my_error()) { > $this->errors[] = $err; > return false; > } > ++$i; > } > } > } > return true; > } > > > > On Tue, Mar 3, 2015 at 12:55 PM, Igor Babaev <[email protected]> wrote: > >> On 03/03/2015 10:25 AM, Sergei Golubchik wrote: >> > Hi, Igor! >> > >> > On Mar 03, Igor Babaev wrote: >> >>> >> >>>> I'd also like to discuss window functions too. I've implemented them >> >>>> in shard-query and have ideas about how to implement them in the >> >>>> server, but pluggable parser would be really useful here. >> >>> >> >>> Window functions have a good chance of being in 10.2, it's MDEV-6115. >> >>> But I don't think that somebody is working on MDEV-6115 yet. >> >> >> >> I started working on MDEV-6115 some time ago. >> > >> > Ah, great. Sorry, I didn't know it. >> > Could you then discuss it with Justin, please? >> > See above, he has some ideas about the implementation. >> >> >> Justin, >> How do you prefer discussing your ideas? >> On IRC (#maria) or by phone? (I don't have skype at the moment) >> >> Regards, >> Igor. >> >> > >> > Regards, >> > Sergei >> >> >> _______________________________________________ >> Mailing list: https://launchpad.net/~maria-discuss >> Post to : [email protected] >> Unsubscribe : https://launchpad.net/~maria-discuss >> More help : https://help.launchpad.net/ListHelp >> > >
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : [email protected] Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp

