gortiz commented on issue #12949:
URL: https://github.com/apache/pinot/issues/12949#issuecomment-2079590396
I was able to replicate a similar plan in MultiStageQuickStart using:
```sql
select
playerID,
numberOfGamesAsBatter,
playerName,
runs,
intentionalWalks,
CAST(hits as BIGINT) as Zeit
from baseballStats a
join dimBaseballTeams b on a.teamID = b.teamID
where not exists ( select 1 from baseballStats c where c.teamID = a.teamID
and c.yearID < a.yearID)
order by Zeit desc
```
Which generates
```
Execution Plan
LogicalSort(sort0=[$5], dir0=[DESC])
PinotLogicalSortExchange(distribution=[hash], collation=[[5 DESC]],
isSortOnSender=[false], isSortOnReceiver=[true])
LogicalProject(playerID=[$3], numberOfGamesAsBatter=[$2],
playerName=[$4], runs=[$5], intentionalWalks=[$1], Zeit=[CAST($0):BIGINT NOT
NULL])
LogicalFilter(condition=[IS NULL($11)])
LogicalJoin(condition=[AND(=($6, $9), =($7, $10))], joinType=[left])
PinotLogicalExchange(distribution=[hash[6, 7]])
LogicalJoin(condition=[=($6, $8)], joinType=[inner])
PinotLogicalExchange(distribution=[hash[6]])
LogicalProject(hits=[$9], intentionalWalks=[$12],
numberOfGamesAsBatter=[$15], playerID=[$16], playerName=[$17], runs=[$19],
teamID=[$25], yearID=[$27])
LogicalTableScan(table=[[default, baseballStats]])
PinotLogicalExchange(distribution=[hash[0]])
LogicalProject(teamID=[$3])
LogicalTableScan(table=[[default, dimBaseballTeams]])
PinotLogicalExchange(distribution=[hash[0, 1]])
LogicalAggregate(group=[{0, 1}], agg#0=[MIN($2)])
PinotLogicalExchange(distribution=[hash[0, 1]])
LogicalAggregate(group=[{0, 1}], agg#0=[MIN($2)])
LogicalProject(teamID0=[$2], yearID0=[$3], $f0=[true])
LogicalJoin(condition=[AND(=($0, $2), <($1, $3))],
joinType=[inner])
PinotLogicalExchange(distribution=[hash[0]])
LogicalProject(teamID=[$25], yearID=[$27])
LogicalTableScan(table=[[default, baseballStats]])
PinotLogicalExchange(distribution=[hash[0]])
LogicalAggregate(group=[{0, 1}])
PinotLogicalExchange(distribution=[hash[0, 1]])
LogicalAggregate(group=[{0, 1}])
LogicalJoin(condition=[=($0, $2)],
joinType=[inner])
PinotLogicalExchange(distribution=[hash[0]])
LogicalProject(teamID=[$25], yearID=[$27])
LogicalTableScan(table=[[default,
baseballStats]])
PinotLogicalExchange(distribution=[hash[0]])
LogicalProject(teamID=[$3])
LogicalTableScan(table=[[default,
dimBaseballTeams]])
```
But at lest with the numbers we have in this quickstart, the rows returned
by this version are smaller than the number returned without the where.
I'll take a look next week trying to understand the rules that have been
applied to generate that plan.
Meanwhile, can you try to use LAST_WITH_TIME instead? IICU your query, that
should be semantically equivalent to:
```
select a.acctuniquesessionid,
a.acctstatustype,
a.ts,
b.acctstatustype,
b.ts,
(cast(b.ts as BIGINT) - cast(a.ts as BIGINT)) / 1000 as Zeit
from (
select acctuniquesessionid,
LAST_WITH_TIME(acctstatustype, ts, "STRING") as acctstatustype,
MAX(ts) as ts
from radius_start
group by acctuniquesessionid
) as a
join radius_stop b on a.acctuniquesessionid = b.acctuniquesessionid
order by Zeit desc
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]