RalfJL commented on issue #12949:
URL: https://github.com/apache/pinot/issues/12949#issuecomment-2079194061
to reduce the complexity of the execution plan I removed the subselect in
the select. This subselect is not necessary but was there only for debugging.
The result stays the same. With the filter to a specific acctuniquesessionid
the result is correct, without the filter, the result is incorrect
this is the reduced query:
```
-- explain plan for
select a.acctuniquesessionid, (a.eventtime - b.eventtime)/1000 as Zeit,
a.acctstatustype, b.acctstatustype, a.eventtime, b.eventtime
-- ,( select count(c.eventtime) from radius_json c where
b.acctuniquesessionid = c.acctuniquesessionid
-- and a.acctuniquesessionid =
c.acctuniquesessionid
-- and a.eventtime > 0 and b.eventtime > 0
-- and (c.acctstatustype = 'Stop' or
c.acctstatustype = 'Start' )
-- and c.eventtime between b.eventtime+1 and
a.eventtime-1
-- )
from radius_json a
join radius_json b
on b.acctuniquesessionid = a.acctuniquesessionid
and a.acctstatustype = 'Stop' and b.acctstatustype = 'Start' -- and
a._3gppimsi <> 'null'
and a.eventtime > b.eventtime
where
not exists ( select 1 from radius_json c where b.acctuniquesessionid =
c.acctuniquesessionid and a.acctuniquesessionid = c.acctuniquesessionid
and a.eventtime > 0 and b.eventtime > 0
and (c.acctstatustype = 'Stop' or
c.acctstatustype = 'Start' )
and c.eventtime between b.eventtime+1 and
a.eventtime-1
)
-- and a.acctuniquesessionid =
'fac07ae7853810946d87e868e463af2c'
-- and a.acctuniquesessionid =
'ac6baa744130522c1eb1eec161114d1b'
-- and a.acctuniquesessionid =
'da0f86138ec36b2364889f048bf2ac82'
order by Zeit desc
```
The query plan is:
```
Execution Plan
LogicalSort(sort0=[$1], dir0=[DESC])
PinotLogicalSortExchange(distribution=[hash], collation=[[1 DESC]],
isSortOnSender=[false], isSortOnReceiver=[true])
LogicalProject(acctuniquesessionid=[$14], Zeit=[/(-($17, $36), 1000)],
acctstatustype=[$12], acctstatustype0=[$31], eventtime=[$17], eventtime0=[$36])
LogicalFilter(condition=[IS NULL($48)])
LogicalJoin(condition=[AND(=($14, $42), =($33, $43), =($38, $44),
=($39, $45), =($40, $46), =($41, $47))], joinType=[left])
PinotLogicalExchange(distribution=[hash[14, 33, 38, 39, 40, 41]])
LogicalProject($docId=[$0], $hostName=[$1], $segmentName=[$2],
USERNAME=[$3], _3gppimsi=[$4], _3gppimsimccmnc=[$5], _3gppsgsnmccmnc=[$6],
acctinputoctets=[$7], acctinputpackets=[$8], acctoutputoctets=[$9],
acctoutputpackets=[$10], acctsessiontime=[$11], acctstatustype=[$12],
acctterminatecause=[$13], acctuniquesessionid=[$14], calledstationid=[$15],
callingstationid=[$16], eventtime=[$17], eventtimestamp=[$18], $docId0=[$21],
$hostName0=[$22], $segmentName0=[$23], USERNAME0=[$24], _3gppimsi0=[$25],
_3gppimsimccmnc0=[$26], _3gppsgsnmccmnc0=[$27], acctinputoctets0=[$28],
acctinputpackets0=[$29], acctoutputoctets0=[$30], acctoutputpackets0=[$31],
acctsessiontime0=[$32], acctstatustype0=[$33], acctterminatecause0=[$34],
acctuniquesessionid0=[$35], calledstationid0=[$36], callingstationid0=[$37],
eventtime0=[$38], eventtimestamp0=[$39], $f40=[$19], $f41=[$40], $f42=[$41],
$f43=[$20])
LogicalJoin(condition=[AND(=($35, $14), >($17, $38))],
joinType=[inner])
PinotLogicalExchange(distribution=[hash[14]])
LogicalProject($docId=[$0], $hostName=[$1],
$segmentName=[$2], USERNAME=[$3], _3gppimsi=[$4], _3gppimsimccmnc=[$5],
_3gppsgsnmccmnc=[$6], acctinputoctets=[$7], acctinputpackets=[$8],
acctoutputoctets=[$9], acctoutputpackets=[$10], acctsessiontime=[$11],
acctstatustype=[$12], acctterminatecause=[$13], acctuniquesessionid=[$14],
calledstationid=[$15], callingstationid=[$16], eventtime=[$17],
eventtimestamp=[$18], $f40=[>($17, 0)], $f43=[-($17, 1)])
LogicalFilter(condition=[=($12, _UTF-8'Stop')])
LogicalTableScan(table=[[radius_json]])
PinotLogicalExchange(distribution=[hash[14]])
LogicalProject($docId=[$0], $hostName=[$1],
$segmentName=[$2], USERNAME=[$3], _3gppimsi=[$4], _3gppimsimccmnc=[$5],
_3gppsgsnmccmnc=[$6], acctinputoctets=[$7], acctinputpackets=[$8],
acctoutputoctets=[$9], acctoutputpackets=[$10], acctsessiontime=[$11],
acctstatustype=[$12], acctterminatecause=[$13], acctuniquesessionid=[$14],
calledstationid=[$15], callingstationid=[$16], eventtime=[$17],
eventtimestamp=[$18], $f41=[>($17, 0)], $f42=[+($17, 1)])
LogicalFilter(condition=[=($12, _UTF-8'Start')])
LogicalTableScan(table=[[radius_json]])
PinotLogicalExchange(distribution=[hash[0, 1, 2, 3, 4, 5]])
LogicalAggregate(group=[{0, 1, 2, 3, 4, 5}], agg#0=[MIN($6)])
PinotLogicalExchange(distribution=[hash[0, 1, 2, 3, 4, 5]])
LogicalAggregate(group=[{0, 1, 2, 3, 4, 5}], agg#0=[MIN($6)])
LogicalProject(acctuniquesessionid0=[$2],
acctuniquesessionid00=[$3], $f40=[$4], $f41=[$5], $f42=[$6], $f43=[$7],
$f0=[true])
LogicalJoin(condition=[AND(=($3, $0), =($2, $0), >=($1,
$6), <=($1, $7))], joinType=[inner])
PinotLogicalExchange(distribution=[hash[0, 0]])
LogicalProject(acctuniquesessionid=[$14],
eventtime=[$17])
LogicalFilter(condition=[OR(=($12, _UTF-8'Start'),
=($12, _UTF-8'Stop'))])
LogicalTableScan(table=[[radius_json]])
PinotLogicalExchange(distribution=[hash[0, 1]])
LogicalProject(acctuniquesessionid=[$0],
acctuniquesessionid0=[$3], $f40=[$1], $f41=[$4], $f42=[$5], $f43=[$2])
LogicalAggregate(group=[{0, 1, 2, 3, 4, 5}])
PinotLogicalExchange(distribution=[hash[0, 1, 2,
3, 4, 5]])
LogicalAggregate(group=[{0, 2, 3, 4, 6, 7}])
LogicalJoin(condition=[AND(=($4, $0), >($1,
$5))], joinType=[inner])
PinotLogicalExchange(distribution=[hash[0]])
LogicalProject(acctuniquesessionid=[$14], eventtime=[$17], $f40=[>($17, 0)],
$f43=[-($17, 1)])
LogicalFilter(condition=[AND(=($12,
_UTF-8'Stop'), >($17, 0))])
LogicalTableScan(table=[[radius_json]])
PinotLogicalExchange(distribution=[hash[0]])
LogicalProject(acctuniquesessionid=[$14], eventtime=[$17], $f41=[>($17, 0)],
$f42=[+($17, 1)])
LogicalFilter(condition=[AND(=($12,
_UTF-8'Start'), >($17, 0))])
LogicalTableScan(table=[[radius_json]])
```
The overall goal is to compute the Bytes sent and received by a device in a
defined timeframe from freeradius accounting data.
So, for e.g., we want to know how many bytes where transfered by device
"XYZ" in the month march.
We will have to do this for more than 1000000 devices.
This is the easiest request out of 4 requests. The other 3 requests will be
more complicated
We are still in the POC phase to see if Pinot is the rigth tool for it
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]