Hello! PUBLIC.EVENTTHETA_PARENTS2CELLID_ASC_S2CELLID_ASC_EVENTDATE_ASC_EVENTHOUR_ASC_IDX Looks like this index is not selective enough for efficient filtering by EVENTDATE. So you have to scan a lot of entries regardless of date range.
You can set local=true on SQL query to only return results from local node. Compute tasks can't use SQL indexes outside of SQL queries so it might not be so efficient. Regards, Regards, -- Ilya Kasnacheev пн, 17 дек. 2018 г. в 07:02, kellan <[email protected]>: > I'm trying to understand the mechanics of collocated joins in Ignite (I'm > running 2.7). I have two queries that are only differentiated by the date > range that return in the same amount of time, although one pulls data for > thirty days and the other for only one. > > > SELECT S2CellCovering.coveringId, COUNT(*) > FROM S2CellCovering > INNER JOIN EventTheta > ON EventTheta.parentS2CellId = S2CellCovering.parentS2CellId > AND EventTheta.s2CellId BETWEEN S2CellCovering.minS2CellId AND > S2CellCovering.maxS2CellId > AND EventTheta.eventDate BETWEEN '2018-10-02' AND '2018-10-02' > AND EventTheta.eventHour BETWEEN -1 AND -1 > WHERE S2CellCovering.coveringId = 166 > GROUP BY S2CellCovering.coveringId; > > COVERINGID 166 > COUNT(*) 5629 > > COVERINGID 166 > COUNT(*) 6407 > > COVERINGID 166 > COUNT(*) 9030 > > COVERINGID 166 > COUNT(*) 8965 > > 4 rows selected (12.389 seconds) > > > SELECT S2CellCovering.coveringId, COUNT(*) > FROM S2CellCovering > INNER JOIN EventTheta > ON EventTheta.parentS2CellId = S2CellCovering.parentS2CellId > AND EventTheta.s2CellId BETWEEN S2CellCovering.minS2CellId AND > S2CellCovering.maxS2CellId > AND EventTheta.eventDate BETWEEN '2018-10-02' AND '2018-10-31' > AND EventTheta.eventHour BETWEEN -1 AND -1 > WHERE S2CellCovering.coveringId = 166 > GROUP BY S2CellCovering.coveringId; > > COVERINGID 166 > COUNT(*) 149690 > > COVERINGID 166 > COUNT(*) 161104 > > COVERINGID 166 > COUNT(*) 215174 > > COVERINGID 166 > COUNT(*) 218183 > > 4 rows selected (12.578 seconds) > > > When I run EXPLAIN it looks like my rows are all properly indexed, as far > as > I can tell: > > > PLAN SELECT > __Z0.COVERINGID AS __C0_0, > COUNT(*) AS __C0_1 > FROM PUBLIC.S2CELLCOVERING __Z0 > /* PUBLIC.S2CELLCOVERING_COVERINGID_ASC_IDX: COVERINGID = 166 */ > /* WHERE __Z0.COVERINGID = 166 > */ > INNER JOIN PUBLIC.EVENTTHETA __Z1 > /* > > PUBLIC.EVENTTHETA_PARENTS2CELLID_ASC_S2CELLID_ASC_EVENTDATE_ASC_EVENTHOUR_ASC_IDX: > EVENTHOUR >= -1 > AND EVENTHOUR <= -1 > AND EVENTDATE >= DATE '2018-10-02' > AND EVENTDATE <= DATE '2018-10-31' > AND PARENTS2CELLID = __Z0.PARENTS2CELLID > AND S2CELLID >= __Z0.MINS2CELLID > AND S2CELLID <= __Z0.MAXS2CELLID > */ > ON 1=1 > WHERE (__Z0.COVERINGID = 166) > AND (((__Z1.EVENTHOUR >= -1) > AND (__Z1.EVENTHOUR <= -1)) > AND (((__Z1.EVENTDATE >= DATE '2018-10-02') > AND (__Z1.EVENTDATE <= DATE '2018-10-31')) > AND ((__Z1.PARENTS2CELLID = __Z0.PARENTS2CELLID) > AND ((__Z0.MINS2CELLID <= __Z1.S2CELLID) > AND (__Z0.MAXS2CELLID >= __Z1.S2CELLID))))) > GROUP BY __Z0.COVERINGID > /* group sorted */ > > PLAN SELECT > __C0_0 AS COVERINGID, > __C0_1 AS __C0_1 > FROM PUBLIC.__T0 > /* PUBLIC."merge_scan" */ > > > The EVENTTHETA and S2CELLCOVERING keys should be collocated on the same > nodes. Here are my keys: > > > case class EventThetaKey( > @(AffinityKeyMapped@field) > @(QuerySqlField@field)( > orderedGroups = Array( > new (QuerySqlField.Group@field)( > name = "pk_public_eventtheta", > order = 4, > descending = true > ) > ) > ) parentS2CellId: Long, > @(QuerySqlField@field)( > orderedGroups = Array( > new (QuerySqlField.Group@field)( > name = "pk_public_eventtheta", > order = 3, > descending = true > ) > ) > ) s2CellId: Long, > @(QuerySqlField@field)( > orderedGroups = Array( > new (QuerySqlField.Group@field)( > name = "pk_public_eventtheta", > order = 2, > descending = true > ) > ) > ) eventDate: Date, > @(QuerySqlField@field)( > orderedGroups = Array( > new (QuerySqlField.Group@field)( > name = "pk_public_eventtheta", > order = 1, > descending = true > ) > ) > ) eventHour: Byte > ) > > > case class S2CellCoveringKey( > @(AffinityKeyMapped@field) > @(QuerySqlField@field)( > orderedGroups = Array( > new (QuerySqlField.Group@field)( > name = "pk_public_s2cellcovering", > order = 3, > descending = true > ) > ) > ) parentS2CellId: Long, > @(QuerySqlField@field)( > orderedGroups = Array( > new (QuerySqlField.Group@field)( > name = "pk_public_s2cellcovering", > order = 2, > descending = true > ) > ) > ) s2CellId: Long, > @(QuerySqlField@field)( > index = true, > orderedGroups = Array( > new (QuerySqlField.Group@field)( > name = "pk_public_s2cellcovering", > order = 1, > descending = true > ) > ) > ) coveringId: Long > ) > > > Based on the fact that I am able to scan 30x more rows in roughly the same > amount of time, I gather that at one of my indices is set up properly, but > both queries are taking over 12s to run with plenty of compute available. > What am I missing here? I'm assuming that since the tables are joined on a > shared affinity key the join is collocated, but maybe I'm wrong. > > On another note, is there a way to run this query in a compute task such > that I'm only returning results from the local node? > > > > > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >
