Hello! As I have already said, there is likely a problem with index selectivity. SQL engine has to walk every record where coveringId = 166, and it has to do join for every such record by using index on parentS2CellId, and only then it can filter by date.
Regards, -- Ilya Kasnacheev ср, 19 дек. 2018 г. в 23:07, kellan <[email protected]>: > I haven't increased parallelism yet, but that's not a solution to my > problem. > > I was able to speed up the query by running a ComputeTask that distributes > work to the nodes in my cluster based on affinity key parentS2CellId, and > the runs this local query for each matching parentS2CellId, s2CellId: > > SELECT EventTheta.theta > FROM EventTheta > WHERE parentS2CellId = ? > AND s2CellId BETWEEN ? AND ? > AND eventDate BETWEEN ? AND ? > AND eventHour BETWEEN ? AND ?; > > With seven days of data in the database I get results back in about 750ms, > which is on target, but when I increase my data set size to thirty days and > run the same query (for both 7 days of data and 30 days of data), I'm up to > 2-3s. > > 7 Days: 1913 ms => 8312 rows > 30 Days: 1965 ms => 39038 rows > > The query execution time seems to be growing at roughly O(n) not O(log(n)) > time in relation to the size of the data set. I need to find a way to > preserve my affinity key (parentS2CellId), while growing out the size of my > data set. Is the problem with the order of the index, with the range > queries > on the index or something else? > > > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >
