I haven't increased parallelism yet, but that's not a solution to my problem.
I was able to speed up the query by running a ComputeTask that distributes work to the nodes in my cluster based on affinity key parentS2CellId, and the runs this local query for each matching parentS2CellId, s2CellId: SELECT EventTheta.theta FROM EventTheta WHERE parentS2CellId = ? AND s2CellId BETWEEN ? AND ? AND eventDate BETWEEN ? AND ? AND eventHour BETWEEN ? AND ?; With seven days of data in the database I get results back in about 750ms, which is on target, but when I increase my data set size to thirty days and run the same query (for both 7 days of data and 30 days of data), I'm up to 2-3s. 7 Days: 1913 ms => 8312 rows 30 Days: 1965 ms => 39038 rows The query execution time seems to be growing at roughly O(n) not O(log(n)) time in relation to the size of the data set. I need to find a way to preserve my affinity key (parentS2CellId), while growing out the size of my data set. Is the problem with the order of the index, with the range queries on the index or something else? -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
