Can one extrapolate future performance expectations for ever-growing tables from a given (non-trivial) data set, and if so with what curve? Corollary: what would one expect a performance curve to approximate in terms of query execution time v. number of data rows (hardware, load staying constant).
I have user and group information on system usage. I would like to be able to do year-to-date counts per user given a single group id but in the data for one business quarter the query is taking in between 10 and 60+ seconds depending on both on the size of the group and the group's total usage. Groups typically have 10-100 users and consume 20K - 80K records in a 9M record data set. Group id column is indexed, but it is not the primary index. (Sad note: two pseudo groups account for 50 percent of the total records IIRC (and will never be used for the usage-by-group query below) This is a single table query: select user_id, element_type, count(*) from dataset where group_id = N group by user_id, element_type order by user_id, element_type Is this the sort of situation which might benefit from increasing the number of histogram bins (alter table alter column statistics (N>10))? Any and all pointers appreciated,