Hi *Requirement :- * We need to retrieve latest health of around 1.5 million objects for a given time.
*Implementation :-* We are storing hourly data of each object in single row. Given below is the schema :- *CREATE TABLE health_timeseries (* * mobid text NOT NULL, hour bigint NOT NULL, health real[] );* mobId - Object ID hour - Epoch hour health - Array of health values for a given hour of that object. Each object has 2 hours of health data (i.e. 2 rows for each object) so total no. of rows is around 3 million. With the above approach the query to retrieve the latest health of all objects for a given time duration is taking around *85 seconds*. I have tried to increase the work_mem, effective_cache, shared_buffer to 4 GB of PostgreSQL but still there was no improvement in the query execution time. *Query :-* *select distinct on (health_timeseries.mobid) mobid, health_timeseries.health, health_timeseries.hour from health_timeseries where hour >=(1505211054000/(3600*1000))-1 and hour <= 1505211054000/(3600*1000) ORDER BY health_timeseries.mobid DESC, health_timeseries.hour DESC;* *Hardware Configuration of PostgreSQL VM :-* 1. OS - Centos. 2. Postgresql version - 9.6.2 3. RAM - 16 GB RAM 4. CPU - 8 vCPU Please let us know the hardware configuration of PostgreSQL for such huge dataset? And also let us know if there is any better schema/query to retrieve this data? Thanks and Regards Subramaniam