I currently have a database doing something very similar. I setup partition tables with predictable names based on the the data's timestamp week number eg: (Data_YYYY_WI).
I have a tigger on the parent partition table to redirect data to the correct partition( tablename:='Data_' || to_char('$NEW(ts)'::timestamptz, 'IYYY_IW') ) . then I use dynamic sql to do the insert. I did some optimization by writting it in pl/TCL and using global variables to store prepared insert statements. Most queries for me are based on the date and we have decent performance with our current setup. For last/current sensor data we just store the last dataID in the sensor record. I haven't thought of a better way yet. After batch inserts we caculate the last reading for each participating sensorID inserted. With partition tables we struggled with the query to get the lastest data : select * from "Data" where "sensorID"=x order by ts limit 1 -- for parition tables. See ( http://archives.postgresql.org/pgsql-performance/2008-11/msg00284.php) On Thu, May 28, 2009 at 7:55 AM, Ivan Voras <ivo...@freebsd.org> wrote: > 2009/5/28 Heikki Linnakangas <heikki.linnakan...@enterprisedb.com>: > > Ivan Voras wrote: > >> > >> I need to store data about sensor readings. There is a known (but > >> configurable) number of sensors which can send update data at any time. > >> The "current" state needs to be kept but also all historical records. > >> I'm trying to decide between these two designs: > >> > >> 1) create a table for "current" data, one record for each sensor, update > >> this table when a sensor reading arrives, create a trigger that would > >> transfer old record data to a history table (of basically the same > >> structure) > >> 2) write only to the history table, use relatively complex queries or > >> outside-the-database magic to determine what the "current" values of the > >> sensors are. > > > > 3) write only to the history table, but have an INSERT trigger to update > the > > table with "current" data. This has the same performance characteristics > as > > 1, but let's you design your application like 2. > > Excellent idea! > > > I think I'd choose this approach (or 2), since it can handle out-of-order > or > > delayed arrival of sensor readings gracefully (assuming they are > timestamped > > at source). > > It seems like your approach is currently the winner. > > > If you go with 2, I'd recommend to still create a view to encapsulate the > > complex query for the current values, to make the application development > > simpler. And if it gets slow, you can easily swap the view with a table, > > updated with triggers or periodically, without changing the application. > > > >> The volume of sensor data is potentially huge, on the order of 500,000 > >> updates per hour. Sensor data is few numeric(15,5) numbers. > > > > Whichever design you choose, you should also consider partitioning the > data. > > I'll look into it, but we'll first see if we can get away with > limiting the time the data needs to be available. > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >