Let me make my problem clearer. Here is a requirement to log data from a set of objects consistently. For example, the object maybe a mobile phone and it will report its location every 30s. To record its historical trace, I create a table like *CREATE TABLE log_table ( id integer NOT NULL, data_type integer NOT NULL, data_value double precision, ts timestamp with time zone NOT NULL, CONSTRAINT log_table_pkey PRIMARY KEY (id, data_type, ts) )*; In my location log example, the field data_type could be longitude or latitude.
I create a primary key (id, data_type, ts) to make my queries more efficient. The major type of queries would ask the latest data_value of a data_type by given id and timestamp. For this kind of query, I make the following SQL statement *SELECT * FROM log_table WHERE id=[given id] and data_type='longitude' and (ts = (SELECT max(ts) FROM log_table WHERE id=[given id]and data_type='longitude' and ts<=[given timestamp]));* According to my evaluation, its performance is acceptable. However, I concern more about the performance of insert operation. As I have mentioned, the log_table is growing so I decide to partition it. Currently, I partition it by date and only keep it 60 days. This partition is helpful. But when I partition it by data_type (in my case, the number of data_type is limited, say 10), the performance of insert operation will be degraded. I guess this is caused by multiple vacuum/analyze on these partitioned data_type log tables. However, if I put all data_type logs together, I can expect that the performance of insert operation will also have degradation if I want to expand the system to support more mobile phones or more data_type. This is my current situation. Please give me some hints to improve the performance (especially for the insert part). kuopo. On Mon, Jul 19, 2010 at 11:37 PM, Jorge Montero < jorge_mont...@homedecorators.com> wrote: > Large tables, by themselves, are not necessarily a problem. The problem is > what you might be trying to do with them. Depending on the operations you > are trying to do, partitioning the table might help performance or make it > worse. > > What kind of queries are you running? How many days of history are you > keeping? Could you post an explain analyze output of a query that is being > problematic? > Given the amount of data you hint about, your server configuration, and > custom statistic targets for the big tables in question would be useful. > >>>> kuopo <sp...@cs.nctu.edu.tw> 7/19/2010 1:27 AM >>> > Hi, > > I have a situation to handle a log table which would accumulate a > large amount of logs. This table only involves insert and query > operations. To limit the table size, I tried to split this table by > date. However, the number of the logs is still large (46 million > records per day). To further limit its size, I tried to split this log > table by log type. However, this action does not improve the > performance. It is much slower than the big table solution. I guess > this is because I need to pay more cost on the auto-vacuum/analyze for > all split tables. > > Can anyone comment on this situation? Thanks in advance. > > > kuopo. > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >