>>>Ryan Bradetich said:
 > the table would look like:
 > 1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user x has an invalid shell.
 > 1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user y has an invalid shell.
 > 1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user y has expired password.
 > 2 | Mon Feb 17 00:34:24 MST 2003 | f101 | file /foo has improper owner.
 > etc...
 > So I do not need the anomaly to be part of the index, I only need it to 
 > I agree with you, that I would not normally add the anomally to the
 > index, except for the unique row requirement.  Thinking about it now,
 > maybe I should guarentee unique rows via a check constraint...
 > Thanks for making me think about this in a different way!

(sorry this is a bit long)


I use somewhat similarly structured data (archived records of various events) 
and when the database was setup (back when this baby was called postgres95), I 
too used indexes on all possible fields.

My database consists of an 'operations' table, which holds for the last x days 
period (example) and several tables with archived records (per month, or 
per-year - see later, The operations table can have frequent updates, which 
add new data. Data is never modified but often lookups are made. The archived 
tables are generated once and forever from the operations table (possibly 
merging in the future, but I haven't yet made my mind on this) - then access 
is read-only, although sufficiently frequent.

What I found for the many years of operating this database on different 
PostgreSQL versions and hardware is that indexes have considerable cost. :)
So does the need to not miss anything from the operations table (that is, 
collect data from many places and have have it all it there).

I ended up with few only indexes on the operations table, because the 
processes that fill it up do minimal lookups to see if data is already in the 
table, if not do inserts. Then at regular intervals, the table is cleaned up - 
that is, a process to remove the duplicate is run. This unfortunately costs 
OIDs, but I found no other reasonable way to do the fast inserts. Perhaps the 
best way is to create the table without OIDs (but wouldn't this still waste 
OIDs?) use COPY and then clean afterwards?

The archived tables are generated, then cleaned up. Then, as Tom suggested 
indexes are put on the archived tables, only for the fields that are used in 
queries. Once the table is created, there is no way duplicated data will 
exist, as it will not be inserted into. Therefore no need for UNIQUE index 

If you need to have one large 'history' table, then perhaps you will just have 
to do (slow :) selects for each record before each insert, or just insert the 
data and then run the cleanup process.


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Reply via email to