I have a table that has over 100K rows of GIS data, including a raster and an
insertdatetime timestamp columns. This table is continually loaded with data
with processes on the back side querying the data and populating other tables
depending on characteristics of the data. Today a row is read, processed,
then deleted. Vacuums occur frequently and are quite time consuming.
I figured a scheme of partitioning the table into 7 child tables, one for each
day of the week, the day derived from the 'insertdatetime' value. Thus, there
is the master, 'incoming_grid', and 7 children, 'incoming_grid_sun',
'incoming_grid_mon', 'incoming_grid_tue', etc. A job would be kicked off each
night after midnight that could then skip the current and the previous days and
truncate the tables for the remaining 5 days, speeding the whole process up
since there would be no deletes or frequent vacuums. I even figured on
creating a separate tablespace for the child tables. Here's my DDLs -
CREATE TABLE incoming_grid
(
rid integer NOT NULL DEFAULT nextval('incoming_grid_rid_seq'::regclass), --
record id
rast raster,
model character varying(80) NOT NULL,
parameter character varying(80) NOT NULL,
forecast bigint NOT NULL,
level character varying(128) NOT NULL,
insertdatetime timestamp without time zone DEFAULT (now())::timestamp without
time zone,
rundatetime timestamp without time zone NOT NULL,
CONSTRAINT incoming_grid_pkey PRIMARY KEY (forecast, parameter, level, model,
rundatetime)
);
CREATE INDEX "incoming_grid_Index1"
ON incoming_grid
USING btree
(parameter, forecast, level, rundatetime);
CREATE TABLE incoming_grid_sun
(
CHECK (to_char(insertdatetime, 'dy') = 'sun')
) INHERITS (incoming_grid);
CREATE INDEX "incoming_grid_sun_Index1"
ON incoming_grid_sun
USING btree
(parameter , forecast, level, rundatetime);
...
CREATE TABLE incoming_grid_sat
(
CHECK (to_char(insertdatetime, 'dy') = 'sat')
) INHERITS (incoming_grid);
CREATE INDEX "incoming_grid_sat_Index1"
ON incoming_grid_sat
USING btree
(parameter , forecast, level, rundatetime);
CREATE TRIGGER incoming_grid_load_trigger
BEFORE INSERT OR UPDATE
ON incoming_grid
FOR EACH ROW
EXECUTE PROCEDURE incoming_grid_load();
CREATE OR REPLACE FUNCTION incoming_grid_load()
RETURNS trigger AS
$BODY$
DECLARE
schema TEXT='children';
tablename TEXT;
query TEXT;
BEGIN
tablename = 'incoming_grid_' || to_char(NEW.insertdatetime, 'dy');
IF TG_OP = 'INSERT' THEN
query = 'INSERT INTO ' || tablename || ' SELECT $1.*;';
EXECUTE(query) USING NEW;
END IF;
RETURN NULL;
END;
$BODY$
LANGUAGE plpgsql VOLATILE STRICT
COST 100;
So far, so good. I loaded the master with 100 rows and they all went to their
correct child table.
My problem now is that SELECTs scan all child tables - the CHECK constraint
doesn't factor in. I tried the following to create another index on the
master, but it didn't work, either - all partitions are scanned. Any ideas on
how I can resolve this?
CREATE FUNCTION custom_to_char(timestamp) RETURNS text AS
$$ select to_char($1, 'dy'); $$
LANGUAGE sql immutable;
CREATE INDEX "incoming_grid_Index2"
ON incoming_grid
USING btree
(custom_to_char(insertdatetime));
Thanks,
Steve Erickson
NOTICE: This email message is for the sole use of the intended recipient(s) and
may contain confidential and privileged information. Any unauthorized use,
disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and destroy all copies of
the original message.