Folks:

Im running a query which is designed to generate a foreign key for a table of approx. 10 million records (I've mentioned this in an earlier posting). The table is called "indethom", and each row contains a single word from the works of St. Thomas Aquinas, along with grammatical data about the word form, and (most importantly for my current problem) a set of columns identifying the particular work/section/paragraph that the word appears in.

This database is completely non-normalized, and I'm working on performing some basic normalization, beginning with creating a table called "s2.sectiones" which (naturally) contains a complete listing of all of the sections of all the works of St. Thomas. I will then eliminate this information from the original "indethom" table, replacing it with the foreign key I am currently generating.

** My question has to do with whether or not I am getting maximal speed out of PostgreSQL, or whether I need to perform further optimizations. I am currently getting about 200,000 updates per hour, and updating the entire 10 million rows thus requires 50 hours, which seems a bit much.

Here's the query I am running:
update indethom
set query_counter = nextval('s2.query_counter_seq'), -- Just for keeping track of how fast the query is running
sectref = (select clavis from s2.sectiones where
s2.sectiones.nomeoper = indethom.nomeoper
and s2.sectiones.refere1a = indethom.refere1a and s2.sectiones.refere1b = indethom.refere1b
and s2.sectiones.refere2a = indethom.refere2a and s2.sectiones.refere2b = indethom.refere2b
and s2.sectiones.refere3a = indethom.refere3a and s2.sectiones.refere3b = indethom.refere3b
and s2.sectiones.refere4a = indethom.refere4a and s2.sectiones.refere4b = indethom.refere4b);

Heres the query plan:
QUERY PLAN
-------------------------------------------------------------------------------------
Seq Scan on indethom (cost=0.00..1310352.72 rows=10631972 width=212)
SubPlan
-> Index Scan using sectiones_ndx on sectiones (cost=0.00..6.03 rows=1 width=4)
Index Cond: ((nomeoper = $0) AND (refere1a = $1) AND (refere1b = $2) AND (refere2a = $3) AND (refere2b = $4) AND (refere3a = $5) AND (refere3b = $6) AND (refere4a = $7) AND (refere4b = $8))
(4 rows)

Note: I have just performed a VACUUM ANALYZE on the indethom table, as suggested by this listserve.

Here's the structure of the s2.sectiones table:
it=> \d s2.sectiones
Table "s2.sectiones"
Column | Type | Modifiers
----------+--------------+-----------
nomeoper | character(3) |
refere1a | character(2) |
refere1b | character(2) |
refere2a | character(2) |
refere2b | character(2) |
refere3a | character(2) |
refere3b | character(2) |
refere4a | character(2) |
refere4b | character(2) |
clavis | integer |
Indexes: sectiones_ndx btree (nomeoper, refere1a, refere1b, refere2a, refere2b, refere3a, refere3b, refere4a, refere4b)

Finally, here is the structure of indethom (some non-relevant columns not shown):
it=> \d indethom
Table "public.indethom"
Column | Type | Modifiers
---------------+-----------------------+-----------
numeoper | smallint | not null
nomeoper | character(3) | not null
editcrit | character(1) |
refere1a | character(2) |
refere1b | character(2) |
refere2a | character(2) |
refere2b | character(2) |
refere3a | character(2) |
refere3b | character(2) |
refere4a | character(2) |
refere4b | character(2) |
refere5a | character(2) | not null
refere5b | smallint | not null
referen6 | smallint | not null
... several columns skipped ...
verbum | character varying(22) | not null
... other columns skipped ...
poslinop | integer | not null
posverli | smallint | not null
posverop | integer | not null
clavis | integer | not null
articref | integer |
sectref | integer |
query_counter | integer |
Indexes: indethom_pkey primary key btree (clavis),
indethom_articulus_ndx btree (nomeoper, refere1a, refere1b, refere2a, refere2b, refere3a, refere3b),
indethom_sectio_ndx btree (nomeoper, refere1a, refere1b, refere2a, refere2b, refere3a, refere3b, refere4a, refere4b),
verbum_ndx btree (verbum)

Thanks for your assistance!
-Erik Norvelle

Reply via email to