Re: [GENERAL] Queries never returning...

2005-12-29 Thread John McCawley
It looks like my primary slowdown on that query was the timestamp 
trigger.  However, even after removing that trigger, and ensuring that 
all of my referencing tables had their foreign keys indexed, a simple 
update of one column on 244451 records took 14 minutes.  Given the specs 
I mentioned in the earlier email, is this to be expected?


I have also modified my entire schema to use the more updated constraint 
syntax.  I wrote a php script which can be used on a pg_dump.  I have 
attached it here in case anyone else ends up needing this:


-file fixkey.php---

#!/usr/bin/php
?
echo This script modifies a schema file generated by pg_dump and 
converts any pre 7.3 foreign key triggers to proper foreign key 
constraint syntax.  I have only tested it on a dumpfile generated by 
Postgres 8.0.3.  I have no idea if this will work on any other version, 
or with other people's wacky schemas.  This worked for me and that's all 
I can say.  Don't blame me if this script burns down your house.\n\n;


if( $argc != 3 ) {
   die(Usage: fixkey.php schemafile.db outfile.db\n);
}


$fp = fopen($argv[1], r);
$fpout = fopen($argv[2], w);

if( !$fp ) {
   die(Error opening ' . $argv[1] . ' for read\n);
}
if( !$fpout ) {
   die(Error opening ' . $argv[2] . ' for write\n);
}

while( $line = fgets($fp, 5000) )  {

   if( strstr($line, ConstraintTrigger_ ) ) {
   //echo Skipping comment $line\n;
   }
   else if( strstr($line, CREATE CONSTRAINT TRIGGER ) ) {
   $keyname = substr($line, strlen(CREATE CONSTRAINT TRIGGER ) );
   $keyname = trim($keyname);

   //Get 5 lines after declaration for foreign key info
   $line2 = fgets($fp, 5000);
   $line3 = fgets($fp, 5000);
   $line4 = fgets($fp, 5000);
   $line5 = fgets($fp, 5000);
   $line6 = fgets($fp, 5000);

   //Foreign keys are apparently made of up 3 triggers...we only 
care about the first one
   //I assume that the subsequent ones will be implicitly created 
by the new syntax

   if( !$key_array[$keyname] ) {
   //Store key name so we don't process it again
   $key_array[$keyname] = 1;

   //Referencing table name is in line 2
   $table = explode( , $line2);
   $table = $table[count($table)-1];
   $table = trim($table);

   //Referenced table is in line 3
   $parent = explode( , $line3);
   $parent = $parent[count($parent)-1];
   $parent = trim($parent);

   //Referencing column is on line 6
   $column = explode(,, $line6);
   $column = $column[4];
   $column = str_replace(', , $column);
   $column = trim($column);

   //Referenced column is on line 6
   $parentcolumn = explode(,, $line6);
   $parentcolumn = $parentcolumn[5];
   $parentcolumn = str_replace(', , $parentcolumn);
   $parentcolumn = str_replace(), , $parentcolumn);
   $parentcolumn = str_replace(;, , $parentcolumn);
   $parentcolumn = trim($parentcolumn);

   $sKeySQL = ALTER TABLE $table ADD CONSTRAINT $keyname 
FOREIGN KEY ($column) REFERENCES $parent ($parentcolumn) MATCH FULL;\n;


   echo $sKeySQL;
   fputs($fpout, \n\n . $sKeySQL . \n\n);
   }

   }
   else {
   fputs($fpout, $line);
   }
}

fclose($fp);
fclose($fpout);
?


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[GENERAL] Queries never returning...

2005-12-28 Thread John McCawley
I am currently having a problem with a query never finishing (or at 
least not in a reasonable amount of time.)  I have had similar problems 
to this in Postgres over the past several years, and I have always found 
workarounds.  This time I'd like to actually understand what is happening.


I have two tables, tbl_claim and tbl_insured.  tbl_claim has a column 
insured_id which references the primary key in tbl_insured, also named 
insured_id.  Originally they were to have a one to many relationship 
(one tbl_insured record could potentionally be referenced by multiple 
records in tbl_claim).  In practice, however, they're essentially 1 to 
1.  There are a few stragglers, but it isn't really necessary to keep 
the integrity.  I want to move all data from tbl_insured into 
tbl_claim.  I added all pertinent columns to tbl_insured.  However, when 
I run the following query:



UPDATE tbl_claim SET
ins_lname = tbl_insured.ins_lname,
ins_fname = tbl_insured.ins_fname,
ins_mi = tbl_insured.ins_mi,
ins_add1 = tbl_insured.ins_add1,
ins_add2 = tbl_insured.ins_add2,
ins_city = tbl_insured.ins_city,
ins_state = tbl_insured.ins_state,
ins_zip = tbl_insured.ins_zip,
ins_phone = tbl_insured.ins_phone,
ins_altphone =tbl_insured.ins_altphone,
ins_cell = tbl_insured.ins_cell,
ins_pager = tbl_insured.ins_pager,
ins_fax = tbl_insured.ins_fax,
ins_email = tbl_insured.ins_email
FROM tbl_insured WHERE tbl_claim.insured_id = tbl_insured.insured_id;

it never comes back.  I have left it running for 30 minutes or so, but 
it never comes back (It is simply not acceptable to run this query for 
30 minutes, as this is part of a much larger system overhaul...I can't 
have my production system down for this long)


Here are my record counts:

select count(*) FROM tbl_insured;
count

242083
(1 row)

select count(*) FROM tbl_claim;
count

243121

select count(*) FROM tbl_claim INNER JOIN tbl_insured ON 
tbl_claim.insured_id = tbl_insured.insured_id;

count

243117

Here is my explain of the update query:

Hash Join  (cost=11033.04..94030.41 rows=313765 width=596)
  Hash Cond: (outer.insured_id = inner.insured_id)
  -  Seq Scan on tbl_claim  (cost=0.00..16240.29 rows=445829 width=442)
  -  Hash  (cost=4871.83..4871.83 rows=242083 width=158)
-  Seq Scan on tbl_insured  (cost=0.00..4871.83 rows=242083 
width=158)

(5 rows)

When I try an explain analyze, it just spins forever as well (I've never 
let it run for more than 30 minutes).


I could easily write a little PHP script to copy over the data, but I'd 
like to do it in SQL, and if not at least know WHY it isn't working.


I can't quite see why this is taking so long, and I don't know how I 
would go about diagnosing the problem, since the explain doesn't really 
seem like it should take too long, and explain analyze is hanging as 
well.  Note:  I did try a VACUUM FULL ANALYZE prior to running the query.










---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Queries never returning...

2005-12-28 Thread Tom Lane
John McCawley [EMAIL PROTECTED] writes:
 I am currently having a problem with a query never finishing (or at 
 least not in a reasonable amount of time.)  I have had similar problems 
 to this in Postgres over the past several years, and I have always found 
 workarounds.  This time I'd like to actually understand what is happening.

If that join SELECT comes back in a reasonable period of time, then the
UPDATE shouldn't take too long either --- they're both doing about the
same thing as far as performing the join goes.  The differential would
have to be index updates or triggers fired by the UPDATE.  I'd bet on
the latter, but since you've told us zip about your schema or what PG
version this is, it's impossible to speculate further...

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Queries never returning...

2005-12-28 Thread Ian Harding
On 12/28/05, John McCawley [EMAIL PROTECTED] wrote:
 I am currently having a problem with a query never finishing (or at
 least not in a reasonable amount of time.)  I have had similar problems
 to this in Postgres over the past several years, and I have always found
 workarounds.  This time I'd like to actually understand what is happening.


I bet it would go faster if you dropped the RI constraints and any
other triggers first.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Queries never returning...

2005-12-28 Thread John McCawley

Tom Lane wrote:


The differential would have to be index updates or triggers fired by the 
UPDATE.  I'd bet on
the latter, but since you've told us zip about your schema or what PG
version this is, it's impossible to speculate further...
 



This is my development machine.  I'm running PostgreSQL 8.0.3 on a 
Pentium 4 3GHZ Gentoo machine with a 2.6.12 kernel, 1 gig of RAM.  
Everything is running on one big partition on a SATA drive.


You're right, it looks to be trigger related.  I did have a timestamp 
trigger, which I have removed, however it still has a bunch of foreign 
key triggers on it.  Even if I run:


update tbl_claim SET ins_lname = NULL;

I get the same problem. 

In looking at the \d tbl_claim output, there is something odd I 
notice.  I have many foreign keys (the claim_id in tbl_claim is 
referenced by 12 or so other tables, and tbl_claim references about 6 or 
so tables by their _id)  What is strange is that two of my newer foreign 
keys are shown as follows:


Foreign-key constraints:
   fk_tbl_claim_tbl_stormgroup_stormgroup_id FOREIGN KEY 
(stormgroup_id) REFERENCES tbl_stormgroup(stormgroup_id) MATCH FULL
   fk_tbl_claim_emp_id FOREIGN KEY (emp_id) REFERENCES 
tbl_employee(emp_id) MATCH FULL


Which matches the syntax I used to create them, however all of my older 
foreign keys are under the Triggers section and are defined as follows:


   RI_ConstraintTrigger_23354821 AFTER INSERT OR UPDATE ON tbl_claim 
FROM tbl_agents NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE 
PROCEDURE RI_FKey_check_ins('fk_tbl_claim_tbl_agents_fk', 'tbl_claim', 
'tbl_agents', 'UNSPECIFIED', 'agent_id', 'agent_id')


Why are they different?  Should all of my foreign keys look like the 
first two, or are they logically identical?  I assume the difference is 
because the older keys were initially created in a 7.x version of 
Postgres, and got into 8.x from a pg_dumpall


Below is a full \d dump of this table.  Obviously, there are a ton of 
triggers on it, but how would I avoid this in a heavily referenced 
table? (tbl_claim is the core table of this entire system).


Certainly, dropping all of the triggers, indexes, etc. would solve the 
problem and allow me to update, but I'd like a more elegant solution.  I 
don't have THAT many records in this table, and I wouldn't expect a 
simple update of a column to hang everything.  Should I modify my 
foreign key triggers?


-


Table public.tbl_claim
 Column  |Type |
Modifiers
--+-+--
claim_id | integer | not null default 
nextval('tbl_claim_claim_id_key'::text)

worlfilenum  | character varying(12)   |
createby | integer |
claimnum | character varying(50)   |
insured_id   | integer |
comaster_id  | integer |
clntmaster_id| integer |
agent_id | integer |
storm_id | integer |
claim_createdate | timestamp with time zone|
claim_lossdate   | timestamp with time zone|
claim_mailer | timestamp with time zone|
claim_contdate   | timestamp with time zone|
claim_inpecdate  | timestamp with time zone|
claim_closedate  | timestamp with time zone|
claim_clntnum| character varying(25)   |
claim_deductible | double precision|
clmtype_id   | integer |
subrogation  | character varying(10)   |
peril_id | integer |
rcv  | double precision|
policydate   | timestamp with time zone|
limita   | double precision|
limitb   | double precision|
limitc   | double precision|
limitd   | double precision|
deductible   | double precision|
riskadd  | character varying(100)  |
riskcity | character varying(50)   |
riskstate| character varying(50)   |
riskzip  | character varying(50)   |
secinjury| character varying(10)   |
searchtext   | character varying(32)   |
lossreserves | double precision|
expensereserves  | double precision|
notes| character varying(512)  |
active   | integer | default 1
policyexpiredate | timestamp with time zone|
deductible2  | double precision|
salvage  | integer |
siu  | integer |
policynum| character varying(32)   |
groupnumber  | integer |
stormgroup_id| integer |
printed  | 

Re: [GENERAL] Queries never returning...

2005-12-28 Thread Tom Lane
John McCawley [EMAIL PROTECTED] writes:
 In looking at the \d tbl_claim output, there is something odd I 
 notice.  I have many foreign keys (the claim_id in tbl_claim is 
 referenced by 12 or so other tables, and tbl_claim references about 6 or 
 so tables by their _id)

It seems a good bet that the poor performance is due to lack of indexes
on the columns that reference tbl_claim from other tables.  PG enforces
an index on the referenced side of an FK constraint, but not on the
referencing side.  This is OK if you mostly update the referencing
table, but it hurts for updates and deletes on the referenced table.
Try creating those indexes.  (You'll likely need to start a fresh
psql session afterwards to make sure that the RI mechanism notices
the new indexes.)

 Which matches the syntax I used to create them, however all of my older 
 foreign keys are under the Triggers section and are defined as follows:

 RI_ConstraintTrigger_23354821 AFTER INSERT OR UPDATE ON tbl_claim 
 FROM tbl_agents NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE 
 PROCEDURE RI_FKey_check_ins('fk_tbl_claim_tbl_agents_fk', 'tbl_claim', 
 'tbl_agents', 'UNSPECIFIED', 'agent_id', 'agent_id')

These are probably inherited from some pre-7.3-or-so schema?  I'd
suggest dropping those triggers and recreating the constraints with
ALTER TABLE ADD CONSTRAINT.  You could also look at contrib/adddepend/
which is alleged to fix such things automatically (but I wouldn't
trust it too much, because it's not been maintained since 7.3).
This won't make any difference to performance, but it'll clean up your
schema into a more future-proof form.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend