Re: [GENERAL] Postgres Encoding conversion problem

2008-04-22 Thread Albe Laurenz
Clemens Schwaighofer wrote:
 I sometimes have a problem with conversion of encodings eg from UTF-8
 tio ShiftJIS:

 ERROR:  character 0xf0a0aeb7 of encoding UTF8 has no
 equivalent in SJIS

 I have no idea what character this is, I cannot view it in my
 browser, etc.

It translates to Unicode 10BB7, which is not defined.
I guess that is not intended; can you guess what the character(s) should be?

 If I run the conversion through PHP with mb_convert_encoding it works,
 perhaps he is ignoring the character.

 Is there a way to do a similar thing, like ignoring this character in
 postgres too?

As far as I know, no.
You'll have to fix the data before you import them.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres Encoding conversion problem

2008-04-22 Thread Clemens Schwaighofer
On 04/22/2008 05:37 PM, Albe Laurenz wrote:
 Clemens Schwaighofer wrote:
 I sometimes have a problem with conversion of encodings eg from UTF-8
 tio ShiftJIS:

 ERROR:  character 0xf0a0aeb7 of encoding UTF8 has no
 equivalent in SJIS

 I have no idea what character this is, I cannot view it in my
 browser, etc.
 
 It translates to Unicode 10BB7, which is not defined.
 I guess that is not intended; can you guess what the character(s) should be?

to be honest no idea. its some chinese character, I have no idea how the
user input this, because this is a japanese page.

I actually found the carachter, but only my Mac OS X can show it. It
looks similar to a japanese character used for a name, but how the
chinese one got selected is a mystery to me ...

 If I run the conversion through PHP with mb_convert_encoding it works,
 perhaps he is ignoring the character.

 Is there a way to do a similar thing, like ignoring this character in
 postgres too?
 
 As far as I know, no.
 You'll have to fix the data before you import them.

well, the web page  data is in utf8 so I never see this issue, except I
would write a method that detects illegal shift_jis characters, and
thats difficult.

The reporting is only done in CSV ... so I am not sure if it is worth to
waste too much time here.

thanks for the tip.

-- 
[ Clemens Schwaighofer  -=:~ ]
[ IT Engineer/Manager, TEQUILA\ Japan IT Group   ]
[6-17-2 Ginza Chuo-ku, Tokyo 104-8167, JAPAN ]
[ Tel: +81-(0)3-3545-7703Fax: +81-(0)3-3545-7343 ]
[ http://www.tequila.co.jp   ]

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] How to modify ENUM datatypes?

2008-04-22 Thread D. Dante Lorenso

All,

In the past I have used foreign keys to lookup tables for small lists of 
values that I now think ENUM could do the job of.  I was hoping that by 
using ENUM, I would avoid having to do joins in my queries, and that I'd 
be making the data more robust and faster.


I used to have a table for account_status:

  A | Active
  B | Billing Failed
  C | Closed
  D | Deactivated

  account.acct_type CHAR references account_type.acct_type CHAR

But, now I've converted that to an ENUM:

  ACTIVE
  BILLING_FAILED
  CLOSED
  DEACTIVATED

  account.acct_type ENUM account_type

The problem is that once I create a column in my account table that uses 
this 'account_type' datatype, I can't seem to change or add to it any 
more.  I want to add a new value or edit/delete an existing one.


How do you make changes to an ENUM datatype that is already in use?

-- Dante




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres Encoding conversion problem

2008-04-22 Thread Albe Laurenz
Clemens Schwaighofer wrote:
 I sometimes have a problem with conversion of encodings eg from UTF-8
 tio ShiftJIS:

 ERROR:  character 0xf0a0aeb7 of encoding UTF8 has no
 equivalent in SJIS

 I have no idea what character this is, I cannot view it in my
 browser, etc.
 
 It translates to Unicode 10BB7, which is not defined.
 I guess that is not intended; can you guess what the character(s) should be?
 
 to be honest no idea. its some chinese character, I have no idea how the
 user input this, because this is a japanese page.
 
 I actually found the carachter, but only my Mac OS X can show it. It
 looks similar to a japanese character used for a name, but how the
 chinese one got selected is a mystery to me ...

Are you sure that your Mac OS X computer interprets the character as
UTF-8?

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] How is statement level read consistency implemented?

2008-04-22 Thread Thomas Kellerer

Hi,

in my office we were discussing the various ways statement level read 
consistency is implemented in different databases, namely Oracle and Postgres.

I am interested in the technical details on how PG determines that a block needs to be 
read from from some other place than the data block because another 
transaction has updated the data block.

I'm referring to the scenario that Tom Kyte describes here:
www.oracle.com/technology/oramag/oracle/05-nov/o65asktom.html
and how PG would detect that row 342,023 has been modified by a different transaction 
(and thus reads the correct value, unlike the other database)

Oracle assigns a SCN (system change number) to each transaction, each block 
contains the SCN for which it is valid, any number higher than the current SCN 
indicates that the block has to be taken from the rollback segment.

How is this test implemented in Postgres? 
Does it have a similar concept (SCN) or does WAL imply a completely different way?



Thanks in advance
Thomas



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How is statement level read consistency implemented?

2008-04-22 Thread Pavan Deolasee
On Tue, Apr 22, 2008 at 4:11 PM, Thomas Kellerer [EMAIL PROTECTED] wrote:


  I am interested in the technical details on how PG determines that a block
 needs to be read from from some other place than the data block because
 another transaction has updated the data block.


Postgres uses MVCC for transaction consistency. When a row is updated,
a new copy of the row is created and the old version is marked for
deletion (though its not removed immediately). The old readers
continue to read from the old copy whereas new transactions see the
new copy. This is all managed by XMIN and XMAX which are transaction
ids of the inserting and deleting transactions respectively and
control the visibility of the different versions.

If you want to understand MVCC, please see this excellent
documentation by Tom Lane:
http://www.postgresql.org/files/developer/transactions.pdf

Thanks,
Pavan


-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How is statement level read consistency implemented?

2008-04-22 Thread Martijn van Oosterhout
On Tue, Apr 22, 2008 at 12:41:08PM +0200, Thomas Kellerer wrote:
 I'm referring to the scenario that Tom Kyte describes here:
 www.oracle.com/technology/oramag/oracle/05-nov/o65asktom.html
 and how PG would detect that row 342,023 has been modified by a different 
 transaction (and thus reads the correct value, unlike the other database)

Check the docs, but the basic ideas are:
- Every transaction has a transaction ID and a snapshot which
identifies the transactions whose results it is allowed to see.
- Every row has a start and end transaction ID which identifes who
created it and who deleted it.
- Each row also has a ctid pointer which points to the newer version of
that row.

What will happen in this case is that the updating transaction will
duplicate the row, setting the end xid of the old version and the
start xid of the new version to its own xid. Any parallel scan will see
either the new version or the old version depending on its snapshot.

If you end up trying to update a row someone else updated in the
meantime, the ctid is followed to the new version, the tuple is
retested for matching and then updated.

See http://www.postgresql.org/docs/8.3/interactive/mvcc.html

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [GENERAL] How is statement level read consistency implemented?

2008-04-22 Thread Thomas Kellerer

Pavan Deolasee, 22.04.2008 12:57:

On Tue, Apr 22, 2008 at 4:11 PM, Thomas Kellerer [EMAIL PROTECTED] wrote:


 I am interested in the technical details on how PG determines that a block
needs to be read from from some other place than the data block because
another transaction has updated the data block.


Postgres uses MVCC for transaction consistency. When a row is updated,
a new copy of the row is created and the old version is marked for
deletion (though its not removed immediately). The old readers
continue to read from the old copy whereas new transactions see the
new copy. This is all managed by XMIN and XMAX which are transaction
ids of the inserting and deleting transactions respectively and
control the visibility of the different versions.

If you want to understand MVCC, please see this excellent
documentation by Tom Lane:
http://www.postgresql.org/files/developer/transactions.pdf


The xmin, xmax info is basically what I was looking for. 

As far as I can tell (from the PDF and your quote) Postgres uses a very similar concept as Oracle. . 
Each transaction has a uniqe number and each tuple contains the information for which transaction number it is visible. 


Thanks for the quick answer.
Thomas




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Schema migration tools?

2008-04-22 Thread Kerri Reno
Christophe -

We use LiquiBase: http://www.liquibase.org/home

We don't use it quite the same as what you're proposing, but I think you
could use it that way.  When we found it, we did 'Generate Changelog' to
start, then made new changes to the ChangeLog, and applied it to each
database as needed.   It has really helped us, because we were forgetting to
apply the changes to the production database that were needed.

Hope this helps!
Kerri

On 4/21/08, Christophe [EMAIL PROTECTED] wrote:

 Greetings,

 We have the traditional three servers:

dev -- staging -- production

 each with a PostgreSQL instance and the same schema, at least over time.
  Right now, we handle schema migration (updating the schema for new
 releases) by manually-created scripts that apply the changes to staging and
 production.

 I'd like a tool that would automatically create these scripts, and I
 wondered if anything like this existed.  The theory would be that it would
 consider two databases a and b, and produce the appropriate script to change
 b's schema to match a.

 Does anything like this exist?  If not, I might have a new project...

 Thanks!
 -- Christophe

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general




-- 
Yuma Educational Computer Consortium
Compass Development Team
Kerri Reno
[EMAIL PROTECTED]  (928) 502-4240
.·:*¨¨*:·.   .·:*¨¨*:·.   .·:*¨¨*:·.


Re: FW: Re: [GENERAL] create temp in function

2008-04-22 Thread Kerri Reno
Thanks to all who responded.  I now know why execute will help this problem,
but then it causes a new problem.  The example I sent you was trivial,
trying to get to the bottom of the issue.  What I'm really trying to is get
past the restriction of execute to do SELECT INTO.  That's why I created a
temporary table, so that that command could be dynamic, and then do a SELECT
INTO from that table.  Because of the planning issue that won't work.  I
can't upgrade to 8.3 at this time (long story).  Any ideas how to do this?
Here is a section of my code.

begin
query = 'create temp table schedrec as select ' ||
'salary_schedule, pay_column, step from ' || tbl ||
' where cp_id = ' || to_char(tcp_id,'');
raise notice 'query: %', query;
execute query;
select into relid distinct(attrelid) from pg_attribute where
attrelid='schedrec'::regclass;
raise notice 'relid: %', relid;
raise notice 'about to do select';
select into arow * from schedrec limit 1;
drop table schedrec;
return arow;
end;

Thanks so much!
Kerri

On 4/21/08, Adrian Klaver [EMAIL PROTECTED] wrote:

  -- Original message --
 From: Kerri Reno [EMAIL PROTECTED]
  Adrian,
 
  I don't understand.  Why do I need to use execute?  It runs fine the
 first
  time.  The second time it bombs, because it's not seeing schedrec
  correctly.  Which part should be in an execute query statement?

 plpgsql caches query plans. In versions prior to 8.3 this meant that the
 first time you ran a function the plans for the statements where cached for
 use by later runs of the function in the same session. The error you are
 getting about OID missing means the function is looking for the OID of the
 temp table as it was cached in the first run and not finding it. To get
 around this you need to EXECUTE the create temp table statement. This causes
 the plan not be cached but run anew for each call of the function. If you
 follow the link I included in the previous email you will see some examples.



-- 
Yuma Educational Computer Consortium
Compass Development Team
Kerri Reno
[EMAIL PROTECTED]  (928) 502-4240
.·:*¨¨*:·.   .·:*¨¨*:·.   .·:*¨¨*:·.


Re: [GENERAL] How is statement level read consistency implemented?

2008-04-22 Thread Roberts, Jon
 Pavan Deolasee, 22.04.2008 12:57:
  On Tue, Apr 22, 2008 at 4:11 PM, Thomas Kellerer
[EMAIL PROTECTED]
 wrote:
 
   I am interested in the technical details on how PG determines that
a
 block
  needs to be read from from some other place than the data block
 because
  another transaction has updated the data block.
 
  Postgres uses MVCC for transaction consistency. When a row is
updated,
  a new copy of the row is created and the old version is marked for
  deletion (though its not removed immediately). The old readers
  continue to read from the old copy whereas new transactions see the
  new copy. This is all managed by XMIN and XMAX which are transaction
  ids of the inserting and deleting transactions respectively and
  control the visibility of the different versions.
 
  If you want to understand MVCC, please see this excellent
  documentation by Tom Lane:
  http://www.postgresql.org/files/developer/transactions.pdf
 
 The xmin, xmax info is basically what I was looking for.
 
 As far as I can tell (from the PDF and your quote) Postgres uses a
very
 similar concept as Oracle. .
 Each transaction has a uniqe number and each tuple contains the
 information for which transaction number it is visible.

Oracle moves the old row(s) to the rollback segment instead of putting a
new row in the table as PostgreSQL does.  The new value goes right in
place of the old row and it adds a pointer to the rollback segment.

A commit removes the pointer and a rollback forces the db to move the
old row from the rollback segment back in place of the row updated.
Oracle's rollback segment can be tuned so that retains data up to a
certain age and then it is flushed.  If rollback is too small, you can
get a snapshot too old error.

Back to PostgreSQL, when a vacuum is issued, the old deleted rows are
removed.  This can be done with the autovacuum feature or manually.  

-Does vacuum make sure there are no long running queries referencing the
deleted rows before it attempts to remove the old rows?  
-With autovacuum, does it skip these rows still being referenced in a
transaction or does it wait?

I'm guessing that you will never get the snapshot too old error in
PostgreSQL based on the architecture.  


Jon

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to modify ENUM datatypes?

2008-04-22 Thread Craig Ringer

D. Dante Lorenso wrote:


The problem is that once I create a column in my account table that 
uses this 'account_type' datatype, I can't seem to change or add to it 
any more.  I want to add a new value or edit/delete an existing one.


How do you make changes to an ENUM datatype that is already in use?

As far as I know ENUM is not well suited to uses where new enumeration 
members may be added later. A lookup table and a foreign key is probably 
better for this sort of use.


You *can* add elements to an enum type - sort of - by creating a new 
type and converting columns. It's ugly, though, and will be hard to get 
right when the column of interest is referenced by foreign keys and such.


One way to do it if you really must:

-- Starting state

CREATE TYPE et1 AS ENUM('yes','no');
CREATE TABLE testtab (
   a et
);
INSERT INTO testtab (a) values ('yes');

-- Change
CREATE TYPE et2 AS ENUM('yes','no','filenotfound');
ALTER TABLE testtab ALTER COLUMN a TYPE et2 USING (
  CASE a WHEN 'yes' THEN 'yes'::et2 WHEN 'no' THEN 'no'::et2 END
);

-- Alternative ALTER that's suitable if you're not removing anything 
from the enum

ALTER TABLE testtab ALTER COLUMN a TYPE et2 USING ( a::text::et2 );

Personally, though, I'd stick to the good 'ol lookup table and foreign key.

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How is statement level read consistency implemented?

2008-04-22 Thread Thomas Kellerer

Roberts, Jon, 22.04.2008 14:56:

As far as I can tell (from the PDF and your quote) Postgres uses a

very

similar concept as Oracle. .
Each transaction has a uniqe number and each tuple contains the
information for which transaction number it is visible.


Oracle moves the old row(s) to the rollback segment instead of putting a
new row in the table as PostgreSQL does.  The new value goes right in
place of the old row and it adds a pointer to the rollback segment.



A commit removes the pointer and a rollback forces the db to move the
old row from the rollback segment back in place of the row updated.
Oracle's rollback segment can be tuned so that retains data up to a
certain age and then it is flushed.  If rollback is too small, you can
get a snapshot too old error.


Thanks, this was exactly the answer I was looking for!


Cheers
Thomas


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How is statement level read consistency implemented?

2008-04-22 Thread Alvaro Herrera
Roberts, Jon wrote:

 Back to PostgreSQL, when a vacuum is issued, the old deleted rows are
 removed.  This can be done with the autovacuum feature or manually.  
 
 -Does vacuum make sure there are no long running queries referencing the
 deleted rows before it attempts to remove the old rows?  

It does.

 -With autovacuum, does it skip these rows still being referenced in a
 transaction or does it wait?

It skips them, the idea being that a future vacuum will remove them.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: FW: Re: [GENERAL] create temp in function

2008-04-22 Thread Pavel Stehule
Hello

On 22/04/2008, Kerri Reno [EMAIL PROTECTED] wrote:
 Thanks to all who responded.  I now know why execute will help this problem,
 but then it causes a new problem.  The example I sent you was trivial,
 trying to get to the bottom of the issue.  What I'm really trying to is get
 past the restriction of execute to do SELECT INTO.  That's why I created a
 temporary table, so that that command could be dynamic, and then do a SELECT
 INTO from that table.  Because of the planning issue that won't work.  I
 can't upgrade to 8.3 at this time (long story).  Any ideas how to do this?
 Here is a section of my code.


you don't need upgrade to 8.3. Just use dynamic statements. Like:

BEGIN
  EXECUTE 'CREATE TEMP TABLE a ...';
  a)
EXECUTE 'SELECT * FROM a WHERE ...' INTO somevars;
  b)
FOR vars IN EXECUTE 'SELECT * FROM .. ' LOOP

Regards
Pavel Stehule

 begin
 query = 'create temp table schedrec as select ' ||
 'salary_schedule, pay_column, step from ' || tbl ||
 ' where cp_id = ' || to_char(tcp_id,'');
  raise notice 'query: %', query;
 execute query;
 select into relid distinct(attrelid) from pg_attribute where
 attrelid='schedrec'::regclass;
 raise notice 'relid: %', relid;
  raise notice 'about to do select';
 select into arow * from schedrec limit 1;
 drop table schedrec;
 return arow;
 end;

 Thanks so much!
 Kerri

 On 4/21/08, Adrian Klaver [EMAIL PROTECTED] wrote:
 
   -- Original message --
  From: Kerri Reno [EMAIL PROTECTED]
 
   Adrian,
  
   I don't understand.  Why do I need to use execute?  It runs fine the
 first
   time.  The second time it bombs, because it's not seeing schedrec
   correctly.  Which part should be in an execute query statement?
 
 
  plpgsql caches query plans. In versions prior to 8.3 this meant that the
 first time you ran a function the plans for the statements where cached for
 use by later runs of the function in the same session. The error you are
 getting about OID missing means the function is looking for the OID of the
 temp table as it was cached in the first run and not finding it. To get
 around this you need to EXECUTE the create temp table statement. This causes
 the plan not be cached but run anew for each call of the function. If you
 follow the link I included in the previous email you will see some examples.
 
 


 --
 Yuma Educational Computer Consortium
 Compass Development Team
 Kerri Reno
 [EMAIL PROTECTED]  (928) 502-4240
  .·:*¨¨*:·.   .·:*¨¨*:·.   .·:*¨¨*:·.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: FW: Re: [GENERAL] create temp in function

2008-04-22 Thread Roberts, Jon
Can you explain what you mean by the restriction to do SELECT INTO? 

 

Why are you using a temp table to begin with? 

 

 

 

Jon

 



From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Kerri Reno
Sent: Tuesday, April 22, 2008 7:55 AM
To: pgsql-general@postgresql.org
Subject: Re: FW: Re: [GENERAL] create temp in function

 

Thanks to all who responded.  I now know why execute will help this problem, 
but then it causes a new problem.  The example I sent you was trivial, trying 
to get to the bottom of the issue.  What I'm really trying to is get past the 
restriction of execute to do SELECT INTO.  That's why I created a temporary 
table, so that that command could be dynamic, and then do a SELECT INTO from 
that table.  Because of the planning issue that won't work.  I can't upgrade to 
8.3 at this time (long story).  Any ideas how to do this?  Here is a section of 
my code.

begin
query = 'create temp table schedrec as select ' ||
'salary_schedule, pay_column, step from ' || tbl ||
' where cp_id = ' || to_char(tcp_id,'');
raise notice 'query: %', query;
execute query;
select into relid distinct(attrelid) from pg_attribute where 
attrelid='schedrec'::regclass;
raise notice 'relid: %', relid;
raise notice 'about to do select';
select into arow * from schedrec limit 1;
drop table schedrec;
return arow;
end;

Thanks so much!
Kerri

On 4/21/08, Adrian Klaver [EMAIL PROTECTED] wrote:

 -- Original message --
From: Kerri Reno [EMAIL PROTECTED]

 Adrian,

 I don't understand.  Why do I need to use execute?  It runs fine the first
 time.  The second time it bombs, because it's not seeing schedrec
 correctly.  Which part should be in an execute query statement?

plpgsql caches query plans. In versions prior to 8.3 this meant that the first 
time you ran a function the plans for the statements where cached for use by 
later runs of the function in the same session. The error you are getting about 
OID missing means the function is looking for the OID of the temp table as it 
was cached in the first run and not finding it. To get around this you need to 
EXECUTE the create temp table statement. This causes the plan not be cached but 
run anew for each call of the function. If you follow the link I included in 
the previous email you will see some examples.






-- 
Yuma Educational Computer Consortium
Compass Development Team
Kerri Reno
[EMAIL PROTECTED]  (928) 502-4240
.·:*¨¨*:·.   .·:*¨¨*:·.   .·:*¨¨*:·. 



Re: [GENERAL] How is statement level read consistency implemented?

2008-04-22 Thread Roberts, Jon
  Back to PostgreSQL, when a vacuum is issued, the old deleted rows
are
  removed.  This can be done with the autovacuum feature or manually.
 
  -Does vacuum make sure there are no long running queries referencing
the
  deleted rows before it attempts to remove the old rows?
 
 It does.
 
  -With autovacuum, does it skip these rows still being referenced in
a
  transaction or does it wait?
 
 It skips them, the idea being that a future vacuum will remove them.
 

Awesome.  In a large data warehouse, the snapshot too old error is very
annoying and I'm glad PostgreSQL is superior to Oracle in this regard.
:)



Jon

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: FW: Re: [GENERAL] create temp in function

2008-04-22 Thread Kerri Reno
http://www.postgresql.org/docs/8.2/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN
says
SELECT INTO is not currently supported within EXECUTE.

I was using a temp table to get around the above problem.

On 4/22/08, Roberts, Jon [EMAIL PROTECTED] wrote:

  Can you explain what you mean by the restriction to do SELECT INTO?



 Why are you using a temp table to begin with?







 Jon


   --

 *From:* [EMAIL PROTECTED] [mailto:
 [EMAIL PROTECTED] *On Behalf Of *Kerri Reno
 *Sent:* Tuesday, April 22, 2008 7:55 AM
 *To:* pgsql-general@postgresql.org
 *Subject:* Re: FW: Re: [GENERAL] create temp in function



 Thanks to all who responded.  I now know why execute will help this
 problem, but then it causes a new problem.  The example I sent you was
 trivial, trying to get to the bottom of the issue.  What I'm really trying
 to is get past the restriction of execute to do SELECT INTO.  That's why I
 created a temporary table, so that that command could be dynamic, and then
 do a SELECT INTO from that table.  Because of the planning issue that won't
 work.  I can't upgrade to 8.3 at this time (long story).  Any ideas how to
 do this?  Here is a section of my code.

 begin
 query = 'create temp table schedrec as select ' ||
 'salary_schedule, pay_column, step from ' || tbl ||
 ' where cp_id = ' || to_char(tcp_id,'');
 raise notice 'query: %', query;
 execute query;
 select into relid distinct(attrelid) from pg_attribute where
 attrelid='schedrec'::regclass;
 raise notice 'relid: %', relid;
 raise notice 'about to do select';
 select into arow * from schedrec limit 1;
 drop table schedrec;
 return arow;
 end;

 Thanks so much!
 Kerri

 On 4/21/08, *Adrian Klaver* [EMAIL PROTECTED] wrote:

  -- Original message --
 From: Kerri Reno [EMAIL PROTECTED]

  Adrian,
 
  I don't understand.  Why do I need to use execute?  It runs fine the
 first
  time.  The second time it bombs, because it's not seeing schedrec
  correctly.  Which part should be in an execute query statement?

 plpgsql caches query plans. In versions prior to 8.3 this meant that the
 first time you ran a function the plans for the statements where cached for
 use by later runs of the function in the same session. The error you are
 getting about OID missing means the function is looking for the OID of the
 temp table as it was cached in the first run and not finding it. To get
 around this you need to EXECUTE the create temp table statement. This causes
 the plan not be cached but run anew for each call of the function. If you
 follow the link I included in the previous email you will see some examples.




 --
 Yuma Educational Computer Consortium
 Compass Development Team
 Kerri Reno
 [EMAIL PROTECTED]  (928) 502-4240
 .·:*¨¨*:·.   .·:*¨¨*:·.   .·:*¨¨*:·.




-- 
Yuma Educational Computer Consortium
Compass Development Team
Kerri Reno
[EMAIL PROTECTED]  (928) 502-4240
.·:*¨¨*:·.   .·:*¨¨*:·.   .·:*¨¨*:·.


Re: [GENERAL] How is statement level read consistency implemented?

2008-04-22 Thread Alvaro Herrera
Roberts, Jon wrote:

   -With autovacuum, does it skip these rows still being referenced
   in a transaction or does it wait?
  
  It skips them, the idea being that a future vacuum will remove them.
 
 Awesome.  In a large data warehouse, the snapshot too old error is very
 annoying and I'm glad PostgreSQL is superior to Oracle in this regard.
 :)

Well, the disadvantage of the PostgreSQL way is that it keeps dead rows
around for longer than they're actually needed, and so it causes some
problems in pathological conditions -- for example when setting up large
replication sets with Slony, or during a pg_dump, no dead rows can be
removed.  Since the Slony thing can take a very long time, dead rows
start to pile up in a way that can really harm performance.

We're currently working on it so that it is less of a problem for 8.4 ...

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: FW: Re: [GENERAL] create temp in function

2008-04-22 Thread Kerri Reno
I tried that, too, and got a different error.
NOTICE:  query: select salary_schedule, pay_column, step from saltab07 where
cp_id =32

ERROR: syntax error at or near $2
SQL state: 42601
Context: PL/pgSQL function get_salary_schedule line 15 at execute
statement

In the PG log file, it says:
syntax error at or near $2
SELECT $1 into $2, $3, $4

My code is below

CREATE OR REPLACE FUNCTION private.get_salary_schedule(integer, text)
  RETURNS record AS
$BODY$
declare
tcp_id alias for $1;
tfy alias for $2;
tbl text := 'saltab' || tfy;
arow record;
sched text;
cl text;
st integer;
query text;
begin
query = 'select ' || 'salary_schedule, pay_column, step from ' || tbl ||
' where cp_id = ' || to_char(tcp_id,'');
raise notice 'query: %', query;
execute query into sched, cl, st;
arow := (sched, cl, st);
return arow;
end;
$BODY$
  LANGUAGE 'plpgsql' VOLATILE SECURITY DEFINER;

I aslo tried:
execute query into (sched, cl, st);
which gave me  'syntax error on ('

and
execute query into arow;
which gave me 'record arow is not assigned yet'

Thanks!
Kerri
On 4/22/08, Pavel Stehule [EMAIL PROTECTED] wrote:

 Hello



 you don't need upgrade to 8.3. Just use dynamic statements. Like:

 BEGIN
   EXECUTE 'CREATE TEMP TABLE a ...';
   a)
 EXECUTE 'SELECT * FROM a WHERE ...' INTO somevars;
   b)
 FOR vars IN EXECUTE 'SELECT * FROM .. ' LOOP

 Regards

 Pavel Stehule




-- 
Yuma Educational Computer Consortium
Compass Development Team
Kerri Reno
[EMAIL PROTECTED]  (928) 502-4240
.·:*¨¨*:·.   .·:*¨¨*:·.   .·:*¨¨*:·.


Re: FW: Re: [GENERAL] create temp in function

2008-04-22 Thread Adrian Klaver
On Tuesday 22 April 2008 6:26 am, Kerri Reno wrote:
 http://www.postgresql.org/docs/8.2/static/plpgsql-statements.html#PLPGSQL-S
TATEMENTS-EXECUTING-DYN says
 SELECT INTO is not currently supported within EXECUTE.

In 8.2  EXECUTE INTO is supported.;

The INTO clause specifies where the results of a SQL command returning rows 
should be assigned. If a row or variable list is provided, it must exactly 
match the structure of the query's results (when a record variable is used, 
it will configure itself to match the result structure automatically). If 
multiple rows are returned, only the first will be assigned to the INTO 
variable. If no rows are returned, NULL is assigned to the INTO variable. If 
no INTO clause is specified, the query results are discarded.



 I was using a temp table to get around the above problem.

 On 4/22/08, Roberts, Jon [EMAIL PROTECTED] wrote:
   Can you explain what you mean by the restriction to do SELECT INTO?
 
 
 
  Why are you using a temp table to begin with?
 
 
 
 
 
 
 
  Jon
 
 
--
 
  *From:* [EMAIL PROTECTED] [mailto:
  [EMAIL PROTECTED] *On Behalf Of *Kerri Reno
  *Sent:* Tuesday, April 22, 2008 7:55 AM
  *To:* pgsql-general@postgresql.org
  *Subject:* Re: FW: Re: [GENERAL] create temp in function
 
 
 
  Thanks to all who responded.  I now know why execute will help this
  problem, but then it causes a new problem.  The example I sent you was
  trivial, trying to get to the bottom of the issue.  What I'm really
  trying to is get past the restriction of execute to do SELECT INTO. 
  That's why I created a temporary table, so that that command could be
  dynamic, and then do a SELECT INTO from that table.  Because of the
  planning issue that won't work.  I can't upgrade to 8.3 at this time
  (long story).  Any ideas how to do this?  Here is a section of my code.
 
  begin
  query = 'create temp table schedrec as select ' ||
  'salary_schedule, pay_column, step from ' || tbl ||
  ' where cp_id = ' || to_char(tcp_id,'');
  raise notice 'query: %', query;
  execute query;
  select into relid distinct(attrelid) from pg_attribute where
  attrelid='schedrec'::regclass;
  raise notice 'relid: %', relid;
  raise notice 'about to do select';
  select into arow * from schedrec limit 1;
  drop table schedrec;
  return arow;
  end;
 
  Thanks so much!
  Kerri
 
  On 4/21/08, *Adrian Klaver* [EMAIL PROTECTED] wrote:
 
   -- Original message --
  From: Kerri Reno [EMAIL PROTECTED]
 
   Adrian,
  
   I don't understand.  Why do I need to use execute?  It runs fine the
 
  first
 
   time.  The second time it bombs, because it's not seeing schedrec
   correctly.  Which part should be in an execute query statement?
 
  plpgsql caches query plans. In versions prior to 8.3 this meant that the
  first time you ran a function the plans for the statements where cached
  for use by later runs of the function in the same session. The error you
  are getting about OID missing means the function is looking for the OID
  of the temp table as it was cached in the first run and not finding it.
  To get around this you need to EXECUTE the create temp table statement.
  This causes the plan not be cached but run anew for each call of the
  function. If you follow the link I included in the previous email you
  will see some examples.
 
 
 
 
  --
  Yuma Educational Computer Consortium
  Compass Development Team
  Kerri Reno
  [EMAIL PROTECTED]  (928) 502-4240
  .·:*¨¨*:·.   .·:*¨¨*:·.   .·:*¨¨*:·.

-- 
Adrian Klaver
[EMAIL PROTECTED]

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: FW: Re: [GENERAL] create temp in function

2008-04-22 Thread Roberts, Jon
I'll ask again in a different way.  

 

What is the purpose of your dynamic sql and/or temp table?  Don't tell me 
anything about using select into.  

 

What is the business purpose of the function?  An appropriate answer would be 
I'm trying calculate x or I'm trying to determine y by looking at x.

 

It looks like you are trying to use a temp table when an inline view would be 
more appropriate.

 

 

Jon

 



From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Kerri Reno
Sent: Tuesday, April 22, 2008 8:27 AM
To: Roberts, Jon
Cc: pgsql-general@postgresql.org
Subject: Re: FW: Re: [GENERAL] create temp in function

 

http://www.postgresql.org/docs/8.2/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN
says
SELECT INTO is not currently supported within EXECUTE.

I was using a temp table to get around the above problem.

On 4/22/08, Roberts, Jon [EMAIL PROTECTED] wrote:

Can you explain what you mean by the restriction to do SELECT INTO? 

 

Why are you using a temp table to begin with? 

 

 

 

Jon

 



From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Kerri Reno
Sent: Tuesday, April 22, 2008 7:55 AM
To: pgsql-general@postgresql.org
Subject: Re: FW: Re: [GENERAL] create temp in function

 

Thanks to all who responded.  I now know why execute will help this problem, 
but then it causes a new problem.  The example I sent you was trivial, trying 
to get to the bottom of the issue.  What I'm really trying to is get past the 
restriction of execute to do SELECT INTO.  That's why I created a temporary 
table, so that that command could be dynamic, and then do a SELECT INTO from 
that table.  Because of the planning issue that won't work.  I can't upgrade to 
8.3 at this time (long story).  Any ideas how to do this?  Here is a section of 
my code.

begin
query = 'create temp table schedrec as select ' ||
'salary_schedule, pay_column, step from ' || tbl ||
' where cp_id = ' || to_char(tcp_id,'');
raise notice 'query: %', query;
execute query;
select into relid distinct(attrelid) from pg_attribute where 
attrelid='schedrec'::regclass;
raise notice 'relid: %', relid;
raise notice 'about to do select';
select into arow * from schedrec limit 1;
drop table schedrec;
return arow;
end;

Thanks so much!
Kerri

On 4/21/08, Adrian Klaver [EMAIL PROTECTED] wrote:

 -- Original message --
From: Kerri Reno [EMAIL PROTECTED]

 Adrian,

 I don't understand.  Why do I need to use execute?  It runs fine the first
 time.  The second time it bombs, because it's not seeing schedrec
 correctly.  Which part should be in an execute query statement?

plpgsql caches query plans. In versions prior to 8.3 this meant that the first 
time you ran a function the plans for the statements where cached for use by 
later runs of the function in the same session. The error you are getting about 
OID missing means the function is looking for the OID of the temp table as it 
was cached in the first run and not finding it. To get around this you need to 
EXECUTE the create temp table statement. This causes the plan not be cached but 
run anew for each call of the function. If you follow the link I included in 
the previous email you will see some examples.





-- 
Yuma Educational Computer Consortium
Compass Development Team
Kerri Reno
[EMAIL PROTECTED]  (928) 502-4240
.·:*¨¨*:·.   .·:*¨¨*:·.   .·:*¨¨*:·. 




-- 
Yuma Educational Computer Consortium
Compass Development Team
Kerri Reno
[EMAIL PROTECTED]  (928) 502-4240
.·:*¨¨*:·.   .·:*¨¨*:·.   .·:*¨¨*:·. 



Re: FW: Re: [GENERAL] create temp in function

2008-04-22 Thread Kerri Reno
So the reason I'm getting the error is that I'm running it in 8.0.  Thanks
so much for your help!
Kerri

On 4/22/08, Adrian Klaver [EMAIL PROTECTED] wrote:

 On Tuesday 22 April 2008 6:26 am, Kerri Reno wrote:
 
 http://www.postgresql.org/docs/8.2/static/plpgsql-statements.html#PLPGSQL-S
 TATEMENTS-EXECUTING-DYN says
  SELECT INTO is not currently supported within EXECUTE.


 In 8.2  EXECUTE INTO is supported.;

 The INTO clause specifies where the results of a SQL command returning
 rows
 should be assigned. If a row or variable list is provided, it must exactly
 match the structure of the query's results (when a record variable is
 used,
 it will configure itself to match the result structure automatically). If
 multiple rows are returned, only the first will be assigned to the INTO
 variable. If no rows are returned, NULL is assigned to the INTO variable.
 If
 no INTO clause is specified, the query results are discarded.



 
  I was using a temp table to get around the above problem.
 
  On 4/22/08, Roberts, Jon [EMAIL PROTECTED] wrote:
Can you explain what you mean by the restriction to do SELECT INTO?
  
  
  
   Why are you using a temp table to begin with?
  
  
  
  
  
  
  
   Jon
  
  
 --
  
   *From:* [EMAIL PROTECTED] [mailto:
   [EMAIL PROTECTED] *On Behalf Of *Kerri Reno
   *Sent:* Tuesday, April 22, 2008 7:55 AM
   *To:* pgsql-general@postgresql.org
   *Subject:* Re: FW: Re: [GENERAL] create temp in function
  
  
  
   Thanks to all who responded.  I now know why execute will help this
   problem, but then it causes a new problem.  The example I sent you was
   trivial, trying to get to the bottom of the issue.  What I'm really
   trying to is get past the restriction of execute to do SELECT INTO.
   That's why I created a temporary table, so that that command could be
   dynamic, and then do a SELECT INTO from that table.  Because of the
   planning issue that won't work.  I can't upgrade to 8.3 at this time
   (long story).  Any ideas how to do this?  Here is a section of my
 code.
  
   begin
   query = 'create temp table schedrec as select ' ||
   'salary_schedule, pay_column, step from ' || tbl ||
   ' where cp_id = ' || to_char(tcp_id,'');
   raise notice 'query: %', query;
   execute query;
   select into relid distinct(attrelid) from pg_attribute where
   attrelid='schedrec'::regclass;
   raise notice 'relid: %', relid;
   raise notice 'about to do select';
   select into arow * from schedrec limit 1;
   drop table schedrec;
   return arow;
   end;
  
   Thanks so much!
   Kerri
  
   On 4/21/08, *Adrian Klaver* [EMAIL PROTECTED] wrote:
  
-- Original message --
   From: Kerri Reno [EMAIL PROTECTED]
  
Adrian,
   
I don't understand.  Why do I need to use execute?  It runs fine the
  
   first
  
time.  The second time it bombs, because it's not seeing schedrec
correctly.  Which part should be in an execute query statement?
  
   plpgsql caches query plans. In versions prior to 8.3 this meant that
 the
   first time you ran a function the plans for the statements where
 cached
   for use by later runs of the function in the same session. The error
 you
   are getting about OID missing means the function is looking for the
 OID
   of the temp table as it was cached in the first run and not finding
 it.
   To get around this you need to EXECUTE the create temp table
 statement.
   This causes the plan not be cached but run anew for each call of the
   function. If you follow the link I included in the previous email you
   will see some examples.
  
  
  
  
   --
   Yuma Educational Computer Consortium
   Compass Development Team
   Kerri Reno
   [EMAIL PROTECTED]  (928) 502-4240
   .·:*¨¨*:·.   .·:*¨¨*:·.   .·:*¨¨*:·.

 --
 Adrian Klaver
 [EMAIL PROTECTED]




-- 
Yuma Educational Computer Consortium
Compass Development Team
Kerri Reno
[EMAIL PROTECTED]  (928) 502-4240
.·:*¨¨*:·.   .·:*¨¨*:·.   .·:*¨¨*:·.


Re: [GENERAL] table as log (multiple writers and readers)

2008-04-22 Thread Joris Dobbelsteen

David Wilson wrote:

On Mon, Apr 21, 2008 at 7:55 PM, Joris Dobbelsteen
[EMAIL PROTECTED] wrote:

 If you want to clean up the the staging table I have some concerns about
the advisory lock. I think you mean exclusive table lock.


Either works, really. An advisory lock is really just a lock over
which you have control of the meaning, as long as you're using it in
the appropriate places. Also, an advisory lock on just the processes
doing staging-to-log moves would allow writes into the staging table
to continue concurrently with the staging-to-log transaction (whereas
an exclusive lock would unnecessarily prevent them).


Describe the mechanism, because I don't really believe it yet. I think 
you need to do a advisory lock around every commit of every transaction 
that writes to the log table.


If you are only using the advisory lock in the staging-to-log 
transaction, how would this prevent newly committed tuples to not show 
up during this process? (You can't both delete and insert in a single 
statement, I believe, in which case you won't have a problem anyways).



Also, while Vance appears to have chosen to have a dedicated
staging-to-log process, even that isn't necessary- each reader can
simply do the lock/clear staging/unlock before any attempt to read-
unless you're polling that log table at truly crazy rates, the
overhead should be negligible and will ensure that the staging table
is simply cleared out whenever necessary while removing the
complexity of a separate process.


Using serialization mode for the staging-to-log process seems to be the 
most efficient methods, as it won't even block writers.


- Joris

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] better error handling for COPY from stdin

2008-04-22 Thread hubert depesz lubaczewski
if you have .sql file with COPY table FROM stdin;
and the table doesn't exist, psql prints a lot of errors from lines
which contain only data for copy.

to reproduce:
= echo -ne COPY non_existing_table FROM stdin;\n1\n\\N\n2\n\\.\n | psql
ERROR:  relation non_existing_table does not exist
invalid command \N
invalid command \.
ERROR:  syntax error at or near 1
LINE 1: 1
^

would it be possible to make it smarter, so it will not treat datalines from
failed copy as statements?

i see a lot of questions that postgresql says command \N is invalid - which
is of course rubbish, but making psql a bit smarter would make all of these
question go away.

regards,

depesz

-- 
quicksil1er: postgres is excellent, but like any DB it requires a
highly paid DBA.  here's my CV! :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How is statement level read consistency implemented?

2008-04-22 Thread Erik Jones


On Apr 22, 2008, at 8:35 AM, Alvaro Herrera wrote:


Roberts, Jon wrote:


-With autovacuum, does it skip these rows still being referenced
in a transaction or does it wait?


It skips them, the idea being that a future vacuum will remove them.


Awesome.  In a large data warehouse, the snapshot too old error is  
very
annoying and I'm glad PostgreSQL is superior to Oracle in this  
regard.

:)


Well, the disadvantage of the PostgreSQL way is that it keeps dead  
rows

around for longer than they're actually needed, and so it causes some
problems in pathological conditions -- for example when setting up  
large

replication sets with Slony, or during a pg_dump, no dead rows can be
removed.  Since the Slony thing can take a very long time, dead rows
start to pile up in a way that can really harm performance.


In addition or rather, another potential issue, if you have a REALLY  
long transaction running then you can risk transaction id wraparound.


Erik Jones

DBA | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate  market in style.
Visit us online at http://www.myemma.com




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] table as log (multiple writers and readers)

2008-04-22 Thread David Wilson
On Tue, Apr 22, 2008 at 9:52 AM, Joris Dobbelsteen
[EMAIL PROTECTED] wrote:

  Describe the mechanism, because I don't really believe it yet. I think you
 need to do a advisory lock around every commit of every transaction that
 writes to the log table.

Consider some number of reader processes and some number of writer processes.

Writer processes touch only the staging table, and solely do inserts
into it. As a result, writer processes cannot interfere with each
other in any way and do not require any synchronization beyond that
provided by MVCC.

Reader processes are interested in polling the logging table at
intervals. In the process, they also act as staging-to-log movers.
This act (because it is destructive and because we require serialized
inserts for id generation in the log table) must take a lock that
prevents other readers from attempting the same work at the same time.

Each reader process therefore has a loop that appears as follows:
1) Obtain advisory lock.
2) Begin transaction.
3) For each row in staging table, insert copy into log table.
4) Delete all visible rows from staging table.
5) Commit transaction.
6) Release advisory lock.
7) Handle not-yet-seen rows in the logging table (This is the primary
work of the readers)
8) Sleep for the desired interval and return to 1).

We require two types of synchronization and the above takes care of both:
1) The advisory lock prevents multiple readers from doing simultaneous
staging-to-log moves.
2) The transaction block ensures that the reader will see a consistent
state on the staging table while writers may write at the same time;
writes that occur during the reader's transaction block will simply be
ignored during this round of reading.

You need both types of synchronization to avoid problems- taking an
exclusive lock would simply be the sledgehammer method of doing the
synchronization, since it would take the place of both the advisory
lock and the transaction at the same time but would also block
writers.

-- 
- David T. Wilson
[EMAIL PROTECTED]

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to modify ENUM datatypes?

2008-04-22 Thread Andreas 'ads' Scherbaum
On Tue, 22 Apr 2008 20:58:30 +0800 Craig Ringer wrote:

 D. Dante Lorenso wrote:
 
  The problem is that once I create a column in my account table that 
  uses this 'account_type' datatype, I can't seem to change or add to it 
  any more.  I want to add a new value or edit/delete an existing one.
 
  How do you make changes to an ENUM datatype that is already in use?
 
 As far as I know ENUM is not well suited to uses where new enumeration 
 members may be added later. A lookup table and a foreign key is probably 
 better for this sort of use.

I remember the discussions before PG implemented ENUMs at all - some
people voted against this feature because they knew that questions
about modifing the enum values would pop up sooner or later.


 You *can* add elements to an enum type - sort of - by creating a new 
 type and converting columns. It's ugly, though, and will be hard to get 
 right when the column of interest is referenced by foreign keys and such.

If there's one request to modify a specific ENUM column, other requests
will follow because the enum set is not as static as it seems in the
first place. So i beg that the only advise should be change the column
to a foreign key construct. Converting columns to new data types is
much more overhead anyway.


Kind regards

-- 
Andreas 'ads' Scherbaum
German PostgreSQL User Group

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] can't cast char to in

2008-04-22 Thread blackwater dev
I have a table with a mileage column that is a character varying (please
don't ask why :).

I need to do a query where mileage  500

select * from cars where mileage500

So I need to cast it but everything I try throws an error such as :

ERROR:  invalid input syntax for integer: +

How can I cast this?



Thanks!


Re: [GENERAL] can't cast char to in

2008-04-22 Thread Erik Jones


On Apr 22, 2008, at 10:34 AM, blackwater dev wrote:

I have a table with a mileage column that is a character varying  
(please don't ask why :).


I need to do a query where mileage  500

select * from cars where mileage500

So I need to cast it but everything I try throws an error such as :

ERROR:  invalid input syntax for integer: +

How can I cast this?


Well, you didn't really give any real information on the format of the  
data in your mileage column.  However, my guess is that you've got at  
least one row with just '+' in the mileage column which is not the  
same thing as '+0'.  You'll probably need to do a little data cleaning  
and, once that's done you should definitely consider switching that to  
an integer/numeric data type.


Erik Jones

DBA | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate  market in style.
Visit us online at http://www.myemma.com




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] can't cast char to in

2008-04-22 Thread blackwater dev
Yeah, it was my being stupid, I got it going now.

Thanks!

On Tue, Apr 22, 2008 at 11:42 AM, Erik Jones [EMAIL PROTECTED] wrote:


 On Apr 22, 2008, at 10:34 AM, blackwater dev wrote:

  I have a table with a mileage column that is a character varying (please
  don't ask why :).
 
  I need to do a query where mileage  500
 
  select * from cars where mileage500
 
  So I need to cast it but everything I try throws an error such as :
 
  ERROR:  invalid input syntax for integer: +
 
  How can I cast this?
 

 Well, you didn't really give any real information on the format of the
 data in your mileage column.  However, my guess is that you've got at least
 one row with just '+' in the mileage column which is not the same thing as
 '+0'.  You'll probably need to do a little data cleaning and, once that's
 done you should definitely consider switching that to an integer/numeric
 data type.

 Erik Jones

 DBA | Emma(R)
 [EMAIL PROTECTED]
 800.595.4401 or 615.292.5888
 615.292.0777 (fax)

 Emma helps organizations everywhere communicate  market in style.
 Visit us online at http://www.myemma.com






Re: [GENERAL] can't cast char to in

2008-04-22 Thread Richard Huxton

blackwater dev wrote:
I have a table with a mileage column that is a character varying (please 
don't ask why :).


Why? :-)


I need to do a query where mileage  500

select * from cars where mileage500

So I need to cast it but everything I try throws an error such as :

ERROR:  invalid input syntax for integer: +


Once you've cleaned your data, I would do one of two things:

1. Add a constraint to restrict the values the mileage column will accept:
ALTER TABLE cars ADD CONSTRAINT valid_mileage
CHECK (mileage ~ '^[+]?[0-9]+$');

2. You can alter the type on-the-fly too:
ALTER TABLE mileage_test ALTER COLUMN mileage TYPE integer
USING (mileage::int);

Note that you'll need to remove the constraint from #1 if you've applied 
that.


--
  Richard Huxton
  Archonet Ltd

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Can not restart postgres: Panic could not locate a valid checkpoint record

2008-04-22 Thread Mary Ellen Fitzpatrick

Hi,
I am running postgres-8.1 on a Centos5 system.
this morning when I came in there was no more disk space on the 
partition that holds pgsql: /var/lib/pgsql.
I believe postgres had crashed at this point.  I stupidly removed all of 
the files under pg_xlog to create disk space.  Not knowing that those 
are the files I need to start the database.  I  quickly realized the 
error I had made and restored the  pg_xlog data/files from our system 
backup that ran at 1am last night (4/22/08).


Now when I try to restart the database, I get the following error.   It 
is looking for file 000100130064, which I do not have. 


LOG:  database system was shut down at 2008-04-22 10:38:59 EDT
LOG:  could not open file pg_xlog/ (log file 19, segment 100): No such 
file or directory

LOG:  invalid primary checkpoint record
LOG:  could not open file pg_xlog/000100130064 (log file 
19, segment 100): No such file or directory

LOG:  invalid secondary checkpoint record
PANIC:  could not locate a valid checkpoint record
LOG:  startup process (PID 6800) was terminated by signal 6
LOG:  aborting startup due to startup process failure
LOG:  logger shutting down


I am in the process of having the WAL dir restored as well.  Again, in 
my haste, I deleted that dir as well.  This data will be from 4/22/08 at 
1pm as well, time of the system backup.



If I do not have the files that startup script is asking for, what is 
the best way to restart the database.
Any advise (other than not deleting important dirs) would be greatly 
appreciated.



--
Thanks
Mary Ellen


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Can not restart postgres: Panic could not locate a valid checkpoint record

2008-04-22 Thread BRUSSER Michael
Some time ago I had a similar problem with Postgresql 7.3.
If I remember correctly running pg_resetxlog helped; check out
http://www.postgresql.org/docs/8.3/static/app-pgresetxlog.html

Michael.


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mary Ellen
Fitzpatrick
Sent: Tuesday, April 22, 2008 1:14 PM
To: PGSQL General
Subject: [GENERAL] Can not restart postgres: Panic could not locate a
valid checkpoint record

Hi,
I am running postgres-8.1 on a Centos5 system.
this morning when I came in there was no more disk space on the 
partition that holds pgsql: /var/lib/pgsql.
I believe postgres had crashed at this point.  I stupidly removed all of

the files under pg_xlog to create disk space.  Not knowing that those 
are the files I need to start the database.  I  quickly realized the 
error I had made and restored the  pg_xlog data/files from our system 
backup that ran at 1am last night (4/22/08).

Now when I try to restart the database, I get the following error.   It 
is looking for file 000100130064, which I do not have. 

LOG:  database system was shut down at 2008-04-22 10:38:59 EDT
LOG:  could not open file pg_xlog/ (log file 19, segment 100): No such

file or directory
LOG:  invalid primary checkpoint record
LOG:  could not open file pg_xlog/000100130064 (log file 
19, segment 100): No such file or directory
LOG:  invalid secondary checkpoint record
PANIC:  could not locate a valid checkpoint record
LOG:  startup process (PID 6800) was terminated by signal 6
LOG:  aborting startup due to startup process failure
LOG:  logger shutting down

 
I am in the process of having the WAL dir restored as well.  Again, in 
my haste, I deleted that dir as well.  This data will be from 4/22/08 at

1pm as well, time of the system backup.


If I do not have the files that startup script is asking for, what is 
the best way to restart the database.
Any advise (other than not deleting important dirs) would be greatly 
appreciated.


-- 
Thanks
Mary Ellen


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] table as log (multiple writers and readers)

2008-04-22 Thread Joris Dobbelsteen

David Wilson wrote:

On Tue, Apr 22, 2008 at 9:52 AM, Joris Dobbelsteen
[EMAIL PROTECTED] wrote:

 Describe the mechanism, because I don't really believe it yet. I think you
need to do a advisory lock around every commit of every transaction that
writes to the log table.


Consider some number of reader processes and some number of writer processes.

Writer processes touch only the staging table, and solely do inserts
into it. As a result, writer processes cannot interfere with each
other in any way and do not require any synchronization beyond that
provided by MVCC.

Reader processes are interested in polling the logging table at
intervals. In the process, they also act as staging-to-log movers.
This act (because it is destructive and because we require serialized
inserts for id generation in the log table) must take a lock that
prevents other readers from attempting the same work at the same time.

Each reader process therefore has a loop that appears as follows:
1) Obtain advisory lock.
2) Begin transaction.
3) For each row in staging table, insert copy into log table.
4) Delete all visible rows from staging table.


Ah, yes, all visible rows...
My point is that, unless you use a transaction with serializable 
isolation, this all visible rows for the second statement might be 
different from those that you copied into the log table.


With the normal Read committed isolation level you suffer from a 
possible nonrepeatable read that might change tuple visibility between 
different statements.



5) Commit transaction.
6) Release advisory lock.
7) Handle not-yet-seen rows in the logging table (This is the primary
work of the readers)
8) Sleep for the desired interval and return to 1).

We require two types of synchronization and the above takes care of both:
1) The advisory lock prevents multiple readers from doing simultaneous
staging-to-log moves.
2) The transaction block ensures that the reader will see a consistent
state on the staging table while writers may write at the same time;
writes that occur during the reader's transaction block will simply be
ignored during this round of reading.


See above, you demand its impossible for nonrepeatable reads to occur.


You need both types of synchronization to avoid problems- taking an
exclusive lock would simply be the sledgehammer method of doing the
synchronization, since it would take the place of both the advisory
lock and the transaction at the same time but would also block
writers.


I agree with you on this, but it does guarentee the impossibility of a 
nonrepeatable read at the cost of concurrency. There seems to be a 
better solution indeed.


- Joris


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] table as log (multiple writers and readers)

2008-04-22 Thread David Wilson
On Tue, Apr 22, 2008 at 2:48 PM, Joris Dobbelsteen
[EMAIL PROTECTED] wrote:

  Ah, yes, all visible rows...
  My point is that, unless you use a transaction with serializable isolation,
 this all visible rows for the second statement might be different from those
 that you copied into the log table.

  With the normal Read committed isolation level you suffer from a possible
 nonrepeatable read that might change tuple visibility between different
 statements.

That depends on implementation. A select into ... to do the initial
copy followed by a delete where... with the where clause referencing
the log table itself to ensure that we delete only things that now
exist in the log table, or a row by row  insert/delete pair. Either
would provide the appropriate level of protection from accidental
deletion of more things than you intended without harming concurrency.
The delete referencing the log table might require that the log table
be indexed for performance, but it's likely that such indexing would
be done anyway for general log use.
-- 
- David T. Wilson
[EMAIL PROTECTED]

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Can not restart postgres: Panic could not locate a valid checkpoint record

2008-04-22 Thread Mary Ellen Fitzpatrick
I ran the pg_resetxlog -n to test and it found the correct values.  Ran 
it for real and it worked.

Thank you for the info/help
Mary Ellen

BRUSSER Michael wrote:

Some time ago I had a similar problem with Postgresql 7.3.
If I remember correctly running pg_resetxlog helped; check out
http://www.postgresql.org/docs/8.3/static/app-pgresetxlog.html

Michael.


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mary Ellen
Fitzpatrick
Sent: Tuesday, April 22, 2008 1:14 PM
To: PGSQL General
Subject: [GENERAL] Can not restart postgres: Panic could not locate a
valid checkpoint record

Hi,
I am running postgres-8.1 on a Centos5 system.
this morning when I came in there was no more disk space on the 
partition that holds pgsql: /var/lib/pgsql.

I believe postgres had crashed at this point.  I stupidly removed all of

the files under pg_xlog to create disk space.  Not knowing that those 
are the files I need to start the database.  I  quickly realized the 
error I had made and restored the  pg_xlog data/files from our system 
backup that ran at 1am last night (4/22/08).


Now when I try to restart the database, I get the following error.   It 
is looking for file 000100130064, which I do not have. 


LOG:  database system was shut down at 2008-04-22 10:38:59 EDT
LOG:  could not open file pg_xlog/ (log file 19, segment 100): No such

file or directory
LOG:  invalid primary checkpoint record
LOG:  could not open file pg_xlog/000100130064 (log file 
19, segment 100): No such file or directory

LOG:  invalid secondary checkpoint record
PANIC:  could not locate a valid checkpoint record
LOG:  startup process (PID 6800) was terminated by signal 6
LOG:  aborting startup due to startup process failure
LOG:  logger shutting down

 
I am in the process of having the WAL dir restored as well.  Again, in 
my haste, I deleted that dir as well.  This data will be from 4/22/08 at


1pm as well, time of the system backup.


If I do not have the files that startup script is asking for, what is 
the best way to restart the database.
Any advise (other than not deleting important dirs) would be greatly 
appreciated.



  


--
Thanks
Mary Ellen


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] table as log (multiple writers and readers)

2008-04-22 Thread Gurjeet Singh
On Wed, Apr 23, 2008 at 12:29 AM, David Wilson [EMAIL PROTECTED]
wrote:

 On Tue, Apr 22, 2008 at 2:48 PM, Joris Dobbelsteen
 [EMAIL PROTECTED] wrote:
 
   Ah, yes, all visible rows...
   My point is that, unless you use a transaction with serializable
 isolation,
  this all visible rows for the second statement might be different from
 those
  that you copied into the log table.
 
   With the normal Read committed isolation level you suffer from a
 possible
  nonrepeatable read that might change tuple visibility between different
  statements.

 That depends on implementation. A select into ... to do the initial
 copy followed by a delete where... with the where clause referencing
 the log table itself to ensure that we delete only things that now
 exist in the log table, or a row by row  insert/delete pair. Either
 would provide the appropriate level of protection from accidental
 deletion of more things than you intended without harming concurrency.
 The delete referencing the log table might require that the log table
 be indexed for performance, but it's likely that such indexing would
 be done anyway for general log use.


I think this plpgsql function would solve the problem of atomic
read-and-delete operation...

create or replace function log_rotate() returns void as $$
declare
  rec record;
begin

for rec in delete from t1 returning * loop
insert into t2 values( rec.a, rec.b );
end loop;

end;
$$ language 'plpgsql';

select log_rotate();

Best regards,
-- 
[EMAIL PROTECTED]

[EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device


Re: [GENERAL] Can not restart postgres: Panic could not locate a valid checkpoint record

2008-04-22 Thread Scott Marlowe
On Tue, Apr 22, 2008 at 1:00 PM, Mary Ellen Fitzpatrick [EMAIL PROTECTED] 
wrote:
 I ran the pg_resetxlog -n to test and it found the correct values.  Ran it
 for real and it worked.

Note that on most unix systems, some percentage of disk storage is
reserved for use by root only.  On linux with ext2/3 you can change
this % with the tune2fs command...

tune2fs -m 0 /dev/sdb2

would set the drive mounted from /dev/sdb2 to have 0 reserved % for
root, and may be enough in the future to get you out of this
situation.  Be sure to set it back to 1 or 2 % after you've cleaned up
enough so you have the free space again when you need it.

The same thing can be done with quotas...

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] table as log (multiple writers and readers)

2008-04-22 Thread Joris Dobbelsteen

Gurjeet Singh wrote:
On Wed, Apr 23, 2008 at 12:29 AM, David Wilson [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED] wrote:


On Tue, Apr 22, 2008 at 2:48 PM, Joris Dobbelsteen
[EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
wrote:
 
   Ah, yes, all visible rows...
   My point is that, unless you use a transaction with serializable
isolation,
  this all visible rows for the second statement might be different
from those
  that you copied into the log table.
 
   With the normal Read committed isolation level you suffer from a
possible
  nonrepeatable read that might change tuple visibility between
different
  statements.

That depends on implementation. A select into ... to do the initial
copy followed by a delete where... with the where clause referencing
the log table itself to ensure that we delete only things that now
exist in the log table, or a row by row  insert/delete pair. Either
would provide the appropriate level of protection from accidental
deletion of more things than you intended without harming concurrency.
The delete referencing the log table might require that the log table
be indexed for performance, but it's likely that such indexing would
be done anyway for general log use.


Of course, point is, that is another way to define visibility in this 
context: if present in log table. Point is, a suitable definition is needed.


I think this plpgsql function would solve the problem of atomic 
read-and-delete operation...


create or replace function log_rotate() returns void as $$
declare
  rec record;
begin

for rec in delete from t1 returning * loop
insert into t2 values( rec.a, rec.b );
end loop;

end;
$$ language 'plpgsql';

select log_rotate();


Don't forget ordering, this was important before...

START TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT ... INTO log FROM staging ORDER BY ...;
DELETE FROM staging;
COMMIT;

Don't know if that ORDER BY works. It should in this case.

- Joris

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Schema migration tools?

2008-04-22 Thread John DeSoi


On Apr 21, 2008, at 10:44 PM, Christophe wrote:

I'd like a tool that would automatically create these scripts, and I  
wondered if anything like this existed.  The theory would be that it  
would consider two databases a and b, and produce the appropriate  
script to change b's schema to match a.


Does anything like this exist?  If not, I might have a new project...



However it happens, you need to *capture* the schema changes that need  
to be applied to each database. I don't think you can just compare  
schemas and guarantee that database A will be transformed in exactly  
the same way as database B. For example, suppose the last text column  
name in database B was different from A. How can you determine if the  
column was renamed or if the column was dropped and a new column was  
added? The semantics of that difference could be very important.


It would be nice if PostgreSQL had some kind of unique reference for  
the column, but I think columns are just numbered sequentially as they  
are added. It would also be neat to have a built-in way to log the  
schema changes.



John DeSoi, Ph.D.





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Rapidly decaying performance repopulating a large table

2008-04-22 Thread David Wilson
I have a fairly simple table (a dozen real/integer columns, a few
indexes, one foreign key reference) with ~120m rows. Periodically the
table is truncated or dropped and recreated and the data is
regenerated (slightly different data, of course, or the exercise would
be rather pointless). The regeneration occurs in batches of ~4000 data
points at a time, which are inserted into the table via COPY, and are
coming from several simultaneous processes.

The first several million data points are quite quick (the copy
executes in well under a quarter second). By the time the table
reaches 10-15m rows, however, each individual COPY is taking 20
seconds to execute. Is there anything I can do to improve this
performance? I can't drop/recreate the indices because some of the
data points rely on points generated already in the run, and dropping
the indices would make the required joins ridiculously expensive once
the table starts growing. The foreign key reference *is* droppable for
this regeneration, but I wouldn't expect it to be a performance
problem.

The autovacuum daemon is running in the background, with these
settings: (All autovacuum-specific settings are still at defaults)
vacuum_cost_delay = 50  # 0-1000 milliseconds
vacuum_cost_page_hit = 1# 0-1 credits
vacuum_cost_page_miss = 10  # 0-1 credits
vacuum_cost_page_dirty = 20 # 0-1 credits
vacuum_cost_limit = 200 # 1-1 credits

My gut feeling is that better autovacuum settings would help, but I'm
not really sure what to modify to get the improvement I'm looking for.

-- 
- David T. Wilson
[EMAIL PROTECTED]

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Rapidly decaying performance repopulating a large table

2008-04-22 Thread Scott Marlowe
On Tue, Apr 22, 2008 at 2:31 PM, David Wilson [EMAIL PROTECTED] wrote:
 I have a fairly simple table (a dozen real/integer columns, a few
  indexes, one foreign key reference) with ~120m rows. Periodically the
  table is truncated or dropped and recreated and the data is
  regenerated (slightly different data, of course, or the exercise would
  be rather pointless). The regeneration occurs in batches of ~4000 data
  points at a time, which are inserted into the table via COPY, and are
  coming from several simultaneous processes.

  The first several million data points are quite quick (the copy
  executes in well under a quarter second). By the time the table
  reaches 10-15m rows, however, each individual COPY is taking 20
  seconds to execute. Is there anything I can do to improve this
  performance? I can't drop/recreate the indices because some of the

The best bet is to issue an analyze table (with your table name in
there, of course) and see if that helps.  Quite often the real issue
is that pgsql is using a method to insert rows when you have 10million
of them that made perfect sense when you had 100 rows, but no longer
is the best way.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Schema migration tools?

2008-04-22 Thread Jonathan Bond-Caron
If you’re looking for a PHP solution, there’s:

 

http://ezcomponents.org/docs/api/latest/introduction_DatabaseSchema.html

 

It’s very limited but works for simple schemas, I begun extending it to
support more rigorously SQL-92 schema information (will be released BSD
license at openmv.com).  

 

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Kerri Reno
Sent: April 22, 2008 8:44 AM
To: Christophe
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Schema migration tools?

 

Christophe -

We use LiquiBase: http://www.liquibase.org/home

We don't use it quite the same as what you're proposing, but I think you
could use it that way.  When we found it, we did 'Generate Changelog' to
start, then made new changes to the ChangeLog, and applied it to each
database as needed.   It has really helped us, because we were forgetting to
apply the changes to the production database that were needed.

Hope this helps!
Kerri

On 4/21/08, Christophe [EMAIL PROTECTED] wrote:

Greetings,

We have the traditional three servers:

   dev -- staging -- production

each with a PostgreSQL instance and the same schema, at least over time.
Right now, we handle schema migration (updating the schema for new releases)
by manually-created scripts that apply the changes to staging and
production.

I'd like a tool that would automatically create these scripts, and I
wondered if anything like this existed.  The theory would be that it would
consider two databases a and b, and produce the appropriate script to change
b's schema to match a.

Does anything like this exist?  If not, I might have a new project...

Thanks!
-- Christophe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




-- 
Yuma Educational Computer Consortium
Compass Development Team
Kerri Reno
[EMAIL PROTECTED]  (928) 502-4240
.·:*¨¨*:·.   .·:*¨¨*:·.   .·:*¨¨*:·. 



Re: [GENERAL] How to modify ENUM datatypes?

2008-04-22 Thread D. Dante Lorenso

Andreas 'ads' Scherbaum wrote:

On Tue, 22 Apr 2008 20:58:30 +0800 Craig Ringer wrote:

D. Dante Lorenso wrote:
The problem is that once I create a column in my account table that 
uses this 'account_type' datatype, I can't seem to change or add to it 
any more.  I want to add a new value or edit/delete an existing one.

How do you make changes to an ENUM datatype that is already in use?
As far as I know ENUM is not well suited to uses where new enumeration 
members may be added later. A lookup table and a foreign key is probably 
better for this sort of use.

I remember the discussions before PG implemented ENUMs at all - some
people voted against this feature because they knew that questions
about modifing the enum values would pop up sooner or later.
You *can* add elements to an enum type - sort of - by creating a new 
type and converting columns. It's ugly, though, and will be hard to get 
right when the column of interest is referenced by foreign keys and such.

If there's one request to modify a specific ENUM column, other requests
will follow because the enum set is not as static as it seems in the
first place. So i beg that the only advise should be change the column
to a foreign key construct. Converting columns to new data types is
much more overhead anyway.


So, the advice here is don't use ENUM?

I was really hoping that it would be more efficient to not have to do 
all the foreign keys and joins for tables that may have 4-5 enum types.


Just being able to:

  SELECT *
  FROM tablename

would be nice if my columns contained enums instead of doing:

  SELECT *
  FROM tablename, lookuptable
  WHERE tablename.some_id = lookuptable.some_id

Isn't the join more expensive?

-- Dante

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to modify ENUM datatypes?

2008-04-22 Thread Christophe

On Apr 22, 2008, at 1:45 PM, D. Dante Lorenso wrote:

So, the advice here is don't use ENUM?


I think it's more Don't use ENUM for a type that you are planning to  
extend.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Schema migration tools?

2008-04-22 Thread Erik Jones


On Apr 22, 2008, at 3:09 PM, John DeSoi wrote:

It would be nice if PostgreSQL had some kind of unique reference for  
the column, but I think columns are just numbered sequentially as  
they are added. It would also be neat to have a built-in way to log  
the schema changes.


It does: log_statement set to either 'all' or 'ddl' will do the trick.

Erik Jones

DBA | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate  market in style.
Visit us online at http://www.myemma.com




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Rapidly decaying performance repopulating a large table

2008-04-22 Thread David Wilson
On Tue, Apr 22, 2008 at 4:38 PM, Scott Marlowe [EMAIL PROTECTED] wrote:
  The best bet is to issue an analyze table (with your table name in
  there, of course) and see if that helps.  Quite often the real issue
  is that pgsql is using a method to insert rows when you have 10million
  of them that made perfect sense when you had 100 rows, but no longer
  is the best way.


This has caused the behavior to be... erratic. That is, individual
copies are now taking anywhere from 2 seconds (great!) to 30+ seconds
(back where we were before). I also clearly can't ANALYZE the table
after every 4k batch; even if that resulted in 2 second copies, the
analyze would take up as much time as the copy otherwise would have
been. I could conceivably analyze after every ~80k (the next larger
unit of batching; I'd love to be able to batch the copies at that
level but dependencies ensure that I can't), but it seems odd to have
to analyze so often.

Oh, barring COPY delays I'm generating the data at a rate of something
like a half million rows every few minutes, if that's relevant.
-- 
- David T. Wilson
[EMAIL PROTECTED]

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Rapidly decaying performance repopulating a large table

2008-04-22 Thread Scott Marlowe
On Tue, Apr 22, 2008 at 2:59 PM, David Wilson [EMAIL PROTECTED] wrote:
 On Tue, Apr 22, 2008 at 4:38 PM, Scott Marlowe [EMAIL PROTECTED] wrote:
The best bet is to issue an analyze table (with your table name in
there, of course) and see if that helps.  Quite often the real issue
is that pgsql is using a method to insert rows when you have 10million
of them that made perfect sense when you had 100 rows, but no longer
is the best way.
  

  This has caused the behavior to be... erratic. That is, individual
  copies are now taking anywhere from 2 seconds (great!) to 30+ seconds
  (back where we were before). I also clearly can't ANALYZE the table
  after every 4k batch; even if that resulted in 2 second copies, the
  analyze would take up as much time as the copy otherwise would have
  been. I could conceivably analyze after every ~80k (the next larger
  unit of batching; I'd love to be able to batch the copies at that
  level but dependencies ensure that I can't), but it seems odd to have
  to analyze so often.

Normally, after the first 50,000 or so the plan won't likely change
due to a new analyze, so you could probably just analyze after 50k or
so and get the same performance.  If the problem is a bad plan for the
inserts / copies.

also, non-indexed foreign keyed fields can cause this problem.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to modify ENUM datatypes?

2008-04-22 Thread Gurjeet Singh
On Tue, Apr 22, 2008 at 3:31 PM, D. Dante Lorenso [EMAIL PROTECTED] wrote:

 All,

 In the past I have used foreign keys to lookup tables for small lists of
 values that I now think ENUM could do the job of.  I was hoping that by
 using ENUM, I would avoid having to do joins in my queries, and that I'd be
 making the data more robust and faster.

 I used to have a table for account_status:

  A | Active
  B | Billing Failed
  C | Closed
  D | Deactivated

  account.acct_type CHAR references account_type.acct_type CHAR

 But, now I've converted that to an ENUM:

  ACTIVE
  BILLING_FAILED
  CLOSED
  DEACTIVATED

  account.acct_type ENUM account_type

 The problem is that once I create a column in my account table that uses
 this 'account_type' datatype, I can't seem to change or add to it any more.
  I want to add a new value or edit/delete an existing one.

 How do you make changes to an ENUM datatype that is already in use?


I agree with others that ENUMs stop looking pretty when you need to modify
them...

Here's a thread from recent past where this exact problem was discussed...
maybe it'll interest you...

http://archives.postgresql.org/pgsql-general/2007-12/msg01363.php

Best regards,
-- 
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device


Re: [GENERAL] How to modify ENUM datatypes?

2008-04-22 Thread Scott Marlowe
On Tue, Apr 22, 2008 at 2:45 PM, D. Dante Lorenso [EMAIL PROTECTED] wrote:

  So, the advice here is don't use ENUM?

  I was really hoping that it would be more efficient to not have to do all
 the foreign keys and joins for tables that may have 4-5 enum types.

  Just being able to:

   SELECT *
   FROM tablename

If you use a lookup table methodology you still get that.  Try this:

smarlowe=# create table choices (color text primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
choices_pkey for table choices
CREATE TABLE
smarlowe=# insert into choices values ('red'),('yellow'),('green'),('blue');
INSERT 0 4
smarlowe=# create table mystuff (id serial primary key, usenam text,
mycolor text references choices(color));
NOTICE:  CREATE TABLE will create implicit sequence mystuff_id_seq
for serial column mystuff.id
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
mystuff_pkey for table mystuff
CREATE TABLE
smarlowe=# insert into mystuff (usenam, mycolor) values ('scott','red');
INSERT 0 1
smarlowe=# insert into mystuff (usenam, mycolor) values ('darren','blue');
INSERT 0 1
smarlowe=# insert into mystuff (usenam, mycolor) values ('dan','green');
INSERT 0 1
smarlowe=# insert into mystuff (usenam, mycolor) values ('steve','green');
INSERT 0 1
smarlowe=# insert into mystuff (usenam, mycolor) values ('mike','black');
ERROR:  insert or update on table mystuff violates foreign key
constraint mystuff_mycolor_fkey
DETAIL:  Key (mycolor)=(black) is not present in table choices.
smarlowe=# select * from mystuff;
 id | usenam | mycolor
++-
  1 | scott  | red
  2 | darren | blue
  3 | dan| green
  4 | steve  | green
(4 rows)

tada!  No enum, and no join.  But you can't insert illegal values in mycolor...

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Rapidly decaying performance repopulating a large table

2008-04-22 Thread David Wilson
On Tue, Apr 22, 2008 at 5:04 PM, Scott Marlowe [EMAIL PROTECTED] wrote:
  Normally, after the first 50,000 or so the plan won't likely change
  due to a new analyze, so you could probably just analyze after 50k or
  so and get the same performance.  If the problem is a bad plan for the
  inserts / copies.

  also, non-indexed foreign keyed fields can cause this problem.


Analyzing after the first 50k or so is easy enough, then; thanks for
the suggestion.

Foreign keys are definitely indexed (actually referencing a set of
columns that the foreign table is UNIQUE on).

Any other suggestions? COPY times alone are pretty much quadrupling my
table-rebuild runtime, and I can interrupt the current rebuild to try
things pretty much at a whim (nothing else uses the DB while a rebuild
is happening), so I'm pretty much game to try any reasonable
suggestions anyone has.

-- 
- David T. Wilson
[EMAIL PROTECTED]

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Rapidly decaying performance repopulating a large table

2008-04-22 Thread Scott Marlowe
On Tue, Apr 22, 2008 at 3:15 PM, David Wilson [EMAIL PROTECTED] wrote:
 On Tue, Apr 22, 2008 at 5:04 PM, Scott Marlowe [EMAIL PROTECTED] wrote:
Normally, after the first 50,000 or so the plan won't likely change
due to a new analyze, so you could probably just analyze after 50k or
so and get the same performance.  If the problem is a bad plan for the
inserts / copies.
  
also, non-indexed foreign keyed fields can cause this problem.
  

  Analyzing after the first 50k or so is easy enough, then; thanks for
  the suggestion.

  Foreign keys are definitely indexed (actually referencing a set of
  columns that the foreign table is UNIQUE on).

  Any other suggestions? COPY times alone are pretty much quadrupling my
  table-rebuild runtime, and I can interrupt the current rebuild to try
  things pretty much at a whim (nothing else uses the DB while a rebuild
  is happening), so I'm pretty much game to try any reasonable
  suggestions anyone has.

Try upping your checkpoint segments.  Some folks find fairly large
numbers like 50 to 100 to be helpful.  Each segment = 16Megs, so be
sure not to run your system out of drive space while increasing it.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to modify ENUM datatypes?

2008-04-22 Thread Joshua D. Drake
On Tue, 22 Apr 2008 15:45:39 -0500
D. Dante Lorenso [EMAIL PROTECTED] wrote:


 I was really hoping that it would be more efficient to not have to do 
 all the foreign keys and joins for tables that may have 4-5 enum
 types.
 
 Just being able to:
 
SELECT *
FROM tablename
 
 would be nice if my columns contained enums instead of doing:
 
SELECT *
FROM tablename, lookuptable
WHERE tablename.some_id = lookuptable.some_id
 
 Isn't the join more expensive?

You were using natural keys, the join would not be required.

Joshua D. Drake

 
 -- Dante
 


-- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Schema migration tools?

2008-04-22 Thread John DeSoi


On Apr 22, 2008, at 4:53 PM, Erik Jones wrote:

It would be nice if PostgreSQL had some kind of unique reference  
for the column, but I think columns are just numbered sequentially  
as they are added. It would also be neat to have a built-in way to  
log the schema changes.


It does: log_statement set to either 'all' or 'ddl' will do the trick.


If I do this, is there a way to get a transaction consistent log of  
just the necessary commands to transform another copy of the database?  
In other words, I assume this approach will log each DDL command even  
if the transaction is rolled back. Correct?


Thanks,



John DeSoi, Ph.D.





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to modify ENUM datatypes?

2008-04-22 Thread D. Dante Lorenso

Scott Marlowe wrote:

On Tue, Apr 22, 2008 at 2:45 PM, D. Dante Lorenso [EMAIL PROTECTED] wrote:

 So, the advice here is don't use ENUM?
 I was really hoping that it would be more efficient to not have to do all
the foreign keys and joins for tables that may have 4-5 enum types.
 Just being able to:
  SELECT *
  FROM tablename

If you use a lookup table methodology you still get that.  Try this:
smarlowe=# create table choices (color text primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
choices_pkey for table choices
CREATE TABLE
smarlowe=# insert into choices values ('red'),('yellow'),('green'),('blue');
INSERT 0 4
smarlowe=# create table mystuff (id serial primary key, usenam text,
mycolor text references choices(color));
NOTICE:  CREATE TABLE will create implicit sequence mystuff_id_seq
for serial column mystuff.id
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
mystuff_pkey for table mystuff
CREATE TABLE
smarlowe=# insert into mystuff (usenam, mycolor) values ('scott','red');
INSERT 0 1
smarlowe=# insert into mystuff (usenam, mycolor) values ('darren','blue');
INSERT 0 1
smarlowe=# insert into mystuff (usenam, mycolor) values ('dan','green');
INSERT 0 1
smarlowe=# insert into mystuff (usenam, mycolor) values ('steve','green');
INSERT 0 1
smarlowe=# insert into mystuff (usenam, mycolor) values ('mike','black');
ERROR:  insert or update on table mystuff violates foreign key
constraint mystuff_mycolor_fkey
DETAIL:  Key (mycolor)=(black) is not present in table choices.
smarlowe=# select * from mystuff;
 id | usenam | mycolor
++-
  1 | scott  | red
  2 | darren | blue
  3 | dan| green
  4 | steve  | green
(4 rows)
tada!  No enum, and no join.  But you can't insert illegal values in mycolor...


This approach is so old-school, I seem to have overlooked the obvious.

Here you've elected to use the foreign key to just control the possible 
values inserted but not really to look up the value.


Seems you are storing the values in text form which goes against all the 
normalization techniques I've learned in school.  I see this might be a 
problem with storage since you will need to store the TEXT value for 
every row in the 'mystuff' table instead of just storing the reference 
to the lookup table as an INTEGER.  Over millions of rows, perhaps this 
would become a concern?


What is the general consensus by the community about this approach?  Is 
this de-normalization frowned upon, or is there a performance advantage 
here that warrants the usage?


-- Dante











--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Rapidly decaying performance repopulating a large table

2008-04-22 Thread David Wilson
On Tue, Apr 22, 2008 at 5:18 PM, Scott Marlowe [EMAIL PROTECTED] wrote:

  Try upping your checkpoint segments.  Some folks find fairly large
  numbers like 50 to 100 to be helpful.  Each segment = 16Megs, so be
  sure not to run your system out of drive space while increasing it.


Ahh, much more progress. Upping the segments to 50, timeout to 30m and
completion target to 0.9 has improved average copy time to between 2
and 10 seconds, which is definitely an improvement. Thanks for the
help. Any other random thoughts while you're at it? :)

-- 
- David T. Wilson
[EMAIL PROTECTED]

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Schema migration tools?

2008-04-22 Thread Erik Jones


On Apr 22, 2008, at 4:33 PM, John DeSoi wrote:



On Apr 22, 2008, at 4:53 PM, Erik Jones wrote:

It would be nice if PostgreSQL had some kind of unique reference  
for the column, but I think columns are just numbered sequentially  
as they are added. It would also be neat to have a built-in way to  
log the schema changes.


It does: log_statement set to either 'all' or 'ddl' will do the  
trick.


If I do this, is there a way to get a transaction consistent log of  
just the necessary commands to transform another copy of the  
database? In other words, I assume this approach will log each DDL  
command even if the transaction is rolled back. Correct?



Right.  It's not something you'd want to lift directly and run  
somewhere else.


Erik Jones

DBA | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate  market in style.
Visit us online at http://www.myemma.com




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Rapidly decaying performance repopulating a large table

2008-04-22 Thread Erik Jones


On Apr 22, 2008, at 4:46 PM, David Wilson wrote:

On Tue, Apr 22, 2008 at 5:18 PM, Scott Marlowe [EMAIL PROTECTED] 
 wrote:


Try upping your checkpoint segments.  Some folks find fairly large
numbers like 50 to 100 to be helpful.  Each segment = 16Megs, so be
sure not to run your system out of drive space while increasing it.



Ahh, much more progress. Upping the segments to 50, timeout to 30m and
completion target to 0.9 has improved average copy time to between 2
and 10 seconds, which is definitely an improvement. Thanks for the
help. Any other random thoughts while you're at it? :)


Has anyone yet pointed out the standards:  drop indexes and foreign  
keys and rebuild them once the entire data import is finished?


Erik Jones

DBA | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate  market in style.
Visit us online at http://www.myemma.com




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Rapidly decaying performance repopulating a large table

2008-04-22 Thread Tom Lane
David Wilson [EMAIL PROTECTED] writes:
 Foreign keys are definitely indexed (actually referencing a set of
 columns that the foreign table is UNIQUE on).

Are you loading any tables that are the targets of foreign key
references from other tables being loaded?  If so, I'd bet on
Scott's theory being correct with respect to the plans for checks
of those FK constraints.  The COPY itself hasn't got any plan,
and inserting rows into a table should be constant-time in itself,
so it seems to me there are only two possibilities for a progressive
slowdown:

* the cost of updating the indexes, which for standard btree indexes
ought to grow at about log(N) when there are already N entries

* bad plans in either foreign-key triggers or user-defined triggers
attached to the tables.

You failed to mention what PG version this is (tut tut) but if it's
less than 8.3 then ANALYZE alone won't fix bad plans in triggers;
you'd need to analyze and then start a fresh database session.

You said that you don't need to have the FK constraints present,
so I'd strongly suggest trying it without ...

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to modify ENUM datatypes?

2008-04-22 Thread Jeff Davis
On Tue, 2008-04-22 at 16:34 -0500, D. Dante Lorenso wrote:
 I see this might be a 
 problem with storage since you will need to store the TEXT value for 
 every row in the 'mystuff' table instead of just storing the reference 
 to the lookup table as an INTEGER.  Over millions of rows, perhaps this 
 would become a concern?

It does use additional storage to store the full text value, rather than
a fixed-size integer. However, the difference is not much when the
average string length is short.

If you store an integer reference instead, joins are not necessarily
expensive. If the number of distinct values is small (which is the
normal use case for ENUM), I would expect the joins to be quite cheap.
Beware of running into bad plans however, or making the optimizer work
too hard (if you have a lot of other joins, too).

I don't think the performance concerns are major, but worth considering
if you have millions of rows.

 What is the general consensus by the community about this approach?  Is 
 this de-normalization frowned upon, or is there a performance advantage 
 here that warrants the usage?

This is not de-normalization, at all. Normalization is a formal process,
and if this were de-normalization, you could find a specific rule that
is violated by this approach.

Look here:
http://en.wikipedia.org/wiki/Third_normal_form

If you go to higher normal forms, you will not find any violations
there, either. There is nothing about normalization that requires the
use of surrogate keys.

The approach suggested by Scott Marlowe is normalized as well as being
quite natural and simple. I think often this is overlooked as being too
simple, but it's a quite good design in many cases.

Regards,
Jeff Davis


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Rapidly decaying performance repopulating a large table

2008-04-22 Thread David Wilson
On Tue, Apr 22, 2008 at 6:10 PM, Tom Lane [EMAIL PROTECTED] wrote:
 David Wilson [EMAIL PROTECTED] writes:

  Are you loading any tables that are the targets of foreign key
  references from other tables being loaded?  If so, I'd bet on
  Scott's theory being correct with respect to the plans for checks
  of those FK constraints.  The COPY itself hasn't got any plan,
  and inserting rows into a table should be constant-time in itself,
  so it seems to me there are only two possibilities for a progressive
  slowdown:

  * the cost of updating the indexes, which for standard btree indexes
  ought to grow at about log(N) when there are already N entries

  * bad plans in either foreign-key triggers or user-defined triggers
  attached to the tables.

Only one table is being regenerated, and it's not the target of any
foreign key checks itself; it merely has a single FK reference out to
one unchanging table. There are no triggers on the table.

  You failed to mention what PG version this is (tut tut) but if it's
  less than 8.3 then ANALYZE alone won't fix bad plans in triggers;
  you'd need to analyze and then start a fresh database session.

PG is 8.3.1.

I certainly expect some slowdown, given that I have indices that I
can't drop (as you indicate above). Having been watching it now for a
bit, I believe that the checkpoint settings were the major cause of
the problem, however. Changing those settings has dropped the copy
times back down toward what I'd expect; I have also now dropped the FK
constraint, but that has made no perceptible difference in time.

My guess at this point is that I'm just running into index update
times and checkpoint IO. The only thing that still seems strange is
the highly variable nature of the COPY times- anywhere from 1.0
seconds to 20 seconds, with an average probably around 8ish. I can
live with that, but I'm still open to any other suggestions anyone
has!

Thanks for the help so far.

-- 
- David T. Wilson
[EMAIL PROTECTED]

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Rapidly decaying performance repopulating a large table

2008-04-22 Thread Tom Lane
David Wilson [EMAIL PROTECTED] writes:
 My guess at this point is that I'm just running into index update
 times and checkpoint IO. The only thing that still seems strange is
 the highly variable nature of the COPY times- anywhere from 1.0
 seconds to 20 seconds, with an average probably around 8ish. I can
 live with that, but I'm still open to any other suggestions anyone
 has!

What have you got shared_buffers set to?  If it's not enough to cover
the working set for your indexes, that might be the (other) problem.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: FW: Re: [GENERAL] create temp in function

2008-04-22 Thread Klint Gore

Kerri Reno wrote:
So the reason I'm getting the error is that I'm running it in 8.0.  
Thanks so much for your help!

Kerri

use FOR instead

CREATE OR REPLACE FUNCTION private.get_salary_schedule(integer, text)
 RETURNS record AS
$BODY$
declare
   tcp_id alias for $1;
   tfy alias for $2;
   tbl text := 'saltab' || tfy;
   arow record;
   query text;
begin
   query = 'select ' || 'salary_schedule, pay_column, step from ' || tbl ||
   ' where cp_id = ' || to_char(tcp_id,'');
   raise notice 'query: %', query;
   for arow in execute query loop
   return arow;
   end loop;
   return null;
end;
$BODY$
 LANGUAGE 'plpgsql' STABLE SECURITY DEFINER;

you probably need to protect tfy from sql injection too.  see quote_ident.

klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789  
Fax: 02 6773 3266

EMail: [EMAIL PROTECTED]


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres Encoding conversion problem

2008-04-22 Thread Clemens Schwaighofer
On 04/22/2008 07:30 PM, Albe Laurenz wrote:
 Clemens Schwaighofer wrote:
 I sometimes have a problem with conversion of encodings eg from UTF-8
 tio ShiftJIS:

 ERROR:  character 0xf0a0aeb7 of encoding UTF8 has no
 equivalent in SJIS

 I have no idea what character this is, I cannot view it in my
 browser, etc.
 It translates to Unicode 10BB7, which is not defined.
 I guess that is not intended; can you guess what the character(s) should be?
 to be honest no idea. its some chinese character, I have no idea how the
 user input this, because this is a japanese page.

 I actually found the carachter, but only my Mac OS X can show it. It
 looks similar to a japanese character used for a name, but how the
 chinese one got selected is a mystery to me ...
 
 Are you sure that your Mac OS X computer interprets the character as
 UTF-8?

That I cannot be sure, I just searched through a page that has a
complete list. OS X can render it, Linux cannot, I have not tried windows.

-- 
[ Clemens Schwaighofer  -=:~ ]
[ IT Engineer/Manager, TEQUILA\ Japan IT Group   ]
[6-17-2 Ginza Chuo-ku, Tokyo 104-8167, JAPAN ]
[ Tel: +81-(0)3-3545-7703Fax: +81-(0)3-3545-7343 ]
[ http://www.tequila.co.jp   ]

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Rapidly decaying performance repopulating a large table

2008-04-22 Thread David Wilson
On Tue, Apr 22, 2008 at 7:33 PM, Tom Lane [EMAIL PROTECTED] wrote:

  What have you got shared_buffers set to?  If it's not enough to cover
  the working set for your indexes, that might be the (other) problem.


shared_buffers = 1536MB

Is there a way to get the size of a specific index, on that note?
There seem to be access functions for the relation + indices, and for
the relation by itself, but not a specific index out of possibly
several. I could increase shared_buffers some, but client apps on the
same machine occasionally also have hefty memory requirements (not
during these regeneration runs, but it seems like restarting the
server with a new shared_buffers value before and after the
regeneration is a bit of overkill).

-- 
- David T. Wilson
[EMAIL PROTECTED]

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to modify ENUM datatypes?

2008-04-22 Thread Merlin Moncure
On Tue, Apr 22, 2008 at 6:11 PM, Jeff Davis [EMAIL PROTECTED] wrote:
  If you store an integer reference instead, joins are not necessarily
  expensive. If the number of distinct values is small (which is the
  normal use case for ENUM), I would expect the joins to be quite cheap.
  Beware of running into bad plans however, or making the optimizer work
  too hard (if you have a lot of other joins, too).

Necessarily being the operative word here.  Think about an enum as
part of a composite key for example.  It's a lot nicer to rely on enum
for natural ordering than doing something like a functional index.

Anyways, it's pretty easy to extend an enum...you can manually insert
an entry into pg_enum (see the relevent docs).  Just watch out for oid
overlap.  One thing currently that is very difficult currently to do
is to alter the order of the enum elements.  The current state of
things is pretty workable though.

Scott's color/mystuff example is generally preferred for a lot of
cases.  I _really_ prefer this to surrogate style enums where you have
color_id...this approach makes your database unreadable IMO.  A decent
hybrid approach which I have been using lately is char (not char)
where the choices set is reasonably small, well represented by a
single character, and the intrinsic ordering property is not too
important (where an enum might be better).  In many cases though, the
pure natural approach is simply the best.  The enum though with is
intrinsic ordering and more efficient indexing has an important niche
however.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Rapidly decaying performance repopulating a large table

2008-04-22 Thread Greg Smith

On Tue, 22 Apr 2008, David Wilson wrote:


My guess at this point is that I'm just running into index update
times and checkpoint IO. The only thing that still seems strange is
the highly variable nature of the COPY times- anywhere from 1.0
seconds to 20 seconds, with an average probably around 8ish.


Have you turned on log_checkpoints to see whether those are correlated 
with the slow ones?  Given that you've had an improvement by increasing 
checkpoint_segments, it's not out of the question to think that maybe 
you're still getting nailed sometimes during the more stressful portions 
of the checkpoint cycle (usually right near the end).  The 1 second ones 
just might just happen to be ones that start just as the previous 
checkpoint finished.  To make lining those up easier, you might turn on 
logging of long statements with log_min_duration_statement to see both 
bits of data in the same log file.  That might get you some other 
accidental enlightenment as well (like if there's some other statement 
going on that's colliding with this load badly).


This is a bit out of my area, but after reading the rest of this thread I 
wonder whether raising the default_statistics_target parameter a bit might 
reduce the instances of bad plans showing up.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general