date:20060606


I have a table that I am using to store email token data for DSPAM.

I'm noticing that a handful (4-16) of rows with duplicate columns
(uid,token) are sneaking into the table every day despite the
primary key constraint.

The server currently processes a few thousand emails per day, and
this particular table currently has about 4.5 million rows in it.

I feel as though I must be missing something here, because I have
always strongly assumed that postgresql prevents this sort of chaos
from happening by default.

When these duplicate pairs make their way into the table, all havoc
breaks loose ;)  The rows with the duplicate pairs seem to become
land mines.  The postgresql process handling a query that
subsequently runs across one of these rows dies, taking down
the DSPAM daemon with it, and sometimes corrupting the postgresql
shared memory enough that the postmaster has to shutdown the other
processes and restart everything anew [1].

I am usually able to clean things up by running the following, but
once or twice I've had to drop the unique constraint before
postgresql would process the request without choking:

delete from dspam_token_data
   where row(uid,token) in
   (select uid,token
 from dspam_token_data
 group by uid,token
 having count(*)  1);

(I don't worry about preserving one of the duplicates here.)

I'm running postgresql-8.1.3.  Here is the table in question:

CREATE TABLE dspam.dspam_token_data
(
   uid int4 NOT NULL,
   token int8 NOT NULL,
   spam_hits int4,
   innocent_hits int4,
   last_hit date,
   CONSTRAINT dspam_token_data_pkey PRIMARY KEY (uid, token)
)
WITHOUT OIDS;

[2]

What steps can I take to fix this?  Is there any additional
information I can provide?

I've cleaned the table many times now.  I then run VACUUM ANALYZE.

My next step will probably be to hack the DSPAM sources to make the
application more careful about not trying to insert rows that would
violate the unique constraint.  Even still, it seems that would only
reduce the frequency of these occurrences, not eliminate them
completely.

Thanks!

Cheers,

-- Travis



Notes:
[1] A condensed log file showing off exactly what happens here is
attached.

[2]  Previously, the table lacked a primary key and instead used a
unique constraint and index.  This yielded the exact same results I
am currently seeing using a two-column primary key, as above.  The
old table schema was:

CREATE TABLE dspam.dspam_token_data
(
   uid int4,
   token int8,
   spam_hits int4,
   innocent_hits int4,
   last_hit date,
   CONSTRAINT dspam_token_data_uid_key UNIQUE (uid, token)
)
WITHOUT OIDS;

Jun  5 17:58:14 shadow postgres[28775]: [21-1] PANIC:  right sibling's 
left-link doesn't match
Jun  5 17:58:14 shadow postgres[28775]: [21-2] STATEMENT:  PREPARE 
dspam_update_plan (bigint) AS UPDATE dspam_token_data SET last_hit = 
CURRENT_DATE, innocent_hits =
Jun  5 17:58:14 shadow postgres[28775]: [21-3]  innocent_hits + 1 WHERE uid = 
'1' AND token = $1;PREPARE dspam_insert_plan (bigint, int, int) AS INSERT INTO 
dspam_token_data
Jun  5 17:58:14 shadow postgres[28775]: [21-4]  (uid, token, spam_hits, 
innocent_hits, last_hit) VALUES (1, $1, $2, $3, CURRENT_DATE);
Jun  5 17:58:14 shadow dspam[7780]: query error: (null): see sql.errors for 
more details
Jun  5 17:58:14 shadow postgres[10566]: [21-1] LOG:  server process (PID 28775) 
was terminated by signal 6
Jun  5 17:58:14 shadow postgres[10566]: [22-1] LOG:  terminating any other 
active server processes
Jun  5 17:58:14 shadow postgres[7219]: [21-1] WARNING:  terminating connection 
because of crash of another server process
Jun  5 17:58:14 shadow postgres[7219]: [21-2] DETAIL:  The postmaster has 
commanded this server process to roll back the current transaction and exit, 
because another server
Jun  5 17:58:14 shadow postgres[7219]: [21-3]  process exited abnormally and 
possibly corrupted shared memory.

Jun  5 17:58:14 shadow postgres[7525]: [24-1] LOG:  database system was 
interrupted at 2006-06-05 17:58:01 UTC
Jun  5 17:58:14 shadow postgres[7525]: [25-1] LOG:  checkpoint record is at 
16/DA1D6868
Jun  5 17:58:14 shadow postgres[7525]: [26-1] LOG:  redo record is at 
16/DA0081D8; undo record is at 0/0; shutdown FALSE
Jun  5 17:58:14 shadow postgres[7525]: [27-1] LOG:  next transaction ID: 
5691748; next OID: 148615
Jun  5 17:58:14 shadow postgres[7525]: [28-1] LOG:  next MultiXactId: 4; next 
MultiXactOffset: 10
Jun  5 17:58:14 shadow postgres[7525]: [29-1] LOG:  database system was not 
properly shut down; automatic recovery in progress
Jun  5 17:58:14 shadow postgres[7525]: [30-1] LOG:  redo starts at 16/DA0081D8
Jun  5 17:58:14 shadow postgres[7525]: [31-1] LOG:  record with zero length at 
16/DC2858F8
Jun  5 17:58:14 shadow postgres[7525]: [32-1] LOG:  redo done at 16/DC2858C8
Jun  5 17:58:16 shadow postgres[7542]: [24-1] FATAL:  the database system is 
starting up
Jun  5 17:58:16 shadow postgres[7543]: [24-1] FATAL:  the database

[HACKERS] AIX check in datetime.h

2006-06-06 Thread Joachim Wieland

Can someone please explain why in include/utils/datetime.h (struct datetkn)
there is a check for _AIX that either initializes a char* pointer or a char
array? Is there any advantage of a char-array except for warnings of some
compilers if the initilization string is too long?

Apart from that I doubt that AIX cannot handle token[TOKMAXLEN] because
similar declarations can be found in other headers without the _AIX check.

The struct definition is more than 9 years old and seems to show up first in
Attic/dt.h.


#define TOKMAXLEN   10  /* only this many chars are stored in
 * datetktbl */

/* keep this struct small; it gets used a lot */
typedef struct
{
#if defined(_AIX)
char   *token;
#else
chartoken[TOKMAXLEN];
#endif   /* _AIX */
chartype;
charvalue;  /* this may be unsigned, alas */
} datetkn;


Joachim


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] fillfactor using WITH syntax

2006-06-06 Thread Simon Riggs

On Tue, 2006-06-06 at 11:45 +0900, ITAGAKI Takahiro wrote:
 Hi Hackers,
 
 I'm rewriting fillfactor patch, per the following discussion,
 http://archives.postgresql.org/pgsql-hackers/2006-03/msg00287.php
 Now fillfactor can be set using WITH syntax:
   - CREATE INDEX index ON table USING btree (columns) WITH (...)
   - CREATE TABLE table (i integer PRIMARY KEY WITH (...))
   - ALTER TABLE table ADD PRIMARY KEY (columns) WITH (...)

Sounds good.

This is important in other situations too, e.g.
http://archives.postgresql.org/pgsql-hackers/2005-09/msg00851.php

 The settings are stored on pg_class.relfillfactor and the last value will
 be used on next REINDEX. WITH parameter is a list of DefElems, so we can
 use it to pass additional parameters to index access methods.

Are you implementing the array of parameters on pg_index as Tom
suggested or pg_class.relfillfactor?

Why not implement an array of option parameters on pg_class, so both
heaps and indexes can be given additional parameters? That way you
wouldn't need a specific relfillfactor attribute. That would allow us to
keep CREATE TABLE free of additional keywords also.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[HACKERS] How to avoid transaction ID wrap

2006-06-06 Thread Mark Woodward

OK, here's my problem, I have a nature study where we have about 10 video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.
We want to keep about a years worth of data at any specific time.
We have triggers that fire is something interesting is found on insert.
We want this thing to run for a log time.
From the numbers, you can see the PostgreSQL database is VERY loaded.
Running VACUUM may not always be possible without losing data.
The numbers I have amount to 466,560,000 transactions per month, lasting a
maximum of about 9 months until XID wrap.

I am thinking about a few work arounds, BEGIN/COMMIT to reduce the number
of transactions, COPY, etc. so I'm not dead in the water, but I would be
interested in any observations yo may have.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much

2006-06-06 Thread Hannu Krosing

Ühel kenal päeval, E, 2006-06-05 kell 14:10, kirjutas Tom Lane:
 Neil Conway [EMAIL PROTECTED] writes:
  On Mon, 2006-06-05 at 19:17 +0200, Zoltan Boszormenyi wrote:
  The general case cannot be applied for all particular cases.
  E.g. you cannot use cursors from shell scripts
 
  This could be fixed by adding an option to psql to transparently produce
  SELECT result sets via a cursor.

I think this is an excellent idea. 

psql --cursor --fetchby 1 -c select ... | myprogram

 Note of course that such a thing would push the incomplete-result
 problem further upstream.   For instance in (hypothetical --cursor
 switch)
   psql --cursor -c select ... | myprogram
 there would be no very good way for myprogram to find out that it'd
 been sent an incomplete result due to error partway through the SELECT.

would it not learn about it at the point of error ?

even without --cursor there is still no very good way to find out when
something else goes wrong, like the result inside libpq taking up all
memory and so psql runs out of memory on formatting some longer lines.


-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] How to avoid transaction ID wrap

2006-06-06 Thread Heikki Linnakangas


On Tue, 6 Jun 2006, Mark Woodward wrote:


OK, here's my problem, I have a nature study where we have about 10 video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.


I would suggest doing all the inserts of one frame in one transaction. 
Maybe even multiple frames in one transaction. That should bring down the 
number of transactions significantly.



We want to keep about a years worth of data at any specific time.
We have triggers that fire is something interesting is found on insert.
We want this thing to run for a log time.
From the numbers, you can see the PostgreSQL database is VERY loaded.



Running VACUUM may not always be possible without losing data.


Why not?


The numbers I have amount to 466,560,000 transactions per month, lasting a
maximum of about 9 months until XID wrap.


If you can get that maximum up above one year (which was how long you want 
to keep the data), you won't need to freeze the records to 
avoid ID wraparound.


- Heikki

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] How to avoid transaction ID wrap

2006-06-06 Thread Hannu Krosing

Ühel kenal päeval, T, 2006-06-06 kell 08:42, kirjutas Mark Woodward:
 OK, here's my problem, I have a nature study where we have about 10 video
 cameras taking 15 frames per second.
 For each frame we make a few transactions on a PostgreSQL database.
 We want to keep about a years worth of data at any specific time.

partition by month, then you have better chances of removing old data
without causing overload/data loss;

 We have triggers that fire is something interesting is found on insert.
 We want this thing to run for a log time.
 From the numbers, you can see the PostgreSQL database is VERY loaded.
 Running VACUUM may not always be possible without losing data.

why ? just run it with very friendly delay settings.

 The numbers I have amount to 466,560,000 transactions per month, lasting a
 maximum of about 9 months until XID wrap.

actually 4.5 months as you will start having problems at 2G xacts.

 I am thinking about a few work arounds, BEGIN/COMMIT to reduce the number
 of transactions, COPY, etc. so I'm not dead in the water, but I would be
 interested in any observations yo may have.


-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] How to avoid transaction ID wrap

2006-06-06 Thread Alvaro Herrera

Mark Woodward wrote:
 OK, here's my problem, I have a nature study where we have about 10 video
 cameras taking 15 frames per second.
 For each frame we make a few transactions on a PostgreSQL database.

Maybe if you grouped multiple operations on bigger transactions, the I/O
savings could be enough to buy you the ability to vacuum once in a
while.  Or consider buffering somehow -- save the data elsewhere, and
have some sort of daemon to put it into the database.  This would allow
to cope with the I/O increase during vacuum.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Why do we want to %Remove behavior of postmaster -o

Peter Eisentraut [EMAIL PROTECTED] writes:
 Am Samstag, 3. Juni 2006 04:27 schrieb Tom Lane:
 Actually, the TODO item is very badly worded.  The idea is to get rid of
 the spelling differences between postmaster and postgres options, and
 then there will be no need for '-o' because you'll just say what you
 want --- that is, -o -foo and -foo will be interchangeable.

 This is already done.

Oh, right.  The patch you still have uncommitted had to do with getting
rid of the separate postmaster and postgres executables, right?  Is that
going anywhere, or did you decide it's not worth the trouble?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much

Hannu Krosing [EMAIL PROTECTED] writes:
 Ãhel kenal pÃ¤eval, E, 2006-06-05 kell 14:10, kirjutas Tom Lane:
 Note of course that such a thing would push the incomplete-result
 problem further upstream.   For instance in (hypothetical --cursor
 switch)
  psql --cursor -c select ... | myprogram
 there would be no very good way for myprogram to find out that it'd
 been sent an incomplete result due to error partway through the SELECT.

 would it not learn about it at the point of error ?

No, it would merely see EOF after some number of result rows.  (I'm
assuming you're also using -A -t so that the output is unadorned.)

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work

2006-06-06 Thread Martijn van Oosterhout

On Mon, Jun 05, 2006 at 11:02:33PM -0400, Tom Lane wrote:
 Just got this rather surprising result:

snip bogus explain output

 The Total runtime is correct AFAICT, which puts the top node's actual
 time rather far out in left field.  This is pretty repeatable on my old
 slow HPPA machine.  A new Xeon shows less of a discrepancy, but it's
 still claiming top node actual  total, which is not right.

Wierd. Can you get the output of *instr in each call of
InstrEndLoop? Preferably after it does the calculation but before it
clears the values. So we get an idea of number of samples and what it
guesses. SampleOverhead would be good too.

I know my version produced sensible results on my machine and the
handful of people testing, so I'll try it again with your changes, see
how it looks...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature

Re: [HACKERS] Duplicate rows sneaking in despite PRIMARY KEY / UNIQUE constraint

Travis Cross [EMAIL PROTECTED] writes:
 I'm noticing that a handful (4-16) of rows with duplicate columns
 (uid,token) are sneaking into the table every day despite the
 primary key constraint.

Corrupt index, looks like ... you might try reindexing the index.

I don't believe that the PANIC you show has anything directly to do
with duplicate entries.  It is a symptom of corrupt index structure.
Now a corrupt index might also explain failure to notice duplications,
but changing your application isn't going to fix whatever is causing
it.  You need to look for server-side causes.

Any database or system crashes on this server (before this problem
started)?  Do you *know* that the disk drive will not lie about write
complete?  What is the platform and storage system, anyway?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] fillfactor using WITH syntax

Simon Riggs [EMAIL PROTECTED] writes:
 Why not implement an array of option parameters on pg_class, so both
 heaps and indexes can be given additional parameters? That way you
 wouldn't need a specific relfillfactor attribute. That would allow us to
 keep CREATE TABLE free of additional keywords also.

None of this should go anywhere near pg_class.  IIRC the solutions we
discussed involved adding some sort of array to pg_index.  A solution
that only works for FILLFACTOR is missing the point, too.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much

On Tue, Jun 06, 2006 at 09:48:43AM -0400, Tom Lane wrote:
 Hannu Krosing [EMAIL PROTECTED] writes:
  ??hel kenal p??eval, E, 2006-06-05 kell 14:10, kirjutas Tom Lane:
  Note of course that such a thing would push the incomplete-result
  problem further upstream.   For instance in (hypothetical --cursor
  switch)
 psql --cursor -c select ... | myprogram
  there would be no very good way for myprogram to find out that it'd
  been sent an incomplete result due to error partway through the SELECT.
 
  would it not learn about it at the point of error ?
 
 No, it would merely see EOF after some number of result rows.  (I'm
 assuming you're also using -A -t so that the output is unadorned.)

So if an error occurs partway through reading a cursor, no error message
is generated? That certainly sounds like a bug to me...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much

Jim C. Nasby [EMAIL PROTECTED] writes:
 On Tue, Jun 06, 2006 at 09:48:43AM -0400, Tom Lane wrote:
 psql --cursor -c select ... | myprogram
 there would be no very good way for myprogram to find out that it'd
 been sent an incomplete result due to error partway through the SELECT.

 So if an error occurs partway through reading a cursor, no error message
 is generated? That certainly sounds like a bug to me...

Sure an error is generated.  But it goes to stderr.  The guy at the
downstream end of the stdout pipe cannot see either the error message,
or the nonzero status that psql will (hopefully) exit with.

You can theoretically deal with this by having the shell script calling
this combination check psql exit status and discard the results of
myprogram on failure, but it's not easy or simple.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] COPY (query) TO file

2006-06-06 Thread Harald Fuchs

In article [EMAIL PROTECTED],
Andrew Dunstan [EMAIL PROTECTED] writes:

 Mark Woodward wrote:
 Tom had posted a question about file compression with copy. I thought
 about it, and I want to through this out and see if anyone things it is a
 good idea.
 
 Currently, the COPY command only copies a table, what if it could operate
 with a query, as:
 
 COPY (select * from mytable where foo='bar') as BAR TO stdout
 
 

 Isn't this already being worked on? The TODO list says:

   Allow COPY to output from views

IIRC Karel Zak posted a patch for that.

   Another idea would be to allow actual SELECT statements in a COPY.

 Personally I strongly favor the second option as being more flexible
 than the first.

How so?  I see that

  psql -h somehost somedb -c copy 'somequery' to stdout localfile

would be more terse than

  psql -h somehost somedb -c create temp view tmp as somequery; copy tmp to 
stdout localfile

but what's more flexible there?


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] COPY (query) TO file

On Tue, Jun 06, 2006 at 04:47:40PM +0200, Harald Fuchs wrote:
 In article [EMAIL PROTECTED],
 Andrew Dunstan [EMAIL PROTECTED] writes:
 
  Mark Woodward wrote:
  Tom had posted a question about file compression with copy. I thought
  about it, and I want to through this out and see if anyone things it is a
  good idea.
  
  Currently, the COPY command only copies a table, what if it could operate
  with a query, as:
  
  COPY (select * from mytable where foo='bar') as BAR TO stdout
  
  
 
  Isn't this already being worked on? The TODO list says:
 
Allow COPY to output from views
 
 IIRC Karel Zak posted a patch for that.
 
Another idea would be to allow actual SELECT statements in a COPY.
 
  Personally I strongly favor the second option as being more flexible
  than the first.
 
 How so?  I see that
 
   psql -h somehost somedb -c copy 'somequery' to stdout localfile
 
 would be more terse than
 
   psql -h somehost somedb -c create temp view tmp as somequery; copy tmp to 
 stdout localfile
 
 but what's more flexible there?

Flexibility aside, doing this via a temporary view is a very
non-intuitive way to go about it. AFAIK CREATE TEMP VIEW is also regular
DDL, which means more overhead in the system catalogs, along with more
need to vacuum.

I really fail to see why we shouldn't support copying from a query
unless there's some serious technical challenge. If there was a serious
technical challange that using a temporary view solved, we should do the
work of creating the temporary view for the user.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Duplicate rows sneaking in despite PRIMARY KEY / UNIQUE constraint

2006-06-06 Thread Ian Barwick


On 6/6/06, Tom Lane [EMAIL PROTECTED] wrote:

Travis Cross [EMAIL PROTECTED] writes:
 I'm noticing that a handful (4-16) of rows with duplicate columns
 (uid,token) are sneaking into the table every day despite the
 primary key constraint.

Corrupt index, looks like ... you might try reindexing the index.

I don't believe that the PANIC you show has anything directly to do
with duplicate entries.  It is a symptom of corrupt index structure.
Now a corrupt index might also explain failure to notice duplications,
but changing your application isn't going to fix whatever is causing
it.  You need to look for server-side causes.

Any database or system crashes on this server (before this problem
started)?  Do you *know* that the disk drive will not lie about write
complete?  What is the platform and storage system, anyway?


FWIW I've seen similar behaviour to this (PostgreSQL processes exiting
abnormally, index corruption with duplicate primary keys) on servers
with defective RAM chips.

Ian Barwick

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] fillfactor using WITH syntax

2006-06-06 Thread Simon Riggs

On Tue, 2006-06-06 at 10:27 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Why not implement an array of option parameters on pg_class, so both
  heaps and indexes can be given additional parameters? That way you
  wouldn't need a specific relfillfactor attribute. That would allow us to
  keep CREATE TABLE free of additional keywords also.
 
 None of this should go anywhere near pg_class.  IIRC the solutions we
 discussed involved adding some sort of array to pg_index.  

Itagaki had suggested adding options to heaps also, so clearly we'd need
to add that to pg_class, rather than pg_index in that case.

PCTFREE would be useful for heaps as well as indexes, but there could be
other options also. Extending the thought for the general case, I see no
reason why we would want to permanently exclude heaps from having a more
flexible set of options when we aim to provide that for indexes.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Duplicate rows sneaking in despite PRIMARY KEY / UNIQUE


Kenneth Marshall wrote:

We have used postgresql 7.4, 8.0, and 8.1 with DSPAM and have
never had a single problem like you are describing. In the past
on this mailing list, these sorts of issues have been caused by
hardware problems on the DB server in some cases. Good luck with
tracking it down.


Thanks.  It is always good to know that something *should* and 
*does* work well.  That gives me a good base for tracking down the 
issue.


Cheers,

-- Travis

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] AIX check in datetime.h

2006-06-06 Thread Bruce Momjian


Wow, that is strange.  We could remove it for 8.2 and see how testing goes.

---

Joachim Wieland wrote:
 Can someone please explain why in include/utils/datetime.h (struct datetkn)
 there is a check for _AIX that either initializes a char* pointer or a char
 array? Is there any advantage of a char-array except for warnings of some
 compilers if the initilization string is too long?
 
 Apart from that I doubt that AIX cannot handle token[TOKMAXLEN] because
 similar declarations can be found in other headers without the _AIX check.
 
 The struct definition is more than 9 years old and seems to show up first in
 Attic/dt.h.
 
 
 #define TOKMAXLEN   10  /* only this many chars are stored in
  * datetktbl */
 
 /* keep this struct small; it gets used a lot */
 typedef struct
 {
 #if defined(_AIX)
 char   *token;
 #else
 chartoken[TOKMAXLEN];
 #endif   /* _AIX */
 chartype;
 charvalue;  /* this may be unsigned, alas */
 } datetkn;
 
 
 Joachim
 
 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq
 

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] AIX check in datetime.h

Bruce Momjian pgman@candle.pha.pa.us writes:
 Joachim Wieland wrote:
 Can someone please explain why in include/utils/datetime.h (struct datetkn)
 there is a check for _AIX that either initializes a char* pointer or a char
 array?

 Wow, that is strange.  We could remove it for 8.2 and see how testing goes.

It looks like a workaround for some ancient compiler problem.  [ digs
for awhile... ]  Very ancient: we inherited that hack from Berkeley, see
http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/Attic/nabstime.h

I bet we can remove it.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] adding new field

2006-06-06 Thread ranbeer makin

hey,
let me clarify something.

1. Whenever you run a query, its result or some other thing gets
cached. I want to flush this cache? Also let me know when this cache
gets flushed automatically.

2. After adding a new field in Resdom structure and making necessary
changes in outfuncs.c, createfuncs. etc files, I *sometimes* get
garbage value for the field added. The field is initialiazed to zero in
makefuns.c

3. **IMP*** Is there something in postgres that, only for N number of
fields, of a structure, the memory is allocated, i.e., do I need to
modify this N somewhere to reflect the addtion of a new field.

And that *weird* thing is:

1. I run some query, didn't get results.
2. Then, I commented the part which I modified [ i.e., commented that
new field in Resdom structure], again run the same query, got results.
3. Next, I uncommented that part, and ran the same query again. I GOT THE RESULTS.

That's how my prog is behaving, behaving in a *weird* way. It seems
like some memory probs somewhere. I can't figure it out. Yes, I'm
re-compiling and running initdb whenever I make some modifications in
the code.I'm using release 8.03 and due to some reasons can't shift to some other release.
Regds,RanbeerOn 6/6/06, Tom Lane [EMAIL PROTECTED] wrote:
Martijn van Oosterhout kleptog@svana.org writes: On Mon, Jun 05, 2006 at 11:24:09PM +0530, ranbeer makin wrote: 2. I have added a new field in RESDOM structure, made necessary
modifications in outfuncs.c, copy, read, equalfuncs.c but my prog is behaving in a weird way, seems like some memory probs...What other modifications needs to be done to reflect the addition of this new field?
Did you remember to recompile *everything* affected? (--enable-depend is useful for this). You also have to initdb again.Also, if you're working on a patch you hope to someday contribute, you
should be starting from CVS HEAD or some reasonable approximation of it.Resdom disappeared more than a year ago:http://archives.postgresql.org/pgsql-committers/2005-04/msg00060.php
regards, tom lane

Re: [HACKERS] AIX check in datetime.h

I wrote:
 It looks like a workaround for some ancient compiler problem.  [ digs
 for awhile... ]  Very ancient: we inherited that hack from Berkeley,

In fact, now that I know where to look, I find the same thing in the
postgres-v4r2 tarball, which means the hack is pre-1994.  I don't have
anything older to look at.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] AIX check in datetime.h

2006-06-06 Thread Bruce Momjian

Tom Lane wrote:
 Bruce Momjian pgman@candle.pha.pa.us writes:
  Joachim Wieland wrote:
  Can someone please explain why in include/utils/datetime.h (struct datetkn)
  there is a check for _AIX that either initializes a char* pointer or a char
  array?
 
  Wow, that is strange.  We could remove it for 8.2 and see how testing goes.
 
 It looks like a workaround for some ancient compiler problem.  [ digs
 for awhile... ]  Very ancient: we inherited that hack from Berkeley, see
 http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/Attic/nabstime.h
 
 I bet we can remove it.

OK, removed.  Let's see if we get failure feedback.

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[HACKERS] AGREGATE FUNCTIONS

2006-06-06 Thread Roberto Rezende de Assis

Hello, I would like to know where in the source-code of postgres is 
located the code of the aggregate functions min, max, avg.
I wish to develop more statistical aggregate functions, and I prefer to 
use C than to write then in the PL/R.


Thanks



___ 
Navegue com o Yahoo! Acesso Grátis, assista aos jogos do Brasil na Copa e ganhe prêmios de hora em hora! 
http://br.yahoo.com/artilheirodacopa/


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] Duplicate rows sneaking in despite PRIMARY KEY / UNIQUE


Tom Lane wrote:

Travis Cross [EMAIL PROTECTED] writes:

I'm noticing that a handful (4-16) of rows with duplicate columns
(uid,token) are sneaking into the table every day despite the
primary key constraint.


Corrupt index, looks like ... you might try reindexing the index.


I probably should have mentioned that I have indeed done a REINDEX 
on the table a couple of times in the past, suspecting that issue, 
and having seen it resolve similar issues on this list.  Upon your 
suggestion, I'm running one right now, and I will probably dump and 
reload the entire database after hours, unless anyone thinks that 
would be a bad idea (or unproductive in tracking this down).



I don't believe that the PANIC you show has anything directly to do
with duplicate entries.  It is a symptom of corrupt index structure.
Now a corrupt index might also explain failure to notice duplications,
but changing your application isn't going to fix whatever is causing
it.  You need to look for server-side causes.


Indeed, you are correct.  I should also mention that the problem 
seems to build over time, in the sense that everything will run fine 
for awhile (a few days), and then will crash repeatedly.  Deleting 
the duplicate rows seems to reset the counter -- of course, I cannot 
run a successful REINDEX until I have deleted those duplicate rows.



Any database or system crashes on this server (before this problem
started)?


No.  In fact, this box, and a sister box running similar hardware, 
have been models of system stability.  My uptimes are 46 and 87 
days, respectively, representing the time since I've done a kernel 
upgrade and the time since I plugged the boxes into the rack.  The 
sister box is running real-time voice services.



Do you *know* that the disk drive will not lie about write
complete?


Know is such a strong word ;)  Honestly, I have very little idea. 
 I understand the nature of the problem this presents, as I've read 
the very fine PostgreSQL manual many times over the years.


Because the drives I use are specifically designed to operate well 
in a RAID environment, I would 'hope' that the drives perform honest 
write operations.


I wonder if there is a utility to perform a deterministic test of 
this...



What is the platform and storage system, anyway?


The platform is:

Linux 2.6.16.9 (w/o loadable modules)
Supermicro PDSMi (a single processor P-D board)
2G ECC DDRII SDRAM

The storage system is:

On-board SATA ICH7R Controller
2 x WD3200SD hard drives running in a Linux RAID 1 configuration.
That is to say: Western Digital 320G SATA 'enterprise' drives.  The 
drives have a somewhat unique feature: time-limited error recovery, 
which is supposed to let the RAID controller/software deal with 
errors after a certain point (7 seconds), rather than continuing to 
block, and causing the drive to fall out of the array.


The drive:
http://www.westerndigital.com/en/products/products.asp?driveid=114language=en

I'll run file system consistency checks tonight to see if I can pick 
out a proximal cause for all this chaos.


I really do appreciate the assistance.

Cheers,

-- Travis

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] Duplicate rows sneaking in despite PRIMARY KEY / UNIQUE


Ian Barwick wrote:

On 6/6/06, Tom Lane [EMAIL PROTECTED] wrote:

Travis Cross [EMAIL PROTECTED] writes:
 I'm noticing that a handful (4-16) of rows with duplicate columns
 (uid,token) are sneaking into the table every day despite the
 primary key constraint.

Corrupt index, looks like ... you might try reindexing the index.

I don't believe that the PANIC you show has anything directly to do
with duplicate entries.  It is a symptom of corrupt index structure.
Now a corrupt index might also explain failure to notice duplications,
but changing your application isn't going to fix whatever is causing
it.  You need to look for server-side causes.

Any database or system crashes on this server (before this problem
started)?  Do you *know* that the disk drive will not lie about write
complete?  What is the platform and storage system, anyway?


FWIW I've seen similar behaviour to this (PostgreSQL processes exiting
abnormally, index corruption with duplicate primary keys) on servers
with defective RAM chips.


That's a good thought, and also a possibility.  I tend to distrust 
RAM inherently.  If nothing else seems verifiable as the cause, I 
may have to take the system down on a Saturday night for a good 
thorough run through memtest86.


Cheers,

-- Travis

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] AGREGATE FUNCTIONS

2006-06-06 Thread Heikki Linnakangas


On Tue, 6 Jun 2006, Roberto Rezende de Assis wrote:

Hello, I would like to know where in the source-code of postgres is located 
the code of the aggregate functions min, max, avg.


They're in src/backend/utils/adt/numeric.c

I wish to develop more statistical aggregate functions, and I prefer to use C 
than to write then in the PL/R.


Make sure you read section 32.10 User-Defined aggregates in the manual:

http://www.postgresql.org/docs/current/interactive/xaggr.html

Also, take a look at contrib/intagg for another example.

- Heikki

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] AGREGATE FUNCTIONS

2006-06-06 Thread Joshua D. Drake


Roberto Rezende de Assis wrote:
Hello, I would like to know where in the source-code of postgres is 
located the code of the aggregate functions min, max, avg.
I wish to develop more statistical aggregate functions, and I prefer to 
use C than to write then in the PL/R.



http://projects.commandprompt.com/public/pgsql/browser/trunk/pgsql/src/backend/utils/adt
http://projects.commandprompt.com/public/pgsql/browser/trunk/pgsql/src/backend/utils/adt/numeric.c
http://projects.commandprompt.com/public/pgsql/browser/trunk/pgsql/src/backend/utils/adt

That will give you and easy interface to view the code and everything 
after browser is the CVS source tree so you can look for yourself within 
your copy of HEAD or 8.1 or whatever.


Sincerely,

Joshua D. Drake



Thanks


   
___ Navegue com o 
Yahoo! Acesso Grátis, assista aos jogos do Brasil na Copa e ganhe 
prêmios de hora em hora! http://br.yahoo.com/artilheirodacopa/


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match




--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] AIX check in datetime.h

2006-06-06 Thread Christopher Browne

In an attempt to throw the authorities off his trail, pgman@candle.pha.pa.us 
(Bruce Momjian) transmitted:
 Tom Lane wrote:
 Bruce Momjian pgman@candle.pha.pa.us writes:
  Joachim Wieland wrote:
  Can someone please explain why in include/utils/datetime.h (struct 
  datetkn)
  there is a check for _AIX that either initializes a char* pointer or a 
  char
  array?
 
  Wow, that is strange.  We could remove it for 8.2 and see how testing goes.
 
 It looks like a workaround for some ancient compiler problem.  [ digs
 for awhile... ]  Very ancient: we inherited that hack from Berkeley, see
 http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/Attic/nabstime.h
 
 I bet we can remove it.

 OK, removed.  Let's see if we get failure feedback.

I haven't been monitoring CVS HEAD, but you can be sure this will get
tried out when 8.2 gets anywhere vaguely close to relese...
-- 
If this was helpful, http://svcs.affero.net/rm.php?r=cbbrowne rate me
http://linuxdatabases.info/info/slony.html
Signs of  a  Klingon  Programmer -  8.  Debugging?   Klingons do  not
debug.  Our software  does not   coddle the  weak. Bugs  are  good for
building character in the user.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] How to avoid transaction ID wrap

2006-06-06 Thread Christopher Browne

Clinging to sanity, [EMAIL PROTECTED] (Hannu Krosing) mumbled into her beard:
 Ühel kenal päeval, T, 2006-06-06 kell 08:42, kirjutas Mark Woodward:
 OK, here's my problem, I have a nature study where we have about 10 video
 cameras taking 15 frames per second.
 For each frame we make a few transactions on a PostgreSQL database.
 We want to keep about a years worth of data at any specific time.

 partition by month, then you have better chances of removing old data
 without causing overload/data loss;

It's certainly worth something to be able to TRUNCATE an elderly
partition; that cleans things out very nicely...

 We have triggers that fire is something interesting is found on insert.
 We want this thing to run for a log time.
 From the numbers, you can see the PostgreSQL database is VERY loaded.
 Running VACUUM may not always be possible without losing data.

 why ? just run it with very friendly delay settings.

Friendly delay settings can have adverse effects; it is likely to
make vacuum run on the order of 3x as long, which means that if you
have a very large table that takes 12h to VACUUM, vacuum delay will
increase that to 36h, which means you'll have a transaction open for
36h.

That'll be very evil, to be sure...

 The numbers I have amount to 466,560,000 transactions per month, lasting a
 maximum of about 9 months until XID wrap.

 actually 4.5 months as you will start having problems at 2G xacts.

Right.

 I am thinking about a few work arounds, BEGIN/COMMIT to reduce the number
 of transactions, COPY, etc. so I'm not dead in the water, but I would be
 interested in any observations yo may have.

Grouping work together to diminish numbers of transactions is almost
always something of a win...
-- 
select 'cbbrowne' || '@' || 'gmail.com';
http://linuxdatabases.info/info/rdbms.html
Roses are red,
  Violets are blue,
I'm schizophrenic...
  And I am too.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Connection Broken with Custom Dicts for TSearch2

2006-06-06 Thread John Jawed


Since we are on the topic, is there a timeline/plans for openfts being
brought into core? If not, I'll continue my work on bringing it into
Gentoo Portage.

John

On 6/5/06, Oleg Bartunov oleg@sai.msu.su wrote:

On Mon, 5 Jun 2006, Teodor Sigaev wrote:



 Teodor Sigaev wrote:
 Sorry, it isn't mentioned on page, but this example of code working only
 with before 8.1 versions. In 8.1 interface to dictionary was changed.

 Try attached dict_tmpl.c

 2Oleg: place file on site, pls

done




Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] How to avoid transaction ID wrap

2006-06-06 Thread Rod Taylor

On Tue, 2006-06-06 at 13:53 -0400, Christopher Browne wrote:
 Clinging to sanity, [EMAIL PROTECTED] (Hannu Krosing) mumbled into her beard:
  Ühel kenal päeval, T, 2006-06-06 kell 08:42, kirjutas Mark Woodward:
  OK, here's my problem, I have a nature study where we have about 10 video
  cameras taking 15 frames per second.
  For each frame we make a few transactions on a PostgreSQL database.
  We want to keep about a years worth of data at any specific time.
 
  partition by month, then you have better chances of removing old data
  without causing overload/data loss;
 
 It's certainly worth something to be able to TRUNCATE an elderly
 partition; that cleans things out very nicely...

With one potential snafu -- it blocks new SELECTs against the parent
table while truncate runs on the child (happens with constraint
exclusion as well).

If your transactions are short then it won't be an issue. If you have
mixed length transactions (many short which the occasional long select)
then it becomes tricky since those short transactions will be blocked.

  We have triggers that fire is something interesting is found on insert.
  We want this thing to run for a log time.
  From the numbers, you can see the PostgreSQL database is VERY loaded.
  Running VACUUM may not always be possible without losing data.
 
  why ? just run it with very friendly delay settings.
 
 Friendly delay settings can have adverse effects; it is likely to
 make vacuum run on the order of 3x as long, which means that if you
 have a very large table that takes 12h to VACUUM, vacuum delay will
 increase that to 36h, which means you'll have a transaction open for
 36h.

Sounds like this is almost strictly inserts and selects though. If there
is limited garbage collection (updates, deletes, rollbacks of inserts)
required then it isn't all that bad.

-- 


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] How to avoid transaction ID wrap

2006-06-06 Thread Alvaro Herrera

Rod Taylor wrote:

 With one potential snafu -- it blocks new SELECTs against the parent
 table while truncate runs on the child (happens with constraint
 exclusion as well).
 
 If your transactions are short then it won't be an issue. If you have
 mixed length transactions (many short which the occasional long select)
 then it becomes tricky since those short transactions will be blocked.

One idea is to try to acquire the lock before issuing the TRUNCATE
itself.  If the LOCK TABLE times out, you know you should wait for a
long-running transaction ...


-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] adding new field

2006-06-06 Thread Martijn van Oosterhout

On Tue, Jun 06, 2006 at 09:44:04PM +0530, ranbeer makin wrote:
 hey,
 let me clarify something.
 
 1. Whenever you run a query, its result or some other thing gets cached. I
 want to flush this cache? Also let me know when this cache gets flushed
 automatically.

What cache? Query results are not cached, neither are plans. Please be
specific.

 2. After adding a new field in Resdom structure and making necessary changes
 in outfuncs.c, createfuncs. etc files, I *sometimes* get garbage value for
 the field added. The field is initialiazed to zero in makefuns.c

Check for other places in the code that create that structure. makefunc
is more convienience than anything else.

 3. **IMP*** Is there something in postgres that, only for N number of
 fields, of a structure, the memory is allocated, i.e., do I need to modify
 this N somewhere to reflect the addtion of a new field.

Postgres uses sizeof() always, so as long as all the files are compiled
the same way, everything will work.

 And that *weird* thing is:
 
 1. I run some query, didn't get results.
 2. Then, I commented the part which I modified [ i.e., commented that new
 field in Resdom structure], again run the same query, got results.
 3. Next, I uncommented that part, and ran the same query again. I GOT THE
 RESULTS.

Did you rerun make install and restart the postmaster between each run?

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature

Re: [HACKERS] AIX check in datetime.h

2006-06-06 Thread Andrew Dunstan


Christopher Browne wrote:

I haven't been monitoring CVS HEAD, but you can be sure this will get
tried out when 8.2 gets anywhere vaguely close to relese...
  


The whole point of having a buildfarm is that we shouldn't have to wait, 
we should be able to see very quickly if we have broken something.


We currently have AIX coverage for 5.2/ppc with both gcc and (I think) 
IBM cc. If we need more coverage then feel free to add other AIX machines.


cheers

andrew


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work

Martijn van Oosterhout kleptog@svana.org writes:
 Wierd. Can you get the output of *instr in each call of
 InstrEndLoop? Preferably after it does the calculation but before it
 clears the values. So we get an idea of number of samples and what it
 guesses. SampleOverhead would be good too.

The problem looks to be an underestimation of SampleOverhead, and on
reflection it's clear why: what CalculateSampleOverhead is measuring
isn't the total overhead, but the time between the two gettimeofday
calls.  Which is probably about half the true overhead.  What we
ought to do is iterate InstStartNode/InstrStopNode N times, and
*separately* measure the total elapsed time spent.

It occurs to me that what we really want to know is not so much the
total time spent in InstStartNode/InstrStopNode, as the difference in
the time spent when sampling is on vs when it is off.  I'm not quite
sure if the time spent when it's off is negligible.  Off to do some
measuring ...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Duplicate rows sneaking in despite PRIMARY KEY / UNIQUE

On Tue, Jun 06, 2006 at 12:58:10PM -0400, Travis Cross wrote:
 Do you *know* that the disk drive will not lie about write
 complete?
 
 Know is such a strong word ;)  Honestly, I have very little idea. 
  I understand the nature of the problem this presents, as I've read 
 the very fine PostgreSQL manual many times over the years.
 
 Because the drives I use are specifically designed to operate well 
 in a RAID environment, I would 'hope' that the drives perform honest 
 write operations.
 
 I wonder if there is a utility to perform a deterministic test of 
 this...

Brad from livejournal.com wrote a utility that does just that, thought
it requires 2 machines to run the test.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work

I wrote:
 The problem looks to be an underestimation of SampleOverhead, and on
 reflection it's clear why: what CalculateSampleOverhead is measuring
 isn't the total overhead, but the time between the two gettimeofday
 calls.  Which is probably about half the true overhead.

On further thought, I take that back: the true overhead is not the
point here.  The time elapsed during a plan node execution when
sampling can be broken down into three phases:
time before first gettimeofday call
time between gettimeofday calls
time after second gettimeofday call
Only the second part is actually measured by the instrumentation system;
the other parts are overhead that has never been counted by EXPLAIN
ANALYZE, sampling version or no.  Moreover, most of the runtime of
InstrStartNode and InstrStopNode falls into the first or third parts.

What we would actually like to set SampleOverhead to is the portion
of the second-part runtime that doesn't occur when sampling = false.
Assuming that gettimeofday() has consistent runtime and the actual
time reported is measured at a consistent instant within that runtime,
I believe that we should take the SampleOverhead as just equal to
the runtime of a single gettimeofday() call.  The added or removed
second-part time within InstrStartNode is just the tail time of
gettimeofday, and the added or removed second-part time within
InstrStopNode is basically just the head time of gettimeofday.  (To make
this as true as possible, we need to change the order of operations so
that gettimeofday is invoked *immediately* after the if (sampling)
test, but that's easy.)

So this line of thought leads to the conclusion that
CalculateSampleOverhead is actually overestimating SampleOverhead
a bit, and we should simplify it to just time INSTR_TIME_SET_CURRENT().

But that still leaves me with a problem because my machine is clearly
overestimating the correction needed.  I added some printouts and got

raw totaltime = 0.370937
per_iter = 0.000156913, SampleOverhead = 3.28e-06
adj totaltime = 1.82976
sampling = 0
starttime = 0/00
counter = 0/370937
firsttuple = 0.258321
tuplecount = 1
itercount = 10001
samplecount = 704
nextsample = 10011
startup = 0.258321
total = 1.82976
ntuples = 1
nloops = 1

on a run with an actual elapsed time near 750 msec.  Clearly the
sampling adjustment is wrong, but why?

I have a theory about this, and it's not pleasant at all.  What I
think is that we have a Heisenberg problem here: the act of invoking
gettimeofday() actually changes what is measured.  That is, the
runtime of the second part of ExecProcNode is actually longer when
we sample than when we don't, not merely due to the extra time spent
in gettimeofday().  It's not very hard to guess at reasons why, either.
The kernel entry is probably flushing some part of the CPU's state,
such as virtual/physical address mapping for the userland address
space.  After returning from the kernel call, the time to reload
that state shows up as more execution time within the second part.

This theory explains two observations that otherwise are hard to
explain.  One, that the effect is platform-specific: your machine
may avoid flushing as much state during a kernel call as mine does.
And two, that upper plan nodes seem much more affected than lower
ones.  That makes sense because the execution cycle of an upper node
will involve touching more userspace data than a lower node, and
therefore more of the flushed TLB entries will need to be reloaded.

If this theory is correct, then the entire notion of EXPLAIN ANALYZE
sampling has just crashed and burned.  We can't ship a measurement
tool that is accurate on some platforms and not others.

I'm wondering if it's at all practical to go over to a gprof-like
measurement method for taking EXPLAIN ANALYZE runtime measurements;
that is, take an interrupt every so often and bump the count for the
currently active plan node.  This would spread the TLB-flush overhead
more evenly and thus avoid introducing that bias.  There may be too much
platform dependency in this idea, though, and I also wonder if it'd
break the ability to do normal gprof profiling of the backend.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] AIX check in datetime.h

Andrew Dunstan [EMAIL PROTECTED] writes:
 The whole point of having a buildfarm is that we shouldn't have to wait, 
 we should be able to see very quickly if we have broken something.

And in fact kookaburra seems happy ...

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work

On Tue, Jun 06, 2006 at 04:50:28PM -0400, Tom Lane wrote:
 I have a theory about this, and it's not pleasant at all.  What I
 think is that we have a Heisenberg problem here: the act of invoking
 gettimeofday() actually changes what is measured.  That is, the
 runtime of the second part of ExecProcNode is actually longer when
 we sample than when we don't, not merely due to the extra time spent
 in gettimeofday().  It's not very hard to guess at reasons why, either.
 The kernel entry is probably flushing some part of the CPU's state,
 such as virtual/physical address mapping for the userland address
 space.  After returning from the kernel call, the time to reload
 that state shows up as more execution time within the second part.
 
 This theory explains two observations that otherwise are hard to
 explain.  One, that the effect is platform-specific: your machine
 may avoid flushing as much state during a kernel call as mine does.
 And two, that upper plan nodes seem much more affected than lower
 ones.  That makes sense because the execution cycle of an upper node
 will involve touching more userspace data than a lower node, and
 therefore more of the flushed TLB entries will need to be reloaded.

If that's the case, then maybe a more sopdisticated method of measuring
the overhead would work. My thought is that on the second call to pull a
tuple from a node (second because the first probably has some anomolies
due to startup), we measure the overhead for that node. This would
probably mean doing the following:
get start time # I'm not refering to this as gettimeofday to avoid
   # confusion
gettimeofday() # this is the gettimeofday call that will happen during
   # normal operation
get end time

Hopefully, there's no caching effect that would come into play from not
actually touching any of the data structures after the gettimeofday()
call. If that's not the case, it makes measuring the overhead more
complex, but I think it should still be doable...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work

Jim C. Nasby [EMAIL PROTECTED] writes:
 If that's the case, then maybe a more sopdisticated method of measuring
 the overhead would work.

I think you missed the point: the time spent in gettimeofday() itself
is not the major overhead of EXPLAIN ANALYZE.  At least it appears that
this is the case on my machine.

I'm thinking that interrupt-driven sampling might work OK though.
My previous worries were based on trying to use the ITIMER_PROF
timer, which might not be portable and would conflict with gprof
anyway.  But EXPLAIN ANALYZE has always been interested in real time
rather than CPU time, so the correct interrupt to use is ITIMER_REAL.
That means we only have to share with our own existing usages of that
interrupt, which turns it from a portability issue into just a Small
Matter Of Programming.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

[HACKERS] SERIAL problems?

2006-06-06 Thread Zoltan Boszormenyi


Hi,

I just saw these in the TODO list:

o %Disallow changing DEFAULT expression of a SERIAL column?
   This should be done only if the existing SERIAL problems cannot be 
fixed.

o %Disallow ALTER SEQUENCE changes for SERIAL sequences
   because pg_dump does not dump the changes

What are the existing problems?

I am asking because I am experimenting to implement
the SQL2003 compliant form for the serial type
to be able specify the underlying sequence
parameters:

SERIAL [ GENERATED  [ ALWAYS | BY DEFAULT ]
 AS IDENTITY (
   [ INCREMENT [ BY ] increment ]
   [ MINVALUE minvalue | NO MINVALUE ]
   [ MAXVALUE maxvalue | NO MAXVALUE ]
   [ START [ WITH ] start ]
   [ CACHE cache ]
   [ [ NO ] CYCLE ]
 ) ]

Best regards,
Zoltán Böszörményi


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work

2006-06-06 Thread Martijn van Oosterhout

On Tue, Jun 06, 2006 at 04:50:28PM -0400, Tom Lane wrote:
 But that still leaves me with a problem because my machine is clearly
 overestimating the correction needed.  I added some printouts and got
 
 raw totaltime = 0.370937
 per_iter = 0.000156913, SampleOverhead = 3.28e-06
 adj totaltime = 1.82976
 sampling = 0
 starttime = 0/00
 counter = 0/370937
 firsttuple = 0.258321
 tuplecount = 1
 itercount = 10001
 samplecount = 704
 nextsample = 10011
 startup = 0.258321
 total = 1.82976
 ntuples = 1
 nloops = 1
 
 on a run with an actual elapsed time near 750 msec.  Clearly the
 sampling adjustment is wrong, but why?

This doesn't make any sense at all. How can a sampling run that only
sampled 7% of the actual tuples, end up with a actual measured time
that's more than 50% of the actual final runtime?

Can you get an estimate of the actual overhead (ie compare wall clock
time of EXPLAIN ANALYZE vs actual query). The cost of gettimeofday() is
on the order of a few hundred cycles, I'm not sure about the cost of
TLB flushes (that's probably highly architechture dependant).

To be honest, I wonder about caching effects, but for the disks. The
first few cycles of any plan (like an index scan) is going to incur
costs that won't happen later on. Because we sample much more heavily
at the beginning rather than the end, this will bias towards higher
numbers. You should be able to see this by seeing if running queries
that don't require disk access fare better.

That would suggest a much more careful correction method that works for
non-linear timing patterns...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature

Re: [HACKERS] SERIAL problems?

Zoltan Boszormenyi [EMAIL PROTECTED] writes:
 What are the existing problems?

Please read the archives; this has been discussed recently.
There's a lot of disagreement about what ALTER should allow
and what pg_dump should do with an altered sequence.

 I am asking because I am experimenting to implement
 the SQL2003 compliant form for the serial type
 to be able specify the underlying sequence
 parameters:

Be aware that the big problem with SQL2003 is that it expects an
action at a distance behavior whereby different references to a
generator all return the same result if executed within the same
query cycle.  This makes the construct not equivalent to either
nextval() or currval(), but some hybrid with hidden state; and
changing of that state would have to tie into core parts of the
executor.  It looks pretty messy :-(

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work

Martijn van Oosterhout kleptog@svana.org writes:
 This doesn't make any sense at all. How can a sampling run that only
 sampled 7% of the actual tuples, end up with a actual measured time
 that's more than 50% of the actual final runtime?

AFAICS, the only conclusion is that the sampled executions are in fact
taking longer than supposedly-equivalent unsampled ones, and by a pretty
good percentage too.  I'm growing unsure again about the mechanism
responsible for that, however.

 Can you get an estimate of the actual overhead (ie compare wall clock
 time of EXPLAIN ANALYZE vs actual query). The cost of gettimeofday() is
 on the order of a few hundred cycles, I'm not sure about the cost of
 TLB flushes (that's probably highly architechture dependant).

Here's some examples.  Keep in mind I've already determined that
gettimeofday() takes about 3 usec on this hardware ...

regression=# explain analyze select count(*) from (select * from tenk1 a join 
tenk1 b on a.unique1 = b.unique2 offset 0) ss;
 QUERY PLAN 
 
-
 Aggregate  (cost=2609.00..2609.01 rows=1 width=0) (actual 
time=797.412..797.416 rows=1 loops=1)
   -  Limit  (cost=825.00..2484.00 rows=1 width=488) (actual 
time=208.208..2576.528 rows=1 loops=1)
 -  Hash Join  (cost=825.00..2484.00 rows=1 width=488) (actual 
time=208.190..2082.577 rows=1 loops=1)
   Hash Cond: (a.unique1 = b.unique2)
   -  Seq Scan on tenk1 a  (cost=0.00..458.00 rows=1 
width=244) (actual time=0.082..3.718 rows=1 loops=1)
   -  Hash  (cost=458.00..458.00 rows=1 width=244) (actual 
time=207.933..207.933 rows=1 loops=1)
 -  Seq Scan on tenk1 b  (cost=0.00..458.00 rows=1 
width=244) (actual time=0.017..3.583 rows=1 loops=1)
 Total runtime: 805.036 ms
(8 rows)

Time: 816.463 ms
regression=# select count(*) from (select * from tenk1 a join tenk1 b on 
a.unique1 = b.unique2 offset 0) ss;
 count 
---
 1
(1 row)

Time: 816.970 m

The actual elapsed time for EXPLAIN ANALYZE seems to jump around quite
a bit, probably because of the random variation we're using in sampling
interval.  This particular combination was unusually close.  But in any
case, the *actual* overhead of EXPLAIN ANALYZE is clearly pretty small
here; the problem is that we're incorrectly extrapolating the measured
runtime to the unmeasured executions.

What's especially interesting is that the excess time doesn't seem to
show up if I form the query in a way that doesn't require pushing as
much data around:

regression=# explain analyze select count(*) from (select * from tenk1 a join 
tenk1 b on a.unique1 = b.unique2) ss;
 QUERY PLAN 
 
-
 Aggregate  (cost=1341.00..1341.01 rows=1 width=0) (actual 
time=212.313..212.317 rows=1 loops=1)
   -  Hash Join  (cost=483.00..1316.00 rows=1 width=0) (actual 
time=88.061..160.886 rows=1 loops=1)
 Hash Cond: (a.unique1 = b.unique2)
 -  Seq Scan on tenk1 a  (cost=0.00..458.00 rows=1 width=4) 
(actual time=0.071..4.068 rows=1 loops=1)
 -  Hash  (cost=458.00..458.00 rows=1 width=4) (actual 
time=87.862..87.862 rows=1 loops=1)
   -  Seq Scan on tenk1 b  (cost=0.00..458.00 rows=1 width=4) 
(actual time=0.031..4.780 rows=1 loops=1)
 Total runtime: 221.022 ms
(7 rows)

Time: 229.377 ms
regression=# select count(*) from (select * from tenk1 a join tenk1 b on 
a.unique1 = b.unique2) ss;
 count 
---
 1
(1 row)

Time: 202.531 ms
regression=# 

(Without the OFFSET 0, the planner flattens the subquery and discovers
that it doesn't actually need to fetch any of the non-join-key table
columns.)  Note the only plan nodes showing whacked-out timings are the
ones returning wide tuples (large width values).  I'm not entirely
sure what to make of this.  It could be interpreted as evidence for my
theory about TLB reloads during userspace data access being the problem.
But I'm getting a bit disenchanted with that theory after running the
same test case in 8.1:

regression=# explain analyze select count(*) from (select * from tenk1 a join 
tenk1 b on a.unique1 = b.unique2 offset 0) ss;
  QUERY PLAN
  
--
 Aggregate  (cost=2609.00..2609.01 rows=1 width=0) (actual 
time=1033.866..1033.870 rows=1

Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work

2006-06-06 Thread Greg Stark


Tom Lane [EMAIL PROTECTED] writes:

 And two, that upper plan nodes seem much more affected than lower
 ones.  That makes sense because the execution cycle of an upper node
 will involve touching more userspace data than a lower node, and
 therefore more of the flushed TLB entries will need to be reloaded.

I would have expected the opposite effect. If you only execute one instruction
then the cache miss can make it take many times longer than normal. But as the
number of instructions grows the cache gets repopulated and the overhead
levels off and becomes negligible relative to the total time.


The other option aside from gprof-like profiling would be to investigate those
cpu timing instructions again. I know some of them are unsafe on multi-cpu
systems but surely there's a solution out there. It's not like there aren't a
million games, music playing, and other kewl kid toys that depend on accurate
low overhead timing these days.


-- 
greg


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-06 Thread Ron Mayer


Tom Lane wrote:


One objection to this is that after moving off the gold standard of
1.0 = one page fetch, there is no longer any clear meaning to the
cost estimate units; you're faced with the fact that they're just an
arbitrary scale.  I'm not sure that's such a bad thing, though.


It seems to me the appropriate gold standard is Time, in microseconds
or milliseconds.

The default postgresql.conf can come with a set of hardcoded
values that reasonably approximate some real-world system; and
if that's documented in the file someone reading it can say
 hey, my CPU's about the same but my disk subsystem is much
 faster, so I know in which direction to change things.
And another person may say ooh, now I know that my 4GHz
machines should have about twice the number here as my 2GHz
box.

For people who *really* care a lot (HW vendors?), they could
eventually make measurements on their systems.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] AIX check in datetime.h

2006-06-06 Thread Christopher Browne

Martha Stewart called it a Good Thing when [EMAIL PROTECTED] (Andrew Dunstan) 
wrote:
 Christopher Browne wrote:
 I haven't been monitoring CVS HEAD, but you can be sure this will get
 tried out when 8.2 gets anywhere vaguely close to relese...


 The whole point of having a buildfarm is that we shouldn't have to
 wait, we should be able to see very quickly if we have broken
 something.

 We currently have AIX coverage for 5.2/ppc with both gcc and (I think)
 IBM cc. If we need more coverage then feel free to add other AIX
 machines.

I'd very much to add an AIX 5.3 system; that's awaiting some sysadmin
activity that is more urgent...
-- 
output = (cbbrowne @ gmail.com)
http://linuxdatabases.info/info/internet.html
``What this means is that when  people say, The X11 folks should have
done this, done that, or included this or that, they really should be
saying Hey, the X11 people were smart enough to allow me to add this,
that and the other myself.''  -- David B. Lewis [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work

Greg Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 And two, that upper plan nodes seem much more affected than lower
 ones.  That makes sense because the execution cycle of an upper node
 will involve touching more userspace data than a lower node, and
 therefore more of the flushed TLB entries will need to be reloaded.

 I would have expected the opposite effect. If you only execute one instruction
 then the cache miss can make it take many times longer than normal. But as the
 number of instructions grows the cache gets repopulated and the overhead
 levels off and becomes negligible relative to the total time.

Well, none of our plan nodes are in the one instruction regime ;-).
I was thinking that the total volume of data accessed was the critical
factor.  Right at the moment I'm disillusioned with the TLB-access
theory though.

Something I'm noticing right now is that it seems like only hash joins
are really seriously misestimated --- nest and merge joins have some
small issues but only the hash is way out there.  What's going on??
Can anyone else reproduce this?

 The other option aside from gprof-like profiling would be to
 investigate those cpu timing instructions again. I know some of them
 are unsafe on multi-cpu systems but surely there's a solution out
 there. It's not like there aren't a million games, music playing, and
 other kewl kid toys that depend on accurate low overhead timing these
 days.

Yeah, and they all work only on Windoze and Intel chips :-(

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] AGREGATE FUNCTIONS

2006-06-06 Thread Mark Woodward

 Hello, I would like to know where in the source-code of postgres is
 located the code of the aggregate functions min, max, avg.
 I wish to develop more statistical aggregate functions, and I prefer to
 use C than to write then in the PL/R.

There is a library in contrib called intagg. I wrote it a few years
ago, and I have to laugh at the README file because I must have been
stoned or something, because I can't understand it.

Anyways, if you want to make an aggregate function, it covers what you
need. Feel free to ignore the array stuff, because you probably won't need
to deal with it.

Aggregates have basically two functions, a single function called on every
iteration of the query (or GROUP BY) for state. Then there is a
function that is called at the end called final. The PostgreSQL docs are
pretty good as well.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] How to avoid transaction ID wrap

2006-06-06 Thread Mark Woodward

 Mark Woodward wrote:
 OK, here's my problem, I have a nature study where we have about 10
 video
 cameras taking 15 frames per second.
 For each frame we make a few transactions on a PostgreSQL database.

 Maybe if you grouped multiple operations on bigger transactions, the I/O
 savings could be enough to buy you the ability to vacuum once in a
 while.  Or consider buffering somehow -- save the data elsewhere, and
 have some sort of daemon to put it into the database.  This would allow
 to cope with the I/O increase during vacuum.

The problem is ssufficiently large that any minor modification can easily
hide the problem for a predictble amount of time. My hope was that someone
would have a real long term work around.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[HACKERS] DROP INHERITS

2006-06-06 Thread Greg Stark



I have a question about pg_depends, namely how do you identify the dependency
tied to a given relationship. Specifically to handle DROP INHERITS. Is there
any other reason there might be a dependency between two tables other than
inheritance? If there was how would you tell the dependencies apart?

Also, it seems to me the INHRELID syscache is entirely pointless. There isn't
a single consumer of it throughout the source tree. Nor can I think of any
reason anyone would want to look up the nth parent of a table. Do I
misunderstand what purpose this syscache serves?

On the other hand I see a few places where a syscache for a particular
child-parent pair might be useful. Would it make sense to create an index and
syscache for that?

I suppose it makes more sense to optimize this on the basis of what's used in
the planner and executor rather than ALTER TABLE commands though. I don't know
what would be helpful there. 

-- 
greg


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] DROP INHERITS