from:"Ed L."

[GENERAL] unique indices without pg_constraint rows

2013-03-29 Thread Ed L.



I've been looking at unique indices in a PostgreSQL 8.3.x cluster.  Some 
unique indices clearly have a corresponding row in pg_constraint, while 
other unique indices appear to have no corresponding row in 
pg_constraint at all.


Why is this?  What determines if a unique index will also have a row in 
pg_constraint?


Thanks in advance.

Ed

Re: [GENERAL] Hung postmaster (8.3.9)

2010-03-01 Thread Ed L.

On Monday 01 March 2010 @ 15:41, Ed Loehr (LoehrTech.com) wrote:
 Ed L. pg...@bluepolka.net writes:
  (gdb) bt
  #0  0x00346f8c43a0 in __read_nocancel () from
  /lib64/libc.so.6 #1  0x00346f86c747 in
  _IO_new_file_underflow () from /lib64/libc.so.6 #2 
  0x00346f86d10e in _IO_default_uflow_internal () from
  /lib64/libc.so.6 #3  0x00346f8689cb in getc () from
  /lib64/libc.so.6 #4  0x00531ee8 in next_token
  (fp=0x5b90f20, buf=0x7fff59bef330 , bufsz=4096) at
  hba.c:128 #5  0x00532233 in tokenize_file
  (filename=0x5b8f3f0 global, file=0x5b90f20,
  lines=0x7fff59bef5c8, line_nums=0x7fff59bef5c0) at hba.c:232
  #6  0x005322e9 in tokenize_file (filename=0x5b8f3d0
  global/pg_auth, file=0x5b90ce0, lines=0x98b168,
  line_nums=0x98b170) at hba.c:358
  #7  0x005327ff in load_role () at hba.c:959
  #8  0x0057f300 in reaper (postgres_signal_arg=value
  optimized out) at postmaster.c:2145 #9  signal handler
  called
  #10 0x00346f8cb323 in __select_nocancel () from
  /lib64/libc.so.6 #11 0x0057cc33 in ServerLoop () at
  postmaster.c:1236 #12 0x0057dfdf in PostmasterMain
  (argc=6, argv=0x5b73fe0) at postmaster.c:1031 #13
  0x005373de in main (argc=6, argv=value optimized
  out) at main.c:188
 
 The postmaster seems to be stuck trying to read
  $PGDATA/global/pg_auth (which would be an expected thing for
  it to do at this point in the startup sequence).  Does that
  file exist?  Is it an ordinary file? Do its contents look
  sane (a list of your userids and their passwords and group
  memberships)?

This just happened again ~24 hours after full reload from backup.  
Arrrgh.

Backtrace looks the same again, same file, same 
__read_nocancel().  $PGDATA/global/pg_auth looks fine to me, 
permissions are 600, entries are 3 or more double-quoted items 
per line each separated by a space, items 3 and beyond being 
groups.

Any clues?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hung postmaster (8.3.9)

2010-03-01 Thread Ed L.

On Monday 01 March 2010 @ 15:46, Ed L. wrote:
 On Monday 01 March 2010 @ 15:41, Ed Loehr (LoehrTech.com) wrote:
  Ed L. pg...@bluepolka.net writes:
   (gdb) bt
   #0  0x00346f8c43a0 in __read_nocancel () from
   /lib64/libc.so.6 #1  0x00346f86c747 in
   _IO_new_file_underflow () from /lib64/libc.so.6 #2
   0x00346f86d10e in _IO_default_uflow_internal () from
   /lib64/libc.so.6 #3  0x00346f8689cb in getc () from
   /lib64/libc.so.6 #4  0x00531ee8 in next_token
   (fp=0x5b90f20, buf=0x7fff59bef330 , bufsz=4096) at
   hba.c:128 #5  0x00532233 in tokenize_file
   (filename=0x5b8f3f0 global, file=0x5b90f20,
   lines=0x7fff59bef5c8, line_nums=0x7fff59bef5c0) at
   hba.c:232 #6  0x005322e9 in tokenize_file
   (filename=0x5b8f3d0 global/pg_auth, file=0x5b90ce0,
   lines=0x98b168, line_nums=0x98b170) at hba.c:358
   #7  0x005327ff in load_role () at hba.c:959
   #8  0x0057f300 in reaper
   (postgres_signal_arg=value optimized out) at
   postmaster.c:2145 #9  signal handler called
   #10 0x00346f8cb323 in __select_nocancel () from
   /lib64/libc.so.6 #11 0x0057cc33 in ServerLoop ()
   at postmaster.c:1236 #12 0x0057dfdf in
   PostmasterMain (argc=6, argv=0x5b73fe0) at
   postmaster.c:1031 #13 0x005373de in main (argc=6,
   argv=value optimized out) at main.c:188
 
  The postmaster seems to be stuck trying to read
   $PGDATA/global/pg_auth (which would be an expected thing
  for it to do at this point in the startup sequence).  Does
  that file exist?  Is it an ordinary file? Do its contents
  look sane (a list of your userids and their passwords and
  group memberships)?
 
 This just happened again ~24 hours after full reload from
  backup. Arrrgh.
 
 Backtrace looks the same again, same file, same
 __read_nocancel().  $PGDATA/global/pg_auth looks fine to me,
 permissions are 600, entries are 3 or more double-quoted items
 per line each separated by a space, items 3 and beyond being
 groups.
 
 Any clues?

Watching the server logs, the system is continuing to process 
data on existing connections.  Just can't get any new ones.  
Here's a backtrace for a hung psql -c select version():

$ gdb `which psql`
GNU gdb Fedora (6.8-37.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type show copying
and show warranty for details.
This GDB was configured as x86_64-redhat-linux-gnu...
(gdb) attach 9631
Attaching to program: /opt/pgsql/installs/postgresql-8.3.9/bin/psql, process 
9631
Reading symbols from /opt/pgsql/installs/postgresql-8.3.9/lib/libpq.so.5...done.
Loaded symbols for /opt/pgsql/installs/postgresql-8.3.9/lib/libpq.so.5
Reading symbols from /usr/lib64/libz.so.1...done.
Loaded symbols for /usr/lib64/libz.so.1
Reading symbols from /usr/lib64/libreadline.so.5...done.
Loaded symbols for /usr/lib64/libreadline.so.5
Reading symbols from /lib64/libtermcap.so.2...done.
Loaded symbols for /lib64/libtermcap.so.2
Reading symbols from /lib64/libcrypt.so.1...done.
Loaded symbols for /lib64/libcrypt.so.1
Reading symbols from /lib64/libdl.so.2...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libm.so.6...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libc.so.6...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libnss_files.so.2...done.
Loaded symbols for /lib64/libnss_files.so.2
0x00346f8c92af in poll () from /lib64/libc.so.6
(gdb) bt
#0  0x00346f8c92af in poll () from /lib64/libc.so.6
#1  0x2b03826e5e6f in pqSocketCheck (conn=0x655eef0, forRead=1, forWrite=0, 
end_time=-1) at fe-misc.c:1046
#2  0x2b03826e5f10 in pqWaitTimed (forRead=1, forWrite=-1, conn=0x655eef0, 
finish_time=-1) at fe-misc.c:920
#3  0x2b03826e1752 in connectDBComplete (conn=0x655eef0) at fe-connect.c:930
#4  0x2b03826e2c60 in PQsetdbLogin (pghost=0x0, pgport=0x0, pgoptions=0x0, 
pgtty=0x0, dbName=0x0, login=0x0, pwd=0x0) at fe-connect.c:678
#5  0x0040e319 in main (argc=value optimized out, 
argv=0x7fff283ce6e8) at startup.c:195


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hung postmaster (8.3.9)

2010-03-01 Thread Ed L.

On Monday 01 March 2010 @ 15:59, Ed L. wrote:
  This just happened again ~24 hours after full reload from
   backup. Arrrgh.
 
  Backtrace looks the same again, same file, same
  __read_nocancel().  $PGDATA/global/pg_auth looks fine to me,
  permissions are 600, entries are 3 or more double-quoted
  items per line each separated by a space, items 3 and beyond
  being groups.
 
  Any clues?

Also seeing lots of postmaster zombies (190 and growing)...

Ed
-- 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hung postmaster (8.3.9)

2010-03-01 Thread Ed L.

On Monday 01 March 2010 @ 16:03, Ed L. wrote:
 On Monday 01 March 2010 @ 15:59, Ed L. wrote:
   This just happened again ~24 hours after full reload from
backup. Arrrgh.
  
   Backtrace looks the same again, same file, same
   __read_nocancel().  $PGDATA/global/pg_auth looks fine to
   me, permissions are 600, entries are 3 or more
   double-quoted items per line each separated by a space,
   items 3 and beyond being groups.
  
   Any clues?
 
 Also seeing lots of postmaster zombies (190 and growing)...

While new connections are hanging, top shows postmaster using 
100% of cpu.  SIGTERM/SIGQUIT do nothing.  Here's a backtrace 
of this busy postmaster:

(gdb) bt
#0  0x00346f8c43a0 in __read_nocancel () from /lib64/libc.so.6
#1  0x00346f86c747 in _IO_new_file_underflow () from /lib64/libc.so.6
#2  0x00346f86d10e in _IO_default_uflow_internal () from /lib64/libc.so.6
#3  0x00346f8689cb in getc () from /lib64/libc.so.6
#4  0x00531ee8 in next_token (fp=0x10377ae0, buf=0x7fff32230e60 , 
bufsz=4096) at hba.c:128
#5  0x00532233 in tokenize_file (filename=0x10359b70 global, 
file=0x10377ae0, lines=0x7fff322310f8, line_nums=0x7fff322310f0) at hba.c:232
#6  0x005322e9 in tokenize_file (filename=0x2b1c8cbf5800 
global/pg_auth, file=0x103767a0, lines=0x98b168, line_nums=0x98b170) at 
hba.c:358
#7  0x005327ff in load_role () at hba.c:959
#8  0x0057f878 in sigusr1_handler (postgres_signal_arg=value optimized 
out) at postmaster.c:3830
#9  signal handler called
#10 0x00346f8cb323 in __select_nocancel () from /lib64/libc.so.6
#11 0x0057cc33 in ServerLoop () at postmaster.c:1236
#12 0x0057dfdf in PostmasterMain (argc=6, argv=0x1033f000) at 
postmaster.c:1031
#13 0x005373de in main (argc=6, argv=value optimized out) at 
main.c:188

...and more from the server logs, fwiw:

2010-03-01 17:30:24.213 CST [32238]WARNING:  worker took too long to start; 
cancelled
2010-03-01 17:30:31.250 CST [32236]DEBUG:  transaction log switch forced 
(archive_timeout=300)
2010-03-01 17:31:24.216 CST [32238]WARNING:  worker took too long to start; 
cancelled
2010-03-01 17:32:24.219 CST [32238]WARNING:  worker took too long to start; 
cancelled
2010-03-01 17:33:24.222 CST [32238]WARNING:  worker took too long to start; 
cancelled
2010-03-01 17:34:24.225 CST [32238]WARNING:  worker took too long to start; 
cancelled
2010-03-01 17:35:19.061 CST [32236]LOG:  checkpoint starting: time
2010-03-01 17:35:19.185 CST [32236]DEBUG:  recycled transaction log file 
0001001C0071
2010-03-01 17:35:19.185 CST [32236]LOG:  checkpoint complete: wrote 0 
buffers (0.0%); 0 transaction log file(s) added, 0 removed, 1 recycled; 
write=0.028 s, sync=0.000 s, total=0.124 s
2010-03-01 17:35:24.328 CST [32238]WARNING:  worker took too long to start; 
cancelled
2010-03-01 17:35:31.224 CST [32236]DEBUG:  transaction log switch forced 
(archive_timeout=300)
2010-03-01 17:36:44.332 CST [32238]WARNING:  worker took too long to start; 
cancelled
2010-03-01 17:37:44.434 CST [32238]WARNING:  worker took too long to start; 
cancelled
2010-03-01 17:37:47.378 CST [3692] dba 10(42816) dba LOG:  could not 
receive data from client: Connection timed out
2010-03-01 17:37:47.378 CST [3692] dba 10(42816) dba LOG:  unexpected EOF 
on client connection
2010-03-01 17:37:47.380 CST [3692] dba 10(42816) dba LOG:  disconnection: 
session time: 2:11:15.303 user=dba database=dba host=... port=428

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hung postmaster (8.3.9)

2010-03-01 Thread Ed L.

On Monday 01 March 2010 @ 16:57, Tom Lane wrote: 
 Now that I look more closely at those line numbers, it looks
  like the thing thinks it is processing an include file.  Are
  there any @ signs in your global/pg_auth file?

Yes, indeed, there are many.  My user names are u...@host form, 
and have been for years.  Would that be a problem now?

Ed

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hung postmaster (8.3.9)

2010-03-01 Thread Ed L.

On Monday 01 March 2010 @ 17:15, Tom Lane wrote:
 Ed L. pg...@bluepolka.net writes:
  On Monday 01 March 2010 @ 16:57, Tom Lane wrote:
 
  Now that I look more closely at those line numbers, it
  looks like the thing thinks it is processing an include
  file.  Are there any @ signs in your global/pg_auth file?
 
  Yes, indeed, there are many.  My user names are u...@host
  form, and have been for years.  Would that be a problem now?
 
 u...@host shouldn't be a problem, but if there were an @ by
  itself or starting a token, it might possibly cause something
  like this.

There is one, looks like a typo got in.  How do I fix it?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hung postmaster (8.3.9)

2010-03-01 Thread Ed L.

On Monday 01 March 2010 @ 16:49, Tom Lane wrote:
 
 Oh, for some reason I thought it was sitting idle.  That
  sounds more like an infinite loop.  Try reattaching to the
  postmaster, confirm the stack trace, and then see how many
  times you can do fin before it doesn't return control. 
  That will tell us which level of subroutine is looping.

Yeah, it took me a bit to notice it was at 100% cpu.  Here's
gdb output with the 'fin's you suggested:

$ gdb `which postgres`
GNU gdb Fedora (6.8-37.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type show copying
and show warranty for details.
This GDB was configured as x86_64-redhat-linux-gnu...
(gdb) attach 21670
Attaching to program: /opt/pgsql/installs/postgresql-8.3.9/bin/postgres, 
process 21670
Reading symbols from /lib64/libcrypt.so.1...done.
Loaded symbols for /lib64/libcrypt.so.1
Reading symbols from /lib64/libdl.so.2...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libm.so.6...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libc.so.6...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libnss_files.so.2...done.
Loaded symbols for /lib64/libnss_files.so.2
0x00346f8c43a0 in __read_nocancel () from /lib64/libc.so.6
(gdb) bt
#0  0x00346f8c43a0 in __read_nocancel () from /lib64/libc.so.6
#1  0x00346f86c747 in _IO_new_file_underflow () from /lib64/libc.so.6
#2  0x00346f86d10e in _IO_default_uflow_internal () from /lib64/libc.so.6
#3  0x00346f8689cb in getc () from /lib64/libc.so.6
#4  0x00531ee8 in next_token (fp=0x11da7f30, buf=0x7fff5cb9e300 , 
bufsz=4096) at hba.c:128
#5  0x00532233 in tokenize_file (filename=0x11da5940 global, 
file=0x11da7f30, lines=0x7fff5cb9e598, line_nums=0x7fff5cb9e590) at hba.c:232
#6  0x005322e9 in tokenize_file (filename=0x11da5920 global/pg_auth, 
file=0x11da7cf0, lines=0x98b168, line_nums=0x98b170) at hba.c:358
#7  0x005327ff in load_role () at hba.c:959
#8  0x0057f300 in reaper (postgres_signal_arg=value optimized out) at 
postmaster.c:2145
#9  signal handler called
#10 0x00346f8cb323 in __select_nocancel () from /lib64/libc.so.6
#11 0x0057cc33 in ServerLoop () at postmaster.c:1236
#12 0x0057dfdf in PostmasterMain (argc=6, argv=0x11d8afb0) at 
postmaster.c:1031
#13 0x005373de in main (argc=6, argv=value optimized out) at 
main.c:188
(gdb) fin
Run till exit from #0  0x00346f8c43a0 in __read_nocancel () from 
/lib64/libc.so.6
0x00346f86c747 in _IO_new_file_underflow () from /lib64/libc.so.6
(gdb) fin
Run till exit from #0  0x00346f86c747 in _IO_new_file_underflow () from 
/lib64/libc.so.6
0x00346f86d10e in _IO_default_uflow_internal () from /lib64/libc.so.6
(gdb) fin
Run till exit from #0  0x00346f86d10e in _IO_default_uflow_internal () from 
/lib64/libc.so.6
0x00346f8689cb in getc () from /lib64/libc.so.6
(gdb) fin
Run till exit from #0  0x00346f8689cb in getc () from /lib64/libc.so.6
0x00531ee8 in next_token (fp=0x11da7f30, buf=0x7fff5cb9e300 , 
bufsz=value optimized out) at hba.c:128
128 while ((c = getc(fp)) != EOF  (pg_isblank(c) || c == ','))
(gdb) fin
Run till exit from #0  0x00531ee8 in next_token (fp=0x11da7f30, 
buf=0x7fff5cb9e300 , bufsz=value optimized out) at hba.c:128
0x00532233 in tokenize_file (filename=0x11da5940 global, 
file=0x11da7f30, lines=0x7fff5cb9e598, line_nums=0x7fff5cb9e590) at hba.c:232
232 if (!next_token(file, buf, sizeof(buf)))
Value returned is $1 = 0 '\0'
(gdb) fin
Run till exit from #0  0x00532233 in tokenize_file (filename=0x11da5940 
global, file=0x11da7f30, lines=0x7fff5cb9e598, 
line_nums=0x7fff5cb9e590) at hba.c:232



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hung postmaster (8.3.9)

2010-03-01 Thread Ed L.

On Monday 01 March 2010 @ 17:18, Tom Lane wrote:
 Ed L. pg...@bluepolka.net writes:
  On Monday 01 March 2010 @ 17:15, Tom Lane wrote:
  u...@host shouldn't be a problem, but if there were an @ by
  itself or starting a token, it might possibly cause
  something like this.
 
  There is one, looks like a typo got in.  How do I fix it?
 
 Hmm, a user named @, or what?

Yes, a bogus user: 

@  

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hung postmaster (8.3.9)

2010-03-01 Thread Ed L.

On Monday 01 March 2010 @ 17:23, Ed L. wrote:
 On Monday 01 March 2010 @ 17:18, Tom Lane wrote:
  Ed L. pg...@bluepolka.net writes:
   On Monday 01 March 2010 @ 17:15, Tom Lane wrote:
   u...@host shouldn't be a problem, but if there were an @
   by itself or starting a token, it might possibly cause
   something like this.
  
   There is one, looks like a typo got in.  How do I fix it?
 
  Hmm, a user named @, or what?
 
 Yes, a bogus user:
 
 @  
 

Correction.  Here's the line:

@   agent_group


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hung postmaster (8.3.9)

2010-03-01 Thread Ed L.

On Monday 01 March 2010 @ 17:25, Ed L. wrote:
 On Monday 01 March 2010 @ 17:23, Ed L. wrote:
  On Monday 01 March 2010 @ 17:18, Tom Lane wrote:
   Ed L. pg...@bluepolka.net writes:
On Monday 01 March 2010 @ 17:15, Tom Lane wrote:
u...@host shouldn't be a problem, but if there were an
@ by itself or starting a token, it might possibly
cause something like this.
   
There is one, looks like a typo got in.  How do I fix
it?
  
   Hmm, a user named @, or what?
 
  Yes, a bogus user:
 
  @  
 
 Correction.  Here's the line:
 
 @   agent_group

It is the first line in the pg_auth file.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hung postmaster (8.3.9)

2010-03-01 Thread Ed L.

On Monday 01 March 2010 @ 17:26, Tom Lane wrote:
 Ed L. pg...@bluepolka.net writes:
  On Monday 01 March 2010 @ 17:18, Tom Lane wrote:
  Ed L. pg...@bluepolka.net writes:
  There is one, looks like a typo got in.  How do I fix it?
 
  Hmm, a user named @, or what?
 
  Yes, a bogus user:
 
  @  
 
 Mph.  We really ought to fix things so that a quoted @ doesn't
  get taken as an include file reference.  Odd that it's never
  come up before.
 
 Anyway, if you still have any open superuser sessions, the
  best thing would be an ALTER USER RENAME.  If you don't,
  you'll have to resort to manually editing the pg_auth file,
  and then rename the user as soon as you can get in.

Killed the stuck postmaster with sigkill, edited the file, 
restarted postmaster, and it re-wrote the file with the bogus 
entry.  I don't have any superuser sessions open.  Is there 
another route?

Ed

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hung postmaster (8.3.9)

2010-03-01 Thread Ed L.

On Monday 01 March 2010 @ 17:36, Ed L. wrote:
 On Monday 01 March 2010 @ 17:26, Tom Lane wrote:
  Ed L. pg...@bluepolka.net writes:
   On Monday 01 March 2010 @ 17:18, Tom Lane wrote:
   Ed L. pg...@bluepolka.net writes:
   There is one, looks like a typo got in.  How do I fix
   it?
  
   Hmm, a user named @, or what?
  
   Yes, a bogus user:
  
   @  
 
  Mph.  We really ought to fix things so that a quoted @
  doesn't get taken as an include file reference.  Odd that
  it's never come up before.
 
  Anyway, if you still have any open superuser sessions, the
   best thing would be an ALTER USER RENAME.  If you don't,
   you'll have to resort to manually editing the pg_auth file,
   and then rename the user as soon as you can get in.
 
 Killed the stuck postmaster with sigkill, edited the file,
 restarted postmaster, and it re-wrote the file with the bogus
 entry.  I don't have any superuser sessions open.  Is there
 another route?

I have source code and can patch and rebuild/reinstall if I had 
the right patch.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[SOLVED] Re: [GENERAL] Hung postmaster (8.3.9)

2010-03-01 Thread Ed L.

On Monday 01 March 2010 @ 17:58, Tom Lane wrote:
 Ed L. pg...@bluepolka.net writes:
  Killed the stuck postmaster with sigkill, edited the file,
  restarted postmaster, and it re-wrote the file with the
  bogus entry.  I don't have any superuser sessions open.  Is
  there another route?
 
 What you're going to need to do is stop the postmaster, start
 a standalone backend (see the postgres man page entry if you
 never did that before) and issue the ALTER USER in the
  standalone backend.  Then you can restart normal operations.

That did the trick.  Thank you very much, Sensei.

Ed

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hung postmaster (8.3.9)

2010-03-01 Thread Ed L.

On Monday 01 March 2010 @ 17:57, Tom Lane wrote:
 Ed L. pg...@bluepolka.net writes:
  Correction.  Here's the line:
  @   agent_group
 
  It is the first line in the pg_auth file.
 
 BTW, there seems to be some other contributing factor here
  besides the weird username, because I don't see any looping
  when I try CREATE USER @.  What's your platform exactly,
  and what type of filesystem is $PGDATA on?

This is CentOS 5.2, Linux 2.6.18-92.1.22.el5 #1 SMP x86_64 
GNU/Linux.  I think the PGDATA filesystem is LVM running on a 
RAID 5 setup, but not 100% sure.

# df
Filesystem   1K-blocks  Used Available Use% Mounted 
on
/dev/mapper/VolGroup00-LogVol00
 267422192 132412984 121205688  53% /
...

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [SOLVED] Re: [GENERAL] Hung postmaster (8.3.9)

2010-03-01 Thread Ed L.

On Monday 01 March 2010 @ 18:29, Tom Lane wrote:
 Ed L. pg...@bluepolka.net writes:
  That did the trick.  Thank you very much, Sensei.
 
 I'd still like to know about platform etc.  I see that we
  shouldn't be allowing a username to trigger @-file expansion,
  but even with that it's not clear how it turned into an
  infinite loop; and as I said, I can't reproduce that here.

The original problem occurred on that CentOS box, but I just 
reproduced it on a new cluster on Ubuntu running LVM but no
RAID.  Here's the transcript:

e...@duke:~/test$ uname -a
Linux duke 2.6.31-15-generic #50-Ubuntu SMP Tue Nov 10 14:53:52 UTC 2009 x86_64 
GNU/Linux

e...@duke:~/test$ export PGPORT=9000
e...@duke:~/test$ w^C
e...@duke:~/test$ svi
e...@duke:~/test$ export PGDATA=`pwd`/data
e...@duke:~/test$ initdb --locale=C -D $PGDATA
The files belonging to this database system will be owned by user ed.
This user must also own the server process.

The database cluster will be initialized with locale C.
The default database encoding has accordingly been set to SQL_ASCII.
The default text search configuration will be set to english.

creating directory /home/ed/test/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers/max_fsm_pages ... 32MB/204800
creating configuration files ... ok
creating template1 database in /home/ed/test/data/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... ok
creating conversions ... ok
creating dictionaries ... ok
setting privileges on built-in objects ... ok
creating information schema ... ok
vacuuming database template1 ... ok
copying template1 to template0 ... ok
copying template1 to postgres ... ok

WARNING: enabling trust authentication for local connections
You can change this by editing pg_hba.conf or using the -A option the
next time you run initdb.

Success. You can now start the database server using:

postgres -D /home/ed/test/data
or
pg_ctl -D /home/ed/test/data -l logfile start

e...@duke:~/test$ pg_ctl -D /home/ed/test/data -l logfile start
server starting
e...@duke:~/test$ psql -c select version()
psql: FATAL:  database ed does not exist
e...@duke:~/test$ create^C
e...@duke:~/test$ createdb
e...@duke:~/test$ psql -c select version()
 version
 
-
 PostgreSQL 8.3.9 on x86_64-unknown-linux-gnu, compiled by GCC gcc (Ubuntu 
4.4.1-4ubuntu8) 4.4.1
(1 row)

e...@duke:~/test$ createuser @ with login
createuser: too many command-line arguments (first is with)
Try createuser --help for more information.
e...@duke:~/test$ psql -c create user \@\ with login
CREATE ROLE
e...@duke:~/test$ cat data/global/
1136 1214 1260 2396 2672 2694 
2698 2844 2847 pg_database  
1137 1232 1261 2397 2676 2695 
2842 2845 pg_auth  pgstat.stat  
1213 1233 1262 2671 2677 2697 
2843 2846 pg_control   
e...@duke:~/test$ cat data/global/pg_auth 
@  
ed  
e...@duke:~/test$ pg_ctl -D $PGDATA stop
waiting for server to shut down...^C
e...@duke:~/test$ ^C
e...@duke:~/test$ ^C
e...@duke:~/test$ ^C
e...@duke:~/test$ pg_ctl -D $PGDATA -m fast stop
waiting for server to shut down.^C
e...@duke:~/test$ ^C
e...@duke:~/test$ ps augxwwwf | grep postma
ed8419  0.0  0.0   7336   868 pts/4S+   18:36   0:00  \_ grep 
postma
ohsdba6115  0.0  0.3 503692 28028 pts/0S18:03   0:00 postmaster -D 
/users/ohsdba/dbclusters/ohs/postgresql-8.3.x/data -i -p 9001
e...@duke:~/test$ gdb `which postgre^C
e...@duke:~/test$ pg_ctl -D /home/ed/test/data -l logfile start
pg_ctl: another server might be running; trying to start server anyway
pg_ctl: could not start server
Examine the log output.
e...@duke:~/test$ cat logfile 
LOG:  database system was shut down at 2010-03-01 18:34:25 MST
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
FATAL:  database ed does not exist
WARNING:  worker took too long to start; cancelled
FATAL:  lock file postmaster.pid already exists
HINT:  Is another postmaster (PID 8203) running in data directory 
/home/ed/test/data?
e...@duke:~/test$ ps augxwwwf | grep 8203
ed8465  0.0  0.0   7336   868 pts/4S+   18:37   0:00  \_ grep 
8203
ed8203 73.7  0.0  63080  4248 pts/4R18:34   2:07 
/opt/pgsql/installs/postgresql-8.3.9/bin/postgres -D /home/ed/test/data
e...@duke:~/test$ gdb `which postgres`
GNU gdb (GDB) 7.0-ubuntu
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html

[GENERAL] Hung postmaster (8.3.9)

2010-02-27 Thread Ed L.

Need some help.

My PostgreSQL 8.3.6 and now 8.3.9 postmaster is hanging.
No idea why, been running like a top for a year.  
Can't do select version(), even hung after system reboot.  
SIGINT/QUIT/TERM have no effect, only SIGKILL can stop it.

This is Linux 2.6.18-92.1.22.el5 SMP x86_64

Here's the log after restarting from SIGKILL:

2010-02-27 20:11:10.426 CST [23134]LOG:  database system was interrupted; 
last known up at 2010-02-27 20:08:46 CST
2010-02-27 20:11:10.426 CST [23134]DEBUG:  checkpoint record is at 
8BC/63F5FC0
2010-02-27 20:11:10.426 CST [23134]DEBUG:  redo record is at 8BC/63F5FC0; 
shutdown TRUE
2010-02-27 20:11:10.426 CST [23134]DEBUG:  next transaction ID: 
0/488483782; next OID: 1619873999
2010-02-27 20:11:10.426 CST [23134]DEBUG:  next MultiXactId: 130401682; 
next MultiXactOffset: 830996786
2010-02-27 20:11:10.426 CST [23134]LOG:  database system was not properly 
shut down; automatic recovery in progress
2010-02-27 20:11:10.497 CST [23134]LOG:  record with zero length at 
8BC/63F6020
2010-02-27 20:11:10.497 CST [23134]LOG:  redo is not required
2010-02-27 20:11:10.503 CST [23134]LOG:  checkpoint starting: shutdown 
immediate
2010-02-27 20:11:10.581 CST [23134]LOG:  checkpoint complete: wrote 0 
buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; 
write=0.045 s, sync=0.000 s, total=0.083 s
2010-02-27 20:11:10.587 CST [23134]DEBUG:  transaction ID wrap limit is 
2435542396, limited by database ohsdba
2010-02-27 20:11:10.589 CST [23134]DEBUG:  proc_exit(0)
2010-02-27 20:11:10.589 CST [23134]DEBUG:  shmem_exit(0)
2010-02-27 20:11:10.592 CST [23134]DEBUG:  exit(0)

Here's the backtrace:

$ gdb `which postgres`
GNU gdb Fedora (6.8-37.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type show copying
and show warranty for details.
This GDB was configured as x86_64-redhat-linux-gnu...
(gdb) attach 3776
Attaching to program: /opt/pgsql/installs/postgresql-8.3.9/bin/postgres, 
process 3776
Reading symbols from /lib64/libcrypt.so.1...done.
Loaded symbols for /lib64/libcrypt.so.1
Reading symbols from /lib64/libdl.so.2...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libm.so.6...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libc.so.6...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libnss_files.so.2...done.
Loaded symbols for /lib64/libnss_files.so.2
0x00346f8c43a0 in __read_nocancel () from /lib64/libc.so.6
(gdb) p debug_query_string
$1 = 0x0
(gdb) bt
#0  0x00346f8c43a0 in __read_nocancel () from /lib64/libc.so.6
#1  0x00346f86c747 in _IO_new_file_underflow () from /lib64/libc.so.6
#2  0x00346f86d10e in _IO_default_uflow_internal () from /lib64/libc.so.6
#3  0x00346f8689cb in getc () from /lib64/libc.so.6
#4  0x00531ee8 in next_token (fp=0x5b90f20, buf=0x7fff59bef330 , 
bufsz=4096) at hba.c:128
#5  0x00532233 in tokenize_file (filename=0x5b8f3f0 global, 
file=0x5b90f20, lines=0x7fff59bef5c8, 
line_nums=0x7fff59bef5c0) at hba.c:232
#6  0x005322e9 in tokenize_file (filename=0x5b8f3d0 global/pg_auth, 
file=0x5b90ce0, lines=0x98b168, line_nums=0x98b170)
at hba.c:358
#7  0x005327ff in load_role () at hba.c:959
#8  0x0057f300 in reaper (postgres_signal_arg=value optimized out) at 
postmaster.c:2145
#9  signal handler called
#10 0x00346f8cb323 in __select_nocancel () from /lib64/libc.so.6
#11 0x0057cc33 in ServerLoop () at postmaster.c:1236
#12 0x0057dfdf in PostmasterMain (argc=6, argv=0x5b73fe0) at 
postmaster.c:1031
#13 0x005373de in main (argc=6, argv=value optimized out) at 
main.c:188
(gdb) quit
The program is running.  Quit anyway (and detach it)? (y or n) y
Detaching from program: /opt/pgsql/installs/postgresql-8.3.9/bin/postgres, 
process 3776

Thanks in advance for any help.

Ed


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hung postmaster (8.3.9)

2010-02-27 Thread Ed L.

On Saturday 27 February 2010 @ 22:18, Greg Smith wrote:
 Ed L. wrote:
  2010-02-27 20:11:10.426 CST [23134]LOG:  database system
  was not properly shut down; automatic recovery in progress
  2010-02-27 20:11:10.497 CST [23134]LOG:  record with
  zero length at 8BC/63F6020 2010-02-27 20:11:10.497 CST
  [23134]LOG:  redo is not required
 
 If the server is hung at this point, it was doing something: 
  recovery from a crash, and during that period it will not
  respond to statements until it's finished as you noted.  That
  can take some time, could run into the minutes.  And if you
  kill it, next time the process starts all over again having
  made no progress.
 
 I'd try starting the server again, confirm it's only after
  this point it pauses at, watching if it's doing something
  with top -c, and grab some backtraces of what it's doing
  then.  In this sample you gave, the shutdown was too fast for
  the server have done very recovery before it was killed off. 
  It may just be you need to let it sit there for a while to
  finish recovery and then it will be fine again, can't tell
  from the data you've provided so far.
 

Good point from my example, but it hung like that for 8 hours, 
and seems to be ready to do so again.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] commit performance anomaly

2009-03-26 Thread Ed L.

I've been tracking the performance of our DB query statements 
across a number of fairly high-volume pg clusters for several 
years (combined 2700 tps, ~1.3TB).  Last year, we started 
migrating off HP-UX IA64 servers running pg 8.1.x onto Linux 
quadcore x86_64 Blade servers running pg 8.3.x while running on 
a high-grade SAN.  Our average, overall query performance has 
improved by a very pleasant ~75%.  But I'm curious why 'commit' 
statements (as well as certain update statements) seem to have 
actually degraded (1ms vs 5ms on avg, 2ms vs 14ms in the 95th 
percentile, etc).  Any ideas?

TIA.

Ed

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Floating-point software assist fault?

2008-08-07 Thread Ed L.

We're seeing gobs of these via dmesg in PostgreSQL 8.3.3 on 
ia64-unknown-linux-gnu, compiled by GCC gcc (GCC) 3.4.6 20060404 
(Red Hat 3.4.6-8), kernel 2.6.9-55.EL:

postmaster(13144): floating-point assist fault at ip 
403a9382, isr 0408
 
It appears to be an Itanium-specific issue with floating-point 
normalization, here is a document describing the issue. 
 
http://i-cluster2.inrialpes.fr/doc/misc/fpswa.txt
 
“The Intel Itanium does not fully support IEEE denormals and 
requires software assistance to handle them. Without further 
informations, the ia64 GNU/Linux kernel triggers a fault when 
denormals are computed. This is the floating-point software 
assist fault (FPSWA) in the kernel messages. It is the user's 
task to clearly design his program to prevent such cases.”
 
“To conclude, I'd like to stress the fact that the programmer has 
to be careful when dealing with floating-point numbers. Even 
with high precision, it is easy to produce denormals and get 
strange behaviour.”

Any thoughts? 

TIA.

Ed

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Floating-point software assist fault?

2008-08-07 Thread Ed L.

On Thursday 08/07/08 @ 5:43 pm MDT, I received this from Ed L. 
[EMAIL PROTECTED]:
 We're seeing gobs of these via dmesg in PostgreSQL 8.3.3 on
 ia64-unknown-linux-gnu, compiled by GCC gcc (GCC) 3.4.6
 20060404 (Red Hat 3.4.6-8), kernel 2.6.9-55.EL:

 postmaster(13144): floating-point assist fault at ip
 403a9382, isr 0408

These are coming lately exclusively from the writer process...

TIA.

Ed

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Floating-point software assist fault?

2008-08-07 Thread Ed L.

On Thursday 08/07/08 @ 5:46 pm MDT, I received this from Ed L. 
  postmaster(13144): floating-point assist fault at ip
  403a9382, isr 0408

 These are coming lately exclusively from the writer process...

Actually, the machine has been up for 45 days and dmesg doesn't 
have timestamps, so I'm not sure if those pids have any relation 
to the ones currently in use.

Ed

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] large table vacuum issues

2008-01-04 Thread Ed L.

We need some advice on how to handle some large table autovacuum 
issues.  One of our 8.1.2 autovacuums is launching a DB-wide 
vacuum on our 270GB database to prevent xid wrap-around, but is 
getting hung-up and/or bogged down for hours on a 40gb table and 
taking the server performance down with it, apparently due to an 
IO bottleneck.  The autovac child process becomes completely 
unresponsive to SIGTERM/SIGINT; only a sigkill restart with 
disabling the autovac daemon gets us back to adequate 
performance for now.

We are discussing how to partition the table (difficult due to 
existing foreign keys in other tables), and archiving/clearing 
data.

Are there any other tricks to get it past this large table for 
the time being and still get the xid wraparound fix?

TIA.

Ed

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] large table vacuum issues

2008-01-04 Thread Ed L.

On Friday 04 January 2008 6:21 pm, Scott Marlowe wrote:

 Have you tried adjusting the

 #vacuum_cost_delay = 0  # 0-1000 milliseconds
 #vacuum_cost_page_hit = 1   # 0-1 credits
 #vacuum_cost_page_miss = 10 # 0-1 credits
 #vacuum_cost_page_dirty = 20# 0-1 credits
 #vacuum_cost_limit = 200# 0-1 credits

 settings to something so as to make vacuum less intrusive? 
 might be the easiest fix.

Any particular suggested changes for these parameters?

Ed

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] large table vacuum issues

2008-01-04 Thread Ed L.

On Friday 04 January 2008 6:21 pm, Scott Marlowe wrote:
 On Jan 4, 2008 6:38 PM, Ed L. [EMAIL PROTECTED] wrote:
  We need some advice on how to handle some large table
  autovacuum issues.  One of our 8.1.2

 First of all, update your 8.1 install to 8.1.10.  Failing to
 keep up with bug fixes is negligent.  who knows, you might be
 getting bitten by a bug that was fixed between 8.1.2 and
 8.1.10

Could be.  But like you said, who knows.  In some environments, 
downtime for upgrading costs money (and more), too, sometimes 
even enough to make it negligent to take downtime to keep up 
with bug fixes (and of course, the new bugs) which may or may 
not be a factor at hand.  While the time required to restart a 
DB may be neglible, there are often upstream/downstream 
dependencies that greatly expand the actual downtime for the 
customer.  How much would downtime need to cost before you 
thought it negligent to upgrade immediately?  It's a tradeoff, 
not well-supported by simple pronouncements, one the customer 
and provider are best qualified to make.

Ed

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[GENERAL] view management

2007-11-16 Thread Ed L.

I have a question about view management... 

I often have need for views that reference views that reference 
views, and so on.  When I need to make a small update to one of 
the views, I am faced with having to drop and recreate all 
dependent views even if the driving change just adds another 
column to the view, for example.  I might have to drop and 
recreate many tens of views just to make a change to a single 
view.  What a PITA.  How do others manage this?

TIA.
Ed

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] view management

2007-11-16 Thread Ed L.

On Friday 16 November 2007 1:57 pm, Ed L. wrote:
 I have a question about view management...

 I often have need for views that reference views that
 reference views, and so on.  When I need to make a small
 update to one of the views, I am faced with having to drop and
 recreate all dependent views even if the driving change just
 adds another column to the view, for example.  I might have to
 drop and recreate many tens of views just to make a change to
 a single view.  What a PITA.  How do others manage this?

And before you tell me all about DROP VIEW ... CASCADE, please 
note I'm not talking about that.  I'm talking about the 
difficulties of having to recreate all views in the entire 
subtree of view dependencies just to change one minor aspect of 
an independent view.

TIA...
Ed

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] view management

2007-11-16 Thread Ed L.

On Friday 16 November 2007 2:09 pm, Merlin Moncure wrote:
 you have to rig a build system.  if you have a lot of views
 (which is good), and keeping them up to date is a pain, you
 have to automate their creation. simplest way to do that is to
 rig a build system around sql scripts.  when you create a view
 the first time, save it's creation script in a .sql file and
 replay that when you need it.   if you like to get fancy, you
 can always work solutions around make, etc on top of this.
  there are other tricks...for example you could grep object
 dropped by the database and replay them.

That looks about as ugly as can be.  Ugh.  What it appears to 
boil down to is that views become unusable unless you are 
willing to invest the effort in a complex build system.  The DB 
should handle this issue automatically.  Does Oracle?

There is a slightly related todo item:

---
# Allow VIEW/RULE recompilation when the underlying tables change

Another issue is whether underlying table changes should be 
reflected in the view, e.g. should SELECT * show additional 
columns if they are added after the view is created.
---

Looks like a tricky problem.  What I think is needed is some sort 
of column-level linkage tracking between views, rather than just 
view-level dependencies.  For example,

create table foo (id integer, msg varchar);
create table bar (id integer, msg varchar);
create view fooview as select id from foo;
create view barview as 
select b.*, f.id as fooid 
from bar b join fooview f on b.id = f.id;

When barview is created, f.id would need to be noted as depending 
on fooview.id rather than just noting barview depends on 
fooview.  Then, if someone decides to recreate fooview to 
include foo.msg, it can be discerned that the change will not 
impact barview at all.  That approach looks far too simple to 
have not been done already.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] view management

2007-11-16 Thread Ed L.

On Friday 16 November 2007 2:48 pm, Scott Marlowe wrote:
 On Nov 16, 2007 3:43 PM, Ed L. [EMAIL PROTECTED] wrote:
  That looks about as ugly as can be.  Ugh.  What it appears
  to boil down to is that views become unusable unless you are
  willing to invest the effort in a complex build system.  The
  DB should handle this issue automatically.  Does Oracle?

 Really?  I find a build system to be a pretty necessary part
 of enterprise development.  I can't imagine trying to keep
 track of what I've done to my db without using some kind of
 simple .sql scripts with all my ddl in them.

 And I use views and user defined functions a lot.

The overall schema upgrade management system is not the difficult 
part.  I find the difficulty comes with, for example, 5 levels 
of view dependencies.  The view you want to update requires you 
to rebuild 15 others, which in turn requires you to trace back 
another 15 views, and so on until you reach the leafs of the 
tree.  You don't know those dependencies when you create the 
first few views.  Maybe you just manually discover all these 
dependency paths each time you decide to change a view with 
dependencies.  That's the part I'm griping about and for which I 
was hoping for a better way.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] view management

2007-11-16 Thread Ed L.

Thanks, Justin.

On Friday 16 November 2007 4:38 pm, Justin Pasher wrote:
 We have a system that has quite a few views to access some of
 the data (although we purposely tried to avoid views that
 pulled from other view due to some performance issues), but
 when we had all of the view interdependencies, we had a simple
 shell script that ran through a list of SQL files and imported
 them one after the other. If we every had to drop a view that
 cascaded to other dependent views, we would just do the DROP
 VIEW ... CASCADE, then run the shell script to recreate all of
 the views. In your situation, the time consuming part would be
 the initial creation of the script to get the SQL files in the
 correct order. After that is done, it's just a matter of
 proper maintenance to keep it working.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] Converting non-null unique idx to pkey

2007-08-22 Thread Ed L.

On Tuesday 21 August 2007 11:40 pm, Tom Lane wrote:
 Ed L. [EMAIL PROTECTED] writes:
  Are there
  are any known or obvious gotchas associated with
  transforming a unique index on a non null column into a
  primary key via this sql?
 
  update pg_index
  set indisprimary = 't'
  where indexrelid = my non-null unique index oid

 The problem with that is there won't be any pg_constraint
 entry, nor any pg_depend entries linking to/from the
 constraint...

Thanks.  Sounds messy enough, I'll try another route.

Ed

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[GENERAL] Converting non-null unique idx to pkey

2007-08-21 Thread Ed L.


I'm preparing a fairly large 7.4.6 DB for trigger-based 
replication.  I'm looking for ways to minimize my impact on the 
existing schema  data and uptime.  This replication solution 
requires every table to have a primary key.  Rather than adding 
a new key column and index for the pkey, it's appealing to just 
to reuse existing unique indices on non-null columns.  Are there 
are any known or obvious gotchas associated with transforming a 
unique index on a non null column into a primary key via this 
sql?

update pg_index 
set indisprimary = 't' 
where indexrelid = my non-null unique index oid

TIA.
Ed

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] history table

2007-08-21 Thread Ed L.

On Tuesday 21 August 2007 1:22 pm, Robin Helgelin wrote:

 Yes, this is where I'm too new to postgresql, how do I tell
 the database which user is logged in to the webapp? A session
 parameter? There will be connection pooling, but if I know how
 to solve the previous question I don't think it's hard to get
 it working with the pool.

Tablelog looks pretty cool.  One way to handle your user ID issue 
would be to initiate a user session by storing a session record 
(for example:  id, username, starttime), then have your app pass 
that session ID to your updates for history.  Then you could 
store the user ID in an update_session_id column and tablelog 
would help track of the history.

Ed


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] history table

2007-08-21 Thread Ed L.

On Tuesday 21 August 2007 1:42 pm, Ed L. wrote:
 Then you could
 store the user ID in an update_session_id column and tablelog
 would help track of the history.

s/user ID/session ID/g;

Ed


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] Converting non-null unique idx to pkey

2007-08-21 Thread Ed L.

On Tuesday 21 August 2007 1:45 pm, Scott Marlowe wrote:
 If you have a large db in 7.4.6, you should do two things.

 1: Update to 7.4.19 or whatever the latest flavor of 7.4 is,
 right now.  There are a few known data eating bugs in 7.4.6.

Sounds like good advice from a strictly technical viewpoint.  
Unfortunately, in our particular real world, there are also 
political, financial, and resource constraints and impacts from 
downtime that at times outweigh the technical merits of 
upgrading 'right now'.

  update pg_index
  set indisprimary = 't'
  where indexrelid = my non-null unique index oid

 I wouldn't bet on that working right.  Others know the
 internals of the db better than me, but that looks like a foot
 gun.

I'd still love to hear from any who know the internals well 
enough to say if this should work or if it's a bad idea.  It 
appears to work in some cursory testing.

TIA.
Ed

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[GENERAL] Using oid as pkey

2007-08-20 Thread Ed L.

What are the concerns with using oid as the column for a primary 
key declaration for use in trigger-based replication?

TIA,
Ed

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

[GENERAL] 8.1.2 select for update issue

2007-08-06 Thread Ed L.


We're seeing some unexpected behavior in one particular 64-bit 
Pgsql 8.1.2 running on HP-UX 11.23 and Itanium 2, built 
with --enable-thread-safety.  We think we are seeing concurrent 
select-for-updates of the same rows by multiple concurrent 
backends, contrary to our understanding of select-for-update 
semantics.  The rows are selected by each client process as 
follows:

SELECT * 
from foo
where eventprocessed = 'f'
  and inprogress = 'f'
  and eventstructure is not NULL 
order by key asc 
for update 
limit 25;

Once the rows are selected, they are then updated within the same 
transaction, for example, as follows:

update foo set inprogress = 't' where key in (10169339);

We think the row should be locked, unselectable for update, and 
that the update above should remove them from selection in any 
subsequent select-for-updates like the one above.  However, we 
see one backend selecting and locking a set of rows, and while 
it presumably has them locked and is chugging through doing 
updates like the one above, we see another backend 
select-for-update grabbing some of the same rows and performing 
updates.

We're unable to reproduce this scenario on demand, but it does 
consistently happen about 1/3 of the time on this busy system 
whenever we turn on the second process.  Any suggestions on 
where to hunt?

TIA,
Ed

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[GENERAL] more select-for-update questions

2007-08-06 Thread Ed L.

I ran a simple select-for-update test on 8.1.2 and was curious as 
to why the semantics are what they are.  Specifically, when you 
have multiple select-for-update-limit queries on the same rows, 
why are rows selected by the blocked query before knowing if 
some of those rows will be removed/eliminated by a preceding 
select-for-update-limit?

Here's an example.  I created this table:

create table foo(
id serial, 
done boolean not null default false, 
msg varchar);

Then I inserted some data:

select * from foo
 id | done | msg  
+--+--
  1 | f| Mon Aug  6 12:09:11 MDT 2007
  2 | f| Mon Aug  6 12:09:12 MDT 2007
  3 | f| Mon Aug  6 12:09:13 MDT 2007
  4 | f| Mon Aug  6 12:09:14 MDT 2007
  5 | f| Mon Aug  6 12:09:15 MDT 2007
(5 rows)

Then in transaction A, 

begin;

select * from foo where not done for update limit 3;
 id | done | msg  
+--+--
  1 | f| Mon Aug  6 12:09:11 MDT 2007
  2 | f| Mon Aug  6 12:09:12 MDT 2007
  3 | f| Mon Aug  6 12:09:13 MDT 2007
(3 rows)

update foo set done = 't' where id  4;
UPDATE 3

select * from foo;
 id | done | msg  
+--+--
  4 | f| Mon Aug  6 12:09:14 MDT 2007
  5 | f| Mon Aug  6 12:09:15 MDT 2007
  1 | t| Mon Aug  6 12:09:11 MDT 2007
  2 | t| Mon Aug  6 12:09:12 MDT 2007
  3 | t| Mon Aug  6 12:09:13 MDT 2007
(5 rows)

Then in transaction B, before committing transaction A,

begin;

select * from foo where not done for update limit 3;
(this blocks transaction B awaiting transaction A commit)

Then, just after commit in transaction A, I see the 
previously-blocked query in transaction B returns:

select * from foo where not done for update limit 3;
 id | done | msg 
+--+-
(0 rows)

It returns zero rows when I expected it to return two (id 4 and 
5).  If I immediately run the same query again in transaction B, 
I see what I expected to see in the preceding query:

 select * from foo where not done for update limit 3;
 id | done | msg  
+--+--
  4 | f| Mon Aug  6 12:09:14 MDT 2007
  5 | f| Mon Aug  6 12:09:15 MDT 2007
(2 rows)

So, B is selecting rows for update and applying the limit prior 
to knowing which rows will be excluded by A's updates.  I know 
that is well-documented behavior.  It just seems pretty 
unintuitive.  I'm just wondering if there is some good reason 
for it. 


TIA.
Ed


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] 8.1.2 select for update issue

2007-08-06 Thread Ed L.

On Monday 06 August 2007 1:22 pm, you wrote:
 Ed L. [EMAIL PROTECTED] writes:
  We're seeing some unexpected behavior in one particular
  64-bit Pgsql 8.1.2 running on HP-UX 11.23 and Itanium 2,
  built with --enable-thread-safety.  We think we are seeing
  concurrent select-for-updates of the same rows by multiple
  concurrent backends, contrary to our understanding of
  select-for-update semantics.

 You really ought to be using something newer than 8.1.2. 

Perhaps.  But we have yet to find a way to make major version 
upgrades of 100+ GB, 100+ tps databases sufficiently inexpensive 
and painless in terms of SAN space, performance costs, and 
customer downtime on heavily loaded systems.  So we put them off 
until there is a clear, directly compelling reason to upgrade.

 You do have a transaction block established around this whole
 process? Row locks only last as long as the current
 transaction ...

Of course.  This is grasping at straws, but I was wondering if 
perhaps anyone saw anything in this behavior that might suggest 
a threadsafe-related anomaly?

TIA.
Ed

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] 8.1.2 select for update issue

2007-08-06 Thread Ed L.

On Monday 06 August 2007 2:11 pm, Tom Lane wrote:
 Ed L. [EMAIL PROTECTED] writes:
  On Monday 06 August 2007 1:22 pm, you wrote:
  You really ought to be using something newer than 8.1.2.
 
  Perhaps.  But we have yet to find a way to make major
  version upgrades of 100+ GB,

 I did not suggest a major version upgrade.

My mistaken assumption.  We are considering an upgrade to 8.1.9.  
I see the number of bugfixes between 8.1.2 and 8.1.9 is lengthy.

Ed

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] query log corrupted-looking entries

2007-06-08 Thread Ed L.

On Friday 08 June 2007 10:30 am, George Pavlov wrote:

 It is very hard to tease these apart because now that I look
 at it closely it is a total mess; there are multiple
 interruptions and interruptions inside of interruptions...
 The interruption can happen anywhere, including the leading 
 log timestamp

This describes my experiences as well on our high-volume 
clusters.

Ed

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] query log corrupted-looking entries

2007-06-01 Thread Ed L.

On Friday 01 June 2007 3:09 pm, George Pavlov wrote:
 On 5/29/2007 10:19 AM, Ed L. wrote:
  On Wednesday 23 May 2007 1:04 pm, George Pavlov wrote:
  FWIW, I've also been seeing this sort of query log
  corruption for as long as I can remember, 7.1 through 8.2,
  HPUX (parisc, ia64), Linux on intel, amd...

 Do you have any tricks for dealing with the problem from a
 query analyzer perspective? That is, if you use something like
 pgfouine do you have any quick and easy way to remove those
 lines (and the affected lines around them)? Or someway to
 fix the corrupted lines? I'd say that lately ~40% of my
 daily query logs are suffering from this problem making query
 analysis very hard.

Not really.  I filter by perl regular expressions, but I wouldn't 
say that is reliable.  However, our query volumes are high 
enough that losing a few here and there is no big deal in 
analysis.  I long ago realized that reliably replaying query 
logs was infeasible due in part to this issue.  Regex filtering 
is problematic and error-prone (for example, how do you reliably 
tell the difference between log timestamp and DB data?).

Perhaps others have better answers.

Ed

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] query log corrupted-looking entries

2007-06-01 Thread Ed L.

On Friday 01 June 2007 3:36 pm, Tom Lane wrote:
 George Pavlov [EMAIL PROTECTED] writes:
  On 5/29/2007 10:19 AM, Ed L. wrote:
  FWIW, I've also been seeing this sort of query log
  corruption for as long as I can remember, 7.1 through 8.2,
  HPUX (parisc, ia64), Linux on intel, amd...

 What *exactly* is the logging setup you guys use, and have you
 tried alternatives?

 As best I can tell, the variant with a downstream syslogger
 process ought to be OK for messages up to PIPEBUF bytes long
 on most standard Unixen.  Of course that's no help if you've
 got lots of very verbose queries, but the examples posted so
 far didn't seem to indicate that.

On older versions, we use stderr piped to Apache's rotatelogs.  
On newer versions with built-in logging, we use it.  We do have 
considerably verbose queries (sometimes a page or so).  No idea 
what PIPEBUF equals.

Ed

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] query log corrupted-looking entries

2007-06-01 Thread Ed L.

On Friday 01 June 2007 3:36 pm, Tom Lane wrote:
 What *exactly* is the logging setup you guys use, and have you
 tried alternatives?

redirect_stderr = on# Enable capturing of stderr into log 
log_directory = '/users/.../logs'   # Directory where log files are 
written
log_filename = 'server_log.%a.%H:00-%H:59' # Log file name pattern.
log_truncate_on_rotation = on # If on, any existing log file of the same 
log_rotation_age = 60min# Automatic rotation of 
logfiles will 
log_min_messages = notice   # Values, in order of decreasing detail:
log_connections = on
log_disconnections = on
log_duration = on
log_line_prefix = '%m [%p] %u %r %d '   # Special values:
log_statement = 'all'   # none, ddl, mod, all

I have not tried alternatives.  The status quo is adequate.

Ed

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] query log corrupted-looking entries

2007-05-29 Thread Ed L.

On Wednesday 23 May 2007 1:04 pm, George Pavlov wrote:
 Hoping to resurrect this thread. I am seeing more and more of
 this as the database gets more usage and it really messes up
 query log analysis.


 A quick summary: When I posted this was getting corrupted
 query log entries. I still am. They look like this (the
 problem line + one line before and after):

 2007-05-15 14:24:52 PDT [3859]: [968-1] LOG:  duration: 0.071
 ms 2007-05-15 14:24:52 PDT [3859]: [969-1] LOG: 
 statement2007-05-15 14:24:52 PDT [3885]: [208-1] LOG: 
 duration: 11.800 ms

FWIW, I've also been seeing this sort of query log corruption for 
as long as I can remember, 7.1 through 8.2, HPUX (parisc, ia64), 
Linux on intel, amd...

Ed

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] HP/Pgsql/DBD::Pg issue

2007-05-01 Thread Ed L.

On Thursday 26 April 2007 9:42 am, Tom Lane wrote:
 Ed L. [EMAIL PROTECTED] writes:
  After a reboot (and usually after an OS patch) on our HP-UX
  11.23 64-bit Itanium DB servers, our libpq/DBD::Pg libraries
  cease to work.  Instead, they give the standard message you
  get when the DB cluster is not running.

 Try ktrace'ing the client to see what it's doing at the
 kernel-call level. (I think HPUX's equivalent is just called
 trace btw.)

Attached is a small tar.gz file containing a short perl DBI 
connection script that repeatedly demonstrates this problem.  
There are also two log files containing tusc output (an HP 
syscall trace utility), one for the 32-bit run (which works) and 
another for the 64-bit run (which fails).  I haven't made much 
sense of it yet, so any help deciphering is appreciated.

TIA.

Ed


connfail.tar.gz
Description: application/tgz

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] HP/Pgsql/DBD::Pg issue

2007-05-01 Thread Ed L.

On Tuesday 01 May 2007 2:23 pm, Tom Lane wrote:
 Well, it's going wrong here:

 socket(AF_INET, SOCK_STREAM, 0) .. = 4
 setsockopt(4, 0x6, TCP_NODELAY, 0x9fffe210, 4) ... = 0
 fcntl(4, F_SETFL, 65536) . = 0
 fcntl(4, F_SETFD, 1) . = 0
 connect(4, 0x60416ea0, 16) ... = 0
 getsockopt(4, SOL_SOCKET, SO_ERROR, 0x9fffe32c,
 0x9fffe338) = 0 close(4)
 . = 0

 The close() indicates we're into the failure path, so
 evidently the getsockopt returned a failure indication (though
 it's hard to tell what --- strerror() isn't providing anything
 useful).  What strikes me as odd about this is that the
 connect() really should have returned EINPROGRESS or some
 other failure code, because we're doing it in nonblock mode. A
 zero return implies that the connection is already made, which
 it shouldn't be if you're connecting to some other machine (if
 this is a local connection then maybe it's sane, but I don't
 see that here when testing loopback TCP connections).  So I
 wonder if connect() is blowing it here and claiming the
 connection is ready when it's not quite yet. Another
 possibility is that getsockopt() is returning bad data, which
 smells a bit more like the sort of thing that might go wrong
 in 64 vs 32 bit mode.

It is indeed a local connection using PGHOST=`hostname`.  That 
name maps to one of the external NIC IPs, not to the normal 
127.0.0.1 loopback address.  For context, I've seen this a 
number of times over the past couple years, from pgsql 7.3.x to 
8.1.x, HPUX 11.00 to 11.23, 32-bit-only and 32/64 Itaniums, 
always via a local connection using `hostname` mapping to an 
external NIC.  What it is about the reboots that triggers this 
remains a mystery.

Ed

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] HP/Pgsql/DBD::Pg issue

2007-05-01 Thread Ed L.

On Tuesday 01 May 2007 2:46 pm, Ed L. wrote:
 It is indeed a local connection using PGHOST=`hostname`.  That
 name maps to one of the external NIC IPs, not to the normal
 127.0.0.1 loopback address.  For context, I've seen this a
 number of times over the past couple years, from pgsql 7.3.x
 to 8.1.x, HPUX 11.00 to 11.23, 32-bit-only and 32/64 Itaniums,
 always via a local connection using `hostname` mapping to an
 external NIC.  What it is about the reboots that triggers this
 remains a mystery.

Not to create a red herring, I should add it also fails for 
PGHOST=localhost/127...  Only relinking/reinstalling with 32-bit 
perl seems to fix it.  I will see if I can tweak fe-connect.c.

Ed


---(end of broadcast)---
TIP 6: explain analyze is your friend

[GENERAL] HP/Pgsql/DBD::Pg issue

2007-04-26 Thread Ed L.


After a reboot (and usually after an OS patch) on our HP-UX 11.23 
64-bit Itanium DB servers, our libpq/DBD::Pg libraries cease to 
work.  Instead, they give the standard message you get when the 
DB cluster is not running.  But we *know* it is running and all 
access paths are working.  We have found a workaround by 
switching from 64-bit perl to 32-bit perl, build a 32-bit pgsql, 
and rebuild the perl DBD module using 32-bit perl and linking 
with the 32-bit pgsql.  But the fact we're having to do that is 
a problem for us.

I don't understand this problem and am at a loss as to where to 
look.  Any ideas?

TIA.

Ed

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] HP/Pgsql/DBD::Pg issue

2007-04-26 Thread Ed L.

On Thursday 26 April 2007 8:50 am, Ed L. wrote:
 After a reboot (and usually after an OS patch) on our HP-UX
 11.23 64-bit Itanium DB servers, our libpq/DBD::Pg libraries
 cease to work.  Instead, they give the standard message you
 get when the DB cluster is not running.  But we *know* it is
 running and all access paths are working.  We have found a
 workaround by switching from 64-bit perl to 32-bit perl, build
 a 32-bit pgsql, and rebuild the perl DBD module using 32-bit
 perl and linking with the 32-bit pgsql.  But the fact we're
 having to do that is a problem for us.

 I don't understand this problem and am at a loss as to where
 to look.  Any ideas?

I should add that it is only these client apps that are running 
on the DB server itself.  DBD apps connecting remotely don't 
have any problems.

TIA.

Ed

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[GENERAL] dropping role w/dependent objects

2007-04-04 Thread Ed L.

This is pgsql 8.2.3:

% psql -c drop role mygroup
ERROR:  role mygroup cannot be dropped because some objects 
depend on it
DETAIL:  227 objects in this database

How do I identify what these dependent objects are?

I've removed all of the users from this group, turned up server 
logging to debug5, and searched docs to no avail.

TIA.
Ed

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] dropping role w/dependent objects

2007-04-04 Thread Ed L.

On Wednesday April 4 2007 4:35 pm, Alvaro Herrera wrote:

 Note that you can give the objects owned by that role to
 someone else with REASSIGN OWNED, and drop the objects with
 DROP OWNED (note that they act differently regarding grants;
 see the docs)

Yes, but how do identify what they are so that I know if I want 
to DROP OWNED them?

TIA.
Ed



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] dropping role w/dependent objects

2007-04-04 Thread Ed L.

On Wednesday April 4 2007 4:39 pm, Ed L. wrote:
 On Wednesday April 4 2007 4:35 pm, Alvaro Herrera wrote:
  Note that you can give the objects owned by that role to
  someone else with REASSIGN OWNED, and drop the objects with
  DROP OWNED (note that they act differently regarding grants;
  see the docs)

 Yes, but how do identify what they are so that I know if I
 want to DROP OWNED them?

The REASSIGN OWNED appears to be insufficient:

% psql -c reassign owned by mygroup to mydba
REASSIGN OWNED
% psql -c drop group mygroup
ERROR:  role mygroup cannot be dropped because some objects 
depend on it
DETAIL:  225 objects in this database

Thanks,
Ed

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] dropping role w/dependent objects

2007-04-04 Thread Ed L.

On Wednesday April 4 2007 4:41 pm, Ed L. wrote:
 On Wednesday April 4 2007 4:39 pm, Ed L. wrote:
  On Wednesday April 4 2007 4:35 pm, Alvaro Herrera wrote:
   Note that you can give the objects owned by that role to
   someone else with REASSIGN OWNED, and drop the objects
   with DROP OWNED (note that they act differently regarding
   grants; see the docs)
 
  Yes, but how do identify what they are so that I know if I
  want to DROP OWNED them?

 The REASSIGN OWNED appears to be insufficient:

 % psql -c reassign owned by mygroup to mydba
 REASSIGN OWNED
 % psql -c drop group mygroup
 ERROR:  role mygroup cannot be dropped because some objects
 depend on it
 DETAIL:  225 objects in this database

I did a before and after dump to compare.  The objects it is 
complaining about are GRANTs to that group.  Hmm... seems
like you shouldn't have to drop every grant for a group
you're dropping.

Thanks,
Ed

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] dropping role w/dependent objects

2007-04-04 Thread Ed L.

On Wednesday April 4 2007 4:48 pm, Alvaro Herrera wrote:
  Yes, but how do identify what they are so that I know if I
  want to DROP OWNED them?

 There's no way AFAICT, short of peeking the catalogs (or
 information_schema).  Try pg_shdepend.

I guess if the bug were fixed, it'd be a non-issue.

Thanks.
Ed


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] dropping role w/dependent objects

2007-04-04 Thread Ed L.

On Wednesday April 4 2007 5:02 pm, Alvaro Herrera wrote:
  I guess if the bug were fixed, it'd be a non-issue.

 Sure, please submit a patch.  It should not be too difficult.

Perhaps this could be added to the TODO list?  I won't get to it 
anytime soon.

Ed


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] dropping role w/dependent objects

2007-04-04 Thread Ed L.

On Wednesday April 4 2007 5:37 pm, Bruce Momjian wrote:
  Perhaps this could be added to the TODO list?  I won't get
  to it anytime soon.

 Yes.  What should the TODO text be?

See if the attached patch is acceptable.  If not, perhaps the 
TODO text should be:

Enable end user to identify dependent objects when the following 
error is encountered:

ERROR:  role mygroup cannot be dropped because some objects 
depend on it
DETAIL:  227 objects in this database

Index: ./src/backend/catalog/pg_shdepend.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/catalog/pg_shdepend.c,v
retrieving revision 1.17
diff -C1 -r1.17 pg_shdepend.c
*** ./src/backend/catalog/pg_shdepend.c	3 Mar 2007 19:32:54 -	1.17
--- ./src/backend/catalog/pg_shdepend.c	5 Apr 2007 00:05:56 -
***
*** 484,488 
  	 * We try to limit the number of reported dependencies to something sane,
! 	 * both for the user's sake and to avoid blowing out memory.
  	 */
! #define MAX_REPORTED_DEPS 100
  
--- 484,497 
  	 * We try to limit the number of reported dependencies to something sane,
! 	 * both for the user's sake and to avoid blowing out memory.  But since
! 	 * this is the only way for an end user to easily identify the dependent
! 	 * objects, make the limit pretty big.  Generously assuming each object
! 	 * description is 64 chars long, and assuming we add some commentary of
! 	 * up to 15 chars in storeObjectDescription(), that's ~80 chars per
! 	 * object.  If we allow 2000, that's 160Kb, which is reasonable.  If the
! 	 * installer gets wild and uses 128 character names, that's still only
! 	 * 320Kb.  These sorts of high numbers of dependencies are reached quite
! 	 * easily when a sizeable schema of hundreds of tables has specific grants
! 	 * on each relation.
  	 */
! #define MAX_REPORTED_DEPS 2000
  

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] daylight savings patches needed?

2007-03-12 Thread Ed L.


We have a 7.4.6 cluster which has been running on an HP B.11.00 
box for quite sometime.  The IT group applied daylight savings 
patches to the OS, but the cluster is still showing the 
incorrect timezone:

$ psql -c select now()
  now  
---
 2007-03-12 13:46:11.369583-05

$ date
Mon Mar 12 14:46:18 EDT 2007

It appears that newly initialized 7.4.6 clusters get the proper 
timezone.  I didn't see anything in the FAQ about restarting.  

Does this mean that we need to restart these clusters in order to 
get the timezone updates from the OS?  Are they cached in the 
postmaster?

TIA.
Ed

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] daylight savings patches needed?

2007-03-12 Thread Ed L.

On Monday March 12 2007 1:07 pm, Ed L. wrote:
 Does this mean that we need to restart these clusters in order
 to get the timezone updates from the OS?  Are they cached in
 the postmaster?

Nevermind.  I just found it via googling.

Would I be correct in understanding that every pre-8.0 cluster 
must be restarted in order for the OS changes to take affect?!?

Perhaps it should be a part of the FAQ answer at  
http://www.postgresql.org/docs/faqs.FAQ.html#item1.14

Ed

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] daylight savings patches needed?

2007-03-12 Thread Ed L.

On Monday March 12 2007 4:08 pm, Martijn van Oosterhout wrote:
 On Mon, Mar 12, 2007 at 01:13:42PM -0600, Ed L. wrote:
  Would I be correct in understanding that every pre-8.0
  cluster must be restarted in order for the OS changes to
  take affect?!?

 Possibly, I imagine many C libraries would cache the timezone
 data over a fork and might not motice the changes...

That was a nasty little surprise.

Thanks,
Ed

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] vacuum error

2007-03-07 Thread Ed L.

On Tuesday March 6 2007 11:52 pm, Peter Eisentraut wrote:
 Ed L. wrote:
  Right.  I'm asking if the fix for this problem is in the new
  8.1.8 software, or in the new DB structure resulting from
  the initdb, or perhaps both.

 There is no new DB structure in 8.1.8, which is why you can
 update without initdb.  Consult the release notes for details.

Perhaps my question was not clear enough.  Let me rephrase:  Does 
the fix for this problem comes from a *fresh* DB structure 
resulting from the initdb, or from a software fix in 8.1.8, or 
both?  The answer makes a big difference with a 200gb database.  
If it's in the software alone, we can simply restart.  If it's 
in the DB structure, we have to migrate 200gb of data from one 
PGDATA to another.

Ed


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] invalid page header in pg_statistic

2007-03-07 Thread Ed L.

On Wednesday February 7 2007 9:01 am, Tom Lane wrote:
 Ed L. [EMAIL PROTECTED] writes:
  How do I fix this 7.4.6 issue short of initdb?
  invalid page header in block 110 of relation pg_statistic
  I looked at the block via pg_filedump (included below), and
  it does not appear to me to be corrupted, so not sure what I
  would zero out, if anything.

 Seems odd to me too ... but pg_statistic doesn't contain any
 irreplaceable data.  See if you can TRUNCATE it.  If so, do
 an ANALYZE to repopulate it.  I don't recall if 7.4 has any
 special hoops you'd have to jump through to truncate a system
 catalog ...

Scheduled downtime finally arrived last night, and I tried 
unsuccessfully to truncate this table:

postgres -D $PGDATA -O -o standalone_log mydb
 truncate pg_statistic;
 ^D

pg_ctl start
psql -d mydb -c vacuum analyze

The truncate showed no errors.  The vacuum analyze showed the 
same error in block 110 of the pg_statistic table.  Any ideas 
what I'm missing here?  Is there another acceptable way to 
truncate that table, perhaps rm file; touch file?

Ed

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] vacuum error

2007-03-07 Thread Ed L.

On Wednesday March 7 2007 3:13 am, Martijn van Oosterhout wrote:
 On Wed, Mar 07, 2007 at 02:29:08AM -0700, Ed L. wrote:
  Perhaps my question was not clear enough.  Let me rephrase: 
  Does the fix for this problem comes from a *fresh* DB
  structure resulting from the initdb, or from a software fix
  in 8.1.8, or both?  The answer makes a big difference with a
  200gb database. If it's in the software alone, we can simply
  restart.  If it's in the DB structure, we have to migrate
  200gb of data from one PGDATA to another.

 In your case you could also get running by creating the
 missing file, but whether that's appropriate depends on the
 cause. I'd check the release notes to see if any related bugs
 have been fixed since then.

How would I go about correctly creating the missing file?  That 
sounds appealing, as if it were something I could do without 
taking downtime.  Is it?

In reviewing the release notes between 8.1.2 and 8.1.8, there are 
a number of vacuum fixes along with many others.  It is unclear 
if any of them are related.

Given this error is occurring while doing a VACUUM FREEZE of 
the template0 database, I wonder if that creates any options for 
me?  Frankly, I'd be happy to simply drop the template0 DB if I 
could, since I'm not aware that we ever need it for anything.

Ed

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

[GENERAL] vacuum error

2007-03-06 Thread Ed L.

I am seeing the following error in pgsql 8.1.2:

2007-03-05 10:00:51.106 PST [9834]DEBUG:  vacuuming pg_toast.pg_toast_1260
2007-03-05 10:00:51.106 PST [9834]DEBUG:  index pg_toast_1260_index now 
contains 0 row versions in 1 pages
2007-03-05 10:00:51.106 PST [9834]DETAIL:  0 index pages have been deleted, 
0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
2007-03-05 10:00:51.106 PST [9834]DEBUG:  pg_toast_1260: found 0 
removable, 0 nonremovable row versions in 0 pages
2007-03-05 10:00:51.106 PST [9834]DETAIL:  0 dead row versions cannot be 
removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
2007-03-05 10:00:51.106 PST [9834]DEBUG:  vacuuming 
pg_catalog.pg_statistic
2007-03-05 10:00:51.107 PST [9834]ERROR:  could not access status of 
transaction 3229475082
2007-03-05 10:00:51.107 PST [9834]DETAIL:  could not open file 
pg_clog/0C07: No such file or directory

What does it mean, and what should I do about it?

TIA.
Ed

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] vacuum error

2007-03-06 Thread Ed L.

On Tuesday March 6 2007 12:20 pm, Peter Eisentraut wrote:
 Ed L. wrote:
  I am seeing the following error in pgsql 8.1.2:
 
  ERROR:  could not access status of transaction 3229475082
  DETAIL:  could not open file pg_clog/0C07: No such file or
  directory
 
  What does it mean, and what should I do about it?

 1. Read this thread:
 http://archives.postgresql.org/pgsql-general/2007-02/msg00820.
php

 2. Upgrade to the latest 8.1.* release.

 3. If that doesn't help, check your system for faulty
 hardware, in particular for bad RAM.

This is a 200gb DB with ~300 transactions/second serving 5 busy 
facilities, so downtime comes at a premium.  We have some 
maintenance downtime planned for 2 weeks from now.  I'm trying 
to understand if this can wait that long.

It appears the only failure occurs during an autovacuum-initiated 
VACUUM FREEZE on template0 when it hits the pg_statistics 
table.  However, that abort appears to be causing autovacuum to 
skip all its other duties as it endlessly restarts and fails 
again.

Do I care if template0 gets a VACUUM FREEZE?

Assuming not, is there a simple way to make autovacuum skip over 
template0 so it can tend to the important data in the other 
databases?

Is restarting with 8.1.8 a known solution for this problem?  Or 
is an initdb required to fix it?

If initdb is required, we might as well move to the latest stable 
8.2 version.  I understand my options to minimize downtime to be 
limited to async replication.  Other ideas?

BTW, the RAM looks good.

TIA.
Ed

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] vacuum error

2007-03-06 Thread Ed L.

On Tuesday March 6 2007 3:53 pm, Joshua D. Drake wrote:

  Is restarting with 8.1.8 a known solution for this problem?  
  Or is an initdb required to fix it?

 You can update to 8.1.8 (if you are running 8.1.x) without an
 initdb.

Right.  I'm asking if the fix for this problem is in the new 
8.1.8 software, or in the new DB structure resulting from the 
initdb, or perhaps both.

Ed

---(end of broadcast)---
TIP 6: explain analyze is your friend

[GENERAL] db stats vs table stats

2007-02-23 Thread Ed L.


I've been periodically collecting the stats stored in 
pg_statio_all_tables and pg_stat_database for ~30 different 
clusters, and have noticed a curiosity.

I would have thought that for a given period, the change in 
pg_stat_database.blks_read would be = the sum of the changes in 
pg_statio_user_tables.heap_blks_read + 
pg_statio_user_tables.idx_blks_read + 
pg_statio_user_tables.toast_blks_read + 
pg_statio_user_tables.tidx_blks_read.

In short, the total would be = heap + idx + toast + idx for user 
tables.

It does not appear that way.  The table-level IO stats appear to 
be typically 1-2 orders of magnitude larger than the db-level 
stats.  Can anyone explain that?

TIA.
Ed


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] db stats vs table stats

2007-02-23 Thread Ed L.

Oops, typo:  I reversed the inequality.  I've corrected it below.

On Friday February 23 2007 2:02 pm, Ed L. wrote:
 I've been periodically collecting the stats stored in
 pg_statio_all_tables and pg_stat_database for ~30 different
 clusters, and have noticed a curiosity.

 I would have thought that for a given period, the change in
 pg_stat_database.blks_read would be = the sum of the changes
 in pg_statio_user_tables.heap_blks_read +
 pg_statio_user_tables.idx_blks_read +
 pg_statio_user_tables.toast_blks_read +
 pg_statio_user_tables.tidx_blks_read.

 In short, the total would be = heap + idx + toast + idx for
 user tables.

 It does not appear that way.  The table-level IO stats appear
 to be typically 1-2 orders of magnitude larger than the
 db-level stats.  Can anyone explain that?

 TIA.
 Ed

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] db stats vs table stats

2007-02-23 Thread Ed L.

On Friday February 23 2007 3:06 pm, Ed L. wrote:
  I've been periodically collecting the stats stored in
  pg_statio_all_tables and pg_stat_database for ~30 different
  clusters, and have noticed a curiosity... The table-level IO stats
  appear to be typically 1-2 orders of magnitude larger than
  the db-level stats.  Can anyone explain that?

Here's an example of how I'm calculating the deltas. Perhaps
someone can spot an error or mistaken assumption.  In this
case, the deltas are not orders of magnitude out of sync
with each other, but they grew from about 3% out of sync to
45% out of sync in ~35 minutes on a DB with 500 transactions/
second.

drop table s;
create table s as
select now(), blks_read as db_blks_read,
   sum(case when heap_blks_read ISNULL then 0 else heap_blks_read end +
   case when idx_blks_read ISNULL then 0 else idx_blks_read end +
   case when toast_blks_read ISNULL then 0 else toast_blks_read end +
   case when tidx_blks_read ISNULL then 0 else tidx_blks_read end) as 
table_blks_read
from pg_stat_database sd, pg_database d, pg_class c, pg_statio_all_tables st
where sd.datname = d.datname
  and d.datname = current_database()
  and c.oid = st.relid
group by blks_read;

create or replace view delta_view as
select now() - s.now as delta, blks_read - s.db_blks_read as db_blks_read_delta,
   sum(case when heap_blks_read ISNULL then 0 else heap_blks_read end +
   case when idx_blks_read ISNULL then 0 else idx_blks_read end +
   case when toast_blks_read ISNULL then 0 else toast_blks_read end +
   case when tidx_blks_read ISNULL then 0 else tidx_blks_read end) -
   s.table_blks_read as table_blks_read_delta
from pg_stat_database sd, pg_database d, pg_class c, pg_statio_all_tables st, s
where sd.datname = d.datname
  and d.datname = current_database()
  and c.oid = st.relid
group by blks_read, s.now, db_blks_read, table_blks_read;

select * from delta_view;

  delta  | db_blks_read_delta | table_blks_read_delta 
-++---
 00:32:51.007703 | 384243 |556212
(1 row)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

[GENERAL] invalid page header in pg_statistic

2007-02-07 Thread Ed L.

How do I fix this 7.4.6 issue short of initdb?

invalid page header in block 110 of relation pg_statistic

I looked at the block via pg_filedump (included below), and it 
does not appear to me to be corrupted, so not sure what I would 
zero out, if anything.

TIA.
Ed


***
* PostgreSQL File/Block Formatted Dump Utility - Version 3.0
*
* File: 16408.backup-20070207-01:11:58PST
* Options used: -y -f -R 110 110 
*
* Dump created on: Wed Feb  7 01:56:04 2007
***

Block  110 

Header -
 Block Offset: 0x000dc000 Offsets: Lower 160 (0x00a0)
 Block: Size 8192  Version1Upper8192 (0x2000)
 LSN:  logid178 recoff 0xf577b4e8  Special  8192 (0x2000)
 Items:   35   Free Space: 8032
 Length (including item array): 164

  : 00b2 f577b4e8 0028 00a02000  .w.(.. .
  0010: 20002001 3da0 3ae0 3730   . .=...:...70..
  0020: 34d0 3210 2e60 2c00  4...2`..,...
  0030: 2940 2590 2330 2070  )@..%...#0.. p..
  0040: 1cc0 1a60 17a0 13f0  .`..
  0050: 1190 0ed0 0b20 08c0  . ..
  0060: 0600 0250 1bc0 19c0  .P..
  0070: 1890 1740 15f0 14a0  [EMAIL PROTECTED]
  0080: 12c0 3da0 3ba0 3a70  =...;...:p..
  0090: 3920 37d0 3680 34a0  9 ..7...6...4...
  00a0:  

Data -- 
 Item   1 -- Length:0  Offset: 7888 (0x1ed0)  Flags: 0x00
 Item   2 -- Length:0  Offset: 7536 (0x1d70)  Flags: 0x00
 Item   3 -- Length:0  Offset: 7064 (0x1b98)  Flags: 0x00
 Item   4 -- Length:0  Offset: 6760 (0x1a68)  Flags: 0x00
 Item   5 -- Length:0  Offset: 6408 (0x1908)  Flags: 0x00
 Item   6 -- Length:0  Offset: 5936 (0x1730)  Flags: 0x00
 Item   7 -- Length:0  Offset: 5632 (0x1600)  Flags: 0x00
 Item   8 -- Length:0  Offset: 5280 (0x14a0)  Flags: 0x00
 Item   9 -- Length:0  Offset: 4808 (0x12c8)  Flags: 0x00
 Item  10 -- Length:0  Offset: 4504 (0x1198)  Flags: 0x00
 Item  11 -- Length:0  Offset: 4152 (0x1038)  Flags: 0x00
 Item  12 -- Length:0  Offset: 3680 (0x0e60)  Flags: 0x00
 Item  13 -- Length:0  Offset: 3376 (0x0d30)  Flags: 0x00
 Item  14 -- Length:0  Offset: 3024 (0x0bd0)  Flags: 0x00
 Item  15 -- Length:0  Offset: 2552 (0x09f8)  Flags: 0x00
 Item  16 -- Length:0  Offset: 2248 (0x08c8)  Flags: 0x00
 Item  17 -- Length:0  Offset: 1896 (0x0768)  Flags: 0x00
 Item  18 -- Length:0  Offset: 1424 (0x0590)  Flags: 0x00
 Item  19 -- Length:0  Offset: 1120 (0x0460)  Flags: 0x00
 Item  20 -- Length:0  Offset:  768 (0x0300)  Flags: 0x00
 Item  21 -- Length:0  Offset:  296 (0x0128)  Flags: 0x00
 Item  22 -- Length:0  Offset: 3552 (0x0de0)  Flags: 0x00
 Item  23 -- Length:0  Offset: 3296 (0x0ce0)  Flags: 0x00
 Item  24 -- Length:0  Offset: 3144 (0x0c48)  Flags: 0x00
 Item  25 -- Length:0  Offset: 2976 (0x0ba0)  Flags: 0x00
 Item  26 -- Length:0  Offset: 2808 (0x0af8)  Flags: 0x00
 Item  27 -- Length:0  Offset: 2640 (0x0a50)  Flags: 0x00
 Item  28 -- Length:0  Offset: 2400 (0x0960)  Flags: 0x00
 Item  29 -- Length:0  Offset: 7888 (0x1ed0)  Flags: 0x00
 Item  30 -- Length:0  Offset: 7632 (0x1dd0)  Flags: 0x00
 Item  31 -- Length:0  Offset: 7480 (0x1d38)  Flags: 0x00
 Item  32 -- Length:0  Offset: 7312 (0x1c90)  Flags: 0x00
 Item  33 -- Length:0  Offset: 7144 (0x1be8)  Flags: 0x00
 Item  34 -- Length:0  Offset: 6976 (0x1b40)  Flags: 0x00
 Item  35 -- Length:0  Offset: 6736 (0x1a50)  Flags: 0x00


*** End of Requested Range Encountered. Last Block Read: 110 ***

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] invalid page header in pg_statistic

2007-02-07 Thread Ed L.

On Wednesday February 7 2007 9:01 am, Tom Lane wrote:
 Ed L. [EMAIL PROTECTED] writes:
  How do I fix this 7.4.6 issue short of initdb?
  invalid page header in block 110 of relation pg_statistic
  I looked at the block via pg_filedump (included below), and
  it does not appear to me to be corrupted, so not sure what I
  would zero out, if anything.

 Seems odd to me too ... but pg_statistic doesn't contain any
 irreplaceable data.  See if you can TRUNCATE it.  If so, do
 an ANALYZE to repopulate it.  I don't recall if 7.4 has any
 special hoops you'd have to jump through to truncate a system
 catalog ...

$ psql -c truncate table pg_statistic
ERROR:  permission denied: pg_statistic is a system catalog

I suppose this means single-user mode is required?  Any other 
trick to do this during uptime?

TIA.
Ed

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

[GENERAL] daylight savings patches needed?

2007-02-06 Thread Ed L.


From the FAQ:


1.14) Will PostgreSQL handle recent daylight saving time changes 
in various countries?

PostgreSQL versions prior to 8.0 use the operating system's 
timezone database for daylight saving information. All current 
versions of PostgreSQL 8.0 and later contain up-to-date timezone 
information.


In this context, assuming we have applied the relevant OS patches 
(RHEL, Debian, HPUX 11.11, 11.23), does the phrase contain 
up-to-date timezone information mean that all 7.[234].x and 8.x 
installations are prepared to properly handle the March 1, 2007 
DST changes in the US?  If not, what is needed?

TIA.

Ed

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] Index bloat of 4x

2007-01-22 Thread Ed L.


We have a large number (50+) of pre-8.2 clusters.  How can I 
best/most easily identify those indices most bloated and in need 
of reindex/rebuilding?

Ed

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] Index bloat of 4x

2007-01-19 Thread Ed L.

On Friday January 19 2007 2:11 am, Csaba Nagy wrote:

  I afraid I don't see how any of the answers I saw discussed
  fit a 24x7 operation.  Reindex, drop index, vacuum full, ...
  they all block production queries of one sort or another for
  significant periods of time (minutes) on large (multi/tens
  of GB) tables, and thus are infeasible for true 24x7
  operations.[snip]

 This is not completely true, as of 8.2 there is an online
 index build, and if that could be used in a background thread
 to rebuild the index and replace the bloated one once it's
 finished, that would be a non-blocking operation which could
 be done in 24x7 situations.

Online index creation definitely helps us toward 24x7.  But 
wouldn't we still have to drop the old index, thus blocking 
production queries?

Ed

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Index bloat of 4x

2007-01-18 Thread Ed L.

On Thursday January 18 2007 6:07 am, Bill Moran wrote:
 Right.  It doesn't _look_ that way from the graph, but that's
 because I only graph total DB size.  I expect if I graphed
 data and index size separately, it would be evident.

pg_total_relation_size() might give you what you want there.

 At this point, I'm going to assume that my question of, Is
 this 4x bloat strange enough to warrant further investigation
 is no.  It seems like this amount of bloat isn't terribly
 unusual, and that the people working on improving this sort of
 thing already have enough examples of it.

I afraid I don't see how any of the answers I saw discussed fit a 
24x7 operation.  Reindex, drop index, vacuum full, ... they all 
block production queries of one sort or another for significant 
periods of time (minutes) on large (multi/tens of GB) tables, 
and thus are infeasible for true 24x7 operations.  What it seems 
we really need is something to remove the bloat without blocking 
production DML queries, while under significant query load, with 
very large tables.  This bloat issue is by far our biggest 
headache on the DB side.

Ed

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[GENERAL] lock query

2007-01-16 Thread Ed L.


I wonder if anyone might help me generate a SQL query that peers 
into pg_locks, pg_stat_activity, etc and tells in plain language 
exactly *who* each backend is blocked *on* while awaiting 
lock(s).

Here's what I'm looking at now:

SELECT date_trunc('second', now()) as now, a.client_addr as ip,
   l.pid, l.locktype, l.mode, l.granted,
   r.relname, l.page, l.tuple, l.transactionid,
   a.query_start, a.current_query as sql
FROM pg_locks l LEFT OUTER JOIN pg_class r ON r.oid = l.relation
LEFT OUTER JOIN pg_stat_activity a ON l.pid = 
a.procpid
ORDER BY a.query_start ASC;

For busy systems with hundreds of backends and hundreds of 
queries per second, I find the output of this query very 
difficult to quickly who is holding the key lock(s) on which 
blocked backends wait.  What would be really helpful is a query 
that generated output along the lines of:

Backend pid 123 is blocked awaiting pid 456 lock on 'sessions' 
relation.

Perhaps this function already exists?  If not, what is needed to 
get there?

TIA,
Ed

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] autovac hung/blocked

2006-11-16 Thread Ed L.

On Thursday November 16 2006 3:33 am, Richard Huxton wrote:
 Ed L. wrote:
  One idea would be to partition the table some how such that
  the chunks getting vacuumed are much smaller and thus not
  such an impact.  On the app side, I suppose we could break
  the table into multiple tables on some dimension (time) to
  make the vacuum impacts smaller.

 You're running on bigger datasets than I'm used to, but that
 would be my approach. Did you notice the constraint
 partitioning introduced in 8.1?

 http://www.postgresql.org/docs/8.1/static/ddl-partitioning.htm
l#DDL-PARTITIONING-CONSTRAINT-EXCLUSION

Thanks for the tip.  We have avoided use of inheritance in order 
to stay closer to the beaten path, but the partitioning 
implementation ideas are useful.

Ed

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] autovac hung/blocked

2006-11-15 Thread Ed L.

On Wednesday November 15 2006 6:30 am, Alvaro Herrera wrote:

  The table in
  question appears to be the pathological case for vacuum: 
  very large with lots of frequent UPDATEs.  It's essentially
  a log table.

 A big log table where the log entries are being updated? 
 Certainly sounds like a recipe for vacuum headaches.

I'm curious to know how others are dealing with this problem, and 
how pgsql might support this issue.

In our case, we have a 1.8GB OS cache, a 30GB DB cache, serving 
around 200 transactions/second from a 110GB DB, and this 
problematic table is 15GB on disk.  So when it is vacuumed, I 
suspect it essentially flushes the OS cache and half the DB 
cache, severely impacting performance in an already 
cpu-bottlenecked machine.  I have attempted to adjusted autovac 
to spread out its I/O impact, but then it takes so long to run 
that other smaller frequently-updated tables are not 
vacuumed/analyzed in the meantime and performance starts to 
suffer.

Suppose there simply are no off-hours periods when you can 
vacuum a very large table with many frequent updates.  (There is 
never a good time to flush the caches.)  How do you manage such 
a table in a 24x7 environment?

One idea would be to partition the table some how such that the 
chunks getting vacuumed are much smaller and thus not such an 
impact.  On the app side, I suppose we could break the table 
into multiple tables on some dimension (time) to make the vacuum 
impacts smaller.

But a pgsql solution in the future would be nice.  I don't know 
the pgsql code well, but what if the freespace map was divided 
into smaller sized sectors such that individual sectors could be 
vacuumed without having to hit the entire table?  Or even simply 
breaking up the 15GB vacuum job into 1GB chunks with some 
spacing between would help.  Of course, it'd be important to 
keep the smaller tables vacuumed/analyzed as needed in the 
in-between time.  

I don't know what the best answer is here, just groping for 
ideas.

TIA.
Ed

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] Transaction id wraparound problem

2006-11-15 Thread Ed L.

On Wednesday November 15 2006 4:18 pm, Morris Goldstein wrote:
 If I'm vacuuming every day (or two), and not running anywhere
 near 1 billion transactions a day, why am I running into
 transaction id wraparound problems?

 Is this just complaining that template0 and template1 haven't
 been vacuumed in the over 2 billion transactions encountered
 by testdb? (I never touch template0 and template1.) If that's
 what's going on, I take it that I have no risk of data loss?
 And is there some reason to vacuum these databases, (other
 than to avoid the scary messages)?

I don't believe template0 ever gets vacuumed since it doesn't 
allow connections.  Template1 can take a vacuum command, and I 
think that might be where your message is coming from.  If you 
vacuum it, do your messages go away?  Your testdb looks like it 
was vacuumed about 10,000 transactions ago.

Ed

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[GENERAL] autovac hung/blocked

2006-11-14 Thread Ed L.

I have an 8.1.2 autovac which appears to be hanging/blocking 
every few days or so, but we're don't understand what's causing 
it.  I wasn't able to catch a backtrace before we killed it.  I 
do not see autovac locks in the pg_locks view.

Will running 8.1.5 buy me anything in terms of being able to see 
what it has locked and what locks it might be awaiting?  Or 
maybe in terms of bugs that might explain it?

TIA.
Ed


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] autovac hung/blocked

2006-11-14 Thread Ed L.

On Tuesday November 14 2006 12:49 pm, Jim C. Nasby wrote:
 On Tue, Nov 14, 2006 at 11:20:30AM -0700, Ed L. wrote:
  I have an 8.1.2 autovac which appears to be hanging/blocking
  every few days or so, but we're don't understand what's
  causing it.  I wasn't able to catch a backtrace before we
  killed it.  I do not see autovac locks in the pg_locks view.
 
  Will running 8.1.5 buy me anything in terms of being able to
  see what it has locked and what locks it might be awaiting? 
  Or maybe in terms of bugs that might explain it?

 If there's no locks, why do you think it's blocked?

I had wondered if all autovac locks were in the pg_locks view, 
but I now see them.  A gdb backtrace a few days ago had autovac 
waiting on a semaphore, and I assumed that was waiting on a 
lock. 

Ed
 

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] autovac hung/blocked

2006-11-14 Thread Ed L.

On Tuesday November 14 2006 12:56 pm, Jim C. Nasby wrote:
 You don't have the vacuum cost delay settings set unreasonably
 high, do you?
On Tuesday November 14 2006 12:56 pm, you wrote:
 You don't have the vacuum cost delay settings set unreasonably
 high, do you?

I'm not sure.  Here's what we're running:

#vacuum_cost_delay = 0  # 0-1000 milliseconds
#vacuum_cost_page_hit = 1   # 0-1 credits
#vacuum_cost_page_miss = 10 # 0-1 credits
#vacuum_cost_page_dirty = 20# 0-1 credits
#vacuum_cost_limit = 200# 0-1 credits
autovacuum = on # enable autovacuum subprocess?
#autovacuum_naptime = 60# time between autovacuum runs, in secs
autovacuum_naptime = 600# changed by CW 9/11/06 to minimize 
interference with application 
autovacuum_vacuum_threshold = 1000  # min # of tuple updates before
# vacuum
autovacuum_analyze_threshold = 500  # min # of tuple updates before 
autovacuum_vacuum_scale_factor = 0.4# fraction of rel size before 
# vacuum
autovacuum_analyze_scale_factor = 0.2   # fraction of rel size before 
autovacuum_vacuum_cost_delay = 500  # default vacuum cost delay for 
# vacuum_cost_delay
autovacuum_vacuum_cost_limit = 200  # default vacuum cost limit for 
# vacuum_cost_limit

The table on which it seems to be struggling is updated 
many times per second and has 7.2M rows over 15GB total.

I'm trying to figure out how often it would be napping
on that...?  Maybe we're just seeing a very conservative
vacuum on a large table...?

Ed

---(end of broadcast)---
TIP 6: explain analyze is your friend

[GENERAL] autovac state persistence

2006-11-14 Thread Ed L.


Does autovac maintain its state/counters across restats as to who 
need to be vacuumed/analyzed?  Or does killing autovac cause it 
to reset the counters for the vacuum/analyze threshholds?

TIA.
Ed

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] autovac hung/blocked

2006-11-14 Thread Ed L.

On Tuesday November 14 2006 1:02 pm, Ed L. wrote:
 On Tuesday November 14 2006 12:56 pm, Jim C. Nasby wrote:
  You don't have the vacuum cost delay settings set
  unreasonably high, do you?

 On Tuesday November 14 2006 12:56 pm, you wrote:
  You don't have the vacuum cost delay settings set
  unreasonably high, do you?

 I'm not sure.  Here's what we're running:

Well, I think we clearly have an HPUX CPU bottleneck (long pri 
queue, high cpu utilization, high user cpu %, lots of processes 
blocked on pri).

It seems to get worst and slow all queries down across the board 
when autovac tries to vacuum a 15GB table.  I'm guessing this is 
flushing the OS/DB caches, exacerbating the CPU bottleneck.

I'm also not sure what to do about it beyond the customer buying 
some politically/financially expensive CPUs.  The table in 
question appears to be the pathological case for vacuum:  very 
large with lots of frequent UPDATEs.  It's essentially a log 
table.  So the other options seem to be figure out how to 
partition the table so as to minimize size of the data needing 
vacuuming, or to redesign if possible to replace UPDATEs with 
INSERTs.  Other ideas?

Ed

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] autovac state persistence

2006-11-14 Thread Ed L.

On Tuesday November 14 2006 11:51 pm, Matthew T. O'Connor wrote:
 Ed L. wrote:
  Does autovac maintain its state/counters across restats as
  to who need to be vacuumed/analyzed?  Or does killing
  autovac cause it to reset the counters for the
  vacuum/analyze threshholds?

 Depends on the version.  The contrib autovacuum does not
 maintain state through a restart.  The integrated version (8.0
 and 8.1) can maintain state as long as you don't have the stat
 system configured to reset on restart.

I'm talking about 8.1.2, thanks.

Ed

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] 8.1.2 locking issues

2006-11-09 Thread Ed L.

I have a few questions on pgsql locking terms and such...

I created the following view to make viewing the locks 
a little easier:

-- CREATE OR REPLACE VIEW locksview AS 
-- SELECT l.*, r.*, a.*, now() - a.query_start as query_age,
--substring(replace(current_query, '\n', ' '), 1, 30) as sql30,
--case when granted then 'granted' else 'WAITING' end as status,
--case l.mode when 'AccessShareLock' then 1
--when 'RowShareLock' then 2
--when 'RowExclusiveLock' then 3
--when 'ShareUpdateExclusiveLock' then 4
--when 'ShareLock' then 5
--when 'ShareRowExclusiveLock' then 6
--when 'ExclusiveLock' then 7
--when 'AccessExclusiveLock' then 8
--else 100 end as exclusivity
-- FROM pg_locks l LEFT OUTER JOIN pg_class r ON r.oid = l.relation
-- LEFT OUTER JOIN pg_stat_activity a ON l.pid = a.procpid;

I then started two transactions that try to update the 
same row, and then ran the following query on the view
above:

SELECT now(), client_addr, pid, query_age, status, locktype, mode,
   relname, page, tuple as tup, current_query
FROM locksview
WHERE now() - query_start  '3 minute'::INTERVAL
ORDER BY query_age DESC, exclusivity DESC;

  now  | client_addr | pid  |query_age| status  
|   locktype|   mode   | relname | page | tup |   
current_query
---+-+--+-+-+---+--+-+--+-+
 2006-11-09 20:37:25.680662-08 | 10.0.1.82   | 6484 | 00:01:25.830264 | granted 
| transactionid | ExclusiveLock| |  | | IDLE in 
transaction
 2006-11-09 20:37:25.680662-08 | 10.0.1.82   | 6484 | 00:01:25.830264 | granted 
| relation  | RowExclusiveLock | foo |  | | IDLE in 
transaction
 2006-11-09 20:37:25.680662-08 | 10.0.1.82   | 6484 | 00:01:25.830264 | granted 
| relation  | AccessShareLock  | foo |  | | IDLE in 
transaction
 2006-11-09 20:37:25.680662-08 | 10.0.1.82   | 6508 | 00:01:18.862623 | granted 
| tuple | ExclusiveLock| foo |0 |   7 | update foo set msg 
= 'maybe' where id = 3;
 2006-11-09 20:37:25.680662-08 | 10.0.1.82   | 6508 | 00:01:18.862623 | granted 
| transactionid | ExclusiveLock| |  | | update foo set msg 
= 'maybe' where id = 3;
 2006-11-09 20:37:25.680662-08 | 10.0.1.82   | 6508 | 00:01:18.862623 | WAITING 
| transactionid | ShareLock| |  | | update foo set msg 
= 'maybe' where id = 3;
 2006-11-09 20:37:25.680662-08 | 10.0.1.82   | 6508 | 00:01:18.862623 | granted 
| relation  | RowExclusiveLock | foo |  | | update foo set msg 
= 'maybe' where id = 3;
 2006-11-09 20:37:25.680662-08 | 10.0.1.82   | 6508 | 00:01:18.862623 | granted 
| relation  | AccessShareLock  | foo |  | | update foo set msg 
= 'maybe' where id = 3;
(8 rows)


In this case, pid 6484 updated the row first but did not 
commit, then 6508 tried to update the same row and naturally
blocked.  I understand at a very basic level why this 
would block, so that's not the spirit of my questions.  I
also understand each transaction holds its acquired locks
until the end of its transaction.  But I'd like to better
understand the pg_locks view and pgsql locking terms.

First, I don't see rhyme or reason in the transactionid locks.
Can someone explain why 6508 has a transactionid ExclusiveLock,
but now is waiting on a transactionid ShareLock?  That seems
unintuitive.  It would seem that if you hold a more exclusive
lock, getting a less exclusive lock would not be a problem.  Is
there rhyme or reason for this?  From the docs, I'd guess the 
ExclusiveLock is given to block select for updates that might
want to grab that row after 6508 already got in line, and that
it is grabbing the ShareLock for... what?

Second, 6508, the *blocked* process, also holds a tuple 
ExclusiveLock on the page and tuple of foo on which it is
actually blocked.  Again, is there rhyme or reason I'm missing?

Third, what is it that causes values to show up in the page and
tuple columns?

Finally, both processes, the blocked and the blocking, hold 
relation RowExclusiveLocks, which again, hardly seems intuitive.
It would seem that the blocked process would not have that lock
until it actually had access to update the row in question.
The docs do not say RowExclusiveLock conflicts with 
RowExclusiveLock, but I guess I'm expecting to to that conflict
since one process is blocking on the other for updating the same
row.

TIA.
Ed

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[GENERAL] killing autovac

2006-11-08 Thread Ed L.

Can I kill -SIGINT autovac in 8.1.2 without taking down all the 
other backends?

Thanks,
Ed


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[GENERAL] 8.1.2 postmaster died

2006-11-08 Thread Ed L.


One of our 8.1.2 postmasters on HPUX 11.23 ia64 just received a 
SIGKILL signal from unknown origins.  After reviewing all 
command history files for the DBA and root, I do not believe 
anyone manually sent it, and we have no scripts etc that would 
do that, at least that we can find or imagine.  The machine had 
ample RAM available.

Any ideas where could this SIGKILL would have come from?

Thanks,
Ed

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] Table and Field namestyle best practices?

2006-11-08 Thread Ed L.

On Wednesday November 8 2006 11:31 am, novnov wrote:
 Yes, I've already pretty much decided to use lowercase for all
 namestyles, I mentioned that in the first post. Using
 lowercase invokes a set of other issues, which I'm asking for
 options on...namely, conventions like org_id, and emp_org_id,
 or simply using org_id as the fk pointer.

Not sure there is a best way, but I prefer org_id due to 
brevity and no concern for quoting.  Your emp_org_id is nice for 
views because you don't have to disambiguate columns.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[GENERAL] 8.1.2 locking issues

2006-11-08 Thread Ed L.


We are having locking issues in 8.1.2 on HP 11.23 ia64.  I'm 
trying to better understand how to conclusively identify who is 
waiting on who and why.

We have a series of select for updates on our 'sessions' table.  
One of those queries is stuck waiting for a transactionid 
locktype ShareLock.  How can I tell what it's actually waiting 
for?

TIA.
Ed

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[GENERAL] DROP INDEX performance/locking

2006-11-03 Thread Ed L.

We're taking some politically expensive downtime to drop an index 
on a very large, heavily used table because to do it while live 
blocks too many users for too long, requiring tens of seconds or 
more on a system doing 200 transactions/second.  That's due to 
the fact that nearly every user interaction touches this 
particular table.

It seems like one ought to be able to drop an index in a much 
shorter amount of time and/or not block queries while it's 
dropped.  But I don't know the code well enough to say how hard 
this would be.

Would a faster drop or non-blocking drop be worthy of a todo 
item?

Ed

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

[GENERAL] failing 8.1.4 autovacuum

2006-10-23 Thread Ed L.

I have an 8.1.4 autovac process running on HP-UX 11.23 IA64 and 
repeatedly failing with the following error:

ERROR:  failed to re-find parent key in audit_idx1

Will a reindex or drop index make this problem go away?

Is there anything I can do to help identify the underlying issue?  
(This is an intensely used production system, so I have very 
limited room to operate.)

TIA.

Ed




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] failing 8.1.4 autovacuum

2006-10-23 Thread Ed L.

On Monday October 23 2006 4:03 pm, Ed L. wrote:
 I have an 8.1.4 autovac process running on HP-UX 11.23 IA64
 and repeatedly failing with the following error:

 ERROR:  failed to re-find parent key in audit_idx1

 Will a reindex or drop index make this problem go away?

BTW, it'd be nice if autovac didn't simply abort and restart with 
the prior tables that were successfully vacuumed, but rather was 
able to continue on after the error to the next table.  As it 
is, it starves the others but autovac just keeps banging it's 
head against the wall, consuming lots of I/O and accomplishing 
little.  Maybe that's part of the 8.2 fixes I haven't kept up 
with?

Ed

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] troubleshooting 8.1.2

2006-07-15 Thread Ed L.

On Tuesday July 11 2006 3:16 pm, Tom Lane wrote:
 Ed L. [EMAIL PROTECTED] writes:
  We are wondering if our swap space was too small, and when
  the swap reservation failed, the OS was sending SIGINT??

 You'd have to check your OS documentation ...  I thought HPUX
 would just return ENOMEM to brk() for such cases.  It doesn't
 do memory overcommit does it?

ENOMEM is correct for our brk(), too.  We're running with
psuedoswap, but I guess our swapspace was too small, and appears
to be what we ran into.  The SIGINT is still a mystery.  Our
truss output for one of these SIGINTs is at the bottom of this
message, for what its worth.

BTW, here's a conversation of possible interest that conflicts
with advice I've heard here of keeping shared_buffers small 
and letting the OS do all the caching.

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1042336

Their argument appears
to be that there are HPUX kernel inefficiencies for OS caches 
larger than 1.5gb.  You once argued that it would be unreasonable
to expect user-space shared memory to be any more efficient than
the kernel cache.  I don't know one way or the other, and
solid benchmarking that simulates our loads appears troublesome.
I guess I could write a little C program to measure shared 
memory random access times as the size of the cache grows...

Anyway, here's the truss output:

( Attached to process 20787 (postmaster -D /users/...) [64-bit] )
select(7, 0x9fffe670, NULL, NULL, 0x9fffe640)   
 [sleeping]
  Received signal 2, SIGINT, in select(), [caught], no siginfo
sigprocmask(SIG_SETMASK, 0x600708c0, NULL)  
 = 0
gettimeofday(0x9fff9460, NULL)  
 = 0
stat(/usr/lib/tztab, 0x9fff9300)  
 = 0
open(/usr/lib/tztab, O_RDONLY|0x800, 01210)   
 = 9
mmap(NULL, 13197, PROT_READ, MAP_PRIVATE, 9, 0) 
 = 0x9fffbb14c0
00
close(9)
 = 0
write(2, 2 0 0 6 - 0 7 - 1 1   1 3 : 5 5 .., 76)  
 = 76
kill(20793, SIGUSR2)
 = 0
kill(20794, SIGQUIT)
 = 0
...

Ed

---(end of broadcast)---
TIP 6: explain analyze is your friend

[GENERAL] Log actual params for prepared queries: TO-DO item?

2006-07-15 Thread Ed L.


We'd like to attempt some log replay to simulate real loads, but 
in 8.1.2, it appears the formal parameters are logged ('$') 
instead of the actuals for prepared queries, e.g.:

EXECUTE unnamed  [PREPARE:  UPDATE sessions SET a_session = $1 
WHERE id = $2]

Thoughts on making this a to-do item?

Ed

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[GENERAL] troubleshooting 8.1.2

2006-07-11 Thread Ed L.

We have 4 8.1.2 cluster running on an HP-UX 11.23 Itanium, repeatedly
dying with the following log message:

2006-07-11 12:52:27 EDT [21582]LOG:  received fast shutdown request
2006-07-11 12:52:27 EDT [21591]LOG:  shutting down
2006-07-11 12:52:27 EDT [21591]LOG:  database system is shut down
2006-07-11 12:52:27 EDT [21584]LOG:  logger shutting down

We can't figure out why it is shutting down.  Nobody here is sending
the signal.  We don't have any cron jobs doing that sort of thing.

We've also seen out of memory errors when this first started
happening, though glance had not shown GBL_MEM_UTIL above 90% (with
OS buffer cache max/min percents at 10%/3%).  The box has 64gb of
RAM, so that would seem to mean there was ~6GB of RAM available
when it got the out of memory errors.

Just in case, we shutdown several clusters, and restarted them, and
now even with plentiful memory, they're dying with the same message.

Any ideas?

Ed

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] troubleshooting 8.1.2

2006-07-11 Thread Ed L.

On Tuesday July 11 2006 1:17 pm, Tom Lane wrote:
 Ed L. [EMAIL PROTECTED] writes:
  We have 4 8.1.2 cluster running on an HP-UX 11.23 Itanium,
  repeatedly dying with the following log message:
 
  2006-07-11 12:52:27 EDT [21582]LOG:  received fast
  shutdown request

 *Something* is sending SIGINT to the postmaster --- it's
 simply not possible to reach that elog call any other way.

 How are you launching the postmaster?  If from a terminal
 window, are you sure it's entirely disconnected from the
 terminal's process group? If not, typing control-C in that
 window could SIGINT the postmaster.

We use a shell function to start the postmaster:

dbstart() {
pg_ctl start -D $PGDATA -m smart -o -i -p $PGPORT -p postmaster
}

We are wondering if our swap space was too small, and when the
swap reservation failed, the OS was sending SIGINT??

Ed

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] Why won't it index scan?

2006-05-23 Thread Ed L.

On Tuesday May 23 2006 4:55 pm, Jim C. Nasby wrote:
 Well, I did find one reason not to go ape with this: the
 number of pages analyzed scales with the number of buckets, so
 doubling the statistics target will roughly double the ANALYZE
 time for any table over 6000 pages (though the effect isn't
 linear, see below). There is a small increase in time for a
 small table, but I doubt it's enough for anyone to care:

Are you accounting for your well-primed OS and DB caches?  I'd 
think a more realistic test would clear those of the target 
table between measurements.

Ed

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] pg_dump index/constraint creation order

2006-05-17 Thread Ed L.

On Monday May 15 2006 11:14 am, Vivek Khera wrote:
 On May 14, 2006, at 12:27 AM, Ed L. wrote:
  While watching a 9-hour 60GB network load from 7.4.6 pg_dump
  into 8.1.2, I noticed the order in which indices and
  constraints are created appears to be their creation order.

 If you use the 8.1.2 pg_dump to make the dump from your 7.4.6
 DB, what is the order like?
 I suspect it would be more to what you're expecting.  In
 general, you should use the pg_dump  corresponding to the
 version into which you are loading for best results.

8.1.2 has same problem (try attached script to see).

Ed



test.sh
Description: application/shellscript

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[GENERAL] Why won't it index scan?

2006-05-17 Thread Ed L.


Can someone help me understand why the 8.1.2 query below is
using a seq scan instead of an index scan?  All relevant
columns appear to be indexed and all tables vacuum analyzed.

$ psql -c analyze verbose patient
INFO:  analyzing public.patient
INFO:  patient: scanned 3000 of 3353 pages, containing 117685 live rows and 
5471 dead rows; 3000 rows in sample, 131533 estimated total rows
ANALYZE

$ psql -c select count(1) from patient
  
 count  

 131661
(1 row)


$ psql -c analyze verbose visit  
INFO:  analyzing public.visit
INFO:  visit: scanned 3000 of 19985 pages, containing 58520 live rows and 
7045 dead rows; 3000 rows in sample, 389841 estimated total rows
ANALYZE

$ psql -c select count(1) from visit  
 count  

 389102
(1 row)


$ psql -c explain analyze select * from visit inner join patient on 
patient.key = visit.patient_key where nursestation_key = '40';
QUERY PLAN  
   
---
 Merge Join  (cost=27724.37..28457.01 rows=4956 width=421) (actual 
time=1819.993..2004.802 rows=6 loops=1)
   Merge Cond: (outer.patient_key = inner.key)
   -  Sort  (cost=11859.31..11871.70 rows=4956 width=209) (actual 
time=0.416..0.426 rows=6 loops=1)
 Sort Key: visit.patient_key
 -  Bitmap Heap Scan on visit  (cost=69.35..11555.14 rows=4956 
width=209) (actual time=0.187..0.245 rows=6 loops=1)
   Recheck Cond: (nursestation_key = 40)
   -  Bitmap Index Scan on idx_visit_nursestation_key  
(cost=0.00..69.35 rows=4956 width=0) (actual time=0.158..0.158 rows=6 loops=1)
 Index Cond: (nursestation_key = 40)
   -  Sort  (cost=15865.05..16194.21 rows=131661 width=212) (actual 
time=1768.501..1856.334 rows=61954 loops=1)
 Sort Key: patient.key
 -  Seq Scan on patient  (cost=0.00..4669.61 rows=131661 width=212) 
(actual time=0.010..355.299 rows=131661 loops=1)
 Total runtime: 2046.323 ms
(12 rows)


 Table public.patient
  Column   |Type |   
Modifiers   
---+-+---
 key   | integer | not null default 
nextval('patient_key_seq'::regclass)
...
Indexes:
pk_patient PRIMARY KEY, btree (key)
...

   Table public.visit
Column |Type |  
Modifiers  
---+-+-
 patient_key   | integer | not null
 nursestation_key  | integer | 
...
Indexes:
idx_visit_nursestation_key btree (nursestation_key)
idx_visit_patient_key btree (patient_key)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

1 2 3 >

1 - 100 of 273 matches

Mail list logo