date:20051031

Re: [HACKERS] [PATCHES] TODO Item - Add system view to show free

2005-10-31 Thread Simon Riggs

On Fri, 2005-10-28 at 12:50 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  There are a few issues with current FSM implementation, IMHO, discussing
  as usual the very highest end of performance:
 
 Do you have any evidence that the FSM is actually a source of
 performance issues, or is this all hypothetical?

This was a side-bar issue for my current focus, as I already said, so
I'll skip what sounds like a lengthy debate on this for now.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] 8.1RC1 fails opr_sanity on osx

2005-10-31 Thread Adam Witney


Just the one fail on OSX 10.3.9

 opr_sanity   ... FAILED

Is this a known problem, or something specific to my machine... I can post
regression.diffs (quite long) if required ...

Thanks

Adam


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] LDAP Authentication?

2005-10-31 Thread Euler Taveira de Oliveira

--- Magnus Hagander wrote:

  It should be fairly easy to write a LDAP backend to password
  authentication using openldap, winldap or whatever ldap library is
  available.
  
I support the idea. It would be a good gain for PostgreSQL
authentication. 
If you want to discuss ideas, drop me a line.




Euler Taveira de Oliveira
euler[at]yahoo_com_br








___ 
Promoção Yahoo! Acesso Grátis: a cada hora navegada você
acumula cupons e concorre a mais de 500 prêmios! Participe!
http://yahoo.fbiz.com.br/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] 8.1RC1 fails opr_sanity on osx

2005-10-31 Thread Bruce Momjian

Adam Witney wrote:
 
 Just the one fail on OSX 10.3.9
 
  opr_sanity   ... FAILED
 
 Is this a known problem, or something specific to my machine... I can post
 regression.diffs (quite long) if required ...

Uh, regression.diffs is large?  MY guess is your backend crashed, for
some unknown reason, so all the queries after the crash just failed.  I
can't think of another reason for that diff file to be large.  Is the
failure repoducable?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] LDAP Authentication?

2005-10-31 Thread Bruno Almeida do Lago

I can help on this one too.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Euler Taveira de
Oliveira
Sent: Monday, October 31, 2005 9:44 AM
To: Satoshi Nagayasu; Magnus Hagander
Cc: PostgreSQL-development
Subject: Re: [HACKERS] LDAP Authentication?

--- Magnus Hagander wrote:

  It should be fairly easy to write a LDAP backend to password
  authentication using openldap, winldap or whatever ldap library is
  available.

I support the idea. It would be a good gain for PostgreSQL
authentication. 
If you want to discuss ideas, drop me a line.

Euler Taveira de Oliveira
euler[at]yahoo_com_br

___ 
Promoção Yahoo! Acesso Grátis: a cada hora navegada você
acumula cupons e concorre a mais de 500 prêmios! Participe!
http://yahoo.fbiz.com.br/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] 8.1RC1 fails opr_sanity on osx

2005-10-31 Thread Adam Witney

On 31/10/05 1:32 pm, Bruce Momjian pgman@candle.pha.pa.us wrote:

 Adam Witney wrote:
 
 Just the one fail on OSX 10.3.9
 
  opr_sanity   ... FAILED
 
 Is this a known problem, or something specific to my machine... I can post
 regression.diffs (quite long) if required ...
 
 Uh, regression.diffs is large?  MY guess is your backend crashed, for
 some unknown reason, so all the queries after the crash just failed.  I
 can't think of another reason for that diff file to be large.  Is the
 failure repoducable?

Seems a bit random actually... Here are the results of 3 successive make
check's, the fourth passed all tests!

http://bugs.sgul.ac.uk/downloads/temp/regression1.diffs
http://bugs.sgul.ac.uk/downloads/temp/regression2.diffs
http://bugs.sgul.ac.uk/downloads/temp/regression3.diffs


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] 8.1RC1 fails opr_sanity on osx

2005-10-31 Thread Bruce Momjian

Adam Witney wrote:
 On 31/10/05 1:32 pm, Bruce Momjian pgman@candle.pha.pa.us wrote:
 
  Adam Witney wrote:
  
  Just the one fail on OSX 10.3.9
  
   opr_sanity   ... FAILED
  
  Is this a known problem, or something specific to my machine... I can post
  regression.diffs (quite long) if required ...
  
  Uh, regression.diffs is large?  MY guess is your backend crashed, for
  some unknown reason, so all the queries after the crash just failed.  I
  can't think of another reason for that diff file to be large.  Is the
  failure repoducable?
 
 Seems a bit random actually... Here are the results of 3 successive make
 check's, the fourth passed all tests!
 
 http://bugs.sgul.ac.uk/downloads/temp/regression1.diffs
 http://bugs.sgul.ac.uk/downloads/temp/regression2.diffs
 http://bugs.sgul.ac.uk/downloads/temp/regression3.diffs

Yea, that helps.  The errors you have are really these:

! psql: could not fork new process for connection: Resource temporarily 
unavailable

and
! psql: could not send startup packet: Broken pipe

Is anything else big running on your machine?

I looked at the OSX configuration section here:

http://candle.pha.pa.us/main/writings/pgsql/sgml/kernel-resources.html

but didn't see anything significant.  My guess is that the parallel
nature of the regression tests are exhausting some system resource on
your machine.  Does the kernel log have anything of interest?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] 8.1 Release Candidate 1 Bundled ...

2005-10-31 Thread Lamar Owen

 The developer.postgresql.org machine really isn't geared to handle
 downloads.. Any reason you can't just stick it on the standard ftp sites
 and have it mirrored along with everything else?

 This is taken from our spec:
 # Pre-release RPM's should not be put up on the public ftp.postgresql.org
 server
 # -- only test releases or full releases should be.

 So thinking that:

 * Beta and RC RPMs are used only by testers
 * We use the beta and RC steps to build the new RPM sets, so that means
 that actually they are not production quality looking from the RPM
 perspective.

By way of clarification, as I am the one who wrote that portion of the
spec file, an 'RPM prerelease' and a 'beta' weren't intended to be the
same thing; the line in the spec referenced was for my own use to remind
me that my own internal testing packages (with a release number 0.x)
weren't intended for public consumption.  Devrim, you can remove that
section of the spec file at any time at this point, because you are using
CVS for the purpose that I was using 'prerelease' RPMs.

Historically, beta and release candidate RPM's were put on the main ftp
site but flagged as beta quality.  I certainly appreciate your dilligence
in following those instructions I wrote long ago, but, thanks to your
smoother release process (in no small part due to the use of CVS) those
instructions are obsolete.  Many thanks for being that dilligent!
-- 
Lamar Owen
Director of Information Technology
Pisgah Astronomical Research Institute
1 PARI Drive
Rosman, NC  28772
(828)862-5554
www.pari.edu

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] 8.1RC1 fails opr_sanity on osx

2005-10-31 Thread Adam Witney

On 31/10/05 2:13 pm, Bruce Momjian pgman@candle.pha.pa.us wrote:

 Adam Witney wrote:
 On 31/10/05 1:32 pm, Bruce Momjian pgman@candle.pha.pa.us wrote:
 
 Adam Witney wrote:
 
 Just the one fail on OSX 10.3.9
 
  opr_sanity   ... FAILED
 
 Is this a known problem, or something specific to my machine... I can post
 regression.diffs (quite long) if required ...
 
 Uh, regression.diffs is large?  MY guess is your backend crashed, for
 some unknown reason, so all the queries after the crash just failed.  I
 can't think of another reason for that diff file to be large.  Is the
 failure repoducable?
 
 Seems a bit random actually... Here are the results of 3 successive make
 check's, the fourth passed all tests!
 
 http://bugs.sgul.ac.uk/downloads/temp/regression1.diffs
 http://bugs.sgul.ac.uk/downloads/temp/regression2.diffs
 http://bugs.sgul.ac.uk/downloads/temp/regression3.diffs
 
 Yea, that helps.  The errors you have are really these:
 
 ! psql: could not fork new process for connection: Resource temporarily
 unavailable
 
 and
 ! psql: could not send startup packet: Broken pipe
 
 Is anything else big running on your machine?
 
 I looked at the OSX configuration section here:
 
 http://candle.pha.pa.us/main/writings/pgsql/sgml/kernel-resources.html
 
 but didn't see anything significant.  My guess is that the parallel
 nature of the regression tests are exhausting some system resource on
 your machine.  Does the kernel log have anything of interest?

Ah that probably explains it... It is my laptop and I have quite a few
things running... So should probably run the make check when I first start
it up maybe.

Thanks for the help

Adam


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] 8.1RC1 fails opr_sanity on osx

2005-10-31 Thread Tom Lane

Adam Witney [EMAIL PROTECTED] writes:
 http://bugs.sgul.ac.uk/downloads/temp/regression1.diffs
 http://bugs.sgul.ac.uk/downloads/temp/regression2.diffs
 http://bugs.sgul.ac.uk/downloads/temp/regression3.diffs

If you'd looked, you would have noticed that they're all variations on
psql: could not fork new process for connection: Resource temporarily 
unavailable

In other words, you've got a system resource limit problem.  See
http://developer.postgresql.org/docs/postgres/kernel-resources.html#AEN17862

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] 8.1RC1 on Tru64

2005-10-31 Thread Honda Shigehiro

Hello, 

I tried RC1 on Tru64 box(with Compaq C V6.1-011)) and succeed :
bash-2.05b$ make MAX_CONNECTIONS=2 check
 ...
== shutting down postmaster   ==
postmaster stopped

==
 All 98 tests passed.
==

Environments: 
- $ uname -a
  OSF1 kiss.my.domain V5.0 910 alpha
- Compaq C V6.1-011 on Digital UNIX V5.0 (Rev. 910)
- GNU Make version 3.79.1, by Richard Stallman and Roland McGrath.
- result of pg_config
$ src/bin/pg_config/pg_config
BINDIR = /home/postgres/postgresql-8.1RC1/src/bin/pg_config
DOCDIR = /home/postgres/postgresql-8.1RC1/src/bin/doc
INCLUDEDIR = /home/postgres/postgresql-8.1RC1/src/bin/include
PKGINCLUDEDIR = /home/postgres/postgresql-8.1RC1/src/bin/include
INCLUDEDIR-SERVER = /home/postgres/postgresql-8.1RC1/src/bin/include/server
LIBDIR = /home/postgres/postgresql-8.1RC1/src/bin/lib
PKGLIBDIR = /home/postgres/postgresql-8.1RC1/src/bin/lib
LOCALEDIR =
MANDIR = /home/postgres/postgresql-8.1RC1/src/bin/man
SHAREDIR = /home/postgres/postgresql-8.1RC1/src/bin/share
SYSCONFDIR = /home/postgres/postgresql-8.1RC1/src/bin/etc
PGXS = /home/postgres/postgresql-8.1RC1/src/bin/lib/pgxs/src/makefiles/pgxs.mk
CONFIGURE = '--with-includes=/usr/local/include'
CC = cc -std
CPPFLAGS = -I/usr/local/include
CFLAGS = -O -ieee
CFLAGS_SL =
LDFLAGS = -rpath /usr/local/pgsql/lib
LDFLAGS_SL =
LIBS = -lpgport -lz -lreadline -lresolv -lPW -lm -lbsd
VERSION = PostgreSQL 8.1RC1

regards,
-- 
  Shigehiro Honda

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] platform test

2005-10-31 Thread Mike Rylander

4x AMD Opteron (tm) Processor 852

   -

[EMAIL PROTECTED] /tmp/pgtestbuild/postgresql-8.1RC1 $ uname -a
Linux localhost 2.6.12-gentoo-r10 #1 SMP Fri Sep 9 09:43:22 EDT 2005
x86_64 AMD Opteron (tm) Processor 852 AuthenticAMD GNU/Linux



[EMAIL PROTECTED] /tmp/pgtestbuild/postgresql-8.1RC1 $ file
src/backend/postgres
src/backend/postgres: ELF 64-bit LSB executable, AMD x86-64, version 1
(SYSV), for GNU/Linux 2.4.1, dynamically linked (uses shared libs),
not stripped

   --


[EMAIL PROTECTED] /tmp/pgtestbuild/postgresql-8.1RC1 $ gcc --version
gcc (GCC) 3.4.4 (Gentoo 3.4.4-r1, ssp-3.4.4-1.0, pie-8.7.8)

   --

[EMAIL PROTECTED] /tmp/pgtestbuild/postgresql-8.1RC1 $
src/bin/pg_config/pg_config
BINDIR = /tmp/pgtestbuild/postgresql-8.1RC1/src/bin/pg_config
DOCDIR = /tmp/pgtestbuild/postgresql-8.1RC1/src/bin/doc/postgresql
INCLUDEDIR = /tmp/pgtestbuild/postgresql-8.1RC1/src/bin/include
PKGINCLUDEDIR = /tmp/pgtestbuild/postgresql-8.1RC1/src/bin/include/postgresql
INCLUDEDIR-SERVER =
/tmp/pgtestbuild/postgresql-8.1RC1/src/bin/include/postgresql/server
LIBDIR = /tmp/pgtestbuild/postgresql-8.1RC1/src/bin/lib
PKGLIBDIR = /tmp/pgtestbuild/postgresql-8.1RC1/src/bin/lib/postgresql
LOCALEDIR =
MANDIR = /tmp/pgtestbuild/postgresql-8.1RC1/src/bin/man
SHAREDIR = /tmp/pgtestbuild/postgresql-8.1RC1/src/bin/share/postgresql
SYSCONFDIR = /tmp/pgtestbuild/postgresql-8.1RC1/src/bin/etc/postgresql
PGXS = 
/tmp/pgtestbuild/postgresql-8.1RC1/src/bin/lib/postgresql/pgxs/src/makefiles/pgxs.mk
CONFIGURE = '--with-perl' '--with-openssl'
'--enable-integer-datetimes' '--prefix=/tmp/pgtest/'
CC = gcc
CPPFLAGS = -D_GNU_SOURCE
CFLAGS = -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Winline
-Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing
CFLAGS_SL = -fpic
LDFLAGS = -Wl,-rpath,/tmp/pgtest//lib
LDFLAGS_SL =
LIBS = -lpgport -lssl -lcrypto -lz -lreadline -lcrypt -lresolv -lnsl
-ldl -lm -lbsd
VERSION = PostgreSQL 8.1RC1

  --

==
 All 98 tests passed.
==

--
Mike Rylander
[EMAIL PROTECTED]
GPLS -- PINES Development
Database Developer
http://open-ils.org

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] 8.1RC1 on Tru64

2005-10-31 Thread Andrew Dunstan




Honda Shigehiro wrote:

Hello, 


I tried RC1 on Tru64 box(with Compaq C V6.1-011)) and succeed :
bash-2.05b$ make MAX_CONNECTIONS=2 check

 



The seems to be a very low setting for MAX_CONNECTIONS. Any particular 
reason for that?


(side note - we'd very much welcome a Tru64 buildfarm member - if you're 
interested please email me off list).


cheers

andrew

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] 8.1RC1 on Tru64

2005-10-31 Thread Honda Shigehiro

From: Andrew Dunstan [EMAIL PROTECTED]
Subject: Re: [HACKERS] 8.1RC1 on Tru64
Date: Mon, 31 Oct 2005 11:12:09 -0500

 I tried RC1 on Tru64 box(with Compaq C V6.1-011)) and succeed :
 bash-2.05b$ make MAX_CONNECTIONS=2 check

 The seems to be a very low setting for MAX_CONNECTIONS. Any particular 
 reason for that?
This is because my box has too small memory(64MB) to do without this. With 
default 
parameter, my box said Unable to obtain requested swap space...

 (side note - we'd very much welcome a Tru64 buildfarm member - if you're 
 interested please email me off list).
... I have been trying to join buildfarm since last weak. But I can not compile 
CVS now...

regards,
-- 
 Shigehiro Honda

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags 0x01),)

2005-10-31 Thread Jim C. Nasby

On Sun, Oct 30, 2005 at 06:17:53PM -0500, Tom Lane wrote:
 I'd like Jim to test this theory by seeing if it helps to reverse the
 order of the if-test elements at lines 294/295, ie make it look like
 
 if (shared-page_status[slotno] != SLRU_PAGE_READ_IN_PROGRESS ||
 shared-page_number[slotno] != pageno)
 
 This won't do as a permanent patch, because it isn't guaranteed to fix
 the problem on machines that don't strongly order writes, but it should
 work on Opterons, at least well enough to confirm the diagnosis.

Given your proposed fix on -patches, do you still need me to test this?
Also, is there any heap corruption risk associated with this patch?

I'm also wondering what the effect of this is when assertions are turned
off. My client had to go back to running with assertions turned off
because of the performance impact. Are they now risking data corruption?
Is there a way to turn on the assertion just in this code segment?

This incident has made me wonder if it's worth creating two classes of
assertions. The (hopefully more common) set of assertions would be for
things that shouldn't happen, but if go un-caught won't result in heap
corruption. A new set (well, existing asserts, but just re-classified)
would be for things that if uncaught could result in heap corruption. My
hope is that the set of critical assertions could be turned on by
default, helping to identify race conditions and other bugs that
conventional testing is unlikely to find.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags 0x01),)

2005-10-31 Thread Jim C. Nasby

Sorry, two more things...

Will increasing shared_buffers make this less likely to occur? Or is
this just something that's likely to happen when there are things like
seqscans that are putting buffers near the front of the LRU? (The 8.0.3
buffer manager does something like that, right?)

Is this something that a test case can be created for? I know someone
submitted a framework for doing concurrent testing...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] FKs on temp tables: hard, or just omitted?

2005-10-31 Thread Jim C. Nasby

On Sun, Oct 30, 2005 at 05:31:07PM -0800, Josh Berkus wrote:
 Folks,
 
 Thanks, all!  Now, if only I could remember who asked me the question ...

ISTM we should add a note about this to the docs...

Here's a patch for create_table.sgml, though there's probably some other
places this could go...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461
Index: doc/src/sgml/ref/create_table.sgml
===
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/ref/create_table.sgml,v
retrieving revision 1.94
diff -u -r1.94 create_table.sgml
--- doc/src/sgml/ref/create_table.sgml  13 Aug 2005 02:48:18 -  1.94
+++ doc/src/sgml/ref/create_table.sgml  31 Oct 2005 17:54:10 -
@@ -421,7 +421,10 @@
   primary key of the replaceable
   class=parameterreftable/replaceable is used.  The
   referenced columns must be the columns of a unique or primary
-  key constraint in the referenced table.
+  key constraint in the referenced table.  Note that foreign key
+  constraints may not be defined between temporary tables and permanent
+  tables. This is because doing so would eliminate most of the performance
+  gains of using a temporary table.
  /para
 
  para

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags

2005-10-31 Thread Bruce Momjian

Jim C. Nasby wrote:
 On Sun, Oct 30, 2005 at 06:17:53PM -0500, Tom Lane wrote:
  I'd like Jim to test this theory by seeing if it helps to reverse the
  order of the if-test elements at lines 294/295, ie make it look like
  
  if (shared-page_status[slotno] != SLRU_PAGE_READ_IN_PROGRESS ||
  shared-page_number[slotno] != pageno)
  
  This won't do as a permanent patch, because it isn't guaranteed to fix
  the problem on machines that don't strongly order writes, but it should
  work on Opterons, at least well enough to confirm the diagnosis.
 
 Given your proposed fix on -patches, do you still need me to test this?
 Also, is there any heap corruption risk associated with this patch?

Because it is a test, I am not sure there is any way to know what the
possible impact of a bug is.  If we knew there were bug in the patch,
it would have been fixed already.

 I'm also wondering what the effect of this is when assertions are turned
 off. My client had to go back to running with assertions turned off
 because of the performance impact. Are they now risking data corruption?
 Is there a way to turn on the assertion just in this code segment?
 
 This incident has made me wonder if it's worth creating two classes of
 assertions. The (hopefully more common) set of assertions would be for
 things that shouldn't happen, but if go un-caught won't result in heap
 corruption. A new set (well, existing asserts, but just re-classified)
 would be for things that if uncaught could result in heap corruption. My
 hope is that the set of critical assertions could be turned on by
 default, helping to identify race conditions and other bugs that
 conventional testing is unlikely to find.

That is probably overkill.  Running with test patches isn't something we
expect folks to do often.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags 0x01),)

2005-10-31 Thread Tom Lane

Jim C. Nasby [EMAIL PROTECTED] writes:
 On Sun, Oct 30, 2005 at 06:17:53PM -0500, Tom Lane wrote:
 This won't do as a permanent patch, because it isn't guaranteed to fix
 the problem on machines that don't strongly order writes, but it should
 work on Opterons, at least well enough to confirm the diagnosis.

 Given your proposed fix on -patches, do you still need me to test this?

Yes; we still need to verify that my theory actually explains your
problem.  Given that I'm positing that you can repeatedly hit a
two-instruction window, this is by no means a sure thing.  We need
it tested (and with asserts on, so that we can tell if it's fixed
the problem or not).

 Also, is there any heap corruption risk associated with this patch?

Look, Jim, I'm trying to help you fix this.  Are you going to help or not?
If you want some kind of written guarantee, you're not going to get one.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags 0x01),)

2005-10-31 Thread Jim C. Nasby

On Mon, Oct 31, 2005 at 01:05:06PM -0500, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  On Sun, Oct 30, 2005 at 06:17:53PM -0500, Tom Lane wrote:
  This won't do as a permanent patch, because it isn't guaranteed to fix
  the problem on machines that don't strongly order writes, but it should
  work on Opterons, at least well enough to confirm the diagnosis.
 
  Given your proposed fix on -patches, do you still need me to test this?
 
 Yes; we still need to verify that my theory actually explains your
 problem.  Given that I'm positing that you can repeatedly hit a
 two-instruction window, this is by no means a sure thing.  We need
 it tested (and with asserts on, so that we can tell if it's fixed
 the problem or not).

Ok, I'll work on getting this tested. Just to clarify, if this fixes it
then the problem wouldn't occur, or would we just see a different
assert?

  Also, is there any heap corruption risk associated with this patch?
 
 Look, Jim, I'm trying to help you fix this.  Are you going to help or not?
 If you want some kind of written guarantee, you're not going to get one.

Of course not, and I'm not looking for one. On the otherhand, I don't
want to recommend something on a production system without understanding
what kind of risks are involved, and unfortunately much of this is still
over my head. I would really like to have a better idea of what the
impact of this bug is.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags

2005-10-31 Thread Bruce Momjian

Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  On Sun, Oct 30, 2005 at 06:17:53PM -0500, Tom Lane wrote:
  This won't do as a permanent patch, because it isn't guaranteed to fix
  the problem on machines that don't strongly order writes, but it should
  work on Opterons, at least well enough to confirm the diagnosis.
 
  Given your proposed fix on -patches, do you still need me to test this?
 
 Yes; we still need to verify that my theory actually explains your
 problem.  Given that I'm positing that you can repeatedly hit a
 two-instruction window, this is by no means a sure thing.  We need
 it tested (and with asserts on, so that we can tell if it's fixed
 the problem or not).
 
  Also, is there any heap corruption risk associated with this patch?
 
 Look, Jim, I'm trying to help you fix this.  Are you going to help or not?
 If you want some kind of written guarantee, you're not going to get one.

I think we can say Jim gets his money back if he finds a bug.  :-)

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags

2005-10-31 Thread Jim C. Nasby

On Mon, Oct 31, 2005 at 01:01:14PM -0500, Bruce Momjian wrote:
  This incident has made me wonder if it's worth creating two classes of
  assertions. The (hopefully more common) set of assertions would be for
  things that shouldn't happen, but if go un-caught won't result in heap
  corruption. A new set (well, existing asserts, but just re-classified)
  would be for things that if uncaught could result in heap corruption. My
  hope is that the set of critical assertions could be turned on by
  default, helping to identify race conditions and other bugs that
  conventional testing is unlikely to find.
 
 That is probably overkill.  Running with test patches isn't something we
 expect folks to do often.

I wasn't thinking about test patches.

My assumption is that the asserts that are currently in place fall into
one of two categories: either they check for something that if false
could result in data corruption in the heap, or they check for something
that shouldn't happen, but if it does it can't corrupt the heap. If that
assumption is correct then seperating them might make it easier to run
with the set of critical asserts turned on. Currently, there can be a
substantial performance penalty with all asserts turned on, but I
suspect a lot of that penalty is from asserts in things like parsing and
planning code; code that pretty much couldn't corrupt data.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags

2005-10-31 Thread Bruce Momjian

Jim C. Nasby wrote:
 On Mon, Oct 31, 2005 at 01:01:14PM -0500, Bruce Momjian wrote:
   This incident has made me wonder if it's worth creating two classes of
   assertions. The (hopefully more common) set of assertions would be for
   things that shouldn't happen, but if go un-caught won't result in heap
   corruption. A new set (well, existing asserts, but just re-classified)
   would be for things that if uncaught could result in heap corruption. My
   hope is that the set of critical assertions could be turned on by
   default, helping to identify race conditions and other bugs that
   conventional testing is unlikely to find.
  
  That is probably overkill.  Running with test patches isn't something we
  expect folks to do often.
 
 I wasn't thinking about test patches.
 
 My assumption is that the asserts that are currently in place fall into
 one of two categories: either they check for something that if false
 could result in data corruption in the heap, or they check for something
 that shouldn't happen, but if it does it can't corrupt the heap. If that
 assumption is correct then seperating them might make it easier to run
 with the set of critical asserts turned on. Currently, there can be a
 substantial performance penalty with all asserts turned on, but I
 suspect a lot of that penalty is from asserts in things like parsing and
 planning code; code that pretty much couldn't corrupt data.

There is no way if the system has some incorrect value whether that
would later corrupt the data or not.  Anything the system does that it
shouldn't do is a potential corruption problem.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags

2005-10-31 Thread Jim C. Nasby

On Mon, Oct 31, 2005 at 01:34:17PM -0500, Bruce Momjian wrote:
 There is no way if the system has some incorrect value whether that
 would later corrupt the data or not.  Anything the system does that it
 shouldn't do is a potential corruption problem.

But is it safe to say that there are areas where a failed assert is far
more likely to result in data corruption? And that there's also areas
where there's likely to be difficult/impossible to find bugs, such as
race conditions? ISTM that it would be valuable to do some additional
checking in these critical areas.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] 8.1 Release Candidate 1 Coming ...

2005-10-31 Thread Chris Browne

[EMAIL PROTECTED] (Tom Lane) writes:
 Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 hmm well -HEAD(and 8.0.4 too!) is broken on AIX 5.3ML3:
 http://archives.postgresql.org/pgsql-hackers/2005-10/msg01053.php

 [ shrug... ]  The reports of this problem have not given enough
 information to fix it, and since it's not a regression from 8.0,
 it's not going to hold up the 8.1 release.  When and if we receive
 enough info to fix it, we'll gladly do so, but ...

Well, we never had an AIX 5.3 system when 8.0 was released, so didn't
attempt a compile.  Seneca just tried out a build on 8.0.3 on AIX 5.3;
it appears to be experiencing the same problem with initdb, and a
slight modification of the previous fix appears to resolve the
issue.

Can you suggest what further we might provide that would help?

 (My guess is that the problem is a compiler or libc bug anyway,
 given that one report says that replacing a memcpy call with an
 equivalent loop makes the failure go away.)

It seems unlikely to be a compiler bug as the same issue has been
reported with both GCC and IBM XLC.  I could believe it being a libc
bug...

It would be terribly disappointing to have to report both internally
and externally that AIX 5.3 is not a usable platform for recent
releases of PostgreSQL...
-- 
cbbrowne,@,ntlug.org
http://cbbrowne.com/info/linuxdistributions.html
Never lend your car to anyone  to whom you have given birth to. 
--Erma Bombeck

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] 8.1 Release Candidate 1 Coming ...

2005-10-31 Thread Tom Lane

Chris Browne [EMAIL PROTECTED] writes:
 [EMAIL PROTECTED] (Tom Lane) writes:
 (My guess is that the problem is a compiler or libc bug anyway,
 given that one report says that replacing a memcpy call with an
 equivalent loop makes the failure go away.)

 It seems unlikely to be a compiler bug as the same issue has been
 reported with both GCC and IBM XLC.  I could believe it being a libc
 bug...

As best I can tell after poking at it on Stefan's machine, it's a linker
bug, or else there is something strange about memcpy as compared to,
say, memcmp.  A function pointer to memcmp works, a function pointer to
memcpy contains a bogus value that points entirely outside the program's
address space.  This despite the assembly code that generates them
looking just the same in both cases, viz

LC..12:
.tc memcmp[TC],memcmp[DS]
LC..14:
.tc memcpy[TC],memcpy[DS]

Even more interesting, if you start the postmaster under gdb and examine
the pointer, then set a breakpoint at main and say run, by the time
control arrives at main() the bogus value has changed to a different
bogus value.  So something in the basic C runtime support is frobbing it
--- incorrectly :-(.  I think all the signs point to incorrect
relocation data generated by the linker, though I have no idea why only
memcpy would be affected.

 It would be terribly disappointing to have to report both internally
 and externally that AIX 5.3 is not a usable platform for recent
 releases of PostgreSQL...

According to Stefan it broke between 5.3ML1 and 5.3ML3.  I suggest
filing a defect report with IBM.  We're not going to stop using memcpy
because one version of one platform is broken.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags

2005-10-31 Thread Gregory Maxwell

On 10/31/05, Jim C. Nasby [EMAIL PROTECTED] wrote:
 On Mon, Oct 31, 2005 at 01:34:17PM -0500, Bruce Momjian wrote:
  There is no way if the system has some incorrect value whether that
  would later corrupt the data or not.  Anything the system does that it
  shouldn't do is a potential corruption problem.
 But is it safe to say that there are areas where a failed assert is far
 more likely to result in data corruption? And that there's also areas
 where there's likely to be difficult/impossible to find bugs, such as
 race conditions? ISTM that it would be valuable to do some additional
 checking in these critical areas.

There are, no doubt, also places where an assert has minimal to no
performance impact. I'd wager a guess that the intersection of low
impact asserts, and asserts which measure high risk activities, is
small enough to be uninteresting.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] 8.1 Release Candidate 1 Coming ...

2005-10-31 Thread Mag Gam

Is this issue only on AIX 5.3 ML1 thru ML 3?
Does the build work fine with 5.2 (ALL MLs)?



On 10/31/05, Tom Lane [EMAIL PROTECTED] wrote:
 Chris Browne [EMAIL PROTECTED] writes:
  [EMAIL PROTECTED] (Tom Lane) writes:
  (My guess is that the problem is a compiler or libc bug anyway,
  given that one report says that replacing a memcpy call with an
  equivalent loop makes the failure go away.)

  It seems unlikely to be a compiler bug as the same issue has been
  reported with both GCC and IBM XLC.  I could believe it being a libc
  bug...

 As best I can tell after poking at it on Stefan's machine, it's a linker
 bug, or else there is something strange about memcpy as compared to,
 say, memcmp.  A function pointer to memcmp works, a function pointer to
 memcpy contains a bogus value that points entirely outside the program's
 address space.  This despite the assembly code that generates them
 looking just the same in both cases, viz

 LC..12:
 .tc memcmp[TC],memcmp[DS]
 LC..14:
 .tc memcpy[TC],memcpy[DS]

 Even more interesting, if you start the postmaster under gdb and examine
 the pointer, then set a breakpoint at main and say run, by the time
 control arrives at main() the bogus value has changed to a different
 bogus value.  So something in the basic C runtime support is frobbing it
 --- incorrectly :-(.  I think all the signs point to incorrect
 relocation data generated by the linker, though I have no idea why only
 memcpy would be affected.

  It would be terribly disappointing to have to report both internally
  and externally that AIX 5.3 is not a usable platform for recent
  releases of PostgreSQL...

 According to Stefan it broke between 5.3ML1 and 5.3ML3.  I suggest
 filing a defect report with IBM.  We're not going to stop using memcpy
 because one version of one platform is broken.

 regards, tom lane

 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] 8.1 Release Candidate 1 Coming ...

2005-10-31 Thread Tom Lane

Mag Gam [EMAIL PROTECTED] writes:
 Is this issue only on AIX 5.3 ML1 thru ML 3?
 Does the build work fine with 5.2 (ALL MLs)?

There's an AIX 5.2 machine in the buildfarm, and it seems happy.  I have
no idea about details beyond that ...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags 0x01),)

2005-10-31 Thread Jim C. Nasby

Now that I've got a little better idea of what this code does, I've
noticed something interesting... this issue is happening on an 8-way
machine, and NUM_SLRU_BUFFERS is currently defined at 8. Doesn't this
greatly increase the odds of buffer conflicts? Bug aside, would it be
better to set NUM_SLRU_BUFFERS higher for a larger number of CPUs?

Also, something else to note is that this database can see a pretty high
transaction rate... I just checked and it was doing 200TPS, but iirc it
can hit 1000+ TPS during the day.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags 0x01),)

2005-10-31 Thread Alvaro Herrera

Jim C. Nasby wrote:
 Now that I've got a little better idea of what this code does, I've
 noticed something interesting... this issue is happening on an 8-way
 machine, and NUM_SLRU_BUFFERS is currently defined at 8. Doesn't this
 greatly increase the odds of buffer conflicts? Bug aside, would it be
 better to set NUM_SLRU_BUFFERS higher for a larger number of CPUs?

We had talked about increasing NUM_SLRU_BUFFERS depending on
shared_buffers, but it didn't get done.  Something to consider for 8.2
though.  I think you could have better performance by increasing that
setting, while at the same time dimishing the possibility that the race
condition appears.

I think you should also consider increasing PGPROC_MAX_CACHED_SUBXIDS
(src/include/storage/proc.h), because that should decrease the chance
that the subtrans area needs to be scanned.  By how much, however, I
wouldn't know -- it depends on the number of subtransactions you
typically have; I guess you could activate the measuring code in
procarray.c to get a figure.

-- 
Alvaro Herrera http://www.amazon.com/gp/registry/CTMLCN8V17R4
www.google.com: interfaz de línea de comando para la web.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Spinlocks, yet again: analysis and proposed patches

2005-10-31 Thread Mark Wong

On Thu, 20 Oct 2005 23:03:47 +0100
Simon Riggs [EMAIL PROTECTED] wrote:

 On Wed, 2005-10-19 at 14:07 -0700, Mark Wong wrote:
   
   This isn't exactly elegant coding, but it provides a useful improvement
   on an 8-way SMP box when run on 8.0 base. OK, lets be brutal: this looks
   pretty darn stupid. But it does follow the CPU optimization handbook
   advice and I did see a noticeable improvement in performance and a
   reduction in context switching.
 
   I'm not in a position to try this again now on 8.1beta, but I'd welcome
   a performance test result from anybody that is. I'll supply a patch
   against 8.1beta for anyone wanting to test this.
  
  Ok, I've produce a few results on a 4 way (8 core) POWER 5 system, which
  I've just set up and probably needs a bit of tuning.  I don't see much
  difference but I'm wondering if the cacheline sizes are dramatically
  different from Intel/AMD processors.  I still need to take a closer look
  to make sure I haven't grossly mistuned anything, but I'll let everyone
  take a look:
 
 Well, the Power 5 architecture probably has the lowest overall memory
 delay you can get currently so in some ways that would negate the
 effects of the patch. (Cacheline is still 128 bytes, AFAICS). But it's
 clear the patch isn't significantly better (like it was with 8.0 when we
 tried this on the 8-way Itanium in Feb).
 
  cvs 20051013
  http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/19/
  2501 notpm
  
  cvs 20051013 w/ lw.patch
  http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/20/
  2519 notpm
 
 Could you re-run with wal_buffers = 32 ? (Without patch) Thanks

Ok, sorry for the delay.  I've bumped up the wal_buffers to 2048 and
redid the disk layout.  Here's where I'm at now:

cvs 20051013
http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/40/
3257 notpm

cvs 20051013 w/ lw.patch
http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/42/
3285 notpm

Still not much of a difference with the patch.  A quick glance over the
iostat data suggests I'm still not i/o bound, but the i/o wait is rather
high according to vmstat.  Will try to see if there's anything else
obvious to get the load up higher.

Mark

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags 0x01),)

2005-10-31 Thread Jim C. Nasby

On Mon, Oct 31, 2005 at 09:02:59PM -0300, Alvaro Herrera wrote:
 Jim C. Nasby wrote:
  Now that I've got a little better idea of what this code does, I've
  noticed something interesting... this issue is happening on an 8-way
  machine, and NUM_SLRU_BUFFERS is currently defined at 8. Doesn't this
  greatly increase the odds of buffer conflicts? Bug aside, would it be
  better to set NUM_SLRU_BUFFERS higher for a larger number of CPUs?
 
 We had talked about increasing NUM_SLRU_BUFFERS depending on
 shared_buffers, but it didn't get done.  Something to consider for 8.2
 though.  I think you could have better performance by increasing that
 setting, while at the same time dimishing the possibility that the race
 condition appears.
 
Ok, I'll look into that. This database is definately having issues due
to the sheer transaction volume, so maybe that will help.

If NUM_SLRU_BUFFERS were to be tied to something, wouldn't it make more
sense to tie it to wal_buffers though? One example is a data warehouse
might have a very high shared_buffers, but most likely won't have a high
transaction rate. ISTM that most databases with a high transaction rate
are likely to have increased wal_buffers.

 I think you should also consider increasing PGPROC_MAX_CACHED_SUBXIDS
 (src/include/storage/proc.h), because that should decrease the chance
 that the subtrans area needs to be scanned.  By how much, however, I
 wouldn't know -- it depends on the number of subtransactions you
 typically have; I guess you could activate the measuring code in
 procarray.c to get a figure.

AFAIK they're not using subtransactions at all, but I'll check.

Is there anywhere this stuff is documented other than in code? It sounds
like an advanced tuning guide would be very valuable for environments
like this one...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[HACKERS] regression failures on WIndows in machines with some non-English locales

2005-10-31 Thread Andrew Dunstan



I have become aware that regression is failing due to ordering 
differences on Windows machines in some non-English locales 
(specifically, Czech, but the potential is there for more failures).


The problem seems to be that the regression suite and initdb don't do 
enough between them to ensure that the tests are run in C locale.


The simple solution seems to be to add --no-locale to the initdb args in 
pg_regress.sh. I have asked Petr Jelinek (one of our Czech users) to 
test this. If it works as I expect it to (buildfarm has done this for 
installcheck tests for some time)  I'd like to add this to both the HEAD 
and 8.0 branches. I know it's very late in the cycle, but it seems very 
low risk to me, and I'd like to have regression working on as broad a 
set of platforms as possible.


If people prefer, I could add it just for the Windows case - Unix 
platforms won't see the effect I propose to remedy, as their setlocale 
works from the environment, unlike Windows.


Thoughts?

cheers

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags 0x01),)

2005-10-31 Thread Tom Lane

Jim C. Nasby [EMAIL PROTECTED] writes:
 AFAIK they're not using subtransactions at all, but I'll check.

Well, yeah, they are ... else you'd never have seen this failure.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] regression failures on WIndows in machines with some non-English locales

2005-10-31 Thread Tom Lane

Andrew Dunstan [EMAIL PROTECTED] writes:
 The simple solution seems to be to add --no-locale to the initdb args in 
 pg_regress.sh.

Er ... what exactly does that do that setting LC_ALL=C doesn't?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] regression failures on WIndows in machines with some non-English

2005-10-31 Thread Petr Jelinek


Tom Lane wrote:


The simple solution seems to be to add --no-locale to the initdb args in 
pg_regress.sh.



Er ... what exactly does that do that setting LC_ALL=C doesn't?



Windows are ignoring locale enviroment variables so you can't do that

--
Regards
Petr Jelinek (PJMODOS)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Spinlocks, yet again: analysis and proposed patches

2005-10-31 Thread Simon Riggs

On Mon, 2005-10-31 at 16:10 -0800, Mark Wong wrote:
 On Thu, 20 Oct 2005 23:03:47 +0100
 Simon Riggs [EMAIL PROTECTED] wrote:
 
  On Wed, 2005-10-19 at 14:07 -0700, Mark Wong wrote:

This isn't exactly elegant coding, but it provides a useful improvement
on an 8-way SMP box when run on 8.0 base. OK, lets be brutal: this looks
pretty darn stupid. But it does follow the CPU optimization handbook
advice and I did see a noticeable improvement in performance and a
reduction in context switching.
  
I'm not in a position to try this again now on 8.1beta, but I'd welcome
a performance test result from anybody that is. I'll supply a patch
against 8.1beta for anyone wanting to test this.
   
   Ok, I've produce a few results on a 4 way (8 core) POWER 5 system, which
   I've just set up and probably needs a bit of tuning.  I don't see much
   difference but I'm wondering if the cacheline sizes are dramatically
   different from Intel/AMD processors.  I still need to take a closer look
   to make sure I haven't grossly mistuned anything, but I'll let everyone
   take a look:
  
  Well, the Power 5 architecture probably has the lowest overall memory
  delay you can get currently so in some ways that would negate the
  effects of the patch. (Cacheline is still 128 bytes, AFAICS). But it's
  clear the patch isn't significantly better (like it was with 8.0 when we
  tried this on the 8-way Itanium in Feb).
  
   cvs 20051013
   http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/19/
   2501 notpm
   
   cvs 20051013 w/ lw.patch
   http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/20/
   2519 notpm
  
  Could you re-run with wal_buffers = 32 ? (Without patch) Thanks
 
 Ok, sorry for the delay.  I've bumped up the wal_buffers to 2048 and
 redid the disk layout.  Here's where I'm at now:
 
 cvs 20051013
 http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/40/
 3257 notpm
 
 cvs 20051013 w/ lw.patch
 http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/42/
 3285 notpm
 
 Still not much of a difference with the patch.  A quick glance over the
 iostat data suggests I'm still not i/o bound, but the i/o wait is rather
 high according to vmstat.  Will try to see if there's anything else
 obvious to get the load up higher.

OK, thats fine. I'm glad there's some gain, but not much yet. I think we
should leave out doing any more tests on lw.patch for now.

Concerned about the awful checkpointing. Can you bump wal_buffers to
8192 just to make sure? Thats way too high, but just to prove it.

We need to rdeuce the number of blocks to be written at checkpoint.

 bgwriter_all_maxpages   5  -  15
 bgwriter_all_percent0.333
 bgwriter_delay  200  
 bgwriter_lru_maxpages   5  -  7
 bgwriter_lru_percent1

 shared_buffers  set lower to 10
 (which should cause some amusement on-list)

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] 8.1 Release Candidate 1 Coming ...

2005-10-31 Thread Stefan Kaltenbrunner

Mag Gam wrote:
 Is this issue only on AIX 5.3 ML1 thru ML 3?
 Does the build work fine with 5.2 (ALL MLs)?

5.3 ML1 works but it is affected by the System include Bug mentioned in
our AIX-FAQ. ML3 is supposed to fix that specific problem but breaks in
another more difficult way as it seems ...



Stefan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

39 matches

Mail list logo