Re: [HACKERS] TESTING in src/bin/pg_upgrade has incorrect documentation

2016-06-06 Thread Robert Haas
On Thu, Jun 2, 2016 at 12:35 PM, Andreas 'ads' Scherbaum
 wrote:
> the TESTING file in src/bin/pg_upgrade talks about a "check.sh script", but
> this seems to be a binary (check) now.

Nope:

[rhaas pgsql]$ git grep -F check.sh
[rhaas pgsql]$

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] TESTING in src/bin/pg_upgrade has incorrect documentation

2016-06-02 Thread Andreas 'ads' Scherbaum


Hi,

the TESTING file in src/bin/pg_upgrade talks about a "check.sh script", 
but this seems to be a binary (check) now.



Regards,

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing Postgresql 9.5 RC1 with Alfresco 5.0.d

2015-12-29 Thread Boriss Mejias

Simon Riggs wrote on 29/12/15 05:31:
[...]

Thanks for testing.

Do you have any comments on performance testing?


Not really, because I didn't have an equivalent environment to compare. These 
tests took exactly the same time as another one with Postgresql 9.3, with a 
major different in the setting of the server:


Postgresql 9.5 RC1
1 GB of RAM
1 vCPU Intel
CentOS 6.5

vs

Postgresql 9.3
8 GB of RAM
4 vCPU LPAR
Redhat 6.5 PowerLinux

So, I can't really compare. So, definitely no performance degradation, and 
possible a performance gain. But of course, this could all be the fault of Alfresco.


Discussing with the Alfresco community I found a benchmarking test which I'm not 
familiar with. I'll make sure to include that test next time I test a 
pre-release version of Postgresql, and I'll make sure to have equivalent 
environment to test.



Are there any opportunities to improve Alfresco using the new features in
PostgreSQL 9.5?


I'm not an Alfresco developer, but I'll try to contact the Alfresco engineers to 
discuss this.



Thanks


You're welcome and I hope next tests will include some performance measurements 
a part from just compliance tests.


Cheers
Boriss



--
Simon Riggs http://www.2ndQuadrant.com/ 
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Testing Postgresql 9.5 RC1 with Alfresco 5.0.d

2015-12-28 Thread Boriss Mejias

Hi there,

The following wiki page:
https://wiki.postgresql.org/wiki/HowToBetaTest#How_to_Test
suggests that I should send the following report to this mailing list.

Name: Boriss Mejías (Tchorix)

Release: Postgresql 9.5 RC1

Test Type: Compatibility with Alfresco Community Edition 5.0.d

Test Detail: New Alfresco installation with Postgresql 9.5 RC1. Testing updates 
into the database.


Platform: CentOS 6.5

Installation Method: Compiled from tarball. Default parameters.

Platform Detail:
- VMWare virtual machine
- RAM: 1GB
- CPU: Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz
- Alfresco running on a different VM.

Test Procedure:
- Vanilla installation of Alfresco Community Edition 5.0.d
- Using the default jdbc driver shipped with Alfresco
- Test that Alfresco Shares works correctly with the basic functionality
- Script with multiple HTTP REST calls to create/read/delete users. Each 
operation/transaction involves several tables.


Failure? No

Test Results:
- Alfresco worked as expected.
- Scripts are run in parallel to expect race conditions. Alfresco uses 
optimistic approach.
- Constraint violations on table-keys are observed in the logs, which is 
good, because data remains consistent, and Alfresco handles the error messages 
correctly.


Comments:
- I just wrote my own scripts to test the consistency of users data. I'm 
not part of Alfresco Inc to use their official test bed
- Alfresco uses Ibatis to control de database model. Hence, I wasn't 
expecting any failure.
- I don't have control of the table indexes, and I was not able to compare 
the test with a previous version of Postgresql to test performance.
- This is the first time I run a test of a previous-release version of 
Postgresql, so any comment of ideas are welcome to contribute again next time.


Cheers
Boriss


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing Postgresql 9.5 RC1 with Alfresco 5.0.d

2015-12-28 Thread Simon Riggs
On 28 December 2015 at 22:56, Boriss Mejias  wrote:

> Hi there,
>
> The following wiki page:
> https://wiki.postgresql.org/wiki/HowToBetaTest#How_to_Test
> suggests that I should send the following report to this mailing list.
>

Thanks for testing.

Do you have any comments on performance testing?

Are there any opportunities to improve Alfresco using the new features in
PostgreSQL 9.5?

Thanks

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] Testing WAL replay by comparing before and after images again

2015-09-14 Thread Heikki Linnakangas

On 09/04/2015 09:30 PM, Simon Riggs wrote:

On 4 September 2015 at 13:45, Heikki Linnakangas  wrote:


Another issue was with the new speculative insertions. Replaying a
speculative insertion record sets the tuple's CTID to point to itself, like
in a regular insertion. But in the original system, the CTID is set to a
special speculative insertion token. The tool flagged up that difference.

I propose the attached patch (mark-speculative-insertions-in-replay.patch)
to fix that in the replay routine. This is not required for correctness,
but helps this tool, and seems like a good idea for debugging purposes
anyway.


ISTM that the WAL record should include the speculative insertion token, so
that replay can set it correctly.


I view this the same as command IDs. We don't restore the original 
command ID of a tuple at WAL replay either, because it's irrelevant for 
recovery and hot standby.



That way we can always re-check that the later update matches the
speculative insertion token we expect, in all cases.


Hmm, I guess that would give a tiny bit of extra sanity checking at 
replay. Doesn't really seem worth the trouble and extra WAL volume to me.



In any case, the assumption that we are replaying all changes in single
threaded mode is not appropriate for use with logical replication.


No such assumption here AFAICS.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing WAL replay by comparing before and after images again

2015-09-04 Thread Simon Riggs
On 4 September 2015 at 13:45, Heikki Linnakangas  wrote:


> Another issue was with the new speculative insertions. Replaying a
> speculative insertion record sets the tuple's CTID to point to itself, like
> in a regular insertion. But in the original system, the CTID is set to a
> special speculative insertion token. The tool flagged up that difference.
>
> I propose the attached patch (mark-speculative-insertions-in-replay.patch)
> to fix that in the replay routine. This is not required for correctness,
> but helps this tool, and seems like a good idea for debugging purposes
> anyway.
>

ISTM that the WAL record should include the speculative insertion token, so
that replay can set it correctly.

That way we can always re-check that the later update matches the
speculative insertion token we expect, in all cases.

In any case, the assumption that we are replaying all changes in single
threaded mode is not appropriate for use with logical replication.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


[HACKERS] Testing WAL replay by comparing before and after images again

2015-09-04 Thread Heikki Linnakangas
I rerun my little testing tool that compares buffer page contents after 
every modification, in master and in WAL replay. Previously discussed 
here: http://www.postgresql.org/message-id/5357b582.7060...@vmware.com. 
Here's an updated version of my original hack, for current git master. 
(Michael posted less-hacky versions of that, but unfortunately I haven't 
gotten around to review his stuff.)


I did not find any new bugs. There were a couple of false positives 
however. Firstly, the post-processing tool needed to be taught that BRIN 
pages can have the PD_HAS_FREE_LINES flag set, and ignore that (like it 
does for heap and other indexam pages).


Another issue was with the new speculative insertions. Replaying a 
speculative insertion record sets the tuple's CTID to point to itself, 
like in a regular insertion. But in the original system, the CTID is set 
to a special speculative insertion token. The tool flagged up that 
difference.


I propose the attached patch 
(mark-speculative-insertions-in-replay.patch) to fix that in the replay 
routine. This is not required for correctness, but helps this tool, and 
seems like a good idea for debugging purposes anyway.


Any objections?

- Heikki
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -7776,8 +7776,27 @@ heap_xlog_insert(XLogReaderState *record)
 	XLogRedoAction action;
 
 	XLogRecGetBlockTag(record, 0, _node, NULL, );
-	ItemPointerSetBlockNumber(_tid, blkno);
-	ItemPointerSetOffsetNumber(_tid, xlrec->offnum);
+
+	if (xlrec->flags & XLH_INSERT_IS_SPECULATIVE)
+	{
+		/*
+		 * In a speculative insertion, the 'token' field is set to a token
+		 * used to arbitrate between concurrent insertions. The token is not
+		 * included in the WAL record, so we just set it to zero here. (There
+		 * can be no competing updates during recovery, so we don't need the
+		 * token.) It will be overwritten by a later XLOG_HEAP_CONFIRM or
+		 * XLOG_HEAP_DELETE record, or the transaction will abort, so the
+		 * value we store here doesn't matter, but it's nice to set it to
+		 * something that shows that this was a speculative insertion, for
+		 * debugging purposes.
+		 */
+		ItemPointerSet(_tid, 0, SpecTokenOffsetNumber);
+	}
+	else
+	{
+		/* In a normal insertion, point ctid to the tuple itself */
+		ItemPointerSet(_tid, blkno, xlrec->offnum);
+	}
 
 	/*
 	 * The visibility map may need to be fixed even if the heap page is
>From 28b7382958272e399f5f611bf0d399e6778c9a59 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Mon, 7 Apr 2014 14:51:45 +0300
Subject: [PATCH 1/1] Write buffer image before and after every WAL record.


diff --git a/contrib/page_image_logging/Makefile b/contrib/page_image_logging/Makefile
new file mode 100644
index 000..9c68bbc
--- /dev/null
+++ b/contrib/page_image_logging/Makefile
@@ -0,0 +1,20 @@
+# contrib/page_image_logging/Makefile
+
+PGFILEDESC = "postprocess-images - "
+
+PROGRAM = postprocess-images
+OBJS	= postprocess-images.o
+
+PG_CPPFLAGS = -I$(libpq_srcdir)
+PG_LIBS = $(libpq_pgport)
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/postprocess-images
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/page_image_logging/README b/contrib/page_image_logging/README
new file mode 100644
index 000..e2f8225
--- /dev/null
+++ b/contrib/page_image_logging/README
@@ -0,0 +1,50 @@
+Usage
+-
+
+1. Apply the patch
+
+2. Set up a master and standby.
+
+3. stop master, then standby.
+
+4. Remove $PGDATA/buffer-images from both servers.
+
+5. Start master and standby
+
+6. Run "make installcheck", or whatever you want to test
+
+7. Stop master, then standby
+
+8. compare the logged page images using the postprocessing tool:
+
+./postprocess-images ~/data-master/buffer-images ~/data-standby/buffer-images  > differences
+
+9. The 'differences' file should be empty. If not, investigate.
+
+Tips
+
+
+The page images take up a lot of disk space! The PostgreSQL regression
+suite generates about 11GB - double that when the same is generated also
+in a standby.
+
+Always stop the master first, then standby. Otherwise, when you restart
+the standby, it will start WAL replay from the previous checkpoint, and
+log some page images already. Stopping the master creates a checkpoint
+record, avoiding the problem.
+
+If you get errors like this from postprocess-images:
+
+could not reorder line XXX
+
+It can be caused by an all-zeros page being logged with XLOG HEAP_NEWPAGE
+records. Look at the line in the buffer-image file, see if it's all-zeros.
+This can happen e.g when you change the tablespace of a table. See
+log_newpage() in xlog.c.
+
+You can use pg_xlogdump to see which WAL record a page image corresponds
+to. But beware that the LSN in the page image points to the *end* of the
+WAL record, while the LSN 

Re: [HACKERS] Testing DDL deparsing support

2014-12-09 Thread Bruce Momjian
On Mon, Dec  8, 2014 at 12:43:36PM -0500, Robert Haas wrote:
 On Sat, Dec 6, 2014 at 10:43 PM, Bruce Momjian br...@momjian.us wrote:
  This causes creation DDL is checked if it is used in the regression
  database, but what about ALTER and DROP?  pg_dump doesn't issue those,
  except in special cases like inheritance.
 
 The proposed testing mechanism should cover any ALTER commands that
 are in the regression tests provided that those objects are not
 subsequently dropped -- because if the ALTER commands aren't replayed
 properly, then the later pg_dump won't produce the same output.
 
 There probably are some gaps in our current regression tests in this
 area, but that's probably a good thing to fix regardless of this.

OK, I understand now that the ALTERs are being passed to the slave and
we then can test that against pg_dump --- sounds good.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing DDL deparsing support

2014-12-08 Thread Robert Haas
On Sat, Dec 6, 2014 at 10:43 PM, Bruce Momjian br...@momjian.us wrote:
 This causes creation DDL is checked if it is used in the regression
 database, but what about ALTER and DROP?  pg_dump doesn't issue those,
 except in special cases like inheritance.

The proposed testing mechanism should cover any ALTER commands that
are in the regression tests provided that those objects are not
subsequently dropped -- because if the ALTER commands aren't replayed
properly, then the later pg_dump won't produce the same output.

There probably are some gaps in our current regression tests in this
area, but that's probably a good thing to fix regardless of this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing DDL deparsing support

2014-12-07 Thread Ian Barwick
On 14/12/07 12:43, Bruce Momjian wrote:
 On Tue, Dec  2, 2014 at 03:13:07PM -0300, Alvaro Herrera wrote:
 Robert Haas wrote:
 On Thu, Nov 27, 2014 at 11:43 PM, Ian Barwick i...@2ndquadrant.com wrote:

 A simple schedule to demonstrate this is available; execute from the
 src/test/regress/ directory like this:

 ./pg_regress \
   --temp-install=./tmp_check \
   --top-builddir=../../.. \
   --dlpath=. \
   --schedule=./schedule_ddl_deparse_demo

 I haven't read the code, but this concept seems good to me.

 Excellent, thanks.

 It has the unfortunate weakness that a difference could exist during
 the *middle* of the regression test run that is gone by the *end* of
 the run, but our existing pg_upgrade testing has the same weakness, so
 I guess we can view this as one more reason not to be too aggressive
 about having regression tests drop the unshared objects they create.

 Agreed.  Not dropping objects also helps test pg_dump itself; the normal
 procedure there is run the regression tests, then pg_dump the regression
 database.  Objects that are dropped never exercise their corresponding
 pg_dump support code, which I think is a bad thing.  I think we should
 institute a policy that regression tests must keep the objects they
 create; maybe not all of them, but at least a sample large enough to
 cover all interesting possibilities.
 
 This causes creation DDL is checked if it is used in the regression
 database, but what about ALTER and DROP?  pg_dump doesn't issue those,
 except in special cases like inheritance.

Sure, pg_dump won't contain ALTER/DROP DDL; we are using pg_dump
after replaying the DDL commands to compare the actual state of the
database with the expected state.

As I'm in the middle of writing these tests, before I go any further
do you accept the tests need to be included?


Regards

Ian Barwick

-- 
 Ian Barwick   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, RemoteDBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing DDL deparsing support

2014-12-06 Thread Bruce Momjian
On Tue, Dec  2, 2014 at 03:13:07PM -0300, Alvaro Herrera wrote:
 Robert Haas wrote:
  On Thu, Nov 27, 2014 at 11:43 PM, Ian Barwick i...@2ndquadrant.com wrote:
 
   A simple schedule to demonstrate this is available; execute from the
   src/test/regress/ directory like this:
  
   ./pg_regress \
 --temp-install=./tmp_check \
 --top-builddir=../../.. \
 --dlpath=. \
 --schedule=./schedule_ddl_deparse_demo
  
  I haven't read the code, but this concept seems good to me.
 
 Excellent, thanks.
 
  It has the unfortunate weakness that a difference could exist during
  the *middle* of the regression test run that is gone by the *end* of
  the run, but our existing pg_upgrade testing has the same weakness, so
  I guess we can view this as one more reason not to be too aggressive
  about having regression tests drop the unshared objects they create.
 
 Agreed.  Not dropping objects also helps test pg_dump itself; the normal
 procedure there is run the regression tests, then pg_dump the regression
 database.  Objects that are dropped never exercise their corresponding
 pg_dump support code, which I think is a bad thing.  I think we should
 institute a policy that regression tests must keep the objects they
 create; maybe not all of them, but at least a sample large enough to
 cover all interesting possibilities.

This causes creation DDL is checked if it is used in the regression
database, but what about ALTER and DROP?  pg_dump doesn't issue those,
except in special cases like inheritance.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing DDL deparsing support

2014-12-05 Thread Bruce Momjian
On Fri, Nov 28, 2014 at 01:43:36PM +0900, Ian Barwick wrote:
 Standard regression tests are helpful because patch authors include new test
 cases in the patches that stress their new options or commands.  This new test
 framework needs to be something that internally runs the regression tests and
 exercises DDL deparsing as the regression tests execute DDL.  That would mean
 that new commands and options would automatically be deparse-tested by our new
 framework as soon as patch authors add standard regress support.

Are you saying every time a new option is added to a command that a new
regression test needs to be added?  We don't normally do that, and it
could easily bloat the regression tests over time.  In summary, this
testing will help, but it will not be fully reliable.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing DDL deparsing support

2014-12-05 Thread Alvaro Herrera
Bruce Momjian wrote:
 On Fri, Nov 28, 2014 at 01:43:36PM +0900, Ian Barwick wrote:
  Standard regression tests are helpful because patch authors include new test
  cases in the patches that stress their new options or commands.  This new 
  test
  framework needs to be something that internally runs the regression tests 
  and
  exercises DDL deparsing as the regression tests execute DDL.  That would 
  mean
  that new commands and options would automatically be deparse-tested by our 
  new
  framework as soon as patch authors add standard regress support.
 
 Are you saying every time a new option is added to a command that a new
 regression test needs to be added?

Not necessarily -- an existing test could be modified, as well.

 We don't normally do that,

I sure hope we do have all options covered by tests.

 and it could easily bloat the regression tests over time.

We had 103 regression tests in 8.2 and we have 145 in 9.4.  Does this
qualify as bloat?

 In summary, this testing will help, but it will not be fully reliable.

No testing is ever fully reliable.  If it were, there would never be
bugs.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing DDL deparsing support

2014-12-05 Thread Bruce Momjian
On Fri, Dec  5, 2014 at 09:29:59AM -0300, Alvaro Herrera wrote:
 Bruce Momjian wrote:
  On Fri, Nov 28, 2014 at 01:43:36PM +0900, Ian Barwick wrote:
   Standard regression tests are helpful because patch authors include new 
   test
   cases in the patches that stress their new options or commands.  This new 
   test
   framework needs to be something that internally runs the regression tests 
   and
   exercises DDL deparsing as the regression tests execute DDL.  That would 
   mean
   that new commands and options would automatically be deparse-tested by 
   our new
   framework as soon as patch authors add standard regress support.
  
  Are you saying every time a new option is added to a command that a new
  regression test needs to be added?
 
 Not necessarily -- an existing test could be modified, as well.
 
  We don't normally do that,
 
 I sure hope we do have all options covered by tests.

Are you saying that every combination of ALTER options is tested?  We
have rejected simple regression test additions on the basis that the
syntax works and is unlikely to break once tested once by the developer.

  and it could easily bloat the regression tests over time.
 
 We had 103 regression tests in 8.2 and we have 145 in 9.4.  Does this
 qualify as bloat?

No, that seems fine.  I am worried about having to have a test for every
syntax change, which we currently don't do?  Was that issue not clear in
my first email?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing DDL deparsing support

2014-12-05 Thread Alvaro Herrera
Bruce Momjian wrote:
 On Fri, Dec  5, 2014 at 09:29:59AM -0300, Alvaro Herrera wrote:
  Bruce Momjian wrote:
   On Fri, Nov 28, 2014 at 01:43:36PM +0900, Ian Barwick wrote:
Standard regression tests are helpful because patch authors include new 
test
cases in the patches that stress their new options or commands.  This 
new test
framework needs to be something that internally runs the regression 
tests and
exercises DDL deparsing as the regression tests execute DDL.  That 
would mean
that new commands and options would automatically be deparse-tested by 
our new
framework as soon as patch authors add standard regress support.
   
   Are you saying every time a new option is added to a command that a new
   regression test needs to be added?
  
  Not necessarily -- an existing test could be modified, as well.
  
   We don't normally do that,
  
  I sure hope we do have all options covered by tests.
 
 Are you saying that every combination of ALTER options is tested?

Well, ALTER TABLE is special: you can give several subcommands, and each
subcommand can be one of a rather long list of possible subcommands.
Testing every combination would mean a combinatorial explosion, which
would indeed be too large.  But surely we want a small bunch of tests to
prove that having several subcommands works fine, and also at least one
test for every possible subcommand.

 We have rejected simple regression test additions on the basis that
 the syntax works and is unlikely to break once tested once by the
 developer.

This rationale doesn't sound so good to me.  Something might work fine
the minute it is committed, but someone else might break it
inadvertently later; this has actually happened.  Having no tests at all
for a feature isn't good.  

I know we have recently rejected patches that added tests only to
improve the coverage percent, for instance in CREATE DATABASE, because
the runtime of the tests got too large.  Are there other examples of
rejected tests?

   and it could easily bloat the regression tests over time.
  
  We had 103 regression tests in 8.2 and we have 145 in 9.4.  Does this
  qualify as bloat?
 
 No, that seems fine.  I am worried about having to have a test for every
 syntax change, which we currently don't do?  Was that issue not clear in
 my first email?

Well, if with every syntax change you mean every feature addition,
then I think we should have at least one test for each, yes.  It's not
like we add new syntax every day anyway.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing DDL deparsing support

2014-12-05 Thread Bruce Momjian
On Fri, Dec  5, 2014 at 04:10:12PM -0300, Alvaro Herrera wrote:
 Well, ALTER TABLE is special: you can give several subcommands, and each
 subcommand can be one of a rather long list of possible subcommands.
 Testing every combination would mean a combinatorial explosion, which
 would indeed be too large.  But surely we want a small bunch of tests to
 prove that having several subcommands works fine, and also at least one
 test for every possible subcommand.
 
  We have rejected simple regression test additions on the basis that
  the syntax works and is unlikely to break once tested once by the
  developer.
 
 This rationale doesn't sound so good to me.  Something might work fine
 the minute it is committed, but someone else might break it
 inadvertently later; this has actually happened.  Having no tests at all
 for a feature isn't good.  
 
 I know we have recently rejected patches that added tests only to
 improve the coverage percent, for instance in CREATE DATABASE, because
 the runtime of the tests got too large.  Are there other examples of
 rejected tests?

Yes, there are many cases we have added options or keywords but didn't
add a regression test.

and it could easily bloat the regression tests over time.
   
   We had 103 regression tests in 8.2 and we have 145 in 9.4.  Does this
   qualify as bloat?
  
  No, that seems fine.  I am worried about having to have a test for every
  syntax change, which we currently don't do?  Was that issue not clear in
  my first email?
 
 Well, if with every syntax change you mean every feature addition,
 then I think we should have at least one test for each, yes.  It's not
 like we add new syntax every day anyway.

Well, my point is that this is a new behavior we have to do, at least to
test the logical DDL behavior --- I suppose it could be remove after
testing.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing DDL deparsing support

2014-12-02 Thread Robert Haas
On Thu, Nov 27, 2014 at 11:43 PM, Ian Barwick i...@2ndquadrant.com wrote:
 DDL deparsing is a feature that allows collection of DDL commands as they
 are
 executed in a server, in some flexible, complete, and fully-contained format
 that allows manipulation, storage, and transmission.  This feature has
 several
 use cases; the two best known ones are DDL replication and DDL auditing.

 We have came up with a design that uses a JSON structure to store commands.
 It is similar to the C sprintf() call in spirit: there is a base format
 string, which is a generic template for each command type, and contains
 placeholders that represent the variable parts of the command.  The values
 for
 the placeholders in each specific command are members of the JSON object.  A
 helper function is provided that expands the format string and replaces the
 placeholders with the values, and returns the SQL command as text.  This
 design lets the user operate on the JSON structure in either a read-only
 fashion (for example to block table creation if the names don't satisfy a
 certain condition), or by modifying it (for example, to change the schema
 name
 so that tables are created in different schemas when they are replicated to
 some remote server).

 This design is mostly accepted by the community.  The one sticking point is
 testing: how do we ensure that the JSON representation we have created
 correctly deparses back into a command that has the same effect as the
 original command.  This was expressed by Robert Haas:
 http://www.postgresql.org/message-id/CA+TgmoZ=vzrijmxlkqi_v0jg4k4leapmwusc6rwxs5mquxu...@mail.gmail.com

 The problem cannot be solved by a standard regression test module which runs
 a
 bunch of previously-defined commands and verifies the output.  We need not
 only check the output for the commands as they exist today, but also we need
 to ensure that this does not get broken as future patches modify the
 existing
 commands as well as create completely new commands.

 The challenge here is to create a new testing framework that ensures the DDL
 deparsing module will be maintained by future hackers as the DDL grammar is
 modified.

 What and How to Test
 

 Our goal should be that patch authors run make check-world in their
 patched
 copies and notice that the DDL deparse test is failing; they can then modify
 deparse_utility.c to add support for the new commands, which should in
 general
 be pretty straightforward.  This way, maintaining deparsing code would be
 part
 of new patches just like we require pg_dump support and documentation for
 new
 features.

 It would not work to require patch authors to add their new commands to a
 new
 pg_regress test file, because most would not be aware of the need, or they
 would just forget to do it, and patches would be submitted and possibly even
 committed without any realization of the breakage caused.

 There are two things we can rely on: standard regression tests, and pg_dump.

 Standard regression tests are helpful because patch authors include new test
 cases in the patches that stress their new options or commands.  This new
 test
 framework needs to be something that internally runs the regression tests
 and
 exercises DDL deparsing as the regression tests execute DDL.  That would
 mean
 that new commands and options would automatically be deparse-tested by our
 new
 framework as soon as patch authors add standard regress support.

 One thing is first-grade testing, that is ensure that the deparsed version
 of
 a DDL command can be executed at all, for if the deparsed version throws an
 error, it's immediately obvious that the deparse code is bogus.  This is
 because we know the original command did not throw an error: otherwise, the
 deparse code would not have run at all, because ddl_command_end triggers are
 only executed once the original command has completed execution.  So
 first-grade testing ensures that no trivial bugs are present.

 But there's second-grade verification as well: is the object produced by the
 deparsed version identical to the one produced by the original command?  One
 trivial but incomplete approach is to run the command, then save the
 deparsed
 version; run the deparsed version, and deparse that one too; compare both.
 The problem with this approach is that if the deparse code is omitting some
 clause (say it omits IN TABLESPACE in a CREATE TABLE command), then both
 deparsed versions would contain the same bug yet they would compare equal.
 Therefore this approach is not good enough.

 The best idea we have so far to attack second-grade testing is to trust
 pg_dump to expose differences: accumulate commands as they run in the
 regression database, the run the deparsed versions in a different database;
 then pg_dump both databases and compare the dumped outputs.

 Proof-of-concept
 

 We have now implemented this as a proof-of-concept; the code is available
 in the deparse branch at:

   

Re: [HACKERS] Testing DDL deparsing support

2014-12-02 Thread Alvaro Herrera
Robert Haas wrote:
 On Thu, Nov 27, 2014 at 11:43 PM, Ian Barwick i...@2ndquadrant.com wrote:

  A simple schedule to demonstrate this is available; execute from the
  src/test/regress/ directory like this:
 
  ./pg_regress \
--temp-install=./tmp_check \
--top-builddir=../../.. \
--dlpath=. \
--schedule=./schedule_ddl_deparse_demo
 
 I haven't read the code, but this concept seems good to me.

Excellent, thanks.

 It has the unfortunate weakness that a difference could exist during
 the *middle* of the regression test run that is gone by the *end* of
 the run, but our existing pg_upgrade testing has the same weakness, so
 I guess we can view this as one more reason not to be too aggressive
 about having regression tests drop the unshared objects they create.

Agreed.  Not dropping objects also helps test pg_dump itself; the normal
procedure there is run the regression tests, then pg_dump the regression
database.  Objects that are dropped never exercise their corresponding
pg_dump support code, which I think is a bad thing.  I think we should
institute a policy that regression tests must keep the objects they
create; maybe not all of them, but at least a sample large enough to
cover all interesting possibilities.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Testing DDL deparsing support

2014-11-27 Thread Ian Barwick

DDL deparsing is a feature that allows collection of DDL commands as they are
executed in a server, in some flexible, complete, and fully-contained format
that allows manipulation, storage, and transmission.  This feature has several
use cases; the two best known ones are DDL replication and DDL auditing.

We have came up with a design that uses a JSON structure to store commands.
It is similar to the C sprintf() call in spirit: there is a base format
string, which is a generic template for each command type, and contains
placeholders that represent the variable parts of the command.  The values for
the placeholders in each specific command are members of the JSON object.  A
helper function is provided that expands the format string and replaces the
placeholders with the values, and returns the SQL command as text.  This
design lets the user operate on the JSON structure in either a read-only
fashion (for example to block table creation if the names don't satisfy a
certain condition), or by modifying it (for example, to change the schema name
so that tables are created in different schemas when they are replicated to
some remote server).

This design is mostly accepted by the community.  The one sticking point is
testing: how do we ensure that the JSON representation we have created
correctly deparses back into a command that has the same effect as the
original command.  This was expressed by Robert Haas:
http://www.postgresql.org/message-id/CA+TgmoZ=vzrijmxlkqi_v0jg4k4leapmwusc6rwxs5mquxu...@mail.gmail.com

The problem cannot be solved by a standard regression test module which runs a
bunch of previously-defined commands and verifies the output.  We need not
only check the output for the commands as they exist today, but also we need
to ensure that this does not get broken as future patches modify the existing
commands as well as create completely new commands.

The challenge here is to create a new testing framework that ensures the DDL
deparsing module will be maintained by future hackers as the DDL grammar is
modified.

What and How to Test


Our goal should be that patch authors run make check-world in their patched
copies and notice that the DDL deparse test is failing; they can then modify
deparse_utility.c to add support for the new commands, which should in general
be pretty straightforward.  This way, maintaining deparsing code would be part
of new patches just like we require pg_dump support and documentation for new
features.

It would not work to require patch authors to add their new commands to a new
pg_regress test file, because most would not be aware of the need, or they
would just forget to do it, and patches would be submitted and possibly even
committed without any realization of the breakage caused.

There are two things we can rely on: standard regression tests, and pg_dump.

Standard regression tests are helpful because patch authors include new test
cases in the patches that stress their new options or commands.  This new test
framework needs to be something that internally runs the regression tests and
exercises DDL deparsing as the regression tests execute DDL.  That would mean
that new commands and options would automatically be deparse-tested by our new
framework as soon as patch authors add standard regress support.

One thing is first-grade testing, that is ensure that the deparsed version of
a DDL command can be executed at all, for if the deparsed version throws an
error, it's immediately obvious that the deparse code is bogus.  This is
because we know the original command did not throw an error: otherwise, the
deparse code would not have run at all, because ddl_command_end triggers are
only executed once the original command has completed execution.  So
first-grade testing ensures that no trivial bugs are present.

But there's second-grade verification as well: is the object produced by the
deparsed version identical to the one produced by the original command?  One
trivial but incomplete approach is to run the command, then save the deparsed
version; run the deparsed version, and deparse that one too; compare both.
The problem with this approach is that if the deparse code is omitting some
clause (say it omits IN TABLESPACE in a CREATE TABLE command), then both
deparsed versions would contain the same bug yet they would compare equal.
Therefore this approach is not good enough.

The best idea we have so far to attack second-grade testing is to trust
pg_dump to expose differences: accumulate commands as they run in the
regression database, the run the deparsed versions in a different database;
then pg_dump both databases and compare the dumped outputs.

Proof-of-concept


We have now implemented this as a proof-of-concept; the code is available
in the deparse branch at:

  http://git.postgresql.org/gitweb/?p=2ndquadrant_bdr.git

a diff is attached for reference, but relies on the deparsing functionality
available in the deparse branch.

[HACKERS] Testing RLS with SECURITY DEFINER functions returning refcursors

2013-10-23 Thread Craig Ringer
Hi all

I'm going to be contributing a fair bit of time to RLS for 2ndQuadrant,
courtesy of the EU AXLE project (http://axleproject.eu/).

I've been catching up on Kohei KaiGai's work and I've been really
impressed by what's already done and working. I'm currently reading the
patches, mailing list discussion, and the tests.

Prompted by mailing list discussion on the topic, I added some tests to
the 9.4 v4 RLS patch to check behaviour around SECURITY DEFINER
functions and found a bit of an issue.

If one non-superuser user2 with limited row access creates a SECURITY
DEFINER function that returns a refcursor, and the other user user1
fetches from the cursor, the returned result set is what user1 sees when
selecting the table directly, not what user2 sees when it selects the
table. That is inconsistent with how SECURITY DEFINER behaves for other
rights. It's also inconsistent with a superuser-owned SECURITY DEFINER
function, where RLS doesn't add the predicate at all so all rows are
returned.

Another issue occurs when the superuser invokes the SECURITY DEFINER
function created by user2. There are no rows returned from a fetch of
the refcursor. This makes sense given that in the test the RLS condition
is (dauthor = current_user()) and there are no rows with dauthor set
to the superuser's username.

This asymmetry is a bug. Either RLS should be applied consistently for
the definer, or consistently as the caller. Currently it's the caller
unless the definer is superuser, in which case no checks are applied
because the RLS predicate never gets applied.

I'm doing these tests on top of the tables defined by the rowsecurity
test suite in the patch.

On a side note, I also noticed that while psql's \dt+ supports RLS, \d
or \d+ doesn't provide any indication that there's an RLS policy or what
the conditions are.

Anyway - the additional tests are attached, and can also be found in
https://github.com/ringerc/postgres/tree/rls-9.4 along with a patched
expected file showing what I think _should_ be happening. Comments would
be appreciated.

I'm also interested in more details on the mention of functions that
change the current user ID during a query that came up in prior RLS
discussion.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/test/regress/sql/rowsecurity.sql b/src/test/regress/sql/rowsecurity.sql
new file mode 100644
index 55d4aad..d9a827f
*** a/src/test/regress/sql/rowsecurity.sql
--- b/src/test/regress/sql/rowsecurity.sql
*** SET SESSION AUTHORIZATION rls_regress_us
*** 277,282 
--- 277,352 
  EXPLAIN (costs off) DELETE FROM only t1 WHERE f_leak(b);
  EXPLAIN (costs off) DELETE FROM t1 WHERE f_leak(b);
  
+ 
+ 
+ -- Check refcursors returned from PL/PgSQL SECURITY DEFINER functions
+ 
+ RESET SESSION AUTHORIZATION;
+ 
+ CREATE OR REPLACE FUNCTION return_refcursor_assuper() RETURNS refcursor AS $$
+ DECLARE
+   curs1 refcursor;
+ BEGIN
+   curs1 = 'super_cursor';
+   OPEN curs1 FOR SELECT * FROM document;
+   RETURN curs1;
+ END;
+ $$
+ LANGUAGE plpgsql
+ SECURITY DEFINER;
+ 
+ -- Run the function entirely as rls_regress_user1
+ SET SESSION AUTHORIZATION rls_regress_user1;
+ BEGIN;
+ SELECT return_refcursor_assuper();
+ -- This fetch should return the full results, even though we are now
+ -- running as a user with much lower access according to the current
+ -- RLS policy.
+ FETCH ALL FROM super_cursor;
+ -- But this should still return the usual result set
+ SELECT * FROM document;
+ ROLLBACK;
+ 
+ -- Do the same check where we return a refcursor from one RLS-affected
+ -- user to another RLS-affected user.
+ 
+ SET SESSION AUTHORIZATION rls_regress_user2;
+ 
+ CREATE OR REPLACE FUNCTION return_refcursor_asuser2() RETURNS refcursor AS $$
+ DECLARE
+   curs1 refcursor;
+ BEGIN
+   curs1 = 'user2_cursor';
+   OPEN curs1 FOR SELECT * FROM document;
+   RETURN curs1;
+ END;
+ $$
+ LANGUAGE plpgsql
+ SECURITY DEFINER;
+ 
+ BEGIN;
+ SET SESSION AUTHORIZATION rls_regress_user1;
+ SELECT return_refcursor_asuser2();
+ -- Even though we're user1, we should see only user2's results from this.
+ -- This FAILS, returning user1's results.
+ FETCH ALL FROM user2_cursor;
+ -- but user1's results for this
+ SELECT * FROM document;
+ ROLLBACK;
+ 
+ -- Now as the superuser, see if the SECURITY DEFINER on an RLS-affected
+ -- user filters the rows the superuser sees. It should, for consistency.
+ 
+ BEGIN;
+ RESET SESSION AUTHORIZATION;
+ SELECT return_refcursor_asuser2();
+ -- Should see user2's results, but FAILS, instead returning an empty result set (!)
+ FETCH ALL FROM user2_cursor;
+ -- Should see superuser's results
+ SELECT * FROM document;
+ ROLLBACK;
+ 
+ 
  DELETE FROM only t1 WHERE f_leak(b) RETURNING oid, *, t1;
  DELETE FROM t1 WHERE f_leak(b) RETURNING oid, *, t1;
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

[HACKERS] Testing Cascading Replication

2013-06-26 Thread Josh Berkus
Folks,

Wanted to give you the below testing emails from DHAVAL JAISWAL.  He's
been testing 9.3's streaming-only cascading replication, and so far it
works as advertised.  What he found in his tests was:

a) he could not remaster to a former replica which was behind the relica
he was trying to remaster

b) when servers where correctly caught up, remastering worked correctly

So, all good so far.

Text follows

==

TEST 1: remastering failure due to picking the wrong replica

 I have tested below scenario of the cascade replication for postgreSQL 9.3
beta version.

  A

   B.E
C...D

  1)   *A is the master,*

 *B  E are pointing to the A, *

 *C  D are pointing to the B.*


*Tested Scenarios are as follows: *
* *

* *


a) When (A) failed, we can able to promote B or E as the master and as
usual C  D would continue to talk with the B, if we have promoted B as the
master. If we have promoted E as the master in that case i have changed
recovery.conf of C  D and replace the port and IP pointing to the E. After
restarting of C  D, it has started to talk with the E.


   b) When (B) failed, I have changed recovery.conf of C  D and replace
the port and IP pointing to the E. After restarting of C  D, it has
started to talk with the E. At last A would be the master, E is pointing to
A and C  D pointing to E.



Now, in a) scenario when we promote B as the master on failure of A, that
time C  D would continue to talk with the B. However, when i am changing
recovery.conf of E by replacing the port and IP of B. it is throwing
following errors.


  cp: cannot stat `/usr/local/arch/0002.history': No such file or
directory

cp: cannot stat `/usr/local/arch/0003.history': No such file or
directory

LOG: entering standby mode

cp: cannot stat `/usr/local/arch/0002.history': No such file or
directory

cp: cannot stat `/usr/local/arch/00020027': No such file or
directory

cp: cannot stat `/usr/local/arch/00010027': No such file or
directory

cp: cannot stat `/usr/local/arch/0002.history': No such file or
directory

*FATAL: requested timeline 2 is not a child of this server's history *
* *

*DETAIL: Latest checkpoint is at 0/272DE57C on timeline 1, but in the
history of the requested timeline, the server forked off from that timeline
at 0/272DC548 *
* *

*LOG: startup process (PID 6155) exited with exit code 1 *
* *

LOG: aborting startup due to startup process failure

==

TEST 2: Remastering success

 Structure would be


*A* *(Master)*

  *(Slave1)
BE (Slave2)*

(Slave3) C.D (Slave4)


 (1) stopped the *node (A)*


 (2)  Following are the snaps of *slave1**slave2*  after
stopping*node (A)
*

*slave 1*

postgres=# select pg_last_xact_replay_timestamp();
  pg_last_xact_replay_timestamp
--
 2013-06-26 12:13:54.056954+05:30   ---
timing
(1 row)

postgres=# select pg_last_xlog_receive_location();
 pg_last_xlog_receive_location
---
 0/3E84
received wal
(1 row)



*slave 2
*
postgres=# select pg_last_xact_replay_timestamp();
  pg_last_xact_replay_timestamp
--
 2013-06-26 12:13:54.056954+05:30--- timing
(1 row)

postgres=# select pg_last_xlog_receive_location();
 pg_last_xlog_receive_location
---  received
wal
 0/3E84
(1 row)




(3)  Following are the logs on *slave1 while stopped node (A)*

FATAL:  could not connect to the primary server: could not connect to
server: Connection refused
Is the server running on host 127.0.0.1 and accepting
TCP/IP connections on port 5432?



(4) Following are the logs on *slave2 while stopped node (A) *

FATAL:  could not connect to the primary server: could not connect to
server: Connection refused
Is the server running on host 127.0.0.1 and accepting
TCP/IP connections on port 5432?




(5) Below *logs of slave1, when promoted slave1 as the master.  *

LOG:  received promote request
LOG:  redo done at 0/3E24
LOG:  selected new timeline ID: 2
LOG:  archive recovery complete
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started



(6) Below logs when changed the recovery.conf of *slave2 and now it is
pointing to the slave1 after restart*.

LOG:  database system was shut down in recovery at 2013-06-26 12:28:49 IST
LOG:  entering standby mode
LOG:  consistent recovery state reached at 0/3E84
LOG:  invalid record length at 0/3E84
LOG:  database system is ready to accept read only connections
LOG:  fetching timeline history file for 

Re: [HACKERS] Testing of various opclasses for ranges

2012-07-10 Thread Heikki Linnakangas

On 10.07.2012 02:33, Alexander Korotkov wrote:

Hackers,

I've tested various opclasses for ranges (including currently in-core one
and my patches). I've looked into scholar papers for which datasets they
are using for testing. The lists below show kinds of datasets used in
papers.


Great! That's a pretty comprehensive suite of datasets.


I've merged all 3 patches into 1 (see 2d_map_range_indexing.patch). In this
patch following opclasses are available for ranges:
1) range_ops - currently in-core GiST opclass
2) range_ops2 - GiST opclass based on 2d-mapping
3) range_ops_quad - SP-GiST quad tree based opclass
4) range_ops_kd - SP-GiST k-d tree based opclass


I think the ultimate question is, which ones of these should we include 
in core? We cannot drop the existing range_ops opclass, if only because 
that would break pg_upgrade. However, range_ops2 seems superior to it, 
so I think we should make that the default for new indexes.


For SP-GiST, I don't think we need to include both quad and k-d tree 
implementations. They have quite similar characteristics, so IMHO we 
should just pick one. Which one would you prefer? Is there any 
difference in terms of code complexity between them? Looking at the 
performance test results, quad tree seems to be somewhat slower to 
build, but is faster to query. Based on that, I think we should pick the 
quad tree, query performance seems more important.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing of various opclasses for ranges

2012-07-10 Thread Alexander Korotkov
On Tue, Jul 10, 2012 at 1:38 PM, Heikki Linnakangas 
heikki.linnakan...@enterprisedb.com wrote:

 I think the ultimate question is, which ones of these should we include in
 core? We cannot drop the existing range_ops opclass, if only because that
 would break pg_upgrade. However, range_ops2 seems superior to it, so I
 think we should make that the default for new indexes.


Actually, I'm not fully satisfied with range_ops2. I expect it could be
recommend for all cases, but actually it builds significantly slower and
sometimes requires more pages for search. Likely, we have to write some
recommedation in docs about which opclass to use in particular.
Additionally, somebody could think GiST range indexing becoming tangled.

For SP-GiST, I don't think we need to include both quad and k-d tree
 implementations. They have quite similar characteristics, so IMHO we should
 just pick one. Which one would you prefer? Is there any difference in terms
 of code complexity between them? Looking at the performance test results,
 quad tree seems to be somewhat slower to build, but is faster to query.
 Based on that, I think we should pick the quad tree, query performance
 seems more important.


Agree, I think we should stay at quad tree implemetation.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Testing 9.2 in ~production environment

2012-06-20 Thread James Cloos
Updating pg_database to set datctype='C' did solve the speed issues with
the two largs dbs.

Presumably, since LC_CTYPE=en_US.UTF-8 was in the env when I ran pg_restore,
it overrode the ctype setting in the dump files.

Some of the slow selects do use ilike; even w/ datctype='C' the indices
are skipped for at least this query:

# explain analyze SELECT mb_id FROM mb WHERE name ILIKE 'foo@bar' AND ownerid=7;

 QUERY PLAN 
 
-
 Seq Scan on mb (cost=0.00..570.96 rows=3 width=4) (actual time=9.443..25.039 
rows=1 loops=1)
   Filter: ((name ~~* 'foo@bar'::text) AND (ownerid = 7))
   Rows Removed by Filter: 34827
 Total runtime: 25.071 ms
(4 rows)

The mb table has several indices, including separate ones on name and ownerid.

(not my design, btw.  And I really do need to re-write the middleware)

Whether it is strcoll(3) (even though LC_COLLATE is explicitly C) or
LIKE, it does make a significant difference for those two apps.

-JimC
-- 
James Cloos cl...@jhcloos.com OpenPGP: 1024D/ED7DAEA6

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing 9.2 in ~production environment

2012-06-20 Thread James Cloos
As a followup, I find that I can avoid the seq scan by adding an index
to that table as:

   create index mb_name_own_idx on mb ( lower(name), ownerid );

and changing the query from using the idiom:

  WHERE name ILIKE 'foo@bar' AND ownerid=7;

to using:

  WHERE lower(name) = lower('foo@bar') AND ownerid=7;

which saves 20+ ms on each of the 30+ k such selects in a full run.

I haven't tested how fast it would be with that change and a utf8 ctype.

Because of how the middleware achives its portability between pg, my et al,
changing it to use lower and = will require significant surgery.

Is there any way to specify the index such that the ILIKE query will use
said index?

-JimC
-- 
James Cloos cl...@jhcloos.com OpenPGP: 1024D/ED7DAEA6

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing 9.2 in ~production environment

2012-06-20 Thread Kevin Grittner
James Cloos cl...@jhcloos.com wrote:
 
create index mb_name_own_idx on mb ( lower(name), ownerid );
 
   WHERE lower(name) = lower('foo@bar') AND ownerid=7;
 
If you expect to be using an equality test on ownerid, you should
put that first in the index.
 
BTW, this is starting to sound more like something for the
pgsql-performance list than the pgsql-hackers list.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing 9.2 in ~production environment

2012-06-19 Thread Peter Eisentraut
On mån, 2012-06-18 at 17:57 -0400, James Cloos wrote:
  JB == Josh Berkus j...@agliodbs.com writes:
 
 JB Can you check the collations of the two databases?  I'm wondering if 9.1
 JB is in C collation and 9.2 is something else.
 
 Thanks!
 
 pg_dump -C tells me these two differences:
 
  -SET client_encoding = 'SQL_ASCII';
  +SET client_encoding = 'UTF8';
 
  -CREATE DATABASE dbm WITH TEMPLATE = template0 ENCODING = 'SQL_ASCII' 
 LC_COLLATE = 'C' LC_CTYPE = 'C';
  +CREATE DATABASE dbm WITH TEMPLATE = template0 ENCODING = 'UTF8' LC_COLLATE 
 = 'C' LC_CTYPE = 'en_US.UTF-8';
 
 for every db in the clusters.
 
 I presume that lc_ctype is the significant difference?

It certainly makes some difference, but it's a bit shocking that makes
things that much slower.

 LC_CTYPE *is* specified as 'C' in the dump from which I created the 9.2
 cluster, so it must have been overridden by pg_restore.  I see that my
 dist's /etc rc script now sets LC_CTYPE.  Would that explain why lc_ctype
 changed between the two clusters?

It's possible, depending on how exactly the start up script maze is set
up on your particular OS.

 Is there any way to alter a db's lc_ctype w/o dumping and restoring?  I
 want to preserve some of the changes made since I copied the 9.1 cluster.
 Alter database reports that lc_ctype cannot be changed.

Not really, but in practice you can probably just update pg_database
directly.  If you don't have any case-insensitive indexes, nothing
should change.  Worst case, reindex everything.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing 9.2 in ~production environment

2012-06-19 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On mån, 2012-06-18 at 17:57 -0400, James Cloos wrote:
 I presume that lc_ctype is the significant difference?

 It certainly makes some difference, but it's a bit shocking that makes
 things that much slower.

If James is testing text-comparison-heavy operations, it doesn't seem
shocking in the least.  strcoll() in most non-C locales is a pig.

 LC_CTYPE *is* specified as 'C' in the dump from which I created the 9.2
 cluster, so it must have been overridden by pg_restore.  I see that my
 dist's /etc rc script now sets LC_CTYPE.  Would that explain why lc_ctype
 changed between the two clusters?

 It's possible, depending on how exactly the start up script maze is set
 up on your particular OS.

pg_dumpall should generate a script that correctly restores database
locales.  However, pg_dump+pg_restore is dependent on user creation
of the specific database, which is likely to be environment sensitive.
We really oughta do something about that ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing 9.2 in ~production environment

2012-06-19 Thread Peter Eisentraut
On tis, 2012-06-19 at 02:38 -0400, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
  On mån, 2012-06-18 at 17:57 -0400, James Cloos wrote:
  I presume that lc_ctype is the significant difference?
 
  It certainly makes some difference, but it's a bit shocking that
 makes
  things that much slower.
 
 If James is testing text-comparison-heavy operations, it doesn't seem
 shocking in the least.  strcoll() in most non-C locales is a pig. 

Ah yes, of course, having lc_ctype != C also selects strcoll instead of
strcmp.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing 9.2 in ~production environment

2012-06-19 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On tis, 2012-06-19 at 02:38 -0400, Tom Lane wrote:
 If James is testing text-comparison-heavy operations, it doesn't seem
 shocking in the least.  strcoll() in most non-C locales is a pig. 

 Ah yes, of course, having lc_ctype != C also selects strcoll instead of
 strcmp.

Come to think of it, another possible factor is that LIKE can't use
ordinary indexes on text if the locale isn't C.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing 9.2 in ~production environment

2012-06-19 Thread Peter Eisentraut
On tis, 2012-06-19 at 09:33 -0400, Tom Lane wrote:
 Come to think of it, another possible factor is that LIKE can't use
 ordinary indexes on text if the locale isn't C.

But he reported that the plans are the same.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing 9.2 in ~production environment

2012-06-18 Thread Josh Berkus

 PE Compare the output of pg_config --configure from both installations.
 
 The only differences are 9.1 vs 9.2 in the paths.

Can you check the collations of the two databases?  I'm wondering if 9.1
is in C collation and 9.2 is something else.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing 9.2 in ~production environment

2012-06-18 Thread James Cloos
 JB == Josh Berkus j...@agliodbs.com writes:

JB Can you check the collations of the two databases?  I'm wondering if 9.1
JB is in C collation and 9.2 is something else.

Thanks!

pg_dump -C tells me these two differences:

 -SET client_encoding = 'SQL_ASCII';
 +SET client_encoding = 'UTF8';

 -CREATE DATABASE dbm WITH TEMPLATE = template0 ENCODING = 'SQL_ASCII' 
LC_COLLATE = 'C' LC_CTYPE = 'C';
 +CREATE DATABASE dbm WITH TEMPLATE = template0 ENCODING = 'UTF8' LC_COLLATE = 
'C' LC_CTYPE = 'en_US.UTF-8';

for every db in the clusters.

I presume that lc_ctype is the significant difference?

LC_CTYPE *is* specified as 'C' in the dump from which I created the 9.2
cluster, so it must have been overridden by pg_restore.  I see that my
dist's /etc rc script now sets LC_CTYPE.  Would that explain why lc_ctype
changed between the two clusters?

Is there any way to alter a db's lc_ctype w/o dumping and restoring?  I
want to preserve some of the changes made since I copied the 9.1 cluster.
Alter database reports that lc_ctype cannot be changed.

-JimC
-- 
James Cloos cl...@jhcloos.com OpenPGP: 1024D/ED7DAEA6

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Testing 9.2 in ~production environment

2012-06-17 Thread James Cloos
I'm giving 9.2-beta2 a test simulating a production workflow.

Everything looks OK except the speed.  Most (all?) queries take about
five to six times as long as they do with 9.1.

The configurations are essentially the same, the query plans are the same.

A (hot) example, pulled semi-randomly from a run, with the names mangled
to protect the innocent:

= 9.1 =
 Nested Loop  (cost=0.00..26.92 rows=1 width=28) (actual time=0.114..0.514 
rows=19 loops=1)
   -  Index Scan using ms_pkey on ms msg  (cost=0.00..26.03 rows=1 width=20) 
(actual time=0.026..0.207 rows=19 loops=1)
 Index Cond: ((ms_id = 407) AND (ms_id = 435) AND (mb_id = 50222))
 Filter: (status = ANY ('{0,1,2}'::integer[]))
   -  Index Scan using ph_pkey on ph pm  (cost=0.00..0.87 rows=1 width=16) 
(actual time=0.010..0.010 rows=1 loops=19)
 Index Cond: (id = msg.ph_id)
 Total runtime: 0.605 ms

= 9.2 =
 Nested Loop  (cost=0.00..30.12 rows=1 width=28) (actual time=0.439..2.540 
rows=19 loops=1)
   -  Index Scan using ms_pkey on ms msg  (cost=0.00..29.18 rows=1 width=20) 
(actual time=0.155..1.157 rows=19 loops=1)
 Index Cond: ((ms_id = 407) AND (ms_id = 435) AND (mb_id = 50222))
 Filter: (status = ANY ('{0,1,2}'::integer[]))
   -  Index Scan using ph_pkey on ph pm  (cost=0.00..0.93 rows=1 width=16) 
(actual time=0.053..0.054 rows=1 loops=19)
 Index Cond: (id = msg.ph_id)
 Total runtime: 2.752 ms

All of the tables and indices for the run in question fit into ram.  The
effective cache, work mem, costs, etc were optimized in 9.0, and kept
for 9.1 and the beta.  That the plans are the same suggests that isn't
the problem, yes?

I think I recall mention from a previous beta (but goog isn't helping me
confirm) that there is some extra debugging or such enabled in the betas.

If so, and if turning that off would provide a better comparison, where
in the src should I look?

-JimC
-- 
James Cloos cl...@jhcloos.com OpenPGP: 1024D/ED7DAEA6

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing 9.2 in ~production environment

2012-06-17 Thread Andres Freund
Hi,

On Monday, June 18, 2012 12:51:51 AM James Cloos wrote:
 I'm giving 9.2-beta2 a test simulating a production workflow.
 
 Everything looks OK except the speed.  Most (all?) queries take about
 five to six times as long as they do with 9.1.
 
 The configurations are essentially the same, the query plans are the same.
Is it possible that you compiled with assertions enabled? That would roughly 
fit that magnitude. SHOW debug_assertions; Should show you whether it was 
enabled.

Greetings,

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing 9.2 in ~production environment

2012-06-17 Thread Peter Eisentraut
On sön, 2012-06-17 at 18:51 -0400, James Cloos wrote:
 I think I recall mention from a previous beta (but goog isn't helping
 me confirm) that there is some extra debugging or such enabled in the
 betas.

That depends on how you built it.  Just being a beta by itself doesn't
turn on any extra debugging.
 
 If so, and if turning that off would provide a better comparison,
 where in the src should I look?

Compare the output of pg_config --configure from both installations.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing 9.2 in ~production environment

2012-06-17 Thread James Cloos
 AF == Andres Freund and...@2ndquadrant.com writes:

AF Is it possible that you compiled with assertions enabled? That would 
roughly 
AF fit that magnitude. SHOW debug_assertions; Should show you whether it was 
AF enabled.

Thanks, but SHOW debug_assertions reports off.

-JimC
-- 
James Cloos cl...@jhcloos.com OpenPGP: 1024D/ED7DAEA6

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing 9.2 in ~production environment

2012-06-17 Thread James Cloos
 PE == Peter Eisentraut pete...@gmx.net writes:

PE That depends on how you built it.  Just being a beta by itself doesn't
PE turn on any extra debugging.

OK.  So either I misremembered or it was something no longer done.

PE That depends on how you built it.

Its a Gentoo box; both were build from their ebuilds, with the same gcc,
flags, etc.

PE Compare the output of pg_config --configure from both installations.

The only differences are 9.1 vs 9.2 in the paths.

Thanks,

-JimC
-- 
James Cloos cl...@jhcloos.com OpenPGP: 1024D/ED7DAEA6

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-22 Thread Pavan Deolasee
On Tue, Nov 22, 2011 at 4:40 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Pavan Deolasee pavan.deola...@gmail.com wrote:

 It will be a great help if you could spare few minutes to also
 test the patch to take out the frequently accessed PGPROC members
 to a different array. We are seeing good improvements on HPUX IA
 platform and the AMD Opteron and it will be interesting to know
 what happens on the Intel platform too.

 For a read only comparison (which was run using the simple
 protocol), using identical settings to the previous master run, but
 with the PGPROC split patch:

 m32 tps = 201738.209348 (including connections establishing)
 p32 tps = 201620.966988 (including connections establishing)

 m128 tps = 352159.631878 (including connections establishing)
 p128 tps = 363998.703900 (including connections establishing)

 Clearly a win at 128 clients; not at 32.

 For updates:

 sm32 tps = 27392.393850 (including connections establishing)
 sp32 tps = 27995.784333 (including connections establishing)

 sm128 tps = 22261.902571 (including connections establishing)
 sp128 tps = 23690.408272 (including connections establishing)

 pm32 tps = 34983.352396 (including connections establishing)
 pp32 tps = 36076.373389 (including connections establishing)

 pm128 tps = 24164.441954 (including connections establishing)
 pp128 tps = 27070.824588 (including connections establishing)

 That's a pretty decisive win all around.


Thanks for running those tests. The numbers are not that bad, but
definitely not as good as we saw on some other platforms. But its
possible that they may improve in percentage terms with even more
number of clients on this box. And given that we are seeing big gains
on other platforms, hopefully it will give us confident to proceed
with the patch.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-22 Thread Kevin Grittner
Pavan Deolasee pavan.deola...@gmail.com wrote:
 
 The numbers are not that bad, but definitely not as good as we saw
 on some other platforms.
 
Well, this machine is definitely designed to hold up under high
concurrency.  As I understand it, each core is the memory manager
for two 4GB DIMMs, with two channels to them, each with two buffers.
The way the cores are connected, a core never needs to go through
more than one other core to get to memory not directly managed, and
that uses snoop technology which hands the cached data right over
from one core to the other when possible, rather than making the
core which now owns the cache line pull it from RAM.  It seems the
2.6.32 kernel is able to manage that technology in a reasonable
fashion.
 
At first I was surprised to see performance top out on the update
tests between 80 and 96 clients.  But then, that lands almost
exactly where my old reliable ((2 * core count) + effective spindle
count) would predict.  The SELECT only tests peaked at 64 clients,
but those were fully cached, so effective spindle count was zero,
again fitting the formula.  So these optimizations seem to me to
break down the barriers which had previously capped the number of
clients which could be handled, letting them peak at their natural
levels.
 
 But its possible that they may improve in percentage terms with
 even more number of clients on this box.
 
I think so; I think this box is just so scalable that at 128
clients we were just barely getting past the knee in the
performance graphs to where these patches help most.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-21 Thread Kevin Grittner
Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 I can run one more set of tests tonight before I have to give it
 back to the guy who's putting it into production.  It sounds like
 a set like the above except with synchronous_commit = off might be
 desirable?
 
OK, that's what I did.  This gave me my best numbers yet for an
updating run of pgbench: tps = 38039.724212 for prepared statements
using the flexlock patch.  This patch is a clear win when you get to
16 clients or more.
 
sm1 tps = 1312.501168 (including connections establishing)
sf1 tps = 1376.678293 (including connections establishing)
sm2 tps = 2705.571856 (including connections establishing)
sf2 tps = 2689.577938 (including connections establishing)
sm4 tps = 5461.403557 (including connections establishing)
sf4 tps = 5447.363103 (including connections establishing)
sm8 tps = 10524.695338 (including connections establishing)
sf8 tps = 10448.012069 (including connections establishing)
sm16 tps = 18952.968472 (including connections establishing)
sf16 tps = 18969.505631 (including connections establishing)
sm32 tps = 27392.393850 (including connections establishing)
sf32 tps = 29225.974112 (including connections establishing)
sm64 tps = 28947.675549 (including connections establishing)
sf64 tps = 31417.536816 (including connections establishing)
sm80 tps = 28053.684182 (including connections establishing)
sf80 tps = 29970.555401 (including connections establishing)
sm96 tps = 25885.679957 (including connections establishing)
sf96 tps = 28581.271436 (including connections establishing)
sm128 tps = 22261.902571 (including connections establishing)
sf128 tps = 24537.566960 (including connections establishing)
 
pm1 tps = 2082.958841 (including connections establishing)
pf1 tps = 2052.328339 (including connections establishing)
pm2 tps = 4287.257860 (including connections establishing)
pf2 tps = 4228.770795 (including connections establishing)
pm4 tps = 8653.196863 (including connections establishing)
pf4 tps = 8592.091631 (including connections establishing)
pm8 tps = 16071.432101 (including connections establishing)
pf8 tps = 16196.992207 (including connections establishing)
pm16 tps = 27146.441216 (including connections establishing)
pf16 tps = 27441.966562 (including connections establishing)
pm32 tps = 34983.352396 (including connections establishing)
pf32 tps = 38039.724212 (including connections establishing)
pm64 tps = 33182.643501 (including connections establishing)
pf64 tps = 34193.732669 (including connections establishing)
pm80 tps = 30686.712607 (including connections establishing)
pf80 tps = 6.011769 (including connections establishing)
pm96 tps = 24692.015615 (including connections establishing)
pf96 tps = 32907.472665 (including connections establishing)
pm128 tps = 24164.441954 (including connections establishing)
pf128 tps = 25742.670928 (including connections establishing)
 
At lower client numbers the tps values within each set of five
samples were very tightly grouped.  With either protocol, and
whether or not the patch was applied, the higher concurrency groups
tended to be bifurcated within a set of five samples between good
and bad numbers.  The patch seemed to increase the number of
clients which could be handled without collapse into the bad
numbers.  It really looks like there's some sort of performance
collapse at higher concurrency which may or may not happen in any
particular five minute run.  Just as one example, running the simple
protocol with the flexlock patch:
 
tps = 24491.653873 (including connections establishing)
tps = 24537.566960 (including connections establishing)
tps = 28462.276323 (including connections establishing)
tps = 24403.373002 (including connections establishing)
tps = 28458.902549 (including connections establishing)

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-21 Thread Pavan Deolasee
On Mon, Nov 21, 2011 at 10:44 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:

 I can run one more set of tests tonight before I have to give it
 back to the guy who's putting it into production.  It sounds like
 a set like the above except with synchronous_commit = off might be
 desirable?

 OK, that's what I did.  This gave me my best numbers yet for an
 updating run of pgbench: tps = 38039.724212 for prepared statements
 using the flexlock patch.  This patch is a clear win when you get to
 16 clients or more.


It will be a great help if you could spare few minutes to also test
the patch to take out the frequently accessed PGPROC members to a
different array. We are seeing good improvements on HPUX IA platform
and the AMD Opteron and it will be interesting to know what happens on
the Intel platform too.

http://archives.postgresql.org/message-id/4eb7c4c9.9070...@enterprisedb.com

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-21 Thread Kevin Grittner
Pavan Deolasee pavan.deola...@gmail.com wrote:
 
 It will be a great help if you could spare few minutes to also
 test the patch to take out the frequently accessed PGPROC members
 to a different array. We are seeing good improvements on HPUX IA
 platform and the AMD Opteron and it will be interesting to know
 what happens on the Intel platform too.
 

http://archives.postgresql.org/message-id/4eb7c4c9.9070...@enterprisedb.com
 
It's going to be hard to arrange more of the 20-hours runs I've been
doing, but I can work in some more abbreviated tests.  What would be
the best test for this?  (I would hate to try and find out I didn't
exercise the right code path.)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-21 Thread Pavan Deolasee
On Mon, Nov 21, 2011 at 11:01 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Pavan Deolasee pavan.deola...@gmail.com wrote:

 It will be a great help if you could spare few minutes to also
 test the patch to take out the frequently accessed PGPROC members
 to a different array. We are seeing good improvements on HPUX IA
 platform and the AMD Opteron and it will be interesting to know
 what happens on the Intel platform too.


 http://archives.postgresql.org/message-id/4eb7c4c9.9070...@enterprisedb.com

 It's going to be hard to arrange more of the 20-hours runs I've been
 doing, but I can work in some more abbreviated tests.  What would be
 the best test for this?  (I would hate to try and find out I didn't
 exercise the right code path.)


I think 2-3 runs with 32 and 128 clients each with prepared statements
should suffice to quickly compare with the other numbers you posted
for the master.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-21 Thread Kevin Grittner
Pavan Deolasee pavan.deola...@gmail.com wrote:
 
 It will be a great help if you could spare few minutes to also
 test the patch to take out the frequently accessed PGPROC members
 to a different array. We are seeing good improvements on HPUX IA
 platform and the AMD Opteron and it will be interesting to know
 what happens on the Intel platform too.
 
For a read only comparison (which was run using the simple
protocol), using identical settings to the previous master run, but
with the PGPROC split patch:
 
m32 tps = 201738.209348 (including connections establishing)
p32 tps = 201620.966988 (including connections establishing)

m128 tps = 352159.631878 (including connections establishing)
p128 tps = 363998.703900 (including connections establishing)
 
Clearly a win at 128 clients; not at 32.
 
For updates:
 
sm32 tps = 27392.393850 (including connections establishing)
sp32 tps = 27995.784333 (including connections establishing) 
 
sm128 tps = 22261.902571 (including connections establishing)
sp128 tps = 23690.408272 (including connections establishing)
 
pm32 tps = 34983.352396 (including connections establishing)
pp32 tps = 36076.373389 (including connections establishing)
 
pm128 tps = 24164.441954 (including connections establishing)
pp128 tps = 27070.824588 (including connections establishing)
 
That's a pretty decisive win all around.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-20 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 Hmm.  There's obviously something that's different in your
 environment or configuration from what I tested, but I don't know
 what it is. The fact that your scale factor is larger than
 shared_buffers might matter; or Intel vs. AMD.  Or maybe you're
 running with synchronous_commit=on?
 
Yes, I had synchronous_commit = on for these runs.  Here are the
settings:
 
cat  $PGDATA/postgresql.conf EOM;
max_connections = 200
max_pred_locks_per_transaction = 256
shared_buffers = 10GB
maintenance_work_mem = 1GB
checkpoint_segments = 300
checkpoint_timeout = 15min
checkpoint_completion_target = 0.9
wal_writer_delay = 20ms
seq_page_cost = 0.1
random_page_cost = 0.1
cpu_tuple_cost = 0.05
effective_cache_size = 40GB
default_transaction_isolation = '$iso'
EOM
 
Is there any chance that having pg_xlog on a separate RAID 10 set of
drives with it's own BBU controller would explain anything?  I mean,
I always knew that was a good idea for a big, heavily-loaded box,
but I remember being surprised at how *big* a difference that made
when a box accidentally went into production without moving the
pg_xlog directory there.

There is one other things which might matter, I didn't use the -n
pgbench option, and on the sample you showed, you were using it.
 
Here is the median of five from the latest runs.  On these
read/write tests there was very little spread within each set of
five samples, with no extreme outliers like I had on the SELECT-only
tests.  In the first position s means simple protocol and p means
prepared protocol.  In the second position m means master, f means
with the flexlock patch.
 
sm1 tps = 1092.269228 (including connections establishing)
sf1 tps = 1090.511552 (including connections establishing)
sm2 tps = 2171.867100 (including connections establishing)
sf2 tps = 2158.609189 (including connections establishing)
sm4 tps = 4278.541453 (including connections establishing)
sf4 tps = 4269.921594 (including connections establishing)
sm8 tps = 8472.257182 (including connections establishing)
sf8 tps = 8476.150588 (including connections establishing)
sm16 tps = 15905.074160 (including connections establishing)
sf16 tps = 15937.372689 (including connections establishing)
sm32 tps = 22331.817413 (including connections establishing)
sf32 tps = 22861.258757 (including connections establishing)
sm64 tps = 26388.391614 (including connections establishing)
sf64 tps = 26529.152361 (including connections establishing)
sm80 tps = 25617.651194 (including connections establishing)
sf80 tps = 26560.541237 (including connections establishing)
sm96 tps = 24105.455175 (including connections establishing)
sf96 tps = 26569.244384 (including connections establishing)
sm128 tps = 21467.530210 (including connections establishing)
sf128 tps = 25883.023093 (including connections establishing)
 
pm1 tps = 1629.265970 (including connections establishing)
pf1 tps = 1619.024905 (including connections establishing)
pm2 tps = 3164.061963 (including connections establishing)
pf2 tps = 3137.469377 (including connections establishing)
pm4 tps = 6114.787505 (including connections establishing)
pf4 tps = 6061.750200 (including connections establishing)
pm8 tps = 11884.534375 (including connections establishing)
pf8 tps = 11870.670086 (including connections establishing)
pm16 tps = 20575.737107 (including connections establishing)
pf16 tps = 20437.648809 (including connections establishing)
pm32 tps = 27664.381103 (including connections establishing)
pf32 tps = 28046.846479 (including connections establishing)
pm64 tps = 26764.294547 (including connections establishing)
pf64 tps = 26631.589294 (including connections establishing)
pm80 tps = 27716.198263 (including connections establishing)
pf80 tps = 28393.642871 (including connections establishing)
pm96 tps = 26616.076293 (including connections establishing)
pf96 tps = 28055.921427 (including connections establishing)
pm128 tps = 23282.912620 (including connections establishing)
pf128 tps = 23072.766829 (including connections establishing)
 
Note that on this 32 core box, performance on the read/write pgbench
is peaking at 64 clients, but without a lot of variance between 32
and 96 clients.  And with the patch, performance still hasn't fallen
off too badly at 128 clients. This is good news in terms of not
having to sweat connection pool sizing quite as much as earlier
releases.
 
Next I will get the profile for the SELECT-only runs.  It seems to
make sense to profile at the peak performance level, which was 64
clients.
 
I can run one more set of tests tonight before I have to give it
back to the guy who's putting it into production.  It sounds like a
set like the above except with synchronous_commit = off might be
desirable?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-20 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 I was actually thinking it would be interesting to oprofile the
 read-only test; see if we can figure out where those slowdowns are
 coming from.
 
CPU: Intel Core/i7, speed 2262 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with
a unit mask of 0x00 (No unit mask) count 10
samples  %image name  symbol name
3124242   5.7137  postgress_lock
254   4.6737  postgresAllocSetAlloc
2403412   4.3954  postgresGetSnapshotData
1967132   3.5975  postgresSearchCatCache
1872176   3.4239  postgresbase_yyparse
1327256   2.4273  postgreshash_search_with_hash_value
1040131   1.9022  postgres_bt_compare
1038976   1.9001  postgresLWLockAcquire
8171221.4944  postgresMemoryContextAllocZeroAligned
7383211.3503  postgrescore_yylex
6226131.1386  postgresMemoryContextAlloc
5970541.0919  postgresPinBuffer
5561381.0171  postgresScanKeywordLookup
5523181.0101  postgresexpression_tree_walker
4942790.9039  postgresLWLockRelease
4886280.8936  postgreshash_any
4729060.8649  postgresnocachegetattr
3964820.7251  postgresgrouping_planner
3829740.7004  postgresLockAcquireExtended
3751860.6861  postgresAllocSetFree
3750720.6859  postgresProcArrayLockRelease
3736680.6834  postgresnew_list
3659170.6692  postgresfmgr_info_cxt_security
3013980.5512  postgresProcArrayLockAcquire
3006470.5498  postgresLockReleaseAll
2920730.5341  postgresDirectFunctionCall1Coll
2857450.5226  postgresMemoryContextAllocZero
2846840.5206  postgresFunctionCall2Coll
2827010.5170  postgresSearchSysCache
 
max_connections = 100
max_pred_locks_per_transaction = 64
shared_buffers = 8GB
maintenance_work_mem = 1GB
checkpoint_segments = 300
checkpoint_timeout = 15min
checkpoint_completion_target = 0.9
wal_writer_delay = 20ms
seq_page_cost = 0.1
random_page_cost = 0.1
cpu_tuple_cost = 0.05
effective_cache_size = 40GB
default_transaction_isolation = '$iso'
 
pgbench -i -s 100
pgbench -S -M simple -T 300 -c 80 -j 80
 
transaction type: SELECT only
scaling factor: 100
query mode: simple
number of clients: 80
number of threads: 80
duration: 300 s
number of transactions actually processed: 104391011
tps = 347964.636256 (including connections establishing)
tps = 347976.389034 (excluding connections establishing)
 
vmstat 1 showed differently this time -- no clue why.
 
procs --memory- ---swap-- -io
 r  b   swpdfree  buffcache   si   sobibo
 ---system -cpu--
in  cs us sy id wa st
91  0   8196 4189436 203925700 5231449200 0 0
 32255 1522807 85 13  1  0  0
92  0   8196 4189404 203925700 5231449200 0 0
 32796 1525463 85 14  1  0  0
67  0   8196 4189404 203925700 5231448800 0 0
 32343 1527988 85 13  1  0  0
93  0   8196 4189404 203925700 5231448800 0 0
 32701 1535827 85 13  1  0  0
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 Nate Boley's AMD 6128 box (which has 32 cores) and an HP Integrity
 server (also with 32 cores). 
 
 [clear improvement with flexlock patch]
 
Hmm.  We have a 32-core Intel box (4 x X7560 @ 2.27GHz) with 256 GB
RAM.  It's about a week from going into production, at which point
it will be extremely hard to schedule such tests, but for a few days
more I've got shots at it.  The flexlock patch doesn't appear to be
such a clear win here.
 
I started from Robert's tests, but used these settings so that I
could go to higher client counts and better test serializable
transactions.  Everything is fully cached.
 
max_connections = 200
max_pred_locks_per_transaction = 256
shared_buffers = 8GB
maintenance_work_mem = 1GB
checkpoint_segments = 30
checkpoint_timeout = 15min
checkpoint_completion_target = 0.9
seq_page_cost = 0.1
random_page_cost = 0.1
cpu_tuple_cost = 0.05
effective_cache_size = 40GB
default_transaction_isolation = '$iso'
 
Serializable results not shown here -- that's to gather information
for trying to improve SSI locking.
 
m1 tps = 7847.834544 (including connections establishing)
f1 tps = 7917.225382 (including connections establishing)
m2 tps = 18672.145526 (including connections establishing)
f2 tps = 17486.435322 (including connections establishing)
m4 tps = 34371.278253 (including connections establishing)
f4 tps = 34465.898173 (including connections establishing)
m8 tps = 68228.261694 (including connections establishing)
f8 tps = 68505.285830 (including connections establishing)
m16 tps = 127449.815100 (including connections establishing)
f16 tps = 127208.939670 (including connections establishing)
m32 tps = 201738.209348 (including connections establishing)
f32 tps = 201637.237903 (including connections establishing)
m64 tps = 380326.800557 (including connections establishing)
f64 tps = 380628.429408 (including connections establishing)
m80 tps = 366628.197546 (including connections establishing)
f80 tps = 162594.012051 (including connections establishing)
m96 tps = 360922.948775 (including connections establishing)
f96 tps = 366728.987041 (including connections establishing)
m128 tps = 352159.631878 (including connections establishing)
f128 tps = 355475.129448 (including connections establishing)
 
I did five runs each and took the median.  In most cases, the values
were pretty close to one another in a group, so confidence is pretty
high that this is meaningful.  There were a few anomalies where
performance for one or more samples was horrid.  This seems
consistent with the theory of pathological pileups on the LW locks
(or also flexlocks?).
 
The problem groups:
 
m64 tps = 380407.768906 (including connections establishing)
m64 tps = 79197.470389 (including connections establishing)
m64 tps = 381112.194105 (including connections establishing)
m64 tps = 378579.036542 (including connections establishing)
m64 tps = 380326.800557 (including connections establishing)

m96 tps = 360582.945291 (including connections establishing)
m96 tps = 363021.805138 (including connections establishing)
m96 tps = 362468.870516 (including connections establishing)
m96 tps = 59614.322351 (including connections establishing)
m96 tps = 360922.948775 (including connections establishing)

f80 tps = 158905.149822 (including connections establishing)
f80 tps = 157192.460599 (including connections establishing)
f80 tps = 370757.790443 (including connections establishing)
f80 tps = 162594.012051 (including connections establishing)
f80 tps = 372170.638516 (including connections establishing)

f96 tps = 366804.733788 (including connections establishing)
f96 tps = 366728.987041 (including connections establishing)
f96 tps = 365490.380848 (including connections establishing)
f96 tps = 366770.193305 (including connections establishing)
f96 tps = 125225.371140 (including connections establishing)
 
So the lows don't seem to be as low when they happen with the
flexlock patch, but they still happen -- possibly more often?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Robert Haas
On Fri, Nov 18, 2011 at 11:26 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:
 Nate Boley's AMD 6128 box (which has 32 cores) and an HP Integrity
 server (also with 32 cores).

 [clear improvement with flexlock patch]

 Hmm.  We have a 32-core Intel box (4 x X7560 @ 2.27GHz) with 256 GB
 RAM.  It's about a week from going into production, at which point
 it will be extremely hard to schedule such tests, but for a few days
 more I've got shots at it.  The flexlock patch doesn't appear to be
 such a clear win here.

 I started from Robert's tests, but used these settings so that I
 could go to higher client counts and better test serializable
 transactions.  Everything is fully cached.

 max_connections = 200
 max_pred_locks_per_transaction = 256
 shared_buffers = 8GB
 maintenance_work_mem = 1GB
 checkpoint_segments = 30
 checkpoint_timeout = 15min
 checkpoint_completion_target = 0.9
 seq_page_cost = 0.1
 random_page_cost = 0.1
 cpu_tuple_cost = 0.05
 effective_cache_size = 40GB
 default_transaction_isolation = '$iso'

I had a dismaying benchmarking experience recently that involved
settings very similar to the ones you've got there - in particular, I
also had checkpoint_segments set to 30.  When I raised it to 300,
performance improved dramatically at 8 clients and above.

Then again, is this a regular pgbench test or is this SELECT-only?
Because the absolute numbers you're posting are vastly higher than
anything I've ever seen on a write test.

Can you by any chance check top or vmstat during the 32-client test
and see what percentage you have of user time/system time/idle time?

What OS are you running?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 Then again, is this a regular pgbench test or is this SELECT-only?
 
SELECT-only
 
 Can you by any chance check top or vmstat during the 32-client
 test and see what percentage you have of user time/system
 time/idle time?
 
You didn't say whether you wanted master or flexlock, but it turned
out that any difference was way too far into the noise to show. 
They both looked like this:
 
procs --memory- ---swap-- -io
 r  b   swpdfree  buffcache   si   sobibo
 system -cpu--
 in  cs us sy id wa st
38  0352 1157400 207177020 5236047200 016
  13345 1190230 40  7 53  0  0
37  0352 1157480 207177020 5236047200 0 0
  12953 1263310 40  8 52  0  0
36  0352 1157484 207177020 5236047200 0 0
  13411 1233365 38  7 54  0  0
37  0352 1157476 207177020 5236047200 0 0
  12780 1193575 41  7 51  0  0
 
Keep in mind that while there are really 32 cores, the cpu
percentages seem to be based on the threads from hyperthreading. 
Top showed pgbench (running on the same machine) as eating a pretty
steady 5.2 of the cores, leaving 26.8 cores to actually drive the 32
postgres processes.
 
 What OS are you running?
 
Linux new-CIR 2.6.32.43-0.4-default #1 SMP 2011-07-14 14:47:44 +0200
x86_64 x86_64 x86_64 GNU/Linux
 
SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 1
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Kevin Grittner
Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 We have a 32-core Intel box (4 x X7560 @ 2.27GHz) with 256 GB
 RAM.
 
In case anyone cares, this is the same box for which I posted STREAM
test results a while back.  The PostgreSQL tests seem to peak on
this 32-core box at 64 clients, while the STREAM test of raw RAM
speed kept increasing up to 128 clients.  Overall, though, it's
impressive how close PostgreSQL is now coming to the raw RAM access
speed curve.
 
http://archives.postgresql.org/pgsql-hackers/2011-08/msg01306.php
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Robert Haas
On Fri, Nov 18, 2011 at 12:03 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Then again, is this a regular pgbench test or is this SELECT-only?

 SELECT-only

Ah, OK.  I would not expect flexlocks to help with that; Pavan's patch
might, though.

 Can you by any chance check top or vmstat during the 32-client
 test and see what percentage you have of user time/system
 time/idle time?

 You didn't say whether you wanted master or flexlock, but it turned
 out that any difference was way too far into the noise to show.
 They both looked like this:

 procs --memory- ---swap-- -io
  r  b   swpd    free      buff    cache   si   so    bi    bo
  system -cpu--
     in      cs us sy id wa st
 38  0    352 1157400 207177020 52360472    0    0     0    16
  13345 1190230 40  7 53  0  0
 37  0    352 1157480 207177020 52360472    0    0     0     0
  12953 1263310 40  8 52  0  0
 36  0    352 1157484 207177020 52360472    0    0     0     0
  13411 1233365 38  7 54  0  0
 37  0    352 1157476 207177020 52360472    0    0     0     0
  12780 1193575 41  7 51  0  0

 Keep in mind that while there are really 32 cores, the cpu
 percentages seem to be based on the threads from hyperthreading.
 Top showed pgbench (running on the same machine) as eating a pretty
 steady 5.2 of the cores, leaving 26.8 cores to actually drive the 32
 postgres processes.

It doesn't make any sense for PostgreSQL master to be using only 50%
of the CPU and leaving the rest idle on a lots-of-clients SELECT-only
test.  That could easily happen on 9.1, but my lock manager changes
eliminated the only place where anything gets put to sleep in that
path (except for the emergency sleeps done by s_lock, when a spinlock
is really badly contended).  So I'm confused by these results.  Are we
sure that the processes are being scheduled across all 32 physical
cores?

At any rate, I do think it's likely that you're being bitten by
spinlock contention, but we'd need to do some legwork to verify that
and work out the details.  Any chance you can run oprofile (on either
branch, don't really care) against the 32 client test and post the
results?  If it turns out s_lock is at the top of the heap, I can put
together a patch to help figure out which spinlock is the culprit.

Anyway, this is probably a digression as it relates to FlexLocks:
those are not optimizing for a read-only workload.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 Then again, is this a regular pgbench test or is this
 SELECT-only?

 SELECT-only
 
 Ah, OK.  I would not expect flexlocks to help with that; Pavan's
 patch might, though.
 
OK.  Sorry for misunderstanding that.  I haven't gotten around to a
deep reading of the patch yet.  :-(  I based this on the test script
you posted here (with slight modifications for my preferred
directory structures):
 
http://archives.postgresql.org/pgsql-hackers/2011-10/msg00605.php
 
If I just drop the -S switch will I have a good test, or are there
other adjustments I should make (besides increasing checkpoint
segments)?  (Well, for the SELECT-only test I didn't bother putting
pg_xlog on a separate RAID 10 on it's own BBU controller as we
normally would for this machine, I'll cover that, too.)
 
 It doesn't make any sense for PostgreSQL master to be using only
 50% of the CPU and leaving the rest idle on a lots-of-clients
 SELECT-only test.  That could easily happen on 9.1, but my lock
 manager changes eliminated the only place where anything gets put
 to sleep in that path (except for the emergency sleeps done by
 s_lock, when a spinlock is really badly contended).  So I'm
 confused by these results. Are we sure that the processes are
 being scheduled across all 32 physical cores?
 
I think so.  My take was that it was showing 32 of 64 *threads*
active -- the hyperthreading funkiness.  Is there something in
particular you'd like me to check?
 
 At any rate, I do think it's likely that you're being bitten by
 spinlock contention, but we'd need to do some legwork to verify
 that and work out the details.  Any chance you can run oprofile
 (on either branch, don't really care) against the 32 client test
 and post the results?  If it turns out s_lock is at the top of the
 heap, I can put together a patch to help figure out which spinlock
 is the culprit.
 
oprofile isn't installed on this machine.  I'll take care of that
and post results when I can.
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Robert Haas
On Fri, Nov 18, 2011 at 12:45 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 OK.  Sorry for misunderstanding that.  I haven't gotten around to a
 deep reading of the patch yet.  :-(  I based this on the test script
 you posted here (with slight modifications for my preferred
 directory structures):

 http://archives.postgresql.org/pgsql-hackers/2011-10/msg00605.php

 If I just drop the -S switch will I have a good test, or are there
 other adjustments I should make (besides increasing checkpoint
 segments)?  (Well, for the SELECT-only test I didn't bother putting
 pg_xlog on a separate RAID 10 on it's own BBU controller as we
 normally would for this machine, I'll cover that, too.)

Yeah, I'd just drop -S.  Make sure to use -c N -j N with pgbench, or
you'll probably not be able to saturate it.  I've also had good luck
with wal_writer_delay=20ms, although if you have synchronous_commit=on
that might not matter, and it's much less important since Simon's
recent patch in that area went in.

What scale factor are you testing at?

 It doesn't make any sense for PostgreSQL master to be using only
 50% of the CPU and leaving the rest idle on a lots-of-clients
 SELECT-only test.  That could easily happen on 9.1, but my lock
 manager changes eliminated the only place where anything gets put
 to sleep in that path (except for the emergency sleeps done by
 s_lock, when a spinlock is really badly contended).  So I'm
 confused by these results. Are we sure that the processes are
 being scheduled across all 32 physical cores?

 I think so.  My take was that it was showing 32 of 64 *threads*
 active -- the hyperthreading funkiness.  Is there something in
 particular you'd like me to check?

Not really, just don't understand the number.

 At any rate, I do think it's likely that you're being bitten by
 spinlock contention, but we'd need to do some legwork to verify
 that and work out the details.  Any chance you can run oprofile
 (on either branch, don't really care) against the 32 client test
 and post the results?  If it turns out s_lock is at the top of the
 heap, I can put together a patch to help figure out which spinlock
 is the culprit.

 oprofile isn't installed on this machine.  I'll take care of that
 and post results when I can.

OK.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 Yeah, I'd just drop -S.
 
Easily done.
 
 Make sure to use -c N -j N with pgbench, or you'll probably not be
 able to saturate it.
 
Yeah, that's part of the script I copied from you.
 
 I've also had good luck with wal_writer_delay=20ms, although if
 you have synchronous_commit=on that might not matter, and it's
 much less important since Simon's recent patch in that area went
 in.
 
What the heck; will do.
 
 What scale factor are you testing at?
 
100.  Perhaps I should boost that since I'm going as far as 128
clients?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 Any chance you can run oprofile (on either branch, don't really
 care) against the 32 client test and post the results?
 
Besides the other changes we discussed, I boosted scale to 150 and
ran at READ COMMITTED isolation level (because all threads promptly
crashed and burned at REPEATABLE READ -- we desperately need a
pgbench option to retry a transaction on serialization failure). 
The oprofile hot spots at half a percent or higher:
 
CPU: Intel Core/i7, speed 2262 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with
a unit mask of 0x00 (No unit mask) count 10
samples  %image name  symbol name
9333944.9651  postgresAllocSetAlloc
8484764.5134  postgresbase_yyparse
7195153.8274  postgresSearchCatCache
4612752.4537  postgreshash_search_with_hash_value
4264112.2682  postgresGetSnapshotData
3229381.7178  postgresLWLockAcquire
3222361.7141  postgrescore_yylex
3054711.6249  postgresMemoryContextAllocZeroAligned
2815431.4976  postgresexpression_tree_walker
2702411.4375  postgresXLogInsert
2348991.2495  postgresMemoryContextAlloc
2101371.1178  postgresScanKeywordLookup
1848570.9833  postgresheap_page_prune
1736080.9235  postgreshash_any
1530110.8139  postgres_bt_compare
1445380.7689  postgresnocachegetattr
1314660.6993  postgresfmgr_info_cxt_security
1310010.6968  postgresgrouping_planner
1308080.6958  postgresLWLockRelease
1241120.6602  postgresPinBuffer
1207450.6423  postgresLockAcquireExtended
1129920.6010  postgresExecInitExpr
1128300.6002  postgreslappend
1123110.5974  postgresnew_list
1103680.5871  postgrescheck_stack_depth
1060360.5640  postgresAllocSetFree
1025650.5456  postgresMemoryContextAllocZero
94689 0.5037  postgresSearchSysCache
 
Do you want line numbers or lower percentages?
 
Two runs:
 
tps = 21946.961196 (including connections establishing)
tps = 22911.873227 (including connections establishing)
 
For write transactions, that seems pretty respectable.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread anara...@anarazel.de


Kevin Grittner kevin.gritt...@wicourts.gov schrieb:

Robert Haas robertmh...@gmail.com wrote:
 
 Any chance you can run oprofile (on either branch, don't really
 care) against the 32 client test and post the results?
 
Besides the other changes we discussed, I boosted scale to 150 and
ran at READ COMMITTED isolation level (because all threads promptly
crashed and burned at REPEATABLE READ -- we desperately need a
pgbench option to retry a transaction on serialization failure). 
The oprofile hot spots at half a percent or higher:
 
CPU: Intel Core/i7, speed 2262 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with
a unit mask of 0x00 (No unit mask) count 10
samples  %image name  symbol name
9333944.9651  postgresAllocSetAlloc
8484764.5134  postgresbase_yyparse
7195153.8274  postgresSearchCatCache
4612752.4537  postgreshash_search_with_hash_value
4264112.2682  postgresGetSnapshotData
3229381.7178  postgresLWLockAcquire
3222361.7141  postgrescore_yylex
3054711.6249  postgresMemoryContextAllocZeroAligned
2815431.4976  postgresexpression_tree_walker
2702411.4375  postgresXLogInsert
2348991.2495  postgresMemoryContextAlloc
2101371.1178  postgresScanKeywordLookup
1848570.9833  postgresheap_page_prune
1736080.9235  postgreshash_any
1530110.8139  postgres_bt_compare
1445380.7689  postgresnocachegetattr
1314660.6993  postgresfmgr_info_cxt_security
1310010.6968  postgresgrouping_planner
1308080.6958  postgresLWLockRelease
1241120.6602  postgresPinBuffer
1207450.6423  postgresLockAcquireExtended
1129920.6010  postgresExecInitExpr
1128300.6002  postgreslappend
1123110.5974  postgresnew_list
1103680.5871  postgrescheck_stack_depth
1060360.5640  postgresAllocSetFree
1025650.5456  postgresMemoryContextAllocZero
94689 0.5037  postgresSearchSysCache
That profile looks like you ran pgbench with -m simple. How does it look with 
prepared instead?

Andres
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Kevin Grittner
anara...@anarazel.de and...@anarazel.de wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov schrieb:
 
samples  %image name  symbol name
9333944.9651  postgresAllocSetAlloc
8484764.5134  postgresbase_yyparse
7195153.8274  postgresSearchCatCache
 
 That profile looks like you ran pgbench with -m simple. How does
 it look with prepared instead?
 
samples  %image name  symbol name
4954633.6718  postgreshash_search_with_hash_value
4909713.6385  postgresGetSnapshotData
4439653.2902  postgresLWLockAcquire
4435663.2872  postgresAllocSetAlloc
3023882.2409  postgresXLogInsert
2868892.1261  postgresSearchCatCache
2464171.8262  postgresPostgresMain
2350181.7417  postgresheap_page_prune
1984421.4706  postgres_bt_compare
1814461.3447  postgreshash_any
1771311.3127  postgresExecInitExpr
1757751.3026  postgresLWLockRelease
1523241.1288  postgresPinBuffer
1502851.1137  postgresexec_bind_message
1452141.0762  postgresfmgr_info_cxt_security
1404931.0412  postgress_lock
1241620.9201  postgresLockAcquireExtended
1204290.8925  postgresMemoryContextAlloc
1170760.8676  postgrespfree
1164930.8633  postgresAllocSetFree
1050270.7783  postgrespgstat_report_activity
1014070.7515  postgresProcArrayLockAcquire
1007970.7470  postgresMemoryContextAllocZeroAligned
98360 0.7289  postgresProcArrayLockRelease
86938 0.6443  postgresheap_hot_search_buffer
82635 0.6124  postgreshash_search
79902 0.5921  postgreserrstart
79465 0.5889  postgresHeapTupleSatisfiesVacuum
78709 0.5833  postgresResourceOwnerReleaseInternal
76068 0.5637  postgresExecModifyTable
73043 0.5413  postgresheap_update
72175 0.5349  postgresstrlcpy
71253 0.5280  postgresMemoryContextAllocZero
 
tps = 27392.219364 (including connections establishing)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Kevin Grittner
Kevin Grittner kevin.gritt...@wicourts.gov wrote:

 samples  %image name  symbol name
 4954633.6718  postgreshash_search_with_hash_value
 
When lines like these show up in the annotated version, I'm
impressed that we're still finding gains as big as we are:
 
 44613  0.3306 :if (segp == NULL)
   :hash_corrupted(hashp);
 
101910  0.7552 :keysize = hashp-keysize;   /* ditto */
 
There goes over 1% of my server run time, right there!
 
Of course, these make no sense unless there is cache line
contention, which is why that area is bearing fruit.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
  I think so.  My take was that it was showing 32 of 64 *threads*
 active -- the hyperthreading funkiness.  Is there something in
 particular you'd like me to check?
 
 Not really, just don't understand the number.
 
I'm having trouble resolving the vmstat numbers I got during the
32-client pgbench runs which modified data.
 
-M simple:
 
procs --memory- ---swap-- -io-
 r  b   swpd   free  buffcache   si   sobi bo
 system -cpu--
 in  cs us sy id wa st
30  1   4464 513492 205564572 5447212400 0  78170
 621724 1246300 30  8 61  1  0
27  1   4464 509288 205564572 5447460000 0 125620
 599403 1192046 29  8 63  1  0
35  1   4464 508368 205564572 5447699600 0  89801
 595939 1186496 29  8 63  0  0
25  0   4464 506088 205564572 5447866800 0  90121
 594800 1189649 28  8 63  0  0
 
-M prepared:
 
procs --memory-- ---swap-- -io-
 r  b   swpdfree  buffcache   si   sobi bo
 system -cpu--
 in  cs us sy id wa st
28  0   5612 1204404 205107344 5423053600 0  93212
 527284 1456417 22  9 69  0  0
 8  1   5612 1202044 205107344 542600 0  93217
 512819 1417457 21  9 70  1  0
17  1   5612 1201892 205107344 5423604800 0 132699
 502333 1412878 21  9 70  0  0
19  1   5612 1199208 205107344 5423893600 0  93612
 519113 1484386 21  9 69  0  0
 
So 60% or 70% idle without any I/O wait time.  I don't know how to
explain that.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Andres Freund
On Friday, November 18, 2011 08:36:59 PM Kevin Grittner wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
  samples  %image name  symbol name
  4954633.6718  postgreshash_search_with_hash_value
 
 When lines like these show up in the annotated version, I'm
 impressed that we're still finding gains as big as we are:
 
  44613  0.3306 :if (segp == NULL)
 
:hash_corrupted(hashp);
 
 101910  0.7552 :keysize = hashp-keysize;   /* ditto */
When doing line-level profiles I would suggest looking at the instructions. 
Quite often the line shown doesn't have much to do whats executed as the 
compiler tries to schedule instructions cleverly.
Also in many situations the shown cost doesn't actually lie in the instruction 
shown but in some previous one. The shown instruction e.g. has to wait for the 
result of the earlier instructions. Pipelining makes that hard to correctly 
observe.

A simplified example would be something like:

bool func(int a, int b, int c){
   int res = a / b;
   if(res == c){
   return true;
   }
   return false;
}

Likely the instruction showing up in the profile would be the comparison. Which 
obviously is not the really expensive part...


 There goes over 1% of my server run time, right there!
 
 Of course, these make no sense unless there is cache line
 contention, which is why that area is bearing fruit.
I don't think cache line contention is the most likely candidate here.  Simple 
cache-misses seem far more likely. In combination with pipeline stalls...

Newer cpus (nehalem+) can measure stalled cycles which can be really useful 
when analyzing performance. I don't remember how to do that with oprofile right 
now though as I use perf these days (its -e stalled-cycles{frontend|backend} 
there}).

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Kevin Grittner
Andres Freund and...@anarazel.de wrote:
 
 When doing line-level profiles I would suggest looking at the
 instructions.
 
What's the best way to do that?
 
 I don't think cache line contention is the most likely candidate
 here.  Simple cache-misses seem far more likely. In combination
 with pipeline stalls...
 
 Newer cpus (nehalem+) can measure stalled cycles which can be
 really useful when analyzing performance. I don't remember how to
 do that with oprofile right now though as I use perf these days
 (its -e stalled-cycles{frontend|backend} there}).
 
When I run oprofile, I still always go back to this post by Tom:
 
http://archives.postgresql.org/pgsql-performance/2009-06/msg00154.php
 
Can anyone provide such a cheat sheet for perf?  I could give that
a try if I knew how.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Andres Freund
On Friday, November 18, 2011 09:16:01 PM Kevin Grittner wrote:
 Andres Freund and...@anarazel.de wrote:
  When doing line-level profiles I would suggest looking at the
  instructions.
 What's the best way to do that?
I think opannotate -a -s produces output with instructions/code intermingled.

  I don't think cache line contention is the most likely candidate
  here.  Simple cache-misses seem far more likely. In combination
  with pipeline stalls...
  
  Newer cpus (nehalem+) can measure stalled cycles which can be
  really useful when analyzing performance. I don't remember how to
  do that with oprofile right now though as I use perf these days
  (its -e stalled-cycles{frontend|backend} there}).
 
 When I run oprofile, I still always go back to this post by Tom:
 http://archives.postgresql.org/pgsql-performance/2009-06/msg00154.php
Hrm. I am on the train and for unknown reasons the only sensible working
protocols are smtp + pop Waiting Waiting
Sorry, too slow/high latency atm. I wrote everything below and another mail
and the page still hasn't loaded.

oprofile can produces graphes as well (--callgraph). for both tools you need
-fno-omit-frame-pointers to get usable graphs.

 Can anyone provide such a cheat sheet for perf?  I could give that
 a try if I knew how.
Unfortunately for sensible results the kernel needs to be rather new.
I would say  2.6.28 or so (just guessed).

# to record activity
perf record [-g|--call-graph] program|-p pid 

# to view a summation
perf report

graph:
# Overhead   Command  Shared Object 
Symbol
#     .  
.
#
 4.09%  postgres  postgres   [.] slab_alloc_dyn
|
--- slab_alloc_dyn
   |  
   |--18.52%-- new_list
   |  |  
   |  |--63.79%-- lappend
   |  |  |  
   |  |  |--13.40%-- find_usable_indexes
   |  |  |  create_index_paths
   |  |  |  set_rel_pathlist
   |  |  |  make_one_rel

flat:

# Overhead   Command  Shared Object 
Symbol
#     .  
.
#
 5.10%  postgres  [vdso] [.] 0x73d8d770  
 4.26%  postgres  postgres   [.] base_yyparse
 3.88%  postgres  postgres   [.] slab_alloc_dyn
 2.82%  postgres  postgres   [.] core_yylex
 2.37%  postgres  postgres   [.] SearchCatCache
 1.85%  postgres  libc-2.13.so   [.] __memcpy_ssse3
 1.66%  postgres  libc-2.13.so   [.] __GI___strcmp_ssse3
 1.23%  postgres  postgres   [.] MemoryContextAlloc


# to view a line/source/instruction level view
perf annotate -l symbol

...
 :
 :  /*
 :   * one-time startup overhead for each cache
 :   */
 :  if (cache-cc_tupdesc == NULL)
0.35 :6e81fd:   48 83 7f 28 00  cmpq   $0x0,0x28(%rdi)
 
/home/andres/src/postgresql/build/optimize/../../src/backend/utils/cache/catcache.c:1070
4.15 :6e8202:   0f 84 54 04 00 00   je 6e865c 
SearchCatCache+0x47c
 :  #endif
 :
 :  /*
 :   * initialize the search key information
 :   */
 :  memcpy(cur_skey, cache-cc_skey, sizeof(cur_skey));
0.00 :6e8208:   48 8d bd a0 fe ff fflea-0x160(%rbp),%rdi
0.17 :6e820f:   49 8d 77 70 lea0x70(%r15),%rsi
0.00 :6e8213:   b9 24 00 00 00  mov$0x24,%ecx
 
/home/andres/src/postgresql/build/optimize/../../src/backend/utils/cache/catcache.c:1080
   33.22 :6e8218:   f3 48 a5rep movsq 
%ds:(%rsi),%es:(%rdi)
 :  cur_skey[0].sk_argument = v1;
 
/home/andres/src/postgresql/build/optimize/../../src/backend/utils/cache/catcache.c:1081
1.56 :6e821b:   48 89 9d e0 fe ff ffmov%rbx,-0x120(%rbp)
...

# get heaps of stats from something
perf stat -ddd someprogram|-p pid

   1242.409965 task-clock#0.824 CPUs utilized   
[100.00%]
14,572 context-switches  #0.012 M/sec   
[100.00%]
   264 CPU-migrations#0.000 M/sec   
[100.00%]
 0 page-faults   #0.000 M/sec  
 2,854,775,135 cycles#2.298 GHz 
[26.28%]
   not supported stalled-cycles-frontend 
   not supported stalled-cycles-backend  
 2,024,997,785 instructions  #0.71  insns per cycle 

Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Andres Freund
On Friday, November 18, 2011 11:12:02 PM Andres Freund wrote:
 On Friday, November 18, 2011 09:16:01 PM Kevin Grittner wrote:
  Andres Freund and...@anarazel.de wrote:
   When doing line-level profiles I would suggest looking at the
   instructions.
  
  What's the best way to do that?
 
 I think opannotate -a -s produces output with instructions/code
 intermingled.
 
   I don't think cache line contention is the most likely candidate
   here.  Simple cache-misses seem far more likely. In combination
   with pipeline stalls...
   
   Newer cpus (nehalem+) can measure stalled cycles which can be
   really useful when analyzing performance. I don't remember how to
   do that with oprofile right now though as I use perf these days
   (its -e stalled-cycles{frontend|backend} there}).
  
  When I run oprofile, I still always go back to this post by Tom:
  http://archives.postgresql.org/pgsql-performance/2009-06/msg00154.php
 
 Hrm. I am on the train and for unknown reasons the only sensible working
 protocols are smtp + pop Waiting Waiting
 Sorry, too slow/high latency atm. I wrote everything below and another mail
 and the page still hasn't loaded.
 
 oprofile can produces graphes as well (--callgraph). for both tools you
 need -fno-omit-frame-pointers to get usable graphs.
 
  Can anyone provide such a cheat sheet for perf?  I could give that
  a try if I knew how.
 
 Unfortunately for sensible results the kernel needs to be rather new.
 I would say  2.6.28 or so (just guessed).
 
 # to record activity
 perf record [-g|--call-graph] program|-p pid
 
 # to view a summation
 perf report

 # get heaps of stats from something
 perf stat -ddd someprogram|-p pid

 # show whats the system executing overall
 perf top -az
 
 # get help
 perf help (record|report|annotate|stat|...)
 
...
I forgot that there is also 

# get a list of event types
perf list

# measure somethign for a specidif event
perf (record|stat|top) -e some_event_type



Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Kevin Grittner
Andres Freund and...@anarazel.de wrote:
 
 I think opannotate -a -s produces output with instructions/code
 intermingled.
 
Thanks.  I'll check out perf later (thanks for the tips!), but for
now, here's the function which was at the top of my oprofile
results, annotated with those options.  I'm afraid it's a bit
intimidating to me -- the last time I did much with X86 assembly
language was in the mid-80s, on an 80286.  :-/  Hopefully, since
this is at the top of the oprofile results when running with
prepared statements, it will be of use to somebody.
 
The instructions which are shown as having that 1% still seem odd to
me, but as you say, they were probably actually waiting for some
previous operation to finish:
 
 43329  0.3211 :  70b56a:   test   %rbp,%rbp
 
 99903  0.7404 :  70b58a:   mov%rax,0x18(%rsp)
 
If anyone wants any other detail from what I captured, let me know.
 
-Kevin

0070b520 hash_search_with_hash_value: /* hash_search_with_hash_value 
total: 495463  3.6718 */
   :hash_search_with_hash_value(HTAB *hashp,
   :const 
void *keyPtr,
   :uint32 
hashvalue,
   :
HASHACTION action,
   :bool 
*foundPtr)
   :{
  5023  0.0372 :  70b520:   push   %r15
  5967  0.0442 :  70b522:   push   %r14
  1407  0.0104 :  70b524:   mov%rdi,%r14
30 2.2e-04 :  70b527:   push   %r13
  2495  0.0185 :  70b529:   push   %r12
  2631  0.0195 :  70b52b:   mov%edx,%r12d
18 1.3e-04 :  70b52e:   push   %rbp
  1277  0.0095 :  70b52f:   push   %rbx
   :static inline uint32
   :calc_bucket(HASHHDR *hctl, uint32 hash_val)
   :{
   :uint32  bucket;
   :
   :bucket = hash_val  hctl-high_mask;
  2122  0.0157 :  70b530:   mov%edx,%ebx
   :hash_search_with_hash_value(HTAB *hashp,
   :const 
void *keyPtr,
   :uint32 
hashvalue,
   :
HASHACTION action,
   :bool 
*foundPtr)
   :{
   247  0.0018 :  70b532:   sub$0x58,%rsp
   236  0.0017 :  70b536:   mov%rsi,0x10(%rsp)
  3851  0.0285 :  70b53b:   mov%ecx,0xc(%rsp)
  2551  0.0189 :  70b53f:   mov%r8,(%rsp)
   :HASHHDR*hctl = hashp-hctl;
  2225  0.0165 :  70b543:   mov(%rdi),%r15
   :static inline uint32
   :calc_bucket(HASHHDR *hctl, uint32 hash_val)
   :{
   :uint32  bucket;
   :
   :bucket = hash_val  hctl-high_mask;
  4544  0.0337 :  70b546:   and0x2c(%r15),%ebx
   :if (bucket  hctl-max_bucket)
 53409  0.3958 :  70b54a:   cmp0x28(%r15),%ebx
   :  70b54e:   jbe70b554 hash_search_with_hash_value+0x34
   :bucket = bucket  hctl-low_mask;
  3324  0.0246 :  70b550:   and0x30(%r15),%ebx
   :bucket = calc_bucket(hctl, hashvalue);
   :
   :segment_num = bucket  hashp-sshift;
   :segment_ndx = MOD(bucket, hashp-ssize);
   :
   :segp = hashp-dir[segment_num];
  9702  0.0719 :  70b554:   mov0x58(%r14),%ecx
  2428  0.0180 :  70b558:   mov%ebx,%eax
   489  0.0036 :  70b55a:   mov0x8(%r14),%rdx
   : * Do the initial lookup
   : */
   :bucket = calc_bucket(hctl, hashvalue);
   :
   :segment_num = bucket  hashp-sshift;
   :segment_ndx = MOD(bucket, hashp-ssize);
   391  0.0029 :  70b55e:   mov0x50(%r14),%r13
   :
   :segp = hashp-dir[segment_num];
  2062  0.0153 :  70b562:   shr%cl,%eax
   309  0.0023 :  70b564:   mov%eax,%eax
   643  0.0048 :  70b566:   mov(%rdx,%rax,8),%rbp
   :
   :if (segp == NULL)
 43329  0.3211 :  70b56a:   test   %rbp,%rbp
  1284  0.0095 :  70b56d:   je 70b727 
hash_search_with_hash_value+0x207
   :hash_corrupted(hashp);
   :
   :prevBucketPtr = segp[segment_ndx];
  1878  0.0139 :  70b573:   lea-0x1(%r13),%rax
   :currBucket = *prevBucketPtr;
   :
   :/*
   : * Follow collision chain looking for 

Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Robert Haas
On Fri, Nov 18, 2011 at 2:05 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:
 Any chance you can run oprofile (on either branch, don't really
 care) against the 32 client test and post the results?

 [ oprofile results ]

Hmm.  That looks a lot like a profile with no lock contention at all.
Since I see XLogInsert in there, I assume this must be a pgbench write
test on unlogged tables?  How close am I?

I was actually thinking it would be interesting to oprofile the
read-only test; see if we can figure out where those slowdowns are
coming from.

 Two runs:

 tps = 21946.961196 (including connections establishing)
 tps = 22911.873227 (including connections establishing)

 For write transactions, that seems pretty respectable.

Very.  What do you get without the patch?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote: 
 
 Hmm.  That looks a lot like a profile with no lock contention at
 all.  Since I see XLogInsert in there, I assume this must be a
 pgbench write test on unlogged tables?  How close am I?
 
Not unless pgbench on HEAD does that by default.  Here are the
relevant statements:
 
$prefix/bin/pgbench -i -s 150
$prefix/bin/pgbench -T $time -c $clients -j $clients $resultfile
 
Perhaps the Intel cores implement the relevant primitives better? 
Maybe I didn't run the profile or reports the right way?
 
 I was actually thinking it would be interesting to oprofile the
 read-only test; see if we can figure out where those slowdowns are
 coming from.
 
I'll plan on doing that this weekend.
 
 tps = 21946.961196 (including connections establishing)
 tps = 22911.873227 (including connections establishing)

 For write transactions, that seems pretty respectable.
 
 Very.  What do you get without the patch?
 
[quick runs a couple tests that way]
 
Single run with -M simple:
 
tps = 23018.314292 (including connections establishing)
 
Single run with -M prepared:
 
tps = 27910.621044 (including connections establishing)
 
So, the patch appears to hinder performance in this environment,
although certainty is quite low with so few samples.  I'll schedule
a spectrum of runs before I leave this evening (very soon).
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Robert Haas
On Fri, Nov 18, 2011 at 6:46 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 tps = 21946.961196 (including connections establishing)
 tps = 22911.873227 (including connections establishing)

 For write transactions, that seems pretty respectable.

 Very.  What do you get without the patch?

 [quick runs a couple tests that way]

 Single run with -M simple:

 tps = 23018.314292 (including connections establishing)

 Single run with -M prepared:

 tps = 27910.621044 (including connections establishing)

 So, the patch appears to hinder performance in this environment,
 although certainty is quite low with so few samples.  I'll schedule
 a spectrum of runs before I leave this evening (very soon).

Hmm.  There's obviously something that's different in your environment
or configuration from what I tested, but I don't know what it is.  The
fact that your scale factor is larger than shared_buffers might
matter; or Intel vs. AMD.  Or maybe you're running with
synchronous_commit=on?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-18 Thread Andres Freund
On Saturday, November 19, 2011 12:18:07 AM Kevin Grittner wrote:
 Andres Freund and...@anarazel.de wrote:
  I think opannotate -a -s produces output with instructions/code
  intermingled.
 
 Thanks.  I'll check out perf later (thanks for the tips!), but for
 now, here's the function which was at the top of my oprofile
 results, annotated with those options.  I'm afraid it's a bit
 intimidating to me -- the last time I did much with X86 assembly
 language was in the mid-80s, on an 80286.  :-/ 
While my assembly knoweldge surely isn't from the 80s be assured that I find it 
intimidating as well ;)

 Hopefully, since
 this is at the top of the oprofile results when running with
 prepared statements, it will be of use to somebody.
I think in quite many situations hash_search_with_hash_value is rather 
noticeable in the profiles. Even without concurrency...

Looking at your annotation output the code seems to be almost entirely stalled 
waiting for memory.
The first stall is after the first reading memory access which is likely to be 
uncached (the first cacheline of the HTAB is accessed before but that will be 
in the cache). The interesting thing is that I would have expected a higher 
likelihood for this to stay in the cache.
  2225  0.0165 :  70b543:   mov(%rdi),%r15
   :static inline uint32
   :calc_bucket(HASHHDR *hctl, uint32 hash_val)
   :{
   :uint32  bucket;
   :
   :bucket = hash_val  hctl-high_mask;
  4544  0.0337 :  70b546:   and0x2c(%r15),%ebx
   :if (bucket  hctl-max_bucket)
 53409  0.3958 :  70b54a:   cmp0x28(%r15),%ebx
   :  70b54e:   jbe70b554 hash_search_with_hash_value+0x34

So a stall here is not that surprising.


Here we fetch data from memory which is unlikely to be prefetchable and then 
require the result from that fetch. Note how  segp = hashp-dir[segment_num]; 
is distributed over line 52, 64, 83.

   :segp = hashp-dir[segment_num];
  2062  0.0153 :  70b562:   shr%cl,%eax
   309  0.0023 :  70b564:   mov%eax,%eax
   643  0.0048 :  70b566:   mov(%rdx,%rax,8),%rbp
   :
   :if (segp == NULL)
 43329  0.3211 :  70b56a:   test   %rbp,%rbp





The next cacheline is referenced here. Again a fetch from memory which is soon 
after needed to continue.
Unless I misunderstood the code-flow this disproves my theory that we might 
have many collisions as that test seems to be outside the test (
   :prevBucketPtr = segp[segment_ndx];
   :currBucket = *prevBucketPtr;
   122 9.0e-04 :  70b586:   mov0x0(%rbp),%rbx
   :
   :/*
   : * Follow collision chain looking for matching key
   : */
   :match = hashp-match;   /* save one fetch in 
inner 
loop */
   :keysize = hashp-keysize;   /* ditto */
 99903  0.7404 :  70b58a:   mov%rax,0x18(%rsp)
   :
   :while (currBucket != NULL)
  1066  0.0079 :  70b58f:   test   %rbx,%rbx




line 136 is the first time the contents of the current bucket is needed. Thats 
why the test is so noticeable.
   :currBucket = *prevBucketPtr;
   655  0.0049 :  70b5a3:   mov(%rbx),%rbx
   : * Follow collision chain looking for matching key
   : */
   :match = hashp-match;   /* save one fetch in 
inner 
loop */
   :keysize = hashp-keysize;   /* ditto */
   :
   :while (currBucket != NULL)
   608  0.0045 :  70b5a6:   test   %rbx,%rbx
   :  70b5a9:   je 70b5d0 hash_search_with_hash_value+0xb0
   :{
   :if (currBucket-hashvalue == hashvalue 
  3504  0.0260 :  70b5ab:   cmp%r12d,0x8(%rbx)
 98486  0.7299 :  70b5af:   nop
  1233  0.0091 :  70b5b0:   jne70b5a0 hash_search_with_hash_value+0x80


That covers all the slow points in the function. And unless I am missing 
something those are all the fetched cachelines of that function... For 
HASH_FIND that is.


So I think that reinforces my belive that ordinary cachemisses are the culprit 
here. Which is to be excepted in a hashtable...


Andres


PS: No idea whether that rambling made sense to anyone... But I looked at that 
function fo  the first time ;)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Testing for safe fetching of TOAST values

2011-11-01 Thread Tom Lane
I'm working on fixing the stale-toast-pointer problem discussed in
http://archives.postgresql.org/message-id/2348.1319591...@sss.pgh.pa.us

In that thread, it was pointed out that it's unsafe to fetch a toasted
value unless we are advertising a MyProc-xmin that's old enough to
prevent removal of the tuple holding the toast pointer.  Otherwise,
someone could commit a deletion of that tuple and its subsidiary toast
tuples after we decide it's good, and then VACUUM could remove the toast
tuples before we finish fetching the toast tuples.  I tried to add an
Assert that at least checks that we're advertising *some* xmin at the
start of toast_fetch_datum():

Assert(TransactionIdIsNormal(MyProc-xmin));

The regression tests run through without complaint with this in place,
but initdb blows up while processing system_views.sql.  I found that the
problem is that exec_simple_query() issues a CommandCounterIncrement
between queries while not holding any snapshot.  (exec_simple_query
itself doesn't take one because the queries are all utility commands,
ie CREATE VIEW.  PortalRunUtility does take a snapshot while running
CREATE VIEW, but then drops it again.)  This causes a relcache flush
and reload for the just-created view (since it was just created,
RelationFlushRelation tries to rebuild not just drop the relcache
entry).  If the view's rule is long enough to be toasted, boom!

Now, in fact this access pattern is perfectly safe, because if the view
was created in the current transaction then there's no way for VACUUM to
remove the tuples describing it.  The whole thing seems rather delicate,
and yet I can't put my finger on anyplace that's doing something wrong.

I'm going to commit the patch without this Assert, but I wonder if
anyone has ideas about either a better test for dangerous fetches,
or a way to rejigger the code to make this test work.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing nested case-when scoping

2011-07-05 Thread Robert Haas
On Wed, Jun 15, 2011 at 7:43 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hello Heikki,

 probably I found a bug in patch:

 CREATE FUNCTION fx(i integer) RETURNS integer
    LANGUAGE plpgsql
    AS $$begin raise notice '%', i; return i; end;$$;

 CREATE FUNCTION fx1(integer) RETURNS text
    LANGUAGE sql
    AS $_$ select case $1 when 1 then 'A' else 'B' end$_$;

 CREATE FUNCTION fx2(text) RETURNS text
    LANGUAGE sql
    AS $_$ select case $1 when 'A' then 'a' else 'b' end$_$;

 CREATE TABLE foo (
    a integer
 );

 COPY foo (a) FROM stdin;
 1
 0
 \.

 postgres=# select fx2(fx1(fx(a))) from foo;
 NOTICE:  1
 ERROR:  invalid expression parameter reference (1 levels up, while
 stack is only 1 elements deep)

I can't reproduce this.  Perhaps it was fixed by one of the later commits?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing nested case-when scoping

2011-07-05 Thread Pavel Stehule
2011/7/5 Robert Haas robertmh...@gmail.com:
 On Wed, Jun 15, 2011 at 7:43 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 Hello Heikki,

 probably I found a bug in patch:

 CREATE FUNCTION fx(i integer) RETURNS integer
    LANGUAGE plpgsql
    AS $$begin raise notice '%', i; return i; end;$$;

 CREATE FUNCTION fx1(integer) RETURNS text
    LANGUAGE sql
    AS $_$ select case $1 when 1 then 'A' else 'B' end$_$;

 CREATE FUNCTION fx2(text) RETURNS text
    LANGUAGE sql
    AS $_$ select case $1 when 'A' then 'a' else 'b' end$_$;

 CREATE TABLE foo (
    a integer
 );

 COPY foo (a) FROM stdin;
 1
 0
 \.

 postgres=# select fx2(fx1(fx(a))) from foo;
 NOTICE:  1
 ERROR:  invalid expression parameter reference (1 levels up, while
 stack is only 1 elements deep)

 I can't reproduce this.  Perhaps it was fixed by one of the later commits?


I don't checked it again, because Tom rejected Heikki's design.

Regards

Pavel

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] testing nested case-when scoping

2011-06-15 Thread Pavel Stehule
Hello Heikki,

probably I found a bug in patch:

CREATE FUNCTION fx(i integer) RETURNS integer
LANGUAGE plpgsql
AS $$begin raise notice '%', i; return i; end;$$;

CREATE FUNCTION fx1(integer) RETURNS text
LANGUAGE sql
AS $_$ select case $1 when 1 then 'A' else 'B' end$_$;

CREATE FUNCTION fx2(text) RETURNS text
LANGUAGE sql
AS $_$ select case $1 when 'A' then 'a' else 'b' end$_$;

CREATE TABLE foo (
a integer
);

COPY foo (a) FROM stdin;
1
0
\.

postgres=# select fx2(fx1(fx(a))) from foo;
NOTICE:  1
ERROR:  invalid expression parameter reference (1 levels up, while
stack is only 1 elements deep)

Regards

Pavel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing extension upgrade scripts

2011-03-03 Thread Chris Browne
da...@kineticode.com (David E. Wheeler) writes:
 You should blog this.

He just did, using the SMTP protocol...
-- 
select 'cbbrowne' || '@' || 'acm.org';
http://linuxdatabases.info/info/postgresql.html
Where do you want to Tell Microsoft To Go Today?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Testing extension upgrade scripts

2011-03-02 Thread Tom Lane
It occurred to me that it might be a good idea to describe how
I've been testing extension upgrade scripts.  So:

1. Install the 9.0 version of the module in an empty 9.0 database.
pg_dump this database.

2. Load the pg_dump script into an empty 9.1 database, with the
underlying shared library (if any) available in $libdir but not
installed as SQL.  (If this fails, probably you removed a C
function from the library.)

3. Execute CREATE EXTENSION whatever FROM unpackaged;.  (If this
fails, obviously you have work to do.)

4. pg_dump --binary-upgrade from the 9.1 database.  (You need
--binary-upgrade or pg_dump won't show the member objects of the
extension.)

5. Install the 9.1 version of the extension in another empty
database, using plain CREATE EXTENSION whatever.  Then
pg_dump --binary-upgrade from that.

6. Diff the two 9.1 dump scripts.  They should be the same except
for OID and frozenxid numbers.  If not, the upgrade script has
missed something it needs to do to update the catalog entries.


Of course, a similar approach will be useful for testing scripts
that are meant to update from one extension version to another,
once we arrive at the point of needing to do that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing extension upgrade scripts

2011-03-02 Thread David E. Wheeler
You should blog this.

David

On Mar 2, 2011, at 11:58 AM, Tom Lane wrote:

 It occurred to me that it might be a good idea to describe how
 I've been testing extension upgrade scripts.  So:
 
 1. Install the 9.0 version of the module in an empty 9.0 database.
 pg_dump this database.
 
 2. Load the pg_dump script into an empty 9.1 database, with the
 underlying shared library (if any) available in $libdir but not
 installed as SQL.  (If this fails, probably you removed a C
 function from the library.)
 
 3. Execute CREATE EXTENSION whatever FROM unpackaged;.  (If this
 fails, obviously you have work to do.)
 
 4. pg_dump --binary-upgrade from the 9.1 database.  (You need
 --binary-upgrade or pg_dump won't show the member objects of the
 extension.)
 
 5. Install the 9.1 version of the extension in another empty
 database, using plain CREATE EXTENSION whatever.  Then
 pg_dump --binary-upgrade from that.
 
 6. Diff the two 9.1 dump scripts.  They should be the same except
 for OID and frozenxid numbers.  If not, the upgrade script has
 missed something it needs to do to update the catalog entries.
 
 
 Of course, a similar approach will be useful for testing scripts
 that are meant to update from one extension version to another,
 once we arrive at the point of needing to do that.
 
   regards, tom lane
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing extension upgrade scripts

2011-03-02 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes:
 You should blog this.

[ shrug... ]  I don't own a blog, and if I did the entries in it would
not be included in the pgsql archives, which is where material like this
probably ought to be.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing extension upgrade scripts

2011-03-02 Thread Dimitri Fontaine
David E. Wheeler da...@kineticode.com writes:
 You should blog this.

He just did, didn't he? :)

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing plpython3u on 9.0beta3

2010-09-08 Thread Peter Eisentraut
On tis, 2010-07-13 at 20:28 -0500, Chris wrote:
 So if I have to explicitly set the python interpretor, how would you
 ever have a plpython2u and plpython3u function in the same DB
 (regardless of the fact that they can't run in the same session)?  The
 manual implies you could have them both installed since it says that
 there's a per session limitation.  After specifying the python3
 interpretor, I can indeed now run plpython3u, but I (rather obviously)
 can't createlang plpython2u now.
 I would think that the plpython section of the manual may want some
 reference to that fact that that compile flag needs to be set.

Added documentation about that.

 Additionally, What's New In Python 3.0  for the beta 3 docs on
 http://www.postgresql.org/docs/9.0/static/plpython-python23.html is
 dead.

And fixed that.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing plpython3u on 9.0beta3

2010-07-15 Thread Chris


 You'd have to build the two plpython.so's in separate compile operations.

regards, tom lane



I hadn't thought of that.  Tried it and it does work.  Thanks.
-- 
Chris Spotts
rfu...@gmail.com


[HACKERS] testing plpython3u on 9.0beta3

2010-07-13 Thread Chris
I'm testing beta 3 and ran across a PL/Python3u bug again.
This time it won't let me install it at all.
Kubuntu 10.04

./configure --with-pgport=5433 --with-python --with-ossp-uuid --with-libxml
--with-libxslt --with-perl --prefix=/usr/local/pgsqlb3
make
make check
sudo make install
All work fine.

However
postg...@cspotts-laptop:/usr/local/pgsql/lib$ createlang plpython3u
ERROR:  could not access file $libdir/plpython3: No such file or directory
STATEMENT:  CREATE LANGUAGE plpython3u;

createlang: language installation failed: ERROR:  could not access file
$libdir/plpython3: No such file or directory




Thoughts? Ideas? Am I doing something wrong? If I can provide more detail
about something specific, let me know.

-- 
Chris Spotts
rfu...@gmail.com


Re: [HACKERS] testing plpython3u on 9.0beta3

2010-07-13 Thread Peter Eisentraut
On tis, 2010-07-13 at 15:38 -0500, Chris wrote:
 I'm testing beta 3 and ran across a PL/Python3u bug again.
 This time it won't let me install it at all.
 Kubuntu 10.04
 
 ./configure --with-pgport=5433 --with-python --with-ossp-uuid
 --with-libxml
 --with-libxslt --with-perl --prefix=/usr/local/pgsqlb3

You probably need something like

./configure ... PYTHON=/usr/bin/python3 ...

here.  Otherwise it picks up /usr/bin/python, which is probably Python
2.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing plpython3u on 9.0beta3

2010-07-13 Thread Chris
So if I have to explicitly set the python interpretor, how would you ever
have a plpython2u and plpython3u function in the same DB (regardless of the
fact that they can't run in the same session)?  The manual implies you could
have them both installed since it says that there's a per session
limitation.  After specifying the python3 interpretor, I can indeed now run
plpython3u, but I (rather obviously) can't createlang plpython2u now.
I would think that the plpython section of the manual may want some
reference to that fact that that compile flag needs to be set.

Additionally, What's New In Python
3.0http://docs.python.org/dev/3.0/whatsnew/3.0.html 
for the beta 3 docs on
http://www.postgresql.org/docs/9.0/static/plpython-python23.html is dead.


On Tue, Jul 13, 2010 at 4:02 PM, Peter Eisentraut pete...@gmx.net wrote:

 On tis, 2010-07-13 at 15:38 -0500, Chris wrote:
  I'm testing beta 3 and ran across a PL/Python3u bug again.
  This time it won't let me install it at all.
  Kubuntu 10.04
 
  ./configure --with-pgport=5433 --with-python --with-ossp-uuid
  --with-libxml
  --with-libxslt --with-perl --prefix=/usr/local/pgsqlb3

 You probably need something like

 ./configure ... PYTHON=/usr/bin/python3 ...

 here.  Otherwise it picks up /usr/bin/python, which is probably Python
 2.




-- 
Chris Spotts
rfu...@gmail.com


Re: [HACKERS] testing plpython3u on 9.0beta3

2010-07-13 Thread Tom Lane
Chris rfu...@gmail.com writes:
 So if I have to explicitly set the python interpretor, how would you ever
 have a plpython2u and plpython3u function in the same DB (regardless of the
 fact that they can't run in the same session)?

You'd have to build the two plpython.so's in separate compile operations.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing plpython3u on 9.0beta2

2010-06-28 Thread Robert Haas
On Fri, Jun 25, 2010 at 2:49 PM, Peter Eisentraut pete...@gmx.net wrote:
 On fre, 2010-06-25 at 10:17 -0400, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
  The problem is apparently that when CREATE LANGUAGE creates a language
  from a pg_pltemplate entry, it creates the proname from the tmplhandler
  name, and if it finds a fitting proname entry already, it used that one.
  So when you create plpython2 first and plpython3 second, the pg_language
  entries of the latter point to the pg_proc entries of the former.

  If you fix that up manually (create additional pg_proc entries and fix
  the pg_language entries to point there), it works better.

 The fix ought to be to change the function nmes used by plpython3 ...

 Right.  What shall we do about the catversion?

We should go ahead and apply this, either with (my vote) or without
(Tom's vote) a catversion bump.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing plpython3u on 9.0beta2

2010-06-28 Thread Bruce Momjian
Josh Berkus wrote:
 
  You could argue it either way.  The number of beta testers with
  plpython3 installations is probably very small, so I'm kinda leaning to
  just changing the code without a catversion bump.  OTOH, if we want to
  encourage testing of pg_upgrade ...
 
 FWIW, the last bump has led to a lot of testing of pgupgrade.

And fixes, that will appear in 9.0 beta3.  :-)  Most fixes were related
to platform compile portability.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing plpython3u on 9.0beta2

2010-06-28 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Josh Berkus wrote:
 You could argue it either way.  The number of beta testers with
 plpython3 installations is probably very small, so I'm kinda leaning to
 just changing the code without a catversion bump.  OTOH, if we want to
 encourage testing of pg_upgrade ...
 
 FWIW, the last bump has led to a lot of testing of pgupgrade.

 And fixes, that will appear in 9.0 beta3.  :-)  Most fixes were related
 to platform compile portability.

Well, if you think that pg_upgrade has changed materially since beta2,
that would be a good argument for getting some fresh testing for it,
which in turn argues for doing the catversion bump here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing plpython3u on 9.0beta2

2010-06-28 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Josh Berkus wrote:
  You could argue it either way.  The number of beta testers with
  plpython3 installations is probably very small, so I'm kinda leaning to
  just changing the code without a catversion bump.  OTOH, if we want to
  encourage testing of pg_upgrade ...
  
  FWIW, the last bump has led to a lot of testing of pgupgrade.
 
  And fixes, that will appear in 9.0 beta3.  :-)  Most fixes were related
  to platform compile portability.
 
 Well, if you think that pg_upgrade has changed materially since beta2,
 that would be a good argument for getting some fresh testing for it,
 which in turn argues for doing the catversion bump here.

Attached are the changes since beta2;  they are pretty minor.  The good
news is I think all reporters eventually got it working.  I assume using
it for beta3 would allow it work even better, and once you have use it
once, using it again is simple.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

/check.c
/dump.c
/function.c
/option.c
/pg_upgrade.c
/pg_upgrade.h
/version.c
/version_old_8_3.c
   momjian
 Have pg_upgrade create its output files in the current directory, rather
 than in a subdirectory of the $HOME directory, or $TMP in Windows.

---
/pg_upgrade.h
   momjian
 Update pg_upgrade C comment about cwd.

---
/pg_upgrade.h
/relfilenode.c
   momjian
 Properly define pg_upgrade global variable, per bug report from Chris
 Ruprecht on Mac (64 bit).

---
/option.c
   momjian
 Fix storage of getopt() return, should be 'int', for pg_upgrade.
 
 Steve Singer

---
/tablespace.c
   momjian
 Fix pg_upgrade to remove malloc(0) call.

---
/dump.c
/option.c
/pg_upgrade.h
/server.c
   momjian
 Fix pg_upgrade's use of pg_ctl on Win32 to not send command and sever
 output to the same file, because it is impossible.
 
 Also set user name for pg_dumpall in pg_upgrade.

---
/check.c
/pg_upgrade.c
/server.c
   momjian
 Add username designations to all pg_upgrade utility calls that support it.

---

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing plpython3u on 9.0beta2

2010-06-28 Thread Peter Eisentraut
On fre, 2010-06-25 at 18:57 -0400, Peter Eisentraut wrote:
 On fre, 2010-06-25 at 23:44 +0200, Andres Freund wrote:
  Has anybody actually researched if it is safe to run python2 and
  python3 in the same address space?
 
 You can't run plpython2 and plpython3 in the same session, because the
 libraries are loaded with dlopen(RTLD_GLOBAL) (with RTLD_LOCAL it would
 apparently work).  But you can use them in different sessions on the
 same database, for example.

By the way, I thought there had been some discussion about this in the
past, but I couldn't find it ...

Why are we using RTLD_GLOBAL?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing plpython3u on 9.0beta2

2010-06-28 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 Why are we using RTLD_GLOBAL?

Some guy named Eisentraut wanted it for plpython:
http://archives.postgresql.org/pgsql-hackers/2001-05/msg00563.php
http://archives.postgresql.org/pgsql-committers/2001-05/msg00283.php

Possibly that no longer applies, though.  In general it seems like
our usage of loadable modules treats them all as independent objects,
so that not using RTLD_GLOBAL would be a more sensible policy.  If it
won't break any of the existing PLs then I'm fine with removing it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing plpython3u on 9.0beta2

2010-06-25 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 The problem is apparently that when CREATE LANGUAGE creates a language
 from a pg_pltemplate entry, it creates the proname from the tmplhandler
 name, and if it finds a fitting proname entry already, it used that one.
 So when you create plpython2 first and plpython3 second, the pg_language
 entries of the latter point to the pg_proc entries of the former.

 If you fix that up manually (create additional pg_proc entries and fix
 the pg_language entries to point there), it works better.

The fix ought to be to change the function nmes used by plpython3 ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing plpython3u on 9.0beta2

2010-06-25 Thread Peter Eisentraut
On fre, 2010-06-25 at 10:17 -0400, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
  The problem is apparently that when CREATE LANGUAGE creates a language
  from a pg_pltemplate entry, it creates the proname from the tmplhandler
  name, and if it finds a fitting proname entry already, it used that one.
  So when you create plpython2 first and plpython3 second, the pg_language
  entries of the latter point to the pg_proc entries of the former.
 
  If you fix that up manually (create additional pg_proc entries and fix
  the pg_language entries to point there), it works better.
 
 The fix ought to be to change the function nmes used by plpython3 ...

Right.  What shall we do about the catversion?
Index: src/include/catalog/pg_pltemplate.h
===
RCS file: /cvsroot/pgsql/src/include/catalog/pg_pltemplate.h,v
retrieving revision 1.13
diff -u -3 -p -r1.13 pg_pltemplate.h
--- src/include/catalog/pg_pltemplate.h	22 Jan 2010 15:45:15 -	1.13
+++ src/include/catalog/pg_pltemplate.h	25 Jun 2010 18:46:56 -
@@ -74,6 +74,6 @@ DATA(insert ( plperl		t t plperl_call
 DATA(insert ( plperlu		f f plperl_call_handler plperl_inline_handler plperl_validator $libdir/plperl _null_ ));
 DATA(insert ( plpythonu	f f plpython_call_handler plpython_inline_handler _null_ $libdir/plpython _null_ ));
 DATA(insert ( plpython2u	f f plpython_call_handler plpython_inline_handler _null_ $libdir/plpython2 _null_ ));
-DATA(insert ( plpython3u	f f plpython_call_handler plpython_inline_handler _null_ $libdir/plpython3 _null_ ));
+DATA(insert ( plpython3u	f f plpython3_call_handler plpython3_inline_handler _null_ $libdir/plpython3 _null_ ));
 
 #endif   /* PG_PLTEMPLATE_H */
Index: src/pl/plpython/plpython.c
===
RCS file: /cvsroot/pgsql/src/pl/plpython/plpython.c,v
retrieving revision 1.144
diff -u -3 -p -r1.144 plpython.c
--- src/pl/plpython/plpython.c	10 Jun 2010 04:05:01 -	1.144
+++ src/pl/plpython/plpython.c	25 Jun 2010 18:46:56 -
@@ -244,6 +244,12 @@ typedef struct PLyResultObject
 
 /* function declarations */
 
+#if PY_MAJOR_VERSION = 3
+/* Use separate names to avoid clash in pg_pltemplate */
+#define plpython_call_handler plpython3_call_handler
+#define plpython_inline_handler plpython3_inline_handler
+#endif
+
 /* exported functions */
 Datum		plpython_call_handler(PG_FUNCTION_ARGS);
 Datum		plpython_inline_handler(PG_FUNCTION_ARGS);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing plpython3u on 9.0beta2

2010-06-25 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On fre, 2010-06-25 at 10:17 -0400, Tom Lane wrote:
 The fix ought to be to change the function nmes used by plpython3 ...

 Right.  What shall we do about the catversion?

You could argue it either way.  The number of beta testers with
plpython3 installations is probably very small, so I'm kinda leaning to
just changing the code without a catversion bump.  OTOH, if we want to
encourage testing of pg_upgrade ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing plpython3u on 9.0beta2

2010-06-25 Thread Josh Berkus

 You could argue it either way.  The number of beta testers with
 plpython3 installations is probably very small, so I'm kinda leaning to
 just changing the code without a catversion bump.  OTOH, if we want to
 encourage testing of pg_upgrade ...

FWIW, the last bump has led to a lot of testing of pgupgrade.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing plpython3u on 9.0beta2

2010-06-25 Thread Andres Freund
On Wednesday 23 June 2010 16:30:54 Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  I can reproduce this, here.  The problem seems to be that plpython
  only build either plpython2.so or plython3.so, but both languages
  expect a call handler called plython_call_handler.  So once we load
  the shared library for one language, the other language just grabs the
  same call handler.
 
 We had better fix that --- I can definitely foresee installations
 wanting to use both plpython2 and plpython3.  This'd require a change in
 the default contents of pg_pltemplate, though.  Does it seem important
 enough to bump catversion for?
Has anybody actually researched if it is safe to run python2 and python3 in 
the same address space? I wouldnt be surprised at all if that where 
problematic.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing plpython3u on 9.0beta2

2010-06-25 Thread Peter Eisentraut
On fre, 2010-06-25 at 23:44 +0200, Andres Freund wrote:
 Has anybody actually researched if it is safe to run python2 and
 python3 in the same address space?

You can't run plpython2 and plpython3 in the same session, because the
libraries are loaded with dlopen(RTLD_GLOBAL) (with RTLD_LOCAL it would
apparently work).  But you can use them in different sessions on the
same database, for example.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing plpython3u on 9.0beta2

2010-06-24 Thread Robert Haas
On Wed, Jun 23, 2010 at 10:49 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Jun 23, 2010 at 10:30 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I can reproduce this, here.  The problem seems to be that plpython
 only build either plpython2.so or plython3.so, but both languages
 expect a call handler called plython_call_handler.  So once we load
 the shared library for one language, the other language just grabs the
 same call handler.

 We had better fix that --- I can definitely foresee installations
 wanting to use both plpython2 and plpython3.  This'd require a change in
 the default contents of pg_pltemplate, though.  Does it seem important
 enough to bump catversion for?

 Yeah, I think so.  As for using both plpython2 and plpython3, it looks
 to me like right now you can only use one or the other.  I think if
 you somehow manage to install both, you're really just getting the
 same one twice (I have not tested this, however).

So, what's the right thing to do here?  Should we just fix it so that
creating the second language always fails, or should we try to make it
possible for both of them to coexist (which is probably a lot more
work)?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing plpython3u on 9.0beta2

2010-06-24 Thread Peter Eisentraut
On ons, 2010-06-23 at 07:17 -0400, Robert Haas wrote:
 I can reproduce this, here.  The problem seems to be that plpython
 only build either plpython2.so or plython3.so, but both languages
 expect a call handler called plython_call_handler.  So once we load
 the shared library for one language, the other language just grabs the
 same call handler.

The problem is apparently that when CREATE LANGUAGE creates a language
from a pg_pltemplate entry, it creates the proname from the tmplhandler
name, and if it finds a fitting proname entry already, it used that one.
So when you create plpython2 first and plpython3 second, the pg_language
entries of the latter point to the pg_proc entries of the former.

If you fix that up manually (create additional pg_proc entries and fix
the pg_language entries to point there), it works better.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing plpython3u on 9.0beta2

2010-06-23 Thread Robert Haas
On Tue, Jun 22, 2010 at 10:41 PM, Chris rfu...@gmail.com wrote:
 I received two errors (described below) in installing 9.0beta2 on Kubuntu
 10.04 , RhodiumToad on IRC recommended I post them here.
 I did not have a 2.x or 3.x python dev installed, but I was really only
 interested in python3 via plython3u.
 So...
 sudo apt-get install python3-all-dev
 Configure works fine...
 ./configure --with-pgport=5433 --with-python --with-ossp-uuid --with-libxml
 --with-libxslt --with-perl
 make fails while trying to build plpython, Cannot find python.h i
 believe(I sadly did not save the text of the error and have since
 continued onward.  But it was not finding a file while building plpython.  I
 believe I would have to uninstall a few things to reproduce.)
 So I tried installing the 2.x dev.  After running sudo apt-get install
 python-all-dev then make;make check;sudo make install, it all worke fine.
 However,  when issuing a createlang plpython3u template1, I get
 createlang: language installation failed: ERROR:  could not access file
 $libdir/plpython3: No such file or directory.  But if I createlang
 plpython2u template1 first, then it will allow createlang plpython3u
 template1.
 Am I doing something incorrect, or...?
 (As an aside, functions created with LANGUAGE plpython3u appear to work.)

I can reproduce this, here.  The problem seems to be that plpython
only build either plpython2.so or plython3.so, but both languages
expect a call handler called plython_call_handler.  So once we load
the shared library for one language, the other language just grabs the
same call handler.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   3   4   5   >