Re: [HACKERS] too much pgbench init output

2012-12-26 Thread Jeevan Chalke
Looks good to me.

Will mark Ready for Committer

Thanks


On Thu, Dec 20, 2012 at 2:30 AM, Tomas Vondra t...@fuzzy.cz wrote:

 On 19.12.2012 06:30, Jeevan Chalke wrote:
 
 
 
  On Mon, Dec 17, 2012 at 5:37 AM, Tomas Vondra t...@fuzzy.cz
  mailto:t...@fuzzy.cz wrote:
 
  Hi,
 
  attached is a new version of the patch that
 
  (a) converts the 'log_step_seconds' variable to a constant (and does
  not allow changing it using a command-line option etc.)
 
  (b) keeps the current logging as a default
 
  (c) adds a -q switch that enables the new logging with a 5-second
  interval
 
  I'm still not convinced there should be yet another know for tuning
 the
  log interval - opinions?
 
 
  It seems that you have generated a patch over your earlier version and
  due to that it is not cleanly applying on fresh sources.
  Please generate patch on fresh sources.

 Seems you're right - I've attached the proper patch against current master.

  However, I absolutely no issues with the design. Also code review is
  already done and looks good to me.
  I think to move forward on this we need someone from core-team. So I am
  planning to change its status to ready-for-committor.
 
  Before that please provide updated patch for final code review.

 thanks
 Tomas


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Jeevan B Chalke
Senior Software Engineer, RD
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: +91 20 30589500

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.


Re: [HACKERS] proposal: regrole type?

2012-12-26 Thread Pavel Stehule
2012/12/26 Pavel Stehule pavel.steh...@gmail.com:
 2012/12/25 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 * We can reduce to half lot of functions \df has_* (84 functions)

 Not without breaking existing queries.  A function taking regrole might
 look like it substitutes for one taking a text-string user name as long
 as you only pass literal constants to it, but as soon as you pass
 non-constants you'll find out different.  (Unless your plan is to also
 create an implicit cast from text to regrole, which strikes me as a
 seriously bad idea.)

I was little bit surprised so regproc, regprocedure is not used  on
SQL level in our builtin functions - and I use both types often in our
custom queries.

So it can be similar with regrole and regaclrole - it can be addressed
for more orthogonal work with roles

I am sending patch, but I will not assign to commitfest now.

Regards

Pavel


regrole.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Feature Request: pg_replication_master()

2012-12-26 Thread Robert Treat
On Mon, Dec 24, 2012 at 7:04 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Josh Berkus j...@agliodbs.com writes:
 What the patch doesn't change is the requirement to have a file that
 causes the server to place itself into archive recovery. So there is
 no more recovery.conf and instead we have a file called
 recovery.trigger instead.

 Requiring a file in order to make a server a replica is what we should
 be trying to get away from.  It should be possible to configure a server
 as a replica by setting a GUC in PostgreSQL.conf (before first startup,
 obviously).

 I'm not entirely convinced about that, because if we do it like that, we
 will *never*, *ever* be able to store GUC settings except in a flat,
 editable textfile.  Now, that's fine by me personally, but there seem to
 be a lot of people around here with ambitions to bury those settings in
 not-so-editable places.  Including you, to judge by your next sentence:

 Naturally, this then links in with SET PERSISTENT or
 however we're calling it these days in order to take a server out of
 replica mode.

 People are going to want to be able to push a server into, and possibly
 out of, replica mode without necessarily having the server up at the
 time.  So I'm not real convinced that we want that flag to be a GUC.
 A trigger file is a lot easier to manipulate from places like shell
 scripts.


I'm not sure that my POV exactly matches up with Tom's, but on the
last point, I strongly agree that the use of the trigger file makes it
trivial to integrate Postgres warm standby management into 3rd party
tools. I'm not against coming up with a new API that's better for
postgres dedicated tools, but I think you're going to really make it
harder for people if you eliminate the trigger file method for coming
out of recovery.

Robert Treat
play: xzilla.net
work: omniti.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buffer assertion tripping under repeat pgbench load

2012-12-26 Thread Greg Smith
To try and speed up replicating this problem I switched to a smaller 
database scale, 100, and I was able to get a crash there.  Here's the 
latest:


2012-12-26 00:01:19 EST [2278]: WARNING:  refcount of base/16384/57610 
blockNum=118571, flags=0x106 is 1073741824 should be 0, globally: 0
2012-12-26 00:01:19 EST [2278]: WARNING:  buffers with non-zero refcount 
is 1
TRAP: FailedAssertion(!(RefCountErrors == 0), File: bufmgr.c, Line: 
1720)


That's the same weird 1073741824 count as before.  I was planning to 
dump some index info, but then I saw this:


$ psql -d pgbench -c select relname,relkind,relfilenode from pg_class 
where relfilenode=57610

 relname  | relkind | relfilenode
--+-+-
 pgbench_accounts | r   |   57610

Making me think this isn't isolated to being an index problem.  I tried 
to soldier on with pg_filedump anyway.  It looks like the last version I 
saw there (9.2.0 from November) doesn't compile anymore:


$ make -f Makefile.contrib USE_PGXS=1
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
-Wformat-security -fno-strict-aliasing -fwrapv -g -I. -I. 
-I/var/lib/pgsql/pgwork/inst/baseline/include/server 
-I/var/lib/pgsql/pgwork/inst/baseline/include/internal -D_GNU_SOURCE 
-c -o pg_filedump.o pg_filedump.c -MMD -MP -MF .deps/pg_filedump.Po

pg_filedump.c: In function ‘FormatHeader’:
pg_filedump.c:617: error: request for member ‘xlogid’ in something not a 
structure or union
pg_filedump.c:617: error: request for member ‘xrecoff’ in something not 
a structure or union

pg_filedump.c: In function ‘FormatItem’:
pg_filedump.c:904: error: invalid application of ‘sizeof’ to incomplete 
type ‘HeapTupleHeaderData’

...

Lots more after that.  Does this need an update or is there just a devel 
version I should grab?


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buffer assertion tripping under repeat pgbench load

2012-12-26 Thread Tom Lane
Greg Smith g...@2ndquadrant.com writes:
 To try and speed up replicating this problem I switched to a smaller 
 database scale, 100, and I was able to get a crash there.  Here's the 
 latest:

 2012-12-26 00:01:19 EST [2278]: WARNING:  refcount of base/16384/57610 
 blockNum=118571, flags=0x106 is 1073741824 should be 0, globally: 0
 2012-12-26 00:01:19 EST [2278]: WARNING:  buffers with non-zero refcount 
 is 1
 TRAP: FailedAssertion(!(RefCountErrors == 0), File: bufmgr.c, Line: 
 1720)

 That's the same weird 1073741824 count as before.  I was planning to 
 dump some index info, but then I saw this:

 $ psql -d pgbench -c select relname,relkind,relfilenode from pg_class 
 where relfilenode=57610
   relname  | relkind | relfilenode
 --+-+-
   pgbench_accounts | r   |   57610

 Making me think this isn't isolated to being an index problem.

Yeah, that destroys my theory that there's something broken about index
management specifically.  Now we're looking for something that can
affect any buffer's refcount, which more than likely means it has
nothing to do with the buffer's contents ...

 I tried 
 to soldier on with pg_filedump anyway.  It looks like the last version I 
 saw there (9.2.0 from November) doesn't compile anymore:

Meh, looks like it needs fixes for Heikki's int64-xlogrecoff patch.
I haven't gotten around to doing that yet, but would gladly take a
patch if anyone wants to do it.  However, I now doubt that examining
the buffer content will help much on this problem.

Now that we know the bug's reproducible on smaller instances, could you
put together an exact description of what you're doing to trigger
it?  What is the DB configuration, pgbench parameters, etc?

Also, it'd be worthwhile to just repeat the test a few more times
to see if there's any sort of pattern in which buffers get affected.
I'm now suspicious that it might not always be just one buffer,
for example.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buffer assertion tripping under repeat pgbench load

2012-12-26 Thread anara...@anarazel.de


Tom Lane t...@sss.pgh.pa.us schrieb:

Greg Smith g...@2ndquadrant.com writes:
 To try and speed up replicating this problem I switched to a smaller 
 database scale, 100, and I was able to get a crash there.  Here's the

 latest:

 2012-12-26 00:01:19 EST [2278]: WARNING:  refcount of
base/16384/57610 
 blockNum=118571, flags=0x106 is 1073741824 should be 0, globally: 0
 2012-12-26 00:01:19 EST [2278]: WARNING:  buffers with non-zero
refcount 
 is 1
 TRAP: FailedAssertion(!(RefCountErrors == 0), File: bufmgr.c,
Line: 
 1720)

 That's the same weird 1073741824 count as before.  I was planning to 
 dump some index info, but then I saw this:

 $ psql -d pgbench -c select relname,relkind,relfilenode from
pg_class 
 where relfilenode=57610
   relname  | relkind | relfilenode
 --+-+-
   pgbench_accounts | r   |   57610

 Making me think this isn't isolated to being an index problem.

Yeah, that destroys my theory that there's something broken about index
management specifically.  Now we're looking for something that can
affect any buffer's refcount, which more than likely means it has
nothing to do with the buffer's contents ...

 I tried 
 to soldier on with pg_filedump anyway.  It looks like the last
version I 
 saw there (9.2.0 from November) doesn't compile anymore:

Meh, looks like it needs fixes for Heikki's int64-xlogrecoff patch.
I haven't gotten around to doing that yet, but would gladly take a
patch if anyone wants to do it.  However, I now doubt that examining
the buffer content will help much on this problem.

Now that we know the bug's reproducible on smaller instances, could you
put together an exact description of what you're doing to trigger
it?  What is the DB configuration, pgbench parameters, etc?

Also, it'd be worthwhile to just repeat the test a few more times
to see if there's any sort of pattern in which buffers get affected.
I'm now suspicious that it might not always be just one buffer,
for example.

I don't think its necessarily only one buffer - if I read the above output 
correctly Greg used the suggested debug output which just put the elog(WARN) 
before the Assert...

Greg, could you output all bad buffers and only assert after the loop if 
there was at least one refcounted buffer?

Andres

--- 
Please excuse the brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buffer assertion tripping under repeat pgbench load

2012-12-26 Thread Greg Smith

On 12/26/12 1:58 PM, anara...@anarazel.de wrote:


I don't think its necessarily only one buffer - if I read the above output 
correctly Greg used the suggested debug output which just put the elog(WARN) 
before the Assert...

Greg, could you output all bad buffers and only assert after the loop if 
there was at least one refcounted buffer?


I've been doing that for a while, the only thing that's new is getting 
the block number.  It's only ever been one buffer involved.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Feature Request: pg_replication_master()

2012-12-26 Thread Josh Berkus



I'm not sure that my POV exactly matches up with Tom's, but on the
last point, I strongly agree that the use of the trigger file makes it
trivial to integrate Postgres warm standby management into 3rd party
tools. I'm not against coming up with a new API that's better for
postgres dedicated tools, but I think you're going to really make it
harder for people if you eliminate the trigger file method for coming
out of recovery.


Huh.  My experience integrating PostgreSQL with Puppet or SALT 
infrastructures is that they don't understand trigger files, but they do 
understand configuration+restart/reload.  Before we get off on an 
argument about which is better, though, here's an important question: 
how difficult would it be to make the trigger file optional, but still 
effective?


That is, I personally don't care if other people use trigger files, I 
just hate to be forced to use them myself.  Is it possible to support 
both options without making either the code or the API hopelessly confusing?


--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Feature Request: pg_replication_master()

2012-12-26 Thread Heikki Linnakangas

On 26.12.2012 21:55, Josh Berkus wrote:



I'm not sure that my POV exactly matches up with Tom's, but on the
last point, I strongly agree that the use of the trigger file makes it
trivial to integrate Postgres warm standby management into 3rd party
tools. I'm not against coming up with a new API that's better for
postgres dedicated tools, but I think you're going to really make it
harder for people if you eliminate the trigger file method for coming
out of recovery.


Huh. My experience integrating PostgreSQL with Puppet or SALT
infrastructures is that they don't understand trigger files, but they do
understand configuration+restart/reload. Before we get off on an
argument about which is better, though, here's an important question:
how difficult would it be to make the trigger file optional, but still
effective?

That is, I personally don't care if other people use trigger files, I
just hate to be forced to use them myself. Is it possible to support
both options without making either the code or the API hopelessly
confusing?


There already are two ways to promote a server out of recovery. One is 
creating the trigger file. The other is pg_ctl promote. (it uses a 
trigger file called $PGDATA/promote internally, but that's invisible to 
the user).


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Switching timeline over streaming replication

2012-12-26 Thread Heikki Linnakangas

On 23.12.2012 16:37, Fujii Masao wrote:

On Fri, Dec 21, 2012 at 1:48 AM, Fujii Masaomasao.fu...@gmail.com  wrote:

On Sat, Dec 15, 2012 at 9:36 AM, Fujii Masaomasao.fu...@gmail.com  wrote:

I found another requested timeline does not contain minimum recovery point
error scenario in HEAD:

1. Set up the master 'M', one standby 'S1', and one cascade standby 'S2'.
2. Shutdown the master 'M' and promote the standby 'S1', and wait for 'S2'
 to reconnect to 'S1'.
3. Set up new cascade standby 'S3' connecting to 'S2'.
 Then 'S3' fails to start the recovery because of the following error:

 FATAL:  requested timeline 2 does not contain minimum recovery
point 0/300 on timeline 1
 LOG:  startup process (PID 33104) exited with exit code 1
 LOG:  aborting startup due to startup process failure

The result of pg_controldata of 'S3' is:

Latest checkpoint location:   0/388
Prior checkpoint location:0/260
Latest checkpoint's REDO location:0/388
Latest checkpoint's REDO WAL file:00020003
Latest checkpoint's TimeLineID:   2
snip
Min recovery ending location: 0/300
Min recovery ending loc's timeline:   1
Backup start location:0/0
Backup end location:  0/0

The content of the timeline history file '0002.history' is:

1   0/388   no recovery target specified


I still could reproduce this problem. Attached is the shell script
which reproduces the problem.


This problem happens when new standby starts up from the backup
taken from another standby and its recovery starts from the shutdown
checkpoint record which causes timeline switch. In this case,
the timeline of minimum recovery point can be different from that of
latest checkpoint (i.e., shutdown checkpoint). But the following check
in StartupXLOG() assumes that they are always the same wrongly.
So the problem happens.

/*
 * The min recovery point should be part of the requested timeline's
 * history, too.
 */
if (!XLogRecPtrIsInvalid(ControlFile-minRecoveryPoint)
tliOfPointInHistory(ControlFile-minRecoveryPoint - 1, 
expectedTLEs) !=
ControlFile-minRecoveryPointTLI)
ereport(FATAL,
(errmsg(requested timeline %u does not contain 
minimum recovery
point %X/%X on timeline %u,
recoveryTargetTLI,
(uint32) 
(ControlFile-minRecoveryPoint  32),
(uint32) 
ControlFile-minRecoveryPoint,

ControlFile-minRecoveryPointTLI)));


No, it doesn't assume that min recovery point is on the same timeline as 
the checkpoint record. This is another variant of the timeline history 
files are not included in the backup problem discussed on the other 
thread with subject pg_basebackup from cascading standby after timeline 
switch. If you remove the min recovery point check above, the test case 
still fails, with a different error message:


LOG:  unexpected timeline ID 1 in log segment 00020003, 
offset 0


If you modify the test script to copy the 0002.history file to the 
data-standby3/pg_xlog after running pg_basebackup, the test case works. 
(we still need to fix it, of course)


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Feature Request: pg_replication_master()

2012-12-26 Thread Josh Berkus



There already are two ways to promote a server out of recovery. One is
creating the trigger file. The other is pg_ctl promote. (it uses a
trigger file called $PGDATA/promote internally, but that's invisible to
the user).


Right, I was thinking of the trigger file to put a server *into* 
replication.  That is, recovery.conf.



--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buffer assertion tripping under repeat pgbench load

2012-12-26 Thread Greg Stark
On Wed, Dec 26, 2012 at 6:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah, that destroys my theory that there's something broken about index
 management specifically.  Now we're looking for something that can
 affect any buffer's refcount, which more than likely means it has
 nothing to do with the buffer's contents ...

Hardware problem is still a plausible candidate. Have you run any
memory checker software or seen anything else crash? Classically gcc
is as good at detecting memory problems as memory checking software.
Or a bad cpu can also sometimes cause problems like this.  Have you
been able to reproduce on any other machines?

Did you ever say what kind of hardware it was? This is the local
reference count so I can't see how it could be a race condition or
anything like that but it sure smells a bit like one.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buffer assertion tripping under repeat pgbench load

2012-12-26 Thread Greg Stark
On Wed, Dec 26, 2012 at 6:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah, that destroys my theory that there's something broken about index
 management specifically.  Now we're looking for something that can
 affect any buffer's refcount, which more than likely means it has
 nothing to do with the buffer's contents ...

Also, do you have the buffer id of the broken buffer? I wonder if it's
not just any buffer but always the same same buffer even if it's a
different block in that buffer. e.g. maybe it's always the first
buffer because something is overwriting past the end of the
BufferBlocks array which is declared immediately before
PrivateRefCount. (Or maybe your compiler is laying out these objects
in a different way from most people's compilers and we're overwriting
past the end of some other object routinely but yours is the only
place where it's being laid out preceding a critical data structure)

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buffer assertion tripping under repeat pgbench load

2012-12-26 Thread Greg Smith

On 12/26/12 5:40 PM, Greg Stark wrote:

Also, do you have the buffer id of the broken buffer? I wonder if it's
not just any buffer but always the same same buffer even if it's a
different block in that buffer.


I just added something looking for that.

Before I got to that I found another crash:

2012-12-26 18:01:42 EST [973]: WARNING:  refcount of base/16384/65553 
blockNum=22140, flags=0x1a7 is 1073741824 should be 0, globally: 0

2012-12-26 18:01:42 EST [973]: WARNING:  buffers with non-zero refcount is 1
-bash-4.1$ export PGPORT=5433
-bash-4.1$ psql -d pgbench -c select relname,relkind,relfilenode from 
pg_class where relfilenode=65553

relname| relkind | relfilenode
---+-+-
 pgbench_accounts_pkey | i   |   65553

So back to an index again.


(Or maybe your compiler is laying out these objects
in a different way from most people's compilers and we're overwriting
past the end of some other object routinely but yours is the only
place where it's being laid out preceding a critical data structure)


I doubt there is anything special about this compiler, given that it's 
the standard RedHat 6 build stack cloned via Scientific Linux 6.0.


The two things I expect I'm doing differently than most tests are:

-Using 2GB for shared_buffers
-Running a write heavy test that goes for many hours

It would be nice if this were just something like a memory issue on this 
system.  That I'm getting the same very odd value every time--this 
refcount of 1073741824--makes it seem less random than I expect from bad 
memory.  Once I get a few more crash samples (with buffer ids) I'll shut 
the system down for a pass of memtest86+.


Regardless, I've copied over the same source code and test configuration 
to a similar system here.  If I can reproduce this on a second system, 
I'll push all the details out to the list, hopeful that other people 
might see it too.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buffer assertion tripping under repeat pgbench load

2012-12-26 Thread Greg Smith

On 12/26/12 5:28 PM, Greg Stark wrote:

Did you ever say what kind of hardware it was? This is the local
reference count so I can't see how it could be a race condition or
anything like that but it sure smells a bit like one.


Agreed, that smell is the reason I'm proceeding so far like this is an 
obscure bug rather than something with my hardware or install.  The CPU 
is an Intel i7 870 using HyperThreading to simulate 8 cores, 16GB of 
RAM.  The database is on a single regular SATA drive.  I simplified the 
storage setup before I started public reports.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buffer assertion tripping under repeat pgbench load

2012-12-26 Thread Greg Stark
On Wed, Dec 26, 2012 at 11:47 PM, Greg Smith g...@2ndquadrant.com wrote:
 It would be nice if this were just something like a memory issue on this
 system.  That I'm getting the same very odd value every time--this refcount
 of 1073741824--makes it seem less random than I expect from bad memory.
 Once I get a few more crash samples (with buffer ids) I'll shut the system
 down for a pass of memtest86+.

Well that's a one-bit error and it would never get detected until the
value was decremented down to what should be zero so that's pretty
much exactly what I would expect to see from a memory or cpu error.

What's odd is that it's always hitting the LocalRefCount array, not
any other large data structure. For 2GB of buffers the LocalRefCount
will be 1MB per client. That's a pretty big target but it's hardly the
only such data structure in Postgres.

It's also possible it's a bad cpu, not bad memory. If it affects
decrement or increment in particular it's possible that the pattern of
usage on LocalRefCount is particularly prone to triggering it.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Proposal: Store timestamptz of database creation on pg_database

2012-12-26 Thread Fabrízio de Royes Mello
Hi all,

This proposal is about add a column datcreated on pg_database to store
the timestamp of the database creation.

A couple weeks ago I had a trouble with a PostgreSQL instance, actually our
ERP had some strange behaviors with some data loss, but I searched for
ERRORs in log files (OS and PG) and I found nothing.

Looking at the files and directories in the cluster noticed something
strange, the date / time of the file base//PG_VERSION (database of
our ERP) was different compared to when we create it. So I used the
following SQL to check the date / time of creation of the databases in the
cluster:

fabrizio=# SELECT datname,
(pg_stat_file('base/'||oid||'/PG_VERSION')).modification AS datcreated
fabrizio-#   FROM pg_database;
  datname  |   datcreated
---+
 template1 | 2012-12-26 12:11:53-02
 template0 | 2012-12-26 12:11:54-02
 postgres  | 2012-12-26 12:11:54-02
 fabrizio  | 2012-12-26 12:12:02-02
(4 rows)

This isn't an elegant solution to do that, but worked fine. However, why
not we have a column to store this information?

Somebody have another idea?

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] Proposal: Store timestamptz of database creation on pg_database

2012-12-26 Thread Josh Berkus

On 12/26/12 4:48 PM, Fabrízio de Royes Mello wrote:

Hi all,

This proposal is about add a column datcreated on pg_database to store
the timestamp of the database creation.


I agree that it would be useful.  However, if we're going to get into 
created dates, we should at least consider adding them to the other 
catalogs, particularly pg_class.


--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Store timestamptz of database creation on pg_database

2012-12-26 Thread Stephen Frost
* Josh Berkus (j...@agliodbs.com) wrote:
 On 12/26/12 4:48 PM, Fabrízio de Royes Mello wrote:
 This proposal is about add a column datcreated on pg_database to store
 the timestamp of the database creation.
 
 I agree that it would be useful.  However, if we're going to get
 into created dates, we should at least consider adding them to the
 other catalogs, particularly pg_class.

I was thinking more-or-less the same thing.  Along those lines, however,
perhaps we should put them into a separate catalog to avoid the
increased size of pg_class and friends..?  Also, we'd probably have 2 of
those, one for global and one for per-database objects, ala pg_depend
and pg_shdepend, and then a view that brings it all together, resolves
the OIDs to names, etc.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] buffer assertion tripping under repeat pgbench load

2012-12-26 Thread Tom Lane
Greg Stark st...@mit.edu writes:
 On Wed, Dec 26, 2012 at 11:47 PM, Greg Smith g...@2ndquadrant.com wrote:
 It would be nice if this were just something like a memory issue on this
 system.  That I'm getting the same very odd value every time--this refcount
 of 1073741824--makes it seem less random than I expect from bad memory.
 Once I get a few more crash samples (with buffer ids) I'll shut the system
 down for a pass of memtest86+.

 Well that's a one-bit error and it would never get detected until the
 value was decremented down to what should be zero so that's pretty
 much exactly what I would expect to see from a memory or cpu error.

Yeah, the fact that it's always the same bit makes it seem like it could
be one bad physical bit.  (Does this machine have ECC memory??)

The thing that this theory has a hard time with is that the buffer's
global refcount is zero.  If you assume that there's a bit that
sometimes randomly goes to 1 when it should be 0, then what I'd expect
to typically happen is that UnpinBuffer sees nonzero LocalRefCount and
hence doesn't drop the session's global pin when it should.  The only
way that doesn't happen is if decrementing LocalRefCount to zero stores
a nonzero pattern when it should store zero, but nonetheless the CPU
thinks it stored zero.  As you say there's some small possibility of a
CPU glitch doing that, but then why is it only happening to
LocalRefCount and not any other similar coding?

At the moment I like the other theory you alluded to, that this is a
wild store from code that thinks it's manipulating some other data
structure entirely.  The buffer IDs will help confirm or refute that
perhaps.  No idea ATM how we would find the problem if it's like that
...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Store timestamptz of database creation on pg_database

2012-12-26 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Josh Berkus (j...@agliodbs.com) wrote:
 On 12/26/12 4:48 PM, Fabrízio de Royes Mello wrote:
 This proposal is about add a column datcreated on pg_database to store
 the timestamp of the database creation.

 I agree that it would be useful.  However, if we're going to get
 into created dates, we should at least consider adding them to the
 other catalogs, particularly pg_class.

 I was thinking more-or-less the same thing.

This has been debated, and rejected, before.

To mention just one problem, are we going to add nonstandard,
non-backwards-compatible syntax to every single kind of CREATE to allow
pg_dump to preserve the creation dates?  Another interesting question is
whether we should likewise track the last ALTER time, or perhaps whether
a sufficiently major ALTER redefinition should update the creation time.

I'm inclined to think that anyone who really needs this should be
pointed at event triggers.  That feature (if it gets in) will allow
people to track creation/DDL-change times with exactly the behavior
they want.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Store timestamptz of database creation on pg_database

2012-12-26 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 To mention just one problem, are we going to add nonstandard,
 non-backwards-compatible syntax to every single kind of CREATE to allow
 pg_dump to preserve the creation dates?

Perhaps 'ALTER' would be a better place to put it, but concerns around
how to make pg_dump work with it hardly strikes me as a serious argument
against this.  I agree that we may be overloading ourselves with syntax
but that's a compromise we made long ago in order to have pg_dump be
able to act like a regular 'user'.

 Another interesting question is
 whether we should likewise track the last ALTER time, or perhaps whether
 a sufficiently major ALTER redefinition should update the creation time.

Yes, tracking the last 'ALTER' time would be useful as well, as it's own
field.  'ALTER' wouldn't change the 'CREATE' time, except perhaps if it
has an explicit 'make the CREATE time X' option.

 I'm inclined to think that anyone who really needs this should be
 pointed at event triggers.  That feature (if it gets in) will allow
 people to track creation/DDL-change times with exactly the behavior
 they want.

I considered that and rejected it.  Event triggers will be great to
allow people to customize and/or specialize exactly what is tracked and
how, but I dislike that they would be the only way to get this
information.  I'm on the fence about if, assuming event triggers go in,
we provide this kind of information through a 'default' set of event
triggers.  I wouldn't want users to be able to modify those event
triggers and I'd expect the results to go into a system table that we
wouldn't want users messing with either.

This information could be extremely useful for forensics, debugging, ETL
processes (many of which create tables as part of their processes), etc.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Proposal: Store timestamptz of database creation on pg_database

2012-12-26 Thread Josh Berkus



This information could be extremely useful for forensics, debugging, ETL
processes (many of which create tables as part of their processes), etc.


I'd say moderately useful at best.  Quite a number of things could 
make the creation dates misleading or not distinctive (think partition 
replacement, restore from pg_dump, replicas, etc.).   ALTER dates would 
be more useful, but as Tom points out, would need the 
user-configurability which can only be delivered by something like event 
triggers.


--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] dynamic SQL - possible performance regression in 9.2

2012-12-26 Thread Pavel Stehule
Hello

I rechecked performance of dynamic SQL and it is significantly slower
in 9.2 than 9.1

-- 9.1
postgres=# create or replace function test() returns void as $$ begin
for i in 1..100 loop execute 'select 1'; end loop; end $$ language
plpgsql;
CREATE FUNCTION
postgres=# \timing
Timing is on.
postgres=# select test();
 test
--

(1 row)

Time: 7652.904 ms
postgres=# select test();
 test
--

(1 row)

Time: 7828.025 ms

-- 9.2
postgres=# create or replace function test() returns void as $$ begin
for i in 1..100 loop execute 'select 1'; end loop; end $$ language
plpgsql;
CREATE FUNCTION
Time: 59.272 ms
postgres=# select test();
 test
--

(1 row)

Time: 11153.646 ms
postgres=# select test();
 test
--

(1 row)

Time: 11081.899 ms

This test is synthetic, but it shows so somebody who use dynamic SQL
in triggers (for partitioning) can has slower operations.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers