date:20120904

On Tue, Sep 4, 2012 at 6:57 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 Hi all,

 I found some whitespace characters in utility.c introduced by commit
 3a0e4d3.
 Please find attached a patch fixing that which can be applied on postgres
 master (commit 2f0c7d5).

That probably exists in many other places in the source as well, but
it's certainly fairly ugly. So I see nothing wrong cleaning it up whe
nspotted.

Thus, patch applied, thanks!

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] txid failed epoch increment, again, aka 6291

2012-09-04 Thread Daniel Farina

It seems like this has reproduced once more.  And once again, there
doesn't appear to be any funny business in pg_control (but the structs
are pasted here for your re-check), and there are successful sensical
updates to it.  The primary is running 9.0.6.

However, we do have a new piece of data: there was a very brief period
where txid_snapshot did report an xmin greater than 2^33, our next
epoch boundary, by a few thousand transactions.  That could be because
the reporting function GetNextXidAndEpoch does its own epoch
calculation before the checkpoint and then after a checkpoint that
forgets to increment the epoch there is no need to add post-facto
adjust the epoch anymore.

I've been reviewing the mechanism in CreateCheckPoint for this on and
off for a couple of days, but so far I haven't come up with a
convincing mechanism. However, given that it seems historically that
this bug is more likely to surface than *not* surface on this system,
perhaps we can try for a sometimes-reproducing test case. I'm still
struggling for a hint of a solution, though, so toss your thoughts
here.

$2 = {Insert = {LogwrtResult = {Write = {xlogid = 0, xrecoff = 0},
Flush = {xlogid = 0, xrecoff = 0}}, PrevRecord = {xlogid = 0,
  xrecoff = 0}, curridx = 0, currpage = 0x7ff4ed04a000, currpos =
0x0, RedoRecPtr = {xlogid = 18751, xrecoff = 1200832888},
forcePageWrites = 0 '\000'}, LogwrtRqst = {Write = {xlogid = 0,
xrecoff = 0}, Flush = {xlogid = 0, xrecoff = 0}}, LogwrtResult = {
Write = {xlogid = 0, xrecoff = 0}, Flush = {xlogid = 0, xrecoff =
0}}, ckptXidEpoch = 1, ckptXid = 9904084, asyncXactLSN = {
xlogid = 0, xrecoff = 0}, lastRemovedLog = 0, lastRemovedSeg = 0,
Write = {LogwrtResult = {Write = {xlogid = 0, xrecoff = 0},
  Flush = {xlogid = 0, xrecoff = 0}}, curridx = 0,
lastSegSwitchTime = 0}, pages = 0x7ff4ed04a000 , xlblocks =
0x7ff4ed0471d8,
  XLogCacheBlck = 1023, ThisTimeLineID = 0, RecoveryTargetTLI = 6,
archiveCleanupCommand = '\000' repeats 1023 times,
  SharedRecoveryInProgress = 1 '\001', lastCheckPointRecPtr = {xlogid
= 18751, xrecoff = 1671519088}, lastCheckPoint = {redo = {
  xlogid = 18751, xrecoff = 1200832888}, ThisTimeLineID = 6,
nextXidEpoch = 1, nextXid = 9904084, nextOid = 2047524,
nextMulti = 1119, nextMultiOffset = 3115, oldestXid = 4115479553,
oldestXidDB = 1, time = 1346746796, oldestActiveXid = 9776547},
  replayEndRecPtr = {xlogid = 18751, xrecoff = 1748623656},
recoveryLastRecPtr = {xlogid = 18751, xrecoff = 1748623656},
  recoveryLastXTime = 400062234671833, info_lck = 0 '\000'}
(gdb) p ControlFile
$3 = (ControlFileData *) 0x7ff4ed046bf8
(gdb) p *ControlFile
$4 = {system_identifier = 5613733157253676693, pg_control_version =
903, catalog_version_no = 201008051,
  state = DB_IN_ARCHIVE_RECOVERY, time = 1346746898, checkPoint =
{xlogid = 18751, xrecoff = 1072693824}, prevCheckPoint = {
xlogid = 18751, xrecoff = 1072693824}, checkPointCopy = {redo =
{xlogid = 18751, xrecoff = 602482536}, ThisTimeLineID = 6,
nextXidEpoch = 1, nextXid = 9904084, nextOid = 2047524, nextMulti
= 1119, nextMultiOffset = 3115, oldestXid = 4115479553,
oldestXidDB = 1, time = 1346746496, oldestActiveXid = 9558248},
minRecoveryPoint = {xlogid = 18751, xrecoff = 1748623656},
  backupStartPoint = {xlogid = 0, xrecoff = 0}, wal_level = 2,
MaxConnections = 500, max_prepared_xacts = 500, max_locks_per_xact =
64,
  maxAlign = 8, floatFormat = 1234567, blcksz = 8192, relseg_size =
131072, xlog_blcksz = 8192, xlog_seg_size = 16777216,
  nameDataLen = 64, indexMaxKeys = 32, toast_max_chunk_size = 1996,
enableIntTimes = 1 '\001', float4ByVal = 1 '\001',
  float8ByVal = 1 '\001', crc = 3725972657}

-- 
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Minor document updates

2012-09-04 Thread Etsuro Fujita

I noticed the syntax of the \copy command in the psql reference page is an old
style.  ISTM it's better to update the document.  Please find attached a patch.

Thanks,

Best regards,
Etsuro Fujita


psql-copy-ref.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cascading replication and recovery_target_timeline='latest'

2012-09-04 Thread Dimitri Fontaine

Heikki Linnakangas hlinn...@iki.fi writes:
 Hmm, I was thinking that when walsender gets the position it can send the
 WAL up to, in GetStandbyFlushRecPtr(), it could atomically check the current
 recovery timeline. If it has changed, refuse to send the new WAL and
 terminate. That would be a fairly small change, it would just close the
 window between requesting walsenders to terminate and them actually
 terminating.

It looks to me like a bug fix that also applies to non cascading
situation. Is that right?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-09-04 Thread Amit Kapila

On Tuesday, September 04, 2012 11:00 AM Andres Freund wrote:
On Tuesday, September 04, 2012 06:20:59 AM Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I can see why that would be nice, but is it really realistic? Don't we
  expect some more diligence in applications using this against letting
  such a child continue to run after ctrl-c/SIGTERMing e.g. pg_dump in
  comparison to closing a normal database connection?
 
 Er, what?  If you kill the client, the child postgres will see
 connection closure and will shut down.  I already tested that with the
 POC patch, it worked fine.

 Well, but that will make scripting harder because you cannot start another 
 single backend pg_dump before the old backend noticed it, checkpointed and 
 shut down.

  But isn't that behavior will be similar when currently server is shutting 
down due to 
  CTRL-C, and at that time new clients will not be allowed to connect. 
  As this new interface is an approach similar to embedded database where first 
API (StartServer)
  or at connect time it starts database and the other connection might not be 
allowed during 
  shutdown state.

With Regards,
Amit Kapila.
  




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multiple setup steps for isolation tests

 Tom Lane  wrote:
 Kevin Grittner  writes:
 Tom Lane wrote:
 The grammar changes look wrong: I think you eliminated the
 ability to have zero setup steps, no? Instead, setup_list should
 expand to either empty or setup_list setup.
 
 I tried that first, but had shift/reduce conflicts.
 
 [ scratches head ... ] Dunno what you did exactly, but the attached
 version works fine for me.
 
[ slaps forhead ]
 
Yeah, that should do it.  Will apply.
 
Thanks.
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] index-only scans versus serializable transactions

Tom Lane  wrote:
 Kevin Grittner  writes:
 By not visiting the heap page for tuples, index-only scans fail to
 acquire all of the necessary predicate locks for correct behavior
 at the serializable transaction isolation level. The tag for the
 tuple-level predicate locks includes the xmin, to avoid possible
 problems with tid re-use. (This was not covered in initial
 pre-release versions of SSI, and testing actually hit the
 problem.)  When an index-only scan does need to look at the heap
 because the visibility map doesn't indicate that the tuple is
 visible to all transactions, the tuple-level predicate lock is
 acquired. The best we can do without visiting the heap is a page
 level lock on the heap page, so that is what the attached patch
 does.
 
 If there are no objections, I will apply to HEAD and 9.2.
 
 This isn't right in detail: there are paths through the loop where
 tuple is not NULL at the beginning of the next iteration
 (specifically, consider failure of a lossy-qual recheck). I think
 that only results in wasted work, but it's still not operating as
 intended. I'd suggest moving the declaration/initialization of the
 tuple variable to inside the while loop, since there's no desire
 for its value to carry across loops.
 
You're right.  It looks to me like moving the declaration (and
initialization) to more local scape (just inside the loop) fixes it.
 
New version attached.  Will apply if no further problems are found.
 
-Kevin




index-only-serializable-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] 9.2rc1 produces incorrect results

2012-09-04 Thread Vik Reykja

Hello.  It took me a while to get a version of this that was independent of
my data, but here it is.  I don't understand what's going wrong but if you
change any part of this query (or at least any part I tried), the correct
result is returned.

This script will reproduce it:

=

create table t1 (id integer primary key);
create table t2 (id integer primary key references t1 (id));

insert into t1 (id) select generate_series(1, 10); -- size matters
insert into t2 (id) values (1); -- get a known value in the table
insert into t2 (id) select g from generate_series(2, 10) g where
random()  0.01; -- size matters again

analyze t1;
analyze t2;

with
A as (
select t2.id,
   t2.id = 1 as is_something
from t2
join t1 on t1.id = t2.id
left join pg_class pg_c on pg_c.relname = t2.id::text -- I haven't
tried on a user table
where pg_c.oid is null
),

B as (
select A.id,
   row_number() over (partition by A.id) as order -- this seems
to be important, too
from A
)

select A.id, array(select B.id from B where B.id = A.id) from A where
A.is_something
union all
select A.id, array(select B.id from B where B.id = A.id) from A where
A.is_something;

=

As you can (hopefully) see, the two UNIONed queries are identical but do
not return the same values.  I wish I had the skills to attach a patch to
this message, but alas I do not.

Re: [HACKERS] [COMMITTERS] pgsql: Make a cut at a major-features list for 9.2.

On Mon, Sep 3, 2012 at 6:11 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Wed, Aug 22, 2012 at 11:58 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Make a cut at a major-features list for 9.2.

 This is open to debate of course, but it's past time we had *something*
 here.

 We have cascading replication as a major feature, which I agree it is.
 But I think we should add the (closely related) base backups from
 slave to the list as well - I've had more people say they're really
 going to use that one than the cascading replication. Perhaps add them
 both on the same bullet point?

 Sure, reword it however you want.  I've not been paying much attention
 to replication stuff, so I don't really know what's important there.

Done.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump incorrect output in plaintext mode

On Fri, Aug 31, 2012 at 2:05 PM, Magnus Hagander mag...@hagander.net wrote:

 On Aug 28, 2012 9:59 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Magnus Hagander mag...@hagander.net writes:
  On Tue, Aug 28, 2012 at 6:42 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  I don't see anything particularly incorrect about that.  The point of
  the --verbose switch is to track what pg_dump is doing, and if what
  it's doing involves going through RestoreArchive(), why should we try
  to hide the fact?

  restoring data for table 't' makes you think it's actuall restoring
  things. It's not. That dumping is implemented by calling an internal
  function called RestoreArchive() has to be an implementation detail...
  It certainly confuses users that we say restoring when we're not
  doing that...

 Well, why don't we just s/restoring/processing/ in the debug message,
 and call it good?

 Sure, that would work for me... I can go do that if there are no objections.

Done.

Are we allowed to backpatch things to 9.2 at this point that changes
strings for translators?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_upgrade docs

This doc sentence about pg_upgrade is now inaccurate:

   If doing option--check/ with a running old server of a pre-9.1 version,
   and the old server is using a Unix-domain socket directory that is
   different from the default built into the new productnamePostgreSQL/
   installation, set envarPGHOST/ to point to the socket location of the
   old server.  (This is not relevant on Windows.)

The new detail is that this also affects non-live check and non-check
upgrades because pg_ctl -w doesn't work for pre-9.1 servers with the
socket in the current directory --- that was not known when this
documentation paragraph was written.

Applied doc patch attached.  The wording became pretty complex so I
tried to simplify it.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/pgupgrade.sgml b/doc/src/sgml/pgupgrade.sgml
new file mode 100644
index 9e43f3c..301222c
*** a/doc/src/sgml/pgupgrade.sgml
--- b/doc/src/sgml/pgupgrade.sgml
*** psql --username postgres --file script.s
*** 520,530 
/para
  
para
!If doing option--check/ with a running old server of a pre-9.1 version,
!and the old server is using a Unix-domain socket directory that is
!different from the default built into the new productnamePostgreSQL/
!installation, set envarPGHOST/ to point to the socket location of the
!old server.  (This is not relevant on Windows.)
/para
  
para
--- 520,529 
/para
  
para
!If using a pre-9.1 old server that is using a non-default Unix-domain
!socket directory or a default that differs from the default of the
!new cluster, set envarPGHOST/ to point to the old server's socket
!location.  (This is not relevant on Windows.)
/para
  
para

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [WIP PATCH] for Performance Improvement in Buffer Management

2012-09-04 Thread Amit kapila

On Tuesday, September 04, 2012 12:42 AM Jeff Janes wrote:
On Mon, Sep 3, 2012 at 7:15 AM, Amit kapila amit.kap...@huawei.com wrote:
 This patch is based on below Todo Item:

 Consider adding buffers the background writer finds reusable to the free
 list



 I have tried implementing it and taken the readings for Select when all the
 data is in either OS buffers

 or Shared Buffers.



 The Patch has simple implementation for  bgwriter or checkpoint process
 moving the unused buffers (unpinned with ZERO usage_count buffers) into
 freelist.

 I don't think InvalidateBuffer can be safely used in this way.  It
 says We assume
 that no other backend could possibly be interested in using the page,
 which is not true here.

As I understood and anlyzed based on above, that there is problem in attached 
patch such that in function
InvalidateBuffer(), after UnlockBufHdr() and before PartitionLock if some 
backend uses that buffer and increase the usage count to 1, still
InvalidateBuffer() will remove the buffer from hash table and put it in 
Freelist. 
I have modified the code to address above by checking refcount  usage_count  
inside Partition Lock 
, LockBufHdr and only after that move it to freelist which is similar to 
InvalidateBuffer. 
In actual code we can optimize the current code by using extra parameter in 
InvalidateBuffer. 

Please let me know if I understood you correctly or you want to say something 
else by above comment?

 Also, do we want to actually invalidate the buffers?  If someone does
 happen to want one after it is put on the freelist, making it read it
 in again into a different buffer doesn't seem like a nice thing to do,
 rather than just letting it reclaim it.

But even if bgwriter/checkpoint don't do, Backend needing new buffer will do 
similar things (remove from hash table) for this buffer as this is nextvictim 
buffer. 
The main intention of doing the MoveBufferToFreeList is to avoid contention of 
Partition Locks and BufFreeListLock among backends, which 
has given Performance improvement in high contention scenarios.

One problem I could see with proposed change is that in some cases the usage 
count will get decrement for a buffer allocated 
from free list immediately as it can be nextvictimbuffer.
However there can be solution to this problem.

Can you suggest some scenario's where I should do more performance test?

With Regards,
Amit Kapila.diff --git a/src/backend/storage/buffer/bufmgr.c 
b/src/backend/storage/buffer/bufmgr.c
index dba19eb..87446cb 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -955,6 +955,88 @@ retry:
StrategyFreeBuffer(buf);
 }
 
+
+/*
+ * MoveBufferToFreeList -- mark a shared buffer invalid and return it to the
+ * freelist. which is similar to InvalidateBuffer function.
+ */
+static void
+MoveBufferToFreeList(volatile BufferDesc *buf)
+{
+   BufferTag   oldTag;
+   uint32  oldHash;/* hash value 
for oldTag */
+   LWLockIdoldPartitionLock;   /* buffer partition 
lock for it */
+   BufFlagsoldFlags;
+
+   /* Save the original buffer tag before dropping the spinlock */
+   oldTag = buf-tag;
+
+   UnlockBufHdr(buf);
+
+   /*
+* Need to compute the old tag's hashcode and partition lock ID. XXX is 
it
+* worth storing the hashcode in BufferDesc so we need not recompute it
+* here?  Probably not.
+*/
+   oldHash = BufTableHashCode(oldTag);
+   oldPartitionLock = BufMappingPartitionLock(oldHash);
+
+
+   /*
+* Acquire exclusive mapping lock in preparation for changing the 
buffer's
+* association.
+*/
+   LWLockAcquire(oldPartitionLock, LW_EXCLUSIVE);
+
+   /* Re-lock the buffer header */
+   LockBufHdr(buf);
+
+   /* If it's changed while we were waiting for lock, do nothing */
+   if (!BUFFERTAGS_EQUAL(buf-tag, oldTag))
+   {
+   UnlockBufHdr(buf);
+   LWLockRelease(oldPartitionLock);
+   return;
+   }
+
+   /*
+* Validate wheather we can add the buffer into freelist or not
+*/
+   if ((buf-refcount != 0) || (buf-usage_count != 0))
+   {
+   UnlockBufHdr(buf);
+   LWLockRelease(oldPartitionLock);
+   return;
+   }
+
+   /*
+* Clear out the buffer's tag and flags.  We must do this to ensure that
+* linear scans of the buffer array don't think the buffer is valid.
+*/
+   oldFlags = buf-flags;
+   CLEAR_BUFFERTAG(buf-tag);
+   buf-flags = 0;
+   buf-usage_count = 0;
+
+   UnlockBufHdr(buf);
+
+   /*
+* Remove the buffer from the lookup hashtable, if it was in there.
+*/
+   if (oldFlags  BM_TAG_VALID)
+   BufTableDelete(oldTag, oldHash);
+
+   /*
+* Done with mapping lock.
+*/
+

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-09-04 Thread Alexander Korotkov

On Mon, Aug 27, 2012 at 5:00 PM, Heikki Linnakangas 
heikki.linnakan...@enterprisedb.com wrote:

 On 24.08.2012 18:51, Heikki Linnakangas wrote:

 On 20.08.2012 00:31, Alexander Korotkov wrote:

 New version of patch.
 * Collect new stakind STATISTIC_KIND_BOUNDS_**HISTOGRAM, which is lower
 and
 upper bounds histograms combined into single ranges array, instead
 of STATISTIC_KIND_HISTOGRAM.


 One worry I have about that format for the histogram is that you
 deserialize all the values in the histogram, before you do the binary
 searches. That seems expensive if stats target is very high. I guess you
 could deserialize them lazily to alleviate that, though.

  * Selectivity estimations for,=,,= using this
 histogram.


 Thanks!

 I'm going to do the same for this that I did for the sp-gist patch, and
 punt on the more complicated parts for now, and review them separately.
 Attached is a heavily edited version that doesn't include the length
 histogram, and consequently doesn't do anything smart for the  and 
 operators.  is estimated using the bounds histograms. There's now a
 separate stakind for the empty range fraction, since it's not included
 in the length-histogram.

 I tested this on a dataset containing birth and death dates of persons
 that have a wikipedia page, obtained from the dbpedia.org project. I can
 send a copy if someone wants it. The estimates seem pretty accurate.

 Please take a look, to see if I messed up something.


 Committed this with some further changes.


Addon patch is attached. Actually, I don't get your intention of
introducing STATISTIC_KIND_RANGE_EMPTY_FRAC stakind. Did you plan to leave
it as empty frac in distinct stakind or replace this stakind
with STATISTIC_KIND_LENGTH_HISTOGRAM? In the attached
patch STATISTIC_KIND_RANGE_EMPTY_FRAC is replaced
with STATISTIC_KIND_LENGTH_HISTOGRAM.

--
With best regards,
Alexander Korotkov.


range_stat-addon-0.1.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SP-GiST for ranges based on 2d-mapping and quad-tree

2012-09-04 Thread Alexander Korotkov

On Mon, Aug 20, 2012 at 12:25 AM, Jeff Davis pg...@j-davis.com wrote:

 I am taking a look at this patch now. A few quick comments:

 * It looks like bounds_adjacent modifies it's by-reference arguments,
 which is a little worrying to me. The lower/upper labels are flipped
 back, but the inclusivities are not. Maybe just pass by value instead?

 * Bounds_adjacent is sensitive to the argument order. Can't it just take
 bound1 and bound2?


Fixed. Patch is attached.

--
With best regards,
Alexander Korotkov.


range_spgist_adjacent-0.2.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2 pg_upgrade regression tests on WIndows

On Mon, Sep  3, 2012 at 12:44:09PM -0400, Andrew Dunstan wrote:
 The attached very small patch allows pg_upgrade's make check to
 succeed on REL9_2_STABLE on my Mingw system.
 
 However, I consider the issue I mentioned earlier regarding use of
 forward slashes in the argument to rmdir to be a significant
 blocker, so I'm going to go and fix that and then pull this all
 together.
 
 cheers
 
 andrew

 diff --git a/contrib/pg_upgrade/exec.c b/contrib/pg_upgrade/exec.c
 index 6f993df..57ca1df 100644
 --- a/contrib/pg_upgrade/exec.c
 +++ b/contrib/pg_upgrade/exec.c
 @@ -91,10 +91,12 @@ exec_prog(bool throw_error, bool is_priv, const char 
 *log_file,
   else
   retval = 0;
  
 +#ifndef WIN32
   if ((log = fopen_priv(log_file, a+)) == NULL)
   pg_log(PG_FATAL, cannot write to log file %s\n, log_file);
   fprintf(log, \n\n);
   fclose(log);
 +#endif
  
   return retval;
  }

I am confused by this fix.  If pg_ctl was keeping that log file open,
wouldn't the log write fail when pg_dump or psql was run later?  I am
trying to understand how a later commands would not also trigger an
error.  Is it a timing thing?  If that is it, I would like to know and
have that documented.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump incorrect output in plaintext mode

On Tue, Sep 4, 2012 at 3:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Aug 28, 2012 9:59 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, why don't we just s/restoring/processing/ in the debug message,
 and call it good?

 Are we allowed to backpatch things to 9.2 at this point that changes
 strings for translators?

 Well, not being a translator I'm not sure that I get a vote.
 But I'd think this is too minor to justify back-patching it.

Probably - that's why I didn't even consider going back beyond 9.2.

Anyway; I'll just leave it at master.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump incorrect output in plaintext mode

Magnus Hagander mag...@hagander.net writes:
 On Aug 28, 2012 9:59 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, why don't we just s/restoring/processing/ in the debug message,
 and call it good?

 Are we allowed to backpatch things to 9.2 at this point that changes
 strings for translators?

Well, not being a translator I'm not sure that I get a vote.
But I'd think this is too minor to justify back-patching it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some whitespaces in utility.c

Magnus Hagander mag...@hagander.net writes:
 On Tue, Sep 4, 2012 at 6:57 AM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 I found some whitespace characters in utility.c introduced by commit
 3a0e4d3.
 Please find attached a patch fixing that which can be applied on postgres
 master (commit 2f0c7d5).

 That probably exists in many other places in the source as well, but
 it's certainly fairly ugly. So I see nothing wrong cleaning it up whe
 nspotted.

Just as a note: we generally leave it to pgindent to fix this sort of
thing.  I'm not sure it's worth the effort of submitting manual patches
for, unless you have reason to think the next pgindent run won't fix it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade del/rmdir path fix

Excerpts from Andrew Dunstan's message of mar sep 04 01:16:39 -0300 2012:

 And here's the first Windows buildfarm check of pg_upgrade. 
 http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=pittadt=2012-09-04%2003%3A00%3A05stg=check-pg_upgrade

Great, thanks.

Who's going to work now on porting the shell script to Perl? ;-)

Somehow the verbose reporting of user relation files being copied does
not seem exceedingly useful; and I don't remember seeing that on Linux.

Should this be tweaked to avoid outputting the status message?

c:\mingw\msys\1.0\home\pgrunner\bf\root\HEAD\pgsql.7020\contrib\pg_upgradeecho
ECHO is on.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade del/rmdir path fix

On Tue, Sep  4, 2012 at 11:42:58AM -0300, Alvaro Herrera wrote:
 Excerpts from Andrew Dunstan's message of mar sep 04 01:16:39 -0300 2012:
 
  And here's the first Windows buildfarm check of pg_upgrade. 
  http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=pittadt=2012-09-04%2003%3A00%3A05stg=check-pg_upgrade
 
 Great, thanks.
 
 Who's going to work now on porting the shell script to Perl? ;-)

Well, we require Perl for development, but not for usage, at least not
yet.  There was talk of needing Perl for doing standby pg_upgrade, but
there were too many concerns about that idea.

 Somehow the verbose reporting of user relation files being copied does
 not seem exceedingly useful; and I don't remember seeing that on Linux.
 
 Should this be tweaked to avoid outputting the status message?
 
 c:\mingw\msys\1.0\home\pgrunner\bf\root\HEAD\pgsql.7020\contrib\pg_upgradeecho
 ECHO is on.

Probably.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2rc1 produces incorrect results

Vik Reykja vikrey...@gmail.com writes:
 Hello.  It took me a while to get a version of this that was independent of
 my data, but here it is.  I don't understand what's going wrong but if you
 change any part of this query (or at least any part I tried), the correct
 result is returned.

Huh.  9.1 gets the wrong answer too, so this isn't a (very) new bug;
but curiously, 8.4 and 9.0 seem to get it right.  I think this is
probably related somehow to Adam Mackler's recent report --- multiple
scans of the same CTE seems to be a bit of a soft spot :-(

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade del/rmdir path fix



On 09/04/2012 10:42 AM, Alvaro Herrera wrote:

Excerpts from Andrew Dunstan's message of mar sep 04 01:16:39 -0300 2012:


And here's the first Windows buildfarm check of pg_upgrade.
http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=pittadt=2012-09-04%2003%3A00%3A05stg=check-pg_upgrade

Great, thanks.

Who's going to work now on porting the shell script to Perl? ;-)


Probably me, one day ...



Somehow the verbose reporting of user relation files being copied does
not seem exceedingly useful; and I don't remember seeing that on Linux.


Yes, it's a pain. Not sure what causes it.



Should this be tweaked to avoid outputting the status message?

c:\mingw\msys\1.0\home\pgrunner\bf\root\HEAD\pgsql.7020\contrib\pg_upgradeecho
ECHO is on.




Already fixed.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] txid failed epoch increment, again, aka 6291

2012-09-04 Thread Noah Misch

On Tue, Sep 04, 2012 at 02:07:30AM -0700, Daniel Farina wrote:
 It seems like this has reproduced once more.  And once again, there
 doesn't appear to be any funny business in pg_control (but the structs
 are pasted here for your re-check), and there are successful sensical
 updates to it.  The primary is running 9.0.6.

What version is the standby running?

 However, we do have a new piece of data: there was a very brief period
 where txid_snapshot did report an xmin greater than 2^33, our next
 epoch boundary, by a few thousand transactions.  That could be because
 the reporting function GetNextXidAndEpoch does its own epoch
 calculation before the checkpoint and then after a checkpoint that
 forgets to increment the epoch there is no need to add post-facto
 adjust the epoch anymore.

That makes sense.

 I've been reviewing the mechanism in CreateCheckPoint for this on and
 off for a couple of days, but so far I haven't come up with a
 convincing mechanism. However, given that it seems historically that
 this bug is more likely to surface than *not* surface on this system,
 perhaps we can try for a sometimes-reproducing test case. I'm still
 struggling for a hint of a solution, though, so toss your thoughts
 here.

The cause is not apparent to me, either, and simple tests here do not
reproduce the problem.

I am suspicious of xlog_redo() updating ControlData-checkPointCopy without
acquiring ControlFileLock.  If a restartpoint is finishing at about the same
time, ControlFile-checkPointCopy.nextXid might come from the just-read
checkpoint record while ControlFile-checkPointCopy.nextXidEpoch bears the
value from the older checkpoint just adopted as a restartpoint.  The resulting
inconsistency would, however, vanish at the next ControlFile-checkPointCopy
update.  This does not explain the symptoms you have reported, and it cannot
explain much of anything on a primary.

 (gdb) p *ControlFile
 $4 = {system_identifier = 5613733157253676693, pg_control_version =
 903, catalog_version_no = 201008051,
   state = DB_IN_ARCHIVE_RECOVERY, time = 1346746898, checkPoint =

This capture, it seems, is from a standby.

 {xlogid = 18751, xrecoff = 1072693824}, prevCheckPoint = {
 xlogid = 18751, xrecoff = 1072693824}, checkPointCopy = {redo =
 {xlogid = 18751, xrecoff = 602482536}, ThisTimeLineID = 6,
 nextXidEpoch = 1, nextXid = 9904084, nextOid = 2047524, nextMulti
 = 1119, nextMultiOffset = 3115, oldestXid = 4115479553,
 oldestXidDB = 1, time = 1346746496, oldestActiveXid = 9558248},

You expected checkPointCopy = { ... nextXidEpoch = 2, ... }, correct?  (Well,
nextXidEpoch = 3 if you count the previous missed increment.)  Does pg_control
on the primary also bear epoch 1 where epoch 2 is expected?

Your last restartpoint was exactly five minutes before your last checkpoint,
so there's no evidence of a dearth of safe restartpoint opportunities.

nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17)

Robert Haas robertmh...@gmail.com wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 Also, I am a bit doubtful about the advice on sizing the
 connection pool as applied to small servers:
 surely it's not sane to recommend that a single-processor system
 with one disk should have max_connections = 3.  At least, *I*
 don't think that's sane.

 I'm not sure it's wrong when combined with this: Remember that
 this sweet spot is for the number of connections that are
 actively doing work.  ...  You should always make max_connections
 a bit bigger than the number of connections you enable in your
 connection pool. That way there are always a few slots available
 for direct connections for system maintenance and monitoring. 
 Where would you expect the knee to be for connections
 concurrently actively doing work on a single-core, single-drive
 system ?
 
 I don't know.  But my experience with our customers is that people
 are often forced to set the size of the connection pool far larger
 than what that formula would suggest.  Many people are doing
 transaction-level pooling, and for those people, they've got to
 look at how many multi-statement transactions they've got and
 think about what the peak value for that quantity is.  It's still
 worth using pooling because it reduces the number of simultaneous
 connections, but it's not going to reduce it to the kind of values
 you're talking about.  Also, consider that transactions aren't all
 the same length.  Suppose 90% of your queries execute in 50ms, and
 10% execute in 60s.  Even though it's less efficient, you've got
 to set the connection pool large enough that at least some of the
 50 ms queries can continue to get processed even if the maximum
 number of 60s queries that you ever expect to see in parallel are
 already running.  This may seem like a theoretical problem but we
 have customers who use connection pools to get the number of
 simultaneous connections down to, say, 800.  I guarantee you that
 these people do not have 200 CPUs and 400 disks, but they're smart
 people and they find that smaller pool sizes don't work.
 
It is something which has to be considered, and I don't think it's
theoretical at all.  Here's how we deal with it.  We don't use a
plain FIFO queue for our transaction requests, but a prioritized
FIFO with 10 levels of priority (0 to 9).  The highest priority (9)
is reserved for utility requests -- where a running transaction
needs to spin off a related transaction to do some work for it.  For
the lowest level (0) we normally allocate only a single connection,
and it is used for very long-running reports which we want to queue
to run one-at-a-time.  As examples of how we categorize queries,
filling a large list in an interactive application will run at
priority 3, while translating a key which must cause a description
on the screen to display is run at priority 8.  Normal single-row
updates and deletes from an interactive application run at priority
5.
 
Each connection in the pool has a worker thread, and is assigned a
minimum priority that it will handle.  When all threads are busy and
transaction requests are queued, any thread completing a database
transaction pulls from the front of the highest priority queue with
a waiting request to run a transaction, looking only at priorities
which are not beneath it.  If there are no waiting requests of
high enough priority, the thread waits for one to arrive.
 
We have found that the formula I presented, when combined with
transactional request queuing like I describe here gives us our best
performance.  I don't have the exact numbers in front of me at the
moment, but on a machine with 16 cores and a 40-drive array (but
heavily cached, so that the effective spindle count was lower than
that), servicing thousands of concurrent web users with hundreds of
tps, we improved performance significantly by dropping our
connection pool size from about 60 to about 30, in addition to the
separate pool of six which are handling logical replication from
about 80 sources.  That was a real-life production situation, but we
ran a series of benchmarks and found that in a pretty wide spectrum
of situations the formula I gave fits pretty neatly.
 
If someone is using 800 connections for, say, a 32 core machine with
a 200 drive array I would suspect that they would get a lot of
benefit from a smarter connection pool.
 
 Sure, we can say, well, the fine print tells you that 2*CPUs+disks
 is not REALLY the formula you should use, but it's just so far off
 what I see in the field that I have a hard time thinking it's
 really helping people to give them that as a starting point.
 
The point is that it *is* generally really close to the numbers we
have seen here in both benchmarks and production, and I have gotten
comments both on and off the lists from people who have told me that
they tried that formula against their benchmark results and found
that it fit well.  Now, this may be dependent on OS or

Re: [HACKERS] 9.2 pg_upgrade regression tests on WIndows



On 09/04/2012 09:47 AM, Bruce Momjian wrote:

On Mon, Sep  3, 2012 at 12:44:09PM -0400, Andrew Dunstan wrote:

The attached very small patch allows pg_upgrade's make check to
succeed on REL9_2_STABLE on my Mingw system.

However, I consider the issue I mentioned earlier regarding use of
forward slashes in the argument to rmdir to be a significant
blocker, so I'm going to go and fix that and then pull this all
together.

cheers

andrew
diff --git a/contrib/pg_upgrade/exec.c b/contrib/pg_upgrade/exec.c
index 6f993df..57ca1df 100644
--- a/contrib/pg_upgrade/exec.c
+++ b/contrib/pg_upgrade/exec.c
@@ -91,10 +91,12 @@ exec_prog(bool throw_error, bool is_priv, const char 
*log_file,
else
retval = 0;
  
+#ifndef WIN32

if ((log = fopen_priv(log_file, a+)) == NULL)
pg_log(PG_FATAL, cannot write to log file %s\n, log_file);
fprintf(log, \n\n);
fclose(log);
+#endif
  
  	return retval;

  }

I am confused by this fix.  If pg_ctl was keeping that log file open,
wouldn't the log write fail when pg_dump or psql was run later?  I am
trying to understand how a later commands would not also trigger an
error.  Is it a timing thing?  If that is it, I would like to know and
have that documented.


Oh, hmm. I thought it was the postmaster holding the log, but now I see 
that we are giving it a different log file. Maybe it is a timing thing. 
I'll experiment and see if a sleep cures the problem.



...


Nope, still getting this after a sleep(5):

   cannot write to log file pg_upgrade_server_start.log
   Failure, exiting

...


[try some more] Nope, even in a loop lasting 60s I still got this.


So I'm a bit confused too. Seeing if I can narrow it down using ProcMon ...


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade del/rmdir path fix

Bruce Momjian br...@momjian.us writes:
 On Tue, Sep  4, 2012 at 11:42:58AM -0300, Alvaro Herrera wrote:
 Who's going to work now on porting the shell script to Perl? ;-)

 Well, we require Perl for development, but not for usage, at least not
 yet.

This is a regression-test script, so that complaint doesn't seem to me
to have a lot of force ... especially not when set against the fact that
the shell script is useless on non-mingw Windows.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade del/rmdir path fix

On Tue, Sep  4, 2012 at 11:12:52AM -0400, Andrew Dunstan wrote:
 
 On 09/04/2012 10:49 AM, Bruce Momjian wrote:
 On Tue, Sep  4, 2012 at 11:42:58AM -0300, Alvaro Herrera wrote:
 Excerpts from Andrew Dunstan's message of mar sep 04 01:16:39 -0300 2012:
 
 And here's the first Windows buildfarm check of pg_upgrade.
 http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=pittadt=2012-09-04%2003%3A00%3A05stg=check-pg_upgrade
 Great, thanks.
 
 Who's going to work now on porting the shell script to Perl? ;-)
 Well, we require Perl for development, but not for usage, at least not
 yet.  There was talk of needing Perl for doing standby pg_upgrade, but
 there were too many concerns about that idea.
 
 
 This is a test script, not what you should use in production. I
 don't see any reason why we shouldn't require Perl for running the
 standard test.

Oh, I thought he was talking about the scripts pg_upgrade creates for
users to run.  Sorry.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade del/rmdir path fix

Andrew Dunstan and...@dunslane.net writes:
 This is a test script, not what you should use in production. I don't 
 see any reason why we shouldn't require Perl for running the standard test.

But on the third hand ... we've taken pains to ensure that you don't
*have* to have Perl to build from a tarball, and I think it is not
unreasonable that build should include being able to do make check.

Maybe we have to carry both this shell script and a Perl equivalent
for Windows.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade del/rmdir path fix



On 09/04/2012 10:49 AM, Bruce Momjian wrote:

On Tue, Sep  4, 2012 at 11:42:58AM -0300, Alvaro Herrera wrote:

Excerpts from Andrew Dunstan's message of mar sep 04 01:16:39 -0300 2012:


And here's the first Windows buildfarm check of pg_upgrade.
http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=pittadt=2012-09-04%2003%3A00%3A05stg=check-pg_upgrade

Great, thanks.

Who's going to work now on porting the shell script to Perl? ;-)

Well, we require Perl for development, but not for usage, at least not
yet.  There was talk of needing Perl for doing standby pg_upgrade, but
there were too many concerns about that idea.



This is a test script, not what you should use in production. I don't 
see any reason why we shouldn't require Perl for running the standard test.



cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] gistchoose vs. bloat

2012-09-04 Thread Alexander Korotkov

On Mon, Aug 20, 2012 at 9:13 PM, Alexander Korotkov aekorot...@gmail.comwrote:

 Current gistchoose code has a bug. I've started separate thread about it.
 http://archives.postgresql.org/pgsql-hackers/2012-08/msg00544.php
 Also, it obviously needs more comments.

 Current state of patch is more proof of concept than something ready. I'm
 going to change it in following ways:
 1) We don't know how expensive user penalty function is. So, I'm going to
 change randomization algorithm so that it doesn't increase number of
 penalty calls in average.
 2) Since, randomization could produce additional IO, there are probably no
 optimal solution for all the cases. We could introduce user-visible option
 which enables or disables randomization. However, default value of this
 option is another question.


 Also, I think you should use random() rather than rand().


 Thanks, will fix.


New version of patch is attached. Parameter randomization was introduced.
It controls whether to randomize choose. Choose algorithm was rewritten.

--
With best regards,
Alexander Korotkov.


gist_choose_bloat-0.2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade del/rmdir path fix



On 09/04/2012 11:21 AM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

This is a test script, not what you should use in production. I don't
see any reason why we shouldn't require Perl for running the standard test.

But on the third hand ... we've taken pains to ensure that you don't
*have* to have Perl to build from a tarball, and I think it is not
unreasonable that build should include being able to do make check.

Maybe we have to carry both this shell script and a Perl equivalent
for Windows.



Yeah. I think it will just be another target in vcregress.pl

cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] txid failed epoch increment, again, aka 6291

2012-09-04 Thread Daniel Farina

On Tue, Sep 4, 2012 at 8:04 AM, Noah Misch n...@leadboat.com wrote:
 On Tue, Sep 04, 2012 at 02:07:30AM -0700, Daniel Farina wrote:
 It seems like this has reproduced once more.  And once again, there
 doesn't appear to be any funny business in pg_control (but the structs
 are pasted here for your re-check), and there are successful sensical
 updates to it.  The primary is running 9.0.6.

 What version is the standby running?

We have several, all of which so far agree on the same txid,
presumably syndicated from the primary: one is 9.0.7, another is
9.0.8.  Incidentally I am also currently preparing a 9.0.9 one that I
was going to use for some forensics.


 However, we do have a new piece of data: there was a very brief period
 where txid_snapshot did report an xmin greater than 2^33, our next
 epoch boundary, by a few thousand transactions.  That could be because
 the reporting function GetNextXidAndEpoch does its own epoch
 calculation before the checkpoint and then after a checkpoint that
 forgets to increment the epoch there is no need to add post-facto
 adjust the epoch anymore.

 That makes sense.

 I've been reviewing the mechanism in CreateCheckPoint for this on and
 off for a couple of days, but so far I haven't come up with a
 convincing mechanism. However, given that it seems historically that
 this bug is more likely to surface than *not* surface on this system,
 perhaps we can try for a sometimes-reproducing test case. I'm still
 struggling for a hint of a solution, though, so toss your thoughts
 here.

 The cause is not apparent to me, either, and simple tests here do not
 reproduce the problem.

 I am suspicious of xlog_redo() updating ControlData-checkPointCopy without
 acquiring ControlFileLock.  If a restartpoint is finishing at about the same
 time, ControlFile-checkPointCopy.nextXid might come from the just-read
 checkpoint record while ControlFile-checkPointCopy.nextXidEpoch bears the
 value from the older checkpoint just adopted as a restartpoint.  The resulting
 inconsistency would, however, vanish at the next ControlFile-checkPointCopy
 update.  This does not explain the symptoms you have reported, and it cannot
 explain much of anything on a primary.

 (gdb) p *ControlFile
 $4 = {system_identifier = 5613733157253676693, pg_control_version =
 903, catalog_version_no = 201008051,
   state = DB_IN_ARCHIVE_RECOVERY, time = 1346746898, checkPoint =

 This capture, it seems, is from a standby.

Yeah. I can probably also get one from the primary.

 {xlogid = 18751, xrecoff = 1072693824}, prevCheckPoint = {
 xlogid = 18751, xrecoff = 1072693824}, checkPointCopy = {redo =
 {xlogid = 18751, xrecoff = 602482536}, ThisTimeLineID = 6,
 nextXidEpoch = 1, nextXid = 9904084, nextOid = 2047524, nextMulti
 = 1119, nextMultiOffset = 3115, oldestXid = 4115479553,
 oldestXidDB = 1, time = 1346746496, oldestActiveXid = 9558248},

 You expected checkPointCopy = { ... nextXidEpoch = 2, ... }, correct?  (Well,
 nextXidEpoch = 3 if you count the previous missed increment.)  Does pg_control
 on the primary also bear epoch 1 where epoch 2 is expected?

Well, in the aggregate life of this thing as we know it we probably
expect more like 4, or even 5.  It bears 1 where 2 is expected,
though, to answer the original question.


 Your last restartpoint was exactly five minutes before your last checkpoint,
 so there's no evidence of a dearth of safe restartpoint opportunities.

 nm

I might try to find the segments leading up to the overflow point and
try xlogdumping them to see what we can see.

If there's anything to note about the workload, I'd say that it does
tend to make fairly pervasive use of long running transactions which
can span probably more than one checkpoint, and the txid reporting
functions, and a concurrency level of about 300 or so backends ... but
per my reading of the mechanism so far, it doesn't seem like any of
this should matter.

-- 
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Yet another issue with pg_upgrade vs unix_socket_directories

On 9/3/12 5:00 PM, Tom Lane wrote:
 I went back for another try at building the Fedora packages with 9.2
 branch tip ... and it still failed at pg_upgrade's make check.
 The reason for this is that test.sh starts a couple of random
 postmasters, and those postmasters expect to put their sockets in
 the configured default location (which is now /var/run/postgresql
 on Fedora), and that's not there in a minimal build environment.

And if it's there, it might not be writable.

 I hacked it up with the attached quick-and-dirty patch, but I wonder
 if anyone's got a better idea.

Yeah, I have resorted to putting something like

export PGHOST=/tmp

in all my test scripts, because the above-mentioned issues have affected
Debian for a long time.  Welcome to the party. ;-)

It might actually be useful if the postmaster accepted PGHOST as the
default value for the -k option, just like it accepts PGPORT.  Then this
type setup will become much easier because clients and servers will use
the same defaults.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Yet another issue with pg_upgrade vs unix_socket_directories

On Tue, Sep  4, 2012 at 01:44:59PM -0400, Peter Eisentraut wrote:
 On 9/3/12 5:00 PM, Tom Lane wrote:
  I went back for another try at building the Fedora packages with 9.2
  branch tip ... and it still failed at pg_upgrade's make check.
  The reason for this is that test.sh starts a couple of random
  postmasters, and those postmasters expect to put their sockets in
  the configured default location (which is now /var/run/postgresql
  on Fedora), and that's not there in a minimal build environment.
 
 And if it's there, it might not be writable.
 
  I hacked it up with the attached quick-and-dirty patch, but I wonder
  if anyone's got a better idea.
 
 Yeah, I have resorted to putting something like
 
 export PGHOST=/tmp
 
 in all my test scripts, because the above-mentioned issues have affected
 Debian for a long time.  Welcome to the party. ;-)
 
 It might actually be useful if the postmaster accepted PGHOST as the
 default value for the -k option, just like it accepts PGPORT.  Then this
 type setup will become much easier because clients and servers will use
 the same defaults.

Interesting idea, but PGPORT controls both the tcp and unix domain
socket connections.  Wouldn't PGHOST just control just unix domain?   Is
that logical?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Yet another issue with pg_upgrade vs unix_socket_directories

Peter Eisentraut pete...@gmx.net writes:
 Yeah, I have resorted to putting something like
 export PGHOST=/tmp
 in all my test scripts, because the above-mentioned issues have affected
 Debian for a long time.  Welcome to the party. ;-)

Yeah, my current patch for Fedora does exactly that in pg_regress, and
has it force the test postmaster's unix_socket_directory as well.
The problem with pg_upgrade's shell script is that it's not going
through pg_regress: it launches some test postmasters directly, and
also fires up psql etc directly.  So it needs its own fix for this.

 It might actually be useful if the postmaster accepted PGHOST as the
 default value for the -k option, just like it accepts PGPORT.  Then this
 type setup will become much easier because clients and servers will use
 the same defaults.

Cute idea, but it'll fall down rather badly if PGHOST is a hostname...

There's no time to redesign this stuff for 9.2, but now that I've had
some exposure to the testing difficulties created by a nonstandard
default socket directory, I'm more interested in trying to fix these
issues in core.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade del/rmdir path fix



On 09/04/2012 10:42 AM, Alvaro Herrera wrote:



Somehow the verbose reporting of user relation files being copied does
not seem exceedingly useful; and I don't remember seeing that on Linux.




Yeah, and it does something odd anyway when it's not writing to a 
terminal. Can we get rid of it, or make it only work in verbose mode?


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2rc1 produces incorrect results

I wrote:
 Vik Reykja vikrey...@gmail.com writes:
 Hello.  It took me a while to get a version of this that was independent of
 my data, but here it is.  I don't understand what's going wrong but if you
 change any part of this query (or at least any part I tried), the correct
 result is returned.

 Huh.  9.1 gets the wrong answer too, so this isn't a (very) new bug;
 but curiously, 8.4 and 9.0 seem to get it right.  I think this is
 probably related somehow to Adam Mackler's recent report --- multiple
 scans of the same CTE seems to be a bit of a soft spot :-(

No, I'm mistaken: it's a planner bug.  The plan looks like this:

QUERY PLAN  
   
---
 Result  (cost=281.29..290.80 rows=20 width=36)
   CTE a
 -  Nested Loop  (cost=126.96..280.17 rows=19 width=4)
   -  Merge Right Join  (cost=126.96..220.17 rows=19 width=4)
 Merge Cond: (((pg_c.relname)::text) = ((t2.id)::text))
 Filter: (pg_c.oid IS NULL)
 -  Sort  (cost=61.86..63.72 rows=743 width=68)
   Sort Key: ((pg_c.relname)::text)
   -  Seq Scan on pg_class pg_c  (cost=0.00..26.43 
rows=743 width=68)
 -  Sort  (cost=65.10..67.61 rows=1004 width=4)
   Sort Key: ((t2.id)::text)
   -  Seq Scan on t2  (cost=0.00..15.04 rows=1004 width=4)
   -  Index Scan using t1_pkey on t1  (cost=0.00..3.14 rows=1 width=4)
 Index Cond: (id = t2.id***)
   CTE b
 -  WindowAgg  (cost=0.78..1.12 rows=19 width=4)
   -  Sort  (cost=0.78..0.83 rows=19 width=4)
 Sort Key: a.id
 -  CTE Scan on a  (cost=0.00..0.38 rows=19 width=4)
   -  Append  (cost=0.00..9.51 rows=20 width=36)
 -  CTE Scan on a  (cost=0.00..4.66 rows=10 width=4)
   Filter: is_something
   SubPlan 3
 -  CTE Scan on b  (cost=0.00..0.43 rows=1 width=4)
   Filter: (id = a.id***)
 -  CTE Scan on a  (cost=0.00..4.66 rows=10 width=4)
   Filter: is_something
   SubPlan 4
 -  CTE Scan on b  (cost=0.00..0.43 rows=1 width=4)
   Filter: (id = a.id***)
(30 rows)

The planner is assigning a single PARAM_EXEC slot for the parameter
passed into the inner indexscan in CTE a and for the parameters passed
into the two SubPlans that scan CTE b (the items I marked with ***
above).  This is safe enough so far as the two SubPlans are concerned,
because they don't execute concurrently --- but when SubPlan 3 is first
fired, it causes the remainder of CTE a to be computed, so that the
nestloop gets iterated some more times, and that overwrites the value of
a.id that already got passed down into SubPlan 3.

The reason this is so hard to replicate is that the PARAM_EXEC slot can
only get re-used for identical-looking Vars (same varno, varlevelsup,
vartype, etc) --- so even granted that you've got the right shape of
plan, minor unrelated changes in the query can stop the aliasing
from occurring.  Also, inner indexscans weren't using the PARAM_EXEC
mechanism until 9.1, so that's why the example doesn't fail before 9.1.

I've always been a bit suspicious of the theory espoused in
replace_outer_var that aliasing different Params is OK:

 * NOTE: in sufficiently complex querytrees, it is possible for the same
 * varno/abslevel to refer to different RTEs in different parts of the
 * parsetree, so that different fields might end up sharing the same Param
 * number.  As long as we check the vartype/typmod as well, I believe that
 * this sort of aliasing will cause no trouble.  The correct field should
 * get stored into the Param slot at execution in each part of the tree.

but I've never seen a provably wrong case before.  Most likely, this has
been broken since we introduced CTEs in 8.4: it's the decoupled timing
of execution of main query and CTE that's needed to allow execution of
different parts of the plan tree to overlap and thus create the risk.

(I get the impression that only recently have people been writing really
complex CTE queries, since we found another fundamental bug with them
just recently.)

I think probably the best fix is to rejigger things so that Params
assigned by different executions of SS_replace_correlation_vars and
createplan.c can't share PARAM_EXEC numbers.  This will result in
rather larger ecxt_param_exec_vals arrays at runtime, but the array
entries aren't very large, so I don't think it'll matter.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] XLogReader v2

Excerpts from Andres Freund's message of jue jul 19 06:29:03 -0400 2012:
 Hi,
 
 Attached is v2 of the patch.

Hello,

I gave this code a quick read some days ago.  Here's the stuff I would
change:

* There are way too many #ifdef VERBOSE_DEBUG stuff for my taste.  It
might look better if you had macros such as elog_debug() that are defined
to empty if VERBOSE_DEBUG is not defined.  (The problem with such an
approach is that you have to get into the business of creating one macro
for each different param count, so elog_debug1(), elog_debug2() and so
on.  It also means you have to count the number of args in each call to
ensure you're calling the right one.)

* In the code beautification front, there are a number of cuddled braces
and improperly indented function declarations.

* I noticed that you have the IDENTIFICATION tag wrong in both .c and .h
files: evidently you renamed the files from readxlog.[ch] to xlogreader.

* There are a few elog(PANIC) calls.  I am not sure that's a very good
idea.  It seems to me that you should be using elog(FATAL) there instead
... or do you really want to make the whole server crash?  OTOH if we
want to make it a true client program, all those elog() calls need to
go.

* XLogReaderRead() seems a bit too long to me.  I would split it with
auxiliary functions -- say read a header and read a record.  (I
mentioned this to Andres on IM and he says he tried that but couldn't
find any nice way to do it.  I may still try to do it.)

* xlogdump's Makefile trick to get all backend object files is ... ugly
(an understatement).  Really we need the *_desc() routines split so that
it can use only those functions, and have a client-side replacement for
StringInfo (discussed elsewhere) and some auxilliary functions such as
relpathbackend() so that it can compile like a normal client.

* why do we pass timeline_id to xlogdump?  I don't see that it's used
anywhere, but maybe I'm missing something?

This is not a full review.  After a new version with these fixes is
published (either by Andres or myself) some more review might find more
serious issues -- I didn't hunt for architectural problems in
XLogReader.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade diffs on WIndows



On 09/04/2012 03:09 PM, Andrew Dunstan wrote:
I realized this morning that I might have been a bit cavalier in using 
dos2unix to smooth away differences in the dumpfiles produced by 
pg_upgrade. Attached is a dump of the diff if this isn't done,  with 
Carriage Returns printed as '*' to make them visible. As can be seen, 
in function bodies dump2 has the Carriage Returns doubled. I have not 
had time to delve into how this comes about, and I need to attend to 
some income-producing activity for a bit, but I'd like to get it 
cleaned up ASAP. We are under the hammer for 9.2, so any help other 
people can give on this would be appreciated.





Actually, I have the answer - it's quite simple. We just need to open 
the output files in binary mode when we split the dumpall file. The 
attached patch fixes it. I think we should backpatch the first part to 9.0.


cheers

andrew
diff --git a/contrib/pg_upgrade/dump.c b/contrib/pg_upgrade/dump.c
index b905ab0..0a96dde 100644
--- a/contrib/pg_upgrade/dump.c
+++ b/contrib/pg_upgrade/dump.c
@@ -62,10 +62,10 @@ split_old_dump(void)
 	if ((all_dump = fopen(filename, r)) == NULL)
 		pg_log(PG_FATAL, Could not open dump file \%s\: %s\n, filename, getErrorText(errno));
 	snprintf(filename, sizeof(filename), %s, GLOBALS_DUMP_FILE);
-	if ((globals_dump = fopen_priv(filename, w)) == NULL)
+	if ((globals_dump = fopen_priv(filename, PG_BINARY_W)) == NULL)
 		pg_log(PG_FATAL, Could not write to dump file \%s\: %s\n, filename, getErrorText(errno));
 	snprintf(filename, sizeof(filename), %s, DB_DUMP_FILE);
-	if ((db_dump = fopen_priv(filename, w)) == NULL)
+	if ((db_dump = fopen_priv(filename, PG_BINARY_W)) == NULL)
 		pg_log(PG_FATAL, Could not write to dump file \%s\: %s\n, filename, getErrorText(errno));
 
 	current_output = globals_dump;
diff --git a/contrib/pg_upgrade/test.sh b/contrib/pg_upgrade/test.sh
index d411ac6..3899600 100644
--- a/contrib/pg_upgrade/test.sh
+++ b/contrib/pg_upgrade/test.sh
@@ -128,10 +128,6 @@ else
 	sh ./delete_old_cluster.sh
 fi
 
-if [ $testhost = Msys ] ; then
-   dos2unix $temp_root/dump1.sql $temp_root/dump2.sql
-fi
-
 if diff -q $temp_root/dump1.sql $temp_root/dump2.sql; then
 	echo PASSED
 	exit 0

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] build farm machine using make -j 8 mixed results

Excerpts from Robert Creager's message of sáb sep 01 12:12:51 -0400 2012:
 
 I change the build-farm.conf file to have the following make line:
 
 make = 'make -j 8', # or gmake if required. can include path if 
 necessary.
 
 2 pass, 4 fail.  Is this a build configuration you want to pursue?

Sure, why not?  Parallel building is going to become more common, so
it's a good idea to investigate the failures.  I would have it build
only HEAD though, because we're not likely to backpatch these fixes.

 I can either create a new machine, or change one of my existing machines.  
 Makes no difference to me.

If we want to have it run only HEAD I would say you should create a new
animal.

Perhaps you should wait until after 9.2 has been released, though, to
avoid distracting people :-)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] build farm machine using make -j 8 mixed results

On 9/1/12 12:12 PM, Robert Creager wrote:
 
 I change the build-farm.conf file to have the following make line:
 
 make = 'make -j 8', # or gmake if required. can include path if
 necessary.
 
 2 pass, 4 fail.  Is this a build configuration you want to pursue?

Sure that would be useful, but it's pretty clear that the check stages
don't work in parallel.  It think it's because the ports conflict, but
there could be any number of other problems.

That said, it would be useful, in my mind, to support parallel checks.
But unless someone is going to put in the work first, you should
restrict your parallel runs to the build and install phases.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] embedded list v2

Excerpts from Andres Freund's message of jue jun 28 17:06:49 -0400 2012:
 On Thursday, June 28, 2012 10:03:26 PM Andres Freund wrote:
  What I wonder is how hard it would be to remove catcache.h's structs into
  the  implementation. Thats the reason why the old and new list
  implementation currently is included all over the backend...
 Moving them into the implementation isn't possible, but catcache.h being 
 included just about everywhere simply isn't needed.

FWIW this got fixed during some header changes I made a couple of weeks
ago.  If you have similar fixes to propose, let me know.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] build farm machine using make -j 8 mixed results

Excerpts from Peter Eisentraut's message of mar sep 04 18:49:46 -0300 2012:
 On 9/1/12 12:12 PM, Robert Creager wrote:
  
  I change the build-farm.conf file to have the following make line:
  
  make = 'make -j 8', # or gmake if required. can include path if
  necessary.
  
  2 pass, 4 fail.  Is this a build configuration you want to pursue?
 
 Sure that would be useful, but it's pretty clear that the check stages
 don't work in parallel.  It think it's because the ports conflict, but
 there could be any number of other problems.

Is that really the problem?  As far as I know, buildfarm doesn't use
anything like installcheck-world or similar targets; each check target
is run separately, serially, by the BF script.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] build farm machine using make -j 8 mixed results



On 09/04/2012 05:49 PM, Peter Eisentraut wrote:

On 9/1/12 12:12 PM, Robert Creager wrote:

I change the build-farm.conf file to have the following make line:

 make = 'make -j 8', # or gmake if required. can include path if
necessary.

2 pass, 4 fail.  Is this a build configuration you want to pursue?

Sure that would be useful, but it's pretty clear that the check stages
don't work in parallel.  It think it's because the ports conflict, but
there could be any number of other problems.

That said, it would be useful, in my mind, to support parallel checks.
But unless someone is going to put in the work first, you should
restrict your parallel runs to the build and install phases.





The buildfarm code doesn't contain a facility to use a different make 
incantation for each step. It's pretty much an all or nothing deal. Of 
course, you can hack run_build.pl to make it do that, but I highly 
discourage that. For one thing, it makes upgrading that much more 
difficult. All the  tweaking is supposed to be done vie the config file. 
I guess I could add a setting that allowed for per step make flags.


Frankly, I have had enough failures of parallel make that I think doing 
this would generate a significant number of non-repeatable failures (I 
had one just the other day that took three invocations of make to get 
right). So I'm not sure doing this would advance us much, although I'm 
open to persuasion.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] too much pgbench init output

On 9/1/12 6:30 AM, Robert Haas wrote:
 On Sat, Sep 1, 2012 at 12:00 AM, Peter Eisentraut pete...@gmx.net wrote:
 When initializing a large database, pgbench writes tons of %d tuples
 done lines.  I propose to change this to a sort of progress counter
 that stays on the same line, as in the attached patch.
 
 I'm not sure I like this - what if the output is being saved off to a file?

I suppose we could print \n instead of \r then.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some whitespaces in utility.c

2012-09-04 Thread Michael Paquier

On Tue, Sep 4, 2012 at 11:35 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Magnus Hagander mag...@hagander.net writes:
  On Tue, Sep 4, 2012 at 6:57 AM, Michael Paquier
  michael.paqu...@gmail.com wrote:
  I found some whitespace characters in utility.c introduced by commit
  3a0e4d3.
  Please find attached a patch fixing that which can be applied on
 postgres
  master (commit 2f0c7d5).

  That probably exists in many other places in the source as well, but
  it's certainly fairly ugly. So I see nothing wrong cleaning it up whe
  nspotted.

 Just as a note: we generally leave it to pgindent to fix this sort of
 thing.  I'm not sure it's worth the effort of submitting manual patches
 for, unless you have reason to think the next pgindent run won't fix it.

Understood, thanks for telling about that.
I just read some code and bumped into it, for sure doing such maintenance
all at once with pgindent saves time and effort.
-- 
Michael Paquier
http://michael.otacoo.com

Re: [HACKERS] Cascading replication and recovery_target_timeline='latest'

2012-09-04 Thread Josh Berkus

Guys,

Is this a patch to 9.3?

i.e. do we need to delay the release for this?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cascading replication and recovery_target_timeline='latest'


On 04.09.2012 03:02, Dimitri Fontaine wrote:

Heikki Linnakangashlinn...@iki.fi  writes:

Hmm, I was thinking that when walsender gets the position it can send the
WAL up to, in GetStandbyFlushRecPtr(), it could atomically check the current
recovery timeline. If it has changed, refuse to send the new WAL and
terminate. That would be a fairly small change, it would just close the
window between requesting walsenders to terminate and them actually
terminating.


It looks to me like a bug fix that also applies to non cascading
situation. Is that right?


No, only cascading replication is affected. In non-cascading situation, 
the timeline never changes in the master. It's only in cascading mode 
that you have a problem, where the standby can cross timelines while 
it's replaying the WAL, and also sending it over to cascading standby.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cascading replication and recovery_target_timeline='latest'


On 04.09.2012 15:41, Josh Berkus wrote:

Guys,

Is this a patch to 9.3?

i.e. do we need to delay the release for this?


It is for 9.2. I'll do a little bit more testing, and barring any 
issues, commit the patch. What exactly is the schedule? Do we need to do 
a RC2 because of this?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cascading replication and recovery_target_timeline='latest'

2012-09-04 Thread Josh Berkus

Heikki,

 It is for 9.2. I'll do a little bit more testing, and barring any
 issues, commit the patch. What exactly is the schedule? Do we need to do
 a RC2 because of this?

We're currently scheduled to release next week.  If we need to do an
RC2, we're going to have to do some fast rescheduling; we've already
started the publicity machine.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] build farm machine using make -j 8 mixed results

2012-09-04 Thread Aidan Van Dyk

On Sep 4, 2012 6:06 PM, Andrew Dunstan and...@dunslane.net wrote:

Frankly, I have had enough failures of parallel make that I think doing
this would generate a significant number of non-repeatable failures (I had
one just the other day that took three invocations of make to get right).
So I'm not sure doing this would advance us much, although I'm open to
persuasion.

Seeing as most PostgreSQL bugs appear with concurrency, I think we should
leave our default config with 1 for max connections.

;-)

Parallel make failures are bugs in the dependencies as described in our
make files.

For the build phase, I don't recall parallel problems and as a habit I
usually use parallel makes. I would like that to be supported and I think
I've seen fixes applied when we had issues before. Cutting build times to
1/2 to 1/4 is good.

Checks and tests are harder because often they can't run in parallel. But
then we shouldn't have them listed as independent prerequisites for
targets. Ideally. But fixing it might not be worth the cost since an
acceptable work around (rely upon make to serialize the test sequences in
the particular order) is pretty painless (so far).

Of course, having the ability to run the tests 8 at a time (or more) and
reduce the time by 80% would be nice .;-)
On Sep 4, 2012 6:06 PM, Andrew Dunstan and...@dunslane.net wrote:

On 09/04/2012 05:49 PM, Peter Eisentraut wrote:

On 9/1/12 12:12 PM, Robert Creager wrote:

I change the build-farm.conf file to have the following make line:

make = 'make -j 8', # or gmake if required. can include path if
necessary.

2 pass, 4 fail. Is this a build configuration you want to pursue?

Sure that would be useful, but it's pretty clear that the check stages
don't work in parallel. It think it's because the ports conflict, but
there could be any number of other problems.

That said, it would be useful, in my mind, to support parallel checks.
But unless someone is going to put in the work first, you should
restrict your parallel runs to the build and install phases.

The buildfarm code doesn't contain a facility to use a different make
incantation for each step. It's pretty much an all or nothing deal. Of
course, you can hack run_build.pl to make it do that, but I highly
discourage that. For one thing, it makes upgrading that much more
difficult. All the tweaking is supposed to be done vie the config file. I
guess I could add a setting that allowed for per step make flags.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/**mailpref/pgsql-hackershttp://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cascading replication and recovery_target_timeline='latest'

Josh Berkus j...@agliodbs.com writes:
 Heikki,
 It is for 9.2. I'll do a little bit more testing, and barring any
 issues, commit the patch. What exactly is the schedule? Do we need to do
 a RC2 because of this?

 We're currently scheduled to release next week.  If we need to do an
 RC2, we're going to have to do some fast rescheduling; we've already
 started the publicity machine.

At this point I would argue that the only thing that should abort the
launch is a bad regression.  Minor bugs in new features (and this must
be minor if it wasn't noticed before) don't qualify.

Having said that, it'd be good to get it fixed if we can.  The schedule
says to wrap 9.2.0 Thursday evening --- Heikki, can you get this fixed
tomorrow (Wednesday)?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cascading replication and recovery_target_timeline='latest'

On 03.09.2012 17:40, Heikki Linnakangas wrote:

On 03.09.2012 16:26, Heikki Linnakangas wrote:

On 03.09.2012 16:25, Fujii Masao wrote:

On Tue, Sep 4, 2012 at 7:07 AM, Heikki Linnakangashlinn...@iki.fi
wrote:

Hmm, I was thinking that when walsender gets the position it can send
the
WAL up to, in GetStandbyFlushRecPtr(), it could atomically check the
current
recovery timeline. If it has changed, refuse to send the new WAL and
terminate. That would be a fairly small change, it would just close the
window between requesting walsenders to terminate and them actually
terminating.

Yeah, sounds good. Could you implement the patch? If you don't have
time,
I will

I'll give it a shot..

So, this is what I came up with, please review.

While testing, I bumped into another related bug: When a WAL segment is
restored from the archive, we let a walsender to send that whole WAL
segment to a cascading standby. However, there's no guarantee that the
restored WAL segment is complete. In particular, if a timeline changes
within that segment, e.g 00010004, that segment will be
only partially full, and the WAL continues at segment
00020004, at the next timeline. This can also happen if
you copy a partial WAL segment to the archive, for example from a
crashed master server. Or if you have set up record-based WAL shipping
not using streaming replication, per
http://www.postgresql.org/docs/devel/static/log-shipping-alternative.html#WARM-STANDBY-RECORD.
That manual page says you can only deal with whole WAL files that way,
but I think with standby_mode='on', that's actually no longer true.

So all in all, it seems like a shaky assumption that once you've
restored a WAL file from the archive, you're free to stream it to a
cascading slave. I think it would be more robust to limit it to
streaming the file only up to the point that it's been replayed - and
thus verified - in the 1st standby. If everyone is OK with that change
in behavior, the fix is simple.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cascading replication and recovery_target_timeline='latest'


On 04.09.2012 16:50, Tom Lane wrote:

Josh Berkusj...@agliodbs.com  writes:

Heikki,

It is for 9.2. I'll do a little bit more testing, and barring any
issues, commit the patch. What exactly is the schedule? Do we need to do
a RC2 because of this?



We're currently scheduled to release next week.  If we need to do an
RC2, we're going to have to do some fast rescheduling; we've already
started the publicity machine.


At this point I would argue that the only thing that should abort the
launch is a bad regression.  Minor bugs in new features (and this must
be minor if it wasn't noticed before) don't qualify.

Having said that, it'd be good to get it fixed if we can.  The schedule
says to wrap 9.2.0 Thursday evening --- Heikki, can you get this fixed
tomorrow (Wednesday)?


The attached patch fixes it for me. It fixes the original problem, by 
adding the missing locking and terminating walsenders on a target 
timeline change, and also changes the behavior wrt. WAL segments 
restored from the archive, as I just suggested in another email 
(http://archives.postgresql.org/pgsql-hackers/2012-09/msg00206.php).


The test case I've been using is a master and two standbys. The first 
standby is set up to connect to the master with streaming replication, 
and the other standby is set up to connect to the 1st standby, ie. it's 
a cascading slave. In addition, the master is set up to do WAL archiving 
to a directory, and both standbys have a restore_command to read from 
that archive, and restore_target_timeline='latest'. After the master and 
both standbys are running, I create a dummy recovery.conf file in 
master's data directory, with just restore_command='/bin/false' in it, 
and restart the master. That forces a timeline change in the master. 
With the patch, the 1st standby will notice the new timeline in the 
archive, switch to that, and reconnect to the master. The cascading 
connection to the 2nd standby is terminated because of the timeline 
change, the 2nd standby will also scan the archive and pick up the new 
timeline, reconnect to the 1st standby, and be in sync again.


- Heikki
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b75dab5..be01d63 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -406,7 +406,6 @@ typedef struct XLogCtlData
 	XLogRecPtr *xlblocks;		/* 1st byte ptr-s + XLOG_BLCKSZ */
 	int			XLogCacheBlck;	/* highest allocated xlog buffer index */
 	TimeLineID	ThisTimeLineID;
-	TimeLineID	RecoveryTargetTLI;
 
 	/*
 	 * archiveCleanupCommand is read from recovery.conf but needs to be in
@@ -455,6 +454,8 @@ typedef struct XLogCtlData
 	XLogRecPtr	recoveryLastRecPtr;
 	/* timestamp of last COMMIT/ABORT record replayed (or being replayed) */
 	TimestampTz recoveryLastXTime;
+	/* current effective recovery target timeline */
+	TimeLineID	RecoveryTargetTLI;
 
 	/*
 	 * timestamp of when we started replaying the current chunk of WAL data,
@@ -4467,12 +4468,17 @@ rescanLatestTimeLine(void)
 			ThisTimeLineID)));
 		else
 		{
+			/* use volatile pointer to prevent code rearrangement */
+			volatile XLogCtlData *xlogctl = XLogCtl;
+
 			/* Switch target */
 			recoveryTargetTLI = newtarget;
 			list_free(expectedTLIs);
 			expectedTLIs = newExpectedTLIs;
 
-			XLogCtl-RecoveryTargetTLI = recoveryTargetTLI;
+			SpinLockAcquire(xlogctl-info_lck);
+			xlogctl-RecoveryTargetTLI = recoveryTargetTLI;
+			SpinLockRelease(xlogctl-info_lck);
 
 			ereport(LOG,
 	(errmsg(new target timeline is %u,
@@ -7519,13 +7525,20 @@ GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch)
 }
 
 /*
- * GetRecoveryTargetTLI - get the recovery target timeline ID
+ * GetRecoveryTargetTLI - get the current recovery target timeline ID
  */
 TimeLineID
 GetRecoveryTargetTLI(void)
 {
-	/* RecoveryTargetTLI doesn't change so we need no lock to copy it */
-	return XLogCtl-RecoveryTargetTLI;
+	/* use volatile pointer to prevent code rearrangement */
+	volatile XLogCtlData *xlogctl = XLogCtl;
+	TimeLineID result;
+
+	SpinLockAcquire(xlogctl-info_lck);
+	result = xlogctl-RecoveryTargetTLI;
+	SpinLockRelease(xlogctl-info_lck);
+
+	return result;
 }
 
 /*
@@ -8321,7 +8334,7 @@ CreateRestartPoint(int flags)
 		XLogRecPtr	endptr;
 
 		/* Get the current (or recent) end of xlog */
-		endptr = GetStandbyFlushRecPtr();
+		endptr = GetStandbyFlushRecPtr(NULL);
 
 		KeepLogSeg(endptr, _logId, _logSeg);
 		PrevLogSeg(_logId, _logSeg);
@@ -9837,14 +9850,13 @@ do_pg_abort_backup(void)
 /*
  * Get latest redo apply position.
  *
- * Optionally, returns the end byte position of the last restored
- * WAL segment. Callers not interested in that value may pass
- * NULL for restoreLastRecPtr.
+ * Optionally, returns the current recovery target timeline. Callers not
+ * interested in that may pass NULL for targetTLI.
  *
  * Exported to allow WALReceiver to read the pointer directly.
  */
 XLogRecPtr
-GetXLogReplayRecPtr(XLogRecPtr *restoreLastRecPtr)

Re: [HACKERS] too much pgbench init output

Peter Eisentraut pete...@gmx.net writes:
 On 9/1/12 6:30 AM, Robert Haas wrote:
 On Sat, Sep 1, 2012 at 12:00 AM, Peter Eisentraut pete...@gmx.net wrote:
 When initializing a large database, pgbench writes tons of %d tuples
 done lines.  I propose to change this to a sort of progress counter
 that stays on the same line, as in the attached patch.

 I'm not sure I like this - what if the output is being saved off to a file?

 I suppose we could print \n instead of \r then.

Possibly off-the-wall idea: we could fix the too much output problem
once and for all by going to a log scale.

10 tuples done
100 tuples done
1000 tuples done
1 tuples done
10 tuples done
...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] build farm machine using make -j 8 mixed results

Andrew Dunstan and...@dunslane.net writes:
 Frankly, I have had enough failures of parallel make that I think doing 
 this would generate a significant number of non-repeatable failures (I 
 had one just the other day that took three invocations of make to get 
 right). So I'm not sure doing this would advance us much, although I'm 
 open to persuasion.

Really?  I routinely use -j4 for building, and it's been a long time
since I've seen failures.  I can believe that for instance make check
in contrib would have a problem running in parallel, but the build
process per se seems reliable enough from here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade diffs on WIndows



On 09/04/2012 03:44 PM, Andrew Dunstan wrote:


On 09/04/2012 03:09 PM, Andrew Dunstan wrote:
I realized this morning that I might have been a bit cavalier in 
using dos2unix to smooth away differences in the dumpfiles produced 
by pg_upgrade. Attached is a dump of the diff if this isn't done,  
with Carriage Returns printed as '*' to make them visible. As can be 
seen, in function bodies dump2 has the Carriage Returns doubled. I 
have not had time to delve into how this comes about, and I need to 
attend to some income-producing activity for a bit, but I'd like to 
get it cleaned up ASAP. We are under the hammer for 9.2, so any help 
other people can give on this would be appreciated.





Actually, I have the answer - it's quite simple. We just need to open 
the output files in binary mode when we split the dumpall file. The 
attached patch fixes it. I think we should backpatch the first part to 
9.0.





OK, nobody else has reacted. I've spoken to Bruce and he seems happy 
with it, although, TBH, whe I talked to him I thought I understood it 
and now I'm not so sure. So we have 3 possibilities: leave it as is with 
an error-hiding hack in the test script, apply this patch which removes 
the hack and applies a fix that apparently works but which confuses us a 
bit, or go back to generating errors. The last choice would mean I would 
need to turn off pg_ugrade testing on Windows pending a fix. And we have 
to decide pretty much now so we can get 9.2 out the door.


Thoughts?

cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] build farm machine using make -j 8 mixed results



On 09/04/2012 08:37 PM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

Frankly, I have had enough failures of parallel make that I think doing
this would generate a significant number of non-repeatable failures (I
had one just the other day that took three invocations of make to get
right). So I'm not sure doing this would advance us much, although I'm
open to persuasion.

Really?  I routinely use -j4 for building, and it's been a long time
since I've seen failures.  I can believe that for instance make check
in contrib would have a problem running in parallel, but the build
process per se seems reliable enough from here.





Both cases were vpath builds, which is what I usually use, if that's a 
useful data point.


Maybe I run on lower level hardware than you do. I saw this again this 
afternoon after I posted the above. In both cases this was the machine 
that runs the buildfarm's crake. I'll try to get a handle on it.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade del/rmdir path fix



On 09/04/2012 02:25 PM, Andrew Dunstan wrote:


On 09/04/2012 10:42 AM, Alvaro Herrera wrote:



Somehow the verbose reporting of user relation files being copied does
not seem exceedingly useful; and I don't remember seeing that on Linux.




Yeah, and it does something odd anyway when it's not writing to a 
terminal. Can we get rid of it, or make it only work in verbose mode?





The attached is an attempt to fix this. I think it handles most of 
what's wrong with this. (The patch is bigger because the code currently 
uses a variable called fileno - not a good idea.)


cheers

andrew


diff --git a/contrib/pg_upgrade/relfilenode.c b/contrib/pg_upgrade/relfilenode.c
index 33a867f..0a579d5 100644
--- a/contrib/pg_upgrade/relfilenode.c
+++ b/contrib/pg_upgrade/relfilenode.c
@@ -14,7 +14,6 @@
 #include catalog/pg_class.h
 #include access/transam.h
 
-
 static void transfer_single_new_db(pageCnvCtx *pageConverter,
 	   FileNameMap *maps, int size);
 static void transfer_relfile(pageCnvCtx *pageConverter,
@@ -136,7 +135,7 @@ transfer_single_new_db(pageCnvCtx *pageConverter,
 	char		**namelist = NULL;
 	int			numFiles = 0;
 	int			mapnum;
-	int			fileno;
+	int			file_no;
 	bool		vm_crashsafe_change = false;
 
 	old_dir[0] = '\0';
@@ -156,8 +155,8 @@ transfer_single_new_db(pageCnvCtx *pageConverter,
 		{
 			if (numFiles  0)
 			{
-for (fileno = 0; fileno  numFiles; fileno++)
-	pg_free(namelist[fileno]);
+for (file_no = 0; file_no  numFiles; file_no++)
+	pg_free(namelist[file_no]);
 pg_free(namelist);
 			}
 
@@ -171,7 +170,34 @@ transfer_single_new_db(pageCnvCtx *pageConverter,
  maps[mapnum].old_relfilenode);
 		snprintf(new_file, sizeof(new_file), %s/%u, maps[mapnum].new_dir,
  maps[mapnum].new_relfilenode);
-		pg_log(PG_REPORT, OVERWRITE_MESSAGE, old_file);
+		if (!isatty(fileno(stdout)))
+		{
+			/* don't try to be cute if we're not interactive */
+			pg_log(PG_REPORT,   %s\n, old_file);
+		}
+		else
+		{
+			/* 
+			 * print the largest rightmost part of the name that can fit
+			 * within the message width.
+			 */
+			int remove = strlen(old_file) - atoi(MESSAGE_WIDTH);
+			char *start = old_file;
+			char *sep;
+
+			if ( remove  0 )	
+			{
+/*
+ * If it's a partial path. move past the
+ * first file path separator we find, so we don't
+ * print a partial path segment.
+ */
+start += remove;
+if ((sep = strpbrk(start,/\\)) != NULL)
+	start = sep + 1;
+			}
+			pg_log(PG_REPORT, OVERWRITE_MESSAGE, start);
+		}
 
 		/*
 		 * Copy/link the relation's primary file (segment 0 of main fork)
@@ -190,23 +216,23 @@ transfer_single_new_db(pageCnvCtx *pageConverter,
 			snprintf(file_pattern, sizeof(file_pattern), %u_,
 	 maps[mapnum].old_relfilenode);
 
-			for (fileno = 0; fileno  numFiles; fileno++)
+			for (file_no = 0; file_no  numFiles; file_no++)
 			{
-char	   *vm_offset = strstr(namelist[fileno], _vm);
+char	   *vm_offset = strstr(namelist[file_no], _vm);
 bool		is_vm_file = false;
 
 /* Is a visibility map file? (name ends with _vm) */
 if (vm_offset  strlen(vm_offset) == strlen(_vm))
 	is_vm_file = true;
 
-if (strncmp(namelist[fileno], file_pattern,
+if (strncmp(namelist[file_no], file_pattern,
 			strlen(file_pattern)) == 0 
 	(!is_vm_file || !vm_crashsafe_change))
 {
 	snprintf(old_file, sizeof(old_file), %s/%s, maps[mapnum].old_dir,
-			 namelist[fileno]);
+			 namelist[file_no]);
 	snprintf(new_file, sizeof(new_file), %s/%u%s, maps[mapnum].new_dir,
-			 maps[mapnum].new_relfilenode, strchr(namelist[fileno], '_'));
+			 maps[mapnum].new_relfilenode, strchr(namelist[file_no], '_'));
 
 	unlink(new_file);
 	transfer_relfile(pageConverter, old_file, new_file,
@@ -225,15 +251,15 @@ transfer_single_new_db(pageCnvCtx *pageConverter,
 		snprintf(file_pattern, sizeof(file_pattern), %u.,
  maps[mapnum].old_relfilenode);
 
-		for (fileno = 0; fileno  numFiles; fileno++)
+		for (file_no = 0; file_no  numFiles; file_no++)
 		{
-			if (strncmp(namelist[fileno], file_pattern,
+			if (strncmp(namelist[file_no], file_pattern,
 		strlen(file_pattern)) == 0)
 			{
 snprintf(old_file, sizeof(old_file), %s/%s, maps[mapnum].old_dir,
-		 namelist[fileno]);
+		 namelist[file_no]);
 snprintf(new_file, sizeof(new_file), %s/%u%s, maps[mapnum].new_dir,
-		 maps[mapnum].new_relfilenode, strchr(namelist[fileno], '.'));
+		 maps[mapnum].new_relfilenode, strchr(namelist[file_no], '.'));
 
 unlink(new_file);
 transfer_relfile(pageConverter, old_file, new_file,
@@ -244,8 +270,8 @@ transfer_single_new_db(pageCnvCtx *pageConverter,
 
 	if (numFiles  0)
 	{
-		for (fileno = 0; fileno  numFiles; fileno++)
-			pg_free(namelist[fileno]);
+		for (file_no = 0; file_no  numFiles; file_no++)
+			pg_free(namelist[file_no]);
 		pg_free(namelist);
 	}
 }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your

Re: [HACKERS] pg_upgrade diffs on WIndows

On Tue, Sep  4, 2012 at 08:46:53PM -0400, Andrew Dunstan wrote:
 
 On 09/04/2012 03:44 PM, Andrew Dunstan wrote:
 
 On 09/04/2012 03:09 PM, Andrew Dunstan wrote:
 I realized this morning that I might have been a bit cavalier in
 using dos2unix to smooth away differences in the dumpfiles
 produced by pg_upgrade. Attached is a dump of the diff if this
 isn't done,  with Carriage Returns printed as '*' to make them
 visible. As can be seen, in function bodies dump2 has the
 Carriage Returns doubled. I have not had time to delve into how
 this comes about, and I need to attend to some income-producing
 activity for a bit, but I'd like to get it cleaned up ASAP. We
 are under the hammer for 9.2, so any help other people can give
 on this would be appreciated.
 
 
 
 Actually, I have the answer - it's quite simple. We just need to
 open the output files in binary mode when we split the dumpall
 file. The attached patch fixes it. I think we should backpatch the
 first part to 9.0.
 
 
 
 OK, nobody else has reacted. I've spoken to Bruce and he seems happy
 with it, although, TBH, whe I talked to him I thought I understood
 it and now I'm not so sure. So we have 3 possibilities: leave it as
 is with an error-hiding hack in the test script, apply this patch
 which removes the hack and applies a fix that apparently works but
 which confuses us a bit, or go back to generating errors. The last
 choice would mean I would need to turn off pg_ugrade testing on
 Windows pending a fix. And we have to decide pretty much now so we
 can get 9.2 out the door.

I am very concerned about putting something into pg_upgrade that we
don't fully understand.  Adding stuff to pg_upgrade that we think we
understand is risky enough, as we have seen in the pg_upgrade churn of
the past week.  Let's work on chat to find the complete details --- same
goes for the log file change we are not sure about either.

pg_upgrade is so complicated that I have learned that if we don't fully
understand something, it can affect things we don't anticipate.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multiple setup steps for isolation tests

Kevin Grittner  wrote:
 Tom Lane  wrote:
 the attached version works fine for me.
  
 Yeah, that should do it.  Will apply.
 
Pushed to master and REL9_2_STABLE.
 
-Kevin



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] index-only scans versus serializable transactions

Kevin Grittner  wrote:
 
 New version attached. Will apply if no further problems are found.
 
Pushed to master and REL9_2_STABLE.
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cascading replication and recovery_target_timeline='latest'


On 04.09.2012 17:34, Heikki Linnakangas wrote:

On 04.09.2012 16:50, Tom Lane wrote:

Josh Berkusj...@agliodbs.com writes:

Heikki,

It is for 9.2. I'll do a little bit more testing, and barring any
issues, commit the patch. What exactly is the schedule? Do we need
to do
a RC2 because of this?



We're currently scheduled to release next week. If we need to do an
RC2, we're going to have to do some fast rescheduling; we've already
started the publicity machine.


At this point I would argue that the only thing that should abort the
launch is a bad regression. Minor bugs in new features (and this must
be minor if it wasn't noticed before) don't qualify.

Having said that, it'd be good to get it fixed if we can. The schedule
says to wrap 9.2.0 Thursday evening --- Heikki, can you get this fixed
tomorrow (Wednesday)?


The attached patch fixes it for me. It fixes the original problem, by
adding the missing locking and terminating walsenders on a target
timeline change, and also changes the behavior wrt. WAL segments
restored from the archive, as I just suggested in another email
(http://archives.postgresql.org/pgsql-hackers/2012-09/msg00206.php).


Committed that.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] too much pgbench init output

2012-09-04 Thread Robert Haas

On Tue, Sep 4, 2012 at 8:35 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Peter Eisentraut pete...@gmx.net writes:
 On 9/1/12 6:30 AM, Robert Haas wrote:
 On Sat, Sep 1, 2012 at 12:00 AM, Peter Eisentraut pete...@gmx.net wrote:
 When initializing a large database, pgbench writes tons of %d tuples
 done lines.  I propose to change this to a sort of progress counter
 that stays on the same line, as in the attached patch.

 I'm not sure I like this - what if the output is being saved off to a file?

 I suppose we could print \n instead of \r then.

 Possibly off-the-wall idea: we could fix the too much output problem
 once and for all by going to a log scale.

 10 tuples done
 100 tuples done
 1000 tuples done
 1 tuples done
 10 tuples done
 ...

I don't like that, because one of the things you can see by following
the current output is where the checkpoint stalls are happening during
the load.  You'd lose the ability to notice any kind of slowdown after
the first few tuples with this kind of format.

Actually, this whole things seems like a solution in search of a
problem to me.  We just reduced the verbosity of pgbench -i tenfold in
the very recent past - I would have thought that enough to address
this problem.  But maybe not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17)

2012-09-04 Thread Robert Haas

On Tue, Sep 4, 2012 at 11:15 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 It is something which has to be considered, and I don't think it's
 theoretical at all.  Here's how we deal with it.  We don't use a
 plain FIFO queue for our transaction requests, but a prioritized
 FIFO with 10 levels of priority (0 to 9).  The highest priority (9)
 is reserved for utility requests -- where a running transaction
 needs to spin off a related transaction to do some work for it.  For
 the lowest level (0) we normally allocate only a single connection,
 and it is used for very long-running reports which we want to queue
 to run one-at-a-time.  As examples of how we categorize queries,
 filling a large list in an interactive application will run at
 priority 3, while translating a key which must cause a description
 on the screen to display is run at priority 8.  Normal single-row
 updates and deletes from an interactive application run at priority
 5.

 Each connection in the pool has a worker thread, and is assigned a
 minimum priority that it will handle.  When all threads are busy and
 transaction requests are queued, any thread completing a database
 transaction pulls from the front of the highest priority queue with
 a waiting request to run a transaction, looking only at priorities
 which are not beneath it.  If there are no waiting requests of
 high enough priority, the thread waits for one to arrive.

I well believe that with this sort of sophisticated system you can
make the connection pool much smaller and get a benefit out of it.
However, I think it's quite rare for people to have a system this
sophisticated.  I suspect that's why I typically see much larger pool
sizes.

Here's my other thought about this: we talk a lot about how a system
with 32 cores and 40 drives can't do more than 72 things at once, and
that's absolutely true.  But I think much of the reason why PostgreSQL
users get a benefit out of connection pooling is unrelated to that
effect.  What I think we're really working around, in many cases, is
internal lock contention.  That's why people are talking about
adjusting formulas for 9.2.  It's not that a system with 72 resources
can suddenly do more than 72 things; it's that in the old world lock
contention could easily make it a loser to have even half that many
tasks running at once, and now that's less true.  Hopefully we'll make
further improvements in the future and it'll become even less true
still.  So is the real issue the hardware limits of the server, or is
it the limits of our software?  The former is certainly in the mix,
but I personally believe the latter has a lot more to do with pool
size selection than we typically credit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] too much pgbench init output

On Tue, 2012-09-04 at 23:14 -0400, Robert Haas wrote:
 Actually, this whole things seems like a solution in search of a
 problem to me.  We just reduced the verbosity of pgbench -i tenfold in
 the very recent past - I would have thought that enough to address
 this problem.  But maybe not.

The problem is that

a) It blasts out too much output and everything scrolls off the screen,
and

b) There is no indication of where the end is.

These are independent problems, and I'd be happy to address them
separately if there are such specific concerns attached to this.

Speaking of tenfold, we could reduce the output frequency tenfold to
once every 100, which would alleviate this problem for a while
longer.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade diffs on WIndows

On Tue, 2012-09-04 at 20:46 -0400, Andrew Dunstan wrote:
 OK, nobody else has reacted. I've spoken to Bruce and he seems happy 
 with it, although, TBH, whe I talked to him I thought I understood it 
 and now I'm not so sure. So we have 3 possibilities: leave it as is with 
 an error-hiding hack in the test script, apply this patch which removes 
 the hack and applies a fix that apparently works but which confuses us a 
 bit, or go back to generating errors. The last choice would mean I would 
 need to turn off pg_ugrade testing on Windows pending a fix. And we have 
 to decide pretty much now so we can get 9.2 out the door.

I think now is not the time to cram in poorly understood changes into a
release candidate.  There is no requirement to have the tests running
now or in time for the release, seeing also that no one has been
particularly bothered about it for the past 11 months.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] too much pgbench init output

Peter Eisentraut pete...@gmx.net writes:
 On Tue, 2012-09-04 at 23:14 -0400, Robert Haas wrote:
 Actually, this whole things seems like a solution in search of a
 problem to me.  We just reduced the verbosity of pgbench -i tenfold in
 the very recent past - I would have thought that enough to address
 this problem.  But maybe not.

 The problem is that

 a) It blasts out too much output and everything scrolls off the screen,
 and

Robert evidently thinks that the verbosity of the output is a feature
not a bug.  I'm not convinced that eyeballing pgbench output is a
particularly useful way to measure checkpoint stalls, but ...

 b) There is no indication of where the end is.

Well, surely *that* can be fixed in a noncontroversial way: just
print M/N tuples done, where N is the target.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2 pg_upgrade regression tests on WIndows

On Mon, Sep  3, 2012 at 12:44:09PM -0400, Andrew Dunstan wrote:
 The attached very small patch allows pg_upgrade's make check to
 succeed on REL9_2_STABLE on my Mingw system.
 
 However, I consider the issue I mentioned earlier regarding use of
 forward slashes in the argument to rmdir to be a significant
 blocker, so I'm going to go and fix that and then pull this all
 together.
 
 cheers
 
 andrew

 diff --git a/contrib/pg_upgrade/exec.c b/contrib/pg_upgrade/exec.c
 index 6f993df..57ca1df 100644
 --- a/contrib/pg_upgrade/exec.c
 +++ b/contrib/pg_upgrade/exec.c
 @@ -91,10 +91,12 @@ exec_prog(bool throw_error, bool is_priv, const char 
 *log_file,
   else
   retval = 0;
  
 +#ifndef WIN32
   if ((log = fopen_priv(log_file, a+)) == NULL)
   pg_log(PG_FATAL, cannot write to log file %s\n, log_file);
   fprintf(log, \n\n);
   fclose(log);
 +#endif
  
   return retval;
  }

OK, I worked with Andrew on this issue, and have applied the attached
patch which explains what is happening in this case.  Andrew's #ifndef
WIN32 was the correct fix.  I consider this issue closed.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/exec.c b/contrib/pg_upgrade/exec.c
new file mode 100644
index ac46a9b..99f5006
*** a/contrib/pg_upgrade/exec.c
--- b/contrib/pg_upgrade/exec.c
*** exec_prog(const char *log_file, const ch
*** 63,70 
  	if (written = MAXCMDLEN)
  		pg_log(PG_FATAL, command too long\n);
  
! 	if ((log = fopen_priv(log_file, a+)) == NULL)
  		pg_log(PG_FATAL, cannot write to log file %s\n, log_file);
  	pg_log(PG_VERBOSE, %s\n, cmd);
  	fprintf(log, command: %s\n, cmd);
  
--- 63,73 
  	if (written = MAXCMDLEN)
  		pg_log(PG_FATAL, command too long\n);
  
! 	if ((log = fopen_priv(log_file, a)) == NULL)
  		pg_log(PG_FATAL, cannot write to log file %s\n, log_file);
+ #ifdef WIN32
+ 	fprintf(log, \n\n);
+ #endif
  	pg_log(PG_VERBOSE, %s\n, cmd);
  	fprintf(log, command: %s\n, cmd);
  
*** exec_prog(const char *log_file, const ch
*** 97,106 
  
  #ifndef WIN32
  	/* 
! 	 * Can't do this on Windows, postmaster will still hold the log file
! 	 * open if the command was pg_ctl start.
  	 */
! 	if ((log = fopen_priv(log_file, a+)) == NULL)
  		pg_log(PG_FATAL, cannot write to log file %s\n, log_file);
  	fprintf(log, \n\n);
  	fclose(log);
--- 100,112 
  
  #ifndef WIN32
  	/* 
! 	 *	We can't do this on Windows because it will keep the pg_ctl start
! 	 *	output filename open until the server stops, so we do the \n\n above
! 	 *	on that platform.  We use a unique filename for pg_ctl start that is
! 	 *	never reused while the server is running, so it works fine.  We could
! 	 *	log these commands to a third file, but that just adds complexity.
  	 */
! 	if ((log = fopen_priv(log_file, a)) == NULL)
  		pg_log(PG_FATAL, cannot write to log file %s\n, log_file);
  	fprintf(log, \n\n);
  	fclose(log);
diff --git a/contrib/pg_upgrade/pg_upgrade.h b/contrib/pg_upgrade/pg_upgrade.h
new file mode 100644
index 195b927..3058343
*** a/contrib/pg_upgrade/pg_upgrade.h
--- b/contrib/pg_upgrade/pg_upgrade.h
*** extern char *output_files[];
*** 63,69 
  #define SERVER_STOP_LOG_FILE	SERVER_LOG_FILE
  #else
  #define SERVER_START_LOG_FILE	pg_upgrade_server_start.log
! /* pg_ctl stop doesn't keep the log file open, so reuse UTILITY_LOG_FILE */
  #define SERVER_STOP_LOG_FILE	UTILITY_LOG_FILE
  #endif
  
--- 63,73 
  #define SERVER_STOP_LOG_FILE	SERVER_LOG_FILE
  #else
  #define SERVER_START_LOG_FILE	pg_upgrade_server_start.log
! /*
!  *	pg_ctl start keeps SERVER_START_LOG_FILE and SERVER_LOG_FILE open
!  *	while the server is running, so we use UTILITY_LOG_FILE for pg_ctl
!  *	stop.
!  */
  #define SERVER_STOP_LOG_FILE	UTILITY_LOG_FILE
  #endif
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2rc1 produces incorrect results

I wrote:
 I think probably the best fix is to rejigger things so that Params
 assigned by different executions of SS_replace_correlation_vars and
 createplan.c can't share PARAM_EXEC numbers.  This will result in
 rather larger ecxt_param_exec_vals arrays at runtime, but the array
 entries aren't very large, so I don't think it'll matter.

Attached is a draft patch against HEAD for this.  I think it makes the
planner's handling of outer-level Params far less squishy than it's ever
been, but it is rather a large change.  Not sure whether to risk pushing
it into 9.2 right now, or wait till after we cut 9.2.0 ... thoughts?

regards, tom lane

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 1f2bb6cc72f1242f14d55eee7cdc8e0e0d0775a9..02a0f62a53a4e3d06a3ad48d523e959d5d6b2ab7 100644
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
*** _outPlannerGlobal(StringInfo str, const 
*** 1666,1672 
  	WRITE_NODE_TYPE(PLANNERGLOBAL);
  
  	/* NB: this isn't a complete set of fields */
- 	WRITE_NODE_FIELD(paramlist);
  	WRITE_NODE_FIELD(subplans);
  	WRITE_BITMAPSET_FIELD(rewindPlanIDs);
  	WRITE_NODE_FIELD(finalrtable);
--- 1666,1671 
*** _outPlannerGlobal(StringInfo str, const 
*** 1674,1679 
--- 1673,1679 
  	WRITE_NODE_FIELD(resultRelations);
  	WRITE_NODE_FIELD(relationOids);
  	WRITE_NODE_FIELD(invalItems);
+ 	WRITE_INT_FIELD(nParamExec);
  	WRITE_UINT_FIELD(lastPHId);
  	WRITE_UINT_FIELD(lastRowMarkId);
  	WRITE_BOOL_FIELD(transientPlan);
*** _outPlannerInfo(StringInfo str, const Pl
*** 1688,1693 
--- 1688,1694 
  	WRITE_NODE_FIELD(parse);
  	WRITE_NODE_FIELD(glob);
  	WRITE_UINT_FIELD(query_level);
+ 	WRITE_NODE_FIELD(plan_params);
  	WRITE_BITMAPSET_FIELD(all_baserels);
  	WRITE_NODE_FIELD(join_rel_list);
  	WRITE_INT_FIELD(join_cur_level);
*** _outRelOptInfo(StringInfo str, const Rel
*** 1754,1759 
--- 1755,1761 
  	WRITE_FLOAT_FIELD(allvisfrac, %.6f);
  	WRITE_NODE_FIELD(subplan);
  	WRITE_NODE_FIELD(subroot);
+ 	WRITE_NODE_FIELD(subplan_params);
  	/* we don't try to print fdwroutine or fdw_private */
  	WRITE_NODE_FIELD(baserestrictinfo);
  	WRITE_NODE_FIELD(joininfo);
*** _outPlannerParamItem(StringInfo str, con
*** 1950,1956 
  	WRITE_NODE_TYPE(PLANNERPARAMITEM);
  
  	WRITE_NODE_FIELD(item);
! 	WRITE_UINT_FIELD(abslevel);
  }
  
  /*
--- 1952,1958 
  	WRITE_NODE_TYPE(PLANNERPARAMITEM);
  
  	WRITE_NODE_FIELD(item);
! 	WRITE_INT_FIELD(paramId);
  }
  
  /*
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 69a1b93b33746370457bff2daf4d4ece66535803..458dae0489c029bd743c75c82f8e5102067e89bf 100644
*** a/src/backend/optimizer/path/allpaths.c
--- b/src/backend/optimizer/path/allpaths.c
*** set_subquery_pathlist(PlannerInfo *root,
*** 1145,1150 
--- 1145,1153 
  	else
  		tuple_fraction = root-tuple_fraction;
  
+ 	/* plan_params should not be in use in current query level */
+ 	Assert(root-plan_params == NIL);
+ 
  	/* Generate the plan for the subquery */
  	rel-subplan = subquery_planner(root-glob, subquery,
  	root,
*** set_subquery_pathlist(PlannerInfo *root,
*** 1152,1157 
--- 1155,1164 
  	subroot);
  	rel-subroot = subroot;
  
+ 	/* Isolate the params needed by this specific subplan */
+ 	rel-subplan_params = root-plan_params;
+ 	root-plan_params = NIL;
+ 
  	/*
  	 * It's possible that constraint exclusion proved the subquery empty. If
  	 * so, it's convenient to turn it back into a dummy path so that we will
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 5d3b293b88a3ee030adae2260520eda69caad4b7..030f420c90eb37946ee333250de54af61d9b82d7 100644
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
*** static HashJoin *create_hashjoin_plan(Pl
*** 84,90 
  	 Plan *outer_plan, Plan *inner_plan);
  static Node *replace_nestloop_params(PlannerInfo *root, Node *expr);
  static Node *replace_nestloop_params_mutator(Node *node, PlannerInfo *root);
! static void identify_nestloop_extparams(PlannerInfo *root, Plan *subplan);
  static List *fix_indexqual_references(PlannerInfo *root, IndexPath *index_path);
  static List *fix_indexorderby_references(PlannerInfo *root, IndexPath *index_path);
  static Node *fix_indexqual_operand(Node *node, IndexOptInfo *index, int indexcol);
--- 84,91 
  	 Plan *outer_plan, Plan *inner_plan);
  static Node *replace_nestloop_params(PlannerInfo *root, Node *expr);
  static Node *replace_nestloop_params_mutator(Node *node, PlannerInfo *root);
! static void process_subquery_nestloop_params(PlannerInfo *root,
!  List *subplan_params);

Re: [HACKERS] too much pgbench init output

2012-09-04 Thread Robert Haas

On Tue, Sep 4, 2012 at 11:31 PM, Peter Eisentraut pete...@gmx.net wrote:
 On Tue, 2012-09-04 at 23:14 -0400, Robert Haas wrote:
 Actually, this whole things seems like a solution in search of a
 problem to me.  We just reduced the verbosity of pgbench -i tenfold in
 the very recent past - I would have thought that enough to address
 this problem.  But maybe not.

 The problem is that

 a) It blasts out too much output and everything scrolls off the screen,
 and

 b) There is no indication of where the end is.

 These are independent problems, and I'd be happy to address them
 separately if there are such specific concerns attached to this.

 Speaking of tenfold, we could reduce the output frequency tenfold to
 once every 100, which would alleviate this problem for a while
 longer.

Well, I wouldn't object to displaying a percentage on each output
line.  But I don't really like the idea of having them less frequent
than they already are, because if you run into a situation that makes
pgbench -i run slowly, as I occasionally do, it's marginal to tell the
difference between slow and completely hung even with the current
level of verbosity.

However, we could add a -q flag to run more quietly, or something like
that.  Actually, I'd even be fine with making the default quieter,
though we can't use -v for verbose since that's already taken.  But
I'd like to preserve the option of getting the current amount of
output because sometimes I need that to troubleshoot problems.
Actually it'd be nice to even get a bit more output: say, a timestamp
on each line, and a completion percentage... but now I'm getting
greedy.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] too much pgbench init output

2012-09-04 Thread Pavan Deolasee

On Wed, Sep 5, 2012 at 9:47 AM, Robert Haas robertmh...@gmail.com wrote:


 Actually it'd be nice to even get a bit more output: say, a timestamp
 on each line, and a completion percentage... but now I'm getting
 greedy.


May be we need a verbosity level and print a lot less or a lot more
information than what we do today. That will satisfy everyone. Hopefully.

Thanks,
Pavan

Re: [HACKERS] pg_upgrade diffs on WIndows

Peter Eisentraut pete...@gmx.net writes:
 On Tue, 2012-09-04 at 20:46 -0400, Andrew Dunstan wrote:
 OK, nobody else has reacted. I've spoken to Bruce and he seems happy 
 with it, although, TBH, whe I talked to him I thought I understood it 
 and now I'm not so sure. So we have 3 possibilities: leave it as is with 
 an error-hiding hack in the test script, apply this patch which removes 
 the hack and applies a fix that apparently works but which confuses us a 
 bit, or go back to generating errors. The last choice would mean I would 
 need to turn off pg_ugrade testing on Windows pending a fix. And we have 
 to decide pretty much now so we can get 9.2 out the door.

 I think now is not the time to cram in poorly understood changes into a
 release candidate.  There is no requirement to have the tests running
 now or in time for the release, seeing also that no one has been
 particularly bothered about it for the past 11 months.

Also, the tests *are* passing right now.  I agree, let's not risk
destabilizing it.  pg_upgrade is way overdue for some quiet time so we
can verify a full day's buildfarm cycle on it before the release wrap.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cascading replication and recovery_target_timeline='latest'

On Tue, 2012-09-04 at 19:34 -0700, Heikki Linnakangas wrote:
  The attached patch fixes it for me. It fixes the original problem, by
  adding the missing locking and terminating walsenders on a target
  timeline change, and also changes the behavior wrt. WAL segments
  restored from the archive, as I just suggested in another email
  (http://archives.postgresql.org/pgsql-hackers/2012-09/msg00206.php).
 
 Committed that.

New compiler warnings:

xlog.c: In function ‘XLogFileRead’:
xlog.c:2785:14: error: unused variable ‘endptr’ [-Werror=unused-variable]
xlog.c:2784:25: error: unused variable ‘xlogctl’ [-Werror=unused-variable]




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cascading replication and recovery_target_timeline='latest'