Re: [HACKERS] [BUG?] lag of minRecoveryPont in archive recovery

2012-12-10 Thread Amit Kapila
Monday, December 10, 2012 7:16 AM Kyotaro HORIGUCHI wrote:
 Thank you.
 
  I think moving CheckRecoveryConsistency() after redo apply loop might
 cause
  a problem.
  As currently it is done before recoveryStopsHere() function, which can
 allow
  connections
  on HOTSTANDY. But now if due to some reason recovery pauses or stops
 due to
  above function,
  connections might not be allowed as CheckRecoveryConsistency() is not
  called.
 
 It depends on the precise meaning of minRecoveryPoint. I've not
 found the explicit explanation for it.
 
 Currently minRecoveryPoint is updated only in XLogFlush. Other
 updates of it seems to reset to InvalidXLogRecPtr. XLogFlush
 seems to be called AFTER the redo core operation has been done,
 at least in xact_redo_commit_internal. I shows me that the
 meaning of minRecoveryPoint is that Just AFTER applying the XLog
 at current LSN, the database files will be assumed to be
 consistent.
 
 At Mon, 10 Dec 2012 00:36:31 +0900, Fujii Masao masao.fu...@gmail.com
 wrote in
 cahgqgwg4w5qz7+ljimg8xxuevwz0bynihmzlzmwf0j6kbiu...@mail.gmail.com
  Yes, so we should just add the CheckRecoveryConsistency() call after
  rm_redo rather than moving it? This issue is related to the old
 discussion:
  http://archives.postgresql.org/pgsql-bugs/2012-09/msg00101.php
 
 Since I've not cleary understood the problem of missing it before
 redo, and it also seems to have no harm on performance, I have no
 objection to place it both before and after of redo.

I have tried that way as well, but it didn't completely resolved the problem
reported in above link.
As the situation of defect got arised when it does first time ReadRecord(). 

To resolve the defect mentioned in link by Fujii Masao, actually we need to
check and 
try to reset the backupStartPoint before each ReadRecord.
The reason is that in ReadRecord(), it can go and start waiting for records
from Master.
So now if backupStartPoint is not set and CheckRecoveryConsistency() is not
done, it can keep on waiting
Records from Master and no connections will be allowed on Standby.

I have modified the code of XLogPageRead(...) such that before it calls
WaitForWALToBecomeavailable(), following code will be added

if (!XLogRecPtrIsInvalid(ControlFile-backupEndPoint)  
 
XLByteLE(ControlFile-backupEndPoint, EndRecPtr)) 
{ 
/* 
 * We have reached the end of base
backup, the point where 
 * the minimum recovery point in
pg_control indicates. The 
 * data on disk is now consistent.
Reset backupStartPoint 
 * and backupEndPoint. 
 */ 
elog(DEBUG1, end of backup
reached); 

LWLockAcquire(ControlFileLock,
LW_EXCLUSIVE); 

 
MemSet(ControlFile-backupStartPoint, 0, sizeof(XLogRecPtr)); 
MemSet(ControlFile-backupEndPoint,
0, sizeof(XLogRecPtr)); 
ControlFile-backupEndRequired =
false; 
UpdateControlFile(); 

LWLockRelease(ControlFileLock); 
}

CheckRecoveryConsistency();

This had completely resolved the problem reported on above link for me.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2012-12-10 Thread Kyotaro HORIGUCHI
Thank you.

 heap_attr_get_length_and_check_equals: 
..
 - This function returns always false for attrnum = 0 as whole 
   tuple or some system attrs comparison regardless of the real 
   result, which is a bit different from the anticipation which 
   the name gives. If you need to keep this optimization, it 
   should have the name more specific to the purpose. 
 
 The heap_attr_get_length_and_check_equals function is similar to 
 heap_tuple_attr_equals, 
 the attrnum = 0 check is required for heap_tuple_attr_equals. 

Sorry, you're right.

 haap_delta_encode: 
  
 - Some misleading variable names (like match_not_found), 
   some reatitions of similiar codelets (att_align_pointer, pglz_out_tag), 
   misleading slight difference of the meanings of variables of 
   similar names(old_off and new_off and the similar pairs), 
   and bit tricky use of pglz_out_add and pglz_out_tag with length = 0. 
  
   These are welcome to be modified for better readability. 
 
 The variable names are modified, please check them once. 
 
 The (att_align_pointer, pglz_out_tag) repetition code is added to take care 
 of padding only incase of values are equal. 
 Use of pglz_out_add and pglz_out_tag with length = 0 is done because of code 
 readability. 

Oops! Sorry for mistake. My point was that the bases for old_off
(of match_off) and dp, not new_off. It is no unnatural. Namings
had not been the problem and the function was perfect as of the
last patch. I'd been confised by the asymmetry between match_off
to pglz_out_tag and dp to pglz_out_add.

 Another change is also done to handle the history size of 2 bytes which is 
 possible with the usage of LZ macro's for delta encoding.

Good catch. This seems to have been a potential bug which does no
harm when called from pglz_compress..

==

Looking into wal_update_changes_mod_lz_v6.patch, I understand
that this patch experimentally adds literal data segment which
have more than single byte in PG-LZ algorithm.  According to
pglz_find_match, memCMP is slower than 'while(*s  *s == *d)' if
len  16 and I suppose it is probably true at least for 4 byte
length data. This is also applied on encoding side. If this mod
does no harm to performance, I want to see this applied also to
pglz_comress.

 By the way, the comment on pg_lzcompress.c:690 seems to quite
differ from what the code does.


regards,


*1: 
http://archives.postgresql.org/message-id/6C0B27F7206C9E4CA54AE035729E9C38285495B0@szxeml509-mbx

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: fix corner use case of variadic fuctions usage

2012-12-10 Thread Vik Reykja
On Sat, Dec 1, 2012 at 1:14 PM, Pavel Stehule pavel.steh...@gmail.comwrote:

 Hello

 
  Hi Pavel.
 
  I am trying to review this patch and on my work computer everything
 compiles
  and tests perfectly. However, on my laptop, the regression tests don't
 pass
  with cache lookup failed for type XYZ where XYZ is some number that
 does
  not appear to be any type oid.
 
  I don't really know where to go from here. I am asking that other people
 try
  this patch to see if they get errors as well.
 

 yes, I checked it on .x86_64 and I had a same problems

 probably there was more than one issue - I had to fix a creating a
 unpacked params and I had a issue with gcc optimalization when I used
 a stack variable for fcinfo.

 Now I fixed these issues and I hope  so it will work on all platforms


It appears to work a lot better, yes.  I played around with it a little bit
and wasn't able to break it, so I'm marking it as ready for committer.
Some wordsmithing will need to be done on the code comments.


Re: [HACKERS] Support for REINDEX CONCURRENTLY

2012-12-10 Thread Simon Riggs
On 10 December 2012 06:03, Michael Paquier michael.paqu...@gmail.com wrote:
 On 2012-12-08 09:40:43 -0500, Tom Lane wrote:
  Andres Freund and...@2ndquadrant.com writes:
  I'm tempted to propose that REINDEX CONCURRENTLY simply not try to
  preserve the index name exactly.  Something like adding or removing
  trailing underscores would probably serve to generate a nonconflicting
  name that's not too unsightly.  Or just generate a new name using the
  same rules that CREATE INDEX would when no name is specified.  Yeah,
  it's a hack, but what about the CONCURRENTLY commands isn't a hack?

 I have no problem with ending up with a new name or something like
 that. If that is what it takes: fine, no problem.

 For the indexes that are created internally by the system like toast or
 internal primary keys this is acceptable. However in the case of indexes
 that have been created externally I do not think it is acceptable as this
 impacts the user that created those indexes with a specific name.

If I have to choose between (1) keeping the same name OR (2) avoiding
an AccessExclusiveLock then I would choose (2). Most other people
would also, especially when all we would do is add/remove an
underscore. Even if that is user visible. And if it is we can support
a LOCK option that does (1) instead.

If we make it an additional constraint on naming, it won't be a
problem... namely that you can't create an index with/without an
underscore at the end, if a similar index already exists that has an
identical name apart from the suffix.

There are few, if any, commands that need the index name to remain the
same. For those, I think we can bend them to accept the index name and
then add/remove the underscore to get that to work.

That's all a little bit crappy, but this is too small a problem with
an important feature to allow us to skip.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for REINDEX CONCURRENTLY

2012-12-10 Thread Michael Paquier


--
Michael Paquier
http://michael.otacoo.com

On 2012/12/10, at 18:28, Simon Riggs si...@2ndquadrant.com wrote:

 On 10 December 2012 06:03, Michael Paquier michael.paqu...@gmail.com wrote:
 On 2012-12-08 09:40:43 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
 I'm tempted to propose that REINDEX CONCURRENTLY simply not try to
 preserve the index name exactly.  Something like adding or removing
 trailing underscores would probably serve to generate a nonconflicting
 name that's not too unsightly.  Or just generate a new name using the
 same rules that CREATE INDEX would when no name is specified.  Yeah,
 it's a hack, but what about the CONCURRENTLY commands isn't a hack?
 
 I have no problem with ending up with a new name or something like
 that. If that is what it takes: fine, no problem.
 
 For the indexes that are created internally by the system like toast or
 internal primary keys this is acceptable. However in the case of indexes
 that have been created externally I do not think it is acceptable as this
 impacts the user that created those indexes with a specific name.
 
 If I have to choose between (1) keeping the same name OR (2) avoiding
 an AccessExclusiveLock then I would choose (2). Most other people
 would also, especially when all we would do is add/remove an
 underscore. Even if that is user visible. And if it is we can support
 a LOCK option that does (1) instead.
 
 If we make it an additional constraint on naming, it won't be a
 problem... namely that you can't create an index with/without an
 underscore at the end, if a similar index already exists that has an
 identical name apart from the suffix.
 
 There are few, if any, commands that need the index name to remain the
 same. For those, I think we can bend them to accept the index name and
 then add/remove the underscore to get that to work.
 
 That's all a little bit crappy, but this is too small a problem with
 an important feature to allow us to skip.
Ok. Removing the switch name part is only deleting 10 lines of code in 
index_concurrent_swap.
Then, do you guys have a preferred format for the concurrent index name? For 
the time being an inelegant _cct suffix is used. The underscore at the end?

Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUG?] lag of minRecoveryPont in archive recovery

2012-12-10 Thread Heikki Linnakangas

On 10.12.2012 03:52, Kyotaro HORIGUCHI wrote:

I think that minRecoveryPoint should be updated before the data-file
changes, so XLogFlush() should be called before smgrtruncate(). No?


Mmm.. As far as I saw in xact_redo_commit_internal, XLogFlush
seems should be done AFTER redo substantial operation. Is it
wrong? If so, I suppose xact_redo_commit_internal could shold be
fixed in the same way.


So, two options:

1. Redo truncation, then XLogFlush()

There's a window where the original problem could still occur: The file 
is truncated, and you crash before XLogFlush() finishes.


2. XLogFlush(), then redo truncation.

If the truncation fails for some reason (disk failure?) and you have 
already updated minRecoveryPoint, you can not start up until you somehow 
fix the problem so that the truncation can succeed, and then finish 
recovery.


Both options have their merits. The window in 1. is very small, and in 
2., the probability that an unlink() or truncation fails is very small 
as well.


We've chosen 1. in xact_redo_commit_internal(), but I don't think that 
was a carefully made decision, it just imitates what happens in the 
corresponding non-redo commit path. In xact_redo_commit_internal(), it 
makes sense to do XLogFlush() afterwards, for CREATE DATABASE and CREATE 
TABLESPACE. Those create files, and if you e.g run out of disk space, or 
non-existent path, you don't want minRecoveryPoint to be updated yet. 
Otherwise you can no longer recover to the point just before the 
creation of those files. But in case of DROP DATABASE, you have the 
exact same situation as with truncation: if you have already deleted 
some files, you must not be able to stop recovery at a point before that.


So I'd say we should update minRecoveryPoint first, then 
truncate/delete. But we should still keep the XLogFlush() at the end of 
xact_redo_commit_internal(), for the case where files/directories are 
created. Patch attached.


- Heikki
*** a/src/backend/access/transam/xact.c
--- b/src/backend/access/transam/xact.c
***
*** 4617,4639  xact_redo_commit_internal(TransactionId xid, XLogRecPtr lsn,
  	}
  
  	/* Make sure files supposed to be dropped are dropped */
! 	for (i = 0; i  nrels; i++)
  	{
! 		SMgrRelation srel = smgropen(xnodes[i], InvalidBackendId);
! 		ForkNumber	fork;
  
! 		for (fork = 0; fork = MAX_FORKNUM; fork++)
! 			XLogDropRelation(xnodes[i], fork);
! 		smgrdounlink(srel, true);
! 		smgrclose(srel);
  	}
  
  	/*
  	 * We issue an XLogFlush() for the same reason we emit ForceSyncCommit()
! 	 * in normal operation. For example, in DROP DATABASE, we delete all the
! 	 * files belonging to the database, and then commit the transaction. If we
! 	 * crash after all the files have been deleted but before the commit, you
! 	 * have an entry in pg_database without any files. To minimize the window
  	 * for that, we use ForceSyncCommit() to rush the commit record to disk as
  	 * quick as possible. We have the same window during recovery, and forcing
  	 * an XLogFlush() (which updates minRecoveryPoint during recovery) helps
--- 4617,4660 
  	}
  
  	/* Make sure files supposed to be dropped are dropped */
! 	if (nrels  0)
  	{
! 		/*
! 		 * First update minimum recovery point to cover this WAL record. Once
! 		 * a relation is deleted, there's no going back. The buffer manager
! 		 * enforces the WAL-first rule for normal updates to relation files,
! 		 * so that the minimum recovery point is always updated before the
! 		 * corresponding change in the data file is flushed to disk, but we
! 		 * have to do the same here since we're bypassing the buffer manager.
! 		 *
! 		 * Doing this before the deleting the files means that if a deletion
! 		 * fails for some reason, you cannot start up the system even after
! 		 * restart, until you fix the underlying situation so that the
! 		 * deletion will succeed. Alternatively, we could update the minimum
! 		 * recovery point after deletion, but that would leave a small window
! 		 * where the WAL-first rule would be violated.
! 		 */
! 		XLogFlush(lsn);
  
! 		for (i = 0; i  nrels; i++)
! 		{
! 			SMgrRelation srel = smgropen(xnodes[i], InvalidBackendId);
! 			ForkNumber	fork;
! 
! 			for (fork = 0; fork = MAX_FORKNUM; fork++)
! XLogDropRelation(xnodes[i], fork);
! 			smgrdounlink(srel, true);
! 			smgrclose(srel);
! 		}
  	}
  
  	/*
  	 * We issue an XLogFlush() for the same reason we emit ForceSyncCommit()
! 	 * in normal operation. For example, in CREATE DATABASE, we copy all the
! 	 * files from the template database, and then commit the transaction. If we
! 	 * crash after all the files have been copied but before the commit, you
! 	 * have files in the data directory without an entry in pg_database.
! 	 * To minimize the window
  	 * for that, we use ForceSyncCommit() to rush the commit record to disk as
  	 * quick as possible. We have the same window during recovery, and forcing
  	 * an XLogFlush() (which updates minRecoveryPoint 

Re: [HACKERS] Switching timeline over streaming replication

2012-12-10 Thread Amit Kapila
 From: Heikki Linnakangas [mailto:hlinnakan...@vmware.com]
 Sent: Friday, December 07, 2012 9:22 PM
 To: Amit Kapila
 Cc: 'PostgreSQL-development'; 'Thom Brown'
 Subject: Re: [HACKERS] Switching timeline over streaming replication
 
 On 06.12.2012 15:39, Amit Kapila wrote:
  On Thursday, December 06, 2012 12:53 AM Heikki Linnakangas wrote:
  On 05.12.2012 14:32, Amit Kapila wrote:
  On Tuesday, December 04, 2012 10:01 PM Heikki Linnakangas wrote:
  After some diversions to fix bugs and refactor existing code, I've
  committed a couple of small parts of this patch, which just add
  some sanity checks to notice incorrect PITR scenarios. Here's a new
  version of the main patch based on current HEAD.
 
  After testing with the new patch, the following problems are
 observed.
 
  Defect - 1:
 
1. start primary A
2. start standby B following A
3. start cascade standby C following B.
4. start another standby D following C.
5. Promote standby B.
6. After successful time line switch in cascade standby C
 D,
  stop D.
7. Restart D, Startup is successful and connecting to standby
 C.
8. Stop C.
9. Restart C, startup is failing.
 
  Ok, the error I get in that scenario is:
 
  C 2012-12-05 19:55:43.840 EET 9283 FATAL:  requested timeline 2 does
  not contain minimum recovery point 0/3023F08 on timeline 1 C
  2012-12-05
  19:55:43.841 EET 9282 LOG:  startup process (PID 9283) exited with
  exit code 1 C 2012-12-05 19:55:43.841 EET 9282 LOG:  aborting startup
  due to startup process failure
 
 
 
 Well, it seems wrong for the control file to contain a situation like
 this:
 
 pg_control version number:932
 Catalog version number:   201211281
 Database system identifier:   5819228770976387006
 Database cluster state:   shut down in recovery
 pg_control last modified: pe  7. joulukuuta 2012 17.39.57
 Latest checkpoint location:   0/3023EA8
 Prior checkpoint location:0/260
 Latest checkpoint's REDO location:0/3023EA8
 Latest checkpoint's REDO WAL file:00020003
 Latest checkpoint's TimeLineID:   2
 ...
 Time of latest checkpoint:pe  7. joulukuuta 2012 17.39.49
 Min recovery ending location: 0/3023F08
 Min recovery ending loc's timeline:   1
 
 Note the latest checkpoint location and its TimelineID, and compare them
 with the min recovery ending location. The min recovery ending location
 is ahead of latest checkpoint's location; the min recovery ending
 location actually points to the end of the checkpoint record. But how
 come the min recovery ending location's timeline is 1, while the
 checkpoint record's timeline is 2.
 
 Now maybe that would happen to work if remove the sanity check, but it
 still seems horribly confusing. I'm afraid that discrepancy will come
 back to haunt us later if we leave it like that. So I'd like to fix
 that.
 
 Mulling over this for some more, I propose the attached patch. With the
 patch, we peek into the checkpoint record, and actually perform the
 timeline switch (by changing ThisTimeLineID) before replaying it. That
 way the checkpoint record is really considered to be on the new timeline
 for all purposes. At the moment, the only difference that makes in
 practice is that we set replayEndTLI, and thus minRecoveryPointTLI, to
 the new TLI, but it feels logically more correct to do it that way.

This has fixed both the problems reported in below link:
http://archives.postgresql.org/pgsql-hackers/2012-12/msg00267.php

The code is also fine.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)

2012-12-10 Thread Stephen Frost
Simon,

* Simon Riggs (si...@2ndquadrant.com) wrote:
 Agreed, but that is also be a silent change of current behaviour.

Sure, proper MVCC for catalog entries would be a change, but I think
it's one which would actually reduce the surprises for our users.  Today
we tell people we have transactional DDL, and that's somewhat true, but
it's not MVCC-safe and that can lead to surprises.

 But the above will only work for CREATE TABLE, not for TRUNCATE.

I'm trying to figure out why there are all the constraints around this
command to begin with.  If we're going to support this, why do we
require the user to create or truncate the table in the same
transaction?  Clearly that's going to reduce the usefulness of this
feature and it's not clear to me why that constraint is needed,
code-wise.

Also, what about adding FREEZE options to INSERT and maybe even UPDATE?
Surely it would reduce the overhead associated with those commands as
well.

 I've invested a lot of time in trying to improve the situation and
 investigated many routes forwards, none of them very satisfying. Until
 someone comes up with a better plan, FREEZE is a pragmatic way
 forwards that improves things now rather than waiting for the perfect
 solution.

I agree that the perfect can sometimes be the enemy of the good, but I
still feel that this is quite a slippery slope that's going to end up
getting ourselves into trouble- regardless of how much we document it or
set up constraints around it.

 And if we want checksums anytime soon we need ways to
 ameliorate the effect of hints on checksums, which this does,
 soemwhat.

I don't buy into this argument in the least.

 Better plans, with code, welcome.

While I appreciate the mentality that those-who-code-win, I don't
believe that it can be used in an argument based on *correctness*.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Support for REINDEX CONCURRENTLY

2012-12-10 Thread Andres Freund
On 2012-12-10 15:03:59 +0900, Michael Paquier wrote:
 I have updated the patch (v4) to take care of updating reltoastidxid for
 toast parent relations at the swap step by using index_update_stats. In
 prior versions of the patch this was done when concurrent index was built,
 leading to toast relations using invalid indexes if there was a failure
 before the swap phase. The update of reltoastidxids of toast relation is
 done with RowExclusiveLock.
 I also added a couple of tests in src/test/isolation. Btw, as for the time
 being the swap step uses AccessExclusiveLock to switch old and new
 relnames, it does not have any meaning to run them...

Btw, as an example of the problems caused by renaming:

postgres=# CREATE TABLE a (id serial primary key); CREATE TABLE b(id
serial primary key, a_id int REFERENCES a);
CREATE TABLE
Time: 137.840 ms
CREATE TABLE
Time: 143.500 ms
postgres=# \d b
 Table public.b
 Column |  Type   |   Modifiers
+-+
 id | integer | not null default nextval('b_id_seq'::regclass)
 a_id   | integer |
Indexes:
b_pkey PRIMARY KEY, btree (id)
Foreign-key constraints:
b_a_id_fkey FOREIGN KEY (a_id) REFERENCES a(id)

postgres=# REINDEX TABLE a CONCURRENTLY;
NOTICE:  drop cascades to constraint b_a_id_fkey on table b
REINDEX
Time: 248.992 ms
postgres=# \d b
 Table public.b
 Column |  Type   |   Modifiers
+-+
 id | integer | not null default nextval('b_id_seq'::regclass)
 a_id   | integer |
Indexes:
b_pkey PRIMARY KEY, btree (id)


Looking at the patch for a bit now.

Regards,

Andres
--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for REINDEX CONCURRENTLY

2012-12-10 Thread Andres Freund
On 2012-12-10 15:51:40 +0100, Andres Freund wrote:
 On 2012-12-10 15:03:59 +0900, Michael Paquier wrote:
  I have updated the patch (v4) to take care of updating reltoastidxid for
  toast parent relations at the swap step by using index_update_stats. In
  prior versions of the patch this was done when concurrent index was built,
  leading to toast relations using invalid indexes if there was a failure
  before the swap phase. The update of reltoastidxids of toast relation is
  done with RowExclusiveLock.
  I also added a couple of tests in src/test/isolation. Btw, as for the time
  being the swap step uses AccessExclusiveLock to switch old and new
  relnames, it does not have any meaning to run them...

 Btw, as an example of the problems caused by renaming:
 Looking at the patch for a bit now.

Some review comments:

* Some of the added !is_reindex in index_create don't seem safe to
  me. Why do we now support reindexing exlusion constraints?

* REINDEX DATABASE .. CONCURRENTLY doesn't work, a variant that does the
  concurrent reindexing for user-tables and non-concurrent for system
  tables would be very useful. E.g. for the upgrade from 9.1.5-9.1.6...

* ISTM index_concurrent_swap should get exlusive locks on the relation
  *before* printing their names. This shouldn't be required because we
  have a lock prohibiting schema changes on the parent table, but it
  feels safer.

* temporary index names during swapping should also be named via
  ChooseIndexName

* why does create_toast_table pass an unconditional 'is_reindex' to
  index_create?

* would be nice (but thats probably a step #2 thing) to do the
  individual steps of concurrent reindex over multiple relations to
  avoid too much overall waiting for other transactions.

* ReindexConcurrentIndexes:

  * says  Such indexes are simply bypassed if caller has not specified
anything. but ERROR's. Imo ERROR is fine, but the comment should be
adjusted...

  * should perhaps be names ReindexIndexesConcurrently?

  * Imo the PHASE 1 comment should be after gathering/validitating the
chosen indexes

  * It seems better to me to do use individual transactions + snapshots
for each index, no need to keep very long transactions open (PHASE
2/3)

  * s/same whing/same thing/

  * Shouldn't a CacheInvalidateRelcacheByRelid be done after PHASE 2 and
5 as well?

  * PHASE 6 should acquire exlusive locks on the indexes

* can some of index_concurrent_* infrastructure be reused for
  DROP INDEX CONCURRENTLY?

* in CREATE/DROP INDEX CONCURRENTLY 'CONCURRENTLY comes before the
  object name, should we keep that conventioN?


Thats all I have for now.


Very nice work! Imo the code looks cleaner after your patch...


Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CommitFest #3 and upcoming schedule

2012-12-10 Thread Andres Freund
On 2012-12-10 09:22:25 +0530, Amit Kapila wrote:
 On Sunday, December 09, 2012 9:27 PM Simon Riggs
  On 16 November 2012 07:20, Greg Smith g...@2ndquadrant.com wrote:
 
 
  Let's bring balance to the situation through our own actions. Please
  review one patch now for every one you submitted.

 In CF-3, I am Author of 5 and Reviewer of 5

 3 of my patches as Author have been moved from CF-2

You're not alone in that ;)

 4 of the patches where I am reviewer have been moved from CF-2

 One of my Patch : Patch for option in pg_resetxlog for restore from WAL
 files
 is dependent on another patch XLogReader, so I am expecting to get it done
 only after XLogReader.

Btw, I posted the current version of this at:
http://archives.postgresql.org/message-id/20121204175212.GB12055%40awork2.anarazel.de

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2012-12-10 Thread Andres Freund
On 2012-12-08 17:07:38 +0100, Tomas Vondra wrote:
 On 8.12.2012 15:49, Tomas Vondra wrote:
  On 8.12.2012 15:26, Andres Freund wrote:
  On 2012-12-06 23:38:59 +0100, Tomas Vondra wrote:
  I've re-run the tests with the current patch on my home workstation, and
  the results are these (again 10k tables, dropped either one-by-one or in
  batches of 100).
 
  1) unpatched
 
  dropping one-by-one:15.5 seconds
  dropping in batches of 100: 12.3 sec
 
  2) patched (v3.1)
 
  dropping one-by-one:32.8 seconds
  dropping in batches of 100:  3.0 sec
 
  The problem here is that when dropping the tables one-by-one, the
  bsearch overhead is tremendous and significantly increases the runtime.
  I've done a simple check (if dropping a single table, use the original
  simple comparison) and I got this:
 
  3) patched (v3.2)
 
  dropping one-by-one:16.0 seconds
  dropping in batches of 100:  3.3 sec
 
  i.e. the best of both. But it seems like an unnecessary complexity to me
  - if you need to drop a lot of tables you'll probably do that in a
  transaction anyway.
 
 
  Imo that's still a pretty bad performance difference. And your
  single-table optimization will probably fall short as soon as the table
  has indexes and/or a toast table...
 
  Why? I haven't checked the code but either those objects are droppped
  one-by-one (which seems unlikely) or they're added to the pending list
  and then the new code will kick in, making it actually faster.
 
  Yes, the original code might be faster even for 2 or 3 objects, but I
  can't think of a simple solution to fix this, given that it really
  depends on CPU/RAM speed and shared_buffers size.

 I've done some test and yes - once there are other objects the
 optimization falls short. For example for tables with one index, it
 looks like this:

   1) unpatched

   one by one:  28.9 s
   100 batches: 23.9 s

   2) patched

   one by one:  44.1 s
   100 batches:  4.7 s

 So the patched code is by about 50% slower, but this difference quickly
 disappears with the number of indexes / toast tables etc.

 I see this as an argument AGAINST such special-case optimization. My
 reasoning is this:

 * This difference is significant only if you're dropping a table with
   low number of indexes / toast tables. In reality this is not going to
   be very frequent.

 * If you're dropping a single table, it really does not matter - the
   difference will be like 100 ms vs. 200 ms or something like that.

I don't particularly buy that argument. There are good reasons (like
avoiding deadlocks, long transactions) to drop multiple tables
in individual transactions.
Not that I have a good plan to how to work around that though :(

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for REINDEX CONCURRENTLY

2012-12-10 Thread Tom Lane
Michael Paquier michael.paqu...@gmail.com writes:
 On 2012/12/10, at 18:28, Simon Riggs si...@2ndquadrant.com wrote:
 If I have to choose between (1) keeping the same name OR (2) avoiding
 an AccessExclusiveLock then I would choose (2). Most other people
 would also, especially when all we would do is add/remove an
 underscore. Even if that is user visible. And if it is we can support
 a LOCK option that does (1) instead.

 Ok. Removing the switch name part is only deleting 10 lines of code in 
 index_concurrent_swap.
 Then, do you guys have a preferred format for the concurrent index name? For 
 the time being an inelegant _cct suffix is used. The underscore at the end?

You still need to avoid conflicting name assignments, so my
recommendation would really be to use the select-a-new-name code already
in use for CREATE INDEX without an index name.  The underscore idea is
cute, but I doubt it's worth the effort to implement, document, or
explain it in a way that copes with repeated REINDEXes and conflicts.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] replication optimization: page writes only at the slave

2012-12-10 Thread Xin Pan

Assumption: I have enough memory to cache all the database pages.
Goal:
Master never write pages. Slave replays logs from master and writes pages.
Benefits:
Reduce the page IO overhead at master, save money in EC2 cloud.

Question:
Can you give me some comments on this idea?
And I cannot turn of page writes in Postgresql.

I adjust the following parameters:
shared_buffers = 3GB

bgwriter_delay = 2000ms # 10-1ms between rounds
bgwriter_lru_maxpages = 0   # 0-1000 max buffers written/round
bgwriter_lru_multiplier = 0 # 0-10.0 multipler on buffers 
scanned/round


checkpoint_segments = 256# in logfile segments, min 1, 
16MB each

checkpoint_timeout = 1h  # range 30s-1h
checkpoint_completion_target = 1.0  # checkpoint target duration, 
0.0 - 1.0


However, I still witness large amount of page writes.
Can anyone tell where are the page writes come from?
Can I turn off that part of page writes by configuration?
If not, which part of source code should I adjust to achieve my goal 
(turn of page writes)?



Thanks!

Xin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUG?] lag of minRecoveryPont in archive recovery

2012-12-10 Thread Heikki Linnakangas

On 10.12.2012 13:50, Heikki Linnakangas wrote:

So I'd say we should update minRecoveryPoint first, then
truncate/delete. But we should still keep the XLogFlush() at the end of
xact_redo_commit_internal(), for the case where files/directories are
created. Patch attached.


Committed and backpatched that. Attached is a script I used to reproduce 
this problem, going back to 8.4.


- Heikki


minrecoverypoint-recipe.sh
Description: Bourne shell script

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] replication optimization: page writes only at the slave

2012-12-10 Thread Hannu Krosing

On 12/10/2012 04:56 PM, Xin Pan wrote:

Assumption: I have enough memory to cache all the database pages.
Goal:
Master never write pages. Slave replays logs from master and writes 
pages.

Benefits:
Reduce the page IO overhead at master, save money in EC2 cloud.


I have suggested something similar on table-by-table basis but this has 
not yet

generated much traction. I'll come back to this in coming weeks

For whole WAL you can achieve this by putting WAL on a large-enough 
ramdrive.


Hannu



Question:
Can you give me some comments on this idea?
And I cannot turn of page writes in Postgresql.

I adjust the following parameters:
shared_buffers = 3GB

bgwriter_delay = 2000ms # 10-1ms between rounds
bgwriter_lru_maxpages = 0   # 0-1000 max buffers 
written/round
bgwriter_lru_multiplier = 0 # 0-10.0 multipler on buffers 
scanned/round


checkpoint_segments = 256# in logfile segments, min 1, 
16MB each

checkpoint_timeout = 1h  # range 30s-1h
checkpoint_completion_target = 1.0  # checkpoint target duration, 
0.0 - 1.0


However, I still witness large amount of page writes.
Can anyone tell where are the page writes come from?
Can I turn off that part of page writes by configuration?
If not, which part of source code should I adjust to achieve my goal 
(turn of page writes)?



Thanks!

Xin







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2012-12-10 Thread Tomas Vondra

Dne 10.12.2012 16:38, Andres Freund napsal:

On 2012-12-08 17:07:38 +0100, Tomas Vondra wrote:

I've done some test and yes - once there are other objects the
optimization falls short. For example for tables with one index, it
looks like this:

  1) unpatched

  one by one:  28.9 s
  100 batches: 23.9 s

  2) patched

  one by one:  44.1 s
  100 batches:  4.7 s

So the patched code is by about 50% slower, but this difference 
quickly

disappears with the number of indexes / toast tables etc.

I see this as an argument AGAINST such special-case optimization. My
reasoning is this:

* This difference is significant only if you're dropping a table 
with
  low number of indexes / toast tables. In reality this is not going 
to

  be very frequent.

* If you're dropping a single table, it really does not matter - the
  difference will be like 100 ms vs. 200 ms or something like that.


I don't particularly buy that argument. There are good reasons (like
avoiding deadlocks, long transactions) to drop multiple tables
in individual transactions.
Not that I have a good plan to how to work around that though :(


Yeah, if you need to drop the tables one by one for some reason, you
can't get rid of the overhead this way :-(

OTOH in the example above the overhead is ~50%, i.e. 1.5ms / table with 
a
single index. Each such associated relation (index, TOAST table, ...) 
means

a relation that needs to be dropped and on my machine, once I reach ~5
relations there's almost no difference as the overhead is balanced by 
the

gains.

Not sure how to fix that in an elegant way, though :-(

Tomas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUG?] lag of minRecoveryPont in archive recovery

2012-12-10 Thread Fujii Masao
On Tue, Dec 11, 2012 at 1:33 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 10.12.2012 13:50, Heikki Linnakangas wrote:

 So I'd say we should update minRecoveryPoint first, then
 truncate/delete. But we should still keep the XLogFlush() at the end of
 xact_redo_commit_internal(), for the case where files/directories are
 created. Patch attached.

Sounds reasonable.

 Committed and backpatched that. Attached is a script I used to reproduce
 this problem, going back to 8.4.

Thanks!

Unfortunately I could reproduce the problem even after that commit.
Attached is the script I used to reproduce the problem.

The cause is that CheckRecoveryConsistency() is called before rm_redo(),
as Horiguchi-san suggested upthead. Imagine the case where
minRecoveryPoint is set to the location of the XLOG_SMGR_TRUNCATE
record. When restarting the server with that minRecoveryPoint,
the followings would happen, and then PANIC occurs.

1. XLOG_SMGR_TRUNCATE record is read.
2. CheckRecoveryConsistency() is called, and database is marked as
consistent since we've reached minRecoveryPoint (i.e., the location
of XLOG_SMGR_TRUNCATE).
3. XLOG_SMGR_TRUNCATE record is replayed, and invalid page is
found.
4. Since the database has already been marked as consistent, an invalid
page leads to PANIC.

Regards,

-- 
Fujii Masao


fujii_test.sh
Description: Bourne shell script

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] PL/Python: Add spidata to all spiexceptions

2012-12-10 Thread Karl O. Pinc
On 12/09/2012 10:33:59 PM, Karl O. Pinc wrote:
 Hi,
 
 I don't feel particularly qualified to comment, but in the
 interest of (hopefully) helping with the patch review process
 I'm going to comment anyway.

I've gone ahead and signed up to review this patch.

I can confirm that it compiles against head and
the tests pass.  I started up a server
and tried it and it works for me, trapping
the exception and executing the exception code.

So, looks good, as far as it goes.
I await your response to my previous message
in this thread before sending it on a
a committer.  There were 2 outstanding
issue raised:

Is this useful enough/proceeding in the right
direction to commit now?

Should some of the logic be moved out of a
subroutine and into the calling code?

Regards,

Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Shuffling xlog header files

2012-12-10 Thread Andres Freund
Hi,

The xlog_fn.h patch was Alvaro's idea (and patch) btw, I previously had
used an ugly typedef for Datum to get arround defining
FRONTEND/including postgres.h...

On 2012-12-10 19:56:49 +0200, Heikki Linnakangas wrote:
 We still need the #define FRONTEND 1 ugly hack in pg_controldata and
 pg_resetxlog with this, but we get rid of it in pg_basebackup. I think
 that's reasonable, pg_controldata and pg_resetxlog are more low-level
 programs than pg_basebackup.

It's also for the pg_receivellog introduced in one of my other
patches... So it seems to be a good idea to me.

 The name xlog_internal.h is a bit of a misnomer now, it's
 not very internal anymore, if it can now actually be included by external
 programs. But the point is that the file contains declarations related to
 the WAL file format.

We could rename it to xlog_details.h or such, but I guess the noise
would outhweigh the benefits.


 Any objections?

Unsurprisingly none from my side.


Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_database_size issue an error (ERROR: could not stat file base/16384/20041: Permission denied)

2012-12-10 Thread Christophe GUILLOT
Hi all,
The usage of the pg_database_size function generate a lot of permission
denied log errors.
Typical message is: ERROR:  could not stat file base/16384/20041:
Permission denied.

As a short description:
- PostgreSQL 9.2.1/64 bits
- our system is running on windows platform (W7, W2008)
- frequent alter/create/drop of tables
- call to pg_database_size, sometimes.

When logging file system activity (using procmon, W7), the query made on
the file printed in the log message return a “DELETE PENDING” state and i
only find a handle to this file owned by a postgres executable.

After code exploration, it looks like the functions from dbsize.c should
use the same trick as found in md.c (i mean the FILE_POSSIBLY_DELETED
macro) for testing each stat() returned error code.

So, is it a good idea to patch dbsize.c with the same macro, which on
windows platform include the EACCES error code as a possibly deleted file,
or not. Or is it the wrong way ? I didn't try to patch yet.

Thanks for your help,

Christophe


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-12-10 Thread Robert Haas
On Tue, Nov 27, 2012 at 4:36 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 11/25/12 6:36 PM, Robert Haas wrote:
 I think that is true.  But for whatever it's worth, and at the risk of
 beating a horse that seems not to be dead yet in spite of the fact
 that I feel I've already administered one hell of a beating, the LPAD
 case is unambiguous, and therefore it is hard to see what sort of
 programming mistake we are protecting users against.

 Upstream applications passing wrong data down to the database.

The circumstantial evidence suggests that many users don't want to be
protected against that in the way that we are currently protecting
them - or at least not all of them do (see Merlin's email elsewhere on
this thread).  What's frustrating about the status quo is that not
only do you need lots of extra casts, but there's no real way to
improve the situation.  If you add an implicit cast from integer to
text, for example, then 4 || 'foo' breaks.  Now, you may think that
adding an implicit cast from integer to text is a dumb idea, and maybe
it is, but don't get too hung up on that example.  The point is that
if you're unhappy with the quote_literal() case (because it accepts
too much), or the lpad() case (because it doesn't accept enough), or
the foo(smallint) case (because it can't be happy with foo(42)), you
don't really have a lot of options for adjusting the behavior as
things stand today.  I accept that some people think that decorating
their code with lots of extra casts helps to avoid errors, and maybe
it does, but there is plenty of evidence that a lot of users don't
want to.  And we not only don't give them the behavior they want; we
don't even have a meaningful way to give the option of opting into
that behavior at initdb or create-database time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] The tarball's README has bad install instructions

2012-12-10 Thread Karl O. Pinc
Hi,

The top-level README in the source tarball says:

--snip---
See the file INSTALL for instructions on how to build and install
PostgreSQL.  That file also lists supported operating systems and
hardware platforms and contains information regarding any other
software packages that are required to build or run the PostgreSQL
system.  Changes between all PostgreSQL releases are recorded in the
file HISTORY.  Copyright and license information can be found in the
file COPYRIGHT.  A comprehensive documentation set is included in this
distribution; it can be read as described in the installation
instructions.
--snip---

There is no INSTALL file, and although there is documentation
it's scattered in a bunch of sgml files.

I was going send in a patch that referred the user to
the on-line postgres docs but it occurred to me that it
would be possible to include the documentation in plain text
form in the tarballs.  This would add at least 1.7M to
the size of the tarball, and complicate the building
of the tarball due to all the dependencies needed to build
the docs.

All the same, it's aesthetically pleasing to have build instructions
included in the tarball.

A minimal replacement of the above text might be:

--snip---
See the chapter titled Installation from Source Code found
in the PostgreSQL documentation available from
http://www.postgresql.org/docs/ for installation instructions,
supported platforms, and build requirements.  A record of the
changes made between PostgreSQL releases is recorded in
an appendix to the documentation.  Copyright and license 
information can be found in the file COPYRIGHT.
--snip---

Is anyone interested in a patch that includes plain text
documentation in the tarballs?  How about a patch to the
README per the text above?

Regards,

Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-12-10 Thread Jeff Davis
It looks like there are still some problems with this patch.

  CREATE TABLE foo(ir int4range);
  insert into foo select 'empty' from generate_series(1,1);
  insert into foo select int4range(NULL, g, '(]')
from generate_series(1,100) g;
  insert into foo select int4range(g, NULL, '[)')
from generate_series(1,100) g;
  insert into foo select int4range(g, ((g*1.01)+10)::int4, '[]')
from generate_series(1,100) g;
  CREATE TABLE bar(ir) AS select * from foo order by random();
  ANALYZE bar;

Now:
  EXPLAIN ANALYZE SELECT * FROM bar
WHERE ir @ int4range(1,2);

The estimates are -nan. Similar for many other queries.

And I have a few other questions/comments:

* Why is summ spelled with two ms? Is it short for summation? If
so, might be good to use summation of instead of integrate in the
comment.

* Why does get_length_hist_frac return 0.0 when i is the last value? Is
that a mistake?

* I am still confused by the distinction between rbound_bsearch and
rbound_bsearch_bin. What is the intuitive purpose of each?

* You use constant value in the comments in several places. Would
query value or search key be better?

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The tarball's README has bad install instructions

2012-12-10 Thread Heikki Linnakangas

On 10.12.2012 21:19, Karl O. Pinc wrote:

The top-level README in the source tarball says:

--snip---
See the file INSTALL for instructions on how to build and install
PostgreSQL.  That file also lists supported operating systems and
hardware platforms and contains information regarding any other
software packages that are required to build or run the PostgreSQL
system.  Changes between all PostgreSQL releases are recorded in the
file HISTORY.  Copyright and license information can be found in the
file COPYRIGHT.  A comprehensive documentation set is included in this
distribution; it can be read as described in the installation
instructions.
--snip---

There is no INSTALL file, and although there is documentation
it's scattered in a bunch of sgml files.


Which tarball did you look at? The INSTALL file is there in the 9.2.2 
tarball at least:


~$ tar tvjf postgresql-9.2.2.tar.bz2 | grep INSTALL
-rw-r--r-- pgsql/pgsql   76825 2012-12-03 22:34 postgresql-9.2.2/INSTALL

Note that the INSTALL file is not present in the git repository, it's 
generated and included in the tarball. See README.git.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The tarball's README has bad install instructions

2012-12-10 Thread Karl O. Pinc
On 12/10/2012 01:29:03 PM, Heikki Linnakangas wrote:
 On 10.12.2012 21:19, Karl O. Pinc wrote:
  The top-level README in the source tarball says:
 
  --snip---
  See the file INSTALL for instructions on how to build and install
  PostgreSQL.  That file also lists supported operating systems and
  hardware platforms and contains information regarding any other
  software packages that are required to build or run the PostgreSQL
  system.  Changes between all PostgreSQL releases are recorded in 
 the
  file HISTORY.  Copyright and license information can be found in 
 the
  file COPYRIGHT.  A comprehensive documentation set is included in
 this
  distribution; it can be read as described in the installation
  instructions.
  --snip---
 
  There is no INSTALL file, and although there is documentation
  it's scattered in a bunch of sgml files.
 
 Which tarball did you look at? 

I made a tarball from head, but did not look at it. :-(

 Note that the INSTALL file is not present in the git repository, it's 
 generated and included in the tarball. See README.git.

That's my problem, I had checked out from git.  
Thanks and sorry to bother you.

Regards,

Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doc patch, distinguish sections with an empty row in error code table

2012-12-10 Thread Karl O. Pinc
On 11/08/2012 11:55:19 AM, Karl O. Pinc wrote:
 On 11/08/2012 11:10:39 AM, Robert Haas wrote:

  Ah, well, as to that, I think you'd have to take that suggestion to
  pgsql-www.  The style sheets used for the web site are - just to
 make
  things exciting - stored in a completely different source code
  repository to which I don't have access.  Some kind of CSS
  frobnication along the lines you suggest might be worth discussing,
  but I don't really work on that stuff.
 
 Without being able to pass additional style from the source
 docs through to the html it seems a bit spooky to do this.

Since the existing style sheets aren't maintained upstream
and don't pass the necessary style through to the generated
style sheets, and since even if it did the style sheets
of the official docs on postgresql.org would not reflect
any changes made here, it seems like this patch should be
rejected.

Regards,

Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The tarball's README has bad install instructions

2012-12-10 Thread Andrew Dunstan


On 12/10/2012 02:42 PM, Karl O. Pinc wrote:


I made a tarball from head, but did not look at it. :-(


Note that the INSTALL file is not present in the git repository, it's
generated and included in the tarball. See README.git.

That's my problem, I had checked out from git.
Thanks and sorry to bother you.




To see what the tarball will contain, run make dist.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [PATCH 02/14] Add support for a generic wal reading facility dubbed XLogReader

2012-12-10 Thread Heikki Linnakangas

(Offlist)

Just a quick note that I'm working on this patch now. I pushed some 
trivial fixes to my git repository at 
git://git.postgresql.org/git/users/heikki/postgres.git, xlogreader_v3 
branch.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [PATCH 02/14] Add support for a generic wal reading facility dubbed XLogReader

2012-12-10 Thread Heikki Linnakangas

On 10.12.2012 22:22, Heikki Linnakangas wrote:

(Offlist)

Just a quick note that I'm working on this patch now. I pushed some
trivial fixes to my git repository at
git://git.postgresql.org/git/users/heikki/postgres.git, xlogreader_v3
branch.


Oops, wasn't offlist :-). Well, if anyone wants to take a look, feel free.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Shuffling xlog header files

2012-12-10 Thread Simon Riggs
On 10 December 2012 17:56, Heikki Linnakangas hlinnakan...@vmware.com wrote:

 Any objections?

No objections, though I'm concerned to make as few changes as possible. Thanks

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enhanced error fields

2012-12-10 Thread David Johnston
 -Original Message-
 From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-
 ow...@postgresql.org] On Behalf Of Peter Geoghegan
 Sent: Monday, December 10, 2012 3:29 PM
 To: Pavel Stehule
 Cc: PostgreSQL Hackers; Alvaro Herrera; Tom Lane
 Subject: Re: [HACKERS] enhanced error fields
 
 
 Now, there are one or two places where these fields are not actually
 available even though they're formally required according to a literal
reading
 of the above. This is only because there is clearly no such field sensibly
 available, even in principle - to my mind this cannot be a problem,
because
 the application developer cannot have any reasonable expectation of a
field
 being set. I'm really talking about two cases in particular:
 
 * For ERRCODE_NOT_NULL_VIOLATION, we don't actually provide
 schema_name and table_name in the event of domains. This was previously
 identified as an issue. If it is judged better to not have any
requirements
 there at all, so be it.
 
 * For the validateDomainConstraint() ERRCODE_CHECK_VIOLATION ereport
 call, we may not provide a constraint name iff a Constraint.connname is
 NULL. Since there isn't a constraint name to give even in principle, and
this is
 an isolated case, this seems reasonable.
 

Just skimming this topic but if these enhanced error fields are going to be
used by software, and we have 99% adherence to a standard, then my first
reaction is why not just supply Not Applicable (or Not Available as
appropriate) instead of suppressing the field altogether in these (and
possibly other, future) cases and make adherence for these fields 100%?

From an ease-of-use aspect for the API if I can simply always query each
of those fields and know I will be receiving a string it does at least seem
theoretically easier to interface with.  If I am expecting special string
values (enclosed in symbols making them invalid identifiers) I can then
handle those as desired without either receiving an error or a NULL when I
go to poll the missing field if those couple of instances.

I may be paranoid or mistaken regarding how this work but figured I'd at
least throw it out for consideration.

David J.






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-12-10 Thread Pavel Stehule
Hello

2012/12/10 Robert Haas robertmh...@gmail.com:
 On Tue, Nov 27, 2012 at 4:36 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 11/25/12 6:36 PM, Robert Haas wrote:
 I think that is true.  But for whatever it's worth, and at the risk of
 beating a horse that seems not to be dead yet in spite of the fact
 that I feel I've already administered one hell of a beating, the LPAD
 case is unambiguous, and therefore it is hard to see what sort of
 programming mistake we are protecting users against.

 Upstream applications passing wrong data down to the database.

 The circumstantial evidence suggests that many users don't want to be
 protected against that in the way that we are currently protecting
 them - or at least not all of them do (see Merlin's email elsewhere on
 this thread).  What's frustrating about the status quo is that not
 only do you need lots of extra casts, but there's no real way to
 improve the situation.  If you add an implicit cast from integer to
 text, for example, then 4 || 'foo' breaks.  Now, you may think that
 adding an implicit cast from integer to text is a dumb idea, and maybe
 it is, but don't get too hung up on that example.  The point is that
 if you're unhappy with the quote_literal() case (because it accepts
 too much), or the lpad() case (because it doesn't accept enough), or
 the foo(smallint) case (because it can't be happy with foo(42)), you
 don't really have a lot of options for adjusting the behavior as
 things stand today.  I accept that some people think that decorating
 their code with lots of extra casts helps to avoid errors, and maybe
 it does, but there is plenty of evidence that a lot of users don't
 want to.  And we not only don't give them the behavior they want; we
 don't even have a meaningful way to give the option of opting into
 that behavior at initdb or create-database time.


it is looking so our design missing some feature, flag, that can
signalize safety of implicit cast - or allow more exactly to specify
casting rules for related functionality. For some functions we do this
magic inside parser and rewriter, but we don't allow this for custom
functions.

Few years ago I proposed a parser hooks, where this task can be
solved. The parser hook is probably too generic, but probably we don't
design good solution just by simple change of some parameter of
current design, and we should to enhance current design - maybe some
new parameter modifiers?

Regards

Pavel


 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enhanced error fields

2012-12-10 Thread Peter Geoghegan
On 10 December 2012 20:52, David Johnston pol...@yahoo.com wrote:
 Just skimming this topic but if these enhanced error fields are going to be
 used by software, and we have 99% adherence to a standard, then my first
 reaction is why not just supply Not Applicable (or Not Available as
 appropriate) instead of suppressing the field altogether in these (and
 possibly other, future) cases and make adherence for these fields 100%?

Well, this is an area that the standard doesn't have anything to say
about. The standard defines errcodes, but not what fields they make
available. I suppose you could say that the patch proposes that they
become a Postgres extension to the standard.

I probably could have found more places to set the fields. Certainly,
I've already set them in some places where they are not actually
required to be set by the new contract of errcodes.txt/errcodes.h. My
immediate concern is getting something minimal committed, though. I
think the latest revision handles all of the important cases (i.e.
cases where applications may want a well-principled way of detecting a
particular kind of error, like an error due to the violation of a
particular, known constraint).

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] pg_upgrade -o/-O regression in 9.2.2

2012-12-10 Thread Marti Raudsepp
Hi!

It seems that PostgreSQL 9.2.2 has a regression in pg_upgrade, the -o
and -O options forget to add a space before passing on user options,
thereby generating unparsable command lines.

For example:
pg_upgrade -b /usr/local/pg91/bin -B /usr/bin -d /tmp/91 -D /tmp/92 -O -F
[...]
Creating catalog dump   ok
*failure*
could not connect to new postmaster started with the command:
/usr/bin/pg_ctl -w -l pg_upgrade_server.log -D /tmp/92 -o -p
50432 -b -c synchronous_commit=off-F -c listen_addresses='' -c
unix_socket_permissions=0700 -c unix_socket_directory='/tmp' start

Notice the bad argument synchronous_commit=off-F

It's easy enough to work around by adding a space to the command line,
passing -O ' -F' instead of -O '-F'

Here's the bad commit:
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ed5699dd1b883e193930448b7ad532e233de0bd7;hp=5ed6546cf75623ba426942a3b71659a66cf7ed68

The attached patch re-introduces the space at the necessary place.

Regards,
Marti


pg_upgrade_user_options-9.2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for REINDEX CONCURRENTLY

2012-12-10 Thread Peter Eisentraut
On 12/8/12 9:40 AM, Tom Lane wrote:
 I'm tempted to propose that REINDEX CONCURRENTLY simply not try to
 preserve the index name exactly.  Something like adding or removing
 trailing underscores would probably serve to generate a nonconflicting
 name that's not too unsightly.

If you think you can rename an index without an exclusive lock, then why
not rename it back to the original name when you're done?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [SPAM?]: Re: [HACKERS] Support for REINDEX CONCURRENTLY

2012-12-10 Thread Simon Riggs
On 10 December 2012 22:18, Peter Eisentraut pete...@gmx.net wrote:
 On 12/8/12 9:40 AM, Tom Lane wrote:
 I'm tempted to propose that REINDEX CONCURRENTLY simply not try to
 preserve the index name exactly.  Something like adding or removing
 trailing underscores would probably serve to generate a nonconflicting
 name that's not too unsightly.

 If you think you can rename an index without an exclusive lock, then why
 not rename it back to the original name when you're done?

Because the index isn't being renamed. An alternate equivalent index
is being created instead.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [SPAM?]: Re: [HACKERS] Support for REINDEX CONCURRENTLY

2012-12-10 Thread Peter Eisentraut
On 12/10/12 5:21 PM, Simon Riggs wrote:
 On 10 December 2012 22:18, Peter Eisentraut pete...@gmx.net wrote:
 On 12/8/12 9:40 AM, Tom Lane wrote:
 I'm tempted to propose that REINDEX CONCURRENTLY simply not try to
 preserve the index name exactly.  Something like adding or removing
 trailing underscores would probably serve to generate a nonconflicting
 name that's not too unsightly.

 If you think you can rename an index without an exclusive lock, then why
 not rename it back to the original name when you're done?
 
 Because the index isn't being renamed. An alternate equivalent index
 is being created instead.

Right, basically, you can do this right now using

CREATE INDEX CONCURRENTLY ${name}_tmp ...
DROP INDEX CONCURRENTLY ${name};
ALTER INDEX ${name}_tmp RENAME TO ${name};

The only tricks here are if ${name}_tmp is already taken, in which case
you might as well just error out (or try a few different names), and if
${name} is already in use by the time you get to the last line, in which
case you can log a warning or an error.

What am I missing?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [SPAM?]: Re: [HACKERS] Support for REINDEX CONCURRENTLY

2012-12-10 Thread Simon Riggs
On 10 December 2012 22:27, Peter Eisentraut pete...@gmx.net wrote:
 On 12/10/12 5:21 PM, Simon Riggs wrote:
 On 10 December 2012 22:18, Peter Eisentraut pete...@gmx.net wrote:
 On 12/8/12 9:40 AM, Tom Lane wrote:
 I'm tempted to propose that REINDEX CONCURRENTLY simply not try to
 preserve the index name exactly.  Something like adding or removing
 trailing underscores would probably serve to generate a nonconflicting
 name that's not too unsightly.

 If you think you can rename an index without an exclusive lock, then why
 not rename it back to the original name when you're done?

 Because the index isn't being renamed. An alternate equivalent index
 is being created instead.

 Right, basically, you can do this right now using

 CREATE INDEX CONCURRENTLY ${name}_tmp ...
 DROP INDEX CONCURRENTLY ${name};
 ALTER INDEX ${name}_tmp RENAME TO ${name};

 The only tricks here are if ${name}_tmp is already taken, in which case
 you might as well just error out (or try a few different names), and if
 ${name} is already in use by the time you get to the last line, in which
 case you can log a warning or an error.

 What am I missing?

That this is already recorded in my book ;-)

And also that REINDEX CONCURRENTLY doesn't work like that, yet.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [SPAM?]: Re: Support for REINDEX CONCURRENTLY

2012-12-10 Thread Andres Freund
On 2012-12-10 17:27:45 -0500, Peter Eisentraut wrote:
 On 12/10/12 5:21 PM, Simon Riggs wrote:
  On 10 December 2012 22:18, Peter Eisentraut pete...@gmx.net wrote:
  On 12/8/12 9:40 AM, Tom Lane wrote:
  I'm tempted to propose that REINDEX CONCURRENTLY simply not try to
  preserve the index name exactly.  Something like adding or removing
  trailing underscores would probably serve to generate a nonconflicting
  name that's not too unsightly.
 
  If you think you can rename an index without an exclusive lock, then why
  not rename it back to the original name when you're done?
 
  Because the index isn't being renamed. An alternate equivalent index
  is being created instead.

 Right, basically, you can do this right now using

 CREATE INDEX CONCURRENTLY ${name}_tmp ...
 DROP INDEX CONCURRENTLY ${name};
 ALTER INDEX ${name}_tmp RENAME TO ${name};

 The only tricks here are if ${name}_tmp is already taken, in which case
 you might as well just error out (or try a few different names), and if
 ${name} is already in use by the time you get to the last line, in which
 case you can log a warning or an error.

 What am I missing?

I don't think this is the problematic side of the patch.

The question is rather how to transfer the dependencies without too much
ugliness or how to swap oids without a race. Either by accepting an
exlusive lock or by playing some games, the latter possibly being easier
with renaming...

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [SPAM?]: Re: Support for REINDEX CONCURRENTLY

2012-12-10 Thread Andres Freund
On 2012-12-10 22:33:50 +, Simon Riggs wrote:
 On 10 December 2012 22:27, Peter Eisentraut pete...@gmx.net wrote:
  On 12/10/12 5:21 PM, Simon Riggs wrote:
  On 10 December 2012 22:18, Peter Eisentraut pete...@gmx.net wrote:
  On 12/8/12 9:40 AM, Tom Lane wrote:
  I'm tempted to propose that REINDEX CONCURRENTLY simply not try to
  preserve the index name exactly.  Something like adding or removing
  trailing underscores would probably serve to generate a nonconflicting
  name that's not too unsightly.
 
  If you think you can rename an index without an exclusive lock, then why
  not rename it back to the original name when you're done?
 
  Because the index isn't being renamed. An alternate equivalent index
  is being created instead.
 
  Right, basically, you can do this right now using
 
  CREATE INDEX CONCURRENTLY ${name}_tmp ...
  DROP INDEX CONCURRENTLY ${name};
  ALTER INDEX ${name}_tmp RENAME TO ${name};
 
  The only tricks here are if ${name}_tmp is already taken, in which case
  you might as well just error out (or try a few different names), and if
  ${name} is already in use by the time you get to the last line, in which
  case you can log a warning or an error.
 
  What am I missing?

 That this is already recorded in my book ;-)

 And also that REINDEX CONCURRENTLY doesn't work like that, yet.

The last submitted patch works pretty similar:

CREATE INDEX CONCURRENTLY $name_cct;
ALTER INDEX $name RENAME TO cct_$name;
ALTER INDEX $name_tmp RENAME TO $tmp;
ALTER INDEX $name_tmp RENAME TO $name_cct;
DROP INDEX CONURRENCTLY $name_cct;

It does that under an exlusive locks, but doesn't handle dependencies
yet...

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sketch of a Hook into the Logging Collector

2012-12-10 Thread Daniel Farina
On Sat, Dec 8, 2012 at 10:40 AM, Daniel Farina dan...@heroku.com wrote:
 Hello all,

 I am approaching this from the angle of increasing power by exposing
 the log collector (syslogger) pipe protocol.

I just spotted a better, already-committed patch.  Thanks to Hannu for
pointing it out:

https://commitfest.postgresql.org/action/patch_view?id=717

I'll retract this patch, unless someone finds it interesting for some reason.

--
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Failing SSL connection due to weird interaction with openssl

2012-12-10 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 FWICS, this kind of problem is endemic in OpenSSL, which
 also doesn't seem to believe in comprehensive documentation or code
 comments.  It would be nice if we had an API to some other, less
 crappy encryption library; or maybe even some generic API that lets
 you easily wire it into any library you happen to wish to use.

Awhile back Red Hat was trying to get people to switch to NSS or GnuTLS,
which apparently are better designed.

 Not that I'm volunteering to write the patch... :-(

Me either ... and in fact the lack of interest among upstreams in
rewriting their TLS code is what made the aforesaid effort crash and
burn.  But FWIW, there are better alternatives out there.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-12-10 Thread Jaime Casanova
On Fri, Nov 23, 2012 at 4:56 AM, Amit Kapila amit.kap...@huawei.com wrote:
 On Thursday, November 22, 2012 10:09 PM Fujii Masao wrote:

 Is it helpful to output the notice message like 'Run pg_reload_conf() or
 restart the server if you want new settings to take effect' always after
 SET PERSISTENT command?

 Not sure because if someone uses SET PERSISTENT command, he should know the
 effect or behavior of same.
 I think it's good to put this in UM along with PERSISTENT option
 explanation.


can we at least send the source file in the error message when a
parameter has a wrong value?

suppose you do: SET PERSISTENT shared_preload_libraries TO
'some_nonexisting_lib';
when you restart postgres and that lib doesn't exist you can get
confused when looking at postgresql.conf and find an empty string
there

--
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)

2012-12-10 Thread Noah Misch
On Mon, Dec 10, 2012 at 08:32:53AM -0500, Stephen Frost wrote:
 * Noah Misch (n...@leadboat.com) wrote:
  On Fri, Dec 07, 2012 at 06:51:18PM -0500, Stephen Frost wrote:
   Now, what I've honestly been hoping for on this thread was for someone
   to speak up and point out why I'm wrong about this concern and explain
   how this patch addresses that issue.  If that's been done, I've missed
   it..
 
 [...]
 
 So, apparently I'm not wrong about my concern, but no one seems to share
 my feelings on this change.
 
 I continue to hold that this could end up being a slippery slope for us
 to go down wrt 'correctness' vs. 'do whatever the user wants'.

I agree we should be reticent to compromise correctness for convenience.
Compromising mere bug-compatibility, trading one incorrect behavior for
another incorrect behavior, is not as bad.  Furthermore, today's behavior in
question is not something I can see applications deliberately and successfully
relying upon.

 If we
 keep this to only COPY and where the table has to be truncated/created
 in the same transaction (which requires the user to have sufficient
 privileges to do non-MVCC-safe things on the table already), perhaps
 it's alright.

Extending it to cases not involving a just-created or just-truncated table
really would compromise correctness; errors could leave the table in an
otherwise-impossible state.  Let's indeed not go there.

I see no particular need to restrict this to COPY; that's just the most
important case by far.  As a side note, the calculus for whether to extend the
optimization to INSERT and UPDATE differs from that for WAL avoidance.  WAL
avoidance can be a substantial loss when the total change is small.  For
pre-hinting/freezing, the worst case is having needlessly checked a few local
variables to rule out the optimization.

 It'll definitely reduce the interest in finding a real
 solution though, which is unfortunate.

That effect seems likely, but I do not find it unfortunate.  The change
variant I have advocated does not stand in contrast to some real solution to
PostgreSQL's treatment for readers of tables created or truncated by a
transaction not in the reader's snapshot.  The two topics interact at arm's
length.  Bundling them into one patch, artificially making them to stand or
fall as a pair, is not a win for PostgreSQL.

That does raise another disadvantage of making the change syntax-controlled:
if we someday implement the other improvement, COPY FREEZE will have minimal
reason not to be the default.  FREEZE then becomes a relic noise word.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for REINDEX CONCURRENTLY

2012-12-10 Thread Michael Paquier
On Mon, Dec 10, 2012 at 11:51 PM, Andres Freund and...@2ndquadrant.comwrote:

 Btw, as an example of the problems caused by renaming:

 postgres=# CREATE TABLE a (id serial primary key); CREATE TABLE b(id
 serial primary key, a_id int REFERENCES a);
 CREATE TABLE
 Time: 137.840 ms
 CREATE TABLE
 Time: 143.500 ms
 postgres=# \d b
  Table public.b
  Column |  Type   |   Modifiers
 +-+
  id | integer | not null default nextval('b_id_seq'::regclass)
  a_id   | integer |
 Indexes:
 b_pkey PRIMARY KEY, btree (id)
 Foreign-key constraints:
 b_a_id_fkey FOREIGN KEY (a_id) REFERENCES a(id)

 postgres=# REINDEX TABLE a CONCURRENTLY;
 NOTICE:  drop cascades to constraint b_a_id_fkey on table b
 REINDEX
 Time: 248.992 ms
 postgres=# \d b
  Table public.b
  Column |  Type   |   Modifiers
 +-+
  id | integer | not null default nextval('b_id_seq'::regclass)
  a_id   | integer |
 Indexes:
 b_pkey PRIMARY KEY, btree (id)

Oops. I will fix that in the next version of the patch. There should be an
elegant way to change the dependencies at the swap phase.
-- 
Michael Paquier
http://michael.otacoo.com


Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)

2012-12-10 Thread Robert Haas
On Sun, Dec 9, 2012 at 3:06 PM, Noah Misch n...@leadboat.com wrote:
 I favor[1] unconditionally letting older snapshots see the new rows after the
 CREATE+COPY transaction commits.  To recap, making affected scans see an empty
 table is as wrong as making them see those rows.  Robert also listed[2] that
 as a credible option, and I don't recall anyone opining against it in previous
 discussions.  I did perceive an undercurrent preference, all other things
 being equal, for an optimization free from semantic side-effects.  I shared
 that preference, but investigations showed that we must compromise something.

You know, I hadn't been taking that option terribly seriously, but
maybe we ought to reconsider it.  It would certainly be simpler, and
as you point out, it's not really any worse from an MVCC point of view
than anything else we do.  Moreover, it would make this available to
clients like pg_dump without further hackery.

I think the current behavior, where we treat FREEZE as a hint, is just
awful.  Regardless of whether the behavior is automatic or manually
requested, the idea that you might get the optimization or not
depending on the timing of relcache flushes seems very much
undesirable.  I mean, if the optimization is actually important for
performance, then you want to get it when you ask for it.  If it
isn't, then why bother having it at all?  Let's say that COPY FREEZE
normally doubles performance on a data load that therefore takes 8
hours - somebody who suddenly loses that benefit because of a relcache
flush that they can't prevent or control and ends up with a 16 hour
data load is going to pop a gasket.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] OAT_POST_ALTER object access hooks

2012-12-10 Thread Robert Haas
On Mon, Dec 3, 2012 at 9:59 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 As we discussed before, it is hard to determine which attributes shall
 be informed to extension via object_access_hook, so the proposed
 post-alter hook (that allows to compare older and newer versions)
 works fine on 99% cases.
 However, I'm inclined to handle SET TABLESPACE as an exception
 of this scenario. For example, an idea is to define OAT_PREP_ALTER
 event additionally, but only invoked very limited cases that takes
 unignorable side-effects until system catalog updates.
 For consistency of hook, OAT_POST_ALTER event may also ought
 to be invoked just after catalog updates of pg_class-reltablespace,
 but is_internal=true.

 How about your opinion?

I don't really like it - I think POST should mean POST.  You can
define some other hook that gets called somewhere else, but after
means after.  There are other plausible uses of these hooks besides
sepgsql; for example, logging the completion time of an action.
Putting the hooks in the wrong places because that happens to be
more convenient for sepgsql is going to render them useless for any
other purpose.  Maybe nobody else will use them anyway, but I'd rather
not render it impossible before we get off the ground.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)

2012-12-10 Thread Stephen Frost
Noah,

* Noah Misch (n...@leadboat.com) wrote:
 I agree we should be reticent to compromise correctness for convenience.
 Compromising mere bug-compatibility, trading one incorrect behavior for
 another incorrect behavior, is not as bad.  Furthermore, today's behavior in
 question is not something I can see applications deliberately and successfully
 relying upon.

I actually don't agree with the notion that one bad bug should allow us
to introduce additional such bugs.  I agree that it's unlikely that
applications are depending on today's behavior of TRUNCATE making
concurrent transactions see an empty table, but it does *not* follow
that applications *won't* start depending on this new behavior of COPY
FREEZE.

 Extending it to cases not involving a just-created or just-truncated table
 really would compromise correctness; errors could leave the table in an
 otherwise-impossible state.  Let's indeed not go there.

Even if we could fix that, I'd be against allowing it arbitrairly for
any regular user INSERT or UPDATE; I'm still not particularly happy with
this approach for COPY.

  It'll definitely reduce the interest in finding a real
  solution though, which is unfortunate.
 
 That effect seems likely, but I do not find it unfortunate.  The change
 variant I have advocated does not stand in contrast to some real solution to
 PostgreSQL's treatment for readers of tables created or truncated by a
 transaction not in the reader's snapshot.  The two topics interact at arm's
 length.  Bundling them into one patch, artificially making them to stand or
 fall as a pair, is not a win for PostgreSQL.

Having proper MVCC support for DDL *would* be a win for PostgreSQL and
this *does* reduce the chances of that ever happening.

 That does raise another disadvantage of making the change syntax-controlled:
 if we someday implement the other improvement, COPY FREEZE will have minimal
 reason not to be the default.  FREEZE then becomes a relic noise word.

Indeed, that's certainly unfortunate as well.  Really, though, it just
goes to show how much of a hack this is rather than a real solution.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)

2012-12-10 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 You know, I hadn't been taking that option terribly seriously, but
 maybe we ought to reconsider it.  It would certainly be simpler, and
 as you point out, it's not really any worse from an MVCC point of view
 than anything else we do.  Moreover, it would make this available to
 clients like pg_dump without further hackery.

I really don't agree with this notion that the behavior of TRUNCATE, a
top-level, seperately permissioned command, makes it OK to introduce
other busted behavior in existing commands.

 I think the current behavior, where we treat FREEZE as a hint, is just
 awful.  

I agree that it's pretty grotty, but I had assumed it was at least
deterministic, ala TRUNCATE/COPY and WAL...  If it isn't, then this
certainly gets really ugly really quickly.

I don't think that means we should go ahead and try to always optimize
it though- even when it isn't explicit, there will be an expectation
that it's going to work when all the 'right' conditions are met.  I know
that's certainly how I feel about TRUNCATE/COPY and WAL'ing.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Doc patch, further describe and-mask nature of the permission system

2012-12-10 Thread Karl O. Pinc
On 11/14/2012 02:35:54 PM, Karl O. Pinc wrote:
 On 11/13/2012 08:50:55 PM, Peter Eisentraut wrote:
  On Sat, 2012-09-29 at 01:16 -0500, Karl O. Pinc wrote:
   This patch makes some sweeping statements.
  
  Unfortunately, they are wrong.
 
 I will see if anything can be salvaged.

Here's another try.
(I bundled changes to both paragraphs into a single
patch.)

grants-of-roles-are-additive_v3.patch

Regards,


Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein

diff --git a/doc/src/sgml/ref/grant.sgml b/doc/src/sgml/ref/grant.sgml
index fb81af4..b57000c 100644
--- a/doc/src/sgml/ref/grant.sgml
+++ b/doc/src/sgml/ref/grant.sgml
@@ -429,11 +429,32 @@ GRANT replaceable class=PARAMETERrole_name/replaceable [, ...] TO replace
/para
 
para
-A user may perform commandSELECT/, commandINSERT/, etc. on a
-column if he holds that privilege for either the specific column or
-its whole table.  Granting the privilege at the table level and then
+Permission granted to a table grants permission to all the columns
+of a table, regardless of permissions granted to the table's
+columns.  Granting a privilege at the table level and then
 revoking it for one column will not do what you might wish: the
-table-level grant is unaffected by a column-level operation.
+table-level grant is unaffected by a column-level operation.  But
+revoking permission at the table level and granting it at the
+column level does grant permission to the column.
+   /para
+
+   para
+Roles can be fashioned into a permission hierarchy.  Roles having
+the literalINHERIT/literal attribute (the default) that are
+assigned to other roles in a hierarchical fashion produce a
+permission system which behaves in the fashion of the
+databasetable/literal./databasecolumn/ hierarchy.
+E.g. a user's login role can be assigned a role of
+literalaccountant/ which is in turn assigned a role of
+literalemployee/.  The user would have all the permissions of
+an literalaccountant/ regardless of whether these permissions
+are revoked from the literalemployee/literal role.  And, by
+virtue of the literalemployee//literalaccountant/ role
+hierarchy, literalaccountant/s also have all permissions
+granted to literalemployee/s.  Unlike the fixed
+databasetable/literal./databasecolumn/ hierarchy the
+productnamePostgreSQL/ user is free to fashion roles into
+arbitrary hierarchical structures.
/para
 
para


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)

2012-12-10 Thread Noah Misch
On Mon, Dec 10, 2012 at 08:04:55PM -0500, Robert Haas wrote:
 I think the current behavior, where we treat FREEZE as a hint, is just
 awful.  Regardless of whether the behavior is automatic or manually
 requested, the idea that you might get the optimization or not
 depending on the timing of relcache flushes seems very much
 undesirable.  I mean, if the optimization is actually important for
 performance, then you want to get it when you ask for it.  If it
 isn't, then why bother having it at all?  Let's say that COPY FREEZE
 normally doubles performance on a data load that therefore takes 8
 hours - somebody who suddenly loses that benefit because of a relcache
 flush that they can't prevent or control and ends up with a 16 hour
 data load is going to pop a gasket.

Until these threads, I did not know that a relcache invalidation could trip up
the WAL avoidance optimization, and now this.  I poked at the relevant
relcache.c code, and it already takes pains to preserve the needed facts.  The
header comment of RelationCacheInvalidate() indicates that entries bearing an
rd_newRelfilenodeSubid can safely survive the invalidation, but the code does
not implement that.  I think the comment is right, and this is just an
oversight in the code going back to its beginning (fba8113c).

I doubt the comment at the declaration of rd_createSubid in rel.h, though I
can't presently say what correct comment should replace it.  CLUSTER does
preserve the old value, at least for the main table relation.  CLUSTER
probably should *set* rd_newRelfilenodeSubid.

Thanks,
nm
*** a/src/backend/utils/cache/relcache.c
--- b/src/backend/utils/cache/relcache.c
***
*** 2149,2156  RelationCacheInvalidate(void)
/* Must close all smgr references to avoid leaving dangling 
ptrs */
RelationCloseSmgr(relation);
  
!   /* Ignore new relations, since they are never cross-backend 
targets */
!   if (relation-rd_createSubid != InvalidSubTransactionId)
continue;
  
relcacheInvalsReceived++;
--- 2149,2162 
/* Must close all smgr references to avoid leaving dangling 
ptrs */
RelationCloseSmgr(relation);
  
!   /*
!* Ignore new relations; no other backend will manipulate them 
before
!* we commit.  Likewise, before replacing a relation's 
relfilenode, we
!* shall have acquired AccessExclusiveLock and drained any 
applicable
!* pending invalidations.
!*/
!   if (relation-rd_createSubid != InvalidSubTransactionId ||
!   relation-rd_newRelfilenodeSubid != 
InvalidSubTransactionId)
continue;
  
relcacheInvalsReceived++;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multiple --table options for other commands

2012-12-10 Thread Karl O. Pinc
On 10/30/2012 10:14:19 PM, Josh Kupershmidt wrote:

 I went ahead and cooked up a patch to allow pg_restore, clusterdb,
 vacuumdb, and reindexdb to accept multiple --table switches. Hope I
 didn't overlook a similar tool, but these were the only other 
 commands
 I found taking a --table argument.

I believe you need ellipses behind --table in the syntax summaries
of the command reference docs.

(This was all I noticed while reading the patch.)

Regards,

Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)

2012-12-10 Thread Noah Misch
On Mon, Dec 10, 2012 at 08:54:04PM -0500, Stephen Frost wrote:
 I agree that it's unlikely that
 applications are depending on today's behavior of TRUNCATE making
 concurrent transactions see an empty table, but it does *not* follow
 that applications *won't* start depending on this new behavior of COPY
 FREEZE.

That is a good point.  I'm not sure whether that should worry us enough to
implement an error in the scenario before letting COPY write frozen tuples.
But it's the strongest argument I've seen for imposing that prerequisite.

 Even if we could fix that, I'd be against allowing it arbitrairly for
 any regular user INSERT or UPDATE; I'm still not particularly happy with
 this approach for COPY.

Sure; COPY is the 99% important case here.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pg_upgrade -o/-O regression in 9.2.2

2012-12-10 Thread Bruce Momjian
On Tue, Dec 11, 2012 at 12:17:11AM +0200, Marti Raudsepp wrote:
 Hi!
 
 It seems that PostgreSQL 9.2.2 has a regression in pg_upgrade, the -o
 and -O options forget to add a space before passing on user options,
 thereby generating unparsable command lines.
 
 For example:
 pg_upgrade -b /usr/local/pg91/bin -B /usr/bin -d /tmp/91 -D /tmp/92 -O -F
 [...]
 Creating catalog dump   ok
 *failure*
 could not connect to new postmaster started with the command:
 /usr/bin/pg_ctl -w -l pg_upgrade_server.log -D /tmp/92 -o -p
 50432 -b -c synchronous_commit=off-F -c listen_addresses='' -c
 unix_socket_permissions=0700 -c unix_socket_directory='/tmp' start
 
 Notice the bad argument synchronous_commit=off-F
 
 It's easy enough to work around by adding a space to the command line,
 passing -O ' -F' instead of -O '-F'
 
 Here's the bad commit:
 http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ed5699dd1b883e193930448b7ad532e233de0bd7;hp=5ed6546cf75623ba426942a3b71659a66cf7ed68
 
 The attached patch re-introduces the space at the necessary place.

I was super-paranoid about making any changes in that area, but it seems
I wasn't paranoid enough.

Patch applied to head and 9.2.  Thanks for the workaround idea too.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)

2012-12-10 Thread Pavan Deolasee
On Mon, Dec 10, 2012 at 7:02 PM, Stephen Frost sfr...@snowman.net wrote:


 I continue to hold that this could end up being a slippery slope for us
 to go down wrt 'correctness' vs. 'do whatever the user wants'.  If we
 keep this to only COPY and where the table has to be truncated/created
 in the same transaction (which requires the user to have sufficient
 privileges to do non-MVCC-safe things on the table already), perhaps
 it's alright.  It'll definitely reduce the interest in finding a real
 solution though, which is unfortunate.


I wonder if something more complete can be done by forcing COPY FREEZE
or whatever we call it to take an exclusive lock on the table and run
loading as an append-only operation. By taking a strong lock, we will
block out any concurrent read/writes to the table. If an error occurs
while loading the data, the table will be truncated at the previously
recorded size. We may need some additional book keeping and WAL
logging to handle crash recovery.

To solve the visibility issue for old snapshots that should not see
the new data, we can store some additional visibility information in
the pg_class itself. For example, we can store the size of the table
before the COPY FREEZE started and the XID of the COPY FREEZE
operation. An old snapshot that can not see the XID, can not see the
tuples inserted in the new blocks either. Once the COPY FREEZE
finishes and the lock on the relation is released, new transactions
can start writing to the table and write past the old size of the
table. But that should be OK. If an old snapshot can't see the  tuples
inserted by COPY FREEZE, AFAIK it can't see any of those other tuples
as well.

I'm sure there will still be challenges with this approach. But I
wonder if we can guarantee correctness by proper use of
synchronization and still avoid multiple writes for most common data
loading scenarios.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] allowing multiple PQclear() calls

2012-12-10 Thread Josh Kupershmidt
The documentation for PQclear() doesn't say whether it is safe to call
PQclear() more than once on the same PGresult pointer. In fact, it is
not safe, but apparently only because of this last step:
/* Free the PGresult structure itself */
free(res);

The other members of PGresult which may be freed by PQclear are set to
NULL or otherwise handled so as not to not be affected by a subsequent
PQclear().

I find that accounting for whether I've already PQclear'ed a given
PGresult can be quite tedious in some cases. For example, in the
cleanup code at the end of a function where control may goto in case
of a problem, it would be much simpler to unconditionally call
PQclear() without worrying about whether this was already done. One
can see an admittedly small illustration of this headache in
pqSetenvPoll() in our own codebase, where several times PQclear(res);
is called immediately before a goto error_return;

Would it be crazy to add an already_freed flag to the pg_result
struct which PQclear() would set, or some equivalent safety mechanism,
to avoid this hassle for users?

Josh


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)

2012-12-10 Thread Jeff Davis
On Mon, 2012-12-10 at 08:16 -0500, Stephen Frost wrote: 
 I'm trying to figure out why there are all the constraints around this
 command to begin with.  If we're going to support this, why do we
 require the user to create or truncate the table in the same
 transaction?  Clearly that's going to reduce the usefulness of this
 feature and it's not clear to me why that constraint is needed,
 code-wise.

There is a very specific reason:

If you insert frozen tuples into a table that already has tuples in it,
then you no longer have just an isolation issue, you have an *atomicity*
issue (and probably a number of other serious issues) because you can't
roll back. Doing it in the same transaction as the table was created
leaves you with a way to roll back: just delete the table (which will
happen implicitly because the creation is part of the same transaction
anyway).

Perhaps we can take some partial steps toward MVCC-safe access? For
instance, how about trying to recheck the xmin of a pg_class tuple when
starting a scan?

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-12-10 Thread Jeff Davis
On Mon, 2012-12-10 at 14:07 -0500, Robert Haas wrote: 
 And we not only don't give them the behavior they want; we
 don't even have a meaningful way to give the option of opting into
 that behavior at initdb or create-database time.

I strongly object to offering options that change the language in such a
substantial way. initdb-time options still mean that we are essentially
dividing our language, and therefore the applications that support
postgres, in half (or worse). One of the things I really like about
postgres is that we haven't forked the language with a million options
like mysql has. I don't even like the fact that we have a GUC to control
the output format of a BYTEA.

For every developer who says wow, that mysql query just worked without
modification there is another one who says oh, I forgot to test with
option XYZ... postgres is too complex to support, I'm going to drop it
from the list of supported databases.

I still don't see a compelling reason why opting out of overloading on a
per-function basis won't work. Your objections seemed fairly minor in
comparison to how strongly you are advocating this use case.

In particular, I didn't get a response to:

http://archives.postgresql.org/message-id/1354055056.1766.50.camel@sussancws0025

For what it's worth, I'm glad that people like you are pushing on these
usability issues, because it can be hard for insiders to see them
sometimes.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUG?] lag of minRecoveryPont in archive recovery

2012-12-10 Thread Kyotaro HORIGUCHI
Hello, I've also found this does not fix this problem.

  So I'd say we should update minRecoveryPoint first, then
  truncate/delete. But we should still keep the XLogFlush() at the end of
  xact_redo_commit_internal(), for the case where files/directories are
  created. Patch attached.
 
 Sounds reasonable.

It makes perfectly sense.

  Committed and backpatched that. Attached is a script I used to reproduce
  this problem, going back to 8.4.
 
 Thanks!
 
 Unfortunately I could reproduce the problem even after that commit.
 Attached is the script I used to reproduce the problem.

Me too.

 The cause is that CheckRecoveryConsistency() is called before rm_redo(),
 as Horiguchi-san suggested upthead. Imagine the case where
 minRecoveryPoint is set to the location of the XLOG_SMGR_TRUNCATE
 record. When restarting the server with that minRecoveryPoint,
 the followings would happen, and then PANIC occurs.
 
 1. XLOG_SMGR_TRUNCATE record is read.
 2. CheckRecoveryConsistency() is called, and database is marked as
 consistent since we've reached minRecoveryPoint (i.e., the location
 of XLOG_SMGR_TRUNCATE).
 3. XLOG_SMGR_TRUNCATE record is replayed, and invalid page is
 found.
 4. Since the database has already been marked as consistent, an invalid
 page leads to PANIC.

Exactly.

In smgr_redo, EndRecPtr which is pointing the record next to
SMGR_TRUNCATE, is used as the new minRecoveryPoint. On the other
hand, during the second startup of the standby,
CheckRecoveryConsistency checks for consistency by
XLByteLE(minRecoveryPoint, EndRecPtr) which should be true at
just BEFORE the SMGR_TRUNCATE record is applied. So
reachedConsistency becomes true just before the SMGR_TRUNCATE
record will be applied. Bang!

I said I had no objection to placing CheckRecoveryConsistency
both before and after of rm_redo in previous message, but it was
wrong. Given aminRecoveryPoint value, it should be placed after
rm_redo from the point of view of when the database should be
considered to be consistent.

Actually, simply moving CheckRecoeverConsistency after rm_redo
turned into succeessfully startup, ignoring the another reason
for it should be before, which is unknown to me.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-12-10 Thread Darren Duncan

I agree with Jeff.

Options that change the language at initdb or create-database time just fragment 
the language.


It is best to just have 1 language where options are providable either 
dynamically per connection or otherwise lexically, so that then they are really 
just shorthands for the current local usage, and the language as a whole is the 
same.


That also means you can have example code out there and know it will work on any 
Postgres install, invariant of static global options.  If language modifiers are 
local or lexical, then any example code presumably would include the switches to 
turn them on for that example.


That all being said, I also think it is best to explicitly overload operators 
with extra parameter types, such as defining another operator with the signature 
of (Nunber,String) with the same base name as string catenation, rather than 
making numbers implicitly stringify.  But I can also accept implicit 
stringification / language behavior changes if it is a lexical/temporary effect 
that the same user is still explicitly turning on.


-- Darren Duncan

Jeff Davis wrote:
On Mon, 2012-12-10 at 14:07 -0500, Robert Haas wrote: 

And we not only don't give them the behavior they want; we
don't even have a meaningful way to give the option of opting into
that behavior at initdb or create-database time.


I strongly object to offering options that change the language in such a
substantial way. initdb-time options still mean that we are essentially
dividing our language, and therefore the applications that support
postgres, in half (or worse). One of the things I really like about
postgres is that we haven't forked the language with a million options
like mysql has. I don't even like the fact that we have a GUC to control
the output format of a BYTEA.

For every developer who says wow, that mysql query just worked without
modification there is another one who says oh, I forgot to test with
option XYZ... postgres is too complex to support, I'm going to drop it
from the list of supported databases.

I still don't see a compelling reason why opting out of overloading on a
per-function basis won't work. Your objections seemed fairly minor in
comparison to how strongly you are advocating this use case.

In particular, I didn't get a response to:

http://archives.postgresql.org/message-id/1354055056.1766.50.camel@sussancws0025

For what it's worth, I'm glad that people like you are pushing on these
usability issues, because it can be hard for insiders to see them
sometimes.

Regards,
Jeff Davis



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] allowing multiple PQclear() calls

2012-12-10 Thread Tom Lane
Josh Kupershmidt schmi...@gmail.com writes:
 Would it be crazy to add an already_freed flag to the pg_result
 struct which PQclear() would set, or some equivalent safety mechanism,
 to avoid this hassle for users?

Yes, it would.  Once the memory has been freed, malloc() is at liberty
to give it out for some other purpose.  The only way we could make that
work reliably is to permanently leak the memory occupied by the PGresult
... which I trust you will agree is a cure worse than the disease.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers