Re: [HACKERS] patch: shared session variables

2012-09-24 Thread Heikki Linnakangas
Having read through this thread, the consensus seems to be that we don't 
want this patch as it is (and I agree with that).


As I understand it, you are trying to solve two problems:

1. Passing parameters to a DO statement. You could quote the parameters 
and inline them in the block itself in the client, but quoting can be 
difficult to do correctly and is not as readable anyway, so I agree it 
would be good to be able to pass them separately.


2. Returning results from a DO statement. At the moment, a DO statement 
is a utility statement, so it cannot return a result set.


Two different approaches to these problems have been discussed that have 
some support:


A) WITH FUNCTION syntax, to allow defining a temporary function for a 
single query, similar to the current WITH syntax. 
(http://archives.postgresql.org/pgsql-hackers/2012-07/msg00426.php)


B) DO ... USING parameter list syntax. This is a straightforward 
extension of the current DO syntax, just adding a parameter list to it. 
Not sure how to return a result set from this, perhaps also support a 
RETURNS keyword, similar to CREATE FUNCTION.


I'm ok with either of those approaches. A) would be more flexible, while 
B) would be straightforward extension of what we already have.


I'm marking this patch as rejected in the commitfest app. Please pursue 
the WITH FUNCTION or DO ... USING syntax instead.


Thanks!

On 31.08.2012 21:27, Pavel Stehule wrote:

2012/8/31 Dimitri Fontainedimi...@2ndquadrant.fr:

Pavel Stehulepavel.steh...@gmail.com  writes:

Pavel, you didn't say what you think about the WITH FUNCTION proposal?


I don't like it - this proposal is too lispish - it is not SQL


We're not doing lambda here, only extending a facility that we rely on
today. The function would be named, for one. I don't know what you mean
by SQL being lispish here, and I can't imagine why it would be something
to avoid.


And you didn't say how do you want to turn a utility statement into
something that is able to return a result set.


if we support real procedures ala sybase procedures (MySQL, MSSQL..)
- then we can return result with same mechanism - there are no
significant difference between DO and CALL statements - you don't know
what will be result type before you call it.


Currently we don't have CALL, and we have DO which is not a query but a
utility statement. Are you proposing to implement CALL? What would be
the difference between making DO a query and having CALL?


defacto a CALL statement implementation can solve this issue.

The core of this issue is an impossibility using parameters for
utility statements.  CALL and DO are utility statements - and if we
can use parameters for CALL, then we can do it for DO too.

CALL statement starts a predefined batch - inside this batch, you can
do anything - can use COMMIT, ROLLBACK, SELECTs, ... DO is some batch
with immediate start. Sure, there is relative significant between
stored procedures implemented in popular RDBMS and although I don't
like T-SQL too much, I like sybase concept of stored procedures - it
is strong and very useful for maintaining tasks.

Regards

Pavel






Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-09-24 Thread Boszormenyi Zoltan

2012-09-22 20:49 keltezéssel, Tom Lane írta:

Boszormenyi Zoltan z...@cybertec.at writes:

new version with a lot more cleanup is attached.

I looked at this patch, and frankly I'm rather dismayed.  It's a mess.


Thank you for the kind words. :-)


To start at the bottom level, the changes to PGSemaphoreLock broke it,
and seem probably unnecessary anyway.  As coded, calling the condition
checker breaks handling of error returns from semop(), unless the checker
is careful to preserve errno, which LmgrTimeoutCondition isn't (and really
shouldn't need to be anyway).


I would call it an oversight, nothing more.


   More, if the checker does return true,
it causes PGSemaphoreLock to utterly violate its contract: it returns to
the caller without having acquired the semaphore, and without even telling
the caller so.  Worse, if we *did* acquire the semaphore, we might still
exit via this path, since the placement of the condition check call
ignores the comment a few lines up:

  * Once we acquire the lock, we do NOT check for an interrupt before
  * returning.  The caller needs to be able to record ownership of the lock
  * before any interrupt can be accepted.

We could possibly fix all this with a redesigned API contract for
PGSemaphoreLock, but frankly I do not see a good reason to be tinkering
with it at all.  We never needed to get it involved with deadlock check
handling, and I don't see why that needs to change for lock timeouts.

One very good reason why monkeying with PGSemaphoreLock is wrong is that
on some platforms a SIGALRM interrupt won't interrupt the semop() call,
and thus control would never reach the checker anyway.  If we're going
to throw an error, it must be thrown from the interrupt handler.


Then please, explain to me, how on Earth can the current deadlock_timeout
can report the error? Sure, I can see the PG_TRY() ... PG_END_TRY() block in
lock.c but as far as I can see, nothing in the CheckDeadLock() - 
DeadLockCheck()
- DeadLockCheckRecurse() path diverts the code to return to a different
address from the signal handler, i.e. there is no elog(ERROR) or ereport(ERROR)
even in the DS_HARD_DEADLOCK case, so nothing calls siglongjmp(). So logically
the code shouldn't end up in the PG_CATCH() branch.

semop() will only get an errno = EINTR when returning from the signal handler
and would loop again. Then what makes it return beside being able to lock the
semaphore? The conditions in ProcSleep() that e.g. print the lock stats work 
somehow.


The whole lmgrtimeout module seems to me to be far more mechanism than is
warranted, for too little added functionality.  In the first place, there
is nothing on the horizon suggesting that we need to let any plug-in code
get control here, and if anything the delicacy of what's going on leads me
to not wish to expose such a possibility.  In the second place, it isn't
adding any actually useful functionality, it's just agglomerating some
checks.  The minimum thing I would want it to do is avoid calling
timeout.c multiple times, which is what would happen right now (leading
to four extra syscalls per lock acquisition, which is enough new overhead
to constitute a strong objection to committing this patch at all).

On the whole I think we could forget lmgrtimeout and just hardwire the
lock timeout and deadlock check cases.  But in any case we're going to
need support in timeout.c for enabling/disabling multiple timeouts at
once without extra setitimer calls.


OK, so you prefer the previous hardcoding PGSemaphoreTimedLock()
that makes every LockAcquire() check its return code and the detailed
error message about failed to lock the given object?

I will add new functions to timeout.c to remove many timeout sources
at once to decrease the amount of syscalls needed.


I'm also not thrilled about the way in which the existing deadlock
checking code has been hacked up.  As an example, you added this to
DeadLockReport():

+   if (!DeadLockTimeoutCondition())
+   return;

which again causes it to violate its contract, namely to report a
deadlock, in the most fundamental way -- existing callers aren't
expecting it to return *at all*.


Existing caller*s*? There is only one caller. The reasoning behind this
change was that if the code reaches ReportLmgrTimeoutError() in lock.c
then at least one timeout triggered and one *Report function in the chain
will do ereport(ERROR). *THAT* would trigger te siglongjmp() ending
up in PG_CATCH(). The added lines in DeadLockReport() ensures that
the deadlock error is not reported if it didn't trigger.


   Surely we can decouple the deadlock
and lock timeout cases better than that; or at least if we can't it's
a delusion to propose anything like lmgrtimeout in the first place.

There's considerable lack of attention to updating comments, too.
For instance in WaitOnLock you only bothered to update the comment
immediately adjacent to the changed code, and not the two comment
blocks above that, which 

Re: [HACKERS] Configuration include directory

2012-09-24 Thread Heikki Linnakangas

On 21.09.2012 00:10, Selena Deckelmann wrote:

Hello!

I've spent a little time with this patch and have attached revision 6.
  Thanks, Noah, for a fantastically detailed review.

The only thing I didn't do that Noah suggested was run pgindent on
guc-file.l. A cursory search did not reveal source compatible with my
operating system for 'indent'. If someone points me to it, I'd happily
also comply with the request to reindent. And document how to do that
on my platform(s). :)

I did just remove the references to the Apache project etc. I agree
that providing best practices is good, but I'm skeptical about
including best practices piecemeal. Adding it to earlier tutorial
sections would probably be a bit more visible IMO.


This seems pretty much ready to commit. One tiny detail that I'd like to 
clarify: the docs say:



Multiple files within an include directory are ordered by an alphanumeric 
sorting, so that ones beginning with numbers are considered before those 
starting with letters.


To be more precise, the patch uses strcmp() for the comparisons. That's 
also what apache seems to do, although I couldn't find it being 
mentioned explicitly in their docs. It's true that numbers are sorted 
before letters, but should we also mention that upper-case letters are 
sorted before lower-case ones, and that sorting of non-ASCII characters 
depends on the encoding, in often surprising ways? Is there a better 
term for what strcmp() does? ASCII order? Is there precedence somewhere 
else in the PostgreSQL codebase or docs for that?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for option in pg_resetxlog for restore from WAL files

2012-09-24 Thread Heikki Linnakangas

On 18.07.2012 16:47, Amit kapila wrote:

Patch implementing the design in below mail chain is attached with this mail.


This patch copies the ReadRecord() function and a bunch of related 
functions from xlog.c into pg_resetxlog.c. There's a separate patch in 
the current commitfest to make that code reusable, without having to 
copy-paste it to every tool that wants to parse the WAL. See 
https://commitfest.postgresql.org/action/patch_view?id=860. This patch 
should be refactored to make use of that framework, as soon as it's 
committed.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 64-bit API for large object

2012-09-24 Thread Nozomi Anzai
Here is 64-bit API for large object version 2 patch.

 I checked this patch. It can be applied onto the latest master branch
 without any problems. My comments are below.
 
 2012/9/11 Tatsuo Ishii is...@postgresql.org:
  Ok, here is the patch to implement 64-bit API for large object, to
  allow to use up to 4TB large objects(or 16TB if BLCKSZ changed to
  32KB). The patch is based on Jeremy Drake's patch posted on September
  23, 2005
  (http://archives.postgresql.org/pgsql-hackers/2005-09/msg01026.php)
  and reasonably updated/edited to adopt PostgreSQL 9.3 by Nozomi Anzai
  for the backend part and Yugo Nagata for the rest(including
  documentation patch).
 
  Here are changes made in the patch:
 
  1) Frontend lo_* libpq functions(fe-lobj.c)(Yugo Nagata)
 
  lo_initialize() gathers backend 64-bit large object handling
  function's oid, namely lo_lseek64, lo_tell64, lo_truncate64.
 
  If client calls lo_*64 functions and backend does not support them,
  lo_*64 functions return error to caller. There might be an argument
  since calls to lo_*64 functions can automatically be redirected to
  32-bit older API. I don't know this is worth the trouble though.
 
 I think it should definitely return an error code when user tries to
 use lo_*64 functions towards the backend v9.2 or older, because
 fallback to 32bit API can raise unexpected errors if application
 intends to seek the area over than 2GB.

  Currently lo_initialize() throws an error if one of oids are not
  available. I doubt we do the same way for 64-bit functions since this
  will make 9.3 libpq unable to access large objects stored in pre-9.2
  PostgreSQL servers.
 
 It seems to me the situation to split the case of pre-9.2 and post-9.3
 using a condition of conn-sversion = 90300.

Fixed so, and tested it by deleteing the lo_tell64's row from pg_proc.


  To pass 64-bit integer to PQfn, PQArgBlock is used like this: int *ptr
  is a pointer to 64-bit integer and actual data is placed somewhere
  else. There might be other way: add new member to union u to store
  64-bit integer:
 
  typedef struct
  {
  int len;
  int isint;
  union
  {
  int*ptr;/* can't use void 
  (dec compiler barfs)   */
  int integer;
  int64   bigint; /* 64-bit integer */
  }   u;
  } PQArgBlock;
 
  I'm a little bit worried about this way because PQArgBlock is a public
  interface.
 
 I'm inclined to add a new field for the union; that seems to me straight
 forward approach.
 For example, the manner in lo_seek64() seems to me confusable.
 It set 1 on isint field even though pointer is delivered actually.
 
 +   argv[1].isint = 1;
 +   argv[1].len = 8;
 +   argv[1].u.ptr = (int *) len;

Your proposal was not adopted per discussion.


  Also we add new type pg_int64:
 
  #ifndef NO_PG_INT64
  #define HAVE_PG_INT64 1
  typedef long long int pg_int64;
  #endif
 
  in postgres_ext.h per suggestion from Tom Lane:
  http://archives.postgresql.org/pgsql-hackers/2005-09/msg01062.php
 
 I'm uncertain about context of this discussion.
 
 Does it make matter if we include stdint.h and use int64_t instead
 of the self defined data type?

Your proposal was not adopted per discussion.
Per discussion, endiannness translation was moved to fe-lobj.c.


  2) Backend lo_* functions (be-fsstubs.c)(Nozomi Anzai)
 
  Add lo_lseek64, lo_tell64, lo_truncate64 so that they can handle
  64-bit seek position and data length. loread64 and lowrite64 are not
  added because if a program tries to read/write more than 2GB at once,
  it would be a sign that the program need to be re-designed anyway.
 
 I think it is a reasonable.
 
  3) Backend inv_api.c functions(Nozomi Anzai)
 
  No need to add new functions. Just extend them to handle 64-bit data.
 
  BTW , what will happen if older 32-bit libpq accesses large objects
  over 2GB?
 
  lo_read and lo_write: they can read or write lobjs using 32-bit API as
  long as requested read/write data length is smaller than 2GB. So I
  think we can safely allow them to access over 2GB lobjs.
 
  lo_lseek: again as long as requested offset is smaller than 2GB, there
  would be no problem.
 
  lo_tell:if current seek position is beyond 2GB, returns an error.
 
 Even though iteration of lo_lseek() may move the offset to 4TB, it also
 makes unavailable to use lo_tell() to obtain the current offset, so I think
 it is reasonable behavior.
 
 However, error code is not an appropriate one.
 
 +   if (INT_MAX  offset)
 +   {
 +   ereport(ERROR,
 +   (errcode(ERRCODE_UNDEFINED_OBJECT),
 +errmsg(invalid large-object
 descriptor: %d, fd)));
 +   PG_RETURN_INT32(-1);
 +   }
 
 According to the manpage of lseek(2)
 EOVERFLOW
 The resulting file 

Re: [HACKERS] Patch for option in pg_resetxlog for restore from WAL files

2012-09-24 Thread Amit Kapila
 On Monday, September 24, 2012 2:30 PM Heikki Linnakangas wrote:
 On 18.07.2012 16:47, Amit kapila wrote:
  Patch implementing the design in below mail chain is attached with
 this mail.
 
 This patch copies the ReadRecord() function and a bunch of related
 functions from xlog.c into pg_resetxlog.c. There's a separate patch in
 the current commitfest to make that code reusable, without having to
 copy-paste it to every tool that wants to parse the WAL. See
 https://commitfest.postgresql.org/action/patch_view?id=860. This patch
 should be refactored to make use of that framework, as soon as it's
 committed.

Sure. Thanks for the feedback.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [WIP] Performance Improvement by reducing WAL for Update Operation

2012-09-24 Thread Amit kapila
From: Heikki Linnakangas [mailto:heikki(dot)linnakangas(at)enterprisedb(dot)com]
Sent: Monday, August 27, 2012 5:58 PM
To: Amit kapila
On 27.08.2012 15:18, Amit kapila wrote:
 I have implemented the WAL Reduction Patch for the case of HOT Update as
pointed out by Simon and Robert. In this patch it only goes for Optimized
WAL in case of HOT Update with other restrictions same as in previous patch.

 The performance numbers for this patch are attached in this mail. It has
improved by 90% if the page has fillfactor 80.

 Now going forward I have following options:
 a. Upload the patch in Open CF for WAL Reduction which contains
reductution for HOT and non-HOT updates.
 b. Upload the patch in Open CF for WAL Reduction which contains
reductution for HOT updates.
 c. Upload both the patches as different versions.

 Let's do it for HOT updates only. Simon  Robert made good arguments on
 why this is a bad idea for non-HOT updates.

Okay, I shall do it that way.
So now I shall send information about all the testing I have done for this
Patch and then Upload it in CF.



Rebased version of patch based on latest code.




With Regards,

Amit Kapila.


wal_update_changes_v2.patch
Description: wal_update_changes_v2.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proof of concept: auto updatable views [Review of Patch]

2012-09-24 Thread Amit Kapila
  On Sunday, September 23, 2012 12:33 AM Dean Rasheed wrote:
 On 18 September 2012 14:23, Amit kapila amit.kap...@huawei.com wrote:
  Please find the review of the patch.
 
 
 Thanks for the review. Attached is an updated patch, and I've include
 some responses to specific review comments below.

I have verified your updated patch. It works fine and according to me it is
ready for committer to check this patch.
I have updated in CF page as Ready for Committer.


With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] DROP INDEX CONCURRENTLY is not really concurrency safe leaves around undroppable indexes

2012-09-24 Thread Andres Freund
Hi,

Problem 1: concurrency:

Testcase:

Session 1:
CREATE TABLE test_drop_concurrently(id serial primary key, data int);
INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1, 
10);
CREATE INDEX test_drop_concurrently_data ON test_drop_concurrently(data);
BEGIN;
EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data = 34343;
SELECT * FROM test_drop_concurrently WHERE data = 34343;
(1 row)

Session 2:
BEGIN;
SELECT * FROM test_drop_concurrently WHERE data = 34343;

Session 3:
DROP INDEX CONCURRENTLY test_drop_concurrently_data;
(in-progress)

Session 2:
INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1, 
10);
COMMIT;

Session 1:
SELECT * FROM test_drop_concurrently WHERE data = 34343;
(1 row)
SET enable_bitmapscan = false;
SET enable_indexscan = false;
SELECT * FROM test_drop_concurrently WHERE data = 34343;
(2 rows)

Explanation:
index_drop does:
indexForm-indisvalid = false;  /* make unusable for queries */
indexForm-indisready = false;  /* make invisible to changes */

Setting indisready = false is problematic because that prevents index updates 
which in turn breaks READ COMMITTED semantics. I think there need to be one 
more phase that waits for concurrent users of the index to finish before 
setting indisready = false.


Problem 2: undroppable indexes:

Session 1:
CREATE TABLE test_drop_concurrently(id serial primary key, data int);
CREATE INDEX test_drop_concurrently_data ON test_drop_concurrently(data);
BEGIN;
EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data = 34343;

Session 2:
DROP INDEX CONCURRENTLY test_drop_concurrently_data;
waiting
^CCancel request sent
ERROR:  canceling statement due to user request

Session 1:
ROLLBACK;
DROP TABLE test_drop_concurrently;
SELECT indexrelid, indrelid, indexrelid::regclass, indrelid::regclass, 
indisvalid, indisready FROM pg_index WHERE indexrelid = 
'test_drop_concurrently_data'::regclass;
 indexrelid | indrelid | indexrelid  | indrelid | indisvalid | 
indisready 
+--+-+--++
  24703 |24697 | test_drop_concurrently_data | 24697| f  | 
f
(1 row)

DROP INDEX test_drop_concurrently_data;
ERROR:  could not open relation with OID 24697

Haven't looked at this one at all.

Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DROP INDEX CONCURRENTLY is not really concurrency safe leaves around undroppable indexes

2012-09-24 Thread Andres Freund
On Monday, September 24, 2012 01:27:54 PM Andres Freund wrote:
 Problem 2: undroppable indexes:
 
 Session 1:
 CREATE TABLE test_drop_concurrently(id serial primary key, data int);
 CREATE INDEX test_drop_concurrently_data ON test_drop_concurrently(data);
 BEGIN;
 EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data = 34343;
 
 Session 2:
 DROP INDEX CONCURRENTLY test_drop_concurrently_data;
 waiting
 ^CCancel request sent
 ERROR:  canceling statement due to user request
 
 Session 1:
 ROLLBACK;
 DROP TABLE test_drop_concurrently;
 SELECT indexrelid, indrelid, indexrelid::regclass, indrelid::regclass,
 indisvalid, indisready FROM pg_index WHERE indexrelid =
 'test_drop_concurrently_data'::regclass;
  indexrelid | indrelid | indexrelid  | indrelid |
 indisvalid | indisready
 +--+-+--+--
 --+ 24703 |24697 | test_drop_concurrently_data | 24697|
 f  | f
 (1 row)
 
 DROP INDEX test_drop_concurrently_data;
 ERROR:  could not open relation with OID 24697
 
 Haven't looked at this one at all.
Thats because it has to commit transactions inbetween to make the catalog 
changes visible. Unfortunately at that point it already deleted the 
dependencies in deleteOneObject...


Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Bruce Momjian
On Sun, Sep 23, 2012 at 06:46:33PM -0400, Peter Eisentraut wrote:
 On Sun, 2012-09-23 at 22:20 +0800, Rural Hunter wrote:
  Ah yes, seems I used a wrong parameter. The --locale='zh_CN.utf8' 
  works. --locale='zh_CN.UTF8' also works. But still the question is, 
  should the encoding name be case sensitive?
 
 PostgreSQL treats encoding names as case insensitive.
 
 But it depends on the operating system whether locale names are case
 sensitive.

I can confirm that pg_upgrade does case-insensitive comparisons of
encoding/locale names:

static void
check_locale_and_encoding(ControlData *oldctrl,
  ControlData *newctrl)
{
/* These are often defined with inconsistent case, so use 
pg_strcasecmp(). */
if (pg_strcasecmp(oldctrl-lc_collate, newctrl-lc_collate) != 0)
pg_log(PG_FATAL,
   old and new cluster lc_collate values do not match\n);
if (pg_strcasecmp(oldctrl-lc_ctype, newctrl-lc_ctype) != 0)
pg_log(PG_FATAL,
   old and new cluster lc_ctype values do not match\n);
if (pg_strcasecmp(oldctrl-encoding, newctrl-encoding) != 0)
pg_log(PG_FATAL,
   old and new cluster encoding values do not match\n);
}

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Peter Eisentraut
On 9/24/12 8:55 AM, Bruce Momjian wrote:
 I can confirm that pg_upgrade does case-insensitive comparisons of
 encoding/locale names:
 
   static void
   check_locale_and_encoding(ControlData *oldctrl,
 ControlData *newctrl)
   {
   /* These are often defined with inconsistent case, so use 
 pg_strcasecmp(). */
   if (pg_strcasecmp(oldctrl-lc_collate, newctrl-lc_collate) != 0)
   pg_log(PG_FATAL,
  old and new cluster lc_collate values do not match\n);
   if (pg_strcasecmp(oldctrl-lc_ctype, newctrl-lc_ctype) != 0)
   pg_log(PG_FATAL,
  old and new cluster lc_ctype values do not match\n);

I seem to recall that at some point in the distant past, somehow some
Linux distributions changed the canonical spelling of locale names from
xx_YY.UTF-8 to xx_YY.utf8.  So if people are upgrading old PostgreSQL
instances that use the old spelling, pg_upgrade will probably fail.  A
fix might be to take the locale name you find in pg_control and run it
through setlocale() to get the new canonical name.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Bruce Momjian
On Mon, Sep 24, 2012 at 09:06:04AM -0400, Peter Eisentraut wrote:
 On 9/24/12 8:55 AM, Bruce Momjian wrote:
  I can confirm that pg_upgrade does case-insensitive comparisons of
  encoding/locale names:
  
  static void
  check_locale_and_encoding(ControlData *oldctrl,
ControlData *newctrl)
  {
  /* These are often defined with inconsistent case, so use 
  pg_strcasecmp(). */
  if (pg_strcasecmp(oldctrl-lc_collate, newctrl-lc_collate) != 0)
  pg_log(PG_FATAL,
 old and new cluster lc_collate values do not match\n);
  if (pg_strcasecmp(oldctrl-lc_ctype, newctrl-lc_ctype) != 0)
  pg_log(PG_FATAL,
 old and new cluster lc_ctype values do not match\n);
 
 I seem to recall that at some point in the distant past, somehow some
 Linux distributions changed the canonical spelling of locale names from
 xx_YY.UTF-8 to xx_YY.utf8.  So if people are upgrading old PostgreSQL
 instances that use the old spelling, pg_upgrade will probably fail.  A
 fix might be to take the locale name you find in pg_control and run it
 through setlocale() to get the new canonical name.

Or we could just remove dashes from the name before comparisons.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PostgreSQL in the French news

2012-09-24 Thread Fabien COELHO


Dear devs,

For your information, PostgreSQL appears explicitely in the list of 
Free software cited by the French Prime Minister, Jean-Marc Ayrault, in 
his Circulaire (i.e. a kind of instruction to civil servants) signed 
last week. See on page 9 and 18 of the pdf (not the pages of the document 
which are shifted). Obviously, the document is in French...


http://circulaire.legifrance.gouv.fr/pdf/2012/09/cir_35837.pdf

The spirit of the document does not seem to make the use of free software 
mandatory, but to require that it is at least considered for any project 
in which the administration (i.e. the public sector, not the government 
as in the USA) is involved.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Rural Hunter

于 2012/9/24 20:55, Bruce Momjian 写道:

On Sun, Sep 23, 2012 at 06:46:33PM -0400, Peter Eisentraut wrote:

On Sun, 2012-09-23 at 22:20 +0800, Rural Hunter wrote:

Ah yes, seems I used a wrong parameter. The --locale='zh_CN.utf8'
works. --locale='zh_CN.UTF8' also works. But still the question is,
should the encoding name be case sensitive?

PostgreSQL treats encoding names as case insensitive.

But it depends on the operating system whether locale names are case
sensitive.

I can confirm that pg_upgrade does case-insensitive comparisons of
encoding/locale names:

static void
check_locale_and_encoding(ControlData *oldctrl,
  ControlData *newctrl)
{
/* These are often defined with inconsistent case, so use 
pg_strcasecmp(). */
if (pg_strcasecmp(oldctrl-lc_collate, newctrl-lc_collate) != 0)
pg_log(PG_FATAL,
   old and new cluster lc_collate values do not match\n);
if (pg_strcasecmp(oldctrl-lc_ctype, newctrl-lc_ctype) != 0)
pg_log(PG_FATAL,
   old and new cluster lc_ctype values do not match\n);
if (pg_strcasecmp(oldctrl-encoding, newctrl-encoding) != 0)
pg_log(PG_FATAL,
   old and new cluster encoding values do not match\n);
}


strange. not sure what happened. I reviewed the log and here is what I did:
1. initdb without encoding/locale parameter:
$ initdb
The files belonging to this database system will be owned by user 
postgres.

This user must also own the server process.

The database cluster will be initialized with locale zh_CN.UTF-8.
The default database encoding has accordingly been set to UTF8.
initdb: could not find suitable text search configuration for locale 
zh_CN.UTF-8

The default text search configuration will be set to simple.

2. Run pg_upgrade:
$ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B 
/opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c

Performing Consistency Checks
-
Checking current, bin, and data directories ok
Checking cluster versions ok
Checking database user is a superuser ok
Checking for prepared transactions ok
Checking for reg* system OID user data types ok
Checking for contrib/isn with bigint-passing mismatch ok

old and new cluster lc_collate values do not match
Failure, exiting

3. initdb with --lc-collate:
$ initdb --lc-collate=zh_CN.utf8
The files belonging to this database system will be owned by user 
postgres.

This user must also own the server process.

The database cluster will be initialized with locales
COLLATE: zh_CN.utf8
CTYPE: zh_CN.UTF-8
MESSAGES: zh_CN.UTF-8
MONETARY: zh_CN.UTF-8
NUMERIC: zh_CN.UTF-8
TIME: zh_CN.UTF-8
The default database encoding has accordingly been set to UTF8.
initdb: could not find suitable text search configuration for locale 
zh_CN.UTF-8

The default text search configuration will be set to simple.

4. try pg_upgrade again:
$ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B 
/opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c

Performing Consistency Checks
-
Checking current, bin, and data directories ok
Checking cluster versions ok
Checking database user is a superuser ok
Checking for prepared transactions ok
Checking for reg* system OID user data types ok
Checking for contrib/isn with bigint-passing mismatch ok

old and new cluster lc_ctype values do not match
Failure, exiting

5. Run initdb with all those locale settings:
$ initdb --lc-collate=zh_CN.utf8 --lc-ctype=zh_CN.utf8 
--lc-messages=zh_CN.utf8 --lc-monetary=zh_CN.utf8 
--lc-numeric=zh_CN.utf8 --lc-time=zh_CN.utf8
The files belonging to this database system will be owned by user 
postgres.

This user must also own the server process.

The database cluster will be initialized with locale zh_CN.utf8.
The default database encoding has accordingly been set to UTF8.
initdb: could not find suitable text search configuration for locale 
zh_CN.utf8

The default text search configuration will be set to simple.

6. Run pg_upgrade. this time it worked.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 I can confirm that pg_upgrade does case-insensitive comparisons of
 encoding/locale names:

 Or we could just remove dashes from the name before comparisons.

That would merely move the breakage somewhere else.  I think you are
already assuming far too much about the OS' interpretation of locale
names by assuming they are case-insensitive.  Assuming that dashes
aren't significant seems 100% wrong.

FWIW, what I found out last time I touched this code is that on many
systems setlocale doesn't bother to return a canonicalized spelling;
it just gives back the string you gave it.  It might be worth doing
what Peter suggests, just to be consistent with what we are doing
elsewhere, but I'm not sure how much it will help.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Caught up

2012-09-24 Thread Alvaro Herrera
Excerpts from Thom Brown's message of sáb sep 22 14:58:04 -0300 2012:
 On 3 September 2012 15:20, Bruce Momjian br...@momjian.us wrote:

  On a related note, I apologize that many of these items didn't make it
  into 9.2, though they are committed for 9.3.  The good news is that most
  of my work was in documentation improvements and clarifications, that
  could be backpatched to 9.2.  I did not bring this up earlier because I
  didn't want to distract the work of making the improvements.
 
  If anyone wants to look at backpatching some of these doc changes into
  9.2, I will not object.  ;-)  I am attaching a partial list of doc
  changes that might be considered.  (I need to improve my commit messages
  that reference earlier commits by including the old commit tag;  my
  apologies.)  No one has mentioned backpatching so perhaps these are all
  too minor, which is fine.
 
 That's a huge number of amendments. Thanks for giving these some
 attention Bruce.

FWIW I tried to backpatch the changes listed by Bruce and they all seem
to apply fine to 9.2 so maybe we should do that -- I see no reason not
to.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configuration include directory

2012-09-24 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes:
 This seems pretty much ready to commit. One tiny detail that I'd like to 
 clarify: the docs say:

 Multiple files within an include directory are ordered by an alphanumeric 
 sorting, so that ones beginning with numbers are considered before those 
 starting with letters.

 To be more precise, the patch uses strcmp() for the comparisons.

Just say it sorts the file names according to C locale rules.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Bruce Momjian
On Mon, Sep 24, 2012 at 09:59:02PM +0800, Rural Hunter wrote:
 于 2012/9/24 20:55, Bruce Momjian 写道:
 On Sun, Sep 23, 2012 at 06:46:33PM -0400, Peter Eisentraut wrote:
 On Sun, 2012-09-23 at 22:20 +0800, Rural Hunter wrote:
 Ah yes, seems I used a wrong parameter. The --locale='zh_CN.utf8'
 works. --locale='zh_CN.UTF8' also works. But still the question is,
 should the encoding name be case sensitive?
 PostgreSQL treats encoding names as case insensitive.
 
 But it depends on the operating system whether locale names are case
 sensitive.
 I can confirm that pg_upgrade does case-insensitive comparisons of
 encoding/locale names:
 
  static void
  check_locale_and_encoding(ControlData *oldctrl,
ControlData *newctrl)
  {
  /* These are often defined with inconsistent case, so use 
  pg_strcasecmp(). */
  if (pg_strcasecmp(oldctrl-lc_collate, newctrl-lc_collate) != 0)
  pg_log(PG_FATAL,
 old and new cluster lc_collate values do not match\n);
  if (pg_strcasecmp(oldctrl-lc_ctype, newctrl-lc_ctype) != 0)
  pg_log(PG_FATAL,
 old and new cluster lc_ctype values do not match\n);
  if (pg_strcasecmp(oldctrl-encoding, newctrl-encoding) != 0)
  pg_log(PG_FATAL,
 old and new cluster encoding values do not match\n);
  }
 
 strange. not sure what happened. I reviewed the log and here is what I did:
 1. initdb without encoding/locale parameter:
 $ initdb
 The files belonging to this database system will be owned by user
 postgres.
 This user must also own the server process.
 
 The database cluster will be initialized with locale zh_CN.UTF-8.
 The default database encoding has accordingly been set to UTF8.
 initdb: could not find suitable text search configuration for locale
 zh_CN.UTF-8
 The default text search configuration will be set to simple.
 
 2. Run pg_upgrade:
 $ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B
 /opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c
 Performing Consistency Checks
 -
 Checking current, bin, and data directories ok
 Checking cluster versions ok
 Checking database user is a superuser ok
 Checking for prepared transactions ok
 Checking for reg* system OID user data types ok
 Checking for contrib/isn with bigint-passing mismatch ok
 
 old and new cluster lc_collate values do not match
 Failure, exiting
 
 3. initdb with --lc-collate:
 $ initdb --lc-collate=zh_CN.utf8
 The files belonging to this database system will be owned by user
 postgres.
 This user must also own the server process.
 
 The database cluster will be initialized with locales
 COLLATE: zh_CN.utf8
 CTYPE: zh_CN.UTF-8
 MESSAGES: zh_CN.UTF-8
 MONETARY: zh_CN.UTF-8
 NUMERIC: zh_CN.UTF-8
 TIME: zh_CN.UTF-8
 The default database encoding has accordingly been set to UTF8.
 initdb: could not find suitable text search configuration for locale
 zh_CN.UTF-8
 The default text search configuration will be set to simple.
 
 4. try pg_upgrade again:
 $ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B
 /opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c
 Performing Consistency Checks
 -
 Checking current, bin, and data directories ok
 Checking cluster versions ok
 Checking database user is a superuser ok
 Checking for prepared transactions ok
 Checking for reg* system OID user data types ok
 Checking for contrib/isn with bigint-passing mismatch ok
 
 old and new cluster lc_ctype values do not match
 Failure, exiting
 
 5. Run initdb with all those locale settings:
 $ initdb --lc-collate=zh_CN.utf8 --lc-ctype=zh_CN.utf8
 --lc-messages=zh_CN.utf8 --lc-monetary=zh_CN.utf8
 --lc-numeric=zh_CN.utf8 --lc-time=zh_CN.utf8
 The files belonging to this database system will be owned by user
 postgres.
 This user must also own the server process.
 
 The database cluster will be initialized with locale zh_CN.utf8.
 The default database encoding has accordingly been set to UTF8.
 initdb: could not find suitable text search configuration for locale
 zh_CN.utf8
 The default text search configuration will be set to simple.
 
 6. Run pg_upgrade. this time it worked.

OK, that is good information.  pg_upgrade gets the locale and encoding
from the template0 database settings:

SELECT datcollate, datctype 
FROM   pg_catalog.pg_database 
WHERE  datname = 'template0' );

If your operating system locale/encoding names changed after the initdb
of the old cluster, this would not be reflected in template0.  I think
Peter is right that this might be as dash issue, utf8 vs utf-8.  Look at
the initdb output:

 3. initdb with --lc-collate:
 $ initdb --lc-collate=zh_CN.utf8
 The files belonging to this database system will be owned by user
 postgres.
 This user must also own the server process.

 The database cluster will be initialized with locales
 COLLATE: 

Re: [HACKERS] Switching timeline over streaming replication

2012-09-24 Thread Amit Kapila
On Tuesday, September 11, 2012 10:53 PM Heikki Linnakangas wrote:
 I've been working on the often-requested feature to handle timeline
 changes over streaming replication. At the moment, if you kill the
 master and promote a standby server, and you have another standby
 server that you'd like to keep following the new master server, you
 need a WAL archive in addition to streaming replication to make it
 cross the timeline change. Streaming replication will just error out.
 Having a WAL archive is usually a good idea in complex replication
 scenarios anyway, but it would be good to not require it.

Confirm my understanding of this feature:

This feature is for case when standby-1 who is going to be promoted to
master has archive mode 'on'.
As in that case only its timeline will change.

If above is right, then there can be other similar scenario's where it can
be used:

Scenario-1 (1 Master, 1 Stand-by)
1. Master (archive_mode=on) goes down.
2. Master again comes up
3. Stand-by tries to follow it

Now in above scenario also due to timeline mismatch it gives error, but your
patch should fix it.


 
 
 Some parts of this patch are just refactoring that probably make sense
 regardless of the new functionality. For example, I split off the
 timeline history file related functions to a new file, timeline.c.
 That's not very much code, but it's fairly isolated, and xlog.c is
 massive, so I feel that anything that we can move off from xlog.c is a
 good thing. I also moved off the two functions RestoreArchivedFile()
 and ExecuteRecoveryCommand(), to a separate file. Those are also not
 much code, but are fairly isolated. If no-one objects to those changes,
 and the general direction this work is going to, I'm going split off
 those refactorings to separate patches and commit them separately.
 
 I also made the timeline history file a bit more detailed: instead of
 recording just the WAL segment where the timeline was changed, it now
 records the exact XLogRecPtr. That was required for the walsender to
 know the switchpoint, without having to parse the XLOG records (it
 reads and parses the history file, instead)

IMO separating timeline history file related functions to a new file is
good.
However I am not sure about splitting for RestoreArchivedFile() and
ExecuteRecoveryCommand() into separate file.
How about splitting for all Archive related functions:
static void XLogArchiveNotify(const char *xlog); 
static void XLogArchiveNotifySeg(XLogSegNo segno); 
static bool XLogArchiveCheckDone(const char *xlog); 
static bool XLogArchiveIsBusy(const char *xlog); 
static void XLogArchiveCleanup(const char *xlog);
..
..

In any case, it will be better if you can split it into multiple patches:
1. Having new functionality of Switching timeline over streaming
replication
2. Refactoring related changes.

It can make my testing and review for new feature patch little easier.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Bruce Momjian
On Mon, Sep 24, 2012 at 10:13:45AM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  I can confirm that pg_upgrade does case-insensitive comparisons of
  encoding/locale names:
 
  Or we could just remove dashes from the name before comparisons.
 
 That would merely move the breakage somewhere else.  I think you are
 already assuming far too much about the OS' interpretation of locale
 names by assuming they are case-insensitive.  Assuming that dashes
 aren't significant seems 100% wrong.
 
 FWIW, what I found out last time I touched this code is that on many
 systems setlocale doesn't bother to return a canonicalized spelling;
 it just gives back the string you gave it.  It might be worth doing
 what Peter suggests, just to be consistent with what we are doing
 elsewhere, but I'm not sure how much it will help.

This comment in initdb.c doesn't sound hopeful:

 * If successful, and canonname isn't NULL, a malloc'd copy of the locale's
 * canonical name is stored there.  This is especially useful for figuring out
 * what locale name  means (ie, the environment value).  (Actually,
 * it seems that on most implementations that's the only thing it's good for;
 * we could wish that setlocale gave back a canonically spelled version of
 * the locale name, but typically it doesn't.)

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Mon, Sep 24, 2012 at 10:13:45AM -0400, Tom Lane wrote:
 FWIW, what I found out last time I touched this code is that on many
 systems setlocale doesn't bother to return a canonicalized spelling;
 it just gives back the string you gave it.  It might be worth doing
 what Peter suggests, just to be consistent with what we are doing
 elsewhere, but I'm not sure how much it will help.

 This comment in initdb.c doesn't sound hopeful:

  * If successful, and canonname isn't NULL, a malloc'd copy of the locale's
  * canonical name is stored there.  This is especially useful for figuring out
  * what locale name  means (ie, the environment value).  (Actually,
  * it seems that on most implementations that's the only thing it's good for;
  * we could wish that setlocale gave back a canonically spelled version of
  * the locale name, but typically it doesn't.)

Yeah, I wrote that.  We can hope that the OP is running on a platform
where setlocale does canonicalize the name, in which case doing the
same thing in pg_upgrade that initdb does would fix his problem.  But
I'm not going to predict success.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doc patch to note which system catalogs have oids

2012-09-24 Thread Karl O. Pinc
On 09/23/2012 10:14:33 PM, Tom Lane wrote:
 Karl O. Pinc k...@meme.com writes:
  The attached patch documents the oid column of those
  system catalogs having an oid.
 
 I think this is fundamentally wrong, or at least misleading, because
 it
 documents OID as if it were an ordinary column.  Somebody who did
 select * from pg_class and didn't see any oid in the result would
 think the docs were wrong.

Ok. 

When I went looking at querying the
system catalogs I got confused some time ago because oids were
not listed along with the other columns.  (It didn't help that the
catalog I was looking at had another column of type oid.)

 
 It's possible that it's worth expending a boilerplate paragraph in
 each
 of those pages to say this catalog has OIDs (or that it doesn't).
 But this isn't the way.

How about modifying the (printed) table layout as attached?
It begins each (printed) table documenting each catalog with a
Has OID column Yes/No.

Also, I note that pg_constraint and pg_collation are not
collated properly in the docs.  (Constraint comes before collation
in the docs, although everything else is sorted by name.)

A second patch (applied on top of the first) fixes this.

Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index f999190..2dfb40f 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -332,6 +332,19 @@
   table
titlestructnamepg_aggregate/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryHas OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryNo/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -415,6 +428,19 @@
   table
titlestructnamepg_am/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryHas OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryYes/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -671,6 +697,19 @@
   table
titlestructnamepg_amop/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryHas OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryYes/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -807,6 +846,19 @@
   table
titlestructnamepg_amproc/structname Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryHas OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryYes/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -890,6 +942,19 @@
   table
titlestructnamepg_attrdef/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryHas OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryYes/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -967,6 +1032,19 @@
   table
titlestructnamepg_attribute/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryHas OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryNo/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -1247,6 +1325,19 @@
   table
titlestructnamepg_authid/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryHas OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryYes/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=3
 thead
  row
@@ -1377,6 +1468,19 @@
   table
titlestructnamepg_auth_members/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryHas OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryNo/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -1450,6 +1554,19 @@
   table
titlestructnamepg_cast/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryHas OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryYes/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -1565,6 +1682,19 @@
   table
titlestructnamepg_class/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryHas OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryYes/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -1877,6 +2007,19 @@
   table
titlestructnamepg_event_trigger/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryHas OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryNo/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -1972,6 +2115,19 @@
   table
titlestructnamepg_constraint/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryHas OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryYes/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -2238,6 

Re: [HACKERS] Configuration include directory

2012-09-24 Thread Heikki Linnakangas

On 24.09.2012 17:24, Tom Lane wrote:

Heikki Linnakangashlinnakan...@vmware.com  writes:

This seems pretty much ready to commit. One tiny detail that I'd like to
clarify: the docs say:



Multiple files within an include directory are ordered by an alphanumeric 
sorting, so that ones beginning with numbers are considered before those 
starting with letters.



To be more precise, the patch uses strcmp() for the comparisons.


Just say it sorts the file names according to C locale rules.


Hmm, that's preceise, but I don't think an average user necessarily 
knows what the C locale is. I think I'll go with:


Multiple files within an include directory are processed in filename 
order. The filenames are ordered by C locale rules, ie. numbers before 
letters, and uppercase letters before lowercase ones.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Rural Hunter

于 2012/9/24 22:26, Bruce Momjian 写道:

On Mon, Sep 24, 2012 at 09:59:02PM +0800, Rural Hunter wrote:

于 2012/9/24 20:55, Bruce Momjian 写道:

On Sun, Sep 23, 2012 at 06:46:33PM -0400, Peter Eisentraut wrote:

On Sun, 2012-09-23 at 22:20 +0800, Rural Hunter wrote:

Ah yes, seems I used a wrong parameter. The --locale='zh_CN.utf8'
works. --locale='zh_CN.UTF8' also works. But still the question is,
should the encoding name be case sensitive?

PostgreSQL treats encoding names as case insensitive.

But it depends on the operating system whether locale names are case
sensitive.

I can confirm that pg_upgrade does case-insensitive comparisons of
encoding/locale names:

static void
check_locale_and_encoding(ControlData *oldctrl,
  ControlData *newctrl)
{
/* These are often defined with inconsistent case, so use 
pg_strcasecmp(). */
if (pg_strcasecmp(oldctrl-lc_collate, newctrl-lc_collate) != 0)
pg_log(PG_FATAL,
   old and new cluster lc_collate values do not match\n);
if (pg_strcasecmp(oldctrl-lc_ctype, newctrl-lc_ctype) != 0)
pg_log(PG_FATAL,
   old and new cluster lc_ctype values do not match\n);
if (pg_strcasecmp(oldctrl-encoding, newctrl-encoding) != 0)
pg_log(PG_FATAL,
   old and new cluster encoding values do not match\n);
}


strange. not sure what happened. I reviewed the log and here is what I did:
1. initdb without encoding/locale parameter:
$ initdb
The files belonging to this database system will be owned by user
postgres.
This user must also own the server process.

The database cluster will be initialized with locale zh_CN.UTF-8.
The default database encoding has accordingly been set to UTF8.
initdb: could not find suitable text search configuration for locale
zh_CN.UTF-8
The default text search configuration will be set to simple.

2. Run pg_upgrade:
$ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B
/opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c
Performing Consistency Checks
-
Checking current, bin, and data directories ok
Checking cluster versions ok
Checking database user is a superuser ok
Checking for prepared transactions ok
Checking for reg* system OID user data types ok
Checking for contrib/isn with bigint-passing mismatch ok

old and new cluster lc_collate values do not match
Failure, exiting

3. initdb with --lc-collate:
$ initdb --lc-collate=zh_CN.utf8
The files belonging to this database system will be owned by user
postgres.
This user must also own the server process.

The database cluster will be initialized with locales
COLLATE: zh_CN.utf8
CTYPE: zh_CN.UTF-8
MESSAGES: zh_CN.UTF-8
MONETARY: zh_CN.UTF-8
NUMERIC: zh_CN.UTF-8
TIME: zh_CN.UTF-8
The default database encoding has accordingly been set to UTF8.
initdb: could not find suitable text search configuration for locale
zh_CN.UTF-8
The default text search configuration will be set to simple.

4. try pg_upgrade again:
$ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B
/opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c
Performing Consistency Checks
-
Checking current, bin, and data directories ok
Checking cluster versions ok
Checking database user is a superuser ok
Checking for prepared transactions ok
Checking for reg* system OID user data types ok
Checking for contrib/isn with bigint-passing mismatch ok

old and new cluster lc_ctype values do not match
Failure, exiting

5. Run initdb with all those locale settings:
$ initdb --lc-collate=zh_CN.utf8 --lc-ctype=zh_CN.utf8
--lc-messages=zh_CN.utf8 --lc-monetary=zh_CN.utf8
--lc-numeric=zh_CN.utf8 --lc-time=zh_CN.utf8
The files belonging to this database system will be owned by user
postgres.
This user must also own the server process.

The database cluster will be initialized with locale zh_CN.utf8.
The default database encoding has accordingly been set to UTF8.
initdb: could not find suitable text search configuration for locale
zh_CN.utf8
The default text search configuration will be set to simple.

6. Run pg_upgrade. this time it worked.

OK, that is good information.  pg_upgrade gets the locale and encoding
from the template0 database settings:

SELECT datcollate, datctype 
FROM  pg_catalog.pg_database 
WHERE datname = 'template0' );

If your operating system locale/encoding names changed after the initdb
of the old cluster, this would not be reflected in template0.

No. It's not changed. look at my system settings:
LANG=zh_CN.UTF-8
$ cat /var/lib/locales/supported.d/local
zh_CN.UTF-8 UTF-8

I think the problem is on the options when I installed pgsql(both 9.1 
and 9.2)

Select the locale to be used by the new database cluster.

Locale

[1] [Default locale]
[2] C
[3] POSIX
[4] zh_CN.utf8
[5] zh_HK.utf8
[6] zh_SG.utf8
[7] zh_TW.utf8

Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Well, if you run that query on template0 in the old and new cluster, you
 will see something different in the two of them.  Could you have used
 default in one and a non-dash in the other.  Did we change the way we
 canonicalize the locale between 9.1 and 9.2?

IIRC, we didn't try to canonicalize locale names at all before 9.2.
That initdb code you're quoting is of fairly recent vintage.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Bruce Momjian
On Mon, Sep 24, 2012 at 11:24:04AM -0400, Peter Eisentraut wrote:
 On 9/24/12 10:13 AM, Tom Lane wrote:
  FWIW, what I found out last time I touched this code is that on many
  systems setlocale doesn't bother to return a canonicalized spelling;
  it just gives back the string you gave it.  It might be worth doing
  what Peter suggests, just to be consistent with what we are doing
  elsewhere, but I'm not sure how much it will help.
 
 It might not have anything to do with the current problem, but if initdb
 canonicalizes locale names, then pg_upgrade also has to.  Otherwise,
 whenever an operating system changes its locale canonicalization rules,
 pg_upgrade will fail.

Agreed. I will work on that soon.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Bruce Momjian
On Mon, Sep 24, 2012 at 11:22:22AM -0400, Peter Eisentraut wrote:
 On 9/24/12 11:04 AM, Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
  Well, if you run that query on template0 in the old and new cluster, you
  will see something different in the two of them.  Could you have used
  default in one and a non-dash in the other.  Did we change the way we
  canonicalize the locale between 9.1 and 9.2?
  
  IIRC, we didn't try to canonicalize locale names at all before 9.2.
  That initdb code you're quoting is of fairly recent vintage.
 
 initdb has changed POSIX to C with glibc at least since 8.3.  The code
 you're quoting is just a refactoring, AFAICT.

Frankly, I assumed the values assigned in pg_database for template0 were
canonical.  Tom is saying that canonicalization behavior changed
between 9.1 to 9.2, and the user is reporting this.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_reorg in core?

2012-09-24 Thread Alvaro Herrera
Excerpts from Daniele Varrazzo's message of dom sep 23 22:02:51 -0300 2012:
 On Mon, Sep 24, 2012 at 12:23 AM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 
  As proposed by Masahiko, a single organization grouping all the tools (one
  repository per tool) would be enough. Please note that github can also host
  documentation. Bug tracker would be tool-dedicated in this case.
 
 From this PoV, pgFoundry allows your tool to be under
 http://yourtool.projects.postgresql.org instead of under a more
 generic namespace: I find it a nice and cozy place in the url space
 where to put your project. If pgFoundry will be dismissed I hope at
 least a hosting service for static pages will remain.

I don't think that has been offered.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Mon, Sep 24, 2012 at 11:22:22AM -0400, Peter Eisentraut wrote:
 initdb has changed POSIX to C with glibc at least since 8.3.  The code
 you're quoting is just a refactoring, AFAICT.

 Frankly, I assumed the values assigned in pg_database for template0 were
 canonical.  Tom is saying that canonicalization behavior changed
 between 9.1 to 9.2, and the user is reporting this.

It was not just a refactoring: we now pass the locale names through
setlocale() which we didn't before.  See commit
c7cea267de3ca05b29a57b9d113b95ef3793c8d8.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] External Replication

2012-09-24 Thread Dimitri Fontaine
m...@rpzdesign.com m...@rpzdesign.com writes:
 You may want to consider changing the command TRIGGER into a command FILTER
 and possibly post processing TRIGGER that
 is determined to be called INSIDE the FILTER.  Or some way to pass
 information between the FILTER and the post processing trigger.

The only current event supported by the system is the
ddl_command_start one. We mean to add some more, and triggers wanting
to communicate data in between ddl_command_start and ddl_command_end
(for example) will have to use something like a table.

 Also, something information as to whether a series of statements was ROLLED
 BACK would be helpful.

Event Triggers are not an autonomous transaction: any effect they have
in the database is rolled-backed when the main transaction is rolled
backed. You can use LISTEN/NOTIFY or PGQ that both know how to handle
that semantics.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Peter Eisentraut
On 9/24/12 11:04 AM, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
 Well, if you run that query on template0 in the old and new cluster, you
 will see something different in the two of them.  Could you have used
 default in one and a non-dash in the other.  Did we change the way we
 canonicalize the locale between 9.1 and 9.2?
 
 IIRC, we didn't try to canonicalize locale names at all before 9.2.
 That initdb code you're quoting is of fairly recent vintage.

initdb has changed POSIX to C with glibc at least since 8.3.  The code
you're quoting is just a refactoring, AFAICT.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_reorg in core?

2012-09-24 Thread Roberto Mello
On Sat, Sep 22, 2012 at 3:25 AM, Satoshi Nagayasu sn...@uptime.jp wrote:

 To solve this problem, I would like to have some umbrella project.
 It would be called pg dba utils, or something like this.
 This umbrella project may contain several third-party tools (pg_reorg,
 pg_rman, pg_filedump, xlogdump, etc, etc...) as its sub-modules.

Great idea!

+1

Roberto Mello


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Bruce Momjian
On Mon, Sep 24, 2012 at 11:04:32AM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Well, if you run that query on template0 in the old and new cluster, you
  will see something different in the two of them.  Could you have used
  default in one and a non-dash in the other.  Did we change the way we
  canonicalize the locale between 9.1 and 9.2?
 
 IIRC, we didn't try to canonicalize locale names at all before 9.2.
 That initdb code you're quoting is of fairly recent vintage.

Ah, so that would explain the change he is seeing.  I will work on a
patch.  I am working on the information_schema patch now.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_reorg in core?

2012-09-24 Thread Simon Riggs
On 21 September 2012 08:42, Michael Paquier michael.paqu...@gmail.com wrote:


 On Fri, Sep 21, 2012 at 1:00 PM, Hitoshi Harada umi.tan...@gmail.com
 wrote:

 I'm not familiar with pg_reorg, but I wonder why we need a separate
 program for this task.  I know pg_reorg is ok as an external program
 per se, but if we could optimize CLUSTER (or VACUUM which I'm a little
 pessimistic about) in the same way, it's much nicer than having
 additional binary + extension.  Isn't it possible to do the same thing
 above within the CLUSTER command?  Maybe CLUSTER .. CONCURRENTLY?

 CLUSTER might be more adapted in this case as the purpose is to reorder the
 table.
 The same technique used by pg_reorg (aka table coupled with triggers) could
 lower the lock access of the table.
 Also, it could be possible to control each sub-operation in the same fashion
 way as CREATE INDEX CONCURRENTLY.
 By the way, whatever the operation, VACUUM or CLUSTER used, I got a couple
 of doubts:
 1) isn't it be too costly for a core operation as pg_reorg really needs many
 temporary objects? Could be possible to reduce the number of objects created
 if added to core though...
 2) Do you think the current CLUSTER is enough and are there wishes to
 implement such an optimization directly in core?


For me, the Postgres user interface should include
* REINDEX CONCURRENTLY
* CLUSTER CONCURRENTLY
* ALTER TABLE CONCURRENTLY
and also that autovacuum would be expanded to include REINDEX and
CLUSTER, renaming it to automaint.

The actual implementation mechanism for those probably looks something
like pg_reorg, but I don't see it as preferable to include the utility
directly into core, though potentially some of the underlying code
might be.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Peter Eisentraut
On 9/24/12 10:13 AM, Tom Lane wrote:
 FWIW, what I found out last time I touched this code is that on many
 systems setlocale doesn't bother to return a canonicalized spelling;
 it just gives back the string you gave it.  It might be worth doing
 what Peter suggests, just to be consistent with what we are doing
 elsewhere, but I'm not sure how much it will help.

It might not have anything to do with the current problem, but if initdb
canonicalizes locale names, then pg_upgrade also has to.  Otherwise,
whenever an operating system changes its locale canonicalization rules,
pg_upgrade will fail.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] External Replication

2012-09-24 Thread m...@rpzdesign.com

Dmitri:

Thanks for the response.

I am resigned to just patch each major release with my own pile of hook 
code that I can quickly

graft into the code base, currently grafting 9.2.0.

My strategy is to let the PG code base float around with all the work of 
the fine hackers on this list,

maybe debate a couple of things with some friendly types, but
really just put my effort into logic piled into external replication 
daemon which

will NOT change due to anything in the PG core.

If one day, the code base actually feeds me the event information I 
need, maybe I will change it.


And I have not seen anybody request my hook code but a few have 
responded that the are working

on things in the code base, release date unknown.

Cheers,

marco

On 9/24/2012 10:20 AM, Dimitri Fontaine wrote:

m...@rpzdesign.com m...@rpzdesign.com writes:

You may want to consider changing the command TRIGGER into a command FILTER
and possibly post processing TRIGGER that
is determined to be called INSIDE the FILTER.  Or some way to pass
information between the FILTER and the post processing trigger.

The only current event supported by the system is the
ddl_command_start one. We mean to add some more, and triggers wanting
to communicate data in between ddl_command_start and ddl_command_end
(for example) will have to use something like a table.


Also, something information as to whether a series of statements was ROLLED
BACK would be helpful.

Event Triggers are not an autonomous transaction: any effect they have
in the database is rolled-backed when the main transaction is rolled
backed. You can use LISTEN/NOTIFY or PGQ that both know how to handle
that semantics.

Regards,




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: DROP INDEX CONCURRENTLY is not really concurrency safe leaves around undroppable indexes

2012-09-24 Thread Simon Riggs
On 24 September 2012 06:27, Andres Freund and...@2ndquadrant.com wrote:

 Problem 1: concurrency:

 Problem 2: undroppable indexes:

Thanks for posting. I'll think some more before replying.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Bruce Momjian
On Mon, Sep 24, 2012 at 10:45:34PM +0800, Rural Hunter wrote:
 If your operating system locale/encoding names changed after the initdb
 of the old cluster, this would not be reflected in template0.
 No. It's not changed. look at my system settings:
 LANG=zh_CN.UTF-8
 $ cat /var/lib/locales/supported.d/local
 zh_CN.UTF-8 UTF-8
 
 I think the problem is on the options when I installed pgsql(both
 9.1 and 9.2)
 Select the locale to be used by the new database cluster.
 
 Locale
 
 [1] [Default locale]
 [2] C
 [3] POSIX
 [4] zh_CN.utf8
 [5] zh_HK.utf8
 [6] zh_SG.utf8
 [7] zh_TW.utf8
 Please choose an option [1] : 4
 I chose 4 instead of 1. I guess the default locale(option 1) is with dash.

Well, if you run that query on template0 in the old and new cluster, you
will see something different in the two of them.  Could you have used
default in one and a non-dash in the other.  Did we change the way we
canonicalize the locale between 9.1 and 9.2?

I can send you a patch to test if the setlocale canonicalization works. 
Can you test it if I send it?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XLogInsert scaling, revisited

2012-09-24 Thread Fujii Masao
On Fri, Sep 21, 2012 at 12:29 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 I've been slowly continuing to work that I started last winder to make
 XLogInsert scale better. I have tried quite a few different approaches since
 then, and have settled on the attached. This is similar but not exactly the
 same as what I did in the patches I posted earlier.

 The basic idea, like before, is to split WAL insertion into two phases:

 1. Reserve the right amount of WAL. This is done while holding just a
 spinlock. Thanks to the changes I made earlier to the WAL format, the space
 calculations are now much simpler and the critical section boils down to
 almost just CurBytePos += size_of_wal_record. See
 ReserveXLogInsertLocation() function.

 2. Copy the WAL record to the right location in the WAL buffers. This slower
 part can be done mostly in parallel.

 The difficult part is tracking which insertions are currently in progress,
 and being able to wait for an insertion to finish copying the record data in
 place. I'm using a small number (7 at the moment) of WAL insertion slots for
 that. The first thing that XLogInsert does is to grab one of the slots. Each
 slot is protected by a LWLock, and XLogInsert reserves a slot by acquiring
 its lock. It holds the lock until it has completely finished copying the WAL
 record in place. In each slot, there's an XLogRecPtr that indicates how far
 the current inserter has progressed with its insertion. Typically, for a
 short record that fits on a single page, it is updated after the insertion
 is finished, but if the insertion needs to wait for a WAL buffer to become
 available, it updates the XLogRecPtr before sleeping.

 To wait for all insertions up to a point to finish, you scan all the
 insertion slots, and observe that the XLogRecPtrs in them are = the point
 you're interested in. The number of slots is a tradeoff: more slots allow
 more concurrency in inserting records, but makes it slower to determine how
 far it can be safely flushed.

 I did some performance tests with this, on an 8-core HP Proliant server, in
 a VM running under VMware vSphere 5.1. The tests were performed with Greg
 Smith's pgbench-tools kit, with one of two custom workload scripts:

 1. Insert 1000 rows in each transaction. This is exactly the sort of
 workload where WALInsertLock currently becomes a bottleneck. Without the the
 patch, the test scales very badly, with about 420 TPS with a single client,
 peaking only at 520 TPS with two clients. With the patch, it scales up to
 about 1200 TPS, with 7 clients. I believe the test becomes I/O limited at
 that point; looking at iostat output while the test is running shows about
 200MB/s of writes, and that is roughly what the I/O subsystem of this
 machine can do, according to a simple test with 'dd ...; sync. Or perhaps
 having more insertion slots would allow it to go higher - the patch uses
 exactly 7 slots at the moment.

 http://hlinnaka.iki.fi/xloginsert-scaling/results-1k/

 2. Insert only 10 rows in each transaction. This simulates an OLTP workload
 with fairly small transactions. The patch doesn't make a huge difference
 with that workload. It performs somewhat worse with 4-16 clients, but then
 somewhat better with  16 clients. The patch adds some overhead to flushing
 the WAL, I believe that's what's causing the slowdown with 4-16 clients. But
 with more clients, the WALInsertLock bottleneck becomes more significant,
 and you start to see a benefit again.

 http://hlinnaka.iki.fi/xloginsert-scaling/results-10/

 Overall, the results look pretty good. I'm going to take a closer look at
 the slowdown in the second test. I think it might be fixable with some
 changes to how WaitInsertionsToFinish() and WALWriteLock work together,
 although I'm not sure how exactly it ought to work.

 Comments, ideas?

Sounds good.

The patch could be applied cleanly and the compile could be successfully done.
But when I ran initdb, I got the following assertion error:

--
$ initdb -D data --locale=C --encoding=UTF-8
The files belonging to this database system will be owned by user postgres.
This user must also own the server process.

The database cluster will be initialized with locale C.
The default text search configuration will be set to english.

creating directory data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
creating configuration files ... ok
creating template1 database in data/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... TRAP: FailedAssertion(!(((uint64)
currpos) % 8192 = (((intptr_t) ((sizeof(XLogPageHeaderData))) + ((8)
- 1))  ~((intptr_t) ((8) - 1))) || rdata_len == 0), File: xlog.c,
Line: 1363)
sh: line 1: 29537 Abort trap: 6   /dav/hoge/bin/postgres
--single -F -O -c search_path=pg_catalog -c exit_on_error=true
template1  /dev/null
child process exited with exit code 134

Re: [HACKERS] [PoC] load balancing in libpq

2012-09-24 Thread Satoshi Nagayasu

2012/09/24 1:07, Christopher Browne wrote:

We historically have connection pooling as an external thing; with the
high degree to which people keep implementing and reimplementing this, I
think *something* more than we have ought to be built in.

This, with perhaps better implementation, might be an apropos start.

Parallel with LDAP: it takes very much this approach, where
configuration typically offers a list of LDAP servers.  I am not certain
if OpenLDAP does round robin on the list, or if it tries targets in
order, stopping when it succeeds.  A decent debate fits in, there.

I could see this being implemented instead via something alongside
PGSERVICE; that already offers a well-defined way to capture a
registry of connection configuration.  Specifying a list of service
names would allow the command line configuration to remain short and yet
very flexible.


Thanks for the comment.

As you pointed out, I think it would be a start point to
implement new simple load-balancing stuff. That's what
I actually intended.

My clients often ask me easier way to take advantage of
replication and load-balancing.

I know there are several considerations to be discussed,
such as API compatibility issue, but it would be worth
having in the core (or around the core).

And I also know many people are struggling with load-balancing
and master-failover things for the PostgreSQL replication.

If those people are trying implementing their own load-balancing
stuff in their apps again and again, it's time to consider
implementing it to deliver and/or leverage with the potential
of PostgreSQL replication.

Regards,



On 2012-09-23 10:01 AM, Euler Taveira eu...@timbira.com
mailto:eu...@timbira.com wrote:

On 23-09-2012 07:50, Satoshi Nagayasu wrote:
  I have just written the first PoC code to enable load balancing
  in the libpq library.
 
Your POC is totally broken. Just to point out two problems: (i)
semicolon (;)
is a valid character for any option in the connection string and
(ii) you
didn't think about PQsetdb[Login](), PQconnectdbParams() and
PQconnectStartParams(). If you want to pursue this idea, you should
think a
way to support same option multiple times (one idea is host1, host2,
etc).

Isn't it easier to add support on your application or polling software?


--
Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org
mailto:pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers




--
Satoshi Nagayasu sn...@uptime.jp
Uptime Technologies, LLC. http://www.uptime.jp


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_reorg in core?

2012-09-24 Thread Satoshi Nagayasu

2012/09/25 0:15, Simon Riggs wrote:

On 21 September 2012 08:42, Michael Paquier michael.paqu...@gmail.com wrote:



On Fri, Sep 21, 2012 at 1:00 PM, Hitoshi Harada umi.tan...@gmail.com
wrote:


I'm not familiar with pg_reorg, but I wonder why we need a separate
program for this task.  I know pg_reorg is ok as an external program
per se, but if we could optimize CLUSTER (or VACUUM which I'm a little
pessimistic about) in the same way, it's much nicer than having
additional binary + extension.  Isn't it possible to do the same thing
above within the CLUSTER command?  Maybe CLUSTER .. CONCURRENTLY?


CLUSTER might be more adapted in this case as the purpose is to reorder the
table.
The same technique used by pg_reorg (aka table coupled with triggers) could
lower the lock access of the table.
Also, it could be possible to control each sub-operation in the same fashion
way as CREATE INDEX CONCURRENTLY.
By the way, whatever the operation, VACUUM or CLUSTER used, I got a couple
of doubts:
1) isn't it be too costly for a core operation as pg_reorg really needs many
temporary objects? Could be possible to reduce the number of objects created
if added to core though...
2) Do you think the current CLUSTER is enough and are there wishes to
implement such an optimization directly in core?



For me, the Postgres user interface should include
* REINDEX CONCURRENTLY
* CLUSTER CONCURRENTLY
* ALTER TABLE CONCURRENTLY
and also that autovacuum would be expanded to include REINDEX and
CLUSTER, renaming it to automaint.

The actual implementation mechanism for those probably looks something
like pg_reorg, but I don't see it as preferable to include the utility
directly into core, though potentially some of the underlying code
might be.


I think it depends on what trade-off we can see.

AFAIK, basically, rebuilding tables and/or indexes has
a trade-off between lock-free and disk-space.

So, if we have enough disk space to build a temporary
table/index when rebuilding a table/index, concurrently
would be a great option, and I would love it to have
in core.

Regards,
--
Satoshi Nagayasu sn...@uptime.jp
Uptime Technologies, LLC. http://www.uptime.jp


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Bruce Momjian
On Mon, Sep 24, 2012 at 11:04:32AM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Well, if you run that query on template0 in the old and new cluster, you
  will see something different in the two of them.  Could you have used
  default in one and a non-dash in the other.  Did we change the way we
  canonicalize the locale between 9.1 and 9.2?
 
 IIRC, we didn't try to canonicalize locale names at all before 9.2.
 That initdb code you're quoting is of fairly recent vintage.

OK, I have developed two patches.  

The first fixes the problem of toast tables having oid 
FirstNormalObjectId due to recreating the information_schema as outlined
in the 9.1 release notes.  In fact, there are several cases this fixes,
but information_schema was the one reported.  The basic problem is that
TOAST tables can't be restricted by schema --  you have to gather the
relations, and then get the toast tables.  The good news is that
pg_upgrade caught its own bug and threw an error.

I was able to test this patch by testing the information_schema
recreation, and I checked to see the regression database had the
expected info.c relation count.

The second patch canonicalizes the old cluster's collation and ctype
values pulled from the template0 database.  

I was recreate the fix my Debian Squeeze system.  Can someone suggestion
a way? I updated pg_database on the old 9.1 cluster to be en_US.UTF8,
while the new cluster defaults to en_US.UTF-8, but pg_upgrade kept them
the same after the setlocale() call and pg_upgrade threw a mismatch
error.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/info.c b/contrib/pg_upgrade/info.c
new file mode 100644
index 74b13e7..9d08f41
*** a/contrib/pg_upgrade/info.c
--- b/contrib/pg_upgrade/info.c
*** get_rel_infos(ClusterInfo *cluster, DbIn
*** 269,302 
  	 */
  
  	snprintf(query, sizeof(query),
! 			 SELECT c.oid, n.nspname, c.relname, 
! 			 	c.relfilenode, c.reltablespace, %s 
  			 FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n 
  			 	   ON c.relnamespace = n.oid 
! 			   LEFT OUTER JOIN pg_catalog.pg_tablespace t 
! 			 	   ON c.reltablespace = t.oid 
! 			 WHERE relkind IN ('r','t', 'i'%s) AND 
  	/* exclude possible orphaned temp tables */
  			   ((n.nspname !~ '^pg_temp_' AND 
  			 n.nspname !~ '^pg_toast_temp_' AND 
! 			 n.nspname NOT IN ('pg_catalog', 'information_schema', 'binary_upgrade') AND 
  			 	  c.oid = %u) 
  			   OR (n.nspname = 'pg_catalog' AND 
! 	relname IN ('pg_largeobject', 'pg_largeobject_loid_pn_index'%s) )) 
! 	/* we preserve pg_class.oid so we sort by it to match old/new */
! 			 ORDER BY 1;,
! 	/* 9.2 removed the spclocation column */
! 			 (GET_MAJOR_VERSION(cluster-major_version) = 901) ?
! 			 t.spclocation : pg_catalog.pg_tablespace_location(t.oid) AS spclocation,
  	/* see the comment at the top of old_8_3_create_sequence_script() */
  			 (GET_MAJOR_VERSION(old_cluster.major_version) = 803) ?
  			  : , 'S',
- 	/* this oid allows us to skip system toast tables */
  			 FirstNormalObjectId,
  	/* does pg_largeobject_metadata need to be migrated? */
  			 (GET_MAJOR_VERSION(old_cluster.major_version) = 804) ?
  	 : , 'pg_largeobject_metadata', 'pg_largeobject_metadata_oid_index');
  
  	res = executeQueryOrDie(conn, %s, query);
  
  	ntups = PQntuples(res);
--- 269,327 
  	 */
  
  	snprintf(query, sizeof(query),
! 			 CREATE TEMPORARY TABLE info_rels (reloid) AS SELECT c.oid 
  			 FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n 
  			 	   ON c.relnamespace = n.oid 
! 			 WHERE relkind IN ('r', 'i'%s) AND 
  	/* exclude possible orphaned temp tables */
  			   ((n.nspname !~ '^pg_temp_' AND 
  			 n.nspname !~ '^pg_toast_temp_' AND 
! 	/* skip pg_toast because toast index have relkind == 'i', not 't' */
! 			 n.nspname NOT IN ('pg_catalog', 'information_schema', 
! 			 		'binary_upgrade', 'pg_toast') AND 
  			 	  c.oid = %u) 
  			   OR (n.nspname = 'pg_catalog' AND 
! 	relname IN ('pg_largeobject', 'pg_largeobject_loid_pn_index'%s) ));,
  	/* see the comment at the top of old_8_3_create_sequence_script() */
  			 (GET_MAJOR_VERSION(old_cluster.major_version) = 803) ?
  			  : , 'S',
  			 FirstNormalObjectId,
  	/* does pg_largeobject_metadata need to be migrated? */
  			 (GET_MAJOR_VERSION(old_cluster.major_version) = 804) ?
  	 : , 'pg_largeobject_metadata', 'pg_largeobject_metadata_oid_index');
  
+ 	PQclear(executeQueryOrDie(conn, %s, query));
+ 
+ 	/*
+ 	 *	Get TOAST tables and indexes;  we have to gather the TOAST tables in
+ 	 *	later steps because we can't schema-qualify TOAST tables.
+ 	 */
+ 	PQclear(executeQueryOrDie(conn,
+ 			  INSERT INTO info_rels 
+ 			  SELECT reltoastrelid 
+ 			  FROM info_rels i JOIN pg_catalog.pg_class c 
+ 			  		ON i.reloid = c.oid));
+ 	

Re: [HACKERS] pg_reorg in core?

2012-09-24 Thread Christopher Browne
On Mon, Sep 24, 2012 at 10:17 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Excerpts from Daniele Varrazzo's message of dom sep 23 22:02:51 -0300 2012:
 On Mon, Sep 24, 2012 at 12:23 AM, Michael Paquier
 michael.paqu...@gmail.com wrote:

  As proposed by Masahiko, a single organization grouping all the tools (one
  repository per tool) would be enough. Please note that github can also host
  documentation. Bug tracker would be tool-dedicated in this case.

 From this PoV, pgFoundry allows your tool to be under
 http://yourtool.projects.postgresql.org instead of under a more
 generic namespace: I find it a nice and cozy place in the url space
 where to put your project. If pgFoundry will be dismissed I hope at
 least a hosting service for static pages will remain.

 I don't think that has been offered.

But I don't think it's necessarily the case that pgFoundry is getting
dismissed, either.

I got a note from Marc Fournier not too long ago (sent to some
probably-not-small set of people with pgFoundry accounts) indicating
that they were planning to upgrade gForge as far as they could, and
then switch to FusionForge http://fusionforge.org/, which is
evidently the successor.  It shouldn't be assumed that the upgrade
process will be easy or quick.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configuration include directory

2012-09-24 Thread Gavin Flower

On 25/09/12 02:41, Heikki Linnakangas wrote:

On 24.09.2012 17:24, Tom Lane wrote:

Heikki Linnakangashlinnakan...@vmware.com  writes:
This seems pretty much ready to commit. One tiny detail that I'd 
like to

clarify: the docs say:


Multiple files within an include directory are ordered by an 
alphanumeric sorting, so that ones beginning with numbers are 
considered before those starting with letters.



To be more precise, the patch uses strcmp() for the comparisons.


Just say it sorts the file names according to C locale rules.


Hmm, that's preceise, but I don't think an average user necessarily 
knows what the C locale is. I think I'll go with:


Multiple files within an include directory are processed in filename 
order. The filenames are ordered by C locale rules, ie. numbers before 
letters, and uppercase letters before lowercase ones.


- Heikki



Even I can understand that!  :-)

More to the point: are fullstops '.' sorted before digits?


Cheers,
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configuration include directory

2012-09-24 Thread Noah Misch
On Thu, Sep 20, 2012 at 02:10:58PM -0700, Selena Deckelmann wrote:
 The only thing I didn't do that Noah suggested was run pgindent on
 guc-file.l. A cursory search did not reveal source compatible with my
 operating system for 'indent'. If someone points me to it, I'd happily
 also comply with the request to reindent. And document how to do that
 on my platform(s). :)

For future reference, src/tools/pgindent/README points to the pg_bsd_indent
sources.  If pg_bsd_indent fails to build and run, post the details.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] External Replication

2012-09-24 Thread Alvaro Herrera
Excerpts from m...@rpzdesign.com's message of lun sep 24 14:24:31 -0300 2012:

 And I have not seen anybody request my hook code but a few have 
 responded that the are working
 on things in the code base, release date unknown.

Well, typically that's not how our development works -- people here
don't *request* your changes.  Instead, you submit them for inclusion,
people here criticize them a lot, it morphs from feedback (if you have
the energy and a strong enough skin), and eventually after much pain and
many tears it gets committed.

The result is typically much more solid than whatever you can build in
your own dark corner, but of course it takes longer.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-09-24 Thread Boszormenyi Zoltan

Hi,

2012-09-22 20:49 keltezéssel, Tom Lane írta:

Boszormenyi Zoltan z...@cybertec.at writes:

new version with a lot more cleanup is attached.

I looked at this patch, and frankly I'm rather dismayed.  It's a mess.


I hope you won't find this one a mess. I tried to address all your complaints.


[rather long diatribe on modifying PGSemaphoreLock improperly]


I have returned to the previous version that used PGSemaphoreTimedLock
and this time I save and restore errno around calling the timeout condition
checker.


The whole lmgrtimeout module seems to me to be far more mechanism than is
warranted, for too little added functionality.  [...]


lmgrtimeout is no more, back to hardcoding things.


   The minimum thing I would want it to do is avoid calling
timeout.c multiple times, which is what would happen right now (leading
to four extra syscalls per lock acquisition, which is enough new overhead
to constitute a strong objection to committing this patch at all).


I have added enable_multiple_timeouts() and disable_multiple_timeouts()
that minimize the number setitimer() calls.


There's considerable lack of attention to updating comments, too.
For instance in WaitOnLock you only bothered to update the comment
immediately adjacent to the changed code, and not the two comment
blocks above that, which both have specific references to deadlocks
being the reason for failure.


I modified the comment in question. I hope the wording is right.


Also, the per statement mode for lock timeout doesn't seem to be
any such thing, because it's implemented like this:

+case LOCK_TIMEOUT_PER_STMT:
+enable_timeout_at(LOCK_TIMEOUT,
+TimestampTzPlusMilliseconds(
+GetCurrentStatementStartTimestamp(),
+LockTimeout));
+break;

That doesn't provide anything like you can spend at most N milliseconds
waiting for locks during a statement.  What it is is if you happen to be
waiting for a lock N milliseconds after the statement starts, or if you
attempt to acquire any lock more than N milliseconds after the statement
starts, you lose instantly.  I don't think that definition actually adds
any useful functionality compared to setting statement_timeout to N
milliseconds, and it's certainly wrongly documented.  To do what the
documentation implies would require tracking and adding up the time spent
waiting for locks during a statement.  Which might be a good thing to do,
especially if the required gettimeofday() calls could be shared with what
timeout.c probably has to do anyway at start and stop of a lock wait.
But this code doesn't do it.


The code now properly accumulates the time spent in waiting for
LOCK_TIMEOUT. This means that if STATEMENT_TIMEOUT and
LOCK_TIMEOUT are set to the same value, STATEMENT_TIMEOUT
will trigger because it considers the time as one contiguous unit,
LOCK_TIMEOUT only accounts the time spent in waiting, not the time
spent with useful work. This means that LOCK_TIMEOUT doesn't
need any special code in its handler function, it's a NOP. The
relation between timeouts is only handled by the timeout.c module.


Lastly, I'm not sure where is the best place to be adding the control
logic for this, but I'm pretty sure postinit.c is not it.  It oughta be
somewhere under storage/lmgr/, no?


The above change means that there is no control logic outside of
storage/lmgr now.

I temporarily abandoned the idea of detailed error reporting on
the object type and name/ID. WaitOnLock() reports canceling statement
due to lock timeout and LockAcquire() kept its previous semantics.
This can be quickly revived in case of demand, it would be another ~15K patch.

I hope you can find another time slot in this CF to review this one.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/



2-lock_timeout-v26.patch.gz
Description: Unix tar archive

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PoC] load balancing in libpq

2012-09-24 Thread Simon Riggs
On 23 September 2012 05:50, Satoshi Nagayasu sn...@uptime.jp wrote:

 I have just written the first PoC code to enable load balancing
 in the libpq library.

 This libpq enhancement is intended to allow PostgreSQL users to
 take advantage of the replication in easier way.

 With using this patch, PQconnectdb() function accepts multiple
 connection info strings, and pick one of them by round-robin basis
 to connect.

It's certainly worth thinking about. New ideas are good, if they have
some justification.

A load balancing solution that only works at connection level isn't
much of a solution, IMHO.

If you have multiple connection strings in the client program, its
trivial to put them in an array and pick one randomly, then use that.
So this is already available as a solution in user space.

Also, any client based solution presumes we can ship lists of cluster
metadata to clients, which we have no solution for, so it places a
much greater burden on the administrator. Load balancing config needs
to be held more centrally.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_reorg in core?

2012-09-24 Thread Josh Berkus

 For me, the Postgres user interface should include
 * REINDEX CONCURRENTLY

I don't see why we don't have REINDEX CONCURRENTLY now.  When I was
writing out the instructions for today's update, I was thinking we
already have all the commands for this.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_reorg in core?

2012-09-24 Thread Simon Riggs
On 24 September 2012 17:36, Josh Berkus j...@agliodbs.com wrote:

 For me, the Postgres user interface should include
 * REINDEX CONCURRENTLY

 I don't see why we don't have REINDEX CONCURRENTLY now.

Same reason for everything on (anyone's) TODO list.

Lack of vision is not holding us back, we just need the vision to realise it.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_reorg in core?

2012-09-24 Thread Josh Berkus
On 9/24/12 3:43 PM, Simon Riggs wrote:
 On 24 September 2012 17:36, Josh Berkus j...@agliodbs.com wrote:

 For me, the Postgres user interface should include
 * REINDEX CONCURRENTLY

 I don't see why we don't have REINDEX CONCURRENTLY now.
 
 Same reason for everything on (anyone's) TODO list.

Yes, I'm just pointing out that it would be a very small patch for
someone, and that AFAIK it didn't make it on the TODO list yet.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Oid registry

2012-09-24 Thread Andrew Dunstan
This rather overdue mail arises out the developer's meeting back in May, 
where we discussed an item I raised suggesting an Oid registry.


The idea came from some difficulties I encountered when I did the 
backport of the JSON work we did in 9.2 to 9.1, but has wider 
application. Say someone writes an extension that defines type A. You 
want to write an extension that takes advantage of that type, but it's 
difficult if you don't know the type's Oid, and of course currently 
there is no way of knowing for sure what it is unless it's a builtin 
type. So the proposal is to have an Oid registry, in which authors could 
in effect reserve an Oid (or a couple of Oids) for a type. We would 
guarantee that these Oids would be reserved in just the same way Oids 
for builtins are reserved, and #define symbolic constants for the 
reserved Oids. To make that viable, we'd need to extend the CREATE 
commands for any objects we could reserve Oids for to allow for the 
specification of the Oids, somewhat like:


   CREATE TYPE newtype ( ) WITH (TYPEOID = 123456, TYPEARRAYOID =
   234567);

I'm not sure what objects we would need this for other than types, but 
CREATE TYPE would be a good start anyway.


Another suggestion that was made during the discussion was that we 
should also reserve a range of Oids for private use, and guarantee that 
those would never be allocated, somewhat analogously to RFC1918 IP 
addresses.


thoughts?

If there is general agreement I want to get working on this fairly soon.

cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_reorg in core?

2012-09-24 Thread Andres Freund
On Tuesday, September 25, 2012 12:55:35 AM Josh Berkus wrote:
 On 9/24/12 3:43 PM, Simon Riggs wrote:
  On 24 September 2012 17:36, Josh Berkus j...@agliodbs.com wrote:
  For me, the Postgres user interface should include
  * REINDEX CONCURRENTLY
  
  I don't see why we don't have REINDEX CONCURRENTLY now.
  
  Same reason for everything on (anyone's) TODO list.
 
 Yes, I'm just pointing out that it would be a very small patch for
 someone, and that AFAIK it didn't make it on the TODO list yet.
Its not *that* small.

1. You need more than you can do with CREATE INDEX CONCURRENTLY and DROP INDEX 
CONCURRENTLY because the index can e.g. be referenced by a foreign key 
constraint. So you need to replace the existing index oid with a new one by 
swapping the relfilenodes of both after verifying several side conditions 
(indcheckxmin, indisvalid, indisready).

It would probably have to look like:

- build new index with indisready = false
- newindex.indisready = true
- wait
- newindex.indisvalid = true
- wait
- swap(oldindex.relfilenode, newindex.relfilenode)
- oldindex.indisvalid = false
- wait
- oldindex.indisready = false
- wait
- drop new index with old relfilenode

Every wait indicates an externally visible state which you might encounter/need 
to cleanup...

To make it viable to use that systemwide it might be necessary to batch the 
individual steps together for multiple indexes because all that waiting is 
going to suck if you do it for every single table in the database while you 
also have longrunning queries...

2. no support for concurrent on system tables (not easy for shared catalogs)

3. no support for the indexes of exlusion constraints (not hard I think)

Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DROP INDEX CONCURRENTLY is not really concurrency safe leaves around undroppable indexes

2012-09-24 Thread Andres Freund
Hi,

On Monday, September 24, 2012 01:27:54 PM Andres Freund wrote:
 Hi,
 
 Problem 1: concurrency:
 
 Testcase:
 
 Session 1:
 CREATE TABLE test_drop_concurrently(id serial primary key, data int);
 INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1,
 10);
 CREATE INDEX test_drop_concurrently_data ON test_drop_concurrently(data);
 BEGIN;
 EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data = 34343;
 SELECT * FROM test_drop_concurrently WHERE data = 34343;
 (1 row)
 
 Session 2:
 BEGIN;
 SELECT * FROM test_drop_concurrently WHERE data = 34343;
 
 Session 3:
 DROP INDEX CONCURRENTLY test_drop_concurrently_data;
 (in-progress)
 
 Session 2:
 INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1,
 10);
 COMMIT;
 
 Session 1:
 SELECT * FROM test_drop_concurrently WHERE data = 34343;
 (1 row)
 SET enable_bitmapscan = false;
 SET enable_indexscan = false;
 SELECT * FROM test_drop_concurrently WHERE data = 34343;
 (2 rows)
 
 Explanation:
 index_drop does:
   indexForm-indisvalid = false;  /* make unusable for queries */
   indexForm-indisready = false;  /* make invisible to changes */
 
 Setting indisready = false is problematic because that prevents index
 updates which in turn breaks READ COMMITTED semantics. I think there need
 to be one more phase that waits for concurrent users of the index to
 finish before setting indisready = false.
The attached patch fixes this issue. Haven't looked at the other one in detail 
yet.

Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 8891fcd59496483793aecc21a096fc0119369e73 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Tue, 25 Sep 2012 01:41:29 +0200
Subject: [PATCH] Fix concurrency issues in concurrent index drops

Previously a DROP INDEX CONCURRENTLY started with unsetting indisvalid *and*
indisready. Thats problematic if some transaction is still looking at the index
and another transction makes changes. See the example below.

Now we do the drop in three stages, just as a concurrent index build. First
unset indivalid, wait, unset indisready, wait, drop index.

Example:

Session 1:
CREATE TABLE test_drop_concurrently(id serial primary key, data int);
INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1,
10);
CREATE INDEX test_drop_concurrently_data ON test_drop_concurrently(data);
BEGIN;
EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data = 34343;
SELECT * FROM test_drop_concurrently WHERE data = 34343;
(1 row)

Session 2:
BEGIN;
SELECT * FROM test_drop_concurrently WHERE data = 34343;

Session 3:
DROP INDEX CONCURRENTLY test_drop_concurrently_data;
(in-progress)

Session 2:
INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1, 10);
COMMIT;

Session 1:
SELECT * FROM test_drop_concurrently WHERE data = 34343;
(1 row)
SET enable_bitmapscan = false;
SET enable_indexscan = false;
SELECT * FROM test_drop_concurrently WHERE data = 34343;
(2 rows)
---
 src/backend/catalog/index.c | 99 ++---
 1 file changed, 84 insertions(+), 15 deletions(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index a61b424..3e1794f 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1318,6 +1318,10 @@ index_drop(Oid indexId, bool concurrent)
 	 * table lock strong enough to prevent all queries on the table from
 	 * proceeding until we commit and send out a shared-cache-inval notice
 	 * that will make them update their index lists.
+	 *
+	 * In the concurrent case we make sure that nobody can be looking at the
+	 * indexes by dropping the index in multiple steps, so we don't need a full
+	 * fledged AccessExlusiveLock yet.
 	 */
 	heapId = IndexGetRelation(indexId, false);
 	if (concurrent)
@@ -1338,7 +1342,19 @@ index_drop(Oid indexId, bool concurrent)
 
 	/*
 	 * Drop Index concurrently is similar in many ways to creating an index
-	 * concurrently, so some actions are similar to DefineIndex()
+	 * concurrently, so some actions are similar to DefineIndex() just in the
+	 * reverse order.
+	 *
+	 * First we unset indisvalid so queries starting afterwards don't use the
+	 * index to answer queries anymore. We have to keep indisready = true
+	 * though so transactions that are still using the index can continue to
+	 * see valid index contents. E.g. when they are using READ COMMITTED mode,
+	 * and another transactions that started later commits makes changes and
+	 * commits, they need to see those new tuples in the index.
+	 *
+	 * After all transactions that could possibly have used it for queries
+	 * ended we can unset indisready and wait till nobody could be updating it
+	 * anymore.
 	 */
 	if (concurrent)
 	{
@@ -1357,21 +1373,14 @@ index_drop(Oid indexId, bool concurrent)
 			elog(ERROR, cache lookup failed for index %u, indexId);
 		

[HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-24 Thread Brian Weaver
While researching the way streaming replication works I was examining
the construction of the tar file header. By comparing documentation on
the tar header format from various sources I certain the following
patch should be applied to so the group identifier is put into thee
header properly.

While I realize that wikipedia isn't always the best source of
information, the header offsets seem to match the other documentation
I've found. The format is just easier to read on wikipedia

http://en.wikipedia.org/wiki/Tar_(file_format)#File_header

Here is the trivial patch:

diff --git a/src/backend/replication/basebackup.c
b/src/backend/replication/basebackup.c
index 4aaa9e3..524223e 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -871,7 +871,7 @@ _tarWriteHeader(const char *filename, const char
*linktarget,
sprintf(h[108], %07o , statbuf-st_uid);

/* Group 8 */
-   sprintf(h[117], %07o , statbuf-st_gid);
+   sprintf(h[116], %07o , statbuf-st_gid);

/* File size 12 - 11 digits, 1 space, no NUL */
if (linktarget != NULL || S_ISDIR(statbuf-st_mode))


-- Brian

-- 

/* insert witty comment here */


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL in the French news

2012-09-24 Thread Michael Paquier
On Mon, Sep 24, 2012 at 10:37 PM, Fabien COELHO coe...@cri.ensmp.fr wrote:


 Dear devs,

 For your information, PostgreSQL appears explicitely in the list of
 Free software cited by the French Prime Minister, Jean-Marc Ayrault, in
 his Circulaire (i.e. a kind of instruction to civil servants) signed
 last week. See on page 9 and 18 of the pdf (not the pages of the document
 which are shifted). Obviously, the document is in French...

 
 http://circulaire.legifrance.**gouv.fr/pdf/2012/09/cir_35837.**pdfhttp://circulaire.legifrance.gouv.fr/pdf/2012/09/cir_35837.pdf

 The spirit of the document does not seem to make the use of free software
 mandatory, but to require that it is at least considered for any project in
 which the administration (i.e. the public sector, not the government as
 in the USA) is involved.

+1. Such news are nice for Postgres.
-- 
Michael Paquier
http://michael.otacoo.com


Re: [HACKERS] DROP INDEX CONCURRENTLY is not really concurrency safe leaves around undroppable indexes

2012-09-24 Thread Andres Freund
On Monday, September 24, 2012 01:37:59 PM Andres Freund wrote:
 On Monday, September 24, 2012 01:27:54 PM Andres Freund wrote:
  Problem 2: undroppable indexes:
  
 
  Session 1:
  CREATE TABLE test_drop_concurrently(id serial primary key, data int);
  CREATE INDEX test_drop_concurrently_data ON test_drop_concurrently(data);
  BEGIN;
  EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data = 34343;
 
  
 
  Session 2:
  DROP INDEX CONCURRENTLY test_drop_concurrently_data;
  waiting
  ^CCancel request sent
  ERROR:  canceling statement due to user request
 
  
 
  Session 1:
  ROLLBACK;
  DROP TABLE test_drop_concurrently;
  SELECT indexrelid, indrelid, indexrelid::regclass, indrelid::regclass,
  indisvalid, indisready FROM pg_index WHERE indexrelid =
  'test_drop_concurrently_data'::regclass;
 
   indexrelid | indrelid | indexrelid  | indrelid |
 
  indisvalid | indisready
  +--+-+--+
  -- --+ 24703 |24697 | test_drop_concurrently_data |
  24697| f  | f
  (1 row)
 
  
 
  DROP INDEX test_drop_concurrently_data;
  ERROR:  could not open relation with OID 24697
 
  
 
  Haven't looked at this one at all.
 
 Thats because it has to commit transactions inbetween to make the catalog 
 changes visible. Unfortunately at that point it already deleted the 
 dependencies in deleteOneObject...
Seems to be solvable with some minor reshuffling in deleteOneObject. We can 
only perform the deletion of outgoing pg_depend records *after* we have dropped 
the object with doDeletion in the concurrent case. As the actual drop of the 
relation happens in the same transaction that will then go on to drop the 
dependency records that seems to be fine.
I don't see any problems with that right now, will write a patch tomorrow. We 
will see if thats problematic...

Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-24 Thread Brian Weaver
Actually I found one other issue while continuing my investigation.
The insertion of the 'ustar' and version '00' has the '00' version at
the wrong offset. The patch is attached.

-- Brian

On Mon, Sep 24, 2012 at 7:51 PM, Brian Weaver cmdrcluel...@gmail.com wrote:
 While researching the way streaming replication works I was examining
 the construction of the tar file header. By comparing documentation on
 the tar header format from various sources I certain the following
 patch should be applied to so the group identifier is put into thee
 header properly.

 While I realize that wikipedia isn't always the best source of
 information, the header offsets seem to match the other documentation
 I've found. The format is just easier to read on wikipedia

 http://en.wikipedia.org/wiki/Tar_(file_format)#File_header

 Here is the trivial patch:

 diff --git a/src/backend/replication/basebackup.c
 b/src/backend/replication/basebackup.c
 index 4aaa9e3..524223e 100644
 --- a/src/backend/replication/basebackup.c
 +++ b/src/backend/replication/basebackup.c
 @@ -871,7 +871,7 @@ _tarWriteHeader(const char *filename, const char
 *linktarget,
 sprintf(h[108], %07o , statbuf-st_uid);

 /* Group 8 */
 -   sprintf(h[117], %07o , statbuf-st_gid);
 +   sprintf(h[116], %07o , statbuf-st_gid);

 /* File size 12 - 11 digits, 1 space, no NUL */
 if (linktarget != NULL || S_ISDIR(statbuf-st_mode))


 -- Brian

 --

 /* insert witty comment here */



-- 

/* insert witty comment here */


pg-tar.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-24 Thread Brian Weaver
Um I apologize for the third e-mail on the topic. It seems that my
C coding is a bit rusty from years of neglect. No sooner had I hit the
send button then I realized that trying to embed a null character in a
string might not work, especially when it's followed by two
consecutive zeros.

Here is a safer fix which is more in line with the rest of the code in
the file. I guess this is what I get for being cleaver.

Patch is attached

-- 

/* insert witty comment here */


pg-tar.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Rural Hunter

于 2012/9/24 22:57, Bruce Momjian 写道:

On Mon, Sep 24, 2012 at 10:45:34PM +0800, Rural Hunter wrote:

If your operating system locale/encoding names changed after the initdb
of the old cluster, this would not be reflected in template0.

No. It's not changed. look at my system settings:
LANG=zh_CN.UTF-8
$ cat /var/lib/locales/supported.d/local
zh_CN.UTF-8 UTF-8

I think the problem is on the options when I installed pgsql(both
9.1 and 9.2)
Select the locale to be used by the new database cluster.

Locale

[1] [Default locale]
[2] C
[3] POSIX
[4] zh_CN.utf8
[5] zh_HK.utf8
[6] zh_SG.utf8
[7] zh_TW.utf8
Please choose an option [1] : 4
I chose 4 instead of 1. I guess the default locale(option 1) is with dash.

Well, if you run that query on template0 in the old and new cluster, you
will see something different in the two of them.  Could you have used
default in one and a non-dash in the other.
Yes, that's true. The upgrade is fine with both fresh installs(9.1 and 
9.2) with option above(without-dash). The problem only happens when I 
inited the 9.2 db with default locale(I guess that one has the dash). 
Just wondering why pg installer provides options without dash.

Did we change the way we
canonicalize the locale between 9.1 and 9.2?

I can send you a patch to test if the setlocale canonicalization works.
Can you test it if I send it?





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doc patch to note which system catalogs have oids

2012-09-24 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
 I think this is fundamentally wrong, or at least misleading, because it
 documents OID as if it were an ordinary column.  Somebody who did
 select * from pg_class and didn't see any oid in the result would
 think the docs were wrong.

Given that it's been quite some time since we defaulted to including
OIDs in tables, and the high level of confusion that individuals trying
to join pg_class and pg_namespace together go through due to select *
not including the oid column, I wonder if perhaps we should consider
changing that.  Might be possible to do for just the catalog tables (to
minimize the risk of breaking poorly-written applications), or provide
a GUC to control including the oid column in select *.

 It's possible that it's worth expending a boilerplate paragraph in each
 of those pages to say this catalog has OIDs (or that it doesn't).
 But this isn't the way.

I'm afraid I disagree with this.  The oid column, in the system
catalog, is user-facing and I like having it included as a column in the
table in the docs, so users know what to use when doing joins.
Including something in the boilerplate about it not being shown by
default (or in the description in the table) might be alright, if we
don't change that.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Oid registry

2012-09-24 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 ...  So the proposal is to have an Oid registry, in which authors could 
 in effect reserve an Oid (or a couple of Oids) for a type. We would 
 guarantee that these Oids would be reserved in just the same way Oids 
 for builtins are reserved, and #define symbolic constants for the 
 reserved Oids. To make that viable, we'd need to extend the CREATE 
 commands for any objects we could reserve Oids for to allow for the 
 specification of the Oids, somewhat like:

 CREATE TYPE newtype ( ) WITH (TYPEOID = 123456, TYPEARRAYOID =
 234567);

Well, of course, it's not clear how reserved an OID can be if you
provide SQL syntax that allows any Joe Blow to create a type with that
OID.  The possibilities for security holes are interesting to say the
least, especially if applications blindly assume that oid 123456 *must*
be the type they think it is.

 Another suggestion that was made during the discussion was that we 
 should also reserve a range of Oids for private use, and guarantee that 
 those would never be allocated, somewhat analogously to RFC1918 IP 
 addresses.

Likewise, would never be allocated and we'll provide syntax that does
it trivially seem a bit at odds.

Another point to think about is that it's a few years too late to
guarantee that any particular OID above 16384 is unused; we can't
do that now without possibly breaking pg_upgrade-ability of existing
databases.  While we could hand out some subrange of the OIDs below
that, there's not exactly a huge amount of space available.

 If there is general agreement I want to get working on this fairly soon.

Turning this into something actually useful seems to me to be a bit
harder than meets the eye.  I'm not opposed to it in principle, but the
management aspect seems much more difficult than the technical aspect.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Oid registry

2012-09-24 Thread Peter Eisentraut
On Mon, 2012-09-24 at 18:59 -0400, Andrew Dunstan wrote:
 This rather overdue mail arises out the developer's meeting back in
 May, where we discussed an item I raised suggesting an Oid registry.
 
 The idea came from some difficulties I encountered when I did the 
 backport of the JSON work we did in 9.2 to 9.1, but has wider 
 application. Say someone writes an extension that defines type A. You 
 want to write an extension that takes advantage of that type, but it's
 difficult if you don't know the type's Oid, 

Could you fill the rest of us in with some technical details about why
this might be necessary and what it aims to achieve?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] plpgsql gram.y make rule

2012-09-24 Thread Peter Eisentraut
I wanted to refactor the highly redundant flex and bison rules
throughout the source into common pattern rules.  (Besides saving some
redundant code, this could also help some occasionally flaky code in
pgxs modules.)  The only outlier that breaks this is in plpgsql

pl_gram.c: gram.y

I would like to either rename the intermediate file(s) to gram.{c,h}, or
possibly rename the source file to pl_gram.y.  Any preferences or other
comments?




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-24 Thread Tom Lane
Brian Weaver cmdrcluel...@gmail.com writes:
 While researching the way streaming replication works I was examining
 the construction of the tar file header. By comparing documentation on
 the tar header format from various sources I certain the following
 patch should be applied to so the group identifier is put into thee
 header properly.

Yeah, this is definitely wrong.

 While I realize that wikipedia isn't always the best source of
 information, the header offsets seem to match the other documentation
 I've found. The format is just easier to read on wikipedia

The authoritative specification can be found in the pax page in the
POSIX spec, which is available here:
http://pubs.opengroup.org/onlinepubs/9699919799/

I agree that the 117 number is bogus, and also that the magic ustar
string is written incorrectly.  What's more, it appears that the latter
error has been copied from pg_dump (but the 117 seems to be just a new
bug in pg_basebackup).  I wonder what else might be wrong hereabouts :-(
Will sit down and take a closer look.

I believe what we need to do about this is:

1. fix pg_dump and pg_basebackup output to conform to spec.

2. make sure pg_restore will accept both conformant and
   previous-generation files.

Am I right in believing that we don't have any code that's expected to
read pg_basebackup output?  We just feed it to tar, no?

I'm a bit concerned about backwards compatibility issues.  It looks to
me like existing versions of pg_restore will flat out reject files that
have a spec-compliant ustar\0 MAGIC field.  Is it going to be
sufficient if we fix this in minor-version updates, or are we going to
need to have a switch that tells pg_dump to emit the incorrect old
format?  (Ick.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql gram.y make rule

2012-09-24 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 I wanted to refactor the highly redundant flex and bison rules
 throughout the source into common pattern rules.  (Besides saving some
 redundant code, this could also help some occasionally flaky code in
 pgxs modules.)  The only outlier that breaks this is in plpgsql

 pl_gram.c: gram.y

 I would like to either rename the intermediate file(s) to gram.{c,h}, or
 possibly rename the source file to pl_gram.y.  Any preferences or other
 comments?

Hmmm ... it's annoyed me for a long time that that file is named the
same as the core backend's gram.y.  So renaming to pl_gram.y might be
better.  On the other hand I have very little confidence in git's
ability to preserve change history if we do that.  Has anyone actually
done a file rename in a project with lots of history, and how well did
it turn out?  (For instance, does git blame still provide any useful
tracking of pre-rename changes?  If you try to cherry-pick a patch
against the new file into a pre-rename branch, does it work?)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-24 Thread Brian Weaver
Tom,

I'm still investigating and I have been looking at various sources. I
have checked lots of pages on the web and I was just looking at the
libarchive source from github. I found an interesting sequence in
libarchive that implies that the 'ustar00\0' marks the header as GNU
Tar format.

Here are lines 321 through 329 of 'archive_read_support_format_tar.c'
from libarchive

 321 /* Recognize POSIX formats. */
 322 if ((memcmp(header-magic, ustar\0, 6) == 0)
 323  (memcmp(header-version, 00, 2) == 0))
 324 bid += 56;
 325
 326 /* Recognize GNU tar format. */
 327 if ((memcmp(header-magic, ustar , 6) == 0)
 328  (memcmp(header-version,  \0, 2) == 0))
 329 bid += 56;

I'm wondering if the original committer put the 'ustar00\0' string in by design?

Regardless I'll look at it more tomorrow 'cause I'm calling it a
night. I need to send a note to the libarchive folks too because I
*think* I found a potential buffer overrun in one of their octal
conversion routines.

-- Brian

On Mon, Sep 24, 2012 at 10:07 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Brian Weaver cmdrcluel...@gmail.com writes:
 While researching the way streaming replication works I was examining
 the construction of the tar file header. By comparing documentation on
 the tar header format from various sources I certain the following
 patch should be applied to so the group identifier is put into thee
 header properly.

 Yeah, this is definitely wrong.

 While I realize that wikipedia isn't always the best source of
 information, the header offsets seem to match the other documentation
 I've found. The format is just easier to read on wikipedia

 The authoritative specification can be found in the pax page in the
 POSIX spec, which is available here:
 http://pubs.opengroup.org/onlinepubs/9699919799/

 I agree that the 117 number is bogus, and also that the magic ustar
 string is written incorrectly.  What's more, it appears that the latter
 error has been copied from pg_dump (but the 117 seems to be just a new
 bug in pg_basebackup).  I wonder what else might be wrong hereabouts :-(
 Will sit down and take a closer look.

 I believe what we need to do about this is:

 1. fix pg_dump and pg_basebackup output to conform to spec.

 2. make sure pg_restore will accept both conformant and
previous-generation files.

 Am I right in believing that we don't have any code that's expected to
 read pg_basebackup output?  We just feed it to tar, no?

 I'm a bit concerned about backwards compatibility issues.  It looks to
 me like existing versions of pg_restore will flat out reject files that
 have a spec-compliant ustar\0 MAGIC field.  Is it going to be
 sufficient if we fix this in minor-version updates, or are we going to
 need to have a switch that tells pg_dump to emit the incorrect old
 format?  (Ick.)

 regards, tom lane



-- 

/* insert witty comment here */


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Oid registry

2012-09-24 Thread Andrew Dunstan


On 09/24/2012 09:37 PM, Peter Eisentraut wrote:

On Mon, 2012-09-24 at 18:59 -0400, Andrew Dunstan wrote:

This rather overdue mail arises out the developer's meeting back in
May, where we discussed an item I raised suggesting an Oid registry.

The idea came from some difficulties I encountered when I did the
backport of the JSON work we did in 9.2 to 9.1, but has wider
application. Say someone writes an extension that defines type A. You
want to write an extension that takes advantage of that type, but it's
difficult if you don't know the type's Oid,

Could you fill the rest of us in with some technical details about why
this might be necessary and what it aims to achieve?


Well, an obvious case is how record_to_json handles fields. If it knows 
nothing about the type all it can do is output the string value. That 
doesn't work well for types such as hstore. If it could reliably 
recognize a field as an hstore it might well be able to do lots better.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doc patch to note which system catalogs have oids

2012-09-24 Thread Karl O. Pinc
On 09/24/2012 09:38:53 AM, Karl O. Pinc wrote:
 On 09/23/2012 10:14:33 PM, Tom Lane wrote:
  Karl O. Pinc k...@meme.com writes:
   The attached patch documents the oid column of those
   system catalogs having an oid.
  
  I think this is fundamentally wrong, or at least misleading, 
 because
  it
  documents OID as if it were an ordinary column. 

 How about modifying the (printed) table layout as attached?
 It begins each (printed) table documenting each catalog with a
 Has OID column Yes/No.

Changed text from Has OID column to Keyed with an OID column
since this explains more and there's no worry about horizontal
space.

I like having the documentation of oid be part of the
(printed) table describing the columns, in some way or
another, since that's where the eye is drawn when
looking for column documentation.




Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index f999190..2dfb40f 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -332,6 +332,19 @@
   table
titlestructnamepg_aggregate/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryKeyed with an OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryNo/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -415,6 +428,19 @@
   table
titlestructnamepg_am/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryKeyed with an OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryYes/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -671,6 +697,19 @@
   table
titlestructnamepg_amop/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryKeyed with an OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryYes/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -807,6 +846,19 @@
   table
titlestructnamepg_amproc/structname Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryKeyed with an OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryYes/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -890,6 +942,19 @@
   table
titlestructnamepg_attrdef/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryKeyed with an OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryYes/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -967,6 +1032,19 @@
   table
titlestructnamepg_attribute/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryKeyed with an OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryNo/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -1247,6 +1325,19 @@
   table
titlestructnamepg_authid/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryKeyed with an OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryYes/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=3
 thead
  row
@@ -1377,6 +1468,19 @@
   table
titlestructnamepg_auth_members/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryKeyed with an OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryNo/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -1450,6 +1554,19 @@
   table
titlestructnamepg_cast/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryKeyed with an OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryYes/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -1565,6 +1682,19 @@
   table
titlestructnamepg_class/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryKeyed with an OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryYes/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -1877,6 +2007,19 @@
   table
titlestructnamepg_event_trigger/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryKeyed with an OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryNo/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -1972,6 +2115,19 @@
   table
titlestructnamepg_constraint/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryKeyed with an OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryYes/entry
+ /row
+/tbody
+   /tgroup
+
tgroup cols=4
 thead
  row
@@ -2238,6 +2394,19 @@
   table
titlestructnamepg_collation/ Columns/title
 
+   tgroup cols=1
+thead
+ row
+  entryKeyed with an OID column/entry
+ /row
+/thead
+tbody
+ row
+  entryYes/entry
+ /row
+/tbody
+   

Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-24 Thread Tom Lane
Brian Weaver cmdrcluel...@gmail.com writes:
 Here are lines 321 through 329 of 'archive_read_support_format_tar.c'
 from libarchive

  321 /* Recognize POSIX formats. */
  322 if ((memcmp(header-magic, ustar\0, 6) == 0)
  323  (memcmp(header-version, 00, 2) == 0))
  324 bid += 56;
  325
  326 /* Recognize GNU tar format. */
  327 if ((memcmp(header-magic, ustar , 6) == 0)
  328  (memcmp(header-version,  \0, 2) == 0))
  329 bid += 56;

 I'm wondering if the original committer put the 'ustar00\0' string in by 
 design?

The second part of that looks to me like it matches ustar  \0,
not ustar00\0.  I think the pg_dump coding is just wrong.  I've
already noticed that its code for writing the checksum is pretty
brain-dead too :-(

Note that according to the wikipedia page, tar programs typically
accept files as pre-POSIX format if the checksum is okay, regardless of
what is in the magic field; and the fields that were added by POSIX
are noncritical so we'd likely never notice that they were being
ignored.  (In fact, looking closer, pg_dump isn't even filling those
fields anyway, so the fact that it's not producing a compliant magic
field may be a good thing ...)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_reorg in core?

2012-09-24 Thread Michael Paquier
On Tue, Sep 25, 2012 at 8:13 AM, Andres Freund and...@2ndquadrant.comwrote:

 On Tuesday, September 25, 2012 12:55:35 AM Josh Berkus wrote:
  On 9/24/12 3:43 PM, Simon Riggs wrote:
   On 24 September 2012 17:36, Josh Berkus j...@agliodbs.com wrote:
   For me, the Postgres user interface should include
   * REINDEX CONCURRENTLY
  
   I don't see why we don't have REINDEX CONCURRENTLY now.
  
   Same reason for everything on (anyone's) TODO list.
 
  Yes, I'm just pointing out that it would be a very small patch for
  someone, and that AFAIK it didn't make it on the TODO list yet.
 Its not *that* small.

 1. You need more than you can do with CREATE INDEX CONCURRENTLY and DROP
 INDEX
 CONCURRENTLY because the index can e.g. be referenced by a foreign key
 constraint. So you need to replace the existing index oid with a new one by
 swapping the relfilenodes of both after verifying several side conditions
 (indcheckxmin, indisvalid, indisready).

 It would probably have to look like:

 - build new index with indisready = false
 - newindex.indisready = true
 - wait
 - newindex.indisvalid = true
 - wait
 - swap(oldindex.relfilenode, newindex.relfilenode)
 - oldindex.indisvalid = false
 - wait
 - oldindex.indisready = false
 - wait
 - drop new index with old relfilenode

 Every wait indicates an externally visible state which you might
 encounter/need
 to cleanup...

Could you clarify what do you mean here by cleanup?
I am afraid I do not get your point here.


 2. no support for concurrent on system tables (not easy for shared
 catalogs)

Doesn't this exclude all the tables that are in the schema catalog?



 3. no support for the indexes of exclusion constraints (not hard I think)

This just consists in a check of indisready in pg_index.
-- 
Michael Paquier
http://michael.otacoo.com


Re: [HACKERS] Oid registry

2012-09-24 Thread Andrew Dunstan


On 09/24/2012 09:24 PM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

...  So the proposal is to have an Oid registry, in which authors could
in effect reserve an Oid (or a couple of Oids) for a type. We would
guarantee that these Oids would be reserved in just the same way Oids
for builtins are reserved, and #define symbolic constants for the
reserved Oids. To make that viable, we'd need to extend the CREATE
commands for any objects we could reserve Oids for to allow for the
specification of the Oids, somewhat like:
 CREATE TYPE newtype ( ) WITH (TYPEOID = 123456, TYPEARRAYOID =
 234567);

Well, of course, it's not clear how reserved an OID can be if you
provide SQL syntax that allows any Joe Blow to create a type with that
OID.  The possibilities for security holes are interesting to say the
least, especially if applications blindly assume that oid 123456 *must*
be the type they think it is.


I think this would probably be a superuser-only facility.





Another suggestion that was made during the discussion was that we
should also reserve a range of Oids for private use, and guarantee that
those would never be allocated, somewhat analogously to RFC1918 IP
addresses.

Likewise, would never be allocated and we'll provide syntax that does
it trivially seem a bit at odds.

Another point to think about is that it's a few years too late to
guarantee that any particular OID above 16384 is unused; we can't
do that now without possibly breaking pg_upgrade-ability of existing
databases.  While we could hand out some subrange of the OIDs below
that, there's not exactly a huge amount of space available.



we seem to have a fair bit of leeway between to top numbered Oid as 
reported by unused_oids (4332) and 16384.


I would expect a private range of a few hundred to be more than 
adequate, and a range for registered Oids of a couple of thousand to 
last for many years. I'm expecting the use of this over quite a few 
years to be numbered in tens, not thousands.



cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doc patch to note which system catalogs have oids

2012-09-24 Thread Karl O. Pinc
On 09/24/2012 08:18:00 PM, Stephen Frost wrote:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:

  It's possible that it's worth expending a boilerplate paragraph in
 each
  of those pages to say this catalog has OIDs (or that it doesn't).
  But this isn't the way.
 
 I'm afraid I disagree with this.  The oid column, in the system
 catalog, is user-facing and I like having it included as a column in
 the
 table in the docs, so users know what to use when doing joins.
 Including something in the boilerplate about it not being shown by
 default (or in the description in the table) might be alright, if we
 don't change that.

Having information about oid included in the (printed) table
under a separate heading, as in v2 and v3 of this patch,
is something of a compromise.  It's hard to visualize
from the sgml so it might be worth building the docs
and viewing with a file:/// url.  The trouble is that
it's visually ugly because the two parts of the
table are of separate widths.  There is almost surely a way
to change this in the xsl transformation to html/etc., but I
would probably do a bad job of it and can't
speak to the sanity of maintaining such a thing.
(So it's probably a bad idea.)


Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Bruce Momjian
On Tue, Sep 25, 2012 at 08:41:19AM +0800, Rural Hunter wrote:
 I think the problem is on the options when I installed pgsql(both
 9.1 and 9.2)
 Select the locale to be used by the new database cluster.
 
 Locale
 
 [1] [Default locale]
 [2] C
 [3] POSIX
 [4] zh_CN.utf8
 [5] zh_HK.utf8
 [6] zh_SG.utf8
 [7] zh_TW.utf8
 Please choose an option [1] : 4
 I chose 4 instead of 1. I guess the default locale(option 1) is with dash.
 Well, if you run that query on template0 in the old and new cluster, you
 will see something different in the two of them.  Could you have used
 default in one and a non-dash in the other.
 Yes, that's true. The upgrade is fine with both fresh installs(9.1
 and 9.2) with option above(without-dash). The problem only happens
 when I inited the 9.2 db with default locale(I guess that one has

OK, that is good to know.  I developed the attached C program that does
the setlocale canonical test.  On Debian Squeeze, I could not see any
change:  if I pass en_US.UTF-8, I get en_US.UTF-8 returned;  if I pass
en_US.UTF8, I get en_US.UTF8 returned.  Can anyone test this and find a
case where the local is canonicalized?  Run it this way:

$ canonical
LC_COLLATE = 3
LC_CTYPE = 0
$ canonical 0 en_US.UTF8
en_US.UTF8

We are looking for cases where the second argument produces a
non-matching locale name as output.

I have also attached a patch that reports the mismatching locale or
encoding names --- this should at least help with debugging and show
that a dash is the problem.

 the dash). Just wondering why pg installer provides options without
 dash.

No idea.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
#include stdio.h
#include locale.h
#include stdlib.h
#include string.h

int
main(int argc, char **argv)
{
	char	   *save;
	char	   *res;
	int 		category;

	if (argc == 1)
	{
		printf(LC_COLLATE = %d\n, LC_COLLATE);
		printf(LC_CTYPE = %d\n, LC_CTYPE);
		return 0;
	}
	
	category = atoi(argv[1]);
	
	save = setlocale(category, NULL);
	if (!save)
	{
		printf(failed to get the current locale\n);
		return 0;
	}

	/* 'save' may be pointing at a modifiable scratch variable, so copy it. */
	save = strdup(save);

	/* set the locale with setlocale, to see if it accepts it. */
	res = setlocale(category, argv[2]);

	if (!res)
	{
		printf(failed to get system local name for \%s\\n, res);
		return 0;
	}

	res = strdup(res);

	/* restore old value. */
	if (!setlocale(category, save))
	{
		printf(failed to restore old locale \%s\\n, save);
		return 0;
	}
		
	free(save);

	puts(res);
	return 0;
}
diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c
new file mode 100644
index beb177d..e4fec34
*** a/contrib/pg_upgrade/check.c
--- b/contrib/pg_upgrade/check.c
*** static void
*** 406,421 
  check_locale_and_encoding(ControlData *oldctrl,
  		  ControlData *newctrl)
  {
! 	/* These are often defined with inconsistent case, so use pg_strcasecmp(). */
  	if (pg_strcasecmp(oldctrl-lc_collate, newctrl-lc_collate) != 0)
  		pg_log(PG_FATAL,
! 			   old and new cluster lc_collate values do not match\n);
  	if (pg_strcasecmp(oldctrl-lc_ctype, newctrl-lc_ctype) != 0)
  		pg_log(PG_FATAL,
! 			   old and new cluster lc_ctype values do not match\n);
  	if (pg_strcasecmp(oldctrl-encoding, newctrl-encoding) != 0)
  		pg_log(PG_FATAL,
! 			   old and new cluster encoding values do not match\n);
  }
  
  
--- 406,428 
  check_locale_and_encoding(ControlData *oldctrl,
  		  ControlData *newctrl)
  {
! 	/*
! 	 *	These are often defined with inconsistent case, so use pg_strcasecmp().
! 	 *	They also often use inconsistent hyphenation, which we cannot fix, e.g.
! 	 *	UTF-8 vs. UTF8, so at least we display the mismatching values.
! 	 */
  	if (pg_strcasecmp(oldctrl-lc_collate, newctrl-lc_collate) != 0)
  		pg_log(PG_FATAL,
! 			   lc_collate cluster values do not match:  old \%s\, new \%s\\n,
! 			   oldctrl-lc_collate, newctrl-lc_collate);
  	if (pg_strcasecmp(oldctrl-lc_ctype, newctrl-lc_ctype) != 0)
  		pg_log(PG_FATAL,
! 			   lc_ctype cluster values do not match:  old \%s\, new \%s\\n,
! 			   oldctrl-lc_ctype, newctrl-lc_ctype);
  	if (pg_strcasecmp(oldctrl-encoding, newctrl-encoding) != 0)
  		pg_log(PG_FATAL,
! 			   encoding cluster values do not match:  old \%s\, new \%s\\n,
! 			   oldctrl-encoding, newctrl-encoding);
  }
  
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Oid registry

2012-09-24 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 09/24/2012 09:24 PM, Tom Lane wrote:
 Another point to think about is that it's a few years too late to
 guarantee that any particular OID above 16384 is unused; we can't
 do that now without possibly breaking pg_upgrade-ability of existing
 databases.  While we could hand out some subrange of the OIDs below
 that, there's not exactly a huge amount of space available.

 we seem to have a fair bit of leeway between to top numbered Oid as 
 reported by unused_oids (4332) and 16384.

It's not nearly that simple.  Per transam.h:

 *OIDs 1- are reserved for manual assignment (see the files
 *in src/include/catalog/).
 *
 *OIDS 1-16383 are reserved for assignment during initdb
 *using the OID generator.  (We start the generator at 1.)
 *
 *OIDs beginning at 16384 are assigned from the OID generator
 *during normal multiuser operation.  (We force the generator up to
 *16384 as soon as we are in normal operation.)

pg_database.datlastsysoid is 12907 as of HEAD, so we're using about 2900
OIDs that are assigned during initdb.  We should expect continued growth
in that number.  For comparison, it was 10791 in 8.1, which was the first
version following this assignment scheme --- so the number of
auto-assigned OIDs has more than tripled in the last seven years.
And there's not room for another tripling.  So I think it's entirely
likely that we'll have to reduce FirstBootstrapObjectId to a smaller
number in the foreseeable future, or else increase FirstNormalObjectId
which would be a pg_upgrade breaking move.  Meanwhile, we do continue to
eat manually-assigned OIDs at a nontrivial pace as well.

So, yeah, we could reserve a couple hundred OIDs for a scheme like this
and (probably) not regret it later.  But a couple thousand would scare
me ... and I'm not exactly convinced that a couple hundred is enough,
if there's any demand out there at all.

I suggest we could consider handing out reserved OIDs one or two at a
time using the current manual assignment process, *without* any notion
of a reserved block of OIDs.  That is, when Joe Developer needs a couple
of OIDs for joes_whizzy_datatype, he comes to us and we say sure, you
can have 4333 and 4334, which we then memorialize in a text file in
include/catalog/.  This doesn't preclude doing something differently
later of course, but it avoids setting aside OID address space that is
likely to be underutilized and fairly desperately needed at some future
time.

As you might guess from this, I'm not enamored of the reserve some
private OID space part at all.  There's not been any demand for that
that I've noticed.  In pretty much any development context, you could
use OIDs up around 4 billion for such purposes and not have any problem
at all --- it's only trying to *guarantee* they're not used anywhere out
in the field that is problematic.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Oid registry

2012-09-24 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 09/24/2012 09:37 PM, Peter Eisentraut wrote:
 Could you fill the rest of us in with some technical details about why
 this might be necessary and what it aims to achieve?

 Well, an obvious case is how record_to_json handles fields. If it knows 
 nothing about the type all it can do is output the string value. That 
 doesn't work well for types such as hstore. If it could reliably 
 recognize a field as an hstore it might well be able to do lots better.

Um ... that is an entirely unconvincing use case.  We would not put any
code into core that knows specifically about a non-core datatype, or
at least I sure hope we'd not do such a klugy thing.  It would be far
better to invent an extensibility scheme (plug-in of some sort) for
record_to_json.

My recollection of the PGCon discussion is that people wanted to allow
client-side code to hard-wire type OIDs for add-on types, in more or
less the same way that things like JDBC know that 25 is text.
That's not unreasonable, since the alternatives are complicated and
not all that reliable.  I'm not sure the usage applies to anything
except data types though, so at least for starters I'd only want to
accept reservations of OIDs for data types.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Switching timeline over streaming replication

2012-09-24 Thread Amit Kapila
 On Monday, September 24, 2012 9:08 PM m...@rpzdesign.com wrote:
 What a disaster waiting to happen. Maybe the only replication should be
 master-master replication
 so there is no need to sequence timelines or anything, all servers are
 ready masters, no backups or failovers.
 If you really do not want a master serving, then it should only be
 handled in the routing
 of traffic to that server and not the replication logic itself.  The
 only thing that ever came about
 from failovers was the failure to turn over.  The above is opinion
 only.

This feature is for users who want to use master-standby configurations.

What do you mean by :
then it should only be  handled in the routing of traffic to that server
and not the replication logic itself.

Do you have any idea other than proposed implementation or do you see any
problem in currently proposed solution?


 
 On 9/24/2012 7:33 AM, Amit Kapila wrote:
  On Tuesday, September 11, 2012 10:53 PM Heikki Linnakangas wrote:
  I've been working on the often-requested feature to handle timeline
  changes over streaming replication. At the moment, if you kill the
  master and promote a standby server, and you have another standby
  server that you'd like to keep following the new master server, you
  need a WAL archive in addition to streaming replication to make it
  cross the timeline change. Streaming replication will just error
 out.
  Having a WAL archive is usually a good idea in complex replication
  scenarios anyway, but it would be good to not require it.
  Confirm my understanding of this feature:
 
  This feature is for case when standby-1 who is going to be promoted
 to
  master has archive mode 'on'.
  As in that case only its timeline will change.
 
  If above is right, then there can be other similar scenario's where
 it can
  be used:
 
  Scenario-1 (1 Master, 1 Stand-by)
  1. Master (archive_mode=on) goes down.
  2. Master again comes up
  3. Stand-by tries to follow it
 
  Now in above scenario also due to timeline mismatch it gives error,
 but your
  patch should fix it.
 
 
 
  Some parts of this patch are just refactoring that probably make
 sense
  regardless of the new functionality. For example, I split off the
  timeline history file related functions to a new file, timeline.c.
  That's not very much code, but it's fairly isolated, and xlog.c is
  massive, so I feel that anything that we can move off from xlog.c is
 a
  good thing. I also moved off the two functions RestoreArchivedFile()
  and ExecuteRecoveryCommand(), to a separate file. Those are also not
  much code, but are fairly isolated. If no-one objects to those
 changes,
  and the general direction this work is going to, I'm going split off
  those refactorings to separate patches and commit them separately.
 
  I also made the timeline history file a bit more detailed: instead
 of
  recording just the WAL segment where the timeline was changed, it
 now
  records the exact XLogRecPtr. That was required for the walsender to
  know the switchpoint, without having to parse the XLOG records (it
  reads and parses the history file, instead)
  IMO separating timeline history file related functions to a new file
 is
  good.
  However I am not sure about splitting for RestoreArchivedFile() and
  ExecuteRecoveryCommand() into separate file.
  How about splitting for all Archive related functions:
  static void XLogArchiveNotify(const char *xlog);
  static void XLogArchiveNotifySeg(XLogSegNo segno);
  static bool XLogArchiveCheckDone(const char *xlog);
  static bool XLogArchiveIsBusy(const char *xlog);
  static void XLogArchiveCleanup(const char *xlog);
  ..
  ..
 
  In any case, it will be better if you can split it into multiple
 patches:
  1. Having new functionality of Switching timeline over streaming
  replication
  2. Refactoring related changes.
 
  It can make my testing and review for new feature patch little
 easier.
 
  With Regards,
  Amit Kapila.
 
 
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-24 Thread Rural Hunter

于 2012/9/25 11:00, Bruce Momjian 写道:

On Tue, Sep 25, 2012 at 08:41:19AM +0800, Rural Hunter wrote:

I think the problem is on the options when I installed pgsql(both
9.1 and 9.2)
Select the locale to be used by the new database cluster.

Locale

[1] [Default locale]
[2] C
[3] POSIX
[4] zh_CN.utf8
[5] zh_HK.utf8
[6] zh_SG.utf8
[7] zh_TW.utf8
Please choose an option [1] : 4
I chose 4 instead of 1. I guess the default locale(option 1) is with dash.

Well, if you run that query on template0 in the old and new cluster, you
will see something different in the two of them.  Could you have used
default in one and a non-dash in the other.

Yes, that's true. The upgrade is fine with both fresh installs(9.1
and 9.2) with option above(without-dash). The problem only happens
when I inited the 9.2 db with default locale(I guess that one has

OK, that is good to know.  I developed the attached C program that does
the setlocale canonical test.  On Debian Squeeze, I could not see any
change:  if I pass en_US.UTF-8, I get en_US.UTF-8 returned;  if I pass
en_US.UTF8, I get en_US.UTF8 returned.  Can anyone test this and find a
case where the local is canonicalized?  Run it this way:

$ canonical
LC_COLLATE = 3
LC_CTYPE = 0
$ canonical 0 en_US.UTF8
en_US.UTF8

We are looking for cases where the second argument produces a
non-matching locale name as output.

It matches on my system(ubuntu 10.10 server):
$ ./canonical
LC_COLLATE = 3
LC_CTYPE = 0
$ ./canonical 0 zh_CN.UTF-8
zh_CN.UTF-8
$ ./canonical 0 zh_CN.UTF8
zh_CN.UTF8
$ ./canonical 0 zh_CN.utf8
zh_CN.utf8
$ ./canonical 0 zh_CN.utf-8
zh_CN.utf-8

I tested the checker with the patch:
$ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B 
/opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c

Performing Consistency Checks
-
Checking current, bin, and data directories ok
Checking cluster versions ok
Checking database user is a superuser ok
Checking for prepared transactions ok
Checking for reg* system OID user data types ok
Checking for contrib/isn with bigint-passing mismatch ok

lc_collate cluster values do not match: old zh_CN.utf8, new zh_CN.UTF-8
Failure, exiting

zh_CN.utf8 is provided by the installer and zh_CN.UTF-8 is my system 
default.


I have also attached a patch that reports the mismatching locale or
encoding names --- this should at least help with debugging and show
that a dash is the problem.


the dash). Just wondering why pg installer provides options without
dash.

No idea.





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Oid registry

2012-09-24 Thread Hitoshi Harada
On Mon, Sep 24, 2012 at 8:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andrew Dunstan and...@dunslane.net writes:
 On 09/24/2012 09:37 PM, Peter Eisentraut wrote:
 Could you fill the rest of us in with some technical details about why
 this might be necessary and what it aims to achieve?

 Well, an obvious case is how record_to_json handles fields. If it knows
 nothing about the type all it can do is output the string value. That
 doesn't work well for types such as hstore. If it could reliably
 recognize a field as an hstore it might well be able to do lots better.

 Um ... that is an entirely unconvincing use case.  We would not put any
 code into core that knows specifically about a non-core datatype, or
 at least I sure hope we'd not do such a klugy thing.  It would be far
 better to invent an extensibility scheme (plug-in of some sort) for
 record_to_json.


I don't think (and not hope) the core should know about external data
type, but I have certainly seen a lot of use cases where an external
project wants to know about another external data type.  Say, if plv8
wants to convert hstore into a javascript object.  It is arbitrary for
users to define such a function that accepts hstore as arguments, but
how does plv8 know the input is actually hstore?  Of course you can
look up type name conlusting SysCache and see if the type name is
hstore, but it's expensive to do it for every function invocation,
so caching the hstore's oid in plv8 is the current workaround, which
is not truly safe because the hstore's oid may change while caching.
I can tell similar stories around array creation which needs element
type oid.  The core code can do some special stuff by using
pre-defined type or function oid macros, and I guess it is reasonable
for the external developers to hope to do such things.

Thanks,
-- 
Hitoshi Harada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doc patch to note which system catalogs have oids

2012-09-24 Thread Karl O. Pinc
On 09/23/2012 08:57:45 PM, Karl O. Pinc wrote:

 The attached patch documents the oid column of those
 system catalogs having an oid.

Don't use the first version of this patch (oid_doc.patch)
without discarding the last hunk.  The last hunk
introduces an error by duplicating the
documentation of the pg_roles.oid column.

(I am reluctant to post a revised version
since there's already 3 versions floating
around representing 2 different approaches
and no consensus as to which approach, if any, should
be taken.)

Regards,

Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers