Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Wed, Sep 07, 2011 at 09:02:04PM -0400, Tom Lane wrote:
> daveg  writes:
> > On Wed, Sep 07, 2011 at 07:39:15PM -0400, Tom Lane wrote:
> >> BTW ... what were the last versions you were running on which you had
> >> *not* seen the problem?  (Just wondering about the possibility that we
> >> back-patched some "fix" that broke things.  It would be good to have
> >> a version range before trawling the commit logs.)
> 
> > The first version we saw it on was 8.4.7.
> 
> Yeah, you said that.  I was wondering what you'd last run before 8.4.7.

Sorry, misunderstood. We were previously running 8.4.4, but have been on 8.4.7
since shortly after it was released. Prior to that we have had all the major
and most of the minor releases since 7.1.
 
-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.conf archive_command example

2011-09-07 Thread Simon Riggs
On Thu, Sep 8, 2011 at 7:05 AM, Fujii Masao  wrote:
> On Wed, Sep 7, 2011 at 11:53 PM, Robert Treat  wrote:
>> On Tue, Sep 6, 2011 at 10:11 PM, Fujii Masao  wrote:
>>> I agree that basically archive_command should not overwrite an existing 
>>> file.
>>> But if the size of existing file is less than 16MB, it should do that.
>>> Otherwise,
>>> that WAL file would be lost forever.
>>
>> I think best practice in this case is that if you ever find an
>> existing file with the same name already in place, you should error
>> and investigate. We don't ship around partially completed WAL files,
>> and finding an existing one probably means something went wrong. (Of
>> course, we use rsync instead of copy/move, so we have some better
>> guarantees about this).
>
> That's an option. But I don't think that finding an existing file is so 
> serious
> problem.

The recommendation should be that the archived files are never
overwritten because that prevents a huge range of data loss bugs and
kills them stone dead.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large C files

2011-09-07 Thread Simon Riggs
On Wed, Sep 7, 2011 at 9:02 PM, Tom Lane  wrote:
> Simon Riggs  writes:
>> Please lets not waste effort on refactoring efforts in mid dev cycle.
>
> Say what?  When else would you have us do it?

When else would you have us develop?

Major changes happen at start of a dev cycle, after some discussion.
Not in the middle and especially not for low priority items. It's not
even a bug, just code beautification.

We've all accepted the need for some change, but I would like to see
it happen slowly and carefully because of the very high risk of
introducing bugs into stable code that doesn't have a formal
verification mechanism currently. Anybody that could be trusted to
make those changes ought to have something better to do. If they don't
then that is in itself a much more serious issue than this.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.conf archive_command example

2011-09-07 Thread Fujii Masao
On Wed, Sep 7, 2011 at 11:53 PM, Robert Treat  wrote:
> On Tue, Sep 6, 2011 at 10:11 PM, Fujii Masao  wrote:
>> I agree that basically archive_command should not overwrite an existing file.
>> But if the size of existing file is less than 16MB, it should do that.
>> Otherwise,
>> that WAL file would be lost forever.
>
> I think best practice in this case is that if you ever find an
> existing file with the same name already in place, you should error
> and investigate. We don't ship around partially completed WAL files,
> and finding an existing one probably means something went wrong. (Of
> course, we use rsync instead of copy/move, so we have some better
> guarantees about this).

That's an option. But I don't think that finding an existing file is so serious
problem. The most common cases which cause a partially-filled archived
file are;

1. The server crashes while WAL file is being archived, and then the server
restarts. In this case, the restarted server would find partially-filled
archived file.

2. In replication environment, the master crashes while WAL file is being
archived, and then a failover happens. In this case, new master would
find partially-filled archived file.

In these cases, I don't think it's so unsafe to overwrite an existing file.

OTOH, the practice you explained might fill up an archive area and
pg_xlog directory and then cause a PANIC error. Such a PANIC error
is more serious thing at least for me. So I'd like to overwrite an exiting
file when its size is not 16MB.

>> I have another feature request;
>> (5) Maybe not in the initial version, but eventually it might be
>> nice to support calling posix_fadvise(POSIX_FADV_DONTNEED)
>> after copying a WAL file.
>>
>
> Can you go into more details on how you envision this working. I'm
> mostly curious because I think rsync might already support this, which
> would make it easy to incorporate.

I'm expecting that the executable is written in C, it calls posix_fadvice
against the file descriptor created when opening the WAL file in pg_xlog
directory, just before closing that descriptor.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] force_not_null option support for file_fdw

2011-09-07 Thread Shigeru Hanada
(2011/09/05 22:05), Kohei Kaigai wrote:
>> In my usual environment that test passed, but finally I've reproduced the 
>> failure with setting
>> $LC_COLLATE to "es_ES.UTF-8".  Do you have set any $LC_COLLATE in your test 
>> environment?
>>
> It is not set in my environment.
> 
> I checked the behavior of ORDER BY when we set a collation on the regular 
> relation, not a foreign table.
> Do we hit something other unexpected bug in collation here?
> 
> postgres=# CREATE TABLE t1 (word1 text);
> CREATE TABLE
> postgres=# INSERT INTO t1 VALUES ('ABC'),('abc'),('123'),('NULL');
> INSERT 0 4
> postgres=# ALTER TABLE t1 ALTER word1 TYPE text COLLATE "ja_JP.utf8";
> ALTER TABLE
> postgres=# SELECT * FROM t1 ORDER BY word1;
>   word1
> ---
>   123
>   ABC
>   NULL
>   abc
> (4 rows)
> 
> postgres=# ALTER TABLE t1 ALTER word1 TYPE text COLLATE "en_US.utf8";
> ALTER TABLE
> postgres=# SELECT * FROM t1 ORDER BY word1;
>   word1
> ---
>   123
>   abc
>   ABC
>   NULL
> (4 rows)

Thanks for the checking.  FYI, I mainly use Fedora 15 box with Japanese
environment for my development.

ISTM that your results are reasonable for each collation setting.
Former ordering is same as C locale, and in latter case alphabetical
order has priority over case distinctions.  Do you mean that ordering
used in file_fdw is affected from something unexpected, without
collation or locale setting?

BTW, I found a thread which is related to this issue.
  http://archives.postgresql.org/pgsql-hackers/2011-09/msg00130.php

I changed the test data so that it uses only upper case alphabets,
because case distinction is not important for that test.  I also removed
digits to avoid test failure in some locales which sort alphabets before
digits.

Regards,
-- 
Shigeru Hanada
diff --git a/contrib/file_fdw/data/text.csv b/contrib/file_fdw/data/text.csv
index ...09827e7 .
*** a/contrib/file_fdw/data/text.csv
--- b/contrib/file_fdw/data/text.csv
***
*** 0 
--- 1,4 
+ AAA,AAA
+ XYZ,XYZ
+ NULL,NULL
+ ABC,ABC
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 224e74f..548dcd2 100644
*** a/contrib/file_fdw/file_fdw.c
--- b/contrib/file_fdw/file_fdw.c
***
*** 23,30 
--- 23,32 
  #include "foreign/fdwapi.h"
  #include "foreign/foreign.h"
  #include "miscadmin.h"
+ #include "nodes/makefuncs.h"
  #include "optimizer/cost.h"
  #include "utils/rel.h"
+ #include "utils/syscache.h"
  
  PG_MODULE_MAGIC;
  
*** static struct FileFdwOption valid_option
*** 57,72 
{"escape", ForeignTableRelationId},
{"null", ForeignTableRelationId},
{"encoding", ForeignTableRelationId},
  
/*
 * force_quote is not supported by file_fdw because it's for COPY TO.
 */
  
-   /*
-* force_not_null is not supported by file_fdw.  It would need a parser
-* for list of columns, not to mention a way to check the column list
-* against the table.
-*/
  
/* Sentinel */
{NULL, InvalidOid}
--- 59,70 
{"escape", ForeignTableRelationId},
{"null", ForeignTableRelationId},
{"encoding", ForeignTableRelationId},
+   {"force_not_null", AttributeRelationId},/* specified as boolean 
value */
  
/*
 * force_quote is not supported by file_fdw because it's for COPY TO.
 */
  
  
/* Sentinel */
{NULL, InvalidOid}
*** static void fileGetOptions(Oid foreignta
*** 112,117 
--- 110,116 
  static void estimate_costs(PlannerInfo *root, RelOptInfo *baserel,
   const char *filename,
   Cost *startup_cost, Cost *total_cost);
+ static List * get_force_not_null(Oid relid);
  
  
  /*
*** file_fdw_validator(PG_FUNCTION_ARGS)
*** 145,150 
--- 144,150 
List   *options_list = untransformRelOptions(PG_GETARG_DATUM(0));
Oid catalog = PG_GETARG_OID(1);
char   *filename = NULL;
+   char   *force_not_null = NULL;
List   *other_options = NIL;
ListCell   *cell;
  
*** file_fdw_validator(PG_FUNCTION_ARGS)
*** 198,204 
 buf.data)));
}
  
!   /* Separate out filename, since ProcessCopyOptions won't allow 
it */
if (strcmp(def->defname, "filename") == 0)
{
if (filename)
--- 198,207 
 buf.data)));
}
  
!   /*
!* Separate out filename and force_not_null, since 
ProcessCopyOptions
!* won't allow them.
!*/
if (strcmp(def->defname, "filename") == 0)
{
if (filename)
*** file_fdw_validator(PG_FUNCTION_ARGS)
*** 207,212 
--- 210,229 
   

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Robert Haas
On Wed, Sep 7, 2011 at 6:25 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I thought about an error exit from client authentication, and that's a
>> somewhat appealing explanation, but I can't quite see why we wouldn't
>> clean up there the same as anywhere else.  The whole mechanism feels a
>> bit rickety to me - we don't actually release locks; we just abort the
>> transaction and *assume* that will cause locks to get released.
>
> Well, transaction abort will call LockReleaseAll, which is carefully
> coded to clean up the proclock lists regardless of what is in the
> locallocks table, so I'm not sure why you find that any more rickety
> than anything else.

Well, it's very clear that you CAN orphan locks if a backend holding a
session lock ever does CHECK_FOR_INTERRUPTS() outside of a
transaction.  Try the attached patch.

rhaas=# vacuum full foo;
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
The connection to the server was lost. Attempting reset: Succeeded.
rhaas=# vacuum full foo;
ERROR:  lock AccessExclusiveLock on object 16384/1431013/0 is already held

Now, I don't see any evidence of a live bug here (and on further
thought it can't be Dave's bug because he is orphaning
AccessShareLocks, not AccessExclusiveLocks), but I find this pretty
convincing as a demonstration of ricketiness.  It is certainly not
obvious on its face that a misplaced CHECK_FOR_INTERRUPTS() can result
in a backend exiting without cleaning up its locks, and I'd argue its
a bad idea to leave it that way even if there's no user-visible
problem there today.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


break-vacuum.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Tom Lane
daveg  writes:
> On Wed, Sep 07, 2011 at 07:39:15PM -0400, Tom Lane wrote:
>> BTW ... what were the last versions you were running on which you had
>> *not* seen the problem?  (Just wondering about the possibility that we
>> back-patched some "fix" that broke things.  It would be good to have
>> a version range before trawling the commit logs.)

> The first version we saw it on was 8.4.7.

Yeah, you said that.  I was wondering what you'd last run before 8.4.7.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Wed, Sep 07, 2011 at 07:39:15PM -0400, Tom Lane wrote:
> daveg  writes:
> > Also, this is very intermittant, we have seen it only in recent months
> > on both 8.4.7 and 9.0.4 after years of no problems. Lately we see it what
> > feels like a few times a month. Possibly some new application behaviour
> > is provoking it, but I have no guesses as to what.
> 
> BTW ... what were the last versions you were running on which you had
> *not* seen the problem?  (Just wondering about the possibility that we
> back-patched some "fix" that broke things.  It would be good to have
> a version range before trawling the commit logs.)

The first version we saw it on was 8.4.7.

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Moving core timestamp typedefs/macros somewhere else

2011-09-07 Thread Brendan Jurd
On 8 September 2011 10:22, Tom Lane  wrote:
> If you believe the idea I suggested a few days ago that we ought to try
> to push basic typedefs into a separate set of headers, then this could
> be the first instance of that, which would lead to naming it something
> like "datatype/timestamp.h".  If that seems premature, then I guess it
> ought to go into utils/, but then we need some other name because
> utils/timestamp.h is taken.

The separate headers for basic typedefs makes perfect sense to me.

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Moving core timestamp typedefs/macros somewhere else

2011-09-07 Thread Tom Lane
In connection with doing this:
http://archives.postgresql.org/message-id/22214.1315343...@sss.pgh.pa.us
I've run into the problem that tz_acceptable(), which needs to be
available to frontend-ish code if initdb is to use it, depends on these
symbols:

#define UNIX_EPOCH_JDATE2440588 /* == date2j(1970, 1, 1) */
#define POSTGRES_EPOCH_JDATE2451545 /* == date2j(2000, 1, 1) */
#define SECS_PER_DAY 86400

which are defined in backend-only include files such as
utils/timestamp.h.  This immediately brought to mind the pgrminclude
fiasco of a couple days ago, which was at least in part due to the fact
that utils/timestamp.h got included into some very low-level header
files so that they could use typedef TimestampTz.  So I think it's time
to do something about that.

I propose moving the Timestamp/Interval typedefs, as well as some basic
associated macros such as the ones mentioned above (basically, lines
25-100 of utils/timestamp.h, plus the DT_NOBEGIN/DT_NOEND stuff, plus
the Julian-date macros in datetime.h), into a separate header file that
contains no backend-only declarations (eg, not fmgr.h stuff; right
offhand I don't think it would depend on anything except c.h).

If you believe the idea I suggested a few days ago that we ought to try
to push basic typedefs into a separate set of headers, then this could
be the first instance of that, which would lead to naming it something
like "datatype/timestamp.h".  If that seems premature, then I guess it
ought to go into utils/, but then we need some other name because
utils/timestamp.h is taken.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Log crashed backend's query (activity string)

2011-09-07 Thread Robert Haas
On Wed, Sep 7, 2011 at 5:24 PM, Alvaro Herrera
 wrote:
> Excerpts from Marti Raudsepp's message of mié sep 07 18:09:32 -0300 2011:
>> On Wed, Sep 7, 2011 at 22:42, Alvaro Herrera  
>> wrote:
>> > A mishandled encoding conversion could be problematic, so that needs to
>> > be carefully considered (perhaps just shut off unconditionally).
>>
>> Are you referring to log file encoding or something else? The log file
>> is already potentially mixed-encoding, as different databases may have
>> different encodings and backends just dump bytes to it in their
>> current encoding.
>
> I am referring to the fact that whatever the backend puts in shared
> memory is going to be in its encoding setting, which may not necessarily
> match the postmaster's.  And if it doesn't, the postmaster might try to
> convert it using settings not valid for the string, possibly leading to
> crashes.
>
> I remember we had bugs whereby an encoding conversion would fail,
> leading to elog trying to report this problem, but this attempt would
> also incur a conversion step, failing recursively until elog's stack got
> full.  I'm not saying this is impossible to solve, just something to
> keep in mind.

Can we do something like: pass through ASCII characters unchanged, and
output anything with the high-bit set as \x?  That
might be garbled in some cases, but the goal here is not perfection.
We're just trying to give the admin (or PostgreSQL-guru-for-hire) a
clue where to start looking for the problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Tom Lane
daveg  writes:
> Also, this is very intermittant, we have seen it only in recent months
> on both 8.4.7 and 9.0.4 after years of no problems. Lately we see it what
> feels like a few times a month. Possibly some new application behaviour
> is provoking it, but I have no guesses as to what.

BTW ... what were the last versions you were running on which you had
*not* seen the problem?  (Just wondering about the possibility that we
back-patched some "fix" that broke things.  It would be good to have
a version range before trawling the commit logs.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Tom Lane
daveg  writes:
> On Wed, Sep 07, 2011 at 06:25:23PM -0400, Tom Lane wrote:
>> ...  But maybe it'd be interesting for Dave to stick a
>> LockReleaseAll call into ProcKill() and see if that makes things better.
>> (Dave: test that before you put it in production, I'm not totally sure
>> it's safe.)

> Re safety, what is the worst case here? 

I think a failure would be pretty obvious --- if it gets through
regression tests it's probably fine.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Wed, Sep 07, 2011 at 06:25:23PM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > I thought about an error exit from client authentication, and that's a
> > somewhat appealing explanation, but I can't quite see why we wouldn't
> > clean up there the same as anywhere else.  The whole mechanism feels a
> > bit rickety to me - we don't actually release locks; we just abort the
> > transaction and *assume* that will cause locks to get released.
> 
> Well, transaction abort will call LockReleaseAll, which is carefully
> coded to clean up the proclock lists regardless of what is in the
> locallocks table, so I'm not sure why you find that any more rickety
> than anything else.  But maybe it'd be interesting for Dave to stick a
> LockReleaseAll call into ProcKill() and see if that makes things better.
> (Dave: test that before you put it in production, I'm not totally sure
> it's safe.)

Re safety, what is the worst case here? 

Also, this is very intermittant, we have seen it only in recent months
on both 8.4.7 and 9.0.4 after years of no problems. Lately we see it what
feels like a few times a month. Possibly some new application behaviour
is provoking it, but I have no guesses as to what.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Wed, Sep 07, 2011 at 06:35:08PM -0400, Tom Lane wrote:
> daveg  writes:
> > It does not seem restricted to pg_authid:
> > 2011-08-24 18:35:57.445 24987  c23  apps  ERROR:  lock AccessShareLock on 
> > object 16403/2615/0 
> > And I think I've seen it on other tables too.
> 
> Hmm.  2615 = pg_namespace, which most likely is the first catalog
> accessed by just about any SQL command that's going to access tables at
> all, so I suspect that this is mostly just a "the first access failed"
> thing and not something peculiar to pg_namespace.  But we still don't
> have a clue why the locks are not getting released by the previous
> owner of the PGPROC slot.  Have you trawled your logs to see if there's
> any sign of any distress at all, shortly before the problem starts to
> happen?

Will do, but its a pretty big haystack. Sure wish I knew what the needle
looked like. ;-)

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Tom Lane
daveg  writes:
> It does not seem restricted to pg_authid:
> 2011-08-24 18:35:57.445 24987  c23  apps  ERROR:  lock AccessShareLock on 
> object 16403/2615/0 
> And I think I've seen it on other tables too.

Hmm.  2615 = pg_namespace, which most likely is the first catalog
accessed by just about any SQL command that's going to access tables at
all, so I suspect that this is mostly just a "the first access failed"
thing and not something peculiar to pg_namespace.  But we still don't
have a clue why the locks are not getting released by the previous
owner of the PGPROC slot.  Have you trawled your logs to see if there's
any sign of any distress at all, shortly before the problem starts to
happen?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Tom Lane
Robert Haas  writes:
> I thought about an error exit from client authentication, and that's a
> somewhat appealing explanation, but I can't quite see why we wouldn't
> clean up there the same as anywhere else.  The whole mechanism feels a
> bit rickety to me - we don't actually release locks; we just abort the
> transaction and *assume* that will cause locks to get released.

Well, transaction abort will call LockReleaseAll, which is carefully
coded to clean up the proclock lists regardless of what is in the
locallocks table, so I'm not sure why you find that any more rickety
than anything else.  But maybe it'd be interesting for Dave to stick a
LockReleaseAll call into ProcKill() and see if that makes things better.
(Dave: test that before you put it in production, I'm not totally sure
it's safe.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Wed, Sep 07, 2011 at 04:55:24PM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > Tom's right to be skeptical of my theory, because it would require a
> > CHECK_FOR_INTERRUPTS() outside of a transaction block in one of the
> > pathways that use session-level locks, and I can't find one.
> 
> More to the point: session-level locks are released on error.  The only
> way to get out of a transaction while still holding one is for the
> VACUUM-or-whichever-command code to deliberately commit and exit while
> still holding it.  An error exit path would release the lock.
> 
> > OTOH, I'm skeptical of the theory that this involves userlocks,
> > because this whole thread started because of a complaint about lock
> > 0/1260/0 already being held.  That ain't no userlock.
> 
> Yeah, and for that matter it seems to let VACUUM off the hook too.
> If we assume that the reported object ID is non-corrupt (and if it's
> always the same, that seems like the way to bet) then this is a lock
> on pg_authid.
> 
> Hmmm ... could the pathway involve an error exit from client
> authentication?  We're still finding bugs in the 9.0 rewrite of
> auth-time database access.

It does not seem restricted to pg_authid:

2011-08-24 18:35:57.445 24987  c23  apps  ERROR:  lock AccessShareLock on 
object 16403/2615/0 

And I think I've seen it on other tables too.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Robert Haas
On Wed, Sep 7, 2011 at 4:55 PM, Tom Lane  wrote:
> Yeah, and for that matter it seems to let VACUUM off the hook too.
> If we assume that the reported object ID is non-corrupt (and if it's
> always the same, that seems like the way to bet) then this is a lock
> on pg_authid.
>
> Hmmm ... could the pathway involve an error exit from client
> authentication?  We're still finding bugs in the 9.0 rewrite of
> auth-time database access.

Well, according to Dave's report upthread, it's not only this one relation:

DG> The recent errors are:
DG> lock AccessShareLock on object 16403/2615/0 is already held
DG> which is for pg_namespace in database c23.

I thought about an error exit from client authentication, and that's a
somewhat appealing explanation, but I can't quite see why we wouldn't
clean up there the same as anywhere else.  The whole mechanism feels a
bit rickety to me - we don't actually release locks; we just abort the
transaction and *assume* that will cause locks to get released.  And
it should.  But there's a bit more action-at-a-distance there than I'd
ideally like.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Log crashed backend's query (activity string)

2011-09-07 Thread Alvaro Herrera
Excerpts from Marti Raudsepp's message of mié sep 07 18:09:32 -0300 2011:
> On Wed, Sep 7, 2011 at 22:42, Alvaro Herrera  
> wrote:
> > A mishandled encoding conversion could be problematic, so that needs to
> > be carefully considered (perhaps just shut off unconditionally).
> 
> Are you referring to log file encoding or something else? The log file
> is already potentially mixed-encoding, as different databases may have
> different encodings and backends just dump bytes to it in their
> current encoding.

I am referring to the fact that whatever the backend puts in shared
memory is going to be in its encoding setting, which may not necessarily
match the postmaster's.  And if it doesn't, the postmaster might try to
convert it using settings not valid for the string, possibly leading to
crashes.

I remember we had bugs whereby an encoding conversion would fail,
leading to elog trying to report this problem, but this attempt would
also incur a conversion step, failing recursively until elog's stack got
full.  I'm not saying this is impossible to solve, just something to
keep in mind.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Log crashed backend's query (activity string)

2011-09-07 Thread Tom Lane
Robert Haas  writes:
> On Wed, Sep 7, 2011 at 4:15 PM, Tom Lane  wrote:
>> Keep in mind that in the postmaster, elog(ERROR) *is* a crash.

> Right... but a function that returns -1 to indicate that something
> didn't work should be OK, as long as whatever it does is otherwise
> extremely boring.

The more functionality you put in the postmaster, the more likely it is
to trip over an elog(ERROR) somewhere.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] timezone GUC

2011-09-07 Thread Tom Lane
Magnus Hagander  writes:
> On Tue, Sep 6, 2011 at 23:52, Robert Haas  wrote:
>> On Tue, Sep 6, 2011 at 5:16 PM, Tom Lane  wrote:
>>> Although there's always more than one way to skin a cat.  Consider
>>> this idea:
>>> 
>>> 1. The hard-wired default for timezone is always UTC (or something
>>> else not dependent on environment).
>>> 
>>> 2. We put the identify_system_timezone work into initdb, and have it
>>> inject a non-default entry into postgresql.conf in the usual way
>>> if it can identify what the system zone is.
>>> 
>>> 3. Run-time dependency on TZ environment disappears altogether.
>>> 
>>> This basically means that instead of incurring that search on every
>>> postmaster start, we do it once at initdb.  If you change the
>>> postmaster's timezone environment, well, you gotta go change
>>> postgresql.conf.

>> Seems reasonable to me...

> +1.

I spent a bit of time on this idea last night.  The most painful part
actually seems to be translating identify_system_timezone to run in a
non-backend environment (no elog, etc).  The one thing I've run into
that doesn't seem straightforward is to decide where to look for the
timezone files.  If we have --with-system-tzdata then of course it's a
constant, but should we honor initdb's -L switch otherwise?  And if so,
how should we pass that into the pg_TZDIR code?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Log crashed backend's query (activity string)

2011-09-07 Thread Marti Raudsepp
On Wed, Sep 7, 2011 at 22:42, Alvaro Herrera  wrote:
> A mishandled encoding conversion could be problematic, so that needs to
> be carefully considered (perhaps just shut off unconditionally).

Are you referring to log file encoding or something else? The log file
is already potentially mixed-encoding, as different databases may have
different encodings and backends just dump bytes to it in their
current encoding.

pg_verify_mbstr() could potentially be used with noError=true if we
can figure out the backend's current encoding, but that indeed sounds
like something to avoid.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Tom Lane
Robert Haas  writes:
> Tom's right to be skeptical of my theory, because it would require a
> CHECK_FOR_INTERRUPTS() outside of a transaction block in one of the
> pathways that use session-level locks, and I can't find one.

More to the point: session-level locks are released on error.  The only
way to get out of a transaction while still holding one is for the
VACUUM-or-whichever-command code to deliberately commit and exit while
still holding it.  An error exit path would release the lock.

> OTOH, I'm skeptical of the theory that this involves userlocks,
> because this whole thread started because of a complaint about lock
> 0/1260/0 already being held.  That ain't no userlock.

Yeah, and for that matter it seems to let VACUUM off the hook too.
If we assume that the reported object ID is non-corrupt (and if it's
always the same, that seems like the way to bet) then this is a lock
on pg_authid.

Hmmm ... could the pathway involve an error exit from client
authentication?  We're still finding bugs in the 9.0 rewrite of
auth-time database access.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Don't truncate integer part in to_char for 'FM99.'

2011-09-07 Thread Tom Lane
Marti Raudsepp  writes:
> On Wed, Sep 7, 2011 at 21:37, Tom Lane  wrote:
>> Hmm. I agree that this is a bug, but the proposed fix seems like a bit
>> of a kluge. Wouldn't it be better to make get_last_relevant_decnum
>> honor its contract, that is not delete any relevant digits?

> You're right, it was a kludge.

> Here's an improved version. I need to take a NUMProc* argument to do
> that right, because that's how it knows how many '0's to keep from the
> format.

Yeah, after fooling with it myself I saw that it was the interconnection
of the don't-suppress-'0'-format-positions logic with the find-the-last-
nonzero-digit logic that was confusing matters.  (formatting.c may not
be the most spaghetti-ish code I've ever worked with, but it's up
there.)  However, I don't think that inserting knowledge of the other
consideration into get_last_relevant_decnum is really the way to make
things cleaner.  Also, the way yours is set up, I'm dubious that it
does the right thing when the last '0' specifier is to the left of the
decimal point.  I'm currently testing this patch:


diff --git a/src/backend/utils/adt/formatting.c 
b/src/backend/utils/adt/formatting.c
index 
7efd988362889346af86c642f6ee660a4ae1b3d2..a7000250b0363165bee5697ad72036aad28b830e
 100644
*** a/src/backend/utils/adt/formatting.c
--- b/src/backend/utils/adt/formatting.c
*** NUM_prepare_locale(NUMProc *Np)
*** 3908,3913 
--- 3908,3916 
  /* --
   * Return pointer of last relevant number after decimal point
   *12.0500 --> last relevant is '5'
+  *12. --> last relevant is '.'
+  * If there is no decimal point, return NULL (which will result in same
+  * behavior as if FM hadn't been specified).
   * --
   */
  static char *
*** get_last_relevant_decnum(char *num)
*** 3921,3927 
  #endif
  
if (!p)
!   p = num;
result = p;
  
while (*(++p))
--- 3924,3931 
  #endif
  
if (!p)
!   return NULL;
! 
result = p;
  
while (*(++p))
*** NUM_processor(FormatNode *node, NUMDesc 
*** 4458,4470 
{
Np->num_pre = plen;
  
!   if (IS_FILLMODE(Np->Num))
{
!   if (IS_DECIMAL(Np->Num))
!   Np->last_relevant = get_last_relevant_decnum(
!   
 Np->number +
!
((Np->Num->zero_end - Np->num_pre > 0) ?
! 
Np->Num->zero_end - Np->num_pre : 0));
}
  
if (Np->sign_wrote == FALSE && Np->num_pre == 0)
--- 4462,4483 
{
Np->num_pre = plen;
  
!   if (IS_FILLMODE(Np->Num) && IS_DECIMAL(Np->Num))
{
!   Np->last_relevant = 
get_last_relevant_decnum(Np->number);
! 
!   /*
!* If any '0' specifiers are present, make sure we 
don't strip
!* those digits.
!*/
!   if (Np->last_relevant && Np->Num->zero_end > 
Np->num_pre)
!   {
!   char   *last_zero;
! 
!   last_zero = Np->number + (Np->Num->zero_end - 
Np->num_pre);
!   if (Np->last_relevant < last_zero)
!   Np->last_relevant = last_zero;
!   }
}
  
if (Np->sign_wrote == FALSE && Np->num_pre == 0)


regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Don't truncate integer part in to_char for 'FM99.'

2011-09-07 Thread Marti Raudsepp
On Wed, Sep 7, 2011 at 21:37, Tom Lane  wrote:
> Hmm.  I agree that this is a bug, but the proposed fix seems like a bit
> of a kluge. Wouldn't it be better to make get_last_relevant_decnum
> honor its contract, that is not delete any relevant digits?

You're right, it was a kludge.

Here's an improved version. I need to take a NUMProc* argument to do
that right, because that's how it knows how many '0's to keep from the
format.

What do you think?

Regards,
Marti
From d0264d8fe8179716bfd58e12201e456387ca5469 Mon Sep 17 00:00:00 2001
From: Marti Raudsepp 
Date: Wed, 7 Sep 2011 20:12:58 +0300
Subject: [PATCH] Don't truncate integer part in to_char for 'FM99.'

When the numeric to_char format used fillmode (FM), didn't contain 0s
and had a trailing dot, the integer part of the number was truncated in
error.

to_char(10, 'FM99.') used to return '1', now it will return '10'
---
 src/backend/utils/adt/formatting.c|   29 -
 src/test/regress/expected/numeric.out |6 ++
 src/test/regress/sql/numeric.sql  |2 ++
 3 files changed, 24 insertions(+), 13 deletions(-)

diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 7efd988..eada4a8 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -979,7 +979,7 @@ static char *fill_str(char *str, int c, int max);
 static FormatNode *NUM_cache(int len, NUMDesc *Num, text *pars_str, bool *shouldFree);
 static char *int_to_roman(int number);
 static void NUM_prepare_locale(NUMProc *Np);
-static char *get_last_relevant_decnum(char *num);
+static char *get_last_relevant_decnum(NUMProc *Np);
 static void NUM_numpart_from_char(NUMProc *Np, int id, int plen);
 static void NUM_numpart_to_char(NUMProc *Np, int id);
 static char *NUM_processor(FormatNode *node, NUMDesc *Num, char *inout, char *number,
@@ -3911,17 +3911,26 @@ NUM_prepare_locale(NUMProc *Np)
  * --
  */
 static char *
-get_last_relevant_decnum(char *num)
+get_last_relevant_decnum(NUMProc *Np)
 {
 	char	   *result,
-			   *p = strchr(num, '.');
+			   *p;
 
 #ifdef DEBUG_TO_FROM_CHAR
 	elog(DEBUG_elog_output, "get_last_relevant_decnum()");
 #endif
 
-	if (!p)
-		p = num;
+	if (Np->Num->zero_end > Np->num_pre)
+		/* Keep digits according to '0's in the format */
+		p = Np->number + Np->Num->zero_end - Np->num_pre;
+	else
+	{
+		p = strchr(Np->number, '.');
+
+		if (!p)
+			return NULL;
+	}
+
 	result = p;
 
 	while (*(++p))
@@ -4458,14 +4467,8 @@ NUM_processor(FormatNode *node, NUMDesc *Num, char *inout, char *number,
 	{
 		Np->num_pre = plen;
 
-		if (IS_FILLMODE(Np->Num))
-		{
-			if (IS_DECIMAL(Np->Num))
-Np->last_relevant = get_last_relevant_decnum(
-			 Np->number +
-	 ((Np->Num->zero_end - Np->num_pre > 0) ?
-	  Np->Num->zero_end - Np->num_pre : 0));
-		}
+		if (IS_FILLMODE(Np->Num) && IS_DECIMAL(Np->Num))
+			Np->last_relevant = get_last_relevant_decnum(Np);
 
 		if (Np->sign_wrote == FALSE && Np->num_pre == 0)
 			++Np->num_count;
diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out
index d9927b7..e12ab5b 100644
--- a/src/test/regress/expected/numeric.out
+++ b/src/test/regress/expected/numeric.out
@@ -1154,6 +1154,12 @@ SELECT '' AS to_char_23, to_char(val, '9.999')FROM num_data;
 | -2.493e+07
 (10 rows)
 
+SELECT '' AS to_char_24, to_char('100'::numeric, 'FM999.');
+ to_char_24 | to_char 
++-
+| 100
+(1 row)
+
 -- TO_NUMBER()
 --
 SELECT '' AS to_number_1,  to_number('-34,338,492', '99G999G999');
diff --git a/src/test/regress/sql/numeric.sql b/src/test/regress/sql/numeric.sql
index a1435ec..d552526 100644
--- a/src/test/regress/sql/numeric.sql
+++ b/src/test/regress/sql/numeric.sql
@@ -764,6 +764,8 @@ SELECT '' AS to_char_21, to_char(val, '99SG99')			FROM num_data;
 SELECT '' AS to_char_22, to_char(val, 'FM.999')	FROM num_data;
 SELECT '' AS to_char_23, to_char(val, '9.999')FROM num_data;
 
+SELECT '' AS to_char_24, to_char('100'::numeric, 'FM999.');
+
 -- TO_NUMBER()
 --
 SELECT '' AS to_number_1,  to_number('-34,338,492', '99G999G999');
-- 
1.7.6.1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Log crashed backend's query (activity string)

2011-09-07 Thread Robert Haas
On Wed, Sep 7, 2011 at 4:15 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Sep 7, 2011 at 3:42 PM, Alvaro Herrera
>>  wrote:
>>> A mishandled encoding conversion could be problematic, so that needs to
>>> be carefully considered (perhaps just shut off unconditionally).
>
>> It's not really a problem for the postmaster if something just plain
>> old fails.  Where we get into trouble is if it manages to (a) crash,
>> (b) take an excessive amount of time to complete, or (c) screw up the
>> postmaster state in some way we can't recover from.  But if any of
>> those are an issue then, yeah, just shut it off.
>
> Keep in mind that in the postmaster, elog(ERROR) *is* a crash.

Right... but a function that returns -1 to indicate that something
didn't work should be OK, as long as whatever it does is otherwise
extremely boring.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Robert Haas
On Wed, Sep 7, 2011 at 4:22 PM, daveg  wrote:
> Yes, we make extensive use of advisory locks. That was my thought too when
> Robert mentioned session level locks.
>
> I'm happy to add any additional instrumentation, but my client would be
> happier to actually run it if there was a way to recover from this without
> an unplanned outage. Is there something I can do when the patch detects the
> problem to be able to continue without a restart?

Well, basically, you want to do the same thing that LockRelease()
would do - remove the PROCLOCK from the SHM_QUEUE and delete it from
the hash table, adjusting the counts in the LOCK object as
appropriate.  If you just ignore the failure, you'll get the "blah
blah blah is already held" messages you were having before.

Tom's right to be skeptical of my theory, because it would require a
CHECK_FOR_INTERRUPTS() outside of a transaction block in one of the
pathways that use session-level locks, and I can't find one.

OTOH, I'm skeptical of the theory that this involves userlocks,
because this whole thread started because of a complaint about lock
0/1260/0 already being held.  That ain't no userlock.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Wed, Sep 07, 2011 at 10:20:24AM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > After spending some time staring at the code, I do have one idea as to
> > what might be going on here.  When a backend is terminated,
> > ShutdownPostgres() calls AbortOutOfAnyTransaction() and then
> > LockReleaseAll(USER_LOCKMETHOD, true).  The second call releases all
> > user locks, and the first one (or so we suppose) releases all normal
> > locks as part of aborting the transaction.  But what if there's no
> > transaction in progress?  In that case, AbortOutOfAnyTransaction()
> > won't do anything - which is fine as far as transaction-level locks
> > go, because we shouldn't be holding any of those anyway if there's no
> > transaction in progress.  However, if we hold a session-level lock at
> > that point, then we'd orphan it.  We don't make much use of session
> > locks.  As far as I can see, they are used by (1) VACUUM, (2) CREATE
> > INDEX CONCURRENTLY, (3) ALTER DATABASE .. SET TABLESPACE, and (4) on
> > standby servers, redo of DROP DATABASE actions.  Any chance one of
> > those died or was killed off around the time this happened?
> 
> I don't believe this theory at all, because if that were the issue,
> we'd have heard about it long since.  The correct theory has to involve
> a very-little-used triggering condition.  At the moment I'm wondering
> about advisory (userspace) locks ... Dave, do your apps use any of those?

Yes, we make extensive use of advisory locks. That was my thought too when
Robert mentioned session level locks.

I'm happy to add any additional instrumentation, but my client would be
happier to actually run it if there was a way to recover from this without
an unplanned outage. Is there something I can do when the patch detects the
problem to be able to continue without a restart? Is is save to just reset
the proclock queue? I don't think they would mind leaking locks, for instance,
and a later planned restart to clear it up as much as they mind unscheduled
downtime.

Thank

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Log crashed backend's query (activity string)

2011-09-07 Thread Tom Lane
Robert Haas  writes:
> On Wed, Sep 7, 2011 at 3:42 PM, Alvaro Herrera
>  wrote:
>> A mishandled encoding conversion could be problematic, so that needs to
>> be carefully considered (perhaps just shut off unconditionally).

> It's not really a problem for the postmaster if something just plain
> old fails.  Where we get into trouble is if it manages to (a) crash,
> (b) take an excessive amount of time to complete, or (c) screw up the
> postmaster state in some way we can't recover from.  But if any of
> those are an issue then, yeah, just shut it off.

Keep in mind that in the postmaster, elog(ERROR) *is* a crash.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Log crashed backend's query (activity string)

2011-09-07 Thread Robert Haas
On Wed, Sep 7, 2011 at 4:00 PM, Tom Lane  wrote:
> Alvaro Herrera  writes:
>> Excerpts from Robert Haas's message of mar sep 06 19:57:07 -0300 2011:
>>> TBH, I'm very unclear what could cause the postmaster to go belly-up
>>> copying a bounded amount of data out of shared memory for logging
>>> purposes only.  It's surely possible to make the code safe against any
>>> sequence of bytes that might be found there.
>
>> A mishandled encoding conversion could be problematic, so that needs to
>> be carefully considered (perhaps just shut off unconditionally).
>
> That, and the question of exactly what makes the amount bounded, and

The fact that it's a fixed-size chunk of shmem.  We won't copy more
bytes than the size of the buffer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large C files

2011-09-07 Thread Tom Lane
Simon Riggs  writes:
> Please lets not waste effort on refactoring efforts in mid dev cycle.

Say what?  When else would you have us do it?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Log crashed backend's query (activity string)

2011-09-07 Thread Robert Haas
On Wed, Sep 7, 2011 at 3:42 PM, Alvaro Herrera
 wrote:
>> TBH, I'm very unclear what could cause the postmaster to go belly-up
>> copying a bounded amount of data out of shared memory for logging
>> purposes only.  It's surely possible to make the code safe against any
>> sequence of bytes that might be found there.
>
> A mishandled encoding conversion could be problematic, so that needs to
> be carefully considered (perhaps just shut off unconditionally).

It's not really a problem for the postmaster if something just plain
old fails.  Where we get into trouble is if it manages to (a) crash,
(b) take an excessive amount of time to complete, or (c) screw up the
postmaster state in some way we can't recover from.  But if any of
those are an issue then, yeah, just shut it off.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Log crashed backend's query (activity string)

2011-09-07 Thread Tom Lane
Alvaro Herrera  writes:
> Excerpts from Robert Haas's message of mar sep 06 19:57:07 -0300 2011:
>> TBH, I'm very unclear what could cause the postmaster to go belly-up
>> copying a bounded amount of data out of shared memory for logging
>> purposes only.  It's surely possible to make the code safe against any
>> sequence of bytes that might be found there.

> A mishandled encoding conversion could be problematic, so that needs to
> be carefully considered (perhaps just shut off unconditionally).

That, and the question of exactly what makes the amount bounded, and
probably six other things that could go wrong.  But I'm sure Andrew
won't be pleased with a proposal to inject unknown-encoding data into
the logs.

I remain of the opinion that this needs to be kept out of the postmaster.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] problem of win32.mak

2011-09-07 Thread Bruce Momjian
Hiroshi Saito wrote:
> Hi Magnus-san and Bruce-san.
> 
> I am sorry in a very late reaction...
> 
> Is it enough for a 9.1release?
> libpq of bcc32 and win32 has a problem.
> 
> == error of nmake ==
>? .\Release\libpqdll.lib ???
> .\Release\libpqdll.exp 
> libpq.lib(fe-connect.obj) : error LNK2019: ??
> _inet_net_ntop ??? _connectFailureMessage ?
> libpq.lib(getaddrinfo.obj) : error LNK2001: ??
> "_inet_net_ntop" ???
> libpq.lib(fe-connect.obj) : error LNK2019: ??
> _pg_get_encoding_from_locale ??? _PQsetClientEncoding ?
> .\Release\libpq.dll : fatal error LNK1120:  2 ???
> == end ==
> 
> Please take this into consideration.

Applied and backpatched to 9.1.  Thanks.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large C files

2011-09-07 Thread Simon Riggs
On Wed, Sep 7, 2011 at 7:12 PM, Tom Lane  wrote:
> Bruce Momjian  writes:
>> Robert Haas wrote:
>>> I was less concerned about the breakage that might be caused by
>>> variables acquiring unintended referents - which should be unlikely
>>> anyway given reasonable variable naming conventions - and more
>>> concerned that the associated refactoring would break recovery.  We
>>> have no recovery regression tests; that's not a good thing.
>
>> So we are talking about more than moving files between functions?  Yes,
>> it would be risky to restruction functions, but for someone who
>> understand that code, it might be safe.
>
> The pgrminclude-induced bug you just fixed shows a concrete way in which
> moving code from one file to another might silently break it, ie, it
> still compiles despite lack of definition of some symbol it's intended
> to see.
>
> Having said that, I tend to agree that xlog.c is getting so large and
> messy that it needs to be broken up.  But I'm not in favor of breaking
> up files just because they're large, eg, ruleutils.c is not in need of
> such treatment.  The problem with xlog.c is that it seems to be dealing
> with many more considerations than it originally did.

I agree as well, though we've spawned many new files and directories
in the last 7 years. Almost nothing has gone in there that didn't need
to.

As long as we accept its not a priority, I'll do some work to slide
things away and we can do it over time.

Please lets not waste effort on refactoring efforts in mid dev cycle.
Having this done by someone without good experience is just going to
waste all of our time and sneak bugs into something that does actually
work rather well.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Log crashed backend's query (activity string)

2011-09-07 Thread Alvaro Herrera
Excerpts from Robert Haas's message of mar sep 06 19:57:07 -0300 2011:
> On Tue, Sep 6, 2011 at 6:05 PM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> On Tue, Sep 6, 2011 at 5:34 PM, Tom Lane  wrote:
> >>> And I doubt
> >>> that the goal is worth taking risks for.
> >
> >> I am unable to count the number of times that I have had a customer
> >> come to me and say "well, the backend crashed".  And I go look at
> >> their logs and I have no idea what happened.
> >
> > gdb and print debug_query_string?
> 
> Surely you're kidding.  These are customer systems which I frequently
> don't even have access to.  They don't always have gdb installed
> (sometimes they are Windows systems) and if they do the customer isn't
> likely to know how to use it, and even if they do they don't think the
> better of us for needing such a tool to troubleshoot a crash.

I'm with Robert on this ... been there, got that look.


> TBH, I'm very unclear what could cause the postmaster to go belly-up
> copying a bounded amount of data out of shared memory for logging
> purposes only.  It's surely possible to make the code safe against any
> sequence of bytes that might be found there.

A mishandled encoding conversion could be problematic, so that needs to
be carefully considered (perhaps just shut off unconditionally).

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom variables and PGC_SUSET issue

2011-09-07 Thread Tom Lane
Alvaro Herrera  writes:
> Excerpts from Tom Lane's message of mié sep 07 14:49:45 -0300 2011:
>> I don't think we can just s/DEBUG3/LOG/, because of the
>> log clutter that will result when they all think there's a problem.

> I was thinking that the solution would be to promote DEBUG3 to LOG in
> the case of a custom variable.  This would be noisy as you say, but I
> don't see any other option; at least it would just be the custom
> variables.

That's not much of a fix because (a) the behavior is still pretty
undesirable, and (b) it supposes that this sort of failure can only
occur for custom variables.  The previous discussion arose from a
different case entirely --- IIRC, it was from trying to specify
client_encoding in postgresql.conf, which the postmaster was happy with
but some backends were not because they had a database_encoding for
which there was no conversion.  There are probably a bunch of other
possibilities out there that we haven't hit yet, and if not today, there
certainly will be more in the future.  So I'm not very excited by a
proposed fix that makes assumptions about the exact reason why a process
rejects a particular GUC value.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom variables and PGC_SUSET issue

2011-09-07 Thread Andrew Dunstan



On 09/07/2011 03:18 PM, Robert Haas wrote:

Yeah, I guess we don't have many good short-term options.  I continue
to think that this whole facility is mis-designed, though.


I agree. I have tripped over it several times.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] About the performance of startup after dropping many tables

2011-09-07 Thread Alvaro Herrera
Excerpts from Gan Jiadong's message of vie feb 18 03:42:02 -0300 2011:
> Hi,
> 
>   Thanks for your reply. 
>   Of course, we will think about whether 4 relations dropping is
> reasonable. In fact, this happens in a very special scenario . 
>   But when we analyzed this issue, we found the PG code can be rewritten to
> achieve better performance. Or we can say the arithmetic of this part is not
> good enough. 
>   For example, by doing the refactoring as we done, the startup time can be
> reduced from 3 minutes to 8 seconds, It is quite a great improvement,
> especially for the systems with low TTR (time to recovery) requirement.
> 
>   There is any problem or risk to change this part of code as we suggested?

The only way to know would be to show the changes.  If you were to
submit the patch, and assuming we agree on the design and
implementation, we could even consider including it (or, more likely,
some derivate of it).

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom variables and PGC_SUSET issue

2011-09-07 Thread Robert Haas
On Wed, Sep 7, 2011 at 2:21 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Sep 7, 2011 at 2:05 PM, Tom Lane  wrote:
>>> The ones in auto_explain and pg_stat_statements aren't too problematic
>>> because those modules are designed to be preloaded by the postmaster.
>>> We've avoided adding such variables in modules that aren't intended
>>> to be used that way.
>
>> What about plpgsql.variable_conflict?
>
> That one's not really meant to be changed on the fly either; we have
> #variable_conflict instead.
>
> The reason why this is hard, and not just a bug to be fixed, is that
> it's not clear what to do.  Part of the issue is that we don't remember
> whether the current setting was applied by a superuser or not, but even
> if we did remember that, what happens at LOAD time if it wasn't?
> Silently replacing the value isn't appealing, and neither are the other
> options.  It's better to not have such a variable in the first place.

Yeah, I guess we don't have many good short-term options.  I continue
to think that this whole facility is mis-designed, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom variables and PGC_SUSET issue

2011-09-07 Thread Alvaro Herrera
Excerpts from Tom Lane's message of mié sep 07 14:49:45 -0300 2011:
> Alvaro Herrera  writes:
> > Another thing we detected some days ago is that a custom variable in a
> > module not loaded by postmaster, does not seem to get reported
> > appropriately when an invalid value is set on postgresql.conf and
> > reloaded: backends report problems with DEBUG3, only postmaster uses
> > LOG, but since postmaster isn't loading the module, nothing happens.
> 
> This is just an instance of the general problem that the current design
> assumes all processes will have the same opinion about the validity of
> settings read from postgresql.conf.  We discussed that back in July:
> http://archives.postgresql.org/pgsql-hackers/2011-07/msg00850.php
> but it wasn't clear to me that any consensus had been reached about how
> to change the behavior.  I proposed that we should let processes adopt
> individual settings even if they thought other ones were invalid; that
> gets rid of some of the issues, but it doesn't really address how we
> should report problems when only some of the processes think there's a
> problem.

Yeah; in this particular case, the value is just plain invalid for
everybody.  I think it just happens that postmaster didn't see it
because it was valid when it was started (i.e. the file got edited and a
SIGHUP sent, introducing the invalid value but not adopted anywhere).

> I don't think we can just s/DEBUG3/LOG/, because of the
> log clutter that will result when they all think there's a problem.

I was thinking that the solution would be to promote DEBUG3 to LOG in
the case of a custom variable.  This would be noisy as you say, but I
don't see any other option; at least it would just be the custom
variables.  This didn't work for reasons that we haven't been
investigated yet (we discovered this late last week and I've been busy
with 9.1 translations).

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_upgrade problem

2011-09-07 Thread Bruce Momjian
Bruce Momjian wrote:
> Tom Lane wrote:
> > hubert depesz lubaczewski  writes:
> > > Worked a bit to get the ltree problem down to smallest possible, 
> > > repeatable, situation.
> > 
> > I looked at this again and verified that indeed, commit
> > 8eee65c996048848c20f6637c1d12b319a4ce244 introduced an incompatible
> > change into the on-disk format of ltree columns: it widened
> > ltree_level.len, which is one component of an ltree on disk.
> > So the crash is hardly surprising.  I think that the only thing
> > pg_upgrade could do about it is refuse to upgrade when ltree columns
> > are present in an 8.3 database.  I'm not sure though how you'd identify
> > contrib/ltree versus some random user-defined type named ltree.
> 
> It is actually easy to do using the attached patch.  I check for the
> functions that support the data type and check of they are from an
> 'ltree' shared object.  I don't check actual user table type names in
> this case.

Attached patch applied to 9.0, 9.1, and HEAD.  Doc changes included.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c
new file mode 100644
index 37c38c1..720f130
*** a/contrib/pg_upgrade/check.c
--- b/contrib/pg_upgrade/check.c
*** check_old_cluster(migratorContext *ctx, 
*** 72,77 
--- 72,78 
  	{
  		old_8_3_check_for_name_data_type_usage(ctx, CLUSTER_OLD);
  		old_8_3_check_for_tsquery_usage(ctx, CLUSTER_OLD);
+ 		old_8_3_check_ltree_usage(ctx, CLUSTER_OLD);
  		if (ctx->check)
  		{
  			old_8_3_rebuild_tsvector_tables(ctx, true, CLUSTER_OLD);
diff --git a/contrib/pg_upgrade/pg_upgrade.h b/contrib/pg_upgrade/pg_upgrade.h
new file mode 100644
index 41c4b11..7a02fa1
*** a/contrib/pg_upgrade/pg_upgrade.h
--- b/contrib/pg_upgrade/pg_upgrade.h
*** void old_8_3_check_for_name_data_type_us
*** 394,399 
--- 394,401 
  	   Cluster whichCluster);
  void old_8_3_check_for_tsquery_usage(migratorContext *ctx,
  Cluster whichCluster);
+ void old_8_3_check_ltree_usage(migratorContext *ctx,
+ Cluster whichCluster);
  void old_8_3_rebuild_tsvector_tables(migratorContext *ctx,
  bool check_mode, Cluster whichCluster);
  void old_8_3_invalidate_hash_gin_indexes(migratorContext *ctx,
diff --git a/contrib/pg_upgrade/version_old_8_3.c b/contrib/pg_upgrade/version_old_8_3.c
new file mode 100644
index 6fcd61b..7e3a7aa
*** a/contrib/pg_upgrade/version_old_8_3.c
--- b/contrib/pg_upgrade/version_old_8_3.c
*** old_8_3_check_for_tsquery_usage(migrator
*** 204,209 
--- 204,289 
  
  
  /*
+  *	old_8_3_check_ltree_usage()
+  *	8.3 -> 8.4
+  *	The internal ltree structure was changed in 8.4 so upgrading is impossible.
+  */
+ void
+ old_8_3_check_ltree_usage(migratorContext *ctx, Cluster whichCluster)
+ {
+ 	ClusterInfo *active_cluster = (whichCluster == CLUSTER_OLD) ?
+ 	&ctx->old : &ctx->new;
+ 	int			dbnum;
+ 	FILE	   *script = NULL;
+ 	bool		found = false;
+ 	char		output_path[MAXPGPATH];
+ 
+ 	prep_status(ctx, "Checking for /contrib/ltree");
+ 
+ 	snprintf(output_path, sizeof(output_path), "%s/contrib_ltree.txt",
+ 			 ctx->cwd);
+ 
+ 	for (dbnum = 0; dbnum < active_cluster->dbarr.ndbs; dbnum++)
+ 	{
+ 		PGresult   *res;
+ 		bool		db_used = false;
+ 		int			ntups;
+ 		int			rowno;
+ 		int			i_nspname,
+ 	i_proname;
+ 		DbInfo	   *active_db = &active_cluster->dbarr.dbs[dbnum];
+ 		PGconn	   *conn = connectToServer(ctx, active_db->db_name, whichCluster);
+ 
+ 		/* Find any functions coming from contrib/ltree */
+ 		res = executeQueryOrDie(ctx, conn,
+ "SELECT n.nspname, p.proname "
+ "FROM	pg_catalog.pg_proc p, "
+ "		pg_catalog.pg_namespace n "
+ "WHERE	p.pronamespace = n.oid AND "
+ "		p.probin = '$libdir/ltree'");
+ 
+ 		ntups = PQntuples(res);
+ 		i_nspname = PQfnumber(res, "nspname");
+ 		i_proname = PQfnumber(res, "proname");
+ 		for (rowno = 0; rowno < ntups; rowno++)
+ 		{
+ 			found = true;
+ 			if (script == NULL && (script = fopen(output_path, "w")) == NULL)
+ pg_log(ctx, PG_FATAL, "Could not create necessary file:  %s\n", output_path);
+ 			if (!db_used)
+ 			{
+ fprintf(script, "Database:  %s\n", active_db->db_name);
+ db_used = true;
+ 			}
+ 			fprintf(script, "  %s.%s\n",
+ 	PQgetvalue(res, rowno, i_nspname),
+ 	PQgetvalue(res, rowno, i_proname));
+ 		}
+ 
+ 		PQclear(res);
+ 
+ 		PQfinish(conn);
+ 	}
+ 
+ 	if (found)
+ 	{
+ 		fclose(script);
+ 		pg_log(ctx, PG_REPORT, "fatal\n");
+ 		pg_log(ctx, PG_FATAL,
+ 			   "| Your installation contains the \"ltree\" data type.  This data type\n"
+ 			   "| changed its internal storage format between your old and new clusters so this\n"
+ 			   "| cluster cannot currently be upgraded.  You can manually upgrade databases\n"
+ 			   "| that use \"contrib/ltree\" facilities and remove \"contrib/ltree\" 

Re: [HACKERS] [PATCH] Don't truncate integer part in to_char for 'FM99.'

2011-09-07 Thread Tom Lane
Marti Raudsepp  writes:
> This patch fixes an edge case bug in the numeric to_char() function.

> When the numeric to_char format used fillmode (FM), didn't contain 0s
> and had a trailing dot, the integer part of the number was truncated in
> error.

> to_char(10, 'FM99.') used to return '1', after this patch it will return '10'


Hmm.  I agree that this is a bug, but the proposed fix seems like a bit
of a kluge.  Wouldn't it be better to make get_last_relevant_decnum
honor its contract, that is not delete any relevant digits?  I'm
thinking instead of this

if (!p)
p = num;

when there is no decimal point it should do something like

if (!p)
return num + strlen(num) - 1;

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom variables and PGC_SUSET issue

2011-09-07 Thread Tom Lane
Robert Haas  writes:
> On Wed, Sep 7, 2011 at 2:05 PM, Tom Lane  wrote:
>> The ones in auto_explain and pg_stat_statements aren't too problematic
>> because those modules are designed to be preloaded by the postmaster.
>> We've avoided adding such variables in modules that aren't intended
>> to be used that way.

> What about plpgsql.variable_conflict?

That one's not really meant to be changed on the fly either; we have
#variable_conflict instead.

The reason why this is hard, and not just a bug to be fixed, is that
it's not clear what to do.  Part of the issue is that we don't remember
whether the current setting was applied by a superuser or not, but even
if we did remember that, what happens at LOAD time if it wasn't?
Silently replacing the value isn't appealing, and neither are the other
options.  It's better to not have such a variable in the first place.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large C files

2011-09-07 Thread Tom Lane
Bruce Momjian  writes:
> Robert Haas wrote:
>> I was less concerned about the breakage that might be caused by
>> variables acquiring unintended referents - which should be unlikely
>> anyway given reasonable variable naming conventions - and more
>> concerned that the associated refactoring would break recovery.  We
>> have no recovery regression tests; that's not a good thing.

> So we are talking about more than moving files between functions?  Yes,
> it would be risky to restruction functions, but for someone who
> understand that code, it might be safe.

The pgrminclude-induced bug you just fixed shows a concrete way in which
moving code from one file to another might silently break it, ie, it
still compiles despite lack of definition of some symbol it's intended
to see.

Having said that, I tend to agree that xlog.c is getting so large and
messy that it needs to be broken up.  But I'm not in favor of breaking
up files just because they're large, eg, ruleutils.c is not in need of
such treatment.  The problem with xlog.c is that it seems to be dealing
with many more considerations than it originally did.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom variables and PGC_SUSET issue

2011-09-07 Thread Robert Haas
On Wed, Sep 7, 2011 at 2:05 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Sep 7, 2011 at 1:55 PM, Tom Lane  wrote:
>>> You can't set a custom SUSET variable in advance of loading the module,
>>> because the system has no way to know it should enforce superuser
>>> restrictions on setting it. For the most part, this is a good reason to
>>> avoid custom SUSET variables.
>
>> Apparently we haven't taken that advice ourselves?
>
> The ones in auto_explain and pg_stat_statements aren't too problematic
> because those modules are designed to be preloaded by the postmaster.
> We've avoided adding such variables in modules that aren't intended
> to be used that way.

What about plpgsql.variable_conflict?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom variables and PGC_SUSET issue

2011-09-07 Thread Tom Lane
Robert Haas  writes:
> On Wed, Sep 7, 2011 at 1:55 PM, Tom Lane  wrote:
>> You can't set a custom SUSET variable in advance of loading the module,
>> because the system has no way to know it should enforce superuser
>> restrictions on setting it. For the most part, this is a good reason to
>> avoid custom SUSET variables.

> Apparently we haven't taken that advice ourselves?

The ones in auto_explain and pg_stat_statements aren't too problematic
because those modules are designed to be preloaded by the postmaster.
We've avoided adding such variables in modules that aren't intended
to be used that way.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom variables and PGC_SUSET issue

2011-09-07 Thread Robert Haas
On Wed, Sep 7, 2011 at 1:55 PM, Tom Lane  wrote:
> Pavel Stehule  writes:
>> When we have a custom SUSET variable, like plpgsql.variable_conflikt,
>> then setting this variable before plpgsql initalisation raises a
>> exception, but it raise a exception when some plpgsql function is
>> created. Try to execute a attached script - a set statement is ok, but
>> CREATE FUNCTION fails.
>
> You can't set a custom SUSET variable in advance of loading the module,
> because the system has no way to know it should enforce superuser
> restrictions on setting it.  For the most part, this is a good reason to
> avoid custom SUSET variables.

Apparently we haven't taken that advice ourselves?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom variables and PGC_SUSET issue

2011-09-07 Thread Tom Lane
Pavel Stehule  writes:
> When we have a custom SUSET variable, like plpgsql.variable_conflikt,
> then setting this variable before plpgsql initalisation raises a
> exception, but it raise a exception when some plpgsql function is
> created. Try to execute a attached script - a set statement is ok, but
> CREATE FUNCTION fails.

You can't set a custom SUSET variable in advance of loading the module,
because the system has no way to know it should enforce superuser
restrictions on setting it.  For the most part, this is a good reason to
avoid custom SUSET variables.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom variables and PGC_SUSET issue

2011-09-07 Thread Tom Lane
Alvaro Herrera  writes:
> Another thing we detected some days ago is that a custom variable in a
> module not loaded by postmaster, does not seem to get reported
> appropriately when an invalid value is set on postgresql.conf and
> reloaded: backends report problems with DEBUG3, only postmaster uses
> LOG, but since postmaster isn't loading the module, nothing happens.

This is just an instance of the general problem that the current design
assumes all processes will have the same opinion about the validity of
settings read from postgresql.conf.  We discussed that back in July:
http://archives.postgresql.org/pgsql-hackers/2011-07/msg00850.php
but it wasn't clear to me that any consensus had been reached about how
to change the behavior.  I proposed that we should let processes adopt
individual settings even if they thought other ones were invalid; that
gets rid of some of the issues, but it doesn't really address how we
should report problems when only some of the processes think there's a
problem.  I don't think we can just s/DEBUG3/LOG/, because of the
log clutter that will result when they all think there's a problem.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large C files

2011-09-07 Thread Bruce Momjian
Robert Haas wrote:
> > IMHO, when manipulating code at this sort of macro scale, it's good to
> > be paranoid and exhaustive, particularly when it doesn't cost you
> > anything anyway. Unlike when writing or fixing a bit of code at a
> > time, which we're all used to, if the compiler doesn't tell you about
> > it, your chances of catching the problem before a bug manifests itself
> > are close to zero.
> 
> I was less concerned about the breakage that might be caused by
> variables acquiring unintended referents - which should be unlikely
> anyway given reasonable variable naming conventions - and more
> concerned that the associated refactoring would break recovery.  We
> have no recovery regression tests; that's not a good thing.

So we are talking about more than moving files between functions?  Yes,
it would be risky to restruction functions, but for someone who
understand that code, it might be safe.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large C files

2011-09-07 Thread Robert Haas
On Tue, Sep 6, 2011 at 9:14 PM, Peter Geoghegan  wrote:
> On 7 September 2011 01:18, Bruce Momjian  wrote:
>> I am confused how moving a function from one C file to another could
>> cause breakage?
>
> I'm really concerned about silent breakage, however unlikely - that is
> also Simon and Robert's concern, and rightly so. If it's possible in
> principle to guard against a certain type of problem, we should do so.
> While this is a mechanical process, it isn't quite that mechanical a
> process - I would not expect this work to be strictly limited to
> simply spreading some functions around different files. Certainly, if
> there are any other factors at all that could disrupt things, a
> problem that does not cause a compiler warning or error is vastly more
> troublesome than one that does, and the most plausible such error is
> if a symbol is somehow resolved differently when the function is
> moved. I suppose that the simplest way that this could happen is
> probably by somehow having a variable of the same name and type appear
> twice, once as a static, the other time as a global.
>
> IMHO, when manipulating code at this sort of macro scale, it's good to
> be paranoid and exhaustive, particularly when it doesn't cost you
> anything anyway. Unlike when writing or fixing a bit of code at a
> time, which we're all used to, if the compiler doesn't tell you about
> it, your chances of catching the problem before a bug manifests itself
> are close to zero.

I was less concerned about the breakage that might be caused by
variables acquiring unintended referents - which should be unlikely
anyway given reasonable variable naming conventions - and more
concerned that the associated refactoring would break recovery.  We
have no recovery regression tests; that's not a good thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Don't truncate integer part in to_char for 'FM99.'

2011-09-07 Thread Marti Raudsepp
Hi,

This patch fixes an edge case bug in the numeric to_char() function.

When the numeric to_char format used fillmode (FM), didn't contain 0s
and had a trailing dot, the integer part of the number was truncated in
error.

to_char(10, 'FM99.') used to return '1', after this patch it will return '10'

This is known to affect the format() function in the mysqlcompat
pgFoundry project.

Regards,
Marti
From 848076a119c39782ef854ac940f14f5975430b4e Mon Sep 17 00:00:00 2001
From: Marti Raudsepp 
Date: Wed, 7 Sep 2011 20:12:58 +0300
Subject: [PATCH] Don't truncate integer part in to_char for 'FM99.'

When the numeric to_char format used fillmode (FM), didn't contain 0s
and had a trailing dot, the integer part of the number was truncated in
error.

to_char(10, 'FM99.') used to return '1', now it will return '10'
---
 src/backend/utils/adt/formatting.c|6 +-
 src/test/regress/expected/numeric.out |6 ++
 src/test/regress/sql/numeric.sql  |2 ++
 3 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 7efd988..37a819e 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -4460,7 +4460,11 @@ NUM_processor(FormatNode *node, NUMDesc *Num, char *inout, char *number,
 
 		if (IS_FILLMODE(Np->Num))
 		{
-			if (IS_DECIMAL(Np->Num))
+			/*
+			 * Only call get_last_relevant_decnum if the number has fractional
+			 * digits, otherwise the result is bogus.
+			 */
+			if (IS_DECIMAL(Np->Num) && Np->Num->post > 0)
 Np->last_relevant = get_last_relevant_decnum(
 			 Np->number +
 	 ((Np->Num->zero_end - Np->num_pre > 0) ?
diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out
index d9927b7..e12ab5b 100644
--- a/src/test/regress/expected/numeric.out
+++ b/src/test/regress/expected/numeric.out
@@ -1154,6 +1154,12 @@ SELECT '' AS to_char_23, to_char(val, '9.999')FROM num_data;
 | -2.493e+07
 (10 rows)
 
+SELECT '' AS to_char_24, to_char('100'::numeric, 'FM999.');
+ to_char_24 | to_char 
++-
+| 100
+(1 row)
+
 -- TO_NUMBER()
 --
 SELECT '' AS to_number_1,  to_number('-34,338,492', '99G999G999');
diff --git a/src/test/regress/sql/numeric.sql b/src/test/regress/sql/numeric.sql
index a1435ec..d552526 100644
--- a/src/test/regress/sql/numeric.sql
+++ b/src/test/regress/sql/numeric.sql
@@ -764,6 +764,8 @@ SELECT '' AS to_char_21, to_char(val, '99SG99')			FROM num_data;
 SELECT '' AS to_char_22, to_char(val, 'FM.999')	FROM num_data;
 SELECT '' AS to_char_23, to_char(val, '9.999')FROM num_data;
 
+SELECT '' AS to_char_24, to_char('100'::numeric, 'FM999.');
+
 -- TO_NUMBER()
 --
 SELECT '' AS to_number_1,  to_number('-34,338,492', '99G999G999');
-- 
1.7.6.1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] timezone GUC

2011-09-07 Thread Magnus Hagander
On Tue, Sep 6, 2011 at 23:52, Robert Haas  wrote:
> On Tue, Sep 6, 2011 at 5:16 PM, Tom Lane  wrote:
>> I wrote:
>>> Robert Haas  writes:
 I am (and, I think, Alvaro is also) of the opinion that the behavior
 here is still not really right.
>>
>>> I don't see a practical way to do better unless we can find a less
>>> horridly inefficient way of implementing identify_system_timezone().
>>
>> Although there's always more than one way to skin a cat.  Consider
>> this idea:
>>
>> 1. The hard-wired default for timezone is always UTC (or something
>> else not dependent on environment).
>>
>> 2. We put the identify_system_timezone work into initdb, and have it
>> inject a non-default entry into postgresql.conf in the usual way
>> if it can identify what the system zone is.
>>
>> 3. Run-time dependency on TZ environment disappears altogether.
>>
>> This basically means that instead of incurring that search on every
>> postmaster start, we do it once at initdb.  If you change the
>> postmaster's timezone environment, well, you gotta go change
>> postgresql.conf.
>>
>> IMO this would be less DBA-friendly in practice, but only very
>> marginally so; and getting rid of the special initialization behavior of
>> the timezone GUC might well be considered sufficient recompense.
>
> Seems reasonable to me...

+1.

I'm not sure I agree it's less DBA-friendly, really - given that it
makes it more consistent..


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom variables and PGC_SUSET issue

2011-09-07 Thread Alvaro Herrera
Excerpts from Pavel Stehule's message of mié sep 07 14:23:43 -0300 2011:
> Hello
> 
> Andy Colson found a bug in GUC implementation.
> 
> When we have a custom SUSET variable, like plpgsql.variable_conflikt,
> then setting this variable before plpgsql initalisation raises a
> exception, but it raise a exception when some plpgsql function is
> created. Try to execute a attached script - a set statement is ok, but
> CREATE FUNCTION fails.

Another thing we detected some days ago is that a custom variable in a
module not loaded by postmaster, does not seem to get reported
appropriately when an invalid value is set on postgresql.conf and
reloaded: backends report problems with DEBUG3, only postmaster uses
LOG, but since postmaster isn't loading the module, nothing happens.

(Maybe this is our bug but it doesn't seem like it.)

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] typo

2011-09-07 Thread Tom Lane
Euler Taveira de Oliveira  writes:
> While updating the translation I noticed a typo in
> src/backend/commands/collationcmds.c circa line 126.
> parameter \"lc_collate\" parameter must be specified

Will fix, thanks for spotting it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] custom variables and PGC_SUSET issue

2011-09-07 Thread Pavel Stehule
Hello

Andy Colson found a bug in GUC implementation.

When we have a custom SUSET variable, like plpgsql.variable_conflikt,
then setting this variable before plpgsql initalisation raises a
exception, but it raise a exception when some plpgsql function is
created. Try to execute a attached script - a set statement is ok, but
CREATE FUNCTION fails.

repeated setting this GUC raise a strange message

postgres=# \i script.sql
SET
before create function
psql:script.sql:13: ERROR:  42501: permission denied to set parameter
"plpgsql.variable_conflict"
LOCATION:  set_config_option, guc.c:5208
after function
postgres=# \i script.sql
SET
before create function
psql:script.sql:13: ERROR:  XX000: attempt to redefine parameter
"plpgsql.variable_conflict"
LOCATION:  define_custom_variable, guc.c:6333
after function

Regards

Pavel Stehule
--load 'plpgsql';

\set VERBOSITY verbose

set plpgsql.variable_conflict = use_variable;

\echo before create function

create or replace function test1(a integer) returns integer as $$
begin
return a+1;
end;
$$ language plpgsql;

\echo after function

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-07 Thread Tom Lane
Andrew Dunstan  writes:
> On 09/07/2011 12:05 PM, Tom Lane wrote:
>> LEAKPROOF isn't too bad.

> It's fairly opaque unless you know the context, although that might be 
> said of some of our other terms too. Someone coming across it is going 
> to think "What would it leak?"

Well, the whole point is that we want people to RTFM instead of assuming
they know what it means ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] First bug introduced by pgrminclude

2011-09-07 Thread Bruce Momjian
I have fixed a bug introduced by pgrminclude.  It turns out that
CppAsString2() will expand any symbol, even one that is undefined, so
pgrminclude thought that catalog/catversion was not needed.  This is
illustrated in the attached C file that outputs "1" and "y".

I have bumped the catalog version to force users to reload their
tablespaces (or use pg_upgrade) because the tablespace directory names
will not be expanded to the catalog version.

I have modified pgrminclude to skip files that use CppAsString2().

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
new file mode 100644
index 1e1e12d..e472e05
*** a/src/include/catalog/catalog.h
--- b/src/include/catalog/catalog.h
***
*** 14,19 
--- 14,24 
  #ifndef CATALOG_H
  #define CATALOG_H
  
+ /*
+  *	'pgrminclude ignore' needed here because CppAsString2() does not throw
+  *	an error if the symbol is not defined.
+  */
+ #include "catalog/catversion.h"	/* pgrminclude ignore */
  #include "catalog/pg_class.h"
  #include "storage/relfilenode.h"
  #include "utils/relcache.h"
diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h
new file mode 100644
index f5c9797..f3c8bb4
*** a/src/include/catalog/catversion.h
--- b/src/include/catalog/catversion.h
***
*** 53,58 
   */
  
  /*			mmddN */
! #define CATALOG_VERSION_NO	201108051
  
  #endif
--- 53,58 
   */
  
  /*			mmddN */
! #define CATALOG_VERSION_NO	201109071
  
  #endif
#include 
#include 
#include "/pg/include/c.h"

#define x 1
#define x1 CppAsString2(x)
#define x2 CppAsString2(y)

int
main(int argc, char **argv)
{
puts(x1);
puts(x2);
return 0;
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] typo

2011-09-07 Thread Euler Taveira de Oliveira

Hi,

While updating the translation I noticed a typo in
src/backend/commands/collationcmds.c circa line 126.

parameter \"lc_collate\" parameter must be specified


--
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-07 Thread Kohei KaiGai
2011/9/7 Andrew Dunstan :
>> LEAKPROOF isn't too bad.
>>
>>
>
> It's fairly opaque unless you know the context, although that might be said
> of some of our other terms too. Someone coming across it is going to think
> "What would it leak?"
>
It is introduced in the documentation. I'll add a point to this
chapter in the introduction of this keyword.

http://developer.postgresql.org/pgdocs/postgres/rules-privileges.html

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OPERATOR FAMILY and pg_dump

2011-09-07 Thread Joe Abbate
On 09/07/2011 12:10 PM, Tom Lane wrote:
> I guess if it contains no opclasses and no operators either, this code
> won't dump it, but isn't it rather useless in such a case?

Yes, I think it's useless, like a book cover without the contents, but
ISTM it should still be dumped (perhaps someone started defining a
family and forgot about it--oh, the puns that come to mind).

Joe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-07 Thread Andrew Dunstan



On 09/07/2011 12:05 PM, Tom Lane wrote:


I agree that TRUSTED is a pretty bad choice here because of the high
probability that people will think it means something else than what
it really means.


Agreed.


LEAKPROOF isn't too bad.




It's fairly opaque unless you know the context, although that might be 
said of some of our other terms too. Someone coming across it is going 
to think "What would it leak?"


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-07 Thread Kohei KaiGai
2011/9/7 Tom Lane :
> Noah Misch  writes:
>> I liked NOLEAKY for its semantics, though I probably would have spelled it
>> "LEAKPROOF".  PostgreSQL will trust the function to implement a specific,
>> relatively-unintuitive security policy.  We want the function implementers to
>> read that policy closely and not rely on any intuition they have about the
>> "trusted" term of art.  Our use of TRUSTED in CREATE LANGUAGE is more
>> conventional, I think, as is the trusted nature of SECURITY DEFINER.  In that
>> vein, folks who actually need SECURITY DEFINER might first look at TRUSTED;
>> NOLEAKY would not attract the same unwarranted attention.
>
> I agree that TRUSTED is a pretty bad choice here because of the high
> probability that people will think it means something else than what
> it really means.  LEAKPROOF isn't too bad.
>
It seems to me LEAKPROOF is never confusable for everyone, and
no conflicts with other concept, although it was not in my vocaburary.

If no better idea anymore, I'll submit the patch again; with LEAKPROOF keyword.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error building head on OS X 10.7.1

2011-09-07 Thread Dave Cramer
On Wed, Sep 7, 2011 at 11:32 AM, Tom Lane  wrote:
> Dave Cramer  writes:
>> Well the problem is that buildfarm can't build HEAD on OS X 10.7.1
>
> HEAD builds fine on my 10.7.1 laptop.  If you're referring to orangutan,
> it's not failing on that, it's failing here:
>
> configure:3301: checking for C compiler default output file name
> configure:3323: ccache gcc /opt/local/include   conftest.c  >&5
> ld: in /opt/local/include, can't map file, errno=22 for architecture x86_64
> collect2: ld returned 1 exit status
>
> which is probably because you have a malformed value of CFLAGS:
>
>   'config_env' => {
>                                     'CFLAGS' => '/opt/local/include',
>
>                        regards, tom lane
>

Thanks Tom, that was it.


Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OPERATOR FAMILY and pg_dump

2011-09-07 Thread Tom Lane
Joe Abbate  writes:
> If a basic operator family is created, e.g.,
> create operator family of1 using btree;
> shouldn't pg_dump include this in its output?  If not, why?

Quoting from the pg_dump source code:

 * We want to dump the opfamily only if (1) it contains "loose" operators
 * or functions, or (2) it contains an opclass with a different name or
 * owner.  Otherwise it's sufficient to let it be created during creation
 * of the contained opclass, and not dumping it improves portability of
 * the dump.

I guess if it contains no opclasses and no operators either, this code
won't dump it, but isn't it rather useless in such a case?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-07 Thread Tom Lane
Noah Misch  writes:
> I liked NOLEAKY for its semantics, though I probably would have spelled it
> "LEAKPROOF".  PostgreSQL will trust the function to implement a specific,
> relatively-unintuitive security policy.  We want the function implementers to
> read that policy closely and not rely on any intuition they have about the
> "trusted" term of art.  Our use of TRUSTED in CREATE LANGUAGE is more
> conventional, I think, as is the trusted nature of SECURITY DEFINER.  In that
> vein, folks who actually need SECURITY DEFINER might first look at TRUSTED;
> NOLEAKY would not attract the same unwarranted attention.

I agree that TRUSTED is a pretty bad choice here because of the high
probability that people will think it means something else than what
it really means.  LEAKPROOF isn't too bad.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-07 Thread Noah Misch
On Wed, Sep 07, 2011 at 02:09:15PM +0100, Thom Brown wrote:
> On 24 August 2011 13:38, Kohei Kaigai  wrote:
> 
> > The (2) is new stuff from the revision in commit-fest 1st. It enables to
> > supply "NOLEAKY" option on CREATE FUNCTION statement, then the function is
> > allowed to distribute across security barrier. Only superuser can set this
> > option.
> >
> 
> "NOLEAKY" doesn't really sound appropriate as it sounds like pidgin English.
>  Also, it could be read as "Don't allow leaks in this function".  Could we
> instead use something like TRUSTED or something akin to it being allowed to
> do more than safer functions?  It then describes its level of behaviour
> rather than what it promises not to do.

I liked NOLEAKY for its semantics, though I probably would have spelled it
"LEAKPROOF".  PostgreSQL will trust the function to implement a specific,
relatively-unintuitive security policy.  We want the function implementers to
read that policy closely and not rely on any intuition they have about the
"trusted" term of art.  Our use of TRUSTED in CREATE LANGUAGE is more
conventional, I think, as is the trusted nature of SECURITY DEFINER.  In that
vein, folks who actually need SECURITY DEFINER might first look at TRUSTED;
NOLEAKY would not attract the same unwarranted attention.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error building head on OS X 10.7.1

2011-09-07 Thread Tom Lane
Dave Cramer  writes:
> Well the problem is that buildfarm can't build HEAD on OS X 10.7.1

HEAD builds fine on my 10.7.1 laptop.  If you're referring to orangutan,
it's not failing on that, it's failing here:

configure:3301: checking for C compiler default output file name
configure:3323: ccache gcc /opt/local/include   conftest.c  >&5
ld: in /opt/local/include, can't map file, errno=22 for architecture x86_64
collect2: ld returned 1 exit status

which is probably because you have a malformed value of CFLAGS:

   'config_env' => {
 'CFLAGS' => '/opt/local/include',

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error building head on OS X 10.7.1

2011-09-07 Thread Dave Cramer
On Wed, Sep 7, 2011 at 11:16 AM, Tom Lane  wrote:
> Dave Cramer  writes:
>> Get the following error
>> configure:3274: ccache gcc -V >&5
>> llvm-gcc-4.2: argument to `-V' is missing
>
>> should be
>> ccache gcc -v >&5
>
> That's not an error, that's normal behavior.
>
> Mind you, I have no idea why autoconf chooses to do this when it's
> already tried -v, but this is not the source of whatever problem
> you're having.
>
>                        regards, tom lane
>

Well the problem is that buildfarm can't build HEAD on OS X 10.7.1

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error building head on OS X 10.7.1

2011-09-07 Thread Tom Lane
Dave Cramer  writes:
> Get the following error
> configure:3274: ccache gcc -V >&5
> llvm-gcc-4.2: argument to `-V' is missing

> should be
> ccache gcc -v >&5

That's not an error, that's normal behavior.

Mind you, I have no idea why autoconf chooses to do this when it's
already tried -v, but this is not the source of whatever problem
you're having.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] OPERATOR FAMILY and pg_dump

2011-09-07 Thread Joe Abbate
Hi,

If a basic operator family is created, e.g.,

create operator family of1 using btree;

shouldn't pg_dump include this in its output?  If not, why?

Joe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.1rc1 regression: EXPLAIN on information_schema.key_column_usage

2011-09-07 Thread Tom Lane
Marti Raudsepp  writes:
> It seems I have found a regression in PostgreSQL 9.1rc1 (from 9.0).

> In many cases, running the following query fails:
> db=# EXPLAIN select * from information_schema.key_column_usage;
> ERROR:  record type has not been registered

Looks like I overlooked a case in get_name_for_var_field.  Thanks,
will fix.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] problem of win32.mak

2011-09-07 Thread Hiroshi Saito

Hi.

We are looking forward to the great release.
thanks again!!

Regards,
Hiroshi Saito

(2011/09/07 23:46), Magnus Hagander wrote:

On Wed, Sep 7, 2011 at 16:43, Bruce Momjian  wrote:

Hiroshi Saito wrote:

Hi Magnus-san and Bruce-san.

I am sorry in a very late reaction...

Is it enough for a 9.1release?
libpq of bcc32 and win32 has a problem.

== error of nmake ==
? .\Release\libpqdll.lib ???
.\Release\libpqdll.exp 
libpq.lib(fe-connect.obj) : error LNK2019: ??
_inet_net_ntop ??? _connectFailureMessage ?
libpq.lib(getaddrinfo.obj) : error LNK2001: ??
"_inet_net_ntop" ???
libpq.lib(fe-connect.obj) : error LNK2019: ??
_pg_get_encoding_from_locale ??? _PQsetClientEncoding ?
.\Release\libpq.dll : fatal error LNK1120:  2 ???
== end ==


Yes, this is a problem.  I will apply your patch to 9.1 and head unless
I hear otherwise in a few hours.


Looks fine to me, go ahead.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.conf archive_command example

2011-09-07 Thread Robert Treat
On Tue, Sep 6, 2011 at 10:11 PM, Fujii Masao  wrote:
> On Sat, Sep 3, 2011 at 5:10 AM, Kevin Grittner
>  wrote:
>> (2)  It should copy, not move, with protection against overwriting
>> an existing file.
>
> I agree that basically archive_command should not overwrite an existing file.
> But if the size of existing file is less than 16MB, it should do that.
> Otherwise,
> that WAL file would be lost forever.
>

I think best practice in this case is that if you ever find an
existing file with the same name already in place, you should error
and investigate. We don't ship around partially completed WAL files,
and finding an existing one probably means something went wrong. (Of
course, we use rsync instead of copy/move, so we have some better
guarantees about this).

> I have another feature request;
> (5) Maybe not in the initial version, but eventually it might be
> nice to support calling posix_fadvise(POSIX_FADV_DONTNEED)
> after copying a WAL file.
>

Can you go into more details on how you envision this working. I'm
mostly curious because I think rsync might already support this, which
would make it easy to incorporate.

On a side note, seeing this thread hasn't died, I'd encourage everyone
to take another look at OmniPITR,
https://github.com/omniti-labs/omnipitr. It's postgresql licensed,
solves a lot of the problems listed here, and I think makes for a good
example for people who want to accomplish more advanced awl management
goals. So far the biggest criticism we've gotten is that it wasn't
written in python, for some of you that might be a plus though ;-)


Robert Treat
play: xzilla.net
work: omniti.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] error building head on OS X 10.7.1

2011-09-07 Thread Dave Cramer
Get the following error

configure:3274: ccache gcc -V >&5
llvm-gcc-4.2: argument to `-V' is missing

should be
ccache gcc -v >&5

Dave
Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] problem of win32.mak

2011-09-07 Thread Magnus Hagander
On Wed, Sep 7, 2011 at 16:43, Bruce Momjian  wrote:
> Hiroshi Saito wrote:
>> Hi Magnus-san and Bruce-san.
>>
>> I am sorry in a very late reaction...
>>
>> Is it enough for a 9.1release?
>> libpq of bcc32 and win32 has a problem.
>>
>> == error of nmake ==
>>    ? .\Release\libpqdll.lib ???
>> .\Release\libpqdll.exp 
>> libpq.lib(fe-connect.obj) : error LNK2019: ??
>> _inet_net_ntop ??? _connectFailureMessage ?
>> libpq.lib(getaddrinfo.obj) : error LNK2001: ??
>> "_inet_net_ntop" ???
>> libpq.lib(fe-connect.obj) : error LNK2019: ??
>> _pg_get_encoding_from_locale ??? _PQsetClientEncoding ?
>> .\Release\libpq.dll : fatal error LNK1120:  2 ???
>> == end ==
>
> Yes, this is a problem.  I will apply your patch to 9.1 and head unless
> I hear otherwise in a few hours.

Looks fine to me, go ahead.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] problem of win32.mak

2011-09-07 Thread Bruce Momjian
Hiroshi Saito wrote:
> Hi Magnus-san and Bruce-san.
> 
> I am sorry in a very late reaction...
> 
> Is it enough for a 9.1release?
> libpq of bcc32 and win32 has a problem.
> 
> == error of nmake ==
>? .\Release\libpqdll.lib ???
> .\Release\libpqdll.exp 
> libpq.lib(fe-connect.obj) : error LNK2019: ??
> _inet_net_ntop ??? _connectFailureMessage ?
> libpq.lib(getaddrinfo.obj) : error LNK2001: ??
> "_inet_net_ntop" ???
> libpq.lib(fe-connect.obj) : error LNK2019: ??
> _pg_get_encoding_from_locale ??? _PQsetClientEncoding ?
> .\Release\libpq.dll : fatal error LNK1120:  2 ???
> == end ==

Yes, this is a problem.  I will apply your patch to 9.1 and head unless
I hear otherwise in a few hours.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-07 Thread Yeb Havinga

On 2011-09-07 16:02, Kevin Grittner wrote:

Robert Haas  wrote:


Anyone else want to bikeshed?


I'm not sure they beat TRUSTED, but:

SECURE
OPAQUE


SAVE

-- Yeb


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.1rc1 regression: EXPLAIN on information_schema.key_column_usage

2011-09-07 Thread Marti Raudsepp
Hi list,

It seems I have found a regression in PostgreSQL 9.1rc1 (from 9.0).

In many cases, running the following query fails:
db=# EXPLAIN select * from information_schema.key_column_usage;
ERROR:  record type has not been registered

However, this is not always reproducible. It seems to occur more
likely on an empty database. At first I suspected uninitialized memory
access somewhere, but valgrind does not highlight anything obvious.
Trying to isolate the part of the view that causes the error also
didn't yield any results.

Similarly, information_schema.triggered_update_columns also
occasionally returns this error, but less reliably.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-07 Thread Kohei KaiGai
2011/9/7 Robert Haas :
> On Wed, Sep 7, 2011 at 9:39 AM, Thom Brown  wrote:
>> On 7 September 2011 14:34, Kohei KaiGai  wrote:
>>> 2011/9/7 Thom Brown :
>>> > On 24 August 2011 13:38, Kohei Kaigai  wrote:
>>> >>
>>> >> The (2) is new stuff from the revision in commit-fest 1st. It enables
>>> >> to
>>> >> supply "NOLEAKY" option on CREATE FUNCTION statement, then the function
>>> >> is
>>> >> allowed to distribute across security barrier. Only superuser can set
>>> >> this
>>> >> option.
>>> >
>>> > "NOLEAKY" doesn't really sound appropriate as it sounds like pidgin
>>> > English.
>>> >  Also, it could be read as "Don't allow leaks in this function".  Could
>>> > we
>>> > instead use something like TRUSTED or something akin to it being allowed
>>> > to
>>> > do more than safer functions?  It then describes its level of behaviour
>>> > rather than what it promises not to do.
>>> >
>>> Thanks for your comment. I'm not a native English specker, so it is
>>> helpful.
>>>
>>> "TRUSTED" sounds meaningful for me, however, it is confusable with a
>>> concept
>>> of "trusted procedure" in label-based MAC. It is not only SELinux,
>>> Oracle's label
>>> based security also uses this term to mean a procedure that switches
>>> user's
>>> credential during its execution.
>>>
>>>  http://download.oracle.com/docs/cd/B28359_01/network.111/b28529/storproc.htm
>>>
>>> So, how about "CREDIBLE", instead of "TRUSTED"?
>>
>> I can't say I'm keen on that alternative, but I'm probably not the one to
>> participate in bike-shedding here, so I'll leave comment to you hackers. :)
>
> I think TRUSTED actually does a reasonably good job capturing what
> we're after here, although I do share a bit of KaiGai's nervousness
> about terminological confusion.  Still, I'd be inclined to go that way
> if we can't come up with anything better.  CREDIBLE is definitely the
> wrong idea: that means "believable", which sounds more like a
> statement about the function's results than about its side-effects.  I
> thought about TACITURN, since we need the error messages to not be
> excessively informative, but that doesn't do a good job characterizing
> the hazard created by side-effects, or the potential for abuse due to
> - for example - deliberate division by zero.  I also thought about
> PURE, which is a term that's sometimes used to describe code that
> throws no errors and has no side effects, and comes pretty close to
> our actual requirement here, but doesn't necessarily convey that a
> security concern is involved.  Yet another idea would be to use a
> variant of TRUSTED, such as TRUSTWORTHY, just to avoid confusion with
> the idea of a trusted procedure, but I'm not that excited about that
> idea despite have no real specific gripe with it other than length.
> So at the moment I am leaning toward TRUSTED.
>
> Anyone else want to bikeshed?
>
I also become leaning toward TRUSTED, although we still have a bit risk of
terminology confusion, because I assume it is quite rare case to set this
option by DBA and we will able to expect DBAs who try to this option have
correct knowledge about background of the leaky-view problem.

I'll submit the patch again.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Tom Lane
Robert Haas  writes:
> After spending some time staring at the code, I do have one idea as to
> what might be going on here.  When a backend is terminated,
> ShutdownPostgres() calls AbortOutOfAnyTransaction() and then
> LockReleaseAll(USER_LOCKMETHOD, true).  The second call releases all
> user locks, and the first one (or so we suppose) releases all normal
> locks as part of aborting the transaction.  But what if there's no
> transaction in progress?  In that case, AbortOutOfAnyTransaction()
> won't do anything - which is fine as far as transaction-level locks
> go, because we shouldn't be holding any of those anyway if there's no
> transaction in progress.  However, if we hold a session-level lock at
> that point, then we'd orphan it.  We don't make much use of session
> locks.  As far as I can see, they are used by (1) VACUUM, (2) CREATE
> INDEX CONCURRENTLY, (3) ALTER DATABASE .. SET TABLESPACE, and (4) on
> standby servers, redo of DROP DATABASE actions.  Any chance one of
> those died or was killed off around the time this happened?

I don't believe this theory at all, because if that were the issue,
we'd have heard about it long since.  The correct theory has to involve
a very-little-used triggering condition.  At the moment I'm wondering
about advisory (userspace) locks ... Dave, do your apps use any of those?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-07 Thread Kevin Grittner
Robert Haas  wrote:
 
> Anyone else want to bikeshed?
 
I'm not sure they beat TRUSTED, but:
 
SECURE
OPAQUE
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REVIEW proposal: a validator for configuration files

2011-09-07 Thread Tom Lane
Andy Colson  writes:
> Where did the other warnings go?  Its right though, line 570 is bad.  It also 
> seems to have killed the server.  I have not gotten through the history of 
> messages regarding this patch, but is it supposed to kill the server if there 
> is a syntax error in the config file?

The historical behavior is that a configuration file error detected
during postmaster startup should prevent the server from starting, but
an error detected during reload should only result in a LOG message and
the reload not occurring.  I don't believe anyone will accept a patch
that causes the server to quit on a failed reload.  There has however
been some debate about the exact extent of ignoring bad values during
reload --- currently the theory is "ignore the whole file if anything is
wrong", but there's some support for applying all non-bad values as long
as the overall file syntax is okay.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-07 Thread Robert Haas
On Wed, Sep 7, 2011 at 9:39 AM, Thom Brown  wrote:
> On 7 September 2011 14:34, Kohei KaiGai  wrote:
>> 2011/9/7 Thom Brown :
>> > On 24 August 2011 13:38, Kohei Kaigai  wrote:
>> >>
>> >> The (2) is new stuff from the revision in commit-fest 1st. It enables
>> >> to
>> >> supply "NOLEAKY" option on CREATE FUNCTION statement, then the function
>> >> is
>> >> allowed to distribute across security barrier. Only superuser can set
>> >> this
>> >> option.
>> >
>> > "NOLEAKY" doesn't really sound appropriate as it sounds like pidgin
>> > English.
>> >  Also, it could be read as "Don't allow leaks in this function".  Could
>> > we
>> > instead use something like TRUSTED or something akin to it being allowed
>> > to
>> > do more than safer functions?  It then describes its level of behaviour
>> > rather than what it promises not to do.
>> >
>> Thanks for your comment. I'm not a native English specker, so it is
>> helpful.
>>
>> "TRUSTED" sounds meaningful for me, however, it is confusable with a
>> concept
>> of "trusted procedure" in label-based MAC. It is not only SELinux,
>> Oracle's label
>> based security also uses this term to mean a procedure that switches
>> user's
>> credential during its execution.
>>
>>  http://download.oracle.com/docs/cd/B28359_01/network.111/b28529/storproc.htm
>>
>> So, how about "CREDIBLE", instead of "TRUSTED"?
>
> I can't say I'm keen on that alternative, but I'm probably not the one to
> participate in bike-shedding here, so I'll leave comment to you hackers. :)

I think TRUSTED actually does a reasonably good job capturing what
we're after here, although I do share a bit of KaiGai's nervousness
about terminological confusion.  Still, I'd be inclined to go that way
if we can't come up with anything better.  CREDIBLE is definitely the
wrong idea: that means "believable", which sounds more like a
statement about the function's results than about its side-effects.  I
thought about TACITURN, since we need the error messages to not be
excessively informative, but that doesn't do a good job characterizing
the hazard created by side-effects, or the potential for abuse due to
- for example - deliberate division by zero.  I also thought about
PURE, which is a term that's sometimes used to describe code that
throws no errors and has no side effects, and comes pretty close to
our actual requirement here, but doesn't necessarily convey that a
security concern is involved.  Yet another idea would be to use a
variant of TRUSTED, such as TRUSTWORTHY, just to avoid confusion with
the idea of a trusted procedure, but I'm not that excited about that
idea despite have no real specific gripe with it other than length.
So at the moment I am leaning toward TRUSTED.

Anyone else want to bikeshed?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rectifying wrong Date outputs

2011-09-07 Thread Bruce Momjian

Applied, with a function rename.  The only odd case we have left is:

test=> select to_date('079', 'YYY');
  to_date

 1979-01-01
(1 row)

(Note the zero is ignored.)  I can't see an easy way to fix this and
continue to be easily documented.

---

Bruce Momjian wrote:
> Bruce Momjian wrote:
> > Piyush Newe wrote:
> > > Hi,
> > > 
> > > I was randomly testing some date related stuff on PG & observed that the
> > > outputs were wrong.
> > > 
> > > e.g.
> > > postgres=# SELECT TO_DATE('01-jan-2010',  'DD-MON-YY');
> > >   to_date
> > > 
> > >  3910-01-01  <- Look at this
> > > (1 row)
> > > 
> > > postgres=# SELECT TO_DATE('01-jan-2010',  'DD-MON-');
> > >   to_date
> > > 
> > >  2010-01-01
> > > (1 row)
> > 
> > I have done some work on this problem, and have developed the attached
> > patch.  It genarates the output in the final column of this table:
> > 
> > Oracle  PostgreSQL  
> > With PG Patch
> >  1  TO_DATE('01-jan-1',  'DD-MON-Y')01-JAN-2011 01-JAN-2001 
> > 01-JAN-2001+
> >  2  TO_DATE('01-jan-1',  'DD-MON-YY')   01-JAN-2001 01-JAN-2001 
> > 01-JAN-2001
> >  3  TO_DATE('01-jan-1',  'DD-MON-YYY')  01-JAN-2001 01-JAN-2001 
> > 01-JAN-2001
> >  4  TO_DATE('01-jan-1',  'DD-MON-') 01-JAN-0001 01-JAN-0001 
> > 01-JAN-0001
> >  5  TO_DATE('01-jan-10',  'DD-MON-Y')   Error   01-JAN-2010 
> > 01-JAN-2010
> >  6  TO_DATE('01-jan-10',  'DD-MON-YY')  01-JAN-2010 01-JAN-2010 
> > 01-JAN-2010
> >  7  TO_DATE('01-jan-10',  'DD-MON-YYY') 01-JAN-2010 01-JAN-2010 
> > 01-JAN-2010
> >  8  TO_DATE('01-jan-10',  'DD-MON-')01-JAN-0010 01-JAN-0010 
> > 01-JAN-0010
> >  9  TO_DATE('01-jan-067',  'DD-MON-Y')  Error   01-JAN-2067 
> > 01-JAN-2067
> > 10  TO_DATE('01-jan-111',  'DD-MON-YY') 01-JAN-0111 01-JAN-2011 
> > 01-JAN-2111*+
> > 11  TO_DATE('01-jan-678',  'DD-MON-YYY')01-JAN-2678 01-JAN-1678 
> > 01-JAN-1678+
> > 12  TO_DATE('01-jan-001',  'DD-MON-')   01-JAN-0001 01-JAN-0001 
> > 01-JAN-0001
> > 13  TO_DATE('01-jan-2010',  'DD-MON-Y') Error   01-JAN-4010 
> > 01-JAN-2010*
> > 14  TO_DATE('01-jan-2010',  'DD-MON-YY')01-JAN-2010 01-JAN-3910 
> > 01-JAN-2010*
> > 15  TO_DATE('01-jan-2010',  'DD-MON-YYY')   Error   01-JAN-3010 
> > 01-JAN-2010*
> > 16  TO_DATE('01-jan-2010',  'DD-MON-')  01-JAN-2010 01-JAN-2010 
> > 01-JAN-2010
> 
> In an attempt to make the to_date/to_timestamp behavior documentable, I
> have modified the patch to have dates adjust toward the year 2020, and
> added code so if four digits are supplied, we don't do any adjustment. 
> Here is the current odd behavior, which is fixed by the patch:
> 
>   test=> select to_date('222', 'YYY');
> to_date
>   
>-01-01
>   (1 row)
>   
>   test=> select to_date('0222', 'YYY');
> to_date
>   
>-01-01
>   (1 row)
> 
> If they supply a full 4-digit year, it seems we should honor that, even
> for YYY.   still does no adjustment, and I doubt we want to change
> that:
> 
>   test=> select to_date('222', '');
> to_date
>   
>0222-01-01
>   (1 row)
>   
>   test=> select to_date('0222', '');
> to_date
>   
>0222-01-01
>   (1 row)
> 
> -- 
>   Bruce Momjian  http://momjian.us
>   EnterpriseDB http://enterprisedb.com
> 
>   + It's impossible for everything to be true. +

[ text/x-diff is unsupported, treating like TEXT/PLAIN ]

> diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
> new file mode 100644
> index c03dd6c..282bb0d
> *** a/doc/src/sgml/func.sgml
> --- b/doc/src/sgml/func.sgml
> *** SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1
> *** 5550, 
> --- 5550,5564 
>   
>
> 
> +If the year format specification is less than four digits, e.g.
> +YYY, and the supplied year is less than four digits,
> +the year will be adjusted to be nearest to year 2020, e.g.
> +95 becomes 1995.
> +   
> +  
> + 
> +  
> +   
>  The  conversion from string to 
> timestamp or
>  date has a restriction when processing years with more 
> than 4 digits. You must
>  use some non-digit character or template after 
> ,
> diff --git a/src/backend/utils/adt/formatting.c 
> b/src/backend/utils/adt/formatting.c
> new file mode 100644
> index 726a1f4..1a3ec1c
> *** a/src/backend/utils/adt/formatting.c
> --- b/src/backend/utils/adt/formatting.c
> *** static void dump_node(FormatNode *node, 
> *** 964,969 
> --- 964,970 

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Robert Haas
On Wed, Sep 7, 2011 at 5:16 AM, daveg  wrote:
> On Tue, Aug 23, 2011 at 12:15:23PM -0400, Robert Haas wrote:
>> On Mon, Aug 22, 2011 at 3:31 AM, daveg  wrote:
>> > So far I've got:
>> >
>> >  - affects system tables
>> >  - happens very soon after process startup
>> >  - in 8.4.7 and 9.0.4
>> >  - not likely to be hardware or OS related
>> >  - happens in clusters for period of a few second to many minutes
>> >
>> > I'll work on printing the LOCK and LOCALLOCK when it happens, but it's
>> > hard to get downtime to pick up new builds. Any other ideas on getting to
>> > the bottom of this?
>>
>> I've been thinking this one over, and doing a little testing. I'm
>> still stumped, but I have a few thoughts.  What that error message is
>> really saying is that the LOCALLOCK bookkeeping doesn't match the
>> PROCLOCK bookkeeping; it doesn't tell us which one is to blame.
> ...
>> My second thought is that perhaps a process is occasionally managing
>> to exit without fully cleaning up the associated PROCLOCK entry.  At
>> first glance, it appears that this would explain the observed
>> symptoms.  A new backend gets the PGPROC belonging to the guy who
>> didn't clean up after himself, hits the error, and disconnects,
>> sticking himself right back on to the head of the SHM_QUEUE where the
>> next connection will inherit the same PGPROC and hit the same problem.
>>  But it's not clear to me what could cause the system to get into this
>> state in the first place, or how it would eventually right itself.
>>
>> It might be worth kludging up your system to add a test to
>> InitProcess() to verify that all of the myProcLocks SHM_QUEUEs are
>> either NULL or empty, along the lines of the attached patch (which
>> assumes that assertions are enabled; otherwise, put in an elog() of
>> some sort).  Actually, I wonder if we shouldn't move all the
>> SHMQueueInit() calls for myProcLocks to InitProcGlobal() rather than
>> doing it over again every time someone calls InitProcess().  Besides
>> being a waste of cycles, it's probably less robust this way.   If
>> there somehow are leftovers in one of those queues, the next
>> successful call to LockReleaseAll() ought to clean up the mess, but of
>> course there's no chance of that working if we've nuked the queue
>> pointers.
>
> I did this in the elog flavor as we don't build production images with 
> asserts.
> It has been running on all hosts for a few days. Today it hit the extra
> checks in initproc.
>
> 00:02:32.782  8626  [unknown] [unknown]  LOG:  connection received: host=bk0 
> port=42700
> 00:02:32.783  8627  [unknown] [unknown]  LOG:  connection received: host=op2 
> port=45876
> 00:02:32.783  8627  d61 apps  FATAL:  Initprocess myProclocks[4] not empty: 
> queue 0x2ae6b4b895f8 (prev 0x2ae6b4a2b558, next 0x2ae6b4a2b558)
> 00:02:32.783  8626  d35 postgres  LOG:  connection authorized: user=postgres 
> database=c35
> 00:02:32.783  21535  LOG:  server process (PID 8627) exited with exit code 1
> 00:02:32.783  21535  LOG:  terminating any other active server processes
> 00:02:32.783  8626  c35 postgres  WARNING:  terminating connection because of 
> crash of another server process
>
> The patch that produced this is attached. If you can think of anything I
> can add to this to help I'd be happy to do so. Also, can I clean this up
> and continue somehow? Maybe clear the queue instead having to have a restart?
> Or is there a way to just pause this proc here, maybe mark it not to be used
> and exit, or just to sleep forever so I can debug later?

I think what we really need to know here is: when the debugging code
you injected here fires, what happened to the previous owner of that
PGPROC that caused it to not clean up its state properly?  The PGPROC
that 8627 inherited in the above example is obviously messed up - but
what did the last guy do that messed it up?  It would be nice to log
the address of the PGPROC in every log message here - then you could
go back and look to see whether the last guy terminated in some
unusual way.  In the meantime, could you could look back a couple of
minutes from the time of the above occurrence and see if there are any
FATAL errors, or usual ERRORs?

After spending some time staring at the code, I do have one idea as to
what might be going on here.  When a backend is terminated,
ShutdownPostgres() calls AbortOutOfAnyTransaction() and then
LockReleaseAll(USER_LOCKMETHOD, true).  The second call releases all
user locks, and the first one (or so we suppose) releases all normal
locks as part of aborting the transaction.  But what if there's no
transaction in progress?  In that case, AbortOutOfAnyTransaction()
won't do anything - which is fine as far as transaction-level locks
go, because we shouldn't be holding any of those anyway if there's no
transaction in progress.  However, if we hold a session-level lock at
that point, then we'd orphan it.  We don't make much use of session
locks.  As far as I can see, they are used by (1) VACUUM, (2) CRE

Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-07 Thread Thom Brown
On 7 September 2011 14:34, Kohei KaiGai  wrote:

> 2011/9/7 Thom Brown :
> > On 24 August 2011 13:38, Kohei Kaigai  wrote:
> >>
> >> The (2) is new stuff from the revision in commit-fest 1st. It enables to
> >> supply "NOLEAKY" option on CREATE FUNCTION statement, then the function
> is
> >> allowed to distribute across security barrier. Only superuser can set
> this
> >> option.
> >
> > "NOLEAKY" doesn't really sound appropriate as it sounds like pidgin
> English.
> >  Also, it could be read as "Don't allow leaks in this function".  Could
> we
> > instead use something like TRUSTED or something akin to it being allowed
> to
> > do more than safer functions?  It then describes its level of behaviour
> > rather than what it promises not to do.
> >
> Thanks for your comment. I'm not a native English specker, so it is
> helpful.
>
> "TRUSTED" sounds meaningful for me, however, it is confusable with a
> concept
> of "trusted procedure" in label-based MAC. It is not only SELinux,
> Oracle's label
> based security also uses this term to mean a procedure that switches user's
> credential during its execution.
>
> http://download.oracle.com/docs/cd/B28359_01/network.111/b28529/storproc.htm
>
> So, how about "CREDIBLE", instead of "TRUSTED"?
>

I can't say I'm keen on that alternative, but I'm probably not the one to
participate in bike-shedding here, so I'll leave comment to you hackers. :)

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-07 Thread Kohei KaiGai
2011/9/7 Thom Brown :
> On 24 August 2011 13:38, Kohei Kaigai  wrote:
>>
>> The (2) is new stuff from the revision in commit-fest 1st. It enables to
>> supply "NOLEAKY" option on CREATE FUNCTION statement, then the function is
>> allowed to distribute across security barrier. Only superuser can set this
>> option.
>
> "NOLEAKY" doesn't really sound appropriate as it sounds like pidgin English.
>  Also, it could be read as "Don't allow leaks in this function".  Could we
> instead use something like TRUSTED or something akin to it being allowed to
> do more than safer functions?  It then describes its level of behaviour
> rather than what it promises not to do.
>
Thanks for your comment. I'm not a native English specker, so it is helpful.

"TRUSTED" sounds meaningful for me, however, it is confusable with a concept
of "trusted procedure" in label-based MAC. It is not only SELinux,
Oracle's label
based security also uses this term to mean a procedure that switches user's
credential during its execution.
  http://download.oracle.com/docs/cd/B28359_01/network.111/b28529/storproc.htm

So, how about "CREDIBLE", instead of "TRUSTED"?

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] problem of win32.mak

2011-09-07 Thread Hiroshi Saito
Hi Magnus-san and Bruce-san.

I am sorry in a very late reaction...

Is it enough for a 9.1release?
libpq of bcc32 and win32 has a problem.

== error of nmake ==
   ライブラリ .\Release\libpqdll.lib とオブジェクト
.\Release\libpqdll.exp を作成中
libpq.lib(fe-connect.obj) : error LNK2019: 未解決の外部シンボル
_inet_net_ntop が関数 _connectFailureMessage で参照されました。
libpq.lib(getaddrinfo.obj) : error LNK2001: 外部シンボル
"_inet_net_ntop" は未解決です。
libpq.lib(fe-connect.obj) : error LNK2019: 未解決の外部シンボル
_pg_get_encoding_from_locale が関数 _PQsetClientEncoding で参照されました。
.\Release\libpq.dll : fatal error LNK1120: 外部参照 2 が未解決です。
== end ==

Please take this into consideration.
Thanks.

Regards,
Hiroshi Saito

*** src/interfaces/libpq/bcc32.mak.old  Fri Aug 19 06:23:13 2011
--- src/interfaces/libpq/bcc32.mak  Wed Sep  7 21:50:33 2011
***
*** 80,85 
--- 80,87 
-@erase "$(INTDIR)\inet_aton.obj"
-@erase "$(INTDIR)\crypt.obj"
-@erase "$(INTDIR)\noblock.obj"
+   -@erase "$(INTDIR)\chklocale.obj"
+   -@erase "$(INTDIR)\inet_net_ntop.obj"
-@erase "$(INTDIR)\md5.obj"
-@erase "$(INTDIR)\ip.obj"
-@erase "$(INTDIR)\fe-auth.obj"
***
*** 123,128 
--- 125,132 
"$(INTDIR)\inet_aton.obj" \
"$(INTDIR)\crypt.obj" \
"$(INTDIR)\noblock.obj" \
+   "$(INTDIR)\chklocale.obj" \
+   "$(INTDIR)\inet_net_ntop.obj" \
"$(INTDIR)\md5.obj" \
"$(INTDIR)\ip.obj" \
"$(INTDIR)\fe-auth.obj" \
***
*** 220,225 
--- 224,239 
  "$(INTDIR)\noblock.obj" : ..\..\port\noblock.c
$(CPP) @<<
$(CPP_PROJ) ..\..\port\noblock.c
+ <<
+ 
+ "$(INTDIR)\chklocale.obj" : ..\..\port\chklocale.c
+   $(CPP) @<<
+   $(CPP_PROJ) ..\..\port\chklocale.c
+ <<
+ 
+ "$(INTDIR)\inet_net_ntop.obj" : ..\..\port\inet_net_ntop.c
+   $(CPP) @<<
+   $(CPP_PROJ) ..\..\port\inet_net_ntop.c
  <<
  
  "$(INTDIR)\md5.obj" : ..\..\backend\libpq\md5.c
*** src/interfaces/libpq/win32.mak.old  Fri Aug 19 06:23:13 2011
--- src/interfaces/libpq/win32.mak  Wed Sep  7 21:45:22 2011
***
*** 87,92 
--- 87,94 
-@erase "$(INTDIR)\inet_aton.obj"
-@erase "$(INTDIR)\crypt.obj"
-@erase "$(INTDIR)\noblock.obj"
+   -@erase "$(INTDIR)\chklocale.obj" 
+   -@erase "$(INTDIR)\inet_net_ntop.obj" 
-@erase "$(INTDIR)\md5.obj"
-@erase "$(INTDIR)\ip.obj"
-@erase "$(INTDIR)\fe-auth.obj"
***
*** 132,137 
--- 134,141 
"$(INTDIR)\inet_aton.obj" \
"$(INTDIR)\crypt.obj" \
"$(INTDIR)\noblock.obj" \
+   "$(INTDIR)\chklocale.obj" \
+   "$(INTDIR)\inet_net_ntop.obj" \
"$(INTDIR)\md5.obj" \
"$(INTDIR)\ip.obj" \
"$(INTDIR)\fe-auth.obj" \
***
*** 258,263 
--- 262,277 
  "$(INTDIR)\noblock.obj" : ..\..\port\noblock.c
$(CPP) @<<
$(CPP_PROJ) ..\..\port\noblock.c
+ <<
+ 
+ "$(INTDIR)\chklocale.obj" : ..\..\port\chklocale.c
+   $(CPP) @<<
+   $(CPP_PROJ) ..\..\port\chklocale.c
+ <<
+ 
+ "$(INTDIR)\inet_net_ntop.obj" : ..\..\port\inet_net_ntop.c
+   $(CPP) @<<
+   $(CPP_PROJ) ..\..\port\inet_net_ntop.c
  <<
  
  "$(INTDIR)\md5.obj" : ..\..\backend\libpq\md5.c

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-07 Thread Thom Brown
On 24 August 2011 13:38, Kohei Kaigai  wrote:

> The (2) is new stuff from the revision in commit-fest 1st. It enables to
> supply "NOLEAKY" option on CREATE FUNCTION statement, then the function is
> allowed to distribute across security barrier. Only superuser can set this
> option.
>

"NOLEAKY" doesn't really sound appropriate as it sounds like pidgin English.
 Also, it could be read as "Don't allow leaks in this function".  Could we
instead use something like TRUSTED or something akin to it being allowed to
do more than safer functions?  It then describes its level of behaviour
rather than what it promises not to do.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] Cascaded standby message

2011-09-07 Thread Fujii Masao
On Wed, Sep 7, 2011 at 9:10 PM, Thom Brown  wrote:
> On 7 September 2011 11:56, Simon Riggs  wrote:
>> > That's an idea. But what about the patch that I proposed before?
>> > http://archives.postgresql.org/pgsql-hackers/2011-08/msg00816.php
>>
>> Thanks for that. Committed.

Thanks!

> This appears to be the patch submitted to the commitfest.
>  https://commitfest.postgresql.org/action/patch_view?id=630  Can this now be
> marked as committed?

Yes. I did that. Thanks!

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cascaded standby message

2011-09-07 Thread Thom Brown
On 7 September 2011 11:56, Simon Riggs  wrote:

> On Wed, Sep 7, 2011 at 2:44 AM, Fujii Masao  wrote:
> > On Wed, Sep 7, 2011 at 5:22 AM, Simon Riggs 
> wrote:
> >> Fujii - the original goal here was to avoid having a unexplained
> >> disconnection in the logs. The only way to do this cleanly is to have
> >> a disconnection reason passed to the walsender, rather than just a
> >> blind signal. Looks like we need to multiplex or other mechanism.
> >
> > That's an idea. But what about the patch that I proposed before?
> > http://archives.postgresql.org/pgsql-hackers/2011-08/msg00816.php
>
> Thanks for that. Committed.
> 
>

This appears to be the patch submitted to the commitfest.
https://commitfest.postgresql.org/action/patch_view?id=630  Can this now be
marked as committed?

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] savepoint commit performance

2011-09-07 Thread Simon Riggs
On Tue, Sep 6, 2011 at 9:18 PM, Simon Riggs  wrote:

> I'm back now and will act as advertised.

I've revoked the performance aspect. The difference between release
and commit has been maintained since it makes the code easier to
understand.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cascaded standby message

2011-09-07 Thread Simon Riggs
On Wed, Sep 7, 2011 at 2:44 AM, Fujii Masao  wrote:
> On Wed, Sep 7, 2011 at 5:22 AM, Simon Riggs  wrote:
>> Fujii - the original goal here was to avoid having a unexplained
>> disconnection in the logs. The only way to do this cleanly is to have
>> a disconnection reason passed to the walsender, rather than just a
>> blind signal. Looks like we need to multiplex or other mechanism.
>
> That's an idea. But what about the patch that I proposed before?
> http://archives.postgresql.org/pgsql-hackers/2011-08/msg00816.php

Thanks for that. Committed.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cascaded standby message

2011-09-07 Thread Magnus Hagander
On Wed, Sep 7, 2011 at 03:44, Fujii Masao  wrote:
> On Wed, Sep 7, 2011 at 5:22 AM, Simon Riggs  wrote:
>> Fujii - the original goal here was to avoid having a unexplained
>> disconnection in the logs. The only way to do this cleanly is to have
>> a disconnection reason passed to the walsender, rather than just a
>> blind signal. Looks like we need to multiplex or other mechanism.
>
> That's an idea. But what about the patch that I proposed before?
> http://archives.postgresql.org/pgsql-hackers/2011-08/msg00816.php

Seems like a good solution to me - I hadn't noticed that patch before
posting my complaint.

I don't think it's a problem if it logs it multiple times when it
happens - I think it's a much bigger problem that it logs it when it
didn't actually do anything.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Tue, Aug 23, 2011 at 12:15:23PM -0400, Robert Haas wrote:
> On Mon, Aug 22, 2011 at 3:31 AM, daveg  wrote:
> > So far I've got:
> >
> >  - affects system tables
> >  - happens very soon after process startup
> >  - in 8.4.7 and 9.0.4
> >  - not likely to be hardware or OS related
> >  - happens in clusters for period of a few second to many minutes
> >
> > I'll work on printing the LOCK and LOCALLOCK when it happens, but it's
> > hard to get downtime to pick up new builds. Any other ideas on getting to
> > the bottom of this?
> 
> I've been thinking this one over, and doing a little testing. I'm
> still stumped, but I have a few thoughts.  What that error message is
> really saying is that the LOCALLOCK bookkeeping doesn't match the
> PROCLOCK bookkeeping; it doesn't tell us which one is to blame.
... 
> My second thought is that perhaps a process is occasionally managing
> to exit without fully cleaning up the associated PROCLOCK entry.  At
> first glance, it appears that this would explain the observed
> symptoms.  A new backend gets the PGPROC belonging to the guy who
> didn't clean up after himself, hits the error, and disconnects,
> sticking himself right back on to the head of the SHM_QUEUE where the
> next connection will inherit the same PGPROC and hit the same problem.
>  But it's not clear to me what could cause the system to get into this
> state in the first place, or how it would eventually right itself.
> 
> It might be worth kludging up your system to add a test to
> InitProcess() to verify that all of the myProcLocks SHM_QUEUEs are
> either NULL or empty, along the lines of the attached patch (which
> assumes that assertions are enabled; otherwise, put in an elog() of
> some sort).  Actually, I wonder if we shouldn't move all the
> SHMQueueInit() calls for myProcLocks to InitProcGlobal() rather than
> doing it over again every time someone calls InitProcess().  Besides
> being a waste of cycles, it's probably less robust this way.   If
> there somehow are leftovers in one of those queues, the next
> successful call to LockReleaseAll() ought to clean up the mess, but of
> course there's no chance of that working if we've nuked the queue
> pointers.

I did this in the elog flavor as we don't build production images with asserts.
It has been running on all hosts for a few days. Today it hit the extra
checks in initproc.

00:02:32.782  8626  [unknown] [unknown]  LOG:  connection received: host=bk0 
port=42700
00:02:32.783  8627  [unknown] [unknown]  LOG:  connection received: host=op2 
port=45876
00:02:32.783  8627  d61 apps  FATAL:  Initprocess myProclocks[4] not empty: 
queue 0x2ae6b4b895f8 (prev 0x2ae6b4a2b558, next 0x2ae6b4a2b558) 
00:02:32.783  8626  d35 postgres  LOG:  connection authorized: user=postgres 
database=c35
00:02:32.783  21535  LOG:  server process (PID 8627) exited with exit code 1
00:02:32.783  21535  LOG:  terminating any other active server processes
00:02:32.783  8626  c35 postgres  WARNING:  terminating connection because of 
crash of another server process

The patch that produced this is attached. If you can think of anything I
can add to this to help I'd be happy to do so. Also, can I clean this up
and continue somehow? Maybe clear the queue instead having to have a restart?
Or is there a way to just pause this proc here, maybe mark it not to be used
and exit, or just to sleep forever so I can debug later?

Thanks

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.
--- postgresql-9.0.4/src/backend/storage/lmgr/proc.c2011-04-14 
20:15:53.0 -0700
+++ postgresql-9.0.4.dg/src/backend/storage/lmgr/proc.c 2011-08-23 
17:30:03.505176019 -0700
@@ -323,7 +323,15 @@
MyProc->waitLock = NULL;
MyProc->waitProcLock = NULL;
for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
+   {
+   SHM_QUEUE *queue = &(MyProc->myProcLocks[i]);
+   if (! (!queue->prev || queue->prev == queue ||
+  !queue->next || queue->next == queue)
+   )
+   elog(FATAL, "Initprocess myProclocks[%d] not empty: 
queue %p (prev %p, next %p) ",
+   i, queue, queue->prev, queue->next);
SHMQueueInit(&(MyProc->myProcLocks[i]));
+   }
MyProc->recoveryConflictPending = false;
 
/*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_upgrade problem

2011-09-07 Thread hubert depesz lubaczewski
On Tue, Sep 06, 2011 at 09:21:02PM -0400, Bruce Momjian wrote:
> Tom Lane wrote:
> > hubert depesz lubaczewski  writes:
> > > Worked a bit to get the ltree problem down to smallest possible, 
> > > repeatable, situation.
> > 
> > I looked at this again and verified that indeed, commit
> > 8eee65c996048848c20f6637c1d12b319a4ce244 introduced an incompatible
> > change into the on-disk format of ltree columns: it widened
> > ltree_level.len, which is one component of an ltree on disk.
> > So the crash is hardly surprising.  I think that the only thing
> > pg_upgrade could do about it is refuse to upgrade when ltree columns
> > are present in an 8.3 database.  I'm not sure though how you'd identify
> > contrib/ltree versus some random user-defined type named ltree.
> 
> It is actually easy to do using the attached patch.  I check for the
> functions that support the data type and check of they are from an
> 'ltree' shared object.  I don't check actual user table type names in
> this case.

While it will prevent failures in future, it doesn't solve my problem
now :(

Will try to do it via:
- drop indexes on ltree
- convert ltree to text
- upgrade
- convert text to ltree
- create indexes on ltree

Best regards,

depesz


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cheaper snapshots redux

2011-09-07 Thread Amit Kapila
I wanted to clarify my understanding and have some doubts.

>What I'm thinking about instead is using a ring buffer with three pointers:
a start pointer, a stop pointer, and a write pointer.  When a transaction
ends, we
>advance the write pointer, write the XIDs or a whole new snapshot into the
buffer, and then advance the stop pointer.  If we wrote a whole new
snapshot, 
>we advance the start pointer to the beginning of the data we just wrote.
>Someone who wants to take a snapshot must read the data between the start
and stop pointers, and must then check that the write pointer
>hasn't advanced so far in the meantime that the data they read might have
been overwritten before they finished reading it.  Obviously,
>that's a little risky, since we'll have to do the whole thing over if a
wraparound occurs, but if the ring buffer is large enough it shouldn't
happen very often.  
 
Clarification
--
1. With the above, you want to reduce/remove the concurrency issue between
the GetSnapshotData() [used at begining of sql command execution] and
ProcArrayEndTransaction() [used at end transaction]. The concurrency issue
is mainly ProcArrayLock which is taken by GetSnapshotData() in Shared mode
and by ProcArrayEndTransaction() in X mode.
There may be other instances for similar thing, but this the main thing
which you want to resolve.
 
2. You want to resolve it by using ring buffer such that readers don't need
to take any lock.
 
Is my above understanding correct?
 
Doubts


1. 2 Writers; Won't 2 different sessions who try to commit at same time will
get the same write pointer.
I assume it will be protected as even indicated in one of your replies
as I understood?
 
2. 1 Reader, 1 Writter; It might be case that some body has written a new
snapshot and advanced the stop pointer and at that point of time one reader
came and read between start pointer and stop pointer. Now the reader will
see as follows:
   snapshot, few XIDs, snapshot
 
So will it handle this situation such that it will only read latest
snapshot?
 
3. How will you detect overwrite.
 
4. Won't it effect if we don't update xmin everytime and just noting the
committed XIDs. The reason I am asking is that it is used in tuple
visibility check
so with new idea in some cases instead of just returning from begining
by checking xmin it has to go through the committed XID list.
I understand that there may be less cases or the improvement by your
idea can supesede this minimal effect. However some cases can be defeated.
 
 
-- 
With Regards,
Amit Kapila.




***
This e-mail and attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed
above. Any use of the information contained herein in any way (including,
but not limited to, total or partial disclosure, reproduction, or
dissemination) by persons other than the intended recipient's) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!

-Original Message-
From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
Sent: Sunday, August 28, 2011 7:17 AM
To: Gokulakannan Somasundaram
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] cheaper snapshots redux

On Sat, Aug 27, 2011 at 1:38 AM, Gokulakannan Somasundaram
 wrote:
> First i respectfully disagree with you on the point of 80MB. I would say
> that its very rare that a small system( with <1 GB RAM ) might have a long
> running transaction sitting idle, while 10 million transactions are
sitting
> idle. Should an optimization be left, for the sake of a very small system
to
> achieve high enterprise workloads?

With the design where you track commit-visbility sequence numbers
instead of snapshots, you wouldn't need 10 million transactions that
were all still running.  You would just need a snapshot that had been
sitting around while 10 million transactions completed meanwhile.

That having been said, I don't necessarily think that design is
doomed.  I just think it's going to be trickier to get working than
the design I'm now hacking on, and a bigger change from what we do
now.  If this doesn't pan out, I might try that one, or something
else.

> Second, if we make use of the memory mapped files, why should we think,
that
> all the 80MB of data will always reside in memory? Won't they get paged
out
> by the  operating system, when it is in need of memory? Or do you have
some
> specific OS in mind?

No, I don't think it will all be in memory - but that's part of the
performance calculation.  If you need to check on the status of an XID
and find that you need to read a page of data in from disk, that's
going to be many orders of magnitude slower than anything we do with s
snapshot now.  Now, if you gain enough elsewhere, it could still be 

  1   2   >