date:20140612

Re: [HACKERS] Few observations in replication slots related code

2014-06-12 Thread Andres Freund

Hi Amit,

On 2014-06-12 08:55:59 +0530, Amit Kapila wrote:
> 1. In function StartupReplicationSlots(XLogRecPtr checkPointRedo),
> parameter checkPointRedo is not used.

It used to be included in a debug message. Apparently the message was
removed at some point (don't remember it, but I have a memory like a
sieve).

> 2. Few check are in different order in functions
> pg_create_physical_replication_slot() and
> pg_create_logical_replication_slot().
> 
> if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
> elog(ERROR, "return type must be a row type");
> check_permissions();
> 
> CheckLogicalDecodingRequirements();
> 
> I don't think it makes any difference, however having checks in similar
> order seems better unless there is a reason for not doing so.

Can change it.

> 3. Currently there is any Assert (Assert(!MyReplicationSlot);) in
> pg_create_logical_replication_slot(), is there a need for similar
> Assert in pg_create_physical_replication_slot()?

Hm. Shouldn't really matter either way these days, but I guess it
doesn't hurt to add one.

> 4.
> StartupDecodingContext()
> {
> ..
> context = AllocSetContextCreate(CurrentMemoryContext,
> "Changeset Extraction Context",
> }
> 
> To be consistent, shall we name this context as
> logical | logical decoding?

Heh. That apparently escaped when things were renamed. Yes.

> 5.
> pg_create_logical_replication_slot()
> {
> ..
> CreateInitDecodingContext()
> 
> ..
> ReplicationSlotPersist()
> }
> 
> Function pg_create_logical_replication_slot() is trying to
> save slot twice once during CreateInitDecodingContext() and
> then in ReplicationSlotPersist(), isn't it better if we can make
> it do it just once?

Doesn't work here. In the first save it's not yet marked as persistent -
but we still need to safely reserve the xmin. It's also not something
that should happen very frequently, so I'm not worried about the
overhead.

> 6.
> elog(ERROR, "cannot handle changeset extraction yet");
> 
> Shouldn't it be better to use logical replication instead
> of changeset extraction?

Will change.

> 7.
> + * XXX: It might make sense to introduce ephemeral slots and always use
> + * the slot mechanism.
> 
> Already there are RS_EPHEMERAL slots, might be this
> comment needs bit of rephrasing.

> 8. Is there a chance of inconsistency, if pg_restxlog resets the
> xlog and after restart, one of the slots contains xlog position
> older than what is resetted by pg_resetxlog?

Yes. There's lots of ways to screw over your database by using
pg_resetxlog. I personally think there should be a required
--yes-i-am-breaking-my-database parameter for pg_resetxlog.

> 9.
> void
> LogicalIncreaseXminForSlot(XLogRecPtr current_lsn, TransactionId xmin)
> {
> ..
> /*
>  * don't overwrite if we already have a newer xmin. This can happen if we
>  * restart decoding in a slot.
>  */
> if (TransactionIdPrecedesOrEquals(xmin, slot->data.catalog_xmin))
> {
> }
> ..
> }
> 
> Should we just release spinlock in this loop and just return,
> rather than keeping no action loop?

Don't think that'd make it any faster/simpler. We'd be stuck with
duplicate codepaths.

> 10.
> * Iff ri_type = REPLICA_IDENTITY_INDEX, indexOid must be the Oid of a
> suitable
>  * index. Otherwise, it should be InvalidOid.
>  */
> static void
> relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid,
>bool is_internal)
> 
> typo - *Iff*

iff := if and only if.

Thanks for the look.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Shared memory changes in 9.4?

2014-06-12 Thread Christoph Berg

[redirecting to -hackers]

Re: Robert Haas 2014-05-28 

> On Tue, May 27, 2014 at 8:22 PM, Maciek Sakrejda  wrote:
> > On Mon, May 26, 2014 at 12:24 AM, Andres Freund 
> > wrote:
> >> Any chance you're using a 9.3 configuration file instead of the one
> >> generated by initdb?
> >> dynamic_shared_memory_type defaults to 'posix' if not specified in the
> >> config file (on platforms supporting it). If initdb detects that 'posix'
> >> can't be used it'll emit a different value. If you're copying the config
> >> from 9.3 and your environment doesn't support posix shm that'll cause
> >> the above error.
> >> I still think dynamic_shared_memory_type should default to 'none'
> >> because of such problems
> >
> > It works with 'none' and 'sysv'--I think the issue is that technically our
> > environment does support 'posix', but '/dev/shm' is indeed not mounted in
> > the LXC container, leading to a discrepancy between what initdb decides and
> > what's actually possible. Thanks for your help.
> 
> I think it would be good to understand why initdb isn't getting this
> right.  Did you run initdb outside the LXC container, where /dev/shm
> would have worked, but then run postgres inside the LXC container,
> where /dev/shm does not work?  I ask because initdb is supposed to be
> doing the same thing that postgres does, so it really ought to come to
> the same conclusion about what will and won't work.
> 
> With regard to Andres' proposal, I'm not that keen on setting
> dynamic_shared_memory_type='none' by default.  Would we leave it that
> way until we get in-core users of the facility, and then change it?  I
> guess that'd be OK, but frankly if enabling dynamic_shared_memory_type
> by default is causing us many problems, then we'd better reconsider
> the design of the facility now, before we start adding more
> dependencies on it.  We've already fixed a bunch of DSM-related issues
> as a result of the fact that the default *isn't* none, and I dunno how
> many of those we would have found if the default had been none.  I
> tend to think DSM is an important facility that we're going to be
> wanting to build on in future releases, so I'm keen to have it
> available by default so that we can iron out any kinks before we get
> too far down that path.

Hi,

I've just run into the same problem. I am running the Debian
postgresql-common testsuite, which includes upgrade tests. On
upgrades, the old postgresql.conf is copied to the new server (after
initdb and/or pg_upgrade), and deprecated options are removed/renamed. [*]

In that chroot environment, 9.4 is running fine, except that upgrades
failed because /dev/shm wasn't mounted, and the old 9.3
postgresql.conf doesn't contain dynamic_shared_memory_type = 'sysv'.

To me, the following should be done:
* Make initdb determine the best shm type for this platform and write
  it into postgresql.conf as it does now.
* If no dynamic_shared_memory_type is found in the config, default to
  "none".
* Modify the three identical error messages concerned about shm
  segments to include the shm type instead of always just saying
  "FATAL:  could not open shared memory segment"
* Add a HINT to the POSIX error message:
  "HINT: This might indicate that /dev/shm is not mounted, or its
  permissions do not allow the database user to create files there"

Despite /dev/shm having been around for quite some time, many people
seem to be unaware what POSIX shared memory is, so the HINT is really
needed there. It certainly took me weeks after seeing the error
message the first time till I found the time to dig deeper to the
issure - it should just have been "oh yes, /dev/shm isn't mounted
there, that's why".

Christoph

[*] This might not be the smartest thing to do, but it's a simple
default approach to get the new cluster running with as much user
config from the old config as possible.
-- 
c...@df7cb.de | http://www.df7cb.de/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Shared memory changes in 9.4?

2014-06-12 Thread Andres Freund

Hi,

On 2014-06-12 11:07:31 +0200, Christoph Berg wrote:
> Re: Robert Haas 2014-05-28 
> 
> > On Tue, May 27, 2014 at 8:22 PM, Maciek Sakrejda  
> > wrote:
> > > On Mon, May 26, 2014 at 12:24 AM, Andres Freund 
> > > wrote:
> > >> Any chance you're using a 9.3 configuration file instead of the one
> > >> generated by initdb?
> > >> dynamic_shared_memory_type defaults to 'posix' if not specified in the
> > >> config file (on platforms supporting it). If initdb detects that 'posix'
> > >> can't be used it'll emit a different value. If you're copying the config
> > >> from 9.3 and your environment doesn't support posix shm that'll cause
> > >> the above error.
> > >> I still think dynamic_shared_memory_type should default to 'none'
> > >> because of such problems

> > With regard to Andres' proposal, I'm not that keen on setting
> > dynamic_shared_memory_type='none' by default.

Note that I'm not proposing to disable the whole thing. Just that a
unset dynamic_shared_memory_type doesn't configure dsm. Initdb would
still configure it after probing.

> To me, the following should be done:
> * Make initdb determine the best shm type for this platform and write
>   it into postgresql.conf as it does now.
> * If no dynamic_shared_memory_type is found in the config, default to
>   "none".
> * Modify the three identical error messages concerned about shm
>   segments to include the shm type instead of always just saying
>   "FATAL:  could not open shared memory segment"
> * Add a HINT to the POSIX error message:
>   "HINT: This might indicate that /dev/shm is not mounted, or its
>   permissions do not allow the database user to create files there"

Sounds like a sane plan to me.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "RETURNING PRIMARY KEY" syntax extension

2014-06-12 Thread Jochem van Dieten

On Wed, Jun 11, 2014 at 2:39 AM, Tom Lane wrote:
>
> I'm not even 100% sold that automatically returning the primary key
> is going to save any application logic.  Could somebody point out
> *exactly* where an app is going to save effort with this type of
> syntax, compared to requesting the columns it wants by name?

I haven't checked the code, but I am hoping it will help with the problem
where a RETURNING * is added to a statement that is not an insert or update
by the JDBC driver. That has been reported on the JDBC list at least twice,
and the proposed workaround is neither very elegant nor very robust:
https://groups.google.com/forum/#!msg/pgsql.interfaces.jdbc/7WY60JX3qyo/-v1fqDqLQKwJ

Jochem

-- 
Jochem van Dieten
http://jochem.vandieten.net/

Re: [HACKERS] "RETURNING PRIMARY KEY" syntax extension

2014-06-12 Thread Ian Barwick

On 14/06/12 18:46, Jochem van Dieten wrote:
> On Wed, Jun 11, 2014 at 2:39 AM, Tom Lane wrote:
> 
> I'm not even 100% sold that automatically returning the primary key
> is going to save any application logic.  Could somebody point out
> *exactly* where an app is going to save effort with this type of
> syntax, compared to requesting the columns it wants by name?
> 
> 
> I haven't checked the code, but I am hoping it will help with the problem 
> where a RETURNING * is added to a statement that is not an insert or update
> by the JDBC driver. That has been reported on the JDBC list at least twice,
> and the proposed workaround is neither very elegant nor very robust:
> https://groups.google.com/forum/#!msg/pgsql.interfaces.jdbc/7WY60JX3qyo/-v1fqDqLQKwJ

Unfortunately that seems to be a JDBC-specific issue, which is outside
of the scope of this particular patch (which proposes additional server-side
syntax intended to make RETURNING * operations more efficient for
certain use cases, but which is in itself not a JDBC change).


Regards

Ian Barwick

-- 
 Ian Barwick   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Few observations in replication slots related code

2014-06-12 Thread Andres Freund

Hi,

On 2014-06-12 09:15:08 +0200, Andres Freund wrote:
> > 6.
> > elog(ERROR, "cannot handle changeset extraction yet");
> > 
> > Shouldn't it be better to use logical replication instead
> > of changeset extraction?
> 
> Will change.

I don't see that message anywhere in current code? All of those were
rephrased in b89e151054a05f0f6d356ca52e3b725dd0505e53 (where Robert
changed the term to logical decoding) and then implemented in
5a991ef8692ed0d170b44958a81a6bd70e90585c.

Pushed a fix for the rest of the ones I commented on earlier. Thanks!

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] replication commands and log_statements

2014-06-12 Thread Fujii Masao

On Wed, Jun 11, 2014 at 11:55 PM, Tom Lane  wrote:
> Andres Freund  writes:
>> Your wish just seems like a separate feature to me. Including
>> replication commands in 'all' seems correct independent of the desire
>> for a more granular control.
>
> No, I think I've got to vote with the other side on that.
>
> The reason we can have log_statement as a scalar progression
> "none < ddl < mod < all" is that there's little visible use-case
> for logging DML but not DDL, nor for logging SELECTS but not
> INSERT/UPDATE/DELETE.  However, logging replication commands seems
> like something people would reasonably want an orthogonal control for.
> There's no nice way to squeeze such a behavior into log_statement.
>
> I guess you could say that log_statement treats replication commands
> as if they were DDL, but is that really going to satisfy users?
>
> I think we should consider log_statement to control logging of
> SQL only, and invent a separate GUC (or, in the future, likely
> more than one GUC?) for logging of replication activity.

Seems reasonable. OK. The attached patch adds log_replication_command
parameter which causes replication commands to be logged. I added this to
next CF.

Regards,

-- 
Fujii Masao
From 9874e36a8f65667d1f4d3a9a8d1d87d2d20f5188 Mon Sep 17 00:00:00 2001
From: MasaoFujii 
Date: Thu, 12 Jun 2014 19:32:00 +0900
Subject: [PATCH] Add log_replication_command configuration parameter.

This GUC causes replication commands like IDENTIFY_SYSTEM
to be logged in the server log.
---
 doc/src/sgml/config.sgml  |   16 
 src/backend/replication/walsender.c   |   11 +--
 src/backend/utils/misc/guc.c  |9 +
 src/backend/utils/misc/postgresql.conf.sample |1 +
 src/include/replication/walsender.h   |1 +
 5 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 697cf99..8532f08 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4534,6 +4534,22 @@ FROM pg_stat_activity;
   
  
 
+ 
+  log_replication_command (boolean)
+  
+   log_replication_command configuration parameter
+  
+  
+  
+   
+Causes each replication command to be logged in the server log.
+See  for more information about
+replication command. The default value is off.
+Only superusers can change this setting.
+   
+  
+ 
+
  
   log_temp_files (integer)
   
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 088ee2c..009ad92 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -108,6 +108,7 @@ bool		am_db_walsender = false;	/* Connected to a database? */
 int			max_wal_senders = 0;	/* the maximum number of concurrent walsenders */
 int			wal_sender_timeout = 60 * 1000;		/* maximum time to send one
  * WAL data message */
+bool		log_replication_command = false;
 
 /*
  * State for WalSndWakeupRequest
@@ -1261,13 +1262,19 @@ exec_replication_command(const char *cmd_string)
 	MemoryContext old_context;
 
 	/*
+	 * Log replication command if log_replication_command is enabled.
+	 * Even when it's disabled, log the command with DEBUG1 level for
+	 * backward compatibility.
+	 */
+	ereport(log_replication_command ? LOG : DEBUG1,
+			(errmsg("received replication command: %s", cmd_string)));
+
+	/*
 	 * CREATE_REPLICATION_SLOT ... LOGICAL exports a snapshot until the next
 	 * command arrives. Clean up the old stuff if there's anything.
 	 */
 	SnapBuildClearExportedSnapshot();
 
-	elog(DEBUG1, "received replication command: %s", cmd_string);
-
 	CHECK_FOR_INTERRUPTS();
 
 	cmd_context = AllocSetContextCreate(CurrentMemoryContext,
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 1d094f0..427af79 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -931,6 +931,15 @@ static struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 	{
+		{"log_replication_command", PGC_SUSET, LOGGING_WHAT,
+			gettext_noop("Logs each replication command."),
+			NULL
+		},
+		&log_replication_command,
+		false,
+		NULL, NULL, NULL
+	},
+	{
 		{"debug_assertions", PGC_USERSET, DEVELOPER_OPTIONS,
 			gettext_noop("Turns on various assertion checks."),
 			gettext_noop("This is a debugging aid."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index d109394..70d86cf 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -431,6 +431,7 @@
 	# e.g. '<%u%%%d> '
 #log_lock_waits = off			# log lock waits >= deadlock_timeout
 #log_statement = 'none'			# none, ddl, mod, all
+#log_replication_command = off
 #log_temp_files = -1			# log temporary files equal or larger
 	# than the specified size in kilobytes;
 	# -1 disabl

Re: [HACKERS] Inaccuracy in VACUUM's tuple count estimates

2014-06-12 Thread Andres Freund

Hi Tom,

On 2014-06-06 15:44:25 -0400, Tom Lane wrote:
> I figured it'd be easy enough to get a better estimate by adding another
> counter to count just LIVE and INSERT_IN_PROGRESS tuples (thus effectively
> assuming that in-progress inserts and deletes will both commit).

Did you plan to backpatch that? My inclination would be no...

>  I did
> that, and found that it helped Tim's test case not at all :-(.  A bit of
> sleuthing revealed that HeapTupleSatisfiesVacuum actually returns
> INSERT_IN_PROGRESS for any tuple whose xmin isn't committed, regardless of
> whether the transaction has since marked it for deletion:
> 
> /*
>  * It'd be possible to discern between INSERT/DELETE in progress
>  * here by looking at xmax - but that doesn't seem beneficial for
>  * the majority of callers and even detrimental for some. We'd
>  * rather have callers look at/wait for xmin than xmax. It's
>  * always correct to return INSERT_IN_PROGRESS because that's
>  * what's happening from the view of other backends.
>  */
> return HEAPTUPLE_INSERT_IN_PROGRESS;
> 
> It did not use to blow this question off: back around 8.3 you got
> DELETE_IN_PROGRESS if the tuple had a delete pending.  I think we need
> less laziness + fuzzy thinking here.  Maybe we should have a separate
> HEAPTUPLE_INSERT_AND_DELETE_IN_PROGRESS result code?  Is it *really*
> the case that callers other than VACUUM itself are okay with failing
> to make this distinction?  I'm dubious: there are very few if any
> callers that treat the INSERT and DELETE cases exactly alike.

My current position on this is that we should leave the code as is <9.4
and HEAPTUPLE_INSERT_IN_PROGRESS for the 9.4/master. Would you be ok
with that? The second best thing imo would be to discern and return
HEAPTUPLE_INSERT_IN_PROGRESS/HEAPTUPLE_DELETE_IN_PROGRESS for the
respective cases.
Which way would you like to go?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Audit of logout

2014-06-12 Thread Fujii Masao

Hi,

Some users enable log_disconnections in postgresql.conf to audit all logouts.
But since log_disconnections is defined with PGC_BACKEND, it can be changed
at connection start. This means that any client (even nonsuperuser) can freely
disable log_disconnections not to log his or her logout even when the
system admin
enables it in postgresql.conf. Isn't this problematic for audit?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "cancelling statement due to user request error" occurs but the transaction has committed.

2014-06-12 Thread Naoya Anzai

Hi,

> Well, the only other principled fix I can see is to add a new reponse
> along the lines of ERRORBUTITCOMMITTED, which does not seem attractive
> either, since all clients will have to be taught to understand it.

+1

I think current specification hard to understand for many users.
It is really good if PostgreSQL gave us a message such as a replication abort 
warning:
###
WARNING:  canceling wait for synchronous replication due to user request
DETAIL:  The transaction has already committed locally, but might not have been 
replicated to the standby.
###

Regards,

Naoya

---
Naoya Anzai
Engineering Department
NEC Solution Inovetors, Ltd.
E-Mail: anzai-na...@mxu.nes.nec.co.jp
---



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "RETURNING PRIMARY KEY" syntax extension

2014-06-12 Thread Jochem van Dieten

On Thu, Jun 12, 2014 at 12:25 PM, Ian Barwick wrote:

> On 14/06/12 18:46, Jochem van Dieten wrote:
> > I haven't checked the code, but I am hoping it will help with the problem
> > where a RETURNING * is added to a statement that is not an insert or
> update
> > by the JDBC driver. That has been reported on the JDBC list at least
> twice,
> > and the proposed workaround is neither very elegant nor very robust:
> >
> https://groups.google.com/forum/#!msg/pgsql.interfaces.jdbc/7WY60JX3qyo/-v1fqDqLQKwJ
>
> Unfortunately that seems to be a JDBC-specific issue, which is outside
> of the scope of this particular patch (which proposes additional
> server-side
> syntax intended to make RETURNING * operations more efficient for
> certain use cases, but which is in itself not a JDBC change).
>

But the obvious way to fix the JDBC issue is not to fix it by adding a
'mini parser' on the JDBC side, but to make SELECT ... RETURNING PRIMARY
KEY a regular select that silently ignores the returning clause and doesn't
throw an error on the server-side.

That might still be outside the scope of this particular patch, but it
would provide (additional) justification if it were supported.

Jochem

-- 
Jochem van Dieten
http://jochem.vandieten.net/

Re: [HACKERS] "RETURNING PRIMARY KEY" syntax extension

2014-06-12 Thread Andres Freund

On 2014-06-12 13:58:31 +0200, Jochem van Dieten wrote:
> But the obvious way to fix the JDBC issue is not to fix it by adding a
> 'mini parser' on the JDBC side, but to make SELECT ... RETURNING PRIMARY
> KEY a regular select that silently ignores the returning clause and doesn't
> throw an error on the server-side.

Brr. Then it'd need to be added to all statements, not just SELECT. I've
my doubts that's going to happen.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "RETURNING PRIMARY KEY" syntax extension

2014-06-12 Thread Ian Barwick

On 14/06/12 20:58, Jochem van Dieten wrote:
> On Thu, Jun 12, 2014 at 12:25 PM, Ian Barwick wrote:
> 
> On 14/06/12 18:46, Jochem van Dieten wrote:
> > I haven't checked the code, but I am hoping it will help with the 
> problem
> > where a RETURNING * is added to a statement that is not an insert or 
> update
> > by the JDBC driver. That has been reported on the JDBC list at least 
> twice,
> > and the proposed workaround is neither very elegant nor very robust:
> > 
> https://groups.google.com/forum/#!msg/pgsql.interfaces.jdbc/7WY60JX3qyo/-v1fqDqLQKwJ
> 
> Unfortunately that seems to be a JDBC-specific issue, which is outside
> of the scope of this particular patch (which proposes additional 
> server-side
> syntax intended to make RETURNING * operations more efficient for
> certain use cases, but which is in itself not a JDBC change).
> 
> 
> But the obvious way to fix the JDBC issue is not to fix it by adding a 'mini 
> parser' on
> the JDBC side, but to make SELECT ... RETURNING PRIMARY KEY a regular select 
> that silently
> ignores the returning clause and doesn't throw an error on the server-side.
> 
> That might still be outside the scope of this particular patch, but it would 
> provide 
> (additional) justification if it were supported.

That would be adding superfluous, unused and unusable syntax of no potential 
value
(there is no SELECT ... RETURNING and it wouldn't make any sense if there was) 
as a
workaround for a driver issue - not going to happen.

Regards

Ian Barwick


-- 
 Ian Barwick   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?

2014-06-12 Thread Kevin Grittner

Robert Haas  wrote:

> Even aside from security exposures, how
> does a non-superuser who runs pg_dump know whether they've got a
> complete backup or a filtered dump that's missing some rows?

This seems to me to be a killer objection to the feature as
proposed, and points out a huge difference between column level
security and the proposed implementation of row level security. 
(In fact it is a difference between just about any GRANTed
permission and row level security.)  If you try to SELECT * FROM
sometable and you don't have rights to all the columns, you get an
error.  A dump would always either work as expected or generate an
error.

test=# create user bob;
CREATE ROLE
test=# create user bill;
CREATE ROLE
test=# set role bob;
SET
test=> create table person (person_id int not null primary key,
name text not null, ssn text);
CREATE TABLE
test=> grant select (person_id, name) on table person to bill;
GRANT
test=> reset role;
RESET
test=# set role bill;
SET
test=> select person_id, name from person;
 person_id | name 
---+--
(0 rows)

test=> select * from person;
ERROR:  permission denied for relation person

The proposed approach would leave the validity of any dump which
was not run as a superuser in doubt.  The last thing we need, in
terms of improving security, is another thing you can't do without
connecting as a superuser.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Shared memory changes in 9.4?

2014-06-12 Thread Christoph Berg

Re: Andres Freund 2014-06-12 <20140612094112.gz8...@alap3.anarazel.de>
> > * Make initdb determine the best shm type for this platform and write
> >   it into postgresql.conf as it does now.
> > * If no dynamic_shared_memory_type is found in the config, default to
> >   "none".
> > * Modify the three identical error messages concerned about shm
> >   segments to include the shm type instead of always just saying
> >   "FATAL:  could not open shared memory segment"
> > * Add a HINT to the POSIX error message:
> >   "HINT: This might indicate that /dev/shm is not mounted, or its
> >   permissions do not allow the database user to create files there"
> 
> Sounds like a sane plan to me.

Here are two patches, one that implements the annotated error
messages, and one that selects none as default.

It might also make sense to add a Note that POSIX depends on /dev/shm,
and also a Note that dynamic_shared_memory_type is not related to
the shared_buffers shm segments, which I didn't include here.

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/
diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c
new file mode 100644
index 0819641..780e3a5
*** a/src/backend/storage/ipc/dsm_impl.c
--- b/src/backend/storage/ipc/dsm_impl.c
*** dsm_impl_posix(dsm_op op, dsm_handle han
*** 289,296 
  		if (errno != EEXIST)
  			ereport(elevel,
  	(errcode_for_dynamic_shared_memory(),
! 	 errmsg("could not open shared memory segment \"%s\": %m",
! 			name)));
  		return false;
  	}
  
--- 289,299 
  		if (errno != EEXIST)
  			ereport(elevel,
  	(errcode_for_dynamic_shared_memory(),
! 	 errmsg("could not open POSIX shared memory segment \"%s\": %m",
! 			name),
! 	 errhint("This error usually means that /dev/shm is not mounted, or its "
! 			 "permissions do not allow the database user to create files "
! 			 "there.")));
  		return false;
  	}
  
*** dsm_impl_posix(dsm_op op, dsm_handle han
*** 313,319 
  
  			ereport(elevel,
  	(errcode_for_dynamic_shared_memory(),
! 	 errmsg("could not stat shared memory segment \"%s\": %m",
  			name)));
  			return false;
  		}
--- 316,322 
  
  			ereport(elevel,
  	(errcode_for_dynamic_shared_memory(),
! 	 errmsg("could not stat POSIX shared memory segment \"%s\": %m",
  			name)));
  			return false;
  		}
*** dsm_impl_posix(dsm_op op, dsm_handle han
*** 332,338 
  
  		ereport(elevel,
  (errcode_for_dynamic_shared_memory(),
! 		 errmsg("could not resize shared memory segment %s to %zu bytes: %m",
  name, request_size)));
  		return false;
  	}
--- 335,341 
  
  		ereport(elevel,
  (errcode_for_dynamic_shared_memory(),
! 		 errmsg("could not resize POSIX shared memory segment %s to %zu bytes: %m",
  name, request_size)));
  		return false;
  	}
*** dsm_impl_posix(dsm_op op, dsm_handle han
*** 358,364 
  
  			ereport(elevel,
  	(errcode_for_dynamic_shared_memory(),
!    errmsg("could not unmap shared memory segment \"%s\": %m",
  		  name)));
  			return false;
  		}
--- 361,367 
  
  			ereport(elevel,
  	(errcode_for_dynamic_shared_memory(),
!    errmsg("could not unmap POSIX shared memory segment \"%s\": %m",
  		  name)));
  			return false;
  		}
*** dsm_impl_posix(dsm_op op, dsm_handle han
*** 382,388 
  
  		ereport(elevel,
  (errcode_for_dynamic_shared_memory(),
!  errmsg("could not map shared memory segment \"%s\": %m",
  		name)));
  		return false;
  	}
--- 385,391 
  
  		ereport(elevel,
  (errcode_for_dynamic_shared_memory(),
!  errmsg("could not map POSIX shared memory segment \"%s\": %m",
  		name)));
  		return false;
  	}
*** dsm_impl_sysv(dsm_op op, dsm_handle hand
*** 512,518 
  errno = save_errno;
  ereport(elevel,
  		(errcode_for_dynamic_shared_memory(),
! 		 errmsg("could not get shared memory segment: %m")));
  			}
  			return false;
  		}
--- 515,521 
  errno = save_errno;
  ereport(elevel,
  		(errcode_for_dynamic_shared_memory(),
! 		 errmsg("could not get System V shared memory segment: %m")));
  			}
  			return false;
  		}
*** dsm_impl_sysv(dsm_op op, dsm_handle hand
*** 530,536 
  		{
  			ereport(elevel,
  	(errcode_for_dynamic_shared_memory(),
!    errmsg("could not unmap shared memory segment \"%s\": %m",
  		  name)));
  			return false;
  		}
--- 533,539 
  		{
  			ereport(elevel,
  	(errcode_for_dynamic_shared_memory(),
!    errmsg("could not unmap System V shared memory segment \"%s\": %m",
  		  name)));
  			return false;
  		}
*** dsm_impl_sysv(dsm_op op, dsm_handle hand
*** 540,546 
  		{
  			ereport(elevel,
  	(errcode_for_dynamic_shared_memory(),
!   errmsg("could not remove shared memory segment \"%s\": %m",
  		 name)));
  			return false;
  		}
--- 543,549 
  		{

Re: [HACKERS] replication commands and log_statements

2014-06-12 Thread Robert Haas

On Wed, Jun 11, 2014 at 7:42 AM, Magnus Hagander  wrote:
>> Replication commands like IDENTIFY_COMMAND are not logged even when
>> log_statements is set to all. Some users who use log_statements to
>> audit *all* statements might dislike this current situation. So I'm
>> thinking to change log_statements or add something like log_replication
>> so that we can log replication commands. Thought?
>
> +1. I think adding a separate parameter is the way to go.
>
> The other option would be to turn log_statements into a parameter that you
> specify multiple ones

I kind of like this idea, but...

> - so instead of "all" today it would be "ddl,dml,all"
> or something like that, and then you'd also add "replication" as an option.
> But that would cause all sorts of backwards compatibility annoyances.. And
> do you really want to be able to say things like "ddl,all" meanin you'd get
> ddl and select but not dml?

...you lost me here.  I mean, I think it could be quite useful to
redefine the existing GUC as a list.  We could continue to have ddl,
dml, and all as tokens that would be in the list, but you wouldn't
write "ddl,dml,all" because "all" would include everything that those
other ones would log.  But then you could have combinations like
"dml,replication" and so on.  And you could do much more fine-grained
things, like allow log_statement=create,alter,drop to log all such
statements but not, for example, cluster.

I think if we go the route of adding a separate GUC for this, we're
going to get tired of adding GUCs way before we've come close to
meeting the actual requirements in this area.  A comma-separated list
of tokens seems to offer a lot more flexibility.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgresql.auto.conf read from wrong directory

2014-06-12 Thread Fujii Masao

On Tue, May 27, 2014 at 2:05 PM, Amit Kapila  wrote:
> On Sun, May 11, 2014 at 11:23 PM, Tom Lane  wrote:
>> I think it's clearly *necessary* to forbid setting data_directory in
>> postgresql.auto.conf.  The file is defined to be found in the data
>> directory, so any such setting is circular logic by definition;
>> no good can come of not rejecting it.
>>
>> We already have a GUC flag bit about disallowing certain variables
>> in the config file (though I don't remember if it's enforced or
>> just advisory).  It seems to me that we'd better invent one for
>> disallowing in ALTER SYSTEM, as well.
>
> I introduced a new flag bit (GUC_DISALLOW_IN_AUTO_FILE) to
> disallow parameters by Alter System and disallowed data_directory to
> be set by Alter System.

We should document what types of parameters are not allowed to be set by
ALTER SYSTEM SET?

data_directory was displayed when I typed "TAB" just after ALTER SYSTEM SET.
Probably tab-completion for ALTER SYSTEM SET needs to be changed.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Few observations in replication slots related code

2014-06-12 Thread Amit Kapila

On Thu, Jun 12, 2014 at 5:02 PM, Andres Freund 
wrote:
> On 2014-06-12 09:15:08 +0200, Andres Freund wrote:
> > > 6.
> > > elog(ERROR, "cannot handle changeset extraction yet");
> > >
> > > Shouldn't it be better to use logical replication instead
> > > of changeset extraction?
> >
> > Will change.
>
> I don't see that message anywhere in current code?

Right, actually I was reading code from Git History and also
referring latest code, so it seems I have seen that in original
commit and missed to verify it in latest code.

While checking latest code, I got usage of *changeset extraction*
in below comment:

/*

..

*

 * This is useful to initialize the cutoff xid after which a new *changeset*

 * *extraction* replication slot can start decoding changes.

 *

..

*/

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] lo_create(oid, bytea) breaks every extant release of libpq

2014-06-12 Thread Tom Lane

Noah Misch  writes:
>> lo_new() or lo_make()?  An earlier draft of the patch that added
>> lo_create(oid, bytea) had a similar function named make_lo().

It appears that lo_make() has a small plurality, but before we lock
that name in, there was one other idea that occurred to me: the
underlying C function is currently named lo_create_bytea(), and
that seems like not an awful choice for the SQL name either.

Any other votes out there?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lo_create(oid, bytea) breaks every extant release of libpq

2014-06-12 Thread Andres Freund

On 2014-06-12 10:48:01 -0400, Tom Lane wrote:
> Noah Misch  writes:
> >> lo_new() or lo_make()?  An earlier draft of the patch that added
> >> lo_create(oid, bytea) had a similar function named make_lo().
> 
> It appears that lo_make() has a small plurality, but before we lock
> that name in, there was one other idea that occurred to me: the
> underlying C function is currently named lo_create_bytea(), and
> that seems like not an awful choice for the SQL name either.
> 
> Any other votes out there?

I prefer lo_create_bytea() or even lo_create_from_bytea(),
lo_from_bytea().

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lo_create(oid, bytea) breaks every extant release of libpq

2014-06-12 Thread Alvaro Herrera

Tom Lane wrote:
> Noah Misch  writes:
> >> lo_new() or lo_make()?  An earlier draft of the patch that added
> >> lo_create(oid, bytea) had a similar function named make_lo().
> 
> It appears that lo_make() has a small plurality, but before we lock
> that name in, there was one other idea that occurred to me: the
> underlying C function is currently named lo_create_bytea(), and
> that seems like not an awful choice for the SQL name either.
> 
> Any other votes out there?

I was also going to suggest lo_create_bytea().  Another similar
possibility would be lo_from_bytea() -- or, since we have overloading
(and we can actually use it in this case without breaking libpq), we
could just have lo_from(oid, bytea).

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Shared memory changes in 9.4?

2014-06-12 Thread Robert Haas

On Thu, Jun 12, 2014 at 5:41 AM, Andres Freund  wrote:
>> > With regard to Andres' proposal, I'm not that keen on setting
>> > dynamic_shared_memory_type='none' by default.
>
> Note that I'm not proposing to disable the whole thing. Just that a
> unset dynamic_shared_memory_type doesn't configure dsm. Initdb would
> still configure it after probing.

OK, I misunderstood your position; thanks for clarifying.  I think
that would be OK with me.  To some degree, I think that the test setup
is broken by design: while we try to maintain backward-compatibility
for postgresql.conf files, there's never been any guarantee that an
old postgresql.conf file will work on a newer server.  For example, a
whole lot of pre-8.4 users probably had max_fsm_pages and
max_fsm_relations configured, and with 8.4, those settings went away.
Fixing that kind of thing is an essential part of the upgrade process.

That having been said, in this particular case, we can probably ease
the pain without much downside by doing as you suggest.  The only
thing I'm worried about is that people will discover much later that
they don't have working dynamic shared memory, and be unhappy about
that.  Sometimes it's better to complain loudly at the beginning than
to leave buried problems for later.  But I'll defer to the majority on
what to do in his instance.

>> To me, the following should be done:
>> * Make initdb determine the best shm type for this platform and write
>>   it into postgresql.conf as it does now.
>> * If no dynamic_shared_memory_type is found in the config, default to
>>   "none".
>> * Modify the three identical error messages concerned about shm
>>   segments to include the shm type instead of always just saying
>>   "FATAL:  could not open shared memory segment"
>> * Add a HINT to the POSIX error message:
>>   "HINT: This might indicate that /dev/shm is not mounted, or its
>>   permissions do not allow the database user to create files there"
>
> Sounds like a sane plan to me.

+1 to the rest of this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lo_create(oid, bytea) breaks every extant release of libpq

2014-06-12 Thread Robert Haas

On Thu, Jun 12, 2014 at 10:56 AM, Alvaro Herrera
 wrote:
> Tom Lane wrote:
>> Noah Misch  writes:
>> >> lo_new() or lo_make()?  An earlier draft of the patch that added
>> >> lo_create(oid, bytea) had a similar function named make_lo().
>>
>> It appears that lo_make() has a small plurality, but before we lock
>> that name in, there was one other idea that occurred to me: the
>> underlying C function is currently named lo_create_bytea(), and
>> that seems like not an awful choice for the SQL name either.
>>
>> Any other votes out there?
>
> I was also going to suggest lo_create_bytea().

That sounds good to me, too.

Presumably we should also fix libpq to not be so dumb.  I mean, it
doesn't help with the immediate problem, since as you say there could
be non-upgraded copies of libpq out there for the indefinite future,
but it still seems like something we oughta fix.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lo_create(oid, bytea) breaks every extant release of libpq

2014-06-12 Thread Tom Lane

Alvaro Herrera  writes:
> Tom Lane wrote:
>> Any other votes out there?

> I was also going to suggest lo_create_bytea().  Another similar
> possibility would be lo_from_bytea() -- or, since we have overloading
> (and we can actually use it in this case without breaking libpq), we
> could just have lo_from(oid, bytea).

Andres also mentioned lo_from_bytea(), and I kinda like it too.
I don't like plain lo_from(), as it's not too apparent what it's
supposed to get data "from" --- someone might think the second
parameter was supposed to be a file name a la lo_import(),
for example.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suppressing unused subquery output columns

2014-06-12 Thread Robert Haas

On Thu, Jun 5, 2014 at 10:27 PM, Tom Lane  wrote:
> The attached draft patch fixes this by deleting unused output expressions
> from unflattened subqueries, so that we get:
>
> regression=# explain select f1 from (select * from t1 left join t2 on f1=f2 
> limit 1) ss;
> QUERY PLAN
> --
>  Subquery Scan on ss  (cost=0.00..0.02 rows=1 width=4)
>->  Limit  (cost=0.00..0.01 rows=1 width=4)
>  ->  Seq Scan on t1  (cost=0.00..34.00 rows=2400 width=4)
>  Planning time: 0.193 ms
> (4 rows)
>
> I'm not entirely convinced that it's worth the extra planning cycles,
> though.  Given the small number of complaints to date, it might not
> be worth doing this.  Thoughts?

I've encountered this issue before and think it would be great to fix
it.  I'm not sure precisely how many cycles it's worth paying, but
definitely more than zero.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lo_create(oid, bytea) breaks every extant release of libpq

2014-06-12 Thread Pavel Stehule

Lo_from_bytea sounds me better than lo_create_bytea

Re: [HACKERS] lo_create(oid, bytea) breaks every extant release of libpq

2014-06-12 Thread Tom Lane

Robert Haas  writes:
> Presumably we should also fix libpq to not be so dumb.  I mean, it
> doesn't help with the immediate problem, since as you say there could
> be non-upgraded copies of libpq out there for the indefinite future,
> but it still seems like something we oughta fix.

It's been in the back of my mind for awhile that doing a dynamic query at
all here is pretty pointless.  It's not like the OIDs of those functions
ever have or ever will move.  It would probably be more robust if we
just let libpq be a consumer of fmgroids.h and refer directly to the
constants F_LO_CREATE etc.

I think there was some previous discussion about this, possibly tied
to also having a better-defined way to let clients depend on hard-wired
type OIDs, but I'm too lazy to search the archives right now.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Shared memory changes in 9.4?

2014-06-12 Thread Andres Freund

Hi,

On 2014-06-12 11:08:35 -0400, Robert Haas wrote:
> On Thu, Jun 12, 2014 at 5:41 AM, Andres Freund  wrote:
> >> > With regard to Andres' proposal, I'm not that keen on setting
> >> > dynamic_shared_memory_type='none' by default.
> >
> > Note that I'm not proposing to disable the whole thing. Just that a
> > unset dynamic_shared_memory_type doesn't configure dsm. Initdb would
> > still configure it after probing.
> 
> OK, I misunderstood your position; thanks for clarifying.  I think
> that would be OK with me.  To some degree, I think that the test setup
> is broken by design: while we try to maintain backward-compatibility
> for postgresql.conf files, there's never been any guarantee that an
> old postgresql.conf file will work on a newer server.  For example, a
> whole lot of pre-8.4 users probably had max_fsm_pages and
> max_fsm_relations configured, and with 8.4, those settings went away.
> Fixing that kind of thing is an essential part of the upgrade process.

While I think I see where you're coming from I don't think these cases
are comparable. In this case commenting out the GUC can prevent the
server from starting. That's pretty odd imo.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suppressing unused subquery output columns

2014-06-12 Thread Tom Lane

Robert Haas  writes:
> On Thu, Jun 5, 2014 at 10:27 PM, Tom Lane  wrote:
>> The attached draft patch fixes this by deleting unused output expressions
>> from unflattened subqueries, so that we get:
>> ...
>> I'm not entirely convinced that it's worth the extra planning cycles,
>> though.  Given the small number of complaints to date, it might not
>> be worth doing this.  Thoughts?

> I've encountered this issue before and think it would be great to fix
> it.  I'm not sure precisely how many cycles it's worth paying, but
> definitely more than zero.

We have a couple votes for this patch and no one has spoken against it,
so I'll go ahead and push it into HEAD.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-06-12 Thread Robert Haas

On Thu, Jun 5, 2014 at 8:37 PM, Peter Geoghegan  wrote:
> * A configure AC_TRY_RUN tests the suitability of the system strxfrm()
> implementation for the purposes of this optimization. There can be no
> "header bytes" (people at pgCon reported redundant bytes on the Mac
> OSX implementation at pgCon, to my great surprise), ...

I'm attaching a small test program demonstrating the Mac behavior.
Here's some sample output:

[rhaas ~]$ ./strxfrm en_US "" a aa ab abc abcd abcde peter Geoghegan _ %
"" -> ""
"a" -> "001S001S"
"aa" -> "001S001S001S001S"
"ab" -> "001S001T001S001T"
"abc" -> "001S001T001U001S001T001U"
"abcd" -> "001S001T001U001V001S001T001U001V"
"abcde" -> "001S001T001U001V001W001S001T001U001V001W"
"peter" -> "001b001W001f001W001d001b001W001f001W001d"
"Geoghegan" -> 
"0019001W001a001Y001Z001W001Y001S001`0019001W001a001Y001Z001W001Y001S001`"
"_" -> "001Q001Q"
"%" -> "000W000W"

It appears that any string starting with the letter "a" will create
output that begins with 001S00 and the seventh character always
appears to be 0 or 1:

[rhaas ~]$ ./strxfrm en_US ab ac ad ae af a% a0 "a "
"ab" -> "001S001T001S001T"
"ac" -> "001S001U001S001U"
"ad" -> "001S001V001S001V"
"ae" -> "001S001W001S001W"
"af" -> "001S001X001S001X"
"a%" -> "001S000W001S000W"
"a0" -> "001S000b001S000b"
"a " -> "001S000R001S000R"

Also, the total number of bytes produced is apparently 8N+4, where N
is the length of the input string.  On a Linux system I tested, the
output included non-printable characters, so I adapted the test
program to print the results in hex.  Attaching that version, too.
Here, the length was 3N+2, except for 0-length strings which produce a
0-length result:

[rhaas@hydra ~]$ ./strxfrm-binary en_US "" a aa ab abc abcd abcde
peter Geoghegan _ %
"" ->  (0 bytes)
"a" -> 0c01020102 (5 bytes)
"aa" -> 0c0c010202010202 (8 bytes)
"ab" -> 0c0d010202010202 (8 bytes)
"abc" -> 0c0d0e0102020201020202 (11 bytes)
"abcd" -> 0c0d0e0f01020202020102020202 (14 bytes)
"abcde" -> 0c0d0e0f10010202020202010202020202 (17 bytes)
"peter" -> 1b101f101d010202020202010202020202 (17 bytes)
"Geoghegan" -> 12101a121310120c190102020202020202020201040202020202020202
(29 bytes)
"_" -> 0101010112 (5 bytes)
"%" -> 0101010139 (5 bytes)
[rhaas@hydra ~]$ ./strxfrm-binary en_US ab ac ad ae af a% a0 "a "
"ab" -> 0c0d010202010202 (8 bytes)
"ac" -> 0c0e010202010202 (8 bytes)
"ad" -> 0c0f010202010202 (8 bytes)
"ae" -> 0c10010202010202 (8 bytes)
"af" -> 0c11010202010202 (8 bytes)
"a%" -> 0c01020102010239 (8 bytes)
"a0" -> 0c02010202010202 (8 bytes)
"a " -> 0c01020102010211 (8 bytes)

Even though each input bytes is generating 3 output bytes, it's not
generating a group of output bytes for each input byte as appears to
be happening on MacOS X.  If it were doing that, then truncating the
blob to 8 bytes would only compare the first 2-3 bytes of the input
string.  In fact we do better.  If we change even the 8th letter in
the string to some other letter, the 8th output byte changes:

[rhaas@hydra ~]$ ./strxfrm-binary en_US  aaab
"" -> 0c0c0c0c0c0c0c0c010202020202020202010202020202020202 (26 bytes)
"aaab" -> 0c0c0c0c0c0c0c0d010202020202020202010202020202020202 (26 bytes)

If we change the capitalization of the eighth byte, then the change
happens further out:

[rhaas@hydra ~]$ ./strxfrm-binary en_US  aaaA
"" -> 0c0c0c0c0c0c0c0c010202020202020202010202020202020202 (26 bytes)
"aaaA" -> 0c0c0c0c0c0c0c0c010202020202020202010202020202020204 (26 bytes)

Still, it's fair to say that on this Linux system, the first 8 bytes
capture a significant portion of the entropy of the first 8 bytes of
the string, whereas on MacOS X you only get entropy from the first 2
bytes of the string.  It would be interesting to see results from
other platforms people might care about also.

> The cost of adding all of these ameliorating measures appears to be
> very low. We're so bottlenecked on memory bandwidth that the
> fixed-size overhead of maintaining poor man cardinality, and the extra
> cycles from hashing n keys for the benefit of HyperLogLog just don't
> seem to matter next to the big savings for n log n comparisons. The
> best case performance is seemingly just as good as before (about a
> 200% improvement in transaction throughput for one client, but closer
> to a 250% improvement with many clients and/or expensive collations),
> while the worst case is much much better.

I haven't looked at the patch yet, but this sounds promising.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
#include 
#include 
#include 
#include 
#include 

int
main(int argc, char **argv)
{
	char *buffer = NULL;
	size_t buflen = 0;

	if (argc < 3)
	{
		fprintf(stderr, "Usage: strxfrm {collation} {string}...\n");
		exit(1);
	}
	if (setlocale(LC_COLLATE, argv[1]) == NULL)
	{
		fprintf(stderr, "setlocale: %s\n", strerror(errno));
		exit(1);
	}
	argv +=

Re: [HACKERS] Proposing pg_hibernate

2014-06-12 Thread Robert Haas

On Thu, Jun 12, 2014 at 12:17 AM, Gurjeet Singh  wrote:
> On Wed, Jun 11, 2014 at 10:56 AM, Robert Haas  wrote:
>> On Tue, Jun 10, 2014 at 10:03 PM, Gurjeet Singh  wrote:
>>> And it's probably accepted by now that such a bahviour is not
>>> catastrophic, merely inconvenient.
>>
>> I think the whole argument for having pg_hibernator is that getting
>> the block cache properly initialized is important.  If it's not
>> important, then we don't need pg_hibernator in the first place.  But
>> if it is important, then I think not loading unrelated blocks into
>> shared_buffers is also important.
>
> I was constructing a contrived scenario, something that would rarely
> happen in reality. I feel that the benefits of this feature greatly
> outweigh the minor performance loss caused in such an unlikely scenario.

So, are you proposing this for inclusion in PostgreSQL core?

If not, I don't think there's much to discuss here - people can use it
or not as they see fit, and we'll see what happens and perhaps design
improvements will result, or not.

If so, that's different: you'll need to demonstrate the benefits via
convincing proof points, and you'll also need to show that the
disadvantages are in fact minor and that the scenario is in fact
unlikely.  So far there are zero performance numbers on this thread, a
situation that doesn't meet community standards for a performance
patch.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

refactoring allpaths.c (was Re: [HACKERS] Suppressing unused subquery output columns)

2014-06-12 Thread Tom Lane

I wrote:
> We have a couple votes for this patch and no one has spoken against it,
> so I'll go ahead and push it into HEAD.

BTW, I forgot to mention that while working on this patch I was thinking
it's past time to separate out the subquery support in allpaths.c into
its own file.  With this patch, allpaths.c is 2363 lines, about 690 of
which are set_subquery_pathlist and subsidiary functions.  While that
might not be quite tail-wagging-dog level, I think it's enough to justify
splitting that code out into a new file, say optimizer/path/subquerypath.c.

There are also about 630 lines devoted to appendrel path generation,
which might be thought enough to deserve a separate file of its own.
I'm a bit less excited about that though, mainly because the appendrel
logic has some not-quite-arms-length interactions with set_rel_size();
but there's certainly some case to be made for doing it.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-06-12 Thread Claudio Freire

On Thu, Jun 12, 2014 at 1:25 PM, Robert Haas  wrote:
>
> It appears that any string starting with the letter "a" will create
> output that begins with 001S00 and the seventh character always
> appears to be 0 or 1:
>
> [rhaas ~]$ ./strxfrm en_US ab ac ad ae af a% a0 "a "
> "ab" -> "001S001T001S001T"
> "ac" -> "001S001U001S001U"
> "ad" -> "001S001V001S001V"
> "ae" -> "001S001W001S001W"
> "af" -> "001S001X001S001X"
> "a%" -> "001S000W001S000W"
> "a0" -> "001S000b001S000b"
> "a " -> "001S000R001S000R"

...

> [rhaas@hydra ~]$ ./strxfrm-binary en_US  aaaA
> "" -> 0c0c0c0c0c0c0c0c010202020202020202010202020202020202 (26 bytes)
> "aaaA" -> 0c0c0c0c0c0c0c0c010202020202020202010202020202020204 (26 bytes)
>
> Still, it's fair to say that on this Linux system, the first 8 bytes
> capture a significant portion of the entropy of the first 8 bytes of
> the string, whereas on MacOS X you only get entropy from the first 2
> bytes of the string.  It would be interesting to see results from
> other platforms people might care about also.

If you knew mac's output character set with some certainty, you could
compress it rather efficiently by applying a tabulated decode back
into non-printable bytes. Say, like base64 decoding (the set appears
to be a subset of base64, but it's hard to be sure).


Ie,
x = strxfrm(s)
xz = hex(tab[x[0]] + 64*tab[x[1]] + 64^2*tab[x[2]] ... etc)

This can be made rather efficient. But the first step is defining with
some certainty the output character set.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] add line number as prompt option to psql

2014-06-12 Thread Sawada Masahiko

Hi all,

The attached IWP patch is one prompt option for psql, which shows
current line number.
If the user made syntax error with too long SQL then psql outputs
message as following.

ERROR:  syntax error at or near "a"
LINE 250: hoge
^
psql teaches me where syntax error is occurred, but it is not kind
when SQL is too long.
We can use write SQL with ¥e(editor) command(e.g., emacs) , and we can
know line number.
but it would make terminal log dirty . It will make analyzing of log
difficult after error is occurred.
(I think that we usually use copy & paste)

After attached this patch, we will be able to use %l option as prompting option.

e.g.,
$ cat ~/.psqlrc
\set PROMPT2 '%/[%l]%R%# '
\set PROMPT1 '%/[%l]%R%# '
$ psql -d postgres
postgres[1]=# select
postgres[2]-# *
postgres[3]-# from
postgres[4]-# hoge;

The past discussion is following.


Please give me feedback.

Regards,

---
Sawada Masahiko


psql-line-number_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lo_create(oid, bytea) breaks every extant release of libpq

2014-06-12 Thread Tom Lane

I wrote:
> Alvaro Herrera  writes:
>> I was also going to suggest lo_create_bytea().  Another similar
>> possibility would be lo_from_bytea() -- or, since we have overloading
>> (and we can actually use it in this case without breaking libpq), we
>> could just have lo_from(oid, bytea).

> Andres also mentioned lo_from_bytea(), and I kinda like it too.
> I don't like plain lo_from(), as it's not too apparent what it's
> supposed to get data "from" --- someone might think the second
> parameter was supposed to be a file name a la lo_import(),
> for example.

Since the discussion seems to have trailed off, I'm going to run with
lo_from_bytea().  The plan is:

1. Rename the function.
2. Add an opr_sanity regression test memorializing what we should get
from lo_initialize()'s query.
3. Make sure that the regression tests leave behind a few large objects,
so that testing of pg_dump/pg_restore using the regression database
will exercise the large-object code paths.

It'd be a good thing if the TAP tests for client programs included
testing of pg_dump/pg_restore, but that's a bit beyond my competence
with that tool ... anyone care to step up?

Redoing or getting rid of lo_initialize()'s query would be a good thing
too; but that does not seem like material for back-patching into 9.4,
while I think all the above are.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Question about partial functional indexes and the query planner

2014-06-12 Thread Keith Fiske

On Wed, Jun 11, 2014 at 7:24 PM, Tom Lane  wrote:

> Robert Haas  writes:
> > On Tue, Jun 10, 2014 at 7:19 PM, Tom Lane  wrote:
> >> Given the lack of previous complaints, I'm not sure this amounts to
> >> a back-patchable bug, but it does seem like something worth fixing
> >> going forward.
>
> > Agreed, although I'd be willing to see us slip it into 9.4.  It's
> > doubtful that anyone will get upset if their query plans change
> > between beta1 and beta2, but the same cannot be said for released
> > branches.
>
> After further thought about this I realized that there's another category
> of proof possibilities that is pretty simple to add while we're at it.
> Namely, once we've found that both subexpressions of the two operator
> clauses are equal(), we can use btree opclass relationships to prove that,
> say, "x < y implies x <= y" or "x < y refutes x > y", independently of
> just what x and y are.  (Well, they have to be immutable expressions, but
> we'd not get this far if they're not.)  We already had pretty nearly all
> of the machinery for that, but again it was only used for proving cases
> involving comparisons to constants.
>
> A little bit of refactoring later, I offer the attached draft patch.
> I'm thinking this is probably more than we want to slip into 9.4
> at this point, but I'll commit it to HEAD soon if there are not
> objections.
>
> regards, tom lane
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>
I applied Tom's patch to the latest HEAD
(e04a9ccd2ccd6e31cc4af6b08257a0a186d0fce8) and showed it to Brian. Looks to
solve the problem he originally reported

$ patch -p1 -i ../better-predicate-proofs-1.patch
(Stripping trailing CRs from patch.)
patching file src/backend/optimizer/util/predtest.c


$ /opt/pgsql_patch_review/bin/psql postgres
Timing is on.
Null display (null) is "«NULL»".
Expanded display (expanded) is used automatically.
psql (9.5devel)
Type "help" for help.

postgres=# CREATE OR REPLACE FUNCTION public.return_if_even(v_id integer)
returns integer
postgres-# LANGUAGE sql AS
postgres-# $$
postgres$# SELECT case when v_id % 2 = 1 then 0 else v_id end;
postgres$# $$;
CREATE FUNCTION
Time: 44.669 ms

postgres=# create table public.partial_functional_index_test as
postgres-# select id from generate_series(1,100) AS s(id);
SELECT 100
Time: 1037.993 ms

postgres=# create index partial_functional_idx ON
public.partial_functional_index_test
postgres-# USING btree ( public.return_if_even(id) )
postgres-# WHERE public.return_if_even(id) = id;
LOG:  sending cancel to blocking autovacuum PID 12521
DETAIL:  Process 12424 waits for ShareLock on relation 16385 of database
12217.
STATEMENT:  create index partial_functional_idx ON
public.partial_functional_index_test
USING btree ( public.return_if_even(id) )
WHERE public.return_if_even(id) = id;
ERROR:  canceling autovacuum task
CONTEXT:  automatic analyze of table
"postgres.public.partial_functional_index_test"
CREATE INDEX
Time: 1658.245 ms

postgres=# explain analyze select count(1) from
public.partial_functional_index_test where public.return_if_even(id) = id;
 QUERY
PLAN
-
 Aggregate  (cost=4818.05..4818.06 rows=1 width=0) (actual
time=2503.851..2503.854 rows=1 loops=1)
   ->  Bitmap Heap Scan on partial_functional_index_test
(cost=82.67..4805.55 rows=5000 width=0) (actual time=43.724..1309.309
rows=50 loops=1)
 Recheck Cond: (CASE WHEN ((id % 2) = 1) THEN 0 ELSE id END = id)
 Heap Blocks: exact=4425
 ->  Bitmap Index Scan on partial_functional_idx  (cost=0.00..81.42
rows=5000 width=0) (actual time=42.961..42.961 rows=50 loops=1)
 Planning time: 4.245 ms
 Execution time: 2505.281 ms
(7 rows)

Time: 2515.344 ms
postgres=# explain analyze select count(1) from
public.partial_functional_index_test where id = public.return_if_even(id);
 QUERY
PLAN
-
 Aggregate  (cost=4818.05..4818.06 rows=1 width=0) (actual
time=2483.862..2483.866 rows=1 loops=1)
   ->  Bitmap Heap Scan on partial_functional_index_test
(cost=82.67..4805.55 rows=5000 width=0) (actual time=40.704..1282.955
rows=50 loops=1)
 Recheck Cond: (CASE WHEN ((id % 2) = 1) THEN 0 ELSE id END = id)
 Heap Blocks: exact=4425
 ->  Bitmap Index Scan on partial_functional_idx  (cost=0.00..81.42
rows=5000 width=0) (actual time=39.657..39.657 rows=50 loops=1)
 Planning time: 0.127 ms
 Execution time: 2483.979 ms
(7 rows)


--
Keith Fiske
Databas

Re: [HACKERS] lo_create(oid, bytea) breaks every extant release of libpq

2014-06-12 Thread Noah Misch

On Thu, Jun 12, 2014 at 01:53:19PM -0400, Tom Lane wrote:
> Since the discussion seems to have trailed off, I'm going to run with
> lo_from_bytea().  The plan is:
> 
> 1. Rename the function.
> 2. Add an opr_sanity regression test memorializing what we should get
> from lo_initialize()'s query.
> 3. Make sure that the regression tests leave behind a few large objects,
> so that testing of pg_dump/pg_restore using the regression database
> will exercise the large-object code paths.

Sounds good.

> It'd be a good thing if the TAP tests for client programs included
> testing of pg_dump/pg_restore, but that's a bit beyond my competence
> with that tool ... anyone care to step up?

The pg_upgrade test suite covers this well.

> Redoing or getting rid of lo_initialize()'s query would be a good thing
> too; but that does not seem like material for back-patching into 9.4,
> while I think all the above are.

I agree all the above make sense for 9.4.  I also wouldn't object to a
hardening of lo_initialize() sneaking in at this stage.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] updated emacs configuration

2014-06-12 Thread Noah Misch

On Tue, Jun 10, 2014 at 10:36:07AM -0400, Tom Lane wrote:
> Noah Misch  writes:
> > On Mon, Jun 09, 2014 at 09:04:02PM -0400, Peter Eisentraut wrote:
> >> I'd consider just getting rid of the
> >> 
> >> (c-file-style . "bsd")
> >> 
> >> setting in .dir-locals.el and put the actual interesting settings from
> >> the style in there.
> >> 
> >> Another alternative is to change emacs.samples to use
> >> hack-local-variables-hook instead, as described here:
> >> http://www.emacswiki.org/emacs/LocalVariables
> 
> > Those are reasonable alternatives.  The ultimate effect looks similar for 
> > all
> > three options, to the extent that I don't see a basis for forming a strong
> > preference.  Do you have a recommendation?
> 
> The more Ludd^H^H^Hcautious among us run with
>   (setq enable-local-variables nil)
> and would rather not see the configuration recommendations overcomplicated
> due to an attempt to play nice with directory-local settings we're ignoring
> anyway.  So I'd vote for Peter's first option, ie, kluge up .dir-locals.el
> not the configuration code.

Seeing the two votes cast and only cosmetic differences between the options,
I'll just stick with my original -v1.  Thanks.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] make check For Extensions

2014-06-12 Thread Fabien COELHO



Andres said during the unconference last month that there was a way to 
get `make check` to work with PGXS. The idea is that it would initialize 
a temporary cluster, start it on an open port, install an extension, and 
run the extension's test suite. I think the pg_regress --temp-install, 
maybe? I poked through the PGXS makefiles, and although it looks like 
there *might* be something like this for in-core contrib extensions, but 
not for externally-distributed extensions.


Is there something I could add to my extension Makefiles so that `make 
check` or `make test` will do a pre-install test on a temporary cluster?


My 0.02€: It is expected to work, more or less, see the end of

http://www.postgresql.org/docs/9.3/static/extend-pgxs.html

It invokes "psql" which is expected to work directly. Note that there is 
no temporary installation, it is tested against the installed and running 
postgres. Maybe having the ability to create a temporary installation, as 
you suggest, would be a nice extension.


--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lo_create(oid, bytea) breaks every extant release of libpq

2014-06-12 Thread Tom Lane

Noah Misch  writes:
> On Thu, Jun 12, 2014 at 01:53:19PM -0400, Tom Lane wrote:
>> It'd be a good thing if the TAP tests for client programs included
>> testing of pg_dump/pg_restore, but that's a bit beyond my competence
>> with that tool ... anyone care to step up?

> The pg_upgrade test suite covers this well.

Um, not really: what pg_upgrade exercises is "pg_dump -s" which entirely
fails to cover the data-transfer code paths.  It would not have found
this problem.

BTW, after further testing I realized that it was quite accidental that
I found it either.  pg_restore only uses libpq's lo_create() function
when restoring an "old_blob_style" archive, ie one generated by 8.4
or earlier.  That's what I happened to try to do last night, but it's
pure luck that I did.

Poking around with making the largeobject regression test leave
some stuff behind, I found out that:

1. That regression test includes the text of a Robert Frost poem that
AFAICT is still under copyright.  I think we'd better replace it with
something by someone a bit more safely dead.

2. I tried to add a COMMENT ON LARGE OBJECT to one of the not-removed
blobs.  While pg_upgrade didn't fail on transferring the blob, it
*did* fail to transfer the comment on it, which seems like a bug.
Bruce, have you got any idea how to fix that?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] make check For Extensions

2014-06-12 Thread David E. Wheeler

On Jun 12, 2014, at 11:28 AM, Fabien COELHO  wrote:

> My 0.02€: It is expected to work, more or less, see the end of
> 
> http://www.postgresql.org/docs/9.3/static/extend-pgxs.html

That says:

“The scripts listed in the REGRESS variable are used for regression testing of 
your module, which can be invoked by make installcheck after doing make 
install. For this to work you must have a running PostgreSQL server.”

That does not mean that it starts a new cluster on a port. It means it will 
test it against an existing cluster after you have installed into that cluster.

> It invokes "psql" which is expected to work directly. Note that there is no 
> temporary installation, it is tested against the installed and running 
> postgres. Maybe having the ability to create a temporary installation, as you 
> suggest, would be a nice extension.

Yes, that’s what I would like, so I could test *before* installing.

Best,

David

signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2014-06-12 Thread Gurjeet Singh

On Tue, Jun 10, 2014 at 12:05 PM, Tom Lane  wrote:
> If we're going to do this, I would say that we should also take the
> opportunity to get out from under the question of which kernel API
> we're dealing with.  So perhaps a design like this:
>
> 1. If the environment variable PG_OOM_ADJUST_FILE is defined, it's
> the name of a file to write something into after forking.  The typical
> value would be "/proc/self/oom_score_adj" or "/proc/self/oom_adj".
> If not set, nothing happens.
>
> 2. If the environment variable PG_OOM_ADJUST_VALUE is defined, that's
> the string we write, otherwise we write "0".

Please find attached the patch. It includes the doc changes as well.

Best regards,
-- 
Gurjeet Singh http://gurjeet.singh.im/

EDB www.EnterpriseDB.com
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 9fadef5..4492a1d 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -1284,8 +1284,15 @@ echo -1000 > /proc/self/oom_score_adj
 in the postmaster's startup script just before invoking the postmaster.
 Note that this action must be done as root, or it will have no effect;
 so a root-owned startup script is the easiest place to do it.  If you
-do this, you may also wish to build PostgreSQL
-with -DLINUX_OOM_SCORE_ADJ=0 added to CPPFLAGS.
+do this, you may also wish to export these environment variables
+
+PG_OOM_ADJUST_FILE=oom_score_adj
+PG_OOM_ADJUST_VALUE=0
+
+export PG_OOM_ADJUST_FILE
+export PG_OOM_ADJUST_VALUE
+
+in the startup script, before invoking the postmaster.
 That will cause postmaster child processes to run with the normal
 oom_score_adj value of zero, so that the OOM killer can still
 target them at need.
@@ -1296,8 +1303,7 @@ echo -1000 > /proc/self/oom_score_adj
 but may have a previous version of the same functionality called
 /proc/self/oom_adj.  This works the same except the disable
 value is -17 not -1000.  The corresponding
-build flag for PostgreSQL is
--DLINUX_OOM_ADJ=0.
+value for PG_OOM_ADJUST_FILE should be oom_adj.

 

diff --git a/src/backend/postmaster/fork_process.c b/src/backend/postmaster/fork_process.c
index f6df2de..21559af 100644
--- a/src/backend/postmaster/fork_process.c
+++ b/src/backend/postmaster/fork_process.c
@@ -22,6 +22,13 @@
 #endif
 
 #ifndef WIN32
+
+#ifdef __linux__
+static bool	oom_env_checked = false;
+static char	oom_adj_file[MAXPGPATH];
+static int	oom_adj_value = 0;
+#endif	/* __linux__ */
+
 /*
  * Wrapper for fork(). Return values are the same as those for fork():
  * -1 if the fork failed, 0 in the child process, and the PID of the
@@ -78,13 +85,38 @@ fork_process(void)
 		 * LINUX_OOM_SCORE_ADJ #defined to 0, or to some other value that you
 		 * want child processes to adopt here.
 		 */
-#ifdef LINUX_OOM_SCORE_ADJ
+#ifdef __linux__
+		if (!oom_env_checked)
+		{
+			char *env;
+
+			oom_env_checked = true;
+
+			env = getenv("PG_OOM_ADJUST_FILE");
+
+			/* Don't allow a path separator in file name */
+			if (env && !strchr(env, '/') && !strchr(env, '\\'))
+			{
+snprintf(oom_adj_file, MAXPGPATH, "/proc/self/%s", env);
+
+env = getenv("PG_OOM_ADJUST_VALUE");
+if (env)
+{
+	oom_adj_value = atoi(env);
+}
+else
+	oom_adj_value = 0;
+			}
+			else
+oom_adj_file[0] = '\0';
+		}
+
 		{
 			/*
 			 * Use open() not stdio, to ensure we control the open flags. Some
 			 * Linux security environments reject anything but O_WRONLY.
 			 */
-			int			fd = open("/proc/self/oom_score_adj", O_WRONLY, 0);
+			int			fd = open(oom_adj_file, O_WRONLY, 0);
 
 			/* We ignore all errors */
 			if (fd >= 0)
@@ -92,41 +124,14 @@ fork_process(void)
 char		buf[16];
 int			rc;
 
-snprintf(buf, sizeof(buf), "%d\n", LINUX_OOM_SCORE_ADJ);
+snprintf(buf, sizeof(buf), "%d\n", oom_adj_value);
 rc = write(fd, buf, strlen(buf));
 (void) rc;
 close(fd);
 			}
 		}
-#endif   /* LINUX_OOM_SCORE_ADJ */
 
-		/*
-		 * Older Linux kernels have oom_adj not oom_score_adj.  This works
-		 * similarly except with a different scale of adjustment values. If
-		 * it's necessary to build Postgres to work with either API, you can
-		 * define both LINUX_OOM_SCORE_ADJ and LINUX_OOM_ADJ.
-		 */
-#ifdef LINUX_OOM_ADJ
-		{
-			/*
-			 * Use open() not stdio, to ensure we control the open flags. Some
-			 * Linux security environments reject anything but O_WRONLY.
-			 */
-			int			fd = open("/proc/self/oom_adj", O_WRONLY, 0);
-
-			/* We ignore all errors */
-			if (fd >= 0)
-			{
-char		buf[16];
-int			rc;
-
-snprintf(buf, sizeof(buf), "%d\n", LINUX_OOM_ADJ);
-rc = write(fd, buf, strlen(buf));
-(void) rc;
-close(fd);
-			}
-		}
-#endif   /* LINUX_OOM_ADJ */
+#endif   /* __linux__ */
 
 		/*
 		 * Make sure processes do not share OpenSSL randomness state.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql

[HACKERS] Crash on backend exit w/ OpenLDAP [2.4.24, 2.4.31]

2014-06-12 Thread Noah Misch

With LDAP support enabled, we link the backend with libldap, and we link libpq
with libldap_r.  Modules like dblink and postgres_fdw link to libpq, so
loading them results in a backend having both libldap and libdap_r loaded.
Those libraries export the same symbols, and the load order rule gives
priority to the libldap symbols.  So far, so good.  However, both libraries
declare a GCC destructor, ldap_int_destroy_global_options(), to free memory
reachable from a global variable, ldap_int_global_options.  In OpenLDAP
versions 2.4.24 through 2.4.31, that variable's structure type has an
incompatible layout in libldap vs. libldap_r.  When the libldap_r build of the
destructor uses pointers from the ldap_int_global_options of libldap, SIGSEGV
ensues.  This does not arise if nothing in the process ever caused OpenLDAP to
initialize itself, because the relevant bytes then happen to be all-zero in
either structure layout.  OpenLDAP 2.4.32 fixed this by making the libldap
structure a strict prefix of the libldap_r structure.  The OpenLDAP change log
merely says "Fixed libldap sasl handling (ITS#7118, ITS#7133)"; here are the
relevant commits:

  
http://www.openldap.org/devel/gitweb.cgi?p=openldap.git;a=commitdiff;h=270ef33acf18dc13bfd07f8a8e66b446f80e7d27
  
http://www.openldap.org/devel/gitweb.cgi?p=openldap.git;a=commitdiff;h=7ff18967d7d2e49baa9d80f1b9408b276f3982e0

You can cause the at-exit crash by building PostgreSQL against OpenLDAP
2.4.31, connecting with LDAP authentication, and issuing "LOAD 'dblink'".
Alternately, connect with any authentication method and create a dblink
connection that uses an LDAP-based pg_service.conf entry.  I'm attaching a
test suite patch to illustrate that second vector.

The popularity of the affected OpenLDAP versions is not clear to me.  RHEL 5,
6 and 7 all ship OpenLDAP either old enough or new enough to miss the problem.
Debian wheezy ships 2.4.31, an affected version.  I have not looked beyond
that.  I pondered what we could do short of treating this as a pure OpenLDAP
bug for packagers to fix in their OpenLDAP builds, but I did not come up with
anything singularly attractive.  Some possibilities:

1. Link the backend with libldap_r, so we never face the mismatch.  On some
platforms, this means also linking in threading libraries.

2. Link a second copy of libpq against plain libldap and use that in server
modules like dblink.

3. Wipe the memory of ldap_int_global_options so the destructor will have
nothing to do.  A challenge here is that we don't know the structure size;
it's not part of any installed header, and it varies in response to OpenLDAP
configuration options.  Still, this is achievable in principle.  This would be
easy if the destructor function weren't static, alas.

4. Detect older OpenLDAP versions at runtime, just before we would otherwise
initialize OpenLDAP, and raise an error.  Possibly make the same check at
compile time, for packager convenience.

5. Use only _exit(), not exit().

Hopefully I'm missing a great alternative, because each of those has something
substantial to dislike about it.  Thoughts welcome, especially from packagers
dealing with platforms that use affected OpenLDAP versions.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com
diff --git a/contrib/dblink/expected/dblink.out 
b/contrib/dblink/expected/dblink.out
index f237c43..fd61985 100644
--- a/contrib/dblink/expected/dblink.out
+++ b/contrib/dblink/expected/dblink.out
@@ -103,6 +103,18 @@ SELECT *
 FROM dblink('SELECT * FROM foo') AS t(a int, b text, c text[])
 WHERE t.a > 7;
 ERROR:  connection not available
+-- The first-level connection's backend will crash on exit given OpenLDAP
+-- [2.4.24, 2.4.31].  Give the postmaster time to act on the crash.
+SELECT pg_sleep(1)
+FROM dblink('dbname=contrib_regression', $$SELECT * FROM
+   dblink('service=test_ldap dbname=contrib_regression',
+  'select 1') t(c int)$$)
+AS t(c int);
+ pg_sleep 
+--
+ 
+(1 row)
+
 -- create a persistent connection
 SELECT dblink_connect('dbname=contrib_regression');
  dblink_connect 
diff --git a/contrib/dblink/sql/dblink.sql b/contrib/dblink/sql/dblink.sql
index 2a10760..fa98285 100644
--- a/contrib/dblink/sql/dblink.sql
+++ b/contrib/dblink/sql/dblink.sql
@@ -65,6 +65,14 @@ SELECT *
 FROM dblink('SELECT * FROM foo') AS t(a int, b text, c text[])
 WHERE t.a > 7;
 
+-- The first-level connection's backend will crash on exit given OpenLDAP
+-- [2.4.24, 2.4.31].  Give the postmaster time to act on the crash.
+SELECT pg_sleep(1)
+FROM dblink('dbname=contrib_regression', $$SELECT * FROM
+   dblink('service=test_ldap dbname=contrib_regression',
+  'select 1') t(c int)$$)
+AS t(c int);
+
 -- create a persistent connection
 SELECT dblink_connect('dbname=contrib_regression');
 
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 4b31c0a..19bfe75 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.globa

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-06-12 Thread Tom Lane

Peter Geoghegan  writes:
> ... In any case it's pretty clear that a goal of
> the glibc implementation is to concentrate as much entropy as possible
> into the first part of the string, and that's the important point.
> This makes perfect sense, and is why I was so incredulous about the
> Mac behavior.

I think this may be another facet of something we were already aware of,
which is that the UTF8 locales on OS X pretty much suck.  It's fairly
clear that Apple has put no effort into achieving more than minimal
standards compliance for those.  Sorting doesn't work as expected in
those locales, for example.

Still, that's reality, and any proposal to rely on strxfrm is going to
have to deal with it :-(

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lo_create(oid, bytea) breaks every extant release of libpq

2014-06-12 Thread Noah Misch

On Thu, Jun 12, 2014 at 02:53:23PM -0400, Tom Lane wrote:
> Noah Misch  writes:
> > On Thu, Jun 12, 2014 at 01:53:19PM -0400, Tom Lane wrote:
> >> It'd be a good thing if the TAP tests for client programs included
> >> testing of pg_dump/pg_restore, but that's a bit beyond my competence
> >> with that tool ... anyone care to step up?
> 
> > The pg_upgrade test suite covers this well.
> 
> Um, not really: what pg_upgrade exercises is "pg_dump -s" which entirely
> fails to cover the data-transfer code paths.  It would not have found
> this problem.

I see.  TAP suite coverage for a data-included dump/restore would be worth its
weight, agreed.

> BTW, after further testing I realized that it was quite accidental that
> I found it either.  pg_restore only uses libpq's lo_create() function
> when restoring an "old_blob_style" archive, ie one generated by 8.4
> or earlier.  That's what I happened to try to do last night, but it's
> pure luck that I did.

That could have easily remained undiscovered until release day.  Not good.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-06-12 Thread Peter Geoghegan

Thanks for looking into this.

On Thu, Jun 12, 2014 at 9:25 AM, Robert Haas  wrote:
> Still, it's fair to say that on this Linux system, the first 8 bytes
> capture a significant portion of the entropy of the first 8 bytes of
> the string, whereas on MacOS X you only get entropy from the first 2
> bytes of the string.  It would be interesting to see results from
> other platforms people might care about also.

Right. It was a little bit incautious of me to say that we had the
full benefit of 8 bytes of storage with "en_US.UTF-8", since that is
only true of lower case characters (I think that FreeBSD can play
tricks with this. Sometimes, it will give you the benefit of 8 bytes
of entropy for an 8 byte string, with only non-differentiating
trailing bytes, so that the first 8 bytes of "Aaaa" are distinct
from the first eight bytes of "", while any trailing bytes are
non-distinct for both). In any case it's pretty clear that a goal of
the glibc implementation is to concentrate as much entropy as possible
into the first part of the string, and that's the important point.
This makes perfect sense, and is why I was so incredulous about the
Mac behavior. After all, the Open Group's strcoll() documentation
says:

"The strxfrm() and strcmp() functions should be used for sorting large lists."

Sorting text is hardly an infrequent requirement -- it's almost the
entire reason for having strxfrm() in the standard. You're always
going to want to have each strcmp() find differences as early as
possible.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?

2014-06-12 Thread Gregory Smith


On 6/11/14, 10:26 AM, Robert Haas wrote:
Now, as soon as we introduce the concept that selecting from a table 
might not really mean "read from the table" but "read from the table 
after applying this owner-specified qual", we're opening up a whole 
new set of attack surfaces. Every pg_dump is an opportunity to hack 
somebody else's account, or at least audit their activity. 


I'm in full agreement we should clearly communicate the issues around 
pg_dump in particular, because they can't necessarily be eliminated 
altogether without some major work that's going to take a while to 
finish.  And if the work-around is some sort of GUC for killing RLS 
altogether, that's ugly but not unacceptable to me as a short-term fix.


One of the difficult design requests in my inbox right now asks how 
pg_dump might be changed both to reduce its overlap with superuser 
permissions and to allow auditing of its activity.  Those requests 
aren't going away; their incoming frequency is actually rising quite 
fast right now.  They're both things people expect from serious SQL 
oriented commercial database products, and I'd like to see PostgreSQL 
continue to displace those as we reach feature parity in those areas.


Any way you implement finer grained user permissions and auditing 
features will be considered a new attack vector when you use those 
features.  The way the proposed RLS feature inserts an arbitrary 
function for reads has a similar new attack vector when you use that 
feature.


I'm kind of surprised to see this turn into a hot button all of the 
sudden though, because my thought on all that so far has been a giant so 
what?  This is what PostgreSQL does.


You wanna write your own C code and then link the thing right into the 
server, so that bugs can expose data and crash the whole server?  Not 
only can you shoot yourself in the foot that way, we supply a sample gun 
and bullets in contrib.  How about writing arbitrary code in any one of 
a dozen server-side languages of wildly varying quality, then hooking 
that code so it runs as a trigger function whenever you change a row?  
PostgreSQL is *on it*; we love letting people write some random thing, 
and then running that random thing against your data as a side-effect of 
doing an operation.  And if you like that...just wait until you learn 
about this half-assed rules feature we have too!


And when the database breaks because the functions people inserted were 
garbage, that's their fault, not a cause for a CVE.  And when someone 
blindly installs adminpack because it sounded like a pgAdmin 
requirement, lets a monitoring system run as root so it can watch 
pg_stat_activity, and then discovers that pair of reasonable decisions 
suddenly means any fool with monitoring access can call 
pg_file_unlink...that's their fault too. These are powerful tools with 
serious implications, and they're expected to be used by equally serious 
users.


We as a development community do need to put a major amount of work into 
refactoring all of these security mechanisms.  There should be less of 
these embarrassing incidents where bad software design really forced the 
insecure thing to happen, which I'd argue is the case for that 
pg_stat_activity example.  And luckily so far development resources are 
appearing for organizations I know of working in that direction 
recently, as fast as the requirements are rising.  I think there's a 
good outcome at the end of that road.


But let's not act like RLS is a scary bogeyman because it introduces a 
new way to hack the server or get surprising side-effects.  That's 
expected and possibly unavoidable behavior in a feature like this, and 
there are much worse instances of arbitrary function risk throughout the 
core code already.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] How to change the pgsql source code and build it??

2014-06-12 Thread Shreesha

Hello,
I need to initialize the db as the root and start the database server. In
order to accomplish this, I modified the initdb.c source file of pgsql
package and tried to compile it. Eventhough the build was successful, I
couldn't see the root user able to execute initdb executable generated by
the build. I wanted to know if there is any other procedure for building
the postgresql procedure?

Thanks
Shreesha.

P.S
Below is the changes done in initdb.c (shown in bold letters below)
---
static char *
get_id(void)
{
#ifndef WIN32

struct passwd *pw;

//  if (geteuid() == 0) /* 0 is root's uid */
*/*  {*
*fprintf(stderr,*
*_("%s: cannot be run as root\n"*
*  "Please log in (using, e.g., \"su\") as
the "*
*  "(unprivileged) user that will\n"*
*  "own the server process.\n"),*
*progname);*
*exit(1);*
*}*
**/*
...
}

Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?

2014-06-12 Thread Stephen Frost

Greg, all,

I will reply to the emails in detail when I get a chance but am out of town
at a funeral, so it'll likely be delayed. I did want to echo my agreement
for the most part with Greg and in particular...

On Thursday, June 12, 2014, Gregory Smith  wrote:

> On 6/11/14, 10:26 AM, Robert Haas wrote:
>
>> Now, as soon as we introduce the concept that selecting from a table
>> might not really mean "read from the table" but "read from the table after
>> applying this owner-specified qual", we're opening up a whole new set of
>> attack surfaces. Every pg_dump is an opportunity to hack somebody else's
>> account, or at least audit their activity.
>>
>
> I'm in full agreement we should clearly communicate the issues around
> pg_dump in particular, because they can't necessarily be eliminated
> altogether without some major work that's going to take a while to finish.
>  And if the work-around is some sort of GUC for killing RLS altogether,
> that's ugly but not unacceptable to me as a short-term fix.

A GUC which is enable / disable / error-instead may work quiet well, with
error-instead for pg_dump default if people really want it (there would
have to be a way to disable that though, imv).

Note that enable is default in general, disable would be for superuser only
(or on start-up) to disable everything, and error-instead anyone could use
but it would error instead of implementing RLS when querying an RLS-enabled
table.

This approach was suggested by an existing user testing out this RLS
approach, to be fair, but it looks pretty sane to me as a way to address
some of these concerns. Certainly open to other ideas and thoughts though.

Thanks,

Stephen

Re: [HACKERS] How to change the pgsql source code and build it??

2014-06-12 Thread Abhijit Menon-Sen

At 2014-06-12 16:08:05 -0700, shreesha1...@gmail.com wrote:
>
> I need to initialize the db as the root and start the database server.

Why?

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to change the pgsql source code and build it??

2014-06-12 Thread Shreesha

I need to port pgsql onto a controller which doesn't have a framework of
creating multiple users for administrative purposes. The entire controller
is managed by a single root user and that is the reason I am trying to
change the pgsql initdb behavior. Do you think of any other better
alternative?

On Thu, Jun 12, 2014 at 5:40 PM, Abhijit Menon-Sen 
wrote:

> At 2014-06-12 16:08:05 -0700, shreesha1...@gmail.com wrote:
> >
> > I need to initialize the db as the root and start the database server.
>
> Why?
>
> -- Abhijit
>

-- 
~Shreesha.

Re: [HACKERS] How to change the pgsql source code and build it??

2014-06-12 Thread Kyotaro HORIGUCHI

Hi,

> I need to port pgsql onto a controller which doesn't have a framework of
> creating multiple users for administrative purposes. The entire controller
> is managed by a single root user and that is the reason I am trying to
> change the pgsql initdb behavior. Do you think of any other better
> alternative?

The reason you didn't see initdb completed is that it execs
postgres on the way.

As you know, it is strongly discourged on ordinary environment,
but that framework sounds to be a single-user environment like
what MS-DOS was, where any security risk comes from the
characterisc is acceptable.

I could see initdb and postgres operating as root for the moment
(which means any possible side-effect is not checked) by making
changes at four point in the whole postgresql source
tree. Perhaps only two of them are needed for your wish.

postgresql $ find . -type f -print | xargs grep -nH 'geteuid() == 0'
./src/backend/main/main.c:377:  if (geteuid() == 0)
./src/bin/pg_ctl/pg_ctl.c:2121: if (geteuid() == 0)
./src/bin/initdb/initdb.c:778:  if (geteuid() == 0)  /* 0 
is root's uid */
./src/bin/pg_resetxlog/pg_resetxlog.c:250:  if (geteuid() == 0)

Try replacing these conditions with "(0 && geteuid() == 0)" and
you would see it run as root.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Something flaky in the "relfilenode mapping" infrastructure

2014-06-12 Thread Noah Misch

On Thu, Jun 12, 2014 at 02:44:10AM -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2014-06-12 00:38:36 -0400, Noah Misch wrote:
> >> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prairiedog&dt=2014-06-12%2000%3A17%3A07
> 
> > Hm. My guess it's that it's just a 'harmless' concurrency issue. The
> > test currently run in concurrency with others: I think what happens is
> > that the table gets dropped in the other relation after the query has
> > acquired the mvcc snapshot (used for the pg_class) test.
> > But why is it triggering on such a 'unusual' system and not on others?
> > That's what worries me a bit.

I can reproduce a similar disturbance in the test query using gdb and a
concurrent table drop, and the table reported in the prairiedog failure is a
table dropped in a concurrent test group.  That explanation may not be the
full story behind these particular failures, but it certainly could cause
similar failures in the future.

Let's prevent this by only reporting rows for relations that still exist after
the query is complete.

> prairiedog is pretty damn slow by modern standards.  OTOH, I think it
> is not the slowest machine in the buildfarm; hamster for instance seems
> to be at least a factor of 2 slower.  So I'm not sure whether to believe
> it's just a timing issue.

That kernel's process scheduler could be a factor.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com
diff --git a/src/test/regress/expected/alter_table.out 
b/src/test/regress/expected/alter_table.out
index a182176..a274d82 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -2375,14 +2375,18 @@ Check constraints:
 
 DROP TABLE alter2.tt8;
 DROP SCHEMA alter2;
--- Check that we map relation oids to filenodes and back correctly.
--- Only display bad mappings so the test output doesn't change all the
--- time.
+-- Check that we map relation oids to filenodes and back correctly.  Only
+-- display bad mappings so the test output doesn't change all the time.  A
+-- filenode function call can return NULL for a relation dropped concurrently
+-- with the call's surrounding query, so check mappings only for relations
+-- that still exist after all calls finish.
+CREATE TEMP TABLE filenode_mapping AS
 SELECT
 oid, mapped_oid, reltablespace, relfilenode, relname
 FROM pg_class,
 pg_filenode_relation(reltablespace, pg_relation_filenode(oid)) AS 
mapped_oid
 WHERE relkind IN ('r', 'i', 'S', 't', 'm') AND mapped_oid IS DISTINCT FROM oid;
+SELECT m.* FROM filenode_mapping m JOIN pg_class c ON c.oid = m.oid;
  oid | mapped_oid | reltablespace | relfilenode | relname 
 -++---+-+-
 (0 rows)
diff --git a/src/test/regress/sql/alter_table.sql 
b/src/test/regress/sql/alter_table.sql
index 3f641f9..19e1229 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -1582,15 +1582,20 @@ ALTER TABLE IF EXISTS tt8 SET SCHEMA alter2;
 DROP TABLE alter2.tt8;
 DROP SCHEMA alter2;
 
--- Check that we map relation oids to filenodes and back correctly.
--- Only display bad mappings so the test output doesn't change all the
--- time.
+-- Check that we map relation oids to filenodes and back correctly.  Only
+-- display bad mappings so the test output doesn't change all the time.  A
+-- filenode function call can return NULL for a relation dropped concurrently
+-- with the call's surrounding query, so check mappings only for relations
+-- that still exist after all calls finish.
+CREATE TEMP TABLE filenode_mapping AS
 SELECT
 oid, mapped_oid, reltablespace, relfilenode, relname
 FROM pg_class,
 pg_filenode_relation(reltablespace, pg_relation_filenode(oid)) AS 
mapped_oid
 WHERE relkind IN ('r', 'i', 'S', 't', 'm') AND mapped_oid IS DISTINCT FROM oid;
 
+SELECT m.* FROM filenode_mapping m JOIN pg_class c ON c.oid = m.oid;
+
 -- Checks on creating and manipulation of user defined relations in
 -- pg_catalog.
 --

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Something flaky in the "relfilenode mapping" infrastructure

2014-06-12 Thread Tom Lane

Noah Misch  writes:
> On Thu, Jun 12, 2014 at 02:44:10AM -0400, Tom Lane wrote:
>> Andres Freund  writes:
>>> On 2014-06-12 00:38:36 -0400, Noah Misch wrote:
 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prairiedog&dt=2014-06-12%2000%3A17%3A07

>>> Hm. My guess it's that it's just a 'harmless' concurrency issue. The
>>> test currently run in concurrency with others: I think what happens is
>>> that the table gets dropped in the other relation after the query has
>>> acquired the mvcc snapshot (used for the pg_class) test.
>>> But why is it triggering on such a 'unusual' system and not on others?
>>> That's what worries me a bit.

> I can reproduce a similar disturbance in the test query using gdb and a
> concurrent table drop, and the table reported in the prairiedog failure is a
> table dropped in a concurrent test group.  That explanation may not be the
> full story behind these particular failures, but it certainly could cause
> similar failures in the future.

Yeah, that seems like a plausible explanation, since the table shown
in the failure report is one that would be getting dropped concurrently,
and the discrepancy is that we get NULL rather than the expected value
for the pg_filenode_relation result, which is expected if the table is
already dropped when the mapping function is called.

> Let's prevent this by only reporting rows for relations that still exist after
> the query is complete.

I think this is a bad solution though; it risks masking actual problems.

What seems like a better fix to me is to change the test

 mapped_oid IS DISTINCT FROM oid

to

 mapped_oid <> oid

pg_class.oid will certainly never read as NULL, so what this will do is
allow the single case where the function returns NULL.  AFAIK there is
no reason to suppose that a NULL result would mean anything except "the
table's been dropped", so changing it this way will allow only that case
and not any others.

Alternatively, we could do something like you suggest but adjust the
second join so that it suppresses only rows in which mapped_oid is null
*and* there's no longer a matching OID in pg_class.  That would provide
additional confidence that the null result is a valid indicator of a
just-dropped table.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lo_create(oid, bytea) breaks every extant release of libpq

2014-06-12 Thread Tom Lane

I wrote:
> Robert Haas  writes:
>> Presumably we should also fix libpq to not be so dumb.  I mean, it
>> doesn't help with the immediate problem, since as you say there could
>> be non-upgraded copies of libpq out there for the indefinite future,
>> but it still seems like something we oughta fix.

> It's been in the back of my mind for awhile that doing a dynamic query at
> all here is pretty pointless.  It's not like the OIDs of those functions
> ever have or ever will move.  It would probably be more robust if we
> just let libpq be a consumer of fmgroids.h and refer directly to the
> constants F_LO_CREATE etc.

I thought a bit more about this.  Ignore for the moment the larger
question of whether we want to consider fmgroids.h as something we'd
export to clients outside the immediate core infrastructure; that
will definitely take more thought than we can expend if we want to
slip this into 9.4.  It still seems reasonable for libpq to use it.
The actual code changes in fe-lobj.c are trivial enough (and would
consist mostly of code removal).  We would need to deal with the fact
that some of the support functions are not present in older backends,
but I think testing PQserverVersion is adequate for that purpose.

The hard part seems to be making sure that fmgroids.h is available to
reference, since it's a generated file and not guaranteed to be there
a-priori.  In a gmake-driven build I have the skillz to deal with that,
but I am not sure what to do in the various Windows build systems,
especially for the client-code-only build methods.  The path of least
resistance would be to just assume fmgroids.h is available, which would
work fine when building from a tarball, or probably when doing a full
build including backend (MSVC builds aren't parallel are they?).  But
what about a client-only build?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] unable to attach client process to postgresql server using gdb

2014-06-12 Thread Rajmohan C

hi,

   I am working with PostgreSQL 9.3.4 source using Eclipse IDE in ubuntu
14.04. I am facing a problem in attaching client process to postgresql
server using gdb to debug. When I start the postmaster then I connect to it
from client on a terminal. It works fine. Queries get responses. When I run
debug config from eclipse then select postgres process id from list I get
error saying

*Can't find a source file at
"/build/buildd/eglibc-2.19/socket/../sysdeps/unix/sysv/linux/x86_64/recv.c" *
*Locate the file or edit the source lookup path to include its location.*

 After this when I send any query from client, it just stucks. No response
comes. After attaching gdb to client process, client does not get any
response from postgres server. One thing to note is that I was able to
debug properly till yesterday. But now it is not working. I tried
reinstalling but did not help. How could I fix this issue? Kindly help.

Rajmohan

Re: [HACKERS] postgresql.auto.conf read from wrong directory

2014-06-12 Thread Amit Kapila

On Thu, Jun 12, 2014 at 7:26 PM, Fujii Masao  wrote:
>
> On Tue, May 27, 2014 at 2:05 PM, Amit Kapila 
wrote:
> > On Sun, May 11, 2014 at 11:23 PM, Tom Lane  wrote:
> >> I think it's clearly *necessary* to forbid setting data_directory in
> >> postgresql.auto.conf.  The file is defined to be found in the data
> >> directory, so any such setting is circular logic by definition;
> >> no good can come of not rejecting it.
> >>
> >> We already have a GUC flag bit about disallowing certain variables
> >> in the config file (though I don't remember if it's enforced or
> >> just advisory).  It seems to me that we'd better invent one for
> >> disallowing in ALTER SYSTEM, as well.
> >
> > I introduced a new flag bit (GUC_DISALLOW_IN_AUTO_FILE) to
> > disallow parameters by Alter System and disallowed data_directory to
> > be set by Alter System.
>
> We should document what types of parameters are not allowed to be set by
> ALTER SYSTEM SET?

Agreed, I had mentioned in Notes section of document.  Apart from that
I had disallowed parameters that are excluded from postgresql.conf by
initdb (Developer options) and they are recommended in user manual
to be not used in production.

> data_directory was displayed when I typed "TAB" just after ALTER SYSTEM
SET.
> Probably tab-completion for ALTER SYSTEM SET needs to be changed.

This information is not stored in pg_settings.  One way is to specify
manually all the parameters which are disallowed but it seems the query
will become clumsy, another could be to have another column in pg_settings.
Do you think of any other way?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


prohibit_data_dir_by_alter_system-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] view reloptions

2014-06-12 Thread Jaime Casanova

On Wed, Jun 11, 2014 at 2:46 PM, Alvaro Herrera
 wrote:
> I just noticed by chance that view relations are using StdRdOptions to
> allocate their reloptions.  I can't find any reason for this, other than
> someone failed to realize that they should instead have a struct defined
> of its own, just like (say) GIN indexes do.  Views using StdRdOptions is
> quite pointless, given that it's used for stuff like fillfactor and
> autovacuum, neither of which apply to views.
>
> 9.2 added security_barrier to StdRdOptions, and 9.4 is now adding
> check_option_offset, which is a string reloption with some funny rules.
>
[...]
> 2) it would mean that security_barrier would change for external code
> that expects StdRdOptions rather than, say, ViewOptions
> 3) I don't have time to do the legwork before CF1 anyway
>
> If we don't change it now, I'm afraid we'll be stuck with using
> StdRdOptions for views for all eternity.
>
> (It's a pity I didn't become aware of this earlier in 9.4 cycle when I
> added the multixact freeze reloptions ... I could have promoted a patch
> back then.)
>

Attached is a patch moving the reloptions of views into its own structure.
i also created a view_reloptions() function in order to not use
heap_reloptions() for views, but maybe that was too much?

i haven't seen the check_option_offset thingy yet, but i hope to take
a look at that tomorrow.

-- 
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157
From 87ed78a4f276484a37917c417286a16082030f13 Mon Sep 17 00:00:00 2001
From: Jaime Casanova 
Date: Fri, 13 Jun 2014 01:10:24 -0500
Subject: [PATCH] Move reloptions from views into its own structure.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Per gripe from Álvaro Herrera.
---
 src/backend/access/common/reloptions.c |   42 ++-
 src/backend/commands/tablecmds.c   |9 +-
 src/include/access/reloptions.h|1 +
 src/include/utils/rel.h|   43 
 src/tools/pgindent/typedefs.list   |1 +
 5 files changed, 71 insertions(+), 25 deletions(-)

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 522b671..c7ad6f9 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -834,10 +834,12 @@ extractRelOptions(HeapTuple tuple, TupleDesc tupdesc, Oid amoptions)
 	{
 		case RELKIND_RELATION:
 		case RELKIND_TOASTVALUE:
-		case RELKIND_VIEW:
 		case RELKIND_MATVIEW:
 			options = heap_reloptions(classForm->relkind, datum, false);
 			break;
+		case RELKIND_VIEW:
+			options = view_reloptions(datum, false);
+			break;
 		case RELKIND_INDEX:
 			options = index_reloptions(amoptions, datum, false);
 			break;
@@ -1200,10 +1202,6 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		offsetof(StdRdOptions, autovacuum) +offsetof(AutoVacOpts, vacuum_scale_factor)},
 		{"autovacuum_analyze_scale_factor", RELOPT_TYPE_REAL,
 		offsetof(StdRdOptions, autovacuum) +offsetof(AutoVacOpts, analyze_scale_factor)},
-		{"security_barrier", RELOPT_TYPE_BOOL,
-		offsetof(StdRdOptions, security_barrier)},
-		{"check_option", RELOPT_TYPE_STRING,
-		offsetof(StdRdOptions, check_option_offset)},
 		{"user_catalog_table", RELOPT_TYPE_BOOL,
 		offsetof(StdRdOptions, user_catalog_table)}
 	};
@@ -1225,6 +1223,38 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 }
 
 /*
+ * Option parser for views
+ */
+bytea *
+view_reloptions(Datum reloptions, bool validate)
+{
+	relopt_value *options;
+	ViewOptions *vopts;
+	int			numoptions;
+	static const relopt_parse_elt tab[] = {
+		{"security_barrier", RELOPT_TYPE_BOOL,
+		offsetof(ViewOptions, security_barrier)},
+		{"check_option", RELOPT_TYPE_STRING,
+		offsetof(ViewOptions, check_option_offset)}
+	};
+
+	options = parseRelOptions(reloptions, validate, RELOPT_KIND_VIEW, &numoptions);
+
+	/* if none set, we're done */
+	if (numoptions == 0)
+		return NULL;
+
+	vopts = allocateReloptStruct(sizeof(ViewOptions), options, numoptions);
+
+	fillRelOptions((void *) vopts, sizeof(ViewOptions), options, numoptions,
+   validate, tab, lengthof(tab));
+
+	pfree(options);
+
+	return (bytea *) vopts;
+}
+
+/*
  * Parse options for heaps, views and toast tables.
  */
 bytea *
@@ -1248,8 +1278,6 @@ heap_reloptions(char relkind, Datum reloptions, bool validate)
 		case RELKIND_RELATION:
 		case RELKIND_MATVIEW:
 			return default_reloptions(reloptions, validate, RELOPT_KIND_HEAP);
-		case RELKIND_VIEW:
-			return default_reloptions(reloptions, validate, RELOPT_KIND_VIEW);
 		default:
 			/* other relkinds are not supported */
 			return NULL;
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 341262b..fd27cdb 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -533,7 +

[HACKERS] loading .so file at run time

2014-06-12 Thread Rajmohan C

I am working with Postgresql 9.3.4 source using eclipse IDE in ubuntu
14.04. I have a library file abc.so to be loaded at run time when
postgresql server starts. When I run the server with

"-D /home/rajmohan/projects/TPCH_database"

 as argument in run configuration, I could see LOG message abc.so library
loaded and I am able to call library methods at run time. But this requires
a client to connect to server and pass cmds. But I want a stand-alone
postgres server such that I can give commands directly from eclipse command
prompt itself. For that when I set the argument in run configuration as

"--single -D /home/rajmohan/projects/TPCH_database tpch",

the library loaded log message doesnot appears and library does not load.
How to load the library on stand-alone server and work?

Re: [HACKERS] make check For Extensions

2014-06-12 Thread Fabien COELHO



That does not mean that it starts a new cluster on a port. It means it 
will test it against an existing cluster after you have installed into 
that cluster.


Yes, that is what I was saying.

It invokes "psql" which is expected to work directly. Note that there 
is no temporary installation, it is tested against the installed and 
running postgres. Maybe having the ability to create a temporary 
installation, as you suggest, would be a nice extension.


Yes, that’s what I would like, so I could test *before* installing.


I would suggest to add that to https://wiki.postgresql.org/wiki/Todo.

I may look into it when I have time, over the summer. The key point is 
that there is no need for a temporary installation, but only of a 
temporary cluster, and to trick this cluster into loading the uninstalled 
extension, maybe by playing with dynamic_library_path in the temporary 
cluster.


--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

60 matches

Mail list logo