Re: [HACKERS] Platform-dependent(?) failure in timeout handling

2013-11-27 Thread Martijn van Oosterhout
On Tue, Nov 26, 2013 at 06:50:28PM -0500, Tom Lane wrote:
 I believe the reason for this is the mechanism that I speculated about in
 that previous thread.  The platform is blocking SIGALRM while it executes
 handle_sig_alarm(), and that calls LockTimeoutHandler() which does
 kill(MyProcPid, SIGINT), and that SIGINT is being delivered immediately
 (or at least before we can get out of handle_sig_alarm).  So now the
 platform blocks SIGINT, too, and calls StatementCancelHandler(), which
 proceeds to longjmp out of the whole signal-handling call stack.  So
 the signal unblocking that would have happened after the handlers
 returned doesn't happen.  In simpler cases we don't see an issue because
 the longjmp returns to the setsigjmp(foo,1) call in postgres.c, which
 will result in restoring the signal mask that was active at that stack
 level, so we're all good.  However, PG_TRY() uses setsigjmp(foo,0),
 which means that no signal mask restoration happens if we catch the
 longjmp and don't ever re-throw it.  Which is exactly what happens in
 plpgsql because of the EXCEPTION clause in the above example.
 
 I don't know how many platforms block signals during handlers in this way,
 but I'm seeing it on Linux (RHEL6.5 to be exact) and we know that at least
 OpenBSD acts likewise, so that's a pretty darn large chunk of the world.

Isn't this why sigsetjmp/siglongjmp where invented? Is there a
situation where you don't want the signal mask restored?

BTW, seems on BSD systems sigsetjmp == setjmp:

http://www.gnu.org/software/libc/manual/html_node/Non_002dLocal-Exits-and-Signals.html

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] Incomplete freezing when truncating a relation during vacuum

2013-11-27 Thread Heikki Linnakangas

On 11/27/13 01:21, Andres Freund wrote:

On 2013-11-26 13:32:44 +0100, Andres Freund wrote:

This seems to be the case since
b4b6923e03f4d29636a94f6f4cc2f5cf6298b8c8. I suggest we go back to using
scan_all to determine whether we can set new_frozen_xid. That's a slight
pessimization when we scan a relation fully without explicitly scanning
it in its entirety, but given this isn't the first bug around
scanned_pages/rel_pages I'd rather go that way. The aforementioned
commit wasn't primarily concerned with that.
Alternatively we could just compute new_frozen_xid et al before the
lazy_truncate_heap.


I've gone for the latter in this preliminary patch. Not increasing
relfrozenxid after an initial data load seems like a bit of a shame.

I wonder if we should just do scan_all || vacrelstats-scanned_pages 
vacrelstats-rel_pages?


Hmm, you did (scan_all || vacrelstats-scanned_pages  
vacrelstats-rel_pages) for relminmxid, and just 
(vacrelstats-scanned_pages  vacrelstats-rel_pages) for relfrozenxid. 
That was probably not what you meant to do, the thing you did for 
relfrozenxid looks good to me.


Does the attached look correct to you?

- Heikki
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 6778c7d..6688ab3 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -178,7 +178,8 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
 	int			usecs;
 	double		read_rate,
 write_rate;
-	bool		scan_all;
+	bool		scan_all;		/* should we scan all pages? */
+	bool		scanned_all;	/* did we actually scan all pages? */
 	TransactionId freezeTableLimit;
 	BlockNumber new_rel_pages;
 	double		new_rel_tuples;
@@ -226,6 +227,21 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
 	vac_close_indexes(nindexes, Irel, NoLock);
 
 	/*
+	 * Compute whether we actually scanned the whole relation. If we did, we
+	 * can adjust relfrozenxid and relminmxid.
+	 *
+	 * NB: We need to check this before truncating the relation, because that
+	 * will change -rel_pages.
+	 */
+	if (vacrelstats-scanned_pages  vacrelstats-rel_pages)
+	{
+		Assert(!scan_all);
+		scanned_all = false;
+	}
+	else
+		scanned_all = true;
+
+	/*
 	 * Optionally truncate the relation.
 	 *
 	 * Don't even think about it unless we have a shot at releasing a goodly
@@ -254,8 +270,8 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
 	 * is all-visible we'd definitely like to know that.  But clamp the value
 	 * to be not more than what we're setting relpages to.
 	 *
-	 * Also, don't change relfrozenxid if we skipped any pages, since then we
-	 * don't know for certain that all tuples have a newer xmin.
+	 * Also, don't change relfrozenxid/relminmxid if we skipped any pages,
+	 * since then we don't know for certain that all tuples have a newer xmin.
 	 */
 	new_rel_pages = vacrelstats-rel_pages;
 	new_rel_tuples = vacrelstats-new_rel_tuples;
@@ -269,13 +285,8 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
 	if (new_rel_allvisible  new_rel_pages)
 		new_rel_allvisible = new_rel_pages;
 
-	new_frozen_xid = FreezeLimit;
-	if (vacrelstats-scanned_pages  vacrelstats-rel_pages)
-		new_frozen_xid = InvalidTransactionId;
-
-	new_min_multi = MultiXactCutoff;
-	if (vacrelstats-scanned_pages  vacrelstats-rel_pages)
-		new_min_multi = InvalidMultiXactId;
+	new_frozen_xid = scanned_all ? FreezeLimit : InvalidTransactionId;
+	new_min_multi = scanned_all ? MultiXactCutoff : InvalidMultiXactId;
 
 	vac_update_relstats(onerel,
 		new_rel_pages,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Platform-dependent(?) failure in timeout handling

2013-11-27 Thread Andres Freund
Hi,

On 2013-11-26 18:50:28 -0500, Tom Lane wrote:
 I don't know how many platforms block signals during handlers in this way,
 but I'm seeing it on Linux (RHEL6.5 to be exact) and we know that at least
 OpenBSD acts likewise, so that's a pretty darn large chunk of the world.

Just as a datapoint, I think on linux (and likely on any system using
sigaction()) that depends on how you setup the signal handler. There's a
flag that controls that behaviour, namely SA_NODEFER.
When using signal(2) the behaviour depends on the flavor of system we're
running and even on some #defines.

I am not really sure whether using SA_NODEFER would be a good idea, even
if we could achieve that behaviour on all platforms.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-11-27 Thread Peter Geoghegan
On Tue, Nov 26, 2013 at 8:19 PM, Peter Geoghegan p...@heroku.com wrote:
 There are some visibility-related race conditions even still

I also see this, sandwiched between the very many deadlock detected
errors recorded over 6 or so hours (this is in chronological order,
with no ERRORs omitted within the range shown):

ERROR:  deadlock detected
ERROR:  deadlock detected
ERROR:  deadlock detected
ERROR:  unable to fetch updated version of tuple
ERROR:  unable to fetch updated version of tuple
ERROR:  unable to fetch updated version of tuple
ERROR:  unable to fetch updated version of tuple
ERROR:  unable to fetch updated version of tuple
ERROR:  unable to fetch updated version of tuple
ERROR:  unable to fetch updated version of tuple
ERROR:  unable to fetch updated version of tuple
ERROR:  unable to fetch updated version of tuple
ERROR:  unable to fetch updated version of tuple
ERROR:  unable to fetch updated version of tuple
ERROR:  deadlock detected
ERROR:  deadlock detected
ERROR:  deadlock detected
ERROR:  deadlock detected

This, along with the already-discussed attempted to update invisible
tuple forms a full account of unexpected ERRORs seen during the
extended run of the test case, so far.

Since it took me a relatively long time to recreate this, it may not
be trivial to do so. Unless you don't think it's useful to do so, I'm
going to give this test a full 24 hours, just in case it shows up
anything else like this.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_upgrade ?deficiency

2013-11-27 Thread Karsten Hilbert
On Tue, Nov 26, 2013 at 03:25:44PM -0800, Kevin Grittner wrote:

  doc patch?
 
 Instead of the fix you mean, or with it?  I don't see what we would
 change in the docs for the fix; the alternative might be to
 document that pg_dumpall output will fail to restore if any
 database (or the restoring user) has this property set.

a) since pg_dump is not planned to be changed to deal with
   it a hint in the pg_dump docs would be helpful

I can fully understand the argument that if the dump
does NOT contain a create database the restore
should indeed honor the existing database setting.

In case it does contain a create database statement
the issue won't exist -- because the database will
need to be gone before the restore.

b) if pg_ugprade is not fixed then a hint in the docs
   is useful (see below)

  pg_upgrade probably needs a doc patch too, or a reset like
  pg_dumpall.  pg_upgrade is more like pg_dumpall, so a code patch
  seems most logical, again, assuming we decide that pg_dumpall is
  the right place for this reset of default_transaction_read_only.
 
 I don't have much opinion on what the pg_upgrade aspect except,

The situation is this:

naive user:
- installs new PG version
- upgrades old cluster with one default_read_only database
- upgrade fails (or check fails - latter requires patch)
- user asks someone
- user unsets default_read_only
- upgrades old cluster
- upgrade succeeds
- user re-sets default_read_only

careful user:
- installs new PG version
- reads pg_upgrade docs of new version (requires doc patch)
- user unsets default_read_only
- upgrades old cluster
- upgrade succeeds
- user re-sets default_read_only

I can't for the life of it think of a scenario where
one would go: Oh, I've got a default_read_only
database -- let's NOT upgrade the cluster.

 check.  Passing the check but failing during the upgrade would not
 be very user-friendly.  Again, I'm not sure that a doc patch is
 needed to say that pg_upgrade works even when this option is set.
 Why would anyone assume otherwise?  Why would we list this property
 and not others?

A doc patch is only needed if pg_upgrade does NOT work
when some databases are default_read_only because THAT
is counterintuitive: To the naive user upgrading from
version to version is like make a copy, perhaps rename
a file or two. It doesn't intuitively suggest a need
for WRITE access to individual databases themselves.

However, if the current behaviour is retained it would
be very useful to document that and also document one
or two ways around it (unsetting, PGOPTIONS, ...).

Karsten
-- 
GPG key ID E4071346 @ gpg-keyserver.de
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Incomplete freezing when truncating a relation during vacuum

2013-11-27 Thread Andres Freund
On 2013-11-27 11:01:55 +0200, Heikki Linnakangas wrote:
 On 11/27/13 01:21, Andres Freund wrote:
 On 2013-11-26 13:32:44 +0100, Andres Freund wrote:
 This seems to be the case since
 b4b6923e03f4d29636a94f6f4cc2f5cf6298b8c8. I suggest we go back to using
 scan_all to determine whether we can set new_frozen_xid. That's a slight
 pessimization when we scan a relation fully without explicitly scanning
 it in its entirety, but given this isn't the first bug around
 scanned_pages/rel_pages I'd rather go that way. The aforementioned
 commit wasn't primarily concerned with that.
 Alternatively we could just compute new_frozen_xid et al before the
 lazy_truncate_heap.
 
 I've gone for the latter in this preliminary patch. Not increasing
 relfrozenxid after an initial data load seems like a bit of a shame.
 
 I wonder if we should just do scan_all || vacrelstats-scanned_pages 
 vacrelstats-rel_pages?
 
 Hmm, you did (scan_all || vacrelstats-scanned_pages 
 vacrelstats-rel_pages) for relminmxid, and just (vacrelstats-scanned_pages
  vacrelstats-rel_pages) for relfrozenxid. That was probably not what you
 meant to do, the thing you did for relfrozenxid looks good to me.

I said it's a preliminary patch ;), really, I wasn't sure what of both
to go for.

 Does the attached look correct to you?

Looks good.

I wonder if we need to integrate any mitigating logic? Currently the
corruption may only become apparent long after it occurred, that's
pretty bad. And instructing people run a vacuum after the ugprade will
cause the corrupted data being lost if they are already 2^31 xids. But
integrating logic to fix things into heap_page_prune() looks somewhat
ugly as well.
Afaics the likelihood of the issue occuring on non-all-visible pages is
pretty low, since they'd need to be skipped due to lock contention
repeatedly.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-11-27 Thread Andres Freund
On 2013-11-27 01:09:49 -0800, Peter Geoghegan wrote:
 On Tue, Nov 26, 2013 at 8:19 PM, Peter Geoghegan p...@heroku.com wrote:
  There are some visibility-related race conditions even still
 
 I also see this, sandwiched between the very many deadlock detected
 errors recorded over 6 or so hours (this is in chronological order,
 with no ERRORs omitted within the range shown):
 
 ERROR:  deadlock detected
 ERROR:  deadlock detected
 ERROR:  deadlock detected
 ERROR:  unable to fetch updated version of tuple
 ERROR:  unable to fetch updated version of tuple
 ERROR:  unable to fetch updated version of tuple
 ERROR:  unable to fetch updated version of tuple
 ERROR:  unable to fetch updated version of tuple
 ERROR:  unable to fetch updated version of tuple
 ERROR:  unable to fetch updated version of tuple
 ERROR:  unable to fetch updated version of tuple
 ERROR:  unable to fetch updated version of tuple
 ERROR:  unable to fetch updated version of tuple
 ERROR:  unable to fetch updated version of tuple
 ERROR:  deadlock detected
 ERROR:  deadlock detected
 ERROR:  deadlock detected
 ERROR:  deadlock detected
 
 This, along with the already-discussed attempted to update invisible
 tuple forms a full account of unexpected ERRORs seen during the
 extended run of the test case, so far.

I think at least the unable to fetch updated version of tuple ERRORs
are likely to be an unrelated 9.3+ BUG that I've recently
reported. Alvaro has a patch. C.f. 20131124000203.ga4...@alap2.anarazel.de

Even the deadlock detected errors might be a fkey-locking issue. Bug
#8434, but that's really hard to know without more details.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Service on Windows does not start. ~ is not a valid Win32 application

2013-11-27 Thread Naoya Anzai
Hi, Rajeev

  I tested the latest patch. My observation is:
  If we give relative data directory path while registering the
  service, then service start fails.
  But same works if the data directory is absolute path.
  
  Looks like an existing issue. May be we need to internally
  convert relative data path to absolute.
 
 Since the mentioned issue is an existing issue and not because of this patch.
 So can we take that as separate defect and fix. If so, then I can 
 move this patch to ready for committer.

I think so too. 
In boot by Service, CurrentDirectory seems to be C:/Windows/system32.
So, you have to set a relative data directory path that the starting point to 
be C:/Windows/system32.

 
 Thanks and Regards,
 Kumar Rajeev Rastogi
 
 

Regards,

Naoya

---
Naoya Anzai
Engineering Department
NEC Soft, Ltd.
E-Mail: anzai-na...@mxu.nes.nec.co.jp
---


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-11-27 Thread Peter Geoghegan
On Wed, Nov 27, 2013 at 1:20 AM, Andres Freund and...@2ndquadrant.com wrote:
 Even the deadlock detected errors might be a fkey-locking issue. Bug
 #8434, but that's really hard to know without more details.

Thanks, I was aware of that but didn't make the connection.

I've written a test-case that is designed to exercise one case that
deadlocks like crazy - deadlocking is the expected, correct behavior.
The deadlock errors are not in themselves suspicious. Actually, if
anything I find it suspicious that there aren't more deadlocks.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Name type in postgres

2013-11-27 Thread mohsen soodkhah mohammadi
hello.
I want do understand that did can we have name without null-terminator?
in src/backend/util/adt/name.c in nameout() function is:
Names = PG_GETARG_NAME(0);
PG_RETURN_CSTRING(pstrdup(NameStr(*s)));
what do the pstrdup() function?
do this function create string from name with null-terminate?


Re: [HACKERS] Assertions in PL/PgSQL

2013-11-27 Thread Pavel Stehule
2013/11/27 Peter Eisentraut pete...@gmx.net

 On Tue, 2013-11-19 at 10:40 -0500, Robert Haas wrote:
  I think the goal was to get to RAISE ASSERT
  WHEN ...; then, if assertions are off, you do nothing; if they're on,
  you error.  IF condition THEN RAISE... isn't a suitable surrogate in
  that case because you incur the overhead of testing the condition
  regardless.

 So if I do RAISE ASSERT WHEN condition and assertions are off, then
 condition wouldn't even be evaluated?  But what about RAISE NOTICE WHEN,
 when log_min_messages is error?  What about the side effects of the
 format string?  This is all just getting too weird.

 I don't see anything wrong with considering a separate ASSERT command
 with its own semantics, like in many other programming languages.


 My objection against ASSERT command was one - it was too simply (against
to cost of possible collision from introduction new (wide used) keyword.

I can live with ASSERT statement - but I expect as minimum a possibility to
specify level (failure, tracing, ...) and specify a message related to
assert. Assert with only expression is not enough.

Regards

Pavel


Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Shigeru Hanada
Hi Merlin,

2013/11/22 Merlin Moncure mmonc...@gmail.com:
 On Thu, Nov 21, 2013 at 6:43 PM, Shigeru Hanada
 shigeru.han...@gmail.com wrote:
 2013/11/22 Tom Lane t...@sss.pgh.pa.us:
 Merlin Moncure mmonc...@gmail.com writes:
 On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian br...@momjian.us wrote:
 I know join pushdowns seem insignificant, but it helps to restrict what
 data must be passed back because you would only pass back joined rows.

 By 'insignificant' you mean 'necessary to do any non-trivial real
 work'.   Personally, I'd prefer it if FDW was extended to allow
 arbitrary parameterized queries like every other database connectivity
 API ever made ever.

 [ shrug... ]  So use dblink.  For better or worse, the FDW stuff is
 following the SQL standard's SQL/MED design, which does not do it
 like that.

 Pass-through mode mentioned in SQL/MED standard might be what he wants.

 happen to have a link handy?

SQL/MED standard doesn't say much about PASS THROUGH mode, especially
about interaction between client.  Besides it, I think it would be
nice to allow arbitrary FDW as backend of dblink interface like this:

postgres= SELECT dblink_connect('con1', 'server name of an FDW');
postgres= SELECT * FROM dblink('con1', 'some query written in remote
syntax') as t(/* record type definition */...);

This provides a way to execute query without defining foreign table.
-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Get more from indices.

2013-11-27 Thread Etsuro Fujita
Kyotaro HORIGUCHI wrote:
  * In get_relation_info(), the patch determines the branch condition
  keyattno != ObjectIdAttributeNumber.  I fail to understand the
  meaning of this branch condition.  Could you explain about it?

 Literally answering, it means oid cannot be NULL (if it exists).

Understood.  Thank you for the detailed information.  But I'm not sure it's
worth complicating the code.  What use cases are you thinking?

Thanks,

Best regards,
Etsuro Fujita



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Service on Windows does not start. ~ is not a valid Win32 application

2013-11-27 Thread Rajeev rastogi
On 27 November 2013, Naoya Anzai wrote:
 Hi, Rajeev
 
   I tested the latest patch. My observation is:
 If we give relative data directory path while registering the
   service, then service start fails.
 But same works if the data directory is absolute path.
  
 Looks like an existing issue. May be we need to internally
 convert
   relative data path to absolute.
 
  Since the mentioned issue is an existing issue and not because of
 this patch.
  So can we take that as separate defect and fix. If so, then I can
 move
  this patch to ready for committer.
 
 I think so too.
 In boot by Service, CurrentDirectory seems to be C:/Windows/system32.
 So, you have to set a relative data directory path that the starting
 point to be C:/Windows/system32.
 

OK. Then I am moving it to ready for committer.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add transforms feature

2013-11-27 Thread Hannu Krosing
On 11/15/2013 05:04 PM, Dimitri Fontaine wrote:
 Hi,

 Peter Eisentraut pete...@gmx.net writes:
 Rebased patch.  No changes except that merge conflicts were resolved,
 and I had to add some Data::Dumper tweaks to the regression tests so
 that the results came out in  consistent order on different versions of
 Perl.

 On the higher level design, the big question here is about selective
 behavior. As soon as you CREATE TRANSFORM FOR hstore LANGUAGE plperl
 then any plperl function will now receive its hstore arguments as a
 proper perl hash rather than a string.

 Any pre-existing plperl function with hstore arguments or return type
 then needs to be upgraded to handle the new types nicely, and some of
 those might not be under the direct control of the DBA running the
 CREATE TRANSFORM command, when using some plperl extensions for example.

 A mechanism allowing for the transform to only be used in some functions
 but not others might be useful. The simplest such mechanism I can think
 of is modeled against the PL/Java classpath facility as specified in the
 SQL standard: you attach a classpath per schema.
If we start adding granularity, then why not go all the way?

I mean, we could do it in the following way

1) create named transforms

CREATE [DEFAULT] TRANSFORM xformname FOR type LANGUAGE lang 
(...details...);

2) use it when declaring a function

CREATE function funcname(
IN argname type [[USING] [TRANSFORM] xformname],
INOUT argname type [[USING] [IN] [TRANSFORM] xformname] [[USING] 
[OUT] [TRANSFORM] xformname],
OUT argname type [[USING] [TRANSFORM] xformname],

 ... 
) LANGUAGE lang $$
funcdef
$$;

This approach allows full flexibility in using old packages, especially
if we define old transform behaviour as DEFAULT TRANSFORM

Default transforms also allow easy way for rewriting current type i/o
conversions between languages into transforms.

There are immediately a few transforms that I would find useful

A) pass field data to language as pairs of (typeoid, typebin)

this is useful for speed, especially if you do not want to use many
of the passed arguments on most invocations

B) pass field data in as (typeoid, typebin), except do not de-toast
values but
pass in the toast ids, so the function is free to use only parts of
toasted values as it needs

C) pass field data in as string, probably the default behaviour for
languages like pl/tcl and pl/sh

D) and then of course just having a sensible transforms for extension
types like the current patch provides.

 Worst case, that I really don't think we need, would be addressing that
 per-argument:

   CREATE FUNCTION foo (hash hstore WITH TRANSFORM, kv hstore) …

 I certainly hope we don't need that, and sure can't imagine use cases
 for that level of complexity at the time of writing this review.

A typical use case would be to have a short hstore always passed in as
dictionary
and have another possibly large hstore passed in as toast pointer.

And if we want to have all type conversions between postgres and pls
re-written
as transforms, then we do need named transforms, not just one per (pl,
type) pair.

Also, if we allow flexibility, the it is probably a good idea to
implement full flexibility
first and then look at making usage easy after that, instead of adding
flexibility in
small steps.

Once we have per-argument transforms in place, we can look at setting
per-schema
defaults for ease of use.

As large part of this is actually abstracting i/o conversions out of pl
function code,
I think we should look at allowing the conversion functions to be
written in the
target pl language in addition to C.

I'll see if I can resurrect my patch for support of cstring and
internal types in pl/python
function defs for this.

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Incomplete freezing when truncating a relation during vacuum

2013-11-27 Thread Heikki Linnakangas

On 11/27/13 11:15, Andres Freund wrote:

On 2013-11-27 11:01:55 +0200, Heikki Linnakangas wrote:

On 11/27/13 01:21, Andres Freund wrote:

On 2013-11-26 13:32:44 +0100, Andres Freund wrote:

This seems to be the case since
b4b6923e03f4d29636a94f6f4cc2f5cf6298b8c8. I suggest we go back to using
scan_all to determine whether we can set new_frozen_xid. That's a slight
pessimization when we scan a relation fully without explicitly scanning
it in its entirety, but given this isn't the first bug around
scanned_pages/rel_pages I'd rather go that way. The aforementioned
commit wasn't primarily concerned with that.
Alternatively we could just compute new_frozen_xid et al before the
lazy_truncate_heap.


I've gone for the latter in this preliminary patch. Not increasing
relfrozenxid after an initial data load seems like a bit of a shame.

I wonder if we should just do scan_all || vacrelstats-scanned_pages 
vacrelstats-rel_pages?


Hmm, you did (scan_all || vacrelstats-scanned_pages 
vacrelstats-rel_pages) for relminmxid, and just (vacrelstats-scanned_pages
 vacrelstats-rel_pages) for relfrozenxid. That was probably not what you
meant to do, the thing you did for relfrozenxid looks good to me.


I said it's a preliminary patch ;), really, I wasn't sure what of both
to go for.


Does the attached look correct to you?


Looks good.


Ok, committed and backpatched that.


I wonder if we need to integrate any mitigating logic? Currently the
corruption may only become apparent long after it occurred, that's
pretty bad. And instructing people run a vacuum after the ugprade will
cause the corrupted data being lost if they are already 2^31 xids.


Ugh :-(. Running vacuum after the upgrade is the right thing to do to 
prevent further damage, but you're right that it will cause any 
already-wrapped around data to be lost forever. Nasty.



But integrating logic to fix things into heap_page_prune() looks
somewhat ugly as well.


I think any mitigating logic we might add should go into vacuum. It 
should be possible for a DBA to run a command, and after it's finished, 
be confident that you're safe. That means vacuum.



Afaics the likelihood of the issue occuring on non-all-visible pages is
pretty low, since they'd need to be skipped due to lock contention
repeatedly.


Hmm. If a page has its visibility-map flag set, but contains a tuple 
that appears to be dead because you've wrapped around, vacuum will give 
a warning:  page containing dead tuples is marked as all-visible in 
relation \%s\ page %u. So I think if a manual VACUUM FREEZE passes 
without giving that warning, that vacuum hasn't destroyed any data. So 
we could advise to take a physical backup of the data directory, and run 
a manual VACUUM FREEZE on all databases. If it doesn't give a warning, 
you're safe from that point onwards. If it does, you'll want to recover 
from an older backup, or try to manually salvage just the lost rows from 
the backup, and re-index. Ugly, but hopefully rare in practice.


Unfortunately that doesn't mean that you haven't already lost some data. 
Wrap-around could've already happened, and vacuum might already have run 
and removed some rows. You'll want to examine your logs and grep for 
that warning.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/Python: domain over array support

2013-11-27 Thread Marko Kreen
On Tue, Nov 26, 2013 at 07:12:00PM -0200, Rodolfo Campero wrote:
 2013/11/26 Heikki Linnakangas hlinnakan...@vmware.com
  Oops, sorry about that. Fixed.
 
 Maybe be you forgot to modify
 plpython_types_3.out

Yes.  Heikki, please fix plpython_types_3.out too.

See attached patch.

-- 
marko

diff --git a/src/pl/plpython/expected/plpython_types_3.out b/src/pl/plpython/expected/plpython_types_3.out
index 25331f2..e104356 100644
--- a/src/pl/plpython/expected/plpython_types_3.out
+++ b/src/pl/plpython/expected/plpython_types_3.out
@@ -664,6 +664,9 @@ SELECT * FROM test_type_conversion_array_error();
 ERROR:  return value of function with array return type is not a Python sequence
 CONTEXT:  while creating return value
 PL/Python function test_type_conversion_array_error
+--
+-- Domains over arrays
+--
 CREATE DOMAIN ordered_pair_domain AS integer[] CHECK (array_length(VALUE,1)=2 AND VALUE[1]  VALUE[2]);
 CREATE FUNCTION test_type_conversion_array_domain(x ordered_pair_domain) RETURNS ordered_pair_domain AS $$
 plpy.info(x, type(x))

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Incomplete freezing when truncating a relation during vacuum

2013-11-27 Thread Andres Freund
On 2013-11-27 13:56:58 +0200, Heikki Linnakangas wrote:
 Ok, committed and backpatched that.

Thanks.

 I wonder if we need to integrate any mitigating logic? Currently the
 corruption may only become apparent long after it occurred, that's
 pretty bad. And instructing people run a vacuum after the ugprade will
 cause the corrupted data being lost if they are already 2^31 xids.
 
 Ugh :-(. Running vacuum after the upgrade is the right thing to do to
 prevent further damage, but you're right that it will cause any
 already-wrapped around data to be lost forever. Nasty.

 But integrating logic to fix things into heap_page_prune() looks
 somewhat ugly as well.
 
 I think any mitigating logic we might add should go into vacuum. It should
 be possible for a DBA to run a command, and after it's finished, be
 confident that you're safe. That means vacuum.

Well, heap_page_prune() is the first thing that's executed by
lazy_scan_heap(), that's why I was talking about it. So anything we do
need to happen in there or before.

 Afaics the likelihood of the issue occuring on non-all-visible pages is
 pretty low, since they'd need to be skipped due to lock contention
 repeatedly.

 Hmm. If a page has its visibility-map flag set, but contains a tuple that
 appears to be dead because you've wrapped around, vacuum will give a
 warning:  page containing dead tuples is marked as all-visible in relation
 \%s\ page %u.

I don't think this warning is likely to be hit as the code stands -
heap_page_prune() et. al. will have removed all dead tuples already,
right and so has_dead_tuples won't be set.

Independent from this, ISTM we should add a
   else if (PageIsAllVisible(page)  all_visible)
to those checks.


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/Python: domain over array support

2013-11-27 Thread Heikki Linnakangas

On 11/27/13 14:15, Marko Kreen wrote:

On Tue, Nov 26, 2013 at 07:12:00PM -0200, Rodolfo Campero wrote:

2013/11/26 Heikki Linnakangas hlinnakan...@vmware.com

Oops, sorry about that. Fixed.


Maybe be you forgot to modify
plpython_types_3.out


Yes.  Heikki, please fix plpython_types_3.out too.

See attached patch.


Ah, sorry. Committed..

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [BUGS] BUG #7873: pg_restore --clean tries to drop tables that don't exist

2013-11-27 Thread Pavel Stehule
I'll prepare patch


2013/11/27 Tom Lane t...@sss.pgh.pa.us

 Dean Rasheed dean.a.rash...@gmail.com writes:
  Actually the IF EXISTS in DROP TABLE now applies to the schema as
  well. Unfortunately there is currently no consistency across the
  various DROP commands --- some tolerate a non-existent schema, while
  others error out.

 Yeah.  I think now that we've had this discussion, we should make them
 all tolerate a non-existent schema.  I'm fine with having that happen
 over a series of patches rather than all at once though.

  Also amongst those that tolerate a non-existent
  schema, the resulting notices are not consistent --- some report the
  schema-qualified object name, while others just report the local
  object name.

 Less excited about this part, but on the whole I'd vote for the schema
 no_such_schema does not exist wording in cases where the schema isn't
 there.  The other way is less specific for no very good reason.

 regards, tom lane



Re: [HACKERS] Incomplete freezing when truncating a relation during vacuum

2013-11-27 Thread Heikki Linnakangas

On 11/27/13 14:11, Andres Freund wrote:

On 2013-11-27 13:56:58 +0200, Heikki Linnakangas wrote:

Afaics the likelihood of the issue occuring on non-all-visible pages is
pretty low, since they'd need to be skipped due to lock contention
repeatedly.



Hmm. If a page has its visibility-map flag set, but contains a tuple that
appears to be dead because you've wrapped around, vacuum will give a
warning:  page containing dead tuples is marked as all-visible in relation
\%s\ page %u.


I don't think this warning is likely to be hit as the code stands -
heap_page_prune() et. al. will have removed all dead tuples already,
right and so has_dead_tuples won't be set.


It might be a good idea to add such a warning to heap_page_prune(). Or 
also emit the warning in lazy_scan_heap() if heap_page_prune() returned  0.



Independent from this, ISTM we should add a
else if (PageIsAllVisible(page)  all_visible)
to those checks.


Can you elaborate, where should that be added?

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Incomplete freezing when truncating a relation during vacuum

2013-11-27 Thread Andres Freund
On 2013-11-27 14:45:25 +0200, Heikki Linnakangas wrote:
 On 11/27/13 14:11, Andres Freund wrote:
 I don't think this warning is likely to be hit as the code stands -
 heap_page_prune() et. al. will have removed all dead tuples already,
 right and so has_dead_tuples won't be set.
 
 It might be a good idea to add such a warning to heap_page_prune(). Or also
 emit the warning in lazy_scan_heap() if heap_page_prune() returned  0.

 Independent from this, ISTM we should add a
 else if (PageIsAllVisible(page)  all_visible)
 to those checks.
 
 Can you elaborate, where should that be added?

I was thinking of adding such a warning below
elog(WARNING, page containing dead tuples is marked as all-visible in 
relation \%s\ page %u,..)
but cannot warn against that because GetOldestXmin() can go backwards...

I think it's probably sufficient to set has_dead_tuples = true in the
ItemIdIsDead() branch in lazy_scan_heap(). That should catch relevant
actions from heap_page_prune().

Besides not warning in against deletions from heap_page_prune(), the
current warning logic is also buggy for tables without indexes...

/*
 * If there are no indexes then we can vacuum the page right now
 * instead of doing a second scan.
 */
if (nindexes == 0 
vacrelstats-num_dead_tuples  0)
{
/* Remove tuples from heap */
lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, vmbuffer);
has_dead_tuples = false;

That happens before the
else if (PageIsAllVisible(page)  has_dead_tuples)
check.

With regard to fixing things up, ISTM the best bet is heap_prune_chain()
so far. That's executed b vacuum and by opportunistic pruning and we
know we have the appropriate locks there. Looks relatively easy to fix
up things there. Not sure if there are any possible routes to WAL log
this but using log_newpage()?
I am really not sure what the best course of action is :(

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Alvaro Herrera
Shigeru Hanada escribió:

 SQL/MED standard doesn't say much about PASS THROUGH mode, especially
 about interaction between client.  Besides it, I think it would be
 nice to allow arbitrary FDW as backend of dblink interface like this:
 
 postgres= SELECT dblink_connect('con1', 'server name of an FDW');
 postgres= SELECT * FROM dblink('con1', 'some query written in remote
 syntax') as t(/* record type definition */...);
 
 This provides a way to execute query without defining foreign table.

Seems to me that if you want to read remote tables without creating a
foreign table, you could define them locally using something like the
WITH syntax and then use them normally in the rest of the query.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Traffic jams in fn_extra

2013-11-27 Thread Greg Stark
On Sun, Nov 24, 2013 at 4:21 PM, Simon Riggs si...@2ndquadrant.com wrote:

 Why do you need to do this dance with fn_extra?

 It's possible to allocate a hash table in a Transaction-lifetime
 memory context on first call into a function then cache things there.


fn_extra gives a handle per function call site.

It sounds to me like the complexity is coming not from having many Postgres
functions but from having lots of infrastructure backing up those
functions. So if many of their Postgres functions call a C function to do
various things and the C function wants to cache something somewhere
related to the object they've been passed then the natural thing to do is
have the Postgres function pass fn_extra down to the C function but if they
have many such C functions it gets a bit tricky.


But you could declare fn_extra to be a hash table yourself since it's your
Postgres function that gets to decide how fn_extra is going to be used. You
could then pass that hash table down to the various C functions to cache
state. However that might still be a bit odd. If you call the same C
function twice from the same Postgres function it'll get the same hash
table for both calls. fn_extra is per Postgres function call site.


-- 
greg


Re: [HACKERS] Handling GIN incomplete splits

2013-11-27 Thread Heikki Linnakangas

On 11/22/13 15:04, Michael Paquier wrote:

On Thu, Nov 21, 2013 at 9:44 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

Here's a new version. To ease the review, I split the remaining patch again
into two, where the first patch is just yet more refactoring.

I also fixed some bugs in WAL logging and replay that I bumped into while
testing.

Cool. Here is the review of the two remaining patches.
1) More refactoring, general remarks:
- Code compiles without warnings
- Passes make check
- If I got it correctly... this patch separates the part managing data
to be inserted from ginbtree. I can understand the meaning behind that
if we consider that GinBtree is used only to define methods and search
conditions (flags and keys). However I am wondering if this does not
make the code more complicated...


Well, I think it's an improvement in readability to separate the 
insertion payload from the search information. With the second patch it 
becomes more important, because if an incompletely split page is 
encountered while you're inserting a value, you needs to insert the 
downlink for the old incomplete split first, and then continue the 
insertion of the original vaule where you left. That context switching 
becomes a lot easier when value you're inserting is passed around 
separately.



Particularly the flag isDelete that
is only passed to ginxlogInsert meritates its comment IMO. Note that I
haven't read the 2nd patch when writing that :)


Yeah, that gets removed in the second patch, which changes the WAL 
record format. :-)



1-2) Could it be possible to change the variable name of
GinBtreeEntryInsertData *entry in entryIsEnoughSpace? entry-entry
is kind of hard to apprehend... Renaming it to either insertEntry.
Another idea would be to rename entry in GinBtreeEntryInsertData to
entryData or entryTuple.


ok, renamed the variable to insertData.


1-3) This block of code is present two times:
+   if (!isleaf)
+   {
+   PostingItem *pitem = GinDataPageGetPostingItem(lpage, off);
+   PostingItemSetBlockNumber(pitem, updateblkno);
+   }
Should the update of a downlink to point to the next page be a
separate function?


Doesn't seem any better to me.

Thanks, committed this refactoring patch now. Will continue on the main 
patch.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Dimitri Fontaine
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Seems to me that if you want to read remote tables without creating a
 foreign table, you could define them locally using something like the
 WITH syntax and then use them normally in the rest of the query.

I guess what's needed here is a kind of barrier that allows pushing a
whole arbitrary subquery (with joins and quals and whatnot) down to the
remote side.

My current thinking about how to solve that would be to add a notion of
FOREIGN VIEW in our system, which would basically implement that barrier
and send the view definition on the remote, with known quals values as
constants, or something like that.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_upgrade ?deficiency

2013-11-27 Thread Kevin Grittner
Bruce Momjian br...@momjian.us wrote:
 On Tue, Nov 26, 2013 at 03:25:44PM -0800, Kevin Grittner wrote:
 Bruce Momjian br...@momjian.us wrote:

 How are we handling breakage of pg_dump, not pg_dumpall?

 That was discussed.  Do you have something to add?

 I am confused what we are patching.  Are we patching pg_dump,
 pg_dumpall, or both?

Just pg_dumpall.c.

 Can I see the full patch?

It was attached to this post:

http://www.postgresql.org/message-id/1385225082.8248.yahoomail...@web162901.mail.bf1.yahoo.com

 Are we propagating other settings from pg_dump to pg_dumpall,
 like statement_timeout?

pg_dumpall output sets up the global objects (including their
properties) and then does a \connect to each database, followed by
the same output that pg_dump would generate for that database. 
That includes the SET statements for statement_timeout, etc.  The
patch does nothing to change what objects or properties the
pg_dumpall output tries to set up, it just sets a property *on the
current connection* to allow those statements to run without error.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2013-11-27 Thread Robert Haas
On Wed, Nov 27, 2013 at 12:56 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 - There is a comment TODO: It would be nice to behave like the
 history and the source strings were concatenated, so that you could
 compress using the new data, too.  If we're not already doing that,
 then how are we managing to compress WAL by more than one-quarter in
 the hundred tiny fields, all changed case?

 Algorithm is not doing concatenation of history and source strings,
 the hash table is formed just with history data and then later
 if match is not found then it is added to history, so this can be the
 reason for the above result.

From the compressor's point of view, that's pretty much equivalent to
behaving as if those strings were concatenated.

The point is that there's a difference between using the old tuple's
history entries to compress the new tuple and using the new tuple's
own history to compress it.  The former is delta-compression, which is
what we're supposedly doing here.  The later is just plain
compression.  That doesn't *necessarily* make it a bad idea, but they
are clearly two different things.

 The basic idea is that you use a rolling hash function to divide up
 the history data into chunks of a given average size.  So we scan the
 history data, compute a rolling hash value at each point, and each
 time the bottom N bits are zero, we consider that to be the end of a
 chunk.  We enter all the chunks into a hash table.  The chunk size
 will vary, but on the average, given a reasonably well-behaved rolling
 hash function (the pglz one probably doesn't qualify) it'll happen
 every 2^N bytes, so perhaps for this purpose we'd choose N to be
 between 3 and 5.  Then, we scan the input we want to compress and
 divide it into chunks in the same way.  Chunks that don't exist in the
 history data get copied to the output, while those that do get
 replaced with a reference to their position in the history data.

 I think this idea looks better than current and it will definately
 improve some of the cases, but not sure we can win in all cases.
 We have tried one of the similar idea (reduce size of hash and
 eventually comparision) by adding every 10 bytes to the history
 lookup table rather than every byte. It improved most cases but not
 all cases (hundred tiny fields, all changed,
  hundred tiny fields, half changed test were still slow).
 Patch and results are at link (refer approach-1):
 http://www.postgresql.org/message-id/001f01ce1c14$d3af0770$7b0d1650$@kap...@huawei.com

What you did there will, I think, tend to miss a lot of compression
opportunities.  Suppose for example that the old tuple is
ABCDEFGHIJKLMNOP and the new tuple is xABCDEFGHIJKLMNOP.  After
copying one literal byte we'll proceed to copy 9 more, missing the
fact that there was a long match available after the first byte.  The
advantage of the fingerprinting technique is that it's supposed to be
resistant to that sort of thing.

 Now the tough question is what are the possible options for this patch
 and which one to pick:
 a. optimize encoding technique, so that it can improve results in most
 cases even if not all.
 b. have a table level option or guc to enable/disable WAL compression.
 c. use some heuristics to check if chances of compression are good,
 then only perform compression.
 1. apply this optimization for tuple size  128 and  2000
 2. apply this optimization if number of modified columns are less
 than 25% (some threshold number) of total columns.
 I think we can get modified columns from target entry and use
 it if triggers haven't changed that tuple. I remember
 earlier there were concerns that this value can't be trusted
 completely, but I think using it as a heuristic is not a
 problem, even if this number is not right in some cases.
 d. forget about this optimization and reject the patch.
 I think by doing option 'b' and 'c' together can make this
 optimization usable in cases where it is actually useful.

I agree that we probably want to do (b), and I suspect we want both a
GUC and a reloption, assuming that can be done relatively cleanly.

However, I think we should explore (a) more before we explore (c).   I
think there's a good chance that we can reduce the CPU overhead of
this enough to feel comfortable having it enabled by default.  If we
proceed with heuristics as in approach (c), I don't think that's the
end of the world, but I think there will be more corner cases where we
lose and have to fiddle things manually.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Shave a few instructions from child-process startup sequence

2013-11-27 Thread Robert Haas
On Tue, Nov 26, 2013 at 10:12 PM, Gurjeet Singh gurj...@singh.im wrote:
 On Tue, Nov 26, 2013 at 9:42 PM, Peter Eisentraut pete...@gmx.net wrote:

 src/backend/postmaster/postmaster.c:2255: indent with spaces.
 +else
 src/backend/postmaster/postmaster.c:2267: indent with spaces.
 +break


 Not sure how that happened! Attached is the updated patch.

This is a performance patch, so it should come with benchmark results
demonstrating that it accomplishes its intended purpose.  I don't see
any.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] new unicode table border styles for psql

2013-11-27 Thread Pavel Stehule
2013/11/26 Pavel Stehule pavel.steh...@gmail.com




 2013/11/26 Peter Eisentraut pete...@gmx.net

 On 11/22/13, 3:26 AM, Pavel Stehule wrote:
  website is related to patch for 9.3 (I add note there)
 
  patch for 9.4 is fixed - and now with small doc

 I think it would help if we considered the new border styles and the new
 line styles separately.

 I don't find the new border styles to be particularly useful.  They just
 use up vertical screen space, which is usually more precious than
 vertical space.  Is there a situation where you would find these styles
 to be more useful than the existing ones?  Keep in mind that pset is
 usually set permanently, so it wouldn't be practical to use a different
 border style depending on how the query results shape up (like \x auto).

 Now for the linestyles.  I can see how some of them are attractive, but
 several of them have poor aesthetics, I think.  I don't see a reason to
 accept 7 new styles just for fun.  If I had to choose, I'd consider
 -double1 and -double4 to be acceptable.


I am sending reduced patch

I add a double1, double4 and double5 - renamed to double1,double2, double3

support for border 3 and 4 removed

postgres=# \pset linestyle double1 \pset border 2 \l
Line style (linestyle) is double1.
Border style (border) is 2.
   List of databases
┌───┬──┬──┬─┬─┬───┐
│   Name│  Owner   │ Encoding │   Collate   │Ctype│   Access
privileges   │
╞═══╪══╪══╪═╪═╪═══╡
│ postgres  │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8
│   │
│ template0 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
=c/postgres   │
│   │  │  │ │ │
postgres=CTc/postgres │
│ template1 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
=c/postgres   │
│   │  │  │ │ │
postgres=CTc/postgres │
└───┴──┴──┴─┴─┴───┘
(3 rows)

postgres=# \pset linestyle double2 \pset border 2 \l
Line style (linestyle) is double2.
Border style (border) is 2.
   List of databases
╔═══╤══╤══╤═╤═╤═══╗
║   Name│  Owner   │ Encoding │   Collate   │Ctype│   Access
privileges   ║
╟───┼──┼──┼─┼─┼───╢
║ postgres  │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8
│   ║
║ template0 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
=c/postgres   ║
║   │  │  │ │ │
postgres=CTc/postgres ║
║ template1 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
=c/postgres   ║
║   │  │  │ │ │
postgres=CTc/postgres ║
╚═══╧══╧══╧═╧═╧═══╝
(3 rows)

postgres=# \pset linestyle double3 \pset border 2 \l
Line style (linestyle) is double3.
Border style (border) is 2.
   List of databases
╔═══╦══╦══╦═╦═╦═══╗
║   Name║  Owner   ║ Encoding ║   Collate   ║Ctype║   Access
privileges   ║
╠═══╬══╬══╬═╬═╬═══╣
║ postgres  ║ postgres ║ UTF8 ║ en_US.UTF-8 ║ en_US.UTF-8
║   ║
║ template0 ║ postgres ║ UTF8 ║ en_US.UTF-8 ║ en_US.UTF-8 ║
=c/postgres   ║
║   ║  ║  ║ ║ ║
postgres=CTc/postgres ║
║ template1 ║ postgres ║ UTF8 ║ en_US.UTF-8 ║ en_US.UTF-8 ║
=c/postgres   ║
║   ║  ║  ║ ║ ║
postgres=CTc/postgres ║
╚═══╩══╩══╩═╩═╩═══╝
(3 rows)

Pavel



commit 47b3258db6e3e82399417f57ad551ad740348635
Author: Pavel Stehule pavel.steh...@gooddata.com
Date:   Wed Nov 27 15:09:01 2013 +0100

initial

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 264cfe6..6e74690 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2088,9 +2088,10 @@ lo_import 152801
   para
   Sets the border line drawing style to one
   of literalascii/literal, literalold-ascii/literal
-  or literalunicode/literal.
-  Unique abbreviations are allowed.  (That would mean one
-  letter is enough.)
+  or literalunicode/literal, literaldouble1/literal,
+  literaldouble2/literal or literaldouble3/literal
+  (unicode styles). Unique abbreviations are allowed.
+  (That would mean one letter is enough.)
   The 

Re: [HACKERS] Extension Templates S03E11

2013-11-27 Thread Vik Fearing
On 11/26/2013 10:07 PM, Dimitri Fontaine wrote:
 What I now think we should do is only grant superusers the privileges to
 install an extension from a template they own or is owned by another
 superuser.

Say what?  Superusers bypass all privileges by definition.

-- 
Vik



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] doPickSplit stack buffer overflow in XLogInsert?

2013-11-27 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 On 2013-11-26 14:14:38 -0800, Kevin Grittner wrote:

 I happened to build in a shell that was still set up for the clang
 address sanitizer, and got the attached report.  On a rerun it was
 repeatable.  XLogInsert() seems to read past the end of a variable
 allocated on the stack in doPickSplit(). I haven't tried to analyze
 it past that, since this part of the code is unfamiliar to me.

 Yea, I've seen that one before as well and planned to report it at some
 point. The reason is the MAXALIGN()s in ACCEPT_RDATA_DATA(). That rounds
 up to 8byte boundaries, while we've e.g. only added 2bytes of slop to
 toDelete.

Have you established whether having the CRC calculation read past
the end of the buffer can cause problems on recovery or standby
systems?  Should we try to get this fixed by Monday?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Platform-dependent(?) failure in timeout handling

2013-11-27 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:

 3. Establish a coding rule that if you catch an error with
 PG_TRY() and don't re-throw it, you have to unblock signals in
 your PG_CATCH block.

Could that be done in the PG_END_TRY macro?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2013-11-27 Thread Amit Kapila
On Wed, Nov 27, 2013 at 7:35 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Nov 27, 2013 at 12:56 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 The basic idea is that you use a rolling hash function to divide up
 the history data into chunks of a given average size.  So we scan the
 history data, compute a rolling hash value at each point, and each
 time the bottom N bits are zero, we consider that to be the end of a
 chunk.  We enter all the chunks into a hash table.  The chunk size
 will vary, but on the average, given a reasonably well-behaved rolling
 hash function (the pglz one probably doesn't qualify) it'll happen
 every 2^N bytes, so perhaps for this purpose we'd choose N to be
 between 3 and 5.  Then, we scan the input we want to compress and
 divide it into chunks in the same way.  Chunks that don't exist in the
 history data get copied to the output, while those that do get
 replaced with a reference to their position in the history data.

 I think this idea looks better than current and it will definately
 improve some of the cases, but not sure we can win in all cases.
 We have tried one of the similar idea (reduce size of hash and
 eventually comparision) by adding every 10 bytes to the history
 lookup table rather than every byte. It improved most cases but not
 all cases (hundred tiny fields, all changed,
  hundred tiny fields, half changed test were still slow).
 Patch and results are at link (refer approach-1):
 http://www.postgresql.org/message-id/001f01ce1c14$d3af0770$7b0d1650$@kap...@huawei.com

 What you did there will, I think, tend to miss a lot of compression
 opportunities.  Suppose for example that the old tuple is
 ABCDEFGHIJKLMNOP and the new tuple is xABCDEFGHIJKLMNOP.  After
 copying one literal byte we'll proceed to copy 9 more, missing the
 fact that there was a long match available after the first byte.

That is right, but one idea to try that out was to see if we can
reduce CPU usage at cost of compression,
but we found that it didn't completely eliminate that problem.

 The
 advantage of the fingerprinting technique is that it's supposed to be
 resistant to that sort of thing.

Okay, one question arise here is that can it be better in terms of CPU
usage as compare to when
we have used hash function for every 10th byte, if you have a feeling
that it can improve situation,
I can try a prototype implementation of same to check the results.


 Now the tough question is what are the possible options for this patch
 and which one to pick:
 a. optimize encoding technique, so that it can improve results in most
 cases even if not all.
 b. have a table level option or guc to enable/disable WAL compression.
 c. use some heuristics to check if chances of compression are good,
 then only perform compression.
 1. apply this optimization for tuple size  128 and  2000
 2. apply this optimization if number of modified columns are less
 than 25% (some threshold number) of total columns.
 I think we can get modified columns from target entry and use
 it if triggers haven't changed that tuple. I remember
 earlier there were concerns that this value can't be trusted
 completely, but I think using it as a heuristic is not a
 problem, even if this number is not right in some cases.
 d. forget about this optimization and reject the patch.
 I think by doing option 'b' and 'c' together can make this
 optimization usable in cases where it is actually useful.

 I agree that we probably want to do (b), and I suspect we want both a
 GUC and a reloption, assuming that can be done relatively cleanly.

 However, I think we should explore (a) more before we explore (c).

Sure, but to explore (a), the scope is bit bigger. We have below
options to explore (a):
1. try to optimize existing algorithm as used in patch, which we have
tried but ofcourse we can spend some more time to see if anything more
can be tried out.
2. try fingerprint technique as suggested by you above.
3. try some other standard methods like vcdiff, lz4 etc.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [HITB-Announce] #HITB2014AMS Call for Papers Now Open

2013-11-27 Thread Hafez Kamal

Hi everyone - The Call for Papers for the 5th annual HITB Security
Conference in Amsterdam is now open. #HITB2014AMS takes place at the
Beurs van Berlage from the 27th - 30th of May 2014. The official
conference hotel for the event is the Hilton DoubleTree.

As always we start with 2-days of hands on technical trainings followed
by a 2-day triple track conference. However, 2014 also sees the
introduction of a brand new addition to the HITB line up - the HITB
Haxpo or hacker expo!

With support and backing from iAmsterdam, HITB Haxpo is going to be
unlike anything that's been done before. Think IT security exhibition
meets Makerfaire with a generous touch of HITBSecConf flavor thrown in
for good measure.

What exactly does that mean? Imagine an area dedicated to hackerspaces;
makers with 3D printers, laser cutters and other fabrication goodies
coupled with TOOOL's Lock Picking Village, HITB and Mozilla's
HackWEEKDAY developer hackathon, a hackerspaces challenge featuring LEGO
Mindstorms EV3, our Capture the Flag 'live hacking' competition and
topped off by a 3 day IT exhibition featuring Microsoft and Google as
the main anchors. The best part? Entrance to the 3-day Haxpo is
COMPLETELY FREE!

As 2014 marks our 5th year anniversary of HITB in EU, #HITB2014AMS will
be a very special edition conference with an ALL WOMEN keynote line up:

Opening Keynote (29th May 2014):


Katie Moussouris (Lead, Microsoft Security Response Center)
Kristin Lovejoy (GM, IBM Security Services Division)

Closing Keynote (29th May 2014):
===

Mischel Kwon (Former Director, US Computer Emergency Response Team)

Opening Keynote (30th May 2014):


Pamela Fusco (Chief Information Security Officer, Apollo Group)
Jennifer Steffens (Chief Executive Officer, IOActive)

Closing Keynote (30th May 2014):


Jaya Baloo (Chief Information Security Officer, KPN Telecom)

---

We're looking for talks that are highly technical, but most importantly,
material which is new, fresh and hasn't been presented previously.

Submission Deadlines:

   Round #1 selection: 1st week January 2014
   Round #2 selection: 1st Week of February 2014


CFP submissions will be evaluated in 2 rounds. If all slots are filled
in the first selection round, we will close CFP early so DON'T DELAY
SUBMITTING

HITB CFP: http://cfp.hackinthebox.org/

Event Website: http://haxpo.nl || http://conference.hitb.org
(Opens 1st Dec)

===

Each accepted submission will entitle the speaker(s) to
accommodation for 3 nights / 4 days and travel expense reimbursement
up to EUR1200.00 per speaking slot.

Topics of interest include, but are not limited to the following:

  Cloud Security
  File System Security
  3G/4G/WIMAX Security
  SS7/GSM/VoIP Security
  Security of Medical Devices
  Critical Infrastructure Security
  Smartphone / MobileSecurity
  Smart Card and Physical Security
  Network Protocols, Analysis and Attacks
  Applications of Cryptographic Techniques
  Side Channel Analysis of Hardware Devices
  Analysis of Malicious Code / Viruses / Malware
  Data Recovery, Forensics and Incident Response
  Hardware based attacks and reverse engineering
  Windows / Linux / OS X / *NIX Security Vulnerabilities
  Next Generation Exploit and Exploit Mitigation Techniques
  NFC, WLAN, GPS, HAM Radio, Satellite, RFID and Bluetooth Security

WHITE PAPER: If your presentation is short listed for inclusion into the
conference program, a technical white paper must also be provided for
review (3000 - 5000 words).

Your submissions will be reviewed by The HITB CFP Review Committee:

Charlie Miller (formerly Principal Research Consultant, Accuvant Labs)
Katie Moussouris, Senior Security Strategist, Microsoft
Itzik Kotler, Chief Technology Officer, Security Art
Cesar Cerrudo, Chief Technology Officer, IOActive
Jeremiah Grossman, Founder, Whitehat Security
Andrew Cushman, Senior Director, Microsoft
Saumil Shah, Founder CEO Net-Square
Thanh 'RD' Nguyen, THC, VNSECURITY
Alexander Kornburst, Red Database
Fredric Raynal, QuarksLab
Shreeraj Shah, Founder, BlueInfy
Emmanuel Gadaix, Founder, TSTF
Andrea Barisani, Inverse Path
Philippe Langlois, TSTF
Ed Skoudis, InGuardians
Haroon Meer, Thinkst
Chris Evans, Google
Raoul Chiesa, TSTF/ISECOM
rsnake, SecTheory
Gal Diskin, Intel
Skyper, THC

Note: We do not accept product or vendor related pitches. If you would
like to showcase your company's products or technology at HITB Haxpo
(which also has it's own set of speaking slots), please email
i...@haxpo.nl or conferencei...@hackinthebox.org

Regards,
Hafez Kamal
Hack in The Box (M) Sdn. Bhd
36th Floor, Menara Maxis
Kuala Lumpur City Centre
50088 Kuala Lumpur, Malaysia
Tel: +603-26157299
Fax: +603-26150088



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Name type in postgres

2013-11-27 Thread Tom Lane
mohsen soodkhah mohammadi mohsensoodk...@gmail.com writes:
 I want do understand that did can we have name without null-terminator?

No.  Possibly that was the original intention, but for many years the
system has enforced that Name contains at least one trailing null byte, ie
the maximum usable length of identifiers is NAMEDATALEN-1.  So the code
you quote is valid.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Merlin Moncure
On Wed, Nov 27, 2013 at 4:20 AM, Shigeru Hanada
shigeru.han...@gmail.com wrote:
 Hi Merlin,

 2013/11/22 Merlin Moncure mmonc...@gmail.com:
 On Thu, Nov 21, 2013 at 6:43 PM, Shigeru Hanada
 shigeru.han...@gmail.com wrote:
 2013/11/22 Tom Lane t...@sss.pgh.pa.us:
 Merlin Moncure mmonc...@gmail.com writes:
 On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian br...@momjian.us wrote:
 I know join pushdowns seem insignificant, but it helps to restrict what
 data must be passed back because you would only pass back joined rows.

 By 'insignificant' you mean 'necessary to do any non-trivial real
 work'.   Personally, I'd prefer it if FDW was extended to allow
 arbitrary parameterized queries like every other database connectivity
 API ever made ever.

 [ shrug... ]  So use dblink.  For better or worse, the FDW stuff is
 following the SQL standard's SQL/MED design, which does not do it
 like that.

 Pass-through mode mentioned in SQL/MED standard might be what he wants.

 happen to have a link handy?

 SQL/MED standard doesn't say much about PASS THROUGH mode, especially
 about interaction between client.  Besides it, I think it would be
 nice to allow arbitrary FDW as backend of dblink interface like this:

 postgres= SELECT dblink_connect('con1', 'server name of an FDW');
 postgres= SELECT * FROM dblink('con1', 'some query written in remote
 syntax') as t(/* record type definition */...);

 This provides a way to execute query without defining foreign table.

yeah.  (thanks for indulging -- this is barely on topic I guess).

if it were possible to create a supporting function (say, fdw_link)
that could somehow interface with a previously established server, it
could probably be worked out.   Then all FDW could leverage
parameterization without having to copy and paste the pgsql-fdw qual
push code.  But that would be a fairly large break from the rest of
the FDW syntax and having to define the record at each call site is
admittedly a bit of a headache.

Hm, another way to think about this would be to somehow abstract the
qual push into a library so that it could be accessed by other FDWs if
they opted in.  This would address my chief complaint that only the
pgsql-fdw (the only database for which we already have an in-core high
quality connection api) driver could tap the excellent work you've
done.  If this were even possible, it would probably result in more
fdw API changes.

If my:

SELECT * FROM big_sql_server_foreign_table WHERE id = x;

was fast, that'd be pretty nice.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Shigeru Hanada
2013/11/27 Dimitri Fontaine dimi...@2ndquadrant.fr:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Seems to me that if you want to read remote tables without creating a
 foreign table, you could define them locally using something like the
 WITH syntax and then use them normally in the rest of the query.

 I guess what's needed here is a kind of barrier that allows pushing a
 whole arbitrary subquery (with joins and quals and whatnot) down to the
 remote side.

Yes, a big problem is how to skip parsing remote query in PG context.
Bare query string (other than string literal) always parsed by PG
parser, but remote side would have different syntax and semantics, as
Dimitri says we need to pass whole of arbitrary query string to remote
side as-is.

 My current thinking about how to solve that would be to add a notion of
 FOREIGN VIEW in our system, which would basically implement that barrier
 and send the view definition on the remote, with known quals values as
 constants, or something like that.

I'm sorry but I don't see the point here.  Do you mean that user
executes CREATE FOREIGN VIEW in advance and uses the view in a
subsequent query? Or, allow new syntax like WITH alias AS FOREIGN VIEW
(remote query)?

I think it's nice to support executing ad-hoc remote query written in
the syntax which is valid only on remote data source through FDW, and
at the moment dblink interface seems feasible for that purpose.

-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Modify the DECLARE CURSOR command tag depending on the scrollable flag

2013-11-27 Thread Boszormenyi Zoltan

2013-11-23 22:01 keltezéssel, Tom Lane írta:

Boszormenyi Zoltan z...@cybertec.at writes:

Attached is the patch that modified the command tag returned by
the DECLARE CURSOR command. It returns DECLARE SCROLL CURSOR
or DECLARE NO SCROLL CURSOR depending on the cursor's
scrollable flag that can be determined internally even if neither is
asked explicitly.

This does not strike me as an acceptable change.  It will break any code
that's expecting the existing command tag, for little or no benefit
to most applications.  Even if backwards compatibility were of no concern,
I'm not convinced it's a good thing to expose the backend's internal
choices about query plans used for cursors, which is what this is
basically doing.


I saw code in the backend allowing a cursor to be scrollable, although
it was not declared as such. How about ripping that out?

That way there would be no incentive for lazy SQL coding using simple cursors.

You can argue that it would also break application compatibility but
on the other hand, such a code has always been buggy and should be fixed.


It is expected by the ECPG cursor readahead code.

And that doesn't sound like a sufficient excuse.  You should only assume a
cursor is scrollable if SCROLL was specified in the cursor declaration
command, which it'd seem to me is something ECPG would or easily could
know about commands it issues.


Yes, it can and I have a patch in the series passing this info to ecpglib.

I am also arguing for backward compatibility on a different angle:
this small backend change would still allow using simple cursors
in ECPG while using the cursor readahead.

And it's not the first time drivers have to adapt to new PostgreSQL major 
versions.
If it was, I wouldn't have the courage to set a precedent either.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] doPickSplit stack buffer overflow in XLogInsert?

2013-11-27 Thread Noah Misch
On Wed, Nov 27, 2013 at 06:23:38AM -0800, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:
  On 2013-11-26 14:14:38 -0800, Kevin Grittner wrote:
 
  I happened to build in a shell that was still set up for the clang
  address sanitizer, and got the attached report.  On a rerun it was

(Kevin, I saw no attachment.)

  repeatable.  XLogInsert() seems to read past the end of a variable
  allocated on the stack in doPickSplit(). I haven't tried to analyze
  it past that, since this part of the code is unfamiliar to me.
 
  Yea, I've seen that one before as well and planned to report it at some
  point. The reason is the MAXALIGN()s in ACCEPT_RDATA_DATA(). That rounds
  up to 8byte boundaries, while we've e.g. only added 2bytes of slop to
  toDelete.
 
 Have you established whether having the CRC calculation read past
 the end of the buffer can cause problems on recovery or standby
 systems?  Should we try to get this fixed by Monday?

The threat is that rounding the read size up to the next MAXALIGN would cross
into an unreadable memory page, resulting in a SIGSEGV.  Every palloc chunk
has MAXALIGN'd size under the hood, so the excess read of toDelete cannot
cause a SIGSEGV.  For a stack variable, it depends on the ABI.  I'm not aware
of an ABI where the four bytes past the end of this stack variable could be
unreadable, which is not to claim I'm well-read on the topic.  We should fix
this in due course on code hygiene grounds, but I would not back-patch it.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Errors on missing pg_subtrans/ files with 9.3

2013-11-27 Thread Alvaro Herrera
Alvaro Herrera escribió:
 Andres Freund escribió:

 This seems simple to handle by adding the check you propose to the loop.
 Basically if the xmax doesn't match the xmin, we reached the end,
 there's nothing more to lock and we can return success without any
 further work:

As mentioned in the thread for bug #8434, the complete working patch for
this is attached.

  b) Check whether a chain element actually aborted - currently we're
 only doing that in the HEAP_KEYS_UPDATED updated case, but that seems
 wrong (we can't check for committed tho!).
 
 Let me point out that this is exactly the same code that would be
 affected by my proposed fix for #8434, which would have this check the
 updateXid in all cases, not only in KEYS_UPDATED as currently.

I posted a patch for this problem in the thread about #8434.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
commit 1a4cadc729e724fa1aa6f260a4988b3615fccd1b
Author: Alvaro Herrera alvhe...@alvh.no-ip.org
Date:   Wed Nov 27 12:12:42 2013 -0300

Compare new tuple's Xmin to previous Xmax while following an update chain

Not doing so causes us to traverse an update chain that has been broken
by concurrent page pruning.  All other code that traverses update chains
is careful to do this check, so replicate it here too.  Failure to do so
leads to erroneous CLOG, subtrans or multixact lookups.

Per bug report by J Smith in
cadfupgc5bmtv-yg9znxv-vcfkb+jprqs7m2oesqxam_4z1j...@mail.gmail.com
diagnosed by Andres Freund.

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index c3b2108..5eb45ff 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -4819,6 +4819,7 @@ heap_lock_updated_tuple_rec(Relation rel, ItemPointer tid, TransactionId xid,
 old_infomask;
 	TransactionId xmax,
 new_xmax;
+	TransactionId priorXmax = InvalidTransactionId;
 
 	ItemPointerCopy(tid, tupid);
 
@@ -4844,6 +4845,18 @@ l4:
 		CHECK_FOR_INTERRUPTS();
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
 
+		/*
+		 * Check the tuple XMIN against prior XMAX, if any.  If we reached
+		 * the end of the chain, we're done, so return success.
+		 */
+		if (TransactionIdIsValid(priorXmax) 
+			!TransactionIdEquals(HeapTupleHeaderGetXmin(mytup.t_data),
+ priorXmax))
+		{
+			UnlockReleaseBuffer(buf);
+			return HeapTupleMayBeUpdated;
+		}
+
 		old_infomask = mytup.t_data-t_infomask;
 		xmax = HeapTupleHeaderGetRawXmax(mytup.t_data);
 
@@ -4944,6 +4957,7 @@ l4:
 		}
 
 		/* tail recursion */
+		priorXmax = HeapTupleHeaderGetUpdateXid(mytup.t_data);
 		ItemPointerCopy((mytup.t_data-t_ctid), tupid);
 		UnlockReleaseBuffer(buf);
 	}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Dimitri Fontaine
Shigeru Hanada shigeru.han...@gmail.com writes:
 I'm sorry but I don't see the point here.  Do you mean that user
 executes CREATE FOREIGN VIEW in advance and uses the view in a

Yes that's what I mean.

 I think it's nice to support executing ad-hoc remote query written in
 the syntax which is valid only on remote data source through FDW, and
 at the moment dblink interface seems feasible for that purpose.

I guess the view query would have to be validated by the FDW, which
would just receive a text.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock on object is already held

2013-11-27 Thread Tom Lane
Daniel Wood dw...@salesforce.com writes:
 Does the original version of my stress test not repro the problem on 9.2?

[ tries it ... ]  No, it doesn't, or at least the MTBF is a couple orders
of magnitude better than on 9.3.

Another odd thing (seen with my short version as well as your original)
is that 9.3/HEAD run the test case enormously faster than 9.2 and 9.1
do.  The older versions seem to spend a lot of time sleeping, which
I don't understand.

 Why does LockAcquireExtended() test for nLocks == 0 in the if
 (dontWait) block before calling RemoveLocalLock()?

Looks like a useless test to me --- we wouldn't be here at all if nLocks
had been positive to start with, and there's nothing in between that
could raise the count.  On the other hand, removing a LOCALLOCK that
did have positive count would be disastrous.  Maybe what would be
more appropriate is an Assert(nLocks == 0) in RemoveLocalLock().

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Atri Sharma
 I guess the view query would have to be validated by the FDW, which
 would just receive a text.

+1

This is not exactly in context, but I and David Fetter discussed
recently how we could do similar thing.

This would work,but how can we do it for FDWs which do not parse SQL?

Am I missing something here?

Regards,

Atri


-- 
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Dimitri Fontaine
Atri Sharma atri.j...@gmail.com writes:
 This would work,but how can we do it for FDWs which do not parse SQL?
 Am I missing something here?

Worst case:

  CREATE FOREIGN VIEW foo
  AS $$
whatever syntax is accepted on the other side
  $$;

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Atri Sharma
On Wed, Nov 27, 2013 at 11:08 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Atri Sharma atri.j...@gmail.com writes:
 This would work,but how can we do it for FDWs which do not parse SQL?
 Am I missing something here?

 Worst case:

   CREATE FOREIGN VIEW foo
   AS $$
 whatever syntax is accepted on the other side


That doesnt sound like a very good idea.

Can we add a function to the FDW API to define a SQL to foreign server
side conversion?

I am just musing though.

Regards,

Atri



-- 
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Handling GIN incomplete splits

2013-11-27 Thread Jeff Janes
On Wed, Nov 13, 2013 at 8:49 AM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 Here's another part of my crusade against xlog cleanup routines. This
 series of patches gets rid of the gin_cleanup() function, which is
 currently used to finish splits of GIN b-tree pages, if the system crashes
 (or an error occurs) between splitting a page and inserting its downlink to
 the parent.

 The first three patches just move code around. IMHO they make the code
 more readable, so they should be committed in any case. The meat is in the
 fourth patch.

 Thoughts, objections?

 Alexander, I'm sorry if this conflicts with your GIN patches. Feel free to
 post the latest versions of your patches against the current master,
 ignoring patches. I can fix the bitrot. That said, I think these
 refactorings will make your code look a little bit nicer too, so you might
 want to rebase because of that anyway.


Hi Heikki,

The commit 04eee1fa9ee80dabf7 of this series causes a self-deadlock in the
LWLock code during the operation below, with it trying to take
an LW_EXCLUSIVE on a high, even-numbered lockid when it already holds the
same lockid.

CREATE INDEX planet_osm_ways_nodes ON planet_osm_ways USING gin (nodes)
 WITH (FASTUPDATE=OFF);

It happens pretty reliably using osm2pgsql.

I will try to come up with a simple reproducible demonstration, and stack
trace, over the weekend.

Cheers,

Jeff


Re: [HACKERS] Handling GIN incomplete splits

2013-11-27 Thread Heikki Linnakangas

On 11/22/13 15:04, Michael Paquier wrote:

2) post recovery cleanup:
- OK, so roughly the soul of this patch is to change the update
mechanism for a left child gin page so as the parent split is always
done first before any new data is inserted in this child. And this
ensures that we can remove the xlog cleanup mechanism for gin page
splits in the same fashion as gist... xlog redo mechanism is then
adapted according to that.



- I did some tests with the patch:
-- Index creation time
vanilla: 3266.834
with the two patches: 3412.473 ms


Hmm. I didn't expect any change in index creation time. Is that 
repeatable, or within the margin of error?



2-1) In ginFindParents, is the case where the stack has no parent
possible (aka the stack is the root itself)? Shouldn't this code path
check if root is NULL or not?


No. A root split doesn't need to find the parent (because there isn't 
any), so we never call ginFindParents with a stack containing just the 
root. If you look at the only caller of ginFindParents, it's quite clear 
that stack-parent always exists.



2-2) Not sure that this structure is in-line with the project policy:
struct
{
BlockNumber left;
BlockNumber right;
} children;
Why not adding a complementary structure in gin_private.h doing that?
It could be used as well in ginxlogSplit to specify a left/right
family of block numbers.


I don't think we have a project policy against in-line structs. Might be 
better style to not do it, anyway, though.


Come to think of it, that children variable mustn't be declared inside 
the if-statement; it's no longer in scope when the XLogInsert was done, 
so the children was pointing to potentially uninitialized piece of 
stack. Valgrind would've complained.


I ended up using BlockIdData[2] for that.

Committed, thanks for the review!

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Platform-dependent(?) failure in timeout handling

2013-11-27 Thread Tom Lane
Kevin Grittner kgri...@ymail.com writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 3. Establish a coding rule that if you catch an error with
 PG_TRY() and don't re-throw it, you have to unblock signals in
 your PG_CATCH block.

 Could that be done in the PG_END_TRY macro?

Interesting idea ... [ thinks for a bit ... ] but I'm not sure it's really
a great fix.  PG_END_TRY would have to unblock *all* signals, I think,
and it's not clear you want that in that mechanism.  Consider a PG_TRY
executing inside a signal handler.  It would be semantically reasonable
to use sigsetjmp(foo,1) in PG_TRY, but as I said, I'm trying to avoid
that on performance grounds.

In any case, after sleeping on it I've realized that the HOLD_INTERRUPTS
dance I proposed adding to handle_sig_alarm() is necessary but not
sufficient to deal with interruptions from SIGINT.  The real reason that
we need HOLD_INTERRUPTS there is that the signal handler may be executing
while ImmediateInterruptsOK is true.  So if a random SIGINT happened to
arrive while we're updating the timeout data structures, we could lose
control and leave partially updated (ie, totally corrupted) timeout info.
Adding HOLD_INTERRUPTS prevents that case.  But, imagine that SIGINT
happens just as we enter handle_sig_alarm(), before we can do
HOLD_INTERRUPTS.  Then the SIGINT handler could longjmp out of the SIGALRM
handler; the data structures are OK, but we've effectively lost that
interrupt, and no new one is scheduled.  This isn't much of a problem
for the existing timer event handlers, because we'd just cancel them
anyway on the way out of the (sub)transaction, cf LockErrorCleanup.
But it could be a problem for future usages of the timer infrastructure.

In the case where we longjmp all the way out to the idle loop, we cancel
all pending timeouts anyway in postgres.c's sigsetjmp cleanup stanza.
It would be reasonable to add a PG_SETMASK(UnBlockSig) in the same area.

I am thinking that what we want to do is add a signal unblock step to
(sub)transaction abort, too, as well as a call to timeout.c to let it
re-establish its timer interrupt in case one got lost.  This would fix
the problem as long as you assume that anyone catching an arbitrary error
condition will do a subtransaction abort to get back into a good state.
But we're requiring that already.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Modify the DECLARE CURSOR command tag depending on the scrollable flag

2013-11-27 Thread Tom Lane
Boszormenyi Zoltan z...@cybertec.at writes:
 2013-11-23 22:01 keltezéssel, Tom Lane írta:
 Boszormenyi Zoltan z...@cybertec.at writes:
 Attached is the patch that modified the command tag returned by
 the DECLARE CURSOR command. It returns DECLARE SCROLL CURSOR
 or DECLARE NO SCROLL CURSOR depending on the cursor's
 scrollable flag that can be determined internally even if neither is
 asked explicitly.

 This does not strike me as an acceptable change.  It will break any code
 that's expecting the existing command tag, for little or no benefit
 to most applications.  Even if backwards compatibility were of no concern,
 I'm not convinced it's a good thing to expose the backend's internal
 choices about query plans used for cursors, which is what this is
 basically doing.

 I saw code in the backend allowing a cursor to be scrollable, although
 it was not declared as such. How about ripping that out?

That also fails the unnecessary-backwards-compatibility-break test.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-11-27 Thread Peter Geoghegan
On Wed, Nov 27, 2013 at 1:09 AM, Peter Geoghegan p...@heroku.com wrote:
 Since it took me a relatively long time to recreate this, it may not
 be trivial to do so. Unless you don't think it's useful to do so, I'm
 going to give this test a full 24 hours, just in case it shows up
 anything else like this.

I see a further, distinct error message this morning:

ERROR:  unrecognized heap_lock_tuple status: 1

This is a would-be attempted to lock invisible tuple error, but with
the error raised by some heap_lock_tuple() call site, unlike the
previous situation where heap_lock_tuple() raised the error directly.
Since with the most recent revision, we handle this (newly possible)
return code in the new ExecLockHeapTupleForUpdateSpec() function, that
just leaves EvalPlanQualFetch() as a plausible place to see it, given
the codepaths exercised in the test case.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread David Fetter
On Wed, Nov 27, 2013 at 10:29:34AM -0300, Alvaro Herrera wrote:
 Shigeru Hanada escribió:
 
  SQL/MED standard doesn't say much about PASS THROUGH mode, especially
  about interaction between client.  Besides it, I think it would be
  nice to allow arbitrary FDW as backend of dblink interface like this:
  
  postgres= SELECT dblink_connect('con1', 'server name of an FDW');
  postgres= SELECT * FROM dblink('con1', 'some query written in remote
  syntax') as t(/* record type definition */...);
  
  This provides a way to execute query without defining foreign table.
 
 Seems to me that if you want to read remote tables without creating a
 foreign table, you could define them locally using something like the
 WITH syntax and then use them normally in the rest of the query.

WITH, or SRF, or whatever, the point is that we need to be able to
specify what we're sending--probably single opaque strings delimited
just as we do other strings--and what we might get back--errors only,
rows, [sets of] refcursors are the ones I can think of offhand.

What we can't do is assume that our parser needs to, or even could, in
principle, understand these things in more detail than that.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Should we improve documentation on isolation levels?

2013-11-27 Thread AK
I am not sure if i am posting in the right place - correct me if I am wrong.
The following is not precise:

13.2.1. Read Committed Isolation Level

Also note that two successive SELECT commands can see different data, even
though they are within a single transaction, if other transactions commit
changes during execution of the first SELECT.

I think it should be re-worded as follows

Also note that two successive SELECT commands can see different data, even
though they are within a single transaction, if other transactions commit
after the first SELECT starts, and before the second SELECT starts.

The reason: there could be other DML running between these two SELECTs.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Should-we-improve-documentation-on-isolation-levels-tp5780629.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] doPickSplit stack buffer overflow in XLogInsert?

2013-11-27 Thread Kevin Grittner
Noah Misch n...@leadboat.com wrote:

 (Kevin, I saw no attachment.)

Apologies.  Trying again.

 The threat is that rounding the read size up to the next MAXALIGN
 would cross into an unreadable memory page, resulting in a
 SIGSEGV.  Every palloc chunk has MAXALIGN'd size under the hood,
 so the excess read of toDelete cannot cause a SIGSEGV.  For a
 stack variable, it depends on the ABI.  I'm not aware of an ABI
 where the four bytes past the end of this stack variable could be
 unreadable, which is not to claim I'm well-read on the topic.  We
 should fix this in due course on code hygiene grounds, but I
 would not back-patch it.

If you're sure.  I hadn't worked through the code, but had two
concerns (neither of which was about a SEGSEGV):

(1)  That multiple MAXALIGNs of shorter values could push the
structure into overlap with the next thing on the stack, allowing
one or the other to get stepped on.

(2)  That the CRC calculation might picking up uninitialized data
which was not actually going to match what was used during
recovery, leading to end of recovery on replay.

If you are confident that neither of these is a real risk, I'll
relax about this.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company=
==7232==ERROR: AddressSanitizer: stack-buffer-overflow on address 
0x7fffbcb2e834 at pc 0xef0335 bp 0x7fffbcb29b30 sp 0x7fffbcb29b28
READ of size 1 at 0x7fffbcb2e834 thread T0
#0 0xef0334 in XLogInsert 
/home/kgrittn/pg/master/src/backend/access/transam/xlog.c:1040
#1 0xd53792 in doPickSplit 
/home/kgrittn/pg/master/src/backend/access/spgist/spgdoinsert.c:1391
#2 0xd0d9f1 in spgdoinsert 
/home/kgrittn/pg/master/src/backend/access/spgist/spgdoinsert.c:2008
#3 0xcbb4d1 in spginsert 
/home/kgrittn/pg/master/src/backend/access/spgist/spginsert.c:238
#4 0x46f9de3 in FunctionCall6Coll 
/home/kgrittn/pg/master/src/backend/utils/fmgr/fmgr.c:1436
#5 0xad43fb in index_insert 
/home/kgrittn/pg/master/src/backend/access/index/indexam.c:223
#6 0x2122fcd in ExecInsertIndexTuples 
/home/kgrittn/pg/master/src/backend/executor/execUtils.c:1104
#7 0x228413f in ExecInsert 
/home/kgrittn/pg/master/src/backend/executor/nodeModifyTable.c:274
#8 0x227fba8 in ExecModifyTable 
/home/kgrittn/pg/master/src/backend/executor/nodeModifyTable.c:1014
#9 0x2026b03 in ExecProcNode 
/home/kgrittn/pg/master/src/backend/executor/execProcnode.c:377
#10 0x1fef534 in ExecutePlan 
/home/kgrittn/pg/master/src/backend/executor/execMain.c:1474
#11 0x1fee488 in standard_ExecutorRun 
/home/kgrittn/pg/master/src/backend/executor/execMain.c:308
#12 0x1fec7e9 in ExecutorRun 
/home/kgrittn/pg/master/src/backend/executor/execMain.c:256
#13 0x34f05ab in ProcessQuery 
/home/kgrittn/pg/master/src/backend/tcop/pquery.c:185
#14 0x34e8c3a in PortalRunMulti 
/home/kgrittn/pg/master/src/backend/tcop/pquery.c:1279
#15 0x34e1b9c in PortalRun 
/home/kgrittn/pg/master/src/backend/tcop/pquery.c:816
#16 0x34b6721 in exec_simple_query 
/home/kgrittn/pg/master/src/backend/tcop/postgres.c:1054
#17 0x34b1420 in PostgresMain 
/home/kgrittn/pg/master/src/backend/tcop/postgres.c:3998
#18 0x2f1f925 in BackendRun 
/home/kgrittn/pg/master/src/backend/postmaster/postmaster.c:4085
#19 0x2f1b830 in BackendStartup 
/home/kgrittn/pg/master/src/backend/postmaster/postmaster.c:3774
#20 0x2efcc96 in ServerLoop 
/home/kgrittn/pg/master/src/backend/postmaster/postmaster.c:1585
#21 0x2ef13ee in PostmasterMain 
/home/kgrittn/pg/master/src/backend/postmaster/postmaster.c:1240
#22 0x24dc3c3 in main /home/kgrittn/pg/master/src/backend/main/main.c:196
#23 0x2b89d526e76c in __libc_start_main 
/build/buildd/eglibc-2.15/csu/libc-start.c:226
#24 0x4dc3cc in _start ??:?

Address 0x7fffbcb2e834 is located in stack of thread T0 at offset 2644 in frame
#0 0xd2b5ef in doPickSplit 
/home/kgrittn/pg/master/src/backend/access/spgist/spgdoinsert.c:680

  This frame has 57 object(s):
[32, 40) ''
[96, 104) ''
[160, 168) ''
[224, 232) ''
[288, 296) ''
[352, 356) ''
[416, 417) ''
[480, 481) ''
[544, 545) 'insertedNew'
[608, 632) 'in'
[672, 720) 'out'
[768, 776) 'procinfo'
[832, 833) 'includeNew'
[896, 900) 'i'
[960, 964) 'max'
[1024, 1028) 'n'
[1088, 1096) 'innerTuple'
[1152, 1160) 'node'
[1216, 1224) 'nodes'
[1280, 1284) 'newInnerBuffer'
[1344, 1348) 'newLeafBuffer'
[1408, 1416) 'heapPtrs'
[1472, 1480) 'leafPageSelect'
[1536, 1544) 'leafSizes'
[1600, 1608) 'toDelete'
[1664, 1672) 'toInsert'
[1728, 1730) 'redirectTuplePos'
[1792, 1796) 'startOffsets'
[1856, 1864) 'newLeafs'
[1920, 1924) 'spaceToDelete'
[1984, 1988) 'currentFreeSpace'
[2048, 2052) 'totalLeafSizes'
[2112, 2113) 'allTheSame'
[2176, 2496) 'rdata'
[2528, 2532) 'nRdata'
[2592, 2644) 'xlrec' == Memory access 

Re: [HACKERS] logical changeset generation v6.7

2013-11-27 Thread Fabrízio de Royes Mello
On Thu, Nov 14, 2013 at 11:46 AM, Andres Freund and...@2ndquadrant.com
wrote:

 On 2013-11-12 18:50:33 +0100, Andres Freund wrote:
   You've actually changed the meaning of this section (and not in a
good way):
  
be set at server start. varnamewal_level/ must be set
   -to literalarchive/ or literalhot_standby/ to allow
   -connections from standby servers.
   +to literalarchive/, literalhot_standby/ or
literallogical/
   +to allow connections from standby servers.
  
   I think that the previous text meant that you needed archive - or, if
   you want to allow connections, hot_standby.  The new text loses that
   nuance.
 
  Yea, that's because it was lost on me in the first place...

 I think that's because the nuance isn't actually in the text - note that
 it is talking about max_wal_senders and talking about connections
 *from*, not *to* standby servers.
 I've reformulated the wal_level paragraph and used or higher in
 several places now.

 Ok, so here's a rebased version of this. I tried to fix all the issues
 you mentioned, and it's based on the split off IsSystemRelation() patch,
 I've sent yesterday (included here).


Hello,

I'm trying to apply the patches but show some warnings/errors:

$ gunzip -c
/home/fabrizio/Downloads/0002-wal_decoding-Add-wal_level-logical-and-log-data-requ.patch.gz
| git apply -
warning: src/backend/access/transam/xlog.c has type 100755, expected 100644

$ gunzip -c
/home/fabrizio/Downloads/0005-wal_decoding-Introduce-wal-decoding-via-catalog-time.patch.gz
| git apply -
warning: src/backend/access/transam/xlog.c has type 100755, expected 100644

$ gunzip -c
/home/fabrizio/Downloads/0006-wal_decoding-Implement-VACUUM-FULL-CLUSTER-support-v.patch.gz
| git apply -
warning: src/backend/access/transam/xlog.c has type 100755, expected 100644

$ gunzip -c
/home/fabrizio/Downloads/0007-wal_decoding-Only-peg-the-xmin-horizon-for-catalog-t.patch.gz
| git apply -
warning: src/backend/access/transam/xlog.c has type 100755, expected 100644

$ gunzip -c
/home/fabrizio/Downloads/0011-wal_decoding-pg_recvlogical-Introduce-pg_receivexlog.patch.gz
| git apply -
error: patch failed: src/bin/pg_basebackup/streamutil.c:210
error: src/bin/pg_basebackup/streamutil.c: patch does not apply

The others are applied correctly. The permission warning must be fixed and
0011 bust be rebased.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] Modify the DECLARE CURSOR command tag depending on the scrollable flag

2013-11-27 Thread Boszormenyi Zoltan

2013-11-27 19:16 keltezéssel, Tom Lane írta:

Boszormenyi Zoltan z...@cybertec.at writes:

2013-11-23 22:01 keltezéssel, Tom Lane írta:

Boszormenyi Zoltan z...@cybertec.at writes:

Attached is the patch that modified the command tag returned by
the DECLARE CURSOR command. It returns DECLARE SCROLL CURSOR
or DECLARE NO SCROLL CURSOR depending on the cursor's
scrollable flag that can be determined internally even if neither is
asked explicitly.

This does not strike me as an acceptable change.  It will break any code
that's expecting the existing command tag, for little or no benefit
to most applications.  Even if backwards compatibility were of no concern,
I'm not convinced it's a good thing to expose the backend's internal
choices about query plans used for cursors, which is what this is
basically doing.

I saw code in the backend allowing a cursor to be scrollable, although
it was not declared as such. How about ripping that out?

That also fails the unnecessary-backwards-compatibility-break test.


If you read the rest of the mail, it turns out it wasn't a serious question.

Getting the SCROLL / NO SCROLL flags from the preprocessor is no problem.
The only problem is when it's unspecified.

Treating it as NO SCROLL (or adding it to the DECLARE command behind
the application's back) would break apps that want to scroll backward.
(Not ABSOLUTE.)

On the other hand, what problems would one face when adding SCROLL
implicitly if it's unspecified? It's not a workable solution either, see below.

As the documentation suggests, an application that wants to use
UPDATE/DELETE WHERE CURRENT OF ..., it's highly recommended
that the cursor is for a FOR UPDATE query. Watch this scenario:

zozo= begin;
BEGIN
zozo= declare mycur cursor for select * from t1 for update;
DECLARE CURSOR
zozo= fetch from mycur;
 id | t
+---
  1 | a
(1 row)

zozo= fetch from mycur;
 id | t
+---
  2 | b
(1 row)

zozo= update t1 set t=t||'_x' where current of mycur;
UPDATE 1
zozo= fetch from mycur;
 id | t
+---
  3 | c
(1 row)

zozo= delete from t1 where current of mycur;
DELETE 1
zozo= move absolute 0 in mycur;
MOVE 0
zozo= fetch from mycur;
 id | t
+---
  1 | a
(1 row)

zozo= fetch from mycur;
 id | t
+---
(0 rows)

Although the server complains about MOVE BACKWARD, it's not
complaining about MOVE ABSOLUTE, despite it's clearly moving
backward. The cursor position is tracked in the backend in a long
variable and it's not overflowed. This is also legacy behaviour,
changing it would break backward compatibility.

The other problem I see is with the documentation: it says that
the INSENSITIVE keyword is just a placeholder, all cursors are insensitive.
It's clearly false. Moving back to the start, previously existing rows
won't show up again. It's not strictly a sensitive cursor, either,
because the row with id=2 would show up with the new value of t.
This neither sensitive nor insensitive behaviour is what the SQL
standard calls an asensitive cursor. It would worth a doc change.
This is what's written in 9.3:


If the cursor's query includes FOR UPDATE or FOR SHARE, then returned rows are locked at 
the time they are first fetched, in the same way as for a regular SELECT 
http://www.postgresql.org/docs/9.3/interactive/sql-select.html command with these 
options. In addition, the returned rows will be the most up-to-date versions; therefore 
these options provide the equivalent of what the SQL standard calls a sensitive cursor. 
(Specifying INSENSITIVE together with FOR UPDATE or FOR SHARE is an error.)


( http://www.postgresql.org/docs/9.3/interactive/sql-declare.html )

However, if the cursor is declared without FOR UPDATE, both
the explicit SCROLL keyword (or implicit, if the query is simple),
scrolling backward and DML with WHERE CURRENT OF are allowed.
In this case, the cursor is really insensitive, FETCH statements
after MOVE ABSOLUTE 0 return all rows with their original data.

This is just to show that adding SCROLL behind the application's
back is also pointless. If the query (which can also be a prepared
statement in ECPG) contains FOR UPDATE, adding SCROLL to the
DECLARE statement would make it fail.

If you consider all these:

- certain combinations of query and DECLARE stmt flags fail;
- adding NO SCROLL is breaking backward compatibility;
- the readahead code has to really know whether the cursor is
  scrollable so it can behave just like the server;

then returning the SCROLL / NO SCROLL flag in the command tag is
not a bad solution in my view. In fact, this was the only workable
solution I could come up with to make it work reliably when neither
SCROLL nor NO SCROLL is specified by the application.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/



Re: [HACKERS] Should we improve documentation on isolation levels?

2013-11-27 Thread Kevin Grittner
AK alk...@gmail.com wrote:

 The following is not precise:

 13.2.1. Read Committed Isolation Level

 Also note that two successive SELECT commands can see different data, even
 though they are within a single transaction, if other transactions commit
 changes during execution of the first SELECT.

 I think it should be re-worded as follows

 Also note that two successive SELECT commands can see different data, even
 though they are within a single transaction, if other transactions commit
 after the first SELECT starts, and before the second SELECT starts.

 The reason: there could be other DML running between these two SELECTs.

That's a fair point.  I'll commit something to the master branch.
I don't think this rises to the level of a documentation bug that
needs to be back-patched.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Modify the DECLARE CURSOR command tag depending on the scrollable flag

2013-11-27 Thread Alvaro Herrera
Boszormenyi Zoltan escribió:

 If you consider all these:
 
 - certain combinations of query and DECLARE stmt flags fail;
 - adding NO SCROLL is breaking backward compatibility;
 - the readahead code has to really know whether the cursor is
   scrollable so it can behave just like the server;
 
 then returning the SCROLL / NO SCROLL flag in the command tag is
 not a bad solution in my view. In fact, this was the only workable
 solution I could come up with to make it work reliably when neither
 SCROLL nor NO SCROLL is specified by the application.

Would it work to have a function of some sort to which you give a cursor
name and it returns whether it is scrollable or not?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Incomplete freezing when truncating a relation during vacuum

2013-11-27 Thread Noah Misch
How would you characterize the chances of this happening with default
*vacuum_freeze_*_age settings?  Offhand, it seems you would need to encounter
this bug during each of ~10 generations of autovacuum_freeze_max_age before
the old rows actually become invisible.

On Wed, Nov 27, 2013 at 02:14:53PM +0100, Andres Freund wrote:
 With regard to fixing things up, ISTM the best bet is heap_prune_chain()
 so far. That's executed b vacuum and by opportunistic pruning and we
 know we have the appropriate locks there. Looks relatively easy to fix
 up things there. Not sure if there are any possible routes to WAL log
 this but using log_newpage()?
 I am really not sure what the best course of action is :(

Maximizing detection is valuable, and the prognosis for automated repair is
poor.  I would want a way to extract tuples having xmin outside the range of
CLOG that are marked HEAP_XMIN_COMMITTED or appear on an all-visible page.  At
first, I supposed we could offer a tool to blindly freeze such tuples.
However, there's no guarantee that they are in harmony with recent changes to
the database; transactions that wrongly considered those tuples invisible may
have made decisions incompatible with their existence.  For example, reviving
such a tuple could violate a UNIQUE constraint if the user had already
replaced the missing row manually.  A module that offers SELECT * FROM
rows_wrongly_invisible('anytable') would aid manual cleanup efforts.
freeze_if_wrongly_invisible(tid) would be useful, too.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we improve documentation on isolation levels?

2013-11-27 Thread AK
I concur - the documentation is not incorrect, it is just incomplete.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Should-we-improve-documentation-on-isolation-levels-tp5780629p5780636.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Atri Sharma
On Thu, Nov 28, 2013 at 12:54 AM, David Fetter da...@fetter.org wrote:
 On Wed, Nov 27, 2013 at 10:29:34AM -0300, Alvaro Herrera wrote:
 Shigeru Hanada escribió:

  SQL/MED standard doesn't say much about PASS THROUGH mode, especially
  about interaction between client.  Besides it, I think it would be
  nice to allow arbitrary FDW as backend of dblink interface like this:
 
  postgres= SELECT dblink_connect('con1', 'server name of an FDW');
  postgres= SELECT * FROM dblink('con1', 'some query written in remote
  syntax') as t(/* record type definition */...);
 
  This provides a way to execute query without defining foreign table.

 Seems to me that if you want to read remote tables without creating a
 foreign table, you could define them locally using something like the
 WITH syntax and then use them normally in the rest of the query.

 WITH, or SRF, or whatever, the point is that we need to be able to
 specify what we're sending--probably single opaque strings delimited
 just as we do other strings--and what we might get back--errors only,
 rows, [sets of] refcursors are the ones I can think of offhand.

+1

The input-output formats need to be defined clearly.

How about sending parse trees? Is it even possible?

 What we can't do is assume that our parser needs to, or even could, in
 principle, understand these things in more detail than that.

Agreed.

I wonder if its possible to give this task to the FDW implementing
authority instead, and get FDW to translate to the required format.




-- 
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] doPickSplit stack buffer overflow in XLogInsert?

2013-11-27 Thread Noah Misch
On Wed, Nov 27, 2013 at 11:38:23AM -0800, Kevin Grittner wrote:
 Noah Misch n...@leadboat.com wrote:
  The threat is that rounding the read size up to the next MAXALIGN
  would cross into an unreadable memory page, resulting in a
  SIGSEGV.  Every palloc chunk has MAXALIGN'd size under the hood,
  so the excess read of toDelete cannot cause a SIGSEGV.  For a
  stack variable, it depends on the ABI.  I'm not aware of an ABI
  where the four bytes past the end of this stack variable could be
  unreadable, which is not to claim I'm well-read on the topic.  We
  should fix this in due course on code hygiene grounds, but I
  would not back-patch it.
 
 If you're sure.  I hadn't worked through the code, but had two
 concerns (neither of which was about a SEGSEGV):
 
 (1)  That multiple MAXALIGNs of shorter values could push the
 structure into overlap with the next thing on the stack, allowing
 one or the other to get stepped on.

These are out-of-bounds reads only, not writes.  Also, the excess doesn't
accumulate that way; the code reads beyond any particular stack variable or
palloc chunk by no more than 7 bytes.

 (2)  That the CRC calculation might picking up uninitialized data
 which was not actually going to match what was used during
 recovery, leading to end of recovery on replay.

The CRC calculation does pick up unspecified bytes, but we copy those same
bytes into the actual WAL record.  The effect is similar to that of
unspecified pad bytes in the middle of xlrec structures.

 If you are confident that neither of these is a real risk, I'll
 relax about this.

If there is a real risk, I'm not seeing it.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we improve documentation on isolation levels?

2013-11-27 Thread Kevin Grittner
AK alk...@gmail.com wrote:

 the documentation is not incorrect, it is just incomplete.

Yeah, that's exactly how I saw it.  :-)

Docs changed on master only.  Thanks for the report!

--

Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Modify the DECLARE CURSOR command tag depending on the scrollable flag

2013-11-27 Thread Tom Lane
Boszormenyi Zoltan z...@cybertec.at writes:
 If you consider all these:

 - certain combinations of query and DECLARE stmt flags fail;
 - adding NO SCROLL is breaking backward compatibility;
 - the readahead code has to really know whether the cursor is
scrollable so it can behave just like the server;

If you're claiming that readahead inside ECPG will behave absolutely
transparently in all cases, I think that's bogus anyway.  Consider for
example a query that will get a divide-by-zero error when it computes
the 110th row --- but the application stops after fetching 100 rows.
Everything's fine, until you insert some readahead logic.  Queries
containing volatile functions might also not be happy about readahead.

Given these considerations, I think it'd be better to allow explicit
application control over whether read-ahead happens for a particular
query.  And I have no problem whatsoever with requiring that the cursor
be explicitly marked SCROLL or NO SCROLL before read-ahead will occur.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] bytea_ouput = escape vs encode(byte, 'escape')

2013-11-27 Thread Jim Nasby
I'm wondering why bytes_output = escape produces different output than 
encode(byte, 'escape') does. Is this intentional? If so, why?

cnuapp_prod@postgres=# select e'\r'::bytea AS cr, e'\n'::bytea AS lf;
  cr  |  lf  
--+--
 \x0d | \x0a
(1 row)

cnuapp_prod@postgres=# set bytea_output = escape;
SET
cnuapp_prod@postgres=# select e'\r'::bytea AS cr, e'\n'::bytea AS lf;
  cr  |  lf  
--+--
 \015 | \012
(1 row)

cnuapp_prod@postgres=# select encode(e'\r'::bytea,'escape') AS cr, 
encode(e'\n'::bytea, 'escape') AS lf;
 cr | lf 
+
 \r |   +
| 
(1 row)

cnuapp_prod@postgres=# 
-- 
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggestion: Issue warning when calling SET TRANSACTION outside transaction block

2013-11-27 Thread Robert Haas
On Tue, Nov 26, 2013 at 5:02 PM, Bruce Momjian br...@momjian.us wrote:
 On Tue, Nov 26, 2013 at 01:58:04PM -0300, Alvaro Herrera wrote:
 Bruce Momjian escribió:
  On Tue, Nov 26, 2013 at 11:22:39AM -0500, Tom Lane wrote:

Uh, I ended up mentioning no effect to highlight it does nothing,
rather than mention a warning.  Would people prefer I say warning?  
Or
should I say issues a warning because it has no effect or something?
It is easy to change.
  
   I'd revert the change Robert highlights above.  ISTM you've changed the
   code to match the documentation; why would you then change the docs?
 
  Well, I did it to make it consistent.  The question is what to write for
  _all_ of the new warnings, including SET.  Do we say warning, do we
  say it has no effect, or do we say both?  The ABORT is a just one case
  of that.

 Maybe it emits a warning and otherwise has no effect?  Emitting a
 warning is certainly not doing nothing; as has been pointed out in the
 SSL renegotiation thread, it might cause the log to fill disk.

 OK, doc patch attached.

Seems broadly reasonable, but I'd use no other effect throughout.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread David Fetter
On Thu, Nov 28, 2013 at 01:29:46AM +0530, Atri Sharma wrote:
 On Thu, Nov 28, 2013 at 12:54 AM, David Fetter da...@fetter.org wrote:
  On Wed, Nov 27, 2013 at 10:29:34AM -0300, Alvaro Herrera wrote:
  Shigeru Hanada escribió:
 
   SQL/MED standard doesn't say much about PASS THROUGH mode, especially
   about interaction between client.  Besides it, I think it would be
   nice to allow arbitrary FDW as backend of dblink interface like this:
  
   postgres= SELECT dblink_connect('con1', 'server name of an FDW');
   postgres= SELECT * FROM dblink('con1', 'some query written in remote
   syntax') as t(/* record type definition */...);
  
   This provides a way to execute query without defining foreign table.
 
  Seems to me that if you want to read remote tables without creating a
  foreign table, you could define them locally using something like the
  WITH syntax and then use them normally in the rest of the query.
 
  WITH, or SRF, or whatever, the point is that we need to be able to
  specify what we're sending--probably single opaque strings delimited
  just as we do other strings--and what we might get back--errors only,
  rows, [sets of] refcursors are the ones I can think of offhand.
 
 +1
 
 The input-output formats need to be defined clearly.
 
 How about sending parse trees? Is it even possible?

I don't see why parse trees wouldn't be something that could
eventually be sent to other PostgreSQL servers, but I see that whole
discussion as orthogonal to this one.

My point here is that there needs to be an escape to native system
available in SQL so people can communicate directly with the remote
systems in the systems' own languages.  It's a little bit analogous to
making assembler available from C, or C from HLLs.

  What we can't do is assume that our parser needs to, or even could, in
  principle, understand these things in more detail than that.
 
 Agreed.
 
 I wonder if its possible to give this task to the FDW implementing
 authority instead, and get FDW to translate to the required format.

I don't know that the FDW would necessarily need to get involved
except in the sense of full recognition before processing.
http://langsec.org/occupy/

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea_ouput = escape vs encode(byte, 'escape')

2013-11-27 Thread David Johnston
Jim Nasby-2 wrote
 I'm wondering why bytes_output = escape produces different output than
 encode(byte, 'escape') does. Is this intentional? If so, why?
 
 cnuapp_prod@postgres=# select e'\r'::bytea AS cr, e'\n'::bytea AS lf;
   cr  |  lf  
 --+--
  \x0d | \x0a
 (1 row)
 
 cnuapp_prod@postgres=# set bytea_output = escape;
 SET
 cnuapp_prod@postgres=# select e'\r'::bytea AS cr, e'\n'::bytea AS lf;
   cr  |  lf  
 --+--
  \015 | \012
 (1 row)
 
 cnuapp_prod@postgres=# select encode(e'\r'::bytea,'escape') AS cr,
 encode(e'\n'::bytea, 'escape') AS lf;
  cr | lf 
 +
  \r |   +
 | 
 (1 row)
 
 cnuapp_prod@postgres=# 

encode takes a bytea and provides what it would be as a text (using the
specified encoding to perform the conversion).

the bytea output examples are simple output of the contents of the
byte-array without an supposition as to what those bytes represent.  It is
strictly a serialization format and not an encoding/decoding of the
contents.

In this example the two functions are acting as paired input/output.

I'm thinking the direction you are assuming from the word encode is
confusing you - as it did me at first.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/bytea-ouput-escape-vs-encode-byte-escape-tp5780643p5780647.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Modify the DECLARE CURSOR command tag depending on the scrollable flag

2013-11-27 Thread Peter Eisentraut
On 11/27/13, 2:49 PM, Alvaro Herrera wrote:
 Would it work to have a function of some sort to which you give a cursor
 name and it returns whether it is scrollable or not?

That might make sense.  I think this case is similar to the question
whether a view is updatable.  You wouldn't put that information in the
CREATE VIEW command tag.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Modify the DECLARE CURSOR command tag depending on the scrollable flag

2013-11-27 Thread Peter Eisentraut
On 11/27/13, 3:47 PM, Tom Lane wrote:
 Given these considerations, I think it'd be better to allow explicit
 application control over whether read-ahead happens for a particular
 query.  And I have no problem whatsoever with requiring that the cursor
 be explicitly marked SCROLL or NO SCROLL before read-ahead will occur.

Well, technically, unspecified means NO SCROLL according to the SQL
standard.  A lot of applications in ECPG are ported from other systems,
which might make that assumption.  It wouldn't be very nice to have to
change all that.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2013-11-27 Thread Bruce Momjian
On Sat, Jan 12, 2013 at 02:14:03PM -0500, Kevin Grittner wrote:
 Amit Kapila wrote:
  On Thursday, January 10, 2013 6:09 AM Josh Berkus wrote:
 
  Surely VACUUM FULL should rebuild the visibility map, and make
  tuples in the new relation all-visible, no?
 
 Certainly it seems odd to me that VACUUM FULL leaves the the table
 in a less-well maintained state in terms of visibility than a
 normal vacuum. VACUUM FULL should not need to be followed by
 another VACUUM.
 
  I think it cannot made all visible.
 
 I don't think all tuples in the relation are necessarily visible to
 all transactions, but the ones which are should probably be flagged
 that way.

I have developed the attached proof-of-concept patch to fix the problem
of having no visibility map after CLUSTER or VACUUM FULL.  I tested with
these queries:

CREATE TABLE test(x INT PRIMARY KEY);
INSERT INTO test VALUES (1);
VACUUM FULL test; -- or CLUSTER
SELECT relfilenode FROM pg_class WHERE relname = 'test';
 relfilenode
-
   16399

Then 'ls -l data/base/16384/16399*' to see the *_vm file.  I am not sure
how to test that the vm contents are valid.

This patch is fairly tricky because our CLUSTER/VACUUM FULL behavior
does not do writes through the shared buffer cache, as outlined in this
C comment block:

 * We can't use the normal heap_insert function to insert into the new
 * heap, because heap_insert overwrites the visibility information.
 * We use a special-purpose raw_heap_insert function instead, which
 * is optimized for bulk inserting a lot of tuples, knowing that we have
 * exclusive access to the heap.  raw_heap_insert builds new pages in
 * local storage.  When a page is full, or at the end of the process,
 * we insert it to WAL as a single record and then write it to disk
 * directly through smgr.  Note, however, that any data sent to the new
 * heap's TOAST table will go through the normal bufmgr.

I originally tried to do this higher up in the stack but ran into
problems because I couldn't access the new heap page so I had to do it
at the non-shared-buffer page level.  I reused the lazy vacuum routines.

I need to know this is the right approach, and need to know what things
are wrong or missing.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
new file mode 100644
index 951894c..c01a6a8
*** a/src/backend/access/heap/rewriteheap.c
--- b/src/backend/access/heap/rewriteheap.c
***
*** 107,112 
--- 107,114 
  #include access/rewriteheap.h
  #include access/transam.h
  #include access/tuptoaster.h
+ #include access/visibilitymap.h
+ #include commands/vacuum.h
  #include storage/bufmgr.h
  #include storage/smgr.h
  #include utils/memutils.h
*** typedef OldToNewMappingData *OldToNewMap
*** 172,177 
--- 174,180 
  
  /* prototypes for internal functions */
  static void raw_heap_insert(RewriteState state, HeapTuple tup);
+ static void update_page_vm(Relation relation, Page page, BlockNumber blkno);
  
  
  /*
*** end_heap_rewrite(RewriteState state)
*** 281,286 
--- 284,290 
  		RelationOpenSmgr(state-rs_new_rel);
  
  		PageSetChecksumInplace(state-rs_buffer, state-rs_blockno);
+ 		update_page_vm(state-rs_new_rel, state-rs_buffer, state-rs_blockno);
  
  		smgrextend(state-rs_new_rel-rd_smgr, MAIN_FORKNUM, state-rs_blockno,
     (char *) state-rs_buffer, true);
*** raw_heap_insert(RewriteState state, Heap
*** 633,638 
--- 637,643 
  			RelationOpenSmgr(state-rs_new_rel);
  
  			PageSetChecksumInplace(page, state-rs_blockno);
+ 			update_page_vm(state-rs_new_rel, page, state-rs_blockno);
  
  			smgrextend(state-rs_new_rel-rd_smgr, MAIN_FORKNUM,
  	   state-rs_blockno, (char *) page, true);
*** raw_heap_insert(RewriteState state, Heap
*** 677,679 
--- 682,704 
  	if (heaptup != tup)
  		heap_freetuple(heaptup);
  }
+ 
+ static void
+ update_page_vm(Relation relation, Page page, BlockNumber blkno)
+ {
+ 	Buffer		vmbuffer = InvalidBuffer;
+ 	TransactionId visibility_cutoff_xid;
+ 
+ 	visibilitymap_pin(relation, blkno, vmbuffer);
+ 	Assert(BufferIsValid(vmbuffer));
+ 
+ 	if (!visibilitymap_test(relation, blkno, vmbuffer) 
+ 		heap_page_is_all_visible(relation, InvalidBuffer, page,
+  visibility_cutoff_xid))
+ 	{
+ 		PageSetAllVisible(page);
+ 		visibilitymap_set(relation, blkno, InvalidBlockNumber,
+ 		  InvalidXLogRecPtr, vmbuffer, visibility_cutoff_xid);
+ 	}
+ 	ReleaseBuffer(vmbuffer);
+ }
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
new file mode 100644
index 7f40d89..a42511b
*** a/src/backend/access/heap/visibilitymap.c
--- b/src/backend/access/heap/visibilitymap.c
*** 

Re: [HACKERS] [GENERAL] pg_upgrade ?deficiency

2013-11-27 Thread Bruce Momjian
On Wed, Nov 27, 2013 at 06:05:13AM -0800, Kevin Grittner wrote:
 Bruce Momjian br...@momjian.us wrote:
  On Tue, Nov 26, 2013 at 03:25:44PM -0800, Kevin Grittner wrote:
  Bruce Momjian br...@momjian.us wrote:
 
  How are we handling breakage of pg_dump, not pg_dumpall?
 
  That was discussed.  Do you have something to add?
 
  I am confused what we are patching.  Are we patching pg_dump,
  pg_dumpall, or both?
 
 Just pg_dumpall.c.

OK, there was a pg_dump patch earlier which we are not using now.

  Are we propagating other settings from pg_dump to pg_dumpall,
  like statement_timeout?
 
 pg_dumpall output sets up the global objects (including their
 properties) and then does a \connect to each database, followed by
 the same output that pg_dump would generate for that database. 
 That includes the SET statements for statement_timeout, etc.  The
 patch does nothing to change what objects or properties the
 pg_dumpall output tries to set up, it just sets a property *on the
 current connection* to allow those statements to run without error.

What is the logic that has us setting statement_timeout in pg_dump but
default_transaction_read_only in pg_dumpall?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Incomplete freezing when truncating a relation during vacuum

2013-11-27 Thread Andres Freund
On 2013-11-27 14:53:27 -0500, Noah Misch wrote:
 How would you characterize the chances of this happening with default
 *vacuum_freeze_*_age settings?  Offhand, it seems you would need to encounter
 this bug during each of ~10 generations of autovacuum_freeze_max_age before
 the old rows actually become invisible.

I think realistically, to actually trigger the bug, it needs to happen
quite a bit more often. But in some workloads it's pretty darn easy to
hit. E.g. if significant parts of the table are regularly deleted, lots,
if not most, of your vacuums will spuriously increase relfrozenxid above
the actual value. Each time only by a small amount, but due to that
small increase there never will be an actual full table vacuum since
freeze_table_age will never even remotely be reached.

The client that made me look into the issue noticed problems on
pg_attribute - presumably because of temporary table usage primarily
affecting the tail end of pg_attribute.

 On Wed, Nov 27, 2013 at 02:14:53PM +0100, Andres Freund wrote:
  With regard to fixing things up, ISTM the best bet is heap_prune_chain()
  so far. That's executed b vacuum and by opportunistic pruning and we
  know we have the appropriate locks there. Looks relatively easy to fix
  up things there. Not sure if there are any possible routes to WAL log
  this but using log_newpage()?
  I am really not sure what the best course of action is :(
 
 Maximizing detection is valuable, and the prognosis for automated repair is
 poor.  I would want a way to extract tuples having xmin outside the range of
 CLOG that are marked HEAP_XMIN_COMMITTED or appear on an all-visible
 page.

I think the likelihood of the problem affecting !all-visible pages is
close to zero. Each vacuum will try to clean those, so they surely will
get vacuumed at some point. I think the only way that could happen is if
the ConditionalLockBufferForCleanup() fails in each vacuum. And that
seems a bit unlikely.

 At first, I supposed we could offer a tool to blindly freeze such tuples.
 However, there's no guarantee that they are in harmony with recent changes to
 the database; transactions that wrongly considered those tuples invisible may
 have made decisions incompatible with their existence.  For example, reviving
 such a tuple could violate a UNIQUE constraint if the user had already
 replaced the missing row manually.

Good point, although since they are all on all-visible pages sequential
scans will currently already find those. It's primarily index scans that
won't. So it's not really reviving them...
The primary reason why I think it might be a good idea to revive
automatically is, that an eventual full-table/freeze vacuum will
currently delete them which seems bad.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] doPickSplit stack buffer overflow in XLogInsert?

2013-11-27 Thread Andres Freund
On 2013-11-27 15:29:24 -0500, Noah Misch wrote:
  If you are confident that neither of these is a real risk, I'll
  relax about this.
 
 If there is a real risk, I'm not seeing it.

Me neither.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggestion: Issue warning when calling SET TRANSACTION outside transaction block

2013-11-27 Thread Bruce Momjian
On Wed, Nov 27, 2013 at 03:59:31PM -0500, Robert Haas wrote:
 On Tue, Nov 26, 2013 at 5:02 PM, Bruce Momjian br...@momjian.us wrote:
  On Tue, Nov 26, 2013 at 01:58:04PM -0300, Alvaro Herrera wrote:
  Bruce Momjian escribió:
   On Tue, Nov 26, 2013 at 11:22:39AM -0500, Tom Lane wrote:
 
 Uh, I ended up mentioning no effect to highlight it does nothing,
 rather than mention a warning.  Would people prefer I say warning? 
  Or
 should I say issues a warning because it has no effect or 
 something?
 It is easy to change.
   
I'd revert the change Robert highlights above.  ISTM you've changed the
code to match the documentation; why would you then change the docs?
  
   Well, I did it to make it consistent.  The question is what to write for
   _all_ of the new warnings, including SET.  Do we say warning, do we
   say it has no effect, or do we say both?  The ABORT is a just one case
   of that.
 
  Maybe it emits a warning and otherwise has no effect?  Emitting a
  warning is certainly not doing nothing; as has been pointed out in the
  SSL renegotiation thread, it might cause the log to fill disk.
 
  OK, doc patch attached.
 
 Seems broadly reasonable, but I'd use no other effect throughout.

That sounds awkward, e.g.:

 Issuing commandROLLBACK/ outside of a transaction
 block emits a warning but has no other effect.

I could live with this:

 Issuing commandROLLBACK/ outside of a transaction
 block has no effect except emitting a warning.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MultiXact pessmization in 9.3

2013-11-27 Thread Alvaro Herrera
Alvaro Herrera wrote:

 Correct.  The only difficulty here is that we would need to pass down
 the fact that we know for certain that this is only a locking Multixact.
 There are some callers that go to it indirectly via MultiXactIdWait or
 MultiXactIdExpand, but now that I look I think it's fine for those to
 pass false (i.e. assume there might be an update and disable the
 optimization), since those aren't hot compared to the other cases.
 
 This patch implements this idea, but I haven't tested it much beyond
 compiling and ensuring it passes the existing tests.

.. and it turns out it doesn't work.  To be really effective, we need
MultiXactIdIsRunning to be passed down the flag too, so it can pass it
to GetMultiXactIdMembers.

One other thought is that MultiXactIdIsRunning and GetMultiXactIdMembers
are public functions, so this patch would represent an API change in
9.3.  I doubt any external modules would be relying on these functions,
but there's so many care and thought put into avoiding API changes on
released versions that I'm nervous about doing it here.  So I think we'd
need to provide a compatibility shim to avoid that.

(I generally dislike to keep compatibility stuff forever, so I would
provide this backward-compatible functions in 9.3 only.  Anyone using it
would have to fix the code and recompile for 9.4+.  This means a #ifdef
in code meant to work on top of both 9.3 and 9.4.  Anyone opines
otherwise?)

The other idea is to just not backpatch this.

Other than that, this patch implements the optimization suggested here.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
*** a/contrib/pgrowlocks/pgrowlocks.c
--- b/contrib/pgrowlocks/pgrowlocks.c
***
*** 168,174  pgrowlocks(PG_FUNCTION_ARGS)
  
  allow_old = !(infomask  HEAP_LOCK_MASK) 
  	(infomask  HEAP_XMAX_LOCK_ONLY);
! nmembers = GetMultiXactIdMembers(xmax, members, allow_old);
  if (nmembers == -1)
  {
  	values[Atnum_xids] = {0};
--- 168,175 
  
  allow_old = !(infomask  HEAP_LOCK_MASK) 
  	(infomask  HEAP_XMAX_LOCK_ONLY);
! nmembers = GetMultiXactIdMembers(xmax, members, allow_old,
!  false);
  if (nmembers == -1)
  {
  	values[Atnum_xids] = {0};
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
***
*** 3987,3993  l3:
  			 * the case, HeapTupleSatisfiesUpdate would have returned
  			 * MayBeUpdated and we wouldn't be here.
  			 */
! 			nmembers = GetMultiXactIdMembers(xwait, members, false);
  
  			for (i = 0; i  nmembers; i++)
  			{
--- 3987,3995 
  			 * the case, HeapTupleSatisfiesUpdate would have returned
  			 * MayBeUpdated and we wouldn't be here.
  			 */
! 			nmembers =
! GetMultiXactIdMembers(xwait, members, false,
! 	  HEAP_XMAX_IS_LOCKED_ONLY(infomask));
  
  			for (i = 0; i  nmembers; i++)
  			{
***
*** 4008,4014  l3:
  }
  			}
  
! 			pfree(members);
  		}
  
  		/*
--- 4010,4017 
  }
  			}
  
! 			if (members)
! pfree(members);
  		}
  
  		/*
***
*** 4157,4163  l3:
   * been the case, HeapTupleSatisfiesUpdate would have returned
   * MayBeUpdated and we wouldn't be here.
   */
! nmembers = GetMultiXactIdMembers(xwait, members, false);
  
  if (nmembers = 0)
  {
--- 4160,4168 
   * been the case, HeapTupleSatisfiesUpdate would have returned
   * MayBeUpdated and we wouldn't be here.
   */
! nmembers =
! 	GetMultiXactIdMembers(xwait, members, false,
! 		  HEAP_XMAX_IS_LOCKED_ONLY(infomask));
  
  if (nmembers = 0)
  {
***
*** 4626,4632  l5:
  		 * MultiXactIdExpand if we weren't to do this, so this check is not
  		 * incurring extra work anyhow.
  		 */
! 		if (!MultiXactIdIsRunning(xmax))
  		{
  			if (HEAP_XMAX_IS_LOCKED_ONLY(old_infomask) ||
  TransactionIdDidAbort(MultiXactIdGetUpdateXid(xmax,
--- 4631,4637 
  		 * MultiXactIdExpand if we weren't to do this, so this check is not
  		 * incurring extra work anyhow.
  		 */
! 		if (!MultiXactIdIsRunning(xmax, HEAP_XMAX_IS_LOCKED_ONLY(old_infomask)))
  		{
  			if (HEAP_XMAX_IS_LOCKED_ONLY(old_infomask) ||
  TransactionIdDidAbort(MultiXactIdGetUpdateXid(xmax,
***
*** 5217,5223  GetMultiXactIdHintBits(MultiXactId multi, uint16 *new_infomask,
  	 * We only use this in multis we just created, so they cannot be values
  	 * pre-pg_upgrade.
  	 */
! 	nmembers = GetMultiXactIdMembers(multi, members, false);
  
  	for (i = 0; i  nmembers; i++)
  	{
--- 5222,5228 
  	 * We only use this in multis we just created, so they cannot be values
  	 * pre-pg_upgrade.
  	 */
! 	nmembers = GetMultiXactIdMembers(multi, members, false, false);
  
  	for (i = 0; i  nmembers; i++)
  	{
***
*** 5293,5299  MultiXactIdGetUpdateXid(TransactionId xmax, uint16 t_infomask)
  	 * Since we know the LOCK_ONLY 

[HACKERS] Another bug introduced by fastpath patch

2013-11-27 Thread Tom Lane
In LockReleaseAll, we have this coding:

for (partition = 0; partition  NUM_LOCK_PARTITIONS; partition++)
{
LWLockIdpartitionLock = FirstLockMgrLock + partition;
SHM_QUEUE  *procLocks = (MyProc-myProcLocks[partition]);

proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
 offsetof(PROCLOCK, procLink));

if (!proclock)
continue;/* needn't examine this partition */

LWLockAcquire(partitionLock, LW_EXCLUSIVE);

... process the proclock list ...

LWLockRelease(partitionLock);
}/* loop over partitions */


That is, we check the proclock list head before acquiring the matching
partitionLock, and assume we can skip scanning this partition if the list
is empty.  Once upon a time that was safe, but since the fast-path patch
it's not very safe, because some other backend could be in process of
promoting one of our fast-path locks (and hence changing the proclock
list).  If we could be sure that we had removed *all* our fastpath locks
in the earlier loop then it'd be okay, but because LockReleaseAll's name
is a bit of a misnomer, I don't think we can assume that.

The simplest fix is just to move the fetch of the list header down
past the LWLockAcquire.  This is pretty annoying, though, because in
typical small transactions that will require a lot of additional
partition lock acquire/releases.

I think it might be safe to check the list header, skip the partition
if it's null, but if it isn't then acquire the lock *and re-fetch the
list header*.  The idea here is that if there is someone else busy
promoting one of our fastpath locks, it must be a lock we'd decided not
to delete in the previous loop, and so we would again decide not to
delete it in the main loop.  Therefore, it doesn't matter if we decide
to skip a partition microseconds before somebody changes the list header
from null to not-null.  However, once we've decided to process a
partition, we'd better get an up-to-date view of the list state.

This does do some damage to the original concept that allLocks mode
would guarantee to clean up all locks of the target lockmethod even
if the locallock table was damaged.  I wonder whether we shouldn't
get rid of the existing logic about fastpath in the loop over locallocks
entirely, and replace it with code that just zeroes the fastpath array
when lockmethodid == DEFAULT_LOCKMETHOD and allLocks.  That would likely
be faster than what you've got here as well as more robust.  Or we
could add a restriction to EligibleForRelationFastPath that restricts
the fast-path mechanism to non-session locks, in which case we'd not
need to make the zeroing contingent on allLocks either.  I don't think
we take any fast-path-eligible locks in session mode anyway, so this
wouldn't be giving up any functionality on that end.

Comments?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggestion: Issue warning when calling SET TRANSACTION outside transaction block

2013-11-27 Thread Bruce Momjian
On Wed, Nov 27, 2013 at 04:44:02PM -0500, Bruce Momjian wrote:
 I could live with this:
 
  Issuing commandROLLBACK/ outside of a transaction
  block has no effect except emitting a warning.

Proposed doc patch attached.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/doc/src/sgml/ref/abort.sgml b/doc/src/sgml/ref/abort.sgml
new file mode 100644
index f3a2fa8..ce70e7f
*** a/doc/src/sgml/ref/abort.sgml
--- b/doc/src/sgml/ref/abort.sgml
*** ABORT [ WORK | TRANSACTION ]
*** 63,69 
/para
  
para
!Issuing commandABORT/ outside of a transaction block has no effect.
/para
   /refsect1
  
--- 63,70 
/para
  
para
!Issuing commandABORT/ outside of a transaction block
!has no effect except emitting a warning.
/para
   /refsect1
  
diff --git a/doc/src/sgml/ref/rollback.sgml b/doc/src/sgml/ref/rollback.sgml
new file mode 100644
index 4f79621..3465d51
*** a/doc/src/sgml/ref/rollback.sgml
--- b/doc/src/sgml/ref/rollback.sgml
*** ROLLBACK [ WORK | TRANSACTION ]
*** 60,66 
  
para
 Issuing commandROLLBACK/ outside of a transaction
!block has no effect.
/para
   /refsect1
  
--- 60,66 
  
para
 Issuing commandROLLBACK/ outside of a transaction
!block has no effect except emitting a warning.
/para
   /refsect1
  
diff --git a/doc/src/sgml/ref/set.sgml b/doc/src/sgml/ref/set.sgml
new file mode 100644
index 5a84f69..6ef06e6
*** a/doc/src/sgml/ref/set.sgml
--- b/doc/src/sgml/ref/set.sgml
*** SET [ SESSION | LOCAL ] TIME ZONE { rep
*** 110,117 
   para
Specifies that the command takes effect for only the current
transaction.  After commandCOMMIT/ or commandROLLBACK/,
!   the session-level setting takes effect again.  This has no effect
!   outside of a transaction block.
   /para
  /listitem
 /varlistentry
--- 110,117 
   para
Specifies that the command takes effect for only the current
transaction.  After commandCOMMIT/ or commandROLLBACK/,
!   the session-level setting takes effect again.  Issuing this
!   outside of a transaction block has no effect except emitting a warning.
   /para
  /listitem
 /varlistentry
diff --git a/doc/src/sgml/ref/set_constraints.sgml b/doc/src/sgml/ref/set_constraints.sgml
new file mode 100644
index a33190c..541a50b
*** a/doc/src/sgml/ref/set_constraints.sgml
--- b/doc/src/sgml/ref/set_constraints.sgml
*** SET CONSTRAINTS { ALL | replaceable cla
*** 99,105 
  
para
 This command only alters the behavior of constraints within the
!current transaction.  This has no effect outside of a transaction block.
/para
   /refsect1
  
--- 99,106 
  
para
 This command only alters the behavior of constraints within the
!current transaction.  Issuing this outside of a transaction block
!has no effect except emitting a warning.
/para
   /refsect1
  
diff --git a/doc/src/sgml/ref/set_transaction.sgml b/doc/src/sgml/ref/set_transaction.sgml
new file mode 100644
index e90ff4a..1597657
*** a/doc/src/sgml/ref/set_transaction.sgml
--- b/doc/src/sgml/ref/set_transaction.sgml
*** SET SESSION CHARACTERISTICS AS TRANSACTI
*** 185,191 
para
 If commandSET TRANSACTION/command is executed without a prior
 commandSTART TRANSACTION/command or commandBEGIN/command,
!it will have no effect.
/para
  
para
--- 185,191 
para
 If commandSET TRANSACTION/command is executed without a prior
 commandSTART TRANSACTION/command or commandBEGIN/command,
!ithas no effect except emitting a warning.
/para
  
para

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] overflow checks optimized away

2013-11-27 Thread Bruce Momjian
On Mon, Sep  9, 2013 at 12:21:56PM -0400, Robert Haas wrote:
 On Sat, Sep 7, 2013 at 6:55 PM, Greg Stark st...@mit.edu wrote:
  Should these patches be applied?
 
  I have a copy of the program and was going to take care of this.
 
 When?

2.5 months later, status report?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another bug introduced by fastpath patch

2013-11-27 Thread Andres Freund
On 2013-11-27 17:25:44 -0500, Tom Lane wrote:
 Or we
 could add a restriction to EligibleForRelationFastPath that restricts
 the fast-path mechanism to non-session locks, in which case we'd not
 need to make the zeroing contingent on allLocks either.  I don't think
 we take any fast-path-eligible locks in session mode anyway, so this
 wouldn't be giving up any functionality on that end.

That seems like the best thing to do to me.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MultiXact pessmization in 9.3

2013-11-27 Thread Andres Freund
On 2013-11-27 19:24:35 -0300, Alvaro Herrera wrote:
 One other thought is that MultiXactIdIsRunning and GetMultiXactIdMembers
 are public functions, so this patch would represent an API change in
 9.3.  I doubt any external modules would be relying on these functions,
 but there's so many care and thought put into avoiding API changes on
 released versions that I'm nervous about doing it here.  So I think we'd
 need to provide a compatibility shim to avoid that.

-0.5 for providing compatibility shims for this. There really doesn't
seem to be legitimate use for that api outside the guts of heapam.c
besides crude debugging hacks. So it seems like wasted effort.

 (I generally dislike to keep compatibility stuff forever, so I would
 provide this backward-compatible functions in 9.3 only.  Anyone using it
 would have to fix the code and recompile for 9.4+.  This means a #ifdef
 in code meant to work on top of both 9.3 and 9.4.  Anyone opines
 otherwise?)

If at all, we definitely should only do it for 9.3.

 The other idea is to just not backpatch this.

I think backpatching is a good idea, I have seen GetMultiXactIdMembers()
+ slru code take up 80% cpu in strange workloads. But it possibly might
be a good idea to wait till after the next point release to give people
at least a minimal chance of catching problems.

 Other than that, this patch implements the optimization suggested here.

Great!

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Errors on missing pg_subtrans/ files with 9.3

2013-11-27 Thread Andres Freund
On 2013-11-27 13:57:52 -0300, Alvaro Herrera wrote:
 Per bug report by J Smith in
 cadfupgc5bmtv-yg9znxv-vcfkb+jprqs7m2oesqxam_4z1j...@mail.gmail.com
 diagnosed by Andres Freund.

Alvaro, do you see a way this could actually have caused J.'s problems?
I thought about a few, but each turned about to not really be
possible...
I can easily see why we would uselessly wait or worse, but the
pg_subtrans errors I can't really explain with it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MultiXact bugs

2013-11-27 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:

 HeapTupleHeaderGetUpdateXid() ignores aborted updaters
 and returns InvalidTransactionId in that case, but
 HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DELETE_IN_PROGRESS...

That sure *sounds* like it should cause a problem for this code in
CheckForSerializableConflictOut():

    htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
    switch (htsvResult)
    {
    [ ... ]
    case HEAPTUPLE_DELETE_IN_PROGRESS:
    xid = HeapTupleHeaderGetUpdateXid(tuple-t_data);
    break;
    [ ... ]
    }
    Assert(TransactionIdIsValid(xid));

... however, I have not been able to trigger that Assert even with
gdb breakpoints at what I think are the right spots.  Any
suggestions?  How far back is it true that the above
HeapTupleSatisfiesVacuum() can return HEAPTUPLE_DELETE_IN_PROGRESS
but HeapTupleHeaderGetUpdateXid(tuple-t_data) on the exact same
tuple structure can return InvalidTransactionId?  Is there some
other condition (besides a ROLLBACK of an UPDATE on the tuple being
read) which needs to be met?  Is any particular timing necessary?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Modify the DECLARE CURSOR command tag depending on the scrollable flag

2013-11-27 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On 11/27/13, 3:47 PM, Tom Lane wrote:
 Given these considerations, I think it'd be better to allow explicit
 application control over whether read-ahead happens for a particular
 query.  And I have no problem whatsoever with requiring that the cursor
 be explicitly marked SCROLL or NO SCROLL before read-ahead will occur.

 Well, technically, unspecified means NO SCROLL according to the SQL
 standard.  A lot of applications in ECPG are ported from other systems,
 which might make that assumption.  It wouldn't be very nice to have to
 change all that.

Hm.  So you're suggesting that ECPG fix this problem by inserting an
explicit NO SCROLL clause into translated DECLARE CURSOR commands, if
there's not a SCROLL clause?

That would solve the problem of the ECPG library not being sure which
behavior applies, but it might break existing apps that were unknowingly
relying on a simple cursor being scrollable.  OTOH any such app would be
subject to breakage anyway as a result of planner changes, so it's hard to
complain against this, as long as it's happening in a major version
update.

I'm for it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Incomplete freezing when truncating a relation during vacuum

2013-11-27 Thread Noah Misch
On Wed, Nov 27, 2013 at 10:43:05PM +0100, Andres Freund wrote:
 On 2013-11-27 14:53:27 -0500, Noah Misch wrote:
  How would you characterize the chances of this happening with default
  *vacuum_freeze_*_age settings?  Offhand, it seems you would need to 
  encounter
  this bug during each of ~10 generations of autovacuum_freeze_max_age before
  the old rows actually become invisible.
 
 I think realistically, to actually trigger the bug, it needs to happen
 quite a bit more often. But in some workloads it's pretty darn easy to
 hit. E.g. if significant parts of the table are regularly deleted, lots,
 if not most, of your vacuums will spuriously increase relfrozenxid above
 the actual value. Each time only by a small amount, but due to that
 small increase there never will be an actual full table vacuum since
 freeze_table_age will never even remotely be reached.

That makes sense.

  Maximizing detection is valuable, and the prognosis for automated repair is
  poor.  I would want a way to extract tuples having xmin outside the range of
  CLOG that are marked HEAP_XMIN_COMMITTED or appear on an all-visible
  page.
 
 I think the likelihood of the problem affecting !all-visible pages is
 close to zero. Each vacuum will try to clean those, so they surely will
 get vacuumed at some point. I think the only way that could happen is if
 the ConditionalLockBufferForCleanup() fails in each vacuum. And that
 seems a bit unlikely.

The page could have sat all-visible (through multiple XID epochs, let's say)
until a recent UPDATE.

  At first, I supposed we could offer a tool to blindly freeze such tuples.
  However, there's no guarantee that they are in harmony with recent changes 
  to
  the database; transactions that wrongly considered those tuples invisible 
  may
  have made decisions incompatible with their existence.  For example, 
  reviving
  such a tuple could violate a UNIQUE constraint if the user had already
  replaced the missing row manually.
 
 Good point, although since they are all on all-visible pages sequential
 scans will currently already find those. It's primarily index scans that
 won't. So it's not really reviving them...

True.  Since a dump/reload of the database would already get the duplicate key
violation, the revival is not making anything clearly worse.  And if we hope
for manual repair, many DBAs just won't do that at all.

 The primary reason why I think it might be a good idea to revive
 automatically is, that an eventual full-table/freeze vacuum will
 currently delete them which seems bad.

Will it?  When the page became all-visible, the tuples were all hinted.  They
will never be considered dead.  Every 2B transactions, they will alternate
between live and not-yet-committed.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MultiXact bugs

2013-11-27 Thread Andres Freund
On 2013-11-27 15:14:11 -0800, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:
 
  HeapTupleHeaderGetUpdateXid() ignores aborted updaters
  and returns InvalidTransactionId in that case, but
  HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DELETE_IN_PROGRESS...
 
 That sure *sounds* like it should cause a problem for this code in
 CheckForSerializableConflictOut():

Yea. IMNSHO the current state is a API design flaw. We really should be
returning the aborted xid since that's what happens for non-multixact
ids.

     htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
     switch (htsvResult)
     {
     [ ... ]
     case HEAPTUPLE_DELETE_IN_PROGRESS:
     xid = HeapTupleHeaderGetUpdateXid(tuple-t_data);
     break;
     [ ... ]
     }
     Assert(TransactionIdIsValid(xid));
 
 ... however, I have not been able to trigger that Assert even with
 gdb breakpoints at what I think are the right spots.  Any
 suggestions?  How far back is it true that the above
 HeapTupleSatisfiesVacuum() can return HEAPTUPLE_DELETE_IN_PROGRESS
 but HeapTupleHeaderGetUpdateXid(tuple-t_data) on the exact same
 tuple structure can return InvalidTransactionId?  Is ther

What do you mean with how far back?

 e some
 other condition (besides a ROLLBACK of an UPDATE on the tuple being
 read) which needs to be met?  Is any particular timing necessary?

Afaics you need a multixact consisting out of a) the updater and b) a
lock. That's probably easiest to get if you update a row in one session
without changing the primary key, and then key-share lock it in
another. Or the other way round.
Then abort the updater.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MultiXact pessmization in 9.3

2013-11-27 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-11-27 19:24:35 -0300, Alvaro Herrera wrote:
 The other idea is to just not backpatch this.

 I think backpatching is a good idea, I have seen GetMultiXactIdMembers()
 + slru code take up 80% cpu in strange workloads. But it possibly might
 be a good idea to wait till after the next point release to give people
 at least a minimal chance of catching problems.

Agreed on both counts.  We're close enough now to Monday's wrap that we
should avoid anything that risks destabilizing 9.3.x, unless it's to fix
a serious bug.  AIUI this is just a performance issue, so let's wait till
after 9.3.2 is done to push in the fix.  But since it is a performance
regression from pre-9.3, never back-patching the fix at all isn't very
attractive.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_upgrade ?deficiency

2013-11-27 Thread Kevin Grittner
Bruce Momjian br...@momjian.us wrote:
 On Wed, Nov 27, 2013 at 06:05:13AM -0800, Kevin Grittner wrote:
 Bruce Momjian br...@momjian.us wrote:

 I am confused what we are patching.  Are we patching pg_dump,
 pg_dumpall, or both?

 Just pg_dumpall.c.

 OK, there was a pg_dump patch earlier which we are not using now.

Right, that was v1, early in the discussion.  This is v3, based on
responding to the discussion on this thread.

 What is the logic that has us setting statement_timeout in
 pg_dump but default_transaction_read_only in pg_dumpall?

Of the people who posted on this thread supporting that, I think
Tom said it best:

Tom Lane t...@sss.pgh.pa.us wrote:

 I'm inclined to agree with Kevin that this behavior is wrong and
 should be fixed (and back-patched), so far as pg_dumpall is concerned.
 pg_dumpall's charter is to be able to recreate a database cluster's
 contents in a virgin installation, but it's failing to honor that
 contract if the cluster has any ALTER DATABASE SET default_read_only
 settings.  Similarly, I think it's reasonable to try to make pg_upgrade
 cope with the case.

 I also agree with *not* changing pg_dump, since it is not the charter
 of pg_dump to recreate a whole cluster, and the objection about possibly
 restoring into a database that was meant to be protected by this setting
 seems to have some force.

For example, I have seen dumps accidentally restored to the
postgres database on multiple occasions.  You might, for example,
flag the postgres database with this, and thereby block such
accidents.  The patch as it stands would allow pg_dumpall to
replicate such a cluster, flag and all.  Without the patch you get
many errors.

It is also much easier to work around with pg_dump output.  You
could get a psql connection to a database, set this off for the
connection, and use \i to read the pg_dump output file.  Or you
could concatenate a SET statement in front of the pg_dump output
when piping it in.  There is no correspondingly easy solution for
pg_dumpall.

--
Kevin GrittnerEDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MultiXact bugs

2013-11-27 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 On 2013-11-27 15:14:11 -0800, Kevin Grittner wrote:

 ... however, I have not been able to trigger that Assert even with
 gdb breakpoints at what I think are the right spots.  Any
 suggestions?  How far back is it true that the above
 HeapTupleSatisfiesVacuum() can return HEAPTUPLE_DELETE_IN_PROGRESS
 but HeapTupleHeaderGetUpdateXid(tuple-t_data) on the exact same
 tuple structure can return InvalidTransactionId?

 What do you mean with how far back?

What back-patching will be needed for a fix?  It sounds like 9.3?

 Afaics you need a multixact consisting out of a) the updater and b) a
 lock. That's probably easiest to get if you update a row in one session
 without changing the primary key, and then key-share lock it in
 another. Or the other way round.
 Then abort the updater.

Thanks!  I'll keep trying to generate a failure at that point.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MultiXact bugs

2013-11-27 Thread Andres Freund
On 2013-11-27 15:42:11 -0800, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:
  What do you mean with how far back?
 
 What back-patching will be needed for a fix?  It sounds like 9.3?

Yep.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Dimitri Fontaine
Atri Sharma atri.j...@gmail.com writes:
 Can we add a function to the FDW API to define a SQL to foreign server
 side conversion?

I think that's not tenable. Even if you limit the discussion to the
postgres_fdw, some queries against past version will stop working
against new version (keywords changes, catalogs, default settings, etc).

I don't think you want to embed a full parser of every supported FOREIGN
version of PostgreSQL inside the postgres_fdw code, so I think the text
of the view needs to be an opaque string.

-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another bug introduced by fastpath patch

2013-11-27 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-11-27 17:25:44 -0500, Tom Lane wrote:
 Or we
 could add a restriction to EligibleForRelationFastPath that restricts
 the fast-path mechanism to non-session locks, in which case we'd not
 need to make the zeroing contingent on allLocks either.  I don't think
 we take any fast-path-eligible locks in session mode anyway, so this
 wouldn't be giving up any functionality on that end.

 That seems like the best thing to do to me.

I poked a bit more at this and realized that what I had in mind above
doesn't work: a session lock isn't a distinct lock type, just a different
owner, and moreover usually we ask for a session lock on a rel that we
already have locked normally.  So it doesn't work to say skip trying fast
path if sessionLock --- we typically won't get that far in the code.
We could conceivably make it work if we were willing to forcibly promote
an existing fastpath lock to regular when a session lock gets asked for,
but that's complication I don't want much, especially since it would go
completely untested in normal use.  (I verified the claim above that we
never take session locks on any fast-path-eligible lock types.)

We could still do this if we were willing to actually reject requests
for session level locks on fast-path-eligible lock types.  At the moment
that costs us nothing really.  If anyone ever did have a use case, we
could consider adding the extra logic to support it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Tom Lane
Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 Atri Sharma atri.j...@gmail.com writes:
 Can we add a function to the FDW API to define a SQL to foreign server
 side conversion?

 I think that's not tenable. Even if you limit the discussion to the
 postgres_fdw, some queries against past version will stop working
 against new version (keywords changes, catalogs, default settings, etc).

 I don't think you want to embed a full parser of every supported FOREIGN
 version of PostgreSQL inside the postgres_fdw code, so I think the text
 of the view needs to be an opaque string.

I'm not real sure what this'd buy us that wouldn't be done as well or
better by creating a view on the remote side.  (IOW, there's nothing
that says that the remote object backing a foreign table can't be a
view.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Incomplete freezing when truncating a relation during vacuum

2013-11-27 Thread Andres Freund
On 2013-11-27 18:18:02 -0500, Noah Misch wrote:
  I think the likelihood of the problem affecting !all-visible pages is
  close to zero. Each vacuum will try to clean those, so they surely will
  get vacuumed at some point. I think the only way that could happen is if
  the ConditionalLockBufferForCleanup() fails in each vacuum. And that
  seems a bit unlikely.
 
 The page could have sat all-visible (through multiple XID epochs, let's say)
 until a recent UPDATE.

Good point.

   At first, I supposed we could offer a tool to blindly freeze such tuples.
   However, there's no guarantee that they are in harmony with recent 
   changes to
   the database; transactions that wrongly considered those tuples invisible 
   may
   have made decisions incompatible with their existence.  For example, 
   reviving
   such a tuple could violate a UNIQUE constraint if the user had already
   replaced the missing row manually.
  
  Good point, although since they are all on all-visible pages sequential
  scans will currently already find those. It's primarily index scans that
  won't. So it's not really reviving them...
 
 True.  Since a dump/reload of the database would already get the duplicate key
 violation, the revival is not making anything clearly worse.  And if we hope
 for manual repair, many DBAs just won't do that at all.

Especially if it involves compiling C code...

  The primary reason why I think it might be a good idea to revive
  automatically is, that an eventual full-table/freeze vacuum will
  currently delete them which seems bad.
 
 Will it?  When the page became all-visible, the tuples were all hinted.  They
 will never be considered dead.  Every 2B transactions, they will alternate
 between live and not-yet-committed.

Good point again. And pretty damn good luck.

Although 9.3+ multixact rows look like they could return _DEAD since
we'll do an TransactionIdDidAbort() (via GetUpdateXid()) and
TransactionIdDidCommit() in there and we don't set XMAX_COMMITTED hint
bits for XMAX_IS_MULTI rows. As an additional problem, once multixact.c
has pruned the old multis away we'll get errors from there on.
But that's less likely to affect many rows.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] doPickSplit stack buffer overflow in XLogInsert?

2013-11-27 Thread Peter Eisentraut
On 11/26/13, 5:14 PM, Kevin Grittner wrote:
 I happened to build in a shell that was still set up for the clang
 address sanitizer, and got the attached report.  On a rerun it was
 repeatable.  XLogInsert() seems to read past the end of a variable
 allocated on the stack in doPickSplit(). I haven't tried to analyze
 it past that, since this part of the code is unfamiliar to me.

I also see that.  It only happens in 64-bit builds.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Proposed feature: Selective Foreign Keys

2013-11-27 Thread Tom Dunstan
Hi all!

The Problem
-
One case that traditional SQL doesn't handle very well is when you have a child 
entity which can be attached to a number of different parent entities. Examples 
might be comments, tags or file attachments - we might have 20 different 
entities in the system that we would like our users to be able add comments to, 
but the existing solutions for mapping this all have downsides.

Existing solution 1: Join tables ahoy
If I can have a list of comments on every other object in the system, and I 
want to have referrential integrity, then the obvious thing to do is create a 
join table between each entity and the comments table.
Pros:
 - Straight forward, traditional object-with-collection-of-child SQL structure
Cons:
 - If a parent object gets deleted here, we can't use foreign keys to delete 
e.g. a child comment, so we'll have to either explicitly do it as part of our 
delete logic or have a cleanup process to catch orphans. Or do a dance with 
delete triggers on the join tables deleting the comment.
 - For n entities requiring comments in the system, we need n join tables.If we 
want both comments and e.g. tags and likes on all of our entities, we now have 
3n join tables for what should be some relatively self-contained on-the-side 
data - this is could be more tables than the entire rest of the system
 - It's difficult to create any kind of self-contained component for building 
applications in this scenario, as it will need to know about every other entity 
in the system, or be able to poke around inside whatever ORM or data access 
system that you have to work out what join tables it needs when running queries.

Existing solution 2: Enter the matrix
Instead of having n join tables, let's just mash them all together, with a 
column per parent object, and a check constraint to force exactly one of those 
columns to be set.
Pros:
 - Less bloat in the number of tables
Cons:
 - Doesn't solve orphan problem
 - Addition of a new entity which needs comments and we now have to add another 
column onto it, potentially rewriting the whole thing
 - Ugly

Existing solution 3: Embed the matrix
Kinda like the dependency matrix table, except that all the columns referencing 
potential parent objects we put into the comment table instead.
Pros:
 - Everything contained in column table
 - No orphans, since cascaded deletes will now delete the actual comment
Cons:
 - Comment table now has references to every single type that it may be 
attached to
 - Addition of a new entity and we probably have to rewrite the comment table 
now

Existing solution 4: Abandon ye all referential integrity
Have a column indicating parent type and another one for the id. In the case of 
comments this would be directly on the comment table itself. In the case of 
something like tags that we might expect to be shared between entities, it 
would be in a single join table. 
Pros:
 - Pretty self-contained
 - Data model which has neither lots of empty columns or lots of tables
 - Can make new entities commentable without rewriting anything
 - Because it's self-contained, can build application components that don't 
need to know much about the rest of your system. For example this is the 
approach that the grails taggable and commentable plugins take.
Cons:
 - No referential integrity, since we can't have a single column pointing to 
different tables with existing foreign key infrastructure
 - Since there's no real db support for doing things this way, existing ORMs 
etc don't really know how use a single column to join against multiple 
different tables based on a discriminator or 'switch' column.

Existing solution 5: Everything's a thing
Make your entity hierarchy have a base level object which can have comments 
attached, and then everything that you need to be commentable has to extend 
that. You can do that in an ORM, or with table inheritance in the database.
Pros:
 - Single top-level thing to hang your data on
Cons:
 - You've polluted your object hierarchy just to hang some stuff off of the end 
of it rather than it being driven by behaviours
 - You're going to be paying a performance penalty - everything that extends 
that base level object will now join against it incessantly, and you now have a 
global id sequence or whatever that you may not want.

Basically none of the above handle the situation very well. The cleanest is 
solution 4, but lack of RI sucks.

Feature Proposal: Selective foreign keys.
-
Allow foreign keys to have where clauses. The above comment example using 
solution 4 might then look like then following:

CREATE TABLE comment as (
  id bigserial primary key,
  content text not null,
  parent_entity regclass not null,
  parent_id int8
);
ALTER TABLE comment ADD CONSTRAINT comment_blog_fk FOREIGN KEY (parent_id) 
REFERENCES blog(id) WHERE (parent_entity = ‘blog');
ALTER TABLE comment ADD CONSTRAINT comment_event_fk FOREIGN KEY 

Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-11-27 Thread Peter Geoghegan
On Wed, Nov 27, 2013 at 1:09 AM, Peter Geoghegan p...@heroku.com wrote:
 This, along with the already-discussed attempted to update invisible
 tuple forms a full account of unexpected ERRORs seen during the
 extended run of the test case, so far.

Actually, it was slightly misleading of me to say it's the same
test-case; in fact, this time I ran each pgbench run with a variable,
random number of seconds between 2 and 20 inclusive (as opposed to
always 2 seconds). If you happen to need help recreating this, I am
happy to give it.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another bug introduced by fastpath patch

2013-11-27 Thread Tom Lane
I wrote:
 We could still do this if we were willing to actually reject requests
 for session level locks on fast-path-eligible lock types.  At the moment
 that costs us nothing really.  If anyone ever did have a use case, we
 could consider adding the extra logic to support it.

Nope, that *still* doesn't work, because in non-allLocks mode the main
loop won't clear any locks that have been promoted from fastpath to
regular.  Sigh.  For the moment I'm proposing that we just re-fetch
the list header after acquiring the lock.  The attached patch is slightly
more verbose than that, because I took the opportunity to reformulate the
while() loop as a for() loop and thereby eliminate some goto's.

regards, tom lane

diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 072b276..75df457 100644
*** a/src/backend/storage/lmgr/lock.c
--- b/src/backend/storage/lmgr/lock.c
*** LockReleaseAll(LOCKMETHODID lockmethodid
*** 2098,2116 
  	{
  		LWLockId	partitionLock = FirstLockMgrLock + partition;
  		SHM_QUEUE  *procLocks = (MyProc-myProcLocks[partition]);
  
! 		proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
! 			 offsetof(PROCLOCK, procLink));
! 
! 		if (!proclock)
  			continue;			/* needn't examine this partition */
  
  		LWLockAcquire(partitionLock, LW_EXCLUSIVE);
  
! 		while (proclock)
  		{
  			bool		wakeupNeeded = false;
- 			PROCLOCK   *nextplock;
  
  			/* Get link first, since we may unlink/delete this proclock */
  			nextplock = (PROCLOCK *)
--- 2098,2134 
  	{
  		LWLockId	partitionLock = FirstLockMgrLock + partition;
  		SHM_QUEUE  *procLocks = (MyProc-myProcLocks[partition]);
+ 		PROCLOCK   *nextplock;
  
! 		/*
! 		 * If the proclock list for this partition is empty, we can skip
! 		 * acquiring the partition lock.  This optimization is trickier than
! 		 * it looks, because another backend could be in process of adding
! 		 * something to our proclock list due to promoting one of our
! 		 * fast-path locks.  However, any such lock must be one that we
! 		 * decided not to delete above, so it's okay to skip it again now;
! 		 * we'd just decide not to delete it again.  We must, however, be
! 		 * careful to re-fetch the list header once we've acquired the
! 		 * partition lock, to be sure we see the up to date version.
! 		 *
! 		 * XXX This argument assumes that the locallock table correctly
! 		 * represents all of our fast-path locks.  While allLocks mode
! 		 * guarantees to clean up all of our normal locks regardless of the
! 		 * locallock situation, we lose that guarantee for fast-path locks.
! 		 * This is not ideal.
! 		 */
! 		if (SHMQueueNext(procLocks, procLocks,
! 		 offsetof(PROCLOCK, procLink)) == NULL)
  			continue;			/* needn't examine this partition */
  
  		LWLockAcquire(partitionLock, LW_EXCLUSIVE);
  
! 		for (proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
! 			   offsetof(PROCLOCK, procLink));
! 			 proclock;
! 			 proclock = nextplock)
  		{
  			bool		wakeupNeeded = false;
  
  			/* Get link first, since we may unlink/delete this proclock */
  			nextplock = (PROCLOCK *)
*** LockReleaseAll(LOCKMETHODID lockmethodid
*** 2123,2129 
  
  			/* Ignore items that are not of the lockmethod to be removed */
  			if (LOCK_LOCKMETHOD(*lock) != lockmethodid)
! goto next_item;
  
  			/*
  			 * In allLocks mode, force release of all locks even if locallock
--- 2141,2147 
  
  			/* Ignore items that are not of the lockmethod to be removed */
  			if (LOCK_LOCKMETHOD(*lock) != lockmethodid)
! continue;
  
  			/*
  			 * In allLocks mode, force release of all locks even if locallock
*** LockReleaseAll(LOCKMETHODID lockmethodid
*** 2139,2145 
  			 * holdMask == 0 and are therefore recyclable
  			 */
  			if (proclock-releaseMask == 0  proclock-holdMask != 0)
! goto next_item;
  
  			PROCLOCK_PRINT(LockReleaseAll, proclock);
  			LOCK_PRINT(LockReleaseAll, lock, 0);
--- 2157,2163 
  			 * holdMask == 0 and are therefore recyclable
  			 */
  			if (proclock-releaseMask == 0  proclock-holdMask != 0)
! continue;
  
  			PROCLOCK_PRINT(LockReleaseAll, proclock);
  			LOCK_PRINT(LockReleaseAll, lock, 0);
*** LockReleaseAll(LOCKMETHODID lockmethodid
*** 2168,2176 
  		lockMethodTable,
  		LockTagHashCode(lock-tag),
  		wakeupNeeded);
- 
- 	next_item:
- 			proclock = nextplock;
  		}		/* loop over PROCLOCKs within this partition */
  
  		LWLockRelease(partitionLock);
--- 2186,2191 
*** PostPrepare_Locks(TransactionId xid)
*** 3142,3160 
  	{
  		LWLockId	partitionLock = FirstLockMgrLock + partition;
  		SHM_QUEUE  *procLocks = (MyProc-myProcLocks[partition]);
  
! 		proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
! 			 offsetof(PROCLOCK, procLink));
! 
! 		if (!proclock)
  			continue;			/* needn't examine this partition */
  
  		

Re: [HACKERS] [GENERAL] pg_upgrade ?deficiency

2013-11-27 Thread Bruce Momjian
On Wed, Nov 27, 2013 at 03:36:12PM -0800, Kevin Grittner wrote:
 Of the people who posted on this thread supporting that, I think
 Tom said it best:
 
 Tom Lane t...@sss.pgh.pa.us wrote:
 
  I'm inclined to agree with Kevin that this behavior is wrong and
  should be fixed (and back-patched), so far as pg_dumpall is concerned.
  pg_dumpall's charter is to be able to recreate a database cluster's
  contents in a virgin installation, but it's failing to honor that
  contract if the cluster has any ALTER DATABASE SET default_read_only
  settings.  Similarly, I think it's reasonable to try to make pg_upgrade
  cope with the case.
 
  I also agree with *not* changing pg_dump, since it is not the charter
  of pg_dump to recreate a whole cluster, and the objection about possibly
  restoring into a database that was meant to be protected by this setting
  seems to have some force.
 
 For example, I have seen dumps accidentally restored to the
 postgres database on multiple occasions.  You might, for example,
 flag the postgres database with this, and thereby block such
 accidents.  The patch as it stands would allow pg_dumpall to
 replicate such a cluster, flag and all.  Without the patch you get
 many errors.
 
 It is also much easier to work around with pg_dump output.  You
 could get a psql connection to a database, set this off for the
 connection, and use \i to read the pg_dump output file.  Or you
 could concatenate a SET statement in front of the pg_dump output
 when piping it in.  There is no correspondingly easy solution for
 pg_dumpall.

Well, I can understand that, but part of the argument was that
default_transaction_read_only is not part of the database, but rather
just the transaction default:

 Karsten wrote:
 Maybe I am splitting hairs but setting transactions to readonly
 per default does not mean the database *as such* is to be readonly.
 It literally applies to the *default* state of transactions (as
 opposed to the ONLY state of transactions). No more, no less.

I ask again!

 What is the logic that has us setting statement_timeout in
 pg_dump but default_transaction_read_only in pg_dumpall?

Why can't I get an answer to that question?  Is it that
statement_timeout is less likely to lead to a restore failure?  Are all
the other settings output from pg_dump safe?  Is only
default_transaction_read_only a problem?  Whatever the answer is, the
patch should explain why we are singling out
default_transaction_read_only for pg_dumpall use and everything else is
in pg_dump.

Why does it feel like I am going around in circles here?  I feel I like
am reliving the materialized view record comparison thread all over
again.  :-(

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_upgrade ?deficiency

2013-11-27 Thread Bruce Momjian
On Wed, Nov 27, 2013 at 03:36:12PM -0800, Kevin Grittner wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
 
  I'm inclined to agree with Kevin that this behavior is wrong and
  should be fixed (and back-patched), so far as pg_dumpall is concerned.
  pg_dumpall's charter is to be able to recreate a database cluster's
  contents in a virgin installation, but it's failing to honor that
  contract if the cluster has any ALTER DATABASE SET default_read_only
  settings.  Similarly, I think it's reasonable to try to make pg_upgrade
  cope with the case.
 
  I also agree with *not* changing pg_dump, since it is not the charter
  of pg_dump to recreate a whole cluster, and the objection about possibly
  restoring into a database that was meant to be protected by this setting
  seems to have some force.
 
 For example, I have seen dumps accidentally restored to the
 postgres database on multiple occasions.  You might, for example,
 flag the postgres database with this, and thereby block such
 accidents.  The patch as it stands would allow pg_dumpall to
 replicate such a cluster, flag and all.  Without the patch you get
 many errors.
 
 It is also much easier to work around with pg_dump output.  You
 could get a psql connection to a database, set this off for the
 connection, and use \i to read the pg_dump output file.  Or you
 could concatenate a SET statement in front of the pg_dump output
 when piping it in.  There is no correspondingly easy solution for
 pg_dumpall.

OK, I have read through this again, and now I see the idea is that we
are trying to have pg_dumpall override the default_transaction_read_only
value, but keep it for pg_dump restores.  That is a pretty tricky
use-case, so I think we need to mention this in the C comments.

Tom and you think we should backpatch, and it seems only I am against
it, so I suppose you can move forward. We are packaging soon for a minor
release.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >