Re: [HACKERS] autonomous transactions

2016-09-03 Thread Craig Ringer
On 3 Sep. 2016 20:27, "Greg Stark"  wrote:
>

> Well using a separate process also requires rewriting locking and
> deadlock detection since a reasonable user might expect that second
> process to have access to data locked in their current transaction.

The  user is going to hit some confusing issues.

Xact1: alter table... Add column...
Xact2: inserts data with new format
Xact2: autonomous commits leaving xact1 running
Xact2: rollback

There are similar issues with FKs and and sorts. It isn't limited to schema
changes.

I don't see how autonomous xacts that inherit parent xact's uncommitted
data, locks, etc can ever really be safe. Useful, but use at your own
(considerable) risk and unlikely to be worth the issues they introduce.

If they inherit locks but don't see uncommitted data it'd be even worse.

See the autonomous commit subxact thread for more on these issues.

>
> Parallel query is currently restricted to read-only queries however.
> Autonomous transactions will certainly need to be read-write so the
> question then is what problems led to the restriction in parallel
> query and are they any more tractable with autonomous transactions?

Parallel query is fairly different because it doesn't have it's own xact.
This should mean no need to be able to wait on a separate virtualxid or
xid, no need to have their own TransactionID allocated, no complex
considerations of visibility and most importantly can't commit separately.


Re: [HACKERS] LSN as a recovery target

2016-09-03 Thread Michael Paquier
On Sun, Sep 4, 2016 at 8:05 AM, Michael Paquier
 wrote:
> On Sun, Sep 4, 2016 at 1:57 AM, Simon Riggs  wrote:
>> On 24 August 2016 at 05:50, Michael Paquier  
>> wrote:
>>
> Everything else looks in good order.
>>
>> Committed. Thanks.
>
> Thanks for the commit!

By the way, what has been committed does not include the patch adding
the parsing context in case of an error as wanted upthread. Perhaps
that's not worth adding now as there is the GUC refactoring
potentially happening for the recovery parameters, so I don't mind
much. Just that's worth mentioning.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Transactional enum additions - was Re: [HACKERS] Alter or rename enum value

2016-09-03 Thread Tom Lane
Andrew Dunstan  writes:
> OK, did that. Here is a patch that is undocumented but I think is 
> otherwise complete. It's been tested a bit and we haven't been able to 
> break it. Comments welcome.

Got around to looking at this.  Attached is a revised version that I think
is in committable shape.  The main non-cosmetic change is that the test
for "type was created in same transaction as new value" now consists of
comparing the xmins of the pg_type and pg_enum rows, without consulting
GetCurrentTransactionId().  I did not like the original coding because
it would pointlessly disallow references to enum values that were created
in a parent transaction of the current subxact.  This way is still leaving
some subxact use-cases on the table, as noted in the code comments, but
it's more flexible than before.

Barring objections I'll push this soon.

regards, tom lane

diff --git a/doc/src/sgml/ref/alter_type.sgml b/doc/src/sgml/ref/alter_type.sgml
index 9789881..aec73f6 100644
*** a/doc/src/sgml/ref/alter_type.sgml
--- b/doc/src/sgml/ref/alter_type.sgml
*** ALTER TYPE t_data) == GetCurrentTransactionId() &&
- 		!(tup->t_data->t_infomask & HEAP_UPDATED))
- 		 /* safe to do inside transaction block */ ;
- 	else
- 		PreventTransactionChain(isTopLevel, "ALTER TYPE ... ADD");
- 
  	/* Check it's an enum and check user has permission to ALTER the enum */
  	checkEnumOwner(tup);
  
--- 1236,1241 
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index ac50c2a..ac64135 100644
*** a/src/backend/tcop/utility.c
--- b/src/backend/tcop/utility.c
*** ProcessUtilitySlow(Node *parsetree,
*** 1359,1365 
  break;
  
  			case T_AlterEnumStmt:		/* ALTER TYPE (enum) */
! address = AlterEnum((AlterEnumStmt *) parsetree, isTopLevel);
  break;
  
  			case T_ViewStmt:	/* CREATE VIEW */
--- 1359,1365 
  break;
  
  			case T_AlterEnumStmt:		/* ALTER TYPE (enum) */
! address = AlterEnum((AlterEnumStmt *) parsetree);
  break;
  
  			case T_ViewStmt:	/* CREATE VIEW */
diff --git a/src/backend/utils/adt/enum.c b/src/backend/utils/adt/enum.c
index 135a544..47d5355 100644
*** a/src/backend/utils/adt/enum.c
--- b/src/backend/utils/adt/enum.c
***
*** 19,24 
--- 19,25 
  #include "catalog/indexing.h"
  #include "catalog/pg_enum.h"
  #include "libpq/pqformat.h"
+ #include "storage/procarray.h"
  #include "utils/array.h"
  #include "utils/builtins.h"
  #include "utils/fmgroids.h"
*** static Oid	enum_endpoint(Oid enumtypoid,
*** 31,36 
--- 32,124 
  static ArrayType *enum_range_internal(Oid enumtypoid, Oid lower, Oid upper);
  
  
+ /*
+  * Disallow use of an uncommitted pg_enum tuple.
+  *
+  * We need to make sure that uncommitted enum values don't get into indexes.
+  * If they did, and if we then rolled back the pg_enum addition, we'd have
+  * broken the index because value comparisons will not work reliably without
+  * an underlying pg_enum entry.  (Note that removal of the heap entry
+  * containing an enum value is not sufficient to ensure that it doesn't appear
+  * in upper levels of indexes.)  To do this we prevent an uncommitted row from
+  * being used for any SQL-level purpose.  This is stronger than necessary,
+  * since the value might not be getting inserted into a table or there might
+  * be no index on its column, but it's easy to enforce centrally.
+  *
+  * However, it's okay to allow use of uncommitted values belonging to enum
+  * types that were themselves created in the same transaction, because then
+  * any such index would also be new and would go away altogether on rollback.
+  * (This case is required by pg_upgrade.)
+  *
+  * This function needs to be called (directly or indirectly) in any of the
+  * functions below that could return an enum value to SQL operations.
+  */
+ static void
+ check_safe_enum_use(HeapTuple enumval_tup)
+ {
+ 	TransactionId xmin;
+ 	Form_pg_enum en;
+ 	HeapTuple	enumtyp_tup;
+ 
+ 	/*
+ 	 * If the row is hinted as committed, it's surely safe.  This provides a
+ 	 * fast path for all normal use-cases.
+ 	 */
+ 	if (HeapTupleHeaderXminCommitted(enumval_tup->t_data))
+ 		return;
+ 
+ 	/*
+ 	 * Usually, a row would get hinted as committed when it's read or loaded
+ 	 * into syscache; but just in case not, let's check the xmin directly.
+ 	 */
+ 	xmin = HeapTupleHeaderGetXmin(enumval_tup->t_data);
+ 	if (!TransactionIdIsInProgress(xmin) &&
+ 		TransactionIdDidCommit(xmin))
+ 		return;
+ 
+ 	/* It is a new enum value, so check to see if the whole enum is new */
+ 	en = (Form_pg_enum) GETSTRUCT(enumval_tup);
+ 	enumtyp_tup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(en->enumtypid));
+ 	if (!HeapTupleIsValid(enumtyp_tup))
+ 		elog(ERROR, "cache lookup failed for type %u", en->enumtypid);
+ 
+ 	/*
+ 	 * We insist that the type have been created in the same (sub)transaction
+ 	 * as the enum value.  It would be safe to allow the type's 

Re: [HACKERS] Long options for pg_ctl waiting

2016-09-03 Thread Michael Paquier
On Sun, Sep 4, 2016 at 5:57 AM, Vik Fearing  wrote:
> One thing that has been irking me ever since I came to PostgreSQL is the
> fact that pg_ctl -w (and -W) don't have longhand equivalents.  I like to
> use the long version in scripts and such as extra documentation, and
> I've never been able to with these.  What's more, I keep forgetting that
> --wait (and --no-wait) aren't a thing.
>
> Trivial patch attached.

Nit: Like --nosync we could use --nowait, without an hyphen.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LSN as a recovery target

2016-09-03 Thread Michael Paquier
On Sun, Sep 4, 2016 at 1:57 AM, Simon Riggs  wrote:
> On 24 August 2016 at 05:50, Michael Paquier  wrote:
>
 Everything else looks in good order.
>
> Committed. Thanks.

Thanks for the commit!
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence data type

2016-09-03 Thread Gavin Flower

On 04/09/16 06:41, Vik Fearing wrote:

On 08/31/2016 06:22 AM, Peter Eisentraut wrote:

Here is a patch that adds the notion of a data type to a sequence.  So
it might be CREATE SEQUENCE foo AS integer.  The types are restricted to
int{2,4,8} as now.

This patch does not apply cleanly to current master (=600dc4c).


Must admit I first thought of: "2, 4, 8, who do we appreciate!"


MORE SERIOUSLY:

Would a possibly future expansion be to include numeric?

Of hand, I can't see any value for using other than integers of 2, 4, & 
8 bytes (not a criticism! - may simply be a failure of imagination on my 
part).


I am curious as to the use cases for other possibilities.


Cheers,
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autonomous transactions

2016-09-03 Thread Serge Rielau

> On Sep 3, 2016, at 5:25 AM, Greg Stark  wrote:
> 
> On Sat, Sep 3, 2016 at 12:09 PM, Simon Riggs  wrote:
>> So doing autonomous transactions inside a single backend doesn't gain
>> you very much, yet it is an enormously invasive patch to do it that
>> way, not least because it requires you to rewrite locking and
>> deadlocks to make them work correctly when proc is not 1:1 with xid.
>> And as Serge points out it introduces other restrictions that we know
>> about now, perhaps more as well.
> 
> Well using a separate process also requires rewriting locking and
> deadlock detection since a reasonable user might expect that second
> process to have access to data locked in their current transaction.The
> plus side is that we're already facing that issue with parallel query
> so at least it's something that only has to be solved once instead of
> a new problem.
I can’t speak for reasonable users, (or persons in fact ;-)
But… previous implementations of ATs do fully expect them to deadlock on their 
parents and not see uncommitted changes.
So if one’s goal is to merely match competitors’ behavior then that part is a 
non issue.
I do not recall the single backend approach having been particularly invasive.
We managed to do a the 9.4 -> 9.5 merge with little problem despite it.

IMHO, solving the problem of passing variables to and from an AT is required 
for viability of the feature.
How else would the AT know what it’s supposed to do?
Starting an AT within a DDL transaction seems a much more narrow use case to me.

Interestingly, despite being supported in PL/SQL on nested BEGIN END blocks, 
we nearly exclusively see AT’s covering the entire function or trigger.
This usage property can be used to narrow the scope of variable passing to 
function parameters.

Cheers
Serge Rielau
salesforce.com

  

 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Long options for pg_ctl waiting

2016-09-03 Thread Vik Fearing
One thing that has been irking me ever since I came to PostgreSQL is the
fact that pg_ctl -w (and -W) don't have longhand equivalents.  I like to
use the long version in scripts and such as extra documentation, and
I've never been able to with these.  What's more, I keep forgetting that
--wait (and --no-wait) aren't a thing.

Trivial patch attached.
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


pg_ctl_wait-01.patch
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Alter or rename enum value

2016-09-03 Thread Tom Lane
Emre Hasegeli  writes:
> Other than these, it looks good to me.  I am marking it as Ready for 
> Committer.

I started looking at this patch.  I'm kind of unhappy with having *both*
IF EXISTS and IF NOT EXISTS options on the statement, especially since
the locations of those phrases in the syntax seem to have been chosen
with a dartboard.  This feels way more confusing than it is useful.
Is there really a strong use-case for either option?  I note that
ALTER TABLE RENAME COLUMN, which is probably used a thousand times
more often than this will be, has so far not grown either kind of option,
which sure makes me think that this proposal is getting ahead of itself.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \timing interval

2016-09-03 Thread Tom Lane
I wrote:
> Attached is an updated patch that does it like that.  Sample output
> (generated by forcing specific arguments to PrintTiming):

> Time: 0.100 ms
> Time: 1.200 ms
> Time: 1001.200 ms (00:01.001)
> Time: 12001.200 ms (00:12.001)
> Time: 60001.200 ms (01:00.001)
> Time: 720001.200 ms (12:00.001)
> Time: 3660001.200 ms (01:01:00.001)
> Time: 43920001.200 ms (12:12:00.001)
> Time: 176460001.200 ms (2 01:01:00.001)
> Time: 216720001.200 ms (2 12:12:00.001)
> Time: 8816460001.200 ms (102 01:01:00.001)
> Time: 8856720001.200 ms (102 12:12:00.001)

After further thought I concluded that not providing any labeling of
days is a bad idea.  The hours, minutes, and seconds fields seem
reasonably self-explanatory given the formatting, but days not so much.
(I'm not sure whether that is the whole of Peter van H's objection,
but surely it's part of it.)  I pushed the patch using this:

Time: 176460001.200 ms (2 d 01:01:00.001)

and all else as before.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-09-03 Thread Petr Jelinek

On 03/09/16 18:04, Tom Lane wrote:

Petr Jelinek  writes:

On 02/09/16 22:57, Peter Eisentraut wrote:

The new system catalog pg_publication_rel has columns pubid, relid,
and does not use the customary column name prefixes.  Maybe that is OK
here.  I can't actually think of a naming scheme that wouldn't make
things worse.



Yeah, well I could not either and thee are some catalogs that don't use
the prefixes so I figured it's probably not big deal.


The ones that don't are not models to be emulated.  They are cases
where somebody ignored project convention and it wasn't caught until
too late.



Okay but if I follow the convention the names of those fields would be 
something like pubrelpubid and pubrelrelid which does not seem like 
improvement to me. Maybe the catalog should be pg_publication_map then 
as that would make it seem less ugly although less future proof (as 
we'll want to add more things to publications than just tables and they 
might need different catalogs).


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence data type

2016-09-03 Thread Vik Fearing
On 08/31/2016 06:22 AM, Peter Eisentraut wrote:
> Here is a patch that adds the notion of a data type to a sequence.  So
> it might be CREATE SEQUENCE foo AS integer.  The types are restricted to
> int{2,4,8} as now.

This patch does not apply cleanly to current master (=600dc4c).
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (Comment)Bug in CteScanNext

2016-09-03 Thread Tom Lane
Jim Nasby  writes:
> In CteScanNext():
>   /*
>* If we are not at the end of the tuplestore, or are going 
> backwards, try
>* to fetch a tuple from tuplestore.
>*/
>   eof_tuplestore = tuplestore_ateof(tuplestorestate);

>   if (!forward && eof_tuplestore)

> For the comment to be correct, wouldn't the condition need to be 
> (!forward || !eof_tuplestore)?

No.  The comment is describing the overall strategy for the next ~30
lines.  That first if-test is dealing with a sub-case, ie reversing
direction after reaching EOF.  The code's fine, and the comments are
okay as far as they go, but maybe some rearrangement would be in order.
Or we could add something like "But first, we must deal with the special
case of reversing direction after reaching EOF."

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in numeric aggregate

2016-09-03 Thread Tom Lane
Andreas Seltenreich  writes:
> Digging in the coredumps, it looks like set_var_from_num() is invoked on
> an uninitialized NumericVar.  Sample gdb session below.

Hm, yeah, looks like numeric_poly_deserialize is missing a required
init_var() step.  Slightly astonishing that this got through even
minimal testing.

For the archives' sake: I could not reproduce this at default settings,
because the aggregate calls don't get parallelized.  But it fails fairly
quickly with
set parallel_tuple_cost = 0;
set parallel_setup_cost = 0;
set min_parallel_relation_size = 0;

Thanks for the report!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT .. SET syntax

2016-09-03 Thread Vik Fearing
On 08/31/2016 04:12 PM, Marko Tiikkaja wrote:
> Hello hello,
> 
> Here's a rebased and updated patch for $SUBJECT for the September commit
> fest.

Review:

This patch is pretty straightforward, using mostly already existing
infrastructure.  I tried to break it in various ways and could not.  I
do have a few comments, though.

In insert.sgml, some things are  and some
are .  I don't expect this patch to fix
those inconsistencies, but it should certainly not perpetuate them.

This code comment in gram.y took me a while to figure out:

+/*
+ * This is different from set_clause_list used in UPDATE because the
SelectStmt
+ * syntax already does everything you might want to do in an in INSERT.
+ */

If the SelectStmt is all we need, why is this patch here?  I would
prefer wording such as "This is different from set_clause_list used in
UPDATE because we don't want multiple_set_clause.  The INSERT INTO ...
SELECT variant may be more appropriate in such cases."  Or something.

Aside from those trivialities, the main question about this patch is if
we actually want it.  It is not standard SQL syntax, and the only other
product I've located that uses it (or anything like it) is MySQL.
However, I can see how it would be a huge win for very wide tables and
so my personal opinion is to accept this syntax as a PostgreSQL
extension to the standard.

Marking ready for committer, the minor gripes above notwithstanding.
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LSN as a recovery target

2016-09-03 Thread Simon Riggs
On 24 August 2016 at 05:50, Michael Paquier  wrote:

>>> Everything else looks in good order.

Committed. Thanks.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index Onlys Scan for expressions

2016-09-03 Thread Vladimir Sitnikov
Ildar>The reason why this doesn't work is that '~~' operator (which is a
Ildar>synonym for 'like') isn't supported by operator class for btree. Since
Ildar>the only operators supported by btree are <, <=, =, >=, >, you can use
Ildar>it with queries like:

Ildar>And in 3rd query 'OFFSET' statement prevents rewriter from
Ildar>transforming the query, so it is possible to use index only scan on
Ildar>subquery and then filter the result of subquery with '~~' operator.

I'm afraid I do not follow you.
Note: query 3 is 100% equivalent of query 2, however query 3 takes 55 times
less reads.
It looks like either an optimizer bug, or some missing feature in the
"index only scan" logic.

Here's quote from "query 2" (note % are at both ends):  ... where type=42)
as x where upper_vc like '%ABC%';

Note: I do NOT use "indexed scan" for the like operator. I'm very well aware
that LIKE patterns with leading % cannot be optimized to a btree range scan.
What I want is "use the first indexed column as index scan, then use the
second column
for filtering".

As shown in "query 2" vs "query 3", PostgreSQL cannot come up with such a
plan on its own
for some reason.

This is not a theoretical issue, but it is something that I use a lot with
Oracle DB (it just creates a good plan for "query 2").

Vladimir


Re: [HACKERS] Surprising behaviour of \set AUTOCOMMIT ON

2016-09-03 Thread Tom Lane
"Daniel Verite"  writes:
>   Rushabh Lathia wrote:
>> so its good to have all the set variable related checks inside
>> SetVariable().

> There's already another way to skip the \set check:
>   select 'on' as "AUTOCOMMIT" \gset

Hmm, that might be a bug.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-09-03 Thread Tom Lane
Petr Jelinek  writes:
> On 02/09/16 22:57, Peter Eisentraut wrote:
>> The new system catalog pg_publication_rel has columns pubid, relid,
>> and does not use the customary column name prefixes.  Maybe that is OK
>> here.  I can't actually think of a naming scheme that wouldn't make
>> things worse.

> Yeah, well I could not either and thee are some catalogs that don't use 
> the prefixes so I figured it's probably not big deal.

The ones that don't are not models to be emulated.  They are cases
where somebody ignored project convention and it wasn't caught until
too late.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba_file_settings view patch

2016-09-03 Thread Simon Riggs
On 15 August 2016 at 12:17, Haribabu Kommi  wrote:

> comments?

This looks like a good feature contribution, thanks.

At present the patch doesn't apply cleanly, please rebase.

The patch doesn't contain any tests, which means I can't see what the
output looks like, so I can't judge the exact usefulness of this
particular patch. ISTM likely that there will be some detailed points
to review in there somewhere.

Do we want line number, or priority order? i.e. do we want to count
comment lines or just main rule lines? I prefer latter.
Various other questions along those lines needed, once I can see the output.

What is push_jsonb_string_value() etc..?
If there is required infrastructure there is no explanation of why.
Suggest you explain and/or split into two.

pg_hba_file_settings seems a clumsy name. I'd prefer pg_hba_settings,
since that name could live longer than the concept of pg_hba.conf,
which seems likely to become part of ALTER SYSTEM in future, so we
wouldn't really want the word "file" in there.

I've not seen anything yet to make me think a commit for this wouldn't
happen once we've worked the detail.

Thanks

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup stream xlog to tar

2016-09-03 Thread Magnus Hagander
On Thu, Sep 1, 2016 at 2:39 PM, Michael Paquier 
wrote:

> On Thu, Sep 1, 2016 at 5:13 PM, Magnus Hagander 
> wrote:
> > We don't seem to check for similar issues as the one just found in the
> > existing tests though, do we? As in, we don't actually verify that the
> xlog
> > files being streamed are 16Mb? (Or for that matter that the tarfile
> emitted
> > by -Ft is actually a tarfile?) Or am I missing some magic somewhere? :)
>
> No. There is no checks on the WAL file size (you should use the output
> of pg_controldata to see how large the segments should be). For the
> tar file, the complication is in its untar... Perl provides some ways
> to untar things, though the oldest version that we support in the TAP
> tests does not offer that :(
>

Ugh. That would be nice to have, but I think that's outside the scope of
this patch.

PFA is an updated version of this patch that:
* documents a magic value passed to zlib (which is in their documentation
as being a magic value, but has no define)
* fixes the padding of tar files
* adds a most basic test that the -X stream -Ft does produce a tarfile

As for using XLOGDIR to drive the name of the tarfile. pg_basebackup is
already hardcoded to use pg_xlog. And so are the tests. We probably want to
fix that, but that's a separate step and this patch will be easier to
review and test if we keep it out for now.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 03615da..981d201 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -180,7 +180,8 @@ PostgreSQL documentation
 target directory, the tar contents will be written to
 standard output, suitable for piping to for example
 gzip. This is only possible if
-the cluster has no additional tablespaces.
+the cluster has no additional tablespaces and transaction
+log streaming is not used.


  
@@ -323,6 +324,10 @@ PostgreSQL documentation
  If the log has been rotated when it's time to transfer it, the
  backup will fail and be unusable.

+   
+The transaction log files will be written to
+ the base.tar file.
+   
   
  
 
@@ -339,6 +344,9 @@ PostgreSQL documentation
  client can keep up with transaction log received, using this mode
  requires no extra transaction logs to be saved on the master.

+   The transactionn log files are written to a separate file
+called pg_xlog.tar.
+   
   
  
 
diff --git a/src/bin/pg_basebackup/Makefile b/src/bin/pg_basebackup/Makefile
index fa1ce8b..52ac9e9 100644
--- a/src/bin/pg_basebackup/Makefile
+++ b/src/bin/pg_basebackup/Makefile
@@ -19,7 +19,7 @@ include $(top_builddir)/src/Makefile.global
 override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
 LDFLAGS += -L$(top_builddir)/src/fe_utils -lpgfeutils -lpq
 
-OBJS=receivelog.o streamutil.o $(WIN32RES)
+OBJS=receivelog.o streamutil.o walmethods.o $(WIN32RES)
 
 all: pg_basebackup pg_receivexlog pg_recvlogical
 
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 351a420..58c0821 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -365,7 +365,7 @@ typedef struct
 {
 	PGconn	   *bgconn;
 	XLogRecPtr	startptr;
-	char		xlogdir[MAXPGPATH];
+	char		xlog[MAXPGPATH];	/* directory or tarfile depending on mode */
 	char	   *sysidentifier;
 	int			timeline;
 } logstreamer_param;
@@ -383,9 +383,13 @@ LogStreamerMain(logstreamer_param *param)
 	stream.standby_message_timeout = standby_message_timeout;
 	stream.synchronous = false;
 	stream.mark_done = true;
-	stream.basedir = param->xlogdir;
 	stream.partial_suffix = NULL;
 
+	if (format == 'p')
+		stream.walmethod = CreateWalDirectoryMethod(param->xlog);
+	else
+		stream.walmethod = CreateWalTarMethod(param->xlog, compresslevel);
+
 	if (!ReceiveXlogStream(param->bgconn, ))
 
 		/*
@@ -395,6 +399,14 @@ LogStreamerMain(logstreamer_param *param)
 		 */
 		return 1;
 
+	if (!stream.walmethod->finish())
+	{
+		fprintf(stderr,
+_("%s: could not finish writing WAL files: %s\n"),
+progname, strerror(errno));
+		return 1;
+	}
+
 	PQfinish(param->bgconn);
 	return 0;
 }
@@ -445,22 +457,25 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier)
 		/* Error message already written in GetConnection() */
 		exit(1);
 
-	snprintf(param->xlogdir, sizeof(param->xlogdir), "%s/pg_xlog", basedir);
-
-	/*
-	 * Create pg_xlog/archive_status (and thus pg_xlog) so we can write to
-	 * basedir/pg_xlog as the directory entry in the tar file may arrive
-	 * later.
-	 */
-	snprintf(statusdir, sizeof(statusdir), 

Re: [HACKERS] pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)

2016-09-03 Thread Michael Paquier
On Sat, Sep 3, 2016 at 10:26 PM, Magnus Hagander  wrote:
> Yes, we should definitely not allow that combination. In fact, maybe that
> argument in itself is enough not to have it for pg_receivexlog -- the cause
> of confusion.
>
> I can see how you might want to avoid it for pg_basebackup during testing
> for example,. but the overhead on pg_receivexlog shouldn't be as bad in
> testing, should it?

No, I haven't tested directly but it should not be.. pg_basebackup
does always a bulk write, while pg_receivexlog depends on the server
activity.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speedup twophase transactions

2016-09-03 Thread Michael Paquier
On Fri, Sep 2, 2016 at 5:06 AM, Simon Riggs  wrote:
> On 13 April 2016 at 15:31, Stas Kelvich  wrote:
>
>> Fixed patch attached. There already was infrastructure to skip currently
>> held locks during replay of standby_redo() and I’ve extended that with check 
>> for
>> prepared xids.
>
> Please confirm that everything still works on current HEAD for the new
> CF, so review can start.

The patch does not apply cleanly. Stas, could you rebase? I am
switching the patch to "waiting on author" for now.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)

2016-09-03 Thread Magnus Hagander
On Sat, Sep 3, 2016 at 3:21 PM, Michael Paquier 
wrote:

> On Sat, Sep 3, 2016 at 12:42 AM, Magnus Hagander 
> wrote:
> > On Fri, Sep 2, 2016 at 8:50 AM, Michael Paquier <
> michael.paqu...@gmail.com>
> > wrote:
> >> On Fri, Sep 2, 2016 at 2:20 AM, Peter Eisentraut
> >>  wrote:
> >> > On 5/13/16 2:39 AM, Michael Paquier wrote:
> >> What do others think about that? I could implement that on top of 0002
> >> with some extra options. But to be honest that looks to be just some
> >> extra sugar for what is basically a bug fix... And I am feeling that
> >> providing such a switch to users would be a way for one to shoot
> >> himself badly, particularly for pg_receivexlog where a crash can cause
> >> segments to go missing.
> >>
> >
> > Well, why do we provide a --nosync option for initdb? Wouldn't the
> argument
> > basically be the same?
>
> Yes, the good-for-testing-but-not-production argument.
>
> > I agree it kind of feels like overkill, but it would be consistent
> overkill?
> > :)
>
> Oh, well. I have just implemented it on top of the two other patches
> for pg_basebackup. For pg_receivexlog, I am wondering if it makes
> sense to have it. That would be trivial to implement it, and I think
> that we had better make the combination of --synchronous and --nosync
> just leave with an error. Thoughts about having that for
> pg_receivexlog?
>

Yes, we should definitely not allow that combination. In fact, maybe that
argument in itself is enough not to have it for pg_receivexlog -- the cause
of confusion.

I can see how you might want to avoid it for pg_basebackup during testing
for example,. but the overhead on pg_receivexlog shouldn't be as bad in
testing, should it?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)

2016-09-03 Thread Michael Paquier
On Sat, Sep 3, 2016 at 10:22 PM, Michael Paquier
 wrote:
> On Sat, Sep 3, 2016 at 10:21 PM, Michael Paquier
>  wrote:
>> Oh, well. I have just implemented it on top of the two other patches
>> for pg_basebackup. For pg_receivexlog, I am wondering if it makes
>> sense to have it. That would be trivial to implement it, and I think
>> that we had better make the combination of --synchronous and --nosync
>> just leave with an error. Thoughts about having that for
>> pg_receivexlog?
>
> With patches that's actually better..

Meh, meh et meh.
-- 
Michael
From be6888046cc7dcfde33c22294e8d94a9369ff6b5 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Fri, 2 Sep 2016 15:19:11 +0900
Subject: [PATCH 1/3] Relocation fsync routines of initdb into src/common

Those are aimed at being used by other utilities, pg_basebackup mainly,
and at some other degree by pg_dump and pg_receivexlog.
---
 src/bin/initdb/initdb.c | 266 +-
 src/common/Makefile |   2 +-
 src/common/file_utils.c | 279 
 src/include/common/file_utils.h |  22 
 src/tools/msvc/Mkvcbuild.pm |   2 +-
 5 files changed, 310 insertions(+), 261 deletions(-)
 create mode 100644 src/common/file_utils.c
 create mode 100644 src/include/common/file_utils.h

diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 9407492..0b74ff9 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -61,6 +61,7 @@
 #endif
 
 #include "catalog/catalog.h"
+#include "common/file_utils.h"
 #include "common/restricted_token.h"
 #include "common/username.h"
 #include "mb/pg_wchar.h"
@@ -70,13 +71,6 @@
 #include "fe_utils/string_utils.h"
 
 
-/* Define PG_FLUSH_DATA_WORKS if we have an implementation for pg_flush_data */
-#if defined(HAVE_SYNC_FILE_RANGE)
-#define PG_FLUSH_DATA_WORKS 1
-#elif defined(USE_POSIX_FADVISE) && defined(POSIX_FADV_DONTNEED)
-#define PG_FLUSH_DATA_WORKS 1
-#endif
-
 /* Ideally this would be in a .h file, but it hardly seems worth the trouble */
 extern const char *select_default_timezone(const char *share_path);
 
@@ -237,13 +231,6 @@ static char **filter_lines_with_token(char **lines, const char *token);
 #endif
 static char **readfile(const char *path);
 static void writefile(char *path, char **lines);
-static void walkdir(const char *path,
-		void (*action) (const char *fname, bool isdir),
-		bool process_symlinks);
-#ifdef PG_FLUSH_DATA_WORKS
-static void pre_sync_fname(const char *fname, bool isdir);
-#endif
-static void fsync_fname_ext(const char *fname, bool isdir);
 static FILE *popen_check(const char *command, const char *mode);
 static void exit_nicely(void);
 static char *get_id(void);
@@ -270,7 +257,6 @@ static void load_plpgsql(FILE *cmdfd);
 static void vacuum_db(FILE *cmdfd);
 static void make_template0(FILE *cmdfd);
 static void make_postgres(FILE *cmdfd);
-static void fsync_pgdata(void);
 static void trapsig(int signum);
 static void check_ok(void);
 static char *escape_quotes(const char *src);
@@ -529,177 +515,6 @@ writefile(char *path, char **lines)
 }
 
 /*
- * walkdir: recursively walk a directory, applying the action to each
- * regular file and directory (including the named directory itself).
- *
- * If process_symlinks is true, the action and recursion are also applied
- * to regular files and directories that are pointed to by symlinks in the
- * given directory; otherwise symlinks are ignored.  Symlinks are always
- * ignored in subdirectories, ie we intentionally don't pass down the
- * process_symlinks flag to recursive calls.
- *
- * Errors are reported but not considered fatal.
- *
- * See also walkdir in fd.c, which is a backend version of this logic.
- */
-static void
-walkdir(const char *path,
-		void (*action) (const char *fname, bool isdir),
-		bool process_symlinks)
-{
-	DIR		   *dir;
-	struct dirent *de;
-
-	dir = opendir(path);
-	if (dir == NULL)
-	{
-		fprintf(stderr, _("%s: could not open directory \"%s\": %s\n"),
-progname, path, strerror(errno));
-		return;
-	}
-
-	while (errno = 0, (de = readdir(dir)) != NULL)
-	{
-		char		subpath[MAXPGPATH];
-		struct stat fst;
-		int			sret;
-
-		if (strcmp(de->d_name, ".") == 0 ||
-			strcmp(de->d_name, "..") == 0)
-			continue;
-
-		snprintf(subpath, MAXPGPATH, "%s/%s", path, de->d_name);
-
-		if (process_symlinks)
-			sret = stat(subpath, );
-		else
-			sret = lstat(subpath, );
-
-		if (sret < 0)
-		{
-			fprintf(stderr, _("%s: could not stat file \"%s\": %s\n"),
-	progname, subpath, strerror(errno));
-			continue;
-		}
-
-		if (S_ISREG(fst.st_mode))
-			(*action) (subpath, false);
-		else if (S_ISDIR(fst.st_mode))
-			walkdir(subpath, action, false);
-	}
-
-	if (errno)
-		fprintf(stderr, _("%s: could not read directory \"%s\": %s\n"),
-progname, path, strerror(errno));
-
-	(void) closedir(dir);
-
-	/*
-	 * It's important to fsync the destination directory 

Re: [HACKERS] pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)

2016-09-03 Thread Michael Paquier
On Sat, Sep 3, 2016 at 10:21 PM, Michael Paquier
 wrote:
> On Sat, Sep 3, 2016 at 12:42 AM, Magnus Hagander  wrote:
>> On Fri, Sep 2, 2016 at 8:50 AM, Michael Paquier 
>> wrote:
>>> On Fri, Sep 2, 2016 at 2:20 AM, Peter Eisentraut
>>>  wrote:
>>> > On 5/13/16 2:39 AM, Michael Paquier wrote:
>>> What do others think about that? I could implement that on top of 0002
>>> with some extra options. But to be honest that looks to be just some
>>> extra sugar for what is basically a bug fix... And I am feeling that
>>> providing such a switch to users would be a way for one to shoot
>>> himself badly, particularly for pg_receivexlog where a crash can cause
>>> segments to go missing.
>>>
>>
>> Well, why do we provide a --nosync option for initdb? Wouldn't the argument
>> basically be the same?
>
> Yes, the good-for-testing-but-not-production argument.
>
>> I agree it kind of feels like overkill, but it would be consistent overkill?
>> :)
>
> Oh, well. I have just implemented it on top of the two other patches
> for pg_basebackup. For pg_receivexlog, I am wondering if it makes
> sense to have it. That would be trivial to implement it, and I think
> that we had better make the combination of --synchronous and --nosync
> just leave with an error. Thoughts about having that for
> pg_receivexlog?

With patches that's actually better..
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)

2016-09-03 Thread Michael Paquier
On Sat, Sep 3, 2016 at 12:42 AM, Magnus Hagander  wrote:
> On Fri, Sep 2, 2016 at 8:50 AM, Michael Paquier 
> wrote:
>> On Fri, Sep 2, 2016 at 2:20 AM, Peter Eisentraut
>>  wrote:
>> > On 5/13/16 2:39 AM, Michael Paquier wrote:
>> What do others think about that? I could implement that on top of 0002
>> with some extra options. But to be honest that looks to be just some
>> extra sugar for what is basically a bug fix... And I am feeling that
>> providing such a switch to users would be a way for one to shoot
>> himself badly, particularly for pg_receivexlog where a crash can cause
>> segments to go missing.
>>
>
> Well, why do we provide a --nosync option for initdb? Wouldn't the argument
> basically be the same?

Yes, the good-for-testing-but-not-production argument.

> I agree it kind of feels like overkill, but it would be consistent overkill?
> :)

Oh, well. I have just implemented it on top of the two other patches
for pg_basebackup. For pg_receivexlog, I am wondering if it makes
sense to have it. That would be trivial to implement it, and I think
that we had better make the combination of --synchronous and --nosync
just leave with an error. Thoughts about having that for
pg_receivexlog?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] Failed assertion in numeric aggregate

2016-09-03 Thread Andreas Seltenreich
Hi,

updating master from be7f7ee..39b691f, the following assertion is
triggered frequently by sqlsmith:

TRAP: BadArgument("!(((context) != ((void *)0) && (const 
Node*)((context)))->type) == T_AllocSetContext", File: "mcxt.c", Line: 1010)

Digging in the coredumps, it looks like set_var_from_num() is invoked on
an uninitialized NumericVar.  Sample gdb session below.

Below is also one of the generated queries that eventually triggers it
for me when invoked a dozen times or so.

regards,
Andreas

--8<---cut here---start->8---
select
  subq_0.c0 as c0,
  subq_0.c0 as c1,
  5 as c2,
  (select pg_catalog.min(class) from public.f_star)
 as c3
from
  (select
sample_2.cc as c0
  from
public.shoelace_arrive as ref_0
  inner join public.hub as sample_1
right join public.e_star as sample_2
on (sample_1.name = sample_2.class )
  on (ref_0.arr_name = sample_2.class )
  limit 63) as subq_0
where ((subq_0.c0 is not NULL)
and ((select pg_catalog.var_pop(enumsortorder) from pg_catalog.pg_enum)
 is not NULL))
  and (((select pg_catalog.var_samp(random) from public.bt_txt_heap)
 is NULL)
or ((select m from public.money_data limit 1 offset 1)
 <> (select pg_catalog.min(salary) from public.rtest_empmass)
));
--8<---cut here---end--->8---

(gdb) bt
#0  0x7ff011f221c8 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:54
#1  0x7ff011f2364a in __GI_abort () at abort.c:89
#2  0x007ef1b1 in ExceptionalCondition 
(conditionName=conditionName@entry=0x9d26c8 "!(((context) != ((void *)0) && 
(const Node*)((context)))->type) == T_AllocSetContext", 
errorType=errorType@entry=0x835c25 "BadArgument", 
fileName=fileName@entry=0x9d2640 "mcxt.c", lineNumber=lineNumber@entry=1010) at 
assert.c:54
#3  0x00813561 in pfree (pointer=) at mcxt.c:1010
#4  0x00773169 in alloc_var (var=var@entry=0x7ffe3a6d18d0, 
ndigits=ndigits@entry=6) at numeric.c:5387
#5  0x00774230 in set_var_from_num (num=0x1e49180, dest=0x7ffe3a6d18d0) 
at numeric.c:5608
#6  0x0077be2c in numeric_poly_deserialize (fcinfo=) at 
numeric.c:4196
#7  0x005ec48c in combine_aggregates (aggstate=0x1e255d8, 
pergroup=) at nodeAgg.c:986
#8  0x005edcc5 in agg_retrieve_direct (aggstate=0x1e255d8) at 
nodeAgg.c:2095
#9  ExecAgg (node=node@entry=0x1e255d8) at nodeAgg.c:1837
#10 0x005e0078 in ExecProcNode (node=node@entry=0x1e255d8) at 
execProcnode.c:503
#11 0x0060173c in ExecSetParamPlan (node=, 
econtext=0x1e2e710) at nodeSubplan.c:995
#12 0x005e4f75 in ExecEvalParamExec (exprstate=, 
econtext=, isNull=0x7ffe3a6d1b3f "", isDone=) at 
execQual.c:1140
#13 0x005e14c6 in ExecEvalNullTest (nstate=0x1e2ec50, 
econtext=0x1e2e710, isNull=0x7ffe3a6d1b3f "", isDone=0x0) at execQual.c:3902
#14 0x005e0656 in ExecEvalOr (orExpr=, 
econtext=0x1e2e710, isNull=0x7ffe3a6d1b3f "", isDone=) at 
execQual.c:2809
#15 0x005e7089 in ExecQual (qual=, 
econtext=econtext@entry=0x1e2e710, resultForNull=resultForNull@entry=0 '\000') 
at execQual.c:5379
#16 0x005fd6b1 in ExecResult (node=node@entry=0x1e2e5f8) at 
nodeResult.c:82
#17 0x005e01f8 in ExecProcNode (node=node@entry=0x1e2e5f8) at 
execProcnode.c:392
#18 0x005dc27e in ExecutePlan (dest=0x7ff0129e22b0, 
direction=, numberTuples=0, sendTuples=, 
operation=CMD_SELECT, use_parallel_mode=, planstate=0x1e2e5f8, 
estate=0x1e1aba8) at execMain.c:1567
#19 standard_ExecutorRun (queryDesc=0x1d563b8, direction=, 
count=0) at execMain.c:338
#20 0x006faad8 in PortalRunSelect (portal=portal@entry=0x1def878, 
forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807, 
dest=dest@entry=0x7ff0129e22b0) at pquery.c:948
#21 0x006fc04e in PortalRun (portal=portal@entry=0x1def878, 
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001', 
dest=dest@entry=0x7ff0129e22b0, altdest=altdest@entry=0x7ff0129e22b0, 
completionTag=completionTag@entry=0x7ffe3a6d1fa0 "") at pquery.c:789
#22 0x006f8deb in exec_simple_query (query_string=0x1dc2e58 "select  \n 
 subq_0.c0 as c0, \n  subq_0.c0 as c1, \n  5 as c2, \n  (select 
pg_catalog.min(class) from public.f_star)\n as c3\nfrom \n  (select  \n 
   sample_2.cc as c0\n  from \npublic.shoel"...) at postgres.c:1094
#23 PostgresMain (argc=, argv=argv@entry=0x1d64730, 
dbname=0x1d64590 "regression", username=) at postgres.c:4070
#24 0x0046cf81 in BackendRun (port=0x1d4ffd0) at postmaster.c:4260
#25 BackendStartup (port=0x1d4ffd0) at postmaster.c:3934
#26 ServerLoop () at postmaster.c:1691
#27 0x00693634 in PostmasterMain (argc=argc@entry=3, 
argv=argv@entry=0x1d2c5d0) at postmaster.c:1299
#28 0x0046e0d6 in main (argc=3, argv=0x1d2c5d0) at main.c:228
(gdb) frame 5
#5  0x00774230 in 

Re: [HACKERS] autonomous transactions

2016-09-03 Thread Greg Stark
On Sat, Sep 3, 2016 at 12:09 PM, Simon Riggs  wrote:
> So doing autonomous transactions inside a single backend doesn't gain
> you very much, yet it is an enormously invasive patch to do it that
> way, not least because it requires you to rewrite locking and
> deadlocks to make them work correctly when proc is not 1:1 with xid.
> And as Serge points out it introduces other restrictions that we know
> about now, perhaps more as well.

Well using a separate process also requires rewriting locking and
deadlock detection since a reasonable user might expect that second
process to have access to data locked in their current transaction.The
plus side is that we're already facing that issue with parallel query
so at least it's something that only has to be solved once instead of
a new problem.

Parallel query is currently restricted to read-only queries however.
Autonomous transactions will certainly need to be read-write so the
question then is what problems led to the restriction in parallel
query and are they any more tractable with autonomous transactions?

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index Onlys Scan for expressions

2016-09-03 Thread Tomas Vondra

Hi Ildar,

I've looked at this patch again today to do a bit more thorough review, 
and I think it's fine. There are a few comments (particularly in the new 
code in check_index_only) that need improving, and also a few small 
tweaks in the walkers.


Attached is a modified v5 patch with the proposed changes - it's 
probably easier than explaining what the changes should/might be.


I've added an XXX comment in check_index_only_expr_walker - ISTM we're 
first explicitly matching  Vars to index attributes, and then dealing 
with expressions. But we loop over all index columns (including those 
that can only really match Vars). Not sure if it makes any difference, 
but is it worth differentiating between Var and non-Var expressions? Why 
not to simply call match_index_to_operand() in both cases?


I've also tweaked a few comments to match project code style, and moved 
a few variables into the block where they are used. But the latter is 
probably matter of personal taste, I guess.



regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


indexonlyscan5-tomas.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autonomous transactions

2016-09-03 Thread Simon Riggs
On 2 September 2016 at 09:45, Robert Haas  wrote:
> On Wed, Aug 31, 2016 at 7:20 AM, Peter Eisentraut
>  wrote:
>> I would like to propose the attached patch implementing autonomous
>> transactions for discussion and review.
>
> I'm pretty skeptical of this approach.  Like Greg Stark, Serge Rielau,
> and Constantin Pan, I had expected that autonomous transactions should
> be implemented inside of a single backend, without relying on workers.

The problem is that we have limits on the number of concurrent
transactions, which is currently limited by the number of procs.

We could expand that but given the current way we do snapshots there
will be realistically low limits however we do this.

So doing autonomous transactions inside a single backend doesn't gain
you very much, yet it is an enormously invasive patch to do it that
way, not least because it requires you to rewrite locking and
deadlocks to make them work correctly when proc is not 1:1 with xid.
And as Serge points out it introduces other restrictions that we know
about now, perhaps more as well.

Given that I've already looked into the "single backend" approach in
detail, it looks to me that Peter's approach is viable and robust.

> That approach would be much less likely to run afoul of limits on the
> number of background workers

That would also be an argument against using them for parallelism, yet
we used background workers there.

> , and it will probably perform
> considerably better too, especially when the autonomous transaction
> does only a small amount of work, like inserting a log message
> someplace.  That is not to say that providing an interface to some
> pg_background-like functionality is a bad idea; there's been enough
> interest in that from various quarters to suggest that it's actually
> quite useful, and I don't even think that it's a bad plan to integrate
> that with the PLs in some way.  However, I think that it's a different
> feature than autonomous transactions.  As others have also noted, it
> can be used to fire-and-forget a command, or to run a command while
> foreground processing continues, both of which would be out of scope
> for an autonomous transaction facility per se.  So my suggestion is
> that you pursue the work but change the name.

We agree that we like the infrastructure Peter has provided is useful,
so that part can go ahead.

If that infrastructure can be used in a simple syntactic way to
provide Autonomous Transactions, then I say let that happen. Peter,
please comment on how much infrastructure is required to implement ATs
this way?

If somebody believes there is a better way for ATs, that is just an
optimization of the limits and can occur later, if the people who
believe there is a better way can prove that is the case and come up
with a patch. It's OK for features to go in now and have better
infrastructure later, e.g. LISTEN/NOTIFY.

I would very much like to see ATs in v10 and this is a viable approach
with working code.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Surprising behaviour of \set AUTOCOMMIT ON

2016-09-03 Thread Daniel Verite
Rushabh Lathia wrote:

> It might happen that SetVariable() can be called from other places in
> future,
> so its good to have all the set variable related checks inside
> SetVariable().

There's already another way to skip the \set check:
  select 'on' as "AUTOCOMMIT" \gset

But there's a function in startup.c which might be the ideal location
for the check, as it looks like the one and only place where the
autocommit flag actually changes:

static void
autocommit_hook(const char *newval)
{
 pset.autocommit = ParseVariableBool(newval, "AUTOCOMMIT");
}



Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Surprising behaviour of \set AUTOCOMMIT ON

2016-09-03 Thread Daniel Verite
Rahila Syed wrote:

> >Above error coming because in below code block change, you don't have
> check for
> >command (you should check opt0 for AUTOCOMMIT command)

> Corrected in the attached.

There are other values than ON: true/yes and theirs abbreviations,
the value 1, and anything that doesn't resolve to OFF is taken as ON.
ParseVariableBool() in commands.c already does the job of converting
these to bool, the new code should probably just call that function
instead of parsing the value itself.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-09-03 Thread Petr Jelinek

On 02/09/16 22:57, Peter Eisentraut wrote:

Review of 0001-Add-PUBLICATION-catalogs-and-DDL.patch:



Thanks!


The new system catalog pg_publication_rel has columns pubid, relid,
and does not use the customary column name prefixes.  Maybe that is OK
here.  I can't actually think of a naming scheme that wouldn't make
things worse.



Yeah, well I could not either and thee are some catalogs that don't use 
the prefixes so I figured it's probably not big deal.




In psql, the code psql_error("The server (version %d.%d) does not
...") should be updated to use the new formatPGVersionNumber()
function.



Right, same thing will be in the 2nd patch.


In psql, psql \dr is already for "roles" (\drds).  You are adding \drp
for publications.  Maybe use big R for replication-related describes?



Seems reasonable.


There should be some documentation about how TRUNCATE commands are
handled by publications.  Patch 0005 mentions TRUNCATE in the general
documentation, but I would have questions when reading the CREATE
PUBLICATION reference page.



That's actually bug in the 0005 patch, TRUNCATE is not handled ATM, but 
that should be probably documented as well.



Also, document how publications deal with INSERT ON CONFLICT.



Okay, they just replicate whatever was the result of that operation (if 
any).



In some places, the new publication object type is just added to the
end of a list instead of some alphabetical place, e.g.,
event_trigger.c, gram.y (drop_type).


Hmm, what is and what isn't alphabetically sorted is quite unclear for 
me as we have mix of both everywhere. For example, if you consider 
drop_type to be alphabetically sorted then our locales are much more 
different than I thought :)




What are the BKI_ROWTYPE_OID assignments for?  Are they necessary
here?  (Maybe this was just copied from pg_subscription?)



Yes they are.


I think some or all of replication/logical/publication.c should be
catalog/pg_publication.c.  There are various different precedents in
how this can be split up, but I kind of like having command/foocmds.c
call into catalog/pg_foo.c.



Okay, I prefer grouping the code by functionality (as in terms of this 
is replication) rather than architectures (as in terms this is catalog) 
but no problem moving it. Again same thing will be in 2nd patch.



Also, some things could be in lsyscache.c, although not too many new
things go in there now.


TBH I dislike the whole lsyscache concept of just random lookup 
functions piled in one huge module and would rather not add to it.




Most calls of the GetPublication() function could be changed to a
simpler get_publication_name(Oid), because that is all it is used for
so far.  (It will be used later in 0003, but only in one specific
case.)


You mean the calls from objectaddress? Will change that - I actually 
added the get_publication_name much later in the development and didn't 
go back to use it in preexisting code.




If I add a table to a publication, it requires a primary key.  But
after the table is added, I can remove the primary key.  There is code
in publication_add_relation() to record dependencies for that, but it
doesn't seem to do its job right.



I need to rewrite that part. That's actually something I could use other 
people opinion on - currently the pg_publication_rel does not have 
records for the alltables publication as that seemed redundant so it 
will need some special handling in tablecmds.c for the "dependency" 
tracking and possibly elsewhere for other things. I do wonder though if 
we should instead just add records to the pg_publication_rel catalog.



Relatedly, the error messages in check_publication_add_relation() and
AlterPublicationOptions() conflate replica identity index and primary
key.  (I suppose the whole user-facing presentation of what replica
identity indexes are, which have so far been a rather obscure feature,
will need some polishing during this.)



Those are copy/paste issues from pglogical. It should say replica 
identity index everywhere. But yes it might be needed to make it more 
obvious what replica identity indexes are.



I think the syntax could be made prettier.  For example, instead of

CREATE PUBLICATION testpib_ins_trunct WITH noreplicate_delete
noreplicate_update;

how about something like

CREATE PUBLICATION foo (REPLICATE DELETE, NO REPLICATE UPDATE);



I went with the same syntax style as CREATE ROLE, but I am open to changes.


I also found ALTER PUBLICATION FOR TABLE / FOR ALL TABLES confusing.
Maybe that should be SET TABLE or something.



Yeah I am not sure what is the best option there. SET was also what I 
was thinking but then it does not map well to the CREATE PUBLICATION 
syntax and I would like to have some harmony there.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your 

Re: [HACKERS] standalone backend PANICs during recovery

2016-09-03 Thread Tom Lane
Bernd Helmle  writes:
> Well, the "use case" was a crashed hot standby at a customer site (it kept
> PANICing after restarting), where i decided to have a look through the
> recovery process with gdb. The exact PANIC was

> 2016-03-29 15:12:40 CEST [16097]: [26-1] user=,db=,xid=0,vxid=1/0 FATAL:
> unexpected GIN leaf action: 0

BTW, that didn't happen to be on big-endian hardware did it?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers