Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki

2017-03-06 Thread Jan Michálek
2017-03-05 14:02 GMT+01:00 Jan Michálek :

>
>
> 2017-03-05 13:39 GMT+01:00 Pavel Stehule :
>
>>
>>
>> 2017-03-05 13:22 GMT+01:00 Pavel Stehule :
>>
>>>
>>>
>>> 2017-03-05 13:08 GMT+01:00 Jan Michálek :
>>>
 It is question if it is really new format, because formating is the
 same as aligned/wrapped format, changed is only style of lines.

>>>
>>> Please, don't do top posting https://en.wikipedia.org/wiki/
>>> Posting_style#Top-posting
>>>
>>>
 2017-03-05 12:36 GMT+01:00 Pavel Stehule :

>
>
> 2017-03-05 11:40 GMT+01:00 Jan Michálek :
>
>> I know, but, both new linestyles are created literally by cloning
>> ascii linestyle and few lines in print_aligned_text. Both works with
>> "aligned" and "wrapped" format. In rst is wrapped format useful, in my
>> opinion, in markdown i can`t find how I can get newline in record (maybe 
>> it
>> is not posiible in main markdown types). So it is why i add markdown and
>> rst as new linestyles. But it is not problem to change it in command to 
>> use
>> "\pset format", but i mean, that this is cleaner.
>>
>
> Using a special linestyle for new format is possible probably. But new
> format should be switched with \pset format command.
>

changed

>
> Not sure if wrapped or aligned behave is correct for markdown - it is
> task for markdown processing, not for psql.
>

>>>
>>>
>>> In this case I am inclined to prefer setting via format setting - you
>>> can set a linestyle and border in one step, and then is easy to return back
>>> to previous format. I don't see a big benefit in enhancing set of ascii
>>> linestyles. The collecting new features in formatting is more intuitive
>>> (for me).
>>>
>>
>>  This can be discussed what we prefer, and what we would to implement?
>>
>>
>>
>> 1. Nice formatted markdown tables
>>
>>
>> | Tables| Are   | Cool  |
>> | - |:-:| -:|
>> | col 3 is  | right-aligned | $1600 |
>> | col 2 is  | centered  |   $12 |
>> | zebra stripes | are neat  |$1 |
>>
>> or 2. enough formatting
>>
>>
>> Markdown | Less | Pretty
>> --- | --- | ---
>> *Still* | `renders` | **nicely**
>> 1 | 2 | 3
>>
>> I personally prefer nice formated table, because more comfortable reading
> source of document and easier editing with blocks (deleting whole columns
> etc.).
> I will change \pset to format.
> I find, when adding <\br> for newline works in retext. I will try to add
> it to patch.
>
> | Tables | Are | Cool |
>
> | - |:-:| -:|
>
> | col 3 is | right-aligned | $1600 |
>
> | col 2 is | centered | $12 |
>
> | zebra stripes | are neat | $1 |
>
>
> Jan
>
>
>
>> Pavel
>>
>>
>>
>>
>
>
> --
> Jelen
> Starší čeledín datovýho chlíva
>



-- 
Jelen
Starší čeledín datovýho chlíva
diff -ru a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
--- a/doc/src/sgml/ref/psql-ref.sgml2017-02-06 22:45:25.0 +0100
+++ b/doc/src/sgml/ref/psql-ref.sgml2017-03-06 01:35:46.0 +0100
@@ -2326,7 +2326,8 @@
   aligned, wrapped,
   html, asciidoc,
   latex (uses tabular),
-  latex-longtable, or
+  latex-longtable, 
+ rst, markdown, or
   troff-ms.
   Unique abbreviations are allowed.  (That would mean one letter
   is enough.)
@@ -2354,7 +2355,8 @@
 
   
   The html, asciidoc, latex,
-  latex-longtable, and troff-ms
+  latex-longtable, troff-ms,
+ and markdown and rst
   formats put out tables that are intended to
   be included in documents using the respective mark-up
   language. They are not complete documents! This might not be
diff -ru a/src/bin/psql/command.c b/src/bin/psql/command.c
--- a/src/bin/psql/command.c2017-02-06 22:45:25.0 +0100
+++ b/src/bin/psql/command.c2017-03-06 03:27:07.0 +0100
@@ -2494,6 +2494,12 @@
case PRINT_TROFF_MS:
return "troff-ms";
break;
+   case PRINT_MARKDOWN:
+   return "markdown";
+   break;
+   case PRINT_RST:
+   return "rst";
+   break;
}
return "unknown";
 }
@@ -2565,9 +2571,13 @@
popt->topt.format = PRINT_LATEX_LONGTABLE;
else if (pg_strncasecmp("troff-ms", value, vallen) == 0)
popt->topt.format = PRINT_TROFF_MS;
+   else if (pg_strncasecmp("markdown", value, vallen) == 0) 
/*markdown*/
+   popt->topt.format = PRINT_MARKDOWN;
+   else if (pg_strncasecmp("rst", value, vallen) == 0) /*rst*/
+   popt->topt.format = PRINT_RST;
else
{
-   psql_error("\\pset: allowed formats are unaligned, 
aligned, wrapped, html, 

Re: [HACKERS] Declarative partitioning optimization for large amount of partitions

2017-03-06 Thread Aleksander Alekseev
Hello.

OK, here is a patch.

Benchmark, before:

```
number of transactions actually processed: 1823
latency average = 1153.495 ms
latency stddev = 154.366 ms
tps = 6.061104 (including connections establishing)
tps = 6.061211 (excluding connections establishing)
```

Benchmark, after:

```
number of transactions actually processed: 2462
latency average = 853.862 ms
latency stddev = 112.270 ms
tps = 8.191861 (including connections establishing)
tps = 8.192028 (excluding connections establishing)
```

+35% TPS, just as expected. Feel free to run your own benchmarks on
different datasets and workloads. `perf top` shows that first bottleneck
is completely eliminated. I did nothing about the second bottleneck
since as Amit mentioned partition-pruning should solve this anyway and
apparently any micro-optimizations don't worth an effort.

Also I checked test coverage using lcov to make sure that this code is
test covered. An exact script I'm using could be found here [1].

[1] https://github.com/afiskon/pgscripts/blob/master/code-coverage.sh

On Wed, Mar 01, 2017 at 10:36:24AM +0900, Amit Langote wrote:
> Hi,
> 
> On 2017/02/28 23:25, Aleksander Alekseev wrote:
> > Hello.
> > 
> > I decided to figure out whether current implementation of declarative
> > partitioning has any bottlenecks when there is a lot of partitions. Here
> > is what I did [1].
> 
> Thanks for sharing.
> 
> > Then:
> > 
> > ```
> > # 2580 is some pk that exists
> > echo 'select * from part_test where pk = 2580;' > t.sql
> > pgbench -j 7 -c 7 -f t.sql -P 1 -T 300 eax
> > ```
> > 
> > `perf top` showed to bottlenecks [2]. A stacktrace for the first one
> > looks like this [3]:
> > 
> > ```
> > 0x007a42e2 in get_tabstat_entry (rel_id=25696, isshared=0 '\000') 
> > at pgstat.c:1689
> > 1689if (entry->t_id == rel_id)
> > #0  0x007a42e2 in get_tabstat_entry (rel_id=25696, isshared=0 
> > '\000') at pgstat.c:1689
> > #1  0x007a4275 in pgstat_initstats (rel=0x7f4af3fd41f8) at 
> > pgstat.c:1666
> > #2  0x004c7090 in relation_open (relationId=25696, lockmode=0) at 
> > heapam.c:1137
> > #3  0x004c72c9 in heap_open (relationId=25696, lockmode=0) at 
> > heapam.c:1291
> > (skipped)
> > ```
> > 
> > And here is a stacktrace for the second bottleneck [4]:
> > 
> > ```
> > 0x00584fb1 in find_all_inheritors (parentrelId=16393, lockmode=1, 
> > numparents=0x0) at pg_inherits.c:199
> > 199 forboth(lo, rels_list, li, rel_numparents)
> > #0  0x00584fb1 in find_all_inheritors (parentrelId=16393, 
> > lockmode=1, numparents=0x0) at pg_inherits.c:199
> > #1  0x0077fc9f in expand_inherited_rtentry (root=0x1badcb8, 
> > rte=0x1b630b8, rti=1) at prepunion.c:1408
> > #2  0x0077fb67 in expand_inherited_tables (root=0x1badcb8) at 
> > prepunion.c:1335
> > #3  0x00767526 in subquery_planner (glob=0x1b63cc0, 
> > parse=0x1b62fa0, parent_root=0x0, hasRecursion=0 '\000', tuple_fraction=0) 
> > at planner.c:568
> > (skipped)
> > ```
> > 
> > The first one could be easily fixed by introducing a hash table
> > (rel_id -> pgStatList entry). Perhaps hash table should be used only
> > after some threshold. Unless there are any objections I will send a
> > corresponding patch shortly.
> 
> I have never thought about this one really.
> 
> > I didn't explored the second bottleneck closely yet but at first glance
> > it doesn't look much more complicated.
> 
> I don't know which way you're thinking of fixing this, but a planner patch
> to implement faster partition-pruning will have taken care of this, I
> think.  As you may know, even declarative partitioned tables currently
> depend on constraint exclusion for partition-pruning and planner's current
> approach of handling inheritance requires to open all the child tables
> (partitions), whereas the new approach hopefully shouldn't need to do
> that.  I am not sure if looking for a more localized fix for this would be
> worthwhile, although I may be wrong.
> 
> Thanks,
> Amit
> 
> 

-- 
Best regards,
Aleksander Alekseev
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 2fb9a8bf58..fa906e7930 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -160,6 +160,16 @@ typedef struct TabStatusArray
 
 static TabStatusArray *pgStatTabList = NULL;
 
+/* pgStatTabHash entry */
+typedef struct TabStatHashEntry
+{
+	Oid t_id;
+	PgStat_TableStatus* tsa_entry;
+} TabStatHashEntry;
+
+/* Hash table for faster t_id -> tsa_entry lookup */
+static HTAB *pgStatTabHash = NULL;
+
 /*
  * Backends store per-function info that's waiting to be sent to the collector
  * in this hash table (indexed by function OID).
@@ -815,7 +825,13 @@ pgstat_report_stat(bool force)
 pgstat_send_tabstat(this_msg);
 this_msg->m_nentries = 0;
 			}
+
+			/*
+			 * Entry will be freed soon so we need to remove it from the lookup table.
+			 */
+			hash_search(pgStatTabH

Re: [HACKERS] RADIUS fallback servers

2017-03-06 Thread Adam Brightwell
>> I wonder if removing the complexity of maintaining two separate lists
>> for the server and port would be a better/less complex approach.  For
>> instance, why not go with a list of typical 'host:port' strings for
>> 'radiusservers'?  If no port is specified, then simply use the default
>> for that specific host. Therefore, we would not have to worry about
>> keeping the two lists in sync. Thoughts?
>
>
> If we do that we should do it for all the parameters, no? So not just
> host:port, but something like host:port:secret:identifier? Mixing the two
> ways of doing it would be quite confusing I think.
>
> And I wonder if that format wouldn't get even more confusing if you for
> example want to use default ports, but non-default secrets.

Yes, I agree. Such a format would be more confusing and I certainly
wouldn't be in favor of it.

> I can see how it would probably be easier in some of the simple cases, but I
> wonder if it wouldn't make it worse in a lot of other cases.

Ultimately, I think that it would be better off in a separate
configuration file. Something to the effect of each line representing
a server, something like:

'   '

With 'radiusservers' simply being the path to that file and
'radiusserver', etc. would remain as is. Where only one or the other
could be provided, but not both. Though, that's perhaps would be
beyond the scope of this patch.

At any rate, I'm going to continue moving forward with testing this patch as is.

-Adam


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Teach Catalog.pm how many attributes there should be per DATA() line

2017-03-06 Thread Dagfinn Ilmari Mannsåker
Hi David,

Here's a review of your patch.

David Christensen  writes:

> Throws a build error if we encounter a different number of fields in a
> DATA() line than we expect for the catalog in question.

The patch is a good idea, and as-is implements the suggested feature.
Tested by removing an attribute from a line in catalog/pg_proc.h and
observing the expected failure.  With the attribute in place, it builds and
passes make check-world.

Detailed code review:

[…]
> diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm
[…]
> + check_natts($filename, $catalog{natts},$3) or
> +   die sprintf "Wrong number of Natts in DATA() 
> line %s:%d\n", $input_file,INPUT_FILE->input_line_number;

Including the expected/actual number of attributes in the error message
would be nice.  In fact, the 'die' could be moved into check_natts(),
where the actual number is already available, and would clutter the main
loop less.

> + unless ($natts)
> + {
> + die "Could not find definition for Natts_${catname} before 
> start of DATA()\n";
> + }

More idiomatically written as:

die 
  unless $natts;

> diff --git a/src/backend/utils/Gen_fmgrtab.pl 
> b/src/backend/utils/Gen_fmgrtab.pl

The changes to this file are redundant, since it calls Catalog::Catalogs(),
which already checks that the number of attributes matches.

Attached is a suggested revised patch.

- ilmari

-- 
- Twitter seems more influential [than blogs] in the 'gets reported in
  the mainstream press' sense at least.   - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
  to a mainstream media article.  - Calle Dybedahl

>From b78b281983e7a0406c15461340391c21d6cddef9 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Mon, 6 Mar 2017 14:51:50 +
Subject: [PATCH] Teach Catalog.pm how many attributes there should be per
 DATA() line

Throws a build error if we encounter a different number of fields in a
DATA() line than we expect for the catalog in question.

Previously, it was possible to silently ignore any mismatches at build
time which could result in symbol undefined errors at link time.  Now
we stop and identify the infringing line as soon as we encounter it,
which greatly speeds up the debugging process.
---
 src/backend/catalog/Catalog.pm | 28 +++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm
index e1f3c3a5ee..85a537ad95 100644
--- a/src/backend/catalog/Catalog.pm
+++ b/src/backend/catalog/Catalog.pm
@@ -46,6 +46,9 @@ sub Catalogs
 
 		open(INPUT_FILE, '<', $input_file) || die "$input_file: $!";
 
+		my ($filename) = ($input_file =~ m/(\w+)\.h$/);
+		my $natts_pat = "Natts_$filename";
+
 		# Scan the input file.
 		while ()
 		{
@@ -70,8 +73,15 @@ sub Catalogs
 			s/\s+/ /g;
 
 			# Push the data into the appropriate data structure.
-			if (/^DATA\(insert(\s+OID\s+=\s+(\d+))?\s+\(\s*(.*)\s*\)\s*\)$/)
+			if (/$natts_pat\s+(\d+)/)
+			{
+$catalog{natts} = $1;
+			}
+			elsif (/^DATA\(insert(\s+OID\s+=\s+(\d+))?\s+\(\s*(.*)\s*\)\s*\)$/)
 			{
+check_natts($filename, $catalog{natts}, $3,
+			$input_file, INPUT_FILE->input_line_number);
+
 push @{ $catalog{data} }, { oid => $2, bki_values => $3 };
 			}
 			elsif (/^DESCR\(\"(.*)\"\)$/)
@@ -216,4 +226,20 @@ sub RenameTempFile
 	rename($temp_name, $final_name) || die "rename: $temp_name: $!";
 }
 
+# verify the number of fields in the passed-in bki structure
+sub check_natts
+{
+	my ($catname, $natts, $bki_val, $file, $line) = @_;
+	die "Could not find definition for Natts_${catname} before start of DATA() in $file\n"
+		unless defined $natts;
+
+	# we're working with a copy and need to count the fields only, so collapse
+	$bki_val =~ s/"[^"]*?"/xxx/g;
+	my @atts = split /\s+/ => $bki_val;
+
+	die sprintf
+		"Wrong number of attributes in DATA() entry at %s:%d (expected %d but got %d)\n",
+		$file, $line, $natts, scalar @atts
+	  unless $natts == @atts;
+}
 1;
-- 
2.11.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication existing data copy

2017-03-06 Thread Erik Rijkers

On 2017-03-06 11:27, Petr Jelinek wrote:

Hi,

updated and rebased version of the patch attached.



I compiled with /only/ this one latest patch:
   0001-Logical-replication-support-for-initial-data-copy-v6.patch

Is that correct, or are other patches still needed on top, or 
underneath?


Anyway, with that one patch, and even after
  alter role ... set synchronous_commit = off;
the process is very slow. (sufficiently slow that I haven't
had the patience to see it to completion yet)

What am I doing wrong?

thanks,

Erik Rijkers


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: psql show index with type info

2017-03-06 Thread Amos Bird

Yeah, I'm thinking about that too. Here is a full list of the original
type values,

"Schema"
"Name"
"table"
"view"
"materialized view"
"index"
"sequence"
"special"
"foreign table"
"table"

What else do you think will benefit from extra type information?

regards,
Amos

Stephen Frost  writes:

> Amos,
>
> * Amos Bird (amosb...@gmail.com) wrote:
>> Well, the prefix is used to differentiate other \d commands, like
>> this,
>
> Ah, ok, fair enough.
>
> Should we consider differentiating different table types also?  I
> suppose those are primairly just logged and unlogged, but I could see
> that being useful information to know when doing a \dt.  Not a big deal
> either way though, and this patch stands on its own certainly.
>
> Thanks!
>
> Stephen



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Seems bug in postgres_fdw?

2017-03-06 Thread Rader, David
On Sat, Mar 4, 2017 at 12:52 AM, Robert Haas  wrote:

> On Thu, Mar 2, 2017 at 3:28 AM, Rader, David  wrote:
> > Attached is a doc patch that updates the documentation for postgres-fdw
> to
> > include the actual values for the 4 session variables that are set. Does
> > that make sense to clarify?
>
> From my point of view, this would be a sensible thing to document,
> although I feel like the grammar is not quite right, because after
> "establishes remote session settings for the parameters" my brain
> expects a list of parameters rather than a list of parameters and the
> corresponding values.  I think it reads better if you change "for the
> parameters" to "for various parameters".
>
> 
>

Revised doc patch attached with various parameters.
From 7bf12d7ed4e38b9c3d37ce2b2f786480f8fd764f Mon Sep 17 00:00:00 2001
From: David Rader 
Date: Mon, 6 Mar 2017 09:38:52 -0500
Subject: [PATCH] Document postgres-fdw session settings for parameters

---
 doc/src/sgml/postgres-fdw.sgml | 31 ++-
 1 file changed, 26 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index b31f373..7a9b655 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -532,11 +532,32 @@
 
   
postgres_fdw likewise establishes remote session settings
-   for the parameters ,
-   , ,
-   and .  These are less likely
-   to be problematic than search_path, but can be handled
-   with function SET options if the need arises.
+   for various parameters: 
+   
+
+ 
+   is set to UTC
+ 
+
+
+ 
+   is set to ISO
+ 
+
+
+ 
+   is set to postgres
+ 
+
+
+ 
+   is set to 3 for remote
+  servers 9.0 and newer and is set to 2 for older versions
+ 
+
+   
+   These are less likely to be problematic than search_path, but 
+   can be handled with function SET options if the need arises.
   
 
   
-- 
2.5.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

2017-03-06 Thread Peter Eisentraut
On 3/4/17 01:45, Petr Jelinek wrote:
> I can see one difference though (I didn't see this code before) and that
> is, the connectDBComplete starts with waiting for socket to become
> writable and only then calls PQconnectPoll, while my patch starts with
> PQconnectPoll call. And I see following comment in connectDBstart
>>  /*
>>   * The code for processing CONNECTION_NEEDED state is in 
>> PQconnectPoll(),
>>   * so that it can easily be re-executed if needed again during the
>>   * asynchronous startup process.  However, we must run it once here,
>>   * because callers expect a success return from this routine to mean 
>> that
>>   * we are in PGRES_POLLING_WRITING connection state.
>>   */

Yes, the libpq documentation also says that.

> If that's the case, the attached should fix it, but I have no way of
> testing it on windows, I can only say that it still works on my machine
> so at least it hopefully does not make things worse.

Committed that.  Let's see how it goes.

I rewrote the while loop as a for loop so that the control logic isn't
spread out over three screens.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: psql show index with type info

2017-03-06 Thread Stephen Frost
Amos,

* Amos Bird (amosb...@gmail.com) wrote:
> Well, the prefix is used to differentiate other \d commands, like
> this,

Ah, ok, fair enough.

Should we consider differentiating different table types also?  I
suppose those are primairly just logged and unlogged, but I could see
that being useful information to know when doing a \dt.  Not a big deal
either way though, and this patch stands on its own certainly.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] PATCH: psql show index with type info

2017-03-06 Thread Amos Bird

Hello Stephen,

Well, the prefix is used to differentiate other \d commands, like
this,

amos=# \ditv
  List of relations
 Schema |Name| Type | Owner |  Table
++--+---+-
 public | i  | table| amos  |
 public | ii | index: gist  | amos  | i
 public | j  | table| amos  |
 public | jj | index: gin   | amos  | i
 public | jp | index: btree | amos  | i
 public | js | index: brin  | amos  | i
 public | numbers| table| amos  |
 public | numbers_mod2   | index: gin   | amos  | numbers
 public | numbers_mod2_btree | index: btree | amos  | numbers
 public | ts | table| amos  |
(10 rows)

regards,
Amos

Stephen Frost  writes:

> Greetings,
>
> * Amos Bird (amosb...@gmail.com) wrote:
>> psql currently supports \di+ to view indexes,
>> 
>>   List of relations
>>  Schema |Name| Type  | Owner |  Table  |  Size  | Description
>> ++---+---+-++-
>>  public | ii | index | amos  | i   | 131 MB |
>>  public | jj | index | amos  | i   | 12 MB  |
>>  public | kk | index | amos  | i   | 456 kB |
>>  public | numbers_mod2   | index | amos  | numbers | 10 MB  |
>>  public | numbers_mod2_btree | index | amos  | numbers | 214 MB |
>> (5 rows)
>> 
>> The co
>> lumn "Type" is kinda useless (all equals to index). Representing
>> the actual index type will be more interesting,
>
> Agreed.
>
>>  Schema |Name| Type | Owner |  Table  |  Size  | 
>> Description
>> ++--+---+-++-
>>  public | ii | index: gist  | amos  | i   | 131 MB |
>>  public | jj | index: gin   | amos  | i   | 12 MB  |
>>  public | kk | index: btree | amos  | i   | 456 kB |
>>  public | numbers_mod2   | index: gin   | amos  | numbers | 10 MB  |
>>  public | numbers_mod2_btree | index: btree | amos  | numbers | 214 MB |
>> (5 rows)
>> 
>> I'm not sure where to add documentations about this patch or if needed one. 
>> Please help
>> me if you think this patch is useful.
>
> I'm not sure why it's useful to keep the 'index:'?  I would suggest we
> just drop that and keep only the actual index type (gist, gin, etc).
>
> Thanks!
>
> Stephen



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE command progress checker

2017-03-06 Thread David Steele
On 3/6/17 1:58 AM, Andres Freund wrote:
> On 2017-03-03 15:33:15 -0500, David Steele wrote:
> 
>> I propose we move this to the 2017-07 CF so the idea can be more fully
>> developed.
> 
> I don't see that being warranted in this case, we're really not talking
> about something complicated:

<...>

> excepting tests and docs, this is very little actual code.

Fair enough.  From my read through it appeared a redesign was
required/requested.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-06 Thread David Steele
On 3/6/17 8:17 AM, Robert Haas wrote:
> On Mon, Mar 6, 2017 at 7:38 AM, Tom Lane  wrote:
>> Simon Riggs  writes:
>>> On 1 March 2017 at 01:58, David Steele  wrote:
 PostgreSQL currently requires the file mode mask (umask) to be 0077.
 However, this precludes the possibility of a user in the postgres group
 performing a backup (or whatever).  Now that
 pg_start_backup()/pg_stop_backup() privileges can be delegated to an
 unprivileged user, it makes sense to also allow a (relatively)
 unprivileged user to perform the backup at the file system level as well.
>>
>>> +1
>>
>> I'd ask what is the point, considering that we don't view "cp -a" as a
>> supported backup technique in the first place.
> 
> /me is confused.
> 
> Surely the idea is that you'd like an unprivileged database user to
> run pg_start_backup(), an operating system user that can read but not
> write the database files to copy them, and then the unprivileged to
> then run pg_stop_backup().  I have no opinion on the patch, but I
> support the goal.  As I said on the surprisingly-controversial thread
> about ripping out hard-coded superuser checks, reducing the level of
> privilege which someone must have in order to perform a necessary
> operation leads to better security.  An exclusive backup taken via the
> filesystem (probably not via cp, but say via tar or cpio) inevitably
> requires the backup user to be able to read the entire cluster
> directory, but it doesn't inherently require the backup user to be
> able to write the cluster directory.

Limiting privileges also serves to guard against any bugs in tools that
run directly against $PGDATA and do not require write privileges.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dump a comment of a TSDictionary

2017-03-06 Thread Stephen Frost
Greeting,s

* Giuseppe Broccolo (giuseppe.brocc...@2ndquadrant.it) wrote:
> I've seen that pg_dump execute the dump of an eventual comment of a
> TSDictionary without
> specifying the namespace where it is defined:

Great catch!

> This is actually a problem if a new TSDictionary is created, in a different
> schema specified by
> the dumped search_path setting. I'd propose to change the current call in
> src/bin/pg_dump/pg_dump.c:

Right.  I've got a few other patches queued up for pg_dump, so I'll
take care of this also.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] PATCH: psql show index with type info

2017-03-06 Thread Stephen Frost
Greetings,

* Amos Bird (amosb...@gmail.com) wrote:
> psql currently supports \di+ to view indexes,
> 
>   List of relations
>  Schema |Name| Type  | Owner |  Table  |  Size  | Description
> ++---+---+-++-
>  public | ii | index | amos  | i   | 131 MB |
>  public | jj | index | amos  | i   | 12 MB  |
>  public | kk | index | amos  | i   | 456 kB |
>  public | numbers_mod2   | index | amos  | numbers | 10 MB  |
>  public | numbers_mod2_btree | index | amos  | numbers | 214 MB |
> (5 rows)
> 
> The co
> lumn "Type" is kinda useless (all equals to index). Representing
> the actual index type will be more interesting,

Agreed.

>  Schema |Name| Type | Owner |  Table  |  Size  | 
> Description
> ++--+---+-++-
>  public | ii | index: gist  | amos  | i   | 131 MB |
>  public | jj | index: gin   | amos  | i   | 12 MB  |
>  public | kk | index: btree | amos  | i   | 456 kB |
>  public | numbers_mod2   | index: gin   | amos  | numbers | 10 MB  |
>  public | numbers_mod2_btree | index: btree | amos  | numbers | 214 MB |
> (5 rows)
> 
> I'm not sure where to add documentations about this patch or if needed one. 
> Please help
> me if you think this patch is useful.

I'm not sure why it's useful to keep the 'index:'?  I would suggest we
just drop that and keep only the actual index type (gist, gin, etc).

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-06 Thread David Steele
On 3/6/17 8:50 AM, Stephen Frost wrote:

> * Simon Riggs (si...@2ndquadrant.com) wrote:
>>> to allow the default mode of files and directories
>>> in the $PGDATA directory to be modified.
>>
>> Are you saying if this is changed all files/directories will be
>> changed to the new mode?
> 
> No, new files will be created with the new mode and existing files will
> be allowed to have the mode set.  Changing all of the existing files
> didn't seem like something we should be trying to do at server start.
> 
>> It seems like it would be annoying to have some files in one mode,
>> some in another.
> 
> It's not intended for that to happen, but it is possible for it to.  The
> alternative is to try and forcibly change all files at server start time
> to match what is configured but that didn't seem like a great idea.

Agreed.  It would definitely affect server start time, perhaps
significantly.

I have added a note to the docs that a change in file_mode_mask does not
automatically change the mode of existing files on disk.

This patch applies cleanly on 6f3a13f.

-- 
-David
da...@pgmasters.net
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c 
b/contrib/pg_stat_statements/pg_stat_statements.c
index 62dec87..98f8170 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -1858,8 +1858,7 @@ qtext_store(const char *query, int query_len,
*query_offset = off;
 
/* Now write the data into the successfully-reserved part of the file */
-   fd = OpenTransientFile(PGSS_TEXT_FILE, O_RDWR | O_CREAT | PG_BINARY,
-  S_IRUSR | S_IWUSR);
+   fd = OpenTransientFile(PGSS_TEXT_FILE, O_RDWR | O_CREAT | PG_BINARY);
if (fd < 0)
goto error;
 
@@ -1923,7 +1922,7 @@ qtext_load_file(Size *buffer_size)
int fd;
struct stat stat;
 
-   fd = OpenTransientFile(PGSS_TEXT_FILE, O_RDONLY | PG_BINARY, 0);
+   fd = OpenTransientFile(PGSS_TEXT_FILE, O_RDONLY | PG_BINARY);
if (fd < 0)
{
if (errno != ENOENT)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index cd82c04..6c84082 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -812,6 +812,44 @@ include_dir 'conf.d'
   
  
 
+ 
+  file_mode_mask (integer)
+  
+   file_mode_mask configuration parameter
+  
+  
+  
+   
+Sets the file mode mask (umask) for the data directory. The parameter
+value is expected to be a numeric mode specified in the format
+accepted by the chmod and
+umask system calls. (To use the customary octal
+format the number must start with a 0 (zero).)
+   
+
+   
+The default file_mode_mask is 
0077,
+meaning that the only the database owner can read and write files in
+the data directory.  For example, setting the
+file_mode_mask to 0027 would 
allow
+any user in the same group as the database owner to read all database
+files, which would be useful for producing a backup using a relatively
+unprivileged user.
+   
+
+   
+ Note that changing this parameter does not automatically change the
+ mode of existing files in the cluster.  This must be done manually by
+ an administator.
+   
+
+   
+This parameter can only be set at server start.
+   
+
+  
+ 
+
  
   bonjour (boolean)
   
diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml
index 1aaa490..bd1f849 100644
--- a/doc/src/sgml/ref/initdb.sgml
+++ b/doc/src/sgml/ref/initdb.sgml
@@ -296,6 +296,25 @@ PostgreSQL documentation
  
 
  
+  -u mask
+  --file-mode-mask=mask
+  
+   
+Sets the file mode mask (umask) for the data directory. The parameter
+value is expected to be a numeric mode specified in the format
+accepted by the chmod and
+umask system calls. (To use the customary octal
+format the number must start with a 0 (zero).)
+   
+
+   
+Specifying this parameter will automatically set
+ in postgresql.conf.
+   
+  
+ 
+
+ 
   -W
   --pwprompt
   
diff --git a/src/backend/access/heap/rewriteheap.c 
b/src/backend/access/heap/rewriteheap.c
index c7b283c..53a2acc 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -1010,8 +1010,7 @@ logical_rewrite_log_mapping(RewriteState state, 
TransactionId xid,
src->off = 0;
memcpy(src->path, path, sizeof(path));
src->vfd = PathNameOpenFile(path,
-   O_CREAT 
| O_EXCL | O_WRONLY | PG_BINARY,
-   S_IRUSR 
| S_IWUSR);
+

Re: [HACKERS] [BUG FIX] Removing NamedLWLockTrancheArray

2017-03-06 Thread Amit Kapila
On Mon, Mar 6, 2017 at 1:37 PM, Kyotaro HORIGUCHI
 wrote:
> Ok, I think I understand the complete picture.
>
> At Mon, 06 Mar 2017 15:58:56 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
>  wrote in 
> <20170306.155856.198084190.horiguchi.kyot...@lab.ntt.co.jp>
>> > I can guess two ways to fix this. One is change the definition of
>> > T_NAME.
>> >
>> > | #define T_NAME(l) \
>> > |   ((l)->tranche < LWTRANCHE_FIRST_USER_DEFINED ? \
>> > |LWLockTrancheArray[(l)->tranche] : \
>> > |NamedLWLockTrancheArray[(l)->tranche - LWTRANCHE_FIRST_USER_DEFINED]
>> >
>> > It makes the patch small but I don't thing the shape is
>> > desirable.
>> >
>> > Then, the other way is registering named tranches into the main
>> > tranche array. The number and names of the requested named
>> > tranches are known to postmaster so they can be allocated and
>> > initialized at the time.
>> >
>> > The mapping of the shared memory is inherited to backends so
>> > pointing to the addresses in shared memory will work in the
>> > !EXEC_BACKEND case. I confirmed that the behavior is ensured also
>> > in EXEC_BACKEND case.
>
> But this doesn't work for
> LWLockNewTrancheId/LWLockRegisterTranche and it is valuable
> interface. So the measure we can take is redefining T_NAME.
>

In RegisterLWLockTranches(), we do register the named tranches as well
which should make it available in LWLockTrancheArray.  Do you see any
reason why that shouldn't work?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change in "policy" on dump ordering?

2017-03-06 Thread Stephen Frost
* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> On 3/6/17 03:33, Michael Banck wrote:
> > Would this be a candidate for backpatching, or is the behaviour change
> > in pg_dump trumping the issues it solves?
> 
> Unless someone literally has a materialized view on pg_policy, it
> wouldn't make a difference, so I'm not very keen on bothering to
> backpatch this.

Agreed.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2017-03-06 Thread Thomas Munro
On Wed, Mar 1, 2017 at 10:29 PM, Thomas Munro
 wrote:
> I'm testing a patch that lets you set up a fixed sized
> SharedBufFileSet object in a DSM segment, with its own refcount for
> the reason you explained.  It supports a dynamically expandable set of
> numbered files, so each participant gets to export file 0, file 1,
> file 2 and so on as required, in any order.  I think this should suit
> both Parallel Tuplesort which needs to export just one file from each
> participant, and Parallel Shared Hash which doesn't know up front how
> many batches it will produce.  Not quite ready but I will post a
> version tomorrow to get Peter's reaction.

See 0007-hj-shared-buf-file-v6.patch in the v6 tarball in the parallel
shared hash thread.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: [[Parallel] Shared] Hash

2017-03-06 Thread Thomas Munro
On Wed, Mar 1, 2017 at 10:40 PM, Thomas Munro
 wrote:
> I'm testing a new version which incorporates feedback from Andres and
> Ashutosh, and is refactored to use a new SharedBufFileSet component to
> handle batch files, replacing the straw-man implementation from the v5
> patch series.  I've set this to waiting-on-author and will post v6
> tomorrow.

I created a system for reference counted partitioned temporary files
called SharedBufFileSet: see 0007-hj-shared-buf-file.patch.  Then I
ripped out the code for sharing batch files that I previously had
cluttering up nodeHashjoin.c, and refactored it into a new component
called a SharedTuplestore which wraps a SharedBufFileSet and gives it
a tuple-based interface: see 0008-hj-shared-tuplestore.patch.  The
name implies aspirations of becoming a more generally useful shared
analogue of tuplestore, but for now it supports only the exact access
pattern needed for hash join batches ($10 wrench).

It creates temporary files like this:

  base/pgsql_tmp/pgsql_tmp[pid].[set].[partition].[participant].[segment]

I'm not sure why nodeHashjoin.c is doing raw batchfile read/write
operations anyway; why not use tuplestore.c for that (as
tuplestore.c's comments incorrectly say is the case)?  Maybe because
Tuplestore's interface doesn't support storing the extra hash value.
In SharedTuplestore I solved that problem by introducing an optional
fixed sized piece of per-tuple meta-data.  Another thing that is
different about SharedTuplestore is that it supports partitions, which
is convenient for this project and probably other parallel projects
too.

In order for workers to be able to participate in reference counting
schemes based on DSM segment lifetime, I had to give the
Exec*InitializeWorker() functions access to the dsm_segment object,
whereas previously they received only the shm_toc in order to access
its contents.  I invented ParallelWorkerContext which has just two
members 'seg' and 'toc': see
0005-hj-let-node-have-seg-in-worker.patch.  I didn't touch the FDW API
or custom scan API where they currently take toc, though I can see
that there is an argument that they should; changing those APIs seems
like a bigger deal.  Another approach would be to use ParallelContext,
as passed into ExecXXXInitializeDSM, with the members that are not
applicable to workers zeroed out.  Thoughts?

I got rid of the ExecDetachXXX stuff I had invented in the last
version, because acf555bc fixed the problem a better way.

I found that I needed to put use more than one toc entry for a single
executor node, in order to reserve space for the inner and outer
SharedTuplestore objects.  So I invented a way to make more extra keys
with PARALLEL_KEY_EXECUTOR_NTH(plan_node_id, N).

-- 
Thomas Munro
http://www.enterprisedb.com


parallel-shared-hash-v6.tgz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster methods for getting SPI results

2017-03-06 Thread Jim Nasby

On 3/2/17 8:03 AM, Peter Eisentraut wrote:

On 12/20/16 23:14, Jim Nasby wrote:

I've been looking at the performance of SPI calls within plpython.
There's a roughly 1.5x difference from equivalent python code just in
pulling data out of the SPI tuplestore. Some of that is due to an
inefficiency in how plpython is creating result dictionaries, but fixing
that is ultimately a dead-end: if you're dealing with a lot of results
in python, you want a tuple of arrays, not an array of tuples.


There is nothing that requires us to materialize the results into an
actual list of actual rows.  We could wrap the SPI_tuptable into a
Python object and implement __getitem__ or __iter__ to emulate sequence
or mapping access.


Would it be possible to have that just pull tuples directly from the 
executor? The overhead of populating the tuplestore just to drain it 
again can become quite significant, and AFAICT it's completely unnecessary.


Unfortunately, I think adding support for that would be even more 
invasive, which is why I haven't attempted it. On the flip side, I 
believe that kind of an interface would be usable by plpgsql, whereas 
the DestReceiver approach is not (AFAICT).

--
Jim Nasby, Chief Data Architect, OpenSCG


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change in "policy" on dump ordering?

2017-03-06 Thread Jim Nasby

On 3/4/17 11:49 AM, Peter Eisentraut wrote:

I wonder whether we should emphasize this change by assigning
DO_REFRESH_MATVIEW a higher number, like 100?

Since there wasn't any interest in that idea, I have committed Jim's
patch as is.


Thanks. Something else that seems somewhat useful would be to have the 
sort defined by an array of the ENUM values in the correct order, and 
then have the code do the mechanical map generation. I'm guessing the 
only reasonable way to make that work would be to have some kind of a 
last item indicator value, so you know how many values were in the ENUM. 
Maybe there's a better way to do that...

--
Jim Nasby, Chief Data Architect, OpenSCG


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] objsubid vs subobjid

2017-03-06 Thread Jim Nasby

On 3/1/17 9:24 AM, Peter Eisentraut wrote:

On 3/1/17 09:51, Alvaro Herrera wrote:

Peter Eisentraut wrote:

On 2/22/17 19:35, Jim Nasby wrote:

pg_get_object_address() currently returns a field called subobjid, while
pg_depend calls that objsubid. I'm guessing that wasn't on purpose
(especially because internally the function uses objsubid), and it'd be
nice to fix it.


I'm in favor of changing it, but it could theoretically break someone's
code.


Yes, it was an oversight.  +1 for changing.


OK done.


BTW, did you backpatch as well? The function was added in 9.5. 
Presumably we wouldn't normally do that, but if we think this is unused 
enough maybe it's worth it.

--
Jim Nasby, Chief Data Architect, OpenSCG


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster methods for getting SPI results (460% improvement)

2017-03-06 Thread Jim Nasby

On 2/28/17 9:42 PM, Jim Nasby wrote:


I'll post a plpython patch that doesn't add the output format control.


I've attached the results of that. Unfortunately the speed improvement
is only 27% at this point (with 999 tuples). Presumably that's
because it's constructing a brand new dictionary from scratch for each
tuple.


I found a couple bugs. New patches attached.
--
Jim Nasby, Chief Data Architect, OpenSCG
From 116b6a45b0146e42f1faa130d78e9362950c18c3 Mon Sep 17 00:00:00 2001
From: Jim Nasby 
Date: Wed, 1 Mar 2017 15:45:51 -0600
Subject: [PATCH 1/2] Add SPI_execute_callback() and callback-based
 DestReceiver.

Instead of placing results in a tuplestore, this method of execution
uses the supplied callback when creating the Portal for a query.
---
 src/backend/executor/spi.c | 79 --
 src/backend/tcop/dest.c| 11 +++
 src/include/executor/spi.h |  4 +++
 src/include/tcop/dest.h|  1 +
 4 files changed, 85 insertions(+), 10 deletions(-)

diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 55f97b14e6..ffeba679da 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -55,7 +55,8 @@ static void _SPI_prepare_oneshot_plan(const char *src, 
SPIPlanPtr plan);
 
 static int _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
  Snapshot snapshot, Snapshot 
crosscheck_snapshot,
- bool read_only, bool fire_triggers, uint64 
tcount);
+ bool read_only, bool fire_triggers, uint64 
tcount,
+ DestReceiver *callback);
 
 static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes,
Datum *Values, const char *Nulls);
@@ -320,7 +321,34 @@ SPI_execute(const char *src, bool read_only, long tcount)
 
res = _SPI_execute_plan(&plan, NULL,
InvalidSnapshot, 
InvalidSnapshot,
-   read_only, true, 
tcount);
+   read_only, true, 
tcount, NULL);
+
+   _SPI_end_call(true);
+   return res;
+}
+int
+SPI_execute_callback(const char *src, bool read_only, long tcount,
+   DestReceiver *callback)
+{
+   _SPI_plan   plan;
+   int res;
+
+   if (src == NULL || tcount < 0)
+   return SPI_ERROR_ARGUMENT;
+
+   res = _SPI_begin_call(true);
+   if (res < 0)
+   return res;
+
+   memset(&plan, 0, sizeof(_SPI_plan));
+   plan.magic = _SPI_PLAN_MAGIC;
+   plan.cursor_options = 0;
+
+   _SPI_prepare_oneshot_plan(src, &plan);
+
+   res = _SPI_execute_plan(&plan, NULL,
+   InvalidSnapshot, 
InvalidSnapshot,
+   read_only, true, 
tcount, callback);
 
_SPI_end_call(true);
return res;
@@ -354,7 +382,34 @@ SPI_execute_plan(SPIPlanPtr plan, Datum *Values, const 
char *Nulls,

_SPI_convert_params(plan->nargs, plan->argtypes,

Values, Nulls),
InvalidSnapshot, 
InvalidSnapshot,
-   read_only, true, 
tcount);
+   read_only, true, 
tcount, NULL);
+
+   _SPI_end_call(true);
+   return res;
+}
+
+/* Execute a previously prepared plan with a callback Destination */
+int
+SPI_execute_plan_callback(SPIPlanPtr plan, Datum *Values, const char *Nulls,
+bool read_only, long tcount, DestReceiver 
*callback)
+{
+   int res;
+
+   if (plan == NULL || plan->magic != _SPI_PLAN_MAGIC || tcount < 0)
+   return SPI_ERROR_ARGUMENT;
+
+   if (plan->nargs > 0 && Values == NULL)
+   return SPI_ERROR_PARAM;
+
+   res = _SPI_begin_call(true);
+   if (res < 0)
+   return res;
+
+   res = _SPI_execute_plan(plan,
+   
_SPI_convert_params(plan->nargs, plan->argtypes,
+   
Values, Nulls),
+   InvalidSnapshot, 
InvalidSnapshot,
+   read_only, true, 
tcount, callback);
 
_SPI_end_call(true);
return res;
@@ -383,7 +438,7 @@ SPI_execute_plan_with_paramlist(SPIPlanPtr plan, 
ParamListInfo params,
 
res = _SPI_execute_plan(plan, params,
InvalidSnapshot, 
InvalidSnapshot,
-  

Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-06 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Simon Riggs  writes:
> > On 1 March 2017 at 01:58, David Steele  wrote:
> >> PostgreSQL currently requires the file mode mask (umask) to be 0077.
> >> However, this precludes the possibility of a user in the postgres group
> >> performing a backup (or whatever).  Now that
> >> pg_start_backup()/pg_stop_backup() privileges can be delegated to an
> >> unprivileged user, it makes sense to also allow a (relatively)
> >> unprivileged user to perform the backup at the file system level as well.
> 
> > +1
> 
> I'd ask what is the point, considering that we don't view "cp -a" as a
> supported backup technique in the first place.

The point is to allow backups to be performed as a user who only has
read-only access to the files and is a non-superuser in the database.
With the changes to allow GRANT'ing of the pg_start/stop_backup and
related functions and these changes to allow the files to be group
readable, that will be possible using a tool such as pgbackrest, not
just with a "cp -a".

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-06 Thread Stephen Frost
Greetings,

* Simon Riggs (si...@2ndquadrant.com) wrote:
> On 1 March 2017 at 01:58, David Steele  wrote:
> > PostgreSQL currently requires the file mode mask (umask) to be 0077.
> > However, this precludes the possibility of a user in the postgres group
> > performing a backup (or whatever).  Now that
> > pg_start_backup()/pg_stop_backup() privileges can be delegated to an
> > unprivileged user, it makes sense to also allow a (relatively)
> > unprivileged user to perform the backup at the file system level as well.
> 
> +1
> 
> > This patch introduces a new initdb param, -u/-file-mode-mask, and a new
> > GUC, file_mode_mask,
> 
> Why both initdb and at server start? Seems like initdb is OK, or in 
> pg_control.

One could imagine someone wishing to change their mind regarding the
permissions after initdb, and for existing systems who may wish to move
to allowing group-read in an environment where that can be safely done
but don't wish to re-initdb.

> > to allow the default mode of files and directories
> > in the $PGDATA directory to be modified.
> 
> Are you saying if this is changed all files/directories will be
> changed to the new mode?

No, new files will be created with the new mode and existing files will
be allowed to have the mode set.  Changing all of the existing files
didn't seem like something we should be trying to do at server start.

> It seems like it would be annoying to have some files in one mode,
> some in another.

It's not intended for that to happen, but it is possible for it to.  The
alternative is to try and forcibly change all files at server start time
to match what is configured but that didn't seem like a great idea.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Parallel Index Scans

2017-03-06 Thread Amit Kapila
On Mon, Mar 6, 2017 at 6:49 PM, Robert Haas  wrote:
> On Mon, Mar 6, 2017 at 6:33 AM, Amit Kapila  wrote:
>
> I was going to do it after index and index-only scans and parallel
> bitmap heap scan were committed, but I've been holding off on
> committing parallel bitmap heap scan waiting for Andres to fix the
> simplehash regressions, so maybe I should just go do an update now and
> another one later once that goes in.
>

As you wish, but one point to note is that as of now parallelism for
index scans can be influenced by table level parameter
parallel_workers.  It sounds slightly awkward, but if we want to keep
that way, then maybe we can update the docs to indicate the same.
Another option is to have a separate parameter for index scans.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2017-03-06 Thread Robert Haas
On Mon, Mar 6, 2017 at 6:33 AM, Amit Kapila  wrote:
> On Mon, Mar 6, 2017 at 4:57 PM, Michael Banck  
> wrote:
>> Hi,
>>
>> On Thu, Feb 16, 2017 at 08:14:28AM +0530, Amit Kapila wrote:
>>> On Thu, Feb 16, 2017 at 12:27 AM, Robert Haas  wrote:
>>> > On Wed, Feb 15, 2017 at 1:39 PM, Robert Haas  
>>> > wrote:
>>> >> On Wed, Feb 15, 2017 at 7:11 AM, Amit Kapila  
>>> >> wrote:>
>>> >>> support related patch.  In anycase, to avoid confusion I am attaching
>>> >>> all the three patches with this e-mail.
>>> >>
>>> >> Committed guc_parallel_index_scan_v1.patch, with changes to the
>>> >> documentation and GUC descriptions.
>>> >
>>> > And committed parallel_index_opt_exec_support_v10.patch as well, with
>>> > a couple of minor tweaks.
>>>
>>> Thanks a lot!  I think this is a big step forward for parallelism in
>>> PostgreSQL.  Now, we have another way to drive parallel scans and I
>>> hope many more queries can use parallelism.
>>
>> Shouldn't the chapter 15.3 "Parallel Plans" in the documentation be
>> updated for this as well, or is this going to be taken care as a batch
>> at the ned of the development cycle, pending other added parallization
>> features?
>>
>
> Robert mentioned up thread that it is better to update it once at end
> rather than with each feature.

I was going to do it after index and index-only scans and parallel
bitmap heap scan were committed, but I've been holding off on
committing parallel bitmap heap scan waiting for Andres to fix the
simplehash regressions, so maybe I should just go do an update now and
another one later once that goes in.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-06 Thread Robert Haas
On Mon, Mar 6, 2017 at 7:38 AM, Tom Lane  wrote:
> Simon Riggs  writes:
>> On 1 March 2017 at 01:58, David Steele  wrote:
>>> PostgreSQL currently requires the file mode mask (umask) to be 0077.
>>> However, this precludes the possibility of a user in the postgres group
>>> performing a backup (or whatever).  Now that
>>> pg_start_backup()/pg_stop_backup() privileges can be delegated to an
>>> unprivileged user, it makes sense to also allow a (relatively)
>>> unprivileged user to perform the backup at the file system level as well.
>
>> +1
>
> I'd ask what is the point, considering that we don't view "cp -a" as a
> supported backup technique in the first place.

/me is confused.

Surely the idea is that you'd like an unprivileged database user to
run pg_start_backup(), an operating system user that can read but not
write the database files to copy them, and then the unprivileged to
then run pg_stop_backup().  I have no opinion on the patch, but I
support the goal.  As I said on the surprisingly-controversial thread
about ripping out hard-coded superuser checks, reducing the level of
privilege which someone must have in order to perform a necessary
operation leads to better security.  An exclusive backup taken via the
filesystem (probably not via cp, but say via tar or cpio) inevitably
requires the backup user to be able to read the entire cluster
directory, but it doesn't inherently require the backup user to be
able to write the cluster directory.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal : Parallel Merge Join

2017-03-06 Thread Dilip Kumar
On Fri, Mar 3, 2017 at 3:57 PM, Robert Haas  wrote:
> I'm not happy with the way this patch can just happen to latch on to a
> path that's not parallel-safe rather than one that is and then just
> give up on a merge join in that case.  I already made this argument in
> https://www.postgresql.org/message-id/ca+tgmobdw2au1jq5l4ysa2zhqfma-qnvd7zfazbjwm3c0ys...@mail.gmail.com
> and my opinion hasn't changed.  I've subsequently come to realize that
> this problem is more widespread that I initially realized: not only do
> sort_inner_and_outer() and generate_partial_mergejoin_paths() give up
> without searching for the cheapest parallel-safe path, but the latter
> later calls get_cheapest_path_for_pathkeys() and then just pretends it
> didn't find anything unless the result is parallel-safe.  This makes
> no sense to me: the cheapest parallel-safe path could be 1% more
> expensive than the cheapest path of any kind, but because it's not the
> cheapest we arbitrarily skip it, even though parallelizing more of the
> tree might leave us way ahead overall.

for sort_inner_and_outer I have followed the same logic what
hash_inner_and_outer is doing.  Also moved the logic of selecting the
cheapest_safe_inner out of the pathkey loop.

>
> I suggest that we add an additional "bool require_parallel_safe"
> argument to get_cheapest_path_for_pathkeys(); when false, the function
> will behave as at present, but when true, it will skip paths that are
> not parallel-safe.  And similarly have a function to find the cheapest
> parallel-safe path if the first one in the list isn't it.  See
> attached draft patch.  Then this code can search for the correct path
> instead of searching using the wrong criteria and then giving up if it
> doesn't get the result it wants.

Done as suggested.  Also, rebased the path_search_for_mergejoin on
head and updated the comments of get_cheapest_path_for_pathkeys for
new argument.

>
> +if (!(joinrel->consider_parallel &&
> +  save_jointype != JOIN_UNIQUE_OUTER &&
> +  save_jointype != JOIN_FULL &&
> +  save_jointype != JOIN_RIGHT &&
> +  outerrel->partial_pathlist != NIL &&
> +  bms_is_empty(joinrel->lateral_relids)))
>
> I would distribute the negation, so that this reads if
> (!joinrel->consider_parallel || save_jointype == JOIN_UNIQUE_OUTER ||
> save_jointype == JOIN_FULL || ...).  Or maybe better yet, just drop
> the ! and the return that follows and put the
> consider_parallel_nestloop and consider_parallel_mergejoin calls
> inside the if-block.

Done
>
> You could Assert(nestjoinOK) instead of testing it, although I guess
> it doesn't really matter.

left as it is.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


parallel_mergejoin_v9.patch
Description: Binary data


path-search-for-mergejoin-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] dump a comment of a TSDictionary

2017-03-06 Thread Giuseppe Broccolo
Hi hackers,

I've seen that pg_dump execute the dump of an eventual comment of a
TSDictionary without
specifying the namespace where it is defined:

https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/pg_dump.c#L13542

This is actually a problem if a new TSDictionary is created, in a different
schema specified by
the dumped search_path setting. I'd propose to change the current call in
src/bin/pg_dump/pg_dump.c:

dumpComment(fout, labelq->data,
   NULL, dictinfo->rolname,
   dictinfo->dobj.catId, 0,
dictinfo->dobj.dumpId);

with the following one:

dumpComment(fout, labelq->data,
   dictinfo->dobj.namespace->dobj.
name, dictinfo->rolname,
   dictinfo->dobj.catId, 0,
dictinfo->dobj.dumpId);

This is present in the master branch too, so potentially all the PostgreSQL
versions are affected.

Let me know what do you think about this change.

Regards,
Giuseppe.

-- 
Giuseppe Broccolo - 2ndQuadrant Italy
PostgreSQL & PostGIS Training, Services and Support
giuseppe.brocc...@2ndquadrant.it | www.2ndQuadrant.it


Re: [HACKERS] One-shot expanded output in psql using \gx

2017-03-06 Thread Christoph Berg
Re: Daniel Verite 2017-03-03 
<4d84079e-325b-48c5-83e6-bb54bb567...@manitou-mail.org>
> - tab-completion: works but the list in tab-complete.c:backslash_commands[]
> is sorted alphabetically so "\\gx" should come after "\\gset"

Good catch, it was still in that place from when it was named \G.

In the \? output I left it directly after \g because the help text
starts with "as \g ..." and it is IMHO more closely related to \g than
\gexec and \gset which differ more in functionality.

> unsigned short int expanded;/* expanded/vertical output (if supported
> by
>* output format); 0=no, 1=yes, 2=auto */
> Although there is still code that puts true/false in this variable
> (probably because it was a bool before?), I assume that for new
> code, assigning 0/1/2 should be preferred.

Right. I had seen both being used and went with "true", but "1" is
better style.

Both fixed, thanks for the review! Version 3 attached.

Christoph
-- 
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970  87C6 4C5A 6BAB 12D2 A7AE
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
new file mode 100644
index ae58708..e0302ea
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
*** Tue Oct 26 21:40:57 CEST 1999
*** 1891,1896 
--- 1891,1908 
  
  

+ \gx [ filename ]
+ \gx [ |command ]
+ 
+ 
+ \gx is equivalent to \g, but
+ forces expanded output mode for this query.
+ 
+ 
+   
+ 
+ 
+   
  \gexec
  
  
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
new file mode 100644
index a52adc8..07efc27
*** a/src/bin/psql/command.c
--- b/src/bin/psql/command.c
*** exec_command(const char *cmd,
*** 906,913 
  		free(fname);
  	}
  
! 	/* \g [filename] -- send query, optionally with output to file/pipe */
! 	else if (strcmp(cmd, "g") == 0)
  	{
  		char	   *fname = psql_scan_slash_option(scan_state,
     OT_FILEPIPE, NULL, false);
--- 906,916 
  		free(fname);
  	}
  
! 	/*
! 	 * \g [filename] -- send query, optionally with output to file/pipe
! 	 * \gx [filename] -- same as \g, with expanded mode forced
! 	 */
! 	else if (strcmp(cmd, "g") == 0 || strcmp(cmd, "gx") == 0)
  	{
  		char	   *fname = psql_scan_slash_option(scan_state,
     OT_FILEPIPE, NULL, false);
*** exec_command(const char *cmd,
*** 920,925 
--- 923,930 
  			pset.gfname = pg_strdup(fname);
  		}
  		free(fname);
+ 		if (strcmp(cmd, "gx") == 0)
+ 			pset.g_expanded = true;
  		status = PSQL_CMD_SEND;
  	}
  
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
new file mode 100644
index 5349c39..1aa56ab
*** a/src/bin/psql/common.c
--- b/src/bin/psql/common.c
*** PrintQueryTuples(const PGresult *results
*** 770,775 
--- 770,779 
  {
  	printQueryOpt my_popt = pset.popt;
  
+ 	/* one-shot expanded output requested via \gx */
+ 	if (pset.g_expanded)
+ 		my_popt.topt.expanded = 1;
+ 
  	/* write output to \g argument, if any */
  	if (pset.gfname)
  	{
*** sendquery_cleanup:
*** 1410,1415 
--- 1414,1422 
  		pset.gfname = NULL;
  	}
  
+ 	/* reset \gx's expanded-mode flag */
+ 	pset.g_expanded = false;
+ 
  	/* reset \gset trigger */
  	if (pset.gset_prefix)
  	{
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
new file mode 100644
index 91cf0be..c390f87
*** a/src/bin/psql/help.c
--- b/src/bin/psql/help.c
*** slashUsage(unsigned short int pager)
*** 173,178 
--- 173,179 
  	fprintf(output, _("  \\copyright show PostgreSQL usage and distribution terms\n"));
  	fprintf(output, _("  \\errverboseshow most recent error message at maximum verbosity\n"));
  	fprintf(output, _("  \\g [FILE] or ; execute query (and send results to file or |pipe)\n"));
+ 	fprintf(output, _("  \\gx [FILE] as \\g, but force expanded output\n"));
  	fprintf(output, _("  \\gexec execute query, then execute each value in its result\n"));
  	fprintf(output, _("  \\gset [PREFIX] execute query and store results in psql variables\n"));
  	fprintf(output, _("  \\q quit psql\n"));
diff --git a/src/bin/psql/settings.h b/src/bin/psql/settings.h
new file mode 100644
index 195f5a1..70ff181
*** a/src/bin/psql/settings.h
--- b/src/bin/psql/settings.h
*** typedef struct _psqlSettings
*** 91,96 
--- 91,97 
  	printQueryOpt popt;
  
  	char	   *gfname;			/* one-shot file output argument for \g */
+ 	bool		g_expanded;		/* one-shot expanded output requested via \gx */
  	char	   *gset_prefix;	/* one-shot prefix argument for \gset */
  	bool		gexec_flag;		/* one-shot flag to execut

Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-06 Thread Tom Lane
Simon Riggs  writes:
> On 1 March 2017 at 01:58, David Steele  wrote:
>> PostgreSQL currently requires the file mode mask (umask) to be 0077.
>> However, this precludes the possibility of a user in the postgres group
>> performing a backup (or whatever).  Now that
>> pg_start_backup()/pg_stop_backup() privileges can be delegated to an
>> unprivileged user, it makes sense to also allow a (relatively)
>> unprivileged user to perform the backup at the file system level as well.

> +1

I'd ask what is the point, considering that we don't view "cp -a" as a
supported backup technique in the first place.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Allow vacuums to report oldestxmin

2017-03-06 Thread Masahiko Sawada
On Fri, Mar 3, 2017 at 10:50 PM, Simon Riggs  wrote:
> Allow vacuums to report oldestxmin
>
> Allow VACUUM and Autovacuum to report the oldestxmin value they
> used while cleaning tables, helping to make better sense out of
> the other statistics we report in various cases.
>
> Branch
> --
> master
>
> Details
> ---
> http://git.postgresql.org/pg/commitdiff/9eb344faf54a849898d9be012ddfa8204cfeb57c
>
> Modified Files
> --
> src/backend/commands/vacuumlazy.c | 9 +
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
>

Should we change the example in vacuum.sgml file as well? Attached patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


fix_vacuum_example_in_doc.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] objsubid vs subobjid

2017-03-06 Thread Peter Eisentraut
On 3/5/17 16:10, Jim Nasby wrote:
> BTW, did you backpatch as well? The function was added in 9.5. 
> Presumably we wouldn't normally do that, but if we think this is unused 
> enough maybe it's worth it.

It's a catalog change, so we can't backpatch it.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change in "policy" on dump ordering?

2017-03-06 Thread Peter Eisentraut
On 3/6/17 03:33, Michael Banck wrote:
> Would this be a candidate for backpatching, or is the behaviour change
> in pg_dump trumping the issues it solves?

Unless someone literally has a materialized view on pg_policy, it
wouldn't make a difference, so I'm not very keen on bothering to
backpatch this.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign Join pushdowns not working properly for outer joins

2017-03-06 Thread Ashutosh Bapat
On Mon, Mar 6, 2017 at 1:29 PM, David Rowley
 wrote:
> On 6 March 2017 at 18:51, Etsuro Fujita  wrote:
>> On 2017/03/06 11:05, David Rowley wrote:
>>> The attached patch, based on 9.6,  fixes the problem by properly
>>> processing the foreign server options in
>>> postgresGetForeignJoinPaths().
>>
>> I think the fix would work well, but another way I think is much simpler and
>> more consistent with the existing code is to (1) move code for getting the
>> server info from the outer's fpinfo before calling is_foreign_expr() in
>> foreign_join_ok() and (2) add code for getting the shippable extensions info
>> from the outer's fpinfo before calling that function, like the attached.
>
> Thanks for looking over my patch.
>
> I looked over yours and was surprised to see:
>
> + /* is_foreign_expr might need server and shippable-extensions info. */
> + fpinfo->server = fpinfo_o->server;
> + fpinfo->shippable_extensions = fpinfo_o->shippable_extensions;
>
> That appears to do nothing for the other server options. For example
> use_remote_estimate, which is used in estimate_path_cost_size(). As of
> now, that's also broken. My patch fixes that problem too, yours
> appears not to.
>
> Even if you expanded the above code to process all server options, if
> someone comes along later and adds a new option, my code will pick it
> up, yours could very easily be forgotten about and be the cause of yet
> more bugs.
>
> It seems like a much better idea to keep the server option processing
> in one location, which is what I did.

I agree with this. However
1. apply_server_options() is parsing the options strings again and
again, which seems wastage of CPU cycles. It should probably pick up
the options from one of the joining relations. Also, the patch calls
GetForeignServer(), which is not needed; in foreign_join_ok(), it will
copy it from the outer relation anyway.
2. Some server options like use_remote_estimate and fetch_size are
overridden by corresponding table level options. For a join relation
the values of these options are derived by some combination of
table-level options.

I think we should write a separate function
apply_fdw_options_for_joinrel() and pass the fpinfos of joinrel, outer
and inner relation. The function will copy the values of server level
options and derive values for table level options. We would add a note
there to keep this function in sync with apply_*_options(). I don't
think there's any better way to keep the options in sync for base
relations and join relations.

Here's the patch attached.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


foreign_join_upperrel_pushdown_fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2017-03-06 Thread Amit Kapila
On Mon, Mar 6, 2017 at 4:57 PM, Michael Banck  wrote:
> Hi,
>
> On Thu, Feb 16, 2017 at 08:14:28AM +0530, Amit Kapila wrote:
>> On Thu, Feb 16, 2017 at 12:27 AM, Robert Haas  wrote:
>> > On Wed, Feb 15, 2017 at 1:39 PM, Robert Haas  wrote:
>> >> On Wed, Feb 15, 2017 at 7:11 AM, Amit Kapila  
>> >> wrote:>
>> >>> support related patch.  In anycase, to avoid confusion I am attaching
>> >>> all the three patches with this e-mail.
>> >>
>> >> Committed guc_parallel_index_scan_v1.patch, with changes to the
>> >> documentation and GUC descriptions.
>> >
>> > And committed parallel_index_opt_exec_support_v10.patch as well, with
>> > a couple of minor tweaks.
>>
>> Thanks a lot!  I think this is a big step forward for parallelism in
>> PostgreSQL.  Now, we have another way to drive parallel scans and I
>> hope many more queries can use parallelism.
>
> Shouldn't the chapter 15.3 "Parallel Plans" in the documentation be
> updated for this as well, or is this going to be taken care as a batch
> at the ned of the development cycle, pending other added parallization
> features?
>

Robert mentioned up thread that it is better to update it once at end
rather than with each feature.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-06 Thread Simon Riggs
On 1 March 2017 at 01:58, David Steele  wrote:
> PostgreSQL currently requires the file mode mask (umask) to be 0077.
> However, this precludes the possibility of a user in the postgres group
> performing a backup (or whatever).  Now that
> pg_start_backup()/pg_stop_backup() privileges can be delegated to an
> unprivileged user, it makes sense to also allow a (relatively)
> unprivileged user to perform the backup at the file system level as well.

+1

> This patch introduces a new initdb param, -u/-file-mode-mask, and a new
> GUC, file_mode_mask,

Why both initdb and at server start? Seems like initdb is OK, or in pg_control.

> to allow the default mode of files and directories
> in the $PGDATA directory to be modified.

Are you saying if this is changed all files/directories will be
changed to the new mode?

It seems like it would be annoying to have some files in one mode,
some in another.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2017-03-06 Thread Michael Banck
Hi,

On Thu, Feb 16, 2017 at 08:14:28AM +0530, Amit Kapila wrote:
> On Thu, Feb 16, 2017 at 12:27 AM, Robert Haas  wrote:
> > On Wed, Feb 15, 2017 at 1:39 PM, Robert Haas  wrote:
> >> On Wed, Feb 15, 2017 at 7:11 AM, Amit Kapila  
> >> wrote:>
> >>> support related patch.  In anycase, to avoid confusion I am attaching
> >>> all the three patches with this e-mail.
> >>
> >> Committed guc_parallel_index_scan_v1.patch, with changes to the
> >> documentation and GUC descriptions.
> >
> > And committed parallel_index_opt_exec_support_v10.patch as well, with
> > a couple of minor tweaks.
> 
> Thanks a lot!  I think this is a big step forward for parallelism in
> PostgreSQL.  Now, we have another way to drive parallel scans and I
> hope many more queries can use parallelism.

Shouldn't the chapter 15.3 "Parallel Plans" in the documentation be
updated for this as well, or is this going to be taken care as a batch
at the ned of the development cycle, pending other added parallization
features?


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PATCH: psql show index with type info

2017-03-06 Thread Amos Bird

psql currently supports \di+ to view indexes,

  List of relations
 Schema |Name| Type  | Owner |  Table  |  Size  | Description
++---+---+-++-
 public | ii | index | amos  | i   | 131 MB |
 public | jj | index | amos  | i   | 12 MB  |
 public | kk | index | amos  | i   | 456 kB |
 public | numbers_mod2   | index | amos  | numbers | 10 MB  |
 public | numbers_mod2_btree | index | amos  | numbers | 214 MB |
(5 rows)

The co
lumn "Type" is kinda useless (all equals to index). Representing
the actual index type will be more interesting,

 Schema |Name| Type | Owner |  Table  |  Size  | 
Description
++--+---+-++-
 public | ii | index: gist  | amos  | i   | 131 MB |
 public | jj | index: gin   | amos  | i   | 12 MB  |
 public | kk | index: btree | amos  | i   | 456 kB |
 public | numbers_mod2   | index: gin   | amos  | numbers | 10 MB  |
 public | numbers_mod2_btree | index: btree | amos  | numbers | 214 MB |
(5 rows)

I'm not sure where to add documentations about this patch or if needed one. 
Please help
me if you think this patch is useful.

Best regards,
Amos

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index e2e4cbc..ac27662 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3169,7 +3169,7 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
  " WHEN 'r' THEN '%s'"
  " WHEN 'v' THEN '%s'"
  " WHEN 'm' THEN '%s'"
- " WHEN 'i' THEN '%s'"
+ " WHEN 'i' THEN %s"
  " WHEN 'S' THEN '%s'"
  " WHEN 's' THEN '%s'"
  " WHEN 'f' THEN '%s'"
@@ -3181,7 +3181,7 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
  gettext_noop("table"),
  gettext_noop("view"),
  gettext_noop("materialized view"),
- gettext_noop("index"),
+ gettext_noop("'index: '||(select amname from pg_am a where a.oid = c.relam)"),
  gettext_noop("sequence"),
  gettext_noop("special"),
  gettext_noop("foreign table"),

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication and Character encoding

2017-03-06 Thread Petr Jelinek
On 06/03/17 11:06, Kyotaro HORIGUCHI wrote:
> At Fri, 3 Mar 2017 21:31:24 -0500, Peter Eisentraut 
>  wrote in 
> <88397afa-a8ec-8d8a-1c94-94a4795a3...@2ndquadrant.com>
>> On 3/3/17 14:51, Petr Jelinek wrote:
>>> On 03/03/17 20:37, Peter Eisentraut wrote:
 On 2/27/17 00:23, Kyotaro HORIGUCHI wrote:
> Yeah, the patch sends converted string with the length of the
> orignal length. Usually encoding conversion changes the length of
> a string. I doubt that the reverse case was working correctly.

 I think we shouldn't send the length value at all.  This might have been
 a leftover from an earlier version of the patch.

 See attached patch that removes the length value.

>>>
>>> Well the length is necessary to be able to add binary format support in
>>> future so it's definitely not an omission.
>>
>> Right.  So it appears the right function to use here is
>> pq_sendcountedtext().  However, this sends the strings without null
>> termination, so we'd have to do extra processing on the receiving end.
>> Or we could add an option to pq_sendcountedtext() to make it do what we
>> want.  I'd rather stick to standard pqformat.c functions for handling
>> the protocol.
> 
> It seems reasonable. I changed the prototype of
> pg_sendcountedtext as the following,
> 
> | extern void pq_sendcountedtext(StringInfo buf, const char *str, int slen,
> |bool countincludesself, bool binary);
> 
> I think 'binary' seems fine for the parameter here. The patch
> size increased a bit.
> 

Hmm I find it bit weird that binary true means NULL terminated while
false means not NULL terminated, I would think that the behaviour would
be exactly opposite given that text strings are the ones where NULL
termination matters?

> By the way, I noticed that postmaster launches logical
> replication launcher even if wal_level < logical. Is it right?

Yes, that came up couple of times in various threads. The logical
replication launcher is needed only to start subscriptions and
subscriptions don't need wal_level to be logical, only publication needs
that.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication and Character encoding

2017-03-06 Thread Kyotaro HORIGUCHI
At Fri, 3 Mar 2017 21:31:24 -0500, Peter Eisentraut 
 wrote in 
<88397afa-a8ec-8d8a-1c94-94a4795a3...@2ndquadrant.com>
> On 3/3/17 14:51, Petr Jelinek wrote:
> > On 03/03/17 20:37, Peter Eisentraut wrote:
> >> On 2/27/17 00:23, Kyotaro HORIGUCHI wrote:
> >>> Yeah, the patch sends converted string with the length of the
> >>> orignal length. Usually encoding conversion changes the length of
> >>> a string. I doubt that the reverse case was working correctly.
> >>
> >> I think we shouldn't send the length value at all.  This might have been
> >> a leftover from an earlier version of the patch.
> >>
> >> See attached patch that removes the length value.
> >>
> > 
> > Well the length is necessary to be able to add binary format support in
> > future so it's definitely not an omission.
> 
> Right.  So it appears the right function to use here is
> pq_sendcountedtext().  However, this sends the strings without null
> termination, so we'd have to do extra processing on the receiving end.
> Or we could add an option to pq_sendcountedtext() to make it do what we
> want.  I'd rather stick to standard pqformat.c functions for handling
> the protocol.

It seems reasonable. I changed the prototype of
pg_sendcountedtext as the following,

| extern void pq_sendcountedtext(StringInfo buf, const char *str, int slen,
|  bool countincludesself, bool binary);

I think 'binary' seems fine for the parameter here. The patch
size increased a bit.

By the way, I noticed that postmaster launches logical
replication launcher even if wal_level < logical. Is it right?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c
index 5fe1c72..62fa70d 100644
--- a/src/backend/access/common/printsimple.c
+++ b/src/backend/access/common/printsimple.c
@@ -96,7 +96,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 	pq_sendcountedtext(&buf,
 	   VARDATA_ANY(t),
 	   VARSIZE_ANY_EXHDR(t),
-	   false);
+	   false, false);
 }
 break;
 
@@ -106,7 +106,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 	char	str[12];	/* sign, 10 digits and '\0' */
 
 	pg_ltoa(num, str);
-	pq_sendcountedtext(&buf, str, strlen(str), false);
+	pq_sendcountedtext(&buf, str, strlen(str), false, false);
 }
 break;
 
@@ -116,7 +116,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 	char	str[23];	/* sign, 21 digits and '\0' */
 
 	pg_lltoa(num, str);
-	pq_sendcountedtext(&buf, str, strlen(str), false);
+	pq_sendcountedtext(&buf, str, strlen(str), false, false);
 }
 break;
 
diff --git a/src/backend/access/common/printtup.c b/src/backend/access/common/printtup.c
index a2ca2d7..1ebc50f 100644
--- a/src/backend/access/common/printtup.c
+++ b/src/backend/access/common/printtup.c
@@ -353,7 +353,8 @@ printtup(TupleTableSlot *slot, DestReceiver *self)
 			char	   *outputstr;
 
 			outputstr = OutputFunctionCall(&thisState->finfo, attr);
-			pq_sendcountedtext(&buf, outputstr, strlen(outputstr), false);
+			pq_sendcountedtext(&buf, outputstr, strlen(outputstr),
+			   false, false);
 		}
 		else
 		{
@@ -442,7 +443,7 @@ printtup_20(TupleTableSlot *slot, DestReceiver *self)
 		Assert(thisState->format == 0);
 
 		outputstr = OutputFunctionCall(&thisState->finfo, attr);
-		pq_sendcountedtext(&buf, outputstr, strlen(outputstr), true);
+		pq_sendcountedtext(&buf, outputstr, strlen(outputstr), true, false);
 	}
 
 	pq_endmessage(&buf);
diff --git a/src/backend/libpq/pqformat.c b/src/backend/libpq/pqformat.c
index c8cf67c..f799130 100644
--- a/src/backend/libpq/pqformat.c
+++ b/src/backend/libpq/pqformat.c
@@ -129,23 +129,24 @@ pq_sendbytes(StringInfo buf, const char *data, int datalen)
  */
 void
 pq_sendcountedtext(StringInfo buf, const char *str, int slen,
-   bool countincludesself)
+   bool countincludesself, bool binary)
 {
-	int			extra = countincludesself ? 4 : 0;
+	int			extra_self		= countincludesself ? 4 : 0;
+	int			extra_binary	= binary ? 1 : 0;
 	char	   *p;
 
 	p = pg_server_to_client(str, slen);
 	if (p != str)/* actual conversion has been done? */
 	{
 		slen = strlen(p);
-		pq_sendint(buf, slen + extra, 4);
-		appendBinaryStringInfo(buf, p, slen);
+		pq_sendint(buf, slen + extra_self + extra_binary, 4);
+		appendBinaryStringInfo(buf, p, slen + extra_binary);
 		pfree(p);
 	}
 	else
 	{
-		pq_sendint(buf, slen + extra, 4);
-		appendBinaryStringInfo(buf, str, slen);
+		pq_sendint(buf, slen + extra_self + extra_binary, 4);
+		appendBinaryStringInfo(buf, str, slen + extra_binary);
 	}
 }
 
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index bc6e9b5..c5e4214 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -417,7 +417,6 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple)
 		Form_pg_type typclass

Re: [HACKERS] UPDATE of partition key

2017-03-06 Thread Amit Langote
Hi,

On 2017/03/02 15:23, Amit Khandekar wrote:
> On 23 February 2017 at 16:02, Amit Langote
>  wrote:
>>
>>> 2. In the patch, as part of the row movement, ExecDelete() is called
>>> followed by ExecInsert(). This is done that way, because we want to
>>> have the ROW triggers on that (sub)partition executed. If a user has
>>> explicitly created DELETE and INSERT BR triggers for this partition, I
>>> think we should run those. While at the same time, another question
>>> is, what about UPDATE trigger on the same table ? Here again, one can
>>> argue that because this UPDATE has been transformed into a
>>> DELETE-INSERT, we should not run UPDATE trigger for row-movement. But
>>> there can be a counter-argument. For e.g. if a user needs to make sure
>>> about logging updates of particular columns of a row, he will expect
>>> the logging to happen even when that row was transparently moved. In
>>> the patch, I have retained the firing of UPDATE BR trigger.
>>
>> What of UPDATE AR triggers?
> 
> I think it does not make sense running after row triggers in case of
> row-movement. There is no update happened on that leaf partition. This
> reasoning can also apply to BR update triggers. But the reasons for
> having a BR trigger and AR triggers are quite different. Generally, a
> user needs to do some modifications to the row before getting the
> final NEW row into the database, and hence [s]he defines a BR trigger
> for that. And we can't just silently skip this step only because the
> final row went into some other partition; in fact the row-movement
> itself might depend on what the BR trigger did with the row. Whereas,
> AR triggers are typically written for doing some other operation once
> it is made sure the row is actually updated. In case of row-movement,
> it is not actually updated.

OK, so it'd be better to clarify in the documentation that that's the case.

>> As a comment on how row-movement is being handled in code, I wonder if it
>> could be be made to look similar structurally to the code in ExecInsert()
>> that handles ON CONFLICT DO UPDATE.  That is,
>>
>> if (partition constraint fails)
>> {
>> /* row movement */
>> }
>> else
>> {
>> /* ExecConstraints() */
>> /* heap_update(), EvalPlanQual(), and ExecInsertIndexTuples() */
>> }
> 
> I guess this is what has been effectively done for row movement, no ?

Yes, although it seems nice how the formatting of the code in ExecInsert()
makes it apparent that they are distinct code paths.  OTOH, the additional
diffs caused by the suggested formatting might confuse other reviewers.

> Looking at that, I found that in the current patch, if there is no
> row-movement happening, ExecPartitionCheck() effectively gets called
> twice : First time when ExecPartitionCheck() is explicitly called for
> row-movement-required check, and second time in ExecConstraints()
> call. May be there should be 2 separate functions
> ExecCheckConstraints() and ExecPartitionConstraints(), and also
> ExecCheckConstraints() that just calls both. This way we can call the
> appropriate functions() accordingly in row-movement case, and the
> other callers would continue to call ExecConstraints().

One random idea: we could add a bool ri_PartitionCheckOK which is set to
true after it is checked in ExecConstraints().  And modify the condition
in ExecConstraints() as follows:

if (resultRelInfo->ri_PartitionCheck &&
+   !resultRelInfo->ri_PartitionCheckOK &&
!ExecPartitionCheck(resultRelInfo, slot, estate))

>>> 3. In case of a concurrent update/delete, suppose session A has locked
>>> the row for deleting it. Now a session B has decided to update this
>>> row and that is going to cause row movement, which means it will
>>> delete it first. But when session A is finished deleting it, session B
>>> finds that it is already deleted. In such case, it should not go ahead
>>> with inserting a new row as part of the row movement. For that, I have
>>> added a new parameter 'already_delete' for ExecDelete().
>>
>> Makes sense.  Maybe: already_deleted -> concurrently_deleted.
> 
> Right, concurrently_deleted sounds more accurate. In the next patch, I
> will change that.

Okay, thanks.

>>> Of course, this still won't completely solve the concurrency anomaly.
>>> In the above case, the UPDATE of Session B gets lost. May be, for a
>>> user that does not tolerate this, we can have a table-level option
>>> that disallows row movement, or will cause an error to be thrown for
>>> one of the concurrent session.
>>
>> Will this table-level option be specified for a partitioned table once or
>> for individual partitions?
> 
> My opinion is, if decide to have table-level option, it should be on
> the root partition, to keep it simple.

I see.

>> But that starts to sound less attractive when one realizes that
>> that will occur for every row that wants to move.
> 
> If we manage to call ExecSetupPartitionTupleRouting() during execution
> phase only once for the very first time we

Re: [HACKERS] Print correct startup cost for the group aggregate.

2017-03-06 Thread Ashutosh Bapat
>
>
> I understood you reasoning of why startup_cost = input_startup_cost and not
> input_total_cost for aggregation by sorting. But what I didn't understand is
> how come higher startup cost for aggregation by sorting would force hash
> aggregation to be chosen? I am not clear about this part.

See this comment in cost_agg()
 * Note: in this cost model, AGG_SORTED and AGG_HASHED have exactly the
 * same total CPU cost, but AGG_SORTED has lower startup cost.  If the
 * input path is already sorted appropriately, AGG_SORTED should be
 * preferred (since it has no risk of memory overflow).

AFAIU, if the input is already sorted, aggregation by sorting and
aggregation by hashing will have almost same cost, the startup cost of
AGG_SORTED being lower than AGG_HASHED. Because of lower startup cost,
AGG_SORTED gets chosen by add_path() over AGG_HASHED path.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE command progress checker

2017-03-06 Thread vinayak


On 2017/03/06 17:02, Amit Langote wrote:

Hi Vinayak,

On 2017/02/28 18:24, vinayak wrote:

The attached patch reports the different phases of analyze command.
Added this patch to CF 2017-03.

In the updated monitoring.sgml:

+
+ computing heap stats
+ 
+   VACUUM is currently computing heap stats.
+ 
+
+
+ computing index stats
+ 
+   VACUUM is currently computing index stats.
+ 
+
+
+ cleaning up indexes
+ 
+   ANALYZE is currently cleaning up indexes.
+ 
+
+   
+   
+  

The entries mentioning VACUUM should actually say ANALYZE.

Yes. Thank you.
I will fix it.

Regards,
Vinayak Pokale
NTT Open Source Software Center


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Restricting maximum keep segments by repslots

2017-03-06 Thread Kyotaro HORIGUCHI
Thank you for the comment.

At Fri, 3 Mar 2017 14:47:20 -0500, Peter Eisentraut 
 wrote in 

> On 3/1/17 19:54, Kyotaro HORIGUCHI wrote:
> >> Please measure it in size, not in number of segments.
> > It was difficult to dicide which is reaaonable but I named it
> > after wal_keep_segments because it has the similar effect.
> > 
> > In bytes(or LSN)
> >  max_wal_size
> >  min_wal_size
> >  wal_write_flush_after
> > 
> > In segments
> >  wal_keep_segments
> 
> We have been moving away from measuring in segments.  For example,
> checkpoint_segments was replaced by max_wal_size.
> 
> Also, with the proposed patch that allows changing the segment size more
> easily, this will become more important.  (I wonder if that will require
> wal_keep_segments to change somehow.)

Agreed. It is 'max_slot_wal_keep_size' in the new version.

wal_keep_segments might should be removed someday.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From a646db76ac71ba1bff1105d55b8042fb451022bf Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 28 Feb 2017 11:39:48 +0900
Subject: [PATCH] Add WAL releaf vent for replication slots

Adds a capability to limit the number of segments kept by replication
slots by a GUC variable.
---
 src/backend/access/transam/xlog.c | 12 
 src/backend/utils/misc/guc.c  | 11 +++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/access/xlog.h |  1 +
 4 files changed, 25 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 8973583..cb23fbc 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -104,6 +104,7 @@ int			wal_level = WAL_LEVEL_MINIMAL;
 int			CommitDelay = 0;	/* precommit delay in microseconds */
 int			CommitSiblings = 5; /* # concurrent xacts needed to sleep */
 int			wal_retrieve_retry_interval = 5000;
+int			max_slot_wal_keep_size = 0;
 
 #ifdef WAL_DEBUG
 bool		XLOG_DEBUG = false;
@@ -9267,6 +9268,17 @@ KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo)
 
 		XLByteToSeg(keep, slotSegNo);
 
+		/* emergency vent */
+		if (max_slot_wal_keep_size > 0 &&
+			segno - slotSegNo > max_slot_wal_keep_size)
+		{
+			ereport(WARNING,
+	(errmsg ("restart LSN of replication slots is ignored by checkpoint"),
+	 errdetail("Some replication slots lose required WAL segnents to continue.")));
+			/* slotSegNo cannot be negative here */
+			slotSegNo = segno - max_slot_wal_keep_size;
+		}
+
 		if (slotSegNo <= 0)
 			segno = 1;
 		else if (slotSegNo < segno)
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 0707f66..20fe29a 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2335,6 +2335,17 @@ static struct config_int ConfigureNamesInt[] =
 	},
 
 	{
+		{"max_slot_wal_keep_size", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the maximum size of extra WALs kept by replication slots."),
+		 NULL,
+		 GUC_UNIT_XSEGS
+		},
+		&max_slot_wal_keep_size,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
+	{
 		{"wal_sender_timeout", PGC_SIGHUP, REPLICATION_SENDING,
 			gettext_noop("Sets the maximum time to wait for WAL replication."),
 			NULL,
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 157d775..93e2f13 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -233,6 +233,7 @@
 #max_wal_senders = 10		# max number of walsender processes
 # (change requires restart)
 #wal_keep_segments = 0		# in logfile segments, 16MB each; 0 disables
+#max_slot_wal_keep_size = 0	# measured in bytes; 0 disables
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 
 #max_replication_slots = 10	# max number of replication slots
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 9f036c7..93ee819 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -97,6 +97,7 @@ extern bool reachedConsistency;
 extern int	min_wal_size;
 extern int	max_wal_size;
 extern int	wal_keep_segments;
+extern int	max_slot_wal_keep_size;
 extern int	XLOGbuffers;
 extern int	XLogArchiveTimeout;
 extern int	wal_retrieve_retry_interval;
-- 
2.9.2


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dropping partitioned tables without CASCADE

2017-03-06 Thread Ashutosh Bapat
On Mon, Mar 6, 2017 at 11:12 AM, Ashutosh Bapat
 wrote:
> On Mon, Mar 6, 2017 at 11:05 AM, Simon Riggs  wrote:
>> On 6 March 2017 at 05:29, Ashutosh Bapat
>>  wrote:
>>
>>> Just to confirm, you want the output to look like this
> \d+ t1
> Table "public.t1"
>  Column |  Type   | Collation | Nullable | Default | Storage | Stats
> target | Description
> +-+---+--+-+-+--+-
>  a  | integer |   | not null | | plain   |
>   |
> Partition key: RANGE (a)
> Partitions: t1p1 FOR VALUES FROM (0) TO (100), HAS PARTITIONS
> t1p2 FOR VALUES FROM (100) TO (200)
>>>

 lowercase please
>>>
>>> Except for HAS PARTITIONS, everything is part of today's output. Given
>>> the current output, HAS PARTITIONS should be in upper case.
>>
>> "has partitions" is not part of the DDL, whereas "FOR VALUES FROM (0)
>> TO (100)" is. So ISTM sensible to differentiate between DDL and
>> non-ddl using upper and lower case.
>>
>
> Make sense. Will try to cook up a patch soon.

here's the patch. I have added a testcase in insert.sql to test \d+
output for a partitioned table which has partitions which are further
partitioned and also some partitions which are not partitioned
themselves. I have also refactored a statement few lines above,
replacing an if condition with ? : operator similar to code few lines
below.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


pg_part_desc.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Print correct startup cost for the group aggregate.

2017-03-06 Thread Rushabh Lathia
Thanks Ashutosh & Robert for the explanation.

On Mon, Mar 6, 2017 at 10:02 AM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> On Sat, Mar 4, 2017 at 2:50 PM, Robert Haas  wrote:
> > On Thu, Mar 2, 2017 at 6:48 PM, Ashutosh Bapat
> >  wrote:
> >> On Thu, Mar 2, 2017 at 6:06 PM, Rushabh Lathia <
> rushabh.lat...@gmail.com> wrote:
> >>> While reading through the cost_agg() I found that startup cost for the
> >>> group aggregate is not correctly assigned. Due to this explain plan is
> >>> not printing the correct startup cost.
> >>>
> >>> Without patch:
> >>>
> >>> postgres=# explain select aid, sum(abalance) from pgbench_accounts
> where
> >>> filler like '%foo%' group by aid;
> >>>  QUERY PLAN
> >>> 
> -
> >>>  GroupAggregate  (cost=81634.33..85102.04 rows=198155 width=12)
> >>>Group Key: aid
> >>>->  Sort  (cost=81634.33..82129.72 rows=198155 width=8)
> >>>  Sort Key: aid
> >>>  ->  Seq Scan on pgbench_accounts  (cost=0.00..61487.89
> rows=198155
> >>> width=8)
> >>>Filter: (filler ~~ '%foo%'::text)
> >>> (6 rows)
> >>>
> >>> With patch:
> >>>
> >>> postgres=# explain select aid, sum(abalance) from pgbench_accounts
> where
> >>> filler like '%foo%' group by aid;
> >>>  QUERY PLAN
> >>> 
> -
> >>>  GroupAggregate  (cost=82129.72..85102.04 rows=198155 width=12)
> >>>Group Key: aid
> >>>->  Sort  (cost=81634.33..82129.72 rows=198155 width=8)
> >>>  Sort Key: aid
> >>>  ->  Seq Scan on pgbench_accounts  (cost=0.00..61487.89
> rows=198155
> >>> width=8)
> >>>Filter: (filler ~~ '%foo%'::text)
> >>> (6 rows)
> >>>
> >>
> >> The reason the reason why startup_cost = input_startup_cost and not
> >> input_total_cost for aggregation by sorting is we don't need the whole
> >> input before the Group/Agg plan can produce the first row. But I think
> >> setting startup_cost = input_startup_cost is also not exactly correct.
> >> Before the plan can produce one row, it has to transit through all the
> >> rows belonging to the group to which the first row belongs. On an
> >> average it has to scan (total number of rows)/(number of groups)
> >> before producing the first aggregated row. startup_cost will be
> >> input_startup_cost + cost to scan (total number of rows)/(number of
> >> groups) rows + cost of transiting over those many rows. Total cost =
> >> startup_cost + cost of scanning and transiting through the remaining
> >> number of input rows.
> >
> > While that idea has some merit, I think it's inconsistent with current
> > practice.  cost_seqscan(), for example, doesn't include the cost of
> > reading the first page in the startup cost, even though that certainly
> > must be done before returning the first row.
>
> OTOH, while costing for merge join, initial_cost_mergejoin() seems to
> consider the cost of scanning outer and inner relations upto the first
> matching tuple on both sides in the startup_cost. Different policies
> at different places.
>
> > I think there have been
> > previous discussions of switching over to the practice for which you
> > are advocating here, but my impression (without researching) is that
> > the current practice is more like what Rushabh did.
> >
>
> I am not sure Rushabh's approach is correct. Here's the excerpt from my
> mail.
>
> >> The reason the reason why startup_cost = input_startup_cost and not
> >> input_total_cost for aggregation by sorting is we don't need the whole
> >> input before the Group/Agg plan can produce the first row.
>
>

> With Rushabh's approach the startup cost of aggregation by sorting
> would be way higher that it's right now. Secondly, it would match that
> of hash aggregation and thus forcing hash aggregation to be chosen in
> almost all the cases.
>

I understood you reasoning of why startup_cost = input_startup_cost and not
input_total_cost for aggregation by sorting. But what I didn't understand is
how come higher startup cost for aggregation by sorting would force hash
aggregation to be chosen? I am not clear about this part.

--
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>



Regards.
Rushabh Lathia
www.EnterpriseDB.com


Re: [HACKERS] Logical replication and inheritance

2017-03-06 Thread Amit Langote
On 2017/03/06 18:04, Amit Langote wrote:
> One more option is for OpenTableList() called by CreatePublication() and
> AlterPublicationTables() to not disregard inheritance, as if ONLY was
> specified.

Oops, meant to say: One more option is for OpenTableList to disregard
inheritance...

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication and inheritance

2017-03-06 Thread Amit Langote
On 2017/03/04 4:24, Peter Eisentraut wrote:
> On 2/27/17 01:57, Amit Langote wrote:
>> I see that if the table is a inheritance parent, and ONLY is not
>> specified, the child tables are also added to the publication.
> 
>> If the child table is later removed from the inheritance hierarchy, it
>> continues to be a part of the publication.
> 
>> Perhaps that's intentional?
> 
> I came across this the other day as well.  It's not clear what the best
> way for this to behave would be.  Another option would be to check the
> then-current inheritance relationships in the output plugin.

I thought removal of a table from inheritance hierarchy would delete it
from publications that its parents are part of.

One more option is for OpenTableList() called by CreatePublication() and
AlterPublicationTables() to not disregard inheritance, as if ONLY was
specified.

Related: I noticed that if you dump a database containing a publication
and an inheritance parent is published by it, loading that into another
database causes the following error for each child.

ERROR:  relation "bar" is already member of publication "foo_pub"

Because when foo, the parent, is added to foo_pub publication, bar (a
child of foo) is added implicitly.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUG FIX] Removing NamedLWLockTrancheArray

2017-03-06 Thread Kyotaro HORIGUCHI
By the way,

At Mon, 06 Mar 2017 17:07:55 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20170306.170755.68410634.horiguchi.kyot...@lab.ntt.co.jp>
> Ok, I think I understand the complete picture.
> 
> At Mon, 06 Mar 2017 15:58:56 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
>  wrote in 
> <20170306.155856.198084190.horiguchi.kyot...@lab.ntt.co.jp>
> > > I can guess two ways to fix this. One is change the definition of
> > > T_NAME.
> > > 
> > > | #define T_NAME(l) \
> > > |   ((l)->tranche < LWTRANCHE_FIRST_USER_DEFINED ? \
> > > |LWLockTrancheArray[(l)->tranche] : \
> > > |NamedLWLockTrancheArray[(l)->tranche - LWTRANCHE_FIRST_USER_DEFINED]
> > > 
> > > It makes the patch small but I don't thing the shape is
> > > desirable.
> > > 
> > > Then, the other way is registering named tranches into the main
> > > tranche array. The number and names of the requested named
> > > tranches are known to postmaster so they can be allocated and
> > > initialized at the time.
> > > 
> > > The mapping of the shared memory is inherited to backends so
> > > pointing to the addresses in shared memory will work in the
> > > !EXEC_BACKEND case. I confirmed that the behavior is ensured also
> > > in EXEC_BACKEND case.
> 
> But this doesn't work for
> LWLockNewTrancheId/LWLockRegisterTranche and it is valuable
> interface. So the measure we can take is redefining T_NAME.

I realized that *I* have used the interface
RequestNamedLWLockTranche in certain extensions but I completely
forgot that and faintly recalled after being pointed by one of my
colleagues..

Sorry for the off-topic.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change in "policy" on dump ordering?

2017-03-06 Thread Michael Banck
Hi,

On Sat, Mar 04, 2017 at 02:49:36PM -0500, Peter Eisentraut wrote:
> On 3/1/17 08:36, Peter Eisentraut wrote:
> > On 2/22/17 18:24, Jim Nasby wrote:
> >>> Yes, by that logic matview refresh should always be last.
> >>
> >> Patches for head attached.
> >>
> >> RLS was the first item added after DO_REFRESH_MATVIEW, which was added 
> >> in 9.5. So if we want to treat this as a bug, they'd need to be patched 
> >> as well, which is a simple matter of swapping 33 and 34.
> > 
> > I wonder whether we should emphasize this change by assigning
> > DO_REFRESH_MATVIEW a higher number, like 100?
> 
> Since there wasn't any interest in that idea, I have committed Jim's
> patch as is.

Would this be a candidate for backpatching, or is the behaviour change
in pg_dump trumping the issues it solves?


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioned tables and relfilenode

2017-03-06 Thread Ashutosh Bapat
>
> We can leave it for the committer to decide, maybe.  Committers often
> rewrite surrounding comments to improve wording, correcting factual
> errors, etc.
>

Sure.


-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wait events for disk I/O

2017-03-06 Thread Rushabh Lathia
On Sat, Mar 4, 2017 at 7:53 PM, Amit Kapila  wrote:

> On Mon, Feb 20, 2017 at 4:04 PM, Rushabh Lathia
>  wrote:
> >
> > My colleague Rahila reported compilation issue with
> > the patch. Issue was only coming with we do the clean
> > build on the branch.
> >
> > Fixed the same into latest version of patch.
> >
>
> Few assorted comments:
>
> 1.
> +
> + WriteBuffile
> + Waiting during
> buffile read operation.
> +
>
> Operation name and definition are not matching.
>
>
Will fix this.


>
> 2.
> +FilePrefetch(File file, off_t offset, int amount, uint32 wait_event_info)
>  {
>  #if defined(USE_POSIX_FADVISE) && defined(POSIX_FADV_WILLNEED)
>   int returnCode;
> @@ -1527,8 +1527,10 @@ FilePrefetch(File file, off_t offset, int amount)
>   if (returnCode < 0)
>   return returnCode;
>
> + pgstat_report_wait_start(wait_event_info);
>   returnCode = posix_fadvise(VfdCache[file].fd, offset, amount,
> POSIX_FADV_WILLNEED);
> + pgstat_report_wait_end();
>
> AFAIK, this call is non-blocking and will just initiate a read, so do
> you think we should record wait event for such a call.
>
>
Yes, prefecth call is a non-blocking and will just initiate a read. But
this info
about the prefetch into wait events will give more info about the system.

3.
> - written = FileWrite(src->vfd, waldata_start, len);
> + written = FileWrite(src->vfd, waldata_start, len,
> + WAIT_EVENT_WRITE_REWRITE_DATA_BLOCK);
>   if (written != len)
>   ereport(ERROR,
>   (errcode_for_file_access(),
> @@ -957,7 +960,7 @@ logical_end_heap_rewrite(RewriteState state)
>   hash_seq_init(&seq_status, state->rs_logical_mappings);
>   while ((src = (RewriteMappingFile *) hash_seq_search(&seq_status)) !=
> NULL)
>   {
> - if (FileSync(src->vfd) != 0)
> + if (FileSync(src->vfd, WAIT_EVENT_SYNC_REWRITE_DATA_BLOCK) != 0)
>
>
> Do we want to consider recording wait event for both write and sync?
> It seems to me OS level writes are relatively cheap and sync calls are
> expensive, so shouldn't we just log for sync calls?  I could see the
> similar usage at multiple places in the patch.
>
>
Yes, I thought of adding wait event only for the sync but then recording the
wait event for both write and sync. I understand that OS level writes are
cheap but it still have some cost attached to that. Also I thought for the
monitoring tool being develop using this wait events, will have more useful
capture data if we try to collect as much info as we can. Or may be not.

I am open for other opinion/suggestions.


--
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>



-- 
Rushabh Lathia


Re: [HACKERS] Partitioned tables and relfilenode

2017-03-06 Thread Amit Langote
On 2017/03/06 17:01, Ashutosh Bapat wrote:
> On Mon, Mar 6, 2017 at 1:26 PM, Amit Langote
>  wrote:
>> On 2017/03/06 16:49, Ashutosh Bapat wrote:
>>> On Mon, Mar 6, 2017 at 12:48 PM, Amit Langote wrote:
 On 2017/03/06 15:41, Michael Paquier wrote:
> This comment is not completely correct. Children can be temp tables,
> they just cannot be temp tables of other backends. It seems to me that
> you could still keep this code simple and remove has_child..

 I updated the comment.  I recall having posted a patch for that once, but
 perhaps went unnoticed.
>>>
>>> The existing comment only specifies "temp tables" and not "temp table
>>> of other backends". The new comment keeps that part same and adds
>>> partitioned table case. So, I don't see any reason to change the "temp
>>> tables" to "temp table of other backends" in this patch.
>>
>> Hmm.  A separate patch might be fine but why not fix the incorrect part
>> while we are updating the whole comment anyway.
> 
> There must be a reason why that comment is written the way it's
> written. I guess, "from other backends" part of "temp tables" story
> has been explained just few lines above and the author/s didn't want
> to repeat it again.

That's perhaps true.  I guess it depends on who reads it.  Someone might
think the comment is incorrect because *not all* temporary child tables
are excluded.

> There's no point in changing it while we are not
> touching temp tables handling.

We can leave it for the committer to decide, maybe.  Committers often
rewrite surrounding comments to improve wording, correcting factual
errors, etc.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioned tables and relfilenode

2017-03-06 Thread Michael Paquier
On Mon, Mar 6, 2017 at 4:18 PM, Amit Langote
 wrote:
> About autovacuum_* parameters - we currently don't handle partitioned
> tables in autovacuum.c, because no statistics are reported for them. That
> is, relation_needs_vacanalyze() will never return true for dovacuum,
> doanalyze and wraparound if it is passed a RELKIND_PARTITIONED_TABLE
> relation.  That's something to be fixed separately though.  When we add
> autovacuum support for partitioned tables, we may want to add a new set of
> reloptions (new because partitioned tables still won't support all options
> returned by heap_reloptions()).  Am I missing something?

OK. I got confused by the fact that settings on parents should
super-seed the settings of the children. Or if you want if a value is
set on the parent by default it would apply to the child if it has no
value set, which is where autovacuum_enabled makes sense even for
partitioned tables. Leading to the point that parents could have
reloptions, with a new category of the type RELOPT_KIND_PARTITION.
Still, it is sensible as well to bypass the parents in autovacuum as
well, now that I read it. And the handling is more simple.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Radix tree for character conversion

2017-03-06 Thread Kyotaro HORIGUCHI
Hello,

At Fri, 3 Mar 2017 12:53:04 +0900, Michael Paquier  
wrote in 
> On Thu, Mar 2, 2017 at 2:20 PM, Kyotaro HORIGUCHI
>  wrote:
> > 5) Just remove plain map files and all related code. Addition to
> >that, Makefile stores hash digest of authority files in
> >Unicode/authoriy_hashes.txt or something that is managed by
> >git.
> 
> That may be an idea to check for differences across upstream versions.
> But that sounds like a separate discussion to me.

Fine with me either.

> > This digest may differ among platforms (typically from cr/nl
> > difference) but we can assume *nix for the usage.
> >
> > I will send the next version after this discussion is settled.
> 
> Sure. There is not much point to move on without Heikki's opinion at
> least, or anybody else like Ishii-san or Tom who are familiar with
> this code. I would think that Heikki would be the committer to pick up
> this change though.

So, this is the latest version of this patch in the shape of the
option 1.


| need some extra opinions is what to do with the old maps:
| 1) Just remove them, replacing the old maps by the new radix tree maps.
| 2) Keep them around in the backend code, even if they are useless.
| 3) Use a GUC to be able to switch from one to the other, giving a
| fallback method in case of emergency.
| 4) Use an extension module to store the old maps with as well the
| previous build code, so as sanity checks can still be performed on the
| new maps.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


0001-Use-radix-tree-for-character-conversion_20170306.patch.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUG FIX] Removing NamedLWLockTrancheArray

2017-03-06 Thread Kyotaro HORIGUCHI
Ok, I think I understand the complete picture.

At Mon, 06 Mar 2017 15:58:56 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20170306.155856.198084190.horiguchi.kyot...@lab.ntt.co.jp>
> > I can guess two ways to fix this. One is change the definition of
> > T_NAME.
> > 
> > | #define T_NAME(l) \
> > |   ((l)->tranche < LWTRANCHE_FIRST_USER_DEFINED ? \
> > |LWLockTrancheArray[(l)->tranche] : \
> > |NamedLWLockTrancheArray[(l)->tranche - LWTRANCHE_FIRST_USER_DEFINED]
> > 
> > It makes the patch small but I don't thing the shape is
> > desirable.
> > 
> > Then, the other way is registering named tranches into the main
> > tranche array. The number and names of the requested named
> > tranches are known to postmaster so they can be allocated and
> > initialized at the time.
> > 
> > The mapping of the shared memory is inherited to backends so
> > pointing to the addresses in shared memory will work in the
> > !EXEC_BACKEND case. I confirmed that the behavior is ensured also
> > in EXEC_BACKEND case.

But this doesn't work for
LWLockNewTrancheId/LWLockRegisterTranche and it is valuable
interface. So the measure we can take is redefining T_NAME.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index ab81d94..7c4c8f4 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -115,7 +115,9 @@ static char **LWLockTrancheArray = NULL;
 static int	LWLockTranchesAllocated = 0;
 
 #define T_NAME(lock) \
-	(LWLockTrancheArray[(lock)->tranche])
+	((lock)->tranche < LWTRANCHE_FIRST_USER_DEFINED ? \
+	LWLockTrancheArray[(lock)->tranche] : \
+	 NamedLWLockTrancheArray[(lock)->tranche - LWTRANCHE_FIRST_USER_DEFINED].trancheName)
 
 /*
  * This points to the main array of LWLocks in shared memory.  Backends inherit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_dump segfaults with publication

2017-03-06 Thread Amit Langote
Hi,

pg_dump segfaults if there are more than one DO_PUBLICATION_REL objects to
dump.

create table foo (a int);
create publication foo_pub;
alter publication foo_pub add table foo;

$ pg_dump


create table bar (a int);
alter publication foo_pub add table bar;

$ pg_dump -s
Segmentation fault (core dumped)

The reason is DumpableObject.name is not set being set in
getPublicationTables().  Attached patch fixes that.

Thanks,
Amit
>From 384fef77172168452efb22123e01b0a6349683e8 Mon Sep 17 00:00:00 2001
From: amit 
Date: Mon, 6 Mar 2017 16:44:13 +0900
Subject: [PATCH] Set DumpableObject.name for PublicationRelInfo objects

---
 src/bin/pg_dump/pg_dump.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7273ec8fe2..83c9b014e7 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3532,6 +3532,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
 			pubrinfo[j].dobj.catId.oid = atooid(PQgetvalue(res, j, i_oid));
 			AssignDumpId(&pubrinfo[j].dobj);
 			pubrinfo[j].dobj.namespace = tbinfo->dobj.namespace;
+			pubrinfo[j].dobj.name = tbinfo->dobj.name;
 			pubrinfo[j].pubname = pg_strdup(PQgetvalue(res, j, i_pubname));
 			pubrinfo[j].pubtable = tbinfo;
 		}
-- 
2.11.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE command progress checker

2017-03-06 Thread Amit Langote
Hi Vinayak,

On 2017/02/28 18:24, vinayak wrote:
> The attached patch reports the different phases of analyze command.
> Added this patch to CF 2017-03.

In the updated monitoring.sgml:

+
+ computing heap stats
+ 
+   VACUUM is currently computing heap stats.
+ 
+
+
+ computing index stats
+ 
+   VACUUM is currently computing index stats.
+ 
+
+
+ cleaning up indexes
+ 
+   ANALYZE is currently cleaning up indexes.
+ 
+
+   
+   
+  

The entries mentioning VACUUM should actually say ANALYZE.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioned tables and relfilenode

2017-03-06 Thread Ashutosh Bapat
On Mon, Mar 6, 2017 at 1:26 PM, Amit Langote
 wrote:
> On 2017/03/06 16:49, Ashutosh Bapat wrote:
>> On Mon, Mar 6, 2017 at 12:48 PM, Amit Langote wrote:
>>> On 2017/03/06 15:41, Michael Paquier wrote:
 This comment is not completely correct. Children can be temp tables,
 they just cannot be temp tables of other backends. It seems to me that
 you could still keep this code simple and remove has_child..
>>>
>>> I updated the comment.  I recall having posted a patch for that once, but
>>> perhaps went unnoticed.
>>
>> The existing comment only specifies "temp tables" and not "temp table
>> of other backends". The new comment keeps that part same and adds
>> partitioned table case. So, I don't see any reason to change the "temp
>> tables" to "temp table of other backends" in this patch.
>
> Hmm.  A separate patch might be fine but why not fix the incorrect part
> while we are updating the whole comment anyway.

There must be a reason why that comment is written the way it's
written. I guess, "from other backends" part of "temp tables" story
has been explained just few lines above and the author/s didn't want
to repeat it again. There's no point in changing it while we are not
touching temp tables handling.


-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


<    1   2