Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)

2017-02-10 Thread Fabien COELHO



Ok, so that's not just PROMPT_READY, that's every prompt...which might be
ok. ? is a great optional cue, and you're thinking on 2 levels deep, 2nd
level always being '.'?


Yep. The idea is to keep it short, but to still have something to say 
"there are more levels" in the stack, hence the one dot. Basically I just 
compressed your 4 level proposal, and added a separator to deal with the 
preceding stuff and suggest the conditional.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel bitmap heap scan

2017-02-10 Thread Dilip Kumar
On Wed, Feb 8, 2017 at 8:07 AM, Robert Haas  wrote:

Thanks for the detailed review and your valuable feedback.

> I think it would be useful to break the remaining patch into two
> parts, one that enhances the tidbitmap.c API and another that uses
> that to implement Parallel Bitmap Heap Scan.

I have done that.

> There are several cosmetic problems here.  You may have noticed that
> all function prototypes in PostgreSQL header files are explicitly
> declared extern; yours should be, too.  Also, there is extra
> whitespace before some of the variable names here, like
> "ParallelTBMInfo * tbminfo" instead of "ParallelTBMInfo *tbminfo". If
> that is due to pgindent, the solution is to add the typedef names to
> your local typedef list.  Also, tbm_restore_shared_members doesn't
> actually exist.

Fixed

> 1. Add an additional argument to tbm_create(), dsa_area *dsa.  If it's
> NULL, we allocate a backend-private memory for the hash entries as
> normal.  If it's true, we use the dsa_area to store the hash entries,
> using the infrastructure added by your 0002 and revised in
> c3c4f6e1740be256178cd7551d5b8a7677159b74.  (You can use a flag in the
> BitmapIndexScan and BitmapOr to decide whether to pass NULL or
> es_query_dsa.)
Done

>
> 2. Add a new function dsa_pointer tbm_prepare_shared_iterate(TIDBitmap
> *tbm) which allocates shared iteration arrays using the DSA passed to
> tbm_create() and returns a dsa_pointer to the result.  Arrange this so
> that if it's called more than once, each call returns a separate
> iterator, so that you can call it once to get the main iterator and a
> second time for the prefetch iterator, but have both of those point to
> the same underlying iteration arrays.
>
> 3. Add a new function TBMSharedIterator
> *tbm_attach_shared_iterate(dsa_area *dsa, dsa_pointer) which is called
> once per backend and gets passed the dsa_pointer from the previous
> step.
>
> 4. Add a new function TBMIterateResult
> *tbm_shared_iterate(TBMSharedIterator *) to fetch the next result.

I have tried to get these three API's as you explained with one
change. In tbm_attach_shared_iterate I need to pass TBM also because
each worker will create their own copy of TBM. Those workers need to
get the TBM-related information from the shared location. Even though
I try to access some of the information like chunk, npages from
directly shared location, but some other information like base pointer
to dsa element, RelptrPagetableEntry etc. should be local to each
worker (after conversion from dsa pointer to local pointer).

>
> As compared with your proposal, this involves only 3 functions instead
> of 7 (plus one new argument to another function), and I think it's a
> lot clearer what those functions are actually doing.  You don't need
> tbm_estimate_parallel_tbminfo() any more because the data being passed
> from one backend to another is precisely a dsa_pointer, and the bitmap
> scan can just leave space for that in its own estimator function.  You
> don't need tbm_update_shared_members() separately from
> tbm_begin_shared_iterate() separately from tbm_init_shared_iterator()
> because tbm_prepare_shared_iterate() can do all that stuff.  You don't
> need tbm_set_parallel() because the additional argument to
> tbm_create() takes care of that need.

Right

>
> The way you've got ParallelTBMInfo set up right now doesn't respect
> the separation of concerns between nodeBitmapHeapscan.c and
> tidbitmap.c properly. tidbitmap.c shouldn't know that the caller
> intends on creating two iterators, one of which is for prefetching.
> The design above hopefully addresses that: each call to
> tbm_prepare_shared_iterate returns a dsa_pointer to a separate chunk
> of shared iterator state, but tidbitmap.c does not need to know how
> many times that will be called.

Done
>
> Apart from the above, this patch will need a rebase over today's
> commits,

Done
and please make sure all functions you add have high-quality
> header comments.

I tried my best, please check the latest patch (0001).

Apart from those, there are some more changes.
1. I have moved the dsa_pointer and dsa_area declaration from dsa.h to
postgres .h, an independent patch is attached for the same. Because we
need to declare function headers with dsa_pointer in tidbitmap.h, but
tidbitmap.h also used in FRONTEND, therefore, this can not include
dsa.h (as it contains atomics.h).

2. I noticed there was one defect in my code related to handling the
TBM_ONE_PAGE, In initial version, there was no problem, but it got
introduced somewhere in intermediate versions.

I have fixed the same. There were two option to do that

1. Don’t switch to  TBM_ONE_PAGE in case of parallel mode (ideally
this can not happen only worst estimation can get us there) and
directly got to TBM_HASH
2. In shared information keep space for sharing a PagetableEntry.


I have implemented 2nd (in the initial versions I implemented with 1st).

Note: Patch is also rebased on 

Re: [HACKERS] possibility to specify template database for pg_regress

2017-02-10 Thread Pavel Stehule
Hi

2017-02-10 6:00 GMT+01:00 Michael Paquier :

> On Thu, Feb 9, 2017 at 5:13 AM, Pavel Stehule 
> wrote:
> > here is a patch
>
> Thanks.
>
> -   for (sl = dblist; sl; sl = sl->next)
> -   create_database(sl->str);
> +   if (templatelist != NULL)
> +   {
> +   _stringlist *tl;
> +
> +   for (sl = dblist, tl = templatelist; sl; sl = sl->next, tl
> = tl->next)
> +   {
> +   if (tl != NULL)
> +   create_database(sl->str, tl->str);
> +   else
> +   {
> +   fprintf(stderr, _("%s: the template list is
> shorter than database list\n"),
> +   progname);
> +   exit(2);
> +   }
> +   }
> +   }
> +   else
> +   for (sl = dblist; sl; sl = sl->next)
> +   create_database(sl->str, "template0");
> There is one problem here: if the length of the template list is
> shorter than the database list, databases get halfly created, then
> pg_regress complains, letting the instance in a half-way state. I
> think that you had better do any sanity checks before creating or even
> dropping existing databases.
>

here is new update - check is done before any creating

Regards

Pavel


> --
> Michael
>
diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index d4d00d9..b5f5c2f 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -68,6 +68,7 @@ const char *pretty_diff_opts = "-w -C3";
 
 /* options settable from command line */
 _stringlist *dblist = NULL;
+_stringlist *templatelist = NULL;
 bool		debug = false;
 char	   *inputdir = ".";
 char	   *outputdir = ".";
@@ -1907,7 +1908,7 @@ drop_database_if_exists(const char *dbname)
 }
 
 static void
-create_database(const char *dbname)
+create_database(const char *dbname, const char *template)
 {
 	_stringlist *sl;
 
@@ -1917,10 +1918,12 @@ create_database(const char *dbname)
 	 */
 	header(_("creating database \"%s\""), dbname);
 	if (encoding)
-		psql_command("postgres", "CREATE DATABASE \"%s\" TEMPLATE=template0 ENCODING='%s'%s", dbname, encoding,
+		psql_command("postgres", "CREATE DATABASE \"%s\" TEMPLATE=\"%s\" ENCODING='%s'%s",
+	 dbname, template, encoding,
 	 (nolocale) ? " LC_COLLATE='C' LC_CTYPE='C'" : "");
 	else
-		psql_command("postgres", "CREATE DATABASE \"%s\" TEMPLATE=template0%s", dbname,
+		psql_command("postgres", "CREATE DATABASE \"%s\" TEMPLATE=\"%s\"%s",
+	 dbname, template,
 	 (nolocale) ? " LC_COLLATE='C' LC_CTYPE='C'" : "");
 	psql_command(dbname,
  "ALTER DATABASE \"%s\" SET lc_messages TO 'C';"
@@ -1995,6 +1998,7 @@ help(void)
 	printf(_("  --outputdir=DIR   place output files in DIR (default \".\")\n"));
 	printf(_("  --schedule=FILE   use test ordering schedule from FILE\n"));
 	printf(_("(can be used multiple times to concatenate)\n"));
+	printf(_("  --template=DB use template DB (default \"template0\")\n"));
 	printf(_("  --temp-instance=DIR   create a temporary instance in DIR\n"));
 	printf(_("  --use-existinguse an existing installation\n"));
 	printf(_("\n"));
@@ -2041,6 +2045,7 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc
 		{"launcher", required_argument, NULL, 21},
 		{"load-extension", required_argument, NULL, 22},
 		{"config-auth", required_argument, NULL, 24},
+		{"template", required_argument, NULL, 25},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -2154,6 +2159,16 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc
 			case 24:
 config_auth_datadir = pg_strdup(optarg);
 break;
+			case 25:
+
+/*
+ * If a default template was specified, we need to remove it
+ * before we add the specified one.
+ */
+free_stringlist();
+split_to_stringlist(optarg, ",", );
+break;
+
 			default:
 /* getopt_long already emitted a complaint */
 fprintf(stderr, _("\nTry \"%s -h\" for more information.\n"),
@@ -2454,8 +2469,35 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc
 	 */
 	if (!use_existing)
 	{
-		for (sl = dblist; sl; sl = sl->next)
-			create_database(sl->str);
+		if (templatelist != NULL)
+		{
+			_stringlist *tl;
+
+			/*
+			 * The template list should to have same length as database list.
+			 * Check it before any database creation.
+			 */
+			for (sl = dblist, tl = templatelist; sl; sl = sl->next, tl = tl->next)
+if (tl == NULL)
+{
+	fprintf(stderr, _("%s: the template list is shorter than database list\n"),
+			progname);
+	exit(2);
+}
+			if (tl != NULL)
+			{
+fprintf(stderr, _("%s: the template list is longer than database list\n"),
+		progname);
+exit(2);
+			}
+
+			for (sl = dblist, tl = templatelist; sl; sl = sl->next, tl = tl->next)
+create_database(sl->str, tl->str);
+		}
+		

Re: [HACKERS] Parallel Index Scans

2017-02-10 Thread Amit Kapila
On Fri, Feb 10, 2017 at 11:27 PM, Robert Haas  wrote:
> On Wed, Feb 1, 2017 at 8:20 AM, Amit Kapila  wrote:
>>> The hunk in indexam.c looks like a leftover that can be omitted.
>>
>> It is not a leftover hunk. Earlier, the patch has the same check
>> btparallelrescan, but based on your comment up thread [1] (Why does
>> btparallelrescan cater to the case where scan->parallel_scan== NULL?),
>> this has been moved to indexam.c.
>
> That's not my point.  The point is, if it's not a parallel scan,
> index_parallelrescan() should never get called in the first place.  So
> therefore it shouldn't need to worry about scan->parallel_scan being
> NULL.  It seems possibly reasonable to put something like
> Assert(scan->parallel_scan != NULL) in there, but arranging to return
> without doing anything in that case seems like it just masks possible
> bugs in the calling code.  And really I'm not sure we even need the
> Assert.
>

This is just a safety check, so probably we can change it to Assert.

>
>>> That
>>> is a little odd.  Another, possibly related thing that is odd is that
>>> when _bt_steppage() finds no tuples and decides to advance to a new
>>> page again, there's a very short comment in the forward case and a
>>> very long comment in the backward case:
>>>
>>> /* nope, keep going */
>>> vs.
>>> /*
>>>  * For parallel scans, get the last page scanned as it is quite
>>>  * possible that by the time we try to fetch previous page, 
>>> other
>>>  * worker has also decided to scan that previous page.  We could
>>>  * avoid that by doing _bt_parallel_release once we have read 
>>> the
>>>  * current page, but it is bad to make other workers wait till 
>>> we
>>>  * read the page.
>>>  */
>>>
>>> Now it seems to me that these cases are symmetric and the issues are
>>> identical.  It's basically that, while we were judging whether the
>>> current page has useful contents, some other process could have
>>> advanced the scan (since _bt_readpage un-seized it).
>>
>> Yeah, but the reason of difference in comments is that for
>> non-parallel backwards scans there is no code at that place to move to
>> previous page and it basically relies on next call to _bt_walk_left()
>> whereas for parallel-scans, we can't simply rely on _bt_walk_left().
>> I have slightly modified the  comments for backward scan case, see if
>> that looks better and if not, then suggest what you think is better.
>
> Why can't we rely on _bt_walk_left?
>

The reason is mentioned in comments, but let me try to explain with
some example.  When you reach that point of code, it means that either
the current page (assume page number is 10) doesn't contain any
matching items or it is a half-dead page, both of which indicates that
we have to move to the previous page.   Now, before checking if the
current page contains matching items, we signal parallel machinery
(via _bt_parallel_release) to allow workers to read the previous page
(assume previous page number is 9).  So it is quite possible that
after deciding that current page (page number 10) doesn't contain any
matching tuples if we directly move to the previous page (in this case
it will be 9) by using _bt_walk_left, some other worker would have
read page 9.  In short, if we directly use _bt_walk_left(), then we
are prone to returning some of the values twice as multiple workers
can read the same page.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Passing query string to workers

2017-02-10 Thread Rafia Sabih
> >
> > There are some spacing issues in the code. For example,
> > +   estate->es_queryString = (char
> > *)palloc0(strlen(queryDesc->sourceText) + 1);
> > +   /*Estimate space for query text. */
> > pgindent might be helpful to track all such changes.
> >
>
Fixed.


> > +#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE010)
> > I'm uncomfortable with declaring the same macro in two
> > files(parallel.c, execParallel.c). My suggestion would be to move
> > pgstat_report_activity in ParallelQueryMain instead of
> > ParallelWorkerMain. Then, you can remove the macro definition from
> > parallel.c. Thoughts?
> >

Yes, I also don't see any need of defining it in parallel.c.  I think
> she has kept to report it in pg_stat_activity, but I feel that code
> can also be moved to execParallel.c.
>
> Agree and fixed.


> Another question is don't we need to set debug_query_string in worker?

In the updated version I am setting it in ParallelQueryMain.

Please find the attached file for the revised version.

-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/


pass_queryText_to_workers_v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we cacheline align PGXACT?

2017-02-10 Thread Tomas Vondra

On 02/11/2017 02:44 AM, Ashutosh Sharma wrote:

Hi,

I am currently testing this patch on a large machine and will share the
test results in few days of time.



FWIW it might be interesting to have comparable results from the same 
benchmarks I did. The scripts available in the git repositories should 
not be that hard to tweak. Let me know if you're interested and need 
help with that.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we cacheline align PGXACT?

2017-02-10 Thread Ashutosh Sharma
Hi,

I am currently testing this patch on a large machine and will share the
test results in few days of time.

Please excuse any grammatical errors as I am using my mobile device. Thanks.

On Feb 11, 2017 04:59, "Tomas Vondra"  wrote:

> Hi,
>
> As discussed at the Developer meeting ~ a week ago, I've ran a number of
> benchmarks on the commit, on a small/medium-size x86 machines. I currently
> don't have access to a machine as big as used by Alexander (with 72
> physical cores), but it seems useful to verify the patch does not have
> negative impact on smaller machines.
>
> In particular I've ran these tests:
>
> * r/o pgbench
> * r/w pgbench
> * 90% reads, 10% writes
> * pgbench with skewed distribution
> * pgbench with skewed distribution and skipping
>
> And each of that with a number of clients, depending on the number of
> cores available. I've used the usual two boxes I use for all benchmarks,
> i.e. a small i5-2500k machine (8GB RAM, 4 cores), and a medium e5-2620v4
> box (32GB RAM, 16/32 cores).
>
> Comparing averages of tps, measured on 5 runs (each 5 minutes long), the
> difference between master and patched master is usually within 2%, which is
> pretty much within noise.
>
> I'm attaching spreadsheets with summary of the results, so that we have it
> in the archives. As usual, the scripts and much more detailed results are
> available here:
>
> * e5-2620: https://bitbucket.org/tvondra/test-xact-alignment
> * i5-2500k: https://bitbucket.org/tvondra/test-xact-alignment-i5
>
> I do plan to run these results on the Power8 box I have access to, but
> that will have to wait for a bit, because it's currently doing something
> else.
>
> regards
>
> --
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-02-10 Thread Andres Freund
On 2017-02-11 02:13:59 +0100, Tomas Vondra wrote:
> > > + /* move the whole block to the right place in the freelist */
> > > + dlist_delete(>node);
> > > + dlist_push_head(>freelist[block->nfree], >node);
> > 
> > Hm.  What if we, instead of the array of doubly linked lists, just kept
> > a single linked list of blocks, and keep that list sorted by number of
> > free chunks?  Given that freeing / allocation never changes the number
> > of allocated chunks by more than 1, we'll never have to move an entry
> > far in that list to keep it sorted.
> > 
> 
> Only assuming that there'll be only few blocks with the same number of free
> chunks. If that's not the case, you'll have to walk many blocks to move the
> block to the right place in the list. The array of lists handles such cases
> way more efficiently, and I think we should keep it.

The proper datastructure would probably be a heap.  Right now
binaryheap.h is fixed-size - probably not too hard to change.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2017-02-10 Thread Peter Geoghegan
On Thu, Feb 9, 2017 at 2:32 PM, Andres Freund  wrote:
> Except that the proposed names aren't remotely like that... ;).

Revision attached -- V5. We now REVOKE ALL on both functions, as
Robert suggested, instead of the previous approach of having a
hard-coded superuser check with enforcement.

> And I proposed documenting named parameters, and
> check_btree(performing_check_requiring_exclusive_locks => true) is just
> about as expressive.

I have not done this, nor have I renamed the functions. I still think
that this is something that we can fix by adding a boolean argument to
each function in the future, or something along those lines. I
*really* hate the idea of having one function with non-obvious,
variable requirements on locking, with locking implications that are
not knowable when we PREPARE an SQL statement calling the function. It
also removes a useful way of have superusers discriminate against the
stronger locking variant bt_index_parent_check() by not granting
execute on it (as an anti-footgun measure). I don't think that it's
too much to ask for you to concede this one point to me (and to
Robert).

I have acted on every other item you raised, without exception,
including retaining the locks for the duration of the transaction. I
also made the functions PARALLEL RESTRICTED, which seemed like a good
idea.

Other notes:

* A nice thing about the additional level check that you suggested is
that it's a cross-level check that is perfectly safe to perform with
only an AccessShareLock, unlike the extra bt_index_parent_check()
check, since the number of levels only ever increases, no matter what.
There is no possible reason why we should ever find that a child page
does not agree with its self-described parent about the level it's on.
I'm glad you thought of this.

* I kept ereport() DEBUG1 calls, but now use non-error errcode for
those. I thought that preserving the various ereport() fields was
worth it, since these traces are rather information dense in a way
that is harder to deal with when using raw elog()s. If you hate this,
you can still change it.

* I have not added code to check the relation permissions, which might
matter if and when the superuser grants execute privs on amcheck
functions. Anyone want to argue that I should have added
relation-level permissions checks? I tend to think it's pointless,
since the superuser almost certainly won't think to consider that
additional factor. I cannot imagine how either function could be used
to leak information, in any case. No error message dumps the contents
of pages (only item pointers).

* pgindent was run against this revision. You were right to want to
get that out of the way now, since it needed to be fixed up by hand to
look reasonable. typedef list also updated.

-- 
Peter Geoghegan


0001-Add-amcheck-extension-to-contrib.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-02-10 Thread Andres Freund
On 2017-02-11 02:13:59 +0100, Tomas Vondra wrote:
> On 02/09/2017 10:37 PM, Andres Freund wrote:
> > Hi,
> > 
> > On 2016-12-13 01:45:13 +0100, Tomas Vondra wrote:
> > >  src/backend/utils/mmgr/Makefile   |   2 +-
> > >  src/backend/utils/mmgr/aset.c | 115 +
> > >  src/backend/utils/mmgr/memdebug.c | 131 
> > > ++
> > >  src/include/utils/memdebug.h  |  22 +++
> > >  4 files changed, 156 insertions(+), 114 deletions(-)
> > >  create mode 100644 src/backend/utils/mmgr/memdebug.c
> > 
> > I'm a bit loathe to move these to a .c file - won't this likely make
> > these debugging tools even slower?  Seems better to put some of them
> > them in a header as static inlines (not randomize, but the rest).
> > 
> 
> Do you have any numbers to support that? AFAICS compilers got really good in
> inlining stuff on their own.

Unless you use LTO, they can't inline across translation units. And
using LTO is slow enough for linking that it's not that much fun to use,
as it makes compile-edit-compile cycles essentially take as long as a
full rebuild.


> > > From 43aaabf70b979b172fd659ef4d0ef129fd78d72d Mon Sep 17 00:00:00 2001
> > > From: Tomas Vondra 
> > > Date: Wed, 30 Nov 2016 15:36:23 +0100
> > > Subject: [PATCH 2/3] slab allocator
> > > 
> > > ---
> > >  src/backend/replication/logical/reorderbuffer.c |  82 +--
> > >  src/backend/utils/mmgr/Makefile |   2 +-
> > >  src/backend/utils/mmgr/slab.c   | 803 
> > > 
> > >  src/include/nodes/memnodes.h|   2 +-
> > >  src/include/nodes/nodes.h   |   1 +
> > >  src/include/replication/reorderbuffer.h |  15 +-
> > >  src/include/utils/memutils.h|   9 +
> > 
> > I'd like to see the reorderbuffer changes split into a separate commit
> > from the slab allocator introduction.
> > 
> 
> I rather dislike patches that only add a bunch of code, without actually
> using it anywhere.

> But if needed, this is trivial to do at commit time - just don't
> commit the reorderbuffer bits.

Meh.


> > > + *   Each block includes a simple bitmap tracking which chunks are 
> > > used/free.
> > > + *   This makes it trivial to check if all chunks on the block are 
> > > free, and
> > > + *   eventually free the whole block (which is almost impossible 
> > > with a global
> > > + *   freelist of chunks, storing chunks from all blocks).
> > 
> > Why is checking a potentially somewhat long-ish bitmap better than a
> > simple counter, or a "linked list" of "next free chunk-number" or such
> > (where free chunks simply contain the id of the subsequent chunk)?
> > Using a list instead of a bitmap would also make it possible to get
> > 'lifo' behaviour, which is good for cache efficiency.  A simple
> > chunk-number based singly linked list would only imply a minimum
> > allocation size of 4 - that seems perfectly reasonable?
> > 
> 
> A block-level counter would be enough to decide if all chunks on the block
> are free, but it's not sufficient to identify which chunks are free /
> available for reuse.
> 
> The bitmap only has a single bit per chunk, so I find "potentially long-ish"
> is a bit misleading. Any linked list implementation will require much more
> per-chunk overhead - as the chunks are fixed-legth, it's possible to use
> chunk index (instead of 64-bit pointers), to save some space. But with large
> blocks / small chunks that's still at least 2 or 4 bytes per index, and
> you'll need two (to implement doubly-linked list, to make add/remove
> efficient).

> For example assume 8kB block and 64B chunks, i.e. 128 chunks. With bitmap
> that's 16B to track all free space on the block. Doubly linked list would
> require 1B per chunk index, 2 indexes per chunk. That's 128*2 = 256B.

> I have a hard time believing this the cache efficiency of linked lists
> (which may or may not be real in this case) out-weights this, but if you
> want to try, be my guest.

I'm not following - why would there be overhead in anything for
allocations bigger than 4 (or maybe 8) bytes? You can store the list
(via chunk ids, not pointers) inside the chunks itself, where data
otherwise would be.  And I don't see why you'd need a doubly linked
list, as the only operations that are needed are to push to the front of
the list, and to pop from the front of the list - and both operations
are simple to do with a singly linked list?


> > Thirdly, isn't that approach going to result in a quite long freelists
> > array, when you have small items and a decent blocksize? That seems like
> > a fairly reasonable thing to do?
> > 
> 
> I'm confused. Why wouldn't that be reasonable. Or rather, what would be a
> more reasonable way?

If I understood correctly, you have one an array of doubly linked lists.
A block is stored in the list at the index #block's-free-elements.  Is that
right?

If so, if 

Re: [HACKERS] multivariate statistics (v19)

2017-02-10 Thread Tomas Vondra

On 02/08/2017 07:40 PM, Dean Rasheed wrote:

On 8 February 2017 at 16:09, David Fetter  wrote:

Combinations are n!/(k! * (n-k)!), so computing those is more
along the lines of:

unsigned long long
choose(unsigned long long n, unsigned long long k) {
if (k > n) {
return 0;
}
unsigned long long r = 1;
for (unsigned long long d = 1; d <= k; ++d) {
r *= n--;
r /= d;
}
return r;
}

which greatly reduces the chance of overflow.



Hmm, but that doesn't actually prevent overflows, since it can
overflow in the multiplication step, and there is no protection
against that.

In the algorithm I presented, the inputs and the intermediate result
are kept below INT_MAX, so the multiplication step cannot overflow the
64-bit integer, and it will only raise an overflow error if the actual
result won't fit in a 32-bit int. Actually a crucial part of that,
which I failed to mention previously, is the first step replacing k
with min(k, n-k). This is necessary for inputs like (100,99), which
should return 100, and which must be computed as 100 choose 1, not 100
choose 99, otherwise it will overflow internally before getting to the
final result.



Thanks for the feedback, I'll fix this. I've allowed myself to be a bit 
sloppy because the number of attributes in the statistics is currently 
limited to 8, so the overflows are currently not an issue. But it 
doesn't hurt to make it future-proof, in case we change that mostly 
artificial limit sometime in the future.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-02-10 Thread Tomas Vondra

On 02/09/2017 10:37 PM, Andres Freund wrote:

Hi,

On 2016-12-13 01:45:13 +0100, Tomas Vondra wrote:

 src/backend/utils/mmgr/Makefile   |   2 +-
 src/backend/utils/mmgr/aset.c | 115 +
 src/backend/utils/mmgr/memdebug.c | 131 ++
 src/include/utils/memdebug.h  |  22 +++
 4 files changed, 156 insertions(+), 114 deletions(-)
 create mode 100644 src/backend/utils/mmgr/memdebug.c


I'm a bit loathe to move these to a .c file - won't this likely make
these debugging tools even slower?  Seems better to put some of them
them in a header as static inlines (not randomize, but the rest).



Do you have any numbers to support that? AFAICS compilers got really 
good in inlining stuff on their own.





From 43aaabf70b979b172fd659ef4d0ef129fd78d72d Mon Sep 17 00:00:00 2001
From: Tomas Vondra 
Date: Wed, 30 Nov 2016 15:36:23 +0100
Subject: [PATCH 2/3] slab allocator

---
 src/backend/replication/logical/reorderbuffer.c |  82 +--
 src/backend/utils/mmgr/Makefile |   2 +-
 src/backend/utils/mmgr/slab.c   | 803 
 src/include/nodes/memnodes.h|   2 +-
 src/include/nodes/nodes.h   |   1 +
 src/include/replication/reorderbuffer.h |  15 +-
 src/include/utils/memutils.h|   9 +


I'd like to see the reorderbuffer changes split into a separate commit
from the slab allocator introduction.



I rather dislike patches that only add a bunch of code, without actually 
using it anywhere. But if needed, this is trivial to do at commit time - 
just don't commit the reorderbuffer bits.







+/*-
+ *
+ * slab.c
+ *   SLAB allocator definitions.
+ *
+ * SLAB is a custom MemoryContext implementation designed for cases of
+ * equally-sized objects.
+ *
+ *
+ * Portions Copyright (c) 2016, PostgreSQL Global Development Group


Bump, before a committer forgets it.



OK.




+ * IDENTIFICATION
+ *   src/backend/utils/mmgr/slab.c
+ *
+ *
+ * The constant allocation size allows significant simplification and 
various
+ * optimizations. Firstly, we can get rid of the doubling and carve the 
blocks
+ * into chunks of exactly the right size (plus alignment), not wasting 
memory.


Getting rid of it relative to what? I'd try to phrase it so these
comments stand on their own.



OK, rill reword.




+ * Each block includes a simple bitmap tracking which chunks are used/free.
+ * This makes it trivial to check if all chunks on the block are free, and
+ * eventually free the whole block (which is almost impossible with a 
global
+ * freelist of chunks, storing chunks from all blocks).


Why is checking a potentially somewhat long-ish bitmap better than a
simple counter, or a "linked list" of "next free chunk-number" or such
(where free chunks simply contain the id of the subsequent chunk)?
Using a list instead of a bitmap would also make it possible to get
'lifo' behaviour, which is good for cache efficiency.  A simple
chunk-number based singly linked list would only imply a minimum
allocation size of 4 - that seems perfectly reasonable?



A block-level counter would be enough to decide if all chunks on the 
block are free, but it's not sufficient to identify which chunks are 
free / available for reuse.


The bitmap only has a single bit per chunk, so I find "potentially 
long-ish" is a bit misleading. Any linked list implementation will 
require much more per-chunk overhead - as the chunks are fixed-legth, 
it's possible to use chunk index (instead of 64-bit pointers), to save 
some space. But with large blocks / small chunks that's still at least 2 
or 4 bytes per index, and you'll need two (to implement doubly-linked 
list, to make add/remove efficient).


For example assume 8kB block and 64B chunks, i.e. 128 chunks. With 
bitmap that's 16B to track all free space on the block. Doubly linked 
list would require 1B per chunk index, 2 indexes per chunk. That's 128*2 
= 256B.


I have a hard time believing this the cache efficiency of linked lists 
(which may or may not be real in this case) out-weights this, but if you 
want to try, be my guest.





+ * At the context level, we use 'freelist' to track blocks ordered by 
number
+ * of free chunks, starting with blocks having a single allocated chunk, 
and
+ * with completely full blocks on the tail.


Why that way round? Filling chunks up as much as possible is good for
cache and TLB efficiency, and allows for earlier de-allocation of
partially used blocks?  Oh, I see you do that in the next comment,
but it still leaves me wondering.

Also, is this actually a list? It's more an array of lists, right?
I.e. it should be named freelists?



Possibly. Naming things is hard.

>

Thirdly, isn't that approach going to result in a quite long freelists
array, when 

Re: [HACKERS] Checksums by default?

2017-02-10 Thread Tomas Vondra

Hi,

I've repeated those benchmarks on a much smaller/older machine, with 
only minimal changes (mostly related to RAM and cores available). I've 
expected to see more significant differences, assuming that newer CPUs 
will handle the checksumming better, but to my surprise the impact of 
enabling checksums on this machine is ~2%.


As usual, full results and statistics are available for review here:

https://bitbucket.org/tvondra/checksum-bench-i5

Looking at average TPS (measured over 2 hours, with a checkpoints every 
30 minutes), I see this:


testscale checksums   no-checksums
---
 pgbench   50  7444   7518 99.02%
  300  6863   6936 98.95%
 1000  4195   4295 97.67%

 read-write50 48858  48832100.05%
  300 41999  42302 99.28%
 1000 16539  1 99.24%

 skewed50  7485   7480100.07%
  300  7245   7280 99.52%
 1000  5950   6050 98.35%

 skewed-n  50 10234  10226100.08%
  300  9618   9649 99.68%
 1000  7371   7393 99.70%

And the amount of WAL produced looks like this:

 test   scalechecksums no-checksums
-
 pgbench   5024.8924.67  100.89%
  30037.9437.54  101.07%
 100065.9164.88  101.58%

 read-write5010.009.98   100.11%
  30023.2823.35   99.66%
 100054.2053.20  101.89%

 skewed5024.3524.01  101.43%
  30035.1234.51  101.77%
 100052.1451.15  101.93%

 skewed-n  5021.7121.13  102.73%
  30032.2331.54  102.18%
 100053.2451.94  102.50%

Again, this is hardly a proof of non-existence of a workload where data 
checksums have much worse impact, but I've expected to see a much more 
significant impact on those workloads.



Incidentally, I've been dealing with a checksum failure reported by a 
customer last week, and based on the experience I tend to agree that we 
don't have the tools needed to deal with checksum failures. I think such 
tooling should be a 'must have' for enabling checksums by default.


In this particular case the checksum failure is particularly annoying 
because it happens during recovery (on a standby, after a restart), 
during startup, so FATAL means shutdown.


I've managed to inspect the page in different way (dd and pageinspect 
from another instance), and it looks fine - no obvious data corruption, 
the only thing that seems borked is the checksum itself, and only three 
consecutive bits are flipped in the checksum. So this doesn't seem like 
a "stale checksum" - hardware issue is a possibility (the machine has 
ECC RAM though), but it might just as easily be a bug in PostgreSQL, 
when something scribbles over the checksum due to a buffer overflow, 
just before we write the buffer to the OS. So 'false failures' are not 
entirely impossible thing.


And no, backups may not be a suitable solution - the failure happens on 
a standby, and the page (luckily) is not corrupted on the master. Which 
means that perhaps the standby got corrupted by a WAL, which would 
affect the backups too. I can't verify this, though, because the WAL got 
removed from the archive, already. But it's a possibility.


So I think we're not ready to enable checksums by default for everyone, 
not until we can provide tools to deal with failures like this (I don't 
think users will be amused if we tell them to use 'dd' and inspect the 
pages in a hex editor).


ISTM the way forward is to keep the current default (disabled), but to 
allow enabling checksums on the fly. That will mostly fix the issue for 
people who actually want checksums but don't realize they need to enable 
them at initdb time (and starting from scratch is not an option for 
them), are running on good hardware and are capable of dealing with 
checksum errors if needed, even without more built-in tooling.


Being able to disable checksums on the fly is nice, but it only really 
solves the issue of extra overhead - it does really help with the 
failures (particularly when you can't even start the database, because 
of a checksum failure in the startup phase).


So, shall we discuss what tooling 

Re: [HACKERS] Should we cacheline align PGXACT?

2017-02-10 Thread Tomas Vondra

Hi,

As discussed at the Developer meeting ~ a week ago, I've ran a number of 
benchmarks on the commit, on a small/medium-size x86 machines. I 
currently don't have access to a machine as big as used by Alexander 
(with 72 physical cores), but it seems useful to verify the patch does 
not have negative impact on smaller machines.


In particular I've ran these tests:

* r/o pgbench
* r/w pgbench
* 90% reads, 10% writes
* pgbench with skewed distribution
* pgbench with skewed distribution and skipping

And each of that with a number of clients, depending on the number of 
cores available. I've used the usual two boxes I use for all benchmarks, 
i.e. a small i5-2500k machine (8GB RAM, 4 cores), and a medium e5-2620v4 
box (32GB RAM, 16/32 cores).


Comparing averages of tps, measured on 5 runs (each 5 minutes long), the 
difference between master and patched master is usually within 2%, which 
is pretty much within noise.


I'm attaching spreadsheets with summary of the results, so that we have 
it in the archives. As usual, the scripts and much more detailed results 
are available here:


* e5-2620: https://bitbucket.org/tvondra/test-xact-alignment
* i5-2500k: https://bitbucket.org/tvondra/test-xact-alignment-i5

I do plan to run these results on the Power8 box I have access to, but 
that will have to wait for a bit, because it's currently doing something 
else.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


xact-e5-2620.ods
Description: application/vnd.oasis.opendocument.spreadsheet


xact-i5-2500k.ods
Description: application/vnd.oasis.opendocument.spreadsheet

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: About CMake v2

2017-02-10 Thread Andres Freund
On 2017-02-10 10:22:34 -0800, Andres Freund wrote:
> On 2017-02-10 12:07:15 -0500, Robert Haas wrote:
> > But also, I'm not really sure whether this conversion makes sense.  I
> > mean, any build system is going to require some work, and accordingly
> > our present build systems require some work.  cmake will require
> > different work, but not necessarily less.  The current system has a
> > long history; we pretty much know it works.  Switching will inevitably
> > break some things.  Maybe we'll end up better off in the long term,
> > but maybe we won't.  Things are pretty good now, so it seems like it
> > would be easy for them to get worse rather than better.
> 
> I do think it's kinda ok for people who've dealt with this for some
> time.  But for the first few times having to write autoconf macros is
> quite intimidating. A somewhat less obscure way to deal with that is
> worth something.  As is better file dependency management, across
> different environments.  As is the IDE integration cmake is more and
> more getting.  As is properly builtin working caching of configure tests
> (llvm first cmake run, 7.7s, second 2.54s). As is the fact that a lot of
> big projects (llvm, kde, qt, and a lot of others) migrated to it.
> 
> For me personally those benefits aren't worth that much.  I've learned
> autoconf stuff.  I've scripting around autoconf caching.  But for newer
> people that's not true.

Craig's email just now reminded me of another advantage: Using cmake
across the board, would mean we'd use the same ./configure alike logic
on both windows and linux.  To me that seems quite and advantage over
managing pg_config.h.win32/config_default.pl manually/separately - we
obviously have screwed up quite badly there in the past
(cf376a4adc0805b0).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: About CMake v2

2017-02-10 Thread Craig Ringer
On 11 Feb. 2017 08:42, "Andrew Dunstan" 
wrote:



On 02/10/2017 01:27 PM, Magnus Hagander wrote:
> On Fri, Feb 10, 2017 at 6:07 PM, Robert Haas  > wrote:
>
> On Mon, Jan 30, 2017 at 10:26 AM, Peter Eisentraut
>  > wrote:
> > On 1/30/17 1:28 AM, Andres Freund wrote:
> >> Given that fact, I just don't buy why it's a good idea to not also
> >> replace autoconf initially.
> >
> > Well, I find it a bit scary.  If you do the big switch all at
> once, then
> > you will have to dedicate the following 3 months to fixing
> complaints
> > from developers and build farmers.
>
> I agree with that.  I think replacing the Windows build system first
> and then the non-Windows build system later is a better plan than
> replacing both at the same time.
>
> But also, I'm not really sure whether this conversion makes sense.  I
> mean, any build system is going to require some work, and accordingly
> our present build systems require some work.  cmake will require
> different work, but not necessarily less.  The current system has a
> long history; we pretty much know it works.  Switching will inevitably
> break some things.  Maybe we'll end up better off in the long term,
> but maybe we won't.  Things are pretty good now, so it seems like it
> would be easy for them to get worse rather than better.  If nothing
> else, everybody who has to learn the new system either to use it for
> development or because they are doing packaging will have to do some
> amount of extra work as a result of any switch.
>
>
> For me a killer feature would be if/when we can get to a point where
> we can have something pgxs-style on cmake that also works on windows.
>
> Our homemade Windows build system works OK for postgres, and while
> ugly it is as you say well tested by now. But it doesn't do *anything*
> to help people build extensions on Windows.
>
>


Watch this space. There is work being done on building extensions out of
tree using CMake on Windows. It's pretty neat, and I'm hoping to demo it
publicly soon. But it's not reliant on CMake in core.


Yeah. In fact it's completely independent of core's build system it and
works with any supported Pg, using pg_config for discovery.

We don't need cmake in core for pgxs-like functionality on Windows.

I'd like to use it instead of our Perl windows stuff msbuild stuff in core
though. I wonder if it's practical to generate the cmake file lists etc
from our Makefiles initialy like we do for the current system, and just
replace the MSVC/msbuild project generators with cmake initially.


Re: [HACKERS] Reporting xmin from VACUUMs

2017-02-10 Thread Alvaro Herrera
Robert Haas wrote:
> On Fri, Feb 10, 2017 at 3:30 AM, Simon Riggs  wrote:
> > We've added xmin info to pg_stat_activity and pg_stat_replication, but
> > VACUUM doesn't yet report which xmin value it used when it ran.
> >
> > Small patch to add this info to VACUUM output.
> 
> It seems sensible to me to do something like this.  We already report
> a lot of other fine details, so what's one more?  And it could be
> useful.

Yeah, I can see how this can be useful to debug some hard-to-track
problems.  The patch looks sensible to me.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)

2017-02-10 Thread Corey Huinker
>
>   calvin=> \if true
>
  calvin?t=> SELECT 1 +
>   calvin?t->   2;
> 3
>   calvin?t=> \if true
>   calvin?t=>   \echo hello
> hello
>   calvin?t=> \endif
>   calvin?t=> \else
>   calvin?z=>   \echo ignored
>   calvin?t=> \endif
>   calvin=>
>

Ok, so that's not just PROMPT_READY, that's every prompt...which might be
ok. ? is a great optional cue, and you're thinking on 2 levels deep, 2nd
level always being '.'? I'll give that a shot.


Re: [HACKERS] Reporting xmin from VACUUMs

2017-02-10 Thread Robert Haas
On Fri, Feb 10, 2017 at 3:30 AM, Simon Riggs  wrote:
> We've added xmin info to pg_stat_activity and pg_stat_replication, but
> VACUUM doesn't yet report which xmin value it used when it ran.
>
> Small patch to add this info to VACUUM output.

It seems sensible to me to do something like this.  We already report
a lot of other fine details, so what's one more?  And it could be
useful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)

2017-02-10 Thread Fabien COELHO


Hello,

I'm looking forward to the doc update.

My 0.02€ about prompting:


Given that there is no more barking, then having some prompt indication
that the code is inside a conditional branch becomes more important, so
ISTM that there should be some plan to add it.


Yeah, prompting just got more important. I see a few ways to go about this:

1. Add a new prompt type, either %T for true (heh, pun) or %Y for
branching. It would print a string of chained 't' (branch is true), 'f'
(branch is false), 'z' (branch already had its true section). The depth
traversal would have a limit, say 3 levels deep, and if the tree goes more
than that deep, then '...' would be printed in the stead of any deeper
values. So the prompt would change through a  session like:

command   prompt is now
---   ---
\echo bob ''   = initial state, no branch going on at all
\if yes   't' = inside a true branch
\if no'tf' = false inside a true
\endif't' = back to just the true branch
\if yes   'tt'
\if yes   'ttt'
\if yes   '...ttt' = only show the last 3, but let it be known that
there's at least one more'
\else '...ttz' = past the point of a true bit of this branch


I like the "tfz" idea. I'm not sure whether the up to 6 characters is a 
good, though.



2. The printing of #1 could be integrated into %R only in PROMPT_READY
cases, either prepended or appended to the !/=/^, possibly separated by a :


Hmmm. Logically I would say prepend, but the default prompt is with the 
dbname, which is mostly letters, so it means adding a separator as well.



3. Like #2, but prepended/appended in all circumstances


I would say yes.


4. Keep %T (or %Y), and reflect the state of pset.active_branch within %R,
a single t/f/z


Yep, but with a separator?


5. Like #4, but also printing the if-stack depth if > 1


Hmmm, not sure...

Based on the your ideas above, I would suggest the following:

  calvin=> \if true
  calvin?t=> SELECT 1 +
  calvin?t->   2;
3
  calvin?t=> \if true
  calvin?t=>   \echo hello
hello
  calvin?t=> \endif
  calvin?t=> \else
  calvin?z=>   \echo ignored
  calvin?t=> \endif
  calvin=>

Or maybe use "?.t" for the nested if...

--
Fabien
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add doc advice about systemd RemoveIPC

2017-02-10 Thread Peter Eisentraut
On 12/31/16 11:43 AM, Tom Lane wrote:
> Magnus Hagander  writes:
>> I still think that some wording in the direction of the fact that the
>> majority of all users won't actually have this problem is the right thing
>> to do (regardless of our previous history in the area as pointed out by
>> Craig)
> 
> +1.  The use-a-system-user solution is the one that's in place on the
> ground for most current PG users on affected platforms.  We should explain
> that one first and make clear that platform-specific packages attempt to
> set it up that way for you.  The RemoveIPC technique should be documented
> as a fallback to be used if you can't/won't use a system userid.

How about this version, which shifts the emphasis a bit, as suggested?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 5b4ba436d3882bfa2ce0e6243b9ab2ece66a4da4 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 10 Feb 2017 16:34:20 -0500
Subject: [PATCH v2] doc: Add advice about systemd RemoveIPC

---
 doc/src/sgml/runtime.sgml | 77 +++
 1 file changed, 77 insertions(+)

diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 130c386462..25c57192db 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -1165,6 +1165,83 @@ System V IPC Parameters
 
   
 
+  
+   systemd RemoveIPC
+
+   
+systemd
+RemoveIPC
+   
+
+   
+If systemd is in use, some care must be taken
+that IPC resources (shared memory and semaphores) are not prematurely
+removed by the operating system.  This is especially of concern when
+installing PostgreSQL from source.  Users of distribution packages of
+PostgreSQL are less likely to be affected.
+   
+
+   
+The setting RemoveIPC
+in logind.conf controls whether IPC objects are
+removed when a user fully logs out.  System users are exempt.  This
+setting defaults to on in stock systemd, but
+some operating system distributions default it to off.
+   
+
+   
+A typical observed effect when this setting is on is that the semaphore
+objects used by a PostgreSQL server are removed at apparently random
+times, leading to the server crashing with log messages like
+
+LOG: semctl(1234567890, 0, IPC_RMID, ...) failed: Invalid argument
+
+Different types of IPC objects (shared memory vs. semaphores, System V
+vs. POSIX) are treated slightly differently
+by systemd, so one might observe that some IPC
+resources are not removed in the same way as others.  But it is not
+advisable to rely on these subtle differences.
+   
+
+   
+A user logging out might happen as part of a maintenance
+job or manually when an administrator logs in as
+the postgres user or similar, so it is hard to prevent
+in general.
+   
+
+   
+What is a system user is determined
+at systemd compile time from
+the SYS_UID_MAX setting
+in /etc/login.defs.
+   
+
+   
+Packaging and deployment scripts should be careful to create
+the postgres user as a system user by
+using useradd -r, adduser --system,
+or equivalent.
+   
+
+   
+Alternatively, if the user account was created incorrectly or cannot be
+changed, it is recommended to set
+
+RemoveIPC=no
+
+in /etc/systemd/logind.conf or another appropriate
+configuration file.
+   
+
+   
+
+ At least one of these two things have to be ensured, or the PostgreSQL
+ server will be very unreliable.
+
+   
+  
+
   
Resource Limits
 
-- 
2.11.1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add support to COMMENT ON CURRENT DATABASE

2017-02-10 Thread Peter Eisentraut
On 1/28/17 1:33 PM, Fabrízio de Royes Mello wrote:
> I did what you suggested making CURRENT_DATABASE reserved but I got the
> following error during the bootstrap:

current_database is also used as a function name, so you need to do some
parser work to get it working in all the right ways.  Hard to tell
without a patch to look at.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence data type

2017-02-10 Thread Peter Eisentraut
On 2/1/17 10:01 PM, Michael Paquier wrote:
> On Wed, Feb 1, 2017 at 10:02 AM, Michael Paquier
>  wrote:
>> On Wed, Feb 1, 2017 at 1:11 AM, Peter Eisentraut
>>  wrote:
>>> And here is a rebased patch for the original feature.  I think this
>>> addresses all raised concerns and suggestions now.
>>
>> Thanks for the new version. That looks good to me after an extra lookup.
> 
> Moved to CF 2017-03.

committed, thanks

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] log_autovacuum_min_duration doesn't log VACUUMs

2017-02-10 Thread Robert Haas
On Fri, Feb 10, 2017 at 3:38 AM, Simon Riggs  wrote:
> I guess its fairly obvious in the title, but
> log_autovacuum_min_duration doesn't log VACUUMs only autovacuums.
>
> What isn't obvious is why that restruction is useful.
>
> I say that it would be helpful to log all kinds of VACUUM, so we get
> similar output from all methods of submission.
>
> So, changes would be
>
> 1. Allow logging whether or not it is an autovacuum (attached)
>
> 2. Change name of parameter to ...
> a) log_vacuum_min_duration
> b) log_maintenance_min_duration and have it log CLUSTER, CREATE INDEX etc also
> c) log_ddl_min_duration and just time any DDL that takes a long time
> We could do any of those and they are all about as easy as one
> another, though the last one will be a bigger patch, so a) might be
> simpler.
>
> The following patch implements (1), but not yet (2) to allow debate.

Right now, the mission of log_autovacuum_min_duration is not to log
all automatic vacuums, but to log all actions taken by the autovacuum
workers, which include both VACUUM and ANALYZE.  With your proposed
patch, we'd log all VACUUM operations (whether or not performed by
autovacuum) and some ANALYZE operations (only if performed by
autovacuum).  That's not very logical, so -1 for the patch as
proposed.

That having been said, I think it could certainly be useful to have
more control over what DDL gets logged in foreground processes.  Right
now you can say log_statement=ddl, but you can't for example say
log_statement=vacuum,create_index,analyze,truncate.  That might not be
the right interface but certainly I think people would appreciate the
ability to log some DDL without logging all of it.  And maybe it's
useful to have a way to provide the same kind of logging that
log_autovacuum_min_duration does for foreground vacuums (and
analyzes?).  I'd make that a separate GUC, though.

One of the things to keep in mind here is that most people would log
everything if it had no downsides.  But it does - it takes CPU time
and I/O bandwidth and it makes it harder to find things in the logs.
So fine-grained control over what gets logged is important.  It's
probably not wise to just rip out log_autovacuum_min_duration and
replace it with something that has a much broader remit, because then
people who are happy with the amount of logging they're getting now
might suddenly find that they're getting too much.  I think the way
forward is to allow new options that add more logging rather than
making the existing options log more than they do now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] removing tsearch2

2017-02-10 Thread Andrew Dunstan


On 02/10/2017 01:27 PM, Josh Berkus wrote:
> On 02/10/2017 10:18 AM, Peter Geoghegan wrote:
>> On Fri, Feb 10, 2017 at 3:28 AM, Robert Haas  wrote:
> Works for me.
 +1
>>> OK, that's three votes in favor of removing tsearch2 (from core,
>>> anyone who wants it can maintain a copy elsewhere).
>> +1.
>>
>> I'd also be in favor of either removing contrib/isn, or changing it so
>> that the ISBN country code prefix enforcement went away. That would
>> actually not imply and real loss of functionality from a practical
>> perspective, since you can still enforce that the check digit is
>> correct without any of that. I think that the existing design of some
>> parts of contrib/isn is just horrible.
> +1 to quick-fix it, -1 to just delete it.
>
> There's a bunch of these things in /contrib which really ought to be
> PGXN extensions (also CUBE, earthdistance, etc.).  However, one of the
> steps in that would be getting the mainstream platforms to package them
> so that users have a reasonable upgrade path, so I would not propose
> doing it for 10.
>



Part of the reason for keeping a number of extensions is that it helps
test our extension infrastructure. Also they server as good pieces of
example code. So I don't want to get rid of them all, or even any of
them that have any degree of significant use. I think these days
tsearch2 is very largely redundant, so that means there's a good reason
not to keep it. But that's not true of cube, isn etc.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal : For Auto-Prewarm.

2017-02-10 Thread Robert Haas
On Thu, Feb 9, 2017 at 1:47 AM, Beena Emerson  wrote:
>> Won't it be simple if you consider -1 as a value to just load library?
>>  For *_interval = -1, it will neither load nor dump.
>>
> +1
> That is what I thought was the behaviour we decided upon for -1.

Right.  I can't see why you'd want to be able to separately control
those two things.  If you're not dumping, you don't want to load; if
you're not loading, you don't want to dump.

I think what should happen is that there should be only one worker.
If the GUC is -1, it never gets registered.  Otherwise, it starts at
database startup time and runs until shutdown.  At startup time, it
loads buffers until we run out of free buffers or until all saved
buffers are loaded, whichever happens first.  Buffers should be sorted
by relfilenode (in any order) and block number (in increasing order).
Once it finishes loading buffers, it repeatedly sleeps for the amount
of time indicated by the GUC (or indefinitely if the GUC is 0),
dumping after each sleep and at shutdown.

Shutting down one worker to start up another doesn't seem to make
sense.  If for some reason you want the code for those in separate
functions, you can call one function and then call the other.  Putting
them in completely separate processes doesn't buy anything.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal : For Auto-Prewarm.

2017-02-10 Thread Robert Haas
On Tue, Feb 7, 2017 at 2:04 AM, Amit Kapila  wrote:
> On Tue, Feb 7, 2017 at 10:44 AM, Mithun Cy  wrote:
>>
>> ==
>> One problem now I have kept it open is multiple "auto pg_prewarm dump"
>> can be launched even if already a dump/load worker is running by
>> calling launch_pg_prewarm_dump. I can avoid this by dropping a
>> lock-file before starting the bgworkers. But, if there is an another
>> method to avoid launching bgworker on a simple method I can do same.
>>
>
> How about keeping a variable in PROC_HDR structure to indicate if
> already one dump worker is running, then don't allow to start a new
> one?

A contrib module shouldn't change core (and shouldn't need to).  It
can register its own shared memory area if it wants.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: About CMake v2

2017-02-10 Thread Andrew Dunstan


On 02/10/2017 01:27 PM, Magnus Hagander wrote:
> On Fri, Feb 10, 2017 at 6:07 PM, Robert Haas  > wrote:
>
> On Mon, Jan 30, 2017 at 10:26 AM, Peter Eisentraut
>  > wrote:
> > On 1/30/17 1:28 AM, Andres Freund wrote:
> >> Given that fact, I just don't buy why it's a good idea to not also
> >> replace autoconf initially.
> >
> > Well, I find it a bit scary.  If you do the big switch all at
> once, then
> > you will have to dedicate the following 3 months to fixing
> complaints
> > from developers and build farmers.
>
> I agree with that.  I think replacing the Windows build system first
> and then the non-Windows build system later is a better plan than
> replacing both at the same time.
>
> But also, I'm not really sure whether this conversion makes sense.  I
> mean, any build system is going to require some work, and accordingly
> our present build systems require some work.  cmake will require
> different work, but not necessarily less.  The current system has a
> long history; we pretty much know it works.  Switching will inevitably
> break some things.  Maybe we'll end up better off in the long term,
> but maybe we won't.  Things are pretty good now, so it seems like it
> would be easy for them to get worse rather than better.  If nothing
> else, everybody who has to learn the new system either to use it for
> development or because they are doing packaging will have to do some
> amount of extra work as a result of any switch.
>
>
> For me a killer feature would be if/when we can get to a point where
> we can have something pgxs-style on cmake that also works on windows.
>
> Our homemade Windows build system works OK for postgres, and while
> ugly it is as you say well tested by now. But it doesn't do *anything*
> to help people build extensions on Windows. 
>
>


Watch this space. There is work being done on building extensions out of
tree using CMake on Windows. It's pretty neat, and I'm hoping to demo it
publicly soon. But it's not reliant on CMake in core.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DROP SUBSCRIPTION and ROLLBACK

2017-02-10 Thread Masahiko Sawada
On Thu, Feb 9, 2017 at 12:44 AM, Petr Jelinek
 wrote:
> On 08/02/17 07:40, Masahiko Sawada wrote:
>> On Wed, Feb 8, 2017 at 9:01 AM, Michael Paquier
>>  wrote:
>>> On Wed, Feb 8, 2017 at 1:30 AM, Fujii Masao  wrote:
 On Wed, Feb 8, 2017 at 12:26 AM, Petr Jelinek
  wrote:
> For example what happens if apply crashes during the DROP
> SUBSCRIPTION/COMMIT and is not started because the delete from catalog
> is now visible so the subscription is no longer there?

 Another idea is to treat DROP SUBSCRIPTION in the same way as VACUUM, i.e.,
 make it emit an error if it's executed within user's transaction block.
>>>
>>> It seems to me that this is exactly Petr's point: using
>>> PreventTransactionChain() to prevent things to happen.
>>
>> Agreed. It's better to prevent to be executed inside user transaction
>> block. And I understood there is too many failure scenarios we need to
>> handle.
>>
 Also DROP SUBSCRIPTION should call CommitTransactionCommand() just
 after removing the entry from pg_subscription, then connect to the 
 publisher
 and remove the replication slot.
>>>
>>> For consistency that may be important.
>>
>> Agreed.
>>
>> Attached patch, please give me feedback.
>>
>
> This looks good (and similar to what initial patch had btw). Works fine
> for me as well.
>
> Remaining issue is, what to do about CREATE SUBSCRIPTION then, there are
> similar failure scenarios there, should we prevent it from running
> inside transaction as well?
>

Hmm,  after thought I suspect current discussing approach. For
example, please image the case where CRAETE SUBSCRIPTION creates
subscription successfully but fails to create replication slot for
whatever reason, and then DROP SUBSCRIPTION drops the subscription but
dropping replication slot is failed. In such case, CREAET SUBSCRIPTION
and DROP SUBSCRIPTION return ERROR but the subscription is created and
dropped successfully. I think that this behaviour confuse the user.

I think we should just prevent calling DROP SUBSCRIPTION in user's
transaction block. Or I guess that it could be better to separate the
starting/stopping logical replication from subscription management.

Thought?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: About CMake v2

2017-02-10 Thread Magnus Hagander
On Feb 10, 2017 19:41, "Andres Freund"  wrote:

On 2017-02-10 19:33:18 +0100, Magnus Hagander wrote:
> I guess we wouldn't, but we'd still need the "replacement for autoconf"
> part. So then we're back to maintaining multiple buildsystems.

Hm? Do we really need that?  Most of the things in an extension you do
*not* want to determine separately from the backend.  It's not like pgxs
atm really allows to differ wildly from autoconf's results. And most of
the relevant determinations made by autoconf are available in headers
and/or we can generate a cmake include file with the results of
autoconf.


Yeah, you're right. You need the output from the process,  it mot the
process itself.

/Magnus


Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)

2017-02-10 Thread Corey Huinker
>
> Shouldn't there be some documentation changes to reflect the behavior on
> errors? A precise paragraph about that would be welcome, IMHO.
>

Oddly enough, the documentation I wrote hadn't addressed invalid booleans,
only the error messages did that.

The new behavior certainly warrants a mention, and I'll add that.


> Given that there is no more barking, then having some prompt indication
> that the code is inside a conditional branch becomes more important, so
> ISTM that there should be some plan to add it.


Yeah, prompting just got more important. I see a few ways to go about this:

1. Add a new prompt type, either %T for true (heh, pun) or %Y for
branching. It would print a string of chained 't' (branch is true), 'f'
(branch is false), 'z' (branch already had its true section). The depth
traversal would have a limit, say 3 levels deep, and if the tree goes more
than that deep, then '...' would be printed in the stead of any deeper
values. So the prompt would change through a  session like:

command   prompt is now
---   ---
\echo bob ''   = initial state, no branch going on at all
\if yes   't' = inside a true branch
\if no'tf' = false inside a true
\endif't' = back to just the true branch
\if yes   'tt'
\if yes   'ttt'
\if yes   '...ttt' = only show the last 3, but let it be known that
there's at least one more'
\else '...ttz' = past the point of a true bit of this branch

2. The printing of #1 could be integrated into %R only in PROMPT_READY
cases, either prepended or appended to the !/=/^, possibly separated by a :
3. Like #2, but prepended/appended in all circumstances
4. Keep %T (or %Y), and reflect the state of pset.active_branch within %R,
a single t/f/z
5. Like #4, but also printing the if-stack depth if > 1


Re: [HACKERS] WIP: About CMake v2

2017-02-10 Thread Andres Freund
On 2017-02-10 19:33:18 +0100, Magnus Hagander wrote:
> I guess we wouldn't, but we'd still need the "replacement for autoconf"
> part. So then we're back to maintaining multiple buildsystems.

Hm? Do we really need that?  Most of the things in an extension you do
*not* want to determine separately from the backend.  It's not like pgxs
atm really allows to differ wildly from autoconf's results. And most of
the relevant determinations made by autoconf are available in headers
and/or we can generate a cmake include file with the results of
autoconf.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] removing tsearch2

2017-02-10 Thread Robert Haas
On Fri, Feb 10, 2017 at 12:32 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> That's not a bad idea, but I think it's an independent issue.  If the
>> hacks are still needed for an external module, we shouldn't go out of
>> our way to remove them even if we nuke tsearch2 (but we don't need to
>> maintain them going forward unless we get a complaint).  If they hacks
>> aren't still needed, they could be removed whether or not we keep
>> tsearch2 in contrib.  Unless I'm confused?
>
> Technically, it's probably independent of whether we keep tsearch2 in
> contrib.  I think (but haven't researched it) that it's more a matter
> of whether we care to still support direct upgrades from pre-release-N
> versions of tsearch2, for some N.  Politically though, I think it'll
> be easier to make an aggressive decision in that regard if we are
> tossing tsearch2 out of contrib.

I agree that it's OK to desupport direct upgrades from ancient
releases to the very latest code at some point in time, and if
somebody thinks that's important work to reduce future maintenance or
just to tidy up, I'm not going to oppose it strenuously.  But I
wouldn't be in any rush either.

Anyway, it seems like the consensus here is unanimous.  Unless there
are a LARGE number of contrary votes in the meanwhile, I'll go make
this happen sometime next week.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: About CMake v2

2017-02-10 Thread Andres Freund
On 2017-02-10 20:31:12 +0200, Heikki Linnakangas wrote:
> On 02/10/2017 08:27 PM, Magnus Hagander wrote:
> > For me a killer feature would be if/when we can get to a point where we can
> > have something pgxs-style on cmake that also works on windows.
> > 
> > Our homemade Windows build system works OK for postgres, and while ugly it
> > is as you say well tested by now. But it doesn't do *anything* to help
> > people build extensions on Windows.
> 
> Do we need to build PostgreSQL itself using cmake, to achieve that? Could we
> write something like pgxs for cmake, only for extensions?

I don't see why it'd need to be done together.  The minimal version
would be a simple cmake file that just sets a bunch of variables from
pg_config, provides a few rules for specifying the current pgxs stuff,
and an example stanza how to include that file.  We'd have to duplicate
some of the pgxs specific logic, but that's probably not too bad.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: About CMake v2

2017-02-10 Thread Magnus Hagander
On Fri, Feb 10, 2017 at 7:31 PM, Heikki Linnakangas  wrote:

> On 02/10/2017 08:27 PM, Magnus Hagander wrote:
>
>> For me a killer feature would be if/when we can get to a point where we
>> can
>> have something pgxs-style on cmake that also works on windows.
>>
>> Our homemade Windows build system works OK for postgres, and while ugly it
>> is as you say well tested by now. But it doesn't do *anything* to help
>> people build extensions on Windows.
>>
>
> Do we need to build PostgreSQL itself using cmake, to achieve that? Could
> we write something like pgxs for cmake, only for extensions?
>
>
I guess we wouldn't, but we'd still need the "replacement for autoconf"
part. So then we're back to maintaining multiple buildsystems.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] WIP: About CMake v2

2017-02-10 Thread Andres Freund
On 2017-02-10 19:27:55 +0100, Magnus Hagander wrote:
> For me a killer feature would be if/when we can get to a point where we can
> have something pgxs-style on cmake that also works on windows.

That should be quite possible.  The ugliest part will be to determine
where to include a cmake file from (since that'll be copied in every
such project), but after that it should be nearly trivial.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: About CMake v2

2017-02-10 Thread Heikki Linnakangas

On 02/10/2017 08:27 PM, Magnus Hagander wrote:

For me a killer feature would be if/when we can get to a point where we can
have something pgxs-style on cmake that also works on windows.

Our homemade Windows build system works OK for postgres, and while ugly it
is as you say well tested by now. But it doesn't do *anything* to help
people build extensions on Windows.


Do we need to build PostgreSQL itself using cmake, to achieve that? 
Could we write something like pgxs for cmake, only for extensions?


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: About CMake v2

2017-02-10 Thread Magnus Hagander
On Fri, Feb 10, 2017 at 6:07 PM, Robert Haas  wrote:

> On Mon, Jan 30, 2017 at 10:26 AM, Peter Eisentraut
>  wrote:
> > On 1/30/17 1:28 AM, Andres Freund wrote:
> >> Given that fact, I just don't buy why it's a good idea to not also
> >> replace autoconf initially.
> >
> > Well, I find it a bit scary.  If you do the big switch all at once, then
> > you will have to dedicate the following 3 months to fixing complaints
> > from developers and build farmers.
>
> I agree with that.  I think replacing the Windows build system first
> and then the non-Windows build system later is a better plan than
> replacing both at the same time.
>
> But also, I'm not really sure whether this conversion makes sense.  I
> mean, any build system is going to require some work, and accordingly
> our present build systems require some work.  cmake will require
> different work, but not necessarily less.  The current system has a
> long history; we pretty much know it works.  Switching will inevitably
> break some things.  Maybe we'll end up better off in the long term,
> but maybe we won't.  Things are pretty good now, so it seems like it
> would be easy for them to get worse rather than better.  If nothing
> else, everybody who has to learn the new system either to use it for
> development or because they are doing packaging will have to do some
> amount of extra work as a result of any switch.
>
>
For me a killer feature would be if/when we can get to a point where we can
have something pgxs-style on cmake that also works on windows.

Our homemade Windows build system works OK for postgres, and while ugly it
is as you say well tested by now. But it doesn't do *anything* to help
people build extensions on Windows.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] removing tsearch2

2017-02-10 Thread Josh Berkus
On 02/10/2017 10:18 AM, Peter Geoghegan wrote:
> On Fri, Feb 10, 2017 at 3:28 AM, Robert Haas  wrote:
 Works for me.
>>>
>>> +1
>>
>> OK, that's three votes in favor of removing tsearch2 (from core,
>> anyone who wants it can maintain a copy elsewhere).
> 
> +1.
> 
> I'd also be in favor of either removing contrib/isn, or changing it so
> that the ISBN country code prefix enforcement went away. That would
> actually not imply and real loss of functionality from a practical
> perspective, since you can still enforce that the check digit is
> correct without any of that. I think that the existing design of some
> parts of contrib/isn is just horrible.

+1 to quick-fix it, -1 to just delete it.

There's a bunch of these things in /contrib which really ought to be
PGXN extensions (also CUBE, earthdistance, etc.).  However, one of the
steps in that would be getting the mainstream platforms to package them
so that users have a reasonable upgrade path, so I would not propose
doing it for 10.

-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] removing tsearch2

2017-02-10 Thread Robert Haas
On Fri, Feb 10, 2017 at 1:18 PM, Peter Geoghegan  wrote:
> I'd also be in favor of either removing contrib/isn, or changing it so
> that the ISBN country code prefix enforcement went away. That would
> actually not imply and real loss of functionality from a practical
> perspective, since you can still enforce that the check digit is
> correct without any of that. I think that the existing design of some
> parts of contrib/isn is just horrible.

There's no question that contrib/isn is not well-designed, but whether
to remove it is a question for a different thread than this one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2017-02-10 Thread Robert Haas
On Thu, Feb 9, 2017 at 1:33 AM, Amit Kapila  wrote:
> compute_index_pages_v2.patch - This function extracts the computation
> of index pages to be scanned in a separate function and used it in
> existing code.  You will notice that I have pulled up the logic of
> conversion of clauses to indexquals from create_index_path to
> build_index_paths as that is required to compute the number of index
> and heap pages to be scanned by scan in patch
> parallel_index_opt_exec_support_v8.patch.  This doesn't impact any
> existing functionality.

This design presupposes that every AM that might ever want to do
parallel scans is happy with genericcostestimate()'s method of
computing the number of index pages that will be fetched.  However,
that might not be true for every possible AM.  In fact, it's already
not true for BRIN, which always reads the whole index.  Now, since
we're only concerned with btree at the moment, nothing would be
immediately broken by this approach but it seems we're just setting
ourselves up for future pain.

I have what I think might be a better idea: let's make
amcostestimate() responsible for returning a suggested parallel degree
for the path via an additional out parameter.  cost_index() can then
reduce that value if it seems like not enough heap pages will be
fetched to justify the return value, or it can override it completely
if parallel_degree is set for the relation.  Then we don't need to run
this logic twice to compute the same value, and we don't violate the
AM abstraction layer.

BTW, the changes to add_partial_path() aren't needed, because an
IndexPath only gets reused if you stick a Bitmap{Heap,And,Or}Path on
top of it, and that won't be the case with this or any other pending
patch.  If we get the ability to have a Parallel Bitmap Heap Scan that
takes a parallel index scan rather than a standard index scan as
input, then we'll need something like this.  But for now it's probably
best to just update the comments and remove the Assert().

I think you can also leave out the documentation changes from these
patches.  I'll do some general rewriting of the parallel query section
once we know exactly what capabilities we'll have in this release; I
think that will work better than trying to update them a bit at a time
for each patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: About CMake v2

2017-02-10 Thread Andres Freund
On 2017-02-10 12:07:15 -0500, Robert Haas wrote:
> On Mon, Jan 30, 2017 at 10:26 AM, Peter Eisentraut
>  wrote:
> > On 1/30/17 1:28 AM, Andres Freund wrote:
> >> Given that fact, I just don't buy why it's a good idea to not also
> >> replace autoconf initially.
> >
> > Well, I find it a bit scary.  If you do the big switch all at once, then
> > you will have to dedicate the following 3 months to fixing complaints
> > from developers and build farmers.
> 
> I agree with that.  I think replacing the Windows build system first
> and then the non-Windows build system later is a better plan than
> replacing both at the same time.

Obviously I disagree ;)


But I'm more replying because of:

> But also, I'm not really sure whether this conversion makes sense.  I
> mean, any build system is going to require some work, and accordingly
> our present build systems require some work.  cmake will require
> different work, but not necessarily less.  The current system has a
> long history; we pretty much know it works.  Switching will inevitably
> break some things.  Maybe we'll end up better off in the long term,
> but maybe we won't.  Things are pretty good now, so it seems like it
> would be easy for them to get worse rather than better.

I do think it's kinda ok for people who've dealt with this for some
time.  But for the first few times having to write autoconf macros is
quite intimidating. A somewhat less obscure way to deal with that is
worth something.  As is better file dependency management, across
different environments.  As is the IDE integration cmake is more and
more getting.  As is properly builtin working caching of configure tests
(llvm first cmake run, 7.7s, second 2.54s). As is the fact that a lot of
big projects (llvm, kde, qt, and a lot of others) migrated to it.

For me personally those benefits aren't worth that much.  I've learned
autoconf stuff.  I've scripting around autoconf caching.  But for newer
people that's not true.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] removing tsearch2

2017-02-10 Thread Peter Geoghegan
On Fri, Feb 10, 2017 at 3:28 AM, Robert Haas  wrote:
>>> Works for me.
>>
>> +1
>
> OK, that's three votes in favor of removing tsearch2 (from core,
> anyone who wants it can maintain a copy elsewhere).

+1.

I'd also be in favor of either removing contrib/isn, or changing it so
that the ISBN country code prefix enforcement went away. That would
actually not imply and real loss of functionality from a practical
perspective, since you can still enforce that the check digit is
correct without any of that. I think that the existing design of some
parts of contrib/isn is just horrible.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] removing tsearch2

2017-02-10 Thread Josh Berkus
On 02/10/2017 06:41 AM, David Steele wrote:
> On 2/10/17 6:28 AM, Robert Haas wrote:
>> On Thu, Feb 9, 2017 at 7:37 PM, Andres Freund  wrote:
>>> On 2017-02-09 19:19:21 -0500, Stephen Frost wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Feb 9, 2017 at 4:24 PM, Tom Lane  wrote:
>> Also, our experience with contrib/tsearch2 suggests that the extension
>> shouldn't be part of contrib, because we have zero track record of 
>> getting
>> rid of stuff in contrib, no matter how dead it is.
>
> Let's nuke tsearch2 to remove this adverse precedent, and then add the
> new thing.
>
> Anybody who still wants tsearch2 can go get it from an old version, or
> somebody can maintain a fork on github.

 Works for me.
>>>
>>> +1
>>
>> OK, that's three votes in favor of removing tsearch2 (from core,
>> anyone who wants it can maintain a copy elsewhere).  Starting a new
>> thread to make sure we collect all the relevant votes, but I really,
>> really think it's past time for this to go away.  The last actual
>> change to tsearch2 which wasn't part of a wider cleanup was
>> 3ca7eddbb7c4803729d385a0c9535d8a972ee03f in January 2009, so it's been
>> 7 years since there's been any real work done on this -- and the
>> release where we brought tsearch into core is now EOL, plus three more
>> releases besides.
> 
> +1
> 

+1

-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2017-02-10 Thread Andres Freund
Hi,

On 2017-02-10 18:18:13 +0100, Markus Nullmeier wrote:
> Well, if this thread of thought about hand-crafted JIT should be taken
> up again by someone at some point in time, I guess it could be possible
> to reuse tools that are already out there, such as "DynASM"
> ( http://luajit.org/dynasm_features.html ) from the LuaJIT project
> (currently offers x86-64, x86-32, ARM-32, PPC-32, and MIPS-32).
> Maybe one could even recycle parts of the LuaJIT project itself
> ( http://luajit.org/luajit.html ,
>   http://article.gmane.org/gmane.comp.lang.lua.general/58908 ).

FWIW, I'd looked at dynasm/luajit.

One big reason to go for LLVM is that it has nearly all the
infrastructure to make backend-functions/operators inlineable.
Especially for some of the arithmetic operations and such, that'd be
quite useful performance-wise.  With LLVM you can just use clang on C to
generate the IR, do some work to boil down the IR modules to the
relevant functions (i.e. remove non sql-callable functions), for which
LLVM has infrastructure, and then inline the functions that way.  That's
a lot harder to do with nearly everything else (save gcc's jit library,
but the licensing and stability situation makes that unattractive.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2017-02-10 Thread Robert Haas
On Wed, Feb 1, 2017 at 8:20 AM, Amit Kapila  wrote:
>> The hunk in indexam.c looks like a leftover that can be omitted.
>
> It is not a leftover hunk. Earlier, the patch has the same check
> btparallelrescan, but based on your comment up thread [1] (Why does
> btparallelrescan cater to the case where scan->parallel_scan== NULL?),
> this has been moved to indexam.c.

That's not my point.  The point is, if it's not a parallel scan,
index_parallelrescan() should never get called in the first place.  So
therefore it shouldn't need to worry about scan->parallel_scan being
NULL.  It seems possibly reasonable to put something like
Assert(scan->parallel_scan != NULL) in there, but arranging to return
without doing anything in that case seems like it just masks possible
bugs in the calling code.  And really I'm not sure we even need the
Assert.

>> I am a bit mystified about how this manages to work with array keys.
>> _bt_parallel_done() won't set btps_pageStatus to BTPARALLEL_DONE
>> unless so->arrayKeyCount >= btscan->btps_arrayKeyCount, but
>> _bt_parallel_advance_scan() won't do anything unless btps_pageStatus
>> is already BTPARALLEL_DONE.
>
> This is just to ensure that btps_arrayKeyCount is advanced once and
> btps_pageStatus is changed to BTPARALLEL_DONE once per array element.
> So it goes something like, if we have array with values [1,2,3], then
> all the workers will complete the scan with key 1 and one of them will
> mark btps_pageStatus as BTPARALLEL_DONE and then first one to hit
> _bt_parallel_advance_scan will increment the value of
> btps_arrayKeyCount, then same will happen for key 2 and 3.  It is
> quite possible that by the time one of the participant advances it
> local key, the scan for that key is already finished and we handle
> that situation in _bt_parallel_seize() with below check:
>
> if (so->arrayKeyCount < btscan->btps_arrayKeyCount)
> *status = false;
>
> I think Rahila has also mentioned something on above lines, let us
> know if we are missing something here?  Do you want to add more
> comments in the code to explain this handling, if yes, then where (on
> top of function _bt_parallel_advance_scan)?

You know, I just misread this code.  Both of you are right, and I am wrong.

>> That
>> is a little odd.  Another, possibly related thing that is odd is that
>> when _bt_steppage() finds no tuples and decides to advance to a new
>> page again, there's a very short comment in the forward case and a
>> very long comment in the backward case:
>>
>> /* nope, keep going */
>> vs.
>> /*
>>  * For parallel scans, get the last page scanned as it is quite
>>  * possible that by the time we try to fetch previous page, other
>>  * worker has also decided to scan that previous page.  We could
>>  * avoid that by doing _bt_parallel_release once we have read the
>>  * current page, but it is bad to make other workers wait till we
>>  * read the page.
>>  */
>>
>> Now it seems to me that these cases are symmetric and the issues are
>> identical.  It's basically that, while we were judging whether the
>> current page has useful contents, some other process could have
>> advanced the scan (since _bt_readpage un-seized it).
>
> Yeah, but the reason of difference in comments is that for
> non-parallel backwards scans there is no code at that place to move to
> previous page and it basically relies on next call to _bt_walk_left()
> whereas for parallel-scans, we can't simply rely on _bt_walk_left().
> I have slightly modified the  comments for backward scan case, see if
> that looks better and if not, then suggest what you think is better.

Why can't we rely on _bt_walk_left?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2017-02-10 Thread Robert Haas
On Thu, Feb 9, 2017 at 5:34 AM, Amit Kapila  wrote:
>> What about parallel CREATE INDEX? The patch currently uses
>> min_parallel_relation_size as an input into the optimizer's custom
>> cost model. I had wondered if that made sense. Note that another such
>> input is the projected size of the final index.
>
> If projected index size is available, then I think Create Index can
> also use a somewhat similar formula where we cap the maximum number of
> workers based on the size of the index.  Now, I am not sure if the
> threshold values of guc's kept for the scan are realistic for Create
> Index operation.

I think that would be an abuse of the GUC, because the idea of the
existing GUC - and the new one we're proposing to create here - has
always been about the amount of data being fed into the parallel
operation.  In the case of CREATE INDEX, the resulting index is an
output, not an input.  So if I were Peter and wanted to reuse the
existing GUCs, I'd reuse the one for the table size, because that's
what is being scanned.  No index is going to get scanned.

Of course, it's possible that the sensible amount of parallelism for
CREATE INDEX is higher or lower than for other sequential scans, so
that might not be the right thing to do.  It might need its own knob,
or some other refinement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] removing tsearch2

2017-02-10 Thread Tom Lane
Robert Haas  writes:
> That's not a bad idea, but I think it's an independent issue.  If the
> hacks are still needed for an external module, we shouldn't go out of
> our way to remove them even if we nuke tsearch2 (but we don't need to
> maintain them going forward unless we get a complaint).  If they hacks
> aren't still needed, they could be removed whether or not we keep
> tsearch2 in contrib.  Unless I'm confused?

Technically, it's probably independent of whether we keep tsearch2 in
contrib.  I think (but haven't researched it) that it's more a matter
of whether we care to still support direct upgrades from pre-release-N
versions of tsearch2, for some N.  Politically though, I think it'll
be easier to make an aggressive decision in that regard if we are
tossing tsearch2 out of contrib.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2017-02-10 Thread Markus Nullmeier
On 12/06/16 21:40, Andres Freund wrote:
> On 2016-12-06 14:35:43 -0600, Nico Williams wrote:
>> On Tue, Dec 06, 2016 at 12:27:51PM -0800, Andres Freund wrote:
>>> On 2016-12-06 14:19:21 -0600, Nico Williams wrote:
> I concur with your feeling that hand-rolled JIT is right out.  But

 Yeah, that way lies maintenance madness.
>>>
>>> I'm not quite that sure about that. I had a lot of fun doing some
>>> hand-rolled x86 JITing. Not that is a ward against me being mad.  But
>>> more seriously: Manually doing a JIT gives you a lot faster compilation
>>> times, which makes JIT applicable in a lot more situations.
>>
>> What I meant is that each time there are new ISA extensions, or
>> differences in how relevant/significant different implementations of the
>> same ISA implement certain instructions, and/or every time you want to
>> add a new architecture... someone has to do a lot of very low-level
>> work.
> 
> Yea, that's why I didn't pursue this path further. I *personally* think
> it'd be perfectly fine to only support JITing on linux x86_64 and
> aarch64 for now. And those I'd be willing to work on. But since I know
> that's not project policy...

Well, if this thread of thought about hand-crafted JIT should be taken
up again by someone at some point in time, I guess it could be possible
to reuse tools that are already out there, such as "DynASM"
( http://luajit.org/dynasm_features.html ) from the LuaJIT project
(currently offers x86-64, x86-32, ARM-32, PPC-32, and MIPS-32).
Maybe one could even recycle parts of the LuaJIT project itself
( http://luajit.org/luajit.html ,
  http://article.gmane.org/gmane.comp.lang.lua.general/58908 ).

-- 
Markus Nullmeierhttp://www.g-vo.org
German Astrophysical Virtual Observatory (GAVO)



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Press Release Draft - 2016-02-09 Cumulative Update

2017-02-10 Thread Robert Haas
On Wed, Feb 8, 2017 at 5:31 PM, Jim Nasby  wrote:
>   11 There existed a race condition /where/ if CREATE INDEX CONCURRENTLY was
> called on a column that had not been indexed before, then rows that were
> updated by transactions running at the same time as the CREATE INDEX
> CONCURRENTLY command /may not/ have been indexed incorrectly.

I think this is moot at this point, but "may not have been indexed
incorrectly" seems to have two negatives where there should be only
one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: About CMake v2

2017-02-10 Thread Robert Haas
On Mon, Jan 30, 2017 at 10:26 AM, Peter Eisentraut
 wrote:
> On 1/30/17 1:28 AM, Andres Freund wrote:
>> Given that fact, I just don't buy why it's a good idea to not also
>> replace autoconf initially.
>
> Well, I find it a bit scary.  If you do the big switch all at once, then
> you will have to dedicate the following 3 months to fixing complaints
> from developers and build farmers.

I agree with that.  I think replacing the Windows build system first
and then the non-Windows build system later is a better plan than
replacing both at the same time.

But also, I'm not really sure whether this conversion makes sense.  I
mean, any build system is going to require some work, and accordingly
our present build systems require some work.  cmake will require
different work, but not necessarily less.  The current system has a
long history; we pretty much know it works.  Switching will inevitably
break some things.  Maybe we'll end up better off in the long term,
but maybe we won't.  Things are pretty good now, so it seems like it
would be easy for them to get worse rather than better.  If nothing
else, everybody who has to learn the new system either to use it for
development or because they are doing packaging will have to do some
amount of extra work as a result of any switch.

I do agree that - in theory - one build system is better than two.
But two well-tested, reliable build systems could easily be better
than one system with a bunch of problems.  And the points downthread
about our two existing systems being not entirely separate are on
point, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] removing tsearch2

2017-02-10 Thread Robert Haas
On Fri, Feb 10, 2017 at 9:28 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> OK, that's three votes in favor of removing tsearch2 (from core,
>> anyone who wants it can maintain a copy elsewhere).  Starting a new
>> thread to make sure we collect all the relevant votes, but I really,
>> really think it's past time for this to go away.  The last actual
>> change to tsearch2 which wasn't part of a wider cleanup was
>> 3ca7eddbb7c4803729d385a0c9535d8a972ee03f in January 2009, so it's been
>> 7 years since there's been any real work done on this -- and the
>> release where we brought tsearch into core is now EOL, plus three more
>> releases besides.
>
> I'm abstaining on the actual vote, but I just wanted to note that there's
> at least two layers of stub compatibility functions in core that are
> there only to support tsearch2, cf commits
> eb43e851d6b3fa0c4516efcfdf29a183f7717a43
> 6595dd04d136d5c97ae05fc580572c8f00042143
> ... and there might be more that I missed in quickly scanning the
> commit log.  We should probably investigate whether those can be
> gotten rid of without damaging the maintainability of the external
> module.  I *think* these were only needed as short-term hacks but
> don't want to say for sure because ENOCAFFEINE.

That's not a bad idea, but I think it's an independent issue.  If the
hacks are still needed for an external module, we shouldn't go out of
our way to remove them even if we nuke tsearch2 (but we don't need to
maintain them going forward unless we get a complaint).  If they hacks
aren't still needed, they could be removed whether or not we keep
tsearch2 in contrib.  Unless I'm confused?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2017

2017-02-10 Thread Dmitry Melnik
The expected result for this work is push-based executor working for many
types of queries (currently we aim at TPC-H), but it's unlikely to be a
production-ready patch to commit into mainline at that stage. This work is
the actual topic for our student's thesis, so he has already started, and
has working prototypes for very simple plans. Also, he won't be working on
this alone, but rather will make use of support and experience of our team
(as well as mentor's help).
So this is not about replacing current pull executor right away, but rather
to develop working prototype to find out about the benefits of switching
from pull to push model (for both the interpreter and LLVM JIT).

On Wed, Feb 8, 2017 at 7:06 PM, Robert Haas  wrote:

> On Mon, Feb 6, 2017 at 6:51 AM, Ruben Buchatskiy  wrote:
> > 2017-01-10 12:53 GMT+03:00 Alexander Korotkov  >:
> >> 1. What project ideas we have?
> >
> > We would like to propose a project on rewriting PostgreSQL executor from
> >
> > traditional Volcano-style [1] to so-called push-based architecture as
> > implemented in
> >
> > Hyper [2][3] and VitesseDB [4]. The idea is to reverse the direction of
> data
> > flow
> >
> > control: instead of pulling up tuples one-by-one with ExecProcNode(), we
> > suggest
> >
> > pushing them from below to top until blocking operator (e.g.
> Aggregation) is
> >
> > encountered. There’s a good example and more detailed explanation for
> this
> > approach in [2].
>
> I think this very possibly a good idea but extremely unlikely to be
> something that a college student or graduate student can complete in
> one summer.  More like an existing expert developer and a year of
> doing not much else.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Best regards,
  Dmitry


Re: [HACKERS] removing tsearch2

2017-02-10 Thread David Steele
On 2/10/17 6:28 AM, Robert Haas wrote:
> On Thu, Feb 9, 2017 at 7:37 PM, Andres Freund  wrote:
>> On 2017-02-09 19:19:21 -0500, Stephen Frost wrote:
>>> * Robert Haas (robertmh...@gmail.com) wrote:
 On Thu, Feb 9, 2017 at 4:24 PM, Tom Lane  wrote:
> Also, our experience with contrib/tsearch2 suggests that the extension
> shouldn't be part of contrib, because we have zero track record of getting
> rid of stuff in contrib, no matter how dead it is.

 Let's nuke tsearch2 to remove this adverse precedent, and then add the
 new thing.

 Anybody who still wants tsearch2 can go get it from an old version, or
 somebody can maintain a fork on github.
>>>
>>> Works for me.
>>
>> +1
> 
> OK, that's three votes in favor of removing tsearch2 (from core,
> anyone who wants it can maintain a copy elsewhere).  Starting a new
> thread to make sure we collect all the relevant votes, but I really,
> really think it's past time for this to go away.  The last actual
> change to tsearch2 which wasn't part of a wider cleanup was
> 3ca7eddbb7c4803729d385a0c9535d8a972ee03f in January 2009, so it's been
> 7 years since there's been any real work done on this -- and the
> release where we brought tsearch into core is now EOL, plus three more
> releases besides.

+1

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] removing tsearch2

2017-02-10 Thread Alexander Korotkov
On Fri, Feb 10, 2017 at 2:28 PM, Robert Haas  wrote:

> On Thu, Feb 9, 2017 at 7:37 PM, Andres Freund  wrote:
> > On 2017-02-09 19:19:21 -0500, Stephen Frost wrote:
> >> * Robert Haas (robertmh...@gmail.com) wrote:
> >> > On Thu, Feb 9, 2017 at 4:24 PM, Tom Lane  wrote:
> >> > > Also, our experience with contrib/tsearch2 suggests that the
> extension
> >> > > shouldn't be part of contrib, because we have zero track record of
> getting
> >> > > rid of stuff in contrib, no matter how dead it is.
> >> >
> >> > Let's nuke tsearch2 to remove this adverse precedent, and then add the
> >> > new thing.
> >> >
> >> > Anybody who still wants tsearch2 can go get it from an old version, or
> >> > somebody can maintain a fork on github.
> >>
> >> Works for me.
> >
> > +1
>
> OK, that's three votes in favor of removing tsearch2 (from core,
> anyone who wants it can maintain a copy elsewhere).  Starting a new
> thread to make sure we collect all the relevant votes, but I really,
> really think it's past time for this to go away.  The last actual
> change to tsearch2 which wasn't part of a wider cleanup was
> 3ca7eddbb7c4803729d385a0c9535d8a972ee03f in January 2009, so it's been
> 7 years since there's been any real work done on this -- and the
> release where we brought tsearch into core is now EOL, plus three more
> releases besides.
>

+1

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] removing tsearch2

2017-02-10 Thread Tom Lane
Robert Haas  writes:
> OK, that's three votes in favor of removing tsearch2 (from core,
> anyone who wants it can maintain a copy elsewhere).  Starting a new
> thread to make sure we collect all the relevant votes, but I really,
> really think it's past time for this to go away.  The last actual
> change to tsearch2 which wasn't part of a wider cleanup was
> 3ca7eddbb7c4803729d385a0c9535d8a972ee03f in January 2009, so it's been
> 7 years since there's been any real work done on this -- and the
> release where we brought tsearch into core is now EOL, plus three more
> releases besides.

I'm abstaining on the actual vote, but I just wanted to note that there's
at least two layers of stub compatibility functions in core that are
there only to support tsearch2, cf commits
eb43e851d6b3fa0c4516efcfdf29a183f7717a43
6595dd04d136d5c97ae05fc580572c8f00042143
... and there might be more that I missed in quickly scanning the
commit log.  We should probably investigate whether those can be
gotten rid of without damaging the maintainability of the external
module.  I *think* these were only needed as short-term hacks but
don't want to say for sure because ENOCAFFEINE.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parameterization of partial path

2017-02-10 Thread Antonin Houska
Robert Haas  wrote:

> On Thu, Feb 9, 2017 at 12:36 PM, Antonin Houska  wrote:
> > When looking at try_partial_hashjoin_path and try_partial_nestloop_path
> > functions I'm failing to understand the comment "Parameterized partial paths
> > are not supported.".
> >
> > It's clear to me that GatherPath can't have parameters because repeated
> > execution of parallel plan with adjusted parameters is not supported.
> 
> Actually, I think in theory that is fine.  You'd start up and shut
> down workers for every execution, which is likely to stink in terms of
> performance, but it should work OK.  The only problem is that it'll
> only work if you pass the values of the parameters down to the worker
> processes, which the code currently doesn't. Bind parameters sent by
> the user are passed down, but there's no provision to pass down
> parameters populated at execution time.  Amit has written code for
> that, though, and it's been posted here.  It just hasn't gotten into
> the tree yet for one reason or another.

ok, I also thought it's about missing implementation rather than principal
issue.

> > But the
> > fact that generic partial path has parameters should not be a problem if 
> > those
> > parameters are satisfied below the nearest GatherPath node higher in the 
> > plan
> > tree. Do I miss anything of the concept?
> 
> Yes, I think you're missing a key point.  A parameterized path would
> normally be used for a query like small LJ (big1 IJ big2 ON big1.x =
> big2.x) ON big1.y = small.y.  The plan might look like this:

Thanks for detailed explanation. I think the missing point in my thoughts was
that the only way to satisfy parameters of a relation (so that Gather does not
have to pass any parameters) is to put the relation on the inner side of a
join. However the inner side must provide the whole result set, not just part
of it, else the result is wrong.

-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] removing tsearch2

2017-02-10 Thread Daniel Gustafsson
> On 10 Feb 2017, at 12:28, Robert Haas  wrote:
> 
> On Thu, Feb 9, 2017 at 7:37 PM, Andres Freund  wrote:
>> On 2017-02-09 19:19:21 -0500, Stephen Frost wrote:
>>> * Robert Haas (robertmh...@gmail.com) wrote:
 On Thu, Feb 9, 2017 at 4:24 PM, Tom Lane  wrote:
> Also, our experience with contrib/tsearch2 suggests that the extension
> shouldn't be part of contrib, because we have zero track record of getting
> rid of stuff in contrib, no matter how dead it is.
 
 Let's nuke tsearch2 to remove this adverse precedent, and then add the
 new thing.
 
 Anybody who still wants tsearch2 can go get it from an old version, or
 somebody can maintain a fork on github.
>>> 
>>> Works for me.
>> 
>> +1
> 
> OK, that's three votes in favor of removing tsearch2 (from core,
> anyone who wants it can maintain a copy elsewhere).  Starting a new
> thread to make sure we collect all the relevant votes, but I really,
> really think it's past time for this to go away.  The last actual
> change to tsearch2 which wasn't part of a wider cleanup was
> 3ca7eddbb7c4803729d385a0c9535d8a972ee03f in January 2009, so it's been
> 7 years since there's been any real work done on this -- and the
> release where we brought tsearch into core is now EOL, plus three more
> releases besides.

FWIW, +1

cheers ./daniel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Passing query string to workers

2017-02-10 Thread Amit Kapila
On Fri, Feb 10, 2017 at 2:54 PM, Kuntal Ghosh
 wrote:
> On Tue, Feb 7, 2017 at 10:19 AM, Rafia Sabih
>  wrote:
>> Thanks a lot Kuntal for the review, please find attached patch for the
>> revised version.
> Few comments on the patch:
>
> There are some spacing issues in the code. For example,
> +   estate->es_queryString = (char
> *)palloc0(strlen(queryDesc->sourceText) + 1);
> +   /*Estimate space for query text. */
> pgindent might be helpful to track all such changes.
>
> +#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE010)
> I'm uncomfortable with declaring the same macro in two
> files(parallel.c, execParallel.c). My suggestion would be to move
> pgstat_report_activity in ParallelQueryMain instead of
> ParallelWorkerMain. Then, you can remove the macro definition from
> parallel.c. Thoughts?
>

Yes, I also don't see any need of defining it in parallel.c.  I think
she has kept to report it in pg_stat_activity, but I feel that code
can also be moved to execParallel.c.

Another question is don't we need to set debug_query_string in worker?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parameterization of partial path

2017-02-10 Thread Amit Kapila
On Thu, Feb 9, 2017 at 11:36 PM, Robert Haas  wrote:
> On Thu, Feb 9, 2017 at 12:36 PM, Antonin Houska  wrote:
>> When looking at try_partial_hashjoin_path and try_partial_nestloop_path
>> functions I'm failing to understand the comment "Parameterized partial paths
>> are not supported.".
>>
>> It's clear to me that GatherPath can't have parameters because repeated
>> execution of parallel plan with adjusted parameters is not supported.
>
> Actually, I think in theory that is fine.  You'd start up and shut
> down workers for every execution, which is likely to stink in terms of
> performance, but it should work OK.  The only problem is that it'll
> only work if you pass the values of the parameters down to the worker
> processes, which the code currently doesn't.
>

Another thing is that currently, we use the existing DSM for rescan.
If we really want to pass the params during rescan we might need some
work so that if we need more memory to pass the params at the time of
rescan, then we should be able to do that.  Now that we have DSA, I
think we can use that to pass the params during rescan if required.
However here the important point is that I have not come across any
plan (in the benchmarks like TPC-H) where it could be beneficial to
pass params during rescan, so not sure it is worth the effort to
invent something for that.  Now, it is possible that I am missing
something, but if someone can present a use case, then I think we can
try to support such parallel plans.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2017-02-10 Thread Robert Haas
On Thu, Feb 9, 2017 at 6:38 PM, Thomas Munro
 wrote:
> Here's my thought process... please tell me where I'm going wrong:
>
> I have been assuming that it's not enough to just deal with this when
> the leader detaches on the theory that other participants will always
> detach first: that probably isn't true in some error cases, and could
> contribute to spurious racy errors where other workers complain about
> disappearing files if the leader somehow shuts down and cleans up
> while a worker is still running.  Therefore we need *some* kind of
> refcounting, whether it's a new kind or a new mechanism based on the
> existing kind.

+1.

> I have also been assuming that we don't want to teach dsm.c directly
> about this stuff; it shouldn't need to know about other modules, so we
> don't want it talking to buffile.c directly and managing a special
> table of files; instead we want a system of callbacks.  Therefore
> client code needs to do something after attaching to the segment in
> each backend.

+1.

> It doesn't matter whether we use an on_dsm_detach() callback and
> manage our own refcount to infer that destruction is imminent, or a
> new on_dsm_destroy() callback which tells us so explicitly: both ways
> we'll need to make sure that anyone who attaches to the segment also
> "attaches" to this shared BufFile manager object inside it, because
> any backend might turn out to be the one that is last to detach.

Not entirely.  In the first case, you don't need the requirement that
everyone who attaches the segment must attach to the shared BufFile
manager.  In the second case, you do.

> That bring us to the race you mentioned.  Isn't it sufficient to say
> that you aren't allowed to do anything that might throw in between
> attaching to the segment and attaching to the SharedBufFileManager
> that it contains?

That would be sufficient, but I think it's not a very good design.  It
means, for example, that nothing between the time you attach to the
segment and the time you attach to this manager can palloc() anything.
So, for example, it would have to happen before ParallelWorkerMain
reaches the call to shm_mq_attach, which kinda sucks because we want
to do that as soon as possible after attaching to the DSM segment so
that errors are reported properly thereafter.  Note that's the very
first thing we do now, except for working out what the arguments to
that call need to be.

Also, while it's currently safe to assume that shm_toc_attach() and
shm_toc_lookup() don't throw errors, I've thought about the
possibility of installing some sort of cache in shm_toc_lookup() to
amortize the cost of lookups, if the number of keys ever got too
large.  And that would then require a palloc().  Generally, backend
code should be free to throw errors.  When it's absolutely necessary
for a short segment of code to avoid that, then we do, but you can't
really rely on any substantial amount of code to be that way, or stay
that way.

And in this case, even if we didn't mind those problems or had some
solution to them, I think that the shared buffer manager shouldn't
have to be something that is whacked directly into parallel.c all the
way at the beginning of the initialization sequence so that nothing
can fail before it happens.  I think it should be an optional data
structure that clients of the parallel infrastructure can decide to
use, or to not use.  It should be at arm's length from the core code,
just like the way ParallelQueryMain() is distinct from
ParallelWorkerMain() and sets up its own set of data structures with
their own set of keys.  All that stuff is happy to happen after
whatever ParallelWorkerMain() feels that it needs to do, even if
ParallelWorkerMain might throw errors for any number of unknown
reasons.  Similarly, I think this new things should be something than
an executor node can decide to create inside its own per-node space --
reserved via ExecParallelEstimate, initialized
ExecParallelInitializeDSM, etc.  There's no need for it to be deeply
coupled to parallel.c itself unless we force that choice by sticking a
no-fail requirement in there.

> Up until two minutes ago I assumed that policy would leave only two
> possibilities: you attach to the DSM segment and attach to the
> SharedBufFileManager successfully or you attach to the DSM segment and
> then die horribly (but not throw) and the postmaster restarts the
> whole cluster and blows all temp files away with RemovePgTempFiles().
> But I see now in the comment of that function that crash-induced
> restarts don't call that because "someone might want to examine the
> temp files for debugging purposes".  Given that policy for regular
> private BufFiles, I don't see why that shouldn't apply equally to
> shared files: after a crash restart, you may have some junk files that
> won't be cleaned up until your next clean restart, whether they were
> private or shared BufFiles.

I think most people (other than Tom) would agree that 

Re: [HACKERS] removing tsearch2

2017-02-10 Thread Michael Paquier
On Fri, Feb 10, 2017 at 8:28 PM, Robert Haas  wrote:
> On Thu, Feb 9, 2017 at 7:37 PM, Andres Freund  wrote:
>> On 2017-02-09 19:19:21 -0500, Stephen Frost wrote:
>>> * Robert Haas (robertmh...@gmail.com) wrote:
>>> > On Thu, Feb 9, 2017 at 4:24 PM, Tom Lane  wrote:
>>> > > Also, our experience with contrib/tsearch2 suggests that the extension
>>> > > shouldn't be part of contrib, because we have zero track record of 
>>> > > getting
>>> > > rid of stuff in contrib, no matter how dead it is.
>>> >
>>> > Let's nuke tsearch2 to remove this adverse precedent, and then add the
>>> > new thing.
>>> >
>>> > Anybody who still wants tsearch2 can go get it from an old version, or
>>> > somebody can maintain a fork on github.
>>>
>>> Works for me.
>>
>> +1
>
> OK, that's three votes in favor of removing tsearch2 (from core,
> anyone who wants it can maintain a copy elsewhere).  Starting a new
> thread to make sure we collect all the relevant votes, but I really,
> really think it's past time for this to go away.  The last actual
> change to tsearch2 which wasn't part of a wider cleanup was
> 3ca7eddbb7c4803729d385a0c9535d8a972ee03f in January 2009, so it's been
> 7 years since there's been any real work done on this -- and the
> release where we brought tsearch into core is now EOL, plus three more
> releases besides.

+1.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] removing tsearch2

2017-02-10 Thread Robert Haas
On Thu, Feb 9, 2017 at 7:37 PM, Andres Freund  wrote:
> On 2017-02-09 19:19:21 -0500, Stephen Frost wrote:
>> * Robert Haas (robertmh...@gmail.com) wrote:
>> > On Thu, Feb 9, 2017 at 4:24 PM, Tom Lane  wrote:
>> > > Also, our experience with contrib/tsearch2 suggests that the extension
>> > > shouldn't be part of contrib, because we have zero track record of 
>> > > getting
>> > > rid of stuff in contrib, no matter how dead it is.
>> >
>> > Let's nuke tsearch2 to remove this adverse precedent, and then add the
>> > new thing.
>> >
>> > Anybody who still wants tsearch2 can go get it from an old version, or
>> > somebody can maintain a fork on github.
>>
>> Works for me.
>
> +1

OK, that's three votes in favor of removing tsearch2 (from core,
anyone who wants it can maintain a copy elsewhere).  Starting a new
thread to make sure we collect all the relevant votes, but I really,
really think it's past time for this to go away.  The last actual
change to tsearch2 which wasn't part of a wider cleanup was
3ca7eddbb7c4803729d385a0c9535d8a972ee03f in January 2009, so it's been
7 years since there's been any real work done on this -- and the
release where we brought tsearch into core is now EOL, plus three more
releases besides.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallelize queries containing initplans

2017-02-10 Thread Amit Kapila
On Tue, Jan 31, 2017 at 4:16 PM, Amit Kapila  wrote:
> On Wed, Dec 28, 2016 at 5:20 PM, Amit Kapila  wrote:
>
>>   The drawback of the second approach is
>> that we need to evaluate the initplan before it is actually required
>> which means that we might evaluate it even when it is not required.  I
>> am not sure if it is always safe to assume that we can evaluate the
>> initplan before pushing it to workers especially for the cases when it
>> is far enough down in the plan tree which we are parallelizing,
>>
>
> I think we can always pull up un-correlated initplans at Gather node,
> however, if there is a correlated initplan, then it is better not to
> allow such initplans for being pushed below gather.  Ex. of correlated
> initplans:
>
> postgres=# explain (costs off) select * from t1 where t1.i in (select
> t2.i from t2 where t1.k = (select max(k) from t3 where t3.i=t1.i));
>   QUERY PLAN
> --
>  Seq Scan on t1
>Filter: (SubPlan 2)
>SubPlan 2
>  ->  Gather
>Workers Planned: 1
>Params Evaluated: $1
>InitPlan 1 (returns $1)
>  ->  Aggregate
>->  Seq Scan on t3
>  Filter: (i = t1.i)
>->  Result
>  One-Time Filter: (t1.k = $1)
>  ->  Parallel Seq Scan on t2
> (13 rows)
>
> It might be safe to allow above plan, but in general, such plans
> should not be allowed, because it might not be feasible to compute
> such initplan references at Gather node.  I am still thinking on the
> best way to deal with such initplans.
>

I could see two possibilities to determine whether the plan (for which
we are going to generate an initplan) contains a reference to a
correlated var param node.  One is to write a plan or path walker to
determine any such reference and the second is to keep the information
about the correlated param in path node.   I think the drawback of the
first approach is that traversing path tree during generation of
initplan can be costly, so for now I have kept the information in path
node to prohibit generating parallel initplans which contain a
reference to correlated vars. I think we can go with first approach of
using path walker if people feel that is better than maintaining a
reference in path.  Attached patch
prohibit_parallel_correl_params_v1.patch implements the second
approach of keeping the correlated var param reference in path node
and pq_pushdown_initplan_v2.patch uses that to generate parallel
initplans.

Thoughts?

These patches build on top of parallel subplan patch [1].

[1] - 
https://www.postgresql.org/message-id/caa4ek1kyqjqzqmpez+qra2fmim386gqlqbef+p2wmtqjh1r...@mail.gmail.com


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


prohibit_parallel_correl_params_v1.patch
Description: Binary data


pq_pushdown_initplan_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GUC for cleanup indexes threshold.

2017-02-10 Thread Kuntal Ghosh
On Wed, Jan 4, 2017 at 1:51 PM, Masahiko Sawada  wrote:

> Attached patch introduces new GUC parameter parameter
> vacuum_cleanup_index_scale_factor which specifies the fraction of the
> table pages containing dead tuple needed to trigger a cleaning up
> indexes. The default is 0.0, which means that the cleanup index is not
> invoked if no update on table. In other word, if table is completely
> frozen then lazy vacuum can skip the index scans as well. Increasing
> this value could reduce total time of lazy vacuum but the statistics
> and the free space map of index are not updated.
>
I was looking into your patch and trying to understand how the
following piece of code works.
+if (vacuumed_pages > cleanupidx_thresh)
+{
+for (i = 0; i < nindexes; i++)
+lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+}
So, you are skipping deletion of index entries if it does not reach
the clean-up index threshold. But, you are removing all dead tuples
from the heap pointed by the same index. Hence, index will contain
entries with invalid references. How does that work? How will you
remove those index entries later? (I'm a newbie.)

+This parameter can only be set anywhere.
Oxymoron. :-)


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Documentation improvements for partitioning

2017-02-10 Thread Simon Riggs
On 10 February 2017 at 08:18, Amit Langote
 wrote:

> I agree that getting the proposed documentation changes committed would be
> a step ahead.

Committed. I tested how it works and added documentation that matched
my experiences, correcting what you'd said and adding more information
for clarity as I went.

Few points from tests

* "ERROR:  cannot create index on partitioned table "measurement_year_month""
is misleading because you can create indexes on partitions

* You can create additional CHECK (column is NOT NULL) as a check
constraint, even if you can't create a not null constraint

* The OID inheritance needs work - you shouldn't need to specify a
partition needs OIDS if the parent has it already.

* There's no tab completion, which prevents people from testing this
(maybe better now with some docs)

* ERROR:  no partition of relation "measurement_year_month" found for row
DETAIL:  Failing row contains (2016-12-02, 1, 1).
should not return the whole row, just the partition keys

* It's very weird you can't DROP a partitioned table. I think you need
to add dependencies.

* ERROR:  TO must specify exactly one value per partitioning column
should say something more like "you specified one column value,
whereas the partitioning key requires two columns"


Good work so far, but there is still a very significant amount of work
to do. And as this feature evolves it must now contain full
documentation at every step, otherwise it won't be able to receive
full and fair review. So please make sure each new patch contains docs
changes for that patch.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Passing query string to workers

2017-02-10 Thread Kuntal Ghosh
On Tue, Feb 7, 2017 at 10:19 AM, Rafia Sabih
 wrote:
> Thanks a lot Kuntal for the review, please find attached patch for the
> revised version.
Few comments on the patch:

There are some spacing issues in the code. For example,
+   estate->es_queryString = (char
*)palloc0(strlen(queryDesc->sourceText) + 1);
+   /*Estimate space for query text. */
pgindent might be helpful to track all such changes.

+#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE010)
I'm uncomfortable with declaring the same macro in two
files(parallel.c, execParallel.c). My suggestion would be to move
pgstat_report_activity in ParallelQueryMain instead of
ParallelWorkerMain. Then, you can remove the macro definition from
parallel.c. Thoughts?

And, the value of the macro seems pretty random to me. IMO, it should
be UINT64CONST(0xE007).

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] log_autovacuum_min_duration doesn't log VACUUMs

2017-02-10 Thread Simon Riggs
I guess its fairly obvious in the title, but
log_autovacuum_min_duration doesn't log VACUUMs only autovacuums.

What isn't obvious is why that restruction is useful.

I say that it would be helpful to log all kinds of VACUUM, so we get
similar output from all methods of submission.

So, changes would be

1. Allow logging whether or not it is an autovacuum (attached)

2. Change name of parameter to ...
a) log_vacuum_min_duration
b) log_maintenance_min_duration and have it log CLUSTER, CREATE INDEX etc also
c) log_ddl_min_duration and just time any DDL that takes a long time
We could do any of those and they are all about as easy as one
another, though the last one will be a bigger patch, so a) might be
simpler.

The following patch implements (1), but not yet (2) to allow debate.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


all_vacuums_log_min_duration.v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Reporting xmin from VACUUMs

2017-02-10 Thread Simon Riggs
We've added xmin info to pg_stat_activity and pg_stat_replication, but
VACUUM doesn't yet report which xmin value it used when it ran.

Small patch to add this info to VACUUM output.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


vacuum_report_oldestxmin.v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Documentation improvements for partitioning

2017-02-10 Thread Amit Langote
On 2017/02/10 17:00, Simon Riggs wrote:
> On 10 February 2017 at 07:35, Amit Langote
>  wrote:
> 
>>> A final note, because I'm really familiar with partitioning on Postgres and
>>> other databases, documentation which is clear to me might not be to someone
>>> less familiar with partitioning. Maybe we want another reviewer for that?
>>
>> More eyeballs will only help make this better.
> 
> Given that we already have partitioning feature committed, we really
> need to have the docs committed as well.
> 
> Without claiming I'm happy about this, I think the best way to improve
> the number of eyeballs on this is to commit these docs as is.
> 
> For me, the most important thing is understanding the feature, not
> (yet) discussing what the docs should look like. This is especially
> true if other patches reference the way partitioning works and nobody
> can comment on those patches because they don't understand
> 
> Any issues with that?

I agree that getting the proposed documentation changes committed would be
a step ahead.  I saw in the logical replication thread that dealing with
partitioned tables without the docs explaining what they are has been
difficult.  Hopefully the proposed documentation improvements help make
progress in that regard.  Partitioned tables, at this point, have certain
limitations which affect its interaction with other features (old or new);
documenting those limitations will be helpful not only to the users
deciding whether to start using the new partitioned tables right away, but
also to the developers of other features who want to understand what
partitioned tables are.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Documentation improvements for partitioning

2017-02-10 Thread Simon Riggs
On 10 February 2017 at 07:35, Amit Langote
 wrote:

>> A final note, because I'm really familiar with partitioning on Postgres and
>> other databases, documentation which is clear to me might not be to someone
>> less familiar with partitioning. Maybe we want another reviewer for that?
>
> More eyeballs will only help make this better.

Given that we already have partitioning feature committed, we really
need to have the docs committed as well.

Without claiming I'm happy about this, I think the best way to improve
the number of eyeballs on this is to commit these docs as is.

For me, the most important thing is understanding the feature, not
(yet) discussing what the docs should look like. This is especially
true if other patches reference the way partitioning works and nobody
can comment on those patches because they don't understand

Any issues with that?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers