Re: [HACKERS] assessing parallel-safety

2015-03-14 Thread Noah Misch
On Thu, Mar 12, 2015 at 11:21:37AM -0400, Robert Haas wrote:
> On Thu, Feb 19, 2015 at 1:19 AM, Noah Misch  wrote:
> > Rereading my previous message, I failed to make the bottom line clear: I
> > recommend marking eqsel etc. PROPARALLEL_UNSAFE but *not* checking an
> > estimator's proparallel before calling it in the planner.
> 
> But what do these functions do that is actually unsafe?

They call the oprcode function of any operator naming them as an estimator.
Users can add operators that use eqsel() as an estimator, and we have no bound
on what those operators' oprcode can do.  (In practice, parallel-unsafe
operators using eqsel() as an estimator will be rare.)

> >> Hmm.  Altogether prohibiting XLogInsert() in parallel mode seems
> >> unwise, because it would foreclose heap_page_prune_opt() in workers.
> >> I realize there's separate conversation about whether pruning during
> >> SELECT queries is good policy, but in the interested of separating
> >> mechanism from policy, and in the sure knowledge that allowing at
> >> least some writes in parallel mode is certainly going to be something
> >> people will want, it seems better to look into propagating
> >> XactLastRecEnd.
> >
> > Good points; that works for me.
> 
> The key design decision here seems to be this: How eagerly do we need
> to synchronize XactLastRecEnd?  What exactly goes wrong if it isn't
> synchronized?  For example, if the value were absolutely critical in
> all circumstances, one could imagine storing a shared XactLastRecEnd
> in shared memory.  This doesn't appear to be the case: the main
> requirement is that we definitely need an up-to-date value at commit
> time.  Also, at abort time, we don't really the value for anything
> critical, but it's worth kicking the WAL writer so that any
> accumulated WAL gets flushed.

Exactly.  At entry to RecordTransactionCommit(), XactLastRecEnd must point
past the end of all records whose replay is required to satisfy the user's
expectation of transaction durability.  At other times, harm from its value
being wrong is negligible.  I do suggest adding a comment to its definition
explaining when one can rely on it.

> Here's an incremental patch - which I will incorporate into the
> parallel mode patch if it seems about right to you - that tries to
> tackle all this.

I reviewed this.  Minor things:

> @@ -73,10 +74,15 @@ typedef struct FixedParallelState
>   /* Entrypoint for parallel workers. */
>   parallel_worker_main_type   entrypoint;
>  
> - /* Track whether workers have attached. */
> + /* Mutex protects remaining fiedlds. */

Typo.

> @@ -2564,10 +2578,19 @@ AbortTransaction(void)
>* worker, skip this; the user backend must be the one to write the 
> abort
>* record.
>*/
> - if (parallel)
> - latestXid = InvalidTransactionId;
> - else
> + if (!parallel)
>   latestXid = RecordTransactionAbort(false);
> + else
> + {
> + latestXid = InvalidTransactionId;
> +
> + /*
> +  * Since the parallel master won't get our value of 
> XactLastRecEnd in this
> +  * case, we nudge WAL-writer ourselves in this case.  See 
> related comments in
> +  * RecordTransactionAbort for why this matters.
> +  */
> + XLogSetAsyncXactLSN(XactLastRecEnd);

RecordTransactionAbort() skips this for subtransaction aborts.  I would omit
it here, because a parallel worker abort is, in this respect, more like a
subtransaction abort than like a top-level transaction abort.


While studying README.parallel from parallel-mode-v7.patch to understand the
concept of worker commit/abort, this part gave me the wrong idea:

> +Transaction Integration
> +===
...
> +Transaction commit or abort requires careful coordination between backends.
> +Each backend has its own resource owners: buffer pins, catcache or relcache
> +reference counts, tuple descriptors, and so on are managed separately by each
> +backend, and each backend is separately responsible for releasing such
> +resources.  Generally, the commit or abort of a parallel worker is much like
> +a top-transaction commit or abort, but there are a few exceptions.  Most
> +importantly:
> +
> +  - No commit or abort record is written; the initiating backend is
> +responsible for this.
> +
> +  - Cleanup of pg_temp namespaces is not done.  The initiating backend is
> +responsible for this, too.
> +
> +The master kills off all remaining workers as part of commit or abort
> +processing.  It must not only kill the workers but wait for them to actually

I took this to mean that workers normally persist until the master commits a
transaction.  Rather, their lifecycle is like the lifecycle of a buffer pin.
When an error interrupts a parallel operation, transaction abort destroys
workers.  Otherwise, the code driving the specific parallel task destroys them
as early as is practical.  (Strictly to c

Re: [HACKERS] patch : Allow toast tables to be moved to a different tablespace

2015-03-14 Thread Andreas Karlsson

On 03/15/2015 04:25 AM, Andreas Karlsson wrote:

Nice. You will also want to apply the attached patch which fixes support
for the --no-tablespaces flag.


Just realized that --no-tablespaces need to be fixed for pg_restore too.

--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch : Allow toast tables to be moved to a different tablespace

2015-03-14 Thread Andreas Karlsson

On 03/14/2015 05:59 PM, Julien Tachoires wrote:

On 14/03/2015 16:10, Andreas Karlsson wrote:

Noticed a bug when playing round some more with pg_dump. It does not
seem to dump custom table space for the table and default table space
for the toast correctly. It forgets about the toast table being in
pg_default.


Good catch. This is now fixed.


Nice. You will also want to apply the attached patch which fixes support 
for the --no-tablespaces flag.


--
Andreas Karlsson
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index bc4a0b1..8ef2df9 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -13746,7 +13746,8 @@ dumpTableSchema(Archive *fout, DumpOptions *dopt, TableInfo *tbinfo)
 			destroyPQExpBuffer(result);
 
 			/* Change TOAST tablespace */
-			if (strcmp(tbinfo->reltablespace, tbinfo->reltoasttablespace) != 0)
+			if (!dopt->outputNoTablespaces &&
+strcmp(tbinfo->reltablespace, tbinfo->reltoasttablespace) != 0)
 			{
 appendPQExpBuffer(q, "\nALTER MATERIALIZED VIEW %s ",
 	fmtId(tbinfo->dobj.name));
@@ -14032,8 +14033,9 @@ dumpTableSchema(Archive *fout, DumpOptions *dopt, TableInfo *tbinfo)
 	}
 
 	/* Change TOAST tablespace */
-	if (strcmp(tbinfo->reltoasttablespace, tbinfo->reltablespace) != 0 &&
-			tbinfo->relkind == RELKIND_RELATION)
+	if (!dopt->outputNoTablespaces &&
+		strcmp(tbinfo->reltoasttablespace, tbinfo->reltablespace) != 0 &&
+		tbinfo->relkind == RELKIND_RELATION)
 	{
 		appendPQExpBuffer(q, "\nALTER TABLE %s ",
 			fmtId(tbinfo->dobj.name));

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewind in contrib

2015-03-14 Thread Alvaro Herrera
Amit Kapila wrote:

> Getting below linking error with Asserts enabled in Windows build.
> 
> 1>xlogreader.obj : error LNK2019: unresolved external symbol
> ExceptionalCondition referenced in function
> XLogReadRecord
> 1>.\Debug\pg_rewind\pg_rewind.exe : fatal error LNK1120: 1 unresolved
> externals
> 
> Am I doing anything wrong while building?

Assert() gets defined in terms of ExceptionalCondition if building in
!FRONTEND, so it seems like the compile flags are wrong for pg_rewind's
copy of xlogreader.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-14 Thread Kouhei Kaigai
> On Fri, Mar 13, 2015 at 2:31 PM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> Another bit of this that I think we could commit without fretting
> >> about it too much is the code adding set_join_pathlist_hook.  This is
> >> - I think - analogous to set_rel_pathlist_hook, and like that hook,
> >> could be used for other purposes than custom plan generation - e.g. to
> >> delete paths we do not want to use.  I've extracted this portion of
> >> the patch and adjusted the comments; if there are no objections, I
> >> will commit this bit also.
> >
> > I don't object to the concept, but I think that is a pretty bad place
> > to put the hook call: add_paths_to_joinrel is typically called multiple
> > (perhaps *many*) times per joinrel and thus this placement would force
> > any user of the hook to do a lot of repetitive work.
> 
> Interesting point.  I guess the question is whether a some or all
> callers are going to actually *want* a separate call for each
> invocation of add_paths_to_joinrel(), or whether they'll be happy to
> operate on the otherwise-complete path list.  It's true that if your
> goal is to delete paths, it's probably best to be called just once
> after the path list is complete, and there might be a use case for
> that, but I guess it's less useful than for baserels.  For a baserel,
> as long as you don't nuke the sequential-scan path, there is always
> going to be a way to complete the plan; so this would be a fine way to
> implement a disable-an-index extension.  But for joinrels, it's not so
> easy to rule out, say, a hash-join here.  Neither hook placement is
> much good for that; the path you want to get rid of may have already
> dominated paths you want to keep.
> 
From the standpoint of extension development, I'm uncertain whether we
can easily reproduce information needed to compute alternative paths on
the hook at standard_join_search(), like a hook at add_paths_to_joinrel().

(Please correct me, if I misunderstood.)
For example, it is not obvious which path is inner/outer of the joinrel
on which custom-scan provider tries to add an alternative scan path.
Probably, extension needs to find out the path of source relations from
the join_rel_level[] array.
Also, how do we pull SpecialJoinInfo? It contains needed information to
identify required join-type (like JOIN_LEFT), however, extension needs
to search join_info_list by relids again, if hook is located at
standard_join_search().
Even if number of hook invocation is larger if it is located on
add_paths_to_joinrel(), it allows to design extensions simpler,
I think.

> Suppose you want to add paths - e.g. you have an extension that goes
> and looks for a materialized view that matches this subtree of the
> query, and if it finds one, it substitutes a scan of the materialized
> view for a scan of the baserel.  Or, as in KaiGai's case, you have an
> extension that can perform the whole join in GPU-land and produce the
> same results we would have gotten via normal execution.  Either way,
> you want - and this is the central point of the whole patch here - to
> inject a scan path into a joinrel.  It is not altogether obvious to me
> what the best placement for this is.  In the materialized view case,
> you probably need a perfect match between the baserels in the view and
> the baserels in the joinrel to do anything.  There's no point in
> re-checking that for every innerrels/outerrels combination.  I don't
> know enough about the GPU case to reason about it intelligently; maybe
> KaiGai can comment.
>
In case of GPU, extension will add alternative paths based on hash-join
and nested-loop algorithm with individual cost estimation as long as
device can execute join condition. It expects planner (set_cheapest)
will choose the best path in the built-in/additional ones.
So, it is more reasonable for me, if extension can utilize a common
infrastructure as built-in logic (hash-join/merge-join/nested-loop)
is using to compute its cost estimation.

> But there's another possible approach: suppose that
> join_search_one_level, after considering left-sided and right-sided
> joins and after considering bushy joins, checks whether every relation
> it's got is from the same foreign server, and if so, asks that foreign
> server whether it would like to contribute any paths. Would that be
> better or worse?  A disadvantage is that if you've got something like
> A LEFT JOIN B LEFT JOIN C LEFT JOIN D LEFT JOIN E LEFT JOIN F LEFT
> JOIN G LEFT JOIN H LEFT JOIN I but none of the joins can be pushed
> down (say, each join clause calls a non-pushdown-safe function) you'll
> end up examining a pile of joinrels - at every level of the join tree
> - and individually rejecting each one.  With the
> build-it-up-incrementally approach, you'll figure that all out at
> level 2, and then after that there's nothing to do but give up
> quickly.  On the other hand, I'm afraid the incremental approach might
> miss a trick: consider small LEFT JOIN (big INNER JOIN huge

Re: [HACKERS] pg_dump quietly ignore missing tables - is it bug?

2015-03-14 Thread Pavel Stehule
2015-03-14 19:33 GMT+01:00 Pavel Stehule :

>
>
> 2015-03-13 23:43 GMT+01:00 Josh Berkus :
>
>> On 03/13/2015 10:01 AM, Pavel Stehule wrote:
>> >
>> >
>> > 2015-03-13 17:39 GMT+01:00 Robert Haas > > >:
>> >
>> > On Fri, Mar 13, 2015 at 11:26 AM, Pavel Stehule
>> > mailto:pavel.steh...@gmail.com>> wrote:
>> > > we found possible bug in pg_dump. It raise a error only when all
>> > specified
>> > > tables doesn't exists. When it find any table, then ignore missing
>> > other.
>> > >
>> > > /usr/local/pgsql/bin/pg_dump -t Foo -t omega -s postgres >
>> > /dev/null; echo
>> > > $?
>> > >
>> > > foo doesn't exists - it creates broken backup due missing "Foo"
>> table
>> > >
>> > >  [pavel@localhost include]$ /usr/local/pgsql/bin/pg_dump -t Foo
>> -t
>> > omegaa -s
>> > > postgres > /dev/null; echo $?
>> > > pg_dump: No matching tables were found
>> > > 1
>> > >
>> > > Is it ok? I am thinking, so it is potentially dangerous. Any
>> > explicitly
>> > > specified table should to exists.
>> >
>> > Keep in mind that the argument to -t is a pattern, not just a table
>> > name.  I'm not sure how much that affects the calculus here, but
>> it's
>> > something to think about.
>> >
>> >
>> > yes, it has a sense, although now, I am don't think so it was a good
>> > idea. There should be some difference between table name and table
>> pattern.
>>
>> There was a long discussion about this when the feature was introduced
>> in 7.3(?) IIRC.  Changing it now would break backwards-compatibility, so
>> you'd really need to introduce a new option.
>>
>> And, if you introduce a new option which is a specific table name, would
>> you require a schema name or not?
>>
>
> We can use a same rules like we use for STRICT clause somewhere. There
> should be only one table specified by the option. If it is not necessary,
> then only name is enough, else qualified name is necessary.
>
> what is a name for this option? Maybe only with long name - some like
> 'required-table' ?
>

other variant, I hope better than previous. We can introduce new long
option "--strict". With this active option, every pattern specified by -t
option have to have identifies exactly only one table. It can be used for
any other "should to exists" patterns - schemas. Initial implementation in
attachment.

Regards

Pavel


>
> Regards
>
> Pavel
>
>
>>
>> --
>> Josh Berkus
>> PostgreSQL Experts Inc.
>> http://pgexperts.com
>>
>
>
commit 3d3c6f6583c44a06633aea7978ad0631fed1679b
Author: Pavel Stehule 
Date:   Sun Mar 15 00:53:49 2015 +0100

initial

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index fdfb431..2a0d50f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -134,7 +134,8 @@ static void expand_schema_name_patterns(Archive *fout,
 			SimpleOidList *oids);
 static void expand_table_name_patterns(Archive *fout,
 		   SimpleStringList *patterns,
-		   SimpleOidList *oids);
+		   SimpleOidList *oids,
+		   bool strict_table_names);
 static NamespaceInfo *findNamespace(Archive *fout, Oid nsoid, Oid objoid);
 static void dumpTableData(Archive *fout, DumpOptions *dopt, TableDataInfo *tdinfo);
 static void refreshMatViewData(Archive *fout, TableDataInfo *tdinfo);
@@ -253,6 +254,8 @@ static char *get_synchronized_snapshot(Archive *fout);
 static PGresult *ExecuteSqlQueryForSingleRow(Archive *fout, char *query);
 static void setupDumpWorker(Archive *AHX, DumpOptions *dopt, RestoreOptions *ropt);
 
+static int			strict_table_names = false;
+
 
 int
 main(int argc, char **argv)
@@ -332,6 +335,7 @@ main(int argc, char **argv)
 		{"section", required_argument, NULL, 5},
 		{"serializable-deferrable", no_argument, &dopt.serializable_deferrable, 1},
 		{"snapshot", required_argument, NULL, 6},
+		{"strict", no_argument, &strict_table_names, 1},
 		{"use-set-session-authorization", no_argument, &dopt.use_setsessauth, 1},
 		{"no-security-labels", no_argument, &dopt.no_security_labels, 1},
 		{"no-synchronized-snapshots", no_argument, &dopt.no_synchronized_snapshots, 1},
@@ -700,15 +704,18 @@ main(int argc, char **argv)
 	if (table_include_patterns.head != NULL)
 	{
 		expand_table_name_patterns(fout, &table_include_patterns,
-   &table_include_oids);
+   &table_include_oids,
+   strict_table_names);
 		if (table_include_oids.head == NULL)
 			exit_horribly(NULL, "No matching tables were found\n");
 	}
 	expand_table_name_patterns(fout, &table_exclude_patterns,
-			   &table_exclude_oids);
+			   &table_exclude_oids,
+			   false);
 
 	expand_table_name_patterns(fout, &tabledata_exclude_patterns,
-			   &tabledata_exclude_oids);
+			   &tabledata_exclude_oids,
+			   false);
 
 	/* non-matching exclusion patterns aren't an error */
 
@@ -1173,13 +1180,30 @@ expand_schema_name_patterns(Archive *fout,
 	destroyPQExpBuffer(query);
 }
 
+static

[HACKERS] Faster setup_param_list() in plpgsql

2015-03-14 Thread Tom Lane
Given the rather hostile response I got to
http://www.postgresql.org/message-id/4146.1425872...@sss.pgh.pa.us
I was not planning to bring this topic up again until 9.6 development
starts.  However, as I said in that thread, this work is getting done now
because of $dayjob deadlines, and I've realized that it would actually
make a lot of sense to apply it before my expanded-arrays patch that's
pending in the current commitfest.  So I'm going to put on my flameproof
long johns and post it anyway.  I will add it to the 2015-06 commitfest,
but I'd really rather deal with it now ...

What this patch does is to remove setup_param_list() overhead for the
common case of PLPGSQL_DTYPE_VAR variables (ie, any non-composite type).
It does that by the expedient of keeping the ParamExternData image of such
a variable valid at all times.  That adds a few cycles to assignments to
these variables, but removes more cycles from each use of them.  Unless
you believe that common plpgsql functions contain lots of dead stores,
this is a guaranteed win overall.

I'm seeing about 10% overall speedup (vs HEAD, with casserts off) for
realistic simple plpgsql logic, such as this test case:

create or replace function typicalspeed(n int) returns bigint as $$
declare res bigint := 0;
begin
  for i in 1 .. n loop
res := res + i;
if i % 10 = 0 then res := res / 10; end if;
  end loop;
  return res;
end
$$ language plpgsql strict stable;

For functions with lots of variables (or even just lots of expressions,
since each one of those is a PLpgSQL_datum too), it's even more helpful.
I have found no cases where it makes things worse, at least to within
measurement error (run-to-run variability is a percent or two for me).

The reason I would like to apply this now rather than wait for 9.6
is that by making parameter management more explicit it removes the
need for the klugy changes in exec_eval_datum() that exist in
http://www.postgresql.org/message-id/22945.1424982...@sss.pgh.pa.us
Instead, we could leave exec_eval_datum() alone and substitute read-only
pointers only when manufacturing the parameter image of an expanded-object
variable.  If we do it in the other order then we'll be making an API
change for exec_eval_datum() in 9.5 (assuming said patch gets in) and then
reverting it come 9.6.

So there you have it.  Now, where'd I put those long johns ...

regards, tom lane
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 650cc48..9c201a7 100644
*** a/src/pl/plpgsql/src/pl_comp.c
--- b/src/pl/plpgsql/src/pl_comp.c
*** static Node *make_datum_param(PLpgSQL_ex
*** 104,109 
--- 104,111 
  static PLpgSQL_row *build_row_from_class(Oid classOid);
  static PLpgSQL_row *build_row_from_vars(PLpgSQL_variable **vars, int numvars);
  static PLpgSQL_type *build_datatype(HeapTuple typeTup, int32 typmod, Oid collation);
+ static void plpgsql_start_datums(void);
+ static void plpgsql_finish_datums(PLpgSQL_function *function);
  static void compute_function_hashkey(FunctionCallInfo fcinfo,
  		 Form_pg_proc procStruct,
  		 PLpgSQL_func_hashkey *hashkey,
*** do_compile(FunctionCallInfo fcinfo,
*** 371,383 
  	plpgsql_ns_init();
  	plpgsql_ns_push(NameStr(procStruct->proname));
  	plpgsql_DumpExecTree = false;
! 
! 	datums_alloc = 128;
! 	plpgsql_nDatums = 0;
! 	/* This is short-lived, so needn't allocate in function's cxt */
! 	plpgsql_Datums = MemoryContextAlloc(compile_tmp_cxt,
! 	 sizeof(PLpgSQL_datum *) * datums_alloc);
! 	datums_last = 0;
  
  	switch (function->fn_is_trigger)
  	{
--- 373,379 
  	plpgsql_ns_init();
  	plpgsql_ns_push(NameStr(procStruct->proname));
  	plpgsql_DumpExecTree = false;
! 	plpgsql_start_datums();
  
  	switch (function->fn_is_trigger)
  	{
*** do_compile(FunctionCallInfo fcinfo,
*** 758,767 
  	function->fn_nargs = procStruct->pronargs;
  	for (i = 0; i < function->fn_nargs; i++)
  		function->fn_argvarnos[i] = in_arg_varnos[i];
! 	function->ndatums = plpgsql_nDatums;
! 	function->datums = palloc(sizeof(PLpgSQL_datum *) * plpgsql_nDatums);
! 	for (i = 0; i < plpgsql_nDatums; i++)
! 		function->datums[i] = plpgsql_Datums[i];
  
  	/* Debug dump for completed functions */
  	if (plpgsql_DumpExecTree)
--- 754,761 
  	function->fn_nargs = procStruct->pronargs;
  	for (i = 0; i < function->fn_nargs; i++)
  		function->fn_argvarnos[i] = in_arg_varnos[i];
! 
! 	plpgsql_finish_datums(function);
  
  	/* Debug dump for completed functions */
  	if (plpgsql_DumpExecTree)
*** plpgsql_compile_inline(char *proc_source
*** 804,810 
  	PLpgSQL_variable *var;
  	int			parse_rc;
  	MemoryContext func_cxt;
- 	int			i;
  
  	/*
  	 * Setup the scanner input and error info.  We assume that this function
--- 798,803 
*** plpgsql_compile_inline(char *proc_source
*** 860,870 
  	plpgsql_ns_init();
  	plpgsql_ns_push(func_name);
  	plpgsql_DumpExecTree = false;
! 
! 	datums_allo

Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

2015-03-14 Thread Peter Geoghegan
On Fri, Mar 13, 2015 at 7:51 PM, Andrew Gierth
 wrote:
> Since ApplySortComparator returns int, and "compare" is used to store
> the return value of comparetup_datum which is also declared int, this
> seems inappropriate even as a "stylistic tweak".

Consistency matters. That change, and the other changes to the
structure of comparetup_datum() makes it match the other comparetup_*
cases more closely. I don't think it's a big deal, and I'm surprised
you're calling it out specifically.

> Also, your changes to the block comment for SortTuple now hide the fact
> that datum1 is potentially the abbreviated value in tuple as well as
> single-Datum cases.  Here are the versions for comparison (mine is
> first):

I don't think that I've failed to convey that, even just based on the
diff you included. My remarks apply to the new case, datum sorting,
where there is a new convention that must consider whether the type is
pass-by-value or pass-by-reference. Surely if it's not clear that
abbreviation is used for the heap tuple case (I think that's what you
mean), then that's a problem with the extant code in the master branch
that has nothing to do with this.

The whole point of this patch is that virtually every reasonable case
for abbreviation is now supported. That uniformity clarifies things.
So of course sortSupport (and abbreviation) is supported by every case
other than the hash case, where it clearly cannot apply. We just
needed to add a note on what the datum sort case does with
pass-by-value/past-by-reference-ness with respect to abbreviation,
alongside the existing discussion of that, since that becomes a
special case.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump quietly ignore missing tables - is it bug?

2015-03-14 Thread Pavel Stehule
2015-03-13 23:43 GMT+01:00 Josh Berkus :

> On 03/13/2015 10:01 AM, Pavel Stehule wrote:
> >
> >
> > 2015-03-13 17:39 GMT+01:00 Robert Haas  > >:
> >
> > On Fri, Mar 13, 2015 at 11:26 AM, Pavel Stehule
> > mailto:pavel.steh...@gmail.com>> wrote:
> > > we found possible bug in pg_dump. It raise a error only when all
> > specified
> > > tables doesn't exists. When it find any table, then ignore missing
> > other.
> > >
> > > /usr/local/pgsql/bin/pg_dump -t Foo -t omega -s postgres >
> > /dev/null; echo
> > > $?
> > >
> > > foo doesn't exists - it creates broken backup due missing "Foo"
> table
> > >
> > >  [pavel@localhost include]$ /usr/local/pgsql/bin/pg_dump -t Foo -t
> > omegaa -s
> > > postgres > /dev/null; echo $?
> > > pg_dump: No matching tables were found
> > > 1
> > >
> > > Is it ok? I am thinking, so it is potentially dangerous. Any
> > explicitly
> > > specified table should to exists.
> >
> > Keep in mind that the argument to -t is a pattern, not just a table
> > name.  I'm not sure how much that affects the calculus here, but it's
> > something to think about.
> >
> >
> > yes, it has a sense, although now, I am don't think so it was a good
> > idea. There should be some difference between table name and table
> pattern.
>
> There was a long discussion about this when the feature was introduced
> in 7.3(?) IIRC.  Changing it now would break backwards-compatibility, so
> you'd really need to introduce a new option.
>
> And, if you introduce a new option which is a specific table name, would
> you require a schema name or not?
>

We can use a same rules like we use for STRICT clause somewhere. There
should be only one table specified by the option. If it is not necessary,
then only name is enough, else qualified name is necessary.

what is a name for this option? Maybe only with long name - some like
'required-table' ?

Regards

Pavel


>
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com
>


Re: [HACKERS] patch : Allow toast tables to be moved to a different tablespace

2015-03-14 Thread Julien Tachoires
On 14/03/2015 16:10, Andreas Karlsson wrote:
> Noticed a bug when playing round some more with pg_dump. It does not 
> seem to dump custom table space for the table and default table space 
> for the toast correctly. It forgets about the toast table being in 
> pg_default.

Good catch. This is now fixed.

--
Julien



set_toast_tablespace_v0.14.patch.gz
Description: application/gzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] ranlib bleating about dirmod.o being empty

2015-03-14 Thread Tom Lane
I'm getting rather tired of reading these warning messages in OS X builds:

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: libpgport.a(dirmod.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: libpgport_srv.a(dirmod_srv.o) has no symbols

The reason for this warning is that dirmod.o contains no functions that
aren't Windows-specific.  As such, it seems like putting it into the
list of "always compiled" port modules is really a mistake.  Any
objections to switching it to be built only on Windows?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reduce pinning in btree indexes

2015-03-14 Thread Simon Riggs
On 13 March 2015 at 15:41, Kevin Grittner  wrote:

> The feedback was generally fairly positive except for the fact that
> snapshot "age" (for purposes of being too old) was measured in
> transaction IDs assigned.  There seemed to be a pretty universal
> feeling that this needed to be changed to a time-based setting.

-1 for a time based setting.

After years of consideration, bloat is now controllable by altering
the size of the undo tablespace.

I think PostgreSQL needs something size-based also. It would need some
estimation to get it to work like that, true, but it is actually the
size of the bloat we care about, not the time. So we should be
thinking in terms of limits that we actually care about.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reduce pinning in btree indexes

2015-03-14 Thread Simon Riggs
On 13 March 2015 at 13:17, Robert Haas  wrote:
> On Fri, Mar 13, 2015 at 8:52 AM, Simon Riggs  wrote:
>> On 15 February 2015 at 00:19, Kevin Grittner  wrote:
>>> What they wanted was what happened in the other database product --
>>> if a snapshot got sufficiently old that cleaning up the MVCC data
>>> was a problem *and* the snapshot was used again *and* it read a
>>> page which had been modified far enough back that it was not
>>> possible to return correct data, then they wanted to receive a
>>> "snapshot too old" error.  I wrote a patch to do that...
>>
>> So, please lets see the patch. It seems useful for core Postgres.
>
> It was submitted here:
>
> http://www.postgresql.org/message-id/136937748.3364317.1423964815320.javamail.ya...@mail.yahoo.com

Thanks. I have +1'd that patch.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow "snapshot too old" error, to prevent bloat

2015-03-14 Thread Simon Riggs
On 17 February 2015 at 06:35, Magnus Hagander  wrote:
>
> On Feb 17, 2015 12:26 AM, "Andres Freund"  wrote:
>>
>> On 2015-02-16 16:35:46 -0500, Bruce Momjian wrote:
>> > It seems we already have a mechanism in place that allows tuning of
>> > query cancel on standbys vs. preventing standby queries from seeing old
>> > data, specifically
>> > max_standby_streaming_delay/max_standby_archive_delay.  We obsessed
>> > about how users were going to react to these odd variables, but there
>> > has been little negative feedback.
>>
>> FWIW, I think that's a somewhat skewed perception. I think it was right to
>> introduce those, because we didn't really have any alternatives.
>>
>> But max_standby_streaming_delay, max_standby_archive_delay and
>> hot_standby_feedback are among the most frequent triggers for questions
>> and complaints that I/we see.
>>
>
> Agreed.
>
> And a really bad one used to be vacuum_defer_cleanup_age, because of
> confusing units amongst other things. Which in terms seems fairly close to
> Kevins suggestions, unfortunately.

I agree with Bruce that we already have a mechanism similar to this
for Hot Standby, so it should therefore be OK to have it for normal
running when users desire it. The key difference here is that in this
patch we use the LSN to look for changed data blocks, allowing queries
to proceed for longer than they would do on Hot Standby where they
currently just get blown away regardless. I like the proposal in this
patch but it is more extreme than the mechanism Hot Standby provides.
(If we implement this then I would want to see it work for Hot Standby
also so we can keep the mechanisms in step, I think that is too late
in the day to add that for 9.5.)

I also agree with Andres and Magnus in saying that in the current
definition of the tunable parameter it would be hard to use.

In response to Tom's perspective that it is the single most annoying
feature of Oracle, I agree. It just happens we have a similar problem:
bloat.

Having a parameter to define "maximum acceptable bloat" would be a
very useful thing in PostgreSQL.  IIRC other DBMS had a lot of work to
make "snapshot too old" occur in controllable circumstances.  So I
would call having a very crap parameter like "old_snapshot_limit" a
good start, but clearly not the end point of an initiative too improve
our control of bloat.

+1 to include this patch in 9.5, as long as there is agreement to
improve this in the future

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch : Allow toast tables to be moved to a different tablespace

2015-03-14 Thread Andreas Karlsson
Noticed a bug when playing round some more with pg_dump. It does not 
seem to dump custom table space for the table and default table space 
for the toast correctly. It forgets about the toast table being in 
pg_default.


--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch : Allow toast tables to be moved to a different tablespace

2015-03-14 Thread Andreas Karlsson

On 03/13/2015 11:48 AM, Julien Tachoires wrote:

Here is a new version rebased against head and considering Andreas' last
comments:

   - New regression tests about the fact that a notice is raised when
ALTER TABLE SET TOAST TABLESPACE is done for a non-TOASTed table.
   - New regression tests to check that a TOAST index follows the TOAST
table when it's moved.
   - Some fixes in the documentation.
   - psql's '\d' command shows now both table and TOAST table tablespaces
when they are different, even if one of them is pg_default.
   - patch added to the next open commitfest:
https://commitfest.postgresql.org/5/188/


Nice, I have no remaining comments on this patch other than some 
incorrect mixing of whitespace in the indentation of the "Case when 
TOAST and table tablespaces are different and als" comment.


Once that has been fixed I would say this patch is technically ready for 
committer, but since it is in the open commitfest I do not think it will 
get committed any time soon.


--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2015-03-14 Thread Petr Jelinek

On 22/02/15 01:59, Petr Jelinek wrote:

On 21/02/15 22:09, Andrew Dunstan wrote:


On 02/16/2015 09:05 PM, Petr Jelinek wrote:


I found one more issue with the 1.2--1.3 upgrade script, the DROP
FUNCTION pg_stat_statements(); should be DROP FUNCTION
pg_stat_statements(bool); since in 1.2 the function identity has
changed.





I think all the outstanding issues are fixed in this patch.



I think PGSS_FILE_HEADER should be also updated, otherwise it's good.




I see you marked this as 'needs review', I am marking it as 'ready for 
committer' as it looks good to me.


Just don't forget to update the PGSS_FILE_HEADER while committing (I 
think that one can be threated same way as catversion, ie be updated by 
committer and not in patch submission).


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewind in contrib

2015-03-14 Thread Amit Kapila
On Fri, Mar 13, 2015 at 1:13 AM, Heikki Linnakangas  wrote:
>
> The cause was a silly typo in truncate_target_file:
>
>> @@ -397,7 +397,7 @@ truncate_target_file(const char *path, off_t newsize)
>>
>> snprintf(dstpath, sizeof(dstpath), "%s/%s", datadir_target,
path);
>>
>> -   fd = open(path, O_WRONLY, 0);
>> +   fd = open(dstpath, O_WRONLY, 0);
>> if (fd < 0)
>> pg_fatal("could not open file \"%s\" for truncation:
%s\n",
>>  dstpath, strerror(errno));
>
>
> Attached is a new version of the patch, including that fix, and rebased
over current git master.
>

I have verified that new patch has fixed the problem.

Few more observations:

Getting below linking error with Asserts enabled in Windows build.

1>xlogreader.obj : error LNK2019: unresolved external symbol
ExceptionalCondition referenced in function
XLogReadRecord
1>.\Debug\pg_rewind\pg_rewind.exe : fatal error LNK1120: 1 unresolved
externals

Am I doing anything wrong while building?

2.
msvc\clean.bat has below way to clean xlogreader.c for pg_xlogdump,
shouldn't something similar required for pg_rewind?

REM clean up files copied into contrib\pg_xlogdump
if exist contrib\pg_xlogdump\xlogreader.c del /q contrib
\pg_xlogdump\xlogreader.c
for %%f in (contrib\pg_xlogdump\*desc.c) do if not %%f==contrib\pg_xlogdump
\rmgrdesc.c del /q %%f

3.
+void
+remove_target(file_entry_t *entry)
{
..
+ case FILE_TYPE_REGULAR:
+ remove_target_symlink(entry->path);
+
break;
+
+ case FILE_TYPE_SYMLINK:
+ remove_target_file(entry-
>path);
+ break;
}

It seems function calls *_symlink and *_file are reversed, in reality it
might not matter much because the code for both functions is same,
but still it might diverge in future.

4.
Copyright notice contains variation in terms of years

+ * Copyright (c) 2010-2015, PostgreSQL Global Development Group
+ * Copyright (c) 2013-2015, PostgreSQL Global Development Group

+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group

Is there any particular reason for the same?

5.
+ * relation files. Other forks are alwayes copied in toto, because we
cannot
+ * reliably track changes to
the, because WAL only contains block references
+ * for the main fork.
+ */
+static bool
+isRelDataFile(const
char *path)

Sentence near "track changes to the, .." looks incomplete.


6.
+libpqConnect(const char *connstr)
{
..
+ /*
+ * Also check that full_page-writes are enabled. We can get torn pages if
+ * a page is
modified while we read it with pg_read_binary_file(), and we
+ * rely on full page images to fix them.
+
 */
+ str = run_simple_query("SHOW full_page_writes");
+ if (strcmp(str, "on") != 0)
+
pg_fatal("full_page_writes must be enabled in the source server\n");
+ pg_free(str);
..
}

Do you think it is worth to mention this information in docs?


7.
Function execute_pagemap() exists in both copy_fetch.c and
libpq_fetch.c, are you expecting that they will get diverged
in future?



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Performance improvement for joins where outer side is unique

2015-03-14 Thread David Rowley
On 14 March 2015 at 14:51, David Rowley  wrote:

> On 13 March 2015 at 20:34, Kyotaro HORIGUCHI <
> horiguchi.kyot...@lab.ntt.co.jp> wrote:
>
>> Unfortunately I can't decide this to be 'ready for commiter' for
>>
> now. I think we should have this on smaller footprint, in a
>> method without separate exhauxtive searching. I also had very
>> similar problem in the past but I haven't find such a way for my
>> problem..
>>
>>
> I don't think it's ready yet either. I've just been going over a few
> things and looking at Tom's recent commit b557226 in pathnode.c I've got a
> feeling that this patch would need to re-factor some code that's been
> modified around the usage of relation_has_unique_index_for() as when this
> code is called, the semi joins have already been analysed to see if they're
> unique, so it could be just a case of ripping all of that out
> of create_unique_path() and just putting a check to say rel->is_unique_join
> in there. But if I do that then I'm not quite sure if
> SpecialJoinInfo->semi_rhs_exprs and SpecialJoinInfo->semi_operators would
> still be needed at all. These were only added in b557226. Changing this
> would help reduce the extra planning time when the query contains
> semi-joins. To be quite honest, this type of analysis belongs in
> analyzejoin.c anyway. I'm tempted to hack at this part some more, but I'd
> rather Tom had a quick glance at what I'm trying to do here first.
>
>

I decided to hack away any change the code Tom added in b557226. I've
changed it so that create_unique_path() now simply just uses if
(rel->is_unique_join), instead off all the calls to
relation_has_unique_index_for() and query_is_distinct_for().  This vastly
simplifies that code. One small change is that Tom's checks for uniqueness
on semi joins included checks for volatile functions, this check didn't
exist in the original join removal code, so I've left it out. We'll never
match a expression with a volatile function to a unique index as indexes
don't allow volatile function expressions anyway. So as I understand it
this only serves as a fast path out if the join condition has a volatile
function... But I'd assume that check is not all that cheap.

I ended up making query_supports_distinctness() and query_is_distinct_for()
static in analyzejoins.c as they're not used in any other files.
relation_has_unique_index_for() is also now only used in analyzejoins.c,
but I've not moved it into that file yet as I don't want to bloat the
patch. I just added a comment to say it needs moved.

I've also added a small amount of code to query_is_distinct_for() which
allows subqueries such as (select 1 a offset 0) to be marked as unique. I
thought it was a little silly that these were not being detected as unique,
so I fixed it. This has the side effect of allowing left join removals for
queries such as: select t1.* from t1 left join (select 1 a offset 0) a on
t1.id=a.a;

Updated patch attached.

Regards

David Rowley


unijoin_2015-03-14_81bd96a.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2015-03-14 Thread David Rowley
On 14 March 2015 at 14:51, David Rowley  wrote:

> On 13 March 2015 at 20:34, Kyotaro HORIGUCHI <
> horiguchi.kyot...@lab.ntt.co.jp> wrote:
>
>>
>>
> For all that, I agree that the opition that this kind of separate
>> multiple-nested loops on relations, joins or ECs and so on for
>> searching something should be avoided. I personally feel that
>> additional time to such an extent (around 1%) would be tolerable
>> if it affected a wide range of queries or it brought more obvious
>> gain.
>>
>>
> For testing, I added some code to mark_unique_joins() to spit out a NOTICE:
>
> if (eclassjoin_is_unique_join(root, joinlist, rtr))
> {
> root->simple_rel_array[rtr->rtindex]->is_unique_join = true;
> elog(NOTICE, "Unique Join: Yes");
> }
> else
> elog(NOTICE, "Unique Join: No");
>
> and the same below for special joins too.
>
> On running the regression tests I see:
>
> "Unique Join: Yes"  1557 times
> "Unique Join: No" 11563 times
>

With this notice emitting code in place, I opened up pgAdmin and had a
click around for a few minutes.

If I search the log file I see:

Unique Join: No  940 times
Unique Join: Yes  585 times

It seems that joins with a unique inner side are quite common here.

Regards

David Rowley


Re: [HACKERS] MD5 authentication needs help -SCRAM

2015-03-14 Thread Heikki Linnakangas

On 03/09/2015 04:43 PM, Abhijit Menon-Sen wrote:

At 2015-03-09 13:52:10 +0200, hlinn...@iki.fi wrote:


Do you have any insight on why the IETF working group didn't choose a
PAKE protocol instead of or in addition to SCRAM, when SCRAM was
standardized?


Hi Heikki.

It was a long time ago, but I recall that SRP was patent-encumbered:

https://datatracker.ietf.org/ipr/search/?rfc=2945&submit=rfc

The Wikipedia page says the relevant patents expired in 2011 and 2013.
I haven't followed SRP development since then, maybe it's been revised.

When SCRAM was being discussed, I can't recall any other proposals for
PAKE protocols. Besides, as you may already know, anyone can submit an
internet-draft about anything. It needs to gain general support for an
extended period in order to advance through the standards process.


Ok, makes sense. Perhaps it would be time to restart the discussion on 
standardizing SRP as a SASL mechanism in IETF. Or we could just 
implement the draft as it is.



Could you please explain what exactly you mean about a SCRAM
eavesdropper gaining some advantage in being able to mount a
dictionary attack? I didn't follow that part.


Assume that the connection is not encrypted, and Eve captures the SCRAM 
handshake between Alice and Bob. Using the captured handshake, she can 
try to guess the password, offline. With a PAKE protocol, she cannot do 
that.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers