date:20151206

Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-12-06 Thread Amit Langote


Hi,

On 2015/12/03 19:05, Kyotaro HORIGUCHI wrote:
> At Thu, 3 Dec 2015 16:24:32 +0900, Amit Langote 
>  wrote
>> By the way, there are some non-st_progress_* fields that I was talking
>> about in my previous message. For example, st_activity_flag (which I have
>> suggested to rename to st_command instead). It needs to be set once at the
>> beginning of the command processing and need not be touched again. I think
>> it may be a better idea to do the same for table name or OID. It also
>> won't change over the duration of the command execution. So, we could set
>> it once at the beginning where that becomes known. Currently in the patch,
>> it's reported in st_progress_message[0] by every pgstat_report_progress()
>> invocation. So, the table name will be strcpy()'d to shared memory for
>> every scanned block that's reported.
> 
> If we don't have dedicate reporting functions for each phase
> (that is, static data phase and progress phase), the one function
> pgstat_report_progress does that by having some instruction from
> the caller and it would be a bitfield.
> 
> Reading the flags for making decision whether every field to copy
> or not and branching by that seems too much for int32 (or maybe
> 64?) fields. Alhtough it would be in effective when we have
> progress_message fields, I don't think it is a good idea without
> having progress_message.
> 
>> pgstat_report_progress(uint32 *param1,  uint16 param1_bitmap,
>>char param2[][..], uint16 param2_bitmap)
>> {
>> ...
>>   for(i = 0; i < 16 ; i++)
>>   {
>>   if (param1_bitmap & (1 << i))
>>beentry->st_progress_param[i] = param1[i];
>>   if (param2_bitmap & (1 << i))
>>strcpy(beentry->..., param2[i]);
>>   }
> 
> Thoughts?

Hm, I guess progress messages would not change that much over the course
of a long-running command. How about we pass only the array(s) that we
change and NULL for others:

With the following prototype for pgstat_report_progress:

void pgstat_report_progress(uint32 *uint32_param, int num_uint32_param,
   bool *message_param[], int num_message_param);

If we just wanted to change, say scanned_heap_pages, then we report that
using:

uint32_param[1] = scanned_heap_pages;
pgstat_report_progress(uint32_param, 3, NULL, 0);

Also, pgstat_report_progress() should check each of its parameters for
NULL before iterating over to copy. So in most reports over the course of
a command, the message_param would be NULL and hence not copied.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Making tab-complete.c easier to maintain

2015-12-06 Thread Michael Paquier

On Tue, Nov 17, 2015 at 12:19 AM, Alvaro Herrera 
wrote:
> Thomas Munro wrote:
>> New version attached, merging recent changes.
>
> I wonder about the TailMatches and Matches macros --- wouldn't it be
> better to have a single one, renaming TailMatches to Matches and
> replacing the current Matches() with an initial token that corresponds
> to anchoring to start of command?  Just wondering, not terribly attached
> to the idea.

+   /* TODO:TM -- begin temporary, not part of the patch! */
+   Assert(!word_matches(NULL, ""));
+ [...]
+   Assert(!word_matches("foo", ""));
+   /* TODO:TM -- end temporary */

Be sure to not forget to remove that later.

-   else if (pg_strcasecmp(prev5_wd, "DEFAULT") == 0 &&
-pg_strcasecmp(prev4_wd, "PRIVILEGES") == 0 &&
-(pg_strcasecmp(prev3_wd, "FOR") == 0 ||
- pg_strcasecmp(prev3_wd, "IN") == 0))
-   {
-   static const char *const
list_ALTER_DEFAULT_PRIVILEGES_REST[] =
-   {"GRANT", "REVOKE", NULL};
-
-   COMPLETE_WITH_LIST(list_ALTER_DEFAULT_PRIVILEGES_REST);
-   }
+   else if (TailMatches5("DEFAULT", "PRIVILEGES", "FOR", "ROLE",
MatchAny) ||
+TailMatches5("DEFAULT", "PRIVILEGES", "IN",
"SCHEMA", MatchAny))
+   CompleteWithList2("GRANT", "REVOKE");
For this chunk I think that you need to check for ROLE|USER and not only
ROLE.

+   else if (TailMatches4("ALTER", "DOMAIN", MatchAny, "RENAME"))
{
static const char *const list_ALTERDOMAIN[] =
{"CONSTRAINT", "TO", NULL};
I think you should remove COMPLETE_WITH_LIST here for consistency with the
rest.

-   else if (pg_strcasecmp(prev5_wd, "DOMAIN") == 0 &&
-pg_strcasecmp(prev3_wd, "RENAME") == 0 &&
-pg_strcasecmp(prev2_wd, "CONSTRAINT") == 0)
+   else if (TailMatches3("RENAME", "CONSTRAINT", MatchAny))
COMPLETE_WITH_CONST("TO");
Perhaps you are missing DOMAIN here?

-   else if (pg_strcasecmp(prev4_wd, "ALTER") == 0 &&
-pg_strcasecmp(prev3_wd, "SEQUENCE") == 0 &&
-pg_strcasecmp(prev_wd, "NO") == 0)
-   {
-   static const char *const list_ALTERSEQUENCE2[] =
-   {"MINVALUE", "MAXVALUE", "CYCLE", NULL};
-
-   COMPLETE_WITH_LIST(list_ALTERSEQUENCE2);
-   }
+   else if (TailMatches4("ALTER", "SEQUEMCE", MatchAny, "NO"))
+   CompleteWithList3("MINVALUE", "MAXVALUE", "CYCLE");
Typo here: s/SEQUEMCE/SEQUENCE.

-   else if (pg_strcasecmp(prev5_wd, "TABLE") == 0 &&
-pg_strcasecmp(prev3_wd, "RENAME") == 0 &&
-(pg_strcasecmp(prev2_wd, "COLUMN") == 0 ||
- pg_strcasecmp(prev2_wd, "CONSTRAINT") == 0) &&
-pg_strcasecmp(prev_wd, "TO") != 0)
+   else if (TailMatches6("ALTER", "TABLE", MatchAny, "RENAME",
"COLUMN|CONSTRAINT", MatchAny) &&
+!TailMatches1("TO"))
This should use TailMatches5 without ALTER for consistency with the
existing code?

-   else if (pg_strcasecmp(prev_wd, "CLUSTER") == 0 &&
-pg_strcasecmp(prev2_wd, "WITHOUT") != 0)
+   else if (TailMatches1("CLUSTER") && !TailMatches2("WITHOUT",
"CLUSTER"))
Here removing CLUSTER should be fine.

-   else if (pg_strcasecmp(prev2_wd, "CLUSTER") == 0 &&
-pg_strcasecmp(prev_wd, "ON") != 0 &&
-pg_strcasecmp(prev_wd, "VERBOSE") != 0)
-   {
+   else if (TailMatches2("CLUSTER", MatchAny) &&
!TailMatches1("VERBOSE"))
Handling of ON has been forgotten.

-   else if (pg_strcasecmp(prev3_wd, "CREATE") == 0 &&
-!(pg_strcasecmp(prev2_wd, "USER") == 0 &&
pg_strcasecmp(prev_wd, "MAPPING") == 0) &&
-(pg_strcasecmp(prev2_wd, "ROLE") == 0 ||
- pg_strcasecmp(prev2_wd, "GROUP") == 0 ||
pg_strcasecmp(prev2_wd, "USER") == 0))
+   else if (TailMatches3("CREATE", "ROLE|GROUP|USER", MatchAny) &&
+!TailMatches3("CREATE", "USER", "MAPPING"))
!TailMatches2("USER", "MAPPING")?

-   else if (pg_strcasecmp(prev4_wd, "CREATE") == 0 &&
-pg_strcasecmp(prev3_wd, "MATERIALIZED") == 0 &&
-pg_strcasecmp(prev2_wd, "VIEW") == 0)
+   else if (TailMatches3("CREATE", "MATERIALIZED", "VIEW"))
Forgot a MatchAny here?

-   else if (pg_strcasecmp(prev_wd, "DELETE") == 0 &&
-!(pg_strcasecmp(prev2_wd, "ON") == 0 ||
-  pg_strcasecmp(prev2_wd, "GRANT") == 0 ||
-  pg_strcasecmp(prev2_wd, "BEFORE") == 0 ||
-  pg_strcasecmp(prev2_wd, "AFTER") == 0))
+   else if (TailMatches1("DELETE") &&
!TailMatches2("ON|GRANT|BEFORE|AFTER", "DELETE"))
COMPLETE_WITH_CONST("FR

Re: [HACKERS] [sqlsmith] Failed to generate plan on lateral subqueries

2015-12-06 Thread Amit Langote

On 2015/12/07 2:52, Andreas Seltenreich wrote:
> Hi,
> 
> I've added new grammar rules to sqlsmith and improved some older ones.
> This was rewarded with a return of "failed to generate plan" errors.
> The failing queries all contain a lateral subquery.  The shortest of the
> failing queries are below.  They were run against the regression db of
> master as of db07236.
> 
> smith=# select msg, query from error where
>(firstline(msg) ~~ 'ERROR:  failed to build any%'
>or firstline(msg) ~~ 'ERROR:  could not devise a query plan%')
>   and t > now() - interval '1 day' order by length(query) asc limit 3;
> 
> ERROR:  failed to build any 8-way joins
> select
>   ref_96.foreign_table_schema as c0,
>   sample_87.is_supported as c1
> from
>   information_schema.sql_packages as sample_87 tablesample system (0.2)
> right join information_schema._pg_foreign_tables as ref_96
> on (sample_87.feature_id = ref_96.foreign_table_catalog ),
>   lateral (select
> sample_87.is_verified_by as c0,
> ref_97.indexed_col as c1,
> coalesce(sample_87.feature_id, ref_96.foreign_server_name) as c2,
> 4 as c3
>   from
> public.comment_test as ref_97
>   where ref_97.id ~>~ ref_97.indexed_col
>   fetch first 73 rows only) as subq_33
> where ref_96.foreign_table_name ~~ subq_33.c1
> 
> ERROR:  could not devise a query plan for the given query
> select
>   subq_43.c0 as c0
> from
>   (select
>   ref_181.installed as c0
> from
>   pg_catalog.pg_available_extension_versions as ref_181,
>   lateral (select
> ref_181.name as c0,
> ref_181.installed as c1
>   from
> pg_catalog.pg_conversion as ref_182
>   where ref_182.conname ~~* ref_181.version
>   fetch first 98 rows only) as subq_42
> where (subq_42.c0 is not NULL)
>   or (subq_42.c1 is NULL)) as subq_43
> right join pg_catalog.pg_language as sample_177 tablesample system (2.8)
> on (subq_43.c0 = sample_177.lanispl )
> where sample_177.lanowner < sample_177.lanvalidator
> 
> ERROR:  failed to build any 5-way joins
> select
>   ref_239.id2 as c0,
>   40 as c1,
>   ref_239.id2 as c2,
>   ref_238.aa as c3
> from
>   public.tt5 as sample_289 tablesample system (8.1)
> inner join information_schema.element_types as ref_237
> on (sample_289.x = ref_237.character_maximum_length )
>   left join public.b as ref_238
>   on (ref_237.character_maximum_length = ref_238.aa )
> left join public.num_exp_mul as ref_239
> on (ref_237.numeric_precision_radix = ref_239.id1 ),
>   lateral (select
> sample_290.b as c0,
> sample_289.y as c1,
> ref_239.id2 as c2
>   from
> public.rtest_t8 as sample_290 tablesample bernoulli (4.6)
>   where (sample_290.b > ref_238.bb)
> and (sample_289.y > ref_239.expected)
>   fetch first 91 rows only) as subq_64
> where (subq_64.c1 > sample_289.y)
>   and (sample_289.y = ref_239.expected)
> fetch first 133 rows only

FWIW, I'm seeing the following behaviors:

* Removing the limit (fetch first...) in lateral sub-queries makes the
errors go away for all above queries.

* For the last query producing "failed to build any 5-way joins" error,
setting join_collapse_limit to 1, 2 and 4 makes the error go away
irrespective of whether the limit is kept or not.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Making tab-complete.c easier to maintain

2015-12-06 Thread Michael Paquier

On Thu, Nov 26, 2015 at 2:45 PM, Kyotaro HORIGUCHI
 wrote:
> What do you think about this solution?



This patch fails to compile on OSX:
Undefined symbols for architecture x86_64:
  "_ExceptionalCondition", referenced from:
  _pg_regexec in regexec.o
  _cfind in regexec.o
  _find in regexec.o
  _newdfa in regexec.o
  _cfindloop in regexec.o
  _shortest in regexec.o
  _cdissect in regexec.o
  ...
So, to begin with, this may be better if replugged as a standalone
library, aka moving the regexp code into src/common for example or
similar. Also, per the comments on top of rcancelrequested,
rstacktoodeep and rcancelrequested, returning unconditionally 0 is not
a good idea for -DFRONTEND. Callbacks should be defined and made
available for callers.

-   {"EVENT TRIGGER", NULL, NULL},
+   {"EVENT TRIGGER", Query_for_list_of_event_triggers, NULL},
{"EXTENSION", Query_for_list_of_extensions},
-   {"FOREIGN DATA WRAPPER", NULL, NULL},
+   {"FOREIGN DATA WRAPPER", Query_for_list_of_fdws, NULL},
{"FOREIGN TABLE", NULL, NULL},
{"FUNCTION", NULL, &Query_for_list_of_functions},
{"GROUP", Query_for_list_of_roles},
{"LANGUAGE", Query_for_list_of_languages},
{"INDEX", NULL, &Query_for_list_of_indexes},
-   {"MATERIALIZED VIEW", NULL, NULL},
+   {"MATERIALIZED VIEW", NULL, &Query_for_list_of_matviews},
This has value as a separate patch.

The patch has many whitespaces, and unrelated diffs.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-06 Thread Michael Paquier

On Mon, Dec 7, 2015 at 12:06 PM, Noah Misch  wrote:
> On Wed, Dec 02, 2015 at 06:59:09PM -0300, Alvaro Herrera wrote:
>> --- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
>> +++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
>
>> -like($recovery_conf, qr/^standby_mode = 'on[']$/m, 'recovery.conf sets 
>> standby_mode');
>> -like($recovery_conf, qr/^primary_conninfo = '.*port=$ENV{PGPORT}.*'$/m, 
>> 'recovery.conf sets primary_conninfo');
>> -
>> -command_ok([ 'pg_basebackup', '-D', "$tempdir/backupxf", '-X', 'fetch' ],
>> +like(
>> + $recovery_conf,
>> + qr/^standby_mode = 'on[']$/m,
>> + 'recovery.conf sets standby_mode');
>> +like(
>> + $recovery_conf,
>> + qr/^primary_conninfo = '.*port=$ENV{PGPORT}.*'$/m,
>> + 'recovery.conf sets primary_conninfo');
>
> This now elicits a diagnostic:
>
>   Use of uninitialized value $ENV{"PGPORT"} in regexp compilation at 
> t/010_pg_basebackup.pl line 175, <> line 1.

Fixed.

>> --- a/src/bin/pg_controldata/t/001_pg_controldata.pl
>> +++ b/src/bin/pg_controldata/t/001_pg_controldata.pl
>
>> -standard_initdb "$tempdir/data";
>> -command_like([ 'pg_controldata', "$tempdir/data" ],
>> +
>> +my $node = get_new_node();
>> +$node->init;
>> +$node->start;
>
> Before the commit, this test did not start a server.

Fixed.

>> --- /dev/null
>> +++ b/src/test/perl/PostgresNode.pm
>
>> +sub _update_pid
>> +{
>> + my $self = shift;
>> +
>> + # If we can open the PID file, read its first line and that's the PID 
>> we
>> + # want.  If the file cannot be opened, presumably the server is not
>> + # running; don't be noisy in that case.
>> + open my $pidfile, $self->data_dir . "/postmaster.pid";
>> + if (not defined $pidfile)
>
> $ grep -h 'at /.*line' src/bin/*/tmp_check/log/* |sort|uniq -c
>   1 cannot remove directory for 
> /data/nmisch/src/pg/postgresql/src/bin/scripts/tmp_testNNCZ: Directory not 
> empty at /home/nmisch/sw/cpan/lib/perl5/File/Temp.pm line 784
>   1 cannot remove directory for 
> /data/nmisch/src/pg/postgresql/src/bin/scripts/tmp_testNNCZ/pgdata/base/13264:
>  Directory not empty at /home/nmisch/sw/cpan/lib/perl5/File/Temp.pm line 784
>   1 cannot remove directory for 
> /data/nmisch/src/pg/postgresql/src/bin/scripts/tmp_testNNCZ/pgdata/base: 
> Directory not empty at /home/nmisch/sw/cpan/lib/perl5/File/Temp.pm line 784
>   1 cannot remove directory for 
> /data/nmisch/src/pg/postgresql/src/bin/scripts/tmp_testNNCZ/pgdata: Directory 
> not empty at /home/nmisch/sw/cpan/lib/perl5/File/Temp.pm line 784
>  28 readline() on closed filehandle $pidfile at 
> /data/nmisch/src/pg/postgresql/src/bin/pg_rewind/../../../src/test/perl/PostgresNode.pm
>  line 308.
>  28 Use of uninitialized value in concatenation (.) or string at 
> /data/nmisch/src/pg/postgresql/src/bin/pg_rewind/../../../src/test/perl/PostgresNode.pm
>  line 309.
>
> $pidfile is always defined.  Test the open() return value.

That's something I have addressed here:
http://www.postgresql.org/message-id/cab7npqtop28zxv_sxfo2axgjoesfvllmo6syddafv0duvsf...@mail.gmail.com
I am including the fix as well here to do a group shot.

>> + {
>> + $self->{_pid} = undef;
>> + print "# No postmaster PID\n";
>> + return;
>> + }
>> +
>> + $self->{_pid} = <$pidfile>;
>
> chomp() that value to remove its trailing newline.

Right.

>> + print "# Postmaster PID is $self->{_pid}\n";
>> + close $pidfile;
>> +}
>
>> + my $devnull = $TestLib::windows_os ? "nul" : "/dev/null";
>
> Unused variable.

Removed.

>> +sub DESTROY
>> +{
>> + my $self = shift;
>> + return if not defined $self->{_pid};
>> + print "# signalling QUIT to $self->{_pid}\n";
>> + kill 'QUIT', $self->{_pid};
>
> On Windows, that kill() is the wrong thing.  I suspect "pg_ctl kill" will be
> the right thing, but that warrants specific testing.

I don't directly see any limitation with the use of kill on Windows..
http://perldoc.perl.org/functions/kill.html
But indeed using directly pg_ctl kill seems like a better fit for the
PG infrastructure.

> Postmaster log file names became less informative.  Before the commit:
> Should nodes get a name, so we instead see master_57834.log?

I am not sure that this is necessary. There is definitely a limitation
regarding the fact that log files of nodes get overwritten after each
test, hence I would tend with just appending the test name in front of
the node_* files similarly to the other files.

> See also the things Tom Lane identified in <27550.1449342...@sss.pgh.pa.us>.

Yep, I marked this email as something to address when it was sent.

On Sun, Dec 6, 2015 at 4:09 AM, Tom Lane  wrote:
> BTW, if we consider that gospel, why has PostgresNode.pm got this in its
> BEGIN block?
>
> # PGHOST is set once and for all through a single series of tests when
> # this module is loaded.
> $test_pghost =
>   $TestLib::windows_os ? "127.0.0.1" : TestLib::t

Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-12-06 Thread Etsuro Fujita


On 2015/12/05 5:15, Robert Haas wrote:

On Tue, Dec 1, 2015 at 10:20 PM, Etsuro Fujita
 wrote:

One thing I can think of is that we can keep both the structure of a
ForeignPath node and the API of create_foreignscan_path as-is.  The latter
is a good thing for FDW authors.  And IIUC the patch you posted today, I
think we could make create_foreignscan_plan a bit simpler too.  Ie, in your
patch, you modified that function as follows:

@@ -2129,7 +2134,9 @@ create_foreignscan_plan(PlannerInfo *root, ForeignPath
*best_path,
  */
 scan_plan = rel->fdwroutine->GetForeignPlan(root, rel, rel_oid,

best_path,
-
tlist, scan_clauses);
+
tlist,
+
scan_clauses);
+   outerPlan(scan_plan) = fdw_outerplan;

I think that would be OK, but I think we would have to do a bit more here
about the fdw_outerplan's targetlist and qual; I think that the targetlist
needs to be changed to fdw_scan_tlist, as in the patch [1], and that it'd be
better to change the qual to remote conditions, ie, quals not in the
scan_plan's scan.plan.qual, to avoid duplicate evaluation of local
conditions.  (In the patch [1], I didn't do anything about the qual because
the current postgres_fdw join pushdown patch assumes that all the the
scan_plan's scan.plan.qual are pushed down.) Or, FDW authors might want to
do something about fdw_recheck_quals for a foreign-join while creating the
fdw_outerplan.  So if we do that during GetForeignPlan, I think we could
make create_foreignscan_plan a bit simpler, or provide flexibility to FDW
authors.



It's certainly true that we need the alternative plan's tlist to match
that of the main plan; otherwise, it's going to be difficult for the
FDW to make use of that alternative subplan to fill its slot, which is
kinda the point of all this.


OK.


However, I'm quite reluctant to
introduce code into create_foreignscan_plan() that forces the
subplan's tlist to match that of the main plan.  For one thing, that
would likely foreclose the possibility of an FDW ever using the outer
plan for any purpose other than EPQ rechecks.  It may be hard to
imagine what else you'd do with the outer plan as things are today,
but right now the two haves of the patch - letting FDWs have an outer
subplan, and providing them with a way of overriding the EPQ recheck
behavior - are technically independent.  Putting tlist-altering
behavior into create_foreignscan_plan() ties those two things together
irrevocably.


Agreed.


Instead, I think we should go the opposite direction and pass the
outerplan to GetForeignPlan after all.  I was lulled into a full sense
of security by the realization that every FDW that uses this feature
MUST want to do outerPlan(scan_plan) = fdw_outerplan.  That's true,
but irrelevant.  The point is that the FDW might want to do something
additional, like frob the outer plan's tlist, and it can't do that if
we don't pass it fdw_outerplan.  So we should do that, after all.


As I proposed upthread, another idea would be to 1) to store an 
fdw_outerpath in the fdw_private list of a ForeignPath node, and then 2) 
to create an fdw_outerplan from *the fdw_outerpath stored into
the fdw_private* in GetForeignPlan.  One good thing for this is that we 
keep the API of create_foreignscan_path as-is.  What do you think about 
that?



Updated patch attached.  This fixes a couple of whitespace issues that
were pointed out, also.


Thanks for updating the patch!

Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb_delete not documented

2015-12-06 Thread Tom Lane

Peter Eisentraut  writes:
> I see.  The reference from pg_operator to pg_proc is by OID rather than
> function name, so I didn't find them.  Is that because the function is
> overloaded?

Yeah, I suppose so --- regproc can't resolve overloaded function names.

> It's kind of odd that these are the only operators (at
> first glance) that are set up like that.

I think the customary thing when creating functions meant as operator
support is to give them unique names.  These weren't done that way ...
I wasn't involved, but I wonder whether there was uncertainty as to
whether these should be documented as functions or operators.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb_delete not documented

2015-12-06 Thread Peter Eisentraut

On 12/6/15 9:51 PM, Tom Lane wrote:
> Peter Eisentraut  writes:
>> The new function jsonb_delete does not appear to be documented.  Is that
>> intentional?
> 
>> The only thing that's documented is the #- operator for
>> jsonb_delete_path.  But jsonb_delete(jsonb, text) and
>> jsonb_delete(jsonb, int) are not documented.  (Those don't have an
>> operator.)
> 
> Yeah they do ...

> regression=# \do+ -
> ...
>  pg_catalog | -| jsonb   | integer
>  | jsonb   | pg_catalog.jsonb_delete | delete array 
> element
>  pg_catalog | -| jsonb   | text   
>  | jsonb   | pg_catalog.jsonb_delete | delete object field
> ...

I see.  The reference from pg_operator to pg_proc is by OID rather than
function name, so I didn't find them.  Is that because the function is
overloaded?  It's kind of odd that these are the only operators (at
first glance) that are set up like that.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Reusing abbreviated keys during second pass of ordered [set] aggregates

2015-12-06 Thread Peter Geoghegan

On Sun, Dec 6, 2015 at 7:14 PM, Andreas Karlsson  wrote:
> I can also confirm the roughly 25% speedup in the best case (numerics which
> are all distinct) with no measurable slowdown in the worst case.
>
> Given this speedup and the small size of the patch I think we should apply
> it. I will set this patch to "Ready for Commiter".

Thanks for the review, Andreas.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical replication and multimaster

2015-12-06 Thread Craig Ringer

> There are definitely two clear places where additional help would be
> useful and welcome right now.
>

Here's a shortlist of replication-related open items I put together
earlier. These are all independent things it'd be helpful to have when
working with logical replication in PostgreSQL. There are a lot of
medium-sized pieces of independent work still to be done to make some
things work well.

For example, you can't have multimaster logical replication with physical
failover nodes for each MM node unless we replicate slot
create/drop/advance over physical replication and copy them with
pg_basebackup.

Handling big transactions better:

* logical decoding interleaved transaction streaming as discussed earlier

Automatic DDL replication:

* DDL deparse extension (Álvaro started some work on this)
* Better way to link the pg_temp_nnn tables generated during table rewrite
to the original table in decoding
* logical decoding support for prepared xacts (before commit prepared).
Useful for DDL replication, other consensus operations.

Using physical replication/PITR with logical replication:

* replication slots replicated to physical standbys ("Failover slots")
* pg_basebackup should copy slots
* slots should follow timeline changes

Failover between logical replicas:

* logical decoding support for sequences
* logical decoding support for slot create/drop/advance
* Separate slot WAL retention from confirmed commit point so you can say
"I've replayed up to 1/AB but you should keep from 1/01". Needed in
async MM to cope with node loss properly. Will write it up separately.

Multimaster:

* Sequence Access Method

Other logical decoding enhancements:

* Support exporting a new snapshot from an existing logical slot. Useful
for COPYing new tables not previously replicated when added to a
replication set, for resync'ing tables, comparing tables, etc.
* WAL messages. Useful for voting, replay confirmation, etc. Rendered
largely unnecessary by xact interleaved streaming.

Misc:

* An API to enumerate currently registered bgworkers
* A way to securely make a libpq connection from a bgworker without messing
with passwords etc. Generate one-time cookies, sometihng like that.
* (unimportant but nice API fix): BGW_NO_RESTART_ON_CRASH



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Re: Reusing abbreviated keys during second pass of ordered [set] aggregates

2015-12-06 Thread Andreas Karlsson


Hi,

I have reviewed this patch and it still applies to master, compiles and 
passes the test suite.


I like the goal of the patch, making use of the already existing 
abbreviation machinery in more cases is something we should do and the 
patch itself looks clean.


I can also confirm the roughly 25% speedup in the best case (numerics 
which are all distinct) with no measurable slowdown in the worst case.


Given this speedup and the small size of the patch I think we should 
apply it. I will set this patch to "Ready for Commiter".


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pglogical_output - a general purpose logical decoding output plugin

2015-12-06 Thread Craig Ringer

On 2 December 2015 at 18:38, Petr Jelinek  wrote:

> First, I wonder if it would be useful to mention somewhere, even if it's
> only here in the mailing list how can the protocol be extended in
> non-breaking way in future for transaction streaming if we ever get that.

Good point.

I'll address that in the DESIGN.md in the next rev.

Separately, it's looking like xact streaming is possibly more complex than
I hoped due to cache invalidation issues, but I haven't been able to fully
understand the problem yet.

The other thing is that I think we don't need the "forward_changesets"
> parameter which currently decides if to forward changes that didn't
> originate on local node. There is already hook for origin filtering which
> provides same functionality in more flexible way so it seems redundant to
> also have special boolean option for it.

Removed, change pushed.

Also pushed a change to expose the decoded row data to row filter hooks.

I won't cut a v4 for this, as I'm working on merging the SGML-ified docs
and will do a v4 with that and the above readme change once that's done.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-06 Thread Noah Misch

On Wed, Dec 02, 2015 at 06:59:09PM -0300, Alvaro Herrera wrote:
> It seemed better to me to have the import list be empty, i.e. "use
> TestLib ()" and then qualify the routine names inside PostgresNode,
> instead of having to list the names of the routines to import, so I
> pushed it that way after running the tests a few more times.

I inspected commit 1caef31:

> --- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
> +++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl

> -like($recovery_conf, qr/^standby_mode = 'on[']$/m, 'recovery.conf sets 
> standby_mode');
> -like($recovery_conf, qr/^primary_conninfo = '.*port=$ENV{PGPORT}.*'$/m, 
> 'recovery.conf sets primary_conninfo');
> -
> -command_ok([ 'pg_basebackup', '-D', "$tempdir/backupxf", '-X', 'fetch' ],
> +like(
> + $recovery_conf,
> + qr/^standby_mode = 'on[']$/m,
> + 'recovery.conf sets standby_mode');
> +like(
> + $recovery_conf,
> + qr/^primary_conninfo = '.*port=$ENV{PGPORT}.*'$/m,
> + 'recovery.conf sets primary_conninfo');

This now elicits a diagnostic:

  Use of uninitialized value $ENV{"PGPORT"} in regexp compilation at 
t/010_pg_basebackup.pl line 175, <> line 1.

> --- a/src/bin/pg_controldata/t/001_pg_controldata.pl
> +++ b/src/bin/pg_controldata/t/001_pg_controldata.pl

> -standard_initdb "$tempdir/data";
> -command_like([ 'pg_controldata', "$tempdir/data" ],
> +
> +my $node = get_new_node();
> +$node->init;
> +$node->start;

Before the commit, this test did not start a server.

> --- /dev/null
> +++ b/src/test/perl/PostgresNode.pm

> +sub _update_pid
> +{
> + my $self = shift;
> +
> + # If we can open the PID file, read its first line and that's the PID we
> + # want.  If the file cannot be opened, presumably the server is not
> + # running; don't be noisy in that case.
> + open my $pidfile, $self->data_dir . "/postmaster.pid";
> + if (not defined $pidfile)

$ grep -h 'at /.*line' src/bin/*/tmp_check/log/* |sort|uniq -c
  1 cannot remove directory for 
/data/nmisch/src/pg/postgresql/src/bin/scripts/tmp_testNNCZ: Directory not 
empty at /home/nmisch/sw/cpan/lib/perl5/File/Temp.pm line 784
  1 cannot remove directory for 
/data/nmisch/src/pg/postgresql/src/bin/scripts/tmp_testNNCZ/pgdata/base/13264: 
Directory not empty at /home/nmisch/sw/cpan/lib/perl5/File/Temp.pm line 784
  1 cannot remove directory for 
/data/nmisch/src/pg/postgresql/src/bin/scripts/tmp_testNNCZ/pgdata/base: 
Directory not empty at /home/nmisch/sw/cpan/lib/perl5/File/Temp.pm line 784
  1 cannot remove directory for 
/data/nmisch/src/pg/postgresql/src/bin/scripts/tmp_testNNCZ/pgdata: Directory 
not empty at /home/nmisch/sw/cpan/lib/perl5/File/Temp.pm line 784
 28 readline() on closed filehandle $pidfile at 
/data/nmisch/src/pg/postgresql/src/bin/pg_rewind/../../../src/test/perl/PostgresNode.pm
 line 308.
 28 Use of uninitialized value in concatenation (.) or string at 
/data/nmisch/src/pg/postgresql/src/bin/pg_rewind/../../../src/test/perl/PostgresNode.pm
 line 309.

$pidfile is always defined.  Test the open() return value.

> + {
> + $self->{_pid} = undef;
> + print "# No postmaster PID\n";
> + return;
> + }
> +
> + $self->{_pid} = <$pidfile>;

chomp() that value to remove its trailing newline.

> + print "# Postmaster PID is $self->{_pid}\n";
> + close $pidfile;
> +}

> + my $devnull = $TestLib::windows_os ? "nul" : "/dev/null";

Unused variable.

> +sub DESTROY
> +{
> + my $self = shift;
> + return if not defined $self->{_pid};
> + print "# signalling QUIT to $self->{_pid}\n";
> + kill 'QUIT', $self->{_pid};

On Windows, that kill() is the wrong thing.  I suspect "pg_ctl kill" will be
the right thing, but that warrants specific testing.

Postmaster log file names became less informative.  Before the commit:

-rw---. 1 nmisch nmisch 211265 2015-12-06 22:35:59.931114215 + 
regress_log_001_basic
-rw---. 1 nmisch nmisch 268887 2015-12-06 22:36:21.871165951 + 
regress_log_002_databases
-rw---. 1 nmisch nmisch 206808 2015-12-06 22:36:41.861213736 + 
regress_log_003_extrafiles
-rw---. 1 nmisch nmisch   7464 2015-12-06 22:37:00.371256795 + 
master.log
-rw---. 1 nmisch nmisch   6648 2015-12-06 22:37:01.381259211 + 
standby.log
-rw---. 1 nmisch nmisch 208561 2015-12-06 22:37:02.381261374 + 
regress_log_004_pg_xlog_symlink

After:

-rw---. 1 nmisch nmisch 219581 2015-12-06 23:00:50.504643175 + 
regress_log_001_basic
-rw---. 1 nmisch nmisch 276315 2015-12-06 23:01:12.924697085 + 
regress_log_002_databases
-rw---. 1 nmisch nmisch 213940 2015-12-06 23:01:33.574747195 + 
regress_log_003_extrafiles
-rw---. 1 nmisch nmisch   4114 2015-12-06 23:01:40.914764850 + 
node_57834.log
-rw---. 1 nmisch nmisch   6166 2015-12-06 23:01:43.184770615 + 
node_57833.log
-rw---. 1 nmisch nmisch   5550 2015-12-06 23:01:52.504792997 + 
node_57835.log
-rw--

Re: [HACKERS] Logical replication and multimaster

2015-12-06 Thread Craig Ringer

On 7 December 2015 at 01:39, Konstantin Knizhnik 
wrote:

>
> I have integrated pglogical_output in multimaster
>

Excellent.

I just pushed a change to pglogical_output that exposes the row contents
(and the rest of the reorder change buffer contents) to hooks that want it,
by the way.

> using bdr_apply from BDR as template for implementation of receiver part.
>

Yep, that'll tide you over. We're working hard on getting the downstream
part ready and you'll find it more flexible.

> The time of insert is reduced almost 10 times comparing with logical
> replication based on decoder_raw/receiver_raw plugins which performs
> logical replication using SQL statements. But unfortunately time of updates
> is almost not changed.
>

That's not too surprising, given that you'll have significant overheads for
checking if keys are present when doing updates.

This field is assigned by  pglogical_init_api() function. And I can extend
> this PGLogicalProtoAPI structure by adding some protocol specific fields.
>

Yep, that's the idea.

> typedef struct PGLogicalProtoMM
> {
> PGLogicalProtoAPI api;
> bool isLocal; /* mark transaction as local */
> } PGLogicalProtoMM;
>

I'm curious about what you're using the 'isLocal' field for.

For MM you should only need to examine the replication origin assigned to
the transaction to determine whether you're going to forward it or not.

Were you not able to achieve what you wanted with a hook? If not, then we
might need another hook. Could you explain what it's for in more detail?

What I suggest is: have your downstream client install a pglogical_output
hook for the transaction filter hook. There, examine the replication origin
passed to the hook. If you want to forward locally originated xacts only
(such as for mesh multimaster) you can just filter out everything where the
origin is not InvalidRepOriginId. There are example hooks in
contrib/pglogical_output_plhooks .

There'll be a simple MM example using filter hooks in the pglogical
downstream btw and we're working hard to get that out.

> But I have to add "PGLogicalOutputData *data"  parameter to
> pglogical_write_rel_fn function.
> Do you think that it will be better to pass this parameter to all
> functions?
>

Yes, I agree that it should be passed to the API for the output protocol.
It's pretty harmless. Please feel free to send a pull req.

Note that we haven't made that pluggable from the outside though; there's
no way to load a new protocol distributed separately from pglogical_output.
The idea is really to make sure that between the binary protocol and json
protocol we meet the reasonably expected set of use cases and don't need
pluggable protocols. Perhaps that's over-optimistic, but we've already got
and output plugin that has plug-in hooks, a plugin for a plugin. Do we need
another? Also, if we allow dynamic loading of new protocols then that means
we'll have a much harder time changing the protocol implementation API
later, so it's not something I'm keen to do. Also, to make it secure to
allow users to specify the protocol we'd have to make protocols implement
an extension with a C function in pg_proc to return its API struct, like we
do for hooks. So there'd be more hoop-jumping required to figure out how to
talk to the client.

If possible I'd like to find any oversights and omissions in the current
protocol and its APIs to meet future use cases without having to introduce
protocol plugins for an output plugin.

May be it is not intended way of passing custom data to this functions...
>

Yeah, we weren't really thinking of the protocol API as intended to be
pluggable and extensible. If you define new protocols you have to change
the rest of the output plugin code anyway.

Lets look at what protocol changes are needed to address your use case and
see whether it's necessary to take the step of making the protocol fully
pluggable or not.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] jsonb_delete not documented

2015-12-06 Thread Tom Lane

Peter Eisentraut  writes:
> The new function jsonb_delete does not appear to be documented.  Is that
> intentional?

> The only thing that's documented is the #- operator for
> jsonb_delete_path.  But jsonb_delete(jsonb, text) and
> jsonb_delete(jsonb, int) are not documented.  (Those don't have an
> operator.)

Yeah they do ...

regression=# \df+ jsonb_delete
  
List of functions
   Schema   | Name | Result data type | Argument data types |  Type  | 
Security | Volatility |  Owner   | Language |   Source code| 
Description  
+--+--+-++--++--+--+--+--
 pg_catalog | jsonb_delete | jsonb| jsonb, integer  | normal | 
invoker  | immutable  | postgres | internal | jsonb_delete_idx | implementation 
of - operator
 pg_catalog | jsonb_delete | jsonb| jsonb, text | normal | 
invoker  | immutable  | postgres | internal | jsonb_delete | implementation 
of - operator
(2 rows)

regression=# \do+ -
...
 pg_catalog | -| jsonb   | integer 
| jsonb   | pg_catalog.jsonb_delete | delete array element
 pg_catalog | -| jsonb   | text
| jsonb   | pg_catalog.jsonb_delete | delete object field
...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Redundant sentence within INSERT documentation page (exclusion constraints)

2015-12-06 Thread Peter Geoghegan

There is a redundant second appearance of more or less the same
sentence at one point in the new INSERT documentation -- I missed this
during the recent overhaul of the 9.5+ INSERT documentation.

Attached is a trivial patch that fixes this issue.

-- 
Peter Geoghegan
diff --git a/doc/src/sgml/ref/insert.sgml b/doc/src/sgml/ref/insert.sgml
index 945eb69..21b28a6 100644
--- a/doc/src/sgml/ref/insert.sgml
+++ b/doc/src/sgml/ref/insert.sgml
@@ -482,8 +482,7 @@ INSERT INTO table_name [ AS ON CONFLICT DO UPDATE.
+arbiter index or constraint.


 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] jsonb_delete not documented

2015-12-06 Thread Peter Eisentraut

The new function jsonb_delete does not appear to be documented.  Is that
intentional?

The only thing that's documented is the #- operator for
jsonb_delete_path.  But jsonb_delete(jsonb, text) and
jsonb_delete(jsonb, int) are not documented.  (Those don't have an
operator.)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Attach comments to functions' parameters and return value

2015-12-06 Thread Jim Nasby


On 9/15/15 12:35 AM, Charles Clavadetscher wrote:

COMMENT ON FUNCTION function_name ( [ [ argmode ] [ argname ] argtype [,
...] ] ) PARAMETER param_position IS 'text';

COMMENT ON FUNCTION function_name ( [ [ argmode ] [ argname ] argtype [,
...] ] ) RETURN VALUE IS 'text';

An alternative to "RETURN VALUE" could be "RETURNS", which would make
the statement shorter, but I think this may be confusing.


I like RETURN VALUE better myself.


The parameter list of the function is only required to identify the
function also in cases it exists with the same name in different
flavours. This sticks to the general syntax of the command and should be
easy to understand.


Right. You should be able to use the current COMMENT ON code. Note 
however that the last time I looked it does NOT support the full syntax 
that CREATE FUNCTION does. It'd be nice to fix that, but that's mostly a 
separate matter.


Though, it would probably be nice if all of this stuff (along with the 
regprocedure input function) could be factored into a single piece of 
code...

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Double linking MemoryContext children

2015-12-06 Thread Jim Nasby


On 9/14/15 7:24 AM, Jan Wieck wrote:

On 09/12/2015 11:35 AM, Kevin Grittner wrote:


On the other hand, a grep indicates that there are two places that
MemoryContextData.nextchild is set (and we therefore probably need
to also set the new field), and Jan's proposed patch only changes
one of them.  If we do this, I think we need to change both places
that are affected, so ResourceOwnerCreate() in resowner.c would
need a line or two added.


ResourceOwnerCreate() sets ResourceOwnerData.nextchild, not
MemoryContextData.nextchild.


Anything ever happen with this? 
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using quicksort for every external sort run

2015-12-06 Thread Peter Geoghegan

On Tue, Nov 24, 2015 at 4:46 PM, Simon Riggs  wrote:
>> Parallel sort is very important. Robert, Amit and I had a call about
>> this earlier today. We're all in agreement that this should be
>> extended in that direction, and have a rough idea about how it ought
>> to fit together with the parallelism primitives. Parallel sort in 9.6
>> could certainly happen -- that's what I'm aiming for. I haven't really
>> done preliminary research yet; I'll know more in a little while.
>
> Glad to hear it, I was hoping to see that.

As I mentioned just now, I'm working on parallel CREATE INDEX
currently, which seems like a good proving ground for parallel sort,
as its where the majority of really expensive sorts occur. It would be
nice to get parallel-aware sort nodes in 9.6, but I don't think I'll
be able to make that happen in time. The required work in the
optimizer is just too complicated.

The basic idea is that we use the parallel heapam interface, and have
backends sort and write runs as with an external sort (if those runs
are would-be internal sorts, we still write them to tape in the manner
of external sorts). When done, worker processes release memory, but
not tapes, initially. The leader reassembles an in-memory
representation of the tapes that is basically consistent with it
having generated those runs itself (using my new approach to external
sorting). Then, it performs an on-the-fly merge, as before.

At the moment, I have the sorting of runs within workers using the
parallel heapam interface more or less working, with workers dumping
out the runs to tape. I'll work on reassembling the state of the tapes
within the leader in the coming week. It's all still rather rough, but
I think I'll have benchmarks before people start taking time off later
in the month, and possibly even code. Cutting the scope of parallel
sort in 9.6 to only cover parallel CREATE INDEX will make it likely
that I'll be able to deliver something acceptable for that release.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using quicksort for every external sort run

2015-12-06 Thread Peter Geoghegan

On Tue, Nov 24, 2015 at 4:33 PM, Peter Geoghegan  wrote:
> So, the bottom line is: This patch seems very good, is unlikely to
> have any notable downside (no case has been shown to be regressed),
> but has yet to receive code review. I am working on a new version with
> the first two commits consolidated, and better comments, but that will
> have the same code, unless I find bugs or am dissatisfied. It mostly
> needs thorough code review, and to a lesser extent some more
> performance testing.

I'm currently spending a lot of time working on parallel CREATE INDEX.
I should not delay posting a new version of my patch series any
further, though. I hope to polish up parallel CREATE INDEX to be able
to show people something in a couple of weeks.

This version features consolidated commits, the removal of the
multipass_warning parameter, and improved comments and commit
messages. It has almost entirely unchanged functionality.

The only functional changes are:

* The function useselection() is taught to distrust an obviously bogus
caller reltuples hint (when it's already less than half of what we
know to be the minimum number of tuples that the sort must sort,
immediately after LACKMEM() first becomes true -- this is probably a
generic estimate).

* Prefetching only occurs when writing tuples. Explicit prefetching
appears to hurt in some cases, as David Rowley has shown over on the
dedicated thread. But it might still be that writing tuples is a case
that is simple enough to benefit consistently, due to the relatively
uniform processing that memory latency can hide behind for that case
(before, the same prefetching instructions were used for CREATE INDEX
and for aggregates, for example).

Maybe we should consider trying to get patch 0002 (the memory
pool/merge patch) committed first, something Greg Stark suggested
privately. That might actually be an easier way of integrating this
work, since it changes nothing about the algorithm we use for merging
(it only improves memory locality), and so is really an independent
piece of work (albeit one that makes a huge overall difference due to
the other patches increasing the time spent merging in absolute terms,
and especially as a proportion of the total).

-- 
Peter Geoghegan
From 7ff4a3e4448b25ba524e40c0db57307466f167d9 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Sun, 12 Jul 2015 13:14:01 -0700
Subject: [PATCH 3/6] Perform memory prefetching when writing memtuples

This patch is based on, but quite distinct to a separately submitted,
more general version which performs prefetching in several places [1].
This version now only performs prefetching of each "tuple proper" during
the writing of batches of tuples (an entire run, written following a
quicksort).  The case for prefetching each "tuple proper" at several
sites now seems weak due to difference in CPU microarchitecture.
However, it might still be that there is a consistent improvement
observable when writing out tuples, because that involves a particularly
tight inner loop, with relatively predictable processing to hide memory
latency behind.  A helpful generic prefetch hint may be possible for
this case, even if it proves impossible elsewhere.

This has been shown to appreciably help on both a POWER7 server
processor [2], and an Intel Mobile processor.

[1] https://commitfest.postgresql.org/6/305/
[2] CAM3SWZR5rv3+F3FOKf35=dti7otmmcdfoe2vogur0pddg3j...@mail.gmail.com
---
 config/c-compiler.m4   | 17 +
 configure  | 31 +++
 configure.in   |  1 +
 src/backend/utils/sort/tuplesort.c | 15 +++
 src/include/c.h| 14 ++
 src/include/pg_config.h.in |  3 +++
 src/include/pg_config.h.win32  |  3 +++
 src/include/pg_config_manual.h | 10 ++
 8 files changed, 94 insertions(+)

diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 550d034..8be2122 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -271,6 +271,23 @@ fi])# PGAC_C_BUILTIN_UNREACHABLE

+# PGAC_C_BUILTIN_PREFETCH
+# -
+# Check if the C compiler understands __builtin_prefetch(),
+# and define HAVE__BUILTIN_PREFETCH if so.
+AC_DEFUN([PGAC_C_BUILTIN_PREFETCH],
+[AC_CACHE_CHECK(for __builtin_prefetch, pgac_cv__builtin_prefetch,
+[AC_LINK_IFELSE([AC_LANG_PROGRAM([],
+[int i = 0;__builtin_prefetch(&i, 0, 3);])],
+[pgac_cv__builtin_prefetch=yes],
+[pgac_cv__builtin_prefetch=no])])
+if test x"$pgac_cv__builtin_prefetch" = xyes ; then
+AC_DEFINE(HAVE__BUILTIN_PREFETCH, 1,
+  [Define to 1 if your compiler understands __builtin_prefetch.])
+fi])# PGAC_C_BUILTIN_PREFETCH
+
+
+
 # PGAC_C_VA_ARGS
 # --
 # Check if the C compiler understands C99-style variadic macros,
diff --git a/configure b/configure
index 8c61390..cd8db5d 100755
--- a/configure
+++ b/configure
@@ -11338,6 +11338,37 @@ if test x"$pgac_cv__builtin_unreachable" = xyes

Re: [HACKERS] Using quicksort for every external sort run

2015-12-06 Thread Peter Geoghegan

On Sun, Dec 6, 2015 at 3:59 PM, Jeff Janes  wrote:
> My column has the format of ABC-123-456-789-0
>
> The name-space identifier ("ABC-")  is the same in 99.99% of the
> cases.  And to date, as well as in my extrapolation, the first two
> digits of the numeric part are leading zeros and the third one is
> mostly 0,1,2.  So the first 8 bytes really have less than 2 bits worth
> of information.  So yeah, not surprising abbreviation was not useful.

I think that given you're using the "C" collation, abbreviation should
still go ahead. I posted a patch to do that, which I need to further
justify per Robert's request (currently, we do nothing special based
on collation). Abbreviation should help in surprisingly marginal
cases, since far fewer memory accesses will be required in the early
stages of the sort with only (say) 5 distinct abbreviated keys. Once
abbreviated comparisons start to not help at all (with quicksort, at
some partition), there's a good chance that the full keys can be
reused to some extent, before being evicted from CPU caches.

>> BTW, roughly what does this CREATE INDEX look like? Is it a composite
>> index, for example?
>
> Nope, just a single column index.  In the extrapolated data set, each
> distinct value shows up a couple hundred times on average.  I'm
> thinking of converting it to a btree_gin index once I've tested them a
> bit more, as the compression benefits are substantial.

Unfortunately, that cannot use tuplesort.c at all.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using quicksort for every external sort run

2015-12-06 Thread Jeff Janes

On Sun, Nov 29, 2015 at 2:01 AM, Peter Geoghegan  wrote:
> On Sat, Nov 28, 2015 at 4:05 PM, Peter Geoghegan  wrote:
>> So there was 27.76 seconds spent copying tuples into local memory
>> ahead of the quicksort, 2 minutes 56.68 seconds spent actually
>> quicksorting, and a trifling 10.32 seconds actually writing the run! I
>> bet that the quicksort really didn't use up too much memory bandwidth
>> on the system as a whole, since abbreviated keys are used with a cache
>> oblivious internal sorting algorithm.
>
> Uh, actually, that isn't so:
>
> LOG:  begin index sort: unique = f, workMem = 1048576, randomAccess = f
> LOG:  bttext_abbrev: abbrev_distinct after 160: 1.000489
> (key_distinct: 40.802210, norm_abbrev_card: 0.006253, prop_card:
> 0.20)
> LOG:  bttext_abbrev: aborted abbreviation at 160 (abbrev_distinct:
> 1.000489, key_distinct: 40.802210, prop_card: 0.20)
>
> Abbreviation is aborted in all cases that you tested. Arguably this
> should happen significantly less frequently with the "C" locale,
> possibly almost never, but it makes this case less than representative
> of most people's workloads. I think that at least the first several
> hundred leading attribute tuples are duplicates.

I guess I wasn't paying sufficient attention to that part of
trace_sort, I was not familiar enough with the abbreviation feature to
interpret what it meant.I had thought we used 16 bytes for
abbreviation, but now I see it is only 8 bytes.

My column has the format of ABC-123-456-789-0

The name-space identifier ("ABC-")  is the same in 99.99% of the
cases.  And to date, as well as in my extrapolation, the first two
digits of the numeric part are leading zeros and the third one is
mostly 0,1,2.  So the first 8 bytes really have less than 2 bits worth
of information.  So yeah, not surprising abbreviation was not useful.

(When I created the system, I did tests that showed it doesn't make
much difference whether I used the format natively, or stripped it to
something more compact on input and reformatted it on output.  That
was before abbreviation features existed)

>
> BTW, roughly what does this CREATE INDEX look like? Is it a composite
> index, for example?

Nope, just a single column index.  In the extrapolated data set, each
distinct value shows up a couple hundred times on average.  I'm
thinking of converting it to a btree_gin index once I've tested them a
bit more, as the compression benefits are substantial.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: index-only scans with partial indexes

2015-12-06 Thread Tomas Vondra


Hello Kyotaro-san,

Sorry for the long delay since your response in this thread :-(

On 10/14/2015 08:06 AM, Kyotaro HORIGUCHI wrote:


The table t is referred to twice by different (alias) names (though
the diferrence is made by EXPLAIN, it shows that they are different
rels in plantree).


So we'll have these indexrinfos:

aidx: {b=2}
bidx: {a=1}


Yes, but each of them belongs to different rels. So,


which makes index only scans unusable.


The are usable.


Yes, you're of course right. I got confused by the comment at 
IndexOptInfo that says "per-index information" but as you've pointed out 
it's really "per-index per-reloptinfo" which makes it perfectly suitable 
for keeping the info we need.


However, I'm not sure the changes to check_partial_indexes() are correct 
- particularly the first loop.


  /*
   * Frequently, there will be no partial indexes, so first check to
   * make sure there's something useful to do here.
   */
  have_partial = false;
  foreach(lc, rel->indexlist)
  {
IndexOptInfo *index = (IndexOptInfo *) lfirst(lc);

/*
 * index rinfos are the same to baseristrict infos for non-partial
 * indexes
 */
index->indrinfos = rel->baserestrictinfo;

if (index->indpred == NIL)
  continue;  /* ignore non-partial indexes */

if (index->predOK)
  continue;  /* don't repeat work if already proven OK */

have_partial = true;
break;
  }

Now, imagine we have two indexes - partial and regular. In such case the 
loop will terminate after the first index (assuming predOK=true), so the 
regular index won't have indrinfos set. I think we need to remove the 
'break' at the end.


BTW it took me a while to understand the change at the end:

  /* Search for the first rinfo that is implied by indpred */
  foreach (lcr, rel->baserestrictinfo)
  {
RestrictInfo *rinfo = (RestrictInfo *) lfirst(lcr);

if (predicate_implied_by(list_make1(rinfo->clause),
 index->indpred))
  break;
  }

  /* This index needs rinfos different from baserestrictinfo */
  if (lcr)
  {
... filter implied conditions ...
  }

Is there a reason why not to simply move the second block into the if 
block in the foreach loop like this?


  /* Search for the first rinfo that is implied by indpred */
  foreach (lcr, rel->baserestrictinfo)
  {
RestrictInfo *rinfo = (RestrictInfo *) lfirst(lcr);

if (predicate_implied_by(list_make1(rinfo->clause),
 index->indpred))
{
  ... filter implied conditions ...
  break;
}
  }


Otherwise the reworked patch seems fine to me, but I'll give it a bit 
more testing over the next few days.


Thanks for the help so far!

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Unicode collations in FreeBSD 11, DragonFly BSD 4.4 without ICU

2015-12-06 Thread Thomas Munro

Hi

Since the ICU patch from the BSD ports trees has been discussed on
this list a few times I thought people might be interested in this
news:

https://lists.freebsd.org/pipermail/freebsd-current/2015-October/057781.html
https://www.dragonflybsd.org/release44/
https://forums.freebsd.org/threads/this-is-how-i-like-open-source.53943/

(Thanks to pstef on #postgresql for pointing me at this).

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] a word-choice question

2015-12-06 Thread Andrew Dunstan




On 12/06/2015 11:03 AM, Tom Lane wrote:

Chapman Flack  writes:

In PL/Java's function annotations, 'type' has been used for the
volatility category:
@Function(type=IMMUTABLE)
public static String hello() { return "Hello, world!"; }
It seems a bit infelicitous because you would probably sooner guess that
'type' was telling you something about the function's _data_ type

Yeah ...


But I've been trying think of something short, clear, preferably
monosyllabic, less geeky than 'v8y', and I don't have a winner yet.

Instead of thinking about "volatility", maybe something based on
"stability" or "constancy"?  I suppose "const" is too much of a conflict,
but "stable=IMMUTABLE" doesn't seem awful.





I would stick with "volatility". It makes for consistency - pg_proc has 
"provolatile" for example. I think it's a moderately well understood 
term in the Postgres world.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [sqlsmith] Failed to generate plan on lateral subqueries

2015-12-06 Thread Andreas Seltenreich

Hi,

I've added new grammar rules to sqlsmith and improved some older ones.
This was rewarded with a return of "failed to generate plan" errors.
The failing queries all contain a lateral subquery.  The shortest of the
failing queries are below.  They were run against the regression db of
master as of db07236.

regards,
Andreas

smith=# select msg, query from error where
   (firstline(msg) ~~ 'ERROR:  failed to build any%'
   or firstline(msg) ~~ 'ERROR:  could not devise a query plan%')
  and t > now() - interval '1 day' order by length(query) asc limit 3;

ERROR:  failed to build any 8-way joins
select
  ref_96.foreign_table_schema as c0,
  sample_87.is_supported as c1
from
  information_schema.sql_packages as sample_87 tablesample system (0.2)
right join information_schema._pg_foreign_tables as ref_96
on (sample_87.feature_id = ref_96.foreign_table_catalog ),
  lateral (select
sample_87.is_verified_by as c0,
ref_97.indexed_col as c1,
coalesce(sample_87.feature_id, ref_96.foreign_server_name) as c2,
4 as c3
  from
public.comment_test as ref_97
  where ref_97.id ~>~ ref_97.indexed_col
  fetch first 73 rows only) as subq_33
where ref_96.foreign_table_name ~~ subq_33.c1

ERROR:  could not devise a query plan for the given query
select
  subq_43.c0 as c0
from
  (select
  ref_181.installed as c0
from
  pg_catalog.pg_available_extension_versions as ref_181,
  lateral (select
ref_181.name as c0,
ref_181.installed as c1
  from
pg_catalog.pg_conversion as ref_182
  where ref_182.conname ~~* ref_181.version
  fetch first 98 rows only) as subq_42
where (subq_42.c0 is not NULL)
  or (subq_42.c1 is NULL)) as subq_43
right join pg_catalog.pg_language as sample_177 tablesample system (2.8)
on (subq_43.c0 = sample_177.lanispl )
where sample_177.lanowner < sample_177.lanvalidator

ERROR:  failed to build any 5-way joins
select
  ref_239.id2 as c0,
  40 as c1,
  ref_239.id2 as c2,
  ref_238.aa as c3
from
  public.tt5 as sample_289 tablesample system (8.1)
inner join information_schema.element_types as ref_237
on (sample_289.x = ref_237.character_maximum_length )
  left join public.b as ref_238
  on (ref_237.character_maximum_length = ref_238.aa )
left join public.num_exp_mul as ref_239
on (ref_237.numeric_precision_radix = ref_239.id1 ),
  lateral (select
sample_290.b as c0,
sample_289.y as c1,
ref_239.id2 as c2
  from
public.rtest_t8 as sample_290 tablesample bernoulli (4.6)
  where (sample_290.b > ref_238.bb)
and (sample_289.y > ref_239.expected)
  fetch first 91 rows only) as subq_64
where (subq_64.c1 > sample_289.y)
  and (sample_289.y = ref_239.expected)
fetch first 133 rows only


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical replication and multimaster

2015-12-06 Thread Konstantin Knizhnik


Hi,

I have integrated pglogical_output in multimaster, using bdr_apply from BDR as 
template for implementation of receiver part.
The time of insert is reduced almost 10 times comparing with logical 
replication based on decoder_raw/receiver_raw plugins which performs logical 
replication using SQL statements. But unfortunately time of updates is almost 
not changed.
It is expected result because I didn't see any functions related with SQL 
parsing/preparing in profile.
Now in both cases profile is similar:

  4.62%  postgres[.] HeapTupleSatisfiesMVCC
  2.99%  postgres[.] heapgetpage
  2.10%  postgres[.] hash_search_with_hash_value
  1.86%  postgres[.] ExecProject
  1.80%  postgres[.] heap_getnext
  1.79%  postgres[.] PgXidInMVCCSnapshot

By the way, you asked about comments concerning pglogical_output. I have one: most of 
pglogical protocol functions have "PGLogicalOutputData *data" parameter. There 
are few exceptions:

write_startup_message_fn, pglogical_write_origin_fn, pglogical_write_rel_fn

PGLogicalOutputData is the only way to pass protocol specific data, using 
"PGLogicalProtoAPI *api" field.
This field is assigned by  pglogical_init_api() function. And I can extend this 
PGLogicalProtoAPI structure by adding some protocol specific fields.
For example, this is how it is done now for multimaster:

typedef struct PGLogicalProtoMM
{
PGLogicalProtoAPI api;
bool isLocal; /* mark transaction as local */
} PGLogicalProtoMM;

PGLogicalProtoAPI *
pglogical_init_api(PGLogicalProtoType typ)
{
PGLogicalProtoMM* pmm = palloc0(sizeof(PGLogicalProtoMM));
PGLogicalProtoAPI* res = &pmm->api;
pmm->isLocal = false;
res->write_rel = pglogical_write_rel;
res->write_begin = pglogical_write_begin;
res->write_commit = pglogical_write_commit;
res->write_insert = pglogical_write_insert;
res->write_update = pglogical_write_update;
res->write_delete = pglogical_write_delete;
res->write_startup_message = write_startup_message;
return res;
}

But I have to add "PGLogicalOutputData *data"  parameter to 
pglogical_write_rel_fn function.
Di you think that it will be better to pass this parameter to all functions?

May be it is not intended way of passing custom data to this functions...
Certainly it is possible to use static variables for this purpose.
But I think that passing user specific data through PGLogicalOutputData is 
safer and more flexible solution.



On 12/03/2015 04:53 PM, Craig Ringer wrote:

On 3 December 2015 at 19:06, konstantin knizhnik mailto:k.knizh...@postgrespro.ru>> wrote:


On Dec 3, 2015, at 10:34 AM, Craig Ringer wrote:


On 3 December 2015 at 14:54, konstantin knizhnik mailto:k.knizh...@postgrespro.ru>> wrote:



I'd really like to collaborate using pglogical_output if at all 
possible. Petr's working really hard to get the pglogical downstrem out too, 
with me helping where I can.

And where I can get  pglogical_output plugin? Sorry, but I can't 
quickly find reference with Google...


It's been submitted to this CF.

https://commitfest.postgresql.org/7/418/

https://github.com/2ndQuadrant/postgres/tree/dev/pglogical-output

Any tests and comments would be greatly appreciated.



Thank you.
I wonder if there is opposite part of the pipe for pglogical_output - 
analog of receiver_raw?


It's pglogical, and it's in progress, due to be released at the same time as 
9.5. We're holding it a little longer to nail down the user interface a bit 
better, etc, and because sometimes the real world gets in the way.

The catalogs  and UI are very different to BDR, it's much more extensible/modular, it supports much more flexible topologies, etc... but lots of the core concepts are very similar. So if you go take a look at the BDR code that'll give you a pretty solid 
idea of how a lot of it works, though BDR has whole subsystems pglogical doesn't (global ddl lock, ddl replication, etc).

--
 Craig Ringer http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] [PATCH] Equivalence Class Filters

2015-12-06 Thread Tom Lane

David Rowley  writes:
> On 6 December 2015 at 06:07, Tom Lane  wrote:
>> Another issue that would need consideration is how to avoid skewing
>> planner selectivity estimates with derived clauses that are fully
>> redundant with other clauses.

> Could you give me an example of where this is a problem?

Using the regression database, try

explain analyze select * from tenk1 a, tenk1 b where a.thousand = b.thousand;

explain analyze select * from tenk1 a, tenk1 b where a.thousand = b.thousand 
and a.thousand < 100;

explain analyze select * from tenk1 a, tenk1 b where a.thousand = b.thousand 
and a.thousand < 100 and b.thousand < 100;

The first two give pretty accurate estimates for the join size, the third
not so much, because it thinks b.thousand < 100 is an independent
constraint.

> I've tried fooling
> the planner into giving me bad estimates, but I've not managed to.
> # explain analyze select * from t1 where t1.id < 10 and t1.id < 100 and
> t1.id < 1000;

That particular case works because clausesel.c's AddRangeClause figures
out that the similar inequalities are redundant and drops all but one,
on its way to (not) identifying a range constraint for t1.id.  There's
nothing comparable for constraints on different variables, though,
especially not constraints on Vars of different relations, which will
never even appear in the same restriction list.

> if so, is the current eclass code not prone to the exact same problem?

No, and I already explained why not.

>> Lastly, in most cases knowing that t2.id <= 10 is just not worth all
>> that much; it's certainly far less useful than an equality condition.

> I'd have thought that my link to a thread of a reported 1100 to 2200 times
> performance improvement might have made you think twice about claiming the
> uselessness of this idea.

I said "in most cases".  You can find example cases to support almost any
weird planner optimization no matter how expensive and single-purpose;
but that is the wrong way to think about it.  What you have to think about
is average cases, and in particular, not putting a drag on planning time
in cases where no benefit ensues.  We're not committing any patches that
give one uncommon case an 1100X speedup by penalizing every other query 10%,
or even 1%; especially not when there may be other ways to fix it.

The EC machinery wasn't designed in a vacuum.  It is interested in
deriving equality conditions because those are useful for merge and hash
joins.  It's also tightly tied in with the planner's understanding of sort
ordering and distinct-ness.  None of those considerations provide any
reason to build a similar thing for inequalities.

It may be that computing derived inequalities is something that's worth
adding, but we're going to have to take a very hard look at the costs and
benefits; it's not a slam dunk that such a patch would get committed.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] a word-choice question

2015-12-06 Thread Tom Lane

Chapman Flack  writes:
> In PL/Java's function annotations, 'type' has been used for the
> volatility category:
> @Function(type=IMMUTABLE)
> public static String hello() { return "Hello, world!"; }
> It seems a bit infelicitous because you would probably sooner guess that
> 'type' was telling you something about the function's _data_ type

Yeah ...

> But I've been trying think of something short, clear, preferably
> monosyllabic, less geeky than 'v8y', and I don't have a winner yet.

Instead of thinking about "volatility", maybe something based on
"stability" or "constancy"?  I suppose "const" is too much of a conflict,
but "stable=IMMUTABLE" doesn't seem awful.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-12-06 Thread Kouhei Kaigai

> On Wed, Dec 2, 2015 at 4:06 AM, Andres Freund  wrote:
> > On 2015-12-02 08:52:20 +, Kouhei Kaigai wrote:
> >> diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
> >> index 26264cb..c4bb76e 100644
> >> --- a/src/backend/nodes/copyfuncs.c
> >> +++ b/src/backend/nodes/copyfuncs.c
> >> @@ -635,8 +635,12 @@ _copyWorkTableScan(const WorkTableScan *from)
> >>  static ForeignScan *
> >>  _copyForeignScan(const ForeignScan *from)
> >>  {
> >> - ForeignScan *newnode = makeNode(ForeignScan);
> >> -
> >> + const ExtensibleNodeMethods *methods
> >> + = GetExtensibleNodeMethods(from->extnodename, true);
> >> + ForeignScan *newnode = (ForeignScan *) newNode(!methods
> >>
> +
>  ? sizeof(ForeignScan)
> >>
> +
>  : methods->node_size,
> >> +
> T_ForeignScan);
> >
> > Changing the FDW API seems to put this squarely into 9.6
> > territory. Agreed?
> 
> I don't think anybody thought this patch was 9.5 material.  I didn't,
> anyway.  The FDW changes for the join-pushdown-EPQ problem are another
> issue, but that can be discussed on the other thread.
>
I don't expect it is 9.5 feature.
The name of attached patch was "pgsql-v9.6-custom-private.v2.patch".

> >>   /*
> >>* copy node superclass fields
> >>*/
> >> @@ -645,6 +649,7 @@ _copyForeignScan(const ForeignScan *from)
> >>   /*
> >>* copy remainder of node
> >>*/
> >> + COPY_STRING_FIELD(extnodename);
> >>   COPY_SCALAR_FIELD(fs_server);
> >>   COPY_NODE_FIELD(fdw_exprs);
> >>   COPY_NODE_FIELD(fdw_private);
> >> @@ -652,6 +657,8 @@ _copyForeignScan(const ForeignScan *from)
> >>   COPY_NODE_FIELD(fdw_recheck_quals);
> >>   COPY_BITMAPSET_FIELD(fs_relids);
> >>   COPY_SCALAR_FIELD(fsSystemCol);
> >> + if (methods && methods->nodeCopy)
> >> + methods->nodeCopy((Node *) newnode, (const Node *) from);
> >
> > I wonder if we shouldn't instead "recurse" into a generic routine for
> > extensible nodes here.
> 
> I don't know what that means.
> 
> >> +void
> >> +RegisterExtensibleNodeMethods(const ExtensibleNodeMethods *methods)
> >> +{
> >> + uint32  hash;
> >> + int index;
> >> + ListCell   *lc;
> >> + MemoryContext   oldcxt;
> >> +
> >> + if (!xnodes_methods_slots)
> >> + xnodes_methods_slots =
> MemoryContextAllocZero(TopMemoryContext,
> >> +
> sizeof(List *) * XNODES_NUM_SLOTS);
> >> +
> >> + hash = hash_any(methods->extnodename, strlen(methods->extnodename));
> >> + index = hash % XNODES_NUM_SLOTS;
> >> +
> >> + /* duplication check */
> >> + foreach (lc, xnodes_methods_slots[index])
> >> + {
> >> + const ExtensibleNodeMethods *curr = lfirst(lc);
> >> +
> >> + if (strcmp(methods->extnodename, curr->extnodename) == 0)
> >> + ereport(ERROR,
> >> + (errcode(ERRCODE_DUPLICATE_OBJECT),
> >> +  errmsg("ExtensibleNodeMethods
> \"%s\" already exists",
> >> +
> methods->extnodename)));
> >> + }
> >> + /* ok, register this entry */
> >> + oldcxt = MemoryContextSwitchTo(TopMemoryContext);
> >> + xnodes_methods_slots[index] = lappend(xnodes_methods_slots[index],
> >> +
> (void *)methods);
> >> + MemoryContextSwitchTo(oldcxt);
> >> +}
> >
> > Why aren't you using dynahash here, and instead reimplement a hashtable?
> 
> I had thought we might assume that the number of extensible nodes was
> small enough that we could use an array for this and just use linear
> search.  But if we want to keep the door open to having enough
> extensible nodes that this would be inefficient, then I agree that
> dynahash is better than a hand-rolled hash table.
>
Hmm. I also expected the number of extensible nodes are not so much,
so self-defined hash table is sufficient. However, it is not strong
reason. Let me revise the hash implementation.

> >>  static ForeignScan *
> >>  _readForeignScan(void)
> >>  {
> >> + const ExtensibleNodeMethods *methods;
> >>   READ_LOCALS(ForeignScan);
> >>
> >>   ReadCommonScan(&local_node->scan);
> >>
> >> + READ_STRING_FIELD(extnodename);
> >>   READ_OID_FIELD(fs_server);
> >>   READ_NODE_FIELD(fdw_exprs);
> >>   READ_NODE_FIELD(fdw_private);
> >> @@ -1804,6 +1806,19 @@ _readForeignScan(void)
> >>   READ_BITMAPSET_FIELD(fs_relids);
> >>   READ_BOOL_FIELD(fsSystemCol);
> >>
> >> + methods = GetExtensibleNodeMethods(local_node->extnodename, true);
> >> + if (methods && methods->nodeRead)
> >> + {
> >> + ForeignScan*local_temp =
> palloc0(methods->node_size);
> >> +
> >> + Assert(methods->node_size >= sizeof(ForeignScan));
> >> +
> >> + memcpy(local_temp, local_node, sizeof(ForeignScan));
> >> + methods->nodeRead((Node *) local_temp);
> >> +
> >> + pfree(lo

Re: [HACKERS] proposal: multiple psql option -c

2015-12-06 Thread Michael Paquier

On Sun, Dec 6, 2015 at 10:56 PM, Michael Paquier
 wrote:
> Thanks, I looked at that again and problem is fixed as attached.

Er, not exactly... poll_query_until in PostgresNode.pm is using psql
-c without the --no-psqlrc switch so this patch causes the regression
tests of pg_rewind to fail. Fixed as attached.
-- 
Michael


20151206_psql_commands_v4.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: multiple psql option -c

2015-12-06 Thread Michael Paquier

On Fri, Dec 4, 2015 at 11:47 PM, Robert Haas  wrote:
> On Wed, Dec 2, 2015 at 12:33 AM, Pavel Stehule  
> wrote:
>>> Yeah, I don't think that's a big issue either to be honest. The code
>>> is kept consistent a maximum with what is there previously.
>>>
>>> Patch is switched to ready for committer.
>>
>> perfect
>>
>> Thank you very much to all
>
> I did some edits on this patch and was all set to commit it when I ran
> the regression tests and discovered that this breaks 130 out of the
> 160 regression tests. Allow me to suggest that before submitting a
> patch, or marking it ready for commiter, you test that 'make check'
> passes.

Mea culpa. I thought I did a check-world run... But well...

> For the most part, the cleanups in this version are just cosmetic: I
> fixed some whitespace damage, and reverted some needless changes to
> the psql references page that were whitespace-only adjustments.  In a
> few places, I tweaked documentation or comment language.  I also
> hoisted the psqlrc handling out of an if statement where it was the
> same in both branches.  Other than that, this version is, I believe,
> the same as Pavel's last version.

Thanks, I looked at that again and problem is fixed as attached.
-- 
Michael


20151206_psql_commands_v3.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Equivalence Class Filters

2015-12-06 Thread David Rowley

On 6 December 2015 at 06:07, Tom Lane  wrote:

> David Rowley  writes:
> > As of today these Equivalence Classes only incorporate expressions which
> > use the equality operator, but what if the above query had been written
> as:
>
> > select * from t1 inner join t2 on t1.id = t2.id where t1.id <= 10;
>
> > Should we not be able to assume that t2.id is also <= 10?

This sort of thing has been suggested multiple times before, and I've
> rejected it every time on the grounds that it would likely add far more
> planner cycles than it'd be worth, eg, time would be spent on searches for
> matching subexpressions whether or not anything was learned (and often
> nothing would be learned).

I guess if it keeps coming up then people must be hitting this limitation.
Perhaps continually rejecting it based on your original opinion is not the
best way to move forward with improving the query planner.

>   While I've only read your description of the
> patch not the patch itself, the search methodology you propose seems
> pretty brute-force and unlikely to solve that issue.  It's particularly
> important to avoid O(N^2) behaviors when there are N expressions ...
>
>

Likely it would be possible to do something to improve on that.
Perhaps if the list of filter clauses grows beyond a certain number then a
hash table can be built on the ec_members list with the em_expr as the key
and the EquivalenceClass as the data. Although likely we don't currently
have anything available for generating hash values for Expr nodes. On
checking it seems that 32 is the estimated tipping point for hashing
in find_join_rel(), perhaps something similar would suit this.

> Another issue that would need consideration is how to avoid skewing
> planner selectivity estimates with derived clauses that are fully
> redundant with other clauses.  The existing EC machinery is mostly
> able to dodge that problem by generating just a minimal set of equality
> clauses from an EC, but I don't see how we'd make that work here.
>
>
Could you give me an example of where this is a problem? I've tried fooling
the planner into giving me bad estimates, but I've not managed to.

# explain analyze select * from t1 where t1.id < 10 and t1.id < 100 and
t1.id < 1000;
 QUERY PLAN

 Index Scan using t1_pkey on t1  (cost=0.29..8.49 rows=9 width=8) (actual
time=0.155..0.160 rows=9 loops=1)
   Index Cond: ((id < 10) AND (id < 100) AND (id < 1000))

I'd say the t1.id < 100 AND t1.id < 1000 are fully redundant here. The
estimate given by the planner is bang-on.

Perhaps you're talking about another column making the pushed down qual
redundant?  if so, is the current eclass code not prone to the exact same
problem?

I'm also wondering why you want to limit it to comparisons to constants;
> that seems rather arbitrary.
>
>
Do you have other useful cases in mind?
The other one I considered was "expr op expr" clauses, where both those
exprs were in eclasses. For example:

select * from t1 inner join t2 on t1.col1=t2.col1 and t1.col2=t2.col2 where
t1.col1 < t1.col2;

I'd imagine that would have a narrower use case. I simply stopped with
"Expr op Const" because I though that would be more common and less planner
overhead to detect. Giving that below you also raise concerns that "expr op
const" is likely not that useful in enough cases, then I'm not holding up
too much hope of adding more complexity to the detection mechanism for less
common wins.

> Lastly, in most cases knowing that t2.id <= 10 is just not worth all
> that much; it's certainly far less useful than an equality condition.
> It's not difficult to imagine that this would often be a net drag on
> runtime performance (never mind planner performance) by doing nothing
> except creating additional filter conditions the executor has to check.
> Certainly it would be valuable to know this if it let us exclude some
> partition of a table, but that's only going to happen in a small minority
> of queries.
>
>
I'd have thought that my link to a thread of a reported 1100 to 2200 times
performance improvement might have made you think twice about claiming the
uselessness of this idea.

I'm also quite surprised at you mentioning partitions here. Personally I'd
be thinking of indexes long before thinking of partitions. I don't think I
need to remind you that btree indexes handle >= and <= just fine. It's not
all that hard to imagine that if they're using that column for a join
condition and are serious about performance that they just might also have
an index defined on that column too. I'd imagine the only case we might
have to worry about is when the selectivity of the pushed qual is getting
close to 1. Perhaps the RestrictInfos could be marked as "optional"
somehow, and we could simply remove them when their selectivity estimates
are too high.

I'm not necessarily oppos

Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

Re: [HACKERS] Making tab-complete.c easier to maintain

Re: [HACKERS] [sqlsmith] Failed to generate plan on lateral subqueries

Re: [HACKERS] Making tab-complete.c easier to maintain

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

Re: [HACKERS] jsonb_delete not documented

Re: [HACKERS] jsonb_delete not documented

Re: [HACKERS] Re: Reusing abbreviated keys during second pass of ordered [set] aggregates

Re: [HACKERS] Logical replication and multimaster

Re: [HACKERS] Re: Reusing abbreviated keys during second pass of ordered [set] aggregates

Re: [HACKERS] pglogical_output - a general purpose logical decoding output plugin

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

Re: [HACKERS] Logical replication and multimaster

Re: [HACKERS] jsonb_delete not documented

[HACKERS] Redundant sentence within INSERT documentation page (exclusion constraints)

[HACKERS] jsonb_delete not documented

Re: [HACKERS] Attach comments to functions' parameters and return value

Re: [HACKERS] Double linking MemoryContext children

Re: [HACKERS] Using quicksort for every external sort run

Re: [HACKERS] Using quicksort for every external sort run

Re: [HACKERS] Using quicksort for every external sort run

Re: [HACKERS] Using quicksort for every external sort run

Re: [HACKERS] PATCH: index-only scans with partial indexes

[HACKERS] Unicode collations in FreeBSD 11, DragonFly BSD 4.4 without ICU

Re: [HACKERS] a word-choice question

[HACKERS] [sqlsmith] Failed to generate plan on lateral subqueries

Re: [HACKERS] Logical replication and multimaster

Re: [HACKERS] [PATCH] Equivalence Class Filters

Re: [HACKERS] a word-choice question

Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

Re: [HACKERS] proposal: multiple psql option -c

Re: [HACKERS] proposal: multiple psql option -c

Re: [HACKERS] [PATCH] Equivalence Class Filters

34 matches

Site Navigation

Mail list logo

Footer information